Quantifying the Importance of Latent Features in Neural Networks



Alshareef, A, Berthier, N, Schewe, S ORCID: 0000-0002-9093-9518 and Huang, X ORCID: 0000-0001-6267-0366
(2022) Quantifying the Importance of Latent Features in Neural Networks. .

[img] Text
ImportanceOfLatentFeatures.pdf - Author Accepted Manuscript

Download (1MB) | Preview

Abstract

The susceptibility of deep learning models to adversarial examples raises serious concerns over their application in safety-critical contexts. In particular, the level of understanding of the underlying decision processes often lies far below what can reasonably be accepted for standard safety assurance. In this work, we provide insights into the high-level representations learned by neural network models. We specifically investigate how the distribution of features in their latent space changes in the presence of distortions. To achieve this, we first abstract a given neural network model into a Bayesian Network, where each random variable represents the value of a hidden feature. We then estimate the importance of each feature by analysing the sensitivity of the abstraction to targeted perturbations. An importance value indicates the role of the corresponding feature in underlying decision process. Our empirical results suggest that obtained feature importance measures provide valuable insights for validating and explaining neural network decisions.

Item Type: Conference or Workshop Item (Unspecified)
Divisions: Faculty of Science and Engineering > School of Electrical Engineering, Electronics and Computer Science
Depositing User: Symplectic Admin
Date Deposited: 07 Mar 2022 09:05
Last Modified: 18 Jan 2023 21:11
URI: https://livrepository.liverpool.ac.uk/id/eprint/3150228