On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation | PLOS One

Advertisement

Browse Subject Areas

?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

< Back to Article

Fig 1 — Fig 1.

Visualization of the pixel-wise decomposition process.
In the classification step the image is converted to a feature vector representation and a classifier is applied to assign the image to a given category, e.g., “cat” or “no cat”. Note that the computation of the feature vector usually involves the usage of several intermediate representations. Our method decomposes the classification output f(x) into sums of feature and pixel relevance scores. The final relevances visualize the contributions of single pixels to the prediction. Cat image by pixabay user stinne24.

More »

Fig 2.

Left: A neural network-shaped classifier during prediction time.
w_ij are connection weights. a_i is the activation of neuron i. Right: The neural network-shaped classifier during layer-wise relevance computation time. is the relevance of neuron i which is to be computed. In order to facilitate the computation of we introduce messages . are messages which need to be computed such that the layer-wise relevance in Eq (2) is conserved. The messages are sent from a neuron i to its input neurons j via the connections used for classification, e.g. 2 is an input neuron for neurons 4, 5, 6. Neuron 3 is an input neuron for 5, 6. Neurons 4, 5, 6 are the input for neuron 7.

More »

Fig 2.

Left: A neural network-shaped classifier during prediction time.
w_ij are connection weights. a_i is the activation of neuron i. Right: The neural network-shaped classifier during layer-wise relevance computation time. is the relevance of neuron i which is to be computed. In order to facilitate the computation of we introduce messages . are messages which need to be computed such that the layer-wise relevance in Eq (2) is conserved. The messages are sent from a neuron i to its input neurons j via the connections used for classification, e.g. 2 is an input neuron for neurons 4, 5, 6. Neuron 3 is an input neuron for 5, 6. Neurons 4, 5, 6 are the input for neuron 7.

More »

Fig 3 — Fig 3.

An exemplary real-valued prediction function for classification with the dashed black line being the decision boundary which separates the blue from the green dots.
The blue dots are labeled negatively, the green dots are labeled positively. Left: Local gradient of the classification function at the prediction point. Right: Taylor approximation relative to a root point on the decision boundary. This figure depicts the intuition that a gradient at a prediction point x—here indicated by a square—does not necessarily point to a close point on the decision boundary. Instead it may point to a local optimum or to a far away point on the decision boundary. In this example the explanation vector from the local gradient at the prediction point x has a too large contribution in an irrelevant direction. The closest neighbors of the other class can be found at a very different angle. Thus, the local gradient at the prediction point x may not be a good explanation for the contributions of single dimensions to the function value f(x). Local gradients at the prediction point in the left image and the Taylor root point in the right image are indicated by black arrows. The nearest root point x₀ is shown as a triangle on the decision boundary. The red arrow in the right image visualizes the approximation of f(x) by Taylor expansion around the nearest root point x₀. The approximation is given as a vector representing the dimension-wise product between Df(x₀) (the black arrow in the right panel) and x − x₀ (the dashed red line in the right panel) which is equivalent to the diagonal of the outer product between Df(x₀) and x − x₀.

More »

Table 1 — Table 1.

Notation Conventions Used in This Section.

More »

Fig 4 — Fig 4.

Local and global predictions for input images are obtained by following a series of steps through the classification- and pixel-wise decomposition pipelines.
Each step taken towards the final pixel-wise decomposition has a complementing analogue within the Bag of Words classification pipeline. The calculations used during the pixel-wise decomposition process make use of information extracted by those corresponding analogues. Airplane image in the graphic by Pixabay user tpsdave.

More »

Fig 5 — Fig 5.

Multilayer neural network annotated with the different variables and indices describing neurons and weight connections.
Left: forward pass. Right: backward pass.

More »

Fig 6 — Fig 6.

Pixel-wise decomposition for Bag of Words features over χ²-kernels using the Taylor-type decomposition for the third layer and the layer-wise relevance propagation for the subsequent layers.
Left: The original image. Middle: Pixel-wise prediction. Right: Superposition of the original image and the pixel-wise prediction. The decompositions were computed on tiles of size 102 × 102 and having a regular offset of 34 pixels. The decompositions from the overlapping tiles were averaged. In the heatmap, based on linearly mapping the interval [−1, +1] to the jet color map available in many visualization packages, green corresponds to scores close to zero, yellow and red to positive scores and blue color to negative scores. See text for interpretation.

More »

Fig 7 — Fig 7.

Pixel-wise decomposition for Bag of Words features over a histogram intersection kernel using the layer-wise relevance propagation for all subsequent layers and rank-mapping for mapping local features.
Each triplet of images shows—from left to right—the original image, the pixel-wise predictions superimposed with prominent edges from the input image and the original image superimposed with binarized pixel-wise predictions. The decompositions were computed on the whole image. Images twice by Pixabay users tpsdave, and by Pixabay users sirocumo and Pixeleye.

More »

Fig 8 — Fig 8.

Pixel-wise decomposition for Bag of Words features over a histogram intersection kernel using the layer-wise relevance propagation for all subsequent layers and rank-mapping for mapping local features.
Each triplet of images shows—from left to right—the original image, the pixel-wise predictions superimposed with prominent edges from the input image and the original image superimposed with binarized pixel-wise predictions. The decompositions were computed on the whole image. Faces below the hairline but also hands yield high scores, see the woman in the third picture which turns away her face from the camera as an example that hair alone is not relevant. Images by Pelagio Palagi, Wikimedia users Rorschach, Frankie Fouganthin and Flickr user Le vent dans les dunes.

More »

Fig 9 — Fig 9.

Pixel-wise decomposition for Bag of Words features over a histogram intersection kernel using the layer-wise relevance propagation for all subsequent layers and rank-mapping for mapping local features.
Each triplet of images shows—from left to right—the original image, the pixel-wise predictions superimposed with prominent edges from the input image and the original image superimposed with binarized pixel-wise predictions. The decompositions were computed on the whole image. Notably the tail of a plane receives negative scores consistently. Blue sky context seems to contribute to classification which has been conjectured already in the PASCAL VOC workshops [35] and which was observed also on other images not shown here, see the the second picture for comparison against the other three images which have more blueish sky. Images from Pixabay users Holgi, nguyentuanhung, rhodes8043 and tpsdave.

More »

Fig 10 — Fig 10.

Pixel-wise decomposition for Bag of Words features over a histogram intersection kernel using the layer-wise relevance propagation for all subsequent layers and rank-mapping for mapping local features.
Each triplet of images shows—from left to right—the original image, the pixel-wise predictions superimposed with prominent edges from the input image and the original image superimposed with binarized pixel-wise predictions. The decompositions were computed on the whole image. Positive responses seem to exist for certain fur texture patterns, see also the false responses on the wood and the plaster in the second example which both have similar texture and color to a cat’s fur. Images by Pixabay users LoggaWiggler and Holcan.

More »

Fig 11 — Fig 11.

Taylor-approximated pixel-wise predictions for a multilayer neural network trained and tested on the MNIST data set.
Each group of four horizontally aligned panels shows—from left to right—the input digit, the Taylor root point x₀, the gradient of the prediction function f at x₀ of a specific digit class indicated by the subscript next to f and the approximated pixel-wise contributions for x.

More »

Fig 12 — Fig 12.

Pixel-wise decompositions for a multilayer neural network trained and tested on MNIST digits, using layer-wise relevance propagation as in Formula (56).
Each group shows the decomposition of the prediction for the classifier of a specific digit indicated in parentheses.

More »

Fig 13 — Fig 13.

Each quadruple shows: on the leftmost the input digit; on the middle left the class specific pixel-wise density ratios d_k (Eq (65)) for the digit class k for which the pixel-wise decomposition is computed; on the middle right the pixel-wise decomposition R⁽¹⁾ for that digit and the digit class k; on the rightmost the correlation between d_k and the pixel-wise decomposition R⁽¹⁾.
When considering a digit from class i and a pixel-wise decomposition from class k ≠ i, it is observable that the pixel-wise decomposition shows frequently highly positive activations on pixels of the digit from class i which have high relative density d_k for the digit class k ≠ i.

More »

Fig 14 — Fig 14.

Example of non-linear activation functions g used in multilayer neural networks.

More »

Fig 15 — Fig 15.

Evidence for a handwritten digit being a “4” or a “9”.
Strong positive evidence for “4” is allocated to the top part of the image for keeping it blank. If trying to interpret these digits as “9”, the open top-part of the image is perceived as negative evidence for this class, because a “9” would rather have a top-dash closing the upper loop of the “4”. Explanations are consistent across a variety of neural networks and samples.

More »

Fig 16 — Fig 16.

Evidence for a handwritten digit being a “3” or a “8”.
Classifying as “3” is supported by the middle horizontal stroke featured in this digit and the absence of vertical connections on the left of the image. Evidence for being a “8” feature again the middle horizontal stroke, however, the absence of connections on the left side of the digit constitutes negative evidence. Explanations are again stable for various models and samples.

More »

Fig 17 — Fig 17.

Pixel-wise decompositions for all classes for 16 randomly drawn digits from the MNIST test set.
Results are obtained using the relevance propagation Formula (56) with the rectifying network trained for 1 000 000 iterations from Section MNIST experiments I I.

More »

Fig 18 — Fig 18.

Flipping of high-scoring non-digit pixels.
Pixels with highest positive scores are flipped first. The pixel-wise decomposition was computed for the true digit class, three (left) and four (right).

More »

Fig 19 — Fig 19.

Flipping of digit and non-digit pixels with positive responses.
Pixels with highest positive scores are flipped first. The pixel-wise decomposition was computed for the true digit classes three (left) and four (right).

More »

Fig 20 — Fig 20.

Flipping of pixels with pixel-wise decomposition score close to zero.
Pixels with absolute value closest to zero are flipped first. Digit and non-digit pixels may be flipped. Pixel-wise decomposition have been computed for the true digit classes three (left) and four (right).

More »

Fig 21 — Fig 21.

Flipping of pixels with negative responses, due to a pixel-wise decomposition for prediction targets 8 (for digits 3 on the left) and 9 (for digits 4 on the right).
Pixels with lowest negative scores are flipped first.

More »

Fig 22 — Fig 22.

Flipping of pixels for digit and non-digit pixels, compared for each modified digit for the true class against the maximal prediction of all wrong classes.
Left: Pixels with highest positive scores are flipped first. Right: Flipping of neutrally predicted pixels, i.e. pixels with absolute value closest to zero are flipped first (solid lines), and flipping of randomly picked pixels (dashed lines). Results are averaged over digits from all digit classes in contrast to using only digit classes 3 and 4 in the preceding figures.

More »

Fig 23 — Fig 23.

Examples of images with an increasing amount of flipped pixels and the corresponding predictions of the classifier.
Here pixels are flipped away from the class label given in parentheses above the heatmap. Pixels were flipped in steps of 1% of all pixels until the predicted class label changed. The plots show the output of the softmax function y and the output score of the preceding linear layer yp. The pixels were sorted before flipping in decreasing order of the pixel-wise score, i.e. highest scoring pixels were flipped first. In this panel the heatmap was computed for the classifier which produced the highest score, i.e. for the predicted class label. The originally predicted label is given on the leftmost image in parentheses, the predicted label after the switch of the prediction is given in the rightmost image.

More »

Fig 24 — Fig 24.

Examples of images with an increasing amount of flipped pixels and the corresponding predictions of the classifier.
Here pixels are flipped towards the class label given in parentheses above the heatmap. Pixels were flipped in steps of 1% of all pixels until the predicted class label changed. The plots show the output of the softmax function y and the output score of the preceding linear layer yp. The pixels were sorted before flipping in increasing order of the pixel-wise score, i.e. lowest scoring pixels were flipped first. In this panel the heatmap was computed for a classifier which did not produce the highest score, i.e. for a random false class label. The originally predicted label is given on the leftmost image in parentheses, the predicted label after the switch of the prediction is given in the rightmost image.

More »

Fig 25 — Fig 25.

The pixel-wise decompositions for examples images of the neural net pre-trained on ILSVRC data set images and provided by the Caffe open source package [60].
Second column shows decompositions computed by Formula (58) with stabilizers ɛ = 0.01, the third column with stabilizers ɛ = 100, the fourth column was computed by Formula (60) using α = +2, β = −1. The artifacts at the edges of the images are caused by filling the image with locally constant values which comes from the requirement to input square sub-parts of images into the neural net. Pictures in order of appearance from Wikimedia Commons by authors Jens Nietschmann, Shenrich91, Sandstein, Jörg Hempel.

More »

Fig 26 — Fig 26.

Failure examples for the pixel-wise decomposition.
Left and Right: Failures to recognize toilet paper. The decompositions computed by Formula (58) with stabilizers ɛ = 0.01 The neural net is the pre-trained one on ILSVRC data from the Caffe package [60]. The computing methods for each column are the same as in Fig 25. Pictures in order of appearance from Wikimedia Commons by authors Robinhood of the Burger World and Taro the Shiba Inu.

More »

Fig 27 — Fig 27.

The pixel-wise decomposition is different from an edge or texture detector.
Only a subset of strong edges and textures receive high scores. Panels show the original image on the left, and the decomposition on the right. The decompositions were computed twice for the classes table lamp and once for the class rooster. The neural net is the pre-trained one on ILSVRC data from the Caffe package [60]. The computing methods for each column are the same as in Fig 25. The last row shows the gradient norms normalized to lie in [0, 1] mapped by the same color scheme as for the heatmaps. Pictures in order of appearance from Wikimedia Commons by authors Wtshymanski, Serge Ninanne and Immanuel Clio.

More »