Fig 1.
Right Greenland halibut otolith image.
The main structural parts of the otolith are indicated together with annotated manual year zone readings in turquoise dots. Both alternatives predict an age of 8 years. Photo was provided by Kristin Windsland, Norwegian Institute of Marine Research.
Fig 2.
Age frequency distributions predicted by human experts in the training set.
Fig 3.
Different versions of otolith images for a 13 years old fish.
(a) Baseline image, (b) Binary image, (c) Standardized image.
Fig 4.
Examples of baseline otolith images from the training set for the different ages predicted by human readers.
Otoliths belonging to juveniles, adolescents, young adults and adults are surrounded by green, blue, orange and grey rectangles, respectively.
Fig 5.
Illustration of the LRP conceptual flow applied to an otolith image {xp} from a 13-year-old fish (inspired from Montavon et al. [24]).
In the forward propagation phase (a), the output neuron of the network xf has retained the evidence of the actual age class. In the relevance propagation phase (b), this output is first attributed the relevance score Rf before being redistributed backward in the network. The relevance scores of all the pixels can be visualized as a heatmap {Rp} that can have different characteristics depending on the chosen propagation rule. Here, the relevant pixels are highlighted in red and contribute positively to the prediction. The higher the degree of red, the more positive the contribution of the pixel to the prediction.
Fig 6.
Age predictions of model vs human expert for the test set based on classification using baseline data.
The scatters have a radius proportional to the probability density of data. This result can be compared with Fig 5 in Moen et al. [4], where the authors observed an underestimation of ages predicted by the model relative to human readers for the right otoliths.
Table 1.
Comparison of performance achieved on the test set composed of right otoliths in Moen et al. [4] and our classification results obtained on the baseline test set.
Fig 7.
computed from the training (a) and test (b) datasets across the different age groups (juveniles, adolescents, young adults and adults) and for the different versions of the data (baseline, binary and standardized). The different age groups are associated with different colors and the value of
is represented by a circle having a radius proportional to the predicted population size.
Table 2.
Summary of the clustering accuracy scores for the different age groups considering baseline and standardized data (including training + testing).
Fig 8.
Average relevance maps computed from the baseline data considering different predicted ages belonging to (a) juveniles, (b) adolescents, (c) young adults and (d) adults. The number of samples belonging to a given predicted age are also indicated in the upper part of the average image. For each age group, only the four ages having the higher number of predictions had their average relevance map displayed. Note that each heatmap has been normalized by its maximum and the higher the degree of red, the more positive the contribution of the pixel to the prediction.
Fig 9.
Average relevance maps computed from the standardized data considering different predicted ages belonging to (a) juveniles, (b) adolescents, (c) young adults and (d) adults. The number of samples belonging to a given predicted age are also indicated in the upper part of the average image. For each age group, only the four ages having the higher number of predictions had their average relevance map displayed, except for the young adults group where only ages 10, 11 and 13 were predicted. Note that each heatmap has been normalized by its maximum and the higher the degree of red, the more positive the contribution of the pixel to the prediction.