Benchmarking interpretability of deep learning for predictive genomics: Recall, precision, and variability of feature attribution
Fig 2
Illustration of the attribution precision metric.
The full set of features consists of real SNPs (left, green) and decoy SNPs (right, red). Within the set of real SNPs, a small subset is truly associated with the phenotype (grey circle; SNPs with associations). A DNN interpretation method identifies a set of top-K SNPs (yellow oval; DL-Salient SNPs), containing three subsets: A (truly associated real SNPs), B (real SNPs lacking true association), and C (decoy SNPs). Since sets B and C are assumed to be comparable in size, the number of decoy SNPs in the top-K most highly attributed SNPs (C) is used as an estimate of the number of real SNPs lacking true association (B), enabling the calculation of attribution precision as 1 − (|C|/ |A + B|).