Annotation Error in Public Databases: Misannotation of Molecular Function in Enzyme Superfamilies
Figure 5
Distribution of major types of misannotation found in the NR database.
Classification of misannotated sequences follows the steps of the protocol given in Figure 2: ‘No Superfamily Association’ (NSA); ‘Missing Functionally important Residue(s)’ (MFR) ‘Superfamily Association only’ (SFA) ‘Below Trusted Cutoff’ (BTC), as described in methods. The codes were grouped into two sets that specify whether the misannotation is associated with overprediction or to other types of errors (e.g., missing a required residue).