Figure 1.
Phylogenomic Analysis of Protein Function Using Subfamily Annotation
In the example shown above, a phylogenetic tree has been constructed for a set of G protein–coupled receptors. The molecular function of some of the members of the family has been determined experimentally and is used to annotate individual subfamilies, similar to [1]. Sequences without known function can be assigned a predicted molecular function using the tree topology to identify orthologs. When no experimental evidence is available for a subtree's molecular function (e.g., the Unknown Subtype subtree at top), the annotation would be left at a general level (e.g., “GPCR of unknown specificity, related to opioid, galanin, and somatostatin receptors”). By contrast, if the Unknown Subtype subtree were nested within a subtree whose members were consistently characterized, such as opioid receptors, a “subtree neighbors” approach could be used to assign the annotation “Putative opioid receptor” to that group [14]. The use of subfamilies as the basis of phylogenomic inference is only one approach; as noted in the text, the general methodology does not rely on subfamily groupings and would ideally use the entire tree topology.
Table 1.
Resources for Phylogenomic Analysis
Figure 2.
Structural and Functional Differences in Distantly Related Protein Superfamilies
The three proteins shown above are all members of the Structural Classification of Proteins (SCOP) scorpion toxin–related superfamily. All retain the same basic fold, but have significantly divergent functions. They function as part of the innate immune arsenal in plants and insects, but form part of the offense in scorpions. Evolution has conserved the basic structure, but many residues within the sequences are not structurally superposable. Such positions, often in the loop regions, can be significant in determining function.