Sequence Similarity Network Reveals Common Ancestry of Multidomain Proteins

Differences in neighborhood structure of the sequence similarity network reflect differences in evolutionary history.

Network neighborhoods in which nodes represent sequences. Edges connect pairs with significant sequence similarity. Edge weights reflecting degree of sequence similarity are not shown. (A) The neighborhoods of the homologous pair, PDGFRB and PRKG1B. PDGFRB and PRKG1B share 779 neighbors, mostly Kinases (turquoise nodes). These are strong matches due to a shared kinase domain. PDGFRB has 183 unique neighbors, mostly due to weak matches with Ig domains (green nodes). PRKG1B has 142 unique neighbors due to weak matches with the cNMP-binding domain (red nodes). Other matching sequences are shown in yellow. (B) PDGFRB and NCAM2, a domain-only match, have 232 matches in common. PDGFRB has 730 unique neighbors and NCAM2 has 240, mostly due to Fn3 domains (dark blue nodes).

