Protein structural domains are evolutionary units whose relationships can be detected over long evolutionary distances. The evolutionary history of protein domains, including the origin of protein domains, the identification of domain loss, transfer, duplication and combination with other domains to form new proteins, and the formation of the entire protein domain repertoire, are of great interest.
A methodology is presented for providing a parsimonious domain history based on gain, loss, vertical and horizontal transfer derived from the complete genomic domain assignments of 1015 organisms across the tree of life. When mapped to species trees the evolutionary history of domains and domain combinations is revealed, and the general evolutionary trend of domain and combination is analyzed.
We show that this approach provides a powerful tool to study how new proteins and functions emerged and to study such processes as horizontal gene transfer among more distant species.