Phylogenetic inference of the emergence of sequence modules and protein-protein interactions in the ADAMTS-TSL family
Fig 2
Phylogenetic inference of module and phenotype appearances.
The different steps of the method, illustrated here for a dummy set of sequences containing two paralogs p1 and p2 (from one species) and their ortholog p3 (from another species), are: 1) Inference of the reference gene tree from protein sequences by a standard pipeline (PASTA, RAxML, TreeFix); 2) Identification of conserved sequence modules (i.e. sets of strongly similar segments from at least 2 protein sequences aligned in PLMA blocks by Paloma-D); 3) Inference of the module composition of ancestral genes in the reference tree (through Module-Gene-Species reconciliation by SEADOG-MD using the phylogenetic tree of each module inferred with PhyML and TreeFix); 4) Annotation of proteins with known phenotypic traits of interest (here Protein-Protein Interactions); 5) Reconstruction of the ancestral scenario of phenotype evolution across the reference gene tree (PastML); 6) Merging module and phenotype evolutionary information: each ancestral gene of the reference gene tree is then characterized by a module composition and a set of phenotypic traits (protein interactants here). The final result is the prediction of functional signatures by identification of module(s) and phenotypic trait(s) co-appearance.