Impact of phylogeny on the inference of functional sectors from protein sequence data
Fig 3
Impact of phylogeny and selection on ICOD, covariance and SCA spectra.
Eigenvalues of the ICOD, covariance and SCA matrices, sorted from largest to smallest, are shown for sequences generated with only phylogeny (light shades) and both phylogeny and selection (dark shades). We consider different levels of phylogeny by considering different values of μ (shown as different colors). ‘No phylogeny’ corresponds to sequences generated independently at equilibrium, and thus containing only correlations due to selection. This data set comprises M = 2048 sequences of length L = 200 generated exactly as in Fig 2, i.e. using the Hamiltonian in Eq 1 with , τ* = 90, and the same vector of mutational effect
as in Fig 2. Data sets without selection are generated by evolving random sequences of length L = 200 on a perfect binary branching with 11 generations and μ random mutations on each branch, providing M = 211 = 2048 sequences. Finally, data sets with phylogeny and selection are generated along a perfect binary tree with μ accepted mutations per branch (with acceptance criterion in Eq 2 using the same κ and τ* as in the no-phylogeny case and as in Fig 2) and 11 generations again. The three values of μ shown here were chosen to illustrate different levels of phylogenetic impact. Insets show a zoom over large eigenvalues. A logarithmic y-scale is used in the center panel for readability.