Skip to main content
Advertisement

< Back to Article

Impact of phylogeny on the inference of functional sectors from protein sequence data

Fig 6

Identifying functionally important sites in natural protein families.

The symmetrized AUC for the prediction of sites with large mutational effects is computed on 30 protein families, using four different methods: ICOD, SCA, MI and Conservation, using Deep Mutational Scan (DMS) data as ground truth. For ICOD and MI, the average product correction (APC) [4] is applied to the matrix of interest (it was found to improve the average performance for most families for these methods, but not for SCA). For ICOD, MI and SCA, the components of the eigenvector associated to the largest eigenvalue are employed to make predictions of mutational effects. Protein families are ordered by decreasing symmetrized AUC for ICOD. The mapping between protein family number and name is given in S1 Table. The protein families shaded in grey have DMS data featuring a unimodal shape, the other ones have a bimodal shape.

Fig 6

doi: https://doi.org/10.1371/journal.pcbi.1012091.g006