DPCfam: Unsupervised protein family classification by Density Peak Clustering of large sequence datasets
Fig 7
Histograms showing average overlap between ECOD families and their representative MCs.
Colors reflect the contribution of each MC category to each bin (equivalent, reduced, extended and shifted, see S1(B) Fig and Methods for definitions). A: Overlap between individual ECOD families and their representative MCs (cfr. with Fig 6) B: Overlap between individual ECOD families or architectures and representative MCs. Given an ECOD family and its representative MC (same pairs as in A), we search for a better overlap of the representative MC with any multi-family architecture featuring the original ECOD family and up to two additional families. The reported average overlap value is thus the best between the overlap with the original family and any other such ECOD architecture. Note that the ECOD architecture labels (equivalent/reduced/extended/shifted) are still assigned according to the representative MC overlap to the original ECOD family so as to show to which extent the overlap in each MC category increases with respect to A).