Skip to main content
Advertisement

< Back to Article

Constructing benchmark test sets for biological sequence analysis using independent set algorithms

Fig 2

Characteristics of Pfam full families successfully split.

Each marker represents a family in Pfam. The connectivity of a sequence is the fraction of other sequences in the full family with at least 25% pairwise identity. Families successfully split into a training set of size at least 400 and a test set of size at least 20 are marked by a cyan circle, whereas families that were not split are marked by a red diamond. In (B) and (D) the cyan circle represents at least one successful split among 40 independent runs. The 34 families that Blue did not finish splitting within 6 days are not included in the Blue plots.

Fig 2

doi: https://doi.org/10.1371/journal.pcbi.1009492.g002