Efficient discovery of frequently co-occurring mutations in a sequence database with matrix factorization

doi:10.1371/journal.pcbi.1012391

Efficient discovery of frequently co-occurring mutations in a sequence database with matrix factorization

Fig 5

Bar-chart showing each CMP’s prevalence in a PANGO lineage based on the label within the data.

For simplified comparison, only the top PANGO labels (above 100,000 per CMP counts) are shown here. Therefore, the sum of the X-axis may not reflect the entire database count. The labels (C1- C30) on Y-axis are sorted based on each CMP’s first and last occurring PANGO label (“birth" and “death" of CMP) as well as its count.

doi: https://doi.org/10.1371/journal.pcbi.1012391.g005