Skip to main content
Advertisement

< Back to Article

Multiscale mutation clustering algorithm identifies pan-cancer mutational clusters associated with pathway-level changes in gene expression

Fig 7

Multiscale Information-Based Clustering Algorithm.

A) Pan-cancer mutation data is merged across all 23 tumor types for a single gene (PTEN). B) Gaussian kernel density estimates smooth this data at 28 different bandwidths or scales (a limited selection is shown for clarity). C) Each kernel density estimate is used to seed a multivariate mixture model of normal distributions and a single uniform distribution to represent background noise. Initial guesses for the locations of the normal distributions are determined from the local maxima of the kernel density estimates. Clusters from the mixture models (blue) are merged together using the greedy algorithm resulting in a final set of multiscale clusters (red). Green are duplicates of the red clusters and shown to clarify the process. Grey bars are excluded due to too few mutations. D) A mutation spectrum for PTEN. E) The two annotated protein domains in PTEN from PFAM.

Fig 7

doi: https://doi.org/10.1371/journal.pcbi.1005347.g007