Finding driver mutations in cancer: Elucidating the role of background mutational processes

doi:10.1371/journal.pcbi.1006981

Fig 1.

Mutability of all theoretically possible codon substitutions (“not observed”) and all substitutions that were observed in: (A) COSMIC v85 pan-cancer cohort; (B) MSK-IMPACT cohort. Asterisks show the differences on Mann-Whitney-Wilcoxon test significant at p < 0.01. Mutability values have been converted to negative log₁₀ scale as pan-cancer codon mutability ranges several orders of magnitude.

More »

Expand

Fig 2.

Relationship between cancer-specific nucleotide mutability and observed reoccurrence frequency of all mutations from two cohorts.

Counts are binned and refer to how many times a particular mutation was observed in the given cancer type. ‘0’, ‘1’, ‘2’ and ‘3+’ refer to mutations that were not observed (including all possible point mutations), observed once, twice, or in three or more cancer samples. Blue boxes show mutations with the observed frequency calculated in the COSMIC v85 cohort and green boxes refer to MSK-IMPACT cohort. (A) breast cancer (n_COSMIC = 1,667, n_MSK = 783 samples), (B) Lung adenocarcinoma (n_COSMIC = 301, n_MSK = 1,203), (C) Colon adenocarcinoma (n_COSMIC = 369, n_MSK = 688) and (D) Skin malignant melanoma (n_COSMIC = 376, n_MSK = 182).

More »

Expand

Fig 3.

Mutability distributions by mutation type and mutation frequency.

(A) Cumulative distribution of codon mutability of silent (green), nonsense (red) and missense (blue) mutations. (C) Cumulative distribution of nucleotide mutability for silent, nonsense and missense mutations. Inset shows the probability density distributions of mutability by mutation type. Significance was determined by Dunn’s test; difference with p < 0.01 is marked with a double asterisk. (B) and (D) are codon and nucleotide mutability respectively binned by frequency in the COSMIC v85 pan-cancer cohort. ‘0’, ‘1’, ‘2’ and ‘3+’ refer to mutations that were not observed (including all possible point mutations), observed once, twice, or in three or more cancer samples. See S1 Table for the number of mutations in each category.

More »

Expand

Fig 4.

Relationship between codon mutability and frequency of mutations.

Histograms show the Spearman rank correlation coefficients between the reoccurrence frequency and mutability across cancer genes with at least 10 observed mutations of each type: (A) missense (blue), (B) nonsense (red) and (C) silent (green). Filled bars in the left column denote genes with significant correlation at p < 0.01. Bar graphs show Spearman correlation coefficient for genes with significant correlation at p < 0.01. Genes with bold font are tumor suppressors (TSG), underlined genes are oncogenes, and genes in plain font were either categorized as both TSG and oncogene or fusion genes. (D-F) Scatterplots with regression lines and confidence intervals show the linear relationship between mutability and reoccurrence frequency of each type of mutation for several representative genes. Adjusted R² are shown to convey goodness of fit. Mutation reoccurrence frequencies were taken from the pan-cancer COSMIC v85 cohort.

More »

Expand

Fig 5.

Relationship between codon mutability and reoccurrence frequency of mutations for different mutation types and gene functions.

Genes grouped into oncogene and tumor suppressor (TSG) by their role in cancer. Mutations were binned by their reoccurrence frequency in COSMIC v85 cohort. Boxplots show codon mutability calculated with pan-cancer model. See S1 Table for counts.

More »

Expand

Fig 6.

Codon mutability of missense mutations grouped by their experimental effects.

(A) Mutations from the combined dataset were categorized as neutral and non-neutral. Significant differences with p < 0.01 are marked with a double asterisk. Mutability was calculated with pan-cancer background model (B) Mutations binned by their reoccurrence frequency in both MSK-IMPACT (green) and COSMIC v85 (blue) cohorts. In both cohorts, reoccurrence frequency of neutral mutations depends on mutability, whereas for non-neutral mutations, reoccurrence frequency does not scale with background mutability.

More »

Expand

Table 1.

Comparison of different methods to distinguish neutral from non-neutral mutations.

From combined experimental dataset. Mutations were observed in corresponding cancer cohorts. See S6 and S7 Tables for results on rare and all mutations. Maximum Matthew’s correlation is reported for each predictor which are ranked with respect to the maximum Matthew’s correlation coefficient. B-Score for each cohort is calculated with the respective cohort size: COSMIC v85 cohort 12,013; MSK-Impact 9,228. For CHASM the background model yielding best performance was chosen.

More »

Expand

Fig 7.

Ranking of mutations and prediction of driver mutations based on B-score.

Snapshots from the MutaGene server show the results of analysis of EGFR gene with a Pan-cancer model. (A) Scatterplot with expected mutability versus observed mutational frequencies. (B) Top list of mutations ranked by their B-Scores. (C) EGFR nucleotide and translated protein sequence shows per-nucleotide site mutability per codon mutability as well as mutabilities of nucleotide and codon substitutions (heatmaps). Mutations observed in tumors from ICGC repository are shown as circles colored by their prediction status: Driver, Potential driver, and Passenger. Missense mutation p.Arg252Pro is shown with a blue arrow.

More »

Expand