Figure 1.
Flowscheme for the identification of cancer amplified genes with putative cancer driver activity.
TCGA datasets were mined for gene amplification (GISTIC2 analysis, cBio portal) and 461 gene amplifications were identified. The list was narrowed to 73 genes cancer-related genes that were potentially “druggable” based on external druggability databases. From the 73 genes, 40 putative cancer driver genes were identified based on copy number versus mRNA expression analysis of TCGA data.
Figure 2.
Identification of 73 genes amplified in TCGA datasets.
From the initial list of 461 genes amplified in one or more TCGA datasets, 73 amplified genes were identified with potentially “druggable” properties as well as established/putative roles in oncogenesis. Genes/amplicons are arranged by chromosomal location, with their genomic location marked as shown (Mb = Megabase). Colored boxes indicate cancer types with TCGA designations, as follows: BLCA - Bladder Urothelial Carcinoma, BRCA - Breast invasive carcinoma, CRC – Colorectal Cancer (COAD and READ studies combined together), GBM - Glioblastoma multiforme, HNSC - Head and Neck squamous cell carcinoma, KIRC - Kidney renal clear cell carcinoma, LGG - Brain Lower Grade Glioma, LUAD - Lung adenocarcinoma, LUSC - Lung squamous cell carcinoma, OV - Ovarian serous cystadenocarcinoma, PRAD - Prostate adenocarcinoma, SKCM - Skin Cutaneous Melanoma, STAD - Stomach adenocarcinoma, UCEC - Uterine Corpus Endometrioid Carcinoma.
Figure 3.
Gene copy number and mRNA expression correlation analysis to identify putative driver genes amplified on chromosomes 1–11.
Pearson correlation coefficients were calculated by analyzing gene copy number and mRNA expression from individual patient-derived samples in TCGA datasets. Shown are the correlation coefficients for each TCGA cancer subtype and the mean correlation across all cancer types (red denotes high correlation, blue denotes low correlation). Abbreviations of TCGA datasets are listed in Figure 1.
Figure 4.
Gene copy number and mRNA expression correlation analysis to identify putative driver genes amplified on chromosomes 12–20.
Pearson correlation coefficients were calculated by analyzing gene copy number and mRNA expression from individual patient-derived samples in TCGA datasets. Shown are the correlation coefficients for each TCGA cancer subtype and the mean correlation across all cancer types (red denotes high correlation, blue denotes low correlation). Abbreviations of TCGA datasets are listed in Figure 1.
Table 1.
Identification of cancer amplified genes with high copy number versus expression correlation.
Figure 5.
Cancer amplified genes in the MAP kinase pathway.
(A) KRAS shRNA activity in a panel of cancer cell lines (Project Achilles). shRNA score denotes the log2 based decrease in KRAS shRNA compared to pooled shRNA in cancer cell lines after several rounds of proliferation post-shRNA infection [11]. A negative shRNA score suggests decreased cancer cell proliferation/survival after shRNA transfection. Yellow bars indicate cell lines with KRAS copy number >4 and black bars indicate cell lines with KRAS copy number <4. (B) Copy number (x-axis) and mRNA expression (y-axis) for KRAS in a panel of ovarian cancers. Correlation coefficient for copy number and mRNA expression are listed in the top right (r value). (C) Frequency of amplification (red bar), mutation (green bar), and deletion (blue bar) for KRAS in various cancers. The percentages shown reflect the overall rate of gene amplification, mutation and/or deletion in each cancer type. Vertical aligned bars reflect samples from the same patient. (D) KRAS copy number (x-axis) and KRAS relative protein level (y-axis) as measured by western blot in a panel of lung cancer cell lines grown in vitro. (E) Gene amplifications associated with sensitivity to KRAS shRNA in cancer cell lines (Project Achilles). Y-axis = Log10 Likelihood Ratio (LOD) of gene amplification being associated with shRNA score by comparing each gene amplification model to the “null model” without any gene amplification. (F) KRAS copy number (x-axis) and KRAS shRNA score (y-axis) for individual cancer cell lines color-coded by tumor type (data obtained from Project Achilles). Trendline shown for mean values in each copy number bin.
Figure 6.
GRB7 and DCUN1D1 are novel cancer amplified genes with putative driver activity.
(A) GRB7 shRNA activity in a panel of cancer cell lines (Project Achilles). shRNA score denotes the log2 based decrease in GRB7 shRNA compared to pooled shRNA in cancer cell lines after several rounds of proliferation post-shRNA infection [11]. A negative shRNA score suggests decreased cancer cell proliferation/survival after shRNA transfection. Yellow bars indicate cell lines with GRB7 copy number >4 and black bars indicate cell lines with GRB7 copy number <4. (B) Copy number (x-axis) and mRNA expression (y-axis) for GRB7 in a panel of breast cancers. Correlation coefficient for copy number and mRNA expression are listed in the top right (r value). (C) Frequency of amplification (red bar), mutation (green bar), and deletion (blue bar) for GRB7 and ERBB2 in various cancers. The percentages shown reflect the overall rate of gene amplification, mutation and/or deletion in each cancer type. Vertical aligned bars reflect samples from the same patient. (D) Copy number (x-axis) and mRNA expression (y-axis) for DCUN1D1 in lung squamous cancers. Correlation coefficient for copy number and mRNA expression is listed in the top right (r value). (E) Relative proliferation (y-axis) of cancer cell lines KYSE, T47D, SW48, and HCT15 cells 6 days after infection with DCUN1D1 lentiviral shRNA particles, as measured by Cell Titer Glo assay.
Figure 7.
Epigenetic regulatory genes as putative cancer amplified driver genes.
(A) Copy number (x-axis) and mRNA expression (y-axis) for NSD3 and SETD1 in breast cancers and melanomas, respectively. Correlation coefficient for copy number and mRNA expression are listed in the top right (r value). (B) BRD4 and YEATS4 shRNA activity in a panel of cancer cell lines (Project Achilles). shRNA score denotes the log2 based decrease in the representative shRNA compared to pooled shRNA in cancer cell lines after several rounds of proliferation post-shRNA [11]. Yellow bars indicate cell lines with BRD4 or YEATS4 copy number >4 and black bars indicate cell lines with BRD4 or YEATS4 copy number <4. (C) Frequency of amplification (red bar), mutation (green bar), and deletion (blue bar) for NSD3, SETDB1, YEATS4, and BRD4 in various cancers. The percentages shown reflect the overall rate of gene amplification, mutation and/or deletion in each cancer type. Vertical aligned bars reflect samples from the same patient. (D) Relative NSD3 protein level (y-axis, normalized to b-actin protein levels) compared with NSD3 copy number (x-axis) in SW48, H1581, SW837, and H1703 cells. (E) Relative proliferation (y-axis) and (F) relative apoptosis levels of cancer cell lines H1581, H1703, SW48, and SW837 cells 3 days after transfection with NSD3 siRNA, as measured by Cell Titer Glo and Caspase Glo assays, respectively. (G) Cell cycle profile of H1703 cells 24 or 48 hours after transfection with NSD3 siRNA compared to non-transfected controls. (H) Relative changes of cells in apoptosis, G1 or G2 phases (y-axis) in cell lines 48 hours-post NSD3 siRNA transfection compared to uninfected controls.