Skip to main content
Advertisement

< Back to Article

Fig 1.

The pan-cancer landscape of centrosome amplification-associated gene expression.

(a) For each sample, the CA20 score was calculated as the sum of the normalized (log2 median-centred) expression levels of the 20 signature genes. (b) CA20 score distribution across tumour samples of all TCGA cancer types. Cohorts are ordered by their median CA20 score. Black points and lines represent the median +/- upper/lower quartiles. (c) Tumour samples have higher CA20 levels in all 15 cancers with both tumour and matched-normal samples available (at least 10 samples per sample type; False Discovery Rate (FDR) < 0.0001, Wilcoxon rank-sum test). CA20 score distributions of tumour and normal samples are represented in red and blue, respectively. ACC: adrenocortical carcinoma; BLCA: bladder urothelial carcinoma; BRCA: breast invasive carcinoma; CESC: cervical squamous cell carcinoma and endocervical adenocarcinoma; CHOL: cholangiocarcinoma; COADREAD: colon and rectum adenocarcinoma; DLBC: lymphoid neoplasm diffuse large B-cell lymphoma; ESCA: oesophageal carcinoma; GBM: glioblastoma multiforme; HNSC: head and neck squamous cell carcinoma; KICH: kidney chromophobe; KIRC: kidney renal clear cell carcinoma; KIRP: kidney renal papillary cell carcinoma; LAML: acute myeloid leukemia; LGG: low-grade glioma; LIHC: liver hepatocellular carcinoma; LUAD: lung adenocarcinoma; LUSC: lung squamous cell carcinoma; MESO: mesothelioma; OV: ovarian serous cystadenocarcinoma; PAAD: pancreatic adenocarcinoma; PCPG: pheochromocytoma and paraganglioma; PRAD: prostate adenocarcinoma; SARC: sarcoma; SKCM: skin cutaneous melanoma; STAD: stomach adenocarcinoma; TGCT: testicular germ cell tumours; THCA: thyroid carcinoma; THYM: thymoma; UCEC: uterine corpus endometrial carcinoma; UCS: uterine carcinosarcoma; UVM: uveal melanoma.

More »

Fig 1 Expand

Fig 2.

CA20 is associated with different breast cancer clinical and molecular features.

(a-h) CA20 score distribution per (a,e) sample type, (b,f) histological and (c,g) PAM50 molecular subtype, and (d,h) tumour stage for (a-d) TCGA breast cancer and (e-h) METABRIC samples. Black points and lines represent the median +/- upper/lower quartiles. Number of samples used in each violin is shown within brackets. *** p-value < 0.001, **** p-value < 0.0001 and n.s. non-significant (Wilcoxon rank-sum test). (i-j) Luminal B and basal-like human breast carcinomas display higher levels of centrosome amplification (CA). (i) Illustration of the procedure to quantify CA in patient samples. (j) Percentage of cells displaying CA in breast tumours from the different PAM50 molecular subtypes (29 luminal A, 3 luminal B, 3 HER2 and 13 basal-like). Between 5 and 107 cells were analysed for each patient. * p-value < 0.05, *** p-value < 0.001 and n.s. non-significant (Wilcoxon rank-sum test).

More »

Fig 2 Expand

Fig 3.

CA20 is associated with genomic instability features in cancer.

(a) CA20 score is associated with genome doubling. Box plots of CA20 score per whole genome doubling status. **** p-value < 0.0001 (Wilcoxon rank-sum test). (b, d-f) CA20 is associated with different genomic instability features. Smooth scatter plots showing correlation between CA20 score and (b) aneuploidy score (measured as the total number of altered chromosome arms), (d) number of mutations per Mb, (e) number of CNAs and (f) clones per tumour across TCGA tumour samples (Spearman’s correlation coefficient, r = 0.44, 0.48, 0.47 and 0.43, respectively, p-value < 2.2e-16 for all). (c) Chromosome arm alterations associated with CA20 score. Volcano plot shows the results of linear regression analyses comparing CA20 score between samples with deletion or amplification of each chromosome arm. Arms whose deletions or amplifications are associated with higher CA20 (FDR < 0.05) are represented in blue and red, respectively. Chromosome arms with FDR < 1e-5 are highlighted and box plots of CA20 score per chromosome arm alteration are shown for 5q, 16p and 7p arms. **** p-value < 0.0001 (linear regression). (g) Hierarchical clustering of TCGA cancer types based on the independent association between the different genomic instability features and CA20 score. Unsupervised hierarchical clustering using Euclidean distances calculated based on multiple linear regression p-values of association with CA20 of aneuploidy score, number of mutations per Mb, number of CNAs and clones per tumour, per TCGA cohort and with all cohorts together (PanCancer). Heatmap colour scale according with -log10 of linear regression p-values. Main clusters are highlighted with different shades of grey. BLCA: bladder urothelial carcinoma; CESC: cervical squamous cell carcinoma and endocervical adenocarcinoma; GBM: glioblastoma multiforme; HNSC: head and neck squamous cell carcinoma; KIRC: kidney renal clear cell carcinoma; LGG: low-grade glioma; LUAD: lung adenocarcinoma; LUSC: lung squamous cell carcinoma; PRAD: prostate adenocarcinoma; SKCM: skin cutaneous melanoma; STAD: stomach adenocarcinoma; THCA: thyroid carcinoma.

More »

Fig 3 Expand

Fig 4.

CA20 is associated with cancer’s mutational spectrum.

(a) Somatic mutations pan-cancer-wide associated with the CA20 score. The volcano plot shows the results of linear regression analyses comparing the CA20 score between mutated and wild-type samples for 14,589 genes (at least 20 mutated samples). Genes whose mutations are associated with higher and lower CA20 (FDR < 0.05) are represented in red and blue, respectively. The top 10 genes are highlighted. (b) TP53 mutations are associated with CA20 in different cancer types. Linear regression coefficients, representing CA20 score differences between TP53 mutated and wild-type tumour samples, across TCGA cohorts with at least 20 mutated samples. Significant associations (FDR < 0.05) are coloured as in (a). (c) Mutational signatures pan-cancer-wide associated with CA20, independently of other types of genomic instability. Left: Significance of linear regression analyses (-log10 p-value) between CA20 and contribution of each mutational signatures including, as independent variables, the four genomic instability features: aneuploidy, mutation burden, CNA and number of clones per tumour. Positive and negative significant associations (FDR < 0.05) are coloured in red and blue, respectively. Right: Smooth scatter plots showing correlations between CA20 and the contribution of mutational signature 1 (linked with ageing) in 3 TCGA cohorts. Linear regression p-values are shown. (d) Causal effect of CA20-associated mutations on CA20 levels. Scatter plot of linear regression’s coefficient from (a) versus Connectivity Map (CMap)’s knock-down score, ranging from 100 (CA20 up-regulation) to -100 (CA20 down-regulation), for each gene in common. Genes are coloured as in (a) and the ones with both significant linear regression associations and absolute knock-down score higher than 80 are highlighted. (e and f) Gene Set Enrichment Analysis (GSEA) of genes ranked by their CMap’s knock-down score using (e) a manually curated list of centriole duplication factors and (f) MSigDB’s Hallmark Gene Sets (unfolded protein response and mitotic spindle were significantly associated, FDR < 0.05). GSEA p-values are shown. BLCA: bladder urothelial carcinoma; BRCA: breast invasive carcinoma; CESC: cervical squamous cell carcinoma and endocervical adenocarcinoma; ESCA: oesophageal carcinoma; GBM: glioblastoma multiforme; HNSC: head and neck squamous cell carcinoma; KICH: kidney chromophobe; LGG: low-grade glioma; LIHC: liver hepatocellular carcinoma; LUAD: lung adenocarcinoma; LUSC: lung squamous cell carcinoma; OV: ovarian serous cystadenocarcinoma; PAAD: pancreatic adenocarcinoma; PRAD: prostate adenocarcinoma; SARC: sarcoma; SKCM: skin cutaneous melanoma; STAD: stomach adenocarcinoma; UCS: uterine carcinosarcoma.

More »

Fig 4 Expand

Fig 5.

High CA20 is associated with poor patient prognosis, hypoxia and lower stromal infiltration in cancer.

(a) Kaplan-Meier plots for patient stratification based on CA20 score (patients divided by CA20 median: lower CA20 in blue and higher CA20 in red) in 8 different cancer types. Numbers at risk every 2.5 years (tables) and 5-year survival rates (points and dashed lines) are shown. P-values for log-rank tests for differences in survival are shown. (b) CA20 upregulation is associated with hypoxia. Smooth scatter plot showing correlation between the hypoxia and the CA20 scores across TCGA tumour samples (Spearman’s correlation coefficient, r = 0.61, p-value < 2.2e-16). (c) CA20 upregulation is associated with hypoxia in different cancer types. Linear regression coefficients, representing the CA20 score dependence on hypoxia score, independently of genomic instability, across the TCGA cohorts with information for all covariates. Significant associations (FDR < 0.05) are coloured. (d) CA20 is associated with lower stromal cell infiltration. Smooth scatter plot showing correlation between the CA20 and the stromal scores across TCGA tumour samples (Spearman’s correlation coefficient, r = -0.52, p-value < 2.2e-16). (e) CA20 is associated with lower stromal cell infiltration in head and neck squamous cell carcinoma and lung adenocarcinoma. Linear regression coefficients, representing the CA20 score dependence on stromal score, independently of genomic instability, across the TCGA cohorts with information for all covariates. Significant associations (FDR < 0.05) are coloured. ACC: adrenocortical carcinoma; BLCA: bladder urothelial carcinoma; BRCA: breast invasive carcinoma; CESC: cervical squamous cell carcinoma and endocervical adenocarcinoma; GBM: glioblastoma multiforme; HNSC: head and neck squamous cell carcinoma; KICH: kidney chromophobe; KIRC: kidney renal clear cell carcinoma; LGG: low-grade glioma; LUAD: lung adenocarcinoma; LUSC: lung squamous cell carcinoma; MESO: mesothelioma; PAAD: pancreatic adenocarcinoma; PRAD: prostate adenocarcinoma; SKCM: skin cutaneous melanoma; STAD: stomach adenocarcinoma; THCA: thyroid carcinoma; UVM: uveal melanoma.

More »

Fig 5 Expand

Fig 6.

Identification of compounds that selectively kill cancer cells with high CA20.

(a) Compounds with selective activity on cell lines with high or low CA20 score. The volcano plot shows the results of Spearman’s correlation analyses between CA20 scores and compound Area Under the dose-response Curve (AUC) across Cancer Therapeutics Response Portal (CTRP) human cancer cell lines. Note that lower AUC means higher drug activity. The compounds whose activity was associated with high and low CA20 (FDR < 0.05) are represented in blue and red, respectively. The top 6 compounds are highlighted. (b) Top 6 compounds targeting cells with higher CA20 score. Smooth scatter plots showing correlation between CA20 score and compound AUC across CTRP cell lines for the top 6 compounds from (a). Spearman’s correlation coefficients, r, and respective p-values are shown. (c) Compounds that down-regulate the CA20 gene set. Heatmap of CMap’s drug score, ranging from 100 (maximum CA20 up-regulation) to -100 (maximum CA20 down-regulation) per cell line. Drug average score (last column) is the mean of drug scores across cell lines. The 20 compounds with the lowest drug average score are shown and ranked accordingly. Tissue of origin of human cancer cell lines: PC3: prostate; VCAP: prostate; A375: melanoma; A549: lung; HA1E: kidney; HCC515: lung; HT29: colon; MCF7: breast; HEPG2: liver. (d) Compounds selectively targeting cells with higher CA20 also down-regulate these genes. Scatter plot showing correlation between CTRP’s Spearman’s correlation coefficient from (a) and CMap’s drug average score from (c) for the 164 compounds tested in both datasets (Spearman’s correlation coefficient, r = 0.26, p-value = 8.3e-4). Points are coloured as in (a) and the predicted protein targets of compounds with both significant Spearman’s correlations and drug average score lower than -90 are shown.

More »

Fig 6 Expand