Fig 1.
Cancer type-specific analysis of matrisome genes.
A) Schematic shows the steps involved in shortlisting genes from the individual TCGA cancer dataset based on copy number (GISTIC score), expression and survival (GEPIA2 database). The number of genes that met the criteria for each stage and were eventually shortlisted are indicated in the box for genes identified. Their occurrence in two or more cancers was used for the final selection (n = 1). B) Based on their copy number, the number of deep deleted (BLUE bar) or amplified (RED bar) matrisome genes in individual cancers (n = 28) were arranged in the descending order based on total genes affected (BLUE+RED). C) Based on their mRNA expression, the top 5% of matrisome genes in individual cancers (n = 28) that are downregulated (BLUE bar) or upregulated (RED bar) were arranged in descending order based on total genes affected (BLUE+RED). D)Table lists the individual cancers with one or more amplified and upregulated (RED) or deleted and downregulated (BLUE) genes that affect survival. Genes marked with an asterisk (*) are also shortlisted in the pan-cancer analysis (Fig 2F). Genes marked in bold are affected by more than one cancer type. E) Graph represents the mutational (GREEN), copy number amplification (RED) and deletion (BLUE) analysis of CTHRC1 in 30 individual cancers. Arrows point to the individual cancers where CTHRC1 is also selected as detailed above (Fig 1D).
Fig 2.
Pan-cancer analysis of matrisome genes.
A) Schematic shows the steps involved in shortlisting genes from the pan-cancer TCGA dataset based on copy number (GISTIC score), expression and survival (GEPIA2 database). The number of genes shortlisted at each stage of the selection process is indicated in each box. B) Deleted (n = 52) and amplified (n = 52) genes were classified based on their mRNA expression as downregulated (BLUE), upregulated (RED) and no change (GREEN). The nested bar graph represents the percentage of each for deleted (top graph) and amplified (bottom graph) genes. C) Bar graph shows the percentage of cancers where the top 5% of amplified matrisome genes are also upregulated (n = 23). Genes are arranged in descending order (RED–represents upregulated genes). D) Bar graph shows the percentage of cancers where the top 5% of deleted matrisome genes are also downregulated (n = 17). Genes are arranged in descending order (BLUE–represents downregulated genes). E) Bar graph shows the percentage of survival in cancers for the top 5% of matrisome genes shortlisted in C (RED bar) and D (BLUE bar). These are arranged in descending order. F) Tables list genes in the ascending order of their score calculated based on their position in the expression (C, D) and survival (E) graphs (as detailed in methods). A lower score is indicative of a higher position in these graphs. Upregulated genes are listed in RED and downregulated genes are listed in BLUE.
Fig 3.
Pan-cancer analysis of CTHRC1 expression and its effect on survival.
A) Graphs represent CTHRC1 expression data from the GEPIA2 portal in 30 different tumour types (T–RED bar) relative to normal (N–GREY bar). Cancers showing significant upregulation in CTHRC1 are listed first and labelled in RED. Those showing no significant change in expression are listed later and labelled in BLACK. Expression data are represented as mean ± standard deviation (S.D) on a log scale using a box plot. p-value of < 0.05 was determined as statistically significant. B) Table lists the results of univariate analysis of CTHRC1 expression on survival across 30 individual cancers. It shows the significance values (p-value) for survival in patients with “high” vs “low” CTHRC1 expression and their hazards ratio (HR). Cancers with significance p ≤ 0.05 are listed in PINK and p > 0.05 in BLACK in the descending order of their respective hazards ratio. C) Table shows CTHRC1 expression and survival data in 30 individual cancers. Upregulated (PURPLE), and comparable (GREEN) expression marked accordingly. Significant effect seen on survival is marked in PINK and lack thereof marked in ORANGE. D-E) Tables shows the multivariate survival analysis for CTHRC1 expression in the context of (D) race and (E) gender in selected cancers for which data is available. It shows the significance values (p-value) for survival in patients with “high” vs “low” CTHRC1 expression and their hazards ratio (HR) for comparison. Cancers with significance p < 0.05 are listed in their descending order of significance.
Fig 4.
Differential gene expression analysis of CTHRC1.
A) The table lists in the 9 selected cancers (BLCA, BRCA, HNSC, KIRC, LIHC, OV, READ, SARC and STAD–as detailed in Methods) the number of differentially expressed genes and the top 5% genes upregulated or downregulated with a 2 fold change (as detailed in Methods). B) This Venn diagram shows the overlap (if any) of the top 5% upregulated genes in the above listed 9 cancers. C) Table lists the 19 overlapping genes between 3 or more cancer types. D) Protein-protein interaction network constructed for CTHRC1 and its 19 differentially expressed genes using the STRING database. BLUE line marks predicted interactions from gene co-occurrence data, GREEN line marks predicted interactions based on gene neighbourhood evidence, PURPLE line marks experimentally determined known interactions, YELLOW line marks interactions based on text mining and the LIGHT BLUE line marks interactions based on database evidence. E-F) Functional enrichment for significant (p<0.05) (E) biological processes, (F) molecular functions and (G) cellular components in the STRING network analysis are listed in their descending order of significance. (H) The table lists the pathways identified by KEGG analysis for the STRING network in the descending order of their significance (FDR). I) Network of 13 hub genes identified using CytoHubba plugin in Cytoscape software. Colours of the hub genes are based on their rank which is also listed as a table (High to low).
Fig 5.
Validation of CTHRC1 and its hub genes in cancers.
A) Graph represents percentage survival in 30 cancers with “high” (RED plot) vs “low” (BLUE plot) expression for CTHRC1 or each of its 13 hub genes (POSTN, COMP, COL11A1, MMP13, COL10A1, OMD, OGN, SFRP4, SFRP2, THBS2, FAP, FNDC1 and ADAMTS16) using GEPIA2 database. The significance of the difference in survival is listed above each graph. p values are as indicated above the graph. Genes with significance (p ≤ 0.05) are listed in RED and those lacking significance in BLACK. p values = 0 are representative of very high significance. B) Violin plot shows the expression of CTHRC1 and each of its 13 hub genes across pathological stages in 30 cancers analyzed using the GEPIA2 database. Differences across the stages of cancer for each gene of interest was calculated using the ANOVA test and significance was reported. p values are as indicated above the graph. Genes with significance (p ≤ 0.05) are listed in RED. C) Scatter plots show the Spearman correlation analysis for CTHRC1 and its 13 hub genes in 30 cancers using GEPIA2. p values are as indicated above the graph. Genes with significance (p ≤ 0.05 or p = 0) are listed in RED. D) Bar graph shows log2 odds ratio from the cBioPortal for statistically significant co-occurrence between CTHRC1 and hub genes of interest (8 genes) in 30 cancer types. E) This Venn diagram shows the overlap of genes that significantly affects survival and tumour staging and are related in correlation and co-occurrence analysis in 30 cancer types. The table lists the 5 overlapping genes detected in this analysis.
Fig 6.
Protein levels of CTHRC1 and network genes in breast cancer.
Graphs represent protein levels of CTHRC1 and shortlisted network genes (POSTN, MMP13, SFRP4 and FNDC1) in normal (BLUE) versus tumour tissue (RED) data from the UALCAN Portal. The box plot shows the median ± standard deviation. p values are as indicated and calculated using the students t-test. Genes with significance (p <0.05) are listed in RED and those lacking significance in BLACK.
Fig 7.
Functional Network analysis of CTHRC1 and its 5 hub genes.
Network analysis of CTHRC1 and its 5 hub genes (POSTN, MMP13, FNDC1, SFRP4 and ADAMTS16) identifies COL3A1 and ROR2 genes in both co-expression and physical interaction categories. A) Image shows co-expression network based on query genes, B-C) Images shows co-expression network based on (B) biological processes and (C) cellular component. D-E) Image shows physical interaction network based on (D) biological processes and (E) cellular component.
Fig 8.
Functional Network analysis of CTHRC1, POSTN and MMP13.
Network analysis of CTHRC1, POSTN and MMP13 identifies COL3A1 in both co-expression and physical interaction categories. A) Image shows co-expression network based on query genes, B-C) Images shows co-expression network based on (B) biological processes and (C) cellular component. (D-E) Image shows physical interaction network based on (D) biological processes and (E) cellular component.