Fig 1.
Datasets of genetic mutations and framework of analysis.
(A) Summary of tumor genetic mutation datasets collected from public platforms. The outer ring indicates the sources of all cohorts included in this study, which are connected to their respective names through T-shaped crosses. The histogram inside illustrates the sample size of each cohort, and the innermost ring indicates whether the cohort mainly consists of adults or children. (B) Each column of the table provides detailed information about all 55 cohorts, including cohort names, the number of cohorts for each cancer type, cancer types, regional attributes, and the number of IntOGen genes for each cohort. The lollipop diagram on the right illustrates the numbers in the last column of the table. Different colors are used to distinguish different countries or areas. Un: unknown. -: no records. (C) Four parts of our analysis. Wires connecting different cohorts represent comparisons conducted using EntCDP (common pathways, Com) and ModSDP (specific pathways, Spe). The background colors of cohorts correspond to the platforms shown in (A). Head and neck*: Head and neck squamous cell carcinoma. A: adults; C: children; Y: exposure groups; N: control groups.
Fig 2.
Results of the reliability test and regional differences in signaling pathways.
(A) The hypergeometric test reveals significant results for six types of cancer with IntOGen genes in each cohort, which investigates the overlap between cohorts’ common genes identified by IntOGen and EntCDP. The size of the circles corresponds to the significance level. Different colors in the bar graph present different features of genes identified by EntCDP: dark (light) purple indicates common genes obtained from all (part of) selected cohorts by IntOGen; yellow (pink) denotes non-IntOGen genes of any selected cohort that have (not) been verified as drivers in the literature. (B) Specific pathways of patients with a certain cancer type coming from one country (left) relative to other counties (right, denoted by colorful circles). The gray-filled bar for each signaling pathway indicates the proportion of identified genes enriched in that pathway, as shown by the ratio on the right. (C, D) Common signaling pathways among, and pathways specific to, patients with breast adenocarcinoma from the United States versus those from European Union countries and Britain (C), or with bladder cancer from the United States versus China (D). The dashed line indicates no significant results. The bar above clearly lists the names of genes with different characteristics from (A). *1: COADREAD, colorectal cancer; *2: Transcriptional misregulation in cancer; *3: Apoptotic pathways in synovial fibroblasts.
Fig 3.
Pairwise comparison of cancer types similar in location or function.
(A, C, E) The top panels show the common gene sets between two cancers identified by EntCDP (A & B), and the bottom panels display the specific gene sets of cancer A relative to cancer B (A / B) or vice versa (B / A) by ModSDP. Only significant sets are shown. Numbers in the leftmost or rightmost columns denote the parameter K, which means the number of genes in each set. Genes in the highlighted rows are enriched in the signaling pathway at the bottom of each ladder table. (B, D, F) Regulation of genes involved in the signaling pathways shown in (A, C, E), respectively. Genes marked in purple are identified by EntCDP or ModSDP, while genes in green are not.
Fig 4.
Interpretation of signaling pathways in pediatric and adult tumors.
(A) Common and specific pathways of acute myeloid leukemia and glioblastoma in adults and children. (B, D) Specific signals of two hematopoietic tumors (B) and four neural tumors (D). The pathways labeled on directed edges represent significant signals specific to one cancer at the tail compared to another at the arrowhead. (C, E) Corresponding to (B) and (D), genes enriched in identified pathways are stacked from bottom to top and marked in dark blue. The remaining genes, depicted in light blue, are also identified by ModSDP but are not explicitly annotated for enrichment in the corresponding signaling pathway. Both dark blue and light blue genes are displayed under the same parameter K. Cytokine signaling*: Cytokine signaling in immune system; Embryonic*: Embryonic and induced pluripotent stem cells and lineage-specific markers; CFTR activity*: Regulation of CFTR activity.
Fig 5.
Three environmental agents promote tumor progression. Specific signaling pathways in patients with risk exposures to tobacco (A), alcohol (B) and high BMI (C) are illustrated on the left side, while the control groups without harmful habits or abnormal indices are shown on the right.
The pie chart in the middle shows the proportions of the two groups. The heatmaps on both sides exhibit the gene mutation profiles of specific genes in the exposure group (the two maps on the left) and the control group (the two maps on the right) when parameter K is 5. Heatmaps in the same horizontal line share the same gene set. Genes enriched in identified pathways in the middle are filled in dark blue.