Machine learning and multi-omics data reveal driver gene-based molecular subtypes in hepatocellular carcinoma for precision treatment

doi:10.1371/journal.pcbi.1012113

Fig 1.

illustrates the process of obtaining stratification genes.

(A) The workflow for protein domains with significant mutation burden (B) illustrates the process of defining DDGs.

More »

Expand

Fig 2.

Identification of driver gene-related subtypes.

(A) Flowchart depicting the process of the subtype classification algorithm; (B) Silhouette coefficients for different values of k; (C) Silhouette plot specifically for k = 2; (D) Heatmap of the consensus matrix defining the two subtypes; (E) Five-year survival curves for the two subtypes, with CLASS A represented in red and CLASS B represented in blue.

More »

Expand

Table 1.

Comparison of different clustering methods.

More »

Expand

Fig 3.

Different biological properties of the two subtypes.

(A-C) Differential analysis of KEGG, Hallmark and oncogenic signature pathways between these two subtypes. (D) The immune cell abundance in these two subtypes using TIMER, with statistical significance assessed by the Mann-Whitney U test. (E) Box plots depicting the expression levels of six immune checkpoint genes in these two subtypes, with red boxes indicating significantly differentially expressed genes (p.adjust < 0.05, |log2FC| > 1). (F) Box plot showing the mRNA stemness scores (mRNAsi) of the two subtypes.

More »

Expand

Fig 4.

Distinct multi-omics features of the two subtypes.

(A) Box plot showing the number of nonsynonymous mutations in each subtype. (B) Box plot displaying the CIN ratio in each subtype. (C) Heatmap illustrating CNVs of the 22 autosomes in both subtypes, with red and blue indicating copy number amplifications and deletions, respectively. (D) Oncoplot presents the top 20 significantly mutated genes in the subtypes based on the p-values. (E) Identifying of subtype-specific methylation probes in the two groups using the R package ChAMP, with a threshold of p.adjust < 0.05 and |Δβ| > 0.2. (F) Expression profiles of DNA methyltransferase family members DNMT1, DNMT3A, and DNMT3B in these two subtypes.

More »

Expand

Fig 5.

Machine learning classifier for HCC subtypes.

(A) Workflow of the subtype SVM classifier. The process of selecting subtype-specific genes followed the differential analysis procedure described in the Methods, with genes retained based on criteria of p.adjust < 0.001 and |log2FC| > 1. The mean interval was defined as [μ−σ, μ+σ], where μ represents the average expression of the gene in the patient cohort, and σ represents the standard deviation of gene expression in the patient cohort. (B) Predictive results of the SVM_10 model on the validation set. (C) ROC curve of the SVM_10 model on the training set. (D) ROC curve of the SVM_10 model on the validation set. (E) Expression distribution of the 10 classification genes across these two subtypes. (F) Heatmap of the expression levels of the 10 classification genes in different subtypes and normal samples.

More »

Expand

Table 2.

Classification results of subtype classifiers in the validation set.

More »

Expand

Fig 6.

Construction and Evaluation of Clinical Prognostic Model.

(A) Univariate Cox analysis of clinical features and subtypes in TCGA cohort. (B) Multivariate Cox analysis of clinical features and subtypes in TCGA cohort. (C) Nomogram model predicting HCC patients’ prognosis. (D) Calibration plot showing 1-, 3-, and 5-year survival probabilities for the nomogram model. (E) ROC curve evaluating predictive performance of the nomogram model in TCGA cohort.

More »

Expand

Fig 7.

Single-cell analysis of HCC subtypes.

(A) SVM_10 assigned subtypes to 10 primary HCC samples from GSE149614. (B) UMAP plot showing 21 cell clusters. (C) Cell type annotation in different subtypes using marker genes.

More »

Expand

Fig 8.

The drug sensitivity analysis of HCC subtypes.

(A) The SVM_10 model assigned subtypes to 81 HCC cell lines from the LIMORE dataset. (B) KEGG enrichment analysis of the two subtypes. (C) Box plots illustrating drugs with differential activity area in the two subtypes.

More »

Expand