Evaluation and comparison of multi-omics data integration methods for cancer subtyping

doi:10.1371/journal.pcbi.1009224

Fig 1.

Summary of the data type selection of existing integration methods.

(A) Multi-omics data usage of different integration methods. (B) The usage frequency of each omic data type. We use “Methy”, “mRNA”, “miRNA” and “protein” to represent DNA methylation, mRNA expression, miRNA expression, and protein expression, respectively. (C) The usage frequency of different omics combinations. We use “G”, “E”, “T”, “P” and “O” to represent genomics, epigenomics, transcriptomics, proteomics, and others, respectively.

More »

Expand

Table 1.

Summary of previous studies.

More »

Expand

Fig 2.

Data usage.

(A) Omics data types we used in this study. The main color of each heatmap (matrix) represents one type of omics data, i.e. black represents CNV, green represents miRNA expression, orange represents DNA methylation, and blue represents mRNA expression. The shade of color in each heatmap is proportional to the different values. (B) Datasets construction. As TCGA group assigned BRCA and COAD patients into five (i.e. “LumA”, “LumB”, “Basal”, “Her2”, and “normal”) and four (i.e. “CIN”, “GS”, “MSI”, and “POLE”) subtypes, respectively, in BRCA and COAD gold standards datasets of Dataset group #3, we use these assignments as gold standards of BRCA and COAD patients. (C) Omics data combinations. We use “Methy”, “mRNA”, and “miRNA” to represent DNA methylation, mRNA expression, and miRNA expression, respectively.

More »

Expand

Table 2.

Details of packages or codes used in this work.

More »

Expand

Table 3.

Testing strategies and the datasets of the accuracy tests.

More »

Expand

Fig 3.

Clustering-based performance of Dataset group #1 Nine-cancer Datasets.

We use “iCB”, “LRA”, “moC”, “CIM”, “MNMF”, and “SGAN” to represent iClusterBayes, LRAcluster, moCluster, CIMLR, MultiNMF, and Subtype-GAN respectively. (A) Silhouette coefficient based on the suggested k of methods. We set k-max as 8 and let each method suggest the best k. Each of the 11 data points in a box represents a silhouette coefficient of the subtyping results based on the method suggested k obtained by the corresponding method using one of the 11 possible combinations of data types. (B) Silhouette coefficient based on all the possible k. Each of the 11 data points in a box represents the average silhouette coefficient of the subtyping results from k = 2 to 8 obtained by the corresponding method using one of the 11 possible combinations of data types.

More »

Expand

Fig 4.

Clinical-based performance of Dataset group #1 Nine-cancer Datasets.

The representations of the abbreviations are the same as those in Fig 3. We calculated the -log10(log-rank test p-value) of the subtyping results based on every possible k, combination, and cancer of each method. (A) Clinical-based performance based on the suggested k of methods. The upper plot shows the average ranking of the ability to cluster patients into clinically-significant subtypes of each method. Each data point in the box was calculated as follows. We fixed cancer and combination to rank the -log10(p-value) among all methods, which represented the ability of clustering patients into clinically-significant subtypes of each method using the current combination. Then each method had 11 (combinations) * 9(cancers) rankings which we used to compare these methods. The lower plot shows the cumulative number of significant p-values. We set 1.301 as the threshold which corresponded to 0.05 before the transformation to evaluate whether the current subtyping result had clinical significance and we counted the significant ones. (B) Clinical-based performance based on all the possible k. Two plots had the same meaning as (A) but the ways of calculation were a little different. Each data point in the box of the upper plot was calculated as follows. We fixed cancer, combination, and k to rank the -log10(p-value) among all methods. Therefore, each combination had 7 rankings corresponding to each possible k, and we then calculated the average of these 7 rankings to represent the ability of using the current combination. For the lower plot, we counted the number of significant p-values for each combination among all possible k and cumulated the average of each combination to draw the plot.

More »

Expand

Fig 5.

Clustering-based performance of Dataset group #3 Gold Standard Datasets.

For each metric (i.e. precision, NMI, ARI, and F-measure) and each integration method, each data point in a box is a measurement of using one of the 11 data type combinations for both BRCA and COAD datasets, and the white line within the box indicates the mean value of the results. (A) Clustering-based performance of gold standard datasets based on the suggested k of methods. We set k-max as 8 and let each method suggest the best k. The performance of the method suggested k was used to evaluate and compared. To the three methods that cannot suggest best k, we clustered BRCA and COAD samples into 5 and 4 clusters, respectively. (B) Clustering-based performance of gold standard datasets based on the pre-defined k. As the true labels of samples and the number of clusters are known in Dataset group #3, we clustered BRCA and COAD samples into 5 and 4 clusters, respectively, and calculated the clustering-based metrics to evaluate and compare the performance of the integration methods.

More »

Expand

Fig 6.

Accuracy Rank Table.

The rank items listed under the table include the information of metrics, datasets, cancer, whether based on the method suggested k, which are connected by underlines. “Sugg” represents the current test is based on the method suggested k. “Com” and “Sig” represent the complete and significant datasets, respectively.

More »

Expand

Fig 7.

Robustness performance.

A robust method should satisfy two criteria: when the number of clusters is fixed, its NMI decreases slowly as the noise level increases; for a fixed level of noise, the NMI has little difference among the different number of clusters. We show the NMI comparisons of (A) BRCA noise datasets and (B) COAD noise datasets. The x-coordinate of figures is the number of clusters on which there are five bars in each cluster number. Each bar represents the average NMI over all 11 combinations at the current noise level datasets on the given number of clusters. The confidence interval around the average NMI is plotted using an error bar. (C) Robustness rank table. The rank items listed under the table include the information of metrics, noise level, and cancer which are connected by underlines.

More »

Expand

Table 4.

Data scales of different datasets.

More »

Expand

Fig 8.

Computational efficiency performance.

(A) Running time of different datasets. Here, we only included the results for the BRCA datasets in Dataset #1 and the Pan-cancer datasets in Dataset #3. The difference between the running times of these methods is similar to other cancer datasets but less significant as those datasets are much smaller than BRCA datasets. The data combinations are placed along the x-coordinate in ascending order of the number of participating data types and data scales. (B) Total running time. (C) Fold changes of running time on different datasets. (D) Computational efficiency rank table. The rank items listed under the table include the datasets and cancer which are connected by underlines. “Com” and “Sig” represent complete and significant datasets, respectively.

More »

Expand

Fig 9.

Influences of different omics data and their integration on cancer subtyping.

(A) Influence of different omic types to different cancers on cancer subtyping. On the radar plot, each quadrilateral represents a cancer type and each vertex of the quadrilateral represents the influence (i.e. Weighted Average Z-score) of a type of omic data with regard to that cancer. (B) Effective combinations of each cancer-method pair. Each vertex in the radar plots represents the weighted average z-score of a specific data combination with respect to a particular pair of cancer and method. In the plots, we use integers 1, 2, 3, and 4 to label the four omics data types mRNA expression, miRNA expression, DNA methylation, and CNV, respectively, so that a data combination can be written as a sequence of digits. For example, the sequence “134” corresponds to the combination of the three data types: mRNA expression, DNA methylation, and CNV. The red circle on each radar plot represents the z-score of zero. Combinations with a positive average weighted z-score are colored red and are considered to be effective for that cancer-method pair. (C) Common effective combinations of cancer types. The effective combinations for most cancer types are colored red.

More »

Expand

Fig 10.

Comparison of the performance of the integration methods using effective combinations with the performance of clustering methods applied to individual omics data types.

(A) Silhouette coefficients comparison. (B) Transformed p-values of the log-rank test comparison. In the bottom plot, each point in the box is the improvement percentage for a specific cancer type on that combination.

More »

Expand