Evaluation and comparison of multi-omics data integration methods for cancer subtyping
Fig 5
Clustering-based performance of Dataset group #3 Gold Standard Datasets.
For each metric (i.e. precision, NMI, ARI, and F-measure) and each integration method, each data point in a box is a measurement of using one of the 11 data type combinations for both BRCA and COAD datasets, and the white line within the box indicates the mean value of the results. (A) Clustering-based performance of gold standard datasets based on the suggested k of methods. We set k-max as 8 and let each method suggest the best k. The performance of the method suggested k was used to evaluate and compared. To the three methods that cannot suggest best k, we clustered BRCA and COAD samples into 5 and 4 clusters, respectively. (B) Clustering-based performance of gold standard datasets based on the pre-defined k. As the true labels of samples and the number of clusters are known in Dataset group #3, we clustered BRCA and COAD samples into 5 and 4 clusters, respectively, and calculated the clustering-based metrics to evaluate and compare the performance of the integration methods.