Fig 1.
SillyPutty algorithm.
Fig 2.
UMAP plots of four examples of simulated data sets using different parameters.
(A) 3 clusters, 600 samples, 5000 features, low noise. (B) 6 clusters, 1000 samples, 5000 features, low noise. (C) 6 clusters, 600 samples, 10000 features, medium noise. (D) 12 clusters, 1000 samples, 10000 features, high noise.
Table 1.
List of cancer genomics datasets generated by the Umpire package.
Table 2.
Average running time of algorithms across all simulations.
Fig 3.
Values over 19 replicate simulated sets were averaged. Distributions of averaged values over 27 sets of simulation parameters are displayed in ‘bean plots’ for each of the 13 methods (plus true clusters) (A) Adjusted Rand Index. (B) Mean silhouette width. (C) Normalized within-group sum of squares. (D) Entropy.
Table 3.
The averaged values of the validity indices for all clustering methods across all simulated data experiments.
Fig 4.
Effect of simulation parameters on Adjusted Rand Index.
(A) Clusters. (B) Samples (C) Features. (D) Noise.
Fig 5.
Nonlinear t-SNE mapping depicts the arrangement of squamous cell cancers aligned with the HCSilly clustering.
Table 4.
Summary of clustering quality metrics for squamous cell cancers dataset.