Fig 1.
For clarity, we present three random data points extracted from the three classes in the Iris dataset. Black points denote the original data points X and blue points denote the cluster centers U. At μ = 0, X and U coincide. At intermediate μ values (middle figure), U coalesces towards its cluster center. For sufficiently large μ, U converges to cluster centers (right figure). Note that in this demonstration, only the left two points have non-zero pairwise weights wij. Hence, the two resulting clusters reflect the two graphs defined by the matrix of weights.
Fig 2.
Effects of the parameters k and ϕ on cluster paths in the Iris data.
Black, red, and green points denote the species Iris-setosa, Iris-versicolor, and Iris-virginica, respectively. These points are projections of the Iris dataset on the first two principal components (PCs). Lines trace the cluster centers as they traverse the regularization path. The subtle impact of ϕ is revealed in two cases. At k = 50, a red dot coalesces with the right cluster at ϕ = 0, but with the left cluster for larger values of ϕ. At k = 5 or k = 10, the two green dots at the extreme lower left corner coalesce later at the largest value of ϕ.
Table 1.
Avg Rand indices (RI) as a function of noise in the Iris data.
Table 2.
Avg Rand indices (RI) as a function of missingness in the Iris data.
Fig 3.
Convex clustering of the HGDP data using a large number of nearest neighbors to infer intercontinental connections (k = 4, ϕ = 1).
Fig 4.
Hierarchical clustering of the 52 populations from the HGDP data.
Fig 5.
Convex clustering of the HGDP data using a small number k of nearest neighbors to resolve intracontinental connections (k = 1, ϕ = 1).
Fig 6.
Magnified view of the convex clustering results for the HGDP data in East Asia.
Fig 7.
Magnified view of the convex clustering results for the HGDP data in Europe and Central Asia.
Fig 8.
Magnified view of the hierarchical clustering results for the HGDP data in Europe and Central Asia.
Fig 9.
Convex clustering of the European populations from the POPRES data using ϕ = 0 and k = 40.
Fig 10.
Convex clustering of the European populations from the POPRES data using ϕ = 10 and k = 40.
Fig 11.
Hierarchical clustering of the European populations from the POPRES data.
Fig 12.
Convex clustering of the European populations from the POPRES data using ϕ = 1 and k = 3.
Fig 13.
Magnified view of results from convex clustering of Southeast Europe.
Fig 14.
Magnified view of results from convex clustering of Northeast Europe.
Fig 15.
Hierarchical clustering projection showing genetic relationships among populations in and near the British Isles.
Fig 16.
Convex clustering projection showing genetic relationships among populations in and near the British Isles.
Fig 17.
Magnified view of results from convex clustering of Swiss liguistic groups.
Fig 18.
Convex clustering of the breast cancer samples.
Points on the plot indicate data vectors projected onto the first and third principal components (PCs) of the sample. Lines trace the cluster centers as they traverse the regularization path.
Fig 19.
Average linkage hierarchical clustering of the breast cancer samples.
Table 3.
Average runtimes in seconds for different analyses.