mbkmeans: Fast clustering for single cell data using mini-batch k-means
Fig 3
The speed and memory-usage of mbkmeans depends on batch size.
Performance evaluation (y-axis) of (A) maximum memory (RAM) used (GB) and (B) elapsed time (minutes) for increasing batch sizes (x-axis) with b = 75, 150, 300, 500, 1,000, 1,500, 3,000, 5,000, 7,500, 10,000, 20,000, 50,000, 100,000, and 200,000 with a dataset of size N = 1,000,000 observations using our desktop configuration. Results for mbkmeans in-memory are in red and and on-disk in blue. We used k = 15 for the number of centroids.