ARK: Aggregation of Reads by K-Means for Estimation of Bacterial Community Composition

doi:10.1371/journal.pone.0140644

Fig 1.

A flow-chart of the ARK method.

More »

Expand

Fig 2.

Results for the random K-means clustering on the simulated data.

Mean VD error at the genus level as a function of the number of clusters. Note the improvement that ARK contributes to each method.

More »

Expand

Fig 3.

Results for the random K-means clustering on the simulated data.

Mean execution time increase (factor given in comparison to running SEK or Quikr in the absence of ARK) as a function of number of clusters. The dashed line represents a line with slope 1.

More »

Expand

Fig 4.

Comparison of the underlying algorithms with and without ARK.

Results are for the random K-means clustering on the simulated data when fixing the number of clusters to 75. Mean VD error at the genus level. Included for comparison are results for RDP’s NBC (compare to Fig 2(b) of [3]).

More »

Expand

Fig 5.

Comparison of the underlying algorithms with and without ARK.

Results are for the random K-means clustering on the simulated data when fixing the number of clusters to 75. Boxplot of the individual simulated sample execution times. Mean execution times for Quikr and ARK Quikr were 1.75 seconds and 4.71 minutes, while for SEK and ARK SEK they were 21.26 seconds and 19.21 minutes respectively. Mean execution time for RDP’s NBC was 38.19 minutes.

More »