Figure 1.
Overview of the resources of CREST.
The flow of information during the construction of a new reference database (top part) or classification (bottom part) is represented by arrows. The classification tools MEGAN or LCAClassifier can utilize CREST taxonomy files and databases such as SilvaMod for classification of environmental sequences, aligned to the reference database with Megablast.
Table 1.
Assignment accuracy from ten-fold cross validation.
Figure 2.
Precision-recall curves from ten-fold cross validation.
Shows the precision (number of correct assignments/number of assignments made) on the y-axis and measured recall (sensitivity or true positive rate) on the x-axis, when varying LCA range or confidence cutoff. Circles indicate the default cutoffs (cutoff for RDP = 0.8, LCA range = 00.2).
Table 2.
Assignment accuracy from removal-of-taxa cross validation.
Table 3.
Datasets used for performance testing.
Figure 3.
Average proportion of reads classified at different ranks in four environmental datasets.
The CREST LCAClassifier (analogous to MEGAN) was tested using the full SilvaMod and Greengenes [21] reference databases with their respective taxonomies, as well as the RDP Classifier [22] retrained with Greengenes (99%OTU dataset; executed via QIIME) and version 6 of the default RDP training dataset.
Table 4.
Results from performance testing using environmental datasets.