CREST – Classification Resources for Environmental Sequence Tags

doi:10.1371/journal.pone.0049334

Figure 1.

Overview of the resources of CREST.

The flow of information during the construction of a new reference database (top part) or classification (bottom part) is represented by arrows. The classification tools MEGAN or LCAClassifier can utilize CREST taxonomy files and databases such as SilvaMod for classification of environmental sequences, aligned to the reference database with Megablast.

More »

Expand

Table 1.

Assignment accuracy from ten-fold cross validation.

More »

Expand

Figure 2.

Precision-recall curves from ten-fold cross validation.

Shows the precision (number of correct assignments/number of assignments made) on the y-axis and measured recall (sensitivity or true positive rate) on the x-axis, when varying LCA range or confidence cutoff. Circles indicate the default cutoffs (cutoff for RDP = 0.8, LCA range = 00.2).

More »

Expand

Table 2.

Assignment accuracy from removal-of-taxa cross validation.

More »

Expand

Table 3.

Datasets used for performance testing.

More »

Expand

Figure 3.

Average proportion of reads classified at different ranks in four environmental datasets.

The CREST LCAClassifier (analogous to MEGAN) was tested using the full SilvaMod and Greengenes [21] reference databases with their respective taxonomies, as well as the RDP Classifier [22] retrained with Greengenes (99%OTU dataset; executed via QIIME) and version 6 of the default RDP training dataset.

More »

Expand

Table 4.

Results from performance testing using environmental datasets.

More »

Expand