Using pseudoalignment and base quality to accurately quantify microbial community composition
Fig 2
The average absolute error with 95% confidence intervals from simulated samples of 1 × 106 75bp reads, with every simulated dataset having a unique mix of 1,000 reference sequences. Each colored line represents a different quantification method, including Karp, Kallisto, SINTAX, UCLUST, USEARCH, SortMeRNA, and the Naive Bayes method implemented in Mothur. The samples were also classified with the 16S Classifier, its errors were above the shown range. Error refers to the average relative error (AVGRE): the difference between the true number of reads for each reference sequence present in the simulated data and the number classified by each method, if each method had classified every read in the data. (A) Per reference level error for taxa with frequency >0.1% in samples simulated from the full 16S gene sequence. (B) Per reference level error for taxa with frequency >0.1% in samples simulated using only the V4 hypervariable region of the 16S gene. (C) Genera-level error in the full gene samples. (D) Genera-level error in V4 samples.