Skip to main content
Advertisement

< Back to Article

Waste Not, Want Not: Why Rarefying Microbiome Data Is Inadmissible

Figure 2

Graphical summary of the two simulation frameworks.

Both Simulation A (clustering) and Simulation B (differential abundance) are represented. All simulations begin with real microbiome count data from a survey experiment referred to here as “the Global Patterns dataset” [48]. Tables of integers with multiple columns represent an abundance count matrix (“OTU table”), while a single-column of integers represents a multinomial of OTU counts/proportions. In both simulation illustrations an effect size is explained and given an example value of 10 for easy mental computation, but its meaning is different for each simulation. Note that effect size is altogether different than library size, the latter being equivalent to both the column sums and the number of reads per sample. A grey highlight indicates count values for which an effect has been applied in Simulation B. Protocol S1 includes the complete source code used to compute the example values shown here, as well as the full simulations discussed below.

Figure 2

doi: https://doi.org/10.1371/journal.pcbi.1003531.g002