Table 1.
Format of a microbiome data set for subjects and
distinct taxa at an arbitrary level (e.g., Phylum, Class, etc.).
Figure 1.
Description of Dirichlet-multinomial parameters.
Intuitive description of the meaning of the overdispersion parameter . The four plots show the taxa frequencies
for each of the five hypothetical samples (dashed lines) with 12 taxa in each sample, and the corresponding weighted average across the five samples given by the vector of taxa frequencies
(solid line). The plots on the left show the taxa frequencies of samples drawn from a Multinomial distribution
and the plots on the right show taxa frequencies of five samples drawn from a Dirichlet Multinomial
. The top row of plots is for samples with a smaller number of sequence reads, while the bottom row of plots is for samples with a larger number of sequence reads. As the number of reads increases for the multinomial distribution increases each samples taxa frequencies converge onto the mean, while for the Dirichlet-multinomial an increased number of reads is still associated with the same variability between the individual samples.
Figure 2.
Illustration of a small and a large effect size when comparing two groups.
Figure 3.
Comparison of two metagenomic groups using a taxa composition data analysis approach.
Taxa frequency means at Class level obtained from subgingival plaque samples (blue curve) and from supragingival plaques samples (red curve): a) The mean of all taxa frequencies found in each group, b) The mean of taxa frequencies whose weighted average across both groups is larger than 1%. The remaining taxa are pooled into an additional taxon labeled as ‘Pooled taxa’.
Table 2.
Power calculation as a function of number of sequence reads and sample size for the comparison of from the subgingiva and supragingiva populations, using as a reference the taxa frequencies obtained from the 24 samples, and 1% and 5% significant levels.
Figure 4.
Comparison of three metagenomic groups using a taxa composition data analysis approach.
Taxa frequencies at class level obtained from saliva (black line), subgingival plaque (blue line), and from supragingival plaques samples (red line): a) The mean of all taxa frequencies found in each group, b) the mean of taxa frequencies whose weighted average across both groups is larger than 1%. The remaining taxa are pooled into an additional taxon labeled as ‘Pooled taxa’.
Table 3.
Unadjusted and Bonferroni adjusted p-values for all pairwise comparisons between saliva, supragingiva and subgingiva samples.
Figure 5.
Comparison of two metagenomic groups using rank abundance distribution data.
Ranked taxa frequencies mean at class level obtained from subgingival plaque samples (blue curve) and from supragingival plaques samples (red curve): a) The means of all ranked taxa frequencies found in each group; b) The mean of ranked taxa frequencies whose weighted average across both groups is larger than 1%. The remaining taxa are pooled into an additional taxon labeled as ‘Pooled taxa’.
Table 4.
Power calculation as a function of number of sequence reads and sample size for the comparison of ranked from the subgingiva and supragingiva populations, using as a reference the taxa frequencies obtained from the 24 samples, and 1% and 5% significant levels.
Figure 6.
Comparison of three metagenomic groups using rank abundance distribution data.
Ranked taxa frequencies mean at class level obtained from subgingival plaque samples (blue curve) and from supragingival plaques samples (red curve): a) The means of all ranked taxa frequencies found in each group; b) The mean of ranked taxa frequencies whose weighted average across both groups is larger than 1%. The remaining taxa are pooled into an additional taxon labeled as ‘Pooled taxa’.