Fig 1.
Comparison between multivariate and univariate statistical analysis frameworks for 16S microbiome data.
(A) Multivariate mixMC framework including processing/normalisation, optional repeated measures design, unsupervised and supervised analyses, (B) Univariate framework, including normalisation and optional repeated measures design analysis.
Fig 2.
Most diverse data, PCoA sample plots.
Sample plot on the first two coordinates with (a) weighted Unifrac (b) unweighted Unifrac calculated on the filtered OTU count table (based on 1 674 OTU).
Fig 3.
Most diverse data, PCA sample plots.
(a) TSS and (b) TSS multilevel OTU log counts, (c) TSS-ILR and (d) TSS-ILR multilevel normalised log counts, (e) CSS and (f) CSS multilevel log counts.
Fig 4.
Most diverse TSS+CLR data, sPLS-DA sample, contribution and cladogram plots.
(a) sample plot on the first two components with 95% confidence level ellipse plots, (b) and (c) represent the contribution of each OTU feature selected on the first (10 OTUs) and second component (120 OTUs), with OTU contribution ranked from bottom (important) to top. Colours indicate body site in which the OTU is most abundant. (d) Cladogram generated from the sPLS-DA result using GraphlAn.
Table 1.
Top: Number of selected features at the OTU (family) level and mean classification error rate per component. Bottom: Number of features at the OTU (family) level contributing to each body site for each sPLS-DA component. Note that we may observe some overlap between families across the different body sites.
Fig 5.
Oral data, sPLS-DA sample plot for the different components.
(a) Component 1 vs. Component 2, (b) Component 2 vs Component 3, using 95% confidence ellipses.
Fig 6.
Oral data, contribution and cladogram plots of the features selected for each sPLS-DA component.
(a) Component 1, (b) Component 2, (c) Component 3. In (c) only the top 150 OTU are represented. (d) Cladogram generated from the sPLS-DA results for components 1 and 2 using GraphlAn.