Fig 1.
Bioinformatics processing strategy for the three parallel approaches.
Table 1.
Alpha diversity result loss caused by each of the two clustering choices in comparison to full ASV data analysis.
The percent values with respect to those stemming from the taxa counts of the ASV table are shown. Values are the means from 34 samples collected across the different habitats of the habitat type gradient.
Fig 2.
Correlation matrix (Pearson coefficient with Bonferroni-corrected p values) of the pairwise comparisons across numbers of taxa and ecological indexes of the three sequence clustering approaches.
Boxed cells with grey background indicate significant differences for p<0.05. Abund: sequence reads abundance; ASV, O99, O97: number of different taxa resulting from the full ASV analysis or from the OTU clustering at 99% or 97% shared homology, respectively. These three prefixes apply for the remaining correlation indexes’ abbreviations, whose suffixes indicate the following: BgPk: Berger-Parker Dominance; Brill: Brillouin Diversity Index; Equ: Equitability J; Ev: Community Evenness e^H/S; Fisha: Fisher alpha Diversity Index; Marg: Margalef Richness Index; Menh: Menhinick richness index; Sha: Shannon-Wiener H Diversity Index; Smp: Simpson 1-D Diversity Index.
Fig 3.
Neighbor Joining dendrograms obtained by the cluster analysis of each of the three different data matrix tables.
a) Jaccard distances resulting from the Centered Log Ratio (CLR) transformation of the data to circumvent compositional constraints; b) Bray-Curtis dissimilarity-based dendrograms from non-normalized data. In either case the holding consistency of the tree topology at OTU 99% clustering is visible, along with its evident loss when assembling sequences in the 97% clustered packages. Samples nomenclature is drawn from [28]. Habitat acronyms: fpe: forest poplar elms; fpi: forest pines; hed: hedges; hud: humid dune; idu: inner dunes; mel: meadow on levee; odu: outer dunes; pon: pond; ppr: pines prairie; pra: prairie; sed: sea sediment; wh.: wheat; wnt: wheat no tillage.
Fig 4.
Principal Coordinate Analysis (PCoA) ordination biplots based on the Raup distance metrics with each of the three dataset matrixes.
The beta-diversity is calculated among seven ecologically coherent sets across the land-to-sea transect (Cropped, Prairie, Hedges, Floodplain Transition, Coastal, Waters) within which the 34 samples were grouped.
Table 2.
PERMANOVA p values associated to the beta-diversity Principal Coordinate Analysis (PCoA) results using each of the three dataset matrixes.
Samples have been grouped into seven ecologically coherent sets across the land-to-sea transect (Cropped, Prairie, Hedges, Floodplain Transition, Coastal, Waters) among which the community diversity was assessed. P values significant for p<0.05 are in bold and marked with an asterisk (*).
Fig 5.
Linear Discriminant Analysis Effect Size evidencing the significantly (p<0.05) differentially featured taxa from each of the three dataset matrixes.
Top: ASV, Middle: OTU_99, bottom: OTU_97. The relative occurrence of each taxon across each of the seven ecological habitat subsets encountered stepwise along the transect is reported on the right side. The LDA score values are calculated from the trimmed mean M values (TMM)-transformed data.
Table 3.
Gamma diversity of the whole site.
The total number of non-redundant different taxa counted among the whole series of samples is reported and the corresponding comparative levels for the OTU-clustered datasets in comparison to the ASV values.