Skip to main content
Browse Subject Areas

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Watered-down biodiversity? A comparison of metabarcoding results from DNA extracted from matched water and bulk tissue biomonitoring samples

  • Mehrdad Hajibabaei ,

    Roles Conceptualization, Formal analysis, Methodology, Writing – original draft

    Affiliation Centre for Biodiversity Genomics and Department of Integrative Biology, University of Guelph, Guelph, Ontario, Canada

  • Teresita M. Porter,

    Roles Formal analysis, Methodology, Writing – review & editing

    Affiliations Centre for Biodiversity Genomics and Department of Integrative Biology, University of Guelph, Guelph, Ontario, Canada, Great Lakes Forestry Centre, Natural Resources Canada, Sault Ste. Marie, Ontario, Canada

  • Chloe V. Robinson,

    Roles Methodology, Writing – review & editing

    Affiliation Centre for Biodiversity Genomics and Department of Integrative Biology, University of Guelph, Guelph, Ontario, Canada

  • Donald J. Baird,

    Roles Methodology, Writing – review & editing

    Affiliation Environment and Climate Change Canada @ Canadian Rivers Institute, Department of Biology, University of New Brunswick, Fredericton, New Brunswick, Canada

  • Shadi Shokralla,

    Roles Methodology, Writing – review & editing

    Affiliation Centre for Biodiversity Genomics and Department of Integrative Biology, University of Guelph, Guelph, Ontario, Canada

  • Michael T. G. Wright

    Roles Methodology, Writing – review & editing

    Affiliation Centre for Biodiversity Genomics and Department of Integrative Biology, University of Guelph, Guelph, Ontario, Canada


Biomonitoring programs have evolved beyond the sole use of morphological identification to determine the composition of invertebrate species assemblages in an array of ecosystems. The application of DNA metabarcoding in freshwater systems for assessing benthic invertebrate communities is now being employed to generate biological information for environmental monitoring and assessment. A possible shift from the extraction of DNA from net-collected bulk benthic samples to its extraction directly from water samples for metabarcoding has generated considerable interest based on the assumption that taxon detectability is comparable when using either method. To test this, we studied paired water and benthos samples from a taxon-rich wetland complex, to investigate differences in the detection of arthropod taxa from each sample type. We demonstrate that metabarcoding of DNA extracted directly from water samples is a poor surrogate for DNA extracted from bulk benthic samples, focusing on key bioindicator groups. Our results continue to support the use of bulk benthic samples as a basis for metabarcoding-based biomonitoring, with nearly three times greater total richness in benthic samples compared to water samples. We also demonstrated that few arthropod taxa are shared between collection methods, with a notable lack of key bioindicator EPTO taxa in the water samples. Although species coverage in water could likely be improved through increased sample replication and/or increased sequencing depth, benthic samples remain the most representative, cost-effective method of generating aquatic compositional information via metabarcoding.


Aquatic biomonitoring programs are designed to detect and interpret ecological change through analysis of biodiversity in target assemblages such as macroinvertebrates at a given sampling location [1] The inclusion of biodiversity information in environmental impact assessment and monitoring has injected much-needed ecological relevance into a system dominated by physicochemical data [2]. However, current biomonitoring data suffer from coarse taxonomic resolution, incomplete observation (due to inadequate subsampling), and/or inconsistent observation (variable sampling designs and collection methods) to provide information with sufficient robustness to support the development of large-scale models for the interpretation of changing regional patterns in biodiversity [3]. As a result, practitioners of ecosystem biomonitoring struggle to provide information that can easily be scaled up to interpret large-scale regional change [4]. This is a critical deficit, as ecosystems currently face significant threats arising from large-scale, pervasive environmental drivers such as climate change, which in turn create spatially and temporally diverse and co-acting stressors [5].

Over the last decade, biodiversity science has experienced a genomics/bioinformatics revolution. The technique of DNA barcoding has supported the wider use of genetic information as a global biodiversity identification and discovery tool [6,7]. Several studies have advocated the use of DNA barcode sequences to identify bio-indicator species (e.g., macroinvertebrates) in the context of biomonitoring applications [8,9]. The use of DNA sequence information for specimen identification can significantly aid biomonitoring programs by increasing taxonomic resolution (which can provide robust species-level identification) in comparison to morphological analysis (which is often limited to genus- or family-level [order, or class-level] identification). However, this methodology still requires the sorting and separation of individual specimens from environmental samples obtained through collection methods such as benthic kick-net sampling. The samples obtained routinely contain hundreds to thousands of individual organisms, many of which are immature stages which cannot be reliably identified [10].

Advances in high-throughput sequencing (HTS) technologies have enabled massively parallelized sequencing platforms with the capacity to obtain sequence information from biota in environmental samples without separating individual organisms [11,12]. Past research has demonstrated the utility of HTS in providing biodiversity data from environmental samples that have variously been called “metagenomics”, “environmental barcoding”, “environmental DNA” or “DNA metabarcoding” [13,14]. These approaches are either targeted towards specific organisms (e.g., pathogens, invasive species, or endangered species) or aim to characterize assemblages of biota. Biomonitoring applications fall mainly into the second category where assemblages are targeted for ecological analyses [3]. For example, macroinvertebrate larvae from benthos are considered standard bio-indicator taxa for aquatic ecosystem assessment. Previous work demonstrated the use of HTS in biodiversity analysis of benthic macroinvertebrates [11,15,16] and its applicability to biomonitoring programs [3]. Various studies have contributed to this endeavor by demonstrating capabilities and limitations of HTS in aquatic biomonitoring [1720].

An important consideration in generating DNA information via HTS analysis for biomonitoring involves the choice of samples. A wide range of sample types including water, soil, benthic sediments, gut contents, passive biodiversity samplings (e.g., malaise traps) could be used as sources for DNA extraction and analysis [21]. Depending on the size of the target organisms, in some cases whole organisms might be present in the samples (e.g., larval samples in benthos). However, a sample may also harbor DNA in residual tissue or cells shed from organisms that may not be present as a whole. For example, early work on environmental DNA focused on detecting relatively large target species (e.g., invasive amphibian or fish species) from DNA obtained from water samples [22]. The idea of analyzing DNA obtained from water has been proposed for biodiversity assessment in and around water bodies or rivers [23] and specifically for bioindicator species [24]. However, because benthos harbors microhabitats for bio-indicator species development and growth, it has been the main source of biodiversity samples for biomonitoring applications [1]. In order to evaluate the suitability of water as a source for biodiversity information of bio-indicator taxa, it is important to assess whether DNA obtained from water samples alone provides sufficient coverage of benthic bio-indicator taxa commonly used in aquatic biomonitoring.

Here, we compare benthic and water samples collected in parallel from the same wetland ponds as sources of DNA for environmental DNA (eDNA) metabarcoding analysis. Specifically, we assess whether patterns of biodiversity illuminated through DNA analysis of benthos are reflected through DNA analysis of water samples. The study system involves two adjacent deltas in northern Alberta, Canada within Wood Buffalo National Park. By comparing patterns of sequence data from operational taxonomic units (OTUs) and multiple taxonomic levels (species, genus, family, and order), we explore differences between biodiversity data (i.e., taxonomic list information) from DNA extracted from water samples as compared to DNA extracted from co-located benthic samples.


Field sampling

Eight open-water wetland sites within the Peace-Athabasca delta complex were sampled in August 2011 (see S1 Table for site information). All sites were located within Wood Buffalo National Park in Alberta, Canada. Field permits were granted by Parks Canada at Wood Buffalo National Park. The field work did not involve endangered or protected speciesThree replicate samples of the benthic aquatic invertebrate community (hereafter designated as ‘benthos’) were taken from the edge of the emergent vegetation zone into the submerged vegetation zone at each site, following standard Canadian Aquatic Biomonitoring Network (CABIN) protocol [25]. Replicated, paired samples were located approximately 100 metres apart. Samples were collected using a standard CABIN kick net with a 400μm mesh net and attached collecting cup attached to a pole and net frame. Effort was standardized at two minutes per sample. Sampling was conducted by moving the net up and down through the vegetation in a sinusoidal pattern while maintaining constant forward motion. If the net became impeded by dislodged vegetation, sampling was paused so extraneous vegetation could be removed. Sampling typically resulted in a large amount of vegetation within the net. After sampling this vegetation was vigorously rinsed to dislodge attached organisms, and visually inspected to remove remaining individuals before discarding. The remaining material was removed from the net and placed in a white 1L polyethylene sample jar filled no more than half full. The net and collecting cup were rinsed and inspected to remove any remaining invertebrates. Samples were preserved in 95% ethanol in the field and placed on ice in a cooler for transport to the field base. Here they were transferred to a freezer at -20°C before shipment. A sterile net was used to collect samples at each site and field crew wore clean nitrile gloves to collect and handle samples in the field and laboratory, thereby minimizing the risk of DNA contamination between sites.

Three 1L water samples for subsequent DNA extraction were collected directly into sterile DNA/RNA free 1L polyethylene sample jars. Water samples were collected at the same locations as the benthos samples, immediately prior to benthic sampling to avoid disturbance, resulting in the resuspension of DNA from the benthos into the water column. Water samples were placed on ice prior to being transported to the lab.

Water sample filtering and benthos homogenization

Under a positive pressure sterile hood, 1L water samples were filtered with 0.22 μm filter (Mobio Laboratories). After water filtration, total DNA was extracted from the entire filter using Power water DNA extraction kit (MoBio Laboratories) and eluted in 100 μl of molecular biology grade water, according to the manufacturer instructions. DNA samples were kept frozen at -20°C until further PCR amplification and sequencing. DNA extraction negative control was performed in parallel to ensure the sterility of the DNA extraction process.

For benthos samples, after removal of the EtOH [11], a crude homogenate was produced by blending the component of each sample using a standard blender that had been previously decontaminated and sterilized using ELIMINase followed by a rinse with deionized water and UV treatment for 30 min. A sample of this homogenate was transferred to 50 mL Falcon tubes and centrifuged at 1000 rpm for 5 minutes to pellet the tissue. After discarding the supernatant, the pellets were dried at 70°C, until the ethanol was fully evaporated. Once dry, the homogenate pellets were combined into a single tube and stored at -20°C.

Using a sterile spatula, ~300 mg dry weight of homogenate was subsampled into 3 MP matrix tubes containing ceramic and silica gel beads. The remaining dry mass was stored in the Falcon tubes at -20°C as a voucher.

DNA was extracted using a NucleoSpin tissue extraction kit (Macherey-Nagel) with a minor modification of the kit protocol: the crude homogenate was first lysed with 720 μL T1 buffer and then further homogenized using a MP FastPrep tissue homogenizer for 40 s at 6 m/s. Following this homogenization step, the tubes were spun down in a microcentrifuge and 100 μL of proteinase K was added to each. After vortexing, the tubes were incubated at 56°C for 24 hr. Once the incubation was completed, the tubes of digest were centrifuged for 1 min at 10,000 g and 200 μL of supernatant from each tube was transferred to each of three sterile microfuge tubes per tube of digest. The lysate was loaded to a spin column filter and centrifuged at 11,000 g for 1 min. The columns were washed twice and dried according to the manufacturer’s protocol. The dried columns were then transferred into clean 1.5 mL tubes. DNA was eluted from the filters with 30 μL of warmed molecular biology grade water. DNA extraction negative control was performed in parallel to ensure the sterility of the DNA extraction process.

Purity and concentration of DNA for each site was checked using a NanoDrop spectrophotometer and recorded. Samples were kept at -20°C for further PCR and sequencing.

Amplicon library preparation for HTS

Two fragments within the standard COI DNA barcode region were amplified with two primer sets (A_F/D_R [~250 bp] called AD and B_F/E_R called BE [~330 bp]; (11,21) using a two-step PCR amplification regime. The first PCR used COI specific primers and the second PCR involved Illumina-tailed primers. The PCR reactions were assembled in 25 μL volumes. Each reaction contained 2 μL DNA template, 17.5 μL molecular biology grade water, 2.5 μL 10× reaction buffer (200 mM Tris–HCl, 500 mM KCl, pH 8.4), 1 μL MgCl2 (50 mM), 0.5 μL dNTPs mix (10 mM), 0.5 μL forward primer (10 mM), 0.5 μL reverse primer (10 mM), and 0.5 μL Invitrogen’s Platinum Taq polymerase (5 U/μL). The PCR conditions were initiated with heated lid at 95°C for 5 min, followed by a total of 30 cycles of 94°C for 40 s, 46°C (for both primer sets) for 1 min, and 72°C for 30 s, and a final extension at 72°C for 5 min, and hold at 4°C. Amplicons from each sample were purified using Qiagen’s MiniElute PCR purification columns and eluted in 30 μL molecular biology grade water. The purified amplicons from the first PCR were used as templates in the second PCR with the same amplification condition used in the first PCR with the exception of using Illumina-tailed primers in a 30-cycle amplification regime. All PCRs were done using Eppendorf Mastercycler ep gradient S thermalcyclers and negative control reactions (no DNA template) were included in all experiments.

High throughput sequencing

PCR products were visualized on a 1.5% agarose gel to check the amplification success. All generated amplicons plates were dual indexed and pooled into a single tube. The pooled library was purified by AMpure beads and quantified to be sequenced on a MiSeq flowcell using a V2 MiSeq sequencing kit (250 × 2; FC-131-1002 and MS-102-2003).

Bioinformatic methods

Raw Illumina paired-end reads were processed using the SCVUC v2.3 pipeline available from Briefly, raw reads were paired with SeqPrep ensuring a minimum Phred score of 20 and minimum overlap of at least 25 bp [26]. Primers were trimmed with CUTADAPT v1.18 ensuring a minimum trimmed fragment length of at least 150 bp, a minimum Phred score of 20 at the ends, and allowing a maximum of 3 N’s [27]. All primer-trimmed reads were concatenated for a global exact sequence variants (ESV) analysis. Reads were dereplicated with VSEARCH v2.11.0 using the ‘derep_fulllength’ command and the ‘sizein’ and ‘sizeout’ options [28]. Denoising was performed using the unoise3 algorithm in USEARCH v10.0.240 [29]. This method removes sequences with potential errors, PhiX carry-over from Illumina sequencing, putative chimeric sequences, and rare reads. Here we defined rare reads to be ESVs containing only 1 or 2 reads (singletons and doubletons; [30]). An ESV x sample table was created with VSEARCH using the ‘usearch_global’ command, mapping reads to ESVs with 100% identity. ESVs were taxonomically assigned using the COI Classifier v3.2 [31].

Data analysis

Most diversity analyses were conducted in Rstudio with the vegan package [32,33]. Read and ESV statistics for all taxa and for arthropods only were calculated in R. To assess whether sequencing depth was sufficient we plotted rarefaction curves using a modified vegan ‘rarecurve’ function. Before normalization, we assessed the recovery of ESVs from benthos compared with water samples and assessed the proportion of all ESVs that could be taxonomically assigned with high confidence. Taxonomic assignments were deemed to have high confidence if they had the following bootstrap support cutoffs: species > = 0.70 (95% correct), genus > = 0.30 (99% correct), family > = 0.20 (99% correct) as is recommended for 200 bp fragments [31]. An underlying assumption for nearly all taxonomic assignment methods is that the query taxa are present in the reference database, in which case 95–99% of the taxonomic assignments are expected to be correct using these bootstrap support cutoffs. Assignments to more inclusive ranks, ex. order, do not require a bootstrap support cutoff to ensure that 99% of assignments are correct.

To assess how diversity recovered from benthos and water samples may differ, we first normalized different library sizes by rarefying down to the 15th percentile library size using the vegan ‘rrarefy’ function [34]. It is known that bias present at each major sample-processing step (DNA extraction, mixed template PCR, sequencing) can distort initial template to sequence ratios rendering ESV or OTU abundance data questionable [18,3537]. Here we chose to transform our abundance matrix to a presence-absence matrix for all further analyses. We calculated ESV richness across different partitions of the data to compare differences across sites and collection methods (benthos or water samples). To check for significant differences we first checked for normality using visual methods (ggdensity and ggqqplot functions in R) and the Shapiro-Wilk test for normality [38]. Since our data was not normally distributed, we used a paired Wilcoxon test to test the null hypothesis that median richness across sites from benthic samples is greater than the median richness across sites from water samples [39]. Additionally, we created a ternary plot using package ‘Ternary’ [40].

To assess the overall community structure detected from different collection methods, we used non-metric multi-dimensional scaling analysis on Sorensen dissimilarities (binary Bray-Curtis) using the vegan ‘metaMDS’ function. A scree plot was used to guide our choice of 3 dimensions for the analysis (not shown). A Shephard’s curve and goodness of fit calculations were calculated using the vegan ‘stressplot’ and ‘goodness’ functions. To assess the significance of groupings, we used the vegan ‘vegdist’ function to create a Sorensen dissimilarity matrix, the ‘betadisper’ function to check for heterogeneous distribution of dissimilarities, and the ‘adonis' function to do a permutational analysis of variance (PERMANOVA) to check for any significant interactions between groups (collection method, sample site). We calculated the Jaccard index to look at the overall similarity between water and benthos samples.

To assess the ability of traditional bioindicator taxa to distinguish among samples, we limited our dataset to ESVs assigned to the EPTO (Ephemeroptera, Plecoptera, Trichoptera, Odonata) insect orders. No significant beta dispersion was found within groups. We used PERMANOVA to test for significant interactions between groups and sources of variation such as collection method and river delta as described above. Sample replicates were pooled. We also visualized the frequency of ESVs detected from EPTO families using a heatmap generated using geom_tile (ggplot) in R.

Results and discussion

A total of 48,799,721 x 2 Illumina paired-end reads were sequenced (S2 Table). After bioinformatic processing, we retained a total of 16,841 ESVs (5,407,720 reads) that included about 11% of the original raw reads. Many reads were removed during the primer-trimming step from water samples for being too short (< 150 bp) after primer trimming. After taxonomic assignment, a total of 4,459 arthropoda ESVs (4,399,949 reads) were retained for data analysis (S3 Table). 27% of all ESVs were assigned to arthropoda, accounting for 81% of reads in all ESVs.

Rarefaction curves that reach a plateau show that our sequencing depth was sufficient to capture the ESV diversity in our PCRs (S1 Fig). Benthos samples generate more ESVs than water samples as shown in the rarefaction curves as well as by the median number of reads and ESVs recovered by each collection method (S2 Fig). As expected, not all arthropoda ESVs could be taxonomically assigned with confidence (S3 Fig). This is probably because local arthropods may not be represented in the underlying reference sequence database. As a result, most of our analyses are presented at the finest level of resolution using exact sequence variants.

Analysis of sample biodiversity

Alpha diversity measures based on mean richness and beta diversity based on the Jaccard index among all samples show higher values for benthos compared to water at the ESV rank. The total richness for benthos is 1,588 and water is 658, with a benthos:water ratio of 2.4. The Jaccard index is 0.14 indicating that water and benthos samples are 14% similar. Examining the arthropod ESV richness for each sample site from benthos and water collections reinforces the general pattern of higher detected richness from benthos samples (Wilcoxon test, p-value < 0.05; Fig 1).

Fig 1. Median arthropod richness per site is higher in benthos samples than water samples.

Results are based on normalized data.

We further illustrate how arthropod richness varies with collection method (benthos or water) by looking at the number of ESVs exclusively found from benthos samples, found both benthos and water samples, or exclusively found from water samples (Fig 2). ESVs were taxonomically assigned using the COI Classifier v3.2 available from [31]. For example, for sample 04B, 49% of ESVs are unique to benthos samples, 37% of ESVs are unique to water samples, and 14% of ESVs are shared. In fact, this sample contains the largest proportion of shared ESVs. When looking at more inclusive taxonomic ranks, more of the community is shared among benthos and water samples. When considering specific arthropod orders and genera, a greater diversity of sequence variants are detected from benthic samples even when the same higher-level taxa are also recovered from water samples (Fig 3). Some of the confidently identified arthropod genera represented by more than 100 sequence variants included: Tanytarsus (Diptera identified from benthos-B and water-W), Aeshna (Odonata, B only), Leucorrhinia (Odonata, B only), and Scapholeberis (Diplostraca, B + W).

Fig 2. Few arthropod ESVs are shared among benthic and water samples.

The ternary plot shows the proportion of ESVs unique to benthos samples, unique to water samples, or shared. Sample names are shown directly on the plot. Results are based on normalized data.

Fig 3. A greater diversity of arthropod sequence variants are detected from benthic samples.

Each point represents a genus identified with high confidence and the number of benthic and water exact sequence variants (ESVs) with this taxonomic assignment. Only genera represented by at least 2 ESVs in both benthic and water samples are labelled in the plot for clarity. The points are color coded for the 17 arthropod orders detected in this study. A 1:1 correspondence line (dotted) is also shown. Points that fall above this line are represented by a greater number of ESVs from benthic samples. A log10 scale is shown on each axis to improve the spread of points with small values.

Samples from the same sites but collected using different methods (benthos or water), clustered according to collection method instead of site (Fig 4). The ordination was a good fit to the observed Sorensen dissimilarities (NMDS, stress = 0.12, R2 = 0.91). Visually, samples cluster both by collection method and river delta. Although we did find significant beta dispersion among collection method, river, and site dissimilarities (ANOVA, p-value < 0.01), we had a balanced design, so we used a PERMANOVA to check for any significant interactions between groups and none were found [41]. Collection site explained ~ 53% of the variance (p-value < 0.05), river delta explained ~ 10% of the variance (p-value = 0.001), and collection method explained ~ 9% of the variance in beta diversity (p-value = 0.001). Thus, even though richness measures are highly sensitive to choice of collection method, beta diversity is robust with samples clearly clustering by river delta regardless of whether benthos or water samples are analyzed (p-value = 0.001; S4 Fig).

Fig 4. Samples cluster by collection method and river delta.

The NMDS is based on rarefied data and Sorensen dissimilarities based on presence-absence data. The first plot shows sites clustered by collection method, benthos or water. The second plot shows sites clustered by river delta, Athabasca River or Peace River delta.

Analysis of key bioindicator groups

Given the importance of aquatic insects as bioindicator species in standard biomonitoring programs, and to specifically address whether water samples could be used in lieu of benthos for biomonitoring applications, we closely examined the results obtained for four insect orders of biomonitoring importance. Based on the detection of EPTO ESVs, collection method (benthos or water) accounts for 13% of the variation in ordination distances (PERMANOVA, p-value = 0.011; S4 Table). Overall, these differences stem from variation in the distribution of ESVs detected from 76 observed EPTO families (Fig 5). While the total number of ESVs and EPTO families varied from site to site, there is a dramatic shift in the composition detected from benthos and water. For example, in site 1, 888 ESVs from 40 EPTO families were detected from the benthos sample, while only 133 ESVs from 9 EPTO families were observed from the water sample, despite being taken at the exact same location and time. Within each collection method, river delta explains 11% of the variation (PERMANOVA, p-value = 0.031). This means is that despite differences in the community composition detected from benthos and water, EPTO ESVs can still be used to separate samples from two river deltas.

Fig 5. More Ephemeroptera, Plecoptera, Trichoptera, and Odonata family ESVs are detected from benthos compared with water samples.

Each cell shows ESV richness colored according to the legend. Grey cells indicate zero ESVs. Only ESVs taxonomically assigned to families with high confidence (bootstrap support > = 0.20) are included. Based on normalized data.

Biodiversity information forms the basis of a vast array of ecological and evolutionary investigations. Given that biodiversity information for bioindicator groups, such as aquatic insects, is the main source of biological data for various environmental impact assessment and monitoring programs, it is vital for these data to provide a consistent and accurate representation of existing taxon richness [42]. Methods based on bulk sampling of environmental material (i.e. water) for identification of either single species [43] or communities [44] has been proposed as a simplified biomonitoring tool [23,24]. However, our analysis shows that water eDNA fails to provide a rich representation of the benthic community structure in aquatic ecosystems. Our sampling design allowed us to undertake a direct comparison as we were able to collect samples from benthos and water in parallel across a range of sites. These wetland sites consisted of small ponds with minimal or no flow, minimizing the chance of stream flow as a factor impacting the availability of eDNA in a given water sample.

Our analysis of taxon richness in benthos versus water illuminates the need for caution when interpreting data captured from water as an estimate of total richness in a system. In some cases, we saw several-fold decreases in richness in water versus benthos. Although a comprehensive analysis of taxon richness should not rely solely on numbers, this reduction in taxa detected indicates the inadequacy of water for solely detecting existing aquatic invertebrate communities. In comparison, a recent study suggested that eDNA metabarcoding in flowing systems recovers higher levels of richness than bulk benthos samples [24]. However, our study design allowed a direct comparison between water and benthos for both EPTO and general richness without the influence of flow, meaning this was a true assessment of local community assemblages, represented by each sample type. eDNA metabarcoding in flowing systems can therefore result in the additional detection of upstream communities [24], reflected in the greater number of taxa detected, but does not reflect the existing biodiversity at the local scale.

An important consideration when deciding effective biomonitoring methods should be the ecology of target biodiversity units. Factors including life cycle and habitat preference (i.e. benthic or water column) is likely to influence the rate of detection in different sampling approaches [4547]. We have demonstrated in this study that whilst some ESVs are shared between both benthos and water, there is a sampling bias as to the associations of taxa, particularly EPTO, with different sample sources, which was also observed in a recent comparative study with running water [24]. The association of specific taxa with benthos enables communities to be assessed spatially, across different habitat types [15,48]. One of the major limitations of attempting to determine presence/absence of taxa in water is the uncertainty of the original DNA source. As samples are often collected at single fixed locations, taxa recovered in water can vary depending on when and where DNA was released into the aqueous environment in addition to other factors including flow rate [23]. This makes scaling up results from water challenging [49]. Conversely, benthos samples enable a real-time assessment of biodiversity originating from a known locality, which has implications for fine-scale environmental assessments [15].

Environmental factors including hydrolysis drive DNA degradation in aqueous substrates, which can negatively influence detectability of DNA in water [50]. This confounding factor could account for some of the reduction in biodiversity observed between benthos and water [51]. For water sampling to improve species coverage and gain a comparable number of observations, a considerable increase in replicates and sequencing depth is required [52,53]. Earlier research has shown that increasing the volume of water up to 2 L does not seem to be a factor in additional taxonomic coverage [54], however increasing the number of both biological and technical replicates can increase the number of taxa detected [52,53,55,56]. We used triplicate sampling for each site and compared EPTO taxa between sites and two rivers, separately. None of these comparisons provided support for the use of water eDNA in place of benthos. We found that benthos replicates clustered closer with less variation in ESV abundance in comparison with water, which suggests that three replicates is sufficient for consistent species detection with benthos and water is less consistent at representing community structure. In addition, using highly degenerate primers can increase the total biodiversity detected using eDNA metabarcoding [24]. However, with highly degenerated primers, there is an increase likelihood of amplifying non-target regions [57], in comparison to primers with lower degeneracy such as those used in this study. Additionally, employing highly degenerate primers in biomonitoring studies lead to overrepresentation of some taxa (e.g. non-metazoan), which further distances such metabarcoding studies from current stream ecosystem assessment methods [24,58]. Attempting to improve taxonomic coverage of water by increasing numbers of samples collected, sequencing depth and utilising highly degenerate primers, adds considerable costs, both financial, in terms of effort and comparability, without the guarantee of representative levels of biodiversity identification.


It is apparent that in data generated from our comparative study, employing water column samples as a surrogate for benthic samples is not supported, as benthos DNA does not appear to be well represented in the overlying water in these static-water wetland systems at detectable levels. Benthic samples are a superior source of biomonitoring DNA when compared to water in terms of providing reproducible taxon richness information at a variety of spatial scales. Choice of sampling method is a critical factor in determining the taxa detected for biomonitoring assessment and we believe that a comprehensive assessment of total biodiversity should include multiple sampling methods to ensure that representative DNA from all target organisms can be captured.

Supporting information

S1 Table. Summary information for the eight sites sampled.

Information includes waterbody name, latitude and longitude of sample collection.


S2 Table. Summary of reads and ESVs in all taxa.


S3 Table. Summary of reads and ESVs assigned to the arthropoda.


S4 Table. EPTO ESVs can be used to separate rivers using either benthos or water collection methods.

Sample replicates were pooled. No significant beta dispersion was detected within groups (collection method, river). No significant interaction between groups was detected (collection method, river). Summary of PERMANOVA results based on a Sorensen dissimilarity matrix of EPTO ESVs. Significant p-values are in bold. Based on normalized data.


S1 Fig. Rarefaction curves are saturated.

Benthos samples from each site are shown in green and water samples are shown in blue. The vertical line shows the number of reads that would be included after normalizing library size down to the 15th percentile (reads = 2,099).


S2 Fig. The recovery of reads and ESVs from benthos is much greater than that from water.

Results summarized before normalization.


S3 Fig. Only some arthropod ESVs could be assigned with high confidence.

Results summarized before normalization.


S4 Fig. River sites are separated whether arthropod ESVs are detected from benthos or water samples.

NMDS ordination distances well-represent observed Sorensen dissimilarities (Benthos stress = 0.08, R2 = 0.95; Water stress = 0.09, R2 = 0.95). PERMANOVA shows that river groupings are significant and explain 14–19% of the variation in beta diversity (Benthos R2 = 0.19, p-value = 0.001; Water R2 = 0.14, p-value = 0.001). Results based on normalized data.



We acknowledge field support from Parks Canada (Jeff Shatford, Ronnie & David Campbell) and Environment and Climate Change Canada (Daryl Halliwell) for the process of sample collection.


  1. 1. Bonada N, Prat N, Resh VH, Statzner B. DEVELOPMENTS IN AQUATIC INSECT BIOMONITORING: A Comparative Analysis of Recent Approaches. Annu Rev Entomol. 2006;51(1):495–523.
  2. 2. Friberg N, Bonada N, Bradley DC, Dunbar MJ, Edwards FK, Grey J, et al. Biomonitoring of Human Impacts in Freshwater Ecosystems: The Good, the Bad and the Ugly. In: Woodward G, editor. Advances in Ecological Research [Internet]. Academic Press; 2011 [cited 2019 Jun 24]. p. 1–68.
  3. 3. Baird DJ, Hajibabaei M. Biomonitoring 2.0: a new paradigm in ecosystem assessment made possible by next-generation DNA sequencing. Mol Ecol. 2012;21(8):2039–44. pmid:22590728
  4. 4. Hajibabaei M, Baird Donald J., Fahner Nicole A., Beiko Robert, Golding G Brian. A new way to contemplate Darwin’s tangled bank: how DNA barcodes are reconnecting biodiversity science and biomonitoring. Philos Trans R Soc B Biol Sci. 2016 Sep 5;371(1702):20150330.
  5. 5. Dafforn KA, Johnston EL, Ferguson A, Humphrey CL, Monk W, Nichols SJ, et al. Big data opportunities and challenges for assessing multiple stressors across scales in aquatic ecosystems. Mar Freshw Res. 2016 Apr 14;67(4):393–413.
  6. 6. Hajibabaei M, Singer GAC, Hebert PDN, Hickey DA. DNA barcoding: how it complements taxonomy, molecular phylogenetics and population genetics. Trends Genet. 2007 Apr 1;23(4):167–72. pmid:17316886
  7. 7. Hebert P. D. N., Hollingsworth P. M., Hajibabaei M. From writing to reading the encyclopedia of life. Philos Trans R Soc B Biol Sci. 2016 Sep 5;371(1702):20150321.
  8. 8. Pilgrim EM, Jackson SA, Swenson S, Turcsanyi I, Friedman E, Weigt L, et al. Incorporation of DNA barcoding into a large-scale biomonitoring program: opportunities and pitfalls. J North Am Benthol Soc. 2011 Mar 1;30(1):217–31.
  9. 9. Sweeney BW, Battle JM, Jackson JK, Dapkey T. Can DNA barcodes of stream macroinvertebrates improve descriptions of community structure and water quality? J North Am Benthol Soc 2011 Mar 1;30(1):195–216.
  10. 10. Orlofske JM, Baird DJ. The tiny mayfly in the room: implications of size-dependent invertebrate taxonomic identification for biomonitoring data properties. Aquat Ecol. 2013 Dec 1;47(4):481–94.
  11. 11. Hajibabaei M, Spall JL, Shokralla S, van Konynenburg S. Assessing biodiversity of a freshwater benthic macroinvertebrate community through non-destructive environmental barcoding of DNA from preservative ethanol. BMC Ecol. 2012 Dec 23;12(1):28.
  12. 12. Shokralla S, Spall JL, Gibson JF, Hajibabaei M. Next-generation sequencing technologies for environmental DNA research. Mol Ecol. 2012;21(8):1794–805. pmid:22486820
  13. 13. Hajibabaei M. The golden age of DNA metasystematics. Trends Genet. 2012 Nov 1;28(11):535–7. pmid:22951138
  14. 14. Taberlet P, Coissac E, Pompanon F, Brochmann C, Willerslev E. Towards next-generation biodiversity assessment using DNA metabarcoding. Mol Ecol. 2012;21(8):2045–50. pmid:22486824
  15. 15. Hajibabaei M, Shokralla S, Zhou X, Singer GAC, Baird DJ. Environmental Barcoding: A Next-Generation Sequencing Approach for Biomonitoring Applications Using River Benthos. PLOS ONE. 2011 Apr 13;6(4):e17497. pmid:21533287
  16. 16. Gibson JF, Shokralla S, Curry C, Baird DJ, Monk WA, King I, et al. Large-Scale Biomonitoring of Remote and Threatened Ecosystems via High-Throughput Sequencing. PLOS ONE. 2015 Oct 21;10(10):e0138432. pmid:26488407
  17. 17. Carew ME, Kellar CR, Pettigrove VJ, Hoffmann AA. Can high-throughput sequencing detect macroinvertebrate diversity for routine monitoring of an urban river? Ecol Indic. 2018 Feb 1;85:440–50.
  18. 18. Elbrecht V, Leese F. Can DNA-Based Ecosystem Assessments Quantify Species Abundance? Testing Primer Bias and Biomass—Sequence Relationships with an Innovative Metabarcoding Protocol. PLOS ONE. 2015 Jul 8;10(7):e0130324. pmid:26154168
  19. 19. Lejzerowicz F, Esling P, Pillet L, Wilding TA, Black KD, Pawlowski J. High-throughput sequencing and morphology perform equally well for benthic monitoring of marine ecosystems. Sci Rep. 2015 Sep 10;5:13932. pmid:26355099
  20. 20. Dowle EJ, Pochon X, Banks JC, Shearer K, Wood SA. Targeted gene enrichment and high-throughput sequencing for environmental biomonitoring: a case study using freshwater macroinvertebrates. Mol Ecol Resour. 2016;16(5):1240–54. pmid:26583904
  21. 21. Gibson J, Shokralla S, Porter TM, King I, van Konynenburg S, Janzen DH, et al. Simultaneous assessment of the macrobiome and microbiome in a bulk sample of tropical arthropods through DNA metasystematics. Proc Natl Acad Sci. 2014 Jun 3;111(22):8007–12. pmid:24808136
  22. 22. Ficetola G. F., Miaud C, Pompanon F, Taberlet P. Species detection using environmental DNA from water samples. Biol Lett. 2008 Aug 23;4(4):423–5. pmid:18400683
  23. 23. Deiner K, Fronhofer EA, Mächler E, Walser J-C, Altermatt F. Environmental DNA reveals that rivers are conveyer belts of biodiversity information. Nat Commun. 2016 Aug 30;7:12544. pmid:27572523
  24. 24. Macher J-N, Vivancos A, Piggott JJ, Centeno FC, Matthaei CD, Leese F. Comparison of environmental DNA and bulk-sample metabarcoding using highly degenerate cytochrome c oxidase I primers. Mol Ecol Resour. 2018 Nov;18(6):1456–68. pmid:30129704
  25. 25. Environment and Climate Change Canada. CABIN Wetland Macroinvertebrate Protocol. Httppublicationsgccacollectionscollection2018ecccCW66-571-2018-Engpdf. 2018;
  26. 26. St. John, J. SeqPrep. HttpsgithubcomjstjohnSeqPrepreleases. 2016;
  27. 27. Martin M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet.journal. 2011 May 2;17(1):10–2.
  28. 28. Rognes T, Flouri T, Nichols B, Quince C, Mahé F. VSEARCH: a versatile open source tool for metagenomics. PeerJ. 2016 Oct 18;4:e2584. pmid:27781170
  29. 29. Edgar RC. UNOISE2: improved error-correction for Illumina 16S and ITS amplicon sequencing. bioRxiv. 2016 Oct 15;081257.
  30. 30. Callahan BJ, McMurdie PJ, Holmes SP. Exact sequence variants should replace operational taxonomic units in marker-gene data analysis. ISME J. 2017 Dec;11(12):2639–43. pmid:28731476
  31. 31. Porter TM, Hajibabaei M. Over 2.5 million COI sequences in GenBank and growing. PLOS ONE. 2018 Sep 7;13(9):e0200177. pmid:30192752
  32. 32. Dixon P. VEGAN, a package of R functions for community ecology. J Veg Sci. 2003;14(6):927–30.
  33. 33. RStudio Team. RStudio: Integrated Development Environment for R. Retrieved 2016;
  34. 34. Weiss S, Xu ZZ, Peddada S, Amir A, Bittinger K, Gonzalez A, et al. Normalization and microbial differential abundance strategies depend upon data characteristics. Microbiome. 2017 Mar 3;5(1):27. pmid:28253908
  35. 35. Suzuki MT, Giovannoni SJ. Bias caused by template annealing in the amplification of mixtures of 16S rRNA genes by PCR. Appl Environ Microbiol. 1996 Feb 1;62(2):625–30. pmid:8593063
  36. 36. Polz MF, Cavanaugh CM. Bias in Template-to-Product Ratios in Multitemplate PCR. Appl Environ Microbiol. 1998 Oct 1;64(10):3724–30. pmid:9758791
  37. 37. McLaren MR, Willis AD, Callahan BJ. Consistent and correctable bias in metagenomic sequencing measurements. bioRxiv. 2019 Feb 25;559831.
  38. 38. Shapiro SS, Wilk MB. An Analysis of Variance Test for Normality (Complete Samples). Biometrika. 1965;52(3/4):591–611.
  39. 39. Wilcoxon F. Individual Comparisons by Ranking Methods. Biom Bull. 1945;1(6):80–3.
  40. 40. Smith MR. Ternary: An R Package for Creating Ternary Plots version 1.1.1 from CRAN [Internet]. [cited 2019 Sep 10].
  41. 41. Anderson MJ, Walsh DCI. PERMANOVA, ANOSIM, and the Mantel test in the face of heterogeneous dispersions: What null hypothesis are you testing? Ecol Monogr. 2018;(83) 557–74.
  42. 42. Dickie IA, Boyer S, Buckley HL, Duncan RP, Gardner PP, Hogg ID, et al. Towards robust and repeatable sampling methods in eDNA-based studies. Mol Ecol Resour. 2018;18(5):940–52.
  43. 43. Rees HC, Maddison BC, Middleditch DJ, Patmore JRM, Gough KC. REVIEW: The detection of aquatic animal species using environmental DNA–a review of eDNA as a survey tool in ecology. J Appl Ecol. 2014;51(5):1450–9.
  44. 44. Valentini A, Taberlet P, Miaud C, Civade R, Herder J, Thomsen PF, et al. Next-generation monitoring of aquatic biodiversity using environmental DNA metabarcoding. Mol Ecol. 2016;25(4):929–42. pmid:26479867
  45. 45. Culp JM, Armanini DG, Dunbar MJ, Orlofske JM, Poff NL, Pollard AI, et al. Incorporating traits in aquatic biomonitoring to enhance causal diagnosis and prediction. Integr Environ Assess Manag. 2011 Apr;7(2):187–97. pmid:21442732
  46. 46. Tréguier A, Paillisson J-M, Dejean T, Valentini A, Schlaepfer MA, Roussel J-M. Environmental DNA surveillance for invertebrate species: advantages and technical limitations to detect invasive crayfish Procambarus clarkii in freshwater ponds. J Appl Ecol. 2014;51(4):871–9.
  47. 47. Koziol A, Stat M, Simpson T, Jarman S, DiBattista JD, Harvey ES, et al. Environmental DNA metabarcoding studies are critically affected by substrate selection. Mol Ecol Resour. 2019;19(2):366–76. pmid:30485662
  48. 48. Yoccoz NG, Bråthen KA, Gielly L, Haile J, Edwards ME, Goslar T, et al. DNA from soil mirrors plant taxonomic and growth form diversity. Mol Ecol. 2012;21(15):3647–55. pmid:22507540
  49. 49. Roussel J-M, Paillisson J-M, Tréguier A, Petit E. The downside of eDNA as a survey tool in water bodies. J Appl Ecol. 2015;52(4):823–6.
  50. 50. Bohmann K, Evans A, Gilbert MTP, Carvalho GR, Creer S, Knapp M, et al. Environmental DNA for wildlife biology and biodiversity monitoring. Trends Ecol Evol. 2014 Jun 1;29(6):358–67. pmid:24821515
  51. 51. Schultz MT, Lance RF. Modeling the Sensitivity of Field Surveys for Detection of Environmental DNA (eDNA). PLOS ONE. 2015 Oct 28;10(10):e0141503. pmid:26509674
  52. 52. Ficetola GF, Pansu J, Bonin A, Coissac E, Giguet‐Covex C, Barba MD, et al. Replication levels, false presences and the estimation of the presence/absence from eDNA metabarcoding data. Mol Ecol Resour. 2015;15(3):543–56. pmid:25327646
  53. 53. Alberdi A, Aizpurua O, Gilbert MTP, Bohmann K. Scrutinizing key steps for reliable metabarcoding of environmental samples. Methods Ecol Evol. 2018;9(1):134–47.
  54. 54. Mächler E, Deiner K, Spahn F, Altermatt F. Fishing in the Water: Effect of Sampled Water Volume on Environmental DNA-Based Detection of Macroinvertebrates. Environ Sci Technol. 2016;50(1):305–12. pmid:26560432
  55. 55. Furlan EM, Gleeson D, Hardy CM, Duncan RP. A framework for estimating the sensitivity of eDNA surveys. Mol Ecol Resour. 2016;16(3):641–54. pmid:26536842
  56. 56. Lanzén A, Lekang K, Jonassen I, Thompson EM, Troedsson C. DNA extraction replicates improve diversity and compositional dissimilarity in metabarcoding of eukaryotes in marine sediments. PLOS ONE. 2017 Jun 16;12(6):e0179443. pmid:28622351
  57. 57. Elbrecht V, Leese F. Validation and Development of COI Metabarcoding Primers for Freshwater Macroinvertebrate Bioassessment. Front Environ Sci [Internet]. 2017 [cited 2019 Jun 24];5.
  58. 58. Weigand AM, Macher J-N. A DNA metabarcoding protocol for hyporheic freshwater meiofauna: Evaluating highly degenerate COI primers and replication strategy. Metabarcoding Metagenomics. 2018 Aug 23;2:e26869.