AmpliconDuo: A Split-Sample Filtering Protocol for High-Throughput Amplicon Sequencing of Microbial Communities

doi:10.1371/journal.pone.0141590

Fig 1.

Principle of split sample approach with AmpliconDuo filter.

DNA extracted from a sample is split into branches A and B. In each branch, an independent PCR and sequencing run is performed. Sequences occurring in both branches pass the AmpliconDuo filter (upper green sequence ACC… with 4 reads in A and 7 reads in B), while sequences occurring in only one branch are discarded (lower red sequence CCG…). Read numbers of both branches are retained for statistical analyses.

More »

Expand

Fig 2.

Primer construct and amplification products.

The primers are composed of sequences specific to the sequencing platform (green), i.e. the P5 adaptor and the Illumina primer 1 for the forward primer and the P7 adaptor and the Illumina primer 2 for the reverse primer. Downstream follows a sample identifier starting with a poly-N (red) region and the custom defined primer (blue). In the reverse primer construct, the sample identifier was replaced by a poly-N region.

More »

Expand

Fig 3.

Discordance plot showing significant deviations of eukaryote read numbers between split samples.

For each of the samples S an individual panel shows the logarithmically scaled pairs of read numbers (r_iAS, r_iBS) of unique sequences i in PCR branches X ∈ {A, B}. Red and black points correspond to, respectively, sequences with and without significantly deviating r_iAS, r_iBS (false discovery rate q ≤ 0.05 or q > 0.05, respectively).

More »

Expand

Table 1.

Discordance measures for eukaryotic samples.

More »

Expand

Table 2.

Discordance measures for prokaryotic samples.

More »

Expand

Fig 4.

Effect of AmpliconDuo filter on spectrum of read numbers for eukaryotic data.

Columns A and B are experimental branches of the split sample, rows correspond to sampling sites. Number of sequences before and after AmpliconDuo filtering are plotted as black and orange dots, respectively. Both axes have logarithmic scales.

More »

Expand

Fig 5.

Distribution of probability p_art of artificial random mutations.

Each dot corresponds to one p_art value computed for one experimental branch A or B according to Eq (6). In the plot, p_art values are binned in intervals of 1/30 of their total range. Eukaryotes and metazoans (first two columns) have both been analyzed with the same single-read protocol, and the mean p_art of these two groups are not significantly different. For the prokaryotic samples that have been analyzed with a paired-end protocol, we have a higher p_art.

More »

Expand

Fig 6.

Effect of AmpliconDuo filtering on apparent eukaryote community similarities.

Comparison of samples with respect to Jaccard distances d_kl, Eq (5), between sequence abundance vectors. Left panel: Sequences clustered at 100% identity. Right panel: Sequences clustered at 100% identity and excluding sequences observed in only one branch of a split sample (AmpliconDuo filter).

More »

Expand

Fig 7.

Taxonomic composition of eukaryotic communities before (top) and after (bottom) AmpliconDuo filtering.

In the bog soil sample, many archaean taxa were captured by the broad eukaryotic primers used in this study. Archaea were therefore not discarded from the bog soil sample for this community comparison.

More »

Expand

Fig 8.

Effect of AmpliconDuo filtering on chimeras for prokaryotic sample Pro2.

Chimeras defined by being recognized by UCHIME in de novo mode with score ≥ 1. Top: Frequency of chimeras in branches A, B of split sample as function of their read numbers, before (dashed lines) and after (solid lines) application of AmpliconDuo filter. Bottom: Fraction of chimeras passing the AmpliconDuo filter (f_filtered/f_unfiltered) for read numbers 1 to 20 in both branches A, B, and corresponding prediction P(r_iA, r_iB ≥ 1) using the Poisson model in Eq (8) with λ_i = 1, 2, …, 20.

More »

Expand