A Direct PCR Approach to Accelerate Analyses of Human-Associated Microbial Communities

Since the composition of the human microbiome is highly variable both within and between individuals, researchers are increasingly reliant on high-throughput molecular approaches to identify linkages between the composition of these communities and human health. While new sequencing technologies have made it increasingly feasible to analyze large numbers of human-associated samples, the extraction of DNA from samples often remains a bottleneck in the process. Here we tested a direct PCR approach using the Extract-N-Amp Plant PCR Kit to accelerate the 16S rRNA gene-based analyses of human-associated bacterial communities, directly comparing this method to a more commonly-used approach whereby DNA is first extracted and purified from samples using a series of steps prior to PCR amplification. We used both approaches on replicate samples collected from each of five body habitats (tongue surface, feces, forehead skin, underarm skin, and forearm skin) from four individuals. With the exception of the tongue samples, there were few significant differences in the estimates of taxon richness or phylogenetic diversity obtained using the two approaches. Perhaps more importantly, there were no significant differences between the methods in their ability resolve body habitat differences or inter-individual differences in bacterial community composition and the estimates of the relative abundances of individual taxa were nearly identical with the two methods. Overall, the two methods gave very similar results and the direct PCR approach is clearly advantageous for many studies exploring the diversity and composition of human-associated bacterial communities given that large numbers of samples can be processed far more quickly and efficiently.


Introduction
There is growing recognition that the human body harbors diverse communities of microbes and that the composition of these microbial (mostly bacterial) communities can have important effects on human health [1][2][3]. Shifts in the composition of bacterial communities found within the mouth, skin, and the gut are often associated with intra-and inter-individual variation in immune system function, resistance to opportunistic pathogens, tissue development, and metabolism [4,5]. Research into the human microbiome and its influence on human health has long been hindered by the difficulties associated with characterizing the composition and structure of the bacterial communities. Individual samples typically harbor highly diverse bacterial communities consisting of hundreds, if not thousands, of taxa, most of these taxa can only be identified via DNA or RNA analyses and the bacterial communities found within individual body habitats are highly variable between individuals and within individuals over time [6][7][8][9][10][11]. Therefore, effectively quantifying the inter-and intraindividual variability in human-associated bacterial communities requires the analysis of a large number of samples using highthroughput molecular approaches.
The most commonly-used approach to determine the phylogenetic and taxonomic structure of any microbial community, including those associated with humans, is to extract and purify DNA from samples (e.g. skin swabs or fecal samples), PCR amplify the 16S rRNA gene (or a region of that gene), sequence the resulting amplicons, and then analyze the resulting sequence data. Large numbers of samples can now be analyzed in this manner given that PCR setup can readily be automated using liquidhandling robots and the increasing availability of next-generation DNA sequencing technologies [12,13] that allow thousands of samples to be sequenced simultaneously at a very low per-sample cost [14]. Currently, the DNA extraction step is often the bottleneck in this process because most extraction protocols utilize different chemical and/or physical lysis procedures and DNA cleanup steps, many of which are very time consuming even when conducted with liquid-handling robots. Although several recent studies have evaluated a variety of DNA extraction methods to determine the most suitable for human microbiome studies [15][16][17], these studies were primarily concerned with the impact of different DNA extractions methods on community structure representation and not with expediting the extraction process.
To assess if we could more efficiently and quickly go from sample to amplified DNA suitable for high-throughput 16S rRNA gene sequencing of human-associated bacterial communities, we tested a commercially available direct PCR kit that was originally designed for extraction and amplification of plant DNA. We analyzed samples collected from human mouths (tongue), feces, and three skin locations (face, forearm and underarm) using this direct PCR approach and compared the sequence data obtained to replicate samples extracted using a more traditional protocol in order to determine if the two approaches yielded comparable information on bacterial community structure. If so, the direct PCR approach could be useful in situations where highthroughput analyses of human-associated bacterial communities are currently hindered by the time and effort required to conduct standard DNA extractions.

Ethics statement
Volunteers were made aware of the nature of the experiment and gave written informed consent in accordance with the sampling protocol approved by the University of Colorado Human Research Committee (protocol 0409.13).

Sample collection
Samples were collected from five body habitats (tongue surface, feces, forehead skin, underarm skin, and forearm skin) on three men and one woman at a single point in time. These body habitats were selected as they likely represent a broad range of bacterial community types [6,7,18] and are sites commonly studied by microbiologists. Eight replicate samples were collected per body habitat per individual so we could compare the results obtained via the direct PCR approach with a more standard DNA extraction/ purification approach (n = 4 per approach for each body habitat and individual). Samples were collected using the sterile swabbing method described previously [19] with all eight swabs held together and simultaneously brushed over the skin, tongue, or toilet paper surface. All four individuals were between the ages of 20 and 40, in good health, and had no recent history of antibiotic usage. All 160 swabs (four individuals, five body sites per individual, eight replicate samples per body habitat/individual) were stored at 2206C immediately after collection.

DNA extraction and PCR amplification (standard method)
DNA was extracted from 80 of the 160 swabs using the approach described in detail in [7] and [19]. This approach involves extracting and purifying DNA from the swabs using the commercially available MoBio PowerSoil DNA Isolation Kit (MoBio Laboratories, Carlsbad, CA USA, catalog #12955), a kit that is widely used in human microbiome research [6,7] as it consistently yields high quality, PCR-amplifiable DNA from even low biomass samples (like skin). Although numerous kits and extraction procedures have been used in human microbiome studies, all involve the basic steps of cell lysis and DNA purification prior to PCR amplification. Therefore, for the purpose of this study, we chose the MoBio PowerSoil DNA Isolation Kit to represent a ''standard extraction/purification approach,'' as this is what we commonly use in our laboratory and what has been adopted by such large-scale projects as The Earth Microbiome Project (http://www.earthmicrobiome.org/emp-standardprotocols/). This extraction approach involves mechanical lysis, chemical lysis, and DNA purification in a series of 32 steps following the manufacturers' instructions. All 80 samples and 16 negative controls consisting of both reagent blanks and sterile swabs were extracted in a single 96-well plate which requires approximately 6-8 h to process a full plate. For the skin and tongue samples, DNA was extracted directly from individual swab tips placed into the respective wells. For the fecal samples, the four replicate swab tips per individual were placed into a sterile, DNA-  Science, Branford, CT, USA) and a 'GT' linker sequence. The 806r primer incorporated a 12-bp error-correcting barcode sequence unique to each individual sample, the Roche 454-B FLX sequencing adapter and a 'GG' linker sequence. Samples were amplified in triplicate with 2 mL of forward and reverse primers (5 mM each), 10 mL of 5Prime MasterMix (5 PRIME Inc., Gaithersburg, MD, USA), 1 mL 5Prime magnesium solution and 1 mL of DNA in a total volume of 25 mL with the following cycling

Direct PCR amplification
The other 80 samples and negative controls (again consisting of reagent blanks and sterile swabs), were prepared for sequencing using a direct PCR approach. Tongue and skin swab tips were placed directly into wells in a 2 mL 96-well Deep Well plate (Axygen Inc., Union City, CA, USA, catalog #P-DW-20-C-S-IND) along with the appropriate negative control samples. The Axygen plate was chosen for its separate well walls, which offered more uniform heating of samples as opposed to plates with shared well walls. The four replicate fecal samples from each individual were processed as described above with 25 mL of fecal slurry placed into each of four replicate wells in the 96-well plate. The plate was processed using the Extract-N-Amp Plant PCR kit (Sigma-Aldrich, Inc.) following the manufacturers' instructions except for adjusting the volume of reagents as detailed below. Wells containing fecal slurry received 100 mL of Extract-N-Amp Plant Extraction solution (catalog #E7526) and wells containing the swab tips (all negative controls, tongue, and skin samples) received 250 mL of this solution. The plate was then sealed securely with a 96 round well Impermamat Silicon Sealing Mat (Axygen Inc., Union City, CA, USA, catalog #AM-2ML-RD-IMP), heated in a water bath at 90-956C for 10 minutes, followed by centrifugation for 5 min at 2500 xg. Extract-N-Amp Plant Dilution solution (catalog #D5688) was added to the wells at a 1:1 ratio to the extraction solution and mixed gently by pipetting. The plate was resealed with the mat and stored at 46C overnight. PCR was conducted in 20 mL triplicate reactions per sample using 10 mL of Extract-N-Amp Ready Mix (catalog #E3004), 1 mL of the forward and reverse primers (the same primers described above), 5 mL of PCR-grade water, and 4 mL of the Extract-N-Amp sample solutions from the 96-well plate. PCR cycling conditions were identical to those described above, as were the amplicon processing, pooling, cleaning, and sequencing methods. A flow chart comparing the two extraction protocols is shown in Figure 1.

Sequence and statistical analyses
All sequences were processed and sorted using the default parameters in QIIME [21]. Briefly, high-quality sequences (.200 bp in length, quality score .25, exact match to barcode and primer, and containing no ambiguous characters) were clustered de novo into operational taxonomic units (OTUs) at 97% sequence identity using UCLUST [22]. Representative sequences of each OTU were then aligned against the Greengenes core set [23] using PyNAST [24] and assigned taxonomy with the RDP-classifier [25]. Aligned sequences were filtered using the Lane-mask and used to generate a phylogenetic tree with FastTree [26]. Average taxonomic composition was determined from replicate samples of each method. Figure 2. Average phylogenetic diversity observed for the communities of each body habitat of each individual using the direct PCR (dark grey) and standard extraction/purification (light grey) protocols. Bars with asterisks denote comparisons that were statistically significant within an individual (t-test, one asterisks p#0.05, two asterisks p#0.01). Side brackets with asterisks denote comparisons that were statistically significant across all individuals (paired t-test, p#0.01). Error bars are 6 one standard deviation. doi:10.1371/journal.pone.0044563.g002 Most (13/16) of the negative control samples associated with the direct PCR approach produced sufficient amplicon yields for sequencing. To correct for this, OTUs constituting greater than 1% of the total negative control sequences were removed from all samples prior to rarefaction and all downstream analyses. In total, 16 OTUs (out of 9,970) were removed with 75% of the sequences (4,915/6,521) belonging to the Gammaproteobacteria. After removal of these negative control OTUs, quality filtering of sequences, and rarefaction to 400 sequences per sample, 142 of the 160 test samples remained (Table S1). With the direct PCR protocol, none of the four replicates from the underarm of Individual C passed quality control. This was the only set of replicate samples that did not yield data with either of the approaches used.
To evaluate the suitability of the direct PCR protocol and to compare the direct PCR results to the results obtained using the extraction/purification approach, a variety of alpha and beta diversity metrics (both phylogenetic and taxonomic metrics) were calculated from the resulting sequence data. Briefly, alpha diversity is defined here as the number of taxa (OTU richness) or lineages (phylogenetic diversity, PD [27]) found in individual samples. Beta diversity is the net difference between any pair of communities: the number and abundances of taxa shared between samples (measured here using the Bray-Curtis distance metric) or the proportion of lineages shared between samples (unweighted Unifrac distance [28]). All metrics were calculated for 400 randomly selected sequences from each sample. For the alphadiversity metrics, the average value for each sample was determined from 50 resampling events of 400 sequences per sample. Replicate samples were then averaged for each method and tested for differences between methods using a Student's t-test (within an individual) and a paired t-test (across individuals). For the beta-diversity metrics, the unweighted UniFrac distance matrix was exported from QIIME and imported into PRIMER v6 where principal coordinate analysis (PCoA) and analysis of similarity (ANOSIM) were used to visualize and statistically compare the communities observed with the different methods [29]. Bray-Curtis dissimilarities were calculated in PRIMER v6 using a normalized and square root transformed OTU table generated in QIIME. PCoA and ANOSIM were also performed on the Bray-Curtis dissimilarity matrix. Pairwise UniFrac distances and Bray-Curtis dissimilarities values for replicate samples from each individual were averaged and tested for significance using a Student's t-test to evaluate the variability between replicates within a method. All statistical tests, except ANOSIM, were performed in the R software package [30].
In addition to testing for differences in alpha and beta diversity between the two methods, we also quantified how the methods compared in their estimation of the relative abundances of specific bacterial taxa. For this, the family-level taxonomy of replicate samples from each body habitat of each individual was determined and abundances of each taxonomic group were compared between methods within individuals using a Student's t-test. Only taxonomic groups appearing in at least three pairs of samples were included in the analysis. Although dozens of taxonomic groups   (Table 1).

Alpha diversity
Alpha diversity levels clearly varied across body habitats and across individuals, regardless of the metric employed (Figures 2 and S1), patterns that have been discussed in detail in other studies [7,9,18,31]. Here we focus on comparisons of the estimated alpha diversity levels between the two methods, the direct PCR approach and the DNA extraction/purification approach, to determine whether they yielded similar results. With the exception of the tongue samples, there were few significant differences in the estimates of either PD levels ( Figure 2) or OTU richness ( Figure  S1) between the two approaches. Across the four individuals, bacterial diversity on the tongue was consistently underestimated using the direct PCR approach. For the other body habitats, the two approaches yielded no consistent differences in estimated diversity across individuals, but some estimates of diversity within body habitats of specific individuals were significantly different between the two approaches, with the direct PCR approach often yielding higher estimates of diversity on skin sites than the more traditional approach. Alpha diversity estimates do appear sensitive to the method used, but the patterns across body habitats and, in most cases, the relative differences in diversity levels across individuals for a given body habitat, were consistent regardless of the method employed. We do not know how the differences between the two approaches observed here compare to the magnitude of the differences in alpha diversity estimates that are known to arise when different PCR primers [14,32,33], DNA extraction techniques [15][16][17], or data processing strategies [34][35][36] are employed. Alpha diversity estimates can clearly be affected by the methods employed and it is important to keep methods as consistent as possible when comparing estimates across sample sets or datasets.

Beta diversity
Regardless of the extraction method used, the bacterial communities primarily clustered into three groups: fecal, tongue and skin ( Figure S2). The dominant taxa found in these body habitats are described in Figure 3, and the relative abundances of these groups are what we would expect based on the large number of studies that have examined human-associated bacterial communities in these body habitats (reviewed in [8,37]. Within a given body habitat, communities clustered by individual (Figure 4, S3) regardless of the method employed, and there were no statistically significant differences between methods across individuals (p.0.01 for each body habitats; see Figures 4 and S3 for ANOSIM statistics of each body habitat). These patterns were evident when beta diversity was quantified using either a phylogenetic metric (Figure 4) or a taxonomic metric ( Figure S3). These results highlight that the inter-individual differences in bacterial community composition are typically large regardless of the body habitat in question, a pattern that has been noted previously [7,8,19]. The differences in bacterial communities between body habitats and between individuals for a given body habitat were consistent across the two methods. If the goal of a project is to resolve inter-habitat or inter-individual differences in bacterial community composition, either method should suffice and mixing datasets obtained using the two different methods could be justified given that there were no statistically significant differences between the methods at this level of inquiry.
When we restrict our analyses to the beta diversity patterns across replicate samples collected from specific body habitats of individuals, we were able to detect significant differences between the methods for nearly all of the fecal samples, some of the tongue samples, and a few of the skin samples (Tables 2 and 3). Other studies have also observed beta diversity differences between replicate samples extracted with different DNA extraction protocols [15,16]. The taxa driving the differences we observed between the methods are detailed in Table 1. In the gut samples, the direct PCR approach often underestimated the abundances of Ruminococcaceae and overestimated Veillonellaceae relative to the extraction/purification approach. Likewise, in the tongue samples the relative abundance of Prevotellaceae was, in some individuals, estimated to be higher with the direct PCR approach than the more traditional method. Although we do not know which approach is more accurate (i.e. which approach provides estimates of taxon abundances that more closely reflect true abundances), the two approaches do yield different estimates of some taxon abundances, differences that could be due to the direct PCR approach failing to release DNA from certain cell types. If the goal is to assess the intra-individual variability in bacterial communities within a given body habitat (e.g. time series studies), either method would suffice, but it is important not to mix the methods as the two methods do not yield identical estimates of taxon abundances, particularly in the fecal and tongue samples.
One important aspect of high-throughput sequencing is the variability in results obtained from replicate samples. Samples that are identical and are processed in an identical manner should yield very similar sequence data, but previous work has demonstrated that this is often not the case [38]. To quantify variability between replicate samples with each approach, pairwise phylogenetic distance (UniFrac) and taxonomic dissimilarity (Bray-Curtis) values were compared between replicate samples collected from the same body site and individual. With the unweighted UniFrac metric, the levels of variability between replicates were similar  between the two methods with only a handful of replicate sample sets exhibiting significantly different levels of variability ( Figure 5). Perhaps more importantly, one approach did not consistently yield higher variability between replicates than another, in some cases the DNA extraction/purification approach actually resulted in greater variability between replicate samples. Similar patterns were observed using Bray-Curtis dissimilarities although the direct PCR communities from the fecal and tongue samples were more variable across individuals ( Figure S4). Regardless of the approach used, replicate samples never yielded identical sequence data (a point that should be considered when designing studies) and this variability between replicates was not consistently higher with one approach than the other.

Conclusion
The direct approach is far faster -going from sample to amplicons suitable for sequencing was at least 6 to 8 times faster with the direct PCR approach than the DNA extraction/ purification approach (Figure 1). Although there are likely DNA extraction methods that are faster than the MoBio extraction kit used here, we know of no stand-alone technique for DNA extraction that yields PCR-amplifiable DNA from such a high number of human-associated samples as quickly as the direct PCR approach used here. The direct PCR approach is also likely cheaper given the considerable labor savings. For studies where it is critical to process a large number of human-associated samples quickly and efficiently for PCR-based bacterial community analyses, the direct PCR approach described here is clearly a useful tool given that the results are, in most cases, directly comparable to the results obtained using the more time consuming standard approach.
Of course, the direct PCR approach does have its limitations that render it unsuitable (or at least less suitable) for certain applications and study designs. The direct PCR approach is likely not appropriate for shotgun metagenomic analyses and, if a given sample is to be used for multiple PCR-based assays, it would probably be useful to extract DNA using a standard approach. However, we have successfully conducted multiple PCR analyses off of Extract-N-Amp dilution solutions stored at 4uC for up to 20 weeks (data not shown), but this is not likely a long term storage option. If DNA needs to be stored for extended periods of time for downstream analyses, the more standard extraction approach would likely be preferable. Also, we do not know how the direct PCR approach would work for other types of PCR-based analyses (e.g. quantitative PCR analyses targeting specific pathogens). Nevertheless, as studies are increasingly moving to very large sample numbers and amplicon sequencing becomes ever faster and cheaper, the direct PCR approach is clearly a valuable option for high-throughput studies examining the structure and diversity of human-associated bacterial communities. Figure S1 Average number of OTUs observed for each body habitat of each individual using the direct PCR Figure 5. Variation of unweighted UniFrac distances of replicate samples using the direct PCR (dark grey) and standard extraction/purification (light grey) protocols. Bars with asterisks denote comparisons that were statistically significant within an individual (t-test, one asterisks p#0.05, two asterisks p#0.01). No statistical differences between methods across individuals were observed for any body habitats (paired t-test, p.0.05). Error bars are 6 one standard deviation. doi:10.1371/journal.pone.0044563.g005 (dark grey) and standard extraction/purification (light grey) protocols. Bars with asterisks denote comparisons that were statistically significant within an individual (t-test, one asterisks p#0.05, two asterisks p#0.01). Side brackets with asterisks denote comparisons that were statistically significant across all individuals (paired t-test, p#0.01). Error bars are 6 one standard deviation. (TIF) Figure S2 PCoA plots illustrating differences in community composition across body habitats (a and c) and no differences based on which protocol was used across individuals (b and d).  Figure S3 PCoA plots derived from Bray-Curtis dissimilarities comparing the communities observed using the direct PCR (black circles) and standard extraction/ purification (grey circles) protocols. Letters A-D denote individual participants. Results of the ANOSIM testing for statistical differences between methods across individuals are shown for each body habitat. Note that individual C was not included in the underarm analysis as sequences were not obtained for the direct PCR samples. (TIF) Figure S4 Variation of Bray-Curtis dissimilarities of replicate samples using the direct PCR (dark grey) and standard extraction/purification (light grey) protocols. Bars with asterisks denote comparisons that were statistically significant within an individual (t-test, one asterisks p#0.05, two asterisks p#0.01). Only communities from the tongue showed differences between methods across individuals (paired t-test, p#0.05). Error bars are 6 one standard deviation. (TIF)

Supporting Information
Table S1 Samples used to evaluate the suitability of the direct PCR protocol for use in high-throughput 16S rRNA gene surveys of the human microbiome. Note that the before subtraction column refers to the number of sequences before removal of negative control OTUs present as 1% or greater of total negative control sequences. For all downstream analyses, samples were rarefied to 400 sequences per sample using sequences after subtraction. (DOC)