Authentication of Herbal Supplements Using Next-Generation Sequencing

Background DNA-based testing has been gaining acceptance as a tool for authentication of a wide range of food products; however, its applicability for testing of herbal supplements remains contentious. Methods We utilized Sanger and Next-Generation Sequencing (NGS) for taxonomic authentication of fifteen herbal supplements representing three different producers from five medicinal plants: Echinacea purpurea, Valeriana officinalis, Ginkgo biloba, Hypericum perforatum and Trigonella foenum-graecum. Experimental design included three modifications of DNA extraction, two lysate dilutions, Internal Amplification Control, and multiple negative controls to exclude background contamination. Ginkgo supplements were also analyzed using HPLC-MS for the presence of active medicinal components. Results All supplements yielded DNA from multiple species, rendering Sanger sequencing results for rbcL and ITS2 regions either uninterpretable or non-reproducible between the experimental replicates. Overall, DNA from the manufacturer-listed medicinal plants was successfully detected in seven out of eight dry herb form supplements; however, low or poor DNA recovery due to degradation was observed in most plant extracts (none detected by Sanger; three out of seven–by NGS). NGS also revealed a diverse community of fungi, known to be associated with live plant material and/or the fermentation process used in the production of plant extracts. HPLC-MS testing demonstrated that Ginkgo supplements with degraded DNA contained ten key medicinal components. Conclusion Quality control of herbal supplements should utilize a synergetic approach targeting both DNA and bioactive components, especially for standardized extracts with degraded DNA. The NGS workflow developed in this study enables reliable detection of plant and fungal DNA and can be utilized by manufacturers for quality assurance of raw plant materials, contamination control during the production process, and the final product. Interpretation of results should involve an interdisciplinary approach taking into account the processes involved in production of herbal supplements, as well as biocomplexity of plant-plant and plant-fungal biological interactions.

testing and the resulting data interpretation. Most recent studies on DNA-based taxonomic authentication of herbal supplements [21,23,26] utilize Sanger sequencing which is negatively affected by primer bias, even when multiple extractions and serial dilutions are used to reduce preferential amplification of a non-target template, especially when DNA of the target species is degraded.
DNA degradation may happen during the production of standardized extracts. This process involves multiple stages [8] and uses a series of chemical and/or physical treatments to isolate the secondary metabolites of a medicinal plant, while selectively removing toxic or unwanted substances from the final products [27]. In addition to these methods, solid-state fermentation with various microorganisms is often used to improve the yield of active medicinal compounds [28][29][30][31]. These treatments often result in partial or complete degradation of plant DNA and often lead to contamination from foreign DNA sources [25].
An additional challenge for identification of DNA origin in plant materials is due to ecological interactions among species. Plants interact with a wide variety of organisms, including bacteria, viruses and fungi. Among these, plant-fungal relationships have key impact on plants through complex symbiotic, parasitic and pathogenic interactions [32]. Many potentially toxigenic fungi co-exist with plants as important endophytic and/or mycorrhizal symbionts [33][34][35][36][37]. A study of incidence and toxigenic capacity of fungi in Argentinian medicinal herbs [38] highlighted the need for standard procedures to assess acceptability limits for fungal contamination. As a result, DNA-based authentication protocols for such products must be sensitive enough to detect DNA template in mixed samples with varying concentration and levels of degradation.
Next-Generation Sequencing (NGS) offers several key advantages over Sanger sequencing: massive parallelization of sequencing reactions, clonal separation of templates, regardless of their relative concentration, superior sensitivity, and faster turnaround time [39,40]. This makes NGS the preferred method for analyzing samples with varying levels of DNA degradation, derived from multiple species, containing fillers, or contaminants. NGS was shown to be an effective and cost-efficient way to authenticate highly processed Traditional Chinese Medicine (TCM) products and to assist in monitoring their compliance with legal codes and safety regulations [41]. Despite this, there are relatively few studies utilizing this approach in authenticating herbal supplements.
Two recent comprehensive reviews on DNA-based authentication of botanicals [42,43] highlighted NGS as a prospective way to verify listed ingredients in herbal medicines and to detect adulteration. In particular, Countinho Moraes et al. [43] indicated that targeted enrichment [44,45] and whole chloroplast sequencing [46][47][48][49] have great potential to resolve closely related plant species. However, the complexity of bioinformatics and laboratory workflows can limit their applicability for diagnostic applications. Both reviews concluded that amplicon metabarcoding (NGS-facilitated DNA barcoding) can become a standardized tool for authentication of herbal supplements.
Considering the above mentioned advantages of NGS and the potential limitations of Sanger sequencing, there is urgency in establishing high-resolution standard NGS workflow with a simple bioinformatics pipeline for DNA-based taxonomic authentication of NHP's, complementary to existing quality control procedures.
Our study aimed to provide the first comprehensive evaluation of the performance of NGSbased DNA barcoding in authenticating the taxonomic provenance of NHP's, using select herbal supplements as an example. Specific goals were to: 2. compare the relative efficiency and reliability of NGS vs. Sanger sequencing when authenticating NHP's, particularly, heavily processed herbal supplements with degraded DNA; 3. identify the methodological challenges of detecting and discerning the DNA of the source plant(s), filler ingredients and contaminants of herbal supplements; 4. characterize fungal species frequently found in common herbal supplements and their potential effects on the results DNA-based authentication; 5. evaluate the overall applicability of DNA-based diagnostic approaches and their complementarity with the existing industry standards for quality control and authentication of NHP's, such as HPLC-MS.

Material
To establish a reference barcode library for five medicinal plant species, we used herbarium specimens from the Royal Ontario Museum (TRT Green Plant Herbarium) and several freshly collected specimens which were deposited in the collection of the Biodiversity Institute of Ontario, University of Guelph. Voucher and sequence data for the corresponding specimens are available in the public BOLD dataset: [DS-RLMPCCDB]: http://dx.doi.org/10.5883/ DS-RLMPCCDB. Material was selected in a way to cover a wide range of morphological origins (dried aerial parts, seeds, and roots) and forms of preparation (plant extracts, extracts combined with raw plant material, material with and without fillers). Five species of medicinal plants were chosen for testing: Echinacea purpurea, Valeriana officinalis, Ginkgo biloba, Hypericum perforatum and Trigonella foenum-graecum. Fifteen herbal supplements, three from each of the above species from three different manufacturers, were purchased in local pharmacies and health food stores (Table 1). To minimise the number of factors that could contribute to PCR inhibition, only oil-free tablets or gelatine capsules were selected for testing.

DNA extraction
Prior to DNA extraction, each gelatine capsule was opened and the contents were transferred into 2 ml tubes (three capsules or tablets per supplement). Gloves were changed after each sampling event and special care was taken to decontaminate all working surfaces and tubes from the airborne powder using ELIMINase1 (Decon Laboratories, Inc., King of Prussia, USA), DI water and ethanol.
According to the experiment setup, each of 15 supplements had six replicates per each of the three lysis buffers (three lysates and their corresponding 10× dilutions), resulting in a total of 18 replicates per supplement.
DNA extraction followed the previously published protocol for isolation of total genomic DNA from plants [50,51], with minor modifications to deal with powdered material, as described below. A volume equal to displacement of~75-100 μl of fluid was transferred from the powdered material using pre-cut 1 ml sterile tips into a screw-top Matrix A grinding vial (MP Biomedicals, LLC, Santa Ana, USA) containing 1 ml of the corresponding lysis buffer: CTAB [52], ILB [53], or WHITL [54]. Tissue was ground in the presence of the lysis buffer at 28 Hz for 1 min using TissueLyzer II (Qiagen GmbH, Hilden, Germany) and tubes were incubated at 65°C for 2 hours. Following incubation, tubes were centrifuged for 1 min at 11,000×g. Lysates and their corresponding 10× dilutions were assembled into a 96-tube rack (12×8). 50 μl of each lysate were transferred to a new 96-well plate and extracted as described in [50,51] using a 1 ml Acroprep™ 96-well plate with 1 μm glass fiber media (Pall Life Sciences, Port Washington, USA) with two WB wash stages. DNA was eluted in 50 μl of 10 mM Tris-HCl pH 8.0. Each plate contained six blank wells filled with the corresponding lysis buffer used as an internal negative extraction control. Additionally, 50 μl of each of the three lysis buffers were dispensed into 32 wells of the 96-well plate and the entire plate was extracted as described above. This plate was used as a background contamination control.

Internal amplification control
An Internal Amplification Control (IAC) was introduced to distinguish DNA degradation from PCR inhibition. To prepare the IAC plasmid, a 658 bp fragment of COI gene was amplified from lepidopteran DNA (Crinodes ritsemae) using LepF1-LepR1 [55] primers as described Hebert et al. [56]. The resulting product was cloned into the pCR™4-TOPO1Vector using the TOPO1 TA Cloning Kit for Sequencing with One Shot1 TOP10 Chemically Competent E. coli (Invitrogen, Thermo Fisher Scientific, Waltham, USA), according to manufacturer's instructions. Resulting plasmid DNA was used as a positive internal control in PCR reactions. To find the optimal concentration of IAC, serial dilution (1 ng/μl, 0.1 ng/μl, 0.01 ng/μl, 0.001 ng/μl) was prepared from a 100 ng/μl control plasmid stock and amplified with LepF1-LepR1 [55] in the presence of 0.25 ng/ μl of Brassica oleracea DNA. We used the lowest IAC plasmid concentration that allowed the reliable amplification of both templates in a gradient of annealing temperatures from 45 to 56°C. A short (~150 bp) region of rbcL was amplified in the presence of an IAC from supplements extracted with CTAB buffer. A volume of 1 μl from 0.1 ng/μl plasmid dilution and Lep-F1-LepR1 primers were added to a PCR master mix containing rbcLaF and MrbcL 163R primers ( Table 2). The PCR followed standard protocols for plant rbcL amplification [51].

ITS2 -Sanger and NGS workflow
The PCR with ITS3/ITS_S2F/ITS4 primers [59,60] followed Fazekas et al. [51]. Thermocycling consisted of two rounds to minimize PCR bias which may be caused by fusion primers with MID tags [62]. The first round PCRs were performed at three annealing temperatures to minimize PCR bias further. The first round with regular (target) primers started with an initial denaturation at 94°C for 2 min, followed by 30 cycles of 94°C for 30 s, annealing at 51°C, 53°C and 56°C for 30 s, and 72°C for 1 min, with a final extension at 72°C for 5 min. First round PCR products from each annealing temperature were unidirectionally sequenced with the ITS4 primer using the Sanger sequencing workflow, as described above.
Following the first round, aliquots of PCR products from three annealing temperatures were combined into one plate; diluted by 2× and a volume of 2 μl was transferred to the second round of PCR to create barcoded libraries with fusion primers containing Ion Xpress™ MID tags and Ion Adapters (Table 2). PCR thermocycling for the second round consisted of 94°C for 2 min, 10 cycles of 94°C for 30 s, annealing at 56°C for 30 s, and 72°C for 1 min, with a final extension at 72°C for 5 min. PCR products were visualized on a 2% agarose gel using an E-Gel961 Pre-cast Agarose Electrophoresis System (Invitrogen, Thermo Fisher Scientific). In order to amplify the ITS2 region for Ginkgo biloba, the forward primer was designed based on publicly available Ginkgo sequences for full ITS1, 5.8S and ITS2 regions (GenBank accessions: EU350117.1, EU643829.1). PCR reactions with ITS-S2F-GINK/ITS4 primers followed the same conditions as described above with the exception that only 56°C was used for annealing in first round of PCR.
PCRClean™ DX kit (Aline Biosciences, Woburn, USA), was used for double size selection purification of amplicons to remove any non-specific amplification products and primer dimers. The beads to product ratio 0.5:1 was used for the upper cut and 0.7:1 -for the lower cut. A volume of 70 μl of product was thoroughly mixed with 35 μl of beads and incubated at room temperature for 9 min, followed 2 min of incubation on DynaMag™-2 magnet (Invitrogen, Thermo Fisher Scientific); 100 μl volume of supernatant was transferred to a tube containing 23.3 μl of water and 29.7 μl of beads for the lower cut, thoroughly mixed by pipetting, incubated for 9 min at room temperature and transferred to the magnet for 2 min or until the solution was clear. The resulting supernatant was discarded and beads were washed three times with 80% ethanol (each time the beads were re-suspended by pipetting and then placed on the magnet for ethanol removal). The beads were dried at room temperature (while sitting on the magnet) until completely dry. Purified PCR products were eluted in 36 μl of water; their concentration was measured on the Qubit 2.0 spectrophotometer using Qubit1 dsDNA HS Assay Kit (Invitrogen, Thermo Fisher Scientific). All products were normalized to 1 ng/μl prior to final library dilution (~300×).
Ion PGM™ Template OT2 400 kit was used for template preparation for sequencing as per manufacturer's protocol except for the recommended library dilutions (we reduced the input of PCR product in library dilution to <12.5 pM). The Ion PGM™ 400 sequencing kit and Ion Torrent PGM™ (Ion Torrent, Thermo Fisher Scientific) were used for sequencing according to manufacturer's instructions.

NGS data analysis
Primer sequences were trimmed using Cutadapt (v1.8.1); bases with a quality score less than 20 and reads shorter than 200 bp were removed using Sickle (v1.33); reverse complement was generated using the fast reverse complement function (Fastx Toolkit v0.0.14); resulting reads were clustered into OTUs using Uclust (v1.2.22) with a 2% identity threshold, which was chosen based on reported error rate 1.4-1.5% for Ion Torrent [63][64][65][66] and filters with a minimum read depth of 100; OTUs were identified by using a custom ITS database of plant and fungal ITS2 sequences available in Genbank. Each OTU was identified in Qiime (v1.9.0 using BLAST search with a minimum identity of 90% and a minimum e-value of 0.001) using the following command: assign_taxonomy.py -i cluster.fa -m blast -r /path/to/reference/database.fa -t /path/ to/reference/taxonomy.txt -e 0.001. Corresponding Qiime scripts are also available at http:// qiime.org/scripts/assign_taxonomy.html. To evaluate BLAST-based taxonomic identifications, we calculated the pairwise distance of OTUs generated by Uclust (v.1.2.22) from each supplement to its listed medicinal plant reference sequence in our plant BOLD reference library (boldsystems.org; public dataset: Medicinal Plants-CCDB [DS-RLMPCCDB]), using MEGA ver. 6 [67]. Prior to calculating the distance, all OTUs were aligned against their reference sequence using ClustalW [68], and then checked by eye. All distances were calculated using the Maximum Composite Likelihood model [69] with transitions and transversions included, and uniform rates among sequences. All gaps and indels (insertions and deletions) were excluded; and variance was calculated by using 1000 heuristic bootstrap pseudo-replicates [70]. Distances for each sample were binned into four categories reflecting divergence with the reference sequence. Samples were classified as either identical (0%), low divergence (0-2%), moderate divergence (2-5%), or high divergence (> 5%). Mean pairwise distance for each supplement was also calculated [71] using the same parameters. Taxonomic identification of a cluster was considered robust if it was within 2% of its reference sequence. Any distance beyond 2% was considered a poor taxonomic match.

Ginkgo-chemistry analysis
Samples of Ginkgo products were prepared for HPLC analysis as described in [72] with minor modifications. Ten capsules of each product were combined and pulverized into powder using a mortar and pestle, which were decontaminated with concentrated bleach, DI water, absolute ethanol and UV light before and after each sample. 100 mg of the resulting powder were placed into 20 ml scintillation glass vials; then 20 ml of methanol was added to each sample. Resulting mixtures were sonicated in an ultrasonic bath at 42 kHz at room temperature for 40 min with periodical shaking. A volume of 1 ml was transferred to 1.5 ml tubes and centrifuged for 15 min at 15,000×g; 700 μl of supernatant was applied to Ultrafree MC-GV centrifugal filter units with 0.22 μm Durapore PVDF membrane (EMD Millipore, Merck KGaA, Darmstadt, Germany); and tubes were centrifuged for 2 min at 15,000×g.
Liquid chromatography-mass spectrometry (HPLC-MS) analyses were performed at the Mass Spectrometry Facility of the Advanced Analysis Centre, University of Guelph, using an Agilent 1200 HPLC liquid chromatograph interfaced with an Agilent UHD 6530 Q-Tof mass spectrometer (Agilent Technologies, Santa Clara, USA). A C18 column (Agilent Poroshell 120, EC-C18 50 mm x 3.0 mm 2.7 μm (Agilent Technologies) was used for chromatographic separation with the following solvents: water with 0.1% formic acid (A) and acetonitrile with 0.1%formic acid (B). The mobile phase gradient was as follows: initial conditions were 10% B for 1 min, then increasing to 100% B in 29 min, followed by column wash at 100% B for 5 min and 20 min re-equilibration. The flow rate was maintained at 0.4 ml/min. The mass spectrometer electrospray capillary voltage was maintained at 4.0 kV and the drying gas temperature at 250°C with a flow rate of 8 l/min. Nebulizer pressure was 30 psi and the fragmentor was set to 160 V. Nitrogen was used as both nebulizing and drying gas. The mass-to-charge ratio was scanned across the m/z range of 50-1500 m/z using 2GHz (extended dynamic range) in positive and negative ion modes. The acquisition rate was 2 spectra/s. The instrument was externally calibrated with the ESI TuneMix (Agilent Technologies). The sample injection volume was 10 μl.
Chromatograms were analyzed within Agilent Qualitative Analysis software B 06.0 (Agilent Technologies) finding compounds by the Molecular Feature algorithm using the chemical formulas for quercetin (C 15  ANOVA with a post-hoc analysis was used to establish pairwise statistically significant differences among the supplements.

Marker choice for Sanger and Next-Generation Sequencing
Standard markers to be used for authenticating NHP's have to meet the following criteria: 1) be amplifiable with universal primers; 2) provide good resolution at the species level; 3) have a low instance of homopolymer repeats; and 4) be well represented in available reference sequence databases (NCBI GenBank and BOLD Systems). The length of the marker should not exceed 200-400 bp, in order to allow amplification of degraded DNA and to remain compatible with the read length specifications of the selected NGS platform (e.g., Ion Torrent PGM instrument).
A two-tiered approach was suggested for plant DNA barcoding [19,20]. For the first tier, rbcL was selected, which is the best characterized chloroplast gene with sufficient discriminating power, usually to the genus level [19]. It is easy to amplify and sequence; however, the poor discriminatory ability of rbcL in closely related species limits its utility in detecting ingredient substitution. Furthermore, the amplified region is 552 bp long, which prohibits its use on older Ion Torrent NGS platforms that until very recently had a read length limit of~400 bases. For this reason, we only used rbcL as an indicator of amplification success in Sanger-based authentication of NHP's.
We used ITS2 as a second-tier marker because of its full congruence with the criteria listed above and its prior wide use in plant molecular systematics [73], DNA barcoding [20,60,74,75], and authentication of herbal supplements [11,21]. Region spanning ITS1, 5.8S and ITS2 was also accepted as the standard DNA barcode marker for fungi [76][77][78][79]. We targeted the ITS2 region both for plants and fungi using two forward universal primers: fungal ITS3 and plant ITS_S2F, and ITS4 reverse primer ( Table 2). This approach allows estimating overall plant and fungal diversity while giving relatively high resolution for plant species identification.

DNA degradation vs. PCR inhibition
DNA extraction from plants is known to be challenging due to the presence of polysaccharides and polyphenolic compounds and despite the availability of a wide selection of commercial kits and taxon-specific methods [80][81][82], including high-throughput protocols [50,54]. This is especially true for medicinal plants, valued for their secondary metabolites that can complicate extraction and/or inhibit PCR. Because DNA can be degraded or even absent from supplements containing plant extracts, internal positive controls should be used to determine whether failure to recover PCR products is due to the absence of source DNA in the sample, problems with DNA extraction, or the inability to amplify target DNA. One strategy is to add non-target plasmid DNA as an internal amplification control (IAC-see Methods section) to the same sample tube, in order to co-amplify it with the target sequence. Because IAC has the capacity to detect false negative results caused by PCR inhibitors [83][84][85][86], it was mandated for PCR-based diagnostic applications [84]. We used the non-competitive strategy [85], when the target DNA and the IAC were amplified by different sets of primers. In our study, we encountered all four possible amplification scenarios (Fig 1) that indicate the importance of IAC in detecting false negative results. We propose using IAC as a standard quality control tool for authenticating herbal supplements. Table 3 summarizes the recovery of DNA from listed and non-listed species as the number of replicates producing non-mixed sequencing signal from at least one direction. Overall, rbcL and ITS2 produced consistent results across all tested buffers for almost each supplement. Reliable detection of source DNA was achieved in only four out of 15 samples (Echinacea-1, Gingko-9, Trigonella-13 and 14). Random detection of source DNA with low sequencing success was observed in Valeriana-4 and 5 (roots) and Hypericum-10 (raw herb). Our results showed preferential amplification of filler DNA (soy) both for rbcL and ITS2 in two standardized extracts (Echinacea-2, Ginkgo-7). They also indicated stochastic preferential PCR amplification of non-listed DNA in many cases where raw herb material could have been contaminated by other plants and fungi (Echinacea-3, Valeriana-4 and 5, Ginkgo-7 and 8, and Hypericum-10). Mixed signal observed in the remaining samples was considered a failure.

Sanger sequencing results
Simultaneous amplification of multiple DNA sources renders Sanger sequencing results non-interpretable, while the preferential amplification of one or another DNA source results in biased identification outcome. Such results should be interpreted with great caution and indicate a strong need for NGS-based methods.

NGS: listed medicinal plant and listed filler DNA
The NGS workflow using the ITS2 region enabled detection of the key ingredient DNA in 8 out of 15 tested supplements: five raw herb materials and three standardized extracts (Fig 2A). The number of reads for raw herb material (Echinacea-1, Ginkgo-9, Hypericum-10, Trigonella-13 and 14) was markedly higher (21,000-262,500 reads) compared to 135-875 reads for standardized extracts (Echinacea-2, Hypericum-11 and 12). The ILB buffer produced the most consistent results across all preparation forms, especially for standardized extracts.
The quality of read clusters for five supplements with reliable coverage among three lysis buffers was evaluated by comparing the pairwise distance of OTUs from each supplement to its reference sequence in our plant supplement BOLD reference library (Fig 2B). With the exception of Hypericum-10, extracted with CTAB and ILB showing <0-2% distance, the majority of the reads fell into the 0% category, indicating acceptable quality of OTUs used for identification.

Apieae 1
Valeriana-5 The five raw herbal samples that produced high quality ITS2 OTUs with NGS also generated consistent results with rbcL and ITS2, confirming that these samples contain high-quality source DNA. Thus, the NGS approach demonstrated reliable capacity to detect source DNA in supplements that failed to produce consistent results with Sanger sequencing (Hypericum-11). The use of ITS2 as a reference marker for two of the species studied was hampered by two factors: primer specificity (Ginkgo biloba) and intraspecific variability (Valeriana officinalis). Our attempt to sequence ITS2 for G. biloba using standard ITS primers (ITS_S2F and ITS4, Table 2) failed. The design of a new taxon-specific forward primer (ITS-S2F-GINK) solved this problem and resulted in successful authentication of G. biloba in supplement Ginkgo-9, an extract containing dried leaves. Creating the BOLD reference library for V. officinalis turned out to be the most challenging. Specimens from different locations produced Sanger sequences for ITS2 with extremely high intraspecific variation (>15%) and the presence of multiple indels. V. officinalis is known to have variable genome sizes and ploidy levels between different populations varying from 2x to 8x [87,88], suggesting a possible explanation for the sources of the observed intraspecific diversity. Another possible explanation is polymorphism, paralogy, or pseudogenes reported for nrDNA of some angiosperms [89]. As well, our results are concordant with the recent study of Palhares et al. [11], who found similar difficulties while sequencing Valeriana root samples. Therefore, use of standard threshold parameters to compare the reads from Valeriana supplements with our BOLD reference library was not possible. A potential way to overcome the shortfalls of using ITS2 in Valeriana-containing herbal supplements would be to select a different DNA marker with less intraspecific variability. These examples illustrate that while standard protocols are preferred for authentication of herbal supplements, it may be impossible to provide a universal solution for authentication of all herbal remedies.
Of the five supplements containing listed fillers, both instances of soy filler were confirmed with Sanger and NGS; while only one out of three instances of rice was confirmed with NGS only (Table 1).

HPLC-MS analysis-Ginkgo supplements
DNA authentication failed to detect source DNA in all Ginkgo supplements containing standardized extracts. In order to verify the presence of active medicinal compounds, all Ginkgo supplements were also analyzed with HPLC-MS. All 10 active compounds reported for Ginkgo were detected in each of the samples, although their HPLC-MS profiles differed in intensities (Fig 3). Three replicates per each Ginkgo supplement were tested and the resulting values were mostly normally distributed, with very few exceptions. We used one-way ANOVA with a posthoc analysis to establish pairwise statistically significant differences among the supplements. As the result, only three components for the three supplements showed statistically significant (p<0.05) differences in relative peak heights. Both standardized extracts (Ginkgo-7 and Ginkgo-8) had higher peaks for bilobalide and rutin, while Ginkgo-8 had lower peak for kaempferol.

NGS: Non-listed plant DNA
All tested supplements contained non-listed, non-filler plant DNA (Fig 4). The most diverse assembly of plant species was observed in Hypericum-10 supplement, followed by Ginkgo-9 and all Valeriana supplements. Overall, the highest number of non-listed, non-filler plant species was detected in raw herb material supplements: roots (Valeriana), aerial parts (Hypericum), and leaves (Ginkgo)-the same samples where target DNA was also detected. All plant extract supplements had lower counts of non-listed plant species. For example, lowest counts were observed in Ginkgo extracts (two in Ginkgo-7 and one in Ginkgo-8), thus confirming that isolation of secondary plant metabolites during the manufacturing of standardized extracts leads to DNA degradation or loss. Trigonella seed supplements were the only samples that provided a high yield of target DNA, while showing a remarkably low count of non-listed species: three in Trigonella-13 and two in Trigonella-14.
Recent metagenomic studies of clinical samples [90,91] suggest that laboratory background contamination check must be performed with each experiment, so that the contaminants occurring in commercial and in-house prepared reagents could be subtracted from the final results. By running multiple negative controls with all NGS and Sanger sequencing experiments, we detected several common plant and fungal contaminants which were presumed to be in-laboratory contaminants and thus filtered from the final results. The origins of the remaining non-listed plant and fungal DNA cannot be traced with certainty, although we may speculate about the most likely causes.
Many non-listed plants detected in this study are common weed species (mostly from Fabaceae, Asteraceae and Poaceae), suggesting primary trace contamination during bulk harvesting of plant aerial parts. Aside non-selective harvesting, medicinal plants may be 'naturally' contaminated with root excretions, sap and pollen from neighbouring plants. Alien pollen can be transferred as a result of inconstant (generalist) foraging by pollinators [92][93][94][95][96][97][98][99][100]. Root excretions play an important role in pathogen defence mechanisms and are known to contain extracellular DNA [101,102].
Secondary contaminants may be further inadvertently introduced during manufacturing or storage. We detected the presence of other medicinal herbs and species, such as Matricaria chamomilla, Tribulus terrestris, Rhodiola crenulata, Senna alexandrina, Allium spp., and Coriandrum sativum. Plant powders can become airborne or carried over during encapsulation, if the same equipment is used in production of different supplements.
To summarize, NGS analysis suggests that, aside from intended or non-intended substitution, possible cross-contamination with trace plant DNA can occur at any stage during growing, harvesting, manufacturing, handling or laboratory analysis of plant material (Fig 5).  Detection of such non-target DNA is not always indicative of deliberate adulteration; and such results should be interpreted with caution, especially when legal ramifications are considered.

Biocomplexity of plant-fungal interactions
NGS analysis detected the presence of fungal DNA in 14 out of 15 tested supplements. The overall fungal count in our study included 111 genera representing 49 families of Ascomycota, followed by 20 genera from 16 families of Basidiomycota and nine genera from 9 families incertae sedis. The top five fungal families dominating sequence reads were Pleosporaceae, Nectriaceae, Aspergillaceae, Leptosphaeriaceae and Phaeosphaeriaceae.
The lowest fungal species count was observed in two Trigonella seed supplements (one species detected in Trigonella-13 and none in Trigonella-14, Fig 4), which may be explained by antifungal properties of fenugreek [103,104]. Additional two supplements displaying low fungal species counts were standardized Ginkgo extracts (Ginkgo-7 and 8), which likely underwent multiple purification stages of active components, in order to remove toxic ginkgolic acids under controlled conditions [27].
Fungi and pathogenic bacteria are often found in spices and herbs [111][112][113][114]. Tournas et al. [115] studied the microbiology of ginseng supplements from the North American market and detected fungal contamination (including potentially toxigenic mold species) in most ginseng supplements, except extracts. A study of incidence and toxigenic capacity of fungal strains (Aspergillus, Penicillium, and Fusarium) in Argentinian medicinal herbs [38] highlighted the need for standard procedures to assess acceptability limits for fungal contamination. Many potentially toxigenic fungi co-exist with plants as important endophytic and/or mycorrhizal symbionts [33][34][35][36][37]. Recent NGS-based studies have reconstructed complex symbiotic network architectures; for example, an assemblage of 33 plant species in a temperate forest in Japan was found to be associated with 387 functionally and phylogenetically diverse fungal taxa [116]. Many endophytic fungi participate in the production of their host's secondary metabolites and could be used as bio-producers of valuable medicinal components [33,108,[117][118][119][120][121][122][123]. The production of plant extracts often involves fungal and/or bacterial fermentation to improve yields of bioactive components [8,28,30,31,124]. Diversity of fungi in herbal supplements will be determined by a biocomplexity of plant-fungal interactions, molds proliferated during storage, and strains involved in fermentation, therefore interpretation of test results should focus on potential mycotoxin-producing fungi and human pathogens.

Conclusions
1. The NGS workflow developed in this study enables simultaneous detection of plant and fungal DNA. This protocol can be utilized by manufacturers for screening of potential mycotoxin-producing and pathogenic fungi, for quality assurance of raw plant materials, contamination control during the production process, and for assessing the purity of the final product.
2. Sanger sequencing should not be used for testing herbal supplements, due to its inability to resolve mixed signal from samples containing multiple species. NGS-based approaches are far more superior, enabling reliable and effective detection of DNA in complex mixtures.
3. Aside from intended or non-intended substitution, cross-contamination with non-target plant DNA may occur at any stage during growing, harvesting, manufacturing, handling or laboratory analysis of plant material. NGS-based methods would detect such traces, in addition to target DNA. By contrast, when the contaminant template is preferentially amplified, Sanger sequencing may detect only contaminant DNA, leading to biased and misleading outcomes.
4. Diversity of fungi in herbal supplements will be determined by a combination of pathogenic, endophytic and mycorrhizal fungi naturally associated with live plant material, saprophytic fungi proliferated during drying and storage, and strains involved in the fermentation during manufacturing of bioactive components. Although this entire spectrum would be easily detected by NGS methods, interpretation of test results should focus on potential mycotoxin-producing fungi and human pathogens.
5. Quality control of herbal supplements should utilize a synergetic approach targeting both bioactive components and DNA, especially for standardized extracts with potentially degraded DNA.