Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Validation of the shotgun metabarcoding approach for comprehensively identifying herbal products containing plant, fungal, and animal ingredients

  • Zhaolei Zhang,

    Roles Formal analysis, Writing – original draft

    Affiliation Hebei Key Laboratory of Study and Exploitation of Chinese Medicine, Chengde Medical University, Chengde, China

  • Weishan Mu,

    Roles Writing – review & editing

    Affiliation Hebei Key Laboratory of Study and Exploitation of Chinese Medicine, Chengde Medical University, Chengde, China

  • Weijun Kong,

    Roles Writing – review & editing

    Affiliation Institute of Medicinal Plant Development, Chinese Academy of Medical Sciences, Peking Union Medical College, Beijing, China

  • Jiali Liu,

    Roles Writing – review & editing

    Affiliation Institute of Medicinal Plant Development, Chinese Academy of Medical Sciences, Peking Union Medical College, Beijing, China

  • Jingyi Zhao,

    Roles Writing – review & editing

    Affiliation Hebei Key Laboratory of Study and Exploitation of Chinese Medicine, Chengde Medical University, Chengde, China

  • Qing Zhao,

    Roles Methodology

    Affiliation Hebei Key Laboratory of Study and Exploitation of Chinese Medicine, Chengde Medical University, Chengde, China

  • Mengmeng Shi,

    Roles Formal analysis

    Affiliation Hebei Key Laboratory of Study and Exploitation of Chinese Medicine, Chengde Medical University, Chengde, China

  • Hongye Zhao,

    Roles Methodology

    Affiliation Hebei Key Laboratory of Study and Exploitation of Chinese Medicine, Chengde Medical University, Chengde, China

  • Jinxin Liu ,

    Roles Project administration, Writing – review & editing

    liujx_23@163.com (JL); linchun_shi@163.com (LS)

    Affiliations Hebei Key Laboratory of Study and Exploitation of Chinese Medicine, Chengde Medical University, Chengde, China, Institute of Medicinal Plant Development, Chinese Academy of Medical Sciences, Peking Union Medical College, Beijing, China

  • Linchun Shi

    Roles Writing – review & editing

    liujx_23@163.com (JL); linchun_shi@163.com (LS)

    Affiliations Hebei Key Laboratory of Study and Exploitation of Chinese Medicine, Chengde Medical University, Chengde, China, Institute of Medicinal Plant Development, Chinese Academy of Medical Sciences, Peking Union Medical College, Beijing, China

Abstract

Identifying plant, fungal, and animal ingredients in a specific mixture remains challenging during the limitation of PCR amplification and low specificity of traditional methods. Genomic DNA was extracted from mock and pharmaceutical samples. Four type of DNA barcodes were generated from shotgun sequencing dataset with the help of a local bioinformatic pipeline. Taxa of each barcode was assigned by blast to TCM-BOL, BOLD, and GenBank. Traditional methods including microscopy, thin layer chromatography (TLC), and high-performance liquid chromatography (HPLC) were carried out according to Chinese pharmacopoeia. On average, 6.8 Gb shotgun reads were sequenced from genomic DNA of each sample. Then, 97, 11, 10, 14, and one operational taxonomic unit (OTU) were generated for ITS2, psbA-trnH, rbcL, matK, and COI, respectively. All the labeled ingredients including eight plant, one fungal, and one animal species were successfully detected in both the mock and pharmaceutical samples, in which Chebulae Fructus, Poria, and Fritilariae Thunbergia Bulbus were identified via mapping reads to organelle genomes. In addition, four unlabeled plant species were detected from pharmaceutical samples, while 30 genera of fungi, such as Schwanniomyces, Diaporthe, Fusarium were detected from mock and pharmaceutical samples. Furthermore, the microscopic, TLC, and HPLC analysis were all in accordance with the standards stipulated by Chinese Pharmacopoeia. This study indicated that shotgun metabarcoding could simultaneously identified plant, fungal, and animal ingredients in herbal products, which has the ability to serve as a valuable complement to traditional methods.

Introduction

Identifying plant, fungal, and animal ingredients in a specific mixture remains challenging during the limitation of PCR amplification and low specificity of traditional methods. A combination of DNA metabarcoding, DNA barcoding, and high-throughput sequencing technology has been widely used in ecology [13], food science [4, 5], and other fields since first proposed in 2012 [6]. However, the polymerase chain reaction (PCR) amplification process is prone to bias and chimera generation, affecting subsequent data analysis [7]. To eliminate the influence of PCR bias or failure in some taxon groups, Liu et al. proposed a new strategy known as "shotgun metabarcoding," which directly enriched and assembled multiple DNA barcodes via shotgun sequencing data [8]. Shotgun metabarcoding successfully detected all the prescribed ingredients of Wuhu San in mock samples, while the feasibility of this method was verified by testing commercially available samples. Furthermore, shotgun metabarcoding was applied for the quality control of Qingguo Wan, where 103, 12, 10, and 12 OTUs were obtained for ITS2, psbA-trnH, matK, and rbcL, respectively, in mock and commercial samples [9]. All eight medicinal materials of Qingguo Wan were successfully authenticated, while fungi belonging to 20 genera in 19 families were detected in addition to the labeled ingredients. Moreover, 52.16 Gb of shotgun sequencing reads were generated for Fuke Desheng Wan, while 228, 23, and 14 operational taxonomic units (OTUs) were obtained for ITS2, matK, and rbcL, respectively [10]. Six labeled ingredients from the Apiaceae, Paeoniaceae, Asteraceae and Lamiaceae families were simultaneously detected, while 25 weed species representing 16 genera in ten families were revealed by the high-throughput sequencing data. Shotgun metabarcoding displays the potential to be used as a genetic identification method to simultaneously authenticate labeled ingredients, contaminants, and substitutes in traditional patented medicines.

Tiedi Wan, a classic prescription for treating chronic pharyngitis with a long history, was first recorded in the Ming Dynasty under "throat paralysis" in Volume 6 of Dr. Gong Tingxian’s "Shoushi Baoyuan"(寿世保元) [11]. According to the Pharmacopoeia of the People’s Republic of China (simplified as Chinese Pharmacopoeia, 2020 edition), Tiedi Wan consists of eight plant, one fungal, and one animal ingredient, i.e., Ophiopogonis Radix (Maidong), Platycodonis Radix (Jiegeng), Canarii Fructus (Qingguo), Scrophulariae Radix (Xuanshen), Fritilariae Thunbergia Bulbus (Zhebeimu), Trichosanthis Pericarpium (Gualoupi), Poria (Fuling), Glycyrrhiza Radix et Rhizoma (Gancao), Membrana Follicularis Ovi (Fenghuangyi), and Chebulae Fructus (Hezirou) [12]. The quality control methods of Tiedi Wan listed in the Chinese Pharmacopoeia inc lude the microscopic identification of the seven ingredients, thin layer chromatography (TLC) authentication of gallic acid and Peimine, and the content determination of glycyrrhizic acid via high-performance liquid chromatography (HPLC) [12]. In addition, glycyrrhetinic acid [13], platycodon saponin D, beriberiin A, and beriberiin B are reportedly quality markers of Tiedi Wan [14]. In another development, a novel electro-chemical detector was used for the detection of chemical components in Chinese medicine [15]. Compared to traditional methods, electrochemical sensors may be another superior option due to their low cost, high sensitivity and selectivity.

This study combined shotgun metabarcoding with microscopy, TLC, and HPLC to authenticate the biological ingredients in Tiedi Wan. Mock samples were prepared according to the 2020 edition of the Chinese Pharmacopoeia and used to test the feasibility of the method for identifying the labeled ingredients of Tiedi Wan. Then, commercially available pharmaceutical samples of Tiedi Wan were collected to simulate the process of quality detection to verify whether shotgun metabarcoding can be used as a complement to the traditional method.

Materials and methods

Herbal materials, mock samples, and pharmaceutical samples

The 10 medicinal materials (Fig 1) for preparing the mock samples were purchased from Tongrentang Pharmacy. The authenticity of these medicinal materials was verified via morphological and DNA barcoding technology. Two mock samples, namely HSZY162 and HSZY175, were prepared according to the prescriptions delineated in the Chinese Pharmacopoeia (S1 Table). The formulae for HSZY162 and HSZY175 were exactly the same except for Panacis Quinquefolii Radix, which was in the same proportion as Poria and was added to HSZY175 as a positive control for monitoring the experimental process (S2 Table). In addition, three batches of pharmaceutical samples were collected from the same manufacturer to simulate the behavior during market detection. The three batches of samples were labeled HSZY056, HSZY143, and HSZY144, respectively.

thumbnail
Fig 1. Morphological characteristics of the ten herbal materials of Tiedi Wan.

(A) Ophiopogonis Radix, (B) Platycodonis Radix, (C) Canarii Fructus, (D) Scrophulariae Radix, (E) Fritilariae Thunbergia Bulbus, (F) Trichosanthis Pericarpium, (G) Poria, (H) Glycyrrhiza Radix Et Rhizoma, (I) Membrana Follicularis Ovi, (J) Chebulae Fructus.

https://doi.org/10.1371/journal.pone.0286069.g001

DNA extraction, PCR amplification, Sanger sequencing, and Illumina sequencing

The DNA of the herbal materials and medicinal products was extracted according to previously reported methods by Liu et al [8, 9, 16]. The DNA quality of the herbal materials was estimated using a NanoDrop one ultra-micro spectrophotometer (Thermo Fisher Scientific Inc., USA). The DNA of the medicinal products was specifically quantified using a Qubit 4.0 Fluorometer (Thermo Fisher Scientific Inc., USA), followed by shearing to prepare a PCR-free library of 350 bp. The five traditional DNA barcoding regions, namely ITS2, psbA-trnH, rbcL, matK, and COI, were amplified using primer sets, ITS2F/ITS3R [17], psbA/trnH [18], 1F/724R [19], 3F_LIM/1R_KIM [20], and LCO1490/HCO2198 [21], and then bi-directionally sequenced using an ABI 3730 XL DNA Analyzer (Thermo Fisher Scientific Inc., USA). High throughput sequencing was conducted using the Illumina NovaSeq 6000 System.

Data analysis

The ITS2, psbA-trnH, matK, rbcL and COI, primers were removed using CodonCode Aligner v9.0.1 to obtain the traditional DNA barcoding regions. All the DNA barcoding sequences were submitted and deposited in the GenBank database.

Adapter sequences and low-quality bases were gently clipped using Trimmomatic v0.38 [22]. The paired-end reads belonging to the ITS2, psbA-trnH, matK, rbcL, and COI, sequences were enriched via comparison with the GenBank public database. The enriched reads were assembled using MEGAHIT v1.2.9 and MetaSPAdes v3.13.0 [23, 24]. The DNA barcoding regions of psbA-trnH, matK, rbcL and COI, were obtained by removing the above primer sequences using cutadapt v3.8 [25], while the ITS2 regions were annotated using the hidden Markov model (HMM)-based method [26]. The OTUs of the DNA barcodes were generated using Usearch v11 (https://www.drive5.com/usearch/) based upon a 100% sequence similarity threshold, while the representative sequence for each OTU was selected for further analysis. Next, the shotgun reads were mapped to each OTU using bowtie2 v2.3.4.3 [27] to calculate the OTU sequencing depths and coverage. Low reliable OTUs were removed according to previously reported thresholds [8]. The taxonomic assignment for each OTU was performed using the basic local alignment search tool (BLAST) according to the DNA barcoding system for traditional Chinese medicines (TCM-BOL), the barcode of life data system (BOLD), and GenBank on the basis of the lowest common ancestor (LCA) strategies provided by MEGAN v6.20.12 [28].

Chemical analysis

For confirmation, the chemical tests for the pharmaceutical samples, including microscopic identification, TLC, and HPLC, were conducted according to the Chinese Pharmacopoeia. The pharmaceutical Tiedi Wan sample powder was observed under a biological microscope. Gallic acid and Peimine were used as thin-layer chromatographic identification controls to detect the presence of these two components in three commercially available samples according to the method delineated by the Chinese Pharmacopoeia. The TLC chromatograms of the sample and control show spots of the same color at corresponding positions. The glycyrrhetinic acid content was used as a test standard during HPLC. The chromatographic system for HPLC detection was as follows: an Agilent C18 chromatographic column (4.6 × 250 mm, 5 μm) was used as the stationary phase, while a mixture of methanol (A) and 0.2 mol/L Ammonium acetate solution (B) (64:36) was used as the mobile phase. The flow rate of the mobile phase was 1 ml/min, and the column temperature was set to 35 °C. The detector consisted of a diode array detector (DAD), and the spectrophotometer was set to 250 nm. The injection volume was 10 μl.

Results

Authentication of the medicinal materials for creating the mock samples using DNA barcoding and macroscopic identification methods

In addition to traditional macroscopic identification methods, DNA barcoding was used for the species confirmation of the ten ingredients. Six, seven, three, and five ITS2, psbA-trnH, matK, and rbcL barcodes were successfully obtained for the plant and fungal ingredients using universal primers, while no DNA barcodes were acquired for Chebulae Fructus and Poria. The COI, barcodes were obtained for the animal ingredient, Membrana Follicularis Ovi, verifying it as Gallus gallus domesticus. The ITS2, psbA-trnH, matK, and rbcL barcodes of the positive control, Panacis Quinquefolii Radix, have also been amplified and confirmed as Panax quinquefolius. In addition, the corresponding DNA barcodes of failed PCR were downloaded from previously published papers [16, 17]. The GenBank accession numbers of DNA barcodes obtained in this study and downloaded from GenBank are listed in Table 1.

thumbnail
Table 1. GenBank accession numbers of the medicinal materials for making mock samples.

https://doi.org/10.1371/journal.pone.0286069.t001

Summary of the shotgun meta-sequencing and data analysis

A total of 34.08 Gb of clean data were generated for the five samples via high-throughput sequencing using the Illumina NovaSeq platform, yielding 101,861, 754,010, 9,317, 16,322, and 8,113 paired-end reads for ITS2, psbA-trnH, matK, rbcL, and COI, respectively. Furthermore, 11,392 unique contigs were generated via the MEGAHIT and MetaSPAdes assemblers. Annotation of the DNA barcoding regions yielded 243, 35, 26, 28, and 2 ITS2, psbA-trnH, matK, rbcL, and COI, barcodes, respectively. Finally, 133 high-quality OTUs were obtained with the similarity of 100%. The average length of ITS2, psbA-trnH, matK, rbcL, and COI, were 203bp, 374bp, 881bp, 702bp, and 658 bp. A summary of the shotgun meta-sequencing and data analysis results can be found in Table 2 and S3 Table.

thumbnail
Table 2. Summary of the data analysis of shotgun meta-sequencing data.

https://doi.org/10.1371/journal.pone.0286069.t002

The DNA barcodes of the medicinal materials detected in the mock samples

Here, 15, 9, 8, and 10 OTUs were generated for ITS2, psbA-trnH, matK, and rbcL in the two mock samples (HSZY162 and HSZY174) created in the laboratory. The species assignment of these OTUs showed that the shotgun meta-sequencing approach detected all the prescribed ingredients except Chebulae Fructus and Poria (Table 3).

thumbnail
Table 3. High throughput sequencing (HTS) detection of the DNA barcode sequences of prescribed herbal materials in two TDW samples.

https://doi.org/10.1371/journal.pone.0286069.t003

The ITS2 sequences of Glycyrrhiza Radix et Rhizoma in the ITS2 region displayed three OTUs, while Scrophulariae Radix had three, and Trichosanthis Pericarpium had six. The identification of OTUs showed that the ITS2 sequence of Scrophulariae Radix clustered 2 OTUs, 1 of which differed by 1 base from the reference sequence obtained in Section 3.1 using the Sanger sequencing method. Three OTUs were obtained for Glycyrrhiza Radix et, with 0–4 base differences compared to the reference sequence. The ITS2 sequence of Trichosanthis Pericarpium had 6 OTUs, of which 1 OTU was consistent with the reference sequence and the other 5 OTUs differed from the reference sequence by 8 base sites. All the other prescribed ingredients obtained one type of OTU, respectively, which was the same as the sequences generated via Sanger sequencing. For the psbA-trnH, matK, and rbcL barcodes, the OTUs obtained via shotgun meta-sequencing were identical or exhibited a one-base difference compared with the sequences of corresponding species obtained via Sanger sequencing. Only one type of COI, OTU was clustered in the two mock samples, while the species were identified as Gallus gallus domesticus. In addition, the four barcodes of Panacis Quinquefolii Radix, as a positive control for the experiment, were all successfully re-assembled and exactly the same as the sequences obtained in Section 3.1. An overview of the DNA barcodes belonging to the prescribed ingredients is shown in Fig 2.

thumbnail
Fig 2. The number of OTUs and base differences detected in mock samples of Tiedi Wan groups of species.

The size of the bubbles indicates the number of OTUs obtained and the colour of the bubbles indicates base differences. Darker colours indicate more base differences.

https://doi.org/10.1371/journal.pone.0286069.g002

The plant and animal ingredients detected in the commercial pharmaceutical samples

The labeled plant ingredients in the prescription, as well as the ITS2, psbA-trnH, matK, and rbcL barcodes, were all successfully detected in the pharmaceutical samples. ITS2 sequence detection identified the six labeled ingredients in all the samples, while Chebulae Fructus and Poria were detected as partial sequences in one or two samples, and none of the samples contained the ITS2 sequence for Fritilariae Thunbergia Bulbus (Fig 3, S4 Table). For the psbA-trnH and rbcL barcodes, Trichosanthis Pericarpium, Scrophulariae Radix, and Glycyrrhiza Radix et Rhizoma were detected in all pharmaceutical samples (S1 and S2 Figs, S5 and S6 Tables). Moreover, the matK barcodes of Canarii Fructus, Platycodonis Radix, Trichosanthis Pericarpium, and Glycyrrhiza Radix et Rhizoma were present in all the pharmaceutical samples (S3 Fig, S7 Table). In addition, shotgun sequencing also detected several plant source ingredients not labeled in the prescription, such as Cullen corylifolium, Limonium gmelinii, Lablab purpureus, Toona sinensis, Alhagi sparsifolia and Zea mays, while the number of reads belonging to the unlabeled plant ingredients was 1.37% that of the labeled plant ingredients. Finally, one type of COI, sequence was detected in all the pharmaceutical samples, which was identified as Gallus gallus domesticus, referring to the legal original species of Membrana Follicularis Ovi.

thumbnail
Fig 3. The ITS2 sequences of three commercially available samples were clustered and analysed using MEGAN6.

Each taxonomic the node is drawn as a pie chart indicating the proportion of each species in the taxon for each sample.

https://doi.org/10.1371/journal.pone.0286069.g003

Organelle genome mapping for Poria and Chebulae Fructus that cannot be stably detected

In this study, the completed DNA barcodes of Poria and Chebulae Fructus could only be detected in HSZY144 and HSZY056. By mapping their ITS2 sequences, only partial Poria and Chebulae Fructus sequences could be obtained, while a few reads were mapped in other samples.

When mapping reads to the whole mitochondrial genome of Poria, the nad1, orf5, nad2, COX1, atp8, and orf30 regions of the mitochondrial genome were mapped by reads in HSZY056 and HZSZY143, while only the orf5, and COX1 regions were covered in HSZY162 and HSZY174. The COX1 region was mapped by 20 reads in sample HSZY162, while most of the reads were distributed in the region between primer LCO1490 and HCO2198 (S4 Fig). A fewer number of reads were evident in the orf5 region than in the COX1 region and were similar in all four samples.

For the whole chloroplast genome of Chebulae Fructus, 80478, 11840, 138276, 2294458 paired-end reads were mapped in 27 genes and 18 intergenic regions in HSZY143, HSZY144, HSZY162 and HSZY174, respectively. The psbA, ndhK, ndhB, ycf2, and rpl2 genes could be mapped with reads in all the samples. The psbA, ndhK, ndhB, ycf2, and rpl2 gene was mapped with relatively more reads in the four samples, and the psbA and rpl2 regions had higher coverage and sequencing depth in these gene. The psbA and rpl2 regions were mapped by much more reads than other regions in the four sample, with an average coverage of 99.43% and 99.30% respectively. The psbA region of Chebulae Fructus in HSZY162 alone displayed 100% coverage with 1542 reads (S5 Fig). More reads were mapped on the intergenic gene spacers than genes such as rpl36-rps8, rps12-ndhF, and ycf1-rps12.

The fungi detected in the mock and pharmaceutical samples

Fifty-five ITS2 OTUs were identified as fungal belonging to 30 genera of 22 families (S6 Fig, S8 Table). The number of fungal ITS2 reads was 1.62% that of the total ITS2 reads. Schwanniomyces was the most dominant in all the mock and pharmaceutical samples, followed by Diaporthe, Fusarium, Penicillium, and Aspergillus. The number of Schwanniomyces reads was 21.5% of the total number of reads belonging to the fungal community. Aspergillus was detected in both the mock and pharmaceutical samples, with a total of 250 reads. The number of Aspergillus reads in the pharmaceutical samples was 202, which was about four times that of the mock samples. In addition, the number of fungal genera detected in each sample differed, with a minimum of three in HSZY162 and a maximum of 20 in HSZY144. The prescription ingredient Poria is a fungus, of which 46 reads were detected. The ITS2 sequence of Poria was assembled into a complete DNA barcode in HSZY144 and into partial sequences in HSZY056 and HSZY143.

Microscopic identification and chemical analysis of the pharmaceutical samples

Analysis of the sample powders revealed the microscopic features of seven species, including Poria, Ophiopogonis Radix, Scrophulariae Radix, Fritilariae Thunbergia Bulbus, Chebulae Fructus, Platycodonis Radix, and Glycyrrhiza Radix et Rhizoma (Fig 4A). The microscopic features of Poria showed irregularly branched masses and a light brown mycelium. Ophiopogonis Radix presented bundled or scattered calcium oxalate needles of 24–50 μm long and about 3 μm in diameter. Scrophulariae Radix were yellowish-brown stone cells, which were round or irregularly shaped and approximately 94 μm in diameter. Fritilariae Thunbergia Bulbus displayed an ovoid amylum body. The microscopic features of Chebulae Fructus of the chebulo are the yellowish fibrous layer of pericarp. Platycodonis Radix displayed tubes of 14–25 μm in diameter, containing yellowish granular material. Glycyrrhiza Radix et Rhizoma exhibited crystal fibers formed via calcium oxalate square crystals contained in the thin-walled cells surrounding the fiber bundles. The TLC chromatograms of the Tiedi Wan samples with gallic acid and Peimine are shown in Fig 4B. The points in the chromatograms obtained for the test solutions of the three Tiedi Wan samples correspond to the position and color of the spots acquired for the control solution. The three pharmaceutical samples contained 3.06 mg/g, 3.09 mg/g, and 3.05 mg/g of glycyrrhetinic acid as the active ingredient and were detected via HPLC (Fig 4C). The results of the three pharmaceutical samples all met the standards of the Chinese Pharmacopoeia.

thumbnail
Fig 4. Microscopic identification, the TLC identification, and HPLC chromatogram of pharmaceutical samples.

A. Microscopical identification: a. Irregularly branched mass b. hypha c. Calcium oxalate needle crystal d. stone cell e. amylum body f. Fibrous layer of pericarp g. Connecting tube h. Crystal Sheath Fiber. B. Figure a shows the thin layer chromatogram of gallic acid, 1–3 represent the test solutions of HSZY056, HSZY143 and HSZY144 respectively, and 4 represents the control solution of gallic acid. Figure b shows the thin layer chromatogram of Peimine. 5–7 represent the sample solutions of HSZY056, HSZY143 and HSZY144 respectively, and 8 is the control solution of Peimine. C. HPLC chromatogram of pharmaceutical samples. The peak marked in the figure is the chromatographic peak of glycyrrhizic acid.

https://doi.org/10.1371/journal.pone.0286069.g004

Discussion

The challenges of using shotgun metabarcoding to characterize the species in traditional herbal products

The analytical results of the mock samples showed four DNA barcode sequences for most of the herbs. The high throughput data of HSZY174 revealed the presence of the positive control, Panacis Quinquefolii Radix, indicating that this method could be used to detect species with a low dosage and high sensitivity. However, the initial study showed two OTUs in the ITS2 fragment for Ginseng Radix et Rhizoma and Panacis Quinquefolii Radix. A comparison with the standard sequences of Ginseng Radix et Rhizoma and Panacis Quinquefolii Radix revealed that these two OTUs were located at positions 34 and 43, which belonged to the Ginseng Radix et Rhizoma and Panacis Quinquefolii Radix sequences, respectively. The original data was used to map the Panacis Quinquefolii Radix OTU, indicating that the Ginseng Radix et Rhizoma sequence was absent from this data, probably due to an assembly error generated by the assembly software.

The ITS2 region showed an abundance of Glycyrrhiza Radix et Rhizoma, Scrophulariae Radix, and Trichosanthis Pericarpium OTUs with base differences from the DNA barcodes obtained via Sanger sequencing in Section 3.1. The OTU differences obtained via the two sequencing methods may be related to the ITS2 region being a duplicate gene. High-throughput sequencing obtains the genome of a mixture of multiple individuals, with sequences that may be present in multiple cases. Contrarily, Sanger sequencing can only obtain one gene sequence by testing a sample segment at a time.

The ITS2 sequence of Ophiopogonis Radix was detected in HSZY174 but not in HSZY162. After assembly using the original data, some Ophiopogonis Radix ITS2 sequences were also found in HSZY162, probably due to uneven sequencing. The chloroplast genes of Ophiopogonis Radix and Canarii Fructus were not detected in the HSZY143, and HSZY144 pharmaceutical samples, probably since the chloroplast gene sequencing was not uniform and the sequences of these three DNA barcode regions were not obtained. Four barcode sequences were detected in both mock samples but not in the pharmaceutical samples.

Northern genistein was detected during the sequencing data analysis. Scrophulariae Radix is the dried root of the plant Scrophularia ningpoensis Hemsl, which belongs to the genus Scrophularia in the Scrophulariaceae family. There are two species, S. ningpoensis and S. buergeriana (S. oldhami), both of which are similar in form and differ only in inflorescence and flower color [29]. In 1959, S. ningpoensis was used for the first time as the base of Scrophulariae Radix in the first edition of Chinese Medicine and was also known as Zhejiang Scrophulariae Radix. Due to its abundance and good quality, S. ningpoensis has been used as the base of Scrophulariae Radix in all editions of the Pharmacopoeia. In the northern part of China, S. ningpoensis is often used as a substitute for S. buergeriana since they are equal in anti-inflammatory potency and analgesic effect. Therefore, it is feasible to use S. ningpoensis as a substitute for S. buergeriana in clinical practice [30].

Some non-prescription ingredients and fungi were detected in the pharmaceutical samples via ITS2 sequences with a small proportion of reads. In previous tests, sequences of non-prescribed ingredients and fungi were obtained in all analyses of proprietary Chinese medicines by Liu et al. Salix sp., Chenopodium album, Convolvulus arvensis, Citrus sp., and Scutellaria baicalensis were detected as non-prescription ingredients and 24 fungal genera in Wuhu San [8]. Fungal components of 19 families and 20 genera were detected during a study of the Qingguo Wan [9]. A total of 25 weed species and 26 fungal genera belonging to 17 families were detected in both lab-made and commercial Fuke Desheng Wan samples [10]. The detection of non-prescription ingredients may be due to the introduction of other impurities during the production and processing of Chinese medicines [31]. Fungi were detected in all samples, which may be introduced during the production, processing, transportation, and storage of traditional patent medicine [32]. Using a shotgun metabarcoding method to obtain DNA from all species in the sample for analysis, multiple species such as prescription ingredients, non-prescription ingredients and fungi are detected and species detection of mixtures can be accomplished. Shotgun metabarcoding is suitable for the identification of a wide range of unknown herbal raw materials or processed products.

The complementarity of using multiple DNA barcodes for species detection in traditional herbal products

This study used five DNA barcodes to detect the components of traditional herbal medicine, namely, ITS2, psbA-trnH, matK, rbcL, and COI, fragments, with different identification abilities. The ITS2 region can be used as a standard DNA barcode to identify medicinal plants and their relatives for determining a wider range of plant taxa [17, 33]. All the ITS2 sequences obtained in this study were identified accurately and displayed higher identification efficiency compared to the sequences of other fragments.

Chloroplasts are present in green tissues and immature seeds in the form of protoplasts [34]. Chebulae Fructus is the dried ripe fruit of Terminalia chebula Retz. or Terminalia tomentella Retz. var. tomentella Kurt. of the family Angelicaceae and chloroplast genes are not easily detected. Chloroplast genes are absent in some herbal materials, such as fungal herbs like Poria, in which three chloroplast genes could not be detected. The commonly used single DNA barcodes cannot identify all species and need to be combined with other DNA barcodes for identification [35]. The three single-gene barcodes present their own advantages and disadvantages, with matK having a high evolutionary rate, moderate length, and obvious interspecific differentiation [36] but with different identification capabilities for different species [35]. The rbcL could be used to identify the level of family and genus, but not the species [37], while it could be used in combination with other barcodes for accurate identification [38]. The properties of psbA-trnH show that it was more suitable as a universal DNA barcode. Amplification primers were obtained in conserved coding regions. A single primer pair could amplify almost all angiosperms with large sequence variation in the spacer region [39, 40]. The mock sample results indicated that the three chloroplast genes could be used as auxiliary barcodes for identifying herbal components. The Ophiopogonis Radix sequences were not found in the HSZY056, HSZY143, and HSZY144 pharmaceutical samples. The whole chloroplast genome was used to assemble the raw data of these three samples. Although the matK, rbcL, and psbA-trnH sequences were not obtained, a portion of other chloroplast sequences was identified. Using the whole chloroplast genome as DNA barcodes solved the problem of insufficient variation in a single barcode. Species are distinguished by bioinformatics identification, combined with other methods to ensure material quality [41].

Conclusions

The results suggest that the shotgun metabarcoding approach can authenticate all the biological ingredients of Tiedi Wan. This study confirms that shotgun metabarcoding displays potential as a powerful tool to simultaneously authenticate plant, fungal, and animal ingredients in Tiedi Wan, ensuring its quality in combination with other traditional methods.

Supporting information

S1 Table. Medicinal materials collected for making mock samples.

https://doi.org/10.1371/journal.pone.0286069.s001

(DOCX)

S2 Table. The proportion and dosage of herbal materials listed in the prescription of mock and pharmaceutical drug Tiedi Wan according to the Chinese Pharmacopoeia.

https://doi.org/10.1371/journal.pone.0286069.s002

(DOCX)

S3 Table. Sequencing results of the TDW samples.

https://doi.org/10.1371/journal.pone.0286069.s003

(DOCX)

S4 Table. The reads number of the prescription ingredients in the three commercially available samples based on the ITS2 sequences.

https://doi.org/10.1371/journal.pone.0286069.s004

(DOCX)

S5 Table. The reads number of the prescription ingredients in the three commercially available samples based on the psbA-trnH sequences.

https://doi.org/10.1371/journal.pone.0286069.s005

(DOCX)

S6 Table. The reads number of the prescription ingredients in the three commercially available samples based on the rbcL sequences.

https://doi.org/10.1371/journal.pone.0286069.s006

(DOCX)

S7 Table. The reads number of the prescription ingredients in the three commercially available samples based on the matK sequences.

https://doi.org/10.1371/journal.pone.0286069.s007

(DOCX)

S8 Table. The reads number of the fungi in the five samples based on the ITS2 sequences.

https://doi.org/10.1371/journal.pone.0286069.s008

(DOCX)

S1 Fig. The psbA-trnH sequences of three commercially available samples were clustered and analysed using MEGAN6.

The node is drawn as a pie chart indicating the proportion of each species in the taxon for each sample.

https://doi.org/10.1371/journal.pone.0286069.s009

(TIF)

S2 Fig. The rbcL sequences of three commercially available samples were clustered and analysed using MEGAN6.

Each taxonomic the node is drawn as a pie chart indicating the proportion of each species in the taxon for each sample.

https://doi.org/10.1371/journal.pone.0286069.s010

(TIF)

S3 Fig. The matK sequences of three commercially available samples were clustered and analysed using MEGAN6.

Each taxonomic the node is drawn as a pie chart indicating the proportion of each species in the taxon for each sample.

https://doi.org/10.1371/journal.pone.0286069.s011

(TIF)

S4 Fig. Mapping of the COX1 sequence of the mitochondrial genome of Poria in HSZY162.

https://doi.org/10.1371/journal.pone.0286069.s012

(TIF)

S5 Fig. Mapping of the psbA sequence of the chloroplast genome of Chebulae Fructus in HSZY162.

https://doi.org/10.1371/journal.pone.0286069.s013

(TIF)

S6 Fig. Distribution of the fungi for each sample at the genus level.

The data were visualized by Circos. The left half-circle indicates the distribution ratio of species in different samples at the genus level: the outer ribbon represents the species; the inner ribbon represents different groups, and the length represents the sample proportion of a particular genus. The right half-circle indicates the species composition in each sample: the color of the outer ribbon represents samples from different groups; the color of the inner ribbon represents the composition of different species in each sample, and the length of the ribbon represents the relative abundance of the corresponding species.

https://doi.org/10.1371/journal.pone.0286069.s014

(TIF)

References

  1. 1. Zinger L, Taberlet P, Schimann H, Bonin A, Boyer F, De Barba M, et al. Body size determines soil community assembly in a tropical forest. Molecular ecology 2019, 28(3):528–543. pmid:30375061
  2. 2. Del Carmen Gomez Cabrera M, Young JM, Roff G, Staples T, Ortiz JC, Pandolfi JM, et al. Broadening the taxonomic scope of coral reef palaeoecological studies using ancient DNA. Molecular ecology 2019, 28(10):2636–2652. pmid:30723959
  3. 3. Hänfling B, Lawson Handley L, Read DS, Hahn C, Li J, Nichols P, et al. Environmental DNA metabarcoding of lake fish communities reflects long-term data from established survey methods. Molecular ecology 2016, 25(13):3101–3119. pmid:27095076
  4. 4. Dobrovolny S, Blaschitz M, Weinmaier T, Pechatschek J, Cichna-Markl M, Indra A, et al. Development of a DNA metabarcoding method for the identification of fifteen mammalian and six poultry species in food. Food chemistry 2019, 272:354–361. pmid:30309555
  5. 5. Bruno A, Sandionigi A, Agostinetto G, Bernabovi L, Frigerio J, Casiraghi M, et al. Food Tracking Perspective: DNA Metabarcoding to Identify Plant Composition in Complex and Processed Food Products. Genes 2019, 10(3):248. pmid:30934656
  6. 6. Taberlet P, Coissac E, Pompanon F, Brochmann C, Willerslev E. Towards next-generation biodiversity assessment using DNA metabarcoding. Molecular ecology 2012, 21(8):2045–2050. pmid:22486824
  7. 7. Furlan EM, Davis J, Duncan RP. Identifying error and accurately interpreting environmental DNA metabarcoding results: A case study to detect vertebrates at arid zone waterholes. Molecular ecology resources 2020, 20(5):1259–1276. pmid:32310337
  8. 8. Liu J, Mu W, Shi M, Zhao Q, Kong W, Xie H, et al. The Species Identification in Traditional Herbal Patent Medicine, Wuhu San, Based on Shotgun Metabarcoding. Frontiers in pharmacology 2021, 12:607200. pmid:33664667
  9. 9. Liu J, Shi M, Zhao Q, Kong W, Mu W, Xie H, et al. Precise Species Detection in Traditional Herbal Patent Medicine, Qingguo Wan, Using Shotgun Metabarcoding. Frontiers in pharmacology 2021, 12:607210. pmid:33995010
  10. 10. Xie H, Zhao Q, Shi M, Kong W, Mu W, Li B, et al. Biological Ingredient Analysis of Traditional Herbal Patent Medicine Fuke Desheng Wan Using the Shotgun Metabarcoding Approach. Frontiers in pharmacology 2021, 12:607197. pmid:34483893
  11. 11. Liu S. Analysis on the Variation of Tiedi Wan Recipe. Hunan Journal of Traditional Chinese Medicine 2000(04):47–48.
  12. 12. Tnpco China. Pharmacopoeia of the People’s Republic of China Part I. Beijing: China Medical Science Press; 2020.
  13. 13. Meng L, Cao J, Zhang H, Shen S, Du S. Simultaneous determination of four components in Tiedi Wan by HPLC-ELSD. Chinese Journal of Experimental Traditional Medical Formulae 2010, 16(04):35–36.
  14. 14. Zhang L, Wang G. Simultaneous determination of four components in Tiedi Wan by HPLC-ELSD Central South Pharmacy 2020, 18(10):1692–1695.
  15. 15. Wang J, Xu Q, Liu J, Kong W, Shi L. Electrostatic Self-Assembly of MXene on Ruthenium Dioxide-Modified Carbon Cloth for Electrochemical Detection of Kaempferol. Small (Weinheim an der Bergstrasse, Germany) 2023:e2301709. pmid:37093500
  16. 16. Liu J, Shi L, Song J, Sun W, Han J, Liu X, et al. BOKP: A DNA Barcode Reference Library for Monitoring Herbal Drugs in the Korean Pharmacopeia. Frontiers in pharmacology 2017, 8:931. pmid:29326593
  17. 17. Chen S, Yao H, Han J, Liu C, Song J, Shi L, et al. Validation of the ITS2 region as a novel DNA barcode for identifying medicinal plant species. PloS one 2010, 5(1):e8613. pmid:20062805
  18. 18. Hamilton M. Four primer pairs for the amplification of chloroplast intergenic regions with intraspecific variation. Molecular ecology 1999, 8(0):521–523. pmid:10199016
  19. 19. Fay’ M, Swensen S, Chase M. Taxonomic Affinities of Medusagyne oppositifolia (Medusagynaceae). Kew Bulletin 1997, 52(1):111.
  20. 20. Dunning L, Savolainen V. Broad-scale amplification of matK for DNA barcoding plants, a technical note. Botanical Journal of the Linnean Society 2010, 164(1):1–9.
  21. 21. Folmer O, Black M, Hoeh W, Lutz R, Vrijenhoek R. DNA primers for amplification of mitochondrial cytochrome c oxidase subunit I from diverse metazoan invertebrates. Molecular marine biology and biotechnology 1994, 3(5):294–299. pmid:7881515
  22. 22. Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics (Oxford, England) 2014, 30(15):2114–2120. pmid:24695404
  23. 23. Li D, Luo R, Liu C-M, Leung C-M, Ting H-F, Sadakane K, et al. MEGAHIT v1.0: A fast and scalable metagenome assembler driven by advanced methodologies and community practices. Methods (San Diego, Calif) 2016, 102:3–11. pmid:27012178
  24. 24. Nurk S, Meleshko D, Korobeynikov A, Pevzner PA. metaSPAdes: a new versatile metagenomic assembler. Genome research 2017, 27(5):824–834. pmid:28298430
  25. 25. Martin M. Cutadapt removes adapter sequences from high-throughput sequencing reads. J Embnet Journal 2011, 17(1):10–12.
  26. 26. Keller A, Schleicher T, Schultz J, Müller T, Dandekar T, Wolf M. 5.8S-28S rRNA interaction and HMM-based ITS2 annotation. Gene 2009, 430(1–2):50–57. pmid:19026726
  27. 27. Langmead B, Salzberg SL. Fast gapped-read alignment with Bowtie 2. Nature methods 2012, 9(4):357–359. pmid:22388286
  28. 28. Huson DH, Auch AF, Qi J, Schuster SC. MEGAN analysis of metagenomic data. Genome research 2007, 17(3):377–386. pmid:17255551
  29. 29. Ma L, Wang Y, Zhao J, Jin y, Zhan z. Herbal Textual Research on Scrophulariae Radix in Famous Classical Formulas Chinese Journal of Experimental Traditional Medical Formulae 2021:1–18.
  30. 30. Wang M. The comparative study on Scrophularia ningpoensis Hemsl. (Xuan-shen) and Scrophularia buergeriana Miq. (Bei Xuan-shen). master’s thesis. Northwest University; 2010.
  31. 31. Xin T, Xu Z, Jia J, Leon C, Hu S, Lin Y, et al. Biomonitoring for traditional herbal medicinal products using DNA metabarcoding and single molecule, real-time sequencing. Acta pharmaceutica Sinica B 2018, 8(3):488–497. pmid:29881688
  32. 32. Guo M, Jiang W, Yang M, Dou X, Pang X. Characterizing fungal communities in medicinal and edible Cassiae Semen using high-throughput sequencing. International journal of food microbiology 2020, 319:108496. pmid:31911209
  33. 33. Zhang H, Yao H, Cui L, Du H, Lin Z, Gao X, et al. Study on the identification of animal herbs in Chinese Pharmacopoeia based on COI barcode sequence. Modernization of Traditional Chinese Medicine and Materia Medica-World Science and Technology 2013, 15(03):371–380.
  34. 34. Rose RJ. Sustaining Life: Maintaining Chloroplasts and Mitochondria and their Genomes in Plants. The Yale journal of biology and medicine 2019, 92(3):499–510. pmid:31543711
  35. 35. Li X, Yang Y, Henry RJ, Rossetto M, Wang Y, Chen S. Plant DNA barcoding: from gene to genome. Biological reviews of the Cambridge Philosophical Society 2015, 90(1):157–166. pmid:24666563
  36. 36. Selvaraj D, Sarma RK, Sathishkumar R. Phylogenetic analysis of chloroplast matK gene from Zingiberaceae for plant DNA barcoding. Bioinformation 2008, 3(1):24–27. pmid:19052662
  37. 37. Fazekas AJ, Burgess KS, Kesanakurti PR, Graham SW, Newmaster SG, Husband BC, et al. Multiple multilocus DNA barcodes from the plastid genome discriminate plant species equally well. PloS one 2008, 3(7):e2802. pmid:18665273
  38. 38. Hollingsworth ML, Andra Clark A, Forrest LL, Richardson J, Pennington RT, Long DG, et al. Selecting barcoding loci for plants: evaluation of seven candidate loci with species-level sampling in three divergent groups of land plants. Molecular ecology resources 2009, 9(2):439–457. pmid:21564673
  39. 39. Shaw J, Lickey EB, Schilling EE, Small RL. Comparison of whole chloroplast genome sequences to choose noncoding regions for phylogenetic studies in angiosperms: the tortoise and the hare III. American journal of botany 2007, 94(3):275–288. pmid:21636401
  40. 40. Kress WJ, Erickson DL. A two-locus global DNA barcode for land plants: the coding rbcL gene complements the non-coding trnH-psbA spacer region. PloS one 2007, 2(6):e508. pmid:17551588
  41. 41. Shuo C, Ke-xuan Z, Meng-tong J, Shu-ying H, Yun-feng Z, Xun L, et al. Collagen derived species-specific peptides for distinguishing donkey-hide gelatin (Asini Corii Colla). Chinese Herbal Medicines (CHM) 2021, 13(2):261–266. pmid:36117504