Chemical and Genetic Discrimination of Cistanches Herba Based on UPLC-QTOF/MS and DNA Barcoding

Cistanches Herba (Rou Cong Rong), known as “Ginseng of the desert”, has a striking curative effect on strength and nourishment, especially in kidney reinforcement to strengthen yang. However, the two plant origins of Cistanches Herba, Cistanche deserticola and Cistanche tubulosa, vary in terms of pharmacological action and chemical components. To discriminate the plant origin of Cistanches Herba, a combined method system of chemical and genetic –UPLC-QTOF/MS technology and DNA barcoding–were firstly employed in this study. The results indicated that three potential marker compounds (isomer of campneoside II, cistanoside C, and cistanoside A) were obtained to discriminate the two origins by PCA and OPLS-DA analyses. DNA barcoding enabled to differentiate two origins accurately. NJ tree showed that two origins clustered into two clades. Our findings demonstrate that the two origins of Cistanches Herba possess different chemical compositions and genetic variation. This is the first reported evaluation of two origins of Cistanches Herba, and the finding will facilitate quality control and its clinical application.


Introduction
Cistanches Herba (Rou Cong Rong), known as ''Ginseng of the desert'', originates from dried succulent stems of Cistanche deserticola Y.C. Ma and Cistanche tubulosa (Schrenk) Wig according to the Chinese Pharmacopoeia (2010 edition), and is popular for its tonifying the kidney-yin, benefiting life essence and relaxing bowel. Currently, Cistanches Herba is mainly distributed in arid and warm deserts in northwest China, particularly in Xinjiang and Inner Mongolia provinces. However, the two origins of Cistanches Herba differ in terms of their pharmacological activity and chemical components. Tu et al. investigated the decoction of three Cistanche species (C. deserticola, C. tubulosa, Cistanche salsa) and found that C. tubulosa showed the lowest effect in the Yang-deficiency mouse model [1]. Zhang et al. compared pharmacological activity between C. deserticola, C. tubulosa and C. salsa, and found that these species had medicinal functions such as anti-fatigue and hypoxia tolerance, but not on the same extent [2]. Previous research reported the chemical component, and indicated the difference of chemical component and content for plant origins of Cistanches Herba [3]. As for the clinical application and market circulation,as a tonic,C. tubulosa has been traditionally used as a blood circulation-promoting agent and in the treatment of impotence, sterility, lumbago, body weakness in Japan [4][5][6][7][8].
Consequently, it is of great significance to discriminate two origins of Cistanches Herba for the quality control and clinical application. However, there is no research focus on discrimination of two origins of Cistanches Herba. Many researched methods, including microscopy, ultraviolet and infrared detection, intersimple sequence repeats method have been used to identify the genus of Cistanches, but not only for two origins specially [9][10][11][12][13][14][15][16][17][18][19][20].
Here, we conjunctively utilized chemical and molecular techniques to distinguish two origins of Cistanches Herba, UPLC-QTOF/MS (ultra-performance liquid chromatography coupled with quadrupole time-of-flight mass spectrometry) and DNA barcoding. UPLC-QTOF/MS provides information more rapidly and efficiently compared with other techniques. The high selectivity and sensitivity of UPLC-QTOF/MS have resulted in its application for both quantitative and qualitative analyses, as well as in metabolite analysis and identification of complex compounds in Traditional Chinese Medicine [21][22]. Principal component analysis (PCA) and orthogonal projection to latent structure-discriminant analysis (OPLSDA) are also developed to identify potential marker compounds. DNA barcoding, an easier and more universal molecular marker technology, uses a DNA fragment to identify species or genera. It is objective, more accurate, and easier to perform than traditional identification methods and other molecular marker technologies. Moreover, DNA barcoding has successfully been applied to identify animal and plant, including medicinal plants [23][24][25][26].
The purpose of this research is to establish a scientific method system, combined UPLC-QTOF/MS and DNA barcoding, for discrimination of two plant origins of Cistanches Herba.

Ethics statement
We confirm that the field studies did not involve endangered or protected species. GPS coordinates have included in the sample information, please see table 1."

Plant materials and reagents
Succulent stems of Cistanches Herba were collected from wild desert region in Inner Mongolia, Qinghai Provinces, Xinjiang Uygur Autonomous Region, People's Republic of China (Table 1) in May 2012. The samples of the research were all collected in wild desert region, not in private land, where no specific permissions were required. The botanical identities of the stems were confirmed by Dr. Linfang Huang. Voucher specimens were deposited at The Institute of Medicinal Plant Development. Highperformance liquid chromatography (HPLC)-grade acetonitrile (Merck KGaA, Darmstadt, Germany) and formic acid (Tedia, USA) were utilized for UPLC analysis. Deionized water was purified using a Milli-Q system (Millipore, Bedford, MA, USA). All other chemicals were of analytical grade.

Sample preparation
Cistanches Herba samples (1.0 g, 65-mesh) were transferred into a 50-mL conical flask, and 50 mL of 70% methanol was added. After soaking for 30 min, ultrasonication (35 kHz) was performed at room temperature for 30 min. After centrifugation at 10,000 rev/min for 10 min, the supernatant was stored at 4uC and filtered through a 0.22-mm membrane before injection into the UPLC-QTOF/MS system for analysis.

UPLC-QTOF/MS
For UPLC analysis, the following systems/parameters were used: Waters Acquity system (Waters) equipped with a binary solvent delivery pump, auto-sampler and PDA detector connected to a Waters Empower 2 data station; ultrasonication (250 W, 50 kHz, Kunshan Ultrasonic Instrument Co., Zhejiang, China); and an electronic analytical balance model AB135-2 (Mettler-Toledo., Greifensee, Zurich, Switzerland). A Waters Acquity UPLC BEH C 18 column (1.7 mm, 2.16100 mm, Waters) and a Waters C 18 guard column (same material, waters) were used and maintained at 30uC. The mobile phase was 0. The UPLC/MS analysis was performed on a QTOF Synapt G2 HDMS system (Waters, Manchester, UK) equipped with an electrospray ionization (ESI) source operated in the negative-ion mode. N 2 was used as the desolvation gas. The desolvation temperature was set at 450uC at a flow rate of 800 L/h, and the source temperature was set at 120uC. The capillary and cone voltages were set to 2500 and 40 V, respectively. Data were collected between 50-1200 Da with a 0.1-s scan time and a 0.01-s interscan delay over a 15-min analysis time. Argon was used as the collision gas at a pressure of 7.06661023 Pa. All MS data were collected using the LockSpray system to ensure mass accuracy and reproducibility. The [M-H]ion of leucine-enkephalin at m/z 554.2615 was used as the lock mass in negative ESI mode.

Data analysis
UPLC-QTOF/MS data for Cistanches Herba samples were analyzed to identify potential discriminant variables. Peak finding, alignment and filtering of ES raw data were carried out using the Marker Lynx applications manager, version 4.1 (Waters, Manchester, UK). The parameters used were as follows: retention time (t R ) of 0-15 min, mass of 50-1200 Da, retention time tolerance of 0.02 min, and mass tolerance of 0.02 Da. Three replicate samples collected from each geographic location were used (n = 3). A total of 6, 339 variables were used to create the model.

DNA barcoding: DNA extraction, PCR amplification and sequencing
Samples taken from dried fleshy stems of C. deserticola and C. tubulosa (30 mg) were rubbed for 2 minutes at a frequency of 30 r/ s. DNA was extracted according to the manufacturer's instructions (Tiangen). Specifically, the protocol was modified such that chloroform was replaced with a mixture of chloroform: isoamyl alcohol (24:1 in the same volume), and buffer solution GP2 with isopropanol (same volume). The rubbed powder was put into 1.5 ml eppendorf tubes, added 700 mL 65uC preheated GP1 and 1 mL b-mercaptoethanol to mix using vortex for 10-20 s, and incubated for 60 minutes at 65uC; Adding 700 mL mixture of chloroform: isoamyl alcohol (24:1), centrifuge for 5 minutes at 12000 rpm(,134006g); Pipette supernatant to a new tube, adding 700 mL isopropanol, blending for 15-20 minutes; Piping all the mixture into spin column CB3 and centrifuge for 40 s at 12000 rpm; Discarding the filtrate and adding 500 mL GD(adding quantitative anhydrous ethanol before use), centrifuge at 12000 rpm for 40 s; discarding the filtrate and adding 700 mL PW(adding quantitative anhydrous ethanol before use) to wash the membrane, centrifuge for 40 s at 12000 rpm; Discarding the filtrate and adding 500 mL PW, centrifuge for 40 s at 12000 rpm; Discarding the filtrate and centrifuge for 2 minutes at 12000 rpm to remove residual wash buffer PW; Transferring the spin column CB3 into a clean 1.5 ml eppendorf tube, and drying at room temperature for 3-5 minutes; Centrifuge for 2 minutes at 12000 rpm to obtain the total DNA. Primers for polymerase chain reaction (PCR) were based on sequences reported previously [4,5]. PCR reaction mixtures contained 2-mL DNA template, 8.5-mL ddH 2 O, 12.5-mL 26 Taq PCR Master Mix (Beijing TransGen Biotech Co., China), 1/1-mL forward/reverse (F/R) primers (2.5 mM), in a final volume of 25 mL. PCR amplification was conducted as described by Kress et al. [4]. The primer of PCR reaction were fwd PA: GTTATGCATGAACGTAATGCTC (59-39) and rev TH: CGCGCATGGTGGATTCACAATCC (59-39). PCR products were separated and detected by 1% agarose gel electrophoresis. PCR products were purified following the manufacturer's protocol and directly subjected to sequencing.

Sequence alignment and analysis
ITS and ITS2 sequences were collected from the GenBank database. Sequences from sequencing of the samples were submitted to GenBank database (Accession numbers were listed in table 1), assembled with CodonCode Aligner 3.7.1 (CodonCode Co., USA) and aligned using ClustalW. Kimura 2-Parameter (K2P) distances, GC content of base and Neighbor-joining (N-J) trees were calculated and constructed using the MEGA 5.05 with the Bootstrap method (1000 resampling) and K2P model [27]. Barcoding gap (spacer region that was formed between intra-and inter-specific genetic variations) and identification efficiency (the ability of identification for comparing different barcodes) were drawn and calculated based on the method reported by Meyer and Paulay [28].
PCA of C. deserticola and C. tubulosa PCA was employed to distinguish samples of different plant species. PCA is an unsupervised multivariate data analysis method that aims to visualize the similarities and/or differences within multivariate data of secondary metabolite composition [37]. The two-component PCA model cumulatively accounted for 46.04% of the variation (PC1, 36.43%; PC2, 9.61%). Figure 2 shows that 24 samples were clustered into two groups in the PCA scores plotted according to species origin, indicating that the chemical composition of C. deserticola and C. tubulosa differed significantly.

OPLS-DA and marker identification
To identify potential chemical markers for discrimination of the two species, the S-plot of OPLS-DA was generated (Fig. 3). In the S-plot, each point represents one t R -m/z ion pair. The X and Y axes represent the contribution and confidence of the ion, respectively; the farther the distance the ion t R -m/z pair points from zero, the larger the contribution/confidence of this ion is to the difference between the two groups. Thus, the t R -m/z ion pointing to the two ends of the 'S' represent the characteristic markers with the highest confidence in each group.
The OPLS-DA results showed that UPLC-QTOF/MS could be used to distinguish C. deserticola from C. tubulosa (Fig. 3). A total of six credible and significant markers were determined to facilitate discrimination of these groups ( Table 3). The identities of three potential markers were tentatively assigned. The components correlated with these three ions were tentatively identified as isomers of campneoside II, cistanoside C and cistanoside A. The marker compounds a, b and c could be used to distinguish the two plant species, as the ion intensities of a and b in C. deserticola was higher than in C. tubulosa (Fig. 4A, 4B), and marker c could be detected in C. tubulosa, but not in C. deserticola (Fig. 4C).  Sequence information was shown in Table 4. The average genetic distance of psbA-trnH (0.1732) was larger than other two regions (0.0740, 0.1197) significantly. The average GC content of psbA-trnH (20.64%) was smaller than other two regions (55.00%, 55.00%). Though the success rate of ITS and ITS2 was not obtained in this study, the psbA-trnH region performed well in PCR amplification and sequencing (100%, 87.23%). Identification efficiency was achieved by BLAST1 analysis and the nearestdistance method, and mainly reflected the success rate of the barcodes. The psbA-trnH region was clearly higher than the other two barcodes in identification efficiency based on two methods. The shortage of sequences is most likely the reason that ITS region exhibited 100% identification efficiency based on BLAST1 method, and 0 based on the nearest-distance method.

Analysis of genetic divergence using six parameters
Six parameters were used to analyze intra-specific variation and inter-specific divergence using three barcodes ( Table 5). The significant difference between inter-and intra-specific variations was indicative of the utility of the DNA barcodes. Here, the minimum interspecific distance of three barcodes was all higher than the maximum intraspecific distance. Moreover, psbA-trnH region had larger maximum intraspecific distance and average interspecific distance than the other two barcodes, indicated that psbA-trnH region performed well in discrimination of two origins of Cistanches Herba.  Analysis of barcoding gap to identify C. deserticola and C. tubulosa The barcoding gap presents the remarkable variation of interand intra-species, and demonstrates that separate, non-overlapping distributions between intra-and inter-specific samples. In this study (Fig. 5), the distance range was set to 0-0.45, because the greatest K2P distance of psbA-trnH between C. deserticola and C. tubulosa was close to 0.45. The three barcodes exhibited distinct gaps in the distributions of intra-and inter-specific variation. Furthermore, the gap of psbA-trnH was significantly larger than other two barcodes. Therefore, psbA-trnH region could be an ideal barcode for discriminating two origins of Cistanches Herba.

Neighbor-joining (NJ) tree
An NJ tree illustrates the relationship among species and facilitates determination of their clustering. In this study, NJ tree of three barcodes were built based on K2P model (Fig. 6). The results demonstrated that two origins of Cistanches Herba clustered into two clades separately. Thus, the NJ tree clearly distinguished between C. deserticola and C. tubulosa.

Discussion and Conclusions
Cistanches Herba is an important medicinal material commonly used to nourish in the Asian community [38]. However, the two origins of Cistanches Herba, C. deserticola and C. tubulosa, have different chemical compositions and pharmacological activities respectively. Concurrently, the two origins differ in clinical application and commodity market. The classification of Cistanche is confused and massive substitute and adulterants flood the market due to the shortage of resources and special growing environment for Cistanches Herba. Genus of Cistanche is accepted to include four species and one variant: C. deserticola, C. tubulosa, C. sinensis, C. salsa, and C. salsa var. albiflora [39]. Researchers in Japan considered the origin of Cistanches Herba as C. salsa [40][41][42][43], while it was identified as C. deserticola by Tu [44][45][46]. Therefore, it is confused in classification of Cistanche, and it is hard to discriminate the two origins of Cistanches Herba.
Traditional methods for quality control of Cistanches Herba are morphological identification [47,48], microscopic identification [49] and TLC (Thin-Layer Chromatography) [50,51], FTIR (Fourier Transform Infrared Spectroscopy) [14], HPLC (High Performance Liquid Chromatography) [52,53]. Morphological and microscopic method can easily differentiate species from different genera or families that possess big difference in morphological and microscopic characteristics, while it is hard to distinguish sibling species. TLC and FTIR can clearly discriminate species that possess different kind of chemical compositions, whereas it is difficult to determine the chemical component and content. HPLC is mainly used for differentiating species with different chemical elemente contents, nevertheless, the time of analysis is longer and the sensitivity is relatively lower compared to UPLC. Correspondingly, UPLC-QTOF/MS technology was faster and more accurate in determining chemical composition than other chemical methods. Molecular identification methods exhibit well in discrimination based on the genetic variation, such as SDS-PAGE (Sodium Dodecyl Sulfate-Polyacrylamide Gel Electrophoresis) [54], AFLP (Amplified Fragment Length Polymorphism) [55,56]. However, these molecular methods are not easy to operate and are not universal. Correspondingly, DNA barcoding could discriminate species more universally, quickly and accurately than other molecular methods. For the species from same genus and close genetic relationship, those methods alone may not perform well in identification. Here, we combined UPLC-QTOF/MS and DNA barcoding in identifying C. deserticola and C. tubulosa, and evaluated the chemical and molecular markers Figure 5. Relative distribution of inter-specific divergence and intra-specific variation in three barcodes. Three barcodes of ITS2, ITS, psbA-trnH were analyzed for relative distribution of inter-specific divergence and intra-specific variation between C. deserticola and C. tubulosa based on the K2P genetic distance. doi:10.1371/journal.pone.0098061.g005 that would allow them to be discriminated. 23 qualified mass peaks were detected and 16 were identified by using UPLC-QTOF/ MS, and three potential marker compounds were firstly found to facilitate the discrimination of two origins by PCA and OPLS-DA analysis. Furthermore,four indicators were assessed by DNA barcoding technology in terms of their ability to differentiate two origins: Identification efficiency, genetic efficiency, barcoding gap, and NJ tree analysis. The psbA-trnH region was supported as a suitable DNA barcode for discriminating C. deserticola and C. tubulosa.
In conclusion, we firstly established a new molecular and chemical analysis-combined method for discriminating and quality control in two origins of Cistanches Herba. DNA barcoding can discriminate two origins in genetic variation and authenticate species universally and accurately; UPLC-QTOF/MS technology can analyze chemical composition to evaluate the quality of medicinal materials rapidly and accurately. The combined method of DNA barcoding and UPLC-QTOF/MS technology guarantee the identification in multiple sources of medicinal materials more accurately and scientifically, and may serve as method for identifying other confusing species or genus in classification.

Author Contributions
Conceived and designed the experiments: LFH SHZ LBW. Performed the experiments: SHZ LBW. Analyzed the data: SHZ LBW. Contributed reagents/materials/analysis tools: LFH. Wrote the paper: SHZ XJ LBW. Check the manuscript: ZHW.