Discrimination of Wild Paris Based on Near Infrared Spectroscopy and High Performance Liquid Chromatography Combined with Multivariate Analysis

Different geographical origins and species of Paris obtained from southwestern China were discriminated by near infrared (NIR) spectroscopy and high performance liquid chromatography (HPLC) combined with multivariate analysis. The NIR parameter settings were scanning (64 times), resolution (4 cm−1), scanning range (10000 cm−1∼4000 cm−1) and parallel collection (3 times). NIR spectrum was optimized by TQ 8.6 software, and the ranges 7455∼6852 cm−1 and 5973∼4007 cm−1 were selected according to the spectrum standard deviation. The contents of polyphyllin I, polyphyllin II, polyphyllin VI, and polyphyllin VII and total steroid saponins were detected by HPLC. The contents of chemical components data matrix and spectrum data matrix were integrated and analyzed by partial least squares discriminant analysis (PLS-DA). From the PLS-DA model of NIR spectrum, Paris samples were separated into three groups according to the different geographical origins. The R2X and Q2Y described accumulative contribution rates were 99.50% and 94.03% of the total variance, respectively. The PLS-DA model according to 12 species of Paris described 99.62% of the variation in X and predicted 95.23% in Y. The results of the contents of chemical components described differences among collections quantitatively. A multivariate statistical model of PLS-DA showed geographical origins of Paris had a much greater influence on Paris compared with species. NIR and HPLC combined with multivariate analysis could discriminate different geographical origins and different species. The quality of Paris showed regional dependence.


Introduction
Traditional Chinese medicine (TCM) is gaining greater acceptance throughout the world, especially in western countries, for improving health and preventing or healing diseases [1]. It is well known that TCM are composed of animal drugs, medicinal plants, fungi and minerals, among which medicinal plants play an important role for their wealthy sources, rich species and diverse components [2]. These plants have been used to treat various diseases for thousand years in Eastern Asia [3]. Recently, a large number of bioactive components and metabolites were found from medicinal plants, which are considered as the key ingredients of TCM development and utilization [4]. However, the quality and contents of bioactive components in medicinal plants are extremely variable depending on species, geographical origins, cultivation, growth altitude, soil, harvest time and climate conditions such as temperature, sun exposure time and rainfall [5], [6]. The clarification of the source and species of medicinal plants plays the decisive role in the quality control of TCM formulas, which is the fundamental prerequisite for its worldwide recognition and acceptance.
Paris, belonging to the family Liliaceae, contains about 24 species and mainly distributes in Europe and Eastern Asia. There are 22 species of Paris in China, and the diversity center of Paris is located in Southwest China [7]. The dried rhizome paridis is the main raw material of Chinese patent drugs ''Yunnan Baiyao'', ''GongXue Ning'', and ''Jidesheng snake tablet'' [8]. The phytochemistry research indicates that abundant active ingredients including steroidal saponins, flavonoids, fatty acid ester and endophytic fungi are in the dried rhizome of Paris, and steroid saponins such as polyphyllin I, polyphyllin II, polyphyllin VI, and polyphyllin VII are the most investigated ones [8], [9], [10], [11]. Their chemical structures are depicted in Figureô 1. Modern pharmacology has demonstrated that polyphyllin has powerful pharmacological activities on stypticity, spermicide, homeostasis, analgesic, and as a potential anti-cancer drug for the functions of cytotoxicity and induction of apoptosis [12], [13], [14].
In folk of China, several species of Paris plants have a history used to treat snakebite, hemostasis, fractures, parotitis and abscess [8]. However, only the rhizomes of P. polyphylla var. chinensis and P. polyphylla var. yunnanensis are officially recorded in Chinese Pharmacopoeia. As we know, the morphological characteristics are similar between close related species. It is much more difficult to discriminate dry rhizomes of the same genus by traditional morphological identification method, especially the original powder. On the other hand, common methods like microscopic identification, thin layer chromatography are laborious and timeconsuming.
In recent years, near infrared (NIR) spectroscopic methods have been used in analysis of vegetable, fruit, coffee, green tea, wine, plant and pharmaceutical [15], [16], [17], [18], [19], [20], [21]. High performance liquid chromatography (HPLC) is considered as a robust method in numerous applications of qualitative and quantitative analyses of TCM for its easy operation, high accuracy and wide suitability. In quantification, determination of a group of principal active constituents with similar or different structures in  one medicinal plant has been widely implemented [22], [23], [24], [25].
In this research, NIR and HPLC in combination with multivariate statistical analysis were applied for discriminating Paris plants of different species and different origins quantitatively and qualitatively, and also found the key factor of identification of Paris.

Materials
Forty eight samples of wild Paris including 12 species were collected from three main distribution areas in southwest China: Yunnan, Guizhou and Guangxi Provinces ( Figure 2). They were identified and authenticated by Doctor J.Y. Zhang, Yunnan Academy of Agricultural Sciences (Table 1). The herbariums were preserved in the institute of medicinal plants, Yunnan academy of agricultural sciences. The rhizomes of Paris plants were dried at the temperature of 50uC, and then ground to fine powder and storaged in the zip lock bags until further analysis. No specific permits were required for the described field studies, as no endangered or protected species were sampled, and the localities where the samples came from are not protected in any way.

Instruments and Reagents
The standards (polyphyllin I, polyphyllin II, polyphyllin VI, and polyphyllin VII) were purchased from the National Institute for Control of Pharmaceutical and Biological Products (Beijing, China). The purity of all the standards was greater than or equal to 98%. HPLC grade acetonitrile and methanol were obtained from TEDIA (Ohio, USA). Purified water (HPLC grade) was produced by Milli-Q system (Massachusetts, USA). Other reagents were all of analytical grade.
HPLC system (Shimadzu Technologies, Kyoto, Japan) was equipped with Workstation software class-VP (Shimadzu Technologies) for recording chromatograms and composed of HPLC-10 integrator, HPLC-10ATVP pump, and SDP-M10A VP detector (DAD). All chromatographic separations were carried out on a Shim-pack VP-ODS C18 (15064.6 mm, particle size: 5 mm) from Shimadzu (Kyoto, Japan). Antaris II Fourier Transform Near Infrared Spectroscopy (Thermo Fisher Scientific INC., USA) was attached with diffuse reflection module. The spectrum collecting software Result TM 2.1 and the analysis software TQ 8.6 included in the instrument were employed. Traditional Chinese medicine grinder DFT-100 (Zhejiang wenling Linda machinery co., LTD) was applied. Stainless steel sieve tray 80 mesh (Tai'an of Chinese and western, Beijing) was used. The multivariate data analysis software was SIMCA-P 11.0 (Umetrics, Umea, Sweden).

Sample Preparation for NIR Analysis
The rhizomes powder (20.0 g) was weighed before it was sufficient mixed, then transferred to the sample cup of NIR and compressed. Collecting the spectrum of NIR by diffuse reflection module of Result TM 2.1. The parameter settings were scanning (64 times), resolution (4 cm 21 ), scanning range (10000 cm 21 ,4000 cm 21 ) and parallel collection (3 times).

Sample Preparation for HPLC Analysis
The Paris rhizomes powder (0.5000 g) was extracted with 25 mL alcohol under refluxing for 45 min. After cooling to room temperature, 1.5 mL of the extract was transferred into 2 mL centrifuge tube, centrifuged at 16,000 rpm for 10 min, and then reserved supernatant for detection. The contents of polyphyllin I, polyphyllin II, polyphyllin VI, and polyphyllin VII and total steroid saponins were detected by HPLC. The mobile phase solvents, flow-rate, injection volume, column temperature and detection wavelength have been optimization by the reference of Zhang, et al [11].

Data Preprocessing
The NIR spectrums of Paris were preprocessed with Norris, mean centering, standardization, and second derivative successively by software TQ 8.6. The stability of 25 times parallel collections of a sample (30 #) was considered in the 95% confidence by SIMCA-P software 11.0.
The NIR resulting .spc files were converted to .csv data files by the multivariate statistical analysis of SIMCA-P software 11.0. The HPLC resulting .xls files were converted to .csv data files by the Excel software. Then the .csv files were imported to multivariate statistical analysis of SIMCA-P software 11.0. Different geographical origins and species of wild Paris were identified by partial least squares discriminant analysis (PLS-DA) according to the NIR spectra and the contents of chemical compositions. PLS was used to visualize general clustering, trends, and outliers among the observations. Multivariate Analysis of NIR and HPLC NIR spectrum was optimized by TQ 8.6 software, and the ranges 7455,6852 cm 21 and 5973,4007 cm 21 were selected according to the spectrum standard deviation. The higher the spectra standard deviation was, the greater the contribution to classification. The contents of chemical components data matrix and spectrum data matrix were integrated and analyzed by PLS-DA. PLS-DA was applied to obtain the first understanding of relationships between the data matrix, and to examine the differences in the spectrum of different geographical origins and species of Paris. The efficiency and reliability of the PLS-DA model were verified by percent variation of the x and y variables explained by the model (R 2 X, R 2 Y) and the predictive performance of the model (Q 2 ) [26].

Validation of the NIR Spectroscopy and HPLC methods
The precision of the NIR spectrometer was tested by assaying the same sample thirteen times continuously. The relative standard deviation (RSD) (n = 13) of precision for Paris powder was 0.184%. The repeatability of the NIR spectrometer was  evaluated by assaying 13 replicate of the same sample. The RSD (n = 13) of repeatability for Paris powder was 0.232%. Within 3 h, the stability of sample was analyzed every 20 min. The RSD for stability was 0.297%. The spectral reproducibility is an essential factor in assessing the quality of the measurement technique. To gain insight into the reproducibility of system, 25 times parallel collections of sample 30 # were executed and evaluated by Hotelling T 2 (Figure 3). The results showed that the parallel spectrum acquisitions possessed satisfactory stability with coefficient 4.18 and 7.58 in the 95% and 99% levels, respectively. The results indicated that NIR was a reliable method for discriminant analysis.
The precision of HPLC was evaluated by analyzing the same sample extract six times continuously. The RSD (n = 6) of precision were 1.260%, 0.752% and 2.182% for polyphyllin I, polyphyllin II and polyphyllin VII, respectively. The repeatability of HPLC was tested by assaying 6 replicate of the same sample extract. The RSD (n = 6) of repeatability for Paris extract were 1.384%, 0.941% and 2.534% for polyphyllin I, polyphyllin II and polyphyllin VII, respectively. Stability of sample extract was tested

Characterization of the Spectrum and Contents of Chemical Compositions
NIR spectrum and the chemical components contents of Paris were shown in Figure 4 and Table 2, respectively. Figure 4A showed the original spectra collected for the Paris samples, which illustrated the lowest molecular absorptivities were in the region 10000-7515 cm 21 , with higher values in the region 7150-5436 cm 21 and still higher absorbance levels in the region 5326-4045 cm 21 . The wavelength at 8380-8230 cm 21 corresponds to C-H second overtone stretch vibration modes in CH 3 and CH 2 groups, whereas the bands located between 6900 and 6800 cm 21 are the first overtone of O-H bands. In Figure 4B, the wavelengths at 7145 and 6953 cm 21 are related to C-H combination bands in CH 2 . The absorption band at 5181 cm 21 is assigned to polysaccharide combination band of O-H stretch vibration and the transformation of HOH. The wavelength at 4400 cm 21 is the combination band of O-H and C-O stretch vibration in glucose. In HPLC analysis, the retention time of polyphyllin I, polyphyllin II, polyphyllin VI and polyphyllin VII were at 34.828 min, 32.384 min, 23.048 min and 20.980 min, respectively. According to the results of HPLC, we found that polyphyllin VI only was detected in sample 17#. Based on this point, the compound polyphyllin VI was not used for discrimination analysis. Chemical components including polyphyllin I, polyphyllin II, polyphyllin VII, and total steroid saponins were employed in discrimination analysis, with ''0'' expressed the one could not be determined.

Discriminant Analysis of Paris from Different Geographical Origins by NIR Spectrum and HPLC
According to the diversity of NIR spectrum, stepwise discriminant analysis of PLS was utilized to analyze the samples from different geographical origins. In Figure 5A, forty eight collections of Paris samples were obviously separated into three groups according to the different geographical origins. The samples from Yunnan Province were clearly separated from the other two regions. The R 2 X and Q 2 Y described accumulative contribution rates were 99.50% and 94.03% of the total variance, respectively.
In further insight into samples from Yunnan Province for their multiple distribution areas, forty samples were separated into four groups. In Figure 5B, the R 2 X and Q 2 Y described accumulative contribution rates were 99.28% and 94.32% of the total variance, respectively. However, three samples 3#, 24# and 37# were classified incorrectly. The reason for that could not be found.
Furthermore, the average contents of chemical components (polyphyllin I, polyphyllin II, polyphyllin VII, and total steroid saponins) were used for the contribution of the geographical origins ( Figure 6). More interestingly, the variation of chemical components coupling with the locations of wild Paris samples from Yunnan (Central, Northwestern, Southwestern and Southeastern), Guizhou and Guangxi Provinces in the pie charts showed the visualization of the major differences among the six geographical origins of samples. The samples from Southwestern Yunnan had the highest contents of polyphyllin I (11.147 mg?g 21 ) and total steroid saponins (13.363 mg?g 21 ), while samples collected from Guangxi Province and Southeastern Yunnan had the highest contents of polyphyllin II (2.110 mg?g 21 ) and polyphyllin VII (0.796 mg?g 21 ), respectively. The contents of polyphyllin I, polyphyllin II, polyphyllin VII, and total steroid saponins in samples from Guizhou Province were all the lowest. However, there was no significant difference among the contents of polyphyllin I, polyphyllin II, polyphyllin VII, and total steroid saponins in samples from Yunnan (Central, Northwestern, Southwestern and Southeastern), Guizhou and Guangxi Provinces (p.0.05).
Based on the above analysis, samples from different geographical origins were different performance both in the NIR spectra and the chemical components, which might be effected by the main factors including geographical conditions, temperature, and rainfall capacity in different areas. Yunnan Province locates in southwest China and is influenced by a low latitude plateau, mountainous country monsoon climate [26]. Otherwise, Yunnan Province belongs to obvious characteristics of mountain climate with noticeable vertical climatic belt. Central and southeastern Yunnan are mainly the middle and north subtropical area, while northwest Yunnan belongs to the temperate zone, and southwestern Yunnan belongs to south subtropical [27]. In the recent years, the temperature of central, northwestern and southwestern Yunnan has increased remarkably, but the rainfall amount has decreased evidently [28]. Guizhou Province locates in northeast of Yunnan Province and is influenced by a subtropical plateau monsoon climate, the average temperature is not as stable as that of Yunnan. Guangxi Province locates in southeast of Yunnan Province and is influenced by a subtropical monsoon climate [26]. Su and Zhang [29] analyzed the relation between the photosynthesis of Paris polyphylla var. yunnanensis and the environmental factors and found that the leaf temperature increased from 11uC to 20uC, the net photosynthetic rate increased; but with the increase of temperature from 20uC to 35uC, the rate decreased. The optimal temperature was from 16uC to 28uC. With the increase of relative humidity from 20%-85%, the net photosynthetic rate increased. The optimal humidity was over 75% [29]. The results indicated the quality of Paris showed geographic and habitat dependencies to some extent.

Discriminant Analysis of Different Species by NIR Spectrum and HPLC
Utilized PLS-DA analysis to give us a preliminary overview of similarities and differences among the species, the results suggested that wild Paris species impose a significant effect on the NIR spectrum. In the established PLS-DA model (Figure 7), three significant spectra data described 99.62% of the variation in X (R 2 X = 0.9962) and predicted 95.23% in Y (Q 2 Y = 0.9523) according to cross-validation. Forty-eight collections including 12 species were partly separated into different groups, P. cronquisistii var. xichouensis, P. caobangensis, P. cronquistii, P. polyphylla var. alba and P. polyphylla var. pseudothib were obviously separated from the other species. Different species of Paris presented certain different information in the NIR spectra, which might be according to the different chemical components or chemical constituents in the samples for their different absorption band in NIR spectrum. Furthermore, we could find that different species of Paris contained different levels of chemical components from the Table 2. The results of analysis of variance showed the contents of total steroid saponins in different species of Paris were significantly different (p,0.05), while the levels of polyphyllin I, polyphyllin II, polyphyllin VII had no significant difference (p.0.05) among different species.

Effects of Different Geographical Origins and Species for Classification
Based on above analysis, we knew that different geographical origins and species of wild Paris could be separated by PLS-DA model based on their NIR spectra and chemical components. Nevertheless, in order to understand either origin or species is the key factor to identify wild Paris, we have enthusiastically explored the same species from different geographical origins and different species from the same geographical origin.
We selected 29 samples of P. polyphylla var. yunnanensis from four different geographical areas (central, northwestern, southwestern and southeastern) in Yunnan Province, and analyzed the effects of the geographical origins in discriminating Pairs by NIR spectra and chemical compositions. Loading Bi-plot of pc (corr) [1], t (corr) [1] and pc (corr) [2], t (corr) [2] generated from the loadings scaled as correlation and the scores scaled inside the correlation model of different geographical origins ( Figure 8A) or different species ( Figure 8B) of wild Paris.
From Figureô 8A, pc (corr) [1], t (corr) [1] played a significant role in discriminating samples of southwestern Yunnan from the  others, while pc (corr) [2], t (corr) [2] had a comparatively weak impact on separating samples of northwestern Yunnan from central and southeastern Yunnan. More interesting, the locations of samples from southwestern and southeastern Yunnan were entirely opposite, while samples from central and northwestern Yunnan were located between them. The observations are in accordance with the climate of these areas as previously stated. Furthermore, the contribution of NIR spectrum and chemical compositions of samples was given a loading value. The NIR spectrum of samples 17#, 25#, 27#, 38#, 47# and chemical components polyphyllin I and total steroid saponins, had a negative contribution to pc (corr) [1], t (corr) [1], which separated samples of southwestern Yunnan from the others, with southwestern Yunnan having a negative loading value. The results could also be found in Tableô 3. Although samples 17# and 38# belong to northwestern Yunnan according to administrative division, they were close to southwestern Yunnan in geographically. The above showed samples 17#, 25#, 27#, 38#, 47#, chemical components polyphyllin I and total steroid saponins were the discriminating roles for P. polyphylla var. yunnanensis from southwestern Yunnan. It is consistent with the contents determined by HPLC that the levels of polyphyllin I and total steroid saponins were highest in southwestern Yunnan, 11.15 mg?g 21 and 13.36 mg?g 21 , respectively. Ten samples including 5 different species of Paris from southeastern Yunnan were selected to understand the difference among different species from the same geographical origins by PLS-DA model. In Figureô 8B, pc (corr) [1], t (corr) [1] significantly discriminated P. caobangensis and P. polyphylla var. chinensis from P. cronquistii, P. cronquistii var. xichouensis and P. polyphylla var. yunnanensis, while pc (corr) [2], t (corr) [2] separated P. polyphylla var. chinensis and P. polyphylla var. yunnanensis from P. caobangensis, P. cronquistii, and P. cronquistii var. xichouensis. Five species were located in four different quadrants, while P. cronquistii var. xichouensis was close to P. cronquistii, which might be the reason that P. cronquistii var. xichouensis is the variety of P. cronquistii, the two species have much closer genetic relationship. The NIR spectrum of samples 13#, 31#, 34# and chemical components of polyphyllin I, polyphyllin II, polyphyllin VII and total steroid saponins had a positive contribution to pc (corr) [1], t (corr) [1] and pc (corr) [2], t (corr) [2], which separated P. polyphylla var. chinensis from the others. We could clearly understand the results from Tableô 4. From the NIR spectrum, we could find five species of Paris locate in the right corresponding quadrant except P. polyphylla var. yunnanensis. In the loading Bi-plot, the closer to the origin, the smaller contribution a chemical component makes to the discrimination. The contents of polyphyllin I and total steroid saponins in P. polyphylla var. chinensis were significantly higher than in other species, they could be considered as the discriminating components for P. polyphylla var. chinensis from the other species.
From Table 3 and Table 4, we could calculate the average total score of different geographical origins (8.13) was much higher than different species (4.98) of Paris. The results suggested that different geographical origins had a much greater influence on Paris compared with different species. The complex geographical conditions, such as elevation, temperature, rainfall, sun exposure time, light quality and soil type are closely associated to different geographical origins environment.

Conclusions
In conclusion, the results demonstrated that the combination of NIR spectrum and HPLC-based active components with multivariate analysis could be a powerful method for discriminating Paris of different origins and different species. The PLS-DA model showed different origins had a greater effect on Paris than different species. The quality of Paris showed regional dependence. A further study using NIR and HPLC-based metabolic profiling coupled with multivariate analysis would extend the coverage of the metabolites of Paris and provide the authoritative biomarkers responsible for the discrimination of Paris from different geographical areas. On the other hand, the metabolic profiling of different species of Paris also needs further study for providing basis of extending source of medicine-botanical origins.