Classification and identification of Rhodobryum roseum Limpr. and its adulterants based on fourier-transform infrared spectroscopy (FTIR) and chemometrics

Fourier-transform infrared spectroscopy (FTIR) with the attenuated total reflectance technique was used to identify Rhodobryum roseum from its four adulterants. The FTIR spectra of six samples in the range from 4000 cm−1 to 600 cm−1 were obtained. The second-derivative transformation test was used to identify the small and nearby absorption peaks. A cluster analysis was performed to classify the spectra in a dendrogram based on the spectral similarity. Principal component analysis (PCA) was used to classify the species of six moss samples. A cluster analysis with PCA was used to identify different genera. However, some species of the same genus exhibited highly similar chemical components and FTIR spectra. Fourier self-deconvolution and discrete wavelet transform (DWT) were used to enhance the differences among the species with similar chemical components and FTIR spectra. Three scales were selected as the feature-extracting space in the DWT domain. The results show that FTIR spectroscopy with chemometrics is suitable for identifying Rhodobryum roseum and its adulterants.


Introduction
Mosses are small perennial plants that are typically 1-10 cm tall, and more than 12 000 species have been recognized worldwide [1]. Many moss species have broad geographical locations that span several continents. Mosses commonly concentrate in groups [2]. The persistent photosynthetic phase of the moss life cycle is the gametophyte generation. Spores are released from a sporophyte capsule at certain times [3].
Mosses are one of the most speciose among plants, but few have been used as medical plants. Rhodobryum roseum is a medicinal moss species. Rhodobryum roseum extractives such as ursolic acid, flavonoids, and alkaloids have been used in cardiac study, and the extracts are more frequently used in medical research [3]. Mosses show extensive morphological and a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 anatomical diversification in both gametophyte and sporophyte organizations [4][5][6]. Because of the small sizes of mosses, it is difficult to distinguish Rhodobryum roseum and its adulterants by using traditional phytotaxonomic methods. In particular, in field collection, Rhodobryum roseum is usually confused with Rhodobryum ontariense from the same genera and Plagiomnium actum, P. maximoviczii, and Mnium laevinarve from adjacent groups. Several modern analytical methods such as molecular biological approaches have been applied to identify the confused moss species. Rhodobryum Roseum is easily identified by Random Amplified Polymorphic DNA (RAPD) [7]. However, this method uses large amounts of the material and can be very expensive.
Each FTIR spectrum of a compound can express a unique "fingerprint", which allows FTIR spectroscopy to be used in the classification of different samples or identification of unknown samples [20]. However, in numerous problems in practice, judgments cannot be rapidly and accurately made by purely relying on the FTIR spectroscopy analysis [9]. Chemometric methods with FTIR spectroscopy can compensate for the errors that occur in a FTIR spectral analysis [21]. Our study aims to discriminate between Rhodobryum Roseum and its adulterants by using FTIR spectra with several chemometric methods.
Cluster analysis and principal component analysis (PCA) are two multivariate analyses for identifying the natural clustering pattern and group objects based on the similarities among samples [22]. Cluster analysis is an undirected and unbiased statistical method for analyzing spectroscopic information [23]. Cluster analysis and PCA are widely recognized as powerful tools to obtain information about the relations in a dataset [19]. To classify and compare the spectra of species from different genera, it is necessary to obtain FTIR spectra by using multivariate methods.
Wavelet transform (WT) is another useful tool for various signal-processing applications. WT was also developed to discriminate non-stationary signals with different frequency features [24]. The signal to be analyzed is multiplied by a wavelet function. The analyzing ability for the same signal of wavelet coefficients vary at different scales. Therefore, the WT coefficients can be considered the characteristics of a signal. Several features can reflect the major spectral information after a wavelet function. WT can be considered one of the most efficient chemometric methods [11]. Discrete WT (DWT) is used to decompose a signal by using filters to extract the frequency resolution components of interest in the signal. DWT has compact support in both time and frequency domains [25]. DWT is used to analyze the signal at different frequency bands with different resolutions by decomposing the signal into a coarse approximation and detailed information [26]. This technology is a signal-processing tool that has been used in numerous engineering, scientific, and mathematical applications, and DWT can solve numerous difficult problems that Fourier nsform cannot. Hence, DWT is also known as the "math microscope" [27].
In this study, six samples of mosses were classified: two species of Rhdobryum roseum (Rhdobryum roseum1 and Rhdobryum roseum2) collected from different areas, Rhodobryum ontariense, Plagiomnium actum, P. maximoviczii, and Mnium laevinarve. The present study aimed to evaluate the potential use of FTIR with an attenuated total reflection (ATR) unit spectroscopy, second-derivative transformation, cluster analysis and PCA to discriminate Rhdobryum roseum from the other four confused species of mosses. Fourier self-deconvolution (FSD) and DWT were used to investigate the chemical fingerprint variability among the species.

Species records
In this study, six species of epiphytic bryophytes were collected from Xiao wutai National Nature Reserve (Zhuolu county, Hebei province, China) in August 2011. Xiao wutai National Nature Reserve Administration gave permission for each location of the species. We georeferenced the detailed location information of the 6 species using GPS. The geographic coordinates, altitudes, and sample collection sites are shown in Table 1. The environmental conditions and population sizes of the sample plants were similar. The voucher specimens were deposited in the herbarium of Hebei Normal University.

Sample preparation
To avoid problems caused by the effect of water on FTIR spectra, all samples were dried at 45˚C for 72 h, ground into fine powder in an agate mortar, and sieved through 200 meshes (75 μm).

Spectral measurements
The FTIR spectra (4000-600 cm −1 , 4 cm −1 resolution, 32 scans) were obtained with a ZnSeattenuated total reflectance (ATR) accessory (Pike Technologies, Madison, WI, USA), which was combined with an FTIR spectrometer Bruker Optics Vertex70 (Ettlingen Germany) and a DigiTect TM detector, which can prevent external signal disturbance and guarantee the highest signal-to-noise ratio. After grinding, 6 mg of powdered samples was directly placed at approximately 2.54 mm 2 on the center of the ZnSe crystal plate. All samples were pressed using an identical mechanical pressure, and the FTIR spectra were obtained. The FTIR spectra of the six samples of mosses were automatically baseline-corrected. The automatic baseline-corrected data of each sample was used for further analysis. Each sample was measured five times, and the averaged spectrum was obtained for the ATR-FTIR spectra analysis.

Precision, repeatability, and stability test
Sample 1 (Rhdobryum roseum1) was used to validate the method. The precision test was conducted using replicate records (5×) of the same sample. The repeatability test was analyzed by gathering the data of five independently prepared samples of the same sample. The stability test was determined by five records of one sample solution in 24 h. The mean value relative standard deviation (RSD) of the absorbance and wave number represented by the common peaks of each test was calculated.

Second-derivative transformation test
Second-derivative spectroscopy has been a common spectral analysis technique for decades [28][29][30][31][32][33]. Second-derivative spectroscopy enables more specific identification of small and nearby absorption peaks that are not resolved in the original spectrum, which offers a way to increase the specificity of absorption peaks for certain components of the species. Another obvious advantage of using the second derivatives is that the constant and linear components of baseline errors are removed in the differentiation [30], which increases the feasibility of the second-derivative spectroscopy for quantitative work. A second-derivative spectrum was calculated for each measured pixel using Savizkye-Golay algorithm (seven smoothing points). Further, representative chemical maps were calculated based on the derivative peaks.

Data analysis
All replicate spectra from the 6 species were recorded and found to be in the range of 4000-600 cm −1 . The spectral data were calculated using the OMINIC version 8 software (Thermo Fisher, Waltham, USA).
To further classify the spectra, different multivariate methods such as cluster analysis and PCA were applied after the second-derivative transformation test. Cluster analysis and PCA were performed using the IBM SPSS 19.0 software. The cluster analysis was used to sort the FTIR spectra into similar sets or groups. Satisfactory results on the FTIR spectra of complex biological molecules were obtained using Ward's algorithm, Euclidean distances, or the correlation coefficient calculation as distance metrics [14,34]. In the present study, 6 samples (each had 5 replicates) were selected for clustering. To obtain more comprehensive and accurate data, we selected the absorption values in the range of 4000-600 cm -1 for the cluster analysis. The nearest-neighbor and Pearson correlations were used to construct the dendrogram.
PCA also provides information on the major spectral components, where dominant factors determine the differences among the samples [35,36]. PCA is used to extract the important features of a correlation matrix in terms of PCs. Only a few PCs are typically required to explain the majority of the observed variance. PCA can be used as a chemometric method for FTIR analysis. The analysis can be presented as two-dimensional (2D, two PCs) or three-dimensional (3D, three PCs) scatter plots [37]. The absorption values in the range of 4000-600 cm −1 were used for the PCA, and the factor loading was plotted.
FSD was performed using the OMINIC 8 software. DWT was performed using the MATLAB 7.1 software. Daubechies wavelet, which served as the analysis wavelet, can better explore the signal singularity [38]. FSD and one-dimensional stationary DWT were performed on different samples.

FTIR analysis
The mean FTIR spectra of six moss samples were recorded and divided into two sets: three Mniaceae samples and three Rhdobryum samples (Fig 1). The characteristic peaks in the FTIR spectra from the six samples are shown in Fig 1 Most peaks represent major functional groups and show comprehensive information on the protein, carbohydrate, fibrin, and lipid components of the samples [12]. The FTIR spectra comparison can provide information about different samples. Hence, the region of 3500-3000 cm −1 presents a broad band centered at approximately 3300 cm −1 , which corresponds to the absorption because of the stretching of the O-H and N-H bands (Fig 1). A sharp peak at 2920 cm −1 and 2850 cm −1 was attributed to the presence of polysaccharides, lipids, and carbohydrates (C-H stretch). The peak at 1640 cm −1 was attributed to the absorbance of amide (C = O bend). A second amide vibration appeared at 1420 cm −1 (C-H stretch), and an amide peak appeared at 1370 cm −1 (C-H stretch). The peak at 1030 cm −1 can be attributed to oligosaccharides, glycoprotein, and cellulose (C-O stretch) stretch. The peaks at 1300-600 cm −1 can be attributed to the absorbance of low-molecular-weight carbohydrates, polyols, and monosaccharides. This region is characterized as the fingerprint region [39].

Second-derivative transformation
The mean second-derivative spectra of the 6 samples are compared in Fig 2. No significant difference in peak location was found in the investigated spectral region, except for the narrow peaks at approximately 2800 cm −1 (C-H stretch), 2400 cm −1 (C = O stretch) and the region of 800-1700 cm −1 caused by Amide I II, C-N stretch, and N-H band. Although Mnium laevinarve, Plagiomnium actum and P. maximoviczii can be differentiated from the species of genus Classification of Rhodobryum roseum based on FTIR and chemometrics Rhdobryum, the differences between Rhdobryum roseum and Rhdobryum ontariense are not obvious after the second-derivative transformation (Fig 2).

Cluster analysis
The FTIR spectra from different species of mosses exhibit similar absorbances. The secondderivative transformation test cannot provide obvious differences between Rhodobryum roseum and its adulterants. Specific differences are also difficult to distinguish by experience. Therefore, we used multivariate statistical methods to analyze the absorption bands. A cluster analysis was conducted to investigate the relationship among the species. We selected the absorption values in the range of 4000-600 cm −1 . The nearest neighbor was used to construct the dendrogram according to the absorption values.
The dendrogram divides the six species (5 replicates for each sample) into two separated clusters (Fig 3): cluster 1 (C1) has three species from the family Mniaceae, and cluster 2 (C2) has three species from genera Rhodobryum. C1 is subdivided into two secondary sub-clusters. Subcluster 1 (SC1) is composed of 5 replicate samples of Plagiomnium actum and 5 replicate samples of P. maximoviczii. Sub-cluster 2 (SC2) has 5 replicate samples from the genera Mnium (Family Minaceae). The two sub-clusters were clustered together, which is consistent with the traditional morphologic classification. In C2, the three species of genus Rhodobryum were clustered together, and 5 replicate samples of Rhodobryum roseum 1 and 5 replicate samples of Rhodobryum roseum 2 were clustered together (Sc1). The replicate samples of Rhodobryum ontariense were clustered separately (Sc2). The replicate samples of each species were hardly discriminated.
The cluster analysis result indicates that Rhodobryum roseum is closely related to Rhodobryum ontariense. The phylogenetic relationship of the family Mniaceae is distant from genera Rhodobryum. As expected, the two main groups (genera Rhodobryum and family Mniaceae) are completely divided into two parts. However, the same species collected from different areas and the replicate samples of each species can hardly be distinguished by the cluster analysis.
The results can basically reflect the relative relationship among the 6 species. Further study is required to identify the internal relationships of the genera and the species collected from different areas. PCA, FSD, and one-dimensional DWT were used in our study.

PCA analysis
In the present study, we used PCA as the second multivariate analysis. The data from the absorption values in the range of 4000-600 cm −1 in the FTIR spectra were analyzed by PCA.  Table 2 shows the variance that accounts for the first four PCs computed from the absorbance value of the characteristic peaks in the cluster analysis. The first three PCs summarized more variation in the data than in any other PC and accounted for more than 99.73% of the data variance. Fig 4 shows the score plot based on the first three PCs. The score plot indicates that the species of family Mniaceae and genera Rhodobryum can be grouped into two separate ellipses (A and B, Fig 4). The score plots of the replicate samples of each species are mostly overlapped. There is nearly no variation between the replicate samples of the same species. The three studied Rhodobryum species (Rhdobryum ontariense, Rh. roseum 1, Rh. roseum 2) formed one well supported group (Fig 4). The Plagiomnium species exhibits a short-distance relationship with Mnium laevinerve. The cluster analysis and PCA data suggest that the species of genus Plagiomnium (Plagiomnium actum, P. maximoviczii) is more closely related to Mnium laevinerve (Fig 4). In the traditional morphologic classification system, the two genera belong to the same family. The genus Rhodobryum belongs to the family Bryaceae. The PCA results are consistent with the cluster analysis results. The cluster analysis and PCA results can be used to simply, rapidly, and accurately identify different families and genera.

FSD and wavelet analysis of the FTIR spectral data
The species of the same genus contain similar chemical components (e.g., protein, carbohydrate, and plant hormones) [36]. The FTIR spectra of the same genus exhibit close absorbance values and nearly identical wave numbers. In particular, the FTIR absorptions of the species from different places are difficult to distinguish. Although the cluster analysis and PCA can basically discriminate different families and genera, the differences between the same species and the replicate samples are barely visible. Therefore, in our study, FSD and DWT were used to extract the FTIR spectra features for further identification. The 1800-600 cm −1 region can provide higher characteristic molecular structural information on the spectra. The 1800-600 cm −1 fingerprint region contained greater molecular structural information and was used for the FSD and DWT analysis.

FSD analysis
According to the classical taxonomy of bryophytes, genus Rhodobryum belongs to the family Bryaceae, whereas genera Plagiomnium and Mnium belong to the family Mniaceae. The common spectral peaks and absorbance values of the three species are not easy to classify by experience. FSD can be used to distinguish the small differences of different genera. Although it does make the signal peaks narrower, the FSD does not change the position or area of the peaks. Therefore, we applied the FSD method on 6 species of Rhodobryum ontariense, Rh. roseum 1, Rh. roseum 2, Plagiomnium actum, P. maximoviczii and Mnium laevinarve. The 1800-600 cm −1 range includes the fingerprint region, which contains more molecule structure information. There is more information for the range 1800 to 600 cm −1 . Therefore, we use this region to extract spectral features. The FTIR-FSD spectral results in the region of 1800-600 cm −1 are displayed in Fig 5, which shows that the output waveforms of all 6 species have a marked variation. Certain differences  (Rhodobryum roseum1, Rhodobryum roseum2) and Rh. ontariense, both of which belong to the genus Rhodobryum. The three species contain similar chemical components (e.g., amino acids, proteins, stigmasterol, and friedelin [40]). The FTIR-FSD spectra of the three species of genus Rhodobryum are shown in Fig 5B. However, the outcome is unsatisfactory. The absorption bands at 1600 cm -1 (N = H) and 1500 cm -1 (N = O) in Rhodobryum roseum and Rh. ontariense, have obviously different shapes, but the bands at 1500-600 cm -1 are similar. In particular, the two species of Rhodobryum roseum collected from different areas (Rhodobryum roseum1 and Rhodobryum roseum2) are almost identical (Fig 5C). The FSD spectra of the replicate samples of Rhodobryum roseum are nearly identical.

DWT analysis
Two samples (Rhodobryum roseum1 and Rh. roseum2) were collected from different areas. The chemical components of the two samples of Rhodobryum roseum are not significantly different, and the FTIR spectra of the two samples were difficult to differentiate. The FSD spectra of three samples of genus Rhodobryum are considerably similar to one another (Fig 5B). To further identify the two Rhodobryum roseum samples of the same species and the replicate samples, DWT was used to clarify the FTIR spectra. Scales 1-5 present the detailed information after decomposition (Fig 6). Five scales were compared. One-dimensional DWT was applied to decompose the FTIR spectral data of the two samples into different frequency bands. The vibration signals were decomposed up to five levels by using the Daubechies 4 mother wavelet. The DWT coefficients effectively reflected the features of the spectra (Fig 6). Scale 1 contains substantial noise, which is unsuitable in analyzing the differences among the same species. Determining the differences in Scale 5 is also difficult. Scales 2-4 present the differences between Rhodobryum roseum1 and Rh. roseum2 (Fig 6). Therefore, decomposition levels 2-4 in the DWT domain were selected as a variable characteristic extraction region to show the intraspecific variation of Rhodobryum roseum1 and Rh. roseum2. The results prove that DWT can also be used to extract the features of the FTIR absorptions of different samples of the same species. The replicate samples are quite difficult to distinguish by using all methods, which include the second-derivative transformation, cluster analysis, PCA and FSD. The DWT coefficients of the replicate samples (Rhodobryum Roseum2.1 and Rhodobryum Roseum 2.2) were found to have some differences in scales 1 and 2 (Fig 6). DWT can be used to identify the replicate samples of Rhodobryum Roseum cl. The result further proves that different closely related species can be easily distinguished by DWT.
Sample stability test. The sample stability was determined using Rhodobryum roseum as an example. The same sample solution was analyzed at 0, 4, 8, 16, and 24 h after preparation. The RSD of the FTIR absorbance value of the common peaks were 3.47% (S1 Table). The similarity of the results indicates that this sample remained stable for 24 h. The running results show that the FTIR absorbance value of the characteristic common peaks of the same species is stable. Therefore, the method is reliable, exhibits good repeatability, and can be applied in the analysis of other moss samples.

Conclusion
In recent years, Rhodobryum roseum has received increasing interest among researchers [41]. Discriminating Rhodobryum roseum from its adulterants remains difficult. The traditional circumscriptions are mainly based on the characteristics of the leaf cells, which makes the classification of the taxa difficult. The results of the present study show that FTIR spectroscopy with PCA and cluster analyses can be used to discriminate Rhodobryum roseum from other adulterants. In our study, we find that the cluster analysis and PCA can basically classify the species into groups. FSD is applied to extract the features and enhance the differences among the species from different genera with similar FTIR spectra. FSD with DWT can successfully identify Rhodobryum roseum species from different areas and replicate samples of the same species. Thus, the FTIR spectroscopy method with DWT is suitable for discriminating different species of mosses from controversial groups. The results show the possibility of using optical methods such as the FTIR method to differentiate the genera and species of mosses. The use of FTIR with chemometric methods to identify mosses is a rapid and efficient technique that can enable routine laboratories to facilitate the identification of mosses.
Supporting information S1