Holistic Evaluation of Quality Consistency of Ixeris sonchifolia (Bunge) Hance Injectables by Quantitative Fingerprinting in Combination with Antioxidant Activity and Chemometric Methods

A widely used herbal medicine, Ixeris sonchifolia (Bge.) Hance Injectable (ISHI) was investigated for quality consistency. Characteristic fingerprints of 23 batches of the ISHI samples were generated at five wavelengths and evaluated by the systematic quantitative fingerprint method (SQFM) as well as simultaneous analysis of the content of seven marker compounds. Chemometric methods, i.e., support vector machine (SVM) and principal component analysis (PCA) were performed to assist in fingerprint evaluation of the ISHI samples. Qualitative classification of the ISHI samples by SVM was consistent with PCA, and in agreement with the quantitative evaluation by SQFM. In addition, the antioxidant activities of the ISHI samples were determined by both the off-line and on-line DPPH (2, 2-diphenyl-1-picryldrazyl) radical scavenging assays. A fingerprint–efficacy relationship linking the chemical components and in vitro antioxidant activity was established and validated using the partial least squares (PLS) and orthogonal projection to latent structures (OPLS) models; and the online DPPH assay further revealed those components that had position contribution to the total antioxidant activity. Therefore, the combined use of the chemometric methods, quantitative fingerprint evaluation by SQFM, and multiple marker compound analysis in conjunction with the assay of antioxidant activity provides a powerful and holistic approach to evaluate quality consistency of herbal medicines and their preparations.


Introduction
Traditional Chinese Medicine (TCM) and herbal preparations have been widely used by billions of people around the world for thousands of years. The World Health Organization (WHO) recommends chromatography finger printing as a means of identification and quality evaluation since 1991 [1]. The Chinese State Food and Drug Administration (SFDA), US Food The Theory of SQFM The sample fingerprint and reference fingerprint vectors are defined as x ! ¼ ðx 1 ; x 2 ; . . . x n Þ and y ! ¼ ðy 1 ; y 2 ; . . . y n Þ, where x i and y i are the peak areas of the component peaks in the sample fingerprint and reference fingerprint vectors, respectively. Calculating the cosine of the angle between the sample fingerprint and reference fingerprint vectors provides qualitative similarity (S F ) as defined in Eq 1. Although the qualitative similarity factor (S F ) can clearly reflects the degree of similarity in the chemical compositions of the sample fingerprint and reference fingerprint in terms of distribution ratio, it is biased towards the large peaks, which raises a serious question on its validity. In order to limit the influence of the large peaks and ensure an equal weight for each peak, the sample fingerprint ( x ! ) and reference fingerprint ( y ! ) vectors are transformed to P ! s ¼ x 1 y 1 ; x 2 y 2 ; ⋯ x n y n and P ! 0 ¼ ð1; 1; 1⋯1Þ, respectively. The cosine of the angle between the vectors P ! o and P ! s is defined as the qualitative ratio similarity (S 0 F ), as calculated by Eq 2. Macro qualitative similarity (S m ) can be obtained by averaging S F and S 0 F as shown in Eq 3. For quantitative assessment of the fingerprints, the projection of x ! to y ! is defined as projection content similarity (C) as calculated in Eq 4. The projection content similarity factor (C) can reflect the degree of similarity in the chemical compositions of the sample fingerprint and reference fingerprint in terms of the total contents, but still suffers from the bias of the large peaks over the small peaks. The quantitative similarity (P) is the ratio of the total content corrected by the qualitative similarity factor S F , as shown in Eq 5. Combining the above two quantitative properties yields macro quantitative similarity (P m ) as defined in Eq 6, which is a measure to monitor the overall content of chemical components in the sample fingerprint. Finally, a fingerprint leveling coefficient (α), as defined in Eq 7, is another quantitative parameter that is able to detect the difference between sample fingerprint and reference fingerprint.
SQFM combines the macro qualitative and quantitative similarity factors (S m and P m ) [13,25,26]. The quality of the TCM and herbal preparations can be assessed and classified into different grades based on the values of S m and P m as well as α, in which the evaluation criteria by SQFM are listed in Table A in S3 File, where grade 1 belongs to the best quality and grade 8 to the worst one. Based on the criteria, all Sm, Pm and α are used together in the rules for classification, and the final quality grade is on the basis of the worst grade. For example, if Sm 0.96 (grade 1), Pm(%) 95.6 (grade 1) and α 0.02 (grade 1), then quality is grade 1; if Sm 0.89 (grade 3), Pm(%) 89.5 (grade 3) and α 0.07 (grade 2), then quality is grade 3; if Sm 0.87 (grade 3), Pm (%) 104.6 (grade 1) and α 0.25 (grade 5), then quality is grade 5.

Materials and reagents
A total of 23 batches of ISHI injectable preparations (20ml, apparent concentration = 1.0g/ml), all manufactured by Shenyang Shuangding Pharmaceutical Co., Ltd., were obtained from different pharmacies in Shenyang, China. UR standard was purchased from Sigma Chemical Co. (St. Louis, MO, US). The standards of AD, CGA and CFA were acquired from the National Institute for the Control of Pharmaceutical and Biological Products (Beijing, China). The standards of CCA and LGR were supplied by Chengdu Puri France Science and Technology Development Co., Ltd. (Chengdu, China). LG standard was provided by Shanghai Winherb Medical Technology Co., Ltd. (Shanghai, China). All the standard compounds have purity above 98%. The structures of the marker compounds are shown in Fig 1. Methanol (HPLC grade) and acetonitrile (HPLC grade) were purchased from Yuwang Industry Co., Ltd. (Shandong, China), and glacial acetic acid (HPLC grade) from Kermel Chemistry Reagent Co., Ltd. (Tianjin, China). De-ionized water and other reagents were of analytical grade.

Equipment and chromatographic conditions
HPLC analysis was performed on an Agilent 1100 HPLC system comprised of an online degasser, a low pressure mix quaternary pump, an auto-sampler, and a diode array detector (DAD), and controlled by a ChemStation workstation (Agilent Technology, California, USA). The chromatographic separation was carried out on an Arcus EP-C18 column (250 mm × 4.6 mm, 5 μm) from Exformma Technologies (Shanghai, China). The off-line antioxidant activity assay was performed on a 722S spectrophotometer (Shanghai Precision Instrument Co., Ltd., Shanghai, China).

Preparation of standard and sample solutions
The reference standards of the marker compounds (UR, AD, CGA, CFA, CCA, LGR and LG) were accurately weighed separately and dissolved in methanol, then diluted with methanol to appropriate concentration ranges for the calibration curves, and stored at 4°C prior to use.

Antioxidant activity assay
Off-line DPPH assay. The DPPH radical stock solution was prepared in methanol (1 mM) immediately before the experiments and protected from light. DPPH free radical scavenging capacity was determined by a decrease in the absorption at 517 nm upon reduction by an antioxidant. The DPPH assay was performed according to Pamita Bhandari et al. [31] with slight modification. Briefly, a 0.127 mM DPPH solution was prepared in methanol and 2 mL of this solution was added to 2 mL of the ISHI sample solution diluted in methanol to various concentrations (apparent concentration = 1-6 mg/mL). These solutions were allowed to stand in dark for 40 minutes and the absorbance was measured at 517 nm against a blank. All tests were performed in triplicates. The radical scavenging capacity is expressed as percent inhibition and calculated using the following equation: %inhibition = [(A control − A sample ) / A control ]×100, where A control is the absorbance of the negative control and A sample is the absorbance at the presence of the ISHI sample. The percent inhibition was plotted against the sample concentration in order to calculate IC 50 values (the concentration of samples required to scavenge 50% of DPPH radicals). On-line HPLC-DAD-DPPH assay. This on-line assay was performed by using the method introduced by Jyh-Horng Wu et al. [32] with slight modifications. The ISHI solution (11 μL) was injected into the HPLC system (see section 'Equipment and chromatographic conditions') at a flow rate of 0.8 mL/min. The chemical components in the ISHI solution was separated and detected at 260 nm. The eluted compounds reached to a reaction coil (5000 mm × 0.007/0.18 mm i.d. PEEK tubing from Agilent), where the 0.127 mM methanol DPPH solution was delivered via another LC pump (Iso pump, Agilent 1100 series) at a flow rate of 0.3 mL/min. After the eluent mixed with DPPH solution, negative peaks were detected at 517 nm.

Chemometric analysis
SVM. SVM is an efficient method for classification and is widely used on disease diagnosis or medical assistance [33,34]. In this study, SVM was adopted to classify the ISHI samples based on RBF kernel type using SPSS statistic software (SPSS Clementine 12.0, SPSS Inc., USA). The training dataset was 23 samples of (x i , y i ), where x i is a feature vector of seven markers' contents in a d-dimensional feature space R d and y i 2 {−1, +1}, y = −1 represents the integrated grade 2; y = +1 represents grade>2.
PCA. PCA is used to qualitatively analyze the samples by reducing the number of variables and data dimensionality. The score plot of PCA is a map of the observations that shows the possible presence of any outliers in the data [14,15]. In this study, PCA analysis was performed on 49 common peaks detected in all the ISHI samples using SIMCA-P+ software (Version 13.0, Umetrics, Umea, Sweden), and the significance level was set at 95%.
PLS and OPLS analysis. PLS method is a versatile linear regression algorithm that can be used to predict either the continuous or discrete/categorical variables; and OPLS (another linear regression method) is an extension of PLS, which reduces the complexity of models while preserves the ability of prediction by removing descriptor variables X (data set) that is not correlated (i.e. orthogonal) to property variables Y (response set) [16,35]. In this study, both PLS and OPLS models were constructed to characterize the correlation between the total antioxidant activity and the chemical content of the ISHI samples using the areas of 49 characteristic peaks as the descriptor matrix X and the 1/IC 50 values as the response matrix Y with the SIM-CA-P+ software (Version 13.0, Umetrics, Umea, Sweden). The confidence level was set at 95%.

Optimization of chromatographic conditions and method validation
In order to achieve reproducible separation and acceptable resolution in a short analysis time, we investigated four mobile phase (MP: MP1~MP4) conditions and three gradient elution programs (GEP: GEP1~GEP3). The index of the fingerprints information amount (I) [36], which represents the signal size, signal homogenization and the information amount, was adopted to optimize the mobile phase condition and gradient program. From S1 File, it was found that the I values for the four mobile phase conditions MP1~MP4 are 15.7, 14.9, 15.9 and 15.5, respectively; while the I values for the three gradient programs GEP1~GEP3 are 13.6, 16.4 and 16.7, respectively. Therefore, the mobile phase condition MP3 (I = 15.9) and gradient program GEP3 (I = 16.7) were selected as the optimized conditions.
The calibration curves were established by plotting the peak area against the concentration of each standard marker compound in the concentration range suitable for the expected concentration of the marker compounds in the ISHI samples. Table 1 summarizes the linearity results and the method shows acceptable linearity (R 2 !0.9996) for all the marker compounds in the targeted concentration ranges. The limit of detection (LOD, S /N = 3) and the limit of quantification (LOQ, S /N = 10) were also determined to be in the range of 0.10-0.37 μgÁmL -1 and 0.41-1.63 μgÁmL -1 for the marker compounds. The system repeatability was evaluated by analyzing six individual mixed standard solutions; the stability of the sample solution was validated by analyzing a single standard mixture solution stored at room temperature for 0, 2, 4, 8, 16, and 24 h, respectively; intra-day and inter-day precision of the method were evaluated by nine replicate injections of the standard mixture solution three times a day over three consecutive days. The relative standard deviation (RSD%) values for the repeatability, stability, intraday and inter-day precision were all less than 0.4% and 2.2% for the relative retention time and the peak area of the seven marker standards, respectively. The recovery was validated by a standard spiking test, and the average recovery values for the marker standards were between 96.4% and 108.3% with RSD% less than 3.1%, suggesting that the method was accurate. Method validation demonstrated that the method was precise, accurate and sensitive enough for simultaneously quantitative analysis of the seven marker compounds in ISHIs.

Fingerprinting and marker compound analysis by HPLC-DAD
The marker compounds show very different UV absorption as shown in Fig 2D. It is reasonable to believe that the components in the ISHI samples also have different absorption behavior. Therefore, the fingerprints of the ISHI samples were generated at five different wavelengths (i.e., 260 nm, 265 nm, 330 nm, 335 nm and 350 nm) corresponding to the absorption maxima of the marker compounds in order to capture as many peaks as possible. A total of 49, 46, 38, 39 and 33 peaks common to all the ISHI samples were identified at 260 nm, 265 nm, 330 nm, 335 nm and 350 nm, respectively. Fig 2A shows the overlay chromatograms of 23 ISHI samples at 260 nm and the representative chromatograms of the sample and the marker compounds are shown in Fig 2B and 2C, respectively. The reference fingerprint (RFP) was generated by averaging all the sample chromatograms.
The content of the marker compounds was simultaneously determined in all 23 batches of the ISHI samples using the established calibration curves ( Table 1). The quantitative information of the marker compounds are presented in Table 2. CCA and LGR were found to be the main components in all the samples with the average values of 68.9 and 97.1 mg/L, respectively. A large variation in the content of AD and LG was observed in all the samples as reflected by relative standard deviation (%RSD) of 53.1% for AD and 47.7% for LG, respectively. The high %RSD was due to the undetectable and very low levels of AD and very high levels of LG in Sample 21 (S21) and 23 (S23). It was noted that S21 and S23 were significantly different from the other samples in the content of other marker compounds: the content of AD, CGA, CFA and CCA were obviously lower, and the content of LGR and LG higher in S21 and S23 than the other samples. LGR, Luteolin-7-β-D-glucuronide; LG, Luteolin-7-glucoside. b y is the peak area, x is the concentration injected (μgÁmL -1 ).

Fingerprint evaluation by SQFM
The fingerprints of the ISHI samples generated at 5 different wavelengths (260 nm, 265 nm, 330 nm, 335 nm and 350 nm) were evaluated using the SQFM, in which we averaged the 23 batches of sample fingerprints to give the reference fingerprints under each wavelength, respectively. The macro qualitative and quantitative similarity factors (S m and P m ) as well as the leveling coefficient (α) were computed by importing the fingerprint signals of the sample fingerprint and reference fingerprint into an in-house developed software "Digitized Evaluation System for Super-Information Characteristics of TCM-CFPs 4.0" (software certificate No. 0407573, China). A separate set of integrated S m , P m and α values (S 0 m , P 0 m and α') was also calculated according to Eqs 8-10 to avoid potential bias of different wavelengths. The calculated similarity factors and leveling coefficients for all the samples are presented in Table 3.
The integrated macro qualitative similarity S 0 m was mainly evaluated for the similarity of the samples in chemical composition, and the integrated macro quantitative similarity P 0 m and leveling coefficient α' were subsequently used to quantitatively gauge the similarity in the overall chemical contents with reference to the reference fingerprint. The quality grade (G) of each sample, as shown in Table 3, was assigned to each sample based on the S The relationship between sample fingerprint fingerprints and quantitative content of the marker compounds was also investigated. Linear regression was performed using the macro quantitative similarity factors P m (%) calculated with 49 common peaks at 260 nm and the mean value of the content of the seven marker compounds (P 7C %) for each sample ( Table 2). A reasonable linear correlation was obtained between the quantitative similarity factors of the fingerprints and the actual contents of the seven marker compounds in the ISHI samples (r = 0.906). This relationship demonstrates that the selected marker compounds (UR, AD, CGA, CFA, CCA, LGR and LG) basically synchronously changed with the overall content of the ISHI preparation chmicals. Hence quantitative evaluation of the fingerprints by SQFM has the potential to replace the use of multiple marker compounds and provides a reliable and provides a feasible means to control the quality consistency of the ISHI preparations. λ Para. S13 S14 S15 S16 S17 S18 S19 S20

SVM and PCA analysis
As the result presented in Table B in S3 File, the observed and predicted classification by SVM had a percent correct for 86.96%, indicating the efficient classification of SVM. The predicted probability by SVM (Table C in S3 File) shows that the 23 samples could be divided into two groups, namely, Cluster I with S1-S20 and S22 (P!0.9), Cluster II with S21 and S23 (P<0.9). The PCA analysis was performed using a three-component model with a total variance of 89.3% explained (PC1 = 51.2%, PC2 = 25.8%, and PC3 = 12.3%). The PCA score plot in Fig 3  reveals that most samples fall into one cluster except S10, S21 and S23, but S21 and S23 are clearly in the same cluster. The PCA results are in a good agreement with the SVM analysis, showed a strong evidence that the quality of S21 and S23 may be different from the other samples. In fact the interesting α 0 value of S10 was the highest among S10, S21 and S23 samples, which just better state why it is in the PCA score plot. When we look at all the evidences (SVM, PCA and SQFM analysis), we can come to the conclusion that S21 and S23 indeed are greatly different for α 0 = 0.17.

Antioxidant activity
Total antioxidant activity by off-line DPPH assay. Antioxidant activities have been demonstrated to be an effective in vitro measure to assess the biological activity of the ISHI preparations [19,27]. The total antioxidant activities of the ISHI samples were assayed by the off-line DPPH method, where IC 50 values were determined as shown in Table 2. The IC 50 value represents the sample concentration required to scavenge 50% of DPPH radicals, and lower IC 50 values indicate stronger antioxidant activities. A majority of the ISHI samples were found to possess acceptable antioxidant activities with an IC 50 value less 5 mg/mL; however, S21 and S23 showed higher IC 50 values (>5 mg/mL), indicating lower antioxidant activity.
The established PLS and OPLS models were validated using five samples that were not used for calibration. The desirable predictive power was demonstrated by the root mean square error of prediction (RMSEP) value of 0.0195 and 0.0178 for PLS and OPLS, respectively. The   Table 4.
Individual fingerprint components and antioxidant activity. The regression coefficients were also calculated for the scaled and centered X-variables (i.e., 49 common peaks) at 95% confidence interval to explore the relationship between individual fingerprint components and antioxidant activity. The regression coefficient plots in Fig 4B and 4D reveal that the majority of the fingerprint components (34 and 31 out of 49 peaks based on the PLS and OPLS models, respectively) appears to have a positive influence on the total antioxidant activity, and all the seven marker compounds have positive correlation coefficients in both the PLS and OPLS models. The regression coefficients also indicate that phenolic acids (CFA, CGA and CCA) have relatively higher antioxidant activity and flavonoid compounds (LGR and LG) lower antioxidant activity. Therefore, it is not surprising that S21 and S23 had weaker total antioxidant activities than the other samples ( Table 2) because they contain less marker compounds (CFA, CGA, and CCA) with higher antioxidant activity, but more marker compounds (LGR and LG) with lower antioxidant activity.
Antioxidant activity by on-line DPPH assay. The antioxidant activities of the ISHI samples were also determined by the on-line HPLC-DAD-DPPH method. Fig 5A and 5B display the chromatographic fingerprint of Sample 5 (S5) with 49 common peaks detected at 260 nm and antioxidant activity fingerprint at 517 nm, respectively. The negative peaks in the activity fingerprint indicate that these components have free radical scavenging activity. Among the seven marker compounds identified in the chromatographic fingerprint, the antioxidant activity was clearly observed for five marker compounds (CGA-24, CFA-26, CCA-34, LGR-38 and LG-40 in Fig 5B), which was consistent with their positive correlation to the antioxidant activity based on the PLS and OPLS models (Fig 4B and 4D). In comparison, the antioxidant activity was not directly observed for UR-4 in the activity fingerprint possibly due to the chromatographic resolution of the activity peaks (peak 2 in Fig 5B) and/or the sensitivity of the online DPPH method. Uridine elutes closely to an unknown compound (peak 2 in the chromatographic fingerprint). It is possible that the UR peak in the activity fingerprint was not resolved from the activity peak 2. In addition, UR is the least abundant marker compound; therefore, the UR activity peak may not be detected due to the sensitivity of the online DPPH method. The antioxidant activity was not detected for AD as shown in Fig 5B most likely due to its very low (if any) antioxidant activity. The online DPPH assay data suggests that the antioxidant activity of the ISHI samples might be attributed to the presence of phenolic acid [36] and flavonoid components, but not to nucleoside components. In addition to the marker compounds, other unknown components in the ISHI sample also showed significant antioxidant activities, for example, peak 2, 14, 16 and 17 (Fig 5B). In contrast, other unknown components (peak 8 and 44) detected in the chromatographic fingerprint did not show any antioxidant activity in the activity fingerprint in consistence with their negative regression coefficients calculated from the PLS model (Fig 4B). The other 22 samples showed similar antioxidant activity fingerprints (not shown). The online antioxidant activity assay has a clear advantage over the offline assay method in that the individual contribution to the total antioxidant activity by each chemical component can be determined, and the unknown compounds with significant antioxidant activity can be identified for further structural elucidation.

Conclusions
A multi-prong approach including chemometric methods (SVM and PCA), quantitative fingerprint evaluation, and antioxidant activity assay was employed to evaluate the quality consistency of 23 batches of the ISHI injectable preparations. Clustering based on SVM and PCA is able to identify two samples (S21 and S23) that are not similar to the other samples. Simultaneous analysis of the seven known marker compounds provides the quantitative information which helps to explain the difference observed in the SVM and PCA patterns. S21 and S23 had much lower content in one marker compound (AD), but higher content in two other marker compounds (LGR and LG) than the other samples. The characteristic fingerprints generated at multiple wavelengths further disclose the qualitative and quantitative difference in the chemical composition of these samples when evaluated with SQFM. In addition to the known marker compounds, other unknown components of the ISHI samples were evaluated for similarity based on their fingerprints. Again S21 and S23 are shown to be different from the other samples by both the qualitative similarity factor (S 0 m ) and quantitative similarity factor (particularly α') in consistence with the chemometric and quantitative marker compound analysis. Moreover the total antioxidant activities of the ISHI samples were determined by the offline DPPH assay and a correlation was also established between the antioxidant activity and the total amount of the marker compounds based on the PLS and OPLS models. The online DPPH assay further elucidates individual contribution of the chemical components to the total antioxidant activity, and provides a solid explanation why S21 and S23 had lower antioxidant activity. Therefore, this multi-prong approach provides a holistic approach to evaluate the quality consistency of the complex multi-component TCM and their preparations.
Although the chemometric methods (SVM and PCA) are able to identify different samples based on clustering and patterns, it is very difficult to apply these methods to the quality control of the TCM and herbal preparations in a manufacturing environment since clustering or pattern recognition requires the comparison of a large number of samples. Multiple marker compounds could be used, in theory, for quality control purpose; however, the biological or pharmacological effects of the marker compounds must be known. And it is also difficult to perform quantitation of multiple marker compounds in a fast-paced QC laboratory. In comparison, the SQFM has significant advantages for the quality control purpose. First, the qualitative similarity factor S m can reveal the difference in chemical composition of the samples, similar to the SVM and PCA methods. Second, the quantitative measures (i.e., the quantitatively similarity factor P m ) are also be to evaluate the difference in the overall content of the fingerprints. Finally the leveling coefficient α is a sensitive parameter to subdivide the category of samples. Once the standard prescription are set (i.e. the reference fingerprint is determined before one of sample determined), the two similarity factors (S m , P m ) and one leveling coefficient (α) can be effectively briefly used for the product release of a single lot for all TCM or herbal medicine, which cannot be done using the chemometric methods that need so many samples. LGR, Luteolin-7-β-D-glucuronide;

Supporting Information
LG, Luteolin-7-glucoside; MP, mobile phase; GEP, gradient elution program. (DOC) S3 File. Table A in S3 File. The quality grades classified by SQFM. Table B in S3 File. Comparing the observed and predicted classification by SVM. Table C