Colorimetric Grading Scale Can Promote the Standardization of Experiential and Sensory Evaluation in Quality Control of Traditional Chinese Medicines

Experiential and sensory evaluation is an ancient method that remains important in the current quality control system of Traditional Chinese Medicines (TCMs). The process is rapid and convenient when evaluating the quality of crude materials in TCM markets. However, sensory evaluation has been met with skepticism because it is mainly based on experience and lacks a scientific basis. In this study, rhubarb was selected to demonstrate how color-based sensory evaluation could differentiate the quality of herbal medicines objectively. The colors of the rhubarb samples, expressed as RGB values, were obtained from different parts and forms of the plant, including the plant’s surface, fracture surface color, and a powdered form with or without treatment with a color-developing reagent. We first divided the rhubarb samples into three grades based on the total content of five hydroxyanthraquinone derivatives, the major pharmacological components in rhubarb. Then, a three-layer back-propagation artificial neural network (BP-ANN), calibrated with selected training samples, was used to correlate the quality of the rhubarb with its color. The color of the rhubarb powder after coloration attained the highest accuracy (92.3%) in predicting the quality grade of the test samples with the established artificial neural networks. Finally, a standardized colorimetric grading scale was created based on the spatial distribution of the rhubarb samples in a two-dimensional chromaticity diagram according to the colors of the powdered rhubarb after color enhancement. By comparing the color between the scale and the tested samples, similar to performing a pH test with indicator paper, subjects without sensory evaluation experience could quickly determine the quality grade of rhubarb. This work illustrates the technical feasibility of the color-based grading of rhubarb quality and offers references for quantifying and standardizing the sensory evaluation of TCMs, foods and other products.


Introduction
Experiential and sensory evaluation is a rapid and effective evaluation method that has been used widely in various fields [1][2][3]. The technique has been used for more than 30 years in the food industry as a quality control test, characterizing the taste and colors of foods, and is becoming increasingly important in food markets [4][5][6]. A set of unique methods of sensory evaluation have been developed as Traditional Chinese Medicines (TCMs) evolves. Traditional sensory evaluation is based primarily on the human senses to assess shape, size, texture, color, odor and taste. Sensory evaluation remains one of the most effective methods to assess the raw materials used in TCM. Compared with modern quality control methods that involve spectroscopy, chromatography, mass spectrometry and cutting-edge biotechnology, sensory evaluation is convenient, fast and effective [7]. Nevertheless, these high-tech modern methods have been greatly researched, whereas the conventional sensory evaluation methods interest few modern researchers. In China, quality evaluation based on sensory characteristics is widely used in the TCM markets. There are some commercial specifications of TCMs conventionally accepted by the growers, medicinal vendors, TCM physicians and consumers. Also there are Dao-di herbs (good quality herbs produced in the native areas) conventionally accepted by practitioners to differentiate the quality of TCMs. Those commercial specifications and the concept of Dao-di herbs are generally defined by a set of specific morphological features. Growers of medicinal herbs typically select a superior species or variety to cultivate based on those specific morphological features. Similarly, medicinal vendors classify and price TCMs based on different specifications, and consumers primarily evaluate the quality of a TCM on the basis of sensory characteristics [8]. The repeatabilities of the experiential and sensory assessment by practitioners for the TCMs, like rhubarb and Coptis chinensis, etc., have been demonstrated in our previous studies [9,10]. Although traditional sensory evaluation has been regularly applied in China, there are some disadvantages. One concern is whether the sensory evaluation has a scientific foundation and chemical basis. A second issue is how the practice can be applied by people with little sensory evaluation experience, specifically users in the Western world.
Color is one of the most important characteristics in sensory evaluation for TCMs. Some ancient Chinese herbal medicine literature, such as the Origins of the Materia Medica (Ben Cao Yuan Shi), emphasize the importance of color as an identifier of high quality herbal materials (Figure 1). For example, an Aucklandia root (muxiang in Chinese) that is cadet blue in color is considered to be superior, mid-grade if yellowish-white, and inferior in quality when characterized by an oleaginous black color [11]. The correlation between color and chemical composition, however, is not well studied. Furthermore, sensory evaluation relies greatly on personal experiences, which are difficult to quantify and standardize. Thus, its application has been restricted. To our knowledge, no previous quantification and standardization of color evaluation for Chinese herbal medicine has been reported.
Rhubarb, an ancient and one of the best known Chinese herbal medicines, has been recognized for centuries in traditional medicine for its pharmacological properties, including its purgative [12], nephric protecting [13,14], liver protecting [15], antimicrobial and hemostatic properties [16]. The major pharmacological components in rhubarb are five hydroxyanthraquinones (HAQs)aloe-emodin, rhein, emodin, chrysophanol and physcion -which are also quality control markers in the Chinese pharmacopoeia [17]. The quality of rhubarb correlates directly with the content of the five HAQs. Rhubarbs of different grades usually exhibit different colors, such as dark red, brown yellow or yellow. According to traditional sensory evaluation, rhubarbs of good quality are often bright yellow and those of poor quality were often dark yellow or brown. The objectivity and reproducibility of sensory evaluation for rhubarb have been demonstrated using the Delphi method [9]. In this study, rhubarb has been selected as a model drug for its representative color in sensory evaluation.
The overall flowchart of this study is shown in Figure 2. First, we divided the rhubarb samples into three grades based on the total content of the five HAQs. Second, we used the three-layer back-propagation artificial neural network (BP-ANN) to find relationships between the quality and the color of the samples. Finally, we defined a colorimetric grading scale for the quality of rhubarb. By comparing the color of a sample with this scale, similar to matching the color of litmus paper with a color scale to ascertain pH, customers can quickly determine rhubarb quality. To investigate which kind of sample produced the best colors to use for grading rhubarb, colors from the plant surface, fracture surface color and powder, with or without coloration by a colordeveloping reagent, were analyzed. Through this work, we illustrated the feasibility of color-based grading for rhubarb and developed a practical tool, a standardized colorimetric grading scale, to classify the rhubarb. Furthermore, we provided some references for the quantification and standardization of sensory evaluation of TCMs.

Materials
Thirty-four batches of rhubarb were collected from various sources in China ( Table 1). All of the voucher specimens were deposited at the China Military Institute of Chinese Materia Medica. The samples were identified by Professor Xiao-He Xiao as the dried roots and rhizomes of Rheum palmatum L., Rheum tanguticum Maxim. ex Balf. and Rheum offcinale Baill. The chemical reagents were analytical grade and obtained from the Beijing Chemical Factory (Beijing, China).

Ethics
No specific permits were required for the described field studies. The aforementioned locations are neither privately-owned nor protected by the Chinese government. And there is no endangered or protected species involved in the specific locations where we collected the rhubarb samples.

Camera System
Images were acquired using a camera system equipped with a standard lightbox, camera, computer, ZoomBrowser EX software, data lines and digital camera (PowerShot G5 Pro 5.0 Megapixel, Canon Inc., Nagasaki, Japan). The standard lightbox consists of a steel framework, fluorescent lamp (LUMILUX, FH14W/865HE, OSRAM Company, München, Germany), ballast and shadingcloth.

BP Neural Network Analysis
A BP neural network with one input layer, one hidden layer, and one output layer was established in this study. According to the experimental results, the RGB values were set as three of the evaluation parameters. In this experiment, there were seven initial nodes in the hidden layer, and the number of nodes was defined to be 2N+1, where N is the number of input layer nodes. In the output layer, three nodes were designated to represent the three grades of rhubarb. The hidden layer and input vectors were connected, and the output layer and input vectors were not. In addition, the training sample set and the desired values were normalized with Premnmx from the Matlab toolbox. Finally, the BP network was established using the Newcf function and the transfer function of neurons in each layer was tansig and purelin. The network training function was trainParam. After training, the Dynasty, has been considered the first of the herbal classics in China, which described the sensory characteristics for identification of commercial herbal materials (the medicinal parts of the plants) in detail rather than the plant morphology. b) The original description in the book for the color-based quality evaluation of Aucklandia root (Aucklandia lappa Decne, muxiang in Chinese). Aucklandia root is an herb that has been commonly used to treat gastroenterological diseases in China for over two thousand years. c) Photographs of the Aucklandia root samples in different color. The quality of Aucklandia root can be placed into three grades based on the color of its fracture surfaces: cadet blue denotes superior quality (left); yellowish-white indicates ordinary quality (middle); and an oleaginous black color suggests inferior quality (right). doi:10.1371/journal.pone.0048887.g001 grading standards were saved in the network, and the BP neural network was able to produce a prediction. Although the determined sample grades are discrete numbers (1, 2, or 3) in the training process, the predicted values are continuous real numbers. The grade of an inputted sample could be designated based on the predicted value. For example, if the output value is in the range of (0.51-1.50], the sample should be designated as the grade 1 ( Figure 4). Additionally, for the grade 2 and grade 3, the ranges are designated as (1.51-2.50] and (2.51-3.50], respectively. In the results predicted by the BP network, some output values were not in the range of (0.51-3.50]. These numbers were considered incorrect predictions.

Color Conversion for Rhubarb Samples
Because the RGB color space consists of three dimensions, it is difficult to use RGB information to summarize the distribution of the rhubarb samples in each quality grade. To distinguish the lower and upper thresholds for grading different rhubarb samples, the RGB values were transformed into the CIE1931 XYZ color space, where the values can be normalized and plotted in a two-dimensional CIE1931xy chromaticity diagram.
The formulas used to convert between the CIE XYZ and RGB color spaces are well documented [18,19]. The CIE XYZ color space is deliberately designed so that the Y parameter is a measure of the brightness or luminance of a color. The chromaticity of a color is then specified by the derived parameters x and y, two of the three normalized values which are functions of the three tristimulus values X, Y, and Z.
The derived color space specified by x and y can represent all of the chromaticities visible to the average person within a twodimensional chromaticity diagram, which is a widely used tool to specify colors [18].

Quality Grade Assignment for the Rhubarb Samples
Before the BP-ANN analysis, the rhubarb samples were divided into three grades according to the total content of the five HAQs (T HAQs ) based on the following definitions: grade 1, T HAQs $40.00 mg g 21 , grade 2, 25.00,T HAQs ,40.00 mg g 21 , and grade 3, T HAQs #25.00 mg?g 21 (Table S1).

Color Quantization into RGB Values for the Rhubarb Samples
Images of a sample were captured in the standard lightbox and under natural light at different times (morning, noon, afternoon, evening) during the day using the optimized shooting conditions described above. Table 2 shows the relative standard deviations (RSD) of the R, G and B channels, which were 10.24%, 10.22% and 11.75%, respectively, when shooting under natural light at different time points. In contrast, the RSD for the R, G and B values were 0.94%, 1.13% and 1.81%, respectively, when shooting in the lightbox. The lighting conditions provided by the lightbox enable good reproducibility when acquiring images for color information. In this study, we acquired different colors of the herb, including the outside surface, a fracture surface, and a powder,  Table 1. The data of surface color with coloration were not listed and excluded from the research because the data varied greatly in different zones of the herb surface.

Neural Network Establishment and Prediction
The neural network was established firstly by processing the training set, and then the other samples were taken as the test set to assess the validity of the predictions and results from the network. We assumed the samples containing intermediate  Figure 4. It could be found that the color of the rhubarb powder after coloration attained the lowest wrong prediction of the quality grade for all the samples.

Calculation of the Boundaries that Differentiate Grades of Rhubarb
The sixteen representative samples (sample 12, 15, 19 and 31 in the grade 1; sample 1, 3, 4, 24, 26, 28 and 32 in the grade 2; and sample 9, 10, 14, 33 and 34 in the grade 3) showing the least overlap in their distribution in the chromaticity diagram (i.e., distributed approximately 80% of the possible maximum distance from the center of each quality grade) were selected to determine the boundaries for differentiating the rhubarb grades. The acquired RGB values of the representative samples were converted to values within the CIE 1931xy color space and then plotted in the chromaticity diagram (Figure 5a). The distribution of the three grades of the rhubarb samples separated into distinct regions. According to the calculations described above, the mathematical expectation points (x, y) of each grade are O 1 (0.4315, 0.3031), O 2 (0.4819, 0.3408) and O 3 (0.4277, 0.3677). The division point D (x, y) is defined as the geometric center of the three mathematical expectation points calculated for the three grades and occurs at (0.4470, 0.3372). Figure 5b shows that the rhubarb samples in each grade distribute away from the division point (D) in three directions toward a different color category. The grade 1 tends toward the red-purple region; the grade 2 to the orange region; and the grade 3 to the yellow region. Thus, how the samples distribute can be used to determine the grade of an unknown rhubarb sample, specifically by comparing the color of the   Figure 6).

Discussion
Because of the convenience and speed of sensory evaluation compared with the laboratory analyses of chemical content, the desired color-based grading method would help in assessing the quality of rhubarb in the TCM marketplace. The approach for obtaining the color information is important for sensory evaluation. A colorimeter is a device that is commonly used to measure the color components of solid samples. In our preliminary experiment, we found that measuring the color of rhubarb by colorimeter was difficult because the plant's surface or fracture surface was not smooth and uniformly colored, which decreased the repeatability of the results. Unfortunately, most herbal medicines have various shapes that affect reproducibility. In this study, we used a digital camera to capture images of samples and extracted the color data from these images using Photoshop (Adobe Corporation, USA). Through this approach, the color data of the herbs are consistent with how colors are perceived by human vision. The RGB (red, green, blue) color model is one of the most widely used of the color systems. An additive model in which red, green, and blue light are summed to produce a broad array of colors, the model includes almost all of the colors that can be perceived by the human eye. In this study, the RGB values were used to represent color quantitatively. The RGB values could be converted into visual color, which made it possible to create a color scale that could be used as a reference for grading the quality of rhubarb. Different concentrations of HAQs display different shades of yellow or red under a neutral or alkaline environment, respectively, which indicates a relationship between the color of the herb and its HAQ content. However, we found that the absolute values of Pearson's correlation coefficients between the contents of the five HAQs and the RGB values ranged from 0.004 to 0.460 (Table 1), showing non-significant linear correlation between the two variables. It could not be found a direct correlation between the color and quality of rhubarb because the RGB values are distributed in a non-linear space. So, we used artificial neural networks (ANNs), generally developed for non-linear mapping, generalization, self-organization and self-learning [20], to find relationships between the quality and the color of the samples. The most popular method used in ANN-based pattern recognition is the back-propagation (BP) trained, first-order neural network [21], which was used in this study. In Figure 4, it could be found that the outside surface color generated the lowest accuracy compared with the fracture surface color and the powder. After additional color development of the samples with a 0.5% NaOH aqueous solution, the rhubarb displayed a red to dark red color based on the Bornträ ger reaction of the base with the HAQs. Consequently, the coloration of the samples enhanced the accuracy of the network and increased the number of correct predictions for all sample types. The results suggested that measuring surface color of rhubarb is useless to some extent in assessment of its quality, because the color on the outside surface of the rhubarb can be more easily affected over the course of production and storage than the other sources of color data. The ununiformity was another defect of surface color in assessment of its quality. When the rhubarb is ground into powder form, the uniformity of color can be improved. In addition, developing the color of the HAQs has an advantage in reducing interference from other colored substances. Hence, the color-enhanced powder of rhubarb is preferred in color-based assessment of its quality.
The ANN results indicate that there is a positive but non-linear correlation between the quality of rhubarb and its color. However, the neural network is still a machine-based approach. To facilitate the quality assessment of rhubarb, we tried to establish a colorimetric grading scale to visually discriminate rhubarbs of varying quality. The boundaries of each quality grade of rhubarb were calculated according to their distribution in the chromaticity diagram ( Figure 5b). The differences between the colors at the division point (D) and the colors of the calculated expected values for each grade can be distinguished with the naked eye, which enables the development of a color scale that can be printed on a card and used to grade rhubarb in the field [22]. For widespread use, the colorimetric grading scale could be printed on paper, similar to professional color cards such as Munsell color cards, to minimize the deviation between the theoretical and individually perceived colors.
In summary, the results of this study revealed that color or color parameters (RGB values) provided important information for the classification of quality of TCMs, such as rhubarb. However, the results varied depending on where the color was sampled: at the surface or fracture surface of the rhubarb, and with or without color development. The BP-ANN network achieved the highest percentage of correct predictions when analyzing powdered samples that were chemically processed to express color, which suggests that powders may be the best type of sample for the accurate grading of rhubarb. On the basis of the distribution of several rhubarb samples in the chromaticity diagram, we have designed a colorimetric grading scale to discriminate between rhubarbs of varying quality. By comparing the color of the sample with the colorimetric grading scale on the card, similar to determining pH using litmus paper, customers without experience in sensory evaluation could quickly assess the quality of rhubarb. This work illustrates the technical feasibility of color-based grading of rhubarb quality and provides useful references for the quantification and standardization of sensory evaluation-based quality control for TCMs, foods and other products.

Supporting Information
Table S1 Contents (mg g 21 ) of the five HAQs in the rhubarb samples.

(DOC)
Text S1 Ultra Peformance liquid chromotography (UPLC) analysis. (DOC) Figure 6. Differences between the color at the division point and the colors of the mathematical expectation points for each grade. doi:10.1371/journal.pone.0048887.g006