Identification of Nitrogen, Phosphorus, and Potassium Deficiencies in Rice Based on Static Scanning Technology and Hierarchical Identification Method

Establishing an accurate, fast, and operable method for diagnosing crop nutrition is very important for crop nutrient management. In this study, static scanning technology was used to collect images of a rice sample's fully expanded top three leaves and corresponding sheathes. From these images, 32 spectral and shape characteristic parameters were extracted using an RGB mean value function and using the Regionprops function in MATLAB. Hierarchical identification was used to identify NPK deficiencies. First, the normal samples and non-normal (NPK deficiencies) samples were identified. Then, N deficiency and PK deficiencies were identified. Finally, P deficiency and K deficiency were identified. In the identification of every hierarchy, SVFS was used to select the optimal characteristic set for different deficiencies in a targeted manner, and Fisher discriminant analysis was used to build the diagnosis model. In the first hierarchy, the selected characteristics were the leaf sheath R, leaf sheath G, leaf sheath B, leaf sheath length, leaf tip R, leaf tip G, leaf area and leaf G. In the second hierarchy, the selected characteristics were the leaf sheath G, leaf sheath B, white region of the leaf sheath, leaf B, and leaf G. In the third hierarchy the selected characteristics were the leaf G, leaf sheath length, leaf area/leaf length, leaf tip G, difference between the 2nd and 3rd leaf lengths, leaf sheath G, and leaf lightness. The results showed that the overall identification accuracies of NPK deficiencies were 86.15, 87.69, 90.00 and 89.23% for the four growth stages. Data from multiple years were used for validation, and the identification accuracies were 83.08, 83.08, 89.23 and 90.77%.


Introduction
In this research, scanned images of rice leaves and sheaths under NPK deficiencies and normal nutrition levels were compared, and the differences in the rice leaf and sheath characteristics under the different nutrition conditions were analyzed. Fisher discriminant analysis was used to develop the rules and to construct a model for the identification of NPK deficiencies.
During the identification of the rice nutrition deficiencies, the standard identification process was to simultaneously identify the different types of deficiencies. When different deficiencies caused similar symptoms, it was easier to misjudge the deficiency type during the identification process. To improve the identification accuracy and to reduce misjudgments, hierarchical identification was used. With hierarchical identification, the identification process was formulated to identify particular nutrition deficiencies; therefore, a better identification could be performed.

Ethics Statement
This study was designed to aid in the diagnosis of rice nutrition. All of the data in this study can be published and shared.

Experimental Design
The experiment was designed to study rice under different degrees of NPK deficiencies. Rice seeds (cultivar ZheYou-NO. 1) were pre-germinated in moist sand at 30˚C for 3 days, and seedlings were individually transplanted 7 days after emergence into 5 L polyvinyl chloride (PVC) pots that contained clean, sieved, and thoroughly leached river sand to allow for precise nutrient control. In 2012 and 2013, the experiment was carried out in a greenhouse on the ZiJinGang campus of Zhejiang University (30˚179N, 120˚059E) in Hangzhou, China. The plants were grown under natural light conditions. The temperature of the greenhouse was maintained at 30˚C/25˚C (day/night), and the relative humidity was maintained at 50%. The nutrient solution was prepared with deionized water and contained 110.8 mg/L CaCl 2 , 405 mg/L MgSO 4 .7H 2 O, and 16.5 mg/L Na 2 SiO 3 .9H 2 O. The pots were arranged in 13 different levels of nutrition content (4 N level treatments, 4 P level treatments, 4 K level treatments, and normal nutrition treatment), 6 replication for each level, and 10 rice plants in each pot. A total of 13 treatment levels (NH 4 NO 3 0 mg/L, 28.60 mg/L, 57.20 mg/L, 85.70 mg/ L; NaH 2 PO 4 .2H 2 O 0 mg/L, 12.60 mg/L, 25.20 mg/L, 37.80 mg/L; K 2 SO 4 0 mg/L, 22.30 mg/L, 44.70 mg/L, 67.00 mg/L; and normal nutrition (NH 4 NO 3 114.30 mg/L, NaH 2 PO 4 .2H 2 O 50.40 mg/L, K 2 SO 4 89.30 mg/L)) were produced via nutriculture (hydroponic) solutions and were added to different pots. The nutrient solutions in the pots were replaced every 14 days. Every 5 days, the pH of the nutrient solution in each pot was measured and adjusted to 5 using 1 mol/L NaOH.

Acquisition images
The leaf samples were taken on August 4 th , 18 th , and 27 th and September 8 th of 2013. The top-three leaves for 10 rice plants with 13 nutrition levels, totaling 1560 samples and representing all growth stages, were collected. A total of 480 rice leaf and leaf sheath samples under 4 different N levels, 480 rice leaf and leaf sheath samples under 4 different P levels, 480 rice leaf and leaf sheath samples under 4 different K levels and 120 rice leaf and leaf sheath samples with normal nutrition levels were collected to build the diagnosis rules and the identification model for the NPK deficiencies. The rice leaf and leaf sheath samples collected on July 29 th and August 13 th , 20 th , and 31 st of 2012 were used to validate the model; these samples include leaves and leaf sheaths of plants grown under 4 different N levels (240 samples), 4 different P levels (240 samples), 4 different K levels (240 samples) and normal nutrition levels (60 samples).
All of the samples were analyzed in the laboratory. First, the leaves and sheaths were placed on a scanner (EPSON GT20000, Seiko Epson Corporation, Suwa, Nagano-ken, Japan) with a maximum scanning area of 11.7617.0 inches and an R/G/B (the full color images consist of red (R), green (G), and blue (B) channels) and BK color CCD line sensor. The output image data were 16 bits per pixel per internal color and 1 to 8 bits per pixel per external color. The resolution was set to 300 dpi (dots per inch). The leaf area (cm 2 ) can be calculated using the sum of all of the pixels within the range of the leaf multiplied by (2.54/300) 2 ; the length (cm) equals the number of pixels in the vein multiplied by 2.54/300, and the width (cm) equals the number of pixels in the widest zone multiplied by 2.54/300.
In addition, considering that leaves with a phosphorus deficiency are dark green in color, lightness was added (LI) as a characteristic parameter in this study to increase identification accuracy [15].
LI~0:299Rz0:587Gz0:114B ð1Þ Because N, P, and K are very mobile within the plant and are translocated to young leaves from old, senescing leaves, the symptoms of rice suffering from NPK deficiencies often first appear at the tip of the leaf and subsequently spread to the entire leaf. Under N deficiency, leaves become light green at the tip, and the color then spreads to the entire leaf. Under PK deficiencies, the symptoms are similar, and the leaf tips become yellowish brown [1]. Therefore, the color of the leaf tip can be used to effectively identify symptoms of NPK deficiencies. In this research, the ''thinning of color'' characteristic was the mean color value for 1/5 of the leaf length from the tip and is expressed using the leaf tip R (LTR), leaf tip G (LTG), and leaf tip B (LTB).
In the rice-growing process, the color of the heart leaf is light compared to the next functional leaf. In the fully functional stage, when the second and third functional leaves have the darkest color and when the difference between the colors of neighboring leaves is small, rice grows well. In contrast, a larger difference means that the rice is suffering from a nutrition deficiency. At normal nutrition levels, the leaf length gradually increases with higher leaf positions; the top second or top third leaves are the longest. When the length of one leaf is shorter than or equal to the next leaf, the growth in the rice is restrained. Before the jointing stage, every leaf sheath is densely distributed on the tillering node, and the increased length of the leaf sheath reflects the leaf spacing which is the distance between neighboring leaves. The leaf spacing increases as the leaf grows. When the rice plant suffers from a nutrition deficiency, the spacing between leaves shortens. In line with this mechanism, this research added 8 parameters: the difference between the 2 nd and 3 rd leaf lengths (L23), the ratio of the 1 st to 3 rd leaf lengths (L1/3), the ratio of the leaf and leaf sheath lengths (L/LS), the difference between the 2 nd and 3 rd leaf R (R23), the difference between the 2 nd and 3 rd leaf G (G23), the difference between the 2 nd and 3 rd leaf B (B23), the leaf spacing of the 1 st and 2 nd leaves (LS12), the leaf spacing of the 2 nd and 3 rd leaves (LS23), and the difference in leaf spacing (LS12-LS23). In total, 28 color and shape parameters were extracted from the scanned images of the leaf and sheath ( Table 2).
Using the R, G, and B mean value function in MATLAB, the spectral characteristics of the leaf sheaths (LSR, LSG, LSB) under NPK stress and under normal nutrition levels were determined. Using the magic wand tool in Adobe Photoshop CS5, a segment in the white region of the leaf sheath (WRA) resulting from nitrogen stress was selected, and its area was calculated using the Regionprops function in MATLAB (Fig. 3). The four additional parameters for leaf sheath are shown in Table 3.
As shown in Fig. 4, 4 newly added color parameters of the leaf sheath showed obvious differences under the different nutrition conditions in the four growth stages.

Research method
In this research hierarchical identification was used to identify deficiency category. In every hierarchy the selected characteristics were different, then using the characteristics selected by SVFS to establish rule and model of identification with Fisher discriminant analysis. In process the 1 st rule was established for the identification of normal and non-normal (NPK nutrition deficiency) states. According to the identification results, the non-normal samples were included in the second identification hierarchy. In the second hierarchy, the 2 nd rule was established for the identification of N and PK nutrition deficiencies. After removing the samples identified as having N nutrition deficiencies, the samples with PK nutrition deficiencies were included in the third identification hierarchy.

Characteristic
Formula Explanation The ratio of leaf area to leaf length A/P A=P~A rea Perimeter The ratio of leaf area to leaf perimeter

Eccentricity
Eccentricity~A xisLength long AxisLength short The ratio of leaf length to leaf width

Rectangularity
Rectangularity~A rea object Area bounding{box The ratio of leaf area to the area of the smallest box encasing the leaf

Area Convexity
Area convexity~A rea Convex Area The ratio leaf area to the area of the convex hull of leaf The 3 rd rule was established for the identification of P and K nutrition deficiencies (Fig. 4). Finally, the samples with the correct identification of every nutrition deficiency in all of the identification processes were counted to calculate the identification accuracy. As shown in Fig. 5 and Fig. 6 the color and shape characteristics of the leaf and leaf sheath were different under different nutrition deficiencies. The nutrition deficiency category can be identified from the characteristics, but using many characteristics can result in redundant information, which increases the number of calculations and influences the identification accuracy. However, it is difficult to recognize stress with so many sensitive characteristics. Therefore, to quickly diagnose nutrition deficiencies, it is necessary to choose the optimal set of characteristics using an effective feature-selection method.
The support vector feature selection (SVFS) method was used to select the optimal characteristic set to reduce the calculation burden and remove redundant information. This method makes full use of the advantage of SVM, namely, its generalizability from a small training sample. Additionally, this method can improve the operational efficiency and ensure a rapid and stable screening process. The optimal characteristic set has the maximum classification prediction ability. The method can remove the redundant characteristics of the subset by identifying high correlations between characteristics. Therefore, the optimal characteristics subset can represent the set of sensitive characteristics of the different types of nutrition statuses [16][17][18][19]. In this research, the SVFS screen characteristics subset in Libsvm-3.12 was used.
In the screening process, the rate of contribution to the identification was first estimated for each characteristic to sort and remove the unimportant  characteristics until the screening process was complete. An SVFS measures the contribution of each feature to identification by disturbing the objective function of the SVM. SVFS was used to select the optimal characteristics set from the twenty-eight characteristics extracted from the samples having NPK deficiencies and having normal nutrition levels. The essence of the screening is to select the optimal subsets from the original combination of characteristics to ensure accurate identification based on a minimum number of characteristics.
Fisher discriminant analysis can be used to determine which class a research object belongs to by observing and measuring the value of the variables. Discriminant analysis establishes a discriminant function by filtering the variables and by including information from those variables that can be used to determine the classification and characteristics of the object. The error rate can be minimized  using this method, and its common formulation is Here, y is the discriminant value; x 1 , x 2 , x 3 …, x n are variables that reflect the characteristics of the research object; and a 1 , a 2 , a 3 …, a n are discriminant coefficients [20,21].

Results and Discussion
The symptoms of NPK deficiencies in a rice leaf were markedly different between the four growth stages. Under N deficiency, the old leaves, and sometimes all leaves, become light green and chlorotic at the tip. Except for young leaves, which are greener, deficient leaves are narrow, short, and lemon-yellowish. Under P deficiency, the plants are stunted and dark green with erect leaves. The leaves are narrow, short, very erect, and 'dirty' dark green, and the stems are thin and spindly. Young leaves may appear to be healthy, but older leaves turn brown and die. Under K deficiency, the plants are dark green, and yellowish-brown leaf margins, or dark-brown, necrotic spots first appear on the tips of older leaves. Under severe K deficiency, the leaf tips are yellowish-brown. Symptoms first appear on older leaves and subsequently along the leaf edge and, finally, on the leaf base. Older leaves change from yellow to brown, and if the deficiency is not corrected, discoloration gradually appears on younger leaves [1]. In this research, the color of the leaf sheath was introduced to first diagnose the nutrition status. Under P deficiency, red and purple colors may develop in the leaf sheathes if the variety has a tendency to produce anthocyanin. When suffering from N deficiency, rice stems appear light green, and older sheaths become lemon-yellowish. The base of the leaf sheath appears white under severe stress [14].
In this research, the optimal subsets of the characteristics represented the set of the most sensitive characteristics under different nutrition deficiencies. The results are shown in Table 4 (the numbers represent the characteristics).
As shown in Table 4, the optimal set of characteristics of the different leaf positions included the leaf color, leaf shape, and sheath length in every growth stage, which means that every nutrition deficiency can affect the 3 types of characteristics. Taking the leaf and leaf sheath samples of all growth stages together, 9 characteristics (LG, LW, AC, LSL, L23, L1/3, LTR, and LTG) were determined to be universal characteristics for all growth stages for the identification of nutrition deficiencies.
After screening the characteristics, a Fisher discriminant analysis was used to identify the nutrition status of the rice. N, P, K and Normal represented nitrogen deficiency, phosphorus deficiency, potassium deficiency and normal nutrition, respectively. In this research, the IBM SPSS statistics 20 software package was used for the analysis.
For every growth stage, 390 samples were used to build the model. The results are shown in Table 5. As shown in Table 5, the accuracy was lower for the overall identification of NPK deficiencies. In later testing, N deficiency was easily misjudged as K deficiency, which led to a reduced overall accuracy. To increase the model's identification accuracy, new characteristics specific to N deficiency Diagnosis of Rice Nutrition Based on Machine Vision were needed. Previous research reported that old rice leaves and stems become light green under N stress and that the leaf sheath base becomes white under severe N deficiency [14]. Thus, in the present study, the color characteristics of the leaf sheath were introduced to increase the N stress identification accuracy.
After adding the four parameters, Fisher discriminant analysis was used to identify the nutrition deficiencies. The results in Table 6 show that the overall identification accuracy was greatly improved, which verifies the importance of using the leaf sheath in the diagnosis of nutrition deficiencies.
According to Table 6, the leaf position with the highest identification accuracy was the same for the four growth stages. The optimal leaf position was the third leaf, which coincided with NPK being very mobile in the rice. Under NPK deficiencies, the nutrients were translocated into the young leaves from the old, senescing leaves, and the symptoms first appeared on the older leaves. Further testing was conducted to discover additional details about the identification of NPK deficiencies. The results are shown in Tables 7 and 8. According to the results, Normal nutrition and N deficiency had higher identification accuracies for all growth stages compared with P and K deficiencies. As shown in Table 8, the probabilities of identification error for P and K deficiencies were higher for all growth stages. The universal characteristics were used to build the rule in the identification process, but the rule was not targeted to identify particular nutrition deficiencies. To improve the model's identification accuracy and to reduce the misjudgment of P and K deficiencies, hierarchical identification was used.
In every hierarchy, the identification rule was targeted at one nutrition deficiency; thus, different optimal characteristics were selected in different hierarchies (Table 9). SVFS was used to screen the optimal characteristic parameters for every identification hierarchy. The results are shown in Table 9. Putting all of the leaf and leaf sheath samples together, the specific universal characteristic set from every hierarchy that was suitable for all stages was screened for the targeted nutrition deficiency. In the results described above, normal nutrition had the highest identification accuracy. Therefore, the first hierarchy was the identification of Normal and Non-normal nutrition, the set was LSR, LSG, LSB, LSL, LTR, LTG, LA and LG. For the identification of N and PK deficiencies, the set was LSG, LSB, WRA, LB, and LG. For the identification of P and K deficiencies, the set was LG, LSL, A/L, LTG, L23, LSG, and LI.
In the hierarchical identification process, Normal and Non-normal were identified first; thus, the characteristics that appeared under Non-normal nutrition were screened. Under N deficiency, rice leaves become light green, and red and purple colors may develop in stems under P deficiency due to anthocyanin. Under K deficiency, leaf tips are yellowish-brown; thus, the Diagnosis of Rice Nutrition Based on Machine Vision characteristics screened in the first hierarchy were leaf, leaf tip and leaf sheath color. For the identification of N and PK deficiencies in the second hierarchy, the characteristics that mainly appeared under N deficiency were screened. Under N deficiency, rice leaves and sheaths become light green; thus, the selection of characteristics mainly focused on leaf and leaf sheath color. For the identification of P and K deficiencies in the third hierarchy, each specific characteristic of P and K deficiencies was selected for identification. Under P deficiency, leaves are narrow, very erect, and 'dirty' dark green, and leaf sheaths are red and purple; thus, LI, EC, LSR, etc. were selected. Under P and K deficiencies, the leaves become chlorotic at the tip, but the affected area is different. Therefore, the color of the leaf tip was also selected. According to the screening results, the Fisher discriminant analysis established three targeted rules for the identification of NPK deficiencies. For the 2013 dataset, 1560 samples (4 growth stages) were used to build the model to identify NPK deficiencies. The identification accuracy is shown in Table 10. As shown in Table 10, the leaf positions with the highest identification accuracy were the same for the four growth stages. The optimal leaf position was the third leaf and was the same as the optimal leaf positions mentioned above. Under NPK deficiencies, nutrients were translocated into the young leaves from old, senescing leaves; thus, the symptoms first appeared on older leaves.
Comparing Table 6 with Table 10 shows that using hierarchical identification can effectively improve the identification accuracy. Table 11 shows the identification accuracy of one nutrition deficiency (N, P, K, and Normal) when the overall identification accuracy was the highest. The identification accuracy of  Table 13, the validation accuracy for one nutrition deficiency (N, P, K, and Normal) was higher using hierarchical identification.

Conclusions
This paper takes the three most anterior leaves and sheaths of rice under different NPK nutrition conditions as the object of research. Under laboratory conditions, the color and shape parameters were acquired from the scanned images of rice leaves and sheaths. Then, SVFS was used to select the optimal characteristics set for the identification of NPK deficiencies, and Fisher discriminant analysis was used to build the diagnosis model. Compared with traditional methods, hierarchical identification can be used to effectively build the targeted identification rules for each nutrition deficiency. Finally, to improve the accuracy of the identification, this research introduced hierarchical identification. In this research, the identification process was divided into three hierarchies. First, the normal and non-normal (NPK deficiencies) samples were identified. In this hierarchy the selected characteristics were LSR, LSG, LSB, LSL, LTR, LTG, LA and LG. In the second hierarchy, N deficiency and PK deficiencies were identified. In this hierarchy the selected characteristics were LSG, LSB, WRA, LB, and LG. Finally, P deficiency and K deficiency were identified. In this hierarchy the selected characteristics were LG, LSL, A/L, LTG, L23, LSG, and LI. The result showed that hierarchical identification can be used to effectively improve the accuracy of identification (86.15, 87.69, 90.00 and 89.23% for the 4 growth stages). Data representing different years were used for validation, and the validation accuracies were 83.08, 83.08, 89.23 and 90.77% for the 4 growth stages.
The study provides evidence for the quick diagnosis of rice nutrient status, which makes it possible to accurately identify rice NPK deficiencies with scanning technology. Other crops, such as maize and wheat, suffering nutrition deficiencies usually exhibit some special symptoms on the leaves and sheaths, and this method can be used to diagnose their nutrition level status. Therefore, the technology introduced in the study has application value and development potential.