Cotton seed cultivar identification based on the fusion of spectral and textural features

Xiao Liu; Peng Guo; Quan Xu; Wenling Du

doi:10.1371/journal.pone.0303219

Abstract

The mixing of cotton seeds of different cultivars and qualities can lead to differences in growth conditions and make field management difficult. In particular, except for yield loss, it can also lead to inconsistent cotton quality and poor textile product quality, causing huge economic losses to farmers and the cotton processing industry. However, traditional cultivar identification methods for cotton seeds are time-consuming, labor-intensive, and cumbersome, which cannot meet the needs of modern agriculture and modern cotton processing industry. Therefore, there is an urgent need for a fast, accurate, and non-destructive method for identifying cotton seed cultivars. In this study, hyperspectral images (397.32 nm—1003.58 nm) of five cotton cultivars, namely Jinke 20, Jinke 21, Xinluzao 64, Xinluzao 74, and Zhongmiansuo 5, were captured using a Specim IQ camera, and then the average spectral information of seeds of each cultivar was used for spectral analysis, aiming to estab-lish a cotton seed cultivar identification model. Due to the presence of many obvious noises in the < 400 nm and > 1000 nm regions of the collected spectral data, spectra from 400 nm to 1000 nm were selected as the representative spectra of the seed samples. Then, various denoising techniques, including Savitzky-Golay (SG), Standard Normal Variate (SNV), and First Derivative (FD), were applied individually and in combination to improve the quality of the spectra. Additionally, a successive projections algorithm (SPA) was employed for spectral feature selection. Based on the full-band spectra, a Partial Least Squares-Discriminant Analysis (PLS-DA) model was established. Furthermore, spectral features and textural features were fused to create Random Forest (RF), Convolutional Neural Network (CNN), and Extreme Learning Machine (ELM) identification models. The results showed that: (1) The SNV-FD preprocessing method showed the optimal denoising performance. (2) SPA highlighted the near-infrared region (800–1000 nm), red region (620–700 nm), and blue-green region (420–570 nm) for identifying cotton cultivar. (3) The fusion of spectral features and textural features did not consistently improve the accuracy of all modeling strategies, suggesting the need for further research on appropriate modeling strategies. (4) The ELM model had the highest cotton cultivar identification accuracy, with an accuracy of 100% for the training set and 98.89% for the test set. In conclusion, this study successfully developed a highly accurate cotton seed cultivar identification model (ELM model). This study provides a new method for the rapid and non-destructive identification of cotton seed cultivars, which will help ensure the cultivar consistency of seeds used in cotton planting, and improve the overall quality and yield of cotton.

Citation: Liu X, Guo P, Xu Q, Du W (2024) Cotton seed cultivar identification based on the fusion of spectral and textural features. PLoS ONE 19(5): e0303219. https://doi.org/10.1371/journal.pone.0303219

Editor: Narendra Khatri, Manipal Institute of Technology, INDIA

Received: November 2, 2023; Accepted: April 21, 2024; Published: May 28, 2024

Copyright: © 2024 Liu et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: The data are available from the Open Science Framework database. https://osf.io/yxbvq/?view_only=c7aa038da26640b48127846b25d80264 DOI 10.17605/OSF.IO/YXBVQ.

Funding: This research was funded by the National Natural Science Foundation of China (grant number: U2003109) and the Graduate Student Innovation Plan Project of Xinjiang Uygur Autonomous Region (grant number: 2023057).The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.

Introduction

China is a major producer and consumer of cotton, and the cotton industry occupies an important position in the national economy. Different cotton cultivars possess different characteristics in resistance to pests and diseases, fiber quality, yield, etc. [1–3]. The market’s high quality requirements for textile products and cotton require cotton seeds to reach a certain level of purity and quality. Xinjiang is China’s largest cotton production base, and there are a large number of cotton cultivars in the local seed markets. Driven by economic interests, some merchants often mix different cultivars of cotton seeds of different qualities, which can not be easily found by cotton farmers during trading and always causes great economic losses to cotton farmers and the downstream textile enterprises. Mixed seeds of different cotton cultivars can result in an inconsistency in lint quality during processing, leading to low quality of textile products and significant economic losses [4]. Additionally, the presence of cotton seeds of non-elite cultivars in the market poses a significant threat to agricultural production [5]. Therefore, accurate identification of cotton seed cultivars before cotton planting is very necessary, which is crucial for improving cotton quality, assisting cotton seed market administration, and safeguarding the interests of cotton farmers.

Traditional methods of cotton seed cultivar identification, such as field planting, protein electrophoresis, and DNA marker analysis, are complex, time-consuming, laborious, costly, and require destructive sampling, making them unsuitable for actual scenarios [6, 7]. With the rapid development of information technology, visible/near-infrared spectral feature analysis has been applied to seed cultivar identification. For example, Zhang et al. [8] developed an identification model based on near-infrared spectra (NIR) to distinguish the different cultivars of vegetable seeds. Cui et al. [9] identified a large number of maize seed cultivars based on near-infrared reflectance spectroscopy (NIRS) and stoichiometry. Visible/near-infrared (Vis/NIR) spectroscopy technology can achieve rapid and non-destructive identification of seed cultivars [10, 11]. However, because it only contains seed spectral information, the accuracy of seed cultivar identification is not high [4]. As a new technology, hyperspectral imaging technique can integrate spectra and images, that is, it not only contains rich spectral information of seeds, but also contains image information such as shape, texture, and color. This technique solves the problem that single spectral or texture information is not sufficient to distinguish the differences between different cultivars, which is helpful to improve the identification accuracy [1, 12–14]. Especially, this technology can rapidly and non-destructively analyze the internal structure and chemical composition of samples. Thus, it has gained popularity in seed inspection [15–17]. Many researchers have analyzed the relationship between spectral and image information and predicted attributes to establish models for seed cultivar identification [13–15]. For example, Sun et al. [18] used PLS-DA, support vector machine (SVM), and K-nearest neighbor model to identify black soybean seed cultivars based on spectral features, image features, and the combination of spectral and image features. Sofacles Figueredo et al. [19] successfully identified corn seed cultivars using a PLS-DA model. Zhang et al. [20] successfully identified the watermelon seeds of different cultivars using the ELM model established with spectral features. Liu et al. [21] accurately identified wheat seeds of different cultivars using a SVM model. It is important to note that different models have varying performances in the identification of seeds of different cultivars. For example, Wang et al. [22] reported that the CNN model outperformed the SVM and KNN models in the identification of corn seeds of different cultivars.

At present, most researches on the identification of seed cultivars by hyperspectral imaging technology focus on the seeds of corn, wheat, soybean, etc. [5, 23, 24]. The performance of hyperspectral imaging technology in the identification of cotton seed cultivars is still unclear. In addition, previous hyperspectral data acquisitions require the use of a mobile platform, making the process cumbersome. Especially, the acquired hyperspectral images have to be corrected by manual computation [25, 26]. However, the portable Specim IQ handheld push-broom hyperspectral camera offers real-time data acquisition and ease of operation. Especially, the data acquired with the Specim IQ camera are already corrected, eliminating the need for additional computation, which improves efficiency. The usability of the Specim IQ camera was already demonstrated by Jan et al. [27], who compared it with the Specim V10E sensor and evaluated its measurement quality. In this study, a PLS-DA identification model was established using full spectrum data. After preprocessing the spectral data of cotton seeds with the selected optimal preprocessing method, the SPA was used to select the spectral features that were most sensitive to cotton seeds of different cultivars. Additionally, textural features were extracted using the gray-level co-occurrence matrix from the first principal component after performing principal component analysis on the hyperspectral images. Lastly, the performance of established Random Forest (RF), Convolutional Neural Network (CNN), and Extreme Learning Machine (ELM) identification models based on spectral features, texture features, or fusion of spectral features and texture features in identifying cotton seeds of different cultivars were compared. The objectives of this study were to select the optimal preprocessing method for denoising cotton seed spectral data, extract spectral features for different cultivars, and compare the performance of RF, CNN, and ELM models established based on different features. This study will provide a new method for rapid non-destructive identification of cotton seed cultivars.

Materials and methods

Experimental materials

Cotton seeds (cultivars Jinke 20, Jinke 21, Xinluzao 64, Xinluzao 74, and Zhongmi-ansuo 125) were obtained from a seed company (Jinfenghe, Shihezi, China) (60 seeds per cultivar). The variations in shape, size, and skin roughness among the seed samples were minimized [4], to avoid any potential errors in the spectral data caused by incon-sistency in the seeds.

Hyperspectral image acquisition

In this study, a Specim IQ hyperspectral camera (Oulu, Finland) was used to acquire hyperspectral images of cotton seeds. The camera has a wavelength range of 400–1000 nm and a spectral resolution of 7 nm. It captures images with a pixel count of 512 per line and provides 204 spectral points within the wavelength range.

The Custom mode of the Specim IQ camera was used for image acquisition. A white board was used as a reference. Green materials with distinct color difference and different absorbance from the epidermis of cotton seeds were selected as the background for indoor image acquisition. The cotton seeds were placed individually for hyperspectral image acquisition (Fig 1).

Download:

Fig 1. Hyperspectral image acquisition system.

https://doi.org/10.1371/journal.pone.0303219.g001

The obtained hyperspectral images were synthesized into pseudo-colors using three bands: 449.35, 548.55, and 598.60 nm. These images showed the spectral features of the cotton seeds and were used for further analysis of differences between cultivars.

Data sets

The ENVI 5.6 software was used to extract the spectral data of cotton seeds, with each cotton seeds as a regions of interest (ROI).

To obtain the average spectrum of each seed, the average of the spectra of all pixels within the ROI was calculated. This yielded 300 spectral data for each cotton cultivar. These data were saved in a matrix with a dimensions of 300 × 204 (raw spectral dataset). The raw spectral dataset was processed and analyzed, to investigate the differences between different cultivars and extract their spectral features.

The data of 42 seeds of each cultivar were used as the training set, and the data of remaining 18 seeds were used as the test set (7: 3) (Table 1). The value 1, 2, 3, 4, and were assigned to the data of Jinke 20, Jinke 21, Xinluzao 64, Xinluzao 74, and Zhongmiansuo 125, respectively.

Download:

Table 1. Division of training set and testing set.

https://doi.org/10.1371/journal.pone.0303219.t001

Data analysis

Spectral preprocessing

Background, uneven light source, etc. caused interference to hyperspectral image acquisition, yielding noises in the raw spectral. To reduce the influence of these interferences and extract spectral features, three denoising algorithms, Savitzky-Golay smoothing (SG), first-order derivative (FD), standard normal transform (SNV), and their combinations (SG-SNV, SG-FD, SNV-FD, and SG-SNV-FD) were used to preprocess the raw spectra. Besides, the denoising performances were compared.

The SG smoothing is a smoothing method based on local polynomial fitting [28]. It smooths spectral curve by fitting a local polynomial to reduce the influence of high-frequency noise. The choice of smoothing points directly affects the smoothing performance [29, 30]. In this study, the SG quadratic polynomial 7-point smoothing was chosen to preprocess the spectral data (Eq (1)). (1) Where x_a,smooth is the reflectance after SG preprocessing, H is the normalization factor, and ω is the window width 1/2.

The first derivative highlights the edge features of the spectrum and suppresses low-frequency noise. By deriving the spectral data, the negative effects caused by spectral baseline drift can be eliminated [31, 32]. (2) where D(λ_i) represents the first-order derivative reflectance at wavelength i, and R(λ_i+1) and R(λ_i−1) represent the raw spectral reflectance at wavelengths i+1 and i-1, respectively.

The Standard Normal Variate (SNV) Transform can eliminate the impacts of solid particle size, surface scattering, and brightness differences on the spectral data [33]. The spectral processing by SNV algorithm is based on the rows of spectral matrix (Eqs (3) and (4)). (3) (4) where x_SNV is the reflectance after SNV preprocessing, x is the average reflectance, and m is the number of wavelengths.

Feature selection

Feature selection was conducted to eliminate redundant and collinear information in full-spectrum data. These redundancies not only hinder the extraction of spectral features, but also increase computational complexity. Therefore, the SPA, a forward variable selection algorithm that minimizes collinearity, was employed to eliminate redundant information within the raw spectra and extract spectral features [34–36]. The spectral data preprocessed by the optimal preprocessing method and category assignments were used as inputs. Through continuous iterations, the features were selected. Multiple linear regression analysis was performed after each iteration. The algorithm ran until the root mean square error (RMSE) of the Testing set stabilized and reached the minimum value. When the RMSE of the Testing set tended to stabilize and reached a minimum, the output was the optimal number of variables, spectral features and their contributions. At this point, the optimal number of features was selected.

Textural feature extraction

Textural features show the structural features of the object surface in hyperspectral images. These features can provide valuable information for distinguishing cotton seed cultivars. In this study, the gray-level co-occurrence matrix including 14 types of textural features was used to extract textural features [37]. The gray-level co-occurrence matrix provides information about the spatial relationships between pixels, which is useful for texture analysis [38].

Firstly, the first principal component after principal component analysis (PCA) was selected. This component represents the most significant variation in the data and helps reduce dimensionality [39]. Then, eight commonly used textural features were selected from the gray-level co-occurrence matrix. These features include mean, variance, homogeneity, contrast, dissimilarity, entropy, second-order moments, and correlation.

The size of the moving window used in textural feature extraction is crucial [40]. In this study, the identification performance using window sizes of 3 × 3, 5 × 5, 7 × 7, and 9 × 9 were compared. It was found that 3 × 3 was the optimal window size.

The extracted useful textural features were used in modeling. These textural features provide additional information beyond spectral data and can enhance the accuracy and reliability of the identification model.

Machine learning

A PLS-DA (Partial Least Squares Discriminant Analysis) model based on full-band spectral data of cotton seeds of the five cultivars was established to evaluate the influence of different preprocessing methods on the accuracy of cotton seed cultivar identification model. The PLS-DA algorithm is a supervised regression model-based PLS multivariate statistical analysis [41, 42]. It utilizes the spectral data to construct a identification model for identification purposes. The study employed the RF algorithm, known for its ability to handle high-dimensional features and fast training speed, for modeling [43]. RF constructs multiple decision trees by randomly selecting features, and classifies them through a voting mechanism, which helps mitigate overfitting [44].

Additionally, the study used the CNN model, a deep learning algorithm capable of distinguishing spectral and image information for modeling. CNN achieves accurate detection and prediction through hierarchical-layer stacking and specially designed network structures [45, 46]. In this study, the CNN model structure which is modified on the basis of the model structure of Wang et al. [22] consisted of 12 layers, including an input layer, two convolutional layers, two normalization layers, two activation layers, two pooling layers, two fully connected layers, and one output layer. The size of the convolution kernel was 2*1, and the activation function was RELU. To avoid overfitting, L2 regularization was incorporated in both convolutional layers, and the coefficient was set to 0.0001. Lastly, the ELM algorithm was employed for modeling [47]. ELM randomly initializes the hidden-layer weights and uses simple linear equations to obtain the output layer weights. Compared with traditional neural network models, ELM has faster learning speed and better generalization ability [48].

The performance of the cotton seed cultivar identification models established by combining selected spectral features and textural features based on these three algorithms were compared.

Model evaluation

In this study, the Accuracy (the proportion of correctly identified cotton seed samples to the total samples) was used to evaluate the model. The higher the Accuracy, the higher identification accuracy of the model.

Results

Selection of optimal spectral pre-processing methods

Raw spectral data consisted of 204spectral points (wavelength range: 397–1004 nm). The spectral reflectance of the five cotton cultivars mostly overlapped across the entire spectral range (Fig 2(a)), except for 400–700 nm and 800–900 nm (Fig 2(b)). Notably, the differences in spectral reflectance between Jinke 20 and the other four cultivars were most apparent. To facilitate feature selection from the raw data, noises at the front and back ends of the spectra were eliminated. Finally, the spectra between 400 and 1000 nm were preprocessed using SG, SNV, FD, SG-SNV, SG-FD, SNV-FD, and SG-SNV-FD.

Download:

Fig 2. Average spectrum of single seeds (a) and average spectrum of single cultivars (b).

https://doi.org/10.1371/journal.pone.0303219.g002

The SG smoothing yielded smooth curves and eliminated jagged noises across the entire spectral range (Fig 3(a)). SNV transform reduced intra-cultivar spectral differences, but increased inter-cultivar differences (Fig 3(b)). The first-order derivative highlighted the slope changes in the raw spectral data, and made the spectral peaks’ positions and shapes more prominent, highlighting the spectral features (Fig 3c). In summary, the preprocessings significantly eliminated noises in the raw spectral data, yielding a more reliable database for subsequent feature extraction.

Download:

Fig 3. Spectral reflectance curves after preprocessing with SG smoothing (a), SNV (Standard normal variate) (b), SG—SNV (c), FD (First derivative) (d), SG—FD (e), SNV—FD (f), and SG—SNV–FD (g).

https://doi.org/10.1371/journal.pone.0303219.g003

In this study, the PLS-DA of the training and Testing sets of full-band spectra that were preprocessed was performed. Table 2 shows the results of the PLS-DA results for each preprocessing method. The optimal results were obtained for the spectra preprocessed using the SNV-FD method. The accuracy of both the training set and the Testing set was 1. It was worth noting that while the SG smoothing eliminated noises from the spectra, it also eliminated some valid information, resulting in the poorest PLS-DA results.

Download:

Table 2. Identification results of PLS-DA models based on different preprocessing methods.

https://doi.org/10.1371/journal.pone.0303219.t002

Feature selection

In this study, fifteen variables were selected as spectral features (569.12 nm, 637.08 nm, 472.59 nm, 501.72 nm, 960.40 nm, 954.24 nm, 868.55 nm, 905.18 nm, 966.55 nm, 963.47 nm, 917.42 nm, 874.64 nm, 975.79 nm, 920.48 nm, and 420.40 nm) (Fig 4a), and the contribution of each was calculated.

Download:

Fig 4. Number of wavelengths selected by successive projections algorithm (SPA) (a) and the wavelengths selected by successive projections algorithm (SPA) (b).

https://doi.org/10.1371/journal.pone.0303219.g004

Among these wavelengths, the absorption peaks in the near-infrared (NIR) region are primarily attributed to the water content of the cotton seeds, especially the presence of the O—H group. The absorption peaks near the red region are related to the stretching vibration of the C—H chemical bond in the samples [49]. These bands near the red region are crucial for capturing the differences in chemical composition and physical state among different seeds [50].

To validate the selected spectral features, the positional distribution of the spectral features was analyzed using the data from the first sample of Jinke 20. The results showed that these spectral features were predominantly distributed in the near-infrared (800–1000 nm), red (620–700 nm), and blue-green (420–570 nm) regions (Fig 4b).

Overall, the selected spectral features provided valuable information for identifying different cotton seed samples by chemical composition and physical state.

Cotton seed cultivar identification based on spectral features

In this study, the 15 spectral features extracted by SPA after SNV-FD preprocessing were used for RF, CNN, and ELM modeling. The model identification results based on the training set and test set are presented (Fig 5). All three models achieved good identification results on both the training and test sets. The SNV-FD-SPA-ELM model performed the best, achieving 100% accuracy on the training set and 98.89% accuracy on the test set. The SNV-FD-SPA-RF and SNV-FD-SPA-CNN models performed worse.

Download:

Fig 5. The identification results of RF (a, training set; b, test set), CNN (c, training set; d, test set), and ELM (e, training set; f, test set) models based on spectral features extracted by SPA.

https://doi.org/10.1371/journal.pone.0303219.g005

These results indicate that the extracted spectral features can be used for identifying different cultivars of cotton seeds. The SNV-FD preprocessing method effectively eliminated noises in the spectrum, the SPA extracted the most representative wavelengths, and the ELM modeling yielded optimal identification results.

Cotton seed identification based on textural features

The analysis results of the textural features of the five cultivar seeds (Table 3) showed that the Mean, Variance, and Contrast features exhibited significant differences between cultivars, making them useful for identifying cotton seed cultivars. The Homogeneity feature of Xinluzao 74 was obviously different from that of the other four cultivars. The Dissimilarity feature was effective in identifying Jinke 21 and Xinluzao 74 from the other three cultivars. Although Entropy, Second Moment, and Correlation features did not exhibit obvious differences, they still had subtle variations that could contribute to identifying the five cotton cultivars to some extent.

Download:

Table 3. Mean values of textural features of each cotton cultivar.

https://doi.org/10.1371/journal.pone.0303219.t003

Based on these textural feature analysis results, the RF, CNN, and ELM models were established. Among these models, the ELM model demonstrated the best identification performance (Fig 6). However, compared with the identification based on spectral features, the identification performance based on textural features was poorer, with an accuracy of 80.48% for the training set and 71.11% for the test set. The RF and CNN models realized the highest identification accuracy for Xinluzao 64, with the number of misclassified samples not exceeding three for both training and test sets. This may be due to the fact that the Second Moment and Correlation features of Xinluzao 64 were obviously different from those of the other four cultivars. It should be noted that compared with Xinluzao 64, Jinke 20 and Jinke 21 were more likely to be mis-classified, as were Xinluzao 74 and Zhongmiansuo 125.

Download:

Fig 6. The identification results of the RF (a, training set; b, test set), CNN (c, training set; d, test set), and ELM (e, training set; f, test set) models based on textural features extracted by gray-level co-occurrence matrix.

https://doi.org/10.1371/journal.pone.0303219.g006

Identification of cotton seed cultivars by combining spectral and textural features

Fig 7 presents the results of cotton seed cultivar identification by combining spectral and textural features. Comparing with the results (Figs 5 and 6), it is evident that the accuracy of both RF and ELM models are higher. After combining spectral and textural features, the ELM model had the highest accuracy (100%for the training set and 98.89% for the test set) among the three models. The CNN model showed a significant increase in accuracy compared with the CNN model based on textural features, but showed a decrease in accuracy (19.52% and 26.67% for the training and testing sets, respectively) compared with the CNN model based on spectral features.

Download:

Fig 7. The identification results of the RF (a, training set; b, test set), CNN (c, training set; d, test set), and ELM (e, training set; f, test set) models established by combining spectral and textural features.

https://doi.org/10.1371/journal.pone.0303219.g007

Discussion

Effects of different preprocessings

Previous studies have explored the selection of pre-processing methods for identification tasks. For instance, a study on cotton seed cultivar identification found that the combination of SG (Savitzky-Golay) smoothing (with a seven-point quadratic filter) and normalization yielded the best results [4]. Sharma et al. [41] confirmed that the model based on the SG2 preprocessing obtained the highest wheat seed cultivar identification accuracy. Additionally, a study on the identification of frost-damaged rice seeds using hyperspectral imaging and a deep forest model found that the Multiplicative Scatter Correction pre-processing was the most effective in increasing identification accuracy [51]. In this study, various spectral pre-processing methods were compared to select the optimal one for pre-processing the raw spectra. The performance of different pre-processing methods was evaluated by building a PLS-DA model using the pre-processed full-band spectral data. It was found that the combination of SNV and FD was the most effective in eliminating noises from the spectra of cotton seeds and enhancing the accuracy of the model. Compared with the SG smoothing combined with normalization spectral preprocessing method used by Huang et al. [4] in the study of the cotton seed cultivar identification based on hyperspectral image technology, the SNV-FD preprocessing method used in this study can effectively reduce the noise of cotton seed spectral data. Especially, the identification accuracy of the PLS-DA model based on the spectral data preprocessed by SNV-FD in this study is 4% (training set) and 6% (test set) higher than that of the PLS-DA model based on the spectral data preprocessed by SG smoothing combined with normalization [4]. These findings highlight that different pre-processing methods can yield different outcomes depending on the specific context and object characteristics. Therefore, when selecting the optimal pre-processing method, it is crucial to consider the specific environment, object characteristics, etc., and conduct evaluations. This can ensure the high quality and usability of spectral data for the given task.

Spectral variability analysis

Feature selection using the SPA could effectively reduce the redundant information, improving the processing efficiency of spectral data and the accuracy of the identification model [34–36]. Therefore, in this study, the SPA was employed for spectral feature selection, and the importance of different wavelengths in identifying cotton seed cultivars was analyzed. It was found that the top 15 wavelengths by contribution were predominantly in the near-infrared (800–1000 nm), red (620–700 nm), and blue-green (420–570 nm) regions. This indicates that the spectral features in the blue-green, red, and red-edge regions are crucial in differentiating cotton seed cultivars. The spectral features in the blue-green region can reflect the chlorophyll content and growth status of cotton plants [4]. The spectral features in the red region can reflect the leaf thickness and chlorophyll content of cotton plants [52]. The spectral features in the red-edge region can reflect the leaf structure and growth status of cotton plants [53, 54]. However, the chlorophyll content, leaf thickness, leaf structure, and growth status of different cotton cultivars vary greatly. Hence, the blue-green red, and red-edge bands can be used to identify different cotton seed cultivars. By comprehensively using the spectral features of the blue-green, red, and red-edge regions, accurate identifying of different cotton seed cultivars can be realized.

Identification performance based on the combination of spectral and textural features

In this study, to validate the effectiveness of textural features for the identification of cotton seed cultivars, spectral features were combined with eight textural features selected based on the gray scale covariance matrix for identification. The results showed that the inclusion of textural features significantly improved the identification accuracy of the RF and ELM models, that is, the overall identification accuracy of the models established using spectral and textural features was higher than that of the models based solely on full-spectrum data, spectral features, or textural features (Figs 5–7). However, the accuracy of the CNN model decreased after the inclusion of textural features (Figs 5–7). This indicates that the increase in the number of features does not necessarily increase the identification accuracy. This may be due to the fact that the spectral features play a dominant role in identification, and the CNN model may become overfitted or insufficiently trained when textural features are added, leading to a decrease in identification accuracy. Therefore, when constructing a model for cotton seed cultivars identification, the selection of appropriate features is very necessary. Additionally, it is crucial to combine selected features with a suitable modeling strategy for further optimization. This can help identify the most effective feature sets and the optimal modeling strategy to achieve the highest accuracy in cotton seed cultivars identification.

In this study, various preprocessing methods were combined with RF, CNN, and ELM algorithms for cotton seed cultivar identification. The highest accuracy (100%) was achieved through combining the SNV-FD preprocessing, the fusion of spectral and textural features, and the ELM modeling strategy. This is similar to the watermelon seed cultivar identification accuracy using the near-infrared hyperspectral image technology by Zhang et al. [20]. However, there are still some limitations in this study. In this study, seeds were separated for hyperspectral image acquisition. However, in actual cases, the seeds are stacked. Whether the model could still achieve good results in the case of stacking needs to be tested in future study. In addition, in this study, five cotton cultivars widely planted in Xinjiang, China were studied. However, in the cotton seed trade market, there are many cultivars available. So, in the follow-up research, more cotton cultivars will be included in the study to improve the stability and applicability of the model.

Conclusions

In this study, the raw spectral data was preprocessed by various spectral preprocessing methods before feature selection using the successive projections algorithm. After that, the spectral and textural features were fused to establish RF, CNN, and ELM identification models. The main conclusions were as follows:

The SNV-FD preprocessing method effectively eliminated noises and highlighted spectral features. Based on the selected spectral features, the PLS-DA model achieved the highest accuracy (the accuracy, R², and Q² were 1.0000, 0.9883, and 0.9868, respectively for the training set, and 1.0000, 0.9900, and 0.9818, respectively for the test set).

The importance of different spectral regions in distinguishing cotton seed cultivars varied. The spectral features selected from the near-infrared (800–1000 nm), red (620–700 nm), and blue-green (420–570 nm) regions by the SPA algorithm were found to be the most effective for identifying cotton seed cultivars. Utilizing the information from these bands improved the cotton seed cultivar identification accuracy.

The inclusion of textural features did not universally improve the accuracy of all models. The identification accuracy of RF and ELM models was significantly improved, while that of the CNN models was decreased. This may be attributed to overfitting or insufficient training of the CNN model after fusing spectral and textural features. Therefore, selecting appropriate modeling strategy is crucial to enhance identification accuracy in practical applications.

The ELM model showed the highest cotton seed cultivar identification accuracy (100% for the training set and 98.89% for the test set). Therefore, the ELM model has strong generalization ability and can be applied in practical productions.

Overall, the study highlights the importance of the selection of appropriate preprocessing methods, feature selection techniques, and modeling strategies in cotton seed cultivar identification, and provides valuable reference for practical applications.

References

1. Zhu S, Zhou L, Gao P, Bao Y, He Y, Feng L. Near-infrared hyperspectral imaging combined with deep learning to identify cotton seed varieties. Molecules. 2019; 24(18): 3268. pmid:31500333
- View Article
- PubMed/NCBI
- Google Scholar
2. Satturu V, Rani D, Gattu S, Md J, Mulinti S, Nagireddy RK, et al. DNA Fingerprinting for identification of rice varieties and seed genetic purity assessment. Agric Res. 2018;7(4):379–390.
- View Article
- Google Scholar
3. Korir NK, Han J, Shangguan L, Wang C, Kayesh E, Zhang Y, et al. Plant variety and cultivar identification: advances and prospects. Crit Rev Biotechnol. 2013;33(2):111–125. pmid:22698516
- View Article
- PubMed/NCBI
- Google Scholar
4. Huang DY. Study on Identification Method of Delinted Cottonseeds Varieties Based on Hyperspectral Image Technology. M.Sc. Thesis, Shihezi University, 2018.
5. Wang H, Wang K, Wu JZ, Han P. Progress in research on rapid and non-destructive detection of seed quality based on spectroscopy and imaging technology. Spectrosc SpectAnal. 2021;41(1):52–59.
- View Article
- Google Scholar
6. Bao Y, Mi C, Wu N, Liu F, He Y. Rapid classification of wheat grain varieties using hyperspectral imaging and chemometrics. Appl Sci. 2019;9(19):4119.
- View Article
- Google Scholar
7. Wang JF. SSR Markers-based Cotton Purity and Anthenticity Identification. M.Sc. Thesis, Chinese Academy of Agricultural Sciences,2009.
8. Zhang GK, Sun J, Wu XH, Li QL, Jiang SY. Identification of greengrocery seeds based on NIR and different pretreatment methods. Adv Mater Res. 2014; 1049: 1237–1240.
- View Article
- Google Scholar
9. Cui Y, Xu L, An D, Liu Z, Gu J, Li S, et al. Identification of maize seed varieties based on near infrared reflectance spectroscopy and chemometrics. Int J Agric Biol Eng. 2018;11(2):177–183.
- View Article
- Google Scholar
10. Li C, Wang X, Meng Z, Fan P, Cai J. Pepper seed variety identification based on visible/near-infrared spectral technology. In Infrared, Millimeter-Wave, and Terahertz Technologies IV. Bellingham, WA: SPIE; 2016, pp. 447–455.
11. Wu D, Feng L, He Y. Fast variety discrimination of Chinese cabbage seed based on Vis/NIR spectroscopy technique. In International Conference on Complex Systems and Applications; Waterloo: Watam Press; 2007.
12. Fayyazi S, Abbaspour-Fard MH, Rohani A, Monadjemi SA, Sadrnia H. Identification and classification of three iranian rice varieties in mixed bulks using image processing and MLP neural network. Int J Food Eng. 2017;13(5):20160121.
- View Article
- Google Scholar
13. Rahman A, Cho BK. Assessment of seed quality using non-destructive measurement techniques: A review. Seed Sci. Res. 2016;26(4):285–305.
- View Article
- Google Scholar
14. Gowen AA, O’Donnell CP, Cullen PJ, Downey G, Frias JM. Hyperspectral imaging–an emerging process analytical tool for food quality and safety control. Trends Food Sci Technol. 2007;18(12):590–598.
- View Article
- Google Scholar
15. Feng L, Zhu S, Liu F, He Y, Bao Y, Zhang C. Hyperspectral imaging for seed quality and safety inspection: A review. Plant methods, 2019;15(1):1–25. pmid:31406499
- View Article
- PubMed/NCBI
- Google Scholar
16. Orozco J, Manian V, Alfaro E, Walia H, Dhatt BK. Graph Convolutional Network Using Adaptive Neighborhood Laplacian Matrix for Hyperspectral Images with Application to Rice Seed Image Classification. Sensors. 2023; 23(7): 3515. pmid:37050573
- View Article
- PubMed/NCBI
- Google Scholar
17. Díaz-Martínez V, Orozco S. A deep learning framework for processing and classification of hyperspectral rice seed images grown under high day and night temperatures. Sensors. 2023; 23(9): 4370. pmid:37177572
- View Article
- PubMed/NCBI
- Google Scholar
18. Sun J, Jiang S, Mao H, Wu X, Li Q. Classification of black beans using visible and near infrared hyperspectral imaging. Int J Food Prop. 2016; 19(8): 1687–1695.
- View Article
- Google Scholar
19. Carreiro Soares SF, Medeiros EP, Pasquini C, De Lelis Morello C, Harrop Galvao RK, Ugulino Araújo M. Classification of individual cotton seeds with respect to variety using near-infrared hyperspectral imaging. Anal Methods. 2016; 8(48) 8498–8505.
- View Article
- Google Scholar
20. Zhang C, Liu F, Kong W, Zhang H, He Y. Fast identification of watermelon seed variety using near infrared hyperspectral imaging technology. Trans Chin Soc Agric Eng. 2013; 29(20): 270–277.
- View Article
- Google Scholar
21. Liu J, Liu S, Shi T, Wang X, Chen Y, Liu F.et al. A modified feature fusion method for distinguishing seed strains using hyperspectral data. Int J Food Eng. 2020; 16(7): 20190362.
- View Article
- Google Scholar
22. Wang LG, Wang LF. Variety identification model for maize seeds using hyperspectral pixel-level information combined with convolutional neural network. J Remote Sens. 2021; 25(11): 2234–2244.
- View Article
- Google Scholar
23. Michelon TB, Vieira ESN, Panobianco M. Spectral imaging and chemometrics applied at phenotyping in seed science studies: a systematic review. Seed Sci Res. 2023;33(1):9–22.
- View Article
- Google Scholar
24. Zhao L, Haque S, Wang R. Automated seed identification with computer vision: challenges and opportunities. Seed Sci Technol. 2022; 50(2): 75–102.
- View Article
- Google Scholar
25. Jin S, Zhang W, Yang P, Zheng Y, An J, Zhang Z, et al. Spatial-spectral feature extraction of hyperspectral images for wheat seed identification. Comput Electr Eng. 2022; 101: 108077.
- View Article
- Google Scholar
26. Huang M, He C, Zhu Q, Qin J. Maize seed variety classification using the integration of spectral and image features combined with feature transformation based on hyperspectral imaging. Appl Sci. 2016; 6(6): 183.
- View Article
- Google Scholar
27. Behmann J, Acebron K, Emin D, Bennertz S, Matsubara S, Thomas S, et al. Specim IQ: evaluation of a new, miniaturized handheld hyperspectral camera and its application for plant phenotyping and disease detection. Sensors. 2018; 18(2): 441. pmid:29393921
- View Article
- PubMed/NCBI
- Google Scholar
28. Zhang J, Mouazen AM. Fractional-order Savitzky–Golay filter for pre-treatment of on-line vis–NIR spectra to predict phosphorus in soil. Infrared Phys Technol. 2023; 131: 104720.
- View Article
- Google Scholar
29. Prasad KA, Gnanappazham L, Selvam V, Ramasubramanian R, Kar CS. Developing a spectral library of mangrove species of Indian east coast using field spectroscopy. Geocarto Int. 2015; 30(5): 580–599.
- View Article
- Google Scholar
30. Savitzky A, Golay MJ. Smoothing and differentiation of data by simplified least squares procedures. Anal Chem. 1964; 36(8):1627–1639.
- View Article
- Google Scholar
31. Al-Moustafa T, Armitage RP, Danson FM. Mapping fuel moisture content in upland vegetation using airborne hyperspectral imagery. Remote Sens Environ. 2012; 127: 74–83.
- View Article
- Google Scholar
32. ElMasry GM, Nakauchi S. Image analysis operations applied to hyperspectral images for non-invasive sensing of food quality–A comprehensive review. Biosyst Eng. 2016; 142: 53–82.
- View Article
- Google Scholar
33. Shen Y, Li B, Li G, Lang C, Wang H, Zhu J, et al. Rapid identification of producing area of wheat using terahertz spectroscopy combined with chemometrics. Spectrochim. Acta A Mol. Biomol. Spectrosc. 2022; 269: 120694. pmid:34922288
- View Article
- PubMed/NCBI
- Google Scholar
34. Wang Q, Zhang H, Li F, Gu C, Qiao Y, Huang S. Assessment of calibration methods for nitrogen estimation in wet and dry soil samples with different wavelength ranges using near-infrared spectroscopy. Comput Electron. 2021; 186: 106181.
- View Article
- Google Scholar
35. Yang DF, Li AC, Liu JM, Chen ZG, Shi C, Hu J. Optimization of Seed Vigor Near-Infrared Detection by Coupling Mean Impact Value With Successive Projection Algorithm. Spectrosc Spectr Anal. 2022; 42(10): 3135–3142.
- View Article
- Google Scholar
36. Zhang N, Zhang X, Wang C, Li L, Bai T. Cotton LAI Estimation Based on Hyperspectral and Successive Projection Algorithm. Trans Chin Soc Agric Mach. 2022; 53(S1): 257–262.
- View Article
- Google Scholar
37. Haralick RM, Shanmugam K, Dinstein IH. Textural features for image classification. IEEE Trans Syst Man Cybern. 1973; 6: 610–621.
- View Article
- Google Scholar
38. Zhang HK, Roy DP, Yan L, Li Z, Huang H, Vermote E, et al. Characterization of Sentinel-2A and Landsat-8 top of atmosphere, surface, and nadir BRDF adjusted reflectance and NDVI differences. Remote Sens Environ. 2018; 215: 482–494.
- View Article
- Google Scholar
39. Fu H, Sun G, Ren J, Zhang A, Jia X. Fusion of PCA and segmented-PCA domain multiscale 2-D-SSA for effective spectral-spatial feature extraction and data classification in hyperspectral imagery. IEEE Trans Geosci Remote Sens. 2020; 60: 1–14.
- View Article
- Google Scholar
40. Yang H, Wang Z, Cao J, Wu Q, Zhang B. Estimating soil salinity using Gaofen-2 imagery: A novel application of combined spectral and textural features. Environ Res. 2023; 217:114870. pmid:36435496
- View Article
- PubMed/NCBI
- Google Scholar
41. Sharma A, Singh T, Garg N. Combining near-infrared hyperspectral imaging and ANN for varietal classification of wheat seeds. In Third International Conference on Intelligent Computing Instrumentation and Control Technologies (ICICICT). Piscataway, NJ: IEEE; 2022, pp. 1103–1108.
42. Canaza-Cayo AW, Cozzolino D, Alomar D, Quispe E. A feasibility study of the classification of Alpaca (Lama pacos) wool samples from different ages, sex and color by means of visible and near infrared reflectance spectroscopy. Comput Electron Agric. 2012; 88: 141–147.
- View Article
- Google Scholar
43. Ballanti L, Blesius L, Hines E, Kruse B. Tree species classification using hyperspectral imagery: A comparison of two classifiers. Remote Sens. 2016; 8(6): 445.
- View Article
- Google Scholar
44. Ge X, Wang J, Ding J, Cao X, Zhang Z, Liu J, et al. Combining UAV-based hyperspectral imagery and machine learning algorithms for soil moisture content monitoring. PeerJ. 2019; 7: e6926. pmid:31110930
- View Article
- PubMed/NCBI
- Google Scholar
45. Pang L, Men S, Yan L, Xiao J. Rapid vitality estimation and prediction of corn seeds based on spectra and images using deep learning and hyperspectral imaging techniques. IEEE Access. 2020; 8: 123026–123036.
- View Article
- Google Scholar
46. Fricker GA, Ventura JD, Wolf JA, North MP, Davis FW, Franklin J. A convolutional neural network classifier identifies tree species in mixed-conifer forest from hyperspectral imagery. Remote Sens. 2019; 11(19): 2326.
- View Article
- Google Scholar
47. Huang GB, Zhu QY, Siew CK. Extreme learning machine: theory and applications. Neurocomputing. 2006; 70(1–3): 489–501.
- View Article
- Google Scholar
48. Lan Y, Soh YC, Huang GB. Ensemble of online sequential extreme learning machine. Neurocomputing. 2009; 72(13–15): 3391–3395.
- View Article
- Google Scholar
49. Yuan RR, Wang B, Liu GS, He JG, Wan GL, Fan NY, et al. Study on the Detection and Discrimination of Damaged Jujube Based on Hyperspectral Data. Spectrosc Spect Anal. 2021; 41(9): 2879–2885.
- View Article
- Google Scholar
50. Deng XQ, Zhu QB, Huang M. Variety discrimination for single rice seed by integrating spectral, texture and morphological features based on hyperspectral image. Laser Optoelectron Prog. 2015; 52(2): 21001.
- View Article
- Google Scholar
51. Zhang L, Sun H, Rao Z, Ji H. Hyperspectral imaging technology combined with deep forest model to identify frost-damaged rice seeds. Spectrochim. Acta A Mol. Biomol Spectrosc. 2020; 229: 117973. pmid:31887678
- View Article
- PubMed/NCBI
- Google Scholar
52. You J. The Detection Method Research on Delinted Cottonseeds’ Vigor Based on Hyperspectral Imaging. M.Sc. Thesis, Shihezi University. 2017.
53. Chen B, Wang KR, Li SK, Wang J, Bai JH, Xiao CH, et al. Spectrum characteristics of cotton canopy infected with verticillium wilt and inversion of severity level. In Computer And Computing Technologies In Agriculture, Volume II: First IFIP TC 12 International Conference on Computer and Computing Technologies in Agriculture (CCTA 2007). New York: Springer; 2008, pp. 1169–1180.
54. Tang Y, Wang R, Huang J. Relations between red edge characteristics and agronomic parameters of crops. Pedosphere. 2004; 14(4): 467–474.
- View Article
- Google Scholar

[ref1] 1. Zhu S, Zhou L, Gao P, Bao Y, He Y, Feng L. Near-infrared hyperspectral imaging combined with deep learning to identify cotton seed varieties. Molecules. 2019; 24(18): 3268. pmid:31500333
View Article
PubMed/NCBI
Google Scholar

[2] View Article

[3] PubMed/NCBI

[4] Google Scholar

[ref2] 2. Satturu V, Rani D, Gattu S, Md J, Mulinti S, Nagireddy RK, et al. DNA Fingerprinting for identification of rice varieties and seed genetic purity assessment. Agric Res. 2018;7(4):379–390.
View Article
Google Scholar

[6] View Article

[7] Google Scholar

[ref3] 3. Korir NK, Han J, Shangguan L, Wang C, Kayesh E, Zhang Y, et al. Plant variety and cultivar identification: advances and prospects. Crit Rev Biotechnol. 2013;33(2):111–125. pmid:22698516
View Article
PubMed/NCBI
Google Scholar

[9] View Article

[10] PubMed/NCBI

[11] Google Scholar

[ref4] 4. Huang DY. Study on Identification Method of Delinted Cottonseeds Varieties Based on Hyperspectral Image Technology. M.Sc. Thesis, Shihezi University, 2018.

[ref5] 5. Wang H, Wang K, Wu JZ, Han P. Progress in research on rapid and non-destructive detection of seed quality based on spectroscopy and imaging technology. Spectrosc SpectAnal. 2021;41(1):52–59.
View Article
Google Scholar

[14] View Article

[15] Google Scholar

[ref6] 6. Bao Y, Mi C, Wu N, Liu F, He Y. Rapid classification of wheat grain varieties using hyperspectral imaging and chemometrics. Appl Sci. 2019;9(19):4119.
View Article
Google Scholar

[17] View Article

[18] Google Scholar

[ref7] 7. Wang JF. SSR Markers-based Cotton Purity and Anthenticity Identification. M.Sc. Thesis, Chinese Academy of Agricultural Sciences,2009.

[ref8] 8. Zhang GK, Sun J, Wu XH, Li QL, Jiang SY. Identification of greengrocery seeds based on NIR and different pretreatment methods. Adv Mater Res. 2014; 1049: 1237–1240.
View Article
Google Scholar

[21] View Article

[22] Google Scholar

[ref9] 9. Cui Y, Xu L, An D, Liu Z, Gu J, Li S, et al. Identification of maize seed varieties based on near infrared reflectance spectroscopy and chemometrics. Int J Agric Biol Eng. 2018;11(2):177–183.
View Article
Google Scholar

[24] View Article

[25] Google Scholar

[ref10] 10. Li C, Wang X, Meng Z, Fan P, Cai J. Pepper seed variety identification based on visible/near-infrared spectral technology. In Infrared, Millimeter-Wave, and Terahertz Technologies IV. Bellingham, WA: SPIE; 2016, pp. 447–455.

[ref11] 11. Wu D, Feng L, He Y. Fast variety discrimination of Chinese cabbage seed based on Vis/NIR spectroscopy technique. In International Conference on Complex Systems and Applications; Waterloo: Watam Press; 2007.

[ref12] 12. Fayyazi S, Abbaspour-Fard MH, Rohani A, Monadjemi SA, Sadrnia H. Identification and classification of three iranian rice varieties in mixed bulks using image processing and MLP neural network. Int J Food Eng. 2017;13(5):20160121.
View Article
Google Scholar

[29] View Article

[30] Google Scholar

[ref13] 13. Rahman A, Cho BK. Assessment of seed quality using non-destructive measurement techniques: A review. Seed Sci. Res. 2016;26(4):285–305.
View Article
Google Scholar

[32] View Article

[33] Google Scholar

[ref14] 14. Gowen AA, O’Donnell CP, Cullen PJ, Downey G, Frias JM. Hyperspectral imaging–an emerging process analytical tool for food quality and safety control. Trends Food Sci Technol. 2007;18(12):590–598.
View Article
Google Scholar

[35] View Article

[36] Google Scholar

[ref15] 15. Feng L, Zhu S, Liu F, He Y, Bao Y, Zhang C. Hyperspectral imaging for seed quality and safety inspection: A review. Plant methods, 2019;15(1):1–25. pmid:31406499
View Article
PubMed/NCBI
Google Scholar

[38] View Article

[39] PubMed/NCBI

[40] Google Scholar

[ref16] 16. Orozco J, Manian V, Alfaro E, Walia H, Dhatt BK. Graph Convolutional Network Using Adaptive Neighborhood Laplacian Matrix for Hyperspectral Images with Application to Rice Seed Image Classification. Sensors. 2023; 23(7): 3515. pmid:37050573
View Article
PubMed/NCBI
Google Scholar

[42] View Article

[43] PubMed/NCBI

[44] Google Scholar

[ref17] 17. Díaz-Martínez V, Orozco S. A deep learning framework for processing and classification of hyperspectral rice seed images grown under high day and night temperatures. Sensors. 2023; 23(9): 4370. pmid:37177572
View Article
PubMed/NCBI
Google Scholar

[46] View Article

[47] PubMed/NCBI

[48] Google Scholar

[ref18] 18. Sun J, Jiang S, Mao H, Wu X, Li Q. Classification of black beans using visible and near infrared hyperspectral imaging. Int J Food Prop. 2016; 19(8): 1687–1695.
View Article
Google Scholar

[50] View Article

[51] Google Scholar

[ref19] 19. Carreiro Soares SF, Medeiros EP, Pasquini C, De Lelis Morello C, Harrop Galvao RK, Ugulino Araújo M. Classification of individual cotton seeds with respect to variety using near-infrared hyperspectral imaging. Anal Methods. 2016; 8(48) 8498–8505.
View Article
Google Scholar

[53] View Article

[54] Google Scholar

[ref20] 20. Zhang C, Liu F, Kong W, Zhang H, He Y. Fast identification of watermelon seed variety using near infrared hyperspectral imaging technology. Trans Chin Soc Agric Eng. 2013; 29(20): 270–277.
View Article
Google Scholar

[56] View Article

[57] Google Scholar

[ref21] 21. Liu J, Liu S, Shi T, Wang X, Chen Y, Liu F.et al. A modified feature fusion method for distinguishing seed strains using hyperspectral data. Int J Food Eng. 2020; 16(7): 20190362.
View Article
Google Scholar

[59] View Article

[60] Google Scholar

[ref22] 22. Wang LG, Wang LF. Variety identification model for maize seeds using hyperspectral pixel-level information combined with convolutional neural network. J Remote Sens. 2021; 25(11): 2234–2244.
View Article
Google Scholar

[62] View Article

[63] Google Scholar

[ref23] 23. Michelon TB, Vieira ESN, Panobianco M. Spectral imaging and chemometrics applied at phenotyping in seed science studies: a systematic review. Seed Sci Res. 2023;33(1):9–22.
View Article
Google Scholar

[65] View Article

[66] Google Scholar

[ref24] 24. Zhao L, Haque S, Wang R. Automated seed identification with computer vision: challenges and opportunities. Seed Sci Technol. 2022; 50(2): 75–102.
View Article
Google Scholar

[68] View Article

[69] Google Scholar

[ref25] 25. Jin S, Zhang W, Yang P, Zheng Y, An J, Zhang Z, et al. Spatial-spectral feature extraction of hyperspectral images for wheat seed identification. Comput Electr Eng. 2022; 101: 108077.
View Article
Google Scholar

[71] View Article

[72] Google Scholar

[ref26] 26. Huang M, He C, Zhu Q, Qin J. Maize seed variety classification using the integration of spectral and image features combined with feature transformation based on hyperspectral imaging. Appl Sci. 2016; 6(6): 183.
View Article
Google Scholar

[74] View Article

[75] Google Scholar

[ref27] 27. Behmann J, Acebron K, Emin D, Bennertz S, Matsubara S, Thomas S, et al. Specim IQ: evaluation of a new, miniaturized handheld hyperspectral camera and its application for plant phenotyping and disease detection. Sensors. 2018; 18(2): 441. pmid:29393921
View Article
PubMed/NCBI
Google Scholar

[77] View Article

[78] PubMed/NCBI

[79] Google Scholar

[ref28] 28. Zhang J, Mouazen AM. Fractional-order Savitzky–Golay filter for pre-treatment of on-line vis–NIR spectra to predict phosphorus in soil. Infrared Phys Technol. 2023; 131: 104720.
View Article
Google Scholar

[81] View Article

[82] Google Scholar

[ref29] 29. Prasad KA, Gnanappazham L, Selvam V, Ramasubramanian R, Kar CS. Developing a spectral library of mangrove species of Indian east coast using field spectroscopy. Geocarto Int. 2015; 30(5): 580–599.
View Article
Google Scholar

[84] View Article

[85] Google Scholar

[ref30] 30. Savitzky A, Golay MJ. Smoothing and differentiation of data by simplified least squares procedures. Anal Chem. 1964; 36(8):1627–1639.
View Article
Google Scholar

[87] View Article

[88] Google Scholar

[ref31] 31. Al-Moustafa T, Armitage RP, Danson FM. Mapping fuel moisture content in upland vegetation using airborne hyperspectral imagery. Remote Sens Environ. 2012; 127: 74–83.
View Article
Google Scholar

[90] View Article

[91] Google Scholar

[ref32] 32. ElMasry GM, Nakauchi S. Image analysis operations applied to hyperspectral images for non-invasive sensing of food quality–A comprehensive review. Biosyst Eng. 2016; 142: 53–82.
View Article
Google Scholar

[93] View Article

[94] Google Scholar

[ref33] 33. Shen Y, Li B, Li G, Lang C, Wang H, Zhu J, et al. Rapid identification of producing area of wheat using terahertz spectroscopy combined with chemometrics. Spectrochim. Acta A Mol. Biomol. Spectrosc. 2022; 269: 120694. pmid:34922288
View Article
PubMed/NCBI
Google Scholar

[96] View Article

[97] PubMed/NCBI

[98] Google Scholar

[ref34] 34. Wang Q, Zhang H, Li F, Gu C, Qiao Y, Huang S. Assessment of calibration methods for nitrogen estimation in wet and dry soil samples with different wavelength ranges using near-infrared spectroscopy. Comput Electron. 2021; 186: 106181.
View Article
Google Scholar

[100] View Article

[101] Google Scholar

[ref35] 35. Yang DF, Li AC, Liu JM, Chen ZG, Shi C, Hu J. Optimization of Seed Vigor Near-Infrared Detection by Coupling Mean Impact Value With Successive Projection Algorithm. Spectrosc Spectr Anal. 2022; 42(10): 3135–3142.
View Article
Google Scholar

[103] View Article

[104] Google Scholar

[ref36] 36. Zhang N, Zhang X, Wang C, Li L, Bai T. Cotton LAI Estimation Based on Hyperspectral and Successive Projection Algorithm. Trans Chin Soc Agric Mach. 2022; 53(S1): 257–262.
View Article
Google Scholar

[106] View Article

[107] Google Scholar

[ref37] 37. Haralick RM, Shanmugam K, Dinstein IH. Textural features for image classification. IEEE Trans Syst Man Cybern. 1973; 6: 610–621.
View Article
Google Scholar

[109] View Article

[110] Google Scholar

[ref38] 38. Zhang HK, Roy DP, Yan L, Li Z, Huang H, Vermote E, et al. Characterization of Sentinel-2A and Landsat-8 top of atmosphere, surface, and nadir BRDF adjusted reflectance and NDVI differences. Remote Sens Environ. 2018; 215: 482–494.
View Article
Google Scholar

[112] View Article

[113] Google Scholar

[ref39] 39. Fu H, Sun G, Ren J, Zhang A, Jia X. Fusion of PCA and segmented-PCA domain multiscale 2-D-SSA for effective spectral-spatial feature extraction and data classification in hyperspectral imagery. IEEE Trans Geosci Remote Sens. 2020; 60: 1–14.
View Article
Google Scholar

[115] View Article

[116] Google Scholar

[ref40] 40. Yang H, Wang Z, Cao J, Wu Q, Zhang B. Estimating soil salinity using Gaofen-2 imagery: A novel application of combined spectral and textural features. Environ Res. 2023; 217:114870. pmid:36435496
View Article
PubMed/NCBI
Google Scholar

[118] View Article

[119] PubMed/NCBI

[120] Google Scholar

[ref41] 41. Sharma A, Singh T, Garg N. Combining near-infrared hyperspectral imaging and ANN for varietal classification of wheat seeds. In Third International Conference on Intelligent Computing Instrumentation and Control Technologies (ICICICT). Piscataway, NJ: IEEE; 2022, pp. 1103–1108.

[ref42] 42. Canaza-Cayo AW, Cozzolino D, Alomar D, Quispe E. A feasibility study of the classification of Alpaca (Lama pacos) wool samples from different ages, sex and color by means of visible and near infrared reflectance spectroscopy. Comput Electron Agric. 2012; 88: 141–147.
View Article
Google Scholar

[123] View Article

[124] Google Scholar

[ref43] 43. Ballanti L, Blesius L, Hines E, Kruse B. Tree species classification using hyperspectral imagery: A comparison of two classifiers. Remote Sens. 2016; 8(6): 445.
View Article
Google Scholar

[126] View Article

[127] Google Scholar

[ref44] 44. Ge X, Wang J, Ding J, Cao X, Zhang Z, Liu J, et al. Combining UAV-based hyperspectral imagery and machine learning algorithms for soil moisture content monitoring. PeerJ. 2019; 7: e6926. pmid:31110930
View Article
PubMed/NCBI
Google Scholar

[129] View Article

[130] PubMed/NCBI

[131] Google Scholar

[ref45] 45. Pang L, Men S, Yan L, Xiao J. Rapid vitality estimation and prediction of corn seeds based on spectra and images using deep learning and hyperspectral imaging techniques. IEEE Access. 2020; 8: 123026–123036.
View Article
Google Scholar

[133] View Article

[134] Google Scholar

[ref46] 46. Fricker GA, Ventura JD, Wolf JA, North MP, Davis FW, Franklin J. A convolutional neural network classifier identifies tree species in mixed-conifer forest from hyperspectral imagery. Remote Sens. 2019; 11(19): 2326.
View Article
Google Scholar

[136] View Article

[137] Google Scholar

[ref47] 47. Huang GB, Zhu QY, Siew CK. Extreme learning machine: theory and applications. Neurocomputing. 2006; 70(1–3): 489–501.
View Article
Google Scholar

[139] View Article

[140] Google Scholar

[ref48] 48. Lan Y, Soh YC, Huang GB. Ensemble of online sequential extreme learning machine. Neurocomputing. 2009; 72(13–15): 3391–3395.
View Article
Google Scholar

[142] View Article

[143] Google Scholar

[ref49] 49. Yuan RR, Wang B, Liu GS, He JG, Wan GL, Fan NY, et al. Study on the Detection and Discrimination of Damaged Jujube Based on Hyperspectral Data. Spectrosc Spect Anal. 2021; 41(9): 2879–2885.
View Article
Google Scholar

[145] View Article

[146] Google Scholar

[ref50] 50. Deng XQ, Zhu QB, Huang M. Variety discrimination for single rice seed by integrating spectral, texture and morphological features based on hyperspectral image. Laser Optoelectron Prog. 2015; 52(2): 21001.
View Article
Google Scholar

[148] View Article

[149] Google Scholar

[ref51] 51. Zhang L, Sun H, Rao Z, Ji H. Hyperspectral imaging technology combined with deep forest model to identify frost-damaged rice seeds. Spectrochim. Acta A Mol. Biomol Spectrosc. 2020; 229: 117973. pmid:31887678
View Article
PubMed/NCBI
Google Scholar

[151] View Article

[152] PubMed/NCBI

[153] Google Scholar

[ref52] 52. You J. The Detection Method Research on Delinted Cottonseeds’ Vigor Based on Hyperspectral Imaging. M.Sc. Thesis, Shihezi University. 2017.

[ref53] 53. Chen B, Wang KR, Li SK, Wang J, Bai JH, Xiao CH, et al. Spectrum characteristics of cotton canopy infected with verticillium wilt and inversion of severity level. In Computer And Computing Technologies In Agriculture, Volume II: First IFIP TC 12 International Conference on Computer and Computing Technologies in Agriculture (CCTA 2007). New York: Springer; 2008, pp. 1169–1180.

[ref54] 54. Tang Y, Wang R, Huang J. Relations between red edge characteristics and agronomic parameters of crops. Pedosphere. 2004; 14(4): 467–474.
View Article
Google Scholar

[157] View Article

[158] Google Scholar

Figures

Abstract

Introduction

Materials and methods

Experimental materials

Hyperspectral image acquisition

Data sets

Data analysis

Spectral preprocessing

Feature selection

Textural feature extraction

Machine learning

Model evaluation

Results

Selection of optimal spectral pre-processing methods

Feature selection

Cotton seed cultivar identification based on spectral features

Cotton seed identification based on textural features

Identification of cotton seed cultivars by combining spectral and textural features

Discussion

Effects of different preprocessings

Spectral variability analysis

Identification performance based on the combination of spectral and textural features

Conclusions

References