Figures
Abstract
The icing failures of wind turbine blades are critical factors that affect both power generation efficiency and safety. To improve the diagnostic accuracy and speed, an improved weighted kernel extreme learning machine (IWKELM) optimized by multi-strategy adaptive coati optimization algorithm (MACOA) for icing fault diagnosis model is proposed, i.e., MACOA-IWKELM. Firstly, in order to improve the model optimization performance, the MACOA is proposed by introducing chaotic mapping Lévy flights, nonlinear inertial step factors, an improved coati vigilante mechanism, and an improved objective function. Secondly, the weighted kernel extreme learning machine (WKELM) is optimized by improved weighted parameters considering the influence of the internal distribution of samples on the diagnostic model. Finally, the MACOA is applied to the IWKELM and combined with the random forest (RF) dimensionality reduction technique to form the icing diagnostic model. The method is based on two sets of real SCADA data of wind turbine blade icing for comparison experiments with other models. In the two sets of experiments, the accuracy reaches 92.22% and 96.94% respectively, and the standard deviation of the accuracy in 50 experiments is 2.53% and 1.92% respectively. Keywords: Multi-strategy adaptive coati optimization algorithm; Improved weighted extreme learning machine; Wind turbine blade icing fault detection; Fault detection.
Citation: Wu X, Ding Y, Zhao R, Ding D, Zhang H, Wang L (2025) Fault diagnosis model based on multi-strategy adaptive COA and improved weighted kernel ELM: A case study on wind turbine blade icing. PLoS One 20(8): e0329332. https://doi.org/10.1371/journal.pone.0329332
Editor: Aykut Fatih Güven, Yalova University, TÜRKIYE
Received: April 23, 2025; Accepted: July 15, 2025; Published: August 28, 2025
Copyright: © 2025 Wu et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All raw data used to replicate the results of this study are available at the following link: https://ww2.mathworks.cn/matlabcentral/fileexchange/180667-macoa_iwkelm.
Funding: D.D. was awarded by the Chenguang Program of Shanghai Education Development Foundation and Shanghai Municipal Education Commission with Grant No. 23CGA76. D.D. play a key role in investigation and visualization of the manuscript. The URL of the funder website is https://edu.sh.gov.cn/. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
Introduction
Wind energy is commonly used in various applications, including power generation, heating, and water pumping [1]. However, during the process of wind power generation, the turbine blades are susceptible to icing due to low-temperature conditions. Consequently, it is essential to study icing fault diagnosis. Currently, there are two primary categories of methods for diagnosing icing faults in wind turbine blades: mechanistic models and data-driven approaches [2].
Mechanistic models are based on physical and engineering principles to investigate the operational mechanisms and failure modes of wind turbines. However, these models tend to be complex, incur high computational costs, and pose challenges in terms of maintenance and updates. On the other hand, data-driven methods involve constructing intelligent models based on extensive datasets to detect and analyze the operational conditions of wind turbine blades, thereby assessing their operational status. This approach requires less specialized knowledge and has proven effective in actual predictive scenarios [3].
Common data-driven fault diagnosis methods are based on classifiers such as BP, ELM, KNN, SVM, and DT, among others [4]. While these methods have a well-established theoretical foundation and are cost-effective and widely applicable, they often depend on expert knowledge and face challenges in real-time monitoring, as well as the risk of misdiagnosis and omissions [5]. The Extreme Learning Machine (ELM) [6], proposed by Huang, is frequently employed in fault diagnosis due to its remarkable characteristics, including strong learning capability, effective testing performance, rapid training speed, and robust generalization ability. However, ELM exhibits limited generalization in nonlinear systems and is particularly sensitive to noise. To address these nonlinear issues, the Kernel-Based Extreme Learning Machine (KELM) was introduced [7]. Additionally, to tackle the problem of imbalanced data, Weighted Kernel-Based Extreme Learning Machine (WKELM) was proposed [8]. However, WKELM only applies weights to the two types of samples as a whole, overlooking the distribution within the samples, indicating that there is still room for improvement.
Since optimization algorithms can screen initial solutions for traditional models and improve their optimization search process, it is highly feasible and good diagnosis to use them to optimize fault diagnosis methods. Yan Y et al. [9] proposed an On-Load Tap-Changer fault diagnosis method based on the Weighted Extreme Learning Machine optimized by Improved Grey Wolf Algorithm. Guo X Y et al. [10] used an ELM model optimized by the Genetic Algorithm. In literature [11], the Grey Wolf Optimization-Ant Lion Optimizer-Extreme Learning Machine model was proposed. In the literature [12], a Kernel Extreme Learning Machine optimized by Grey Wolf Optimization was presented. The Coati Optimization Algorithm (COA) is a heuristic algorithm that simulates the natural behaviour of long-nosed coatis [13], has a strong optimization ability, which makes it competitive among similar algorithms. Jia et al. [14] proposed the introduction of a sound-based search encirclement strategy as well as a physical exertion strategy to improve the COA but failed to take into account the optimization of the generation of the initial population. Zhang et al. [15] improved the COA by applying it to real engineering problems, such as the three-bar truss design problem, but only a simple nonlinear strategy was used. Barak [16] proposed to combine the COA with the grey wolf optimization algorithm for active suspension linear quadratic regulator controller tuning. Baş [17] et al. proposed a nonlinear optimization algorithm ECOA (Enhanced Coati Optimization Algorithm). ECOA improved the COA by balancing exploitation capacity and exploration capacity but failed to consider eliminating the imbalance by optimizing the exploitation phase.
With the development of ELM, more and more models based on extreme learning machines have appeared. Tong R et al [18] proposed a new ellipsoid nearest neighbour graph computation strategy and fused ELM to form the ESS-ELM model. A short-term load forecasting model for distributed energy systems is introduced by the KELM optimized Fireworks Algorithm combining with Kernel Principal Component Analysis [19]. Vijaya et al [20] proposed a prediction model, which was combined with Variational Mode Decomposition and Multi Kernel Extreme Learning Machine Auto Encoder. Shang S et al [21] optimized the ELM by Improved Zebra Optimization Algorithm (IZOA). Pustokhina IV et al [22] used WELM model optimized by multi-objective rainfall optimization algorithm. Wang C L et al [23] proposed a sound quality prediction model based on ELM optimized by fuzzy adaptive Particle Swarm Optimization.
To address the issue of imbalanced wind turbine blade icing data, weighted parameters that vary according to the internal distribution of samples are introduced into the traditional Weighted Kernel Extreme Learning Machine (WKELM) model. This leads to the proposal of the Improved Weighted Kernel Extreme Learning Machine (IWKELM). In addition, to improve the performance of parameter optimization, this paper proposes a multi-strategy adaptive coati optimization algorithm (MACOA). The proposed MACOA uses a chaotic mapping mechanism to enhance the diversity and quality of the initial population. MACOA introduces a nonlinear inertial step size factor during the global optimization process to improve optimization efficiency. During the local optimization process, MACOA incorporates an improved sparrow vigilante mechanism to prevent the algorithm from falling into local optima. Additionally, an improved objective function is introduced during algorithm iteration to provide solutions for escaping local optima. Finally, MACOA is employed to optimize the parameters of the IWKELM model, resulting in the development of the MACOA-IWKELM icing diagnostic model. This model is compared with the BP, ELM, and KELM models, and experiments are conducted using the CEC2017 dataset, 12 publicly available datasets, and two sets of real turbine operation SCADA datasets to validate the effectiveness of the proposed method.
Fundamental theories
Xm×n is an input data matrix which consists of n samples with m features. The xij denotes the jth feature value of the ith sample. The output matrix is defined as Ym×n.
Weighted kernel extreme learning machine
According to the literature [6], the ELM is modelled as shown in Eq. (1) and Eq. (2):
where, the hidden layer output is defined as h(xi). the hidden layer matrix is I, whereas, H expresses the output matrix of the hidden layer neurons. C indicates the regularization parameter.
T = [t1, t2,...,tN]T expresses the desired output of training sets. L represents the number of hidden layer neurons, and the internal parameters of the hidden neurons (wi and bi) are randomly generated.
The kernel function K(xi, xj) is employed to solve the nonlinear mapping problem, shown in Eq. (3) [7]:
where, the kernel matrix is Ω = HTH, Ωij expresses the element located in the ith row and jth column, and K(xi, xj) is the Gaussian kernel function as shown in Eq. (4).
When samples are trained using the traditional Kernel-Based Extreme Learning Machine (KELM), each sample is assigned equal importance. This approach significantly impacts the classification performance, particularly when there is interference from noise and outliers, or when the distribution of sample classes is highly imbalanced. To solve the problem, the WKELM model [8] is produced as shown in Eq. (5):
where W is the weighted matrix, the formula is shown in Eq. (6). W+(i) = δ1 and W-(i) = δ2 denote the weights of the positive and negative class samples, respectively.
Coati optimization algorithm
The COA is a population intelligence optimization algorithm based on the behaviour of long-nosed coatis in nature [13]. In the COA, each individual coati is a candidate solution. They have two natural behaviours in the hunting period: (1) Hunting for iguana, (2) Escaping from predators. It can be interpreted in the algorithm as two phases: exploration and exploitation.
Hunting for iguana (exploration).
During the exploration phase, the coatis initiate a hunt and attack on the iguana, with a part of coatis climbing a tree in order to get close to the iguana. Other coatis wait beneath the tree to hunt the iguana once it fell to the ground. This strategy enables individual coatis to relocate to various positions within the search space, which demonstrates the global search capability of the COA within the problem space, i.e., Exploration.
During the exploration phase, denotes the position of the best individual in population, corresponds to the position of the iguana. Half of the coatis will ascend the tree, while the other half will remain on the ground, waiting for the iguana to fall. The position of the coati on the tree is shown in Eq. (7).
where is the position of an individual, t denotes the current iteration number, and r denotes a random number between [0,1]. RI denotes a random integer from {1,2}. N denotes the population size. M expresses the dimension.
After the iguana’s falling, it is placed randomly. Then, the coatis, which stay on the ground, move through the space, searching for the iguana. The position is updated by Eq. (8) and Eq. (9) below:
where lbj and ubj expresses the lower and upper limit of the jth dimensional variable. fitness(·) is the formula for calculating fitness. expresses the new position of the iguana after falling.
is the value of the ith dimensional variable for the ith individual under the current iteration.
If the new position improves the fitness value, it is accepted as the new position. Otherwise, the coati remains in previous position, indicating that a greedy selection is performed shown in Eq. (10).
Escaping from predators (exploitation).
During the exploitation phase, the updating of the coati’s location is modeled after the natural behavior of a coati escaping from a predator. This action allows the coati to move closer to a safer position nearby, reflecting the local search capability of the COA, which is indicative of exploitation.
During the exploitation phase, random positions are generated near every coati’s location, as shown in Eq. (11) and Eq. (12):
where T represents the maximum iteration count. t denotes the current number of iterations. and
express the upper and lower bounds of the jth dimensional variable, which are updated with each iteration. r denotes a random number in the range of [0,1].
Finally, one more greedy choice is made, i.e., Eq. (10).
Multi-strategy adaptive COA and improved weighted kernel ELM
Multi-strategy adaptive coati optimization algorithm
Chaos mapping for Levi’s flight.
The chaotic mapping mechanism is characterized by high uncertainty and sensitivity. It can produce complex and unpredictable dynamic behaviors, allowing for a broader exploration of the search space [24,25]. Levy Flight is a specialized random walk model that describes movement patterns characterized by long-tailed distributions [26]. Levy flights are incorporated into the initialization process of the MACOA, as illustrated in Eqs. (13), (14) and (15):
where X(t) denotes the position of the ith coati, ⊕ expresses point-to-point multiplication, and α is the weight of the control step. .
. β is the shape parameter of the step distribution, which is set to 1.5 in this paper.
Nonlinear inertia step size factor.
The introduction of a nonlinear inertia step size factor can significantly improve search efficiency and convergence performance, allowing the COA to dynamically adjust the search behavior. This mechanism maintains a high level of exploration capability during the initial stages, while the gradual reduction of weights in later stages encourages a more focused local search. Considering that updating a coati’s position is influenced by its current position, a nonlinear inertia step size factor is introduced. This factor adjusts the relationship between the coati’s position update and the current position information based on the individual coati’s location. The factor is then calculated using Eq. (16):
where Cn is a constant greater than 1 to control the degree of nonlinearities, which is taken as 2.
Initially, the value of ω is small, which means that position updates are less influenced by the current position. This allows for a broader search range for the algorithm and enhances its global exploration capability. As the search process progresses, the value of ω increases over time, resulting in a greater influence from the current coati position. This adjustment helps the algorithm in finding the optimal solution and also improves its convergence speed and local exploration ability.
The improved formula for modelling coati positions in the first stage is shown in Eq. (17):
Improved sparrow vigilante mechanism.
The Sparrow Search Algorithm is inspired by the behavior of sparrows while foraging for food, where some individuals act as vigilantes, responsible for monitoring their surroundings and sounding an alarm when a potential threat is detected. This approach enables the COA to maintain a higher degree of flexibility and dynamism in exploring the solution space, thereby enhancing its ability to adapt to uncertain problems [27].
Introducing the sparrow vigilante mechanism during the exploitation phase enhances the vigilance capability of the COA to search within an optimal range. Coatis at the edge of the population will quickly move away to find a safe area when they sense danger. Meanwhile, the coatis located in the center will move randomly to get closer to others in the population. The formula for the Sparrow Vigilante Mechanism is presented in Eq. (18):
where represents the global optimal position in the current iteration, β represents the step control parameter. β ~ N(0,1). K is a random number with values between [−1,1]. fi is the fitness value. fg is the global greatest fitness value, and fw is the worst one. ε is a very small constant.
In order to escape from predation, coatis in the middle stay close to each other.
The Eq. (18) can be optimized to attack the problem of the global search capability. A dynamically adjusted step factor [28] is introduced shown in Eq. (19):
where β(t) is a dynamically adjusted step factor as shown in Eq. (20). K(t) is a dynamically adjusted step factor as shown in Eq. (21). rand∈[0,1].
The introduction of dynamic step factors β(t) and K(t) allows the algorithm to adjust its search behavior dynamically. In the initial stages of the algorithm, the focus is on exploration, while the later phases emphasize exploitation. These optimizations enhance the adaptability and robustness of the COA, particularly in complex and high-dimensional problems, enabling it to find the global optimal solution more efficiently.
Improved objective function.
Traditional objective functions often exhibit sensitivity to initial values, a tendency to converge on local optimal solutions, and a lack of robustness. Therefore, an improved objective function is proposed. In general, the dataset is divided into three subsets: the training set, the validation set, and the test set. Alternatively, it can be divided into two subsets: the training set and the test set. When the dataset is split into a training set and a test set, the objective function used to optimize the model parameters is either the number of classification errors (ERROR) or the root mean square error (RMSE) of the test results. ERROR and RMSE are calculated as shown in Eqs. (22) and (23).
When ERROR is used as the objective function, the particle can be viewed as approaching a decreasing extreme value during the reduction of the ERROR. However, there may be instances where, after reaching a certain extreme value, the particle fails to find a more optimal direction, leading to convergence at a local extreme value.
When RMSE is used as the objective function, it is possible for the RMSE value to decrease while the ERROR value increases. Although the overall direction of optimization is correct, the iteration may reduce the RMSE for the overall samples, resulting in most test samples being classified correctly. However, some samples may be misclassified in the next iteration, causing their classification results to change from correct to incorrect.
Therefore, an improved objective function is proposed, i.e., Eq. (24):
where ERMSE is the value of the root mean square for the error sample.
Improved weighted kernel extreme learning machine
In the traditional Weighted Kernel Extreme Learning Machine (WKELM) model, the weighted parameter only influences the overall weight of each class of positive and negative samples. This approach results in the algorithm treating the two classes of samples as a whole during the optimization process, without considering the internal distribution of the samples. As a result, the information provided by the internal distribution is overlooked, which may negatively impact the model’s classification performance. To address this issue, the Improved Weighted Kernel Extreme Learning Machine (IWKELM) model is proposed. This model not only takes into account the overall weight distribution of the two types of samples but also focuses on the weights within each class, which vary according to their distribution, thereby enhancing the weighting of both types of samples.
For all positive sample weights, the formula was modified to Eq. (25):
For all negative sample weights, the formula was modified to Eq. (26):
where d+(i) and d-(i) denote the Euclidean distance of the positive and negative samples to the centre of the respective two samples, and the formulae for the calculation of the respective centres of the two samples are given in Eq. (27) and Eq. (28).
For example, in the weighted formula for positive class samples, a term of (d+(i)/max(d+)*δ1+1) is introduced into the product, in addition to the weighted factor δ3, which affects all positive class samples. The term d+(i)/max(d+) is used to normalize the distances between the centers and all positive class samples. Meanwhile, the term (d+(i)/max(d+)*δ1+1) maps the normalized distances into the range of [1+δ1,1]. When multiplied by δ3, the distances between the centers and all positive class samples can be adjusted to [(1+δ1)*δ3,δ3].
Clearly, δ3 represents the upper limit of the weights for the positive class samples, while (1+δ1)*δ3 serves as the lower limit. δ3 is proportional to the total weights of the positive class samples and inversely proportional to the total weights of the negative class samples. Consequently, the closer a sample is to the center of the positive class, the closer its weight is to δ3. Conversely, as the distance increases, the weights of the edge-positive class samples approach (1+δ1)*δ3.
For positive class samples, the relationship between the size of the sample weights and the distances from the samples to the sample centres is shown in Fig 2.
In Fig 2, after fixing δ3, it is evident that the closer the value of δ1 is to 0, the weights of all positive samples approach δ3, indicating that the internal distribution of the positive samples becomes less significant. Conversely, as the value of δ1 approaches −1, the weights of samples closer to the center of the positive class remain near δ3, while those further away from the center tend toward 0. This suggests that the influence of the internal distribution of positive samples still requires further consideration. Similarly, for negative samples, δ2 is related to the degree of influence exerted by the distribution of positions within the negative samples, and the overall weights of all negative samples are adjusted by controlling δ3.
To test the performance of IWKELM in handling the internal distribution of samples, marginal samples were taken from the KEEL dataset based on Z-score for experimentation. The specific experimental results are shown in Table 2.
The experimental results show that the diagnostic performance of the IWKELM model far exceeds that of traditional models. Furthermore, the diagnostic accuracy of COA-IWKELM is 0.15% and 0.69% higher than that of COA-WKELM in the two marginal data sets, respectively. The diagnostic accuracy of MACOA-IWKELM is 0.08% and 0.85% higher than that of MACOA-WKELM, respectively. The results show that IWKELM has a significant advantage in handling the internal distribution of samples.
The specific structure of the modelling of the IWKELM is shown in Fig 3, and the flowchart is shown in Fig 4.
Experiments for the multi-strategy adaptive coati optimization algorithm
This section presents simulation studies and evaluations of the optimization efficiency of the Multi-strategy Adaptive Coati Optimization Algorithm (MACOA). Given that the individual coatis in the proposed MACOA possess strong optimization capabilities, there is no need to set a large population for the algorithm. However, certain requirements exist regarding the number of iterations. Therefore, the experimental conditions, including the population size and the maximum number of iterations, are outlined in Table 3.
Benchmark functions and compared algorithms
Twenty-nine standard benchmark functions from the IEEE CEC-2017 [29] have been utilized to evaluate MACOA’s capability in addressing various objective functions. A comparison of MACOA’s performance with eleven well-known algorithms is performed in order to assess its quality in providing optimal solutions, namely COA [13], SABO [30], WSO [31], SCSO [32], GJO [33], TSA [34], WOA [35], GWO [36], TLBO [37], GSA [38] and PSO [39]. The results are displayed using four metrics: mean, standard deviation (std), rank, and execution time (ET). The value of control parameters for all competing algorithms are detailed in Table 4.
Experimental results and analysis
CEC-2017 includes thirty standard benchmark functions of various types, as shown in Table 5.
The test function F2 from the CEC-2017 is not used in this paper because of its unstable performance (same as other authors in their paper [15]). Complete information and details for these test functions can be found in literature [29].
The proposed Multi-strategy Adaptive Coati Optimization Algorithm (MACOA) and baseline algorithms were subjected to 29 independent experiments at CEC-2017, each consisting of 200,000 function evaluations (FEs). The experiments utilized three dimensions of test functions: 30, 50, and 100. The ranking results for the experiments are presented in Tables 6–8. The results for the 30-dimensional case (m = 30) indicate that the MACOA is the best algorithm for solving the F4, F10, F11, F22, F24–F26, F28, and F29 functions.
The results for the 50-dimensional case (m = 50) clearly indicate that MACOA is the best optimization algorithm for solving the F1, F4, F10, F11, F16, F18, F22–F26, and F29 functions. Similarly, the results for the 100-dimensional case (m = 100) demonstrate that MACOA excels in solving the F1, F4, F10, F12, F14, F16, F17, F22–F26, F29, and F30 functions. A comparison of the experimental results shows that MACOA outperforms the competing algorithms for most of the tested functions. Overall, MACOA consistently delivers the best performance across different dimensions (30, 50, and 100) of the CEC-2017 test functions.
Compared with other 11 algorithms, the MACOA proposed has strong exploration, exploitation and search capability. It has superior performance compared to other optimization algorithms.
Wind turbine blade icing fault diagnosis model based on MACOA-IWKELM
To enhance the diagnostic correctness of the IWKELM. A wind turbine blade icing diagnosis model based on MACOA-IWKELM is proposed. The specific process of modelling the model is as follows below:
- (1) All wind turbine blade SCADA point data is adjusted and grouped out, overpowered samples are removed, some attributes are averaged, and then all data is normalized by the minimum-maximum standardization method.
- (2) All data are processed using the Random Forest algorithm for dimensionality reduction to avoid too high dimensionality leading to too poor training results.
- (3) The MACOA-IWKELM model is used for wind turbine blade icing fault diagnosis among the dataset obtained after the dimensionality reduction process, and a compared classification model is set up for experimentation.
The framework of MACOA-IWKELM is shown in Fig 5.
Model diagnostic experiments
Introduction to the datasets and models
All the experimental conditions are performed in a test environment with AMD R7 CPU, 3.20GHz, 16GB RAM, and Windows 11 64-bit. PCA method is performed using SPSSPRO software. BP neural network, Support Vector Machine (SVM), and Decision Tree (DT) model training are performed using MATLAB toolkit. The k-nearest neighbour (KNN), ELM and their derived models are programmed using MATLAB 2018a.
12 datasets are used in the experiment, which includes datasets 1–4 from UCI and datasets 5–12 from KEEL. All datasets are normalized. The experimental dataset is shown in Table 9: it contains the sample name, source, number of sample features, total number of samples, and number of positive and negative class samples.
A total of 12 models, BP, ELM, KELM, KNN, SVM, DT, COA-KELM, MACOA-KELM, COA-WKELM, MACOA-WKELM, COA-IWKELM, MACOA-IWKELM are used for the comparison experiments in this experiment. Where COA-KELM is the KELM optimized by COA, MACOA-WKELM is the WKELM optimized by MACOA, and so on.
100 training samples and 100 test samples are randomly selected in the data set for each experiment, with half of the samples in each of the positive and negative categories. In each experiment, all models use this randomly selected data at the same time. A total of 50 experiments are conducted, and the experimental results are averaged.
In this experiment, because the data used is test data set, the test function in MACOA experiment has higher complexity, so the maximum number of iterations need not be set too high. The experimental hardware conditions are shown in Table 3. The population size and maximum number of iterations are set to 20 and 200. The model fixed parameters and particle optimization ranges are shown in Table 10.
Results of diagnostic experiments on the dataset
In order to confirm that MACOA and IWKELM can improve the classification effect when optimizing the model parameters, datasets 1–12 are selected for the experiment. The experimental results are presented in Tables 11 and 12. Among them, the distribution of 50 experiments is shown in the box plot Fig 6.
In this experiment, since there are more models and more combinations, a side-by-side comparison is needed, so some of the models are combined to facilitate the comparison, and the groups set are as follows:
- Group 1: BP, ELM, KELM, KNN, SVM, DT, COA-KELM
- Group 2: COA-KELM, MACOA-KELM
- Group 3: COA-WKELM, MACOA-WKELM
- Group 4: COA-IWKELM, MACOA-IWKELM
- Group 5: COA-KELM, COA-WKELM, COA-IWKELM
- Group 6: MACOA-KELM, MACOA-WKELM, MACOA-IWKELM
It can be concluded from the results of classification correctness in Table 11, and classification accuracy variance in Table 12, and box plot in the Fig 6.
Group 1 is selected for comparison, and the results indicate that the BP, ELM, KELM, KNN, SVM, and DT models do not achieve a high classification accuracy. The highest average accuracy reaches only 81.86%, with the lowest average standard deviation at just 3.49%. In contrast, the average accuracy of the COA-KELM model is 87.54%, significantly higher than the KELM model’s accuracy of 79.96%. This discrepancy arises because the traditional model lacks optimization of its parameters, which hinders improvements in classification performance and reduces stability. Additionally, as shown in the box plot in Fig 6, the traditional model exhibits more outliers and larger classification errors.
The comparisons in groups 2, 3, and 4 reveal that the average accuracy of MACOA-KELM reaches 87.71%, which is 0.17% higher than the average accuracy of COA-KELM. Additionally, the average standard deviation is only 2.68%, which is 0.14% lower than that of the COA-KELM model. Furthermore, the average accuracy of MACOA-WKELM is 88.53%, representing a 0.35% improvement over the average accuracy of COA-WKELM, with an average standard deviation of just 2.57%. The MACOA-IWKELM model achieves an average accuracy of 88.88%, which is 0.48% higher than the average accuracy of COA-IWKELM, and an average standard deviation of only 2.32%, which is 0.24% lower than that of the COA-KELM model.
Overall, the MACOA demonstrates a higher correct classification rate and a smaller standard deviation compared to the COA, effectively improving stability. This improvement is attributed to the initial population generated by the Lévy flight, which is more conducive to optimization, and the optimization speed is significantly enhanced by the nonlinear factor. Additionally, the proposed coati vigilance mechanism ensures that the algorithm can escape local optima and avoid missing the global optimum. Furthermore, the optimized objective function enhances the optimization logic and provides a solution when the original iteration fails to yield a better value. The box plot also illustrates that MACOA exhibits significant superiority and stability.
From the comparative models in groups 5 and 6, COA-WKELM achieves an average accuracy of 88.18%, which is 0.64% higher than the average accuracy of COA-KELM. The average standard deviation is only 2.53%, which is 0.29% lower than that of COA-KELM. COA-IWKELM achieves an average accuracy of 88.40%, which is 0.22% higher than the average accuracy of COA-WKELM. The average accuracy of MACOA-WKELM reaches 88.53%, representing an increase of 0.82% over the accuracy of MACOA-KELM, while the average standard deviation of MACOA-WKELM is only 2.57%, which is 0.11% lower than that of MACOA-KELM. Furthermore, the average accuracy of MACOA-IWKELM reaches 88.88%, which is 0.35% higher than that of MACOA-WKELM, and the average standard deviation of MACOA-IWKELM is only 2.32%, which is 0.25% lower than that of MACOA-WKELM.
Therefore, the weight parameters introduced into the IWKELM can further enhance classification accuracy. Additionally, the box plot demonstrates that IWKELM significantly increases the stability of multiple predictions, with very few outliers. However, in some models, the average standard deviation of WKELM was nearly equal to that of IWKELM. This similarity can be attributed to the limitations of certain datasets and the instability caused by the chaotic mapping mechanism. These issues could be mitigated by utilizing more datasets, increasing the number of iterations, and conducting extensive experimentation. Overall, MACOA-IWKELM exhibits superior optimization search speed and convergence compared to the other models.
Wind turbine blade icing diagnostic experiment.
The experimental data presented in this paper is sourced from the Industrial Big Data Innovation Competition. The dataset records operational data from November 1, 2015, to January 1, 2016, for two turbines, identified as Turbine 15 and Turbine 21, each containing 20 features.
Before conducting the experiments, the wind turbine operation data were processed to remove duplicates, average the samples with the same timestamp, and eliminate samples with power outputs greater than 2 kW. This resulted in 39,465 normal samples and 2,841 icing samples for Turbine 15, and 17,602 normal samples and 1,274 icing samples for Turbine 21. Subsequently, the blade pitch angle, blade pitch speed, and pitch motor temperature data were averaged to yield a total of 20 features. The dataset information is summarized in Table 13, while the corresponding attribute numbers for the wind turbine blade operation data are detailed in Table 14.
Random forest dimensionality reduction.
Random Forest (RF) Dimensionality Reduction is a feature selection and dimensionality reduction technique based on the Random Forest algorithm [40]. In terms of dimensionality reduction, Random Forest effectively identifies and selects the features that have the greatest impact on the target variable, thereby reducing the dimensionality of the data.
The SCADA data of wind turbine blades are processed by RF dimensionality reduction. The specifics of the attribute scores of the SCADA data for turbine 15 and 21 operation under the use of the RF method are shown in Figs 7 and 8. The feature importance heat map drawn based on feature importance is shown in Figs 9 and 10.
Based on the results presented in Figs 7–10, the importance of the top 8 attributes for Turbines 15 and 21 is significantly greater than that of the other attributes. In particular, the importance of the eighth-ranked feature, Generator RPM, is three times that of the ninth-ranked feature. Therefore, experiments were conducted on datasets with 8 or fewer extracted features.
Therefore, based on the experimental results in Tables 15 and 16, this paper selects the top 8 features with the highest scores as the input feature vectors for each experimental model, while the other attributes are disregarded. The top 8 highest-scoring features are wind speed, yaw position, average pitch motor temperature, ambient temperature, output power, cabin temperature, average pitch angle, and generator RPM.
Diagnostic results and comparative analysis of MACOA-IWKELM.
The SCADA data from two turbines were downscaled and then processed using the SMOTE oversampling technique, resulting in 39,465 normal samples and 2,841 icing samples for Turbine 15, and 17,602 normal samples and 1,274 icing samples for Turbine 21.
The processed data is then fed into the classification models for experimentation. The experimental comparison models include BP, ELM, KELM, SVM, KNN, COA-KELM, MACOA-KELM, COA-WKELM, MACOA-WKELM, COA-IWKELM, and MACOA-IWKELM, totaling 12 models.
The fixed parameters for the experimental models and the optimization algorithm’s search range are consistent with those in Section 6.1. The experimental hardware conditions are shown in Table 3. The population size and maximum number of iterations are set to 20 and 200. The model fixed parameters and particle optimization ranges are shown in Table 10.
The diagnostic accuracy of the experiment for wind turbine No.15 and No.21 is shown in Tables 17–19, where the distribution of the 50 experiments is shown in the box plot Figs 11 and 12, and the confusion matrices generated by the diagnostic experiments for wind turbine 15 and wind turbine 21 out of the 50 experiments are shown in Figs 13 and 14.
According to the evaluation indicators in Table 18, in the experiment of fan No.15, the indicators of the COA_KELM model exceeded those of all traditional models. Meanwhile, the F1 score of MACOA_WKELM is 1.32% higher than that of MACOA_KELM, while the F1 score of MACOA_IWKELM is 0.36% higher than that of MACOA_WKELM. In addition, in the experiment of fan No. 21, all indicators of COA_KELM were superior to those of the traditional model. The F1 score of COA_WKELM was 0.50% higher than that of COA_KELM. The F1 score of COA_IWKELM was 0.07% higher than that of COA_WKELM. This proves the effectiveness of IWKELM’s improvements. In addition, in both experiments, the F1 score of MACOA_IWKELM was 0.11% and 0.90% higher than that of COA_IWKELM, respectively. This proves the superiority of MACOA over COA.
From the results presented in Tables 17 and 19 and the box plot of the distribution of 50 experiments shown in Fig 11 and 12. The prediction accuracy of MACOA-KELM for Turbine No. 15 and Turbine No. 21 reach 90.20% and 95.48%, respectively, both of which are significantly higher than those of traditional models such as BP and ELM. Moreover, the standard deviations of the 50 predictions for Turbines No. 15 and No. 21 are only 2.86% and 2.33%, respectively, which are much smaller than those of the traditional models. The accuracy of MACOA-IWKELM is 0.12% and 0.88% higher than that of COA-IWKELM for Turbines 15 and 21, respectively. Additionally, the standard deviations of the 50 predictions for MACOA-IWKELM are only 2.53% and 1.92%, which are lower than the standard deviations of the 50 experiments for COA-IWKELM on Turbines 15 and 21 by 0.28% and 0.31%, respectively. Therefore, it can be concluded that MACOA significantly improves prediction accuracy by applying the chaotic mapping mechanism, nonlinear inertia weighting factors, an improved sparrow vigilante mechanism, and an enhanced objective function. Regardless of whether the optimized model is KELM, WKELM, or IWKELM, both the correct classification rate and the stability of the experimental data are significantly improved compared to using the original COA.
The experimental results indicate that in the Fan No. 15 experiment, the prediction accuracy of MACOA-IWKELM is 0.38% higher than that of MACOA-WKELM, while the standard deviation is 0.28% lower. Additionally, in the Fan No. 21 experiment, the prediction accuracy of MACOA-IWKELM is 0.88% higher than that of MACOA-WKELM, with a standard deviation that is 0.31% lower. Therefore, IWKELM can significantly enhance prediction accuracy when handling data with more features, thanks to the inclusion of a weight parameter that varies according to the individual samples. In conclusion, both MACOA and IWKELM improve the accuracy and stability of fault diagnosis for wind turbine blade icing.
Conclusion and future prospects
To improve diagnostic accuracy, a wind turbine blade icing fault diagnosis model based on MACOA-IWKELM is proposed. Firstly, weight parameters are introduced into the method, allowing them to be adjusted according to the internal distribution of samples, thereby leading to the development of the IWKELM model. Additionally, to enhance the convergence performance and stability of the Coati Optimization Algorithm (COA), chaotic mapping Lévy flight is employed to optimize the initial population, and nonlinear inertia weight factors are added to improve convergence speed. The vigilante mechanism of the improved sparrow optimization algorithm is utilized to enhance stability. The performance of the Coati Optimization Algorithm is significantly improved by incorporating the enhanced objective function during the iteration process.
The effectiveness of MACOA is validated through comparative experiments, which demonstrate that the multi-strategy adaptive Coati Optimization Algorithm outperforms the other 11 comparison algorithms. MACOA is used to optimize IWKELM, resulting in the proposed MACOA-IWKELM model. Experiments conducted with 12 publicly available datasets from UCI and KEEL indicate that the model significantly enhances classification accuracy and stability. Finally, the MACOA-IWKELM model is applied to diagnose faults in two sets of real turbine operation data. Based on the experimental results, the improved model shows a significant increase in fault diagnosis accuracy and stability.
However, the proposed model does have some limitations, primarily related to the parameter settings for population size and maximum number of iterations, which are based on empirical values. In the future, further optimization of the model will be necessary to achieve even better diagnostic results.
References
- 1. Adams EM, Gulka J, Williams KA. A review of the effectiveness of operational curtailment for reducing bat fatalities at terrestrial wind farms in North America. PLoS One. 2021;16(11):e0256382. pmid:34788295
- 2. Jun L, Chenliang Z. Fast fault diagnosis of smart grid equipment based on deep neural network model based on knowledge graph. PLoS One. 2025;20(2):e0315143. pmid:39951439
- 3. Gao Z, Cecati C, Ding SX. A Survey of Fault Diagnosis and Fault-Tolerant Techniques—Part I: Fault Diagnosis With Model-Based and Signal-Based Approaches. IEEE Trans Ind Electron. 2015;62(6):3757–67.
- 4. Wu C, Zeng Z. A fault diagnosis method based on Auxiliary Classifier Generative Adversarial Network for rolling bearing. PLoS One. 2021;16(3):e0246905. pmid:33647055
- 5. Hameed Z, Hong YS, Cho YM, Ahn SH, Song CK. Condition monitoring and fault detection of wind turbines and related algorithms: A review. Renewable and Sustainable Energy Reviews. 2009;13(1):1–39.
- 6. Huang G-B, Zhu Q-Y, Siew C-K. Extreme learning machine: a new learning scheme of feedforward neural networks. In: 2004 IEEE international joint conference on neural networks, 2004.
- 7. Huang G-B, Ding X, Zhou H. Optimization method based extreme learning machine for classification. Neurocomputing. 2010;74(1–3):155–63.
- 8. Zong W, Huang G-B, Chen Y. Weighted extreme learning machine for imbalance learning. Neurocomputing. 2013;101:229–42.
- 9. Yan Y, Qian Y, Ma H, Hu C. Research on imbalanced data fault diagnosis of on-load tap changers based on IGWO-WELM. Math Biosci Eng. 2023;20(3):4877–95. pmid:36896527
- 10. Guo X, Yang J, Shen Y, Zhang X. Prediction of agricultural carbon emissions in China based on a GA-ELM model. Front Energy Res. 2023;11.
- 11. Zhou Z, Ji H, Yang X. Illumination correction of dyed fabric based on extreme learning machine with improved ant lion optimizer. Color Research & Application. 2022;47(4):1065–77.
- 12. Li C, Zhou J, Dias D, Gui Y. A Kernel Extreme Learning Machine-Grey Wolf Optimizer (KELM-GWO) Model to Predict Uniaxial Compressive Strength of Rock. Applied Sciences. 2022;12(17):8468.
- 13. Dehghani M, Montazeri Z, Trojovská E, Trojovský P. Coati Optimization Algorithm: A new bio-inspired metaheuristic algorithm for solving optimization problems. Knowledge-Based Systems. 2023;259:110011.
- 14. Jia H, Shi S, Wu D, Rao H, Zhang J, Abualigah L. Improve coati optimization algorithm for solving constrained engineering optimization problems. Journal of Computational Design and Engineering. 2023;10(6):2223–50.
- 15. Qi Z, Yingjie D, Shan Y, Xu L, Dongcheng H, Guoqi X. An improved Coati Optimization Algorithm with multiple strategies for engineering design optimization problems. Sci Rep. 2024;14(1):20435. pmid:39227613
- 16. Başak H. Hybrid coati–grey wolf optimization with application to tuning linear quadratic regulator controller of active suspension systems. Engineering Science and Technology, an International Journal. 2024;56:101765.
- 17. Baş E, Yildizdan G. Enhanced Coati Optimization Algorithm for Big Data Optimization Problem. Neural Process Lett. 2023;55(8):10131–99.
- 18. Tong R, Li P, Gao L, Lang X, Miao A, Shen X. A Novel Ellipsoidal Semisupervised Extreme Learning Machine Algorithm and Its Application in Wind Turbine Blade Icing Fault Detection. IEEE Trans Instrum Meas. 2022;71:1–16.
- 19. Fan Y, Wang H, Zhao X, Yang Q, Liang Y. Short-Term Load Forecasting of Distributed Energy System Based on Kernel Principal Component Analysis and KELM Optimized by Fireworks Algorithm. Applied Sciences. 2021;11(24):12014.
- 20. Krishna Rayi V, Mishra SP, Naik J, Dash PK. Adaptive VMD based optimized deep learning mixed kernel ELM autoencoder for single and multistep wind power forecasting. Energy. 2022;244:122585.
- 21. Shang S, Zhu J, Liu Q, Shi Y, Qiao T. Low-altitude small target detection in sea clutter background based on improved CEEMDAN-IZOA-ELM. Heliyon. 2024;10(4):e26500. pmid:38420380
- 22. Pustokhina IV. Multi-objective rain optimization algorithm with WELM model for customer churn prediction in telecommunication sector. Complex & Intelligent Systems. 2021;1–13.
- 23. Wang C, Yang G, Li J, Huang Q. Fuzzy Adaptive PSO-ELM Algorithm Applied to Vehicle Sound Quality Prediction. Applied Sciences. 2023;13(17):9561.
- 24. Oueslati R, Manita G, Chhabra A, Korbaa O. Chaos Game Optimization: A comprehensive study of its variants, applications, and future directions. Computer Science Review. 2024;53:100647.
- 25. Akgul A, Karaca Y, Pala MA, Çimen ME, Boz AF, Yildiz MZ. Chaos theory, advanced metaheuristic algorithms and their newfangled deep learning architecture optimization applications: a review. Fractals. 2024;32(03).
- 26. Chawla M, Duhan M. Levy Flights in Metaheuristics Optimization Algorithms – A Review. Applied Artificial Intelligence. 2018;32(9–10):802–21.
- 27. Xue J, Shen B. A novel swarm intelligence optimization approach: sparrow search algorithm. Systems Science & Control Engineering. 2020;8(1):22–34.
- 28. Das S, Suganthan PN. Differential Evolution: A Survey of the State-of-the-Art. IEEE Trans Evol Computat. 2010;15(1):4–31.
- 29.
Wu G, Mallipeddi R, Suganthan PN. Problem definitions and evaluation criteria for the CEC 2017 competition on constrained real-parameter optimization. Changsha, Hunan, PR China: National University of Defense Technology. 2017.
- 30. Trojovský P, Dehghani M. Subtraction-Average-Based Optimizer: A New Swarm-Inspired Metaheuristic Algorithm for Solving Optimization Problems. Biomimetics (Basel). 2023;8(2):149. pmid:37092401
- 31. Braik M, Hammouri A, Atwan J, Al-Betar MA, Awadallah MA. White Shark Optimizer: A novel bio-inspired meta-heuristic algorithm for global optimization problems. Knowledge-Based Systems. 2022;243:108457.
- 32. Seyyedabbasi A, Kiani F. Sand Cat swarm optimization: a nature-inspired algorithm to solve global optimization problems. Engineering with Computers. 2023;39(4):2627–51.
- 33. Chopra N, Mohsin Ansari M. Golden jackal optimization: A novel nature-inspired optimizer for engineering applications. Expert Systems with Applications. 2022;198:116924.
- 34. Kaur S, Awasthi LK, Sangal AL, Dhiman G. Tunicate Swarm Algorithm: A new bio-inspired based metaheuristic paradigm for global optimization. Engineering Applications of Artificial Intelligence. 2020;90:103541.
- 35. Mirjalili S, Lewis A. The Whale Optimization Algorithm. Advances in Engineering Software. 2016;95:51–67.
- 36. Mirjalili S, Mirjalili SM, Lewis A. Grey Wolf Optimizer. Advances in Engineering Software. 2014;69:46–61.
- 37. Rao RV, Savsani VJ, Vakharia DP. Teaching–Learning-Based Optimization: An optimization method for continuous non-linear large scale problems. Information Sciences. 2012;183(1):1–15.
- 38. Rashedi E, Nezamabadi-pour H, Saryazdi S. GSA: A Gravitational Search Algorithm. Information Sciences. 2009;179(13):2232–48.
- 39. Kennedy J, Eberhart R. Particle swarm optimization. In: Proceedings of ICNN’95 - International Conference on Neural Networks. 1942–8.
- 40. Breiman L. Random forests. Machine learning. 2001;45(1):5–32.