Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Machine fault detection model based on MWOA-BiLSTM algorithm

Abstract

This paper proposes the Modulated Whale Optimization Algorithm(MWOA), an innovative metaheuristic algorithm derived from the classic WOA and tailored for bionics-inspired optimization. MWOA tackles common optimization problems like local optima and premature convergence using two key methods: shrinking encircling and spiral position updates. In essence, it prevents algorithms from settling for suboptimal solutions too soon, encouraging exploration of a broader solution space before converging, by incorporating cauchy variation and a perturbation term, MWOA achieve optimization over a wide search space. After that, comparisons were conducted between MWOA and seven recently proposed metaheuristics, utilizing the CEC2005 benchmark functions to assess MWOA’s optimization performance. Moreover, the Wilcoxon rank sum test is used to verify the effectiveness of the proposed algorithm. Eventually, MWOA was juxtaposed with the BiLSTM classifier and six other meta-heuristics combined with the BiLSTM classifier. The aim was to affirm that MWOA-BiLSTM outperforms its counterparts, showcasing superior performance across crucial metrics such as accuracy, precision, recall, and F1-Score. The study results unequivocally demonstrate that MWOA showcases exceptional optimization capabilities, adeptly striking a harmonious balance between exploration and exploitation.

1. Introduction

In contemporary industrial and commercial settings, maintenance activities transcend mere routine operations; they emerge as pivotal elements in a company’s enduring success. In the realm of mechanical equipment and industrial systems, the management of machine breakdowns directly influences the reliability and cost-effectiveness of production. Notably, in sectors like wind turbine and oil and gas industries, maintenance expenses constitute a substantial portion of the overall costs. For example, reports indicate that the Operation and Maintenance expenses for offshore wind turbines constitute a range of 20% to 35% of the overall revenue generated from electricity production [1], and maintenance expenses within the oil and gas sector may range from 15% to 70% of the overall production costs, reflecting a substantial portion of the total expenses involved in production operations [2]. Furthermore, insufficient maintenance can result in unplanned downtime, adversely affecting not only a company’s core business but also its financial performance directly. In 2013, Amazon experienced a $4 million loss in revenue as a result of just 49 minutes of downtime [3]. This underscores the significance that for major e-commerce entities such as Amazon, even a brief period of downtime can precipitate substantial losses. As per a Ponemon Institute survey, organizations face an average loss of $138,000 for every hour of downtime [4]. This emphasizes that the economic ramifications of downtime are pervasive across diverse industries, exerting a direct and significant influence on a company’s financial position. In conclusion, unplanned downtime of any mechanical equipment or machinery has the potential to undermine or disrupt a company’s core operations, possibly leading to significant penalties and inestimable damage to its reputation [3]. Hence, placing emphasis on maintenance and proactive measures to minimize machine downtime is imperative for enhancing the economic efficiency and sustainability of the company. A well-devised maintenance strategy, coupled with the implementation of preventive maintenance, can aid organizations in averting potential losses and enhancing the stability and reliability of their production systems.

In the face of escalating complexity in manufacturing and equipment, coupled with the limitations of traditional manual troubleshooting methods to address all situations comprehensively [5], modern companies are progressively turning to intelligent solutions to optimize their maintenance strategies [6]. Within the realm of maintenance, machine learning stands out as a pivotal technology facilitating more precise failure prediction [7]. Within the realm of maintenance, machine learning stands out as a pivotal technology facilitating more precise failure prediction. Traditional methods often fall short due to their inability to process and analyze large volumes of data efficiently. Machine learning, however, excels in this area by harnessing extensive historical data and real-time monitoring information. This capability allows systems to learn equipment behavior dynamically. Machine learning models can be trained to identify normal operating modes and potential failure modes, allowing for the timely detection of anomalies and proactive maintenance actions. Malburg et al. [8] evaluated three state-of-the-art target detection systems in order to investigate the suitability of machine learning for detecting artifacts and recognizing faulty situations that require adaptation. Gonzalez-Jimenez et al. [9] introduced a troubleshooting approach leveraging machine learning techniques. This method assists auxiliary maintenance teams in detecting faults specifically within the power connections of induction machines. Tai et al. [10] applying the Hidden Markov Model to the detection of machine failures in process control. In the domain of machine fault detection, support vector machines and long short-term memory in machine learning enjoy widespread usage and preference. SVM excels in handling supervised binary classification problems and stands out for its proficiency in constructing efficient classifiers [11]. Additionally, its capability to accurately classify various classes of failure modes by efficiently identifying the optimal hyperplane renders it applicable to a diverse range of industrial systems. Ghate and Dudul [12] developed a fault detection system for small and medium-sized induction motors utilizing SVM technology. Lee et al. [13] employed SVM for identifying defects caused by shaft misalignment in rotating machinery shafts, while Senanayaka et al. [14] utilized SVM algorithms to promptly detect and classify bearing failures, given their pivotal role in motor and generator malfunctions. Conversely, long short-term memory networks, a category within recurrent neural networks, excel in capturing the complexities of sequential data [15]. LSTMs excel in comprehending and predicting intricate patterns in time series by capturing long-term dependencies within the data. This capability positions them as superior in the domain of machine fault detection. Li et al. [16] developed a pioneering anomaly detection technique for mechanical equipment using SAE-LSTM, allowing for the unsupervised identification of anomalies. Borré et al. [17] tackled the challenge of predicting motor failures by anticipating potential anomalies in the data using CNN-LSTM. In a related context, Han et al. [18] devised a new long short-term memory-based variational autoencoder for fault detection in offshore components on ships.

Although machine learning models like SVMs and LSTMs excel in learning intricate patterns, there are instances where the system’s state may not be fully observable or the data may be affected by noise. In this context, heuristic algorithms emerge as empirical and intuition-based search and optimization techniques that offer a more intuitive and flexible approach to decision-making based on experience and intuition. Their application in machine fault classification and detection arises from the imperative to effectively address optimization problems within complex industrial systems [19]. Within the maintenance domain, heuristic algorithms prove valuable in optimizing maintenance schedules, enhancing equipment life, and reducing costs [20]. By integrating machine learning with heuristic algorithms, we can leverage the strengths of both approaches. Machine learning models can first process large datasets to identify patterns and make initial predictions about potential faults. These predictions can then be refined using heuristic algorithms, which apply domain-specific knowledge and intuitive problem-solving to handle noisy or incomplete data. For example, heuristic algorithms can optimize maintenance schedules based on machine learning predictions, ensuring that maintenance tasks are prioritized effectively and adjusted in real-time to changing conditions.

Additionally, heuristic algorithms can dynamically adjust the hyperparameters of machine learning models, further enhancing their performance. By fine-tuning hyperparameters such as learning rates, regularization terms, and number of hidden layer nodes, heuristic algorithms ensure that machine learning models operate at their optimal capacity. This combination allows for a more adaptive and robust maintenance strategy, ultimately leading to improved equipment life and reduced costs. Through the application of heuristic algorithms, companies can achieve more accurate predictions of potential equipment failures. This, in turn, empowers the development of intelligent, data-driven maintenance plans aimed at minimizing downtime and repair costs [21]. This approach utilizes a large amount of historical data and real-time monitoring information to predict possible failure scenarios. This makes maintenance more predictable and efficient by enabling timely detection of anomalies and preventive maintenance measures in real-time monitoring, minimizing production disruptions and repair costs [22]. By complementing each other with machine learning and heuristic algorithms, a more comprehensive, flexible and intelligent machine fault detection and maintenance management system can be realized. For instance, Cuong-Le et al. [23] proposed a method for damage identification in structural health monitoring and non-destructive damage detection utilizing PSO-SVM. This approach utilizes the structural response under dynamic excitation. Addressing the challenge of low accuracy in the mechanical diagnosis of high voltage circuit breakers, Yang et al. [24] proposed a WOA-SVM-based fault diagnostic. Furthermore, Samanta et al. [25] explored the effectiveness of gear failure detection using GA-SVM.

This paper proposes the Modulated Whale Optimization Algorithm (MWOA), which builds upon the classical WOA in bionics-inspired optimization. MWOA addresses issues of local optimization and premature convergence by modifying the original linear shrinking encircling mechanism and spiral updating position approach. By integrating cauchy variation and a perturbation term, MWOA enhances its performance in navigating complex search spaces, introducing greater complexity and diversity to the previously fixed search method. This algorithm offers a promising solution to optimization problems by providing improved search capabilities and overcoming convergence limitations.

The subsequent sections of this article unfold as follows. In Section 2, a meticulous exposition is provided on the intricate process of constructing the mathematical model for BiLSTM. Section 3 delves into an in-depth elucidation of the modulation mechanism in MWOA. Moving forward, Section 4 meticulously evaluates the efficacy of MWOA on the CEC2005 benchmark function, juxtaposing its performance against seven prominent nature-inspiredmetaheuristic algorithms and then performs the Wilcoxon rank sum test. Furthermore, Section 5 meticulously tests and compares the performance of the MWOA-BiLSTM model against six other models for machine fault detection. Finally, Section 6 encapsulates the findings and proffers insights into future avenues of research.

2. Related work

In this section, we present the principles of Long Short-Term Memory (LSTM) neural networks and Bidirectional Long Short-Term Memory (BiLSTM) neural networks, as well as the Whale Optimization Algorithm (WOA). By understanding these foundational concepts, we lay the groundwork for improving metaheuristic algorithms and their integration with machine learning models in subsequent discussions.

2.1 LSTM principle of neural network

In the realm of machine learning, the issue of gradient vanishing poses a pervasive challenge [26], particularly pronounced when handling lengthy sequential data. Traditional Recurrent Neural Networks encounter difficulties over multiple iterations as the gradient information may swiftly approach zero (resulting in gradient vanishing) or experience rapid escalation (leading to gradient explosion) due to successive multiplication operations [27]. This phenomenon significantly impedes the network’s ability to effectively learn dependencies in long time-series sequences.

To tackle this challenge, Hochreiter and Schmidhuber proposed the concept of memory units in 1997 as a solution to the issues of inadequate gradient and diminishing error backpropagation in traditional RNN training [28]. The essence of the memory unit lies in preserving a persistent state [29], empowering the network to adeptly retain crucial information when handling extensive sequences [30]. This design attains meticulous control over the memory unit’s state by incorporating three gates to modulate the flow of information. Refer to Fig 1 for a comprehensive elucidation of the intricate details within.

The fundamental concept behind LSTM is to employ memory cells as autonomous activation functions and constant functions with steadfast weights connected to themselves [30]. The crucial aspect lies in the stability introduced by these fixed weights, preventing the gradient from either vanishing or exploding during back-propagation through memory cell errors [31]. This innovative architecture significantly improves LSTM’s capability to capture temporal dependencies within extensive sequential data [32], proficiently addressing the issue of vanishing gradients and establishing itself as a potent tool for handling intricate temporal tasks.

The forget gate regulates the extent to which old information will be discarded. The formula is as follows Eqs (1) and (2). (1) (2) where σ represents the sigmoid function, ft varies between 0 and 1, and Ct also falls within the range of 0 to 1. Wf denotes the weight of the forget gate, while xt stands for the input of the current layer at time t, and ht−1 represents the output at the previous time step.

The input gate is tasked with determining the amount of new information to be incorporated into the memory cell. Eq (3) and (4) delineate the mathematical formulations in question. (3) (4) where it ranges between 0 and 1, Wi represents the weight of the input gate, bi signifies the input gate bias, Wc denotes the weight of the candidate gate, and bc stands for the bias of the candidate gate.

By employing the sigmoid activation function, the inputs and previous hidden states undergo weighting and summation, producing a gating signal within the (0,1) range. This signal plays a pivotal role in determining the significance of each element. Concurrently, the input and previous hidden states are subjected to the tanh activation function, yielding a new information vector. The input gates then perform element-wise multiplication on the information, thereby defining the new information that will be integrated into the memory cell.

The output gate generates the final memory cell state by blending the Sigmoid with the tanh activation function. At time t, the input to the output gate consists of the outputs ht−1 and input xt, while the output ot is determined by computing Eqs (5) and (6) according to the following formula.

(5)(6)

The input and the previous hidden state undergo processing via the tanh activation function, resulting in a new information vector. The output gate performs element-wise multiplication on the aforementioned pieces of information to derive the ultimate memory cell state, which is subsequently transmitted to the next layer of the network.

2.2 BiLSTM principle of neural network

Deep bidirectional LSTM networks (BiLSTM) [33] present a refinement of traditional LSTMs (illustrated in Fig 2). Unlike the standard training paradigm, which advances strictly from inputs to outputs [34], BiLSTM distinguishes itself by undergoing bidirectional training—simultaneously from both inputs to outputs and from outputs to inputs [35]. In essence, it assimilates training information in both directions, rendering BiLSTM more potent in capturing temporal relationships and comprehensively understanding context. In terms of structure, LSTM exhibits a symmetric unidirectional configuration where the parameters remain consistent across all time steps [36]. In contrast, BiLSTM adopts an asymmetric structure, featuring independent forward and backward parameters. This asymmetry enhances the model’s complexity.

Another crucial distinction lies in parameter updating and hidden states. LSTM conducts parameter updating through both forward and backward propagation [37], with the gradient propagating from moment t to t-1. In contrast, BiLSTM executes both forward and backward propagation in both directions simultaneously, allowing the gradient to propagate through both directions concurrently. This concurrent propagation enhances the efficiency of capturing long-range dependencies [38]. Furthermore, LSTM features a single hidden state, encapsulating the forward information of the sequence. In contrast, BiLSTM incorporates two hidden states, representing both forward () and backward () information. The ultimate output of BiLSTM involves combining these two hidden states to create a more comprehensive representation, denoted as .

After recognizing the potent sequence modeling capabilities of BiLSTM, we delve into its practical applications. BiLSTM has gained widespread usage in the processing of time-series data [39], owing to its exceptional performance in handling complex sequence relationships. Specifically, within the realm of machine fault detection, Yahyaoui et al. [40] proposed a KPCA-based BiLSTM approach for efficiently detecting and diagnosing faults in power converters within wind turbine systems. Jiahao et al. [41] introduced SVM-BiLSTM, a deep learning-based fault detection method tailored for IoT systems at gas stations. Bharatheedasan et al. [42] introduced an approach based on CNN-BiLSTM for diagnosing rolling bearing faults.

However, it’s notable that most of these applications incorporating BiLSTM are rooted in machine learning methods [43], and there exists a scarcity of instances where BiLSTM is combined with heuristic algorithms for machine fault detection and classification. Zhang et al. [44] proposed a novel method for predicting ship motion attitude, leveraging an adaptive dynamic particle swarm optimization algorithm in combination with BiLSTM. Zhen et al. [45] introduced a GA-based improved Bi-LSTM for microgrid photovoltaic power prediction. Heuristic algorithms, known for their global search capabilities and extensive exploration of parameter space, prove valuable in finding optimal model configurations. This is particularly crucial for parameter-heavy BiLSTM models.

2.3 Whale optimization algorithm

WOA represents a collective intelligence algorithm crafted to tackle continuous optimization issues [46]. Empirical evidence supports the assertion that this algorithm demonstrates superior or comparable performance when benchmarked against several existing algorithmic techniques [47]. The inspiration behind WOA can be traced to the fascinating hunting behavior observed in humpback whales [47]. This choice of modeling, derived from the natural world, underscores the algorithm’s effectiveness in navigating and optimizing complex solution spaces [48]. In the context of WOA, every solution is metaphorically represented as a whale. Within this conceptualization, each whale, or solution, endeavors to explore a novel location within the search space, using the best-performing element in the group as a reference. This mimicry of the natural behavior of whales reflects the algorithm’s strategy of dynamically adjusting and refining individual solutions based on the success of the collective [49]. The utilization of such a bio-inspired approach enhances the algorithm’s adaptability and efficiency in navigating the optimization landscape. The WOA process is shown in Fig 3. The fundamental WOA comprises three primary procedures.

  1. Surrounding and encompassing prey.
  2. Utilizing bubble-net tactics during the exploitation stage.
  3. Hunting for prey during the exploration phase.

2.3.1 Surrounding and encompassing prey.

Humpback whales demonstrate an extraordinary skill in locating and surrounding their prey. Similarly, the WOA operates under the assumption that, like whales discerning the best prey locations, the target solution within the search space is either the optimal candidate or closely located to the optimum. Upon identifying the most promising search agent and understanding its characteristics, the subsequent phase entails other search agents adjusting their positions towards this optimal agent. This adaptive behavior is mathematically expressed through Eqs (7) and (8), as detailed in the literature [47]: (7) (8)

where t denotes the present iteration, vectors and denote coefficients. represents the positional vector of the most optimal solution achieved thus far, signifies the vector denoting position, || denotes the magnitude. An important observation is that variable needs to be updated in each iteration whenever a better solution is found. This emphasizes the continuous refinement of to incorporate any improvements found during the iterative process. Vectors and are computed according to Eqs (9) and (10) respectively.

(9)(10)

During the iterative process, as it progresses, the parameter transitions from an exploration phase to an exploitation phase. Meanwhile, is represented by a random vector within the interval [0, 1].

2.3.2 Bubble-net attacking strategy.

Two different methodologies are used to develop a mathematical model for the whales. These approaches are described as follows.

  1. Shrinking encircling mechanism:
    To instill this behavior, the parameter in Eq (9) undergoes a reduction. It’s important to emphasize that the variability range of is also constrained by . To put it more straightforwardly, denotes a random value within the range of [-a, a], while gradually diminishes from [0,2] with each iteration. When randomly assigning values to from the interval [-1, 1], the recalculated position of a search agent falls between its initial position and the location of the current optimal agent.
  2. Spiral updating position:
    In crafting this behavior, the researchers computed the distance between the current whale position and the prey. Following this distance determination, they devised a spiral equation, illustrated in Eq (11) [47], to replicate the whale’s spiral movement from its existing location to the prey position.
(11)

In this context, represents the distance between the whale and its prey, which is the best solution obtained up to the current moment, while b signifies a constant determining the logarithmic spiral shape. The l represents a randomly generated number within the interval [-1, 1].

When simulating the movements of humpback whales around prey, a notable observation is the simultaneous utilization of a shrinking circling technique and a spiral path toward the prey. Additionally, the probability of whales transitioning between these two behaviors is set at 50%, and this transition is mathematically modeled through Eq (12) [47]: (12) where p is a randomly generated number within the range of [0, 1].

2.3.3 Hunting for prey.

An alternative approach, which hinges on the manipulation of the vector’s variation, can also be utilized during the exploration phase when hunting for prey. Essentially, humpback whales engage in random searches relative to each other’s positions. In this scenario, the vector, characterized by random values exceeding 1 or falling below -1, directs a search agent to move significantly away from a designated reference whale. Unlike the exploitation phase, where the best-performing agent dictates movement, the exploration phase involves updating a search agent’s position based on a randomly selected peer rather than the top-performing one. This mechanism, along with the condition |A| > 1, accentuates exploration, enabling the WOA to conduct a comprehensive exploratory quest on a global scale. The corresponding formula is as follows Eqs (13) and (14).

(13)(14)

The variable denotes a randomly determined position, signifying the location of the whale that has been selected randomly from the pool of available whales. This random selection process contributes to the diversity and exploration aspects of the algorithm.

3. Improved whale optimization algorithm

In this section, we will propose two improvement strategies aimed at solving the problems of local optimality and slow convergence in the whale bubble-net attacking strategy. By introducing these improvements, we will enhance the performance of the algorithm in complex optimization problems and effectively avoid falling into local optimal solutions. Meanwhile, we propose a Modulated Whale Optimization Algorithm (MWOA) that combines these strategies to enhance the global search capability and speed up the convergence rate, thus improving the overall optimization results.

3.1 Insufficiency of the algorithms

The WOA algorithm stands out among optimization techniques by drawing inspiration from the sophisticated hunting strategies employed by humpback whales [50]. It intricately mimics the collaborative hunting behaviors of these marine mammals, offering a unique approach to navigating and optimizing solution spaces [51]. In contrast to other algorithms that may derive inspiration from various sources with distinct search strategies and communication methods, WOA distinctly models the foraging conduct of humpback whales during their pursuit of prey. Much like the GWO, WOA incorporates specific mechanisms for information exchange and knowledge sharing among its search agents [52]. However, WOA is not without its limitations. Primarily, it shows sensitivity to the initial solution selection, where the quality of the chosen starting point significantly influences the algorithm’s ultimate optimization performance. Furthermore, the convergence speed of WOA may exhibit variability, potentially leading to instability and the emergence of locally optimal solutions.

Moreover, when confronted with high-dimensional problems, WOA encounters challenges. The simulation of whale behavior may necessitate adaptation to effectively explore spaces with numerous dimensions. While WOA boasts distinctive characteristics and advantages, it is imperative to recognize and address these limitations in practical applications. As a result, it remains crucial to carefully choose an optimization algorithm that suits the particular problems and needs at hand.

3.2 The Modulated whale optimization algorithm

WOA strategically employs parameters to find a delicate equilibrium between exploration and exploitation. Nevertheless, it encounters formidable challenges, including the persistence of suboptimal solutions and premature convergence, hindering the overall advancement of the algorithm. Within the WOA, a group of whales assumes the pivotal role of guiding the search, with specific whales designated as leader whales presumed to occupy optimal positions for prey consumption. In response to the inherent limitations of the WOA, we introduce an enhanced variant named the MWOA. MWOA incorporates a modulation mechanism into the whale guidance process, aiming to mitigate issues related to premature convergence and the stagnation of suboptimal solutions. In MWOA, the three primary phases undergo a redefinition, and modulation is strategically employed to dynamically adjust their influence throughout the optimization process.

By augmenting the original WOA with modulation strategies, MWOA endeavors to surmount algorithmic challenges and enhance convergence speed. This refinement not only preserves the fundamental essence of WOA but also introduces novel features to bolster the exploration-exploitation balance, effectively addressing identified shortcomings in the original algorithm.

Initially, enhancing parameter a involves transitioning from linear adjustments to nonlinear modifications, aiming to strike a harmonious equilibrium between exploration and exploitation. Parameter a’s determination follows the Eq (15).

(15)

This formula indicates that the parameter a experiences exponential decay as the number of iterations increases. This implies that initially, a decreases rapidly but gradually slows down as iterations progress. The use of exponential decay offers a notable advantage over linear decay, providing greater flexibility in parameter tuning. More precisely, it enables rapid exploration of the search space during initial iterations, followed by a more focused approach to local refinement in later iterations. This attribute is advantageous in optimization algorithms as it allows for a delicate equilibrium between the exploration of broader solutions and the exploitation of more localized ones. The variation of a with t is shown in Fig 4.

thumbnail
Fig 4. The variation of a with t (take T = 100 for example).

https://doi.org/10.1371/journal.pone.0310133.g004

In the context of exponential decay, it becomes essential to set constraints on the range of a to maintain algorithmic stability and reasonability. Imposing limits, denoted as amin for the minimum and amax for the maximum, ensures that a stays within a reasonable range [amin, amax]. These constraints play a crucial role in preventing a from becoming excessively small or large, thereby upholding the stability of the algorithm. The judicious application of range restrictions contributes to a more controlled and effective optimization process.

3.2.1 Adaptive control parameters.

The introduction of control-type parameters α and β serves the purpose of addressing challenges faced by the original algorithm when confronted with diverse search space features. Without these parameters, the spiral motion’s shape could become overly rigid during certain iteration phases. This inflexibility in the spiral shape constrains the algorithm’s performance across various problem spaces, particularly in complex, dynamic, or multimodal search spaces.The consequence of this rigidity is a potential bias of the algorithm towards extensive exploration in some stages and excessive focus on local optimization in others. These biases result in a search procedure that lacks the required adaptability to accommodate the changing attributes of the search space across different iteration counts. Integrating control parameters α and β introduces flexibility, empowering the algorithm to dynamically alter the spiral motion’s configuration as iterations progress.

The adaptability inherent in this feature promotes a refined equilibrium between exploration and exploitation, thereby improving the algorithm’s capacity to navigate intricate and dynamic problem landscapes. Ultimately, the refined algorithm with these control-type parameters exhibits improved flexibility and adaptability, mitigating the limitations observed in the original approach. Following the inclusion of adaptive control parameters, the model is represented by Eq(16). (16) where t denotes the current number of iterations, α is an adaptive parameter controlling the helix period, β is an adaptive parameter controlling the helix shape.

The parameterization of the number of iterations, denoted as t, plays a crucial role in shaping the spiral motion, inducing a gradual reduction during the optimization process. This deliberate adjustment aids the algorithm in conducting an extensive search in the initial stages, progressively honing in on more accurate regions in the later stages.

The incorporation of the α and β parameters facilitates the dynamic adjustment of the spiral motion’s shape with increasing iterations. This adaptability enhances the algorithm’s flexibility, enabling it to effectively respond to diverse search space features. Introducing the α parameter allows for the modulation of the spiral motion’s frequency. A smaller α results in a broader spiral shape, emphasizing global search, while a larger α leads to a more compact spiral, enhancing focus on detailed local search. The inclusion of the e(−β*t) term enables the utilization of larger spiral amplitudes in the early iterations, promoting extensive exploration. As the iterations progress, the amplitude gradually diminishes, fostering a more refined local search.

This refinement significantly accelerates the algorithm’s convergence speed, particularly in complex search spaces. The dynamic adjustment of the spiral motion expedites the convergence towards the optimal solution across the entire search space. Showcasing the algorithm’s effectiveness in navigating intricate optimization landscapes.

3.2.2 Cauchy perturbation.

By incorporating cauchy variation and a perturbation term, the algorithm introduces increased diversity in each iteration [53]. This strategic improvement helps prevent the algorithm from getting stuck in local optimal solutions, thus enhancing its performance in complex search environments. The utilization of cauchy variation and the perturbation term enhances the algorithm’s exploratory nature, facilitating a broader exploration throughout the entire search space. The elongated tail of the cauchy distribution is particularly beneficial, allowing the generation of samples at more distant locations. This elongation effectively increases the algorithm’s likelihood of traversing the entire search space. The introduction of the cauchy variant promotes a higher probability of exploring beyond the current localized region in the search space, contributing to a more comprehensive exploration across the complete solution space. In summary, the incorporation of cauchy variation and the perturbation term imparts a vital exploratory element to the algorithm, preventing it from becoming ensnared in local optima. and enabling a more comprehensive exploration of complex search spaces. The cauchy variational perturbation is shown in Eq(17). (17) where stands for the vector representing the currently optimal solution, while refers to the vector indicating the present position. And γ is the scale parameter of the cauchy distribution.

In instances where the disparity between the currently optimal solution and the present position is substantial, the denominator of the cauchy distribution becomes larger. Consequently, this leads to a smaller perturbation term, diminishing the impact of random perturbations. This deliberate adjustment serves to mitigate random perturbations during the global search phase, thereby preventing premature convergence to a local optimal solution. This modification transforms the spiral updating position phase into Eq (18). (18) where p is a random number in the range [0, 1]. and are constant vectors controlling the cauchy variation and random vectors obeying the cauchy distribution, respectively. The pseudocode of MWOA is shown in Table 1.

4. Simulation results and analysis

To assess the performance of the MWOA, a series of experiments were conducted across 23 benchmark functions widely utilized in previous research [54]. These functions have been categorically classified into unimodal functions, characterized by a limited number of local minima, and multimodal functions, known for possessing numerous local minima (refer to Tables 24). The breakdown of these functions is as follows.

  1. Unimodal Benchmark Functions (F1~F7). These functions have one and only one global best advantage, so they are mainly used to test the local search ability and convergence efficiency of optimization algorithms. The goal of unimodal function is to quickly find the global optimal solution, which is suitable for evaluating the convergence speed and optimization accuracy of the algorithm. Such functions are generally simple and easy to compute.
  2. Multimodal Benchmark Functions (F8~F13). The multimodal function F8~F13 has many local optimal advantages, so it is easy to fall into the local optimal situation in the optimization process. This kind of function is mainly used to test the global search ability of the algorithm, that is, whether it can jump out of the local optimal effectively and find the global optimal solution. Such functions are designed to be more complex, which can simulate the complex search space in the real problem and challenge the global optimization ability of the algorithm.
  3. Fixed-Dimension Multimodal Benchmark Functions (F14~F23). Such functions have multimodal characteristics, but their dimensions are fixed and usually lower. These functions are used to test the optimization ability and stability of the algorithm in a specific dimension. Multi-modal functions with fixed dimensions are usually designed to be more complex, containing more local optima and complex function structures, which are suitable for testing the performance of algorithms in specific scenarios, especially in the case of limited dimensions in practical applications.
thumbnail
Table 4. Description of fixed-dimenstion multimodal benchmark functions.

https://doi.org/10.1371/journal.pone.0310133.t004

This comprehensive set of benchmark functions allows for a thorough evaluation of MWOA across a spectrum of optimization challenges, ranging from unimodal to multimodal, and from single-objective to multi-objective scenarios. The results obtained from these experiments serve to provide valuable insights into the algorithm’s robustness and effectiveness across diverse problem domains.

4.1 Computational complexity of MWOA

MOWA is delineated by considering two critical aspects: time complexity and space complexity. These facets play pivotal roles in assessing the overall performance of an algorithm. Evaluating the time complexity provides insights into the algorithm’s efficiency in terms of execution speed, while analyzing space complexity helps gauge its efficiency in memory utilization. Both of these aspects are crucial metrics in determining the algorithm’s practicality and suitability for various computational tasks.

(1) Time complexity.

MWOA’s performance is intricately affected by essential factors like the quantity of particles(N), the duration of iterations(t), and the expense associated with function evaluation(c). A comprehensive assessment of time complexity necessitates a thorough integration of their collective effects to derive an accurate evaluation. It is noteworthy that the time complexity of MWOA remains on par with that of the WOA, as indicated by the constancy maintained through Eqs (19) and (20). This observation underscores the stability of MWOA in terms of time complexity, emphasizing its efficiency in handling optimization tasks across various settings and computational scenarios.

(19)(20)

(2) Space complexity.

In terms of space complexity, the consideration is focused solely on the initial stage—specifically, the entirety of the search space. In this context, the space complexity of MWOA is succinctly expressed as O(n). This notation signifies a linear relationship with the dimensionality of the problem, underscoring the algorithm’s ability to efficiently manage memory resources as it scales with the size of the optimization problem.

4.2 Comparison algorithm selection

To assess the fault detection classification capability of MWOA, it underwent comparison with several established nature-inspired metaheuristic algorithms. Table 5 presents the parameter settings for these comparison algorithms.

thumbnail
Table 5. The initial parameter settings for the corresponding algorithms.

https://doi.org/10.1371/journal.pone.0310133.t005

4.3 Sensitivity analysis of MWOA’s own parameter selection

At the beginning of this section, we tested the effect of varying the MWOA parameter values on its performance. Different scenarios were selected based on the values of the MWOA parameters (k and α). These parameters took values of 4, 5 and 6, respectively, so that we constructed 9 different scenarios (as shown in Table 6). Among the 23 benchmark functions used, we evaluated the performance of each scenario and collected the corresponding statistical results as detailed in Table 7.

thumbnail
Table 7. The influence of the MWOA parameters (i.e., k and α) on CEC2005 functions.

https://doi.org/10.1371/journal.pone.0310133.t007

From these results, it can be seen that the fifth scenario (i.e., α = 0.3, k = 5) has the best performance among all the functions tested; it is closely followed by the seventh and the sixth scenarios, which obtained the second and third rankings, respectively. This suggests that the combination of α = 0.3 with different values of k has a significant effect on the optimization performance of the MWOA algorithm, especially the effect is most pronounced at k = 5.

Through this study, we are able to gain a deeper understanding of the impact of parameter settings on the performance of MWOA. Optimizing the parameter settings not only helps to improve the performance of MWOA on multiple benchmark functions, but also enhances its robustness and adaptability in practical applications.

4.4 Comparison of the MWOA with traditional algorithms

In the assessment of applicability and interpretability, a comparative analysis has been conducted among various optimization algorithms, including traditional and widely-used methods such as WOA, GWO [55], MPA [56], PSO [57], ABC [58], AOA [59] and SABO [60]. To ensure a fair competition between algorithms, each function was run independently for 50 times and the population size was set to N = 50. the number of iterations T = 1000 was used as the termination condition. Eventually, the best fitness value (Best), the worst fitness value (Worst), the average fitness value (Mean) and the standard deviation (Standard Deviation) of each algorithm in 50 runs are used as metrics for statistical analysis.The outcomes presented in Table 6 highlight the superior performance of the MWOA, as evidenced by its first average ranking and overall ranking among the considered algorithms.

Table 8 results underscore the effectiveness of MWOA across a spectrum of optimization challenges, positioning it favorably when compared to other well-established algorithms. This success can be attributed to its unique characteristics and adaptability in navigating diverse solution spaces. Moreover, the convergence curves depicted in Fig 5 offer insights into the dynamic behavior of the MWOA in comparison to WOA, GWO, MPA, PSO, ABC, AOA and SABO algorithms. The curves illustrate that MWOA exhibits rapid convergence and adeptly avoids local stagnation. This characteristic is crucial in ensuring the algorithm’s efficiency in exploring the solution space and reaching optimal solutions in a timely manner.

thumbnail
Fig 5. Convergence curve of MWOA and other traditional algorithms.

https://doi.org/10.1371/journal.pone.0310133.g005

thumbnail
Table 8. Experimental comparison of MWOA with other algorithms.

https://doi.org/10.1371/journal.pone.0310133.t008

In order to further verify the optimization stability of the MWOA algorithm, the boxplots of each algorithm on the CEC2005 benchmark function are recorded in Fig 6. It is obvious from the figure that the results of the MWOA algorithm exhibit smaller upper and lower bound gaps among the 50 independent experiments, which indicates that the algorithm’s results have high consistency and stability across different runs. In contrast, the other algorithms have significantly larger upper and lower bound gaps, indicating that the results of these algorithms fluctuate more across runs. In addition, the MWOA algorithm also outperforms the other algorithms in all experiments in terms of worst-case results, showing its reliability and robustness in dealing with complex optimization problems.

thumbnail
Fig 6. Boxplots of TTAO and 7 comparison algorithms on CEC2005 functions.

https://doi.org/10.1371/journal.pone.0310133.g006

The overall findings suggest that MWOA stands out as a promising optimization algorithm, showcasing competitive performance against established counterparts and demonstrating its potential for practical applications across various domains.

4.5 Statistical tests

Comparisons between algorithms do not fully guarantee the superiority and effectiveness of the algorithms due to the chance nature of the test results. Therefore, this subsection uses various statistical tests to demonstrate the statistical superiority of MWOA. Specifically, we performed statistical tests on the algorithm’s results on the CEC2005 function. In order to verify the significant difference between the MWOA algorithm and other comparative algorithms, the Wilcoxon rank sum test [61] was used for nonparametric testing. At the 5% significance level, if the p-value is less than 0.05, it means that the two algorithms are significantly different in a function, otherwise the difference is not significant.

First, we used MWOA as a control algorithm for pairwise comparison with other algorithms to generate p-values. Table 9 shows the statistical test results of Mann-Whitney U-test for the MWOA algorithm proposed in this paper at 5% significance level. In this table, “+” indicates that the algorithm has a significant advantage and “-” indicates that the algorithm is not significantly competitive in terms of statistical significance. Out of the 161 comparison tests conducted, 154 showed a significant advantage.The results of the Wilcoxon rank sum test clearly indicate that the MWOA algorithm outperforms the other compared algorithms.

thumbnail
Table 9. Wilcoxon rank sum test results from MWOA and 7 comparison algorithms on CEC2005 functions.

https://doi.org/10.1371/journal.pone.0310133.t009

In summary, despite some randomness and uncertainty in algorithm comparison, the results of Wilcoxon rank sum test can prove that the MWOA algorithm is statistically significantly superior. This further validates the effectiveness and competitiveness of MWOA in solving the CEC2005 function problem.

5. MWOA-BiLSTM diagnostic model

This study utilizes the MWOA to investigate the optimization performance of machine fault detection within industrial contexts. The algorithm iteratively improves a multilayer perceptron model by updating the number of hidden layer nodes, learning rate, and regularization parameters of the BiLSTM. Consequently, this process constructs a robust machine fault detection model, continuously refining these parameters for optimal performance. Through this approach, the study enhances the effectiveness of the BiLSTM model, resulting in significant improvements in classification rates and notable reductions in error rates. The MWOA-BiLSTM machine fault detection process is shown in Fig 7.

5.1 Machine predictive maintenance classification dataset

This paper uses the Machine Predictive Maintenance Classification Dataset from the University of California (UCI). The AI4I 2020 Predictive Maintenance Dataset is a synthetic collection designed to mirror real-world predictive maintenance data commonly found in industrial settings. It comprises 10,000 data points organized as rows, each containing 14 features across various columns. The Predictive Maintenance Dataset, comprising the five attributes illustrated in Table 10, is employed to identify the type of machine failure and validate the accuracy of the model. Furthermore, the dataset underwent partitioning into training and testing subsets, with a distribution ratio of 80% for training and 20% for testing.

thumbnail
Table 10. The attribution of the predictive maintenance dataset.

https://doi.org/10.1371/journal.pone.0310133.t010

5.1.1 Dataset visualization.

By assigning the faulty and non-faulty cases in The AI4I 2020 Predictive Maintenance Dataset as Target "1" and "0" respectively, one can gain insight into the distribution of faulty data. This distribution is depicted in Fig 8, offering a visual portrayal for better comprehension.

Among the 10000 data points, 9652 indicate a non-failure condition, while 348 denote a failure condition. These failures are categorized into five distinct situations based on various metrics: Power Failure, Tool Wear Failure, Overstrin Failure, Random Failure, and Heat Dissipation Failure. Detailed distributions are illustrated in Fig 9.

5.2 MWOA optimization of BiLSTM hyperparameters

In this paper, MWOA is integrated with the BiLSTM classifier to optimize its hyperparameters, including the learning rate, the number of hidden layer nodes, and the regularization coefficient. The goal is to maximize the classifier’s performance across four metrics: accuracy, precision, recall, and F1-Score. The learning rate, which controls the step size for updating network weights, is typically tuned within the range of [10−6,10−1]. The number of hidden layer nodes is the number of nodes in BiLSTM layer, and its value range is [10,30], while the regularization coefficient addresses overfitting and is adjusted within the range of [10−5,10−1]. The optimization problem is three-dimensional, as it involves optimizing three independent hyperparameters.

5.3 Experimental analysis

To validate the accuracy of the proposed MWOA-BiLSTM model in machine fault detection, This paper have assembled a variety of models for comparison, including WOA-BiLSTM, ABC-BiLSTM, BOA-BiLSTM, PSO-BiLSTM, SSA-BiLSTM, and MPA-BiLSTM. The evaluation of the MWOA-BiLSTM model’s performance is based on criteria such as accuracy, precision of the final classification, recall, and F1-Score. These metrics play a crucial role in evaluating the efficiency and dependability of the MWOA-BiLSTM model in identifying machine faults.

It’s evident from Fig 10 that the MWOA-BiLSTM model achieves a higher final classification rate compared to other methods. This underscores the excellent exploratory capability of the MWOA and its effectiveness in mitigating local optimal solutions in machine fault identification tasks. Notably, the MWOA-BiLSTM model demonstrates outstanding accuracy in recognizing machine faults, a critical aspect in industrial settings. The model surpasses others in key metrics such as accuracy, precision, recall, and F1-Score, showcasing its superior performance in tackling the challenges of machine fault identification. This outcome further confirms the robustness and efficiency of the MWOA within the search space. Therefore, the application of the MWOA-BiLSTM model in the industrial field holds significant importance in helping predict and maintain equipment faults more efficiently, thereby increasing productivity and reducing costs. To sum up, the incorporation of the MWOA algorithm represents a substantial improvement in both the classification rate and overall performance of machine fault recognition models. During the training process, the incorporation of a multilayer perceptron endows MWOA with potent exploratory capability, mitigates the risk of falling into local optimal solutions through adaptive improvement of the learning rate and regularization parameter, promotes effective updating of the weights and biases of the BiLSTM layer, and thus improves the classification rate.

thumbnail
Fig 10. Comparison of evaluation indicators of different metaheuristic algorithm classifiers.

https://doi.org/10.1371/journal.pone.0310133.g010

6. Conclusion

This study systematically scrutinizes the performance of the newly developed metaheuristic algorithm, MWOA, through a comprehensive comparative analysis with other state-of-the-art meta-heuristics and proto-algorithms, including WOA, GWO, MPA, PSO, ABC, AOA and SABO. The evaluation of MWOA’s optimization capabilities was conducted using the CEC2005 benchmark functions and the AI4I 2020 Predictive Maintenance Dataset. Additionally, we conducted a comprehensive comparative analysis of MWOA-BiLSTM against its competitors. The key findings from this investigation are summarized below.

  1. Across various functions, MWOA consistently outperforms the comparison algorithm.
  2. MWOA exhibits faster convergence compared to the comparative algorithm across a wide range of functions, particularly evident in functions F1~5, F7~13, F15, and F21~F23.
  3. MWOA offers notable advantages in terms of computational cost and complexity, ensuring optimal results.
  4. MWOA was verified to be significantly different from other comparative algorithms by the Wilcoxon rank sum test
  5. The performance of MWOA-BiLSTM on The AI4I 2020 Predictive Maintenance Dataset, including metrics such as accuracy, precision, recall, and F1-Score, significantly surpasses that of WOA-BiLSTM, ABC-BiLSTM, BOA-BiLSTM, PSO-BiLSTM, SSA-BiLSTM, and MPA-BiLSTM.

Inspired by MWOA, in the realm of optimization strategy design, researchers can contemplate the integration of different metaheuristics to leverage synergistic advantages. This entails the simultaneous application of multiple metaheuristic algorithms in problem-solving endeavors, harnessing their respective strengths across various stages or contexts. Such an amalgamative approach holds the promise of enhancing optimization performance and fortifying algorithms with heightened resilience and adaptability.

References

  1. 1. Gong X, Qiao W. Current-based mechanical fault detection for direct-drive wind turbines via synchronous sampling and impulse detection[J]. IEEE Transactions on Industrial Electronics, 2014, 62(3): 1693–1702
  2. 2. Bevilacqua M, Braglia M. The analytic hierarchy process applied to maintenance strategy selection[J]. Reliability Engineering & System Safety, 2000, 70(1): 71–83.
  3. 3. Ran Y, Zhou X, Lin P, et al. A survey of predictive maintenance: Systems, purposes and approaches[J]. arXiv preprint arXiv:1912.07383, 2019.
  4. 4. Wiboonrat M. Human Factors Psychology of Data Center Operations and Maintenance. In Proceedings of the 2020 6th International Conference on Information Management (ICIM), London, UK, 27–29 March 2020; pp. 167–171.
  5. 5. Zhu D, Feng X, Xu X, et al. Robotic grinding of complex components: a step towards efficient and intelligent machining–challenges, solutions, and applications[J]. Robotics and Computer-Integrated Manufacturing, 2020, 65: 101908.
  6. 6. Zhong RY, Xu X, Klotz E, et al. Intelligent manufacturing in the context of industry 4.0: a review[J]. Engineering, 2017, 3(5): 616–630.
  7. 7. Zhao J, Han X, Ouyang M, et al. Specialized deep neural networks for battery health prognostics: Opportunities and challenges[J]. Journal of Energy Chemistry, 2023.
  8. 8. Malburg L, Rieder M P, Seiger R, et al. Object detection for smart factory processes by machine learning[J]. Procedia Computer Science, 2021, 184: 581–588.
  9. 9. Gonzalez-Jimenez D, del-Olmo J, Poza J, et al. Machine learning-based fault detection and diagnosis of faulty power connections of induction machines[J]. Energies, 2021, 14(16): 4886.
  10. 10. Tai AH, Ching WK, Chan LY. Detection of machine failure: Hidden Markov Model approach[J]. Computers & Industrial Engineering, 2009, 57(2): 608–619.
  11. 11. Talukder M A, Islam M M, Uddin M A, et al. Machine learning-based network intrusion detection for big and imbalanced data using oversampling, stacking feature embedding and feature extraction[J]. Journal of Big Data, 2024, 11(1): 33.
  12. 12. Ghate V N, Dudul S V. Induction machine fault detection using support vector machine based classifier[J]. WSEAS Transactions on Systems, 2009, 8(5): 591–603.
  13. 13. Lee Y E, Kim B K, Bae J H, et al. Misalignment detection of a rotating machine shaft using a support vector machine learning algorithm[J]. International Journal of Precision Engineering and Manufacturing, 2021, 22: 409–416.
  14. 14. Senanayaka J S L, Kandukuri S T, Van Khang H, et al. Early detection and classification of bearing faults using support vector machine algorithm[C]//2017 IEEE Workshop on Electrical Machines Design, Control and Diagnosis (WEMDCD). IEEE, 2017: 250–255.
  15. 15. Babichev S, Liakh I, Kalinina I. Applying a Recurrent Neural Network-Based Deep Learning Model for Gene Expression Data Classification[J]. Applied Sciences, 2023, 13(21): 11823.
  16. 16. Li Z, Li J, Wang Y, et al. A deep learning approach for anomaly detection based on SAE and LSTM in mechanical equipment[J]. The International Journal of Advanced Manufacturing Technology, 2019, 103: 499–510.
  17. 17. Borré A, Seman L O, Camponogara E, et al. Machine Fault Detection Using a Hybrid CNN-LSTM Attention-Based Model[J]. Sensors, 2023, 23(9): 4512.
  18. 18. Han P, Ellefsen A L, Li G, et al. Fault detection with LSTM-based variational autoencoder for maritime components[J]. IEEE Sensors Journal, 2021, 21(19): 21903–21912.
  19. 19. Wang H, Peng T, Nassehi A, et al. A data-driven simulation-optimization framework for generating priority dispatching rules in dynamic job shop scheduling with uncertainties[J]. Journal of Manufacturing Systems, 2023, 70: 288–308.
  20. 20. Morcous G, Lounis Z. Maintenance optimization of infrastructure networks using genetic algorithms[J]. Automation in construction, 2005, 14(1): 129–142.
  21. 21. Tao F, Qi Q, Liu A, et al. Data-driven smart manufacturing[J]. Journal of Manufacturing Systems, 2018, 48: 157–169.
  22. 22. Pech M, Vrchota J, Bednář J. Predictive maintenance and intelligent sensors in smart factory[J]. Sensors, 2021, 21(4): 1470.
  23. 23. Cuong-Le T, Nghia-Nguyen T, Khatir S, et al. An efficient approach for damage identification based on improved machine learning using PSO-SVM[J]. Engineering with Computers, 2021: 1–16.
  24. 24. Yang L, Zhang K, Chen Z, et al. Fault diagnosis of WOA-SVM high voltage circuit breaker based on PCA Principal Component Analysis[J]. Energy Reports, 2023, 9: 628–634.
  25. 25. Samanta B. Gear fault detection using artificial neural networks and support vector machines with genetic algorithms[J]. Mechanical systems and signal processing, 2004, 18(3): 625–644.
  26. 26. Kaur R, Singh S. A comprehensive review of object detection with deep learning[J]. Digital Signal Processing, 2023, 132: 103812.
  27. 27. Wen Y, Rahman M F, Xu H, et al. Recent advances and trends of predictive maintenance from data-driven machine prognostics perspective[J]. Measurement, 2022, 187: 110276.
  28. 28. Hochreiter S, Schmidhuber J. Long short-term memory[J]. Neural computation, 1997, 9(8): 1735–1780.
  29. 29. Cortez B, Carrera B, Kim Y J, et al. An architecture for emergency event prediction using LSTM recurrent neural networks[J]. Expert Systems with Applications, 2018, 97: 315–324.
  30. 30. Gao G, Wang C, Wang J, et al. CNN-Bi-LSTM: A complex environment-oriented cattle behavior classification network based on the fusion of CNN and Bi-LSTM[J]. Sensors, 2023, 23(18): 7714.
  31. 31. Eshraghian J K, Ward M, Neftci E O, et al. Training spiking neural networks using lessons from deep learning[J]. Proceedings of the IEEE, 2023.
  32. 32. Yu Y, Si X, Hu C, et al. A review of recurrent neural networks: LSTM cells and network architectures[J]. Neural computation, 2019, 31(7): 1235–1270.
  33. 33. Schuster M, Paliwal K K. Bidirectional recurrent neural networks[J]. IEEE transactions on Signal Processing, 1997, 45(11): 2673–2681.
  34. 34. Manfren M, Caputo P, Costa G. Paradigm shift in urban energy systems through distributed generation: Methods and models[J]. Applied energy, 2011, 88(4): 1032–1048.
  35. 35. Tao Y, Sun H, Cai Y. Predictions of deep excavation responses considering model uncertainty: Integrating BiLSTM neural networks with Bayesian updating[J]. International Journal of Geomechanics, 2022, 22(1): 04021250.
  36. 36. Xia T, Song Y, Zheng Y, et al. An ensemble framework based on convolutional bi-directional LSTM with multiple time windows for remaining useful life estimation[J]. Computers in Industry, 2020, 115: 103182.
  37. 37. Tian H, Ren D, Li K, et al. An adaptive update model based on improved long short term memory for online prediction of vibration signal[J]. Journal of Intelligent Manufacturing, 2021, 32(1): 37–49.
  38. 38. Liu S, Yu H, Liao C, et al. Pyraformer: Low-complexity pyramidal attention for long-range time series modeling and forecasting[C]//International conference on learning representations. 2021.
  39. 39. Kumar S, Kumar V. Multi-view Stacked CNN-BiLSTM (MvS CNN-BiLSTM) for urban PM2. 5 concentration prediction of India’s polluted cities[J]. Journal of Cleaner Production, 2024: 141259.
  40. 40. Yahyaoui Z, Hajji M, Mansouri M, et al. Effective fault detection and diagnosis for power converters in wind turbine systems using KPCA-based BiLSTM[J]. Energies, 2022, 15(17): 6127.
  41. 41. Jiahao Y, Jiang X, Wang S, et al. SVM-BiLSTM: A fault detection method for the gas station IoT system based on deep learning[J]. IEEE Access, 2020, 8: 203712–203723.
  42. 42. Bharatheedasan K, Maity T, Kumaraswamidhas L A, et al. An intelligent of fault diagnosis and predicting remaining useful life of rolling bearings based on convolutional neural network with bidirectional LSTM[J]. Sādhanā, 2023, 48(3): 131.
  43. 43. Kour H, Gupta M K. An hybrid deep learning approach for depression prediction from user tweets using feature-rich CNN and bi-directional LSTM[J]. Multimedia Tools and Applications, 2022, 81(17): 23649–23685.
  44. 44. Zhang G, Tan F, Wu Y. Ship motion attitude prediction based on an adaptive dynamic particle swarm optimization algorithm and bidirectional LSTM neural network[J]. IEEE Access, 2020, 8: 90087–90098.
  45. 45. Zhen H, Niu D, Wang K, et al. Photovoltaic power forecasting based on GA improved Bi-LSTM in microgrid without meteorological information[J]. Energy, 2021, 231: 120908.
  46. 46. Gharehchopogh F S, Gholizadeh H. A comprehensive survey: Whale Optimization Algorithm and its applications[J]. Swarm and Evolutionary Computation, 2019, 48: 1–24.
  47. 47. Mirjalili S, Lewis A. The whale optimization algorithm[J]. Advances in ngineering software, 2016, 95: 51–67.
  48. 48. Kalita K, Ramesh J V N, Cepova L, et al. Multi-objective exponential distribution optimizer (MOEDO): a novel math-inspired multi-objective algorithm for global optimization and real-world engineering design problems[J]. Scientific reports, 2024, 14(1): 1816.
  49. 49. Amiriebrahimabadi M, Mansouri N. A comprehensive survey of feature selection techniques based on whale optimization algorithm[J]. Multimedia Tools and Applications, 2023: 1–72.
  50. 50. Chen X, Cheng L, Liu C, et al. A WOA-based optimization approach for task scheduling in cloud computing systems[J]. IEEE Systems journal, 2020, 14(3): 3117–3128.
  51. 51. Mohammed H, Rashid T. A novel hybrid GWO with WOA for global numerical optimization and solving pressure vessel design[J]. Neural Computing and Applications, 2020, 32(18): 14701–14718.
  52. 52. Hatta N M, Zain A M, Sallehuddin R, et al. Recent studies on optimisation method of Grey Wolf Optimiser (GWO): a review (2014–2017)[J]. Artificial intelligence review, 2019, 52: 2651–2683.
  53. 53. Choi T J, Ahn C W. An improved LSHADE-RSP algorithm with the Cauchy perturbation: iLSHADE-RSP[J]. Knowledge-Based Systems, 2021, 215: 106628.
  54. 54. Suganthan P N, Hansen N, Liang J J, et al. Problem definitions and evaluation criteria for the CEC 2005 special session on real-parameter optimization[J]. KanGAL report, 2005, 2005005(2005): 2005.
  55. 55. Mirjalili S, Mirjalili S M, Lewis A. Grey wolf optimizer[J]. Advances in engineering software, 2014, 69: 46–61.
  56. 56. Faramarzi A, Heidarinejad M, Mirjalili S, et al. Marine Predators Algorithm: A nature-inspired metaheuristic[J]. Expert systems with applications, 2020, 152: 113377.
  57. 57. Kennedy J, Eberhart R. Particle swarm optimization[C]//Proceedings of ICNN’95-international conference on neural networks. ieee, 1995, 4: 1942–1948.
  58. 58. Karaboga D. Artificial bee colony algorithm[J]. scholarpedia, 2010, 5(3): 6915.
  59. 59. Abualigah L, Diabat A, Mirjalili S, et al. The arithmetic optimization algorithm[J]. Computer methods in applied mechanics and engineering, 2021, 376: 113609.
  60. 60. Trojovský P, Dehghani M. Subtraction-average-based optimizer: A new swarm-inspired metaheuristic algorithm for solving optimization problems[J]. Biomimetics, 2023, 8(2): 149.
  61. 61. Wilcoxon F., 1992. Individual comparisons by ranking methods. In: Breakthroughs in Statistics. Springer, New York, NY, pp. 196–202.