Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

A feature selection method based on the Golden Jackal-Grey Wolf Hybrid Optimization Algorithm

  • Guangwei Liu,

    Roles Conceptualization, Funding acquisition, Project administration, Supervision, Writing – review & editing

    Affiliation College of Mining, Liaoning Technical University, Fuxin, Liaoning, China

  • Zhiqing Guo ,

    Roles Investigation, Methodology, Software, Validation, Visualization, Writing – original draft, Writing – review & editing

    mathgzq@gmail.com

    Affiliation College of Mining, Liaoning Technical University, Fuxin, Liaoning, China

  • Wei Liu,

    Roles Funding acquisition, Supervision, Validation

    Affiliation College of Science, Liaoning Technical University, Fuxin, Liaoning, China

  • Feng Jiang,

    Roles Software, Validation, Visualization

    Affiliation College of Science, Liaoning Technical University, Fuxin, Liaoning, China

  • Ensan Fu

    Roles Formal analysis, Supervision, Visualization

    Affiliation College of Mining, Liaoning Technical University, Fuxin, Liaoning, China

Abstract

This paper proposes a feature selection method based on a hybrid optimization algorithm that combines the Golden Jackal Optimization (GJO) and Grey Wolf Optimizer (GWO). The primary objective of this method is to create an effective data dimensionality reduction technique for eliminating redundant, irrelevant, and noisy features within high-dimensional datasets. Drawing inspiration from the Chinese idiom “Chai Lang Hu Bao,” hybrid algorithm mechanisms, and cooperative behaviors observed in natural animal populations, we amalgamate the GWO algorithm, the Lagrange interpolation method, and the GJO algorithm to propose the multi-strategy fusion GJO-GWO algorithm. In Case 1, the GJO-GWO algorithm addressed eight complex benchmark functions. In Case 2, GJO-GWO was utilized to tackle ten feature selection problems. Experimental results consistently demonstrate that under identical experimental conditions, whether solving complex benchmark functions or addressing feature selection problems, GJO-GWO exhibits smaller means, lower standard deviations, higher classification accuracy, and reduced execution times. These findings affirm the superior optimization performance, classification accuracy, and stability of the GJO-GWO algorithm.

1. Introduction

With the continuous evolution and innovation in computer science technology and storage hardware and their widespread applications in fields such as finance, social media, and biomedicine, various forms of unstructured data have experienced exponential growth [1]. However, these unstructured data types typically contain numerous redundant, irrelevant, and noisy features, making subsequent data mining and scientific research processes challenging [2]. Therefore, the rational and practical identification of optimal feature subsets within these unstructured data collections is essential for subsequent data engineering research.

Feature selection (FS) is a method employed to reduce the dimensionality of data by eliminating a substantial number of redundant, irrelevant, and noisy features from the original dataset while endeavoring to retain all the essential attributes [35]. Based on the variations in search and evaluation techniques, feature selection is traditionally classified into three fundamental categories: filter, wrapper, and embedded methods [6, 7]. Notably, a trade-off characterizes the relationship between filter and wrapper methods. While filter methods are computationally more efficient, wrapper methods excel in feature selection tasks by incorporating the classification model’s feedback [8]. Consequently, this paper’s primary research focuses on the wrapper-type feature subset search process.

The crux of addressing the feature selection problem lies in searching for and evaluating feature subsets [9]. This search for a feature subset can be seen as a combinatorial optimization problem, traditionally approached through exhaustive techniques or heuristic approaches. However, the perpetual accumulation of data has sparked the predicament known as the “curse of dimensionality” [10], rendering traditional methods that rely on exhaustive sampling of every data point or heuristic techniques impractical [11]. In essence, selecting informative feature subsets from high-dimensional data presents notable challenges, necessitating the development of effective feature selection methods to efficiently reduce the original data into a lower-dimensional space [12, 13].

Recently, metaheuristic algorithms have emerged as a preferred tool for tackling combinatorial optimization problems [8, 1416]. These algorithms are highly regarded for their straightforward heuristics, robust global search capabilities, and insensitivity to parameter settings, making them versatile solutions across various domains [1719]. When employed as search strategies for feature subset selection to address feature selection challenges, metaheuristic algorithms prove advantageous in circumventing the issues associated with traditional optimization methods.

Consequently, various metaheuristic algorithms have been applied to feature selection problems, yielding meaningful research outcomes [2026]. Notable examples of these algorithms include Genetic Algorithm [2729], Particle Swarm Optimization (PSO) [3032], Ant Colony Optimization (ACO) [3336], Artificial Bee Colony Algorithm (ABC) [3739], Grey Wolf Optim1izer (GWO) [4046], Whale Optimization Algorithm (WOA) [4753], Multi-verse Optimizer (MVO) [41, 5456], Salp Swarm Algorithm (SSA) [5762], Atom Search Optimization (ASO) [63, 64], Harris Hawks Optimizer (HHO) [6568], Grasshopper Optimization Algorithm (GOA) [6971], and Sooty Tern Optimization Algorithm (STOA) [72, 73], among others. Table 1 summarizes select metaheuristic algorithms employed for feature selection in the past three years.

thumbnail
Table 1. Typical metaheuristic algorithms used for feature selection in the past three years.

https://doi.org/10.1371/journal.pone.0295579.t001

The metaheuristic algorithms mentioned above have each made valuable contributions to feature selection at different points in time, capitalizing on their strengths. Nevertheless, as per the No-Free Lunch theorem [82], it is essential to acknowledge that no single algorithm can universally address all optimization problems. This recognition drives researchers towards a continuous quest for more advanced and versatile algorithms capable of addressing diverse challenges.

The Golden Jackal Optimization (GJO) algorithm [83] has emerged as a promising contender among these metaheuristic algorithms. GJO draws inspiration from the hunting behavior of golden jackals and is renowned for its minimal parameterization, swift search capabilities, and remarkable global exploration potential. It has found applications across a spectrum of complex problem domains. However, the foundational GJO algorithm grapples with certain limitations when confronted with intricate optimization challenges, particularly issues associated with local optima and diminished solution precision. Therefore, a significant impetus behind this study is enhancing the GJO algorithm to improve its optimization performance. Furthermore, another crucial motivation is to explore the application of this enhanced GJO algorithm in tackling feature selection problems. To be specific, the main contributions of this paper are outlined as follows:

  1. Drawing inspiration from the Chinese idiom “Chai Lang Hu Bao” and the principles of hybrid algorithm mechanisms, we incorporated the leadership strategy of the head wolf and the hierarchical structure from GWO into the GJO algorithm. This integration serves to diversify the solutions during the algorithm’s iterations. By enhancing solution diversity, we have increased the GJO algorithm’s ability to escape local optima, thus reinforcing its global exploration capabilities.
  2. Drawing inspiration from the collaborative mechanisms observed in natural populations, we introduced the Lagrange interpolation method to update the population’s positions within the GJO algorithm. This addition aims to enhance the algorithm’s convergence accuracy. The novel population updating mechanism strengthens the algorithm’s local exploitation capabilities.
  3. We amalgamated the GWO algorithm, Lagrange interpolation method, and GJO algorithm to introduce the multi-strategy fusion GJO-GWO algorithm. Subsequently, we successfully integrated this algorithm with the KNN classifier to address feature selection problems.
  4. We applied the proposed GJO-GWO algorithm to eight benchmark functions and ten feature selection problems. Experimental results indicate that, under identical experimental conditions, the GJO-GWO algorithm exhibits superior optimization performance, classification performance, and stability.

The organizational structure of this study is outlined as follows: In Section 2, we introduce the standard GJO algorithm. In Section 3, we present the multi-strategy fusion GJO-GWO algorithm. Section 4 explores the search and optimization performance of the GJO-GWO algorithm when dealing with complex benchmark functions. Section 5 investigates the feature selection method’s convergence and classification performance based on GJO-GWO. Finally, Section 6 summarizes the current research and discusses potential future research directions.

2. The GJO algorithm

The Golden Jackal Optimization (GJO) algorithm, developed by Chopra et al., draws inspiration from the biological population habits and predatory behavior of golden jackals. It is a novel metaheuristic algorithm that employs mathematical modeling techniques to simulate the hunting behavior of golden jackal populations, encompassing prey search, tracking, surrounding, and attacking processes. In the GJO algorithm, each individual within the population represents an initial feasible solution. The algorithm iteratively updates this population, simulating the golden jackal population’s search, tracking, surrounding, and attacking behavior until the pack successfully captures its prey, constituting the algorithm’s stopping condition. When this condition is met, it indicates no significant change between the previous and subsequent generations of the population, signifying the discovery of the optimal solution or optimal solution set. The GJO algorithm comprises four main processes.

(1) Population initialization—Algorithm initialization

Like other metaheuristic algorithms, the initial population of the GJO algorithm is randomly distributed across the search space. It can be defined as: (1) Where Y0 represents the initial population of golden jackals. Ymax and Ymin correspond to the upper and lower boundaries of the search space, respectively. rand denotes a random number within the range of [0, 1].

In the GJO, the initial matrix of prey is defined as follows: (2) Where Prey represents the prey matrix, yi,j represents the value of the jth dimension for the ith prey, n denotes the number of prey, and d represents the dimensionality of the problem being solved. During the algorithm’s iterative process, the fitness value of each prey is calculated using an appropriate fitness function. Therefore, the fitness values of all the prey can be expressed as follows: (3) Where FOA is the fitness value matrix of all preys; f is the fitness function.

(2) Searching and tracking the prey—iterative search process

Golden jackals exhibit inherent autonomous prey perception and tracking capabilities in the natural world. When a member of the population senses the presence of prey, the male jackal assumes the role of the leader, guiding the female jackal in the pursuit of the prey. This process can be represented through mathematical modeling as follows: (4) (5)

Where t represents the current iteration number. Prey(t) is the prey position at the tth iteration. YM(t) and YFM(t) represent the positions of the male jackal and the female at the tth iteration, respectively. Y1(t) and Y2(t) represent the updated positions of the male jackal and the female. E is the energy function of the prey avoiding the golden jackal is defined as: (6) (7) (8) Where E1 represents the energy decline process of the prey. E0 is the initial energy state of the prey. r is a random number between [0,1]. c1 is a constant 1.5. T represents the maximum number of iterations of the algorithm.

In Eq (4) and Eq (5), rl represents a random number generated from the Levy distribution, and it can be calculated using the following formula: (9) Where LF is the Levy flight function, defined as: (10) Where μ and v are random numbers between [0,1]. β = 1.5.

(3) Surrounding and attacking the prey—iterative approximation process

As time elapses, the prey’s diminishing escape energy leads to a gradual encirclement and attack by the population of golden jackals. This process can be mathematically represented as follows: (11) (12)

(4) Capturing the prey—algorithm termination

The population of golden jackals cooperatively surrounds and attacks the prey, eventually resulting in the successful capture of the prey. This process can be delineated as follows: (13)

Where Y(t+1) is the position of the golden jackal at the (t+1)th iteration. When Y1(t) and Y2(t) do not change significantly, that is, when Y(t+1) and Y(t) do not change significantly, the golden jackal successfully captures the prey, the algorithm iteration terminates, and the algorithm finds the optimal solution.

3. GJO-GWO hybrid optimization algorithm based on multi-strategy fusion

In the GJO algorithm, the individual search mechanism of the jackal serves as an efficient strategy for achieving rapid convergence. On the other hand, the collective behavior of the jackal ensures the algorithm’s capability to approach the global optimum. Therefore, striking a balance between these two strategies is paramount to facilitate the algorithm’s swift convergence toward the global optimal solution. Nevertheless, adhering to the No-Free-Lunch theorem [82], no solitary approach can comprehensively address all problems. To enhance the convergence and optimization performance of the GJO algorithm, this paper introduces the wolf search strategy from the GWO algorithm and integrates the Lagrange interpolation method.

3.1 A variation of GJO and GWO

3.1.1 The GWO algorithm.

The GWO is a classical metaheuristic algorithm that simulates the hunting behavior of a pack of grey wolves consisting of an α wolf, β wolf, δ wolf, and ω wolf. These wolves collaborate to search, track, and surround their prey. The algorithm is based on the mathematical model that emulates the hunting process of a grey wolf pack, aiming to optimize the objective function in the solution space iteratively. The primary model of the GWO algorithm is as follows:

(1) Predation and hunting model of gray wolves. In the GWO, wolves of different levels cooperate with each other and jointly search for prey. When the prey is found, α wolf leads the wolves of other levels to track, surround and attack the prey until the prey is captured. The process of the mathematical model is as follows: (14) (15) Where t is the current iteration number. p = 1,2,3 represent α wolf, β wolf, and δ wolf. XP(t) and X(t) represent the positions of the prey and gray wolves at the tth iteration. Dp represents the distance between α wolf, β wolf, and δ wolf and the prey in the tth iteration. (DP→0 means that the gray wolves chase and attack and gradually surround and capture the prey). AP and CP are important parameters to control the hunting step of gray wolves.

(2) The location update model of gray wolves. In the process of hunting in a grey wolf pack, the position update equation for the wolves is defined as follows: (16) In this equation, X(t+1) represents the updated positions of the grey wolves after the tth iteration, which corresponds to the initial positions of the grey wolves at the (t+1)th iteration. The algorithm iterates until there is no significant change in the positions of the grey wolves between two consecutive iterations. This indicates that the grey wolves have successfully captured their prey, and the algorithm stops iterating.

3.1.2 Introducing the α-wolf in the GJO algorithm.

The integration of the Grey Wolf Optimization (GWO) algorithm into the GJO algorithm is motivated by the Chinese idiom: “Chai Lang Hu Bao.” As social animals such as jackals and wolves engage in cooperative hunting, their collaboration enables them to search and encircle prey across a more comprehensive spatial area. This collaborative effort compensates for individual differences, leading to an enhanced hunting success rate and an improved survival rate for the population. Even though both wolf and jackal packs collaborate in hunting, they exhibit distinctive hunting strategies. Jackal packs usually lack a strict hierarchical structure and rely on individual jackals forming groups to hunt collectively. In contrast, wolf packs are typically led by a dominant alpha wolf, adhering to a rigorous leadership hierarchy for coordinating group hunting activities.

To bolster the optimization performance of the GJO algorithm, this paper introduces the leadership strategy of the alpha wolf and the hierarchical structure concept from the GWO algorithm, disregarding competitive interactions between different populations. The alpha wolf is seamlessly integrated into the GJO algorithm as a secondary-tier individual, distinct from the golden jackals. The alpha Wolf’s primary role is to aid the golden jackals in their search for and encirclement of prey. This introduction of leadership strategy and hierarchical structure from the GWO algorithm is a deliberate effort to enhance the GJO algorithm’s optimization capability.

Introducing the alpha wolf makes notable adjustments to the critical iterative process of the Improved Golden Jackal Optimization (IGJO) algorithm. These specific modifications are detailed below.

  1. (1) Searching and tracking the prey—iterative search process.
    (17) (18) (19)
  2. (2) Surrounding and attacking the prey—iterative approximation process.
    (20) (21) (22)

Introducing the alpha wolf to the GJO algorithm instigates a transformation from the initial solitary hunting mechanism, reliant solely on the cooperation between golden jackals, to a collaborative hunting approach led by the golden jackal pair and the alpha wolf. As illustrated by Eqs (17) through (22), the inclusion of the alpha wolf extends the search radius of the initial population during the early stages (Eqs 1719), consequently elevating the chances of detecting the prey. Additionally, it heightens the effectiveness of surrounding and capturing the prey (Eqs 2022). The alpha wolf, guided by the golden jackal pair, significantly accelerates and enhances the encircling of the prey, reducing the probability of the prey escaping the population’s pursuit range. This, in turn, results in improved hunting speed and precision for the population.

However, with the introduction of the alpha wolf, the IGJO algorithm incorporates new parameters A and C, leading to an increase in algorithm complexity. To mitigate the potential impact of these new parameters, we substitute parameters A and C with parameters E and rl from the original GJO algorithm. As a result, Eq (19) and Eq (22) are modified as follows: (23) (24)

3.2 Collaborative updating mechanism of GJO-WOA based on Lagrange interpolation

Incorporating the hierarchical structure from the GWO algorithm and introducing the alpha wolf in the IGWO algorithm necessitate a modification of the original position update equation (Eq (13)) used in the basic GJO algorithm. In the fundamental GJO algorithm, when there is no alteration between the population positions of the previous generation and the current generation, signifying convergence, it implies that both the male and female jackals should occupy the same position. Combining Eq (13), it can be observed that Y1(t), Y2(t), and Y3(t) should be equal in this case. However, with the introduction of the hierarchical structure of the gray wolf pack and the α wolf, using Eq (13) as the position update equation does not consider the α wolf’s influence. This directly affects the precise application of the proposed improvement mechanism in Section 3.1 for enhancing the optimization performance of the GJO algorithm.

Considering the integration of the gray wolf optimization algorithm, the existing population will now consist of three fundamental elements: male jackals, female jackals, and α wolf. The positions of these three essential elements can be simplified as three points (Y1(t), Y2(t), and Y3(t)). To ensure the convergence of the position update equation, it is necessary to guarantee the intersection of the constructed three-point iteration formula. Additionally, considering the cooperation and convergence among the male jackals, female jackals, and α wolf, there should be interactions among the three points. Based on this, we introduce the Lagrange three-point interpolation formula, resulting in the new population update equation: (25) Where Y1(t), Y2(t), and Y3(t) represent the positions of the male jackal, the female jackal, and the α wolf in the tth iteration, respectively. Y(t) represents the population position after the last iteration, that is, the initial position of the current population iteration. Y(t+1) represents the population position after the iterative update. The function of constant 3 is mainly used to control the convergence of the iterative equation and ensure that the distribution weights of male jackal, female jackal, and α wolf are consistent.

Incorporating the enhancements detailed in Section 3.1 and Section 3.2, the improved iteration process of the Golden Jackal Optimization algorithm, referred to as the Multi-Strategy Integrated Golden Jackal-Grey Wolf Hybrid Optimization Algorithm (GJO-GWO), is illustrated in the basic flowchart presented in Fig 1.

3.3 GJO-GWO algorithm execution

Combining the algorithm improvement mechanism and Fig 1. The pseudocode for the proposed GJO-GWO algorithm is detailed in Algorithm 1.

Algorithm 1: The GJO-GWO Algorithm

Input: The population N, variable dimension d, and Maximum number of iterations T.

01: Initializing the population

02: while (t<T)

03:    if ffM

04:        fM = f

05:    elseif fM<ffEM

06:        fEM = f

07:    elseif fM<fFM<ffα

08:            fα = f

09:    end if

10:    Calculating the random number rl associated with the levy function using Eqs (9) and (10)

11:    for (Iterating through each individual in the population)

12:            Computing the energy function E for prey avoiding the jackal wolves based on Eqs (6), (7), and (8)

13:            if |E|<1 (EXPLOITATION)

14:                

15:                

16:                

17:            else (EXPLORATION)

18:                

19:                

20:                

21:        end if

22:    Updating the population positions according to Eq (25)

23:      end for

24:      Boundary handling

25:      t = t+1

26: end while

Output: Y1(t) and fM

3.4 Computational complexity

This subsection primarily analyzes the time complexity and space complexity of the proposed GJO-GWO algorithm in this paper.

3.4.1 Time complexity.

Similar to other metaheuristic algorithms, the complexity of the GJO-GWO algorithm is predominantly influenced by three key processes: initialization, fitness evaluation, and individual updating [84, 85]. Notably, the complexity of fitness evaluation is intricately dependent on the intricacy of the specific optimization problem under consideration; consequently, we shall refrain from an exhaustive examination of this aspect. The GJO-GWO algorithm’s initialization process encompasses two distinct sub-processes: jackal initialization and wolf initialization. As a result, the initialization time complexity of the GJO-GWO algorithm stands at O(2N).

Further within the GJO-GWO algorithm, during the individual updating process, the male and female jackals and the wolf undergo updates subject to distinct constraint conditions. To elaborate, the individual update time complexity for the male and female jackals stands at . In contrast, the wolf’s individual update is characterized by a time complexity of O(T×N×d). Hence, the individual updating time complexity of the GJO-GWO algorithm amounts to .

In summary, the overall time complexity of the GJO-GWO algorithm is summarized as . Where N denotes the population size, T represents the maximum number of iterations for the algorithm, and d signifies the dimensionality of the problem at hand.

3.4.2 Space complexity.

For the GJO-GWO algorithm, initializing the golden jackal and grey wolf populations occupies the most significant space. Therefore, the spatial complexity of the GJO-GWO algorithm can be characterized as O(2×N×d).

4. Test case 1: GJO-GWO for benchmark functions

In this section, we will comprehensively describe the experimental results of the GJO-GWO algorithm concerning benchmark functions. Through these experimental results, we will discuss the algorithm’s capabilities in finding optimal values and its convergence performance. Finally, we will employ two statistical tests, Wilcoxon and Friedman, to validate the statistical significance of the GJO-GWO algorithm’s superiority.

4.1 Experiment environment

(1) Environment.

Operating system: 64-bit Windows 11.

CPU: 12th Gen Intel(R) Core(TM) i5-12500H 2.50 GHz Memory: 8G.

(2) Datasets.

To validate the optimization performance of the GJO-GWO algorithm in solving complex functions, this study conducted numerical simulation experiments on eight benchmark functions. Detailed information on the datasets can be found in Table 2.

(3) Parameter settings.

This paper employed specific parameter settings for each algorithm to ensure fair and objective comparisons in the experimental setup. The parameter configurations for each algorithm are presented in Table 3.

4.2 Experimental results

4.2.1 Convergence analysis.

To conduct a preliminary analysis of the proposed GJO-GWO algorithm, we evaluated it on the eight benchmark functions presented in Table 2. The experimental results of the GJO-GWO algorithm and nine other metaheuristic algorithms are summarized in Table 4.

thumbnail
Table 4. Optimization results of different algorithms on benchmark functions.

https://doi.org/10.1371/journal.pone.0295579.t004

Table 4 illustrates the results obtained under identical experimental conditions:

  1. Overall Optimization Results: The GJO-GWO algorithm consistently demonstrates smaller mean values and standard deviations than the other algorithms across most benchmark functions. This signifies that the enhanced algorithm exhibits superior convergence performance and optimization capabilities compared to its counterparts.
  2. Optimization Results for Unimodal Functions (F1-F4): GJO-GWO consistently achieves smaller mean values and standard deviations than the nine other algorithms across the four single-peak functions. This is evidence of the algorithm’s exceptional optimization performance and stability in locating global optima. These findings underscore GJO-GWO’s enhanced local exploration capabilities on unimodal functions.
  3. Optimization Results for Multimodal Functions (F5-F8): Except for a slightly lower performance on F6, GJO-GWO consistently outperforms the other algorithms when optimizing multi-peak functions. Consequently, concerning the overall optimization results for multimodal functions, GJO-GWO exhibits superior global exploration capabilities compared to its counterparts.

The experimental results unequivocally establish that the GJO-GWO algorithm achieves convergence in solving complex functions. It consistently outperforms the competing algorithms regarding mean values and standard deviations, thus highlighting its robust optimization and exploration capabilities. These outcomes validate the efficacy of the GJO-GWO algorithm in tackling function optimization tasks.

To visually compare and analyze the superior performance of the GJO-GWO algorithm in solving complex functions in comparison to the other nine metaheuristic algorithms, we have plotted basic graphs of the eight benchmark functions and convergence curves for each algorithm after 1000 iterations, as depicted in Fig 2.

thumbnail
Fig 2. Benchmark function graphs and convergence curves of different algorithms.

https://doi.org/10.1371/journal.pone.0295579.g002

As observed in Fig 2, it becomes evident that the GJO-GWO algorithm exhibits the fastest and earliest convergence for both unimodal and multimodal functions. This observation strongly suggests that the GJO-GWO algorithm possesses a higher convergence rate and superior convergence accuracy, thus affirming its exceptional optimization performance in addressing complex functions.

In summary, under uniform experimental conditions, the GJO-GWO algorithm, which combines multiple strategies from the Golden Jackal Optimization and Grey Wolf Optimization, surpasses the other nine metaheuristic algorithms in terms of both mean and standard deviation indicators. This substantiates GJO-GWO’s superior local exploitation capability and enhanced global exploration capability in solving unimodal and multimodal functions. Furthermore, the minimal standard deviation indicates GJO-GWO’s heightened robustness when optimizing complex functions.

4.2.2 Statistical test analysis.

To comprehensively and objectively assess the optimization performance of the GJO-GWO algorithm, this study employed two statistical tests for evaluation.

Firstly, to assess the significant differences between the GJO-GWO algorithm and the other algorithms, pairwise comparisons were conducted using the GJO-GWO algorithm as the control. The Wilcoxon [90] rank-sum test was performed at a significance level of 5%, and the corresponding p-values are presented in Table 5. In Table 5, the symbols ‘+’ and ‘-’ indicate whether the algorithm has a significant statistical significance advantage (‘+’) or not (‘-’). The results in Table 5 reveal that the p-values obtained from the Wilcoxon rank-sum test for the GJO-GWO algorithm against all the other algorithms are significantly smaller than 0.05. This signifies that the GJO-GWO algorithm demonstrates a noteworthy advantage over the nine compared algorithms regarding optimization performance.

Secondly, while the Wilcoxon rank-sum test primarily focuses on comparing the performance between two algorithms, it is necessary to effectively evaluate the performance of each algorithm within the entire set. As a non-parametric test, we employed the Friedman test [91] to determine whether there were significant differences among multiple algorithm distributions. This test utilizes ranks to assess the overall optimization performance of the GJO-GWO algorithm across the eight benchmark functions and identify significant differences among various observed data. The results of the Friedman test for the GJO-GWO algorithm are presented in Table 6. As shown in Table 6, the GJO-GWO algorithm achieves the highest rank among the ten algorithms, thus confirming its significant advantage over the nine compared metaheuristic algorithms.

In conclusion, under uniform constraint conditions, the GJO-GWO algorithm exhibits superior overall metrics (lower mean and standard deviation) and statistical metrics (lower Wilcoxon rank-sum test and Friedman test results) compared to the nine algorithms. These findings underscore the GJO-GWO algorithm’s enhanced local exploitation and global exploration capabilities.

4.3 Discussion

In Case 1, we conducted both convergence analysis and statistical tests to evaluate the performance of the GJO-GWO algorithm when solving benchmark functions of varying modes. By studying the experimental results of ten metaheuristic algorithms across different function modes, we comprehensively understood how the GJO-GWO algorithm performs in optimizing problems.

The GJO-GWO algorithm performs better in finding optimal values, stability, convergence, and statistical significance. These remarkable achievements can be attributed to the introducing of the alpha wolf and the cooperative strategies among the alpha wolf, male jackal, and female jackal. Primarily, during the initial iterations of the algorithm, the introduction of the alpha wolf expands the algorithm’s search space, increasing the likelihood of the population discovering prey and thereby enhancing the algorithm’s chances of escaping local optima. Furthermore, in the later iterations of the algorithm, the cooperation between the alpha wolf, male jackal, and female jackal accelerates the population’s updating process, facilitating the algorithm in converging to the global optimum more rapidly.

However, it should be noted that as a hybrid optimization algorithm, the GJO-GWO algorithm exhibits a significant increase in both time and space complexity (as detailed in Section 3.4). This implies that while the proposed algorithm enhances performance, it also demands higher computational resources. Nevertheless, we consider this performance improvement worthwhile because, within our acceptable limits, we are willing to make certain computational sacrifices to achieve superior optimization performance. Hence, we regard the GJO-GWO algorithm as a meaningful algorithm enhancement.

5. Test case 2: GJO-GWO for feature selection

This section will provide a detailed description of applying the GJO-GWO algorithm to feature selection problems. Firstly, we will outline the specific implementation process of the GJO-GWO algorithm in the context of feature selection. Secondly, we will assess the performance of the GJO-GWO algorithm in feature selection problems based on experimental results. Finally, we will employ statistical analysis methods to confirm the exceptional performance of the GJO-GWO algorithm in feature selection tasks.

5.1 Implementation process

5.1.1 Initialization.

  1. Binary Conversion. During the initialization phase, the GJO-GWO algorithm randomly generates an initial population of N candidate solutions, where each individual represents a feature subset to be evaluated. However, feature selection problems are typically binary discrete problems. Therefore, when using the GJO-GWO algorithm to select feature subsets for evaluation, it is necessary to map the feature vectors from continuous to binary discrete space. This transformation is defined as: (26) Where xbinary represents the feature value after binarization. xij indicates the real value and i = j = 1,2,⋯,N, j = 1,2,⋯,D.
  2. Fitness function calculation. Once the feature subsets are selected, it is necessary to calculate the fitness function for these feature subsets to determine their quality. The equation for computing the fitness function is defined as: (27)
    Where γR(D) represents the KNN classification error rate. |R| represents the length of the selected feature subset. |C| represents the total number of features in the datasets. α∈(0,1) represents the importance of classification quality, and β = (1−α) represents the importance of the subset length [2].

5.1.2 Updating solutions.

Solution updating is a crucial component of the optimization algorithm for feature selection problems, and different algorithms employ various strategies. In this critical step, the GJO-GWO algorithm continually adjusts each selected solution using Eqs (17) through (24) to pursue improved solutions. Then, through Eq (27), the fitness evaluation of the new generation of feature subsets is performed to determine the best feature combinations. Typically, this process requires multiple iterations until the termination criteria are met. In this research, the termination criteria usually refer to reaching the maximum number of iterations, which helps evaluate the performance level of the GJO-GWO algorithm.

5.1.3 Classification.

As a typical wrapper feature selection method, the feature selection approach based on GJO-GWO not only employs GJO-GWO to search for feature subsets but also requires combining a learning algorithm to simultaneously evaluate these subsets, ensuring that while reducing the number of features, a high classification accuracy is maintained. In this study, we utilized a KNN classifier (k = 5) as the learning algorithm to evaluate the feature subsets selected by the GJO-GWO algorithm. We adopted the hold-out method to classify the original dataset, randomly splitting it into two portions: 80% as the training set and 20% as the test set. The KNN classifier (k = 5) assessed the classification accuracy.

5.2 Experimental evaluation

In this section, we present the experimental results and discuss the performance of the proposed feature selection method based on the GJO-GWO algorithm. To achieve this, a set of ten UCI [92] classification datasets with multiple features and redundant information was selected for analysis under the same constraints.

5.2.1 Experiment environment.

  1. (1) Datasets. To validate the effectiveness of applying the GJO-GWO algorithm to feature selection problems, numerical experiments were conducted on ten datasets from the UCI repository [92]. Detailed information about these datasets is presented in Table 7.
  1. (2) Parameter settings. To provide a comprehensive and objective validation of the superiority and feasibility of the feature selection method based on GJO-GWO, the parameter settings for each algorithm are presented in Table 8.

Table 8 displays the parameter configurations for the comparative feature selection algorithms used in the experiments. These parameters encompass the population size, maximum number of iterations, and other algorithm-specific settings.

5.2.2 Evaluation metrics.

To comprehensively evaluate the GJO-GWO algorithm’s performance in feature selection problems, we utilized the metrics listed in Table 9 to assess the algorithm [100].

thumbnail
Table 9. Evaluation metrics of the models’ performance.

https://doi.org/10.1371/journal.pone.0295579.t009

In Table 9, Accuracy represents the classification accuracy of the algorithm on each dataset. TP, TN, FP, and FN refer to the true positive, true negative, false positive, and false. negative. represents the average classification accuracy across datasets. A higher average classification accuracy indicates better classification performance of the algorithm. represents the algorithm’s average number of selected features across datasets. A lower average number of selected features indicates a more significant reduction in redundant information. represents the average runtime of the algorithm across datasets. A smaller average runtime implies faster optimization performance. Acc(i) represents the classification accuracy in the ith experiment. Fea(i) represents the number of selected features in the ith experiment. Runtime(i) is the runtime in the ith experiment, where i ranges from 1 to 10, denoting the ten repeated experiments.

5.2.3 Experimental results.

Based on the parameter settings in Table 8, we conducted numerical experiments to compare the performance of the feature selection method based on GJO-GWO with ten other metaheuristic algorithms on the ten classification datasets listed in Table 7. The experimental results are presented in Tables 10 and 11. Tables 10 and 11 display the average number of selected features and average classification accuracy of GJO-GWO when used for feature selection compared to the 13 other algorithms on the ten classification datasets, all with k = 5. The best-performing values in the tables have been highlighted in bold text. Additionally, Figs 3 and 4 visually represent each algorithm’s average number of selected features and average classification accuracy.

thumbnail
Fig 4. Bar chart of average classification accuracy for different algorithms.

https://doi.org/10.1371/journal.pone.0295579.g004

thumbnail
Table 11. Average classification accuracy of different algorithms on datasets.

https://doi.org/10.1371/journal.pone.0295579.t011

  1. (1) Impact of a single indicator on algorithm performance. Regarding the average number of selected features, as indicated by Table 10 and Fig 3, the feature selection method based on MPA demonstrates superior overall performance. Conversely, the performance of the feature selection method based on GJO-GWO is relatively weaker than the other 13 contrast algorithms on most datasets. However, it reduces the number of selected features by half relative to the total number.

Regarding the average classification accuracy indicator, as shown in Table 11 and Fig 4, the feature selection method based on GJO-GWO achieves the best classification accuracy among all 14 algorithms on eight datasets, ranking first overall. Moreover, it attains 100% accuracy on five datasets (D2, D3, D4, D8, and D9).

  1. (2) Impact of multiple indicators on algorithm performance. When considering both the average number of selected features and the average classification accuracy, the feature selection method based on GJO-GWO exhibits suboptimal performance in terms of the average number of selected features but demonstrates superior performance in classification accuracy. This aligns well with the goal of feature selection, which is to balance the selection of features while ensuring high classification accuracy. Therefore, considering the combined influence of these two metrics, the feature selection method based on GJO-GWO outperforms the other 14 algorithms.

In summary, if one only considers the impact of a single metric on the proposed algorithm in this paper, the performance of the feature selection method based on GJO-GWO is moderate. However, when comprehensively considering the interplay between the two metrics, the performance of the proposed algorithm stands out as optimal. The inconsistency in the conclusion arises from the fact that even if the algorithm identifies fewer features through the search, it does not necessarily translate to higher classification accuracy, let alone superior algorithm performance. This highlights that being locally optimal at every step does not guarantee global optimality. The algorithm can be deemed superior performance only by selecting an appropriate number of features and ensuring optimal classification accuracy.

At first glance, the objective of the feature selection problem may seem straightforward: minimize the number of selected features while maximizing classification accuracy. However, when we consider storage space constraints, the motivation behind choosing fewer features becomes clear—it is to ensure faster algorithm execution within the same experimental environment. Consequently, in the context of solving the feature selection problem, the number of selected features is intimately linked with an algorithm’s runtime. Therefore, scrutinizing the specific runtime of algorithms in solving the feature selection problem becomes paramount for performance assessment.

In this context, to validate that the GJO-GWO-based feature selection method achieves superior classification accuracy and exhibits faster runtime, we meticulously recorded the average runtime of each algorithm across ten experiments on the ten classification datasets. This data is presented in detail in Table 12, with the best-performing values highlighted in bold for clarity. We provide a visual representation of the average runtime of each algorithm in Fig 5.

Combined with the insights from Table 12 and the visual representation in Fig 5, it becomes evident that the feature selection method utilizing GJO-GWO boasts a significantly reduced runtime compared to the comparative algorithms across all ten datasets. Remarkably, even as dataset sizes grow (measured by the number of features multiplied by the number of instances), the algorithm’s runtime remains notably lower than that of the comparative algorithms. Considering the findings from Table 11 and Fig 4, it is clear that the proposed feature selection method based on GJO-GWO achieves enhanced computational efficiency and superior classification accuracy.

In conclusion, by considering the collective impact of three pivotal indicators: average selected feature count, average classification accuracy, and average runtime, it is evident that the feature selection method based on GJO-GWO excels in addressing the feature selection problem.

5.2.4 Convergence curve analysis.

The convergence curve depicts the trend of a turn at a particular point or interval, offering insights into the convergence and stability of an algorithm during the optimization process. Therefore, this section strongly emphasizes conducting a detailed analysis of the convergence curves. Fig 6 presents the convergence curves of 13 metaheuristic algorithms for feature selection. By examining the convergence curves in Fig 6, it becomes evident that except for D6, D7, and D10, the GJO-GWO algorithm achieves faster convergence to the optimal solution on the other datasets. This observation underscores that the GJO-GWO algorithm excels in terms of convergence speed and accuracy in most cases, further highlighting its outstanding performance in optimization.

thumbnail
Fig 6. Convergence curves of the different algorithms on the feature selection.

https://doi.org/10.1371/journal.pone.0295579.g006

5.2.5 Statistical analysis.

To comprehensively evaluate the GJO-GWO-based feature selection method’s performance concerning three key metrics: average selected feature count, classification accuracy, and runtime, we have employed the following approach:

  1. 1. Comprehensive Ranking: We initiated the analysis by computing comprehensive rankings for various algorithms across these three metrics based on data from Tables 1012. The summarized rankings are presented in Table 13.
thumbnail
Table 13. Comprehensive rankings of different algorithms in three indicators.

https://doi.org/10.1371/journal.pone.0295579.t013

  1. 2. Radar Chart Visualization: Subsequently, we used the rankings from Table 13 to create radar charts, exemplified in Fig 7. These radar charts visually depict algorithmic rankings across the performance metrics, with a smaller enclosed area indicating superior performance.
thumbnail
Fig 7. The three-indicator ranking radar chart of different algorithms.

https://doi.org/10.1371/journal.pone.0295579.g007

Table 13 and Fig 7 are vital tools for assessing algorithmic performance based on average selected feature count, classification accuracy, and runtime. Fig 7, the radar chart, is crafted to visually represent the ranking outcomes from Table 13. A smaller enclosed area within the radar chart signifies superior algorithmic performance.

Fig 7 visually portrays algorithm rankings across the three essential performance indicators. The radar chart, as displayed in Fig 7, is constructed using the complete ranking data from Table 13. When considering the consistent weighting of three indicators on the algorithm’s performance, the triangle area formed by the feature selection method based on EO is smaller than the other 13 comparative algorithms, and the feature selection method based on GJO-GWO ranks third. However, when considering average classification accuracy and average runtime as the primary influencing factors, the performance of the feature selection method based on GJO-GWO is optimal. Regarding the average number of selected features, the MVO algorithm exhibits the best performance but at the cost of reducing average classification accuracy and average runtime.

  1. 3. Optimal Fitness Boxplots: We conducted ten independent experiments for each algorithm to record the optimal fitness values in solving the feature selection problem. These values are showcased in boxplots in Fig 8.
thumbnail
Fig 8. Boxplot of optimal fitness for 10 independent experiments of different algorithms.

https://doi.org/10.1371/journal.pone.0295579.g008

Fig 8 depicts the box plots of the optimal fitness values obtained by different algorithms in 10 independent experiments. From Fig 8, it can be observed that the feature selection method based on GJO-GWO exhibits a distribution of optimal fitness values in a favorable and narrow range across all ten datasets. This indicates that the improved algorithm possesses better search performance and demonstrates superior stability in finding optimal feature subsets.

In summary, considering the combined impact of the three metrics on algorithm performance, the higher average classification accuracy, shorter average running time, and more reasonable selection of feature subsets validate that the feature selection method based on GJO-GWO not only achieves faster search but also demonstrates more robust stability in solving feature selection problems.

  1. 4. Wilcoxon rank-sum test: To better analyze the superiority of the GJO-GWO algorithm in feature selection, we recorded the average fitness of each algorithm for feature selection tasks. Subsequently, a rank-sum test was conducted for statistical analysis, and the results are presented in Table 14.

According to the statistical results in Table 14, when pairwise comparisons are made with other algorithms, the rank-sum test values of the GJO-GWO algorithm are significantly less than 0.05 for most datasets. This indicates that, in statistical terms, GJO-GWO demonstrates a significant advantage over the 13 comparison algorithms. This profound insight underscores the outstanding performance of GJO-GWO in feature selection problems, providing robust support for its reliability and effectiveness in practical applications.

5.3 Discussion

In Case 2, we conducted an in-depth investigation into the application of the GJO-GWO algorithm in combination with the KNN classifier for feature selection problems. We analyzed the experimental results on various complex datasets for 13 metaheuristic algorithms. Through this analysis, we gained a comprehensive understanding of the performance of the GJO-GWO algorithm when applied to feature selection problems. Here are the specific findings summarized below.

First and foremost, among the 13 metaheuristic algorithms applied to feature selection problems, the GJO-GWO algorithm demonstrates exceptional exploratory and exploitative performance. Its superior performance on high-dimensional datasets highlights its versatility in addressing FS problems. Additionally, the KNN-based GJO-GWO algorithm achieves higher classification accuracy and exhibits faster convergence on most datasets compared to other optimization algorithms. Lastly, the shorter average runtime implies that the KNN-based GJO-GWO algorithm is well-suited for swiftly solving complex feature selection problems.

While the KNN-based GJO-GWO algorithm generally produces efficient results for feature selection tasks, experiments reveal that it needs to excel in the number of features selected. These highlight feature selection algorithms’ inherent challenge in maintaining high classification accuracy while reducing the number of features. It emphasizes the existence of a trade-off between the number of features and classification accuracy, and selecting the most suitable feature selection method based on specific requirements can yield better results. Additionally, as the optimization results are inherently based on non-exact but repeatable processes, applying the GJO-GWO algorithm in various scenarios or problems may result in different feature subsets. Finally, it is essential to note that the KNN-based GJO-GWO feature selection algorithm, being a classical wrapper-based feature selection method, may exhibit variations in runtime, classification accuracy, and the number of selected features when used with different classifiers.

6. Conclusion and future directions

This research has been centered around addressing feature selection problems to improve the optimization performance of the GJO algorithm. Through mechanistic analysis and numerical experiments, the following conclusions have been drawn:

  1. A multi-strategy integrated Golden Jackal-Gray Wolf Hybrid Optimization Algorithm (GJO-GWO) has been proposed. Compared with nine other metaheuristic algorithms on eight benchmark datasets, the proposed GJO-GWO exhibits significant advantages in terms of convergence and stability. These advantages mainly manifest in two aspects: a) Introducing the Gray Wolf Algorithm increases solution diversity. b) The position update strategy based on Lagrange interpolation enhances the algorithm’s convergence performance.
  2. An efficient feature selection method based on GJO-GWO for classification tasks has been provided. On ten high-dimensional datasets, when compared to 13 state-of-the-art feature selection techniques, the proposed feature selection method demonstrates significant advantages in terms of accuracy, convergence speed, and runtime. These advantages mainly manifest in two aspects: a) Introducing the Gray Wolf Algorithm enhances solution diversity and improves the algorithm’s runtime efficiency due to its programming framework. b) The position update strategy based on Lagrange interpolation effectively increases the algorithm’s convergence speed. The clever integration of these strategies allows the algorithm to adaptively adjust the balance between exploration and exploitation at different search stages.

Despite the overall better performance of the feature selection method based on GJO-GWO presented in this paper, some things could still be improved. For instance, it often selects a relatively large number of features during feature selection for classification, which might not be ideal for subsequent machine learning or deep learning tasks. Therefore, we plan to conduct further research to address these issues in the future, as outlined below:

  1. In the future, we intend to design a corresponding optimization and development framework around the GJO-GWO algorithm. This framework will be suitable for handling single or multi-objective optimization problems such as real-time feature selection, autonomous intelligent scheduling, image threshold segmentation, power system dispatch optimization, and neural network architecture search.
  2. In the future, we aim to build a data mining and analytics system based on GJO-GWO. We will explore using the GJO-GWO algorithm or a combination of various metaheuristic algorithms as the underlying algorithms to create an integrated data mining and analytics system that encompasses feature engineering, parameter optimization, machine learning, deep learning, and decision optimization. This system will facilitate rapid analysis of real-world engineering application problems.

References

  1. 1. Manbari Z, AkhlaghianTab F, Salavati C. Hybrid fast unsupervised feature selection for high-dimensional data. Expert Systems with Applications. 2019;124:97–118%U https://www.sciencedirect.com/science/article/pii/S0957417419300168.
  2. 2. Zhiqing GUO. Research on Feature SelectionMethod Based on Improved Whale Optimization Algorithm. Master’s degree, Liaoning Technical University.2022. https://doi.org/10.27210/d.cnki.glnju.2022.000421
  3. 3. Guha R, Ghosh KK, Bera SK, Sarkar R, Mirjalili S. Discrete equilibrium optimizer combined with simulated annealing for feature selection. Journal of Computational Science. 2023;67:101942.
  4. 4. Adeen Z, Ahmad M, Neggaz N, Alkhayyat A. MHGSO: A Modified Hunger Game Search Optimizer Using Opposition-Based Learning for Feature Selection. Proceedings of Trends in Electronics and Health Informatics: TEHI 2021: Springer; 2022. p. 41–52.
  5. 5. AbdElminaam DS, Neggaz N, Gomaa IAE, Ismail FH, Elsawy A. Aom-mpa: Arabic opinion mining using marine predators algorithm based feature selection. 2021 International Mobile, Intelligent, and Ubiquitous Computing Conference (MIUCC): IEEE; 2021. p. 395–402%@ 1-66541-243-7.
  6. 6. Li J, Cheng K, Wang S, Morstatter F, Trevino RP, Tang J, et al. Feature Selection: A Data Perspective. ACM Computing Surveys. 2018;50(6):1–45%U https://dl.acm.org/doi/10.1145/3136625.
  7. 7. Neggaz N, Houssein EH, Hussain K. An efficient henry gas solubility optimization for feature selection. Expert Systems with Applications. 2020;152:113364.
  8. 8. Boussaïd I, Lepagnot J, Siarry P. A survey on optimization metaheuristics. Information sciences. 2013;237:82–117.
  9. 9. Liu H, Motoda H. Feature Selection for Knowledge Discovery and Data Mining: Springer Science & Business Media %@ 978-1-4615–5689-3; 2012.
  10. 10. Dash M, Liu H. Feature selection for classification. Intelligent Data Analysis. 1997;1(1):131–56%U https://www.sciencedirect.com/science/article/pii/S1088467X97000085.
  11. 11. Houssein EH, Hosney ME, Mohamed WM, Ali AA, Younis EMG. Fuzzy-based hunger games search algorithm for global optimization and feature selection using medical data. Neural Computing and Applications. 2023;35(7):5251–75. pmid:36340595
  12. 12. Gangeh MJ, Zarkoob H, Ghodsi A. Fast and scalable feature selection for gene expression data using hilbert-schmidt independence criterion. IEEE/ACM transactions on computational biology and bioinformatics. 2017;14(1):167–81. pmid:28182548
  13. 13. Maldonado S, López J. Dealing with high-dimensional class-imbalanced datasets: Embedded feature selection for SVM classification. Applied Soft Computing. 2018;67:94–105.
  14. 14. Dokeroglu T, Sevinc E, Kucukyilmaz T, Cosar A. A survey on new generation metaheuristic algorithms. Computers & Industrial Engineering. 2019;137:106040.
  15. 15. Houssein EH, Mahdy MA, Shebl D, Mohamed WM. A survey of metaheuristic algorithms for solving optimization problems. Metaheuristics in machine learning: theory and applications: Springer; 2021. p. 515–43.
  16. 16. Khanduja N, Bhushan B. Recent advances and application of metaheuristic algorithms: A survey (2014–2020). Metaheuristic and evolutionary computation: algorithms and applications. 2021:207–28.
  17. 17. El-Kenawy E-SM, Mirjalili S, Alassery F, Zhang Y-D, Eid MM, El-Mashad SY, et al. Novel meta-heuristic algorithm for feature selection, unconstrained functions and engineering problems. IEEE Access. 2022;10:40536–55.
  18. 18. Zhang Y-J, Wang Y-F, Tao L-W, Yan Y-X, Zhao J, Gao Z-M. Self-adaptive classification learning hybrid JAYA and Rao-1 algorithm for large-scale numerical and engineering problems. Engineering Applications of Artificial Intelligence. 2022;114:105069.
  19. 19. Zhang Y-J, Yan Y-X, Zhao J, Gao Z-M. AOAAO: The hybrid algorithm of arithmetic optimization algorithm with aquila optimizer. IEEE Access. 2022;10:10907–33.
  20. 20. Agrawal P, Abutarboush HF, Ganesh T, Mohamed AW. Metaheuristic Algorithms on Feature Selection: A Survey of One Decade of Research (2009–2019). IEEE Access. 2021;9:26766–91.
  21. 21. Ahmad SR, Bakar AA, Yaakub MR. Metaheuristic algorithms for feature selection in sentiment analysis. 2015 Science and Information Conference (SAI)2015. p. 222–6.
  22. 22. Sharma M, Kaur P. A Comprehensive Analysis of Nature-Inspired Meta-Heuristic Techniques for Feature Selection Problem. Archives of Computational Methods in Engineering. 2021;28(3):1103–27%U https://doi.org/10.007/s11831-020-09412-6.
  23. 23. Yusta SC. Different metaheuristic strategies to solve the feature selection problem. Pattern Recognition Letters. 2009;30(5):525–34%U https://www.sciencedirect.com/science/article/pii/S0167865508003565.
  24. 24. El-kenawy E-SM, Albalawi F, Ward SA, Ghoneim SSM, Eid MM, Abdelhamid AA, et al. Feature selection and classification of transformer faults based on novel meta-heuristic algorithm. Mathematics. 2022;10(17):3144.
  25. 25. Neggaz I, Neggaz N, Fizazi H. Boosting Archimedes optimization algorithm using trigonometric operators based on feature selection for facial analysis. Neural Computing and Applications. 2023;35(5):3903–23. pmid:36267472
  26. 26. Dokeroglu T, Deniz A, Kiziloz HE. A comprehensive survey on recent metaheuristics for feature selection. Neurocomputing. 2022;494:269–96.
  27. 27. Huang C-L, Wang C-J. A GA-based feature selection and parameters optimizationfor support vector machines. Expert Systems with applications. 2006;31(2):231–40.
  28. 28. Tao Z, Huiling L, Wenwen W, Xia Y. GA-SVM based feature selection and parameter optimization in hospitalization expense modeling. Applied soft computing. 2019;75:323–32.
  29. 29. Maleki N, Zeinali Y, Niaki STA. A k-NN method for lung cancer prognosis with the use of a genetic algorithm for feature selection. Expert Systems with Applications. 2021;164:113981.
  30. 30. Amoozegar M, Minaei-Bidgoli B. Optimizing multi-objective PSO based feature selection method using a feature elitism mechanism. Expert Systems with Applications. 2018;113:499–514.
  31. 31. Liu Y, Wang G, Chen H, Dong H, Zhu X, Wang S. An improved particle swarm optimization for feature selection. Journal of Bionic Engineering. 2011;8(2):191–200.
  32. 32. Li A-D, Xue B, Zhang M. Improved binary particle swarm optimization for feature selection with new initialization and search space reduction strategies. Applied Soft Computing. 2021;106:107302.
  33. 33. Karimi F, Dowlatshahi MB, Hashemi A. SemiACO: A semi-supervised feature selection based on ant colony optimization. Expert Systems with Applications. 2023;214:119130.
  34. 34. Ma W, Zhou X, Zhu H, Li L, Jiao L. A two-stage hybrid ant colony optimization for high-dimensional feature selection. Pattern Recognition. 2021;116:107933.
  35. 35. Hashemi A, Joodaki M, Joodaki NZ, Dowlatshahi MB. Ant colony optimization equipped with an ensemble of heuristics through multi-criteria decision making: A case study in ensemble feature selection. Applied Soft Computing. 2022;124:109046.
  36. 36. Sun L, Chen Y, Ding W, Xu J, Ma Y. AMFSA: Adaptive fuzzy neighborhood-based multilabel feature selection with ant colony optimization. Applied Soft Computing. 2023;138:110211.
  37. 37. Hanbay K. A new standard error based artificial bee colony algorithm and its applications in feature selection. Journal of King Saud University-Computer and Information Sciences. 2022;34(7):4554–67.
  38. 38. Hancer E, Xue B, Zhang M, Karaboga D, Akay B. Pareto front feature selection based on artificial bee colony optimization. Information Sciences. 2018;422:462–79.
  39. 39. Hancer E, Xue B, Karaboga D, Zhang M. A binary ABC algorithm based on advanced similarity scheme for feature selection. Applied Soft Computing. 2015;36:334–48. %U https://www.sciencedirect.com/science/article/pii/S1568494615004639.
  40. 40. Al-Tashi Q, Abdul Kadir SJ, Rais HM, Mirjalili S, Alhussian H. Binary Optimization Using Hybrid Grey Wolf Optimization for Feature Selection. IEEE Access. 2019;7:39496–508.
  41. 41. Al-Tashi Q, Abdulkadir SJ, Rais HM, Mirjalili S, Alhussian H, Ragab MG, et al. Binary Multi-Objective Grey Wolf Optimizer for Feature Selection in Classification. IEEE Access. 2020;8:106247–63.
  42. 42. Emary E, Zawbaa HM, Hassanien AE. Binary grey wolf optimization approaches for feature selection. Neurocomputing. 2016;172:371–81%U https://www.sciencedirect.com/science/article/pii/S0925231215010504.
  43. 43. Faris H, Aljarah I, Al-Betar MA, Mirjalili S. Grey wolf optimizer: a review of recent variants and applications. Neural Computing and Applications. 2018;30(2):413–35%U https://doi.org/10.1007/s00521-017-3272-5.
  44. 44. Mirjalili S, Mirjalili SM, Lewis A. Grey Wolf Optimizer. Advances in Engineering Software. 2014;69:46–61%U https://www.sciencedirect.com/science/article/pii/S0965997813001853.
  45. 45. Wang J, Lin D, Zhang Y, Huang S. An adaptively balanced grey wolf optimization algorithm for feature selection on high-dimensional classification. Engineering Applications of Artificial Intelligence. 2022;114:105088.
  46. 46. El-Kenawy E-SM, Eid MM, Saber M, Ibrahim A. MbGWO-SFS: Modified binary grey wolf optimizer based on stochastic fractal search for feature selection. IEEE Access. 2020;8:107635–49.
  47. 47. Agrawal RK, Kaur B, Sharma S. Quantum based Whale Optimization Algorithm for wrapper feature selection. Applied Soft Computing. 2020;89:106092%U https://www.sciencedirect.com/science/article/pii/S1568494620300326.
  48. 48. Gharehchopogh FS, Gholizadeh H. A comprehensive survey: Whale Optimization Algorithm and its applications. Swarm and Evolutionary Computation. 2019;48:1–24%U https://www.sciencedirect.com/science/article/pii/S2210650218309350.
  49. 49. Liu W, Guo Z, Jiang F, Liu G, Wang D, Ni Z. Improved WOA and its application in feature selection. PLOS ONE. 2022;17(5):e0267041%U https://journals.plos.org/plosone/article?id=10.1371/journal.pone. pmid:35588402
  50. 50. Mafarja M, Mirjalili S. Whale optimization approaches for wrapper feature selection. Applied Soft Computing. 2018;62:441–53%U https://www.sciencedirect.com/science/article/pii/S1568494617306695.
  51. 51. Mafarja MM, Mirjalili S. Hybrid Whale Optimization Algorithm with simulated annealing for feature selection. Neurocomputing. 2017;260:302–12%U https://www.sciencedirect.com/science/article/pii/S092523121730807X.
  52. 52. Mirjalili S, Lewis A. The Whale Optimization Algorithm. Advances in Engineering Software. 2016;95:51–67%U https://www.sciencedirect.com/science/article/pii/S0965997816300163.
  53. 53. Eid MM, El-kenawy E-SM, Ibrahim A. A binary sine cosine-modified whale optimization algorithm for feature selection. 2021 National Computing Colleges Conference (NCCC): IEEE; 2021. p. 1–6%@ 1-72816-719-1.
  54. 54. Aljarah I, Faris H, Heidari AA, Mafarja MM, Al-Zoubi AM, Castillo PA, et al. A Robust Multi-Objective Feature Selection Model Based on Local Neighborhood Multi-Verse Optimization. IEEE Access. 2021;9:100009–28.
  55. 55. Ewees AA, El Aziz MA, Hassanien AE. Chaotic multi-verse optimizer-based feature selection. Neural Computing and Applications. 2019;31(4):991–1006%U https://doi.org/10.7/s00521-017-3131-4.
  56. 56. Mirjalili S, Mirjalili SM, Hatamlou A. Multi-Verse Optimizer: a nature-inspired algorithm for global optimization. Neural Computing and Applications. 2016;27(2):495–513%U https://doi.org/10.1007/s00521-015-1870-7.
  57. 57. Neggaz N, Ewees AA, Abd Elaziz M, Mafarja M. Boosting salp swarm algorithm by sine cosine algorithm and disrupt operator for feature selection. Expert Systems with Applications. 2020;145:113103.
  58. 58. Hegazy AE, Makhlouf MA, El-Tawel GS. Improved salp swarm algorithm for feature selection. Journal of King Saud University—Computer and Information Sciences. 2020;32(3):335–44%U https://www.sciencedirect.com/science/article/pii/S1319157818303288.
  59. 59. Ibrahim RA, Ewees AA, Oliva D, Abd Elaziz M, Lu S. Improved salp swarm algorithm based on particle swarm optimization for feature selection. Journal of Ambient Intelligence and Humanized Computing. 2019;10(8):3155–69%U https://doi.org/10.1007/s12652-018-1031-9.
  60. 60. Mirjalili S, Gandomi AH, Mirjalili SZ, Saremi S, Faris H, Mirjalili SM. Salp Swarm Algorithm: A bio-inspired optimizer for engineering design problems. Advances in Engineering Software. 2017;114:163–91%U https://www.sciencedirect.com/science/article/pii/S0965997816307736.
  61. 61. Tubishat M, Ja’afar S, Alswaitti M, Mirjalili S, Idris N, Ismail MA, et al. Dynamic Salp swarm algorithm for feature selection. Expert Systems with Applications. 2021;164:113873%U https://www.sciencedirect.com/science/article/pii/S0957417420306801.
  62. 62. Thawkar S. A hybrid model using teaching–learning-based optimization and Salp swarm algorithm for feature selection and classification in digital mammography. Journal of Ambient Intelligence and Humanized Computing. 2021;12:8793–808.
  63. 63. Too J, Rahim Abdullah A. Binary atom search optimisation approaches for feature selection. Connection Science. 2020;32(4):406–30%U https://doi.org/10.1080/09540091.2020.1741515.
  64. 64. Zhao W, Wang L, Zhang Z. Atom search optimization and its application to solve a hydrogeologic parameter estimation problem. Knowledge-Based Systems. 2019;163:283–304.
  65. 65. Long W, Jiao J, Xu M, Tang M, Wu T, Cai S. Lens-imaging learning Harris hawks optimizer for global optimization and its application to feature selection. Expert Systems with Applications. 2022;202:117255.
  66. 66. Turabieh H, Azwari SA, Rokaya M, Alosaimi W, Alharbi A, Alhakami W, et al. Enhanced Harris Hawks optimization as a feature selection for the prediction of student performance. Computing. 2021;103:1417–38.
  67. 67. Zhang Y, Liu R, Wang X, Chen H, Li C. Boosted binary Harris hawks optimizer and feature selection. Engineering with Computers. 2021;37(4):3741–70%U https://doi.org/10.1007/s00366-020-1028-5.
  68. 68. Hussain K, Neggaz N, Zhu W, Houssein EH. An efficient hybrid sine-cosine Harris hawks optimization for low and high-dimensional feature selection. Expert Systems with Applications. 2021;176:114778.
  69. 69. Aljarah I, Al-Zoubi AM, Faris H, Hassonah MA, Mirjalili S, Saadeh H. Simultaneous feature selection and support vector machine optimization using the grasshopper optimization algorithm. Cognitive Computation. 2018;10:478–95.
  70. 70. Dey C, Bose R, Ghosh KK, Malakar S, Sarkar R. LAGOA: Learning automata based grasshopper optimization algorithm for feature selection in disease datasets. Journal of Ambient Intelligence and Humanized Computing. 2022:1–20.
  71. 71. Mafarja M, Aljarah I, Heidari AA, Hammouri AI, Faris H, Ala’M A-Z, et al. Evolutionary population dynamics and grasshopper optimization approaches for feature selection problems. Knowledge-Based Systems. 2018;145:25–45.
  72. 72. Jia H, Li Y, Sun K, Cao N, Zhou HM. Hybrid Sooty Tern Optimization and Differential Evolution for Feature Selection. Computer Systems Science & Engineering. 2021;39(3).
  73. 73. Houssein EH, Oliva D, Celik E, Emam MM, Ghoniem RM. Boosted sooty tern optimization algorithm for global optimization and feature selection. Expert Systems with Applications. 2023;213:119015.
  74. 74. Song X, Zhang Y, Gong D, Liu H, Zhang W. Surrogate sample-assisted particle swarm optimization for feature selection on high-dimensional data. IEEE Transactions on Evolutionary Computation. 2022.
  75. 75. Chattopadhyay S, Dey A, Singh PK, Ahmadian A, Sarkar R. A feature selection model for speech emotion recognition using clustering-based population generation with hybrid of equilibrium optimizer and atom search optimization algorithm. Multimedia Tools and Applications. 2023;82(7):9693–726.
  76. 76. Kumar S, John B. A novel gaussian based particle swarm optimization gravitational search algorithm for feature selection and classification. Neural Computing and Applications. 2021;33(19):12301–15.
  77. 77. Kumar KA, Boda R, Kumar P, Amgothu V. Feature Selection using Multi-Verse Optimization for Brain Tumour Classification. Annals of the Romanian Society for Cell Biology. 2021;25(6):3970–82.
  78. 78. Strumberger I, Rakic A, Stanojlovic S, Arandjelovic J, Bezdan T, Zivkovic M, et al. Feature selection by hybrid binary ant lion optimizer with covid-19 dataset. 2021 29th Telecommunications Forum (TELFOR): IEEE; 2021. p. 1–4%@ 1-66542-585-7.
  79. 79. Long W, Jiao J, Liang X, Wu T, Xu M, Cai S. Pinhole-imaging-based learning butterfly optimization algorithm for global optimization and feature selection. Applied Soft Computing. 2021;103:107146.
  80. 80. Kale GA, Yüzgeç U. Advanced strategies on update mechanism of Sine Cosine Optimization Algorithm for feature selection in classification problems. Engineering Applications of Artificial Intelligence. 2022;107:104506.
  81. 81. Ewees AA, Ismail FH, Sahlol AT. Gradient-based optimizer improved by Slime Mould Algorithm for global optimization and feature selection for diverse computation problems. Expert Systems with Applications. 2023;213:118872.
  82. 82. Wolpert DH, Macready WG. No free lunch theorems for optimization. IEEE Transactions on Evolutionary Computation. 1997;1(1):67–82.
  83. 83. Chopra N, Mohsin Ansari M. Golden jackal optimization: A novel nature-inspired optimizer for engineering applications. Expert Systems with Applications. 2022;198:116924%U https://www.sciencedirect.com/science/article/pii/S095741742200358X.
  84. 84. Liu G, Guo Z, Liu W, Cao B, Chai S, Wang C. MSHHOTSA: A variant of tunicate swarm algorithm combining multi-strategy mechanism and hybrid Harris optimization. PLOS ONE. 2023;18(8):e0290117%U https://journals.plos.org/plosone/article?id=10.1371/journal.pone. pmid:37566618
  85. 85. El-Kenawy E-SM, Khodadadi N, Mirjalili S, Makarovskikh T, Abotaleb M, Karim FK, et al. Metaheuristic optimization for improving weed detection in wheat images captured by drones. Mathematics. 2022;10(23):4421.
  86. 86. Abualigah L, Yousri D, Abd Elaziz M, Ewees AA, Al-qaness MAA, Gandomi AH. Aquila Optimizer: A novel meta-heuristic optimization algorithm. Computers & Industrial Engineering. 2021;157:107250%U https://www.sciencedirect.com/science/article/pii/S0360835221001546.
  87. 87. Abualigah L, Diabat A, Mirjalili S, Abd Elaziz M, Gandomi AH. The Arithmetic Optimization Algorithm. Computer Methods in Applied Mechanics and Engineering. 2021;376:113609%U https://www.sciencedirect.com/science/article/pii/S0045782520307945.
  88. 88. Arora S, Singh S. Butterfly optimization algorithm: a novel approach for global optimization. Soft Computing. 2019;23(3):715–34%U https://doi.org/10.1007/s00500-018-3102-4.
  89. 89. Abualigah L, Shehab M, Alshinwan M, Alabool H. Salp swarm algorithm: a comprehensive survey. Neural Computing and Applications. 2020;32(15):11195–215%U https://doi.org/10.1007/s00521-019-4629-4.
  90. 90. Wilcoxon F. Individual Comparisons by Ranking Methods. In: Kotz S, Johnson NL, editors. Breakthroughs in Statistics: Methodology and Distribution. Springer Series in Statistics. New York, NY: Springer; 1992. p. 196–202%@ 978-1-4612-380-9%U https://doi.org/10.1007/978-1-4612-380-9_16.
  91. 91. Sheldon MR, Fillyaw MJ, Thompson WD. The use and interpretation of the Friedman test in the analysis of ordinal-scale data in repeated measures designs. Physiotherapy Research International. 1996;1(4):221–8%U https://onlinelibrary.wiley.com/doi/abs/10.1002/pri.66. pmid:9238739
  92. 92. K. Bache ML. UCI Machine Learning Repository %U http://archive.ics.uci.edu/ml. http://archiveicsuciedu/ml2013.
  93. 93. Rodrigues D, Pereira LAM, Nakamura RYM, Costa KAP, Yang X-S, Souza AN, et al. A wrapper approach for feature selection based on Bat Algorithm and Optimum-Path Forest. Expert Systems with Applications. 2014;41(5):2250–8%U https://www.sciencedirect.com/science/article/pii/S0957417413007574.
  94. 94. Hancer E. Differential evolution for feature selection: a fuzzy wrapper–filter approach. Soft Computing. 2019;23(13):5233–48%U https://doi.org/10.1007/s00500-018-3545-7.
  95. 95. Nagpal S, Arora S, Dey S, Shreya. Feature Selection using Gravitational Search Algorithm for Biomedical Data. Procedia Computer Science. 2017;115:258–65%U https://www.sciencedirect.com/science/article/pii/S1877050917319610.
  96. 96. Too J, Abdullah AR, Mohd Saad N. A new co-evolution binary particle swarm optimization with multiple inertia weight strategy for feature selection. Informatics: MDPI; 2019. p. 21%@ 2227–9709.
  97. 97. Varzaneh ZA, Hossein S, Mood SE, Javidi MM. A new hybrid feature selection based on Improved Equilibrium Optimization. Chemometrics and Intelligent Laboratory Systems. 2022;228:104618.
  98. 98. Peng L, Cai Z, Heidari AA, Zhang L, Chen H. Hierarchical Harris hawks optimizer for feature selection. Journal of Advanced Research. 2023. pmid:36690206
  99. 99. Beheshti Z. BMPA-TVSinV: a binary marine predators algorithm using time-varying sine and V-shaped transfer functions for wrapper-based feature selection. Knowledge-Based Systems. 2022;252:109446.
  100. 100. Alhussan AA, Khafaga DS, El-Kenawy E-SM, Ibrahim A, Eid MM, Abdelhamid AA. Pothole and plain road classification using adaptive mutation dipper throated optimization and transfer learning for self driving cars. IEEE Access. 2022;10:84188–211.