Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

TS-SSA: An improved two-stage sparrow search algorithm for large-scale many-objective optimization problems

  • Xiaozhi Du ,

    Roles Conceptualization, Investigation, Methodology, Writing – original draft

    xzdu@xjtu.edu.cn

    Affiliation School of Software Engineering, Xi’an Jiaotong University, Shaanxi, China

  • Kai Chen ,

    Contributed equally to this work with: Kai Chen, Hongyuan Du

    Roles Investigation, Methodology, Resources, Writing – original draft

    Affiliation School of Software Engineering, Xi’an Jiaotong University, Shaanxi, China

  • Hongyuan Du ,

    Contributed equally to this work with: Kai Chen, Hongyuan Du

    Roles Investigation, Software, Validation, Writing – review & editing

    Affiliation School of Software Engineering, Xi’an Jiaotong University, Shaanxi, China

  • Zongbin Qiao

    Roles Software, Validation, Writing – original draft, Writing – review & editing

    Affiliation School of Software Engineering, Xi’an Jiaotong University, Shaanxi, China

Abstract

Large-scale many-objective optimization problems (LSMaOPs) are a current research hotspot. However, since LSMaOPs involves a large number of variables and objectives, state-of-the-art methods face a huge search space, which is difficult to be explored comprehensively. This paper proposes an improved sparrow search algorithm (SSA) that manages convergence and diversity separately for solving LSMaOPs, called two-stage sparrow search algorithm (TS-SSA). In the first stage of TS-SSA, this paper proposes a many-objective sparrow search algorithm (MaOSSA) to mainly manages the convergence through the adaptive population dividing strategy and the random bootstrap search strategy. In the second stage of TS-SSA, this paper proposes a dynamic multi-population search strategy to mainly manage the diversity of the population through the dynamic population dividing strategy and the multi-population search strategy. TS-SSA has been experimentally compared with 10 state-of-the-art MOEAs on DTLZ and LSMOP benchmark test problems with 3-20 objectives and 300-2000 decision variables. The results show that TS-SSA has significant performance and efficiency advantages in solving LSMaOPs. In addition, we apply TS-SSA to a real case (automatic test scenarios generation), and the result shows that TS-SSA outperforms other algorithms on diversity.

Introduction

With the rapid development of technology in various industries, a large number of optimization problems of high complexity have emerged. These optimization problems may involve high-dimensional decision variables, complex nonlinear constraints, expensive computational evaluation of objective functions, and the need to optimize multiple conflicting objectives simultaneously [1,2]. The most common of them are multi-objective optimization problems (MOPs). Some studies have achieved good results by converting MOPs into single-objective problems through the weighting method. However, in the weighting method, the weight value of each objective is predetermined, which leads to the difficulty of setting the optimal weight values when there are complex coupling relationships between multiple objectives [3]. To better solve MOPs, researchers have proposed a large number of multi-objective optimization algorithms. Among them, the performance of multi-objective evolutionary algorithms (MOEAs) is particularly outstanding [49]. These traditional MOEAs perform better in solving two-objective and three-objective optimization problems, but their performance decreases as the number of objectives increases [10]. And in real-world applications, the optimization problems often involve more than three objectives, which are called many-objective optimization problems (MaOPs). Many-objective optimization algorithms are an important area of research in optimization today [11]. In addition, some MaOPs involve high-dimensional decision variables, and these problems are also called large-scale many-objective optimization problems (LSMaOPs). LSMaOPs have a more complex search space, which poses greater challenges to the performance of optimization algorithms. In the past few years, many excellent many-objective optimization evolutionary algorithms (MaOEAs) have been proposed [1216], which have achieved good results in solving MaOPs. But their performance on LSMaOPs is poor because these algorithms mainly consider high-dimensional objectives and ignore high-dimensional decision variables. Some of other studies have proposed large-scale optimization algorithms [1719], but they mainly solve large-scale multi-objective problems (LSMOPs). Only a few studies have considered both many objectives and high-dimensional decision variables [20,21]. The main challenge of LSMaOPs is to maintain the convergence and diversity of the population in a huge search space. However, most MaOEAs use a single search strategy, which lead to poor performance of these methods under huge search spaces. Genetic algorithms (GA) and differential evolutionary algorithms (DE) are the most commonly used search operators in MaOEAs. However, a large number of studies have shown that biological population intelligence algorithms [2224] have significantly better performance than GA and DE on single-objective optimization problems. But few studies have been conducted to extend such algorithms to MaOPs. To solve the above problems, this paper proposes a large-scale many-objective evolutionary algorithm (LSMaOEA) based on the sparrow search algorithm, termed the two-stage sparrow search algorithm (TS-SSA). The main contributions of this paper are as follows:

  1. This paper proposes the TS-SSA for solving LSMaOPs. TS-SSA is divided into two stages to manage the convergence and diversity of the population separately. The first stage mainly manages convergence and the second stage mainly manages diversity of the population. The two stages will adaptively alternate according to the population characteristics, thus balancing the convergence and diversity of the algorithm.
  2. Based on SSA, this paper proposes an adaptive population dividing strategy and a random bootstrap search strategy, which effectively improve the convergence of the algorithm and enable the SSA to be applied to LSMaOPs for the first time.
  3. Based on the internal mechanism of the SSA, this paper proposes a dynamic multi-population search strategy to mainly manage the diversity of the population through the dynamic population dividing strategy and the multi-population search strategy, which enables full exploration of large-scale decision spaces to avoid trapping in local optimal and increase population diversity.

The rest of this paper is organized as follows. Sect 2 reviews related work and discusses the unresolved issues. In Sect 3, the proposed TS-SSA method is described in detail. In Sect 4, some experiments are conducted to evaluate the feasibility and effectiveness of our method. Sect 5 concludes this paper and discusses some future work.

Related work

Many-objective optimization evolutionary algorithms

Researchers have proposed many MaOEAs in the past few years. These algorithms can be classified into four categories: selection pressure-based, decomposition-based, metrics-based, and dimensionality reduction-based.

In MaOPs, solutions are in a non-dominant relationship with each other in most cases, which is also known as the dominance resistance. In contrast, the selection pressure based on the Pareto dominance is not sufficient to discriminate between solutions, which makes it difficult for the traditional MOEAs to converge [25]. He et al. [13] constructed a fuzzy Pareto dominance relation based on fuzzy logic to increase the selection pressure. Yuan et al. [14] introduced a α = 0 . 10-dominance relation in the reference point selection strategy, which divides the solution set into multiple groups based on the reference points, and maintains a competitive relationship within each group. This method increases the selection pressure by reducing the number of solutions involved in the selection. On the other hand, Zhang et al. [15] introduced inflection point driving to improve the convergence of the algorithm. There are also studies that increase the selection pressure by improving the environment selection strategy [12] or proposing a new dominance relation [26]. Most of these methods improve the convergence of the algorithms from the perspective of selecting the offspring with higher convergence. In LSMaOPs, due to the many objectives and high-dimensional decision variables, there is a greater computational burden of complex dominance relationships, which makes it difficult for the algorithms to maintain good performance.

Decomposition-based methods are based on the idea of divide and conquer, where complex MaOPs are decomposed into some smaller MOPs or single-objective problems to be solved. MOEA/D, proposed by Zhang and Li [27], is a classical decomposition-based method, which improves the convergence and diversity of the method by introducing weight vectors to decompose the MOPs into smaller subproblems to improve the convergence and diversity of the method, which allows MOEA/D to maintain a better performance while possessing a lower computational complexity. Since then, a large number of studies have also made improvements to MOEA/D [2830]. Among them, MOEA/D-M2M, proposed by Liu et al. [31], used direction vectors instead of weight vectors in MOEA/D. MOEA/D-M2M divides the objective space into multiple subspaces by direction vectors, which realizes the parallel evolution of multiple populations to effectively improve the diversity of the populations. And it solves the problem that it is difficult to achieve the optimal selection of the weight vector values. In addition, SPEA/R [16] decomposed the original objective space based on reference vectors. On the other hand, Asafuddoula et al. [5] introduced reference points in MOEA/D to improve the performance of MOEA/D in solving MaOPs.

The metrics-based methods use performance metrics as selection criteria in the environmental selection strategy. Rostami and Neri [32] proposed a fast search algorithm based on hypervolume (HV) and used Monte Carlo simulation to accelerate the computation of HV. Li et al. [33] proposed a two-stage evolutionary algorithm based on R2 metrics for the initial selection of solutions by R2 metrics, while Sun et al. [34] used inverse generation distance (IGD) for the optimal selection of offspring. IGD has lower computational complexity compared to HV, which results in better performance of the method. Although metrics-based methods show good performance, they still have many drawbacks: First, the computational complexity of most metrics is high. Even HV calculated by Monte Carlo simulation and IGD still have high computational cost. Second, the metrics-based methods rely heavily on the true Pareto front and the shape of the front of the problem to be solved, which leads to the poor results of this type of methods in real-world applications [35,36]. Especially in LSMaOPs, computing performance metrics in each iteration incurs high computational costs.

The dimensionality reduction-based methods are achieved by analyzing the features of the objectives and merging the objectives with the same features to reduce the dimensionality of the target. Huang et al. [37] improved the ability of MOEA/D to solve constrained problems by introducing principal component analysis to MOEA/D. S. Liu et al. [38] introduced an adaptive clustering method to cluster the population, which allowed the populations to better fit the Pareto front. R. Liu et al. [39] improved the performance of the method in dealing with LSMOPs by clustering the decision variables with dimensionality reduction, which divided the decision variables into convergence and diversity variables, and treated the two groups of variables separately.

The above methods employ different strategies from different perspectives to improve the convergence and diversity of the population. Selection pressure-based methods and metrics-based methods mainly improve the environmental selection strategy to increase the selection pressure. However, in LSMaOPs, many objectives and high-dimensional decision variables make it difficult to effectively implement a complex environmental selection strategy. Therefore, in this paper, we improve the convergence and diversity of the population from the perspective of generation, which will reduce the computational burden of environmental selection strategy. Second, most methods manage both convergence and diversity of population together and use a single search strategy to guide population evolution, which causes these methods to have difficulty in maintaining convergence and diversity of population when faced with the huge search space. And they tend to fall into local optimality. Therefore, we propose a dynamic multi-population strategy with multiple search strategies in parallel to improve the search efficiency in a huge search space without increasing the population size to better maintain the diversity of the population.

Large-scale optimization algorithms

Large-scale optimization algorithms can be divided into two categories based on their core ideas: decomposition-based and large-scale exploration-based.

Decomposition-based methods are based on the divide-and-conquer strategy, where the original problem is decomposed into smaller subproblems, and then each subproblem is optimized. Liu et al. [40] proposed a two alternative grouping strategy to divide high-dimensional decision variables. This strategy includes a convergence-related grouping strategy and a diversity-related grouping strategy, which will continuously alternate during population evolution, thus balancing the convergence and diversity of the population. A Bayesian-based parameter adjustment strategy is proposed to balance the accuracy and computational cost of the grouping strategy. However, all subproblems in the grouping strategy share the same decision space, which leads to the possibility that the subproblems do not search in all subspaces but converge on the same subspace, thus reducing the diversity of the solutions. Therefore, Yin and Cao [41] decomposed the decision space into many independent subspaces, each of which has at least one Pareto optimal solution, and individuals can only cross and mutate with other individuals in the same subspace, thus ensuring the diversity of the entire population.

The main idea of the large-scale exploration-based methods is to fully explore the high-dimensional decision space through many search strategies, thus optimizing all decision variables simultaneously. Qi et al. [42] proposed a two-stage multi-strategy search method. In the first stage, the population is divided into different levels by the objective function values of each individual. Different levels are updated by different strategies. Individuals with better objective function values focus on local exploitation, and individuals with poor objective function values focus on global exploration, thus ensuring that the population does not fall into local optimal. When entering the later stages of the search, the algorithm enters the second stage, the detailed search stage, where individuals at the same level learn from each other and search only locally to ensure population convergence. Wang et al. [43] proposed a superiority combination learning strategy based on the master-slave multi-subpopulation distributed model. The population is divided into subpopulations according to the objective function value, and the dominant subpopulation generates a learning particle for the inferior subpopulation to learn and evolve, which ensures sufficient communication and information exchange between different subpopulations and improves the diversity of the population. Gu et al. [44] proposed a chaotic differential strategy and a symmetric direction sampling strategy, which alternate during the evolution process to ensure convergence and diversity of the population. The symmetric direction sampling strategy increases the diversity of the search direction of the algorithm in the high-dimensional decision space by generating symmetric vectors of direction vectors to fully explore the high-dimensional decision space.

Decomposition-based methods require grouping of decision variables before starting the optimization. Therefore, the effectiveness of the grouping strategy directly affects the optimization results. Also, as the number of decision variables increases, the correlation between the decision variables becomes more complex, which requires more computational resources to group the decision variables, thus leading to a decrease in the computational resources allocated to the evolution. In contrast, the large-scale exploration-based methods fully explore the large-scale decision space to optimize all decision variables simultaneously through a multi-directional search strategy. At the same time, its computational cost does not increase significantly with the increase of decision variables. Therefore, when the decision variables are particularly large and the correlation is very complex, the large-scale exploration-based methods are more effective than the decomposition-based methods.

Population-intelligent optimization algorithms

Many population-intelligent optimization algorithms (PIOAs) have been proposed, and many studies have shown that these algorithms outperform GA, DE in solving single-objective optimization problems [2224]. There are also studies that extend the PIOAs to MOPs, which also outperform some traditional MOEAs [45,46]. But there exist two difficulties in the many-objectives extension of PIOAs: First, the search mechanism of the PIOAs is to select an individual with the optimal objective value to guide the movement of other individuals to realize the rapid convergence of the population to the current optimal position. Then the population starts to search the global optimal position from the current optimal position, which greatly improves the search efficiency of the population. In single-objective problems, the optimal individual is easy to choose. However, in MOPs, the decision of the optimal non-dominant individual requires additional computation due to the existence of dominance relationship, which results in MOEAs based on PIOAs to perform one more environmental selection than other MOEAs. Most of the current studies on multi-objective extension of PIOAs select the optimal nondominant individual by crowding degree [46], which results in that these methods have high computational complexity under the many-objective space and are far less efficient than other MaOEAs.Second, the convergence and diversity of the population need to be considered in MOPs, where convergence ensures that the population is close to the Pareto front and diversity ensures that the population can express the Pareto front as completely as possible. However, the search mechanism of the PIOAs leads to strong convergence and weak diversity of the population, and the gap between the two performances is more obvious in the many-objective space, which leads to the poor overall performance of MaOEAs based on PIOAs in MaOPs. Liang et al. [45] proposed a two-stage strategy to improve the sparrow search algorithm with many-objective and achieved good results. However, the study did not consider both large-scale and many-objective, while the algorithm performed poorly in terms of diversity.Therefore, this paper proposes a novel many-objective extension method of PIOAs, which uses the advantage of high convergence of PIOAs to manage the convergence of the population individually to avoid the disadvantage of poor diversity of PIOAs. And since the part of the method does not need to consider diversity, there is no need to select the individual with the best diversity performance as the leading individual, which effectively reduces the computational complexity of the PIOAs in the many-objective space.

The sparrow search algorithm (SSA) is a PIOA that simulates the predatory and anti-predatory behaviors of sparrow population [22]. Compared with other PIOAs, the SSA has two advantages: (1) stronger convergence; (2) the search strategy is a multi-directional search strategy, which is more suitable for dealing with large-scale optimization problems. In the SSA, the population is divided into three subpopulations: discoverers, followers, and vigilantes. The entire population is sorted according to the objective value. A certain percentage of individuals with superior objective values are assigned to the discoverer subpopulation and the rest of the individuals being assigned to the follower subpopulation. Vigilante subpopulation is randomly selected in the entire population. Both discoverers and followers can become vigilantes. Each subpopulation has its own search strategy.

The population is shown in :

(1)

where n is the population size and d is the dimension of decision variable.

The objective function is shown in :

(2)

The search strategy of discoverer subpopulation is shown in :

(3)

where i is the position of the individual in the population after sorting, t is the current iteration number, α ∈ ( 0 , 1 ) is a random number, is the maximum number of iterations, Q is a random number obeying a normal distribution, R ∈ ( 0 , 1 ) is the risk value, and ST ∈ ( 0 . 5 , 1 ) is the risk threshold. The discoverer subpopulation will choose different search strategies according to the degree of the environmental risk. When R < ST means the degree of the environmental risk is low, the discoverer subpopulation will adopt a local exploitation strategy to fully search the local space; when RST means the degree of environmental risk is high, the discoverer subpopulation will adopt a global exploration strategy to find other safe areas on a large space. The discoverer subpopulation is reordered internally after the search is completed.

The search strategy of the follower subpopulation is shown in :

(4)

where is the individual with the worst current objective value, is the individual with the optimal objective value after reordering within the discoverer subpopulation. , and A is a vector of 1 × d randomized to 1 or -1 in each dimension. When individuals are at the back of the population, they can only move as far away as possible from individuals in the worst position because they are farther away from the discoverer, whereas when individuals are at the front or middle of the population, they move towards the optimal position to get more food.

The search strategy of vigilante subpopulation is shown in :

(5)

where β is a random number obeying a normal distribution, is the value of the current individual’s objective function value, is the value of the optimal individual’s objective function value, is the value of the worst individual’s objective value, K ∈ [ − 1 , 1 ] is a random number, and K ∈ [ − 1 , 1 ] is a constant that avoids the denominator being 0. The difference in vigilante subpopulation’s search strategies is mainly reflected in whether the optimal individual is a vigilante or not. When the optimal individual is not a vigilante, it represents that the optimal position of the population is safe, and the whole population will cluster towards the optimal position. But when the individual at the optimal position issues a vigilante, it means that the current optimal position is no longer safe, and the population will conduct a large-scale search to find a new optimal position, thus jumping out of the local optimal situation.

Motivation

Although a large number of researches have proposed many excellent many-objective optimization algorithms and large-scale multi-objective optimization algorithms and achieved good results on the corresponding practical problems, the field of large-scale many-objective optimization algorithms is still in the development stage. Because large-scale many-objective optimization algorithms need to deal with high-dimensional decision variables and high-dimensional objectives at the same time, the performance requirements of the algorithms are higher, and thus the algorithm design will be more complex. For some LSMaOEAs, on the one hand, most of them still use the GA operator as the evolutionary operator, which leads to poor quality of the generated solutions and requires complex selection strategies to increase the selection pressure. And the PIOAs has been proven to be superior to the GA in the field of single-objective optimization and multi-objective optimization by several studies. However, in the field of many-objective optimization, there are a few studies on replacing GA with PIOAs. On the other hand, the decision variable grouping strategy consumes a large number of computational resources, which makes it difficult to balance the efficiency and accuracy. Especially in LSMaOPs, since high-dimensional objectives also need to be handled simultaneously, it is difficult to ensure the overall optimization efficiency of the algorithm by handling high-dimensional decision variables through a decomposition-based methods.

Based on the above motivation, this paper proposes a two-stage method based on SSA. The first stage accelerates the population convergence through many-objective SSA. The second stage fully explores the large-scale decision space through dynamic multi-population search strategy, so as to improve the diversity of the population as well as jumping out of the local optimal while ensuring the efficiency. The two stages alternate adaptively according to the population characteristics, thus balancing the convergence and diversity of the population.

The proposed TS-SSA method

In this paper, we propose a two-stage method based on the sparrow search algorithm (SSA) and multi-population search strategy to solve large-scale many-objective optimization problems (LSMaOPs), called the two-stage sparrow search algorithm (TS-SSA). An LSMaOP can be defined as follows:

In LSMaOPs, maintaining a balance between convergence and diversity in such a high-dimensional decision space is crucial. To address this, our proposed method utilizes nondominated sorting, which ranks solutions based on Pareto dominance. Given n-dimensional decision variables , the goal is to optimize m objective functions: , where m ≥ 4 represents the many objectives that need to be optimized simultaneously. A solution is said to dominate another solution if is no worse in all objectives and strictly better in at least one objective. The set of all nondominated solutions constitutes the Pareto front, representing the trade-offs among objectives where no objective can be improved without sacrificing another.

We employ this formulation to drive the design of the TS-SSA algorithm, and the general flow of the method is shown in Fig 1. First, the population is chaotically initialized to ensure that the initial population is uniformly distributed in the space to improve the search efficiency of the method in the early stage. Then, the population is nondominated sorted, and if there are still individuals that have not converged to the Pareto front, the individuals on the Pareto front are classified into the discoverer subpopulation, the rest of the individuals in the population are classified into the follower subpopulation, and then some of the individuals in the whole population are randomly selected to be classified into the vigilant subpopulation. Each subpopulation is positionally updated by the improved search formula. When the whole population converges to the Pareto front, the size of the three subpopulations is divided according to the relationship between the vigilance value and the vigilance threshold. Each subpopulation is searched by the within-subpopulation search strategy. At the end of each search round, the optimal subpopulation is selected by the reference point environmental selection strategy.

The first stage mainly manages the convergence and ensures the fast convergence of the population to the PF. To enable SSA to be applied to many-objective optimization problems, this paper proposes an adaptive population dividing strategy and a random bootstrap search strategy. The adaptive population dividing strategy will divide the individuals in the optimal layer into the discoverer subpopulation and the rest into the follower subpopulation based on the results of the non-dominated ordering of the population. The random bootstrap search strategy enables the follower to randomly select a discoverer as the following target to search, thus achieving fast convergence of the population to the Pareto front. Meanwhile, in order to ensure the search efficiency of the algorithm in the early search stage, this paper proposes a chaotic population initialization strategy, so that the initial solutions of the population are guaranteed to be widely and randomly distributed as much as possible, thus reducing the ineffective search in the early search stage.

The second stage mainly manages the diversity of the population to ensure that the population is widely distributed on the PF. To fully explore the large-scale decision space, this paper proposes a dynamic multi-population search strategy. The original population is first divided into three smaller subpopulations by a dynamic population dividing strategy, and each subpopulation will adopt a different search strategy to accomplish a full exploration of the large-scale decision space. Among them, the discoverer subpopulation adopts the simulated binary cross search strategy to fully exploit the local space; the vigilant subpopulation adopts the reverse learning search strategy to detect the potential global optimal position to avoid the algorithm falling into the local optimal. The follower subpopulation adopts a differential search strategy and learns from the other two subpopulations through the learning factor to perform a comprehensive search.

The two stages will alternate adaptively according to the population characteristics so as to better balance the convergence and diversity of the population.

Many-objective sparrow search algorithm

Population chaotic initialization.

Initializing the population by obeying uniformly distributed random numbers leads to insufficiently wide distribution of the population in the early stage, which reduces the search efficiency of the population in the early stage. Especially under the huge search space of LSMaOPs, a population that is not widely distributed tends to perform a large number of blind searches in the early stage, which affects the convergence of the population. Chaos theory describes the complex behavior of a nonlinear deterministic system with stochastic and ergodic properties [47]. The population generated by chaotic mapping can have a more uniform distribution in the space [48]. It has been proven on single-objective optimization problems that initializing the population by chaotic sequences can effectively improve the search efficiency of the algorithm in the early stage [49]. Therefore, in this paper, Singer mapping is chosen to generate chaotic sequences to initialize the population to improve the early search capability of the algorithm in the large-scale decision space. The Singer mapping is shown in :

(6)

where the Singer mapping has chaotic properties for μ ∈ [ 0 . 9 , 1 . 08 ] . Singer mappings have a larger value domain than other chaotic mappings, which can accommodate more types of problems. In this paper, μ is set to 1.

Adaptive population dividing strategy.

The size of the subpopulation in the original SSA is generally subjective, with 20% to 40% of the population being discoverers and the rest being followers. Vigilantes make up 10 to 20 percent of the population. However, in MaOPs, a fixed dividing strategy may result in dividing nondominant individuals into the follower subpopulation, which causes convergent aggregation among nondominant individuals and reduces the diversity of the population, thus incurring additional computational cost, or dividing dominant individuals into the discoverer subpopulation, which causes dominant individuals to fail to converge toward the Pareto front, thus reducing the convergence efficiency. Both of these situations reduce the optimization efficiency of the algorithm. Therefore, this paper proposes an adaptive population dividing strategy that first performs a fast non-dominated sorting of the population, and then all non-dominated individuals at the current Pareto front are divided into the discoverer subpopulation, and the remaining dominated individuals are divided into the follower subpopulation. Non-dominated sorting is a method used to evaluate and rank solutions, particularly within the context of multi-objective genetic algorithms (MOGAs). The technique involves categorizing solutions into different levels or fronts. The first front includes all non-dominated solutions—that is, those solutions that are not inferior to any other solution across all objectives. Subsequent fronts include solutions that are dominated only by the solutions in the preceding front. This hierarchical ranking facilitates the algorithm’s focus on solutions close to the Pareto front while maintaining diversity within the population. The Pareto front is a concept central to multi-objective optimization problems. It refers to the set of solutions where no other solution exists that improves one objective without worsening another. Formally, for a given set of solutions SS in a multi-objective optimization problem with objectives , a solution xS is said to be on the Pareto front if there does not exist any other solution yS such that for all i and for at least one j. Solutions on the Pareto front are considered “optimal” because improving one objective function will necessarily result in a degradation of performance in at least one other objective.

The discoverer subpopulation will randomly wander on the Pareto front, which can increase the diversity of the population while avoiding falling into a local optimal. The follower subpopulation will converge rapidly toward the Pareto front by random bootstrap search strategy. It makes little sense for the follower subpopulation to engage in local exploitation or global exploration, since none of the follower subpopulation are at the Pareto front, so in most cases the explored positions may not be as good as the current Pareto front, which leads to a large number of pointless blind searches that add to the computational cost. Therefore, this paper lets the follower subpopulation quickly converge to the discoverer subpopulation, so that more individuals can search on the Pareto front, thus improving the search efficiency. In the original SSA, the vigilante subpopulation is randomly selected from the whole population, and both discoverers and followers have the chance to become vigilantes. However, in the multi-objective optimization problem, the follower only increases the cost of blind search because it is not on the current PF. Therefore, in the adaptive population partitioning strategy, the vigilance will not be selected from the followers, but only randomly from the finders. At the same time, because the alert is only selected from the finders, this paper also expands the selection ratio of alert, and adaptively adjusts the proportion of alert according to :

(7)

where, the rov proportion is alert, iter is the current number of iterations, is the maximum number of iterations, α, β, γ, σ is super parameters.

Random bootstrap search strategy.

In the original SSA, the followers will converge to the individual with the optimal objective function value. Thus, most studies on multi-objective improvement of SSA will determine the optimal non-dominated individual by calculating the crowded distance [46,50]. Calculating crowded distances with high-dimensional objectives has a very high computational complexity, which also makes it impossible to extend SSA to many-objectives with this strategy. Therefore, this paper proposes a stochastic bootstrap search strategy that can guarantee fast convergence of the population in a many-objective space with low computational complexity.

The random bootstrap search strategy for the follower subpopulation is shown in :

(8)

where is randomly selected individual in the follower subpopulation other than individual i. is randomly selected individual in the discoverer subpopulation. L is a 1 × d vector, , and A is a vector of 1 × d randomized to 1 or -1 in each dimension, n is the population size, is discoverer subpopulation size. SSA itself has excellent convergence, but the diversity of SSA performs poorly due to its search and bootstrapping mechanisms. In contrast, MOPs need to consider both convergence and diversity. Therefore, other studies on multi-objective improvement of SSA, since SSA is required to manage both convergence and diversity, these studies can only maximize the diversity performance of SSA by selecting the optimal nondominant individual through the crowded distance.

The disadvantage of this is that the optimal nondominant individual will also lead other nondominant individuals to itself, which destroys the broad distribution of the population on the Pareto front, which is not conducive to managing diversity, and the computation of crowded distances is more expensive in the high-dimensional objective space. However, in TS-SSA, convergence and diversity are managed separately, and SSA only needs to mainly manage convergence primarily to ensure that the population converges quickly to the Pareto front, and then increase the population diversity as much as possible. Therefore, randomly selecting any individuals on the Pareto front can ensure fast convergence of the dominant individuals. It also avoids the high computational cost of selecting the optimal nondominant individual by computing the crowded distance. When the followers are at the back of the population and the number of individuals in the discoverer subpopulation does not exceed half of the total number of the population, i.e., and , it is better to stay away from other followers. Because there are a few discoverers at this time, guiding the followers at this time will lead to the aggregation of the population and reduce the diversity of the population.

The random bootstrap search strategy for the vigilante subpopulation is shown in :

(9)

where β is a random number obeying a normal distribution, M is the number of objectives, is the value of the current individual’s objective function value, is the value of the random individual’s objective function value of follower subpopulation, K ∈ [ − 1 , 1 ] is a random number, and K ∈ [ − 1 , 1 ] is a constant that avoids the denominator being 0. Because all vigilantes are selected from the discoverer subpopulation, and if there are a few individuals in the discoverer subpopulation, the number of vigilantes will be small. Thus, the vigilantes will randomly interact with each other for local exploitation. When there are many individuals in the discoverer subpopulation, the number of vigilantes also increases, at which point the vigilantes randomly select followers to communicate with, thus completing the global exploration in search of a potential Pareto optimal front. The discoverer subpopulation is still searched according to .

Dynamic multi-population search strategy

After the population has all converged to the Pareto front, limited by the performance of SSA, the diversity of the population is poor at this point, while it is difficult to increase the diversity of the population through SSA again. Therefore, this paper proposes a dynamic multi-population search strategy based on the SSA to mainly manage the diversity. The dynamic multi-population search strategy includes the dynamic population dividing strategy and the multi-population search strategy. First, the dynamic population dividing strategy divides the population into three subpopulations: discoverer subpopulation, follower subpopulation, and vigilante subpopulation. The size of each subpopulation is dynamically adjusted, which allows the main search direction of the population to change dynamically for full exploration. For the multi-population search strategy, each subpopulation performs a different search strategy in different directions, thus fully exploring the large-scale decision space, searching for the potential Pareto optimal front, and increasing the diversity of the population.

Dynamic population dividing strategy.

In the original SSA, the risk value R represents the population’s risk assessment of the environment, and the risk threshold ST represents the population’s risk tolerance. The dynamic change of R can constantly adjust the search direction of the population, thus avoiding the algorithm falling into a local optimal.

The dynamic population dividing strategy is shown in :

(10)

where is the discoverer subpopulation size, is the follower subpopulation size, is the vigilante subpopulation size, n is the population size. The discoverer subpopulation performs a local exploitation, the vigilante subpopulation performs a global exploration, and the follower subpopulation learns the search information of the other two subpopulations. Therefore, when R < ST, i.e., when the population is currently in a more secure position, the discoverer subpopulation expands and the population performs a local exploitation to increase the diversity of the population at the local level. When RST, i.e., the current position of the population is no longer safe, the vigilante subpopulation expands and the population mainly performs a global exploration for potentially optimal positions to avoid falling into a local optimal and to improve the global diversity of the population. Dynamic changes in subpopulation size change the dominant search direction of the population, leading to multi-directional search that allows full exploration of large-scale decision spaces.

However, R in the original SSA obeys a uniform distribution, which is difficult to satisfy the randomness of dynamic changes. Therefore, this paper also applies Singer mapping on R to generate a set of chaotic sequences to ensure the randomness and traversal of R, so as to ensure the randomness of the dynamic dividing of the population. Second, the ST in the original SSA is a fixed preset value, which does not fully take into account the different needs of the algorithm’s early and late search stages. In the early search stage, there is still a large amount of potential space unexplored, so global exploration should be the main focus at this time; while in the late optimization stage, most of the space has been explored, and local exploitation should be the main focus at this time.

Therefore, the ST in this paper is adaptively adjusted according to :

(11)

where is the maximum value of ST, is the minimum value of ST, iter is the current iteration number. Fig 2 shows the change of ST when , , . In the early search stage, we keep ST at a low value to increase the frequency of dynamic dividing of the population, so that the main search direction of the population is constantly changing to full local exploit and global explore. As the search continues, ST will gradually increase to allow the discoverer subpopulation to gradually become dominant. In the later search stage, ST is kept at a high level so that the discoverer subpopulation becomes the dominant subpopulation of the population, which results in the entire population being dominated by local exploitation in the later search stage.

Multi-population search strategy.

In order to fully explore the large-scale decision space, avoid falling into the local optimal, and search for potential Pareto optimal front, this paper proposes a multi-population search strategy that dynamically changes the search direction during the optimization process to achieve full local exploitation and global exploration. The discoverer subpopulation adopts the simulated binary cross search strategy (SBX) for local exploitation. The vigilante subpopulation adopts the reverse learning search strategy (RL) for global exploration. The follower subpopulation adopts the differential search strategy (DE) for comprehensive exploration.

The SBX operator used by the discoverer subpopulation is a search strategy that simulates a single-point binary crossover. The parent , are shown in :

(12)

The children , are then generated by the crossover of :

(13)

where β is determined by the distribution factor η according to :

(14)

η is a user-specified cross-distribution index that controls the degree of similarity between the parent and children; the larger η is, the more similar the children are to the parent, which causes SBX to favor local exploitation. The smaller η is, the less similar the children are to the parent, which causes SBX to favor global exploration. However, SBX is essentially an information exchange between individuals of a population, so the local exploitation ability is stronger. Therefore, in this paper, the cross-distribution index is set larger to fully local exploit.

The RL operator used by the vigilante subpopulation is based on :

(15)

where upper is the upper limit of the decision variables and lower is the lower limit of the decision variables. RL operator is computationally simple but changes rapidly and can be searched over a wide range. In single-objective optimization problems, RL operator is destructive to population convergence. Constraints are usually imposed to reduce the destructiveness of RL operator on population convergence. The most common constraint method is to decide whether to keep RL individuals or not by computing and comparing objective values. However, in LSMaOPs, the additional computation and comparison of objective values is more consuming, so in this paper we constrain them by R, ST and two other subpopulations. Compared with other search operators, the RL operator performs well on the diversity and can perform more extensive and faster search. Especially in LSMaOPs, RL operator can be more efficient to perform fast and extensive search under huge search spaces. Moreover, in this paper, RL only needs to manage diversity, which reduces its impact on convergence.

DE operator used by the follower subpopulation is a search strategy based on differential information between individuals. It has strong search ability in complex environments. DE operator in this paper is realized based on :

(16)

where , are two random individuals in the offspring produced by the discoverer subpopulation, and , are two random individuals in the offspring produced by the vigilante subpopulation. α is the learning factor, which controls how much the follower subpopulation learns about the other two subpopulations, and is defined in :

(17)

α is automatically adjusted according to the environmental risk and the number of iterations. When R > ST, i.e., the environmental risk is high, the follower subpopulation is more inclined to learn the vigilante subpopulation and adopt global exploration. On the contrary, when RST, i.e., the environmental risk is low, the follower subpopulation is more inclined to learn the discoverer subpopulation and adopt conservative search. At the same time, as the search process into the later stage, the follower subpopulation chooses a more balanced learning rate for a comprehensive search. Also, considering that the discoverer subpopulation and the vigilante subpopulation are unequal in size, to ensure traversal ability, the already selected individuals are removed from the set, and when the set is empty, all the removed individuals of that subpopulation are added to the set again. The follower subpopulation performs its own search by learning the differential information of the offspring of the other two subpopulations, while continuously adjusting its search strategy to improve the diversity of the subpopulation.

Environmental selection strategy

In this paper, we use a reference point environmental selection strategy similar to NSGA-III [12]. In many-objective space, this method has lower computational complexity and stronger selection pressure compared to the environment selection strategy by the degree of crowding.

First, we perform a fast non-dominated sorting of the mixed population of parents and children into different Pareto layers . If , the individuals in are placed in and go directly to the next iteration. If , the reference point selection strategy is applied in and NP individuals are selected to be put into . If , the individuals in each layer are put into in order, until . And then individuals are selected from into .

In this paper, compared with the complex environmental selection strategy in other studies, we do not make much improvement in the environmental selection strategy. We choose a simpler strategy with low computational complexity. Since TS-SSA produces population with good convergence and diversity, it is not necessary to give too much selection pressure in the environmental selection strategy to select the better offspring to ensure the overall performance of the population.

The pseudo-code for TS-SSA is shown in Algorithm 1.

Algorithm 1. TS-SSA.

Input: N: population size, M: number of objective functions, D: number of decision variables, lower: lower limit of the decision variable, upper: upper limit of the decision variable.

Output: Optimal solutions Pop.

1: Pop ← chaosInitialization(N, M, D)

2: R ← rand()

3: while termination criterion not fulfilled do

4:  R ← chaosMapping(R)

5:  ST ← adaptionST()

6:  {Subpopdiscoverer, Subpopfollower} ← NDSort (Pop)

7:  if SubpopdiscovererN then

8:   Offspringdiscoverer ← MaOSSAdiscoverer(Subpopdiscoverer)

9:   OffspringfollowerMaOSSAfollower(Subpopfollower)

10:   SubpopvigilanterandomSelection(Subpopdiscoverer/5)

11:   OffspringvigilanteMaOSSAvigilante(Subpopvigilante)

12:  else

13:   if R < ST then

14:    SubpopdiscovererrandomSelection(Pop, N/2)

15:    SubpopfollowerrandomSelection(Pop \ Subpopdiscoverer, N/4)

16:    SubpopvigilantePop \ (SubpopdiscovererSubpopfollower)

17:   else

18:    SubpopvigilanterandomSelection(Pop, N/2)

19:    SubpopfollowerrandomSelection(Pop \ Subpopvigilante, N/4)

20:    SubpopdiscovererPop \ (SubpopvigilanteSubpopfollower)

21:   end if

22:  end if

23:  OffspringdiscovererSBX(Subpopdiscoverer)

24:  OffspringvigilanteRL(Subpopvigilante)

25:  OffspringfollowerDE(Subpopfollower)

26:  Offspring ← OffspringdiscovererOffspringfollowerOffspringvigilante

27:  Pop ← EnvironmentalSelection(Pop, Offspring)

28: end while

In summary, the overall best time complexity of TS-SSA is O ( MNlogN ) and the worst time complexity is .

Experimental results and discussions

In this section, we will verify the effectiveness of the proposed method through a series of experiments. Benchmark problems, performance metric, parameter settings, and experimental results and analysis are described next.

Experimental design

Benchmark problems.

The benchmark problems used in this paper include DTLZ1-7 [52], IDTLZ1-2 [53], SDLTZ1-2 [12] and LSMOP1-9 [54] to evaluate the performance and effectiveness of the proposed method. These problems have different Pareto fronts and characteristics, which can better evaluate the overall performance of the method on different types of problems. Table 1 shows the settings of the relevant parameters of the benchmark problems.

thumbnail
Table 1. Parameters related to benchmark problems.

https://doi.org/10.1371/journal.pone.0313772.t001

Performance metric.

The inverse iteration distance (IGD) is chosen to evaluate the overall performance of the method [55]. The IGD is calculated as shown in :

(18)

where PF is the approximate Pareto optimal front computed by the method, is the true Pareto front, is the individual in , and minDistance(z*, PF) computes the minimum Euclidean distance from individual to PF. IGD evaluates the overall performance of the methods by calculating the average of the minimum distances from the set of points on the true Pareto front to the approximate Pareto optimal front. Therefore, IGD is effective in comprehensively evaluating the convergence and diversity of the solution set, if the true Pareto front is known.

Parameter settings.

To verify the effectiveness of the method proposed in this paper, ten advanced MOEAs, FDV [56], LMOEADS [57], IMMOEAD [58], DGEA [59], LERD [60], MOCGDE [61], AGEMOEAII [62], HEA [63], SGECF [64], and UCLMO [65], are selected for comparison. The experiments are performed on a computer with hardware configuration of Intel Core I7-11800 @2.30 GHz and 32 GB RAM. The program was written in MATLAB R2023a. All experiments are performed on the open source MOEA platform PlatEMO 4.2 [66], where some of the relevant code is available (This data can be achieved by visiting https://github.com/BIMK/PlatEMO).

TS-SSA (This data can be achieved by visiting https://figshare.com/s/2cb32f7acb4b6cc3e27e) uses the environmental selection strategy based on reference points, so the selection of reference points in the experiment was obtained through the internal and external double layer sampling method. The population size is set as shown in Table 2. All methods will use the same population size for fair comparisons. The hyperparameter in is chosen as α = 0 . 3, β = − 0 . 2, γ = 10, σ = 10.

As shown in Table 2, the maximum number of the function evaluations (MaxFEs) is the population size multiplied by 1000. For example, the population size of the 8-objective problem is 156, so the maximum number of the evaluation function is 156000. To ensure fairness, all algorithms will have the same maximum number of function evaluations. Each method is run 30 times on each problem, and the mean and standard deviation of the performance metric are taken for statistical analysis and comparison. The statistical metric used for the experiments is the Mann-Whitney-Wilcoxon rank sum test [67] with a 5% significance level.

Most of the methods used simulated binary crossover operators and polynomial variational operators, so for a fair comparison, the distribution indices of all crossover and variational operators are set to 20. For some methods that require special parameters to be set individually, the experiments are set to the optimal reference values given in the original paper.

Experimental results and analysis

Results and analysis of comparisons with other methods.

First, we conducted comparative experiments between TS-SSA and 10 other state-of-the-art methods on 5 dimensions and 20 benchmark problems, and the results of the IGD metric are shown in Tables 3, 4, 5, 6, and 7. In each table, the first four columns are the benchmark problems, the population size, the number of objectives, and the dimensions of the decision variables. The mean of the metric results for the 30 runs is shown outside the parentheses, and the standard deviation is shown in parentheses. Bold numbers indicate the optimal results of the benchmark problems. The symbols “+”, “-” and “=” indicate whether the null hypothesis of the results, which are generated by TS-SSA and compared methods, is accepted or rejected with the significance level 5% by the Mann-Whitney-Wilcoxon rank sum test. The last row of each table gives the summed results of the rank sum test, where the three numbers from left to right indicate the number of times the method is better, worse, and equal compared to the method proposed in this paper.

thumbnail
Table 3. IGD metric of different methods on 3-objective benchmark problems.

https://doi.org/10.1371/journal.pone.0313772.t003

thumbnail
Table 4. IGD metric of different methods on 8-objective benchmark problems.

https://doi.org/10.1371/journal.pone.0313772.t004

thumbnail
Table 5. IGD metric of different methods on 10-objective benchmark problems.

https://doi.org/10.1371/journal.pone.0313772.t005

thumbnail
Table 6. IGD metric of different methods on 15-objective benchmark problems.

https://doi.org/10.1371/journal.pone.0313772.t006

thumbnail
Table 7. IGD metric of different methods on 20-objective benchmark problems.

https://doi.org/10.1371/journal.pone.0313772.t007

From the IGD results in Tables 3, 4, 5, 6, and 7, TS-SSA demonstrates significant performance advantages. FDV is a two-stage method that is divided into a fuzzy evolution stage and a precise evolution stage. The fuzzy evolution stage manages convergence for global exploration and the precise evolution phase manages diversity for local exploitation. From the results, TS-SSA outperforms FDV on most of the benchmark problems. In the 3-objective benchmark problem, FDV outperforms TS-SSA only on the IGD metrics at LSMOP9. Although FDV performs better as the number of objectives increases, it also outperforms TS-SSA on only 6 of the 20-objective benchmark problems. The main reason is that although FDV also manages convergence and diversity separately, its two stages are completely separated. The fuzzy evolution stage takes place only in the early search stage and the precise evolution stage takes place only in the late search stage, which results in the inability to enter the precise evolution stage when diversity needs to be improved. Therefore, FDV has a large number of meaningless searches, thus reducing the search efficiency. UCLMO is also a two-stage method. Based on cultural learning, UCLMO proposed the individual selection strategy and the assisted evolution strategy, both of which will alternate during the search process, so that UCLMO outperforms FDV, but still not as good as TS-SSA. Because the two stages of UCLMO are carried out in a fixed and alternant manner, which results in that the algorithm does not always enter the right stage at the right time, i.e., it enters the diversity improvement stage when it needs to improve convergence, thus making the search less efficient. In contrast, the two stages of TS-SSA adaptively alternate based on population characteristics, which increases the efficiency and accuracy.

LMOEADS is LSMOEA and therefore performs well on the 3-objective benchmark problems. However, its performance degrades as the number of objectives increases. On the 3-objective problems, LMOEADS outperforms TS-SSA five times on IGD metric, while on the 20-objective problems, LMOEADS does not outperform TS-SSA once on IGD metrics. Compared to MOEAs, TS-SSA has a significant advantage in dealing with MaOPs. IMMOEAD is a decomposition-based inverse modeling method, which decomposes the objective space and then constructs an inverse mapping from the objective space to the decision space, and generates solutions in the objective space and then maps them to the decision space. From the results, TS-SSA significantly outperforms IMMOEAD. Especially on the LSMOP benchmark problems, TS-SSA outperforms IMMOEAD across the board from 3 objectives to 20 objectives. LERD and MOCGDE are also decomposition-based methods. LERD divides the decision variables into convergence and diversity variables according to the analysis of decision variables, and MOCGDE discretizes the covariate gradient to drive individuals to converge and improve diversity on the Pareto front. MOCGDE has better IGD metrics on the DTLZ benchmark problems, but is significantly worse than TS-SSA on the LSMOP benchmark problems. To some extent, it shows that when the scale of the decision space is large, methods based on large-scale exploration are superior to those based on decomposition. DGEA is used to improve the optimization of the algorithm from the perspective of generating better offspring. However, generating better offspring is achieved by a pre-selection strategy, and the evolutionary operator still uses GA, which is overall less effective and efficient than SSA. The experimental results also confirm this. HEA enhances selection pressure to select superior offspring through hyper-dominance degree. However, it performs poorly on IDTLZ and SDTLZ benchmark problems with complex shapes of the Pareto front, which is diametrically opposed to TS-SSA. TS-SSA performs better on complex problems. And this may somewhat indicate that the methods of generating better offspring are better than the methods of enhancing selection pressure for solving complex optimization problems.

thumbnail
Table 8. CPU runtime of different methods on 20-objective benchmark problems.

https://doi.org/10.1371/journal.pone.0313772.t008

Overall, TS-SSA’s performance on the DTLZ problems series deteriorates as the number of targets increases, especially for the 20-objective DTLZ1-5. However, for the more complex and harder to optimize IDTLZ and SDTLZ problem series, TS-SSA shows better performance again. The possible reason is that the search space of these two problem types is more complex, and the search strategies of other methods are difficult to achieve a thorough search, so they fall into local optimality. In contrast, TS-SSA specializes in search capability and is more adaptable to the huge and complex search space. Similar conclusions can be drawn from the results of the LSMOP problems. In the 3-objective LSMOPs, TS-SSA does not show much advantage. However, as the number of objectives increases and the search space becomes more complex, the advantage of TS-SSA becomes more obvious. Especially in LSMOPs with more than 10 objectives, TS-SSA has a significant advantage over other methods. The results in Table 7 show that none of the methods outperforms TS-SSA on the 20-objective LSMOPs. MOCGDE, which performs well on the 20-objective DTLZs, also performs significantly weaker than TS-SSA on the LSMOPs.

Table 8 shows the CPU runtime of each method on the 20-objective benchmark problems. According to the results, TS-SSA has a significant speed advantage on all problems. Due to the length of the article, the results of CPU running times under other numbers of objectives’ problems are not listed. However, according to the results under different numbers of objectives’ problems, the speed advantage of TS-SSA also increases with the number of objectives increasing. From Table 8, we can see that the three decomposition-based methods IMMOEAD, LERD, and MOCGDE are far inferior to TS-SSA in terms of time efficiency, especially in LSMOP. In large-scale decision space, the decomposition-based methods need to take a lot of computational resources for analysis and grouping, which results in insufficient computational resources for evolutionary computation and the bad optimization effect. Therefore, in large-scale optimization problems, when the computational resources are limited, the large-scale exploration-based method is superior to the decomposition-based method.

Fig 3 shows the IGD convergence curves of different methods on 20-objective LSMOP4. According to the Fig 3, it is shown that TS-SSA converges more quickly to the Pareto front than other methods. Fig 4 shows the final solution set of the different methods on 20-objective LSMOP4, where the horizontal coordinate is the objective dimension and the vertical coordinate is the objective value. The solution sets obtained from the TS-SSA proposed in this paper show better convergence and diversity compared to other methods.

In summary, TS-SSA is able to generate populations with higher convergence and diversity than other state-of-the-art MOEAs, which confirms the feasibility of solving the dominance resistance problem existing from the perspective of generating better populations in many-objective spaces. Meanwhile, compared to other LSMaOEAs, TS-SSA shows a significant advantage in the performance test in the LSMOP suite, which proves the effectiveness of our proposed method in a huge complex search space. And the performance test results in the DTLZ suite show that TS-SSA is more advantageous in dealing with problems with complex search spaces, but in regular search spaces, TS-SSA performs slightly worse than advanced MaOEAs.

thumbnail
Fig 3. IGD convergence curves of different methods on a 20-objective LSMOP4.

https://doi.org/10.1371/journal.pone.0314584.g003

thumbnail
Fig 4. Solution sets of different methods on the 20-objective LSMOP4. (a), (b), (c), (d), (e), (f), (g), (h), (i), (j) and (k) are TS-SSA, FDV, LMOEADS, IMMOEAD, DGEA, LERD, MOCGDE, AGEMOEAII, HEA, SGECF and UCLMO on 20-objective LSMOP4.

https://doi.org/10.1371/journal.pone.0314584.g004

thumbnail
Fig 5. IGD results for 3 methods on 3-20 objective LSMOP. (a), (b), (c), (d), (e), (f), (g), (h) and (i) is IGD results for 3 methods on 3–20 objective from LSMOP1 to LSMOP9.

https://doi.org/10.1371/journal.pone.0314584.g005

Experiments to demonstrate the validity of the two-stage method.

In this section, we will demonstrate experimentally that it is effective to manage convergence and diversity separately through the two stages. First, we divide TS-SSA into TS-SSA1 and TS-SSA2. In TS-SSA1, only the many-objective sparrow search algorithm is executed, and in TS-SSA2, only the dynamic multi-population search strategy is executed. In order to avoid the situation that there is no follower subpopulation in TS-SSA1 when all individuals converge to the Pareto front, we make some adjustments to TS-SSA1: after all individuals converge to the Pareto front, 40% of the individuals in the population are randomly selected as discoverers, and the rest of the individuals are selected as followers. Fig 5 shows the IGD results of the three methods on the LSMOP with 3, 8, 10, 15, and 20 objectives. According to the IGD results in Fig 3, TS-SSA shows significant advantages over TS-SSA1 and TS-SSA2 in most cases. The two methods TS-SSA1 and TS-SSA2 have their own strengths and weaknesses. TS-SSA1 outperforms TS-SSA2 on LSMOP3, 5, 7, 8, and 9, while TS-SSA2 outperforms TS-SSA1 on LSMOP1, 2, 4, and 6. However, both are not as effective as TS-SSA, which demonstrates the effectiveness of managing convergence and diversity separately through the two stages.

Application in automatic test scenarios generation

In this section, we apply TS-SSA to a real-world case to examine its optimization effect. Automatic test scenarios generation is a large-scale many-objective optimization problem because test scenarios are intended to cover many test objectives and have many decision nodes [68]. The next part of this section will describe the many-objective automatic test scenarios generation based on UML activity diagrams.

UML activity diagrams preprocessing.

Our previous study successfully automated the generation of test scenarios in UML activity diagrams [69]. However, this study generates test scenarios from a single-objective perspective. The final result shows that there are still many different execution paths that are not covered. The main problem is that generating test scenarios from a single-objective perspective leads to easy omission of some test scenarios because the objectives considered are not sufficient. Especially when there are concurrent activities in the UML activity diagram, the concurrent activities will generate a large number of different execution paths, and it is difficult for the single-objective algorithm to adequately generate test scenarios. Therefore, this time we will generate test scenarios from a many-objective perspective to reduce missed test scenarios. Fig 6 shows the UML activity diagram for the real case used in our previous study, and the exact preprocessing process can be found in our previous study and will not be repeated here.

thumbnail
Fig 6. The UML Activity Diagram for Smart Wrench Torque System.

https://doi.org/10.1371/journal.pone.0314584.g006

Code design of the solution.

Automatic test scenarios generation from UML activity diagrams is essentially a path generation task on the diagram, which is a discrete problem. In a control flow diagram, paths will fork at decision nodes. Decision nodes are generally branch nodes and loop nodes. Therefore, in this paper, each decision node is treated as a bit, and the value range of the bit is related to the number of branches of the decision node. For example, the value range of a two-branch node is 0 and 1, and the value range of a three-branch node is 0, 1, and 2. A loop node only needs to perform a loop once to cover the cyclic path. So, in this paper, we set the loop to be performed only 0 or 1 time, and the value domain of the loop node is 0 and 1. The code length of the solution is positively related to the number of decision nodes in the UML activity diagrams. In a complete UML activity diagram for a large system, it often contains hundreds or thousands of decision nodes, so the size of the decision variables is huge, and it is a large-scale optimization problem. In the first stage of TS-SSA, the solution space of SSA is in continuous space, so in this paper, a mapping function is designed to map the solution to discrete space. The mapping function is shown in :

(19)

where is the j-th position of the i-th solution, is the width of the value domain of the j-th position, t is the current number of iterations, is the j-th position of the random individual in the discoverer subpopulation, is the j-th position of the random individual in the follower subpopulation, and tmax is the maximum of the number of iterations. The discoverers, as the Pareto optimal individuals, will only discretely map with reference to its own previous generation. And the influence of the previous generation will gradually increase as the number of iterations increases. This is because when the search enters the later stage, the position of the individuals are determined. The followers will be close to the discoverers and at the same time away from the other followers to perform the discrete mapping, which ensure the convergence of the solution. It will also increase the influence of the previous generation in the later stages of the search.

The search operators in the second stage of TS-SSA all have discrete versions. Therefore, this paper will directly replace with their discrete versions instead of discrete mapping.

Objective function design.

The purpose of automatic uncovered test scenarios generation is to generate a set of test scenarios that cover as many test objectives as possible. Therefore, the set of test scenarios should cover as many nodes and paths as possible. The first minimization objective is to minimize the number of uncovered nodes as shown in :

(20)

where x is a single test scenario, node is the node covered by the test scenario x, and w is the weight of the nodes. The initial weight of each node is 0, and when any test scenario covers the node, its weight is increased by 1. We will maintain a set that includes all nodes. During each round of iteration, the weights of all nodes are first initialized, and then all individuals of the population are traversed to update weight values of each node. Individuals with lower summed weight values are better. Because they cover those uncovered targets.

The second minimization objective is to minimize the number of uncovered edges as shown in :

(21)

where edge is the edge covered by the test scenario x, and w is the weight of the edges. The initial weight of each edge is 0, and when any test scenario covers the edge, its weight is increased by 1. We will maintain a set that includes all edges. During each round of iteration, the weights of all edges are first initialized, and then all individuals of the population are traversed to update weight values of each edge.

To further improve coverage, the third minimization objective is to minimize similarity. Individuals with lower similarity are better. The similarity metrics used in this paper are the Gower-Legendre (dice) similarity metric and the Sokal-Sneath (anti-dice) similarity metric. The Gower-Legendre (dice) similarity measure is shown in and the Sokal-Sneath (Anti-dice) similarity measure is shown in :

(22)

where x is the current individual, y is an individual other than x, | xy | denotes the number of nodes covered by both individuals together, and | xy | denotes the number of all nodes covered by both individuals. The Gower-Legendre similarity measure and the Sokal-Sneath similarity measure have different weights on the denominator of the formula, and the two are used together to better measure the similarity of individuals.

Experimental results and analysis.

The application experiments are still running on the PlatEMO platform. DGEA, MOCGDE, and UCLMO, which performed well in the previous comparison experiments, are selected for comparison algorithms.

The evaluation metrics include Node Coverage, Edge Coverage, and the Number of Test Scenarios. Node coverage is the ratio of nodes covered by the set of test scenarios to the total number of nodes. Edge coverage is the ratio of edges covered by the set of test scenarios to the total number of edges. A higher number of test scenarios means that more test objectives are covered and the algorithm is more effective.

The UML activity diagram used for the experiment is the Smart Wrench Torque activity activity diagram shown in Fig 6. The size of population is set to 100. The maximum number of function evaluations is 10000. The average of ten replicate experiments is taken as the final result.

Table 9 shows the experimental results of the automatic test test scenarios generation. From Table 9, we can see that all algorithms cover all nodes and edges, but the number of test scenarios has a large gap. Due to the existence of concurrent modules, many test scenarios cover the same nodes but with different execution paths, which requires more diverse performance of the algorithms. From the results, TS-SSA produces the highest number of test scenarios and performs better compared to other algorithms in terms of diversity. This also proves the effectiveness of the two-stage search strategy.

thumbnail
Table 9. The result of automatic test scenarios generation.

https://doi.org/10.1371/journal.pone.0313772.t009

Conclusions and future works

This paper proposes a two-stage method to solve LSMaOPs by managing the convergence and diversity of the population separately. In the first stage, the many-objective sparrow search algorithm is proposed to mainly manage convergence through adaptive population dividing strategy and random bootstrap search strategy, which ensures the fast convergence of the population and avoids the extra computational consumption caused by selecting the optimal non-dominant individual. In the second stage, the dynamic multi-population search strategy is proposed to manage diversity. The dynamic multi-population search strategy dynamically changes the three subpopulation sizes and the main search direction of the population according to the value at risk, thus focusing on global exploration in the early search stage and local exploitation in the later search stage. The multi-population search strategy achieves full exploration of the large-scale decision space by searching in three different directions simultaneously.

In order to validate the performance of the proposed method, we conducted several sets of comparison experiments on DTLZ and LSMOP benchmark problems. According to the results of the comparison experiments, the method proposed in this paper has a greater advantage and competitiveness in solving LSMaOPs, which guarantees higher accuracy while possessing lower computational complexity. Then, this paper designs an experiment to demonstrate the effectiveness of managing convergence and diversity separately in two stages. The experimental results show that managing convergence and diversity separately in two stages improves the accuracy of the method, which proves the effectiveness of the method proposed in this paper. In addition, we apply TS-SSA to a real case. The result shows that TS-SSA outperforms other algorithms on diversity.

Although the method proposed in this paper achieves good results in solving LSMaOPs, there are still some aspects that can be improved: first, in this paper, TS-SSA is not used to solve the many-objective optimization problem with constraints or expensive objective function computation, which is a point that can be further explored in future work. Second, TS-SSA currently performs adaptive stage switching based on the condition of whether all populations are at the Pareto front, which is more absolute and may perform poorly in some cases. In future work, we will look for a better switching method. Third, the many-objective improvement of the biological population intelligence algorithm proposed in this paper has so far only been applied to the sparrow search algorithm. We will try to make similar improvements on other biological population intelligence algorithms and propose a possible generalized improvement method.

References

  1. 1. Abbassi R, Abbassi A, Heidari AA, Mirjalili S. An efficient salp swarm-inspired algorithm for parameters identification of photovoltaic cell models. Energy Convers Manag. 2019;17:9362–72.
  2. 2. Faris H, Al-Zoubi AM, Heidari AA, Aljarah I, Mafarja M, Hassonah MA, et al. An intelligent system for spam detection and identification of the most relevant features based on evolutionary Random Weight Networks. Inf Fusion. 2019;48:67–83.
  3. 3. Chittur Ramaswamy P, Tant J, Pillai JR, Deconinck G. Novel methodology for optimal reconfiguration of distribution networks with distributed energy resources. Electric Power Syst Res. 2015;127:165–76.
  4. 4. Deb K, Pratap A, Agarwal S, Meyarivan T. A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans Evol Computat. 2002;6(2):182–97.
  5. 5. Corne DW, Jerram NR, Knowles JD, Oates MJ: PESA-II: region-based selection in evolutionary multiobjective optimization. In: Proceedings of the 3rd Annual Conference on Genetic and Evolutionary Computation. San Francisco, California: Morgan Kaufmann Publishers Inc.; 2001. p. 283–90.
  6. 6. Zitzler E, Laumanns M, Thiele L.: SPEA2: improving the strength pareto evolutionary algorithm. TIK-report, 103; 2001.
  7. 7. Asafuddoula M, Ray T, Sarker R. A decomposition-based evolutionary algorithm for many objective optimization. IEEE Trans Evol Computat. 2015;19(3):445–60.
  8. 8. Tao K, Li Y, Hu Y, Li Y, Zhang D, Li C, et al. Overexpression of ZmEXPA5 reduces anthesis-silking interval and increases grain yield under drought and well-watered conditions in maize. Mol Breed 2023;43(12):84.
  9. 9. Hutahaean J, Demyanov V, Christie M. Many-objective optimization algorithm applied to history matching. In: 2016 IEEE Symposium Series on Computational Intelligence (SSCI). 2016. p. 1–8.
  10. 10. Gong D, Sun J, Miao Z. A set-based genetic algorithm for interval many-objective optimization problems. IEEE Trans Evol Computat. 2018;22(1):47–60.
  11. 11. Li B, Li J, Tang K, Yao X. Many-objective evolutionary algorithms. ACM Comput Surv. 2015;48(1):1–35.
  12. 12. Deb K, Jain H. An evolutionary many-objective optimization algorithm using reference-point-based nondominated sorting approach, Part I: solving problems with box constraints. IEEE Trans Evol Computat. 2014;18(4):577–601.
  13. 13. He Z, Yen GG, Zhang J. Fuzzy-based pareto optimality for many-objective evolutionary algorithms. IEEE Trans Evol Computat. 2014;18(2):269–85.
  14. 14. Yuan Y, Xu H, Wang B, Yao X. A new dominance relation-based evolutionary algorithm for many-objective optimization. IEEE Trans Evol Computat. 2016;20(1):16–37.
  15. 15. Zhang X, Tian Y, Cheng R, Jin Y. An efficient approach to nondominated sorting for evolutionary multiobjective optimization. IEEE Trans Evol Computat. 2015;19(2):201–13.
  16. 16. Jiang S, Yang S. A strength Pareto evolutionary algorithm based on reference direction for multiobjective and many-objective optimization. IEEE Trans Evol Computat. 2017;21(3):329–46.
  17. 17. Antonio LM, Coello CAC. Use of cooperative coevolution for solving large scale multiobjective optimization problems. In: 2013 IEEE Congress on Evolutionary Computation. IEEE; 2013. p. 2758–65.
  18. 18. Antonio LM, Coello CAC, Brambila SG, González JF, Tapia GC. Operational decomposition for large scale multi-objective optimization problems. In: Proceedings of the Genetic and Evolutionary Computation Conference Companion. ACM; 2019. p. 225–6.
  19. 19. Tian Y, Lu C, Zhang X, Tan KC, Jin Y. Solving large-scale multiobjective optimization problems with sparse optimal solutions via unsupervised neural networks. IEEE Trans Cybern. 2021;51(6):3115–28. pmid:32217494
  20. 20. Zhang X, Tian Y, Cheng R, Jin Y. A decision variable clustering-based evolutionary algorithm for large-scale many-objective optimization. IEEE Trans Evol Computat. 2018;22(1):97–112.
  21. 21. Yao X, Zhao Q, Gong D, Zhu S. Solution of large-scale many-objective optimization problems based on dimension reduction and solving knowledge-guided evolutionary algorithm. IEEE Trans Evol Computat. 2023;27(3):416–29.
  22. 22. Xue J, Shen B. A novel swarm intelligence optimization approach: sparrow search algorithm. Syst Sci Control Eng. 2020;8(1):22–34.
  23. 23. Li S, Chen H, Wang M, Heidari AA, Mirjalili S. Slime mould algorithm: a new method for stochastic optimization. Future Gen Comput Syst. 2020;111:300–23.
  24. 24. Li J, Lei H, Alavi AH, Wang G-G. Elephant herding optimization: variants, hybrids, and applications. Mathematics 2020;8(9):1415.
  25. 25. Ishibuchi H, Setoguchi Y, Masuda H, Nojima Y. Performance of decomposition-based many-objective algorithms strongly depends on pareto front shapes. IEEE Trans Evol Computat. 2017;21(2):169–90.
  26. 26. Tian Y, Cheng R, Zhang X, Su Y, Jin Y. A strengthened dominance relation considering convergence and diversity for evolutionary many-objective optimization. IEEE Trans Evol Computat. 2019;23(2):331–45.
  27. 27. Qingfu Zhang, Hui Li. MOEA/D: a multiobjective evolutionary algorithm based on decomposition. IEEE Trans Evol Computat. 2007;11(6):712–31.
  28. 28. Ho-Huu V, Hartjes S, Visser HG, Curran R. An improved MOEA/D algorithm for bi-objective optimization problems with complex Pareto fronts and its application to structural optimization. Exp Syst Appl. 2018;92:430–46.
  29. 29. Wang X, Ge H, Zhang N, Hou Y, Sun L. A universal large-scale many-objective optimization framework based on cultural learning. Appl Soft Comput. 2023;145:110538.
  30. 30. Yang X, Zou J, Yang S, Zheng J, Liu Y. A fuzzy decision variables framework for large-scale multiobjective optimization. IEEE Trans Evol Computat. 2023;27(3):445–59.
  31. 31. Liu H-L, Gu F, Zhang Q. Decomposition of a multiobjective optimization problem into a number of simple multiobjective subproblems. IEEE Trans Evol Computat. 2014;18(3):450–5.
  32. 32. Asafuddoula M, Ray T, Sarker R. A decomposition-based evolutionary algorithm for many objective optimization. IEEE Trans Evol Computat. 2015;19(3):445–60.
  33. 33. Panichella A. An improved Pareto front modeling algorithm for large-scale many-objective optimization. In: Proceedings of the Genetic and Evolutionary Computation Conference; 2022. p. 565–73.
  34. 34. Li F, Cheng R, Liu J, Jin Y. A two-stage R2 indicator based evolutionary algorithm for many-objective optimization. Appl Soft Comput. 2018;67:245–60.
  35. 35. Sun Y, Yen GG, Yi Z. IGD indicator-based evolutionary algorithm for many-objective optimization problems. IEEE Trans Evol Computat. 2019;23(2):173–87.
  36. 36. Tian Y, Zhang X, Cheng R, Jin Y. A multi-objective evolutionary algorithm based on an enhanced inverted generational distance metric. In: 2016 IEEE Congress on Evolutionary Computation (CEC). IEEE; 2016. p. 5222–9.
  37. 37. Marciani G, Carmignani L, Djakovic I, Roussel M, Arrighi S, Rossini M, et al. The uluzzian and châtelperronian: no technological affinity in a shared chronological framework. J Paleolit Archaeol 2025;8(1):3.
  38. 38. Huang W, Xu T, Li K, He J. Multiobjective differential evolution enhanced with principle component analysis for constrained optimization. Swarm Evolution Comput. 2019;50:100571.
  39. 39. Liu S, Yu Q, Lin Q, Tan KC. An adaptive clustering-based evolutionary algorithm for many-objective optimization problems. Inf Sci. 2020;537:261–83.
  40. 40. Liu R, Ren R, Liu J, Liu J. A clustering and dimensionality reduction based evolutionary algorithm for large-scale multi-objective problems. Appl Soft Comput. 2020;89:106120.
  41. 41. Liu T, Zhu J, Cao L. A stable large-scale multiobjective optimization algorithm with two alternative optimization methods. Entropy (Basel) 2023;25(4):561.
  42. 42. Yin F, Cao B. A two-space-decomposition-based evolutionary algorithm for large-scale multiobjective optimization. Swarm Evolution Comput. 2023;83:101397.
  43. 43. Qi S, Zou J, Yang S, Zheng J. A level-based multi-strategy learning swarm optimizer for large-Scale multi-objective optimization. Swarm Evolution Comput. 2022;73:101100.
  44. 44. Wang Z-J, Yang Q, Zhang Y-H, Chen S-H, Wang Y-G. Superiority combination learning distributed particle swarm optimization for large-scale optimization. Appl Soft Comput. 2023;136:110101.
  45. 45. Gu Q, Huang S, Wang Q, Li X, Liu D. A chaotic differential evolution and symmetric direction sampling for large-scale multiobjective optimization. Inf Sci. 2023;639:119003.
  46. 46. Liang S, Yin M, Sun G, Li J, Li H, Lang Q. An enhanced sparrow search swarm optimizer via multi-strategies for high-dimensional optimization problems. Swarm Evolution Comput. 2024;88:101603.
  47. 47. Li B, Wang H. Multi-objective sparrow search algorithm: a novel algorithm for solving complex multi-objective optimisation problems. Exp Syst Appl. 2022;210:118414.
  48. 48. Gao S, Wu R, Wang X, Liu J, Li Q, Tang X. EFR-CSTP: encryption for face recognition based on the chaos and semi-tensor product theory. Inf Sci. 2023;621:766–81.
  49. 49. Geng J, Sun X, Wang H, Bu X, Liu D, Li F, et al. A modified adaptive sparrow search algorithm based on chaotic reverse learning and spiral search for global optimization. Neural Comput Appl. 2023;35(35):24603–20.
  50. 50. Tang A, Zhou H, Han T, Xie L. A chaos sparrow search algorithm with logarithmic spiral and adaptive step for engineering problems. Comput Model Eng Sci. 2022;130(1):331–64.
  51. 51. Li L-L, Xiong J-L, Tseng M-L, Yan Z, Lim MK. Using multi-objective sparrow search algorithm to establish active distribution network dynamic reconfiguration integrated optimization. Exp Syst Appl. 2022;193:116445.
  52. 52. Zhang X, Tian Y, Jin Y. A knee point-driven evolutionary algorithm for many-objective optimization. IEEE Trans Evol Computat. 2015;19(6):761–76.
  53. 53. Deb K, Thiele L, Laumanns M, Zitzler E: Scalable test problems for evolutionary multiobjective optimization. In: Abraham A, Jain L, Goldberg R editors. Evolutionary multiobjective optimization: theoretical advances and applications. London: Springer; 2005. p. 105–145.
  54. 54. Jain H, Deb K. An evolutionary many-objective optimization algorithm using reference-point based nondominated sorting approach, Part II: Handling constraints and extending to an adaptive approach. IEEE Trans Evol Computat. 2014;18(4):602–22.
  55. 55. Cheng R, Jin Y, Olhofer M, Sendhoff B. Test problems for large-scale multiobjective and many-objective optimization. IEEE Trans Cybern. 2017;47(12):4108–21. pmid:28113614
  56. 56. Aimin Zhou, Yaochu Jin, Qingfu Zhang, Sendhoff B, Tsang E. Combining model-based and genetics-based offspring generation for multi-objective optimization using a convergence criterion. In: 2006 IEEE International Conference on Evolutionary Computation; 2006. p. 892–9.
  57. 57. Yang Z, Qiu H, Gao L, Chen L, Liu J. Surrogate-assisted MOEA/D for expensive constrained multi-objective optimization. Inf Sci. 2023;639:119016.
  58. 58. Qin S, Sun C, Jin Y, Tan Y, Fieldsend J. Large-scale evolutionary multiobjective optimization assisted by directed sampling. IEEE Trans Evol Computat. 2021;25(4):724–38.
  59. 59. Farias LRC, Araujo AFR. IM-MOEA/D: an inverse modeling multi-objective evolutionary algorithm based on decomposition. In: 2021 IEEE International Conference on Systems, Man, and Cybernetics (SMC); 2021. p. 462–7.
  60. 60. He C, Cheng R, Yazdani D. Adaptive offspring generation for evolutionary large-scale multiobjective optimization. IEEE Trans Syst Man Cybern Syst. 2022;52(2):786–98.
  61. 61. He C, Cheng R, Li L, Tan KC, Jin Y. Large-scale multiobjective optimization via reformulated decision variable analysis. IEEE Trans Evol Computat. 2024;28(1):47–61.
  62. 62. Tian Y, Chen H, Ma H, Zhang X, Tan KC, Jin Y. Integrating conjugate gradients into evolutionary algorithms for large-scale continuous multi-objective optimization. IEEE/CAA J Autom Sinica. 2022;9(10):1801–17.
  63. 63. Liu Z, Han F, Ling Q, Han H, Jiang J. A many-objective optimization evolutionary algorithm based on hyper-dominance degree. Swarm Evolution Comput. 2023;83:101411.
  64. 64. Wu C, Tian Y, Zhang Y, Jiang H, Zhang X. A sparsity-guided elitism co-evolutionary framework for sparse large-scale multi-objective optimization. 2023 IEEE Congress on Evolutionary Computation (CEC). 2023. p. 1–8.
  65. 65. Wang Q, Gu Q, Chen L, Guo Y, Xiong N. A MOEA/D with global and local cooperative optimization for complicated bi-objective optimization problems. Appl Soft Comput. 2023;137:110162.
  66. 66. Tian Y, Cheng R, Zhang X, Jin Y. PlatEMO: a MATLAB platform for evolutionary multi-objective optimization [educational forum]. IEEE Comput Intell Mag. 2017;12(4):73–87.
  67. 67. Steel RGD, Torrie JH, Dickey DA: Principles and procedures of statistics a biometrical approach. New York, NY, USA: McGraw-Hill (1997).
  68. 68. Panichella A, Kifetew FM, Tonella P. Automated test case generation as a many-objective optimisation problem with dynamic selection of the targets. IIEEE Trans Softw Eng. 2018;44(2):122–58.
  69. 69. Du X, Zhang J, Chen K, Zhou Y. DFS-KeyLevel: A two-layer test scenario generation approach for UML activity diagram. J Electron Test. 2023;39(1):71–88.