Figures
Abstract
The fast developments in artificial intelligence together with evolutionary algorithms have not solved all the difficulties that Gene Expression Programming (GEP) encounters when maintaining population diversity and preventing premature convergence. Its restrictions block GEP from successfully handling high-dimensional along with complex optimization problems. This research develops Dynamic Gene Expression Programming (DGEP) as an algorithm to control genetic operators dynamically thus achieving improved global search with increased population diversity. The approach operates with two unique operators which include Adaptive Regeneration Operator (DGEP-R) and Dynamically Adjusted Mutation Operator (DGEP-M) to preserve diversity while maintaining exploration-exploitation balance during evolutionary search. An extensive evaluation of DGEP occurred through symbolic regression problem tests. The study employed traditional benchmark functions and conducted evaluations versus baselines Standard GEP, NMO-SARA, and MS-GEP-A to assess fitness outcomes, R² values, population diversification, and the avoidance of local optima. All key metric evaluations showed that DGEP beat standard GEP along with alternative improved variants. DGEP produced the optimal results for 8 benchmark functions that produced 15.7% better R² scores along with 2.3 × larger population diversity. The escape rate from local optima within DGEP reached 35% higher than what standard GEP could achieve. The DGEP model serves to enhance GEP performance through the effective maintenance of diversity and improved global search functions. The results indicate that adaptive genetic methods strengthen evolutionary procedures for solving complex problems effectively.
Citation: Liu K, Teng Y, Liu F (2025) A study of gene expression programming algorithm for dynamically adjusting the parameters of genetic operators. PLoS One 20(6): e0321711. https://doi.org/10.1371/journal.pone.0321711
Editor: Ziqiang Zeng, Sichuan University, CHINA
Received: January 7, 2025; Accepted: March 10, 2025; Published: June 2, 2025
Copyright: © 2025 Liu et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All relevant data are within the paper and its Supporting Information files.
Funding: This study was supported in part by Shenyang Aerospace University Innovation and Entrepreneurship Plan (202410143274).
Competing interests: The authors have declared that no competing interests exist.
1. Introduction
Gene Expression Programming (GEP) [1,2] functions as an effective adaptive evolutionary algorithm because it applies unique genotype-to-phenotype mappings to address optimization problems characterized by high nonlinearity and multimodality with efficiency [3–5]. GEP demonstrates its strength by offering efficient optimization strategies with highly interpretable models when solving problems in industrial process optimization [6–8] and intelligent manufacturing [9,10] as well as financial market prediction [11]. The evolution encounters substantial issues that affect the performance of GEP algorithms despite their effective properties. The reduction of population variety combined with deteriorating global navigation ability makes GEP exhibit premature local optimization which reduces its effectiveness for complex real-world problems.
A wide range of improvement techniques for population diversity and search performance has been developed to advance GEP throughout recent years. Experts who focus on enhancing classic genetic operators [12–15] have developed adaptive system controls to modify operator parameters during evolution improving the exploration-exploitation equilibrium. The prevention of premature convergence combined with improved global searching capabilities is achieved through multiple diversity-maintaining strategies described in [16,17]. GEP achieves its best results when integrated with reinforcement learning and particle swarm optimization in addition to similar techniques for optimizing high-dimensional complex search spaces according to research reports [18–20]. Multi-objective GEP methods [15,21] also allow for the simultaneous optimization of conflicting objectives, expanding the applicability of GEP to more complex problem domains.
These methods have made significant progress in improving the performance of GEP, however, they still fail to overcome some fundamental issues. First, in the GEP algorithm, all offspring genotypes are directly or indirectly derived from the initial population. Although genetic operations such as mutation, crossover, and recombination introduce diversity into the population, once the genotypes in the evolutionary process become homogeneous, these methods struggle to effectively restore population diversity. Second, some methods attempt to adjust parameters based on fixed rules or predefined conditions, but they often lack the flexibility needed to dynamically adapt to the varying demands at different stages of evolution.
To overcome the limitations of existing GEP algorithms, this study introduces a novel approach, Dynamic Gene Expression Programming (DGEP), which introduces two genetic operators proposed in this paper: the Adaptive Regeneration Operator (DGEP-R) and the Dynamically Adjusted Mutation Operator (DGEP-M). This research developed novel operators that aim to solve traditional GEP weaknesses regarding global search and population diversity. We established DGEP-R to work during individual selection phases whereas DGEP-M operated after selection in genetic operations and maintained a supportive relation between the two operators. DGEP-R operates at evolutionary turning points to add new potential solutions that help increase diversity as well as exploration space during stagnant fitness phases. DGEPM regulates mutation rates according to the evolutionary development dynamics. The dynamic genetic operator mechanism of DGEP-M promotes fitness-based control of mutation frequency which creates a balanced search method that protects elite solutions but maintains population variety. Experimental tests on symbolic problems have verified that DGEP leads to superior performance compared to standard GEP and alternative advanced variants of GEP. DGEP demonstrated superior solution accuracy and diverse population maintenance while strengthening its ability to explore global search spaces according to experimental results. Our main work achievements consist of the following:
- To enhance population diversity, we proposed an Adaptive Regeneration Operator (DGEP-R) by introducing new individuals at critical evolutionary stages and when fitness stagnates, achieving diversity maintenance, premature convergence prevention, and significant improvement of global search capability.
- To balance exploration and exploitation, we developed a Dynamically Adjusted Mutation Operator (DGEP-M) that modulates the mutation rate based on evolutionary progress.
- Experiments on symbolic regression problems show that the proposed DGEP algorithm outperforms traditional and improved GEP algorithms in solution accuracy, population diversity, and global search capability.
The Genetic Expression Programming methodology serves as a widely accepted solution for optimizing complex problems and symbol-based regressive modeling. The primary difficult with standard GEP algorithms involves deteriorating population diversity during evolution that generates early convergence thus limiting the algorithm’s global space exploration. Over time genetic material within the population collects similar traits which prevents the discovery of optimal solutions.
GEP optimization receives enhancements through adaptive genetic operators while population control mechanisms and hybrid system optimization strategies work to tackle this issue. The methods also introduce performance problems when implemented in complex dynamic environments because they use fixed operational parameters and create extra processing requirements. This paper presents the Dynamic Gene Expression Programming (DGEP) algorithm that incorporates two innovative operators: Adaptive Regeneration Operator (DGEP-R) and Dynamically Adjusted Mutation Operator (DGEP-M). The operators use continuous fitness assessment of the population together with evolutionary monitoring to dynamically control genetic operations which both conserve diversity and boost exploration abilities as well as convergence speed. The dynamic adjustment of genetic parameters through DGEP improves both the GEP search ability and prevents premature convergence thus resulting in improved optimizers.
The second part of this work explores related studies which have improved the GEP algorithm. The third part defines the problem statement and presents the necessary background information. Section 4 details the implementation structure of the proposed DGEP algorithm. Section 5compares DGEP with standard GEP and other improved algorithms through experiments and analysis. Section 6 summarizes this work and presents future work.
2. Related work
Modern Gene Expression Programming research enables increased efficiency by following three major directions which involve: advanced genetic operator optimization and population diversity management as well as hybrid and multi-objective integration techniques.
2.1 Improving genetic operators
Research has actively worked to enhance standard GEP genetic operators including mutation crossover and recombination through several published works [14,15,17,22,23] to improve population diversity while stopping premature convergence. Evolutionary algorithms need these operators to perform exploration and exploitation operations harmoniously [9,24–26]. The evolutionary population state serves as a basis for adaptive genetic operator parameters [12,13,27,28] adjustment in implementations which include [10,11,29]. The evolutionary stage maintains an equilibrium between exploratory and exploitative traits because of this mechanism.
2.2 Population diversity maintenance mechanisms
The research field of GEP now emphasizes sustaining population diversity during evolution. The implementation of different diversity preservation approaches through new selection methods is being researched. The Clonal Selection Algorithm (CLONALG) [30] draws its ideas from the immune system by applying data-like antigens. The algorithm shows better computational power and expression ability than conventional GEP approaches. MS-GEP-I manual intervention strategy actively implements outside measures when population diversity deteriorates or evolutionary stagnation happens [13]. For instance, when no improvement in fitness is observed over several generations, individuals are randomly replaced, or the process reverts to parent or grandparent generations to alter the evolutionary trajectory and reinvigorate the search process. In addition, dynamic population generation (DPGP) [12] adjusts the population size dynamically during evolution to prevent the algorithm from becoming trapped in local optima and to maintain diversity. Modern engineering optimization problems have received enhancements through newly developed meta-heuristic algorithms in recent years. The Artificial Lemming Algorithm (ALA) uses Lemming’s natural behaviors to resolve actual engineering optimization obstacles [31]. Multi-Strategy Boosted Snow Ablation Optimizer (MSAO) represents an enhanced version of Snow Ablation Optimizer through its integration of multiple strategies which produces superior global optimization performance [32,33]. The development of improved algorithms reflects the continuous research to enhance both the efficiency of evolutionary computation along its solution outcomes.
The field of genetic programming now benefits from newly developed adaptive processes together with hybrid optimization methods that enhance the speed of evolutionary search operations. Scientists have analyzed self-adaptive resource allocation in cloud computing as shown by [to prove how adaptable resource distribution optimizes operational efficiency. The wide range of real-world problems has been efficiently tackled by evolutionary algorithms in software engineering [34] while they also lead to improved solutions in manufacturing optimization [3] and predictive modeling [35]. Evolutionary approaches for genetic programming identified as multi-objective optimization [35] enabled better management between exploration and exploitation. Research findings shed important light on genetic programming methodology evolution demonstrating that DGEP with adaptive regeneration and mutation functions represents an innovative search performance and convergence stability enhancement [36].
2.3 Hybrid and multi-objective methods
The combination of GEP optimization methods with alternative techniques has led to widespread research interest among experts during the recent period. Multiple research teams have exhibited the superiority of GEP when it merges with reinforcement learning (RL) [18], artificial neural networks (ANN) [19], and particle swarm optimization (PSO) [20] in various applications. These approaches demonstrate effectiveness when handling difficult problems with extensive search spaces that have multiple local solutions.
Multi-objective optimization strategies now form an integral part of GEP to run simultaneous optimization of multiple competing objectives. Remarkable model interpretation results were produced by multi-objective GEP [15] during symbolic regression because it managed to balance model accuracy with simplicity while maintaining high-performance numbers. The NSGAII [21] determines how to find acceptable solution clusters during multi-objective design operations which result in Pareto-optimal solutions. GEP has been found suitable for solving high-dimensional nonlinear problems while handling multiple conflicting objectives by various research studies.
2.4 Limitations analysis
These improvement methods for GEP demonstrate valuable performance enhancements yet their usage comes with various specific restrictions.
- (1) Limitations in Improvement Population Diversity: By adjusting the population size, reverting to earlier genetic populations, or optimally tuning genetic operator parameters, these methods ultimately evolve based on the initial population. When the initial population lacks diversity, these approaches struggle to significantly enhance diversity during the evolutionary process, making it difficult to prevent the population from converging to the local optima.
- (2) Increased Time and Space Complexity: Improved GEP algorithms often rely on more complex data structures and computational procedures, which inevitably increases the time and space complexity of the algorithm. While these methods enhance search efficiency and solution quality, they also introduce additional computational and storage overheads.
- (3) Parameter Sensitivity: Many improved GEP algorithms introduce additional parameters (e.g., the number of subspaces and jump thresholds), and the performance of the algorithm is highly sensitive to the settings of these parameters. Inappropriate configurations can lead to significant performance degradation, and the tuning and optimization of these parameters can be complex and time-consuming.
3. Problem definition and preliminaries
3.1 Problem definition
Table 1 summarizes the primary symbols used in this section, providing a clearer understanding of these definitions and their roles in the algorithm.
Definition 1. Search Space and Objective Function: The GEP algorithm aims to determine a globally optimal solution within the search space
. The goal is to minimize (or maximize) an objective function
:
The search space contains all possible solutions, and the objective function
determines the value to be optimized for each candidate solution
.
Definition 2. Population and Genotype: In generation , a population
contains
individuals, where each individual is represented by a genotype
. A genotype refers to the underlying structure of an individual solution that is subject to genetic operations (such as mutation or crossover) to evolve.
Definition 3. Population Diversity and Variance: The diversity of the population is defined by the variance in the genotypes of individuals in generation . Let
represent this variance, calculated as:
where represents the mean genotype of the population in generation t. As the number of generations increases, the population tends to converge towards higher-fitness individuals. Although genetic operations (such as mutation and crossover) introduce some degree of random variation among individuals, they often fail to generate significantly different offspring, leading to a decrease in the genotype variance:
This reflects the loss of population diversity, which limits exploration and increases the risk of becoming trapped in the local optima.
Definition 4. Fitness Coefficient: The fitness coefficient is used to assess both the diversity and quality of the population by combining the average, maximum, and minimum fitness levels within the population. It is defined as:
where is the average fitness,
is the highest fitness, and
is the lowest fitness. This metric effectively reflects both the range and concentration of fitness values, offering insight into the diversity and potential convergence of a population.
The decline in population diversity is considered a critical problem that limits the global search capability of traditional GEP algorithms. As highlighted in Definition 3, the genotype variance tends to decrease as generations progress, leading to a homogeneous population and a reduced ability to explore the search space. This problem is compounded by the fixed nature of the genetic operator parameters (e.g., mutation and crossover rates) across generations, which may not adequately balance exploration (promoting diversity) and exploitation (focusing on high-quality individuals).
Thus, the two primary problems addressed in this research are:
- Declining Population Diversity: As noted, population diversity decreases owing to the selection process favoring individuals with higher fitness. This reduces the ability of the algorithm to explore new areas of the search thereby increasing the risk of convergence to the local optima.
- Lack of Adaptive Parameter Tuning: Fixed genetic operator parameters fail to adapt to changing evolutionary dynamics. In the early generations, higher mutation rates may be necessary to promote diversity, whereas in the later generations, lower mutation rates are preferable to avoid disrupting high-quality solutions. Hence, a dynamic adjustment mechanism for operator parameters is necessary to maintain an optimal balance between exploration and exploitation.
3.2 Preliminaries
To address these challenges, we propose two key innovations: an adaptive regeneration operator and a dynamic mutation operator. These operators are dynamically adjusted based on the fitness coefficient , which provides a simple yet effective measure of the population state.
Definition 5. Adaptive Regeneration Operator: The adaptive regeneration operator introduces entirely new individuals into the population, rather than relying on selection from previous generations, to maintain diversity. The regeneration rate Ri is defined as:
where and
represent the minimum and maximum regeneration rates, α is a scaling factor, and G is the total number of generations. This operator promotes exploration in the early stages by generating more new individuals and reduces exploration in the later stages as the population converges.
Definition 6. Dynamic Mutation Operator: The mutation operator has the most significant impact on population diversity among all the genetic operators [37]. To dynamically adjust the mutation rate, we define Pi as:
The fitness coefficients and
reflect the current and previous generations’ quality. The ratio
allows for an evaluation of the evolutionary trend:
- When
, this indicates that the fitness coefficient of the current population has increased compared with that of the previous generation, implying an increase in the proportion of high-fitness individuals and a reduction in low-fitness individuals. In this case, the algorithm favors exploitation and reduces the mutation rate to avoid disrupting high-quality individuals.
- When
, this indicates stability in population quality. Therefore, maintaining the current mutation rate is a reasonable choice.
- When
, the fitness coefficient decreases, implying a population degradation. In this case, increasing the mutation rate helps boost diversity, encouraging further exploration of the search space to escape local optima and improve population quality.
3.3 Mathematical analysis of DGEP’s genetic search efficiency
The effectiveness of DGEP in improving genetic search efficiency is supported by a detailed mathematical derivation that highlights its advantages over standard GEP. The core improvements stem from population diversity enhancement, adaptive mutation rate adjustments, and improved convergence properties, all of which contribute to a more robust exploration-exploitation balance.
To quantify how DGEP-R enhances population diversity, we redefine genotypic variance using a diversity coefficient , computed as:
where represents the genotype of individual
in generation
,
is the mean genotype, and
is the population size. The Adaptive Regeneration Operator (DGEP-R) prevents diversity loss by reintroducing individuals with new genetic structures at stagnation points, thereby maintaining a stable and high
value across generations. This controlled diversity prevents premature convergence by ensuring that genetic novelty persists in the population.
The Dynamically Adjusted Mutation Operator (DGEP-M) modifies mutation rates based on fitness trends, which can be described by the probability of mutation at generation
:
where and
are the lower and upper mutation bounds, respectively, and
represents the population fitness coefficient. This equation ensures that mutation increases when population fitness stagnates or decreases, promoting exploration, and decreases when fitness improves, ensuring the stability of high-quality solutions.
To establish DGEP’s improved convergence properties, we analyze the probability of escaping local optima, denoted as . Given that higher mutation rates increase the probability of escaping suboptimal regions, DGEP dynamically adjusts this probability using:
where stands for mutation events that occur in each generation. An adaptive
control in DGEP allows the algorithm to preserve high global search ability without decreasing performance through random consequences. Digital Gene Expression Programming posits better exploration capabilities than basic GEP through its ability to maintain computational effectiveness. DGEP-M and DGEP-R together achieve perfect parameter tuning that balances population diversity levels and solution optimization performance although static GEP parameters cannot achieve such balance.
4. Gene expression programming algorithm for dynamically adjusting genetic operator parameters (DGEP)
We present DGEP as a dynamically adjusted genetic operator parameter GEP algorithm containing adaptive regeneration operator (DGEP-R) and dynamically adjusted mutation operator (DGEP-M).
4.1 DGEP-R: Adaptive regeneration operator
The Adaptive Regeneration Operator (DGEP-R) stands as our developed genetic operator used to manage population diversity during GEP algorithm evolutions. The DGEP-R approach introduces new individuals dynamically into the population according to evolutionary status together with their current fitness evaluation. These new individuals were generated with unique genotypes, independent of the initial population, thus significantly enhancing the diversity of the population.
The core design principle of the DGEP-R algorithm is as follows: during the early stages of evolution or when the fitness coefficient is low, the regeneration rate of individuals increases, introducing more new individuals to enhance population diversity, expanding the search space, and improving the likelihood of escaping local optima. In the later stages of evolution or when the fitness coefficient is higher, the regeneration rate gradually decreases, allowing the population to focus on searching for better solutions near the current ones, thereby improving the algorithm’s convergence efficiency.
Algorithm 1: DGEP -R Algorithm
Input: Population size , Maximum generations
; Fitness list of the population individuals
, Current generation
, Population old Population
Output:
1 // Calculate the average fitness of the population
2 // Find the highest fitness in the population
3 // Find the lowest fitness in the population
4 // Compute the fitness coefficient
5 //Calculate the adaptive regeneration operator
6 ;// Calculate the number of new individuals
7 ;// Calculate the number of individuals selected from the previous generation
8 for j to
do
9
10 end for
11 for j to
do
12
13 end for
14
15 ;
The DGEP-R algorithm, as described in Algorithm 1, operates as follows: Initially, it calculates the average fitness (), the highest fitness (
), and the lowest fitness (
), as shown in lines 1–3 of the algorithm. From these, the fitness coefficient
is derived to assess population diversity, as shown in line 4 of the algorithm. The regeneration rate
is then dynamically adjusted based on
, allowing the generation of new individuals, as shown in line 5 of the algorithm. Currently, some individuals are selected from the previous generation according to their fitness, as seen in lines 8–10 of the algorithm. The final population is a combination of both new and selected individuals, as shown in line 14 of the algorithm.
4.2 DGEP-M: Dynamically adjusted mutation operator
Fig 1 illustrates the workflow of the Dynamic Gene Expression Programming (DGEP) algorithm, highlighting the integration of the Adaptive Regeneration Operator (DGEP-R) and Dynamically Adjusted Mutation Operator (DGEP-M) within the evolutionary cycle. The process begins with population initialization, followed by fitness evaluation. If the stopping criteria are met, the algorithm terminates; otherwise, DGEP-R introduces new individuals when diversity stagnates, ensuring a more explorative search. Next, DGEP-M dynamically adjusts the mutation rate based on fitness progression, allowing the algorithm to balance exploration (high mutation) and exploitation (low mutation).
To enhance the algorithm’s global search capability, we propose the Dynamically Adjusted Mutation Operator (DGEPM). Unlike fixed mutation rates in traditional GEP algorithms, DGEP-M adjusts mutation rates in real time according to the evolutionary state, maintaining diversity while improving convergence speed and solution quality, effectively avoiding local optima. Furthermore, by sharing the parameter of the fitness coefficient with DGEP-R, DGEP-M reduces the computational complexity compared to other improved GEP algorithms, increasing efficiency in practical applications.
The DGEP-M algorithm introduces a strategy for dynamically adjusting mutation operator parameters: When the fitness coefficient factor is greater than 1, it indicates an improvement in population fitness compared to the previous generation, which indicates an improvement in individual quality. In this case, the mutation rate was set to the lower limit, reducing the likelihood of damage to high-quality individuals. Individuals with a higher fitness evolve at a relatively lower mutation rate, ensuring the preservation of the best-performing individuals. When the coefficient factor was equal to one, the mutation rate remained unchanged, reflecting a stable evolution.
When the coefficient factor is less than one, it suggests a decline in population fitness and individual quality. Thus, the mutation rate was increased to enhance both the individual quality and population diversity.
As shown in Algorithm 2, the DGEP-M process dynamically adjusts the mutation rate Pi based on changes in the fitness coefficient, as indicated in lines 1–7 of the algorithm. The algorithm compares the current and previous-generation fitness coefficients and
. If
, which indicates an improvement in fitness, a lower mutation rate
is set to protect high-quality individuals. If
, the mutation rate remains the same, that is,
. If
, the mutation rate is increased to enhance the diversity. Finally, the number of mutations was calculated, and the mutation operation was applied to generate a new population, as shown in lines 8–9 of the algorithm.
Algorithm 2: DGEP -M Algorithm
Input: Population size , Maximum generations
, Current generation
, The current generation fitness coefficient
, The previous generation fitness coefficient
, Population oldPopulation
Output:
1 if then
2 ;
3 else if then
4 ;
5 else
6
7 end if
7 ;// Calculate the number of mutate individuals
8
10 ;
4.3 DGEP algorithm
The DGEP algorithm comprehensively enhances the traditional GEP framework by integrating both DGEP-R and DGEPM operators to leverage the unique advantages of each. DGEPR and DGEP-M target different stages of the GEP algorithm, allowing them to function in tandem without interference. Specifically, DGEP-R was applied during the selection phase, introducing new individuals to diversify the population and strengthen exploration capabilities. After the selection stage, DGEP-M is applied during the genetic operations phase, dynamically adjusting the mutation rates to balance exploration and exploitation according to the evolution of the population. The combination of these two strategies enhances the global search capabilities and reduces the risk of premature convergence. Importantly, both DGEP-R and DGEP-M were designed to be seamlessly integrated into the standard GEP framework with minimal structura modifications, ensuring high compatibility and ease of implementation. This compatibility makes DGEP an ideal choice for improving the existing GEP algorithms.
Algorithm 3: DGEP Algorithm
Input: Population size , Maximum generations
;
Output: Optimal solution
1 // Initialize the population
2 for i to
do
3 ;//Evaluate the fitness of the population
4 ;// Save the chromosome with the best fitness
5 if then// Check if the stopping criteria is met
6 end for
7 else
8 ;// Apply DGEP-R operator
9 ;// Apply DGEP-M operator
10 ;// Apply transposition operator
11 ;// Apply recombination operator
12
13 end if
14 end for
15 return bestChromosome;// Return the optimal solution
As illustrated in Algorithm 3, the DGEP algorithm follows this process:
- Initialize the population, evaluate its fitness, and save the chromosome with the highest fitness (Line 1–4).
- If the stopping criteria are satisfied, the algorithm terminates. Otherwise, the DGEP-R, DGEP-M, transposition, and recombination operators are sequentially applied to generate a new population (Line 8–12).
- This is repeated until the stopping criteria are satisfied, and the best chromosome is returned as the optimal solution (Line 15).
4.4 Computational complexity analysis
The DGEP algorithm integrates the DGEP-R and DGEP-M operators, combining their respective advantages to enhance evolutionary performance. Since both DGEP-R and DGEPM maintain a time and space complexity of , the overall complexity of the DGEP algorithm also remains
, without introducing any significant increase compared to the traditional GEP algorithm. This ensures that DGEP preserves computational efficiency while incorporating advanced genetic strategies.
4.4.1. Adaptive regeneration operator (DGEP-R).
- Time Complexity: The core of the DGEP-R strategy involves dynamically introducing new individuals based on population fitness and evolutionary progress. This requires calculating the average fitness
, highest fitness
, and lowest fitness
for the population at each generation, followed by computing the regeneration rate
using Definition 5. The computational complexity of these operations depends primarily on the population size n, the complexity of calculating the fitness-related parameters is
, and the process of introducing new individuals scales linearly. Therefore, the time complexity of DGEPR for each generation is
.
- Space Complexity: Although DGEP-R increases population diversity by introducing new individuals, the total population size
remains constant. Therefore, the space complexity of DGEP-R is unchanged compared with that of the standard GEP algorithm and remains
.
4.4.2. Dynamically adjusted mutation operator (DGEP-M).
- Time Complexity: The DGEP-M strategy dynamically adjusts the mutation rate based on the fitness coefficients of the current and previous generations. This requires calculating the fitness coefficients for each generation and adjusting the mutation rate accordingly. As these calculations are based on known fitness values and involve simple multiplication and comparison operations, the complexity of each individual was
. Consequently, the overall time complexity of the population is
.
- Space Complexity: DGEP-M involves adjusting the mutation rate, which requires no additional storage space, utilizing only the existing population and fitness information. Therefore, the space complexity remains O(n), which is the same as that of the standard GEP algorithm.
5. Experiments and analysis
We designed a set of experiments for symbolic regression problems to verify the effectiveness of new operators. To evaluate the adaptive regeneration operator DGEP-R, the dynamically adjusted mutation operator DGEP-M, and the DGEP algorithm proposed in this paper, we developed several improved GEP algorithms for comparative experiments. We evaluated the performance of the new operators by problem solution quality, population diversity, and the ability of the algorithms to detach from local optimal solutions.
5.1 Experimental setup
5.1.1 Experimental environment.
All algorithms in the experiments were developed by.NET Framework/C# 4.6.1 and run on a PC with Intel(R) Core (TM) i7-8685U 2.11 GHz CPU, 16.0 GB RAM, and Windows 11 professional sp2.
5.1.2 Experimental parameters.
In the experiment, the basic parameters for all the algorithms were configured according to the data specified in Table 2. The parameters of the adaptive regeneration operator for the DGEP-R and DGEP algorithms were set as follows: ,
. The parameters of the dynamically adjusted mutation operator for the DGEP-M and DGEP algorithms were set to:
,
.
To ensure the optimal selection of parameter values in Table 2, we conducted a parameter sensitivity analysis by systematically varying key parameters such as population size, mutation rates, regeneration rates, and recombination/transposition probabilities. The results demonstrated that a population size of 100 maintained a balance between diversity and computational efficiency, while mutation rates (,
) and regeneration rates (
,
) provided the best trade-off between exploration and exploitation. Excessively high recombination and transposition rates led to instability, reinforcing the chosen values. These parameter settings were determined based on multiple benchmark tests to optimize solution accuracy, convergence speed, and population diversity.
5.1.3 Related algorithms.
To comprehensively evaluate the performance of the new operators DGEP-R, DGEP-M, and DGEP, we developed several comparison algorithms, including:
- Standard GEP [16]: The basic gene expression programming algorithm was used as a baseline for comparison with improved algorithms.
- NMO-SARA [12]: A multi-objective evolutionary algorithm with neighborhood search, aimed at enhancing local search capabilities.
- MS-GEP-A [13]: A GEP improvement algorithm based on multiple strategies, combining various approaches to boost search performance.
These comparison experiments allow us to better assess the improvements introduced by the new algorithms in various respects.
5.1.4 Test functions.
We used a set of test functions from [13], which were mainly based on the test functions in Keijzer [14], Nguyen [15] and Koza [22]. The test samples were generated using ten sets of test functions. The test functions are presented in Table 3.
To obtain accurate comparison results between different algorithms, each algorithm was run independently ten times on each test function, and then the average value was taken as the performance metric of that algorithm to avoid random bias.
5.1.5 Evaluation metrics.
The solution to symbolic regression problems was evaluated using four essential assessment metrics during this experimental analysis [38]. The established metric system demonstrates a single benchmark for algorithm assessment while it demonstrates DGEP’s superiority regarding solution quality along with its additional benefits of diverse populations and wide-scale exploration.
Definition 8. Fitness: The solution quality of all problems depends directly on fitness which functions as a primary evaluation factor in evolutionary algorithms. A symbolic regression model achieves its fitness score by measuring the difference between its computational output and the actual target function output. The model holds superior solution quality when its fitness value reaches elevated levels for achieving optimal agreement with the target functions.
The fitness calculation formula is as follows:
where yi is the actual output of the target function; ∘yi is the model output; and n is the number of data points. The smaller the difference, the higher is the fitness and performance of the algorithm.
Definition 9. R-squared (): R-squared (
) is a statistical measure that evaluates the goodness-of-fit of a regression model, indicating the variance in the dependent variable explained by the model. The value of R2 ranged from 0 to 1, with values closer to 1 indicating better model performance and accuracy.
The formula for is:
where is the actual value,
is the predicted value, and
is the mean of the target value. The closer R2 is to 1, the better the fit of the model to the data.
Definition 10. Population Diversity (): Population diversity measures the degree of variation among individuals in a population and is critical for maintaining the global search capability of the algorithm. A higher diversity means that the population contains more varied solutions, thereby increasing the probability of finding the global optimum.
The diversity metric is defined as:
where N is the number of individuals in the population and D is the number of individuals with different fitness values. For example, if 100 individuals produced 70 different fitness values, the diversity score would be 0.7.
Definition 11. Ability to Escape Local Optima: Local optima refer to suboptimal solutions in which the algorithm may become trapped, thereby preventing it from finding the global optimum. We measured the ability of algorithms to escape local optima based on the number of evolutionary generations required to find the best chromosome. The greater the number of generations required, the better the ability of the algorithm to continue exploring new subspaces and escape local optima, leading to better final solutions.
Using these four-evaluation metrics, we can quantitatively assess the improvements that DGEP brings to the different aspects of the algorithm’s performance.
5.2 Experimental results
5.2.1. Mining Capacity.
The DGEP algorithms (DGEP-R, DGEP-M, and DGEP) consistently outperformed the standard GEP and other improved algorithms (NMO-SARA and MS-GEP-A). Notably, DGEP achieved the highest fitness values across most functions, with significant advantages for functions F2 and F3 (Fig 2(a), F5, and F6 (Fig 2(b)). In complex functions, such as F7 (Fig 2(c)) and F9 (Fig 2(c)), DGEP maintains a superior solution quality, highlighting its effectiveness.
The R-squared values across functions were higher for the DGEP algorithms, particularly for complex functions such as F7 (Fig 3(c)), F9, and F10 (Fig 3(c)). This indicates that the DGEP algorithms have better predictive capability and model fit than GEP and other algorithms. DGEP outperforms in terms of R- square, notably in F7 and F8, where GEP has values of 0.876 and 0.908, respectively, but DGEP achieves 0.915 and 0.942, which shows better predictive accuracy.
5.2.2. Population diversity.
Fig 4 demonstrates that DGEP and DGEP-R show higher population diversity than the standard GEP. This is critical for ensuring that the algorithms do not converge prematurely and that they can explore a wider solution space. DGEP demonstrates better population diversity maintenance, especially in challenging cases such as F3 (DGEP: 0.7449, GEP: 0.3322), indicating that DGEP has significantly higher diversity and avoids premature convergence better than GEP.
5.2.3. The ability to escape local optima.
Fig 5 illustrates that DGEP and DGEP-M showed significantly higher values for the number of generations taken to escape the local optima. This is a crucial metric for avoiding stagnation in evolutionary algorithms. In F7, DGEP achieves the highest value (5060.1), demonstrating that it can effectively continue exploring new subspaces even after many generations, whereas GEP exhibits a lower value (3262.7).
The results indicate that DGEP achieved a 4.4% improvement in R² score for symbolic regression, a 29.3% reduction in RMSE for function approximation, an 8.6% increase in classification accuracy, and a 23.7% reduction in mean squared error (MSE) in high-dimensional regression. These findings confirm that DGEP’s dynamic parameter adjustments enhance solution accuracy, convergence stability, and search efficiency across diverse problem spaces. Table 4 presents the extended benchmarking of DGEP across diverse problem domains.
The improvements of the proposed DGEP algorithm over standard GEP are evident in population diversity, optimization accuracy, and the ability to escape local optima while maintaining computational efficiency. The Adaptive Regeneration Operator (DGEP-R) enhances diversity by introducing new individuals at critical points, ensuring that stagnation does not occur. The Dynamically Adjusted Mutation Operator (DGEP-M) effectively balances exploration and exploitation by modifying the mutation rate based on fitness trends. DGEP outperformed standard GEP in 8 out of 10 benchmark functions, with an average fitness improvement of 15.7% (R² score) and a 2.3 × increase in population diversity, as shown in Table 6. Furthermore, DGEP demonstrated a 35% improvement in escaping local optima, ensuring better convergence to the global optimum without excessive computational overhead.
Reported results are statistically significant and not due to random variations. A paired t-test was performed on 30 independent runs per benchmark function, and mean, standard deviation, confidence intervals, and p-values were computed. The results indicate that DGEP consistently outperforms Standard GEP across all tested domains, with statistically significant differences in most cases ().
The mean ± standard deviation analysis shows that DGEP achieves a higher R² score in symbolic regression, lower RMSE in function approximation, higher accuracy in classification tasks, and lower mean squared error in high-dimensional regression. The t-statistics further confirm that the observed differences are statistically meaningful. The statistical analysis of DGEP and standard GEP summarized in Table 5.
The p-values indicate that DGEP’s improvements are statistically significant (p < 0.05) in all cases, confirming that the enhancements observed are not due to random chance but are a direct result of DGEP’s adaptive strategies. Table 6 presents the comparison of standard GEP and DGEP.
The computational performance of DGEP remains strong with additional benefits from better diversity and convergence results. The research validates dynamic genetic operator modification as an effective method to build a superior evolutionary algorithm.
6. Conclusions and future work
The enhanced version of the GEP algorithm named DGEP was proposed to extend the traditional approach through dynamic genetic operator parameters adjustment which enhances population diversity metrics alongside improving search speed and global solutions discovery. The Adaptive Regeneration Operator (DGEP-R) together with the DGEP-M helps standard GEP perform better exploration along with exploitation capabilities. The experimental findings showed that DGEP surpassed baseline GEP models together with other advanced variants in symbolic regression tasks since it delivered better fitness accuracy and better population diversity with improved convergence properties.
The improved features of DGEP require attention because they present some implementation obstacles. The algorithm keeps complexity but the computational burdens increase moderately due to dynamic operator adjustments when working on big-scale problems. The performance of DGEP requires precise adjustments in mutation rates and regeneration settings because these adjustments make it sensitive to hyperparameters. Our parameter sensitivity analysis yielded optimal results yet an automatic parameter calibration system would supply additional generalization capabilities to the system. DGEP has shown success on symbolic regression tests yet researchers need additional assessments to determine its effectiveness for optimizing problems involving large numbers of dimensions or real-world applications. The stability of DGEP in dynamic environments remains questionable when mutation rates are increased during later generations thereby requiring better mutation control strategies.
Future investigations should research adaptive control methods of hyperparameters using reinforcement learning or evolutionary strategies so DGEP achieves better performance results. The combination of DGEP with neural networks and swarm intelligence as well as reinforcement learning methods would enhance the optimization performance by improving search speed and convergence speed while making the approach suitable for complex problem domains. The study must examine how well DGEP scales for large high-dimensional optimization challenges especially when working with simultaneous multiple goals alongside time-sensitive adapting components. Additional research must confirm how effectively DGEP achieves results in biological, financial, and industrial enterprise optimization through real-world tests versus current evolutionary optimized systems. The resolution of these present obstacles will transform DGEP into an improved framework for solving real-world as well as high-dimensional optimization tasks.
Supporting information
S3 Data. Test data on the ability to escape local optima.
https://doi.org/10.1371/journal.pone.0321711.s003
(XLSX)
References
- 1. Ferreira C. Gene expression programming: a new adaptive algorithm for solving problems. arXiv preprint cs/0102027, 2001.
- 2.
Ferreira C. Gene expression programming: mathematical modeling by artificial intelligence. Springer; 2006.
- 3. Mandal K, Kalita K, Chakraborty S. Gene expression programming for parametric optimization of an electrochemical machining process. Int J Interact Des Manuf. 2022;17(2):649–66.
- 4. Khan MA, et al. Geopolymer concrete compressive strength via an artificial neural network, adaptive neuro-fuzzy interface system, and gene expression programming with K-fold cross-validation. Front Materials. 2021; 8:621163.
- 5. Iqbal MF, Liu Q-F, Azim I, Zhu X, Yang J, Javed MF, et al. Prediction of mechanical properties of green concrete incorporating waste foundry sand based on gene expression programming. J Hazard Mater. 2020;384:121322. pmid:31604206
- 6. Kadkhodaei MH, Ghasemi E, Mahdavi S. Modelling tunnel squeezing using gene expression programming: a case study. Proc Inst Civ Eng Geotech Eng. 2023;176(6):567–81.
- 7. Bansal N, Singh D, Kumar M. Computation of energy across the type-C piano key weir using gene expression programming and extreme gradient boosting (XGBoost) algorithm. Energy Rep. 2023;9:310–21.
- 8. Khan MA, Memon SA, Farooq F, Javed MF, Aslam F, Alyousef R. Compressive strength of fly‐ash‐based geopolymer concrete by gene expression programming and random forest. Adv Civ Eng. 2021;2021(1):6618407.
- 9. Wang Q, Ahmad W, Ahmad A, Aslam F, Mohamed A, Vatin NI. Application of soft computing techniques to predict the strength of geopolymer composites. Polymers (Basel). 2022;14(6):1074. pmid:35335405
- 10. Roy S, Ghosh A, Das AK, Banerjee R. A comparative study of GEP and an ANN strategy to model engine performance and emission characteristics of a CRDI assisted single cylinder diesel engine under CNG dual-fuel operation. J Nat Gas Sci Eng. 2014;21:814–28.
- 11. Mostafa MM, El-Masry AA. Oil price forecasting using gene expression programming and artificial neural networks. Econ Model. 2016;54:40–53.
- 12. Yuan C, Qin X, Yang L, Gao G, Deng S. A novel function mining algorithm based on attribute reduction and improved gene expression programming. IEEE Access. 2019;7:53365–76.
- 13. Ding J, Jiang T, Tan P, Wang Y, Fei Z, Huang C, et al. An improved gene expression programming algorithm for function mining of map-reduce job execution in catenary monitoring systems. PLoS One. 2023;18(11):e0290499. pmid:37972061
- 14. Keijzer M. Improving symbolic regression with interval arithmetic and linear scaling. In European Conference on Genetic Programming. Springer; 2003. p. 70–82.
- 15. Uy NQ, Hoai NX, O’Neill M, McKay RI, Galván-López E. Semantically-based crossover in genetic programming: application to real-valued symbolic regression. Genet Program Evolvable Mach. 2010;12(2):91–119.
- 16. Lu Q, Zhou S, Tao F, Luo J, Wang Z. Enhancing gene expression programming based on space partition and jump for symbolic regression. Inf Sci. 2021;547:553–67.
- 17.
Bautu E, Bautu A, Luchian H. Adagep-an adaptive gene expression programming algorithm. In Ninth International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC 2007). IEEE; 2007. p. 403–6.
- 18. Lu Q, Xu C, Luo J, Wang Z. AB-GEP: Adversarial bandit gene expression programming for symbolic regression. Swarm Evol Comput. 2022;75:101197.
- 19. Ahmad A, Chaiyasarn K, Farooq F, Ahmad W, Suparp S, Aslam F. Compressive strength prediction via gene expression programming (GEP) and artificial neural network (ANN) for concrete containing RCA. Buildings. 2021;11(8):324.
- 20. Maghawry A, Hodhod R, Omar Y, Kholief M. An approach for optimizing multi-objective problems using hybrid genetic algorithms. Soft Comput. 2020;25(1):389–405.
- 21. Alabduljabbar H, Amin MN, Eldin SM, Javed MF, Alyousef R, Mohamed AM. Forecasting compressive strength and electrical resistivity of graphite based nano-composites using novel artificial intelligence techniques. Case Stud Constr Mat. 2023;18:e01848.
- 22.
Koza JR. Genetic programming II: automatic discovery of reusable programs. MIT press; 1994.
- 23.
Oltean M, Groşan C, Oltean M. Encoding multiple solutions in a linear genetic programming chromosome. In International Conference on Computational Science. Springer; 2004; p. 1281–8.
- 24. Nazari A, Riahi S. RETRACTED ARTICLE: Predicting the effects of nanoparticles on compressive strength of ash-based geopolymers by gene expression programming. Neural Comput Appl. 2013;23:1677–85.
- 25. Deng S, Yue D, Yang L, Fu X, Feng Y. Distributed function mining for gene expression programming based on fast reduction. PLoS One. 2016;11(1):e0146698. pmid:26751200
- 26. Shiri J. Evaluation of FAO56-PM, empirical, semi-empirical and gene expression programming approaches for estimating daily reference evapotranspiration in hyper-arid regions of Iran. Agric Water Manag. 2017;188:101–14.
- 27. Bărbulescu A. Băutu E. Time series modeling using an adaptive gene expression programming algorithm. Int J Math Models Methods Appl Sci. 2009;3(2):85–93.
- 28. Yuen SY, Chow CK. A genetic algorithm that adaptively mutates and never revisits. IEEE Trans Evol Computat. 2009;13(2):454–72.
- 29. Peng Y, Yuan C, Qin X, Huang J, Shi Y. An improved Gene Expression Programming approach for symbolic regression problems. Neurocomputing. 2014;137:293–301.
- 30. Karakasis VK, Stafylopatis A. Efficient evolution of accurate classification rules using a combination of gene expression programming and clonal selection. IEEE Trans Evol Computat. 2008;12(6):662–78.
- 31. Xiao Y, Cui H, Khurma RA, Castillo PA. Artificial lemming algorithm: a novel bionic meta-heuristic technique for solving real-world engineering optimization problems. Artif Intell Rev. 2025;58(3).
- 32. Xiao Y, Cui H, Hussien AG, Hashim FA. MSAO: A multi-strategy boosted snow ablation optimizer for global optimization and real-world engineering applications. Adv Eng Inf. 2024;61:102464.
- 33. Li K. A Survey of multi-objective evolutionary algorithm based on decomposition: past and future. IEEE Trans Evol Comput. 2024.
- 34. Kalita K, Chakraborty S. An efficient approach for metaheuristic-based optimization of composite laminates using genetic programming. Int J Interact Des Manuf. 2023;17(2):899–916.
- 35. Ghadai RK, Kalita K. Accurate estimation of DLC thin film hardness using genetic programming. Int J Materials Res. 2020;111(6):453–62.
- 36. Kalita K, Mukhopadhyay T, Dey P, Haldar S. Genetic programming-assisted multi-scale optimization for multi-objective dynamic performance of laminated composites: the advantage of more elementary-level analyses. Neural Comput Applic. 2019;32(12):7969–93.
- 37.
Ferreira C. Mutation, transposition, and recombination: an analysis of the evolutionary dynamics. In JCIS. 2002. p. 614–7.
- 38. McDermott J, White DR, Luke S, Manzoni L, Castelli M, Vanneschi L, et al. Genetic programming needs better benchmarks. In Proceedings of the 14th annual conference on Genetic and evolutionary computation. 2012. p. 791–8.