An improved genetic algorithm and its application in neural network adversarial attack

The choice of crossover and mutation strategies plays a crucial role in the searchability, convergence efficiency and precision of genetic algorithms. In this paper, a novel improved genetic algorithm is proposed by improving the crossover and mutation operation of the simple genetic algorithm, and it is verified by 15 test functions. The qualitative results show that, compared with three other mainstream swarm intelligence optimization algorithms, the algorithm can not only improve the global search ability, convergence efficiency and precision, but also increase the success rate of convergence to the optimal value under the same experimental conditions. The quantitative results show that the algorithm performs superiorly in 13 of the 15 tested functions. The Wilcoxon rank-sum test was used for statistical evaluation, showing the significant advantage of the algorithm at 95% confidence intervals. Finally, the algorithm is applied to neural network adversarial attacks. The applied results show that the method does not need the structure and parameter information inside the neural network model, and it can obtain the adversarial samples with high confidence in a brief time just by the classification and confidence information output from the neural network.


Introduction
In real life, optimization problems such as shortest path, path planning, task scheduling, parameter tuning, etc. are becoming more and more complex and have complex features such as nonlinear, multi-constrained, high-dimensional, and discontinuous (Deng et al., 2021).Although a series of artificial intelligence algorithms represented by deep learning can solve some optimization problems, they lack mathematical interpretability due to the existence of a large number of nonlinear functions and parameters inside their models, so they are difficult to be widely used in the field of information security.Traditional optimization algorithms and artificial intelligence algorithms can hardly solve complex optimization problems with high dimensionality and nonlinearity in the field of information security.
Therefore, it is necessary to find an effective optimization algorithm to solve such problems.In this background, various swarm intelligence optimization algorithms have been proposed one after another, such as Particle Swarm Optimization(PSO) (Kennedy and Eberhart, 1995;Eberhart and Kennedy, 1995),Grey Wolf Optimizer(GWO) (Mirjalili et al., 2014), etc. Subsequently, a variety of improved optimization algorithms also have been proposed one after another.For example, the improved genetic algorithm for cloud environment task scheduling (Zhou et al., 2020), the improved genetic algorithm for flexible job shop scheduling (Zhang et al., 2020), the improved genetic algorithm for green fresh food logistics (Li et al., 2020), etc.
However, these improved optimization algorithms are improved for domainspecific optimization problems and do not improve the accuracy, convergence efficiency and generalization of the algorithms themselves.In this paper, the crossover operator and mutation operator of the genetic algorithm are improved to improve the convergence efficiency and precision of the algorithm without affecting the effectiveness of the improved genetic algorithm on most optimization problems.The effectiveness of the improved genetic algorithm is also verified through many comparison experiments and applications in the field of neural network adversarial attacks.
The main contributions of this paper are as follows: • By improving the single-point crossover link of SGA, the fitness function is used as an evaluation index for selecting children after crossover, thus reducing the number of iterations and accelerating the convergence speed.
• By improving the basic bitwise mutation of the SGA, traversing each gene of the offspring and performing selective mutation on them, setting different mutation rates for two parts of a chromosome, thus improving the global search in the stable case of local optimum.
• The improved genetic algorithm is applied to the field of neural network adversarial attack, which increases the speed of adversarial sample generation and improves the robustness of the neural network model.

Genetic Algorithm
Genetic Algorithm is a series of simulation evolutionary algorithms proposed by Holland et al. (1975), and later summarized by DeJong, Goldberg and others.The general flowchart of the Genetic Algorithm is shown in Figure 1.The Genetic Algorithm first encodes the problem, then calculates the fitness, then selects the parent and the mother by roulette, and finally generates the children with high fitness by crossover and mutation, and finally generates the individuals with high fitness after many iterations, which is the satisfied solution or optimal solution of the problem.Simple Genetic Algorithm (SGA) uses single-point crossover and simple mutation to embody information exchange between individuals and local search, and does not rely on gradient information, so SGA can find the global optimal solution.

Other Meta-heuristic Algorithms
The meta-heuristic algorithm is problem-independent, does not exploit the specificity of the problem, and is a general solution.In general, it is not greedy, can explore  The PSO (Kennedy and Eberhart, 1995;Eberhart and Kennedy, 1995) algorithm is a swarm intelligence-based global stochastic search algorithm inspired by the results of artificial life research and by simulating the migration and flocking behavior of bird flocks during foraging, and its basic idea is inspired by the results of research on modeling and simulation of birds flock behavior.The GWO algorithm is a swarm intelligence optimization algorithm proposed by Mirjalili et al. (2014).The algorithm is inspired by the grey wolf prey hunting activity and developed as an optimization search algorithm, which has strong convergence performance, few parameters, and easy implementation.The Marine Predator Algorithm (MPA) (Faramarzi et al., 2020) is mainly inspired by foraging strategies widely found in marine predators, namely Lévy and Brownian motion, and optimal encounter rate strategies in biological interactions between predators and prey.The Artificial Gorilla Troops Optimizer (GTO) (Abdollahzadeh et al., 2021b) was inspired by the gorilla group life behavior.The GTO is characterized by fast search speed and high solution accuracy.The African Vulture Optimization Algorithm(AVOA) (Abdollahzadeh et al., 2021a) was inspired by the foraging and navigation behavior of African vultures.this algorithm is fast and has high solution accuracy which is widely used in single-objective optimization.The Remora Optimization Algorithm (ROA) (Jia et al., 2021) first proposed an intelligent optimization algorithm inspired by the biological habits of the neutrals in nature, which has good solution accuracy and high engineering practical value in both function seeking to solve extreme values and typical engineering optimization problems.Szegedy et al. (2013) first demonstrated that a highly accurate deep neural network can be misled to make a misclassification by adding a slight perturbation to an image that is imperceptible to the human eye, and also found that the robustness of deep neural networks can be improved by adversarial training.Such phenomena are farreaching and have attracted many researchers in the area of adversarial attacks and deep learning security.Akhtar and Mian (2018) surveyed 12 attack methods and 15 defense methods for neural networks adversarial attacks.The main attack methods are finding the minimum loss function additive term (Szegedy et al., 2013), increasing the loss function of the classifier (Kurakin et al., 2016), the method of limiting the l 0 norm (Papernot et al., 2016), changing only one pixel value (Su et al., 2019), etc. Nguyen et al. (2015) continued to explore the question of "what differences remain between computer and human vision" based on Szegedy et al. (2013).They used the Evolutionary Algorithm to generate high-confidence adversarial images by iterating over direct-encoded images and CPPN (Compositional Pattern-Producing Network) encoded images, respectively.They obtained high-confidence adversarial samples (fooling images) using the Evolutionary Algorithm on a LeNet model pre-trained on the MNIST dataset (LeCun, 1998) and an AlexNet model pre-trained on the ILSVRC 2012 ImageNet dataset (Deng et al., 2009;Russakovsky et al., 2015), respectively.

Neural Network Adversarial Attack
Neural network adversarial attacks are divided into black-box attacks and whitebox attacks.Black-box attacks do not require the internal structure and parameters of the neural network, and the adversarial samples can be generated with optimization algorithms as long as the output classification and confidence information is known.The study of neural network adversarial attacks not only helps to understand the working principle of neural networks but also increases the robustness of neural networks by training with adversarial samples.

Approaches
This section improves the single-point crossover and simple mutation of SGA.The fitness function is used as the evaluation index of the crossover link, and the crossover points of the whole chromosome are traversed to improve the efficiency of the search for the best.A selective mutation is performed for each gene of the children's chromosome, and the mutation rate of the latter half of the chromosome is set to twice that of the first half to improve the global search under the stable situation of local optimum.

Improved Crossover Operation
As shown in algorithm 1 is the Python pseudocode for the improved crossover algorithm.The single-point crossover of SGA is to generate a random number within the parental chromosome length range, and then intercept the first half of the father's chromosome and the second half of the mother's chromosome to cross-breed the children according to the generated random number.In this paper, the algorithm is improved by trying to cross genes within the parental chromosome length range one by one, calculating the fitness, and picking out the highest fitness children individuals.Experimental data show that such an improvement can reduce the number of iterations and speed up the convergence of fitness.

Improved Mutation Operation
As shown in algorithm 2 is the pseudocode of the improved mutation algorithm.The simple mutation of SGA sets a relatively large mutation rate, and mutates any one gene of the incoming children's chromosome when the generated random number is smaller than the mutation rate.In this paper, we improve the algorithm by setting a small mutation rate and then selectively mutating each gene of the incoming children's chromosome.That is, when the generated random number is smaller than the mutation rate, the gene is mutated, and when the traversed gene position is larger than half of the chromosome length, the mutation rate is set to twice the original one (the second half of the gene has relatively less influence on the result).This ensures that the first half of the gene and the second half of the gene have an equal chance of mutation respectively, and can mutate at the same time.When the gene length is 784, the mutation rate of the whole chromosome is 1−(1−0.025) 392(1−0.05) 392, which greatly improves the species diversity and at the same time ensures the stability of the species (in the stable situation of the local optimum improves the global search ability), and experimental data show that it can improve the search capability.

Test Functions
In order to evaluate the optimization performance of the proposed improved genetic algorithm, 15 representative test functions from AVOA paper of Abdollahzadeh et al. (2021a) and Wikipedia (2021) are selected in this paper.Since the proposed improved genetic algorithm is mainly used for the neural network adversarial attack problem, and the neural network has multi-dimensional parameters, the dimensions of the test functions will be tested on 30, 50, and 100, respectively.The details of the formula, dimensions, range, and minimum of the 15 test functions are shown in

Experimental Environment
The hardware environment of the experiment includes 8G of RAM, i7-4700MQ CPU; the software environment includes Windows 10 system, and the version of Python is 3.8.8.In order to compare the optimization performance of IGA, SGA (Simple Genetic

Qualitative Result Analysis
As shown in Figure 3, Figure 4, Figure 5 and Figure 6, F12-F15 are used to evaluate the qualitative results of the IGA.Each optimization algorithm was tested 10 times on F12-F15 under the same experimental conditions.Among them, "Population distribution" is the scatter plot of the distribution of all individuals for each optimization algorithm in 10 experiments, and the formula for the density is shown in Formula (1), population size = 50."Best record" is the scatter plot of the distribution of the optimal individuals for each experiment, and the formula for calculating the intensity is shown in Formula (2).From the figure, we can see that the density of optimal individuals for each round of experimental IGA is better than the other three optimization algorithms, and also retains a strong global search capability in the last iteration.On the F15 test function, SGA, PSO and GWO fall into local optimum several times.From the convergence curves, we can see that IGA is converged before the other three optimization algorithms, and the precision after convergence is better.

Quantitative Result Analysis
In order to make a quantitative comparison with the other three mainstream optimization algorithms, the four optimization algorithms are performed independently for 10 experiments on F1-F11 test functions in dimensions 30, 50, and 100, respectively.The purpose of performing the high-dimensional function test is to test the convergence superiority of IGA on the high-dimensional space for application in the field of neural network adversarial attack.Table 5, Table 6, Table 7 are the test results of the test functions F1-F11 in 30, 50, and 100 dimensions, respectively.Table 8 shows the results of the four optimization algorithms tested on the test functions F12-F15.The best result, worst result, mean, median, standard deviation, and P-value are compared for 10 experiments.Where P-value is the result of the Wilcoxon rank-sum statistical test and P-value below 5% is significant.
In Table 5, IGA achieves significantly superior performance in 9 test functions, PSO is better in F3, and SGA is slightly better in F8.In Table 6 and Table 7, IGA achieves significantly superior performance in 10 test functions, PSO performs better in F3.It can be seen that the performance loss of IGA with increasing dimensionality is not as large as the other three optimization algorithms.In Table 8, IGA achieves significantly superior performance in 3 test functions, and PSO performs slightly better in F14.
In general, IGA has better iteration efficiency, global search capability, and convergence success rate than the other three optimization algorithms.(Yasue, 2018).

Implementation
As shown in Figure 7 The number of populations of a specific size (set to 100 in this paper) is first generated and then input to the neural network to obtain the confidence of the specified labels.To reduce the computational expense, the input is reduced to a binary image of 28 × 28 and the randomly generated binary image is iterated using the IGA proposed in this paper.Among the 100 individuals, the fathers and mothers with relatively high confidence are selected by roulette selection, and then the children are generated by using the improved crossover link in this paper, and the children from a new population by improving the mutation link until the specified number of iterations.Finally, the individual with the highest confidence is picked from the 100 individuals, which is the binary image with the highest confidence after passing through the neural network.

Result
As shown in Figure 8, the confidence after 99 iterations of DCNN is 99.98% for sample "2".Sample "6" and sample "4" have the slowest convergence speed, and the confidence of sample "6" is 78.84% after 99 iterations, and the confidence of sample "4" is 78.84% after 99 iterations.The statistics of the experimental results are shown in Table 9.The binary image of sample "1" generated after 999 iterations has confidence of 99.94% after passing DCNN, which is much higher than the confidence of sample "1" in the MNIST test set in the DCNN control group.In the statistics of the results after initializing the population with the MNIST test set, because the overall confidence of the population initialized with the test set is higher, the increase in confidence during iteration is smaller.The confidence of the sample selected from the MNIST test set is 99.56%, and after 10 iterations the confidence of the sample is 99.80%, and the number "1" becomes vertical; after 89 iterations the confidence is 99.98%, and the number "1 " has a tendency to "decompose" gradually.As shown in Figure 9, the reason for this situation is probably that the confidence as a function of the image input is a multi-peak function, and the interval in which the test set images are distributed is not the highest peak of the confidence function.This causes the initial population of the test set to "stray" from some pixels in the images generated by the IGA.

Conclusion
The comparison and simulation experiments show that the improved method proposed in this paper is effective and greatly improves the convergence efficiency, global search capability and convergence success rate.Applying IGA to the field of neural network adversarial attacks can also quickly obtain adversarial samples with high confidence, which is meaningful for the improvement of the robustness and security of neural network models.
In this paper, although the genetic algorithm has been improved to enhance the performance of the genetic algorithm, it is based on the genetic algorithm, so it cannot be completely separated from the general framework of the genetic algorithm, and the problem that the genetic algorithm is relatively slow in a single iteration cannot be solved.We hope to explore a new nature-inspired optimization algorithm in our future work.In addition, the reason why the neural network model has so many adversarial samples, we believe that it is a design flaw in the architecture of the neural network model.In future work, we will also try to explore a completely new way of the infrastructure of neural networks so as to compress the space of adversarial samples.
With the wide application of artificial intelligence and deep learning in the field of computer vision, face recognition has outstanding performance in access control systems and payment systems, which require a fast response to the input face image, but this has instead become a drawback to be hacked.For face recognition systems without in vivo detection, using the method in this paper only requires output labels and confidence information can obtain high confidence images quickly.In summary, neural networks have many pitfalls due to their uninterpretability and still need to be considered carefully for use in important areas.

Figure 4 :
Figure 3: Qualitative results for the F12 function

Figure 6 :
Figure 5: Qualitative results for the F14 function (a), the Deep Convolutional Neural Network (DCNN) pre-trained on the MNST dataset (LeCun, 1998) is used as the experimental object in this paper, and the accuracy of the model is 99.35% with a Loss value of 0.9632.As shown in Figure 7(b), the model of network adversarial attack is shown.

Figure 7 :
Figure 7: The model of network adversarial attack

Figure 8 :
Figure 8: The confidence change of the binary image after iteration

Figure 9 :
Figure 9: The confidence curve of a binary image

function 4 Numerical Experiments and Analysis
Algorithm 2 Mutate child with alter each gene if rand number less than mutate rate.

Table 1 ,
andTable 3, where Table 1 are multi-dimensional test functions with unimodal, Table 2 are multi-dimensional test functions with multi-modal, and Table 3 for fixeddimensional test functions.

Table 2 :
Details of multi-modal test functions

Table 3 :
Details of fixed-dimension test functions

Table 4 :
The parameter settings

Table 8 :
LeCun, 1998)est functions (F12-15) with fixed dimensionsLeCun, 1998)is one of the most well-known datasets in the field of machine learning and is used in applications from simple experiments to published paper research.It consists of handwritten digital images from 0-9.The MNIST image data is a single-channel grayscale map of 28 × 28 pixels, with each pixel taking values between 0 and 255, with 60,000 samples in the training set and 10,000 samples in the test set.The general usage of the MNIST dataset is to learn with the training set first and then use the learned model to measure how well the test set can be correctly classified

Table 9 :
Statistical table of experimental results