Figures
Abstract
The parameter values of neural networks will directly affect the performance of the network, so it is very important to choose the appropriate parameter tuning method to improve the performance of the neural network. In this paper, the improved beluga whale optimization hyperparameter optimization ResNet model is used to construct a new model, EBWO-ResNet. Firstly, in order to solve the problem that the initial population of the original beluga whale optimization is not rich enough, the Tent chaotic map is introduced into the beluga whale optimization, and a new algorithm EBWO is constructed. Secondly, in order to solve the problems of low accuracy and difficult parameter tuning of ResNet, the EBWO algorithm was integrated into ResNet to construct a new model EBWO-ResNet. Finally, in order to verify the effectiveness of the EBWO algorithm, the EBWO algorithm was applied to three engineering problems and compared with other five swarm intelligent algorithms, and in order to verify the effectiveness of the EBWO-ResNet model, EBWO-ResNet was applied to maize disease identification,in order to improve the accuracy of corn identification and ensure corn yield,and the other seven models were compared based on three evaluation indexes. The experimental results show that the EBWO algorithm provides the best solutions in the three engineering problems, and the EBWO-ResNet has the best performance in identifying maize diseases, with an accuracy of 96.3%,which is 0.2-1.5 percentage points higher than that of other models.
Citation: Liu H, Qu S, Zhang S, Zhang Y, Li Y (2025) Hyperparameter optimization ResNet by improved Beluga Whale Optimization. PLoS One 20(10): e0333575. https://doi.org/10.1371/journal.pone.0333575
Editor: Ananth JP, Dayananda Sagar University, INDIA
Received: May 9, 2025; Accepted: September 16, 2025; Published: October 24, 2025
Copyright: © 2025 Liu et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All relevant data are within the manuscript and its Supporting Information files.
Funding: This study was financially supported by the Jilin Provincial Department of Education in the form of a project award (JJKH20240243KJ) received by YL. No additional external funding was received for this study.
Competing interests: The authors have declared that no competing interests exist.
Introduction
Neural networks have attracted attention since they were proposed, especially for datasets with a large number of samples, and neural networks can be trained and fitted by a large number of samples. The neural network has an embedded feature extraction structure, so the neural network can perform feature extraction autonomously without manual operation, which can improve the quality of feature extraction and help for subsequent work. Based on the above advantages, neural networks are trusted and widely used in medicine [1], agriculture [2], fault diagnosis [3], and other fields. However, with the increase of social demand, the disadvantages of traditional neural networks have gradually begun to appear, and people’s requirements for their performance have gradually begun to increase. Based on this, a large number of studies have begun to focus on neural network optimization, and a large number of neural network variants have begun to appear.
Wang et al. designed an HCNN based on global inference [4], in which only one layer of a three-dimensional convolutional neural network and a one-layer two-dimensional convolutional optical network were used to jointly extract signs, and SS-GloRe-Unit was also designed to completely extract global features. In order to improve the performance of ResNet, Yang et al. proposed a MA-ResNet model based on multiple attention mechanisms [5], and the experimental results show that the proposed model has faster convergence speed, higher accuracy, and better accuracy of small target classification than other feature extraction models. Yang et al. proposed an rE-GoogLeNet convolutional neural network model [6],which replaced the convolutional kernel of the first layer, added the ECA attention mechanism to the Inception module, used the improved module of residual network connection, replaced the ReLU activation function with the leaky ReLU activation function, and finally simplified the auxiliary classifier, and the simulation results show that the proposed model has the best performance. It can be seen that the optimization of the neural network can actually improve the performance of the model, especially the internal parameter values of the neural network can directly affect the performance of the model, and it is of great significance to select the appropriate parameter values to improve the performance of the model. Traditional neural network parameter tuning mainly relies on manual, grid search and random search, which is inefficient and time-consuming, especially when it is difficult to find the optimal parameter combination for manual parameter tuning.
In recent years, meta-heuristic algorithms have been applied to solve complex optimization problems due to their powerful optimization ability, and good results have been obtained. When it is applied to solve the problem of neural network parameter optimization, it also shows powerful optimization ability. Guo et al. first proposed the dung beetle optimized convolutional neural network (DBO-CNN) [7], in which the dung beetle optimization algorithm hyperparameters were used to optimize the convolutional neural network, and the experimental results proved that the accuracy of DBO-CNN reached 97.93%. Wang et al. proposed an Adaptive Gaussian Variation PSO (ADGMPSO) algorithm [8] and applied it to the parameter configuration of the optimized convolutional neural network, and the experimental results show that the optimized neural network has higher accuracy and generalization ability. In order to improve the performance of CNN, Gadekallu et al. optimized the hyperparameters of CNN by applying the newly developed meta-heuristic algorithm HHO, and the proposed hybrid model of HHO-CNN [9] performed better than the existing model through comparative analysis. Heng et al. proposed a new hybrid neural network model based on algorithm optimization [10], in which the long short-term memory (LSTM) network is used to capture the time series features and depth features of CNN output, and the slime bacteria algorithm (SMA) algorithm is used to adaptively configure the hyperparameters of the LSTM in order to optimize the hyperparameter configuration of the LSTM.
In summary, it is feasible to apply meta-heuristics to replace the traditional manual parameter tuning method of neural network. In this paper, the Beluga optimization algorithm is improved by using the Tent chaotic map, and the improved Beluga optimization algorithm is integrated into the ResNet model to achieve the purpose of adaptively modifying the parameters. The rest of this article reads as follows:
The second part details the methods proposed in this paper. The third section describes the datasets, experiment configurations, and experimental results used for experiments. The fourth part summarizes the experiment and clarifies the future research directions.
Experimental methods
Improved Beluga Whale Optimization
Beluga Whale Optimization.
The Beluga Whale Optimization [11] [12] [13] is a new meta-heuristic algorithm developed in 2022, which simulates the life behavior of beluga whales in nature, just like other meta-heuristic algorithms that simulate population behavior in nature. Beluga whales are social animals, and they move together, exchange information, and hunt and breed. Beluga whales have natural predators and can also have accidents, so beluga whales can also die, a phenomenon called "whale fall". The beluga whale optimization algorithm is divided into three stages in mathematical modeling, which are swimming, predation, and whale fall. In the Beluga optimization algorithm, each individual is treated as a separate search agent, and the initial search agent matrix is modeled as follows:
In the exploration phase of the beluga whale optimization algorithm, the swimming behavior of the beluga whale is mathematically modeled, and the calculation formula is as follows:
Where t represents the current iteration, represents the position information of the i individual in the j dimension, dj is randomly selected in the dimension,
represents the position of the r individual, and r represents random selection. r1 and r2 represent random numbers with values in the range of (0,1).
In the development stage of the BWO, the predatory behavior of beluga whales was mathematically modeled by simulating the predatory behavior of beluga whales, and the calculation formula is as follows:
represents the best position in the individual, r3 and r4 like r1 and r2, represents a random number with a value range of (0,1). The formula for calculating C1 is as follows:
where T represents the maximum number of iterations. The formula for calculating the Levy function is as follows:
Where u and v are random numbers that obey a normal distribution, and β is the default constant.
There is a balance factor Bf in the Beluga optimization algorithm to determine whether the Beluga optimization algorithm will shift from exploration to development, and the calculation formula is as follows:
where B0 varies randomly between (0,1).
The Beluga optimization algorithm sets the beluga fall step parameter Xs when simulating whale falls for mathematical modeling, and the calculation formula is as follows:
where UB and LB are the upper and lower limits of the variable, respectively, and is the step factor, Wf is calculated as follows:
Therefore, the mathematical modeling formula of the beluga optimization algorithm for simulating whale falls is as follows:
where r5, r6, r7 are also random numbers with a value of (0,1).
The algorithm proposed in this paper.
In order to solve the problem that the initial solution randomness is too large and the population is not abundant enough [14] [15], the Tent chaotic map [16] [17] [18] is introduced into the Beluga optimization algorithm,Tent mapping is a discrete mapping with a simple algorithm but complex sequence, which has the advantages of fast operation speed and uniform sequence distribution by generating pseudo random sequences,and the calculation formula is as follows:
The introduction of Tent chaotic mapping into the beluga optimization algorithm can avoid the problem of excessive randomness of the population in the initialization process, because the Tent chaotic map can replace the mechanism of random scattering of the population, so that the population is relatively evenly distributed. The improved algorithm is named EBWO.
Improve the neural network model
Basic ResNet model.
Deepening the number of neural network layers can actually improve the accuracy of the neural network [19] [20] [21], but after the threshold is reached, the accuracy of the neural network will drop significantly, which is obviously contrary to the conclusion that "the more layers of the network, the higher the network accuracy", which researchers call a degenerative phenomenon. The degradation phenomenon makes the neural network unable to do linear transformation well, which leads to problems such as gradient disappearance and gradient explosion. To address these issues, ResNet [22] [23] [24] introduces shortcut links that allow the output of one layer to skip one or more layers directly and connect to the input of subsequent layers. This makes it possible for certain network layers to pass information even without any transformation. In this experiment, ResNet50 was selected as the basic model, and the structure of ResNet50 is shown in Fig 1.
The model proposed in this paper.
In this paper, the traditional parameter tuning method is abandoned, and the improved EBWO algorithm is used to adaptively find the best parameter combination of ResNet,parameters including training algorithm, momentum leaning, batch size, epoch, and validation frequency. and the model is named EBWO-ResNet. The EBWO-ResNet pseudocode is as follows:
1. Set the parameters N and T
2. Initialize the population and calculate the fitness
3. While
4. For
5. If Bf>0.5
6. Update the position according to Eq (2).
7. Else if
8. Update the position according to Eq (3).
9. End if
10. End for
11. For
12. If
13. Update the position according to Eq (10).
14. End if
15. End for
16. t = t + 1
17. End while
18. Output the optimal solution
19. Assign the optimal solution to ResNet50
The improved EBWO-ResNet model can be adaptively found by the EBWO algorithm to find the optimal solution combination, and directly transmitted back to the ResNet model without manual assignment, as shown in Fig 2.
Experimental results
Experimental setup
All the research in this paper was carried out in the laboratory, and the experiments were carried out by MATLAB.The CPU model is AMD Ryzen 7 5800H with Radeon Graphics, the GPU model is RTX 4060, and the OS model is Window 11. In order to verify the effectiveness of the proposed algorithm EBWO, EBWO was applied to three engineering experimental optimization problems, and compared with five other meta–heuristic algorithms, namely BWO, Sine Cosine Algorithm(SCA) [25], Grey Wolf Optimizer (GWO) [26], Genetic Algorithm (GA) [27], and Particle Swarm Optimization (PSO) [28].The experiment was run 30 times independently, and the results were averaged. In order to verify the effectiveness of the proposed model EBWO-ResNet, EBWO-ResNet was applied to the identification of maize diseases, and compared with seven ResNet models optimized by other meta-heuristic algorithms, the seven meta-heuristic algorithms were BWO, Fruit Fly Optimization Algorithm (FOA) [29], GWO,PSO, Firefly Algorithm (FA) [30], Ant Colony Optimization(ACO) [31],GA were evaluated as accuracy, sensitivity, and precision.
Algorithm comparison experiments
Pressure vessel design.
One of the many engineering problems is pressure vessel design. In real life, we will inevitably apply it to containers, such as gas storage tanks, oil tanks, and even containers used in the chemical industry to load products. The safety of these containers will directly affect the safety of the product, and the sealing, pressure resistance and other attributes of the container will also affect the use of the product, so it is very important to do a good job in the design of the container to ensure the safe use of the product. There are many problems in the design of containers, such as the thickness, size and structure of containers. In this experiment, four aspects were considered, namely the thickness TS of the container wall, the thickness TH of the hemispherical head, the inner radius R and the length L of the cylindrical cross-section, and the experimental results are shown in Table 1.
The smaller the fitness value in the table, the better the optimization performance of the algorithm. As can be seen from Table 1, EBWOA ranks first with the smallest fitness value, and BWO ranks second, with a fitness value of 1102.242 higher than EBWO, indicating that there is still a gap between the optimization performance of the two. In third place is SCA, which has a fitness value similar to BWO, but is very different from EBWO, indicating that SCA’s performance is not as good as EBWO. GWO, GA and PSO ranked in the last three places, and their optimization performance was average, especially PSO, which had the lowest fitness value, with a difference of 7605.585 from EBWO.
Three-bar truss design.
The three-bar truss is one of the most common structures in engineering construction, and its main function is to support, so only the three-bar truss has sufficient bearing capacity to ensure the safety of the project. The three-bar truss is mainly composed of three rods, so the bearing capacity of the three-bar truss is mainly related to the material of the rod, the cross-sectional size of different rods, and the connection method. Based on this, this experiment was optimized from the cross-sectional area, and the experimental results are shown in Table 2. Where X1 and X2 represent different cross-sectional areas. Table 2 explains the same as Table 1, and it can be seen from Table 2 that EBWO also provides the lowest fitness value, ranking first, with a fitness value of 263.9106, and shows high stability in the optimization of variables X1 and X2.
Tension and compression spring design.
The tension and compression spring is also one of the most commonly used components in engineering construction, and its main function is to ensure the stability and balance of other machinery, and to ensure the normal operation of the entire mechanical system. In order to achieve this function, the tension and compression spring must have sufficient tension, and the tension of the tension and compression spring is related to the influencing factors such as the characteristics of the material, size, and stiffness coefficient. Based on this, this experiment was optimized from three influencing factors, namely the diameter of the spring coil d, the diameter of the spring coil D, and the number of coils p, and the experimental results are shown in Table 3. From Table 3, it can be seen that the fitness value of EBWO (0.012708) is also better than that of other algorithms, and it shows strong comprehensive performance in the optimization of variables X1, X2 and X3.
Model comparison experiments
Datasets.
The dataset is a public dataset derived from the Kaggle open source platform. This experimental dataset contains a total of 4187 images in 4 categories [32], namely blight, common rust, gray leaf spot, and healthy. An example of the dataset is shown in Fig 3.The dataset is divided into training set and test set, of which the training set accounts for 70% and the test set accounts for 30%.
Comparison results.
The results of ResNet were optimized by applying the different group intelligence algorithms as shown in Fig 4. In order: EBWO-ResNet, BWO-ResNet, FOA-ResNet, GWO-ResNet, PSO-ResNet, FA-ResNet, ACO-ResNet, GA-ResNet. The confusion matrix in Fig 4 consists of 5 rows and 5 columns, of which the first four rows represent the prediction categories, namely blight, common rust, gray leaf spot, and healthy, and the last row represents the sensitivity of model recognition. The first four columns represent the real categories, namely blight, common rust, gray leaf spot, and healthy, and the last column represents the model recognition accuracy. EBWO-ResNet has the highest accuracy rate of 96.3%, BWO-ResNet ranks second, reaching 96.1, a difference of 0.2 percentage points from EBWO-ResNet, among all models, only EBWO-ResNet and BWO-ResNet have an accuracy rate of more than 96%, and the accuracy of other models is below 96%. Among them, GWO-ResNet ranked third, reaching 95.6%, 0.7 percentage points lower than EBWO-ResNet. The accuracy of FOA-ResNet is equal to that of ACO-ResNet, reaching 95.5%. The accuracy of the other models was below 95%, with the same accuracy of 94.9% for PSO-ResNet and FA-ResNet, and the lowest accuracy for GA-ResNet (94.82%), a difference of 1.48 percentage points from EBWO-ResNet. Therefore, the performance of EBWO-ResNet is undoubtedly the best in terms of accuracy.
Sensitivity and precision results are shown in Fig 5. In order: Sensitivity result plots, precision result plots.For the sensitivity evaluation index, all models achieved 100% in the recognition of healthy, and although EBWO-ResNet did not achieve the best effect in the recognition of blight, EBWO-ResNet achieved the best effect in the recognition of common rust, reaching 99.0%. Most importantly, among all the models, only EBWO-ResNet achieved a recognition performance of more than 90% for all categories, and the other models achieved a recognition performance of more than 90% for gray leaf spots. So on the whole, EBWO-ResNet identification sensitivity is also the best. For the precision evaluation index, only PSO-ResNet, FA-ResNet and GA-ResNet did not reach 100% in identifying healthy, and the other models all reached 100%. And for the precision, no model achieved more than 90% in identifying all categories, EBWO-ResNet provided the best accuracy in identifying blight, reaching 95.2, and the other models only reached 93.7, a difference of 1.5 percentage points from EBWO-ResNet. In the other two categories, EBWO-ResNet also reached 98.2% and 87.2%, respectively, and the results were also satisfactory. Based on the three evaluation indicators, EBWO-ResNet has the best comprehensive performance.
Conclusion
In order to solve the problem of low accuracy of maize disease identification, an EBWO-ResNet model was proposed. In order to improve the performance of the swarm intelligence algorithm, the Tent map is selected as the improved method, which is integrated into the BWO algorithm, and finally the improved EBWO algorithm is applied to improve the ResNet model. In order to verify the effectiveness of the proposed EBWO algorithm and apply it to engineering experiments, it can be found that the performance of the EBWO algorithm is the best and the most stable, and the best results have been achieved in the three engineering experiments. In order to verify the effectiveness of the proposed model EBWO-ResNet, EBWO-ResNet was applied to maize disease identification, and compared with other 7 population intelligence algorithms, it can be seen from the experimental results that the accuracy of EBWO-ResNet is the highest, reaching 96.3%. In summary, the proposed method EBWO-ResNet is feasible and can be used as an auxiliary method for maize disease identification. In the future, we will further explore to make up for the limitations of this experiment, for example, the dataset of this experiment is only from an open source platform, and we can try to collect datasets from more platforms next time, and even try to build experimental fields and collect datasets by ourselves. In this experiment, an improved method is used to optimize the BWO algorithm, and more improved mechanisms can be tried next time.There are many kinds of crops, and the follow-up work can also try to apply the method proposed in this paper to the identification of other crops.
References
- 1. Zhang H, Cheng Y, Chen Z, Cong X, Kang H, Zhang R, et al. Clot burden of acute pulmonary thromboembolism: Comparison of two deep learning algorithms, Qanadli score, and Mastora score. Quant Imaging Med Surg. 2022;12(1):66–79. pmid:34993061
- 2. Hasanah SA, Pravitasari AA, Abdullah AS, Yulita IN, Asnawi MH. A deep learning review of ResNet architecture for lung disease identification in CXR image. Appl Sci. 2023;13(24):13111.
- 3. Zhang L, Du J, Dong S, Wang F, Xie C, Wang R. AM-ResNet: Low-energy-consumption addition-multiplication hybrid ResNet for pest recognition. Comput Electron Agric. 2022;202:107357.
- 4. Yu ZT, Zhang L, Kim J. The performance analysis of PSO-ResNet for the fault diagnosis of vibration signals based on the pipeline robot. Sensors. 2023;23(9).
- 5. Wang W, Ma X, Leng L, Wang Y, Liu B, Sun J. A hybrid CNN based on global reasoning for hyperspectral image classification. IEEE Geosci Remote Sensing Lett. 2022;19:1–5.
- 6. Yang L, Zhong J, Zhang Y, Bai S, Li G, Yang Y, et al. An improving faster-RCNN with multi-attention ResNet for small target detection in intelligent autonomous transport with 6G. IEEE Trans Intell Transport Syst. 2023;24(7):7717–25.
- 7. Yang L, Yu X, Zhang S, Long H, Zhang H, Xu S, et al. GoogLeNet based on residual network and attention mechanism identification of rice leaf diseases. Comput Electron Agric. 2023;204:107543.
- 8. Guo XH, et al. Speaker recognition based on dung beetle optimized CNN. Appl Sci-Basel. 2023;13(17).
- 9. Wang CX, Shi TT, Han DN. Adaptive dimensional gaussian mutation of PSO-optimized convolutional neural network hyperparameters. Appl Sci-Basel. 2023;13(7).
- 10. Gadekallu TR, Srivastava G, Liyanage M, M. I, Chowdhary CL, Koppu S, et al. Hand gesture recognition based on a Harris Hawks optimized Convolution Neural Network. Comput Electr Eng. 2022;100:107836.
- 11. Heng F, Gao J, Xu R, Yang H, Cheng Q, Liu Y. Multiaxial fatigue life prediction for various metallic materials based on the hybrid CNN-LSTM neural network. Fatigue Fract Eng Mat Struct. 2023;46(5):1979–96.
- 12. Yuan H, Chen Q, Li H, Zeng D, Wu T, Wang Y, et al. Improved beluga whale optimization algorithm based cluster routing in wireless sensor networks. Math Biosci Eng. 2024;21(3):4587–625. pmid:38549341
- 13. Huang J, Hu H. Hybrid beluga whale optimization algorithm with multi-strategy for functions and engineering optimization problems. J Big Data. 2024;11(1).
- 14. Kouhpah Esfahani K, Mohammad Hasani Zade B, Mansouri N. Multi-objective feature selection algorithm using Beluga Whale Optimization. Chemometr Intell Lab Syst. 2025;257:105295.
- 15. Punia P, Raj A, Kumar P. An enhanced Beluga whale optimization algorithm for engineering optimization problems. J Syst Sci Syst Eng. 2024.
- 16. Kopets E, Rybin V, Vasilchenko O, Butusov D, Fedoseev P, Karimov A. Fractal tent map with application to surrogate testing. Fractal Fract. 2024;8(6):344.
- 17. Qi Y, Jiang A, Gao Y. A Gaussian convolutional optimization algorithm with tent chaotic mapping. Sci Rep. 2024;14(1):31027. pmid:39730896
- 18. Anusic AR, De Leo. Structural graph and backward asymptotics of tent-like unimodal maps. J Differ Equ Applic. 2025;31(3):299–329.
- 19. Valle ME, Vital WL, Vieira G. Universal approximation theorem for vector- and hypercomplex-valued neural networks. Neural Netw. 2024;180:106632. pmid:39173201
- 20. Zhang J, Zheng G, Koike-Akino T, Wong K-K, Burton FA. Hybrid quantum-classical neural networks for downlink beamforming optimization. IEEE Trans Wireless Commun. 2024;23(11):16498–512.
- 21. Zhao J, Chang D, Cao B, Liu X, Lyu Z. Multiobjective evolution of the deep fuzzy rough neural network. IEEE Trans Fuzzy Syst. 2025;33(1):242–54.
- 22. Cai W, Li M, Jin G, Liu Q, Lu C. Comparison of residual network and other classical models for classification of interlayer distresses in pavement. Appl Sci. 2024;14(15):6568.
- 23. Shan C, Geng X, Han C. Remote sensing image road network detection based on channel attention mechanism. Heliyon. 2024;10(18):e37470. pmid:39309790
- 24. Li X, Zou X, Liu W. Residual network with self-adaptive time step size. Pattern Recogn. 2025;158:111008.
- 25. Pan J-S, Zhang S-Q, Chu S-C, Hu C-C, Wu J. Efficient FPGA implementation of sine cosine algorithm using high level synthesis. JIT. 2024;25(6):865–76.
- 26. Huang YR, et al. CMGWO: Grey wolf optimizer for fusion cell-like P systems. Heliyon. 2024;10(14).
- 27. Hameed S, Tanoli IK, Khan TA, Ahmad S, Alluhaidan ASD, Plawiak P, et al. A novel self-healing genetic algorithm for optimizing single objective flow shop scheduling problem. Arab J Sci Eng. 2024;50(10):7069–84.
- 28. Li P-F, Liu W, Zhang Z-Y, Kang J, Zhang J-P. Numerical control machining step error calculation based on hybrid particle swarm optimization method. Int J Adv Manuf Technol. 2024;133(7–8):3151–62.
- 29. Qin S, Wang J, Wang J, Liu S, Guo X, Qi L, et al. An improved fruit fly optimization algorithm for disassembly lines requiring multiskilled workers. IEEE Trans Comput Soc Syst. 2024;11(5):5671–84.
- 30. Panliang M, Madaan S, Babikir Ali SA, J G, Khatibi A, Alsoud AR, et al. Enhancing feature selection for multi-pose facial expression recognition using a hybrid of quantum inspired firefly algorithm and artificial bee colony algorithm. Sci Rep. 2025;15(1):4665. pmid:39920157
- 31. Bai X, Liu D, Xu X. A review of improved methods for ant colony optimization in path planning. J Ship Res. 2024;68(02):77–92.
- 32. Li ZS, et al. Enhanced sea horse optimization algorithm for hyperparameter optimization of agricultural image recognition. Mathematics. 2024;12(3).