## Figures

## Abstract

As a nature-inspired search algorithm, firefly algorithm (FA) has several control parameters, which may have great effects on its performance. In this study, we investigate the parameter selection and adaptation strategies in a modified firefly algorithm — adaptive firefly algorithm (AdaFa). There are three strategies in AdaFa including (1) a distance-based light absorption coefficient; (2) a gray coefficient enhancing fireflies to share difference information from attractive ones efficiently; and (3) five different dynamic strategies for the randomization parameter. Promising selections of parameters in the strategies are analyzed to guarantee the efficient performance of AdaFa. AdaFa is validated over widely used benchmark functions, and the numerical experiments and statistical tests yield useful conclusions on the strategies and the parameter selections affecting the performance of AdaFa. When applied to the real-world problem — protein tertiary structure prediction, the results demonstrated improved variants can rebuild the tertiary structure with the average root mean square deviation less than 0.4Å and 1.5Å from the native constrains with noise free and 10% Gaussian white noise.

**Citation: **Cheung NJ, Ding X-M, Shen H-B (2014) Adaptive Firefly Algorithm: Parameter Analysis and its Application. PLoS ONE 9(11):
e112634.
https://doi.org/10.1371/journal.pone.0112634

**Editor: **Haipeng Peng, Beijing University, China

**Received: **August 12, 2014; **Accepted: **October 9, 2014; **Published: ** November 14, 2014

**Copyright: ** © 2014 Cheung et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

**Data Availability: **The authors confirm that all data underlying the findings are fully available without restriction. Protein data set information is contained in Table 3 of the paper. Source code is available on GitHub: https://github.com/Ngaam/AdaFa/blob/master/AdaFa-PAA.zip.

**Funding: **This work was supported by the China Scholarship Council (NJC, No. 201406230154), the National Natural Science Foundation of China (HBS, No. 61222306, 91130033, 61175024), Shanghai Science and Technology Commission (HBS, No. 11JC1404800), and A Foundation for the Author of National Excellent Doctoral Dissertation of PR China (HBS, No. 201048). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

**Competing interests: ** Co-author Hong-Bin Shen is a PLOS ONE Editorial Board member. This does not alter the authors' adherence to PLOS ONE policies on sharing data and materials.

## Introduction

Firefly algorithm (FA) is a simple yet quite efficient nature-inspired search technique for global optimization. Since FA was developed, it has attracted a lot of attentions and becomes more popular in solving various real-world problems [1]–[6]. FA is a swarm-based intelligence algorithm, which mimics the flashing behavior of fireflies [7]. A firefly flashes as a signal to attract others for some purposes, e.g. predation or mating. Accordingly, this biological phenomenon is formulated as a meta-heuristic algorithm depending on following three rules [7], [8]:

- All fireflies are attracted by each other without respect to their sex;
- Attractiveness is proportional to its brightness, that is, the less bright one will move towards the brighter one;
- If there are no brighter fireflies than a particular firefly, it will move randomly in the space.

Similar to other heuristic algorithms, FA also has several control parameters to be tuned, for instance, the light absorption coefficient, the randomization control factor, and the population size, for good performance on different problems. The values of these control parameters greatly determine the qualities of the achieved solutions and the efficiency of the FA algorithm. Generally, it is a problem dependent task to select suitable control parameters for FA. As to a complex problem, it will be a hard case to deal with if the problem with numerous local optima, in which most optimization algorithms will be trapped [9]–[11]. Although it is highly important, there is no consistent methodology for determining the control parameters of a FA variant. Mostly, the parameters are fixed experientially or set arbitrarily within some predefined ranges.

In FA algorithm, there are two important issues: (1) the variation of light intensity; and (2) formulation of attractiveness [7]. However, these parameters are either set constants or fixed empirically in the traditional FA [7], which may make the algorithm inefficient for the problems with complex landscapes [3]. Researchers have made numerous contributions to the improvement of FA considering the alteration of the control parameters. For example, Gandomi et al. applied several chaos mechanisms to tune light absorption coefficient and attractiveness coefficient [12]. In ref. [1], a geometric progression reduction scheme for the randomization parameter was introduced in FA to enhance the solution quality. Coelho and Mariani adopted Gaussian distribution probability functions to tune the light absorption coefficient and the randomization parameter [3]. In ref. [2], a chaotic mapping operator and a chaotic component were used to generate initial solutions and replace the random component of the standard FA, respectively. The former was to improve the quality of the initial population, while the latter was to perform ergodic search of the solution space.

Although much progress has been achieved on the FA-based algorithms since 2008, more efforts are required to further improve their performance:

- Providing the sufficient analysis for the control parameter settings;
- Efficient strategies or mechanisms for the selections of the control parameters;
- Employing heterogeneous search rules to enhance the performance of FA.

In this paper, the contributions are to develop several mechanisms and strategies for improving standard firefly algorithm, involving: (1) distance-based light absorption coefficient and gray coefficient for adaptively altering the attractiveness and enhancing difference information sharing, respectively; (2) five strategies for controlling the randomization parameter; and (3) employing heterogeneous search rules supported by the adaptively altering control parameters, to enhance the search ability of the original FA. The control parameters are adjusted over time or depending on heuristic rules, which employ the evolutionary information among the fireflies.

## Methods

### Firefly algorithm

The firefly algorithm (FA) is a nature-inspired optimization method [7], which maintains a population of fireflies to find the global optimum of an optimization problem.

In FA, the distance between any two fireflies and at and , respectively, can be defined as the Euclidean distance , which is formulated as follows,(1)where is the dimension of an optimization problem.

Indeed, the larger the distance is, the less light the fireflies can see from each other. Accordingly, it is necessary to define monotonically decreasing functions for light intensity and attractiveness, respectively. They are presented in Eqs. (2) and (3).(2)where is initial light intensity, and is the light absorption coefficient, which controls the decrease of light intensity. Accordingly, the attractiveness of a firefly is defined as shown in Eq. (3).(3)where is a constant, which is the attractiveness at .

The step of a firefly is attracted to move to another more attractive (brighter) firefly is determined by(4)where is a constant vector and is the time step, is drawn from a normal distribution . is the step size of the th firefly moving. The first term is the attraction from the th firefly, while the second term is randomization controlled by , which is a constant in the range of . Therefore, the update of the th firefly is formulated as follows,(5)

The Eqs. (4) and (5) show that the th firefly will move towards the th firefly, which is a more attractive one.

The procedure of FA algorithm is summarized as follows (*Algorithm S1*):

The selections of parameters are crucially important in FA, such as the light absorption coefficient in Eq. (3) and the randomization parameter in Eq. (4), but the values of the control parameters are chosen in predefined ranges dogmatically [13]. During the search process, traditional FA does not alter the values of the control parameters or only use constant parameters throughout the whole process. Also the information of the search or the knowledge achieved by the fireflies are not taken into account in the selections of parameters. All these static designs may be optimal for one problem, but not efficient or even fail to guarantee convergence for another one [13]. Proper selections of these parameters highly determine the quality of the solution and the search efficiency. Although researchers have proposed many improved FA variants [1], [14]–[16], premature convergence can still occur in the original firefly algorithm and its variants. FA may be easily trapped in local regions when it is used to deal with the complex problems with numerous local optima if the randomness is reduced to quickly [3]. To overcome these weaknesses in FA, we develop five variants of FA based on two main mechanisms, which will be described in details in following sections.

### Adaptive firefly algorithm

The standard FA employs three parameters for solving the optimizations, and the parameters may result in significantly different performance of FA, such as the absorption coefficient and the randomization parameter . Proper selections of these parameters can be a useful way to improve the search ability of FA. However, considering different problems with distinguish features, it is difficult to manually tune the parameters. To enhance FA, two main mechanisms and five strategies are employed to avoid the premature convergence of the classical FA. Accordingly, five variants of FA are yielded to balance the exploitation (local search) and exploration (global search), which are denoted as AdaFa-. The two main mechanisms are distance-based adaptive mechanism for different information sharing and gray-based coefficients for efficiently enhancing the heterogeneous search. All the two mechanisms are adaptive to exchange messages and applied to tune the control parameters in FA. Additionally, we also propose five different strategies for the selection of the randomization parameter in Eq. (4).

#### Distance-based adaptive strategy.

The motivation for us to investigate a distance-based strategy is that the adaptive absorption coefficient can efficiently deal with different problems whatever their landscape are, while the traditional FA uses a constant one throughout the search process.

It is obvious that there are two limited cases for a constant light absorption coefficient [4], which can be concluded as follows:

- The attractiveness of other fireflies will be a constant when approaches . That is, a firefly can be seen by all the other ones. In this case, FA is the same as a classical PSO.
- If , the attractiveness will be equal to . All the fireflies cannot take their bearings to move but in random flight. In this case, FA becomes a pure random search algorithm.

As can be seen, the parameter is crucially important in characterizing the variations of the attractiveness, and the speed of the convergence is also dependent on [12]. As a result, the performance of FA will be significantly constrained when a constant is used to solve the optimization problems as done in traditional FA. As is well known, the attractiveness should be linked with the distance among the fireflies, and it should also vary with the different distances among the population during the search process. The information of the distances is useful for promising search adaptively. Hence, we propose to use the distance to adaptively adjust the trajectories of the fireflies. The mean distance of the th firefly to the other fireflies is calculated as follows,(6)where is the number of the fireflies.

Based on the distance calculated in Eq. (6), we then define the distance ratio as follows,(7)where is the distance from the th firefly to the global best firefly. and are the maximum and the minimum distances from the firefly to other fireflies, respectively.

Accordingly, we define an adaptive absorption coefficient based on the distance ratio, which is to adaptively track the promising flight direction. It is defined as follows,(8)where denotes as amplitude factor, which controls the amplitude of . is called contraction index. The relationship between and the two factors is illustrated in Fig. 1.

As and increase, will decrease while the value of will sharply increase when and are very small (or large).

In Fig. 1, the value of is set to 1, which does not have effect on the lower and upper boundaries of . As can be seen, the larger the values of and are, the smaller the value of is; vice versa. If we remove the effect of , that is , then will be a constant. As a result, it is greatly difficult to deal with the balance between exploration and exploitation. Because the light absorption coefficient is determined by the distance information adaptively, the fireflies are able to adjust their flight directions for promising search.

Then, the constant light absorption coefficient in traditional FA is replaced by the adaptive coefficient for efficient search. The new attractiveness is obtained in Eq. (9).(9)

#### Gray-based coefficients.

Gray relational analysis (GRA) is a similarity measure for finite sequences with incomplete information [17]. Let be the reference sequence, and be the th comparative sequence, then the gray relational coefficient between and can be defined as(10)where , , , and is employed to distinguish from and called as distinguishing constant. Based on Eq. (10), we can calculate the corresponding gray relational grade as follows(11)where is a weighted constant of the gray relational coefficient and subjects to .

To properly use the gray relational analysis, the new best firefly is employed as the reference sequence, while all the fireflies are the comparative ones. Base on these assumptions, we can use the gray relational analysis to measure the similarity between them. Let the gray relational grade between the new best firefly and the th firefly be . As can be seen from Eq. (11), the closer the new best firefly and the th firefly are, the larger the is. Accordingly, the relational grade can be used to control the diversity of the firefly population. Since the gray relational grade involves the information of population distribution, we define a gray coefficient to satisfy the requirement of diversity as follows,(12)where and are the upper and the lower boundaries, which ensure the population can converge in finite time. and .

#### Update rules.

Fireflies in the traditional FA and most of its modifications, follow the same search law and share similar information throughout the search process. Due to the same search characteristics, the fireflies cannot always exhibit diverse and useful information for promising search. In an optimization algorithm, it is important to balance the exploration and exploitation. The algorithm should make contribution to exploration initially, while focusing on exploitation and convergence in the later search process [3], [18]. To accomplish the task, we introduce heterogeneous update rules to improve the search abilities of the fireflies. Two updating equations are defined and selected randomly in the search process, which are presented in Eq. (13).(13)where is the maximum number of generations. is the randomization term, which is defined based on new -strategies as follows,(14)where and are the upper and lower boundaries, respectively.

In FA, the strategy () for is(15)

As can be seen from Eq. (15), decreases linearly depending on the generation number, and it does not always work well for different problems. As an important parameter controlling the randomization step, has significant effect on the performance of FA. To enhance its search ability, we propose five different strategies for , which all decrease dynamically with the generation number, population size and the size of the optimized problem. These designs promise AdaFa to deal with different types of problems, and all of them increase the range of . The strategies are presented in Eq. (16)–Eq. (20).

- strategy

(16)where is constant defined by user to clamp , and is also a constant, which is .

(18)where and are the population size and the size of an optimization problem, respectively.

- strategy

(19)where is the power of population size .

- strategy

(20)In Eq. (16)–Eq. (20), . From Eq. (16)–Eq. (20), varies with time non-linearly, and the analysis of the parameters in the strategies are presented in following section.

Based on the designs of different adaptive mechanisms and strategies discussed above, the fireflies are allowed to learn more useful information from others and adjust the flight directions adaptively. The AdaFa algorithm is summarized in (*Algorithm S2*).

## Results

In this section, we demonstrate the performance of AdaFa variants over twelve benchmark functions – summarized in Table S1 of Supplementary Materials (more details can be referred to the study [19], [20]) and apply AdaFa variants to rebuild protein tertiary structure. Firstly, we conduct numerical experiments for parameters analysis and then compare AdaFa with standard particle swarm optimization (SPSO) [21], adaptive particle swarm optimization (APSO) [19], grey particle swarm optimization (GPSO) [20] and FA [8]. In numerical experiments, we analyze the parameter sensitivity, provide the parameter settings of each algorithm, present the results of the numerical experiment, and analyze the numerical results.

### Parameter sensitivity analysis

In this section, we analyze four parameters in AdaFa including the amplitude factor , the contraction index , and . The different selections of the four parameters may have influence on its performance. To investigate the impacts of these parameters, we conducted the experiments on functions –, , and –, which are Sphere, Schwefel, Rosenbrock, Rastrigin, Ackley and Griewank functions, respectively. The maximum number of generations was set to , and the population size was (This is similar to ref. [8], where the authors analyzed the effect of population size on the optimization problems). The averages of best-so-far values were used to measure the performance of AdaFa.

To investigate the amplitude factor and the contraction index , and were set to and , respectively. AdaFa was performed 20 times with different and . was varied from to with an increment of , while was changed from to in the same increment. Fig. 2 reveals that the relationship between the best-so-far and the two factors. As can be seen, the optimal and were around and , respectively. Hence, in this study and were used for all the following experiments.

In the gray strategy, we employ a gray coefficient to trade-off between the th firefly and the best firefly. As shown in Eq. (12), it is necessary to clamp in a fixed interval, such as . To fix and , we conducted trial tests on the same six benchmark functions, and the maximum number of generations was also set to . and were set in and , respectively, with the same increment of . Fig. 3 shows the relationship between and performance of AdaFa. From Fig. 3, AdaFa will perform well when and . Hence, in this study, we use and to conduct the experiments.

In strategies and of (Eq. (16)–Eq. (18)), can be defined by user before the optimization. How to select is important to . We thus analyze the relationship between and , which is illustrated in Fig. 4. As shown in Fig. 4, the larger is, the larger the range of is. It is interesting to note that there is a tail in strategy at the latter of generations, which can enhance the search abilities of the fireflies in exploitation region at latter search process.

In strategy and strategy , the values of are dependent on the population size and the size of an optimization problem, and we also analyze the relationship among them. As illustrated in Fig. 5, in strategy the smaller is, the more similar the trajectories of are. As well as strategy –, the range of is enlarged with the increment of the value of and population size . In strategy , as shown in Fig. 6, although the trajectories of are different from each other varying with , they all converge to similar points. These points are independent of the population size .

In Fig. 7, different strategies of are compared, where the ranges of of all proposed strategies are larger than the strategy in standard FA. The larger ranges allow the fireflies with stronger exploration and exploitation abilities throughout the search process.

### Parameters settings

Experiments were conducted to compare different algorithms on 12 benchmark functions with dimensions. In these experiments, the maximum number of generation was set to , and the population of each algorithm was set to . According to previous analysis, the parameter settings are listed as follows:

### Numerical Results

The experimental results over the twelve benchmark functions are presented in this section. Table 1 shows the mean values and the standard deviation of the results over independent trials of each algorithm, and the success rate. The best are highlighted in bold, and the mean values are also illustrated in Fig. S1 in File S1 to compare the convergence of each algorithm (more details can be found in Supplementary Materials).

As illustrated in Table 1 and Fig. S1 in File S1, the five versions of AdaFa achieved better results than the other four algorithms on eight benchmark functions (–, , and –), which can be also observed from Fig. S1 in File S1. SPSO achieved the target solution on function , although the five versions of AdaFa did not as well as SPSO, they obtained the comparable results to each other, and the results were better than APSO, GPSO and FA. AdaFa- was better than AdaFa with other four different strategies on function –, while other algorithms were all trapped and failed to obtain good results. It is easy to solve function by the FA-based methods as shown in Table 1, as shown FA and five variants of AdaFa all found the target solutions to the problem. Due to the complex landscape of function , GPSO was the sole algorithm that found a most approaching results to the target. Similarly, AdaFa- and AdaFa- were the only ones that achieved sharp result than the compared algorithms on functions and , respectively. Although AdaFa-, AdaFa- and AdaFa- were all trapped into the local regions on function , they performed so well that they all highly better than the three variants of PSO and FA on functions and . FA was good for the function , on which it was superior to others, but the rest possessed similar convergent characteristic as illustrated in Fig. S1 in File S1.

### Comparison of CPU efficiency

To compare the computational efficiency, we use CPU time to measure the complexity of each algorithm. For each algorithm, the computational efficiency is formulated as follows,(21)where is the computational time of an algorithm on the benchmark function , while the is the total time of all the algorithms on function .

In Fig. 8, the computational efficiency is illustrated. As can be seen, APSO is the fastest among all the algorithms, and it is followed by FA. Their computational efficiency are quite similar to each other over each benchmark function. The costs of the proposed five versions of AdaFa are also similar to each other, and they are all comparable to those of SPSO. However, all the five version of AdaFa are worse than FA in terms of computational time due to several new mechanisms have been employed.

### Statistical analysis

Generally, it is necessary to use the non-parametric tests to analyze the experimental results. In this section, the Friedman, Aligned Friedman and Quade tests [22] were used to validate the performances of all the algorithms.

Table 2 presents the average rankings calculated from the Friedman, Aligned Friedman, and Quade tests. Each algorithm and its score are listed in ascending order. The statistics and the corresponding -values are shown at the bottom of the table. In terms of computed -values, there exists significant differences among the algorithms with the significance level.

### Simulations on trans-membrane protein helix

Optimization approaches have been widely used in computational biology, for instance, immune algorithm (IA) was applied to discover a protein conformation with minimal energy based on lattice models [23]. Estimation of distribution algorithms (EDAs) [24] was used to solve the PSP in simplified models. Also based on simple protein models, Islam and Chetty employed memetic algorithm (MA) with several features to accomplish the structure prediction [25]. One important step of *ab initio* PSP methods is to reconstruct the tertiary structure of a protein by some optimization algorithms. Many methods reconstruct the tertiary structure of a protein depending on the residue contact maps, which can be solved in the framework of optimizing NP-hard problem [26], for example, in [27], a stochastic method based on simulated annealing (SA) was developed to derive a three-dimensional structure from a contact map. Vassura et al. used the a heuristic method and contact map to accelerate the process [28]. As we know, the aim of PSP is to obtain the Cartesian coordinates of all the atoms, which are bonded together by inter-atomic forces called chemical bonds. It has been observed that the bond lengths subject to a Gaussian distribution with a small standard deviation in high resolution protein structural data [29], and contain the essential information to determine the backbone structure of a protein [30]. Hence, given the torsion angle and bond length constrains, conformation of the geometry of the global protein structure is also an optimization problem, where the final structure resolution is significantly dependent on the algorithms.

In this study, we applied the developed AdaFa variants on the protein structure prediction problem with focus on the transmembrane helix proteins (TMHP). Membrane proteins account for 30% of the whole genome and more than 70% known drug targets. However, because of the hydrophobic environment of TMHP, they are the extreme difficult targets for the experimental structure biology studies [31], [32].

In high resolution protein structural data, it has been investigated that the bond lengths and angles subject to Gaussian distribution with a small standard deviation [33]. Here, we try to rebuild the tertiary structure directly from the bond lengths and angles with AdaFa. The data set of TMHP is filtered from the RCSB protein data bank (PDB) benchmark with 30% sequence identity cutoff, and it is released after Jan. st, 2010. We use 30 TMHPs for the tests, where the number of the residues (Res. No.) is less than or equal to , and the atomic resolutions of all the proteins are better than 2.5Å, without missed internal residues and sequence redundancies. These benchmark proteins are presented in Table 3.

From the 30 PDB structural data set, we extract only the angles of protein backbone including bond angles and torsion angles, which will be used as the constrains of the construction by the proposed AdaFa. In the experiments, the coordinate of the first backbone C_{α} atom was set randomly, the second one was determined by the standard bond length, the third one was fixed by two standard bond lengths and a bond angle, and the fourth one was calculated by the former three bond lengths, two bond angles and a dihedral angle. From the fifth C_{α} atom, the coordinates were determined by the same constrains involving four bond lengths, three bond angles and two dihedral angles. When considering the whole process as an optimization problem, we define the objective function as an energy, which is to calculate the distance between the undetermined C_{α} atom and its former four fixed C_{α} atom. For example, there are four determined coordinates of C_{α} atoms (, ), and we need to determine the coordinate of the th C_{α} atom using the bond lengths, bond angles, and dihedral angles among the C_{α} atom , , and in the backbone of a chain. We transform the obtained knowledge into distance-based constrains in geometrical respect, and the distances can be denoted as , , and (). Accordingly, the energy can be formulated as , where is transformed from the native constrains. The energy is used as the criterion of the proposed AdaFa in the optimization process. Hence, the backbone of a protein can be achieved by AdaFa iteratively. We then use the entire backbone of a protein chain as input of the PULCHRA [34] to determine the other atoms, such as hydrogen atoms and nitrogen atoms.

As illustrated in Fig. S2 in File S1, the tertiary structures of the proteins listed in Table 3 were predicted by AdaFa with high accuracy from the native constrains of each protein. In the experiments over 30 proteins, the averaged RMSDs of AdaFa-–AdaFa- are Å, Å, Å, Å, and Å, respectively. To validate the robustness of AdaFa, the native constrains with 10% Gaussian white noise were used to predict the tertiary structures. According to the results in Fig. 9 and Fig. S3 in File S1, it can be seen that AdaFa variants (AdaFa-–AdaFa-) are able to construct the tertiary structure of the protein with high accuracy and good robustness, where the averaged RMSDs of AdaFa-–AdaFa- are Å, Å, Å, Å, and Å, respectively.

AdaFa-–AdaFa- were represented by red solid line, black dotted line, blue dotted dashed line, magenta dashed line, and green solid line, respectively.

The different scores (TM-Score [35], GDT-TS-Score [36] and GDT-HA-Score [37]) and RMSD are illustrated by kernel smoothing density estimate [38] in Fig. 9 (More details can be found in Supporting Information). From Fig. 9 (a), the proposed AdaFa with five various strategies achieved different TM-Scores, in which AdaFa-, AdaFa- and AdaFa- were better than the rest two variants of AdaFa. AdaFa- was superior to the other one in terms of GDT-TS-Score and GDT-HA-Score as shown in Fig. 9 (b) and Fig. 9 (c), while the AdaFa- was the worst one among the variants of AdaFa over the protein benchmark dataset. On the other hand, AdaFa- and AdaFa- exhibited a little different and competed with each other in the two score items. In respect of RMSD as shown in Fig. 9 (d), AdaFa- occupied an overwhelming position, which was far better than the other four AdaFa variants, among which AdaFa-–AdaFa- possessed similar density estimation.

## Conclusion

In this paper, we develop an adaptive firefly algorithm (AdaFa) and its five variants to enhance the search ability of the original FA. In AdaFa, we propose a distance-based technique to overcome the two main drawbacks in using a constant light absorption coefficient, which tunes the light among the fireflies dynamically to control the sharing distance information leading to the variation of the attractiveness. Simultaneously, the differences among the fireflies can be adequately used to enhance the local search ability of each firefly, hence we employ the gray relational analysis to design a gray coefficient as another self-adaptively altering parameter. According to the designed parameters, AdaFa uses heterogeneous update laws to accomplish the balance between the exploitation and the exploration throughout the search process.

In the numerical experiments, we compared the performances of all five proposed AdaFa variants with FA and other three PSO-based algorithms, and the statistical results demonstrated the five AdaFa variants were significantly better than the other four algorithms with the significance level. The experiments on the reconstruction of the tertiary structure of the protein showed that AdaFa had the potential ability in predicting the helix structures with high accuracy and good robustness. Although AdaFa exhibited good performance on either the numerical experiments or real-world application on the prediction of the proteins' tertiary structures, it is still a challenging problem to deal with the cooperation among the fireflies for further improving the performance of FA, which is our future efforts. The codes of AdaFa is available upon request at http://www.csbio.sjtu.edu.cn/bioinf/AdaFa-PAA/.

## Supporting Information

### File S1.

**Combined file of supporting figures and tables.** Figure S1: The mean value over the benchmark functions with 30-dimensions. Figure S2: Simulation results over thirty proteins. Figure S3: The kernel smoothing density estimates of different measurement metrics. Table S1: Benchmark Functions.

https://doi.org/10.1371/journal.pone.0112634.s003

(ZIP)

## Author Contributions

Conceived and designed the experiments: NJC XMD HBS. Performed the experiments: NJC. Analyzed the data: NJC XMD. Wrote the paper: NJC XMD HBS. Method development and implementation: NJC HBS.

## References

- 1. Gandomi AH, Yang XS, Alavi AH (2011) Mixed variable structural optimization using firefly algorithm. Computers & Structures 89: 2325–2336.
- 2. Kazem A, Sharifi E, Hussain FK, Saberi M, Hussain OK (2013) Support vector regression with chaos-based firefly algorithm for stock market price forecasting. Applied Soft Computing 13: 947–958.
- 3. dos Santos Coelho L, Mariani VC (2013) Improved firefly algorithm approach applied to chiller loading for energy conservation. Energy and Buildings 59: 273–278.
- 4. Miguel LFF, Lopez RH, Miguel LFF (2013) Multimodal size, shape, and topology optimisation of truss structures using the firefly algorithm. Advances in Engineering Software 56: 23–37.
- 5.
Amiri B, Hossain L, Crawford JW, Wigand RT (2013) Community detection in complex networks: Multiobjective enhanced firefly algorithm. Knowledge-Based Systems.
- 6.
Su Z, Li L, Peng H, Kurths J, Xiao J, et al.. (2014) Robustness of interrelated traffic networks to cascading failures. Sci Rep 4..
- 7.
Yang XS (2008) Nature-Inspired Metaheuristic Algorithms. Luniver Press.
- 8.
Yang XS (2009) Firefly algorithms for multimodal optimization. In: Stochastic Algorithms: Foundations and Applications, SAGA 2009. volume 5792 , pp. 169–178.
- 9. Liang JJ, Qin AK, Suganthan PN, Baskar S (2006) Comprehensive learning particle swarm optimizer for global optimization of multimodal functions. IEEE Transactions on Evolutionary Computation 10: 281–295.
- 10. Cheung NJ, Ding XM, Shen HB (2014) OptiFel: A convergent heterogeneous particle swarm optimization algorithm for Takagi-Sugeno fuzzy modeling. Fuzzy Systems, IEEE Transactions on 22: 919–933.
- 11. Li L, Peng H, Kurths J, Yang Y, Schellnhuber HJ (2014) Chaos-order transition in foraging behavior of ants. Proceedings of the National Academy of Sciences 111: 8392–8397.
- 12. Gandomi A, Yang XS, Talatahari S, Alavi A (2013) Firefly algorithm with chaos. Communications in Nonlinear Science and Numerical Simulation 18: 89–98.
- 13. Brest J, Greiner S, Boskovic B, Mernik M, Zumer V (2006) Self-adapting control parameters in differential evolution: a comparative study on numerical benchmark problems. IEEE Transactions on Evolutionary Computation 10: 646–657.
- 14.
Yang XS (2010) Firefly algorithms for multimodal optimization. ArXiv e-prints.
- 15.
Yang XS, Deb S (2010) Eagle strategy using Lévy walk and firefly algorithms for stochastic optimization. In: Gonzlez J, Pelta D, Cruz C, Terrazas G, Krasnogor N, editors, Nature Inspired Cooperative Strategies for Optimization (NICSO 2010), Springer Berlin Heidelberg, volume 284 of
*Studies in Computational Intelligence*. pp. 101–111. - 16. Yang XS (2010) Firefly algorithm, stochastic test functions and design optimisation. International Journal of Bio-inspired Computation 2: 78–84.
- 17. Deng JL (1989) Introduction to grey system theory. J Grey Syst 1: 1–24.
- 18.
Engelbrecht AP (2010) Heterogeneous particle swarm optimization. In: Swarm Intelligence, Springer Berlin Heidelberg, volume 6234 of
*Lecture Notes in Computer Science*. pp. 191–202. - 19. Zhan ZH, Zhang J, Li Y, Chung HH (2009) Adaptive particle swarm optimization. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics 39: 1362–1381.
- 20. Leu MS, Yeh MF (2012) Grey particle swafrm optimization. Applied Soft Computing 12: 2985–2996.
- 21.
Clerc M (2012). Standard particle swarm optimisation. http://clerc.maurice.free.fr/pso/SPSO_descriptions.pdf.
- 22. Derrac J, García S, Molina D, Herrera F (2011) A practical tutorial on the use of nonparametric statistical tests as a methodology for comparing evolutionary and swarm intelligence algorithms. Swarm and Evolutionary Computation 1: 3–18.
- 23. Cutello V, Nicosia G, Pavone M, Timmis J (2007) An immune algorithm for protein structure prediction on lattice models. IEEE Transactions on Evolutionary Computation 11: 101–117.
- 24. Santana R, Larranaga P, Lozano J (2008) Protein folding in simplified models with estimation of distribution algorithms. IEEE Transactions on Evolutionary Computation 12: 418–438.
- 25.
Islam M, Chetty M (2013) Clustered memetic algorithm with local heuristics for
*ab initio*protein structure prediction. IEEE Transactions on Evolutionary Computation 17: 558–576. - 26. Breu H, Kirkpatrick DG (1998) Unit disk graph recognition is NP-hard. Computational Geometry 9: 3–24.
- 27. Vendruscolo M, Kussell E, Domany E (1997) Recovery of protein structure from contact maps. Folding and Design 2: 295–306.
- 28. Vassura M, Margara L, Lena PD, Medri F, Fariselli P, et al. (2008) Reconstruction of 3D structures from protein contact maps. IEEE/ACM Transactions on Computational Biology and Bioinformatics 5: 357–367.
- 29. Engh RA, Huber R (1991) Accurate bond and angle parameters for X-ray protein structure refinement. Acta Crystallographica Section A 47: 392–400.
- 30. Wu S, Zhang Y (2008) ANGLOR: A composite machine-learning algorithm for protein backbone torsion angle prediction. PLoS ONE 3: e3400.
- 31. Shen HB, Chou JJ (2008) MemBrain: Improving the accuracy of predicting transmembrane helices. PLoS ONE 3: e2399.
- 32. Yang J, Jang R, Zhang Y, Shen HB (2013) High-accuracy prediction of transmembrane inter-helix contacts and application to GPCR 3D structure modeling. Bioinformatics 29: 2579–2587.
- 33.
Engh RA, Huber R (2006) Structure quality and target parameters, John Wiley & Sons, Ltd. doi:10.1107/97809553602060000695.
- 34. Rotkiewicz P, Skolnick J (2008) Fast procedure for reconstruction of full-atom protein models from reduced representations. Journal of Computational Chemistry 29: 1460–1465.
- 35. Zhang Y, Skolnick J (2004) Scoring function for automated assessment of protein structure template quality. Proteins: Structure, Function, and Bioinformatics 57: 702–710.
- 36. Zemla A, MJFK, Venclovas Č (1991) Processing and analysis of casp3 protein structure predictions. Proteins S3: 22–29.
- 37. Zemla A (2003) LGA: a method for finding 3D similarities in protein structures. Nucleic Acids Research 31: 3370–3374.
- 38.
Bowman AW, Azzalini A (1997) Applied Smoothing Techniques for Data Analysis. Oxford University Press, USA.