Training radial basis function networks for wind speed prediction using PSO enhanced differential search optimizer

This paper presents an integrated hybrid optimization algorithm for training the radial basis function neural network (RBF NN). Training of neural networks is still a challenging exercise in machine learning domain. Traditional training algorithms in general suffer and trap in local optima and lead to premature convergence, which makes them ineffective when applied for datasets with diverse features. Training algorithms based on evolutionary computations are becoming popular due to their robust nature in overcoming the drawbacks of the traditional algorithms. Accordingly, this paper proposes a hybrid training procedure with differential search (DS) algorithm functionally integrated with the particle swarm optimization (PSO). To surmount the local trapping of the search procedure, a new population initialization scheme is proposed using Logistic chaotic sequence, which enhances the population diversity and aid the search capability. To demonstrate the effectiveness of the proposed RBF hybrid training algorithm, experimental analysis on publicly available 7 benchmark datasets are performed. Subsequently, experiments were conducted on a practical application case for wind speed prediction to expound the superiority of the proposed RBF training algorithm in terms of prediction accuracy.


Introduction
Artificial neural networks (ANN) are a section of artificial intelligence systems fundamentally designed to overcome some of the challenges the mathematical models fail with complex and ill-defined problems. They are fault tolerant and solve the problem by learning from similar examples. ANN are capable of handling noisy and ambiguous data, with the ability to predict and generalize once efficiently trained [1].
Radial basis function (RBF) networks are another class of ANN simulating the locally tuned response observed in biologic neurons [2]. The structure of RBF consists of three layers, namely the input, hidden and output layers. The RBF training involves two stages, with centres of the hidden layer are determined first in a self-organising manner [3] and secondly, the weights connecting the hidden layer to the output layer are computed. Generally, RBF training is accomplished by computing the weights and biases to obtain the target output by minimizing the error function. To accomplish this, the following methods are widely used in the literature, matrix inversion techniques, gradient-based training approaches and evolutionary computation methods [4]. Thus in this section, a review of literature in the topic of various RBF training methods will be discussed. As the training phase determines the success of any network, the training of radial basis network(RBF) involves three step learning [5] which is the fastest as the centres are determined by unsupervised method and output weights are also determined by less complex algorithms. Though gradient descent method offers precise results as it involves derivatives which affects the computation time it is not preferred to use alone.
Training the neural networks by heuristic search algorithms like differential evolution (DE) [6] was previously utilized and the results are compared with gradient descent methods, however it is observed that no significant improvement in the performance because of DE. Similarly, in [7] the authors suggests solutions for stagnation of Differential evolution (DE) when used with neural networks, as the individual does not improves even under favourable conditions. Taking care of the initialization, merging DE with specified mutation operators, size of population (DE) are some of the key areas have been discussed.
In [8], as the training of both MLP and RBFN is difficult, evolutionary algorithms like Genetic Algorithm optimizes the subset of input data for determining the number of centres which helps us in elevating the over-fitting problems. In another work [9], the authors carried out short term wind speed prediction with inputs from five different meteorological stations and tested with ANN trained by PSO.
Similar works which involves neural network and PSO are experimented in [10] in order to improve the reliability of electric power generation, wind power is predicted with enhanced particle swarm optimization (EPSO) in combination with standard neural networks and the weights of the networks are optimized. For inputs like time series data in [11], GA is used to optimize all the three parameters of RBFN. Similar time series data in [12], nonlinear time varying evolution PSO was proposed for training RBFN and tuning the acceleration coefficients for short term electric power prediction in Taiwan.
Likewise, in [13] to improve the forecasting accuracy of Back propagation network (BPNN), adaptive differential evolution (ADE) is hybridized. Similarly, in [14] both the global search and local search capabilities of Adaptive PSO and BP is been efficiently exploited for finding the global optimum in the given search space. In another work [15] the authors proposed an improved dynamic PSO, together with Ada Boost algorithm, authors adjust the parameters (centers, widths, shape parameters and connection weights) to train the RBF NN.
Similar recently developed hybrid models [16]such as biogeography based optimization (BBO) algorithm is used for training Multi-layer Perceptron (MLP) networks and tested with several classification, approximation datasets. In another work [17] modified bat algorithm is employed to optimize the weights, biases and the structure of neural network and tested on classification, benchmark time series and real time series (e.g. rainfall) datasets. Again in [18], to improve the diversity of population, two strategies are proposed in modified bat inspired algorithm, proposing Ring and Master slave methods, the weights and the structure of ANN are simultaneously optimized.
To improve the performance of training the RBFN, the combination of PSO, K-NN and OSD is presented in [19]. PSO replaces K-means clustering for finding centres, as in K-means random selection of centres was deficient. Subsequently PSO [20] is used for parallel optimizing of parameters of RBFN as it handles two different swarms. These two swarms exchanged the information of optimized parameters among themselves. Again in [21], the variant of PSO i.e PSO with mutation operation is used to train the RBF parameters like the weights and sigma of activation function.
For predicting the electric load demand, a hybrid method of PSO-GA-RBF [22] is presented. Since GA is binary coded, and PSO is real value coded, this algorithm is a mixed coded one where the network structure is been optimized by GA and the weights and basis are optimized by PSO. Similarly, a Support Vector Regression (SVR) model is hybridized with the differential empirical mode decomposition (DEMD) method and PSO-support vector machine for electric load forecasting [23]. In [24], the authors tried RBFN for solar power prediction with wind speed and two dimensional representation of solar irradiation as its inputs. Again [25] uses a hybrid PSO-GA for finding the parameters of radial basis network in rainfall prediction.
Similarly, forecasting stock indices using an artificial fish swarm algorithm (AFSA) optimizes RBF is discussed in [26]. K-means clustering which is adapted for finding centres of RBF, weights linking the output and hidden layer are being optimized by AFSA. Meanwhile in [27], a hybrid perturbation artificial bee colony trainer for a local linear RBF NN is presented. Another hybrid method integrating empirical mode decomposition with adaptive neural network based fuzzy inference system (ANFIS) for short-term wind speed forecasting is presented in [28].
To improve the diversity of individuals results in higher chance to search in the direction of global optimal [29], proposes an integrated hybrid method with PSO and GA for RBFNN training. Similarly [30][31][32], proposes a PSO based training for RBF NN for diverse applications. [33] presents a spatial correlation model algorithm for training ANN for wind speed and power forecasting.
Generally, inconsistency of a single technique could be resolved by combining two or more techniques to overcome the deficiencies of single models and yield more accurate results [34][35][36]. Accordingly this paper proposes a hybrid model combining the salient features of PSO with the differential search algorithm. Thus a new hybrid optimizer called PSODS will be the trainer for the RBF NN for wind speed prediction. Before establishing the applicability of the proposed technique to train RBF NN for wind speed prediction, seven publicly available test datasets are experimented to demonstrate the results produced by the new scheme is evidently superior in many aspects compared to other reported methods for training RBF NN.
Despite the fact that any developed technique can be experimented and proved to be effective for standard test problems, it is more insistence to justify its performance on a real time system. Accordingly this research after establishing the performance of the proposed trainer for RBF NN, will be experimented on a practical wind prediction problem. Wind is one of the green renewable energy, widely available for electric power generation. In spite of its chaotic nature the wind is effectively utilized for power generation with suitable planning. Several factors influence the speed of the wind and hence prediction of wind speed will help electric power companies to well utilize the energy tapped from wind and minimize the expenses for fossil power generation.
The literature for application of NN for wind speed prediction is comprehensive. Here selective articles are reviewed pertaining to the content of this research. Wind speed prediction is done using three NN, namely adaptive linear element, back propagation and radial basis function and demonstrated that no particular NN outperforms the other in terms of all evaluation metrics [37]. In [38], a self-organising map is used to process the uncertainties of wind nature and then processed using RBF NN. Similarly an adaptive neuro fuzzy system is proposed along with similar day method and proved to be effective [39].
In [40], a NN model for predicting real time information obtained from various locations in the mountainous regions of Himalaya is presented. Similarly, a recurrent NN model is developed for predicting the wind power generated from wind turbines installed across the coastal region [41]. In another work [42], a Least square support vector machine (LSSVM), with empirical wavelet transform as a pre-processor is presented. Similarly, two different statistical models with same datasets of inputs ranges from atmospheric variables is presented [43].
Complexity is one of the key factors which trigger the advent of new solution techniques for finding possible solutions, where existing mathematical programming techniques fail. Evolutionary computation algorithms are promising alternatives, when attempting complex search space and further expounded to overcome several drawbacks the mathematical programming techniques face when applied.
The search range of a neural network, where weight determination is the key problem is also complex and cumbersome in nature. This solution space is not only a challenge for any method to produce quality solutions, there are other issues like local trapping and premature convergence. An inherent feature of most of the population based algorithm is their capability of balancing between exploration and exploitation when searching the complex solution space. Similar, the key concern and shortcoming of any evolutionary computation algorithms is to overcome the trapping into local optima and to avoid poor convergence.
Based on this three contributions are made as follows, • A new population initialization algorithm is proposed using a chaotic sequence called 'Logistic iterator' ensuring the search space information can be extracted with enhancement in population diversity. In addition, an opposition based population is subsequently generated using the population generated by the chaotic sequence, to further diversify the initialization population.
• A new optimizer using the differential search algorithm is proposed with functionally modified by incorporating the local search feature of the PSO. Thereby the exploration of DS is ensured and exploitation of the PSO is well utilized.
• The newly proposed optimizer named as PSO enhanced differential search (PSODS) algorithm will be used to train the radial basis function neural networks and the best possible settings for centroid, spread and weights will be estimated and demonstrated for its suitability in solving both theoretical and practical applications of prediction.
The rest of this paper is organized as follows. Section 3, presents a brief introduction to the RBF NN followed by the Logistic chaotic sequence based Initial population generation algorithm. Subsequently with the brief overview of DS algorithm and PSO algorithm, the modeling of the PSODS algorithm for training the RBF NN is presented in Section 4. Section 5, summarizes the simulation results of the seven publicly available regression test datasets and finally for wind speed prediction problem. Finally the paper concludes by summarizing the merits of the proposed approach.  v t id ; v tÀ 1 i1 Present and Previous velocities of the particle P t id Local best value of the particle x t id Present position of the particle p t gd Global best position of the particle spo Super organism population ρ gd Gamma distribution based random number generation.

Radial basis function neural network (RBF NN): An overview
Radial basis function neural network RBF NN [2,3] is the general class of non-linear and three-layer feed forward neural networks: (i) an input layer with n nodes, (ii) a hidden layer with m neurons or RBFs, and (iii) an output layer with one or several nodes (Fig 1). The unsupervised layer is defined between input nodes and the hidden neurons in the RBF network, while the supervised layer exists between hidden neurons and the output nodes. The j th output y j (i) in the network can be defined as The weights connecting hidden layer to output layer is given by, w k = [w 1 , w 2 ,. . ..., w m ] T According to [30], the basis function can be defined in several ways, while some of the most commonly used basis functions are as follows: Gaussian, multi-quadric, inverse multi-quadric, generalised inverse multi-quadric, thin plate spline, cubic and linear function. In this study, the RBF is represented by the Gaussian function that acts as the activation function for the neurons in the hidden layer formed by every term δ k . The output layer applies a linear combination of this function and is represented as The Gaussian form is defined as Now, the j th output becomes: The parameters of RBFN such as w jk , μ k , σ k are to be optimised so that the error function as stated below is minimised. M is the number of sample used to train the RBF NN.
The key problem in RBF neural network structure is to judge the number of hidden layer neurons and their corresponding spread, σ k and centroids μ k . To obtain the above parameters to design the RBF neural network used for prediction problems, the root mean square error (RMSE) is formulated as an optimization problem. Accordingly the fitness function for the optimization procedure is given below.

Fitness function
The fitness function is the key element in determining the suitable parameters for the better performance of the RBF NN. For a dataset with samples S = {(X j ,Y j ), j = 1,2,3,. . .M}. where, X j is the j th sample given by X j = {(x j ), j = 1,2,. . ..J}, where 'M' is the number of samples, n is the number of inputs. t j is the output as per data and y j (i) is the output estimated by RBF NN for the input sample X j . Thus the error function also considered as fitness function is given by Eq (5). Thus this paper, proposes a new hybrid optimizer to determine the appropriate weights w jk , the spread, σ k and centroids μ k , as they are of particular importance for the better performance of the RBF neural network.

Logistic chaotic sequence based Initial population generation
In any evolutionary computation procedure, the convergence speed and final optimum solution obtained are greatly influenced by the initialization of the candidate solution or population. Mostly, initial candidate solutions are randomly generated within the range of the variables limits as no information of the solution space is available [44,45]. Recently, several evolutionary computation procedures due to the randomness and sensitivity dependence on the initial conditions, adopts chaotic maps for initialization of the candidate solutions as chaotic maps are capable of extracting diversity within the solution space, thereby generate initial population that are much diversified (throughout the search space) than the regular randomly initialized population.
In this work the chaotic map adopted is the one proved to be most successful in various applications. Thus in this work the Logistic iterator [46] is selected and its equation is given as follows: Where, chrnd 0,j = 0.2027 and α = 4 Subsequently once the population initialization is done with chaotic maps, another improvisation is carried out by applying the opposition based population diversification [47]. This diversification is done for the entire size of the population and their place in the search will be decided based on their fitness. So that out of twice the size of the candidates only the first half candidates with highest fitness will enter the PSODS routine. The combined algorithm for population initialization using chaotic maps and opposition based method is presented in Algorithm 1.

Algorithm 1: Chaotic opposition-based population initialization
01: The maximum number of chaotic iteration CHITR is set to 300, 02: population size is POP.

Differential search algorithm: An overview
The Differential search (DS) is one of the recently developed evolutionary computation procedure to solve constrained global optimization problems. It is getting attention in recent times in wide range of applications which requires rigorous search of solution space [48,49]. The DS algorithm shall be briefed in three stages as follows: • Set of candidate solutions of a particular problem shall be considered as artificial-superorganism migrating towards better fitness.
• In the course of migration, the artificial-super-organism examines whether a randomly selected location (stop over site) is suitable for temporarily settlement.
• If the location (based on fitness evaluation) is suitable to stall over during the migration, the super-organism that made this location will position itself there.
The above procedure will be continued until all the artificial-super-organism examines and settle at an acceptable position as per the problem requirement. The DS procedure is inspired by the movement of a super-organism well similar to the Brownian-like random-walk model. The flow chart of the Differential search algorithm is shown in Fig 2. The salient feature of the DS Algorithm is, only two parameters (P1 and P2) are normally to be appropriately set for the algorithm to search for the better solution. DS is very simple with good exploration capability but poor at exploitation. Hence requires large number of iterations to obtain good result.

Particle swarm optimization: An overview
Introduced as a simple real number optimization algorithm, PSO is the widely used swarm intelligence algorithm in variety of applications [34,35]. The algorithm is inspired from the behavior of bird flocks known as a swarm in search of food. It's simple steps in reaching a quality solution with control over both global and local search capability made it a popular optimization algorithm. The PSO algorithm shall be briefed in three stages as follows: • Set of candidate solutions of a particular problem shall be considered as particles with positions in the search space moving towards better fitness • The movement of particles will be based on their own personal information and all other particles information in the search space.
• If there is a better fitness found during the movement, the particles will move to the new position or else stay where they are.
The above procedure will be continued until all the particles update their positions and settle at an acceptable position as per the problem requirement. The flow chart of the PSO algorithm is shown in  One of the features of the PSO algorithm is its ability to have control over both global and local search of the solution space. In order to realize this, the algorithm is modelled with linearly decreasing inertia weight which at the beginning supports global exploration and at the end it ensures local exploitation. One of the significant weakness of PSO algorithm is during the exploration the particles often miss better solution region. Additionally, in PSO when the particles positions are updated neglecting the previous velocities, they tend to lead to local search of the region where the particles position. In this research work, the PSO will use a neighbourhood topology to exploit the solutions of DS by thorough search of solution region. This neighbourhood topology will be based on ring topology with neighbours fetched considering both fitness and candidates themselves. Accordingly the velocity equation given in Eq (6), will be modified as in Eq (7).
Where, v t id ; v tÀ 1 i1 is the present and previous velocities of the particles, c 1 , c 2 , r 1 , r 2 are scale and random numbers. P t id ; x t id &p t gd are local best, present position & global best of the particles of In this research, a new topology will be used where both the fitness and candidates neighbours are chosen for updating the solutions. This will be explained in detail in the next section.

PSO enhanced differential search optimizer
In this section, the new hybrid algorithm PSODS that functionally integrates PSO with DS will be discussed in detail. As discussed earlier, DS is very simple with good exploration capability but poor at exploitation. Similarly, PSO algorithm is good in exploitation with adjustment in its inertia weight. Hence taking the advantages of both the techniques, a new hybrid technique is formulated.
Step by step procedure of PSODS algorithm.

(Initialization and Generation of the initial artificial organism)
Initialize the super-organism population Such that, POP is the size of super-organism, and K is the dimension of the problem also assumed as the size of one clan. In addition assume initial values for control parameters P 1 & P 2 Randomly initiate artificial organism using the chaotic sequence initialization procedure discussed in section 4 (Algorithm 1), Such that, the artificial organism should describe the RBF NN with K numbers of hidden layer neurons, should comprise of the parameters w ik , μ k , σ k to be minimized and expressed as

(Fitness Function Evaluations)
Each artificial organism should be evaluated for its fitness value using the fitness function designed for the problem of interest.
Thus the fitness function here is fit The super-organism population is Initialized as spo = T p and FT spo = fit This stage involves three sub-stages, they are Determination of Scale Factor (η) Scale factor is determined using, Where, ρ gd generates random numbers from gamma distribution and ρ i~U (0,1) random numbers generated.

Determination of Donor organisms (DNR)
The Donor or the target is determined by shuffling the super-organism and is expressed by,

Estimation of new temporary position (MUT)
In this stage three mutation strategies are adapted, the control parameters are assumed as, In addition ρ i~U (0,1),i = 1,2,3,. . .
Evaluate the fitness using newly generated position, Evaluate the fitness using newly generated position,FT fspo ¼ fitðfspo tþ1 i Þ PS04: Compare the fitness FT fspo , & FT cspo with fitness of the spo t i , improvement in fitness value will replace the position of spo t i . PS05: Check for t = = t max , Else t = t+1, Repeat from PS03 END PSO routine 5. Check for IT = = IT MAX END PSODS A flowchart of the PSODS algorithm is shown in Fig 4 for  The experiments are performed and demonstrated using the proposed PSODS trained RBF NN. To prove the efficiency of the proposed optimizer, experiments were also conducted using PSO trained RBF, DS trained RBF and basic RBF. The obtained results are also compared with the results reported in the literature.

Dataset description
The benchmark dataset are obtained from UCI repository. These seven dataset which are taken into consideration for testing the performance of the proposed PSODS method is described as below:-• Boston House Price data is concerned about the house price in the area called Boston with 13 attributes such as crime rate, population, pollution level, accessibility to schools, highways, and workplace etc. as inputs and house price as its output. With the total 506 instances, 253 are training instances and 253 instances are used for testing purpose.
• Concrete compressive strength data is having8 input attributes like concrete, water, fly ash, coarse aggregate, fine aggregate etc. and concrete compressive strength as its output. Out of 1030 total samples 680 are used training and 350 samples for testing.
• Air foil self-noise dataset with scaled sound pressure level as its output and 5 other inputs for a total of 1503 samples is also experimented. Where, 1000 instances are trained and 503 instances are tested.
• Istanbul stock exchange dataset deals with the stock exchange returns with 8 other stock exchanges index as its input attributes. For 536 instances, 400 instances are trained and the remaining 136 instances are tested.
• Forest fires dataset is to estimate the burned area of the forest with 12 relevant inputs for estimation and 517 similar instances. Of which 450 are training instances and 67 are testing instances.
• Abalone dataset is used primarily to predict the age of abalone with the help of 8 input attributes and a total of 4177 instances. Out of which 2977 instances as training units and 1200 instances as testing units.
• Auto MPG dataset is mainly used to predict the MPG (miles per gallon) values with the 8 attributes of automobile as its input for 398 samples. 199 samples are used for training and 199 samples are used for testing purpose.
In order to demonstrate the efficacy of the proposed PSODS trained RBF NN, several experiments have been conducted. The PSO and DS also independently used to train the RBF NN for the sake of comparison with the proposed PSODS algorithm. The parameter setting for the DS algorithm is p1 & p2 is set as (0.3xrand), population size is 100. For the PSO algorithm the swarm size is 10 and inertia weight is set = 1. No. of iterations are kept as 1000 for all cases for the PSODS algorithm. The PSO routine will perform the search until 50 iterations or there is no improvement in the solution for 10 iterations. The results for each dataset using all techniques are obtained by performing the following experiments: Each dataset will be simulated for 30 trial runs to obtain the RMSE having best, worst and mean value. The standard deviation (SD) for each case is also listed. Secondly, to identify the suitable number of hidden neurons for the RBF NN, the hidden neurons is changed in the range of 40 to 70 insteps of 5 neurons and experimented for 30trial runs. In addition, experiments are carried out by varying the training and test samples in contradictory to the standard procedure. Such that, the testing sample are gradually increased by 5% and experimented for 30trial runs using the proposed PSODS trained RBF NN. Error statistics variations are shown to prove the robustness of the PSODS algorithm.

Performance evaluation on number of hidden neurons
In this section, the RBF NN is trained using the proposed PSODS algorithm along with the DS and PSO algorithms training RBF NN independently. Hidden layer neurons will be fixed in the range of 40 to 70 insteps of 5 neurons and experimented for 30trial runs. All the seven datasets are experimented to decide a most suitable size of hidden layer neurons for effective prediction. The number of training and testing samples are fixed as per the standard figures as given in the UCI database.
Figs 5 to 11, shows the bar chart for all the seven datasets Boston housing, Concrete Compressive strength, Airfoil self -noise, Istanbul Stock Exchange, Forest Fires, Abalone and Auto MPG respectively. The chart shows the Training RMSE (on left) and Testing RMSE (on right) obtained by the three algorithms.
The following observations can be made from Figs 5 to 11. As mentioned earlier, the RMSE for both training and testing of samples are plotted against the change in hidden layer neuron size. All the results are based on 30 different trial runs. Since this data is large enough to be tabulated, bar chart in 3D view is plotted. In almost all the datasets the best RMSE is attained at a neuron size of 65.
The PSODS algorithm proves by producing better RMSE compared to the PSO and DS. In some cases the worst results of PSODS are even better than the PSO and DS (e.g., Airfoil selfnoise and Abalone). In Boston housing and Istanbul Stock Exchange cases the hidden layer neuron size is close to 60. But still the next better size is 65 and the difference in RMSE produced is also comparatively smaller. Thus based on the above results, the hidden layer neuron size for the PSODS trained RBF NN is fixed at 65 neurons. Further studies and experiments will be performed with these parameters henceforth of this paper.
Subsequently, the results corresponding to the each dataset used for experimentation are summarized in Tables 2 to 8. Here the RMSE results for proposed PSODS algorithm along with PSO, DS and other algorithms reported in the literature are compared. As mentioned earlier each method will be experimented for 30 trial runs and the tabulated result shows the performance of the algorithms for the RBF NN with 65 neurons in all cases. Here the general RBF NN results are also tabulated as Classic results for the sake of comparison.
The following observations can be made from Table 2, showing the results for Boston housing. Here, the out of 506 samples 253 samples have been used for training and a same number of samples are used for testing. As can be seen, the proposed PSODS method is superior in terms of producing quality solutions compared to the results of all the other methods tabulated. The PSODS is superior in both training RMSE of 0.0977 and testing RMSE of 0.1181, as  in both the cases the RMSE obtained is much less compared to other methods. Followed by the DS method, as its testing RMSE is better compared to other networks [50,51].
In support of this, Fig 12(a) shows the convergence of the three methods toward best RMSE. As can be seen from the convergence plot, the PSODS algorithm converges faster than the DS and PSO. This plot is one amongst the convergence data in 30 different trial runs. Similarly, Fig 12(b) shows the accuracy in predicting the test sample targets by the RBF trained using three methods. For the sake of leniency of comparison accuracy a tolerance of 0.01 is set for all the methods. Thus the PSODS algorithm has predicted much higher samples (at an average of 147 samples) than the DS (at an average of 125 samples) and PSO (at an average of 110 samples) methods.
Similarly Table 3 shows the results for Concrete Compressive strength. Here, out of 1030 samples 680 samples have been used for training and 350 samples are used for testing. As can be seen, the proposed PSODS method is superior in terms of producing quality solutions compared to the results of all the other methods tabulated. The PSODS is superior in both training RMSE of 0.1269 and testing RMSE of 0.1320, as in both the cases the RMSE obtained is much less compared to other methods, followed by the DS method, as its testing RMSE is better compared to other networks [50].
In support of this, Fig 13(a) shows the convergence of the three methods toward best RMSE FOR Concrete Compressive strength. As can be seen from the convergence plot, the PSODS algorithm converges faster than the DS and PSO. This plot is one amongst the convergence data in 30 different trial runs. Similarly, Fig 13(b) shows the accuracy in predicting the test sample targets by the RBF trained using three methods. For the sake of leniency of comparison accuracy a tolerance of 0.01 is set for all the methods. Thus the PSODS algorithm has predicted much higher samples (at an average of 245 samples) than DS (at an average of 225 samples) and PSO (at an average of 221 samples) methods.
Likewise observations were made from Table 4, showing the results for Airfoil self -noise. Here, out of 1503 samples 1000 samples have been used for training and 503 samples are used  for testing. As can be seen, the proposed PSODS method is superior in terms of producing quality solutions compared to the results of all the other methods. The PSODS is superior in both training RMSE of 0.1308 and testing RMSE of 0.1337, as in both the cases the RMSE obtained is much less compared to other methods, followed by the DS method is better compared to others [50,51].  In support of this, Fig 14(a) shows the convergence of the three methods toward best RMSE for Airfoil self -noise. As can be seen from the convergence plot, the PSODS algorithm converges faster than the DS and PSO. This plot is one amongst the convergence data in 30 different trial runs. Similarly, Fig 14(b) shows the accuracy in predicting the test sample targets by the RBF trained using three methods. For the sake of leniency of comparison accuracy a tolerance of 0.01 is set for all the methods. Thus the PSODS algorithm has predicted much higher samples (at an average of 393 samples) than DS (at an average of 370 samples) and PSO (at an average of 352 samples) methods.
Consequently the results for Istanbul Stock Exchange are shown in Table 5. Here, out of 400 samples 536 samples have been used for training and 136 samples are used for testing. As can be seen, the proposed PSODS method is superior in terms of producing quality solutions compared to the results of all the other methods tabulated. The PSODS is superior in both training RMSE of 0.1269 and testing RMSE of 0.1320, as in both the cases the RMSE obtained is much less compared to other methods, followed by the DS method, as its testing RMSE is better compared to other networks [50].
In support of this, Fig 15(a) shows the convergence of the three methods toward best RMSE for Istanbul Stock Exchange. As can be seen from the convergence plot, the PSODS algorithm converges faster than the DS and PSO. This plot is one amongst the convergence data in 30 different trial runs. Similarly, Fig 15(b) shows the accuracy in predicting the test sample targets by the RBF trained using three methods. For the sake of leniency of comparison accuracy a tolerance of 0.01 is set for all the methods. Thus the PSODS algorithm has predicted much higher samples (at an average of 97 samples) than the DS (at an average of 85 samples) and PSO (at an average of 80 samples) methods.
Next in Table 6 the results for Forest Fires are summarized. Here, out of 517 samples 450 samples have been used for training and 67 samples are used for testing.
As can be seen, the proposed PSODS method is superior in terms of producing quality solutions compared to the results of all the other methods tabulated. The PSODS is superior in both training RMSE of 0.0408 and testing RMSE of 0.0599, as in both the cases the RMSE obtained is much less compared to other methods, followed by the DS method, as its testing RMSE is better compared to other networks [51].
In support of this, Fig 16(a) shows the convergence of the three methods toward best RMSE for Forest Fires. As can be seen from the convergence plot, the PSODS algorithm converges faster than the DS and PSO. This plot is one amongst the convergence data in 30 different trial runs. Similarly, Fig 16(b) shows the accuracy in predicting the test sample targets by the RBF trained using three methods. For the sake of leniency of comparison accuracy a tolerance of 0.01 is set for all the methods. Thus the PSODS algorithm has predicted much higher samples (at an average of 47 samples) than the DS (at an average of 42 samples) and PSO (at an average of 38 samples) methods.
In continuation following observations were made from Table 7, showing the results for Abalone. Here, out of 4177 samples 2977 samples have been used for training and 1200 samples are used for testing. As can be seen, the proposed PSODS method is superior in terms of producing quality solutions compared to the results of all the other methods tabulated. The PSODS is superior in both training RMSE of 0.0884 and testing RMSE of 0.0935, as in both the cases the RMSE obtained is much less compared to other methods, followed by the DS method, as its testing RMSE is better compared to other networks [51].
In support of this, Fig 17(a) shows the convergence of the three methods toward best RMSE for Abalone. As can be seen from the convergence plot, the PSODS algorithm converges faster than the DS and PSO. This plot is one amongst the convergence data in 30 different trial runs. Similarly, Fig 17(b) shows the accuracy in predicting the test sample targets by the RBF trained using three methods. For the sake of leniency of comparison accuracy a tolerance of 0.01 is set for all the methods. Thus the PSODS algorithm has predicted much higher samples (at an average of 970 samples) than DS (at an average of 925 samples) and PSO (at an average of 910 samples) methods. Again, the results for Auto MPG are shown in Table 8. Here, out of 398 samples 199 samples have been used for training and 199 samples are used for testing. As can be seen, the proposed PSODS method is superior in terms of producing quality solutions compared to the results of all the other methods tabulated. The PSODS is superior in both training RMSE of 0.0780 and testing RMSE of 0.0844, as in both the cases the RMSE obtained is much less compared to other methods, followed by the DS method, as its testing RMSE is better compared to other networks [52].
In support of this, Fig 18(a) shows the convergence of the three methods toward best RMSE for Auto MPG. As can be seen from the convergence plot, the PSODS algorithm converges faster than the DS and PSO. This plot is one amongst the convergence data in 30 different trial runs. Similarly, Fig 18(b) shows the accuracy in predicting the test sample targets by the RBF trained using three methods. For the sake of leniency of comparison accuracy a tolerance of 0.01 is set for all the methods. Thus the PSODS algorithm has predicted much higher samples (at an average of 157 samples) than DS (at an average of 135 samples) and PSO (at an average of 125 samples) methods.

Comparison of error statistics using PSODS
In this experiment, the proposed PSODS trained RBF NN is tested for its applicability in predicting testing sample increased from its standard size. To facilitate this, the training samples are reduced at the rate of 5% from its original size and alternatively the testing samples are equally increased. The seven bench mark datasets are experimented and the box plots are shown in Fig 19. Instead of the RMSE value, the normalized RMSE (NRMSE) value is plotted in order to make easy the comparison between 7 datasets altogether. In order to realize this following Eq (8) is used for calculating the NRMSE: Where, RMSE is the root mean squared error given in (5) and L is the difference of the maximum and minimum RMSE of the respective case.  Based on the plots, the following observations are made: • the error variation is minimum when the testing samples are increased to 10% in almost all cases by the proposed PSODS method. Whereas, when the sample increase goes beyond 15%, the variation is considerably getting higher.
• the median seems to be same for the increase in testing sample for almost 10% and then it also goes little higher for further increase in testing sample.
• the box span indicates the spread of the error and again here up to 10% increase in testing samples the PSODS has produced similar variations.
• the outliers also indicates the performance efficacy in producing similar error statistics for the PSODS method for 10% increase in testing samples (alternatively 10% decrease in training samples) • perhaps not a full data is presented in this paper about the performance of the PSO and DS methods in this experiment, they could not show any improvement when the test samples are increased.

Wind speed prediction
This problem is a practical wind prediction problem and its data is measured by Suzlon Energy Ltd, India, during June 2015. The terrain is tropical (Palghat Pass, India) and data is regressive. The data description is similar to other wind prediction models. Thus, the wind speed as desired output and its corresponding atmospheric variables such as wind vane direction, temperature, atmospheric pressure, air density and relative humidity for the altitude of 65m as input attributes are obtained from Suzlon Energy Ltd, India. The total of 832 hourly data samples are considered, of which 500 samples are used for training and 332 samples are used for testing the performance of the algorithm. Based on the performance of the RBF NN trained by PSODS algorithm, the hidden layer neuron is set to 65. The simulation parameters are set as it is for both the algorithms while solving the 7 benchmark datasets. Simulation for wind prediction is done for 30trial runs using three algorithms to train the RBF NN. Table 9 summarizes the results obtained and depicts the superiority of the proposed PSODS algorithm over PSO and DS in all cases. Similarly, the convergence plot for 1000 iterations is shown in Fig 20(a). Again the proposed PSODS algorithm outperforms the other two algorithms and reaches the better solution faster. Due to the search feature blended from both DS and PSO algorithms, the PSODS reaches the quality solution in early iterations itself. In the same way, Fig 20(b), depicts the plot of three algorithms in attaining the accuracy by predicting the target samples. According the plot elucidates the successful numbers of samples predicted with a 0.01 tolerance. Based on this plot, it is comprehensible that the PSODS algorithm could make the RBF NN to predict relatively higher number of samples, compared to the PSO and DS methods.
To additionally substantiate the performance of the proposed PSODS algorithm over PSO and DS methods, Fig 21, portray the variance plot of the three methods to obtain the RMSE for increase in testing samples. Here each algorithm is used to obtain the RMSE for 30 trial runs. As discussed earlier, there are 832 hourly data samples, of which 500 samples are used for training and 332 (+0%) samples are used for testing. While doing the simulations, the testing samples are increased to 10% and 20%. Accordingly the Training sample will be reduced. Intelligibly from Fig 21, the PSODS algorithm could predict better results compared to the other two methods. Thus again PSODS establishes itself as a suitable method for prediction.
To demonstrate the superiority of the proposed PSODS technique over the other existing neural network method for wind speed prediction, three networks are chosen and experimented for 30 trial runs. The three NN are basic RBF NN [19], extreme learning machine (ELM) [50] and multi-layer perceptron [16] trained by back propagation algorithm (MLP-BP).  Table 11, summarizes the accuracy in predicting the test sample targets by the three networks. For the sake of leniency of comparison accuracy a tolerance of 0.01 is set for all the methods. It is observed that the proposed PSODS trained RBF is superior in terms of producing successful samples compared to the results of all the other networks tabulated. Thus the PSODS trained RBF is superior in predicting for all the 7 datasets and the wind problem with less testing RMSE and also by predicting more number of successful test samples when compared to other NN for a 30 trial experiment. Similarly, in order to justify the merits of the proposed chaotic opposition-based population (COP) initialization algorithm over the general population (GP) initialization algorithm is experimented to show the dominance of the former in supporting the PSODS trainer to swiftly predict the test sample. Accordingly, experiments are conducted for 30 trials and Table 12, summarizes the results obtained. For this purpose, the best and worst fitness value (RMSE) is recorded during initialization along with the final solution obtained by the PSODS-RBF NN on termination of the algorithm.  From the summarized results it is clear that, the proposed chaotic opposition-based population initialization algorithm could generate better initial search solutions and reach the better solution region and produce quality RMSE over the regular random initialization algorithm. Also the PSODS-RBFNN with COP could reach the better solution well before the termination criterion. Thus the proposed chaotic opposition-based population initialization algorithm has great influence of the training and convergence of the proposed PSODS algorithm.

Conclusions
This paper presents an integrated hybrid optimization algorithm for training the radial basis function neural network for prediction of standard benchmark regression data sets and one real-time wind speed case. Accordingly, a hybrid training procedure with differential search DS algorithm functionally integrated with the PSO is modelled and experimented. Here the DS will be used as the main optimizer and PSO will use a neighbourhood topology to exploit the solutions of DS by thorough search of solution region. This neighbourhood topology will be based on ring topology with neighbours fetched considering both fitness and candidates themselves. A new chaotic map based algorithm to generate the initial population is proposed to support the PSODS algorithm to search the n-dimensional space thoroughly by supplementing the diversity of population and reach better optimum regions swiftly. To exemplify the potency of the PSODS method and to generalize the RBF NN architecture, scrupulous experiments are carried out to find the optimum size of hidden layer neurons.
The Numerical experiments on publicly available 7 benchmark datasets are performed using the proposed PSODS algorithm for 30 trial runs to evaluate the RMSE to ensure the RBF NN is prepared to predict outputs of regressive samples database. In all cases the PSODS outperforms the PSO and DS methods in terms of convergence rate and is reliable as the error statics variations are fairly petite. To demonstrate the applicability of the PSODS with reduced samples for training, experiments are carried out by reducing training samples and tested with increased samples, again the error statistics proves the PSODS method is robust in prediction.
The prediction accuracy is also demonstrated by evaluating the number of samples closely (0.01 tolerance) predicted by all the three methods for training RBF NN. Also a standard of 1000 iterations is fixed for all the three methods. Subsequently, experiments were conducted on a practical application case for wind speed prediction to expound the superiority of the proposed PSODS training algorithm in terms of prediction accuracy. In extended work, the proposed PSODS method to train RBF NN will be demonstrated for problems with more attributes and problems with missing data. Also simulations with other types of neural networks such as Extreme learning machines (ELM) will be significant. Also, it is worth to further navigate the proposed PSODS algorithm with many prediction problems such as electricity price forecasting, solar irradiance and solar radiation prediction.