Application of the hybrid ANFIS models for long term wind power density prediction with extrapolation capability

In this paper, the suitability and performance of ANFIS (adaptive neuro-fuzzy inference system), ANFIS-PSO (particle swarm optimization), ANFIS-GA (genetic algorithm) and ANFIS-DE (differential evolution) has been investigated for the prediction of monthly and weekly wind power density (WPD) of four different locations named Mersing, Kuala Terengganu, Pulau Langkawi and Bayan Lepas all in Malaysia. For this aim, standalone ANFIS, ANFIS-PSO, ANFIS-GA and ANFIS-DE prediction algorithm are developed in MATLAB platform. The performance of the proposed hybrid ANFIS models is determined by computing different statistical parameters such as mean absolute bias error (MABE), mean absolute percentage error (MAPE), root mean square error (RMSE) and coefficient of determination (R2). The results obtained from ANFIS-PSO and ANFIS-GA enjoy higher performance and accuracy than other models, and they can be suggested for practical application to predict monthly and weekly mean wind power density. Besides, the capability of the proposed hybrid ANFIS models is examined to predict the wind data for the locations where measured wind data are not available, and the results are compared with the measured wind data from nearby stations.


Introduction
The primary energy sources (fossil fuels) will soon be exhausted since they are used at a much higher rate than they are found in the earth's crust. Moreover, the price of fossil fuels is highly unstable, and it causes huge greenhouse gases (GHG) emissions and environmental pollutions [1,2]. On the other hand, the wind energy is free, environmentally friendly and clean renewable energy. Consequently, in the fight of global climate change, wind energy is a major solution [3][4][5]. Globally, installed wind power capacity has reached 432.9GW at the end of 2015 where 63GW was added in 2015 alone [6]. a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 However, wind energy is unstable and subject to intermittent characteristics thus, the accurate prediction of the wind speed and the wind power is a vital part of the successful establishment of the wind energy conversion system [7]. Again, to build a wind farm in any particular location, analysis of wind data, estimation of wind power and energy density are essential [8,9]. The wind power density (WPD) of a particular location is the measure of the potentiality of wind resources and the chance of extracting wind energy at different wind speed from that location. The knowledge of WPD also helps the designer and investor to understand the performance of wind turbine and to choose the optimal number of a wind turbine with a suitable power rating [10,11].
The wind power can be computed from several numerical methods [12,13]. The problem in numerical methods is that they need high computation time. In the recent years, artificial intelligence (AI) techniques have received overwhelming popularity in the field of the wind energy system and other engineering applications as they offer better advantages, including fast computation time, require no knowledge of internal system parameters and compact solutions [14][15][16][17][18][19][20]. Generally, wind speed and power prediction are divided into three categories, namely, short-term (30 min to 6 h), medium-term (6h to 24h), and long-term (24h and longer) predictions [7,21].

Short-term wind prediction
In ref. [22], two different short-term wind power prediction methods namely; individual ANN and hybrid strategy based on the physical and statistical methods were developed, where individual ANN and hybrid strategy resulted in 10.67% and 2.01% root mean square error (RMSE) respectively in the prediction. However, the hybrid strategy was 5 times slower than individual ANN. An ANFIS based hybrid model was developed in [23] to predict short-term wind power in Portugal that resulted in MAPE of 5.41%, outperforming five other approaches. The authors used historical wind power data as inputs. In [24], the authors applied both ANN and ANFIS models for hourly wind power prediction for a wind farm located in Southern Italy and their prediction accuracy resulted worse when the prediction horizon was increased. More literature review regarding the application of AI methodologies for the prediction of short-term wind speed and power can be found in the ref. [16,21].

Medium-term wind prediction
In [25], an ANN model was employed for the prediction of daily mean wind speed of 11 locations in India where actually measured wind data are not available. The authors used meteorological variables of the target locations from NASA surface meteorology and solar energy database, and the prediction accuracy is compared with measured wind data that was collected from a nearby meteorological station in Hamirpur. A hybrid method consists of wavelet transform (WT), ANFIS, SVM, and GS was proposed in [26] for 6h ahead wind power forecasting. This study showed that the proposed method can predict wind power with MAPE of 12.16% to 13.83%. More literature review regarding the application of soft computing methodologies for the prediction of medium-term wind speed and power can be found in the ref. [21].
power, an ANN and statistical based models were conducted in ref. [7] where the proposed method showed rather promising results in view of the very small mean absolute error (MAE). In study [29], a hybrid model denoted by WT+FA+FF+SVM was reported and the computed MAPE were in the range of 13.46-18.74% for weekly prediction. More literature review regarding the application of soft computing methodologies for the prediction of long-term wind speed and power can be found in the ref. [21,30].
Long-term prediction of wind speed has become a research hotspot in many different areas such as restructured electricity markets, energy management, and wind farm optimal design. Although ANFIS merges the learning power of the ANNs with the knowledge representation of fuzzy logic, there are still some difficulties in ANFIS in constructing membership functions (MFs). The difficulty of using ANFIS in constructing membership functions lies in tuning the function to build the best model with high accuracy and better performance. Therefore, this study proposed hybrid ANFIS; ANFIS-PSO, ANFIS-GA, and ANFIS-DE to predict long-term (monthly and weekly) wind power density for four different places in Malaysia namely; Mersing, Kuala Terengganu, Pulau Langkawi and Bayan Lepas. The main benefit of combining these three techniques (PSO/GA/DE) with ANFIS is to reduce the error rates by tuning and optimizing the membership functions. Besides, this study examined the wind speed prediction capabilities of the proposed models for the locations where measured wind data are not available, and the result of the wind speed extrapolation is compared with the measured wind data collected from the nearby meteorological station.

Site description
The aim of this study is to predict long-term (monthly and weekly) wind power of four different locations situated under four distinctive state of Malaysia shown in Table 1. For this purpose, wind speed data were collected from Malaysian Meteorological Department (MMD) situated in the respective locations during the period (2004-2014). As presented in Table 1, the wind data were measured at different heights above sea level by a rotating cup-type anemometer. Those are 43.6m, 5.2m, 6.4m and 2.46m for Mersing, Kuala Terengganu, Pulau Langkawi and Bayan Lepas respectively. It is important to mention that the wind data recorded in different heights need to adjust to the same height because of various characteristics of wind speed with altitude. The wind shear, which is the variation of wind velocity with altitudes, is most pronounced near the surface (sea and land). Due to the drag of surface and viscosity of air, the wind blows faster at higher altitudes.
Typically, the variation of wind speed at daytime follows the 1/7 th power law whereas, when the temperature become stable or better at night time, the wind speed close to the ground usually subsidies and at turbine altitudes, it does not decrease that much or may even increase. Thus, the daily average wind speed data collected from the meteorological stations were adjusted at turbine hub height of 50m using the power law. The power law for wind speed adjustment at the different hub height is defined as [12]: where v is the wind speed at is desired height h and v o is the wind speed at measured height h o . While α is the power law coefficient. The exponent (α) is an empirically derived coefficient that varies depending upon the stability of the atmosphere. For neutral stable conditions, it is approximately 1/7, or 0.143, which is commonly assumed to be constant in wind resource assessments. This is because the differences between the two levels are not usually so great as to introduce substantial errors into the estimation (usually<50m). The value of the coefficient varies from less than 0.10 for very flat land, water or ice to more than 0.25 for heavily forested landscapes and the typical value of 0.14 for low roughness surfaces. The value 0.143 for the coefficient has been chosen for this assessment [12,31,32].

Wind power density (WPD)
The wind power density (WPD) is an essential indicator to estimate wind potentiality in a specific location. The computation methods of WPD include; application of measured wind data and the use of Weibull distribution function. The power in the wind at a measured wind speed v passing through a blade sweep area can be expressed as [13,33]: where n is the number of data points over a time period, v 3 is the mean of cube of wind speed and ρ is the air density (kg/m 3 ), taken 1.175 kg/m 3 in this study. As presented in [12,31], the 2-parameters Weibull distribution function is the most appropriate, recommended and acceptable model for wind potentiality analysis. In Weibull distribution, the probability distribution function (PDF) determines the probability of the wind at a given velocity V and it can be expressed as [12,34]: On the other hand, cumulative distribution function (CDF) of wind speed V indicates the probability that the wind velocity is equal to or lower than V or within a given wind speed range. CDF can be expressed as [12,34]: where V is wind speed (m/s), k (dimensionless) is shape factor and c (m/s) is scale factor. As earlier mentioned, the k and c (m/s) can be computed by several empirical methods. Commonly used is standard deviation method, k and c are defined as follows [12,13]: where v is the average wind speed (m/s), σ is standard deviation and Γ(x) is the gamma function which is defined as [35,36]: Another empirical method is power density or energy pattern factor method. In this method, E pf is needed to be estimated to compute the shape factor k and scale factor c (m/s).
The E pf is known as wind pattern factor which is used for wind turbine aerodynamic design and it is defined as follows [33,37]: In simple word, E pf is the ratio of mean of cube wind speed to cube of mean wind speed. When E pf is known then shape factor k can be easily estimated by following formula [33,37]: The wind power density on the basis of Weibull probability density function is estimated using the following equation [13]:

Methodology
As mentioned in section 2.1, to accomplish the study objective, 11 years (2004-2014) longterm wind speed data measured by the rotating cup-type anemometer were collected from the four different locations in Malaysia; Mersing, Kuala Terengganu, Pulau Langkawi and Bayan Lepas. The raw data was thereafter adjusted to turbine hub height of 50m. The raw data used in this study is presented in S1 Dataset: Supporting Information.rar. Afterward, the monthly and weekly mean wind speed at 50m and the corresponding wind power density from measured data were applied on the developed standalone ANFIS and hybrid ANFIS models. In this study, the data size used for the training and testing the prediction models are defined as P and Q respectively. The purpose of the training process in ANFIS model is to minimize the error between the actual target and the ANFIS output. Based on the literature on ANN, the percentage of training data must be higher than testing data for effective learning of the system before the system can produce a good result. The developed models were trained and validated with several data segments such as P = 80%, Q = 20%; P = 70%, Q = 30% and P = 60%, Q = 40% for training and testing respectively. The percentage of data selected for training and testing has been carefully tested based on the minimal error obtained in the statistical indicator. It is important to mention that no specific rules were considered to choose the data size for training and testing the models. Application of different training and testing data size on the prediction models helps to observe the error metrics and to choose best data size providing the minimum error in the prediction.

ANFIS (adaptive neuro-fuzzy inference system)
The term adaptive neuro-fuzzy inference system (ANFIS) was first introduced by Jang in 1993. ANFIS is a hybrid intelligent scheme that merges the learning power of ANNs with the knowledge representation of fuzzy logic to produce a powerful processing tool [38]. The fuzzy inference system (FIS) is the core of ANFIS. FIS is based on expertise expressed in terms of 'IF-THEN' rules, thus it can be used to predict the behavior of many uncertain systems. One of the advantages of FIS is that it does not require knowledge of the main physical process as a pre-condition for operation. Thus, ANFIS integrates FIS with a back propagation learning algorithm of a neural network. These techniques provide a method for the fuzzy modeling procedure to learn from the available data set, in order to compute the membership function parameters that best allow the associated fuzzy inference system to track the given input/output data as shown in Fig 1. For one input, two fuzzy 'IF-THEN rule are generated for the maximum equal to 1 and minimum equal to 0. The fuzzy inference system employed in this study uses one input x and one output f. A first-order Sugeno fuzzy model with two fuzzy if-then rules is used as follows [39]: Layer 1 contains membership functions (MFs) of input variables and provides the input values for the next layer. In the 1st layer, each node is adaptive as O = μ AB (x), where μ AB (x) are MFs. The bell-shaped MFs is presented in Eq 13, for which the lowest and highest amounts are 0 and 1, respectively.
The function is subject to the following parameters, namely a, b and c. Each of these parameters defines as follows: a is half width of the curve; b define the gradient together with a, and c is the midpoint of the membership function as shown in Fig 2. In the 2nd layer (the membership layer), the weight of membership functions (MFs) is considered. Input values for this layer are supplied from the first layer. The nodes in the second layer are a fixed node. The output is the product of all incoming signals and be described as, The output of each node indicates the weight strength of a rule. In layer 3 or the rule layer, every node does the pre-condition matching of the fuzzy rules, that calculate each rule's activation level as well as the normalized firing strength. This is a fixed layer as well, and every node computes the ratio of the ith rule of the firing strength to the sum of ith firing strengths of all rules as: The outputs of this layer are named as normalized weights or firing strengths. In layer 4 or defuzzification layer, all the adaptive nodes provide the resulting output values from the inference of rules.
Here, the parameters set are shown as {p i ,t}. Layer 5 or the output layer summarizes the inputs and output from layer 4. This layer also converts the results of fuzzy classification into a crisp. Here, the single node is fixed node, and the whole incoming signal is sum up to produce overall output as below, Three optimization techniques namely; PSO, GA, and DE were employed to adjust the ANFIS membership function parameters. The main benefit of combining these three techniques with ANFIS is to reduce the error rates by tuning and optimizing the membership functions.

ANFIS-PSO
Particle swarm optimization (PSO) is an approach for optimizing "continuous" and "discontinuous" decision-making functions as developed by Kennedy and Eberhart in 1995 [40]. PSO has been used to model animals' sociological and biological behavior such as groups of birds searching for food [41]. PSO has also been employed in population-based search approach, in which a particle of a population is present for each individual potential solution or swarm. In this method, the position of each particle is changed constantly in a search space until getting to the optimum solutions and computational limitations are reached [42]. In PSO, swarm starts with a group of random solutions, each of which is called a particle, and s * i represents the particle's position. Likewise, a particle swarm moves in the problem space, where v * i expresses the particle's velocity. A function f is evaluated at each time step through input s * i . Every particle records its best position related to the best fitness gained to this point, in p * i vector. p * i g tracks the most appropriate position identified by any neighborhood member. In universal form of PSO, p * i g represents the best appropriate position in the entire population. A new velocity is achieved for any particle i in each iteration according to the best positions of individual, p * i ðtÞ and p * i g ðtÞ neighborhood. The new velocity can be presented by: where w represents the inertia weight and positive acceleration coefficients are represented by range is relent on the problem provided the velocity exceeds the mentioned limit. In some cases, it is rearranged within its suitable limits. The position of every particle alters depending upon the velocities as: According to Eqs (18) and (19), the particles incline to gather around the best. Fig 3 depicts the sequential PSO and ANFIS combination [43].
The PSO use for designing a fuzzy system (FS) or parameter optimization is expressed as [44]: where, a i is a crisp value, k represents the time step, the input variables are x 1 (k),. . .,x n (k), A ij is a fuzzy set and u(k) signifies the system output variable. For the FS in Eq (20) which comprises r rules and n input variables, it's free parameters are defined through a position vector: Following the process of rule creation and initialization, the preliminary antecedent part parameters are outlined. According to Eq (21) and Eq (22), the ith solution vector s i * is created as: In Eq (23), Δm ij and Δb ij represent small random numbers, a i designates a random number distributed arbitrarily and homogeneously in the FS output range. The f (evaluation function) for s i * is calculated based upon the FS performance.
PSO looks for the best originator part parameters. P S represents the population size. Eq (21) where, w i * represents a random vector. The primary speed values of all particles, v i * ð0Þ; i ¼ 1; . . . ; P s , are generated randomly. Each particle's performance is evaluated according to the FS it signifies. f is described as the E(t) or error index mentioned above. The best position (p i * ) of each particle and the best particle p i g * in the whole population is obtained according to f. Eqs (18) and (19) overhaul the velocity and position of each particle. The whole learning procedure is accomplished as soon as a pre-defined paradigm is obtained [44].
There are five PSO main parameters used during conducting the experiment as shown in Table 2, this includes; a maximum number of iterations, the population size of the domain, inertia weight damping ratio and inertia weight, global and personal learning coefficient. For this case studies, the optimum values of these parameters are determined by trial and error procedure.

ANFIS-GA
Genetic Algorithm is global search heuristics technique used to find solutions for optimization and solve highly complex search problems. It is a particular class of evolutionary computing method inspired by the idea of natural selection evolutionary process which implemented inheritance, mutation, selection, and recombination [45]. In this hybrid approach, GA is combined with ANFIS to extend its prediction proficiency. GA is implemented to improve ANFIS performance and minimize the error rates by tuning and optimizing the membership functions of a Sugeno type fuzzy inference system. The ANFIS-GA forecast allows reforming of the upcoming behavior of the wind power density and therefore, determines the viability of the wind power plants from any location.
The Hybrid ANFIS-GA model used is shown in Fig 4 [46]. GA model begins with a set of solutions (referred to as chromosomes) represented as population. A new population is drawn from the completion of a previous population. New solutions that formed from selected solution (offspring) are designated according to their fitness.
This process is repeated until some condition is occurred (for example, number of populations or improvement of the optimal solution) is fulfilled. To achieve this, ANFIS algorithm plays an important role as part of the fitness function, f(x). The fitness with intervention of ANFIS fitness function is represented by; where m is a number of feature attributes, a i is output derived through ANFIS, d i is desired wind power density. The next fitness function can be presented as: where n is the total number of input features, d i is set to minimum, a i is actual value of wind power density and n − m represents remaining undesired features. The final equation is minimized f(x), describe as, For this case study, we determined the specific parameters initialization for the GA. These include; number of iterations, population size, mutation and crossover percentage. Selection of these parameters decides, to a great extent, the ability of the designed controller. The range of the tuning parameters is listed in Table 3. Hybrid ANFIS for wind power density prediction After the fitness f(x) of each chromosome x in the population is evaluated, new population is created and following steps are repeated until it is completed. The better fitness gives bigger chance to parent chromosomes from a population to be selected. Then it is crossover the parents to form a new offspring with the crossover probability. Next mutation probability mutates new offspring at each position in chromosome. As the solution goes under Reproduction, Crossover and Mutation with parameters setting from 3, the best solution in current population is returned if end condition is satisfied. The optimal solution calculated will help GA to search for optimized membership function.

ANFIS-DE
Differential evolution (DE) is first introduced as a heuristic method by Storn and Price [47] to solve problems involving global optimization as the solution to minimizing possibly nonlinear and non-differentiable continuous space functions. Since both DE and GA are part of evolutionary computing methodologies, DE functions almost in the same manner as GA. The different in DE is using actual real numbers in a strict mathematical sense, which can be applied to real-valued problems over a continuous space. As a result, the designs of crossover and mutation are significantly different. The idea behind the method of DE is that the difference between two vectors produces a difference vector which is used with a scaling factor to traverse the search space [48].
In ANFIS-DE hybrid approach, DE is combined with ANFIS to improve the performance of ANFIS prediction proficiency. Differential Evolution initialize with population size of (pop size) individuals solutions which can be represented as x t i ¼ 1; 2; . . . ; popsize for each individual where i represents the population and t th represents the generation to which the population belongs. Then the algorithm depends on the operation of three main operators; mutation, crossover and selection as shown in Fig 5 [46].
Mutation operator is the main operator of DE which differs from other EAs. We implemented DE/rand/1 mutation strategy as described in Eq 28.
where u t i is the mutant individual for x t i and r 1 ,r 2 ,r 3 are randomly selected and satisfy individuals. Moreover, they are not equal to running index (i) and mutually different. F is control parameter in the range of [0,2] Process crossover is carried out after the mutation phase is completed and can be described as,  Selection operation in DE is also different from other evolutionary algorithms. Once all pop size trial individuals are generated, the selection operation is processed as: ( where fitness(x) is the fitness function. Table 4 shows the initial parameters of the ANFIS-DE model.  Hybrid ANFIS for wind power density prediction

Statistical indicators model performance evaluation
The performance of the proposed system can be checked by computing several statistical parameters. The most popular statistical error indicators are the mean absolute bias error (MABE), mean absolute percentage error (MAPE), root mean square error (RMSE) and coefficient of determination (R 2 ). The MABE is the average quantity of the summation of all absolute bias error between the predicted and the measured value represented as: The MAPE is the mean absolute percentage difference between the predicted and measured wind power density represented: The RMSE presents the accuracy of the model by comparing the deviation between predicted and measured wind power density. The value of RMSE is always positive and it is defined as: The coefficient of determination (R 2 ) indicates the strength of linear relationship between the predicted and measured wind power density. R 2 is obtained by: In the Eqs (31-34), V i,P and V i,M are wind power density estimated from developed prediction models and measured data respectively.

Monthly wind power density prediction
In the current study, hybrid ANFIS (ANFIS-PSO, ANFIS-GA, ANFIS-DE) and standalone ANFIS models were developed to predict wind power density. The above-mentioned models were trained and tested with different data size. Tables 5 and 6 shows descriptive values of input and output parameters respectively such as; maximum (Max.), minimum (Min.), standard deviation (St Dev.) and range for different locations. On the other hand, Fig 6 shows visual presentation of wind speed at 50m hub height for study locations. It can be found in Fig  6 that maximum wind speed prevails in Mersing and followed by Pulau Langkawi, Bayan Lepas, and Kuala Terengganu. The raw data used in this study is presented in S1 Dataset: Supporting Information.rar. Table 7 presents the error metrics, including MAPE, MABE, RMSE obtained while training the prediction models with the input-output data set for Mersing, Langkawi, Bayan Lepas and Kuala Terengganu. The data size for training and testing the models were P = 105 and Q = 27 respectively. In this study, the performance of the prediction models is categorized based on lowest RMSE. It can be observed from Table 7 that ANFIS-PSO model has lowest RMSE, MAPE and MABE in the training stage for the data set of Mersing, Bayan Lepas, and Kuala Terengganu whereas, ANFIS-GA ranks in the first for the data set of Pulau Langkawi. Table 8 summarizes the RMSE, MAPE and MABE performance metrics when testing data set of different underlying locations is applied to the predictions models. As can be seen from Table 8, the ANFIS-GA model ranks in the first for the underlying data sets of Mersing and Kuala Terengganu. On the other hand, ANFIS-PSO model provides the smallest error metrics for the data set of Pulau Langkawi and Bayan Lepas.
It is important to mention that sometimes ANFIS model provides better performance than hybrid ANFIS models but does not provide the best model performance for any of the underlying locations. For instance, ANFIS shows second and third best performance for the data set of Bayan Lepas and Mersing respectively when testing the prediction models.
The R 2 is the correlation between measured and predicted WPD, which has the highest value of one. A pronounced observation of R 2 from Tables 7 and 8 revealed a very good correlation between measured and predicted WPD obtained when training and testing the prediction models using the data set from Mersing, Pulau Langkawi, and Bayan Lepas. On the other hand, the measured and predicted WPD of Kuala Terengganu suffers comparatively lower correlation when training and testing the prediction models.
Furthermore, the performance of the developed prediction models is compared to different training and testing data size to illustrate the effect of data size on the prediction accuracy. Tables (9)(10)(11)(12) summarize the RMSE, MABE, MAPE, and R 2 when input-output data sets of Mersing, Pulau Langkawi, Bayan Lepas and Kuala Terengganu respectively are applied to the prediction models. In this case, training and testing dataset consists of P = 92 (70%) and Q = 40 (30%) observations during the period January 2004 to August 2011 and September 2011 to December 2014 respectively. Furthermore, in order to choose the best data size that will provide optimal error in WPD prediction, all the developed models were trained and tested with another input-output data size, which consists of P = 80 (60%) and Q = 52 (40%) observations during the period January 2004 to August 2010 and September 2010 to December 2014 respectively. The statistical error metrics (RMSE, MABE, MAPE, and R 2 ) for this category of training and testing data set are also shown in Tables (9)(10)(11)(12).
A profound observation on Tables (7-12) reveals that: i. For WPD prediction of Mersing, ANFIS-PSO is the best model when training the models and the value of RMSE obtained are 4.96, 5.23 and 5.41 for the data size P = 105, Q = 27; Hybrid ANFIS for wind power density prediction P = 92, Q = 40, and P = 80, Q = 52 respectively. On the other hand, when testing the prediction models ANFIS-GA is the best for the data size P = 105, Q = 27 and having RMSE of 5.04 whereas, ANFIS-PSO is the best model when data size P = 92, Q = 40, and P = 80, Q = 52 resulting RMSE of 5.37 and 5.22 respectively.  ii. For WPD prediction of Bayan Lepas, ANFIS-PSO shows the best performance for all above data size when training the prediction models. In this case, the computed values of RMSE are 2.14, 2.17 and 2.18 when the training data sizes are of P = 105, Q = 27; P = 92, Q = 40, and P = 80, Q = 52 respectively. Then again, ANFIS-PSO shows best performance for a testing data size of P = 105, Q = 27 and P = 92, Q = 40 resulting RMSE of 2.37 and 2.54 respectively. It is not surprising that ANFIS model has the best accuracy when testing data size of P = 80, Q = 52 having RMSE of 2.65.
iii. For WPD prediction of Pulau Langkawi, ANFIS-GA and ANFIS-PSO show the best performance in training and testing the models respectively with data size of P = 105, Q = 27 which results in RMSE of 3.13 and 1.53 in training and testing respectively. In case of the data size P = 92 and Q = 40, ANFIS-PSO ranks in first both training and validation with RMSE of 3.29 and 1.69 respectively. For the training and testing data size P = 80, Q = 52; ANFIS and ANFIS-GA model show optimal RMSE, which are 3.45 and 2.04 in training and testing respectively. iv. For WPD prediction of Kuala Terengganu, ANFIS-PSO and ANFIS-GA show the best performance in training and testing the models respectively with data size of P = 105, Q = 27 which results in RMSE of 3.73 and 4.79 respectively. For the data size P = 92, Q = 40; ANFIS model shows optimal RMSE, which is 3.45 in training stage and ANFIS-PSO presents optimal RMSE of 2.23 in the testing stage. In case of data size P = 80 and Q = 52, ANFIS-PSO ranks in first both training and validation of the prediction models with RMSE of 3.99 and 4.58 respectively.
Based on the above discussion, it is clear that P = 105, Q = 27 data size applied to the prediction models provides minimal error for WPD prediction of all underlying locations. For this data size, overall ANFIS-PSO and ANFIS-GA showed a higher correlation between measured and predicted WPD all locations both in training and testing stages.
On the other hand, when P = 92, Q = 40 data set are applied to the prediction models, ANFIS-PSO shows the best performance in both training and testing stages for predicting WPD of Mersing, Pulau Langkawi, and Bayan Lepas. However, when predicting WPD of Kuala Terengganu, ANFIS-PSO ranks second in training and first in the testing stage. It is important to mention that P = 80, Q = 52 data size shows the largest error for predicting WPD of all underlying locations, and therefore, we do not want to use P = 80, Q = 52 data set for model performance justification. Figs 7-10 depicts the visual presentation of measured and predicted WPD when testing using ANFIS-PSO and ANFIS-GA for Mersing, Bayan Lepas, Pulau Langkawi and Kuala Terengganu respectively with best data size i.e. P = 105, Q = 27.
However, both ANFIS-PSO and ANFIS-GA seem to be able to provide an overall good performance both in training and testing stages. Therefore, hybrid ANFIS, especially ANFIS-PSO Hybrid ANFIS for wind power density prediction and ANFIS-GA can be suggested for practical utilization in WPD prediction for the locations having similar wind resource conditions. The Fig 11 shows the error distribution of WPD prediction using ANFIS-PSO and ANFIS-GA for underlying locations. According to the definition of relative percentage error (RPE) presented in [27,28], the RPE falls in an interval of -10% to 10% can be considered acceptable. The computed value of RPE presented in the Fig 11 is obtained from the proposed ANFIS-PSO and ANFIS-GA when 27 months testing data set of underlying locations applied to the models. It can be observed that most of the wind power  Again, the linear regression analysis presented in Fig 13 shows the correlation between actual and predicted WPD obtained from ANFIS-PSO when testing the prediction models with data from different underlying locations. The R 2 obtained while testing the proposed ANFIS-PSO model for Mersing, Pulau Langkawi, Bayan Lepas and Kuala Terengganu are 0.9691, 0.9774, 0.9749 and 0.9456 respectively, which supports that ANFIS-PSO has a high precision for the prediction of WPD.

Weekly wind power density prediction
The detailed explanation of data collection and site description can be found in section 2. For weekly wind power density prediction, 11 year's (2004-2014) long-term daily average wind speed data were converted into the weekly average for all underlying locations in Malaysia namely Mersing, Kuala Terengganu, Pulau Langkawi and Bayan Lepas. Afterward, the weekly mean wind speed at 50m and the corresponding wind power density from measured data were applied on the developed standalone ANFIS and hybrid ANFIS models. The first 80% data and rest 20% data for training and testing were used respectively. The results presented in Figs (14)(15)(16)(17)(18)(19) mainly obtained when training and testing the ANFIS-PSO and ANFIS-GA models as they were best prediction models for monthly wind power density prediction presented in section 5.1.  -5% up to +5%. The RMSE were 1.76 and 2.11 when testing the ANFIS-PSO and ANFIS-GA respectively. It is mentioned in section 5.1 that the R 2 is the correlation between measured and predicted WPD, which has the highest value of one. In Fig 19, the R 2 were 0.9844 and 0.9701 when testing the ANFIS-PSO and ANFIS-GA respectively.

Extrapolation capabilities of the proposed models
Measured wind speed data are not available for many locations in Malaysia, including remote islands (less than 200 km 2 ) and decentralized places to identify possible wind energy applications. In this dissertation, extrapolation capabilities of the wind speed of the proposed hybrid ANFIS models have been examined for a location in Tioman Island having latitude 2˚48' 30" N and longitude 104˚8' 29" E, where measured wind data are not available. And then, the result is compared with the measured wind data at Mersing station (latitude 2˚27' N and longitude 103˚50' E) which is the nearby station having similar climate conditions. As the study location does not have any actual measured meteorological data, the daily average solar radiation (kWh/m2/day), daily average air temperature, maximum and minimum air temperature, air pressure, relative humidity and altitude data were collected from NASA surface meteorology and solar energy database for the whole year of 2004 for the prediction of daily average wind speed (m/s). The Figs 20 and 21 show the prediction of daily average wind speed for Tioman Island using ANFIS-PSO and ANFIS-GA respectively, which are compared with measured wind speed at Mersing meteorological station that was measured at 50m above the sea level in 2004.
From error analysis, it was found that the MAPE and MABE for ANFIS-PSO were 31.56%, 0.9672% respectively whereas, they were found to be 32.21% and 0.9872% for ANFIS-GA respectively. It can be observed from the figures that the wind characteristics in Tioman Island, Hybrid ANFIS for wind power density prediction obtained with the proposed ANFI-PSO and ANFIS-GA, is not exactly similar with the measured wind at Mersing in many cases. However, it is important to take note that the performance of the prediction models can be justified accurately when measured wind data in Tioman Island will be available for comparison.

Conclusion
The wind energy potential assessment is very important for independent power producer and governmental organization to determine how efficiently wind power can be extracted from a certain location. The wind power density (WPD) is the key assessment parameter in wind Hybrid ANFIS for wind power density prediction potentiality analysis. Therefore, an efficient soft computing technique based on ANFIS-PSO, ANFIS-GA, ANFIS-DE and standalone ANFIS prediction models were developed in this paper to predict long-term (monthly and weekly) average wind power density of four different locations in Malaysia. The choice of the ANFIS technique was made due to its simplicity, reliability as well as its efficient computational capability; its ease of adaptability to optimization and other adaptive techniques, and its adaptability in handling complex parameters. The most significant advantage of hybrid ANFIS is that PSO/GA/DE tune the membership functions of the ANFIS model to ensure minimum error. The prediction models were trained and tested using wind speed data collected from meteorological stations of the underlying locations and measured wind power density. Moreover, different training and testing data size were applied to the prediction models to obtain best data size that provides a minimal error. The first 80% of data used for training and remaining 20% data for testing provide the optimal error in WPD prediction. Based on the result from best data size, there is no model that performed uniformly superior to other for all locations in both training and testing stages. Overall, ANFIS-PSO and ANFIS-GA out-performed ANFIS standalone and ANFIS-DE. Therefore, the results and analysis confirmed that the proposed hybrid ANFIS, especially ANFIS-PSO and ANFIS-GA have the excellent capability to predict the WPD with higher accuracy and precision. Other soft computing techniques applicable to wind speed and power density prediction for other parts of the world can be developed and compare with hybrid ANFIS in the further study.