Jatropha half-sib family selection with high adaptability and genotypic stability

Jatropha (Jatropha curcas) has become one of the most important species for producing biofuels. Currently, Genotype x Environment (GxE) interaction is the biggest challenge that breeders should solve to increase the section accuracy in the plant breeding. Therefore, the objectives in this study were to estimate the parameters in the 180 half-sib families in Jatropha evaluated for five production years, to verify the significance of the GxE interaction variance, to evaluate the adaptability and stability for each family based on three prediction methods, to select superior half-sib families based on the adaptability and stability analyses, and to predict the accuracy for the sixth production year. Jatropha half-sib families were classified and selected using the follow adaptability and stability methods: linear regression, bi-segmented linear regression and mixed models concepts called harmonic mean of the relative performance of genetic values (HMRPGV). The prediction accuracy was estimated by the Pearson correlation between the predicted genetic values by adaptability and stability methods and the phenotypic value in the sixth production year. In result, most half-sib families were classified as general adaptability and general stability for the evaluated traits. The selection gain obtained via HMRPGV was higher than other methods. The prediction accuracy for the sixth production year was 0.45. Therefore, HMRPGV is efficient to maximize the genetic gain, and it can be a useful strategy to select genotype with high adaptability and stability in Jatropha breeding as well as other species that should be evaluated for many years to take a suitable selection accuracy.


Introduction
Jatropha (Jatropha curcas L.) is a perennial plant monoecious, and belongs to Euforbiáceae family. This species has several used, such as living fence, phytoremediation, and medicinal purposes [1]. However, due to the worldwide corner about climate change Jatropha has become to be important to biofuel production, because it presents high oil content in the seeds [2], oil/grain ratio superior than traditional oleaginous, for example, soybean [3]. Despite that PLOS ONE | https://doi.org/10.1371/journal.pone.0199880 July 12, 2018 1 / 19 a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 Jatropha is a exotic culture, it has been cultivated in almost all Brazilian regions, in fact, it a culture adapted for different environments [4][5][6]. The effects of Genotype x Environment (GxE) interaction has been the most changeling factor in plant breeding because it makes the varieties recommendation more difficult. However, it is necessary to evaluate the GxE interaction and uses it as an advantage to recommend genotypes for different soil and climate condition. Besides that it is important to evaluate this magnitude by adaptability and stability studies [7]. Several methods can be found on the literature such as based on linear regression [8], bi-segmented linear regression [9], and most recently mixed models concepts called harmonic mean of the relative performance of genetic values (HMRPGV). It is worth mentioning that the uses of the correct methods will influence directly the selection of superior genotypes.
Currently, Eberhart and Russell [8] method is the most used method to study adaptability and stability due to it is simpler to calculate parameters and to understand the results. It considers that genotype performance in each environment is estimated by linear regression analysis based on their phenotypic values in function of environmental gradient [7]. On other words, adaptability is evaluated by linear regression analysis, which each genotype is fitted by one linear regression equation. Whereas stability is estimated by the sum of deviations calculated for each environment.
Cruz et al. [9] presented a modification for the Eberhart and Russell [8] making a simplification for estimating parameters and calculating sum of squares, and consequently having statistics proprieties more suitable for breeding purpose. This method bases on the bi-segmented linear regression, which three adaptability parameters are estimated: mean, linear performance for unfavorable environments, and linear performance for favorable environment. Based on this method, it is possible to select superior genotype for general performance as well as for favorable and unfavorable environment. However, this method has as advantages to analyze more parameters.
Although, there were a huge upgrade on GxE interaction methodologies, from the traditional analysis based on bi-segmented linear regression, these methods have some limitations. It is possible to point out the impossibility to perform these methods for unbalanced experiment, unbalanced orthogonal array, and experiments evaluated in many environments with heterogeneity variance. Situations cited above is very common in plant breeding, especially in forest breeding. In addition, these methods usually assigned genotype effects as fixed, and this is a disadvantage as well as it is not true when the objective is to estimated variance components and genetic parameters based on these methods [10].
Regarding de Resende [11] proposed a methodology which it makes possible to select superior genotypes with general adaptability and stability based on mixed models (HMRPGV). This method allows to select genotypes for the three features (general mean, adaptability and stability) simultaneously and, it present several advantages: genotype effects are assigned as random effect, and therefore, this method estimates the genotypic adaptability and stability, instead of the phenotypic adaptability and stability; this method can be used for unbalanced experiment and unbalanced orthogonal array; it is possible to work with heterogeneity variance and correlated errors within block or environment; it can be apply with any environment numbers; it makes possible to consider the adaptability and stability to select superior genotypes within progenies; it does not depend on estimation or interpretation of other parameters such as regression coefficient; it allows to eliminate the GxE interaction error, due to it considers the heritability for this effect; it generates results on the same magnitude of the phenotypic values for the trait; and it estimates the genetic gain for the three features simultaneously [10].
For all reasons discussed above the objectives for this study were: i) to estimate the parameters in the 180 half-sib families in Jatropha evaluated for five production years; ii) to verify the significance of the GxE interaction variance; iii) to estimate the adaptability and stability for each family based on three prediction methods; iv) to select superior half-sib families based on the adaptability and stability analyses; and v) to estimate the prediction accuracy for the sixth production year.

Experimental design
The experiment was installed in the experimental area of Embrapa Cerrados, in Planaltina, DF (lat. 15˚35'30''S, long. 47˚42'30''W, at 1,007 m asl) in November 2008. The climate is tropical with dry winter and rainy summer (Aw), according to the Köppen classification, with average annual temperature of 22˚C, relative humidity of 73%, and average annual rainfall of 1,100 mm. The predominant soil at the site was classified as Red Latosol, with a high clay content.
180 Jatropha half-sib families were evaluated using a randomized block design with two replications and five plants per plot, spaced 4 x 2 m apart. The parents used for the development of half-sib families had genetic variability for the characteristics of field production (PROD) and weight of 100 seeds (W100S) and they were harvested randomly around Brazilian Jatropha field. Management practices were based on Dias et al. [12], adapted according to the results of studies on Jatropha in Brazil and in the world. The half-sib families were evaluated over five crop years (2010 to 2014) for PROD and four crop years (2010 to 2013) for W100S.

Statistical analysis
Firstly, analysis of variance and Deviance analysis was performed for each year following the model: In which: Y ij is the phenotypic value of the i th family in the j th block; μ is the general mean; B j is the j th block effect assigned as fixed effect; G i is the i th family effect assigned as random effect; ε ij is the random error.
Secondly, analysis of variance was performed considering all years based on the model below: In which Y ijk is the phenotypic value of the i th family evaluated in the j th year in the k th block; B j is the j th block effect assigned as fixed effect; G i is the i th family effect assigned as random effect; E j is the j th year effect assigned as fixed effect; GE (ij) is the GxE interaction effect assigned as random effect; ε ij is the random error.
Variance components were estimated by least squares method: in which:ŝ 2 g is the genotypic variance among half-sib families; MS g is the mean square of the half-sib families; MS r is the mean square of the residual; e is the number of environments (years); r is the number of replication (Blocks).
in whichĥ 2 is the heritability among half-sib families. Subsequently, the data was analyzed by the restricted maximum likelihood (REML) following the model: Estimators of variance components were estimated by REML via Expectation-Maximization algorithm as described below: where C 22 e C 33 is estimated by: in which: C is the coefficient matrix by mixed model equation; tr is the matrix trace; r(x) is the rank of the matrix X; N is the total of evaluated data points; q is the number of half-sib families; s is the number of GxE combinations. The parameters heritability among half-sib families (h 2 ), determination coefficient of the GxE interaction effects (c 2 ); experimental coefficient of variation (CV e ); genetic coefficient of variation (CV g ), and CV g /CV e ratio (CV r ) were estimated by the equations described below: Adaptability and stability method proposed by Eberhart and Russell [8], is based on the simply linear regression analysis that measures the performance of each genotype over environment variations according to the equation below: in which: Y ij is the phenotypic value of the i th half-sib family in the j th environment; β 0i is the linear coefficient of the i th half-sib family; β 1i is the regression coefficient that measure the i th half-sib Family performance in the j th environment; I j is the environment index that is estimated by: In addition, ω ij are the random error and it be decomposed as: In which: δ ij is the regression deviation, and " ε ij mean experimental error. The estimative of mean square for adaptability and stability parameters based on Eberhart and Russell [8] are estimated according to: in which: MSD i is the mean square deviation of the i th half-sib family; MS r is the residual mean square; e r é the number of replication. The important hypothesis are: H 0 : β 1i = 1 versus H 1 : β 1i 6 ¼ 1, and H o : These hypotheses were evaluated by t and F test, respectively. The method proposed by Cruz et al. [9], is based on the bi-segmented linear regression analysis according to the model below: in which: Y ij is the mean of the i th genotypes in the j th environments; β 0i is the linear coefficient of the i th genotypes; β 1i is the linear performance of the i th genotype for the unfavorable environment; β 1i + β 2i is the linear performance of the i th genotypes for favorable environment; I j is the environment index; T(I j ) is equal zero if I j < 0 or TðI j Þ ¼ I j À " I þ if I j > 0, being " I þ the mean of the positives environment indexes; ψ ij are the random error for each component and it can be decomposed as: ε ij , being δ ij the regression deviation, and " ε ij the mean experimental error.
The estimators of minimum squares of the adaptability and stability parameters based on Cruz et al. [9] methods are calculated by the equations below: The hypothesis are H 0 : β 1i = 1 versus H 1 : β 1i 6 ¼ 1, H 0 : β 1i + β 2i = 1 versus H 1 : β 1i + β 2i 6 ¼ 1, and H 0 : The first two hypothesis were evaluated by t test and the last one was evaluated by F test, respectively.
Half-sib families were group into six groups based on Eberhart and Russell [8] and Cruz et al. [9] methods according to Table 1.
The accuracy and genetic value based on Eberhart and Russell [8] and Cruz et al. [9] methods were estimated by the equation below proposed by Resende [13]: The harmonic mean values of the genotypic values (HMGV) to evaluate the stability based on the HMRPGV method was estimated according to the equation: In which n is the number of environment where the i th genotype was evaluated, GV ij is the genotypic value of the i th genotype in the j th environment, and it can be expressed as the mean proportion of the environments [14].
The relative performance of the genotypic values (RPGV) is used to evaluate adaptability by the HMRPGV method and it estimated by the equation below: In which M j is the trait mean (PROD or W100S) in the j th environment. A seleção conjunta considerando-se, simultaneamente, a PROD/W100S, a estabilidade e a adaptabilidade é dada pela estatística harmonic mean of the relative performance of genetic values (HMRPGV) [14]: Therefore, genotypes with higher HMRPGV are those that have high adaptability and high stability simultaneously for the environments evaluated in this study for PROD and W100S.

Classes
Pratical classification E&R Cruz 1 General adaptability and low stability Specific Adaptability for favorable environments and low stability Specific Adaptability for unfavorable environments and low stability General adaptability and high stability β 1 = 1 e s 2 di ¼ 0 Specific Adaptability for favorable environments and high stability β 1 > 1 e s 2 di ¼ 0 Specific Adaptability for unfavorable environments and high stability β 1 < 1 e s 2 di ¼ 0 For each method and each character, 20 families were selected for each method with general adaptability and high stability simultaneously. Subsequently, the coincidence index and predicted gain for the selection of each method were calculated. For each method, the 60 half-sib families of half-siblings that had high adaptability and predictability were selected simultaneously for the PROD during the first five years. From this selection, the prediction accuracy of PROD was calculated in the sixth production year, and it was also predicted the genotypic value for all of them. The prediction accuracy was estimated as being the squared Pearson correlation between the phenotypic value and the genotypic value. The genotypic value was predicted based on the betas estimated by Eberhart and Russell [8] and Cruz et al. [9] methods for each family.
All statistical analyses were performed using the software Genes [15] and Selegen [14].

Estimative of genetic parameters
Analysis of variance (ANOVA) and mixed models were performed aiming to estimate genetic parameters for yield production (PROD) and weight of 100 seeds (W100S) evaluated during five years. It was verified that there is genetic variability among Jatropha half-sib families, and the genetic variance estimated by ANOVA was twice compared with the genetic variance estimated by mixed models (Table 2). It was also observed that the GxE interaction was significant, mainly for PROD. However, ANOVA was unable to estimate the GxE interaction accurate for W100S, due to the estimated value was out the parametric space (negative variance) ( Table 2).
The magnitude of heritability among half-sib families (ĥ 2 ) estimated by ANOVA was higher than REML estimative (Table 2). Moreover, the magnitude of coefficient of determination for GxE interaction effects (ĉ 2 ) was low for both traits according to de Resende [11]. Experimental coefficient of variation (CV e ) was considered high for PROD (greater than 20%) and low for W100S (Table 2). CV g /CV e ratio was lesser than one for both traits. Table 2. Estimative of genetic parameters via analysis of variance (ANOVA) and mixed models for yield production (PROD) and weight of 100 seeds (W100S) evaluated in 180 Jatropha half-sib families during five years. Because of the GxE interaction significance, ANOVA and mixed models were performed for each year separately (Table 3). It was verified that the genetic variance estimated by REML was greater than estimated by ANOVA in magnitude for both traits. However, the heritability was the same between methods for both traits and during the years.

Parameters
Genetic variance estimated by mixed models was four times greater than the genetic variance estimated by ANOVA for PROD and W100S (Table 3). The heritability for PROD ranged from 0.17 to 0.65 during the years, while the heritability for W100S ranged from 0.49 to 0.61 ( Table 3). The CV e was high (greater than 20%) for PROD in all evaluated years and low for W100S in all evaluated years ( Table 3).

Classification of the half-sib families based on adaptability and stability methods
Adaptability and stability analysis were performed using two methods widely used in the plant breeding aiming to identify superior genotypes for favorable and unfavorable environments: method proposed by Eberhart and Russell [8] and method proposed by Cruz et al. [9].
Based on the method proposed by Eberhart and Russell [8] was verified a large group composed by families with general adaptability and high stability for PROD and W100S (Figs 1  and 2, respectively). In addition, three small groups were formed for PROD: specific adaptability for favorable environments and high stability (11 families), general adaptability and low stability (11 families), and specific adaptability for unfavorable environments and high stability Table 3

Year
Mean (13 families) (Fig 1). On the other hand two groups were formed just by one family: specific adaptability for favorable environments and low stability, and specific adaptability for unfavorable environments and low stability (Fig 1). General adaptability and low stability group was formed by four families for W100S (Fig 2), while specific adaptability for favorable environments and high stability, and specific adaptability for favorable environments and low stability groups were formed jus for one family. In addition, specific adaptability for favorable and unfavorable environments groups presented no families (Fig 2).
A large group with general adaptability and high stability was also formed by the method proposed by Cruz et al. [9] for PROD and W100S (Figs 3 and 4). In addition, three small groups were formed for PROD: specific adaptability for favorable environments and high stability (13 families), general adaptability and low stability (13 families), and specific adaptability for unfavorable environments and high stability (six families) (Fig 3). One family formed the group specific adaptability for favorable environments and low stability for PROD (Fig 3). Moreover the group general adaptability and low stability was formed by nine families for W100S (Fig 4), whereas the others groups had no families for W100S.
Families highlighted with red in the Figs 1, 2, 3 and 4 had mean greater than the overall mean for PROD and W100S. It was observed that most all families with mean superior than the overall mean belonged to the group with general adaptability and high stability for both methodologies. For PROD, it was possible to verify families with mean superior than the overall mean in the groups with general adaptability and low stability, and specific adaptability for favorable environment and high (Figs 1 and 3).

Family selection and estimation of selection gain via adaptability and stability methods
Eberhart and Russell [8], Cruz et al. [9], and HMRPGV were performed aiming to select the 20 superior Jatropha half-sib families with general adaptability and high stability. The prediction accuracy was higher for Cruz et al. [9] and Eberhart and Russell [8] methods compared with HMRPGV for PROD and W100S (Tables 4 and 5, respectively). The betas estimative calculated by Eberhart and Russell [8] and Cruz et al. [9] methods for the 20 superior half-sib families were presented in the S1 and S2 Tables.
The coincidence index among the adaptability and stability methods was calculated and it was verified that there was a high coincidence among the selected Jatropha half-sib families between Eberhart and Russell [8] and Cruz et al. [9] methods for PROD and W100S (Table 6). On the other hand, the coincidence between these methods and HMRPGV was low for PROD and W100S.
The selection gain estimated by the HMRPGV was greater than the selection gain estimated by the other methods for PROD e W100S ( Table 7). The selection gain in percentage for PROD ranged from (HMRPGV) to 9 (Eberhart and Russell [8] and Cruz et al. [9] methods) times greater than the selection gain for W100S.
The prediction accuracy for the sixth production year based on the 60 superior Jatropha half-sib families selected after the fifth production year estimated by Eberhart and Russell [8] and Cruz et al. [9] methods was 0.45 for both methods. The beta values of the Jatropha half-sib families evaluated in the sixth production year and their genotypic values for PROD are shown in S3 Table.

Estimative of genetic parameters
Differences among the estimative of genetic parameters and variance components found in this study was also reported previously in researches comparing ANOVA and REML [16], and it can be happened due to the unbalanced data because there were missing plants in the plots. Moreover, REML presented lower estimative because it considers basically the random effects in the statistic model associated with the phenotypic values, which the data are adjusted for the fixed effects and the unequal number of plants per plot based on mixed models [17,18].
GxE interaction is one of the most challenge in the plant breeding because it makes the genotypes recommendation for many environments more difficult [7]. Indeed, Jatropha is still considered a undomesticated crop in Brazil, and consequently there is a few researches evaluating the GxE interaction [19,20]. As the environments evaluated in this study were production years, the difference among Jatropha half-sib families in the environments can be assigned by climate factors such as precipitation, temperature and humidity that interfered in the genotype performance during the years, and consequently these factors make the GxE interaction signiticative. Therefore, adaptability and stability analysis is a useful tool to help breeders to select superior Jatropha half-sib more accurate for W100S and PROD. Heritabilityĥ 2 estimated by ANOVA was higher thanĥ 2 estimated by REML, due to ANOVA overestimatedŝ 2 g . Because that REM is more recomended to analyze experiments with unbalanced data. In addition, the magnitude of the h 2 estimated by REML were similar Table 4. Selected Jatropha half-sib families via adaptability and stability methods for yield production (PROD), and their genetic values and individual predicted accuracy.

Eberhart and Russell [8]
Cruz et al. [ with h 2 reported previously in the literature for W100S and PROD [21][22][23][24]. High CV e for PROD and low for W100S have also reported in the literature previously [20,25]. There was a drastic reduction of the number of observation for estimating genetic parameters for each year in separate analysis, and it made thatŝ 2 g was higher when it was estimated by REML due to this method tends overestimate theŝ 2 g in experiments with reduced number of data points. [26]. Furthermore, how the number of missing data was reduced when each year was evaluated separately, theĥ 2 was the same for both methodologies for all production years. According to de Farias Neto and de Resende [18] REML and ANOVA have to take the same results or with a slightly difference when the experiment is balanced, and therefore, for this case ANOVA is efficient.

Family selection and selection gain estimates via adaptability and stability methods
The identification of genotypes adapted for favorable and unfavorable environments is important in the Jatropha breeding and it will made possible to expand the cultivation of this important crop for other Brazilian regions. Genotypes identified having specific adaptability for favorable environments, high PROD average, and high stability might be used for farmers with high technological level, due to these genotypes are able to improve their performance with the environment conditions improvement. On the other hand, genotypes with specific adaptability for unfavorable environments, high PROD average, and high stability may be planted by farmers with low technological level, because these genotypes have high rusticity and consequently they will maintain the yield production even they will be submitted for adverse conditions.
In addition, genotypes with general adaptability, high PROD average, and high stability can be also cultivated in all environments, however their performance will be lower compared with genotypes having specific adaptability for favorable environments with the environment conditions improvement, and they will be a more seasonal performance compared with genotypes that have specific adaptability for unfavorable environments.
Interestingly, the Jatropha half-sib Family classification for W100S based on adaptability and stability methods is important because this trait is highly positive related with PROD [25], and it also has higher heritability and it is easy to measure. Therefore, genotypes that have high stability and high average for this trait can be used in the Jatropha breeding aiming to make indirect selection for PROD in favorable and unfavorable environments.
Eberhart and Russell [8], Cruz et al. [9] and HMRPGV selected different half-sib families for PROD and W100S. However, the coincidence index between Eberhart and Russell [8] and Cruz et al. [9] were high. These methods based on linear regression are widely used to identify and to select genotypes with high adaptability and stability. However it is needed to consider that the environmental mean is composed by the environments effects plus the GxE interaction, and it can lead on any mistakes about the real causes of the GxE interaction [27]. Besides that, according to Duarte [28] genotypes behavior may not be a linear relationship with the environment.
Genotypic value predicted by HMRPVG were higher than the genotypic values predicted by other methods. These results were expected because the genotypic value predicted by Table 7. Estimative of selection gain based on the 20 superior Jatropha half-sib families via adaptability and stability methods for yield production (PROD) and weight of 100 seeds (W100S). HMRPVG are expressed as the proportion of the overall mean for each environment, which penalizing the genotypes with low stability. This methodology allows maximizing the selection gain compared with other adaptability and stability methods for PROD and W100S. Besides that, the genotype recommendation is ease in this method because this method ranked the genotypes based on the HMRPGV, and this value is calculated based on the genotype mean, adaptability and stability. Therefore, genotypes with high HMRPGV value have high mean, adaptability, and stability.

Future applications
Jatropha is a perennial specie that has been widely used to oil production, and it can also be reported that it is possible to obtain gain with selection for this crop based on the genetic information. Therefore, the use of currently techniques performed in the Jatropha breeding, plus statistical methodologies more robust such as the HMRPGV, and the use of molecular markers such as Single Nucleotide Polymorphism (SNP) can be the key to improve the selection accuracy, to reduce the cycle time, and to decrease the cost per cycle time.
The use of SNPs should be exploited and applied in the future researches aiming to improve the prediction accuracy. Recent developments of the next-generation sequencing platforms has allowed researchers genotype populations with a large number of individuals quickly and with low cost [29].
Therefore, the use of molecular markers exploiting the GxE interaction has emerged as a useful strategy to select superior genotypes, especially in forest species which the cycle time is too large [30] and evaluations should be performed when the genotypes stabilized their yield production, that happens around 11 years old in Jatropha. Based on the theoretical and practical studies as discussed in this research, the introduction of genome-wide selection models performed with the GxE interaction will may increase the efficiency of the Jatropha breeding reducing the cycle time, and/or omitting the progeny test phase. [31].

Conclusion
There are genetic variability among Jatropha half-sib Family; GxE interaction was statistically significant for both traits and the adaptability and stability methods were able to split favorable and unfavorable environments; HMRPGV is efficient to maximize the genetic gain compared with other methods, making it a suitable method to select superior genotypes in long cycle species; The half-sib families 20, 40, 43, and 101 have high-yield production, specific adaptability for favorable environments and high stability; The half-sib family 44 has high-yield production, specific adaptabil8ity for unfavorable environments and high stability; Cruz et al. [9] and [Eberhart and Russell [8]] were able to predict the sixth production year based on the selected half = sib Family with a moderate accuracy.