Multi-trait multi-environment Bayesian model reveals G x E interaction for nitrogen use efficiency components in tropical maize

Identifying maize inbred lines that are more efficient in nitrogen (N) use is an important strategy and a necessity in the context of environmental and economic impacts attributed to the excessive N fertilization. N-uptake efficiency (NUpE) and N-utilization efficiency (NUtE) are components of N-use efficiency (NUE). Despite the most maize breeding data have a multi-trait structure, they are often analyzed under a single-trait framework. We aimed to estimate the genetic parameters for NUpE and NUtE in contrasting N levels, in order to identify superior maize inbred lines, and to propose a Bayesian multi-trait multi-environment (MTME) model. Sixty-four tropical maize inbred lines were evaluated in two experiments: at high (HN) and low N (LN) levels. The MTME model was compared to single-trait multi-environment (STME) models. Based on deviance information criteria (DIC), both multi- and single-trait models revealed genotypes x environments (G x E) interaction. In the MTME model, NUpE was found to be weakly heritable with posterior modes of heritability of 0.016 and 0.023 under HN and LN, respectively. NUtE at HN was found to be highly heritable (0.490), whereas under LN condition it was moderately heritable (0.215). We adopted the MTME model, since combined analysis often presents more accurate breeding values than single models. Superior inbred lines for NUpE and NUtE were identified and this information can be used to plan crosses to obtain maize hybrids that have superior nitrogen use efficiency.


Introduction
Maize is the leading cereal crop in terms of production and, together with rice and wheat, it is one of the most important sources of the global population's daily caloric requirement [1]. Most of the maize planted in the tropics is grown in areas where soils are acidic and low in nutrients, especially nitrogen (N). Nitrogen plays an essential role in the cycle of most crops and maize requires a large amount of this nutrient. In tropical areas, together with drought, low N conditions represent the major cause of yield loss in maize [2]. On the other hand, some a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 under contrasting N levels in the soil as well as to perform comparisons with single trait models. We aimed also to identify superior maize inbred lines for N-uptake and N-utilization efficiencies.

Plant material
The genetic material consisted of 64 tropical maize inbred lines. These maize inbred lines represent a set of diverse germplasm from the maize breeding program of Universidade Federal de Viçosa (UFV), Viçosa, Minas Gerais state, Brazil. They were obtained from different sources of tropical maize germplasm: commercial tropical maize hybrids, maize populations and maize open-pollinated varieties (OPV; Table 1). It is important to mention that, in Brazil, it is allowed to use any commercial variety as germplasm source for breeding purposes and the development of new maize inbred lines; therefore it is possible to use maize hybrids, except the ones with transgenic events (10 th article, of the third paragraph of the No. 9456 Brazilian Law of April 25 th , 1997). The genetic sources used for our program to develop new inbred lines are conventional varieties. The 64 maize inbred lines were developed using a modified pedigree method in the same region where they were evaluated. Most of tropical hybrids used as source to develop the inbred lines studied in this work are recommend to tropical areas; consequently, they are adapted to the region where the inbred lines were studied. The CMS28 is a tropical maize population introduced in Brazil from International Maize and Wheat Improvement Center (CIMMYT) by the Brazilian Agricultural Research Corporation (EMBRAPA), Maize and Sorghum unit, in the 70's [22]. The Nitroflint, CMS50 and BR106 are tropical maize populations developed by EMBRAPA Maize and Sorghum. The Nitroflint population is adapted to low N soil and, it has been used as source germplasm for the development of new inbred lines to tolerance to low N in soil [23]. The CMS50 is a tropical maize population produced by the intercrossing of three single-cross and two double-crosses [24]. In addition, the BR106 is a tropical open-pollinated variety produced by the intercrossing of three Brazilian varieties (Maya, Centralmex and Compose Dent) and an exotic introduction (Tuxpeño 1) [25]. It is cultivated by some small farmers in Brazil and used in some recurrent selection programs in public sector.

Field experiments
The 64 tropical maize inbred lines were evaluated in 2014 at UFV Experimental Station in Coimbra (latitude 20˚51'24''S; longitude 42˚48'10''W; altitude of 720 m asl), located in southwest Minas Gerais state, Brazil. Maize inbred lines were evaluated in two independent experiments under contrasting levels of N: low nitrogen (LN) and high nitrogen (HN). In this context, there is a totally different environmental condition for plants, thus characterizing different environments, mainly because they are very contrasting nitrogen levels.
The design of each experiment was an 8 x 8 lattice square with two replications and tworow plots. The plot size was 6.4 m 2 (4 m long with 0.8 m row spacing and 0.2 m plant spacing). At the LN level, 30 kg ha -1 of N was applied. At the HN level, 180 kg ha -1 was applied. Trait management was the same for all N level experiments, employing standard agricultural practices.
Grain yield was recorded from all ears on the plot at physiological maturity. Ears were shelled, the grain weight and grain moisture percentage were recorded, and grain yield (kg ha -1 ) was calculated at 145 g kg -1 moisture. At physiological maturity, five representative plants were harvested from each plot by cutting them close to the soil surface. All plant stover together with cobs (with kernels removed) at maturity were chopped and oven-dried to a constant weight at 70˚C for 72 hours. The harvest ears were also oven-dried at 70˚C for 72 hours. Grain and stover samples were milled using an analytical mill and analyzed for N according to the Kjeldahl method [26]. The components of NUE were calculated according to [3]: N-uptake efficiency was calculated as the ratio of the total N (kg ha -1 ) in the aboveground biomass to total N in the soil (kg ha -1 ) and N-utilization efficiency was calculated as the ratio of grain yield (kg ha -1 ) to total N (kg ha -1 ) in the aboveground biomass.

Statistical analysis
N-uptake efficiency and N-utilization efficiency were analyzed using single and multi-trait models via the Markov Chain Monte Carlo (MCMC) Bayesian approach. The idea was to compare: (i) the full model (considering the interaction between the genotypes and N levels) with the null model (not considering the interaction); (ii) the estimates of the genetic parameters from single-and multi-trait models at LN and HN levels.
The multi-trait multi-environment model was given by: Which can be rewritten as: Where: ' represents low and h represents high N; 1 represents NUpE and 2 represents NUtE; y is the vector of the phenotypic values of the traits (NUpE and NUtE); β is the vector of systematic effects of environment (N levels), β~N(μ, I ∑ β ); r is the vector of the replication inside the environment effect, r~N(0, I ∑ r ); b is the vector of random effects of block inside replication inside environment, b~N(0, I ∑ b ); u is the vector of random effects of genotypes, u~N(0, I ∑ u ); ε is the vector of random residual effects, ε~N(0, I ∑ ε ). X is the incidence matrix of environment (N level) effects, Z 1 is the incidence matrix of replication inside the environment effect, Z 2 is the incidence matrix of block inside replication inside environment effects and Z 3 is the incidence matrix of genotype effects. The (co)variance matrix estimates are given by: where: ' represents LN and h represents HN; 1 represents NUpE and 2 represents NUtE. We assumed that the variance-covariance matrices follow an inverted Wishart distribution, which was used as a priori to model the variance-covariance matrix [27].
The model was implemented in the "MCMCglmm" R package [28,29] of R software (R Development Core Team). A total of 1,000,000 samples were generated, and assuming a burnin period and sampling interval of 500,000 and 5 iterations, respectively, this resulted in a final total of 100,000 samples. The convergence of MCMC was checked by the criterion of [30], which was performed using the "boa" [31] and "CODA" (convergence diagnosis and output analysis) [32] R packages. Even though Bayesian and frequentist frameworks are not directly compared, mainly in the field of Genetics and Breeding [17], the same models were also fitted based on REML (Restricted Maximum Likelihood) estimation method. However, the convergence was not achieved through AI (Average Information) and EM (Expectation-Maximization) algorithms.
The full (considering the inbred lines x N levels interaction) models were compared with the null (not considering the interaction) models by the deviance information criterion (DIC) proposed by [33]: yÞ is a point estimate of the deviance obtained by replacing the parameters with their posterior means estimates in the likelihood function and p D is the effective number of parameters in the model. Models with smaller DIC should be preferred to models with higher DIC.
Variance components, broad-sense heritability per plot, coefficients of residual and genetic variation, the variation index and the genotypic correlation coefficients between traits and breeding values were calculated from the posterior distribution. The package "boa" [31] of R software was used to calculate the highest posterior density (HPD) intervals for all parameters. Posterior estimates for the broad-sense heritability of NUpE and NUtE for each interaction were calculated from the posterior samples of variance components obtained by the model, using the expression: Where: s 2ðiÞ g , s 2ðiÞ r , s 2ðiÞ b and s 2ðiÞ ε are the genetic, replication, block inside replication and residual variances for each iteration, respectively. The genetic association between the pairs of traits in each environment was calculated as:

Results and discussion
All chains achieved convergence via the criterion of [30]. According to the deviance information criteria (DIC), there was positive evidence of interactions between inbred lines and N levels for all models analyzed (  (Table 3). In the multi-trait multi-environment (MTME) model, NUpE at HN and LN were found to be weakly heritable with Bayesian credible intervals (probability of 95%): h 2 = 4.177x10 -5 to 0.090 and h 2 = 6.644x10 -5 to 0.135, posterior modes: h 2 = 0.016 and h 2 = 0.023, at HN and LN, respectively. Just a comment about this, we would like to emphasize that the low heritability observed only for NUpE did not depend on the number of evaluated samples, since the used Bayesian framework is essentially recommended for situations involving small sample sizes. The NUtE at HN was found to be highly heritable whereas at LN it was moderately heritable, with credible intervals (probability of 95%) ranging from: h 2 = 0.240-0.674 and h 2 = 0.084-0.396, posterior modes: h 2 = 0.490 and h 2 = 0.215, respectively. The difference between mean, mode and median heritability estimates (Tables 3 and 4) reflects some lack of symmetry in the posterior distribution estimates [21]. In our study, we found higher heritability for NUpE at LN and higher heritability for NUtE at HN level. In a study on maize hybrids in three N levels, [7] also found that, for NUpE, the h 2 increased with a decrease in N level, whereas for NUtE, the highest heritability was estimated at an HN level.
In the current study, the posterior mean of the genetic correlation between NUpE and NUtE was not significantly different from zero (95% credible intervals) under both N levels (HN: -0.330 to 0.241; LN: -0.328 to 0.338) ( Table 3). Although some studies consider NUpE and NUtE to be not independent from a statistical point of view, in the present study the opposite was observed. [7] evaluated N-utilization efficiency and N-uptake efficiency in two sets of maize hybrids, one produced from crosses among maize inbred lines selected at HN and another set of hybrids from inbred lines selected at LN. The authors found that at LN, for hybrids from lines selected at HN, N-utilization efficiency was positively related to NUpE, whereas at HN, it was negatively and highly related to NUpE. However, for hybrids from lines selected at LN, there was no correlation among components of NUE and the components of NUE were independent. [12] also found that the genetic correlation between components of  Table 3. Posterior inferences for the mean and genetic variance; the mode, mean, median and higher posterior density (HPD) interval of the broad-sense heritability; and the mode, mean, median and higher posterior density (HPD) interval of the genetic correlation, considering the proposed multi-trait multi-environment model.  [35], in a review of the genetics of NUE in crop plants, stated that maize studies can be divided into two groups, the comparison studies of hybrids from different eras and the ones related to inbred lines use or mapping populations. The authors also discussed that remains difficult to determine whether maize hybrids respond differently than inbred lines to fertilizer N. According to [6], if only elite hybrids are evaluated, probably specific genes for adaptation to low nitrogen condition have been lost due to selection in favorable conditions over the years. Thus, according to the results from the present study, for maize inbred lines NUpE cannot be used for indirect selection of NUtE, and vice versa, and the genetic improvement for NUE in maize breeding programs should consider the improvement of both NUEcomponents, NUtE and NUpE. For the single-and multi-trait models, the means estimates for NUpE and NUtE were always higher in LN inputs than in HN (Tables 3 and 4), with a more pronounced difference between environments for NUpE than NUtE. [6] reported that NUE is higher at LN level than at HN level and this is due to the fact that the efficiency of N decreases as the fertilization level increases. These authors also mentioned that quantitative trait loci (QTL) studies tend to confirm that variation in NUtE is greater than variation in NUpE at LN environments and the opposite occurs at HN environments. In the present study, the genetic variance was greater for NUpE in LN and greater for NUtE in HN (Tables 3 and 4). These different values between the environments (N levels) may be due to the biological response of the inbred lines to the nutritional N stress. [36] verified that the genes responsible for the control of NUE are expressed according to the availability of the nutrient to the plant and consequently, the magnitude of genetic variability is expressed differently in contrasting environments. It was noticed here that posterior inferences for the genetic parameters obtained through the multi-trait model were very similar to the ones obtained through the single-trait models (Tables 3 and 4), this could be due to the lack of correlation between traits.
One of the advantages of Bayesian methods is the ability to access the posterior density intervals of genetic parameters (Figs 1 and 2). The breeding values of the maize inbred lines for each trait and their highest posterior density (HPD) intervals were obtained and were used to assist in the selection of genotypes.
The relative variation index is the ratio of the coefficient of genotypic variation to the coefficient of residual variation (CVg/CVe). Relative variation indices that are greater than the unit suggest that genetic variation is more influential than residual variation. This was observed in this study for both traits but only in HN input, suggesting that there was a greater influence of the residual variation under low N level than high N level. Highest CVe and CVg were observed for NUpE in HN input, whereas lowest CVg was observed for NUtE in LN input (Table 5). We have chosen to adopt the MTME model for the inbred lines selection since a combined analysis takes more data into consideration and consequently is often more accurate to predict breeding values than single models [37,38]. According to [39], the increase in accuracy with the use of multi-trait best linear unbiased prediction (BLUP) analysis compared with singletrait analysis is proportional to the difference between the genetic and environmental correlations of the analyzed traits. In order to compare the ranking of the most efficient tropical maize inbred lines between multi-and single-trait models for each trait in a given environment, the percentage of coincidence of the top 10 inbred lines list (not considering their order in the list) was calculated. For NUpE the coincidence percentage between models was 80% and 70% at LN and HN, respectively. The coincidence percentage for NUtE was 100% under both N levels. The high percentage of coincidence between rankings of the multi-and single-trait models may be, as already mentioned, due to the fact that the genetic correlation between the traits was not significant. Genetic merit estimates of individuals, whether or not they consider genomic information, are in general estimated more accurately when they are based on analyses that consider all the traits simultaneously-that is, under a multivariate approach [40,41]. In a study with popcorn half-sib families, [42] found that multi-trait BLUP (Best Linear Unbiased Prediction) was more accurate and efficient in family selection than single-trait BLUP.  Table 5. Coefficient of residual variation (CVe, %), coefficient of genotypic variation (CVg, %) and relative variation index (CVg/CVe) for the multi-trait multi-environment model and the single-trait multi-environment models. The effect of the G x E interaction was significant, which indicates that the maize inbred lines responded differently to N levels. In order to compare the ranking of inbred lines between the environments (HN and LN) for each trait in a given model, the percentage of coincidence of the top 10 inbred lines list was calculated as described above. For the trait NUpE, 20% and 30% of coincidence between N levels was observed for the multi-trait and single-trait models, respectively, whereas for NUtE, the coincidence was 100% for both models. This result suggests that the trait NUpE was more influenced by the N level than NUtE. Superior inbred lines in a given environment (N level) for a given trait were identified through the breeding values obtained in the MTME analysis. The superior inbred lines presented greater breeding values for the given trait. The inbred lines that stood out from the others for NUpE were, at the HN level, VML010, VML017, VML077, VML014, VML037, VML009, VML016, VML032, VML005, VML040, whereas at the LN level, VML017, VML051, VML016, VML002, VML033, VML019, VML022, VML188, VML048 and VML034 presented good performances compared to the other inbred lines. For NUtE, the most efficient maize inbred lines were VML081, VML022, L038, VML032, VML051, VML016, VML020, VML027, VML028 and VML013 for both N levels. Another observation that must be reported is that the VML016 inbred line was the only one that appeared in the top 10 list for both traits in the two considered N levels, possibly constituting an important source of genes for NUE traits.

Conclusions
We demonstrated the feasibility of the proposed multi-trait multi-environment Bayesian model for plant breeding involving a set of genotypes that are evaluated for multiple traits across a range of environments. The accurate estimates of genetic parameters bring new perspectives on the application of Bayesian methods to solve modeling problems in maize breeding.
NUpE was not genetically correlated to NUtE in HN and LN environments. The panel of inbred lines evaluated presents genetic variability, which is fundamental for the selection and improvement of traits of interest. This variability is particularly important in low nitrogen environments, which is the condition that most small farmers that grow maize in the tropics experience. Superior inbred lines were identified for each environment (N level), thus allowing to design selection strategies specifically for each evaluated environment.
Supporting information S1