Identification of environment-specific QTL and stable QTL having consistent genetic effects across a wide range of environments is of great importance in plant breeding. Inclusive Composite Interval Mapping (ICIM) has been proposed for additive, dominant and epistatic QTL mapping in biparental populations for single environment. In this study, ICIM was extended to QTL by environment interaction (QEI) mapping for multi-environmental trials, where the QTL average effect and QEI effects could be properly estimated. Stepwise regression was firstly applied in each environment to identify the most significant marker variables which were then used to adjust the phenotypic values. One-dimensional scanning was then conducted on the adjusted phenotypic values across the environments in order to detect QTL with either average effect or QEI effects, or both average effect and QEI effects. In this way, the genetic background could be well controlled while the conventional interval mapping was applied. An empirical method to determine the threshold of logarithm of odds was developed, and the efficiency of the ICIM QEI mapping was demonstrated in simulated populations under different genetic models. One actual recombinant inbred line population was used to compare mapping results between QEI mapping and single-environment analysis.
Citation: Li S, Wang J, Zhang L (2015) Inclusive Composite Interval Mapping of QTL by Environment Interactions in Biparental Populations. PLoS ONE 10(7): e0132414. https://doi.org/10.1371/journal.pone.0132414
Editor: Xiaoming Pang, Beijing Forestry University, CHINA
Received: March 3, 2015; Accepted: June 12, 2015; Published: July 10, 2015
Copyright: © 2015 Li et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited
Data Availability: All relevant data are within the paper and its Supporting Information files.
Funding: This study was funded by the National Natural Science Foundation of China (project no. 31271798); the National 973 Research Program of China (project no. 2014CB138105), and the HarvestPlus Challenge Program of CGIAR.
Competing interests: The authors have declared that no competing interests exist.
QTL by environment interaction (QEI) widely exists in crops and other organisms. Studies on QEI contribute to the efficient use of marker-assisted selection (MAS) in breeding, and better understanding of genetic architecture of important quantitative traits and genotype by environment interactions [1,2,3]. As a consequence, many theoretical and applied studies have been conducted on QEI analysis in multi-environmental trials.
Analysis of variance (ANOVA), the simplest method, tested one marker at a time and had no background control, which gave rise to many false positive QTL . Composite interval mapping was applied to detect QEI when multiple environments were regarded as multiple traits [5,6], but the effect of QTL at the current interval may be absorbed by the background variables, which resulted in biased estimation . Tinker and Mather  proposed the simplified composite interval mapping suitable for large dataset. Approximate analysis of QEI was proposed with no limits on the number of environments . Hackett et al.  proposed a multi-trait QTL mapping method for QTL position estimation based on the regression mapping approach of Haley and Knot .
Mixed model approaches have also been used in QEI detection. Wang et al.  developed a methodology using mixed linear models, but the results may be susceptible to the specified models . Piepho  showed how random QEI effects and genetic correlation could be straightforwardly handled in the mixed model framework. Malosetti et al.  presented a strategy combining mixed model with factorial regression. A method using factorial regression and partial least squares  was an extension of the statistical approaches developed by Crossa et al.  and van Eeuwijk et al. . A strategy combining mixed model, simple and composite interval mapping, and introducing environmental co-variables was developed by Boer et al. . However, inconsistent mapping results were observed owing to different fixed and random effect assumptions and variance-covariance matrix choices, for example, identical genetic variation, compound symmetry, first-order analytic model, uniform covariance and heterogeneous variance and so on . Further studies are needed so as to validate the efficiency of the mixed-model based methods. In addition, mixed models have high computational complexity and are much time consuming, which is less suitable for large data sets.
Recently, the method based on Bayesian model [18,19] was proposed for QEI analysis, using Markov Chain Monte Carlo. In this method, estimation of QEI was simplified by treating each marker as a putative QTL. Strategies combining composite interval mapping with additive main effects and multiplicative interaction model were proposed to decrease the noise of phenotypic values especially when the GEI noise is large . Generally speaking, Bayesian model is hard to be implemented because of its computation burden and difficulty in choosing appropriate prior distributions [18,19].
Inclusive Composite Interval Mapping (ICIM) was proposed for additive, dominant and epistatic QTL mapping in biparental populations [7,21–24]. ICIM applies a two-step mapping strategy. Firstly, stepwise regression is conducted to select the significant markers for additive QTL mapping or marker-pairs for epistatic QTL mapping considering all marker information simultaneously. Secondly, the phenotypic values are adjusted by the marker variables retained in the regression equation except the two markers flanking the current scanning position(s) for background control. The adjusted phenotypic values are subsequently used in interval mapping. This strategy effectively separates the cofactor selection from the interval mapping using Maximum Likelihood (ML) method. Genetic background control decreases variance of the estimated genetic parameter, and therefore increases accuracy of estimates and the detection power [5,7,21,25]. Extensive simulations have illustrated that ICIM is an efficient mapping method with higher detection power, lower false discovery rate (FDR) and less biased estimates of QTL effect and position [7,22–24,26,27]. With the user-friendly software of QTL IciMapping (freely available from www.isbreeding.net), ICIM has been widely applied in QTL mapping researches (for examples, see [28–31]).
In this study, we focused on the additive QEI analysis. Our objectives were: (1) to propose an inclusive linear model capable of absorbing all genetic effects in QEI mapping; (2) to extend ICIM to additive QTL by environment interactions, and estimate the average effect, QEI effects and the phenotypic variance explained (PVE); (3) to propose an empirical method to determine the LOD threshold in QEI mapping; (4) to validate the proposed QEI mapping methods in simulated and an actual maize recombinant inbred lines (RIL) populations.
Materials and Methods
Genetic and linear regression models in QEI mapping
DH population was used to illustrate the ICIM QEI mapping. For simplicity, it is supposed that two inbred lines P1 and P2 differ at m QTL, being located in m intervals defined by m+1 markers on one chromosome. If no QTL is located in a marker interval, the average and interaction effects of the QTL are treated as zero. QTL genotypes of P1 and P2 are assumed to be Q1Q1Q2Q2…QmQm and q1q1q2q2…qmqm, respectively. Suppose that a DH population is derived from the F1 hybrids of P1 and P2 and phenotyped in e environments. For each individual, X = (x1, x2,…, xm, xm+1) represents the marker variables equal to 1 or -1, standing for two marker types (i.e. P1 marker type and P2 marker type). G = (g1, g2,…, gm) represents QTL variables equal to 1 or -1, standing for two QTL genotypes (i.e. P1 QTL type and P2 QTL type). Let a1h, a2h,…, amh represent the additive effects of the m QTL in the hth environment, respectively. Under the assumption of additivity of QTL effects, Gh, the genotypic value of an individual in the hth environment under the additive genetic model, can be written in Eq (1).(1)
The expected frequency of the jth QTL genotype depends on its position at the chromosomal interval flanked by the jth and (j+1)th markers and the length of this interval [7,23,32,33,], i.e., (2) where λj and ρj are functions of the three recombination fractions between the jth marker and jth QTL, between the jth QTL and (j+1)th marker, and between the jth and (j+1)th markers. The expectation of genotypic value Gh conditional on marker type X can be denoted as, (3) where b0h = μh, b1h = λ1a1h, and bjh = ρj-1a(j-1)h+λjajh(j = 2,…,m). The coefficient of the jth marker in the hth environment, i.e. bjh, is only affected by QTL in the (j-1)th and jth marker intervals. Therefore, when QTL are isolated by at least one blank interval, bjh and b(j+1)h contain all the position and additive effect information of QTL in the jth interval. These statistical properties provide the theoretical basis of QEI mapping.
Suppose a DH mapping population has observations on a quantitative trait of interest and genotyping information on m+1 ordered markers. The following linear regression model can be used in the additive QEI mapping, i.e., (4) where yih is the phenotypic value of the ith individual in the hth environment; b0h is the overall mean of linear model in the hth environment; xij is the indicating variable for the jth marker’s genotype of the ith individual, which is equal to 1 or -1 standing for P1 type or P2 type respectively; bjh is the partial regression coefficient of phenotype on the jth marker in the hth environment; and εih is the residual random error in the hth environment that is assumed to be normally distributed. Stepwise regression can be therefore conducted for phenotypic value in each environment to select significant markers, similar to additive mapping in single environment .
One-dimensional scanning of QEI mapping
For a testing position in the kth marker interval, the phenotypic value of the ith individual in the hth environment was adjusted by (5) where is the estimate of bjh of significant markers selected by stepwise regression in model (4). The phenotypic value Δyih contains QTL information in the current interval and does not change until the testing position moves to the next interval. Traditional interval mapping was conducted on the adjusted phenotypic values given by Eq (5).
For a testing position in an interval, individuals of DH population can be classified into four groups based on the types of the two flanking markers. If there is one QTL (with the two alleles denoted as Q and q) at the current testing position, each marker group has both QTL genotypes QQ and qq, and hence follows a mixture distribution of two components and . In each marker group, frequencies of the two QTL genotypes depend on the recombination fractions between the putative QTL and two flanking markers, and they are different for the four marker groups. Existence of QTL at the current scanning position can be tested by the following hypotheses: where is the average effect of the putative QTL across the environments, i.e., , and is the QEI effect in the hth environment. In H1, , or aeh ≠ 0 for at least one environment. Therefore, the likelihood ratio of hypotheses H1 versus H0 can be used to test all effects, i.e. both the average effect and QEI effects. In H2, , but aeh ≠ 0 for at least one environment. Therefore, H2 is nested into H1 by adding the condition . Size of the average effect makes the difference between H1 and H2. The likelihood ratio of hypotheses H1 versus H2 can test the significance of average effects. Difference between likelihood functions of H2 and H0 came from the QEI effects, as both of them have the restriction . Therefore, the likelihood ratio of hypotheses H2 versus H0 can test the significance of QEI effects.
The log-likelihood function under the alternative hypothesis H1 is, (6) where Sl (l = 1, 2, 3, and 4) denotes the lth marker type group; πl1 and πl2 are the proportions of two QTL genotypes QQ and qq in the lth group, respectively; represents the density of the kth normal distribution in the hth environment.
The expectation and conditional maximization (ECM) algorithm  was used to estimate the two means and one variance in the hth environment in Eq (6). Their initial values can be determined by: where n1:3 is the summation of n1 to n3. In the E-step, the posterior probability of the ith individual (i = 1, …, nh; h = 1, …, e) belonging to the kth QTL genotype (k = 1, 2) was calculated as, where l denoted the marker group into which the ith individual was classified. In the M-step, the three groups of parameters were updated as,
The EM algorithm continued until the difference in the likelihood between two consecutive iterations reached a pre-assigned precision, say 10−6. The ML estimates thus obtained were represented by , and . Then the additive effect under the hth environment was calculated as .
The LOD score (denoted by LODA) calculated by L1-L2 indicates whether there is significant average effect at the testing position. The LOD score (denoted by LODAE) calculated by L2-L0 indicates whether there are significant QEI effects. Sum of LODA and LODAE (denoted by LOD) gives the overall test statistic indicating the significance of both average effect and QEI effects.
Phenotypic variation explained (PVE) by the identified QTL
In the QTL IciMapping software, PVE of additive and QEI effects at each scanning position were estimated by posterior probability wik and two QTL genotypic means , which have been estimated by EM algorithm previously. For illustration, we assume there is one QTL at the current scanning position, and the expected frequency of the kth QTL genotype (QQ or qq) in the hth environment is fkh, (k = 1 and 2, and h = 1,…, e). The marginal frequency of QTL genotype is defined as the sum of QTL genotype frequencies in all environments, i.e., , which can be estimated by . QTL genotype is independent of environment, therefore fkh can be estimated by ,. Genetic variance and QEI variance can be calculated from Table 1, a two-way table of two QTL genotypic means and a set of environments.
The three components stood for QTL average effect, environmental effect and QEI effect, respectively. In statistics, it can be proved the decomposition is orthogonal. In DH population, genetic variance is equal to the additive variance, which can be calculated as,
Then, PVEA and PVEAE can be calculated from and , respectively, where VP was the average of phenotypic variances in the e environments. Take DH population and four environments as an example. Assume the expected marginal frequencies of QTL genotypes QQ and qq are 0.4 and 0.6, which indicated the frequencies of the two genotypes in each environment were 0.1 and 0.15, respectively. Phenotypic variances in the four environments were set at 30, 20, 10 and 40 respectively, and the values of were given in Table 2. PVEA and PVEAE can be calculated as follows,
Empirical formula of LOD threshold in QEI mapping
In single-environment analysis, when the null hypothesis is true, likelihood ratio test (LRT) at each scanning position follows the χ2 distribution with the degree of freedom (df) equal to the number of genetic parameters be estimated in the genetic population [32,35]. Sun et al.  found that the number of independent tests (denoted as Meff) was proportional to the genome length in one-dimensional QTL scanning, and the proportion varied with marker density and population type. So Meff can be estimated as the product of proportion efficient and the genome length. Let αg be the genome-wide type-I error, the type-I error at each testing position should be based on the Bonferroni correction. Therefore, the empirical LOD threshold can be determined by formula , where is the inverse χ2 distribution that returns the critical value of a right-tailed probability αp for the degree of freedom λ. In QTL mapping, λ is equal to the number of genetic parameters to be estimated .
This formula can also be used to determine the LOD threshold in QEI mapping by considering the difference in degree of freedom. In QEI mapping, each QTL genotype has its own distribution in each environment, and the number of independent genetic parameters to be estimated is the sum of parameters in each environment. In other words, λ is equal to e for BC1 (or DH and RIL) population and 2e for F2 population. For validation, LOD threshold was determined in simulated BC1 and F2 populations under null genetic model (S1 and S2 Files). The genomic information and mapping parameters were the same as unlinked and linked QTL models (to be described in the next section), except that there was no QTL located on the genome. LOD thresholds for αg = 0.05 and αg = 0.01 were estimated at the 95th and 99th percentiles of the 1000 maximum LOD scores out of 1000 runs.
Putative genetic models in simulation studies
QTL IciMapping is integrated software for linkage map construction and QTL detection. QEI mapping has been implemented in version 4.0 of the software as the MET functionality . In this study, unlinked and linked QTL models were both considered to evaluate the efficiency of QEI mapping. The genome consisted of six chromosomes, each of 150 cM in length with 16 evenly distributed markers. Two environments were considered with equal heritability in the broad sense in both models. In the unlinked QTL model, five QTL were located on five chromosomes, and the broad sense heritability was 0.5 for both environments. QTL additive effects in the two environments were given in Table 3, representing three QEI levels, i.e., strong interaction (Q2), environment-specific interaction (Q3 and Q4) and no interaction (Q1 and Q5).
Eight QTL effect scenarios were considered for two linked QTL (Table 4), i.e., Q1 and Q2, located at 25 and 55 cM on chromosome 1. These scenarios represented different QEI levels and linkage phases (coupling or repulsion phases). For example, Q1 and Q2 had strong QEI, and they were linked in the coupling phase in model L3, and in the repulsion phase in model L7. Three levels of heritability were considered, i.e., H2 = 0.1, H2 = 0.5 and H2 = 0.8.
One thousand DH populations, each of a size of 200, were generated for unlinked model (S3 File) and for each effect scenario of the two linked QTL under each heritability level (S4–S6 Files). The LOD threshold was set at 3.11 by empirical formula to ensure the genome-wide Type-I error rate (αg) to be less than 0.05. The scanning step was set at 1 cM. The two probabilities for entering and removing variables in stepwise regression were set at 0.001 and 0.002.
Detection power and FDR were used to evaluate the efficiency of QEI mapping. Each predefined QTL was assigned to a support interval of 10 cM centered at the predefined location. Power of each QTL was calculated as the percentage of the simulation runs having significant peaks higher than the LOD threshold in its support interval. QTL identified but out of this interval were treated as false positives. FDR was calculated as the ratio of the number of false positives to the total number of significant discovery [26,38]. For each genetic model, estimated positions and effects were calculated as the average values of all detected QTL.
One RIL population in maize
The actual maize population used in this study was derived from a cross between inbred parents CML444 and SC-Malawi, consisting of 236 RILs . A subset of 160 markers with an average missing rate of 5.12% was used to build the linkage map. Total genome length was 2105.6 cM by Haldane mapping function, and the average marker density was 13.16 cM. Days of male flowering (MFLW) were investigated in seven environments, i.e., water-stress conditions in Mexico (WSM, in 2003 and 2004) and Zimbabwe (WSZ, in 2003 and 2004), and well-watered conditions in Mexico (WWM, in 2003 and 2004) and Zimbabwe (WWZ, in 2004), which were named as WSM1, WSM2, WSZ1, WSZ2, WWM1, WWM2, and WWZ2, respectively. The average MFLW (d) in environments WSM, WSZ, WWM and WWZ were 104.2, 121.1, 65.4 and 75.6 for parental line CML444, 101.1, 114.1, 64.5 and 74.9 for parental line SC-Malawi, and 101.1, 117.3, 64.1 and 75.5 for the RIL population. Single-environment analysis and QEI mapping were conducted by QTL IciMapping 4.0  (S7 File). LOD thresholds were set at 3.0 and 5.67 for single-environment analysis and QEI mapping, respectively, same as Messmer et al. . To compare with the empirical LOD threshold values, permutation tests were conducted on the maize population for 1000 times .
Empirical LOD threshold in QEI mapping
Fig 1 showed the LOD thresholds based on empirical formula and simulation method. Obviously, higher LOD threshold can be seen by the increase in the environment number, and the two methods resulted in similar LOD thresholds under the same significance level for both populations (i.e. BC1 and F2 populations). Thus, the empirical formula from individual environments  is also suitable for QEI mapping in multi-environments, considering the change in degree of freedom.
(A) BC1 population and αg = 0.05. (B) BC1 population and αg = 0.01. (C) F2 population and αg = 0.05. (D) F2 population and αg = 0.01.
As for the RIL population, accumulated recombination frequency (represented by R) is much larger than the one-meiosis recombination frequency (represented by r) due to the continuous self-pollinations. Their relationship is well known as , indicating R is approximately two times of r, when r is small. The larger recombination frequency estimated in RIL population is equivalent to a longer genome. When map distance is 1 cM in DH population, r and R are equal to 0.0099 and 0.0194 under the Haldane mapping function. The corresponding map distance is 1.98 cM for the accumulated recombination frequency in RIL population. Therefore 1 cM in DH population is expanded by 1.98 times in RIL population. Similarly, 5 cM and 10 cM are expanded by 1.91 and 1.83 times. In most genetic populations, marker density is around 5 to 10 cM, and the genome size is expanded by about 1.9 times. Therefore, the genome length should be multiplied by 1.9 before applying the LOD threshold empirical formula to RIL population.
When marker density is 10 cM, the number of independent tests is about 0.072 (αg = 0.05) or 0.084 (αg = 0.01) times the genome size . Empirical LOD thresholds for common genome lengths were given in Table 5, by which empirical LOD thresholds can be found and applied. For example, one population is planted in four environments, the genome length is 1000 cM, and the average marker density is 10 cM. So df is equal to 4 for BC1, DH or RIL populations, and 8 for F2 population. According to Table 5, LOD threshold should be 4.19 for BC1 and DH populations, 5.87 for F2 population, and around 4.50 for RIL population (referring to length of 1900 cM which is between 1800 cM and 2000 cM) and αg = 0.05. For the actual maize population, “a QTL was considered to be significant (comparison-wise Type-I error rate αc = 0.001, experiment-wise error rate αe = 0.02) when the LOD exceeded the appropriate threshold 5.67 (joint QTL, seven experiments)” in Messmer et al. . LOD thresholds 6.08 and 6.89 were achieved under αg = 0.05 and αg = 0.01 by 1000 permutation tests. According to the empirical formula, the LOD threshold were 6.20 and 6.05 for marker densities of 10 and 20 cM under αg = 0.05, and 7.11 and 7.09 under αg = 0.01 for marker density of 10 and 20 cM respectively. The marker density of this population was about 13.6 cM, and the empirical LOD threshold should be around 6.1 under αg = 0.05 and 7.1 under αg = 0.01, close to that by permutation tests and Messmer et al. .
Power analysis for the unlinked QTL model
Fig 2 displayed five clear peaks along the average LOD profiles at the five predefined QTL positions. LOD scores around the predefined QTL positions increased by the increasing PVE. For example, PVE of Q1 and Q3 were 12.5 and 6.25%, and LOD scores at the two predefined positions were 15.86 and 8.15, respectively. Q1, Q2 and Q5, having the same PVE = 12.5%, had almost the same LOD score. Likewise, LODA and LODAE increased as the increase in PVEA and PVEAE.
LOD, LODA and LODAE are LOD scores for detecting QTL with both average and QEI effects, QTL only with average effect, and QTL only with QEI effects, respectively.
Detection powers of the predefined QTL were higher than 80%, and FDR was 14.11% (Table 6). The larger PVE was, the higher detection power would be. Q1, Q2 and Q5 had the same PVE = 12.5%, and their detection powers were 91.2, 98.3 and 89.5% respectively; the detection powers were 85.4 and 80.7% for Q3 and Q4, respectively, having the same PVE of 6.25%.
When detection powers were calculated by marker intervals along the six chromosomes, it can be seen that most detected QTL were distributed around the marker intervals where the QTL were located (Fig 3). In other words, the ICIM QEI mapping was less likely to locate a QTL in chromosome regions far from the predefined QTL or in other chromosomes where no QTL were located. For example, Q1 was located at 16 cM on chromosome 1. Power at the marker interval where Q1 was located was 83.4%, and powers at its nearest left and right intervals were 5.1% and 14.0%, respectively. Powers at other intervals on chromosome 1 were close to 0.
Estimation of QTL positions and effects for the unlinked model
Estimated QTL positions and effects from the 1000 simulated populations given in Table 6 showed their unbiasedness. Estimated positions of Q1 to Q5 were 16.21, 2.83, 32.59, 26.16 and 34.92 cM, corresponding to the true values 16, 3, 33, 26 and 35 cM on chromosomes 1 to 5. Estimated average additive effects of Q1 to Q5 were 0.47, 0.00, 0.24, 0.24 and 0.46, corresponding to the true values 0.5, 0, 0.25, 0.25 and 0.5 respectively. Estimated QEI effects in E1 were 0.00, 0.47, -0.24, 0.24 and 0.00, corresponding to the true values 0, 0.5, -0.25, 0.25 and 0 respectively. Estimated QEI effects in E2 were 0.00, -0.47, 0.24, -0.24 and 0.00, corresponding to the true effects 0, -0.5, 0.25, -0.25 and 0 respectively. The standard errors of the estimated positions and effects across 1000 simulations ranged from 2.01 to 2.62 and from 0.05 to 0.08, respectively.
Power analysis for the linked QTL model
Fig 4 showed average LOD score profiles of the eight effect models senarios under three heritability levels. Clear peaks were observed around the predefined positions of the two linked QTL, especially for higher heritabilities 0.5 and 0.8 (Fig 4B and 4C). The trend that higher PVE resulted in larger LOD score was also observed in linked QTL model. For example, PVE of Q1 and Q2 in models L3 and L6 were both 16.14% under H2 = 0.5, and similar LOD scores at peaks around predefined positions were obtained, i.e., 20.84 and 20.60 in model L3, and 20.32 and 20.85 in model L6, respectively. PVE of Q1 and Q2 in model L2 were both 25% under H2 = 0.5, and the LOD scores at peaks around QTL were 32.46 and 32.22, respectively, which were much higher than those in models L3 and L6 (Fig 4B).
(A) H2 = 0.1. (B) H2 = 0.5. (C) H2 = 0.8. LOD scores on other chromosomes were not shown because no QTL was defined there. LOD score was close to zero on chromosomes 2 to 6.
For the same QTL effect model, higher heritability results in larger PVE and consequently increases the LOD score. Compared with H2 = 0.1, LOD scores at QTL peaks of H2 = 0.5 were larger for all effect models. LOD scores at QTL peaks of H2 = 0.8 were the largest for the three heritability levels (Fig 4). For instance, LOD scores at peaks around Q1 and Q2 in model L8 were 5.20 and 1.67 for H2 = 0.1, 48.99 and 22.01 for H2 = 0.5, and 117.96 and 51.75 for H2 = 0.8. Linked QTL with larger PVE were also easier to be separated. For example for H2 = 0.1, PVE of Q1 and Q2 in model L1 were 4.88 and 2.44%, and only one peak appeared in the average LOD profile (Fig 4A). PVE of Q1 and Q2 in model L7 were both 11.08% for H2 = 0.1, and two clear peaks appeared around the two predefined QTL positions.
Fig 5 displayed the detection power and FDR for the two linked QTL in QEI mapping. Detection power increased with the increase of heritability and PVE of QTL. For example in model L1 under H2 = 0.1, 0.5 and 0.8, PVE of Q1 were 4.88, 24.40 and 39.05%, and PVE of Q2 were 2.44, 12.20 and 19.52%, respectively. Their detection powers were 60.4 and 19.4% for H2 = 0.1, 98.1 and 81.3% for H2 = 0.5, and 100 and 98.2% for H2 = 0.8. FDR decreased by the increase of heritability. Taking model L1 as an example, FDR were 41.7, 14.7 and 8.8% for H2 = 0.1, 0.5 and 0.8, respectively. Powers of all QTL in all effect models increased to about 80% for H2 = 0.5 and nearly 100% for H2 = 0.8. Meanwhile, FDR of all effect models was less than 22% for H2 = 0.5 and less than 16% for H2 = 0.8.
Power (A) of each QTL was calculated as the proportion of runs where the QTL was identified in a 10 cM support interval. FDR (B), false discovery rate, was calculated as the proportion of false positive QTL to total QTL detected for each model and each heritability level.
When detection powers were calculated by marker intervals along chromosome 1, most detected QTL were distributed around the marker intervals where the two QTL were located especially for H2 = 0.5 and H2 = 0.8 (Fig 6). For example, Q1 and Q2 in model L3 were located at 25 and 55 cM on chromosome 1. Powers at their two marker intervals were 96.4 and 96.9%, respectively for H2 = 0.5. Powers were 3.8 and 3.7% at the nearest left and right intervals of Q1, and 4.4 and 4.1% at the nearest left and right intervals of Q2. Powers were rather low at other intervals. It could be found that even for the linked QTL, ICIM QEI mapping was less likely to locate a QTL in chromosome regions far away from the predefined QTL or in other chromosomes where no QTL were located.
(A) H2 = 0.1. (B) H2 = 0.5. (C) H2 = 0.8. Power was calculated as the proportion of runs where QTL on the interval was detected. There were 15 marker intervals defined by the 16 markers evenly distributed on chromosome 1. Power on other chromosomes were close to zero on chromosomes 2 to 6 where no QTL was predefined.
QEI mapping in the maize RIL population
Through stepwise regression, 4, 5, 2, 3, 4, 2 and 4 markers were selected for environments WSM1, WSM2, WSZ1, WSZ2, WWM1, WWM2 and WWZ2 respectively. In QEI mapping, profiles of LOD, LODA and LODAE along the maize genome were shown in Fig 7A. Under the LOD threshold 5.67, a total of 13 QTL affecting MFLW were identified across the seven environments: one each on chromosomes 8 and 10, two each on chromosomes 2, 3, 4, and 6, and three on chromosome 1 (Table 7). Although QEI was observed in some chromosomal regions, e.g., chromosomes 2, 3 and 10 (Fig 7A), most identified QTL were relatively stable with large LODA and small LODAE (Table 7). Five QTL had positive average effects (Table 7). qMFLW-2-2 had the highest LOD = 24.45, LODA = 18.45 and LODAE = 5.99, and had the largest average and QEI effects as well. Average additive effect was -0.43, indicating the allele from CML444 at this locus would reduce MFLW by 0.43 days on the basis of population mean. qMFLW-1-1 was relatively stable, whose LOD = 5.76, LODA = 5.07 and LODAE = 0.70. The allele of CML444 at this locus would delay MFLW by 0.35 days. qMFLW-2-1 had strong QEI, whose LOD = 12.82, LODA = 0.31 and LODAE = 12.50.
The dash line denotes the LOD threshold of 5.67 in QEI mapping.
LOD profiles along the maize genome by single-environment analysis were shown in Fig 7B. Under the LOD threshold 3.0, a total of 19 QTL were identified, four of which were coincident in more than two environments. Respectively, 2, 5, 2, 3, 4, 1 and 2 QTL were identified in environments WSM1, WSM2, WSZ1, WSZ2, WWM1, WWM2 and WWZ2 (Table 8). Ten QTL detected by QEI mapping, i.e., qMFLW-1-2, qMFLW-2-1, qMFLW-2-2, qMFLW-3-1, qMFLW-3-2, qMFLW-4-1, qMFLW-4-2, qMFLW-6-1, qMFLW-6-2 and qMFLW-10, were also detected by single-environment analysis, while other QTL were detected only by single-environment analysis (Tables 7 and 8). Take a common QTL around 220 cM on chromosome 1 as an example, the estimated positions were 217 and 220 cM by single-environment analysis in WSM1 and WSZ1 respectively, which was close to qMFLW-1-2 detected by QEI mapping. Compared with single-environment analysis, QEI mapping has the following properties.
- Both QTL stability and QEI effect can be analyzed. For example, major effect of qMFLW-1-1 was -0.22 and additive by environment effects in the seven environments were 0.16, -0.04, -0.10, 0.00, -0.04, -0.09, and 0.11 respectively. It could be considered to be stable, as the absolute value of major effect was much larger than the interactions. In contrary, major effect of qMFLW-2-1 was -0.06, but additive by environment effects in the seven environments were -0.28, 0.68, -0.26, 0.27, -0.19, -0.12 and -0.09 respectively. The much larger interactions showed the higher level of QEI, and the less stability.
- Most QTL identified by single-environment analysis can also be detected by QEI mapping, especially for QTL detected in more than one environment. QEI mapping can detect some QTL which were not identified by single-environment analysis. In this study, qMFLW-1-1, qMFLW-1-3 and qMFLW-8 were detected only by QEI mapping (Tables 7 and 8). Small peaks with LOD scores lower than threshold 3.0 were observed around these QTL positions on LOD profiles in several environments. Therefore, it is not strange that they were detected in QEI mapping (Fig 7). For example, qMFLW-8 identified by QEI mapping located at 133 cM on chromosome 8. Peaks of LOD scores were also observed around this position in environments WSM1 and WSZ2 by single-environment analysis.
- For some common QTL, positions estimated by single-environment analysis fluctuated around the positions estimated by QEI mapping. For example, qMFLW-3-1 was located at 55 cM on chromosome 3 in QEI mapping, but was located at 56, 54 and 53 cM on chromosome 3 in WSM2, WSZ1 and WSZ2, respectively, by single-environment analysis (Tables 7 and 8). QEI mapping used multi-environment phenotypic data simultaneously and therefore may result in more precise and reliable estimation of QTL position.
- The estimated effects by QEI mapping and single-environment analysis were similar, although minor differences were observed. Taking qMFLW-4-2 as an example, the estimated effects were 0.50 and 0.61 in WSM2 and WWM1 by single-environment analysis, corresponding to 0.45 and 0.54 estimated by QEI mapping (Tables 7 and 8).
Compared with the mapping results in Messmer et al. , the ICIM QEI mapping detect more QTL. Four QTL reported by the joint analysis for seven environments in Messmer et al.  were all identified in this study, i.e., qMFLW-1-2, qMFLW-3-1, qMFLW-4-2 and qMFLW-6-1. However, nine more QTL were detected by QEI mapping in this study, and many of which can also be detected in single-environment analysis. For example, LOD score of qMFLW-2-2 was 24.45 by QEI mapping. It was identified in environments WSM2 and WSZ2 with LOD scores 13.17 and 3.46 by single-environment analysis. In addition, there were peaks at LOD profiles in environments WSZ1, WWM2 and WWZ2 lower than LOD threshold in single-environment analysis. Similar were qMFLW-2-1, qMFLW-4-1 and qMFLW-10. The truth of these QTL was validated by the single-environment analysis.
QEI can be investigated when genetic populations are planted in multiple locations and/or years. QEI information thus obtained is of great value for breeders and genetic researchers. According to the QTL mapping results, breeders can design ideal genotype of favorable alleles and more efficiently perform marker-assisted selection. Stable QTL of agronomic traits is useful to a wide range of environments, while environment-favorable QTL can be used within specific target environments. However, the detection of QEI is not easy.
“Results of separate analysis by environment is hard to interpret, and cannot take advantage of built-in replication provided by multiple environments”, pointed by Tinker and Mather . Single-environment analysis is subject to the errors from different environments probably resulting in different positions and effects of the same QTL. It is not inconvenient to evaluate QTL stability and QEI effect by directly comparing the effects estimated by single-environment analysis. Some studies conduct QTL mapping using the mean phenotypic value across multi-environments or the genotypic value predicted by best linear unbiased prediction [41,42]. But this approach can only detect QTL with significant major effects. When dealing with multi-environment phenotypic data, the estimated positions and effects by QEI mapping are more reliable than single-environment analysis as the data across all environments are used simultaneously.
The two-step strategy used in ICIM simplifies the mapping procedure by separating the cofactor selection from interval mapping . This study demonstrated that the superiority of ICIM has been maintained when extended to QEI mapping. Stepwise regression was conducted only once, based on which the phenotype was adjusted during the interval mapping. This strategy avoids the repeated interval mapping in Boer et al. , and requires much less computing time. Another feature distinguishing ICIM from other methods is that major effect and QEI effect of QTL are estimated based on genotypic value of two QTL genotypes, QQ and qq, across multi-environments through the orthogonal decomposition. QTL stability and QEI level can be directly evaluated from the mapping results, including three LODs, three PVEs, major effects and QEI effects.
Using similar algorithms described in this study, ICIM has been extended to dominant QEI mapping, and epistatic QEI mapping as well. In the case of epistatic QEI mapping, the first step was to use stepwise regression to select significant markers and significant marker pairs in each environment. The second step was to apply the two-dimensional interval mapping on the adjusted phenotypic values. Both major epistasis effect and epistasis by environment interaction effect can be estimated. QEI mapping of additive, additive and dominant, and epistatic effects in most biparental populations have been well implemented in the QTL IciMapping software .
LOD threshold is used to control false positive in QTL mapping. Use of a suitable threshold is an important issue, as it determines the number of identified QTL and control the genome-wide error rate. For convenience, we used an empirical formula to select the LOD threshold in this study. The LOD threshold to define a stable QTL can also be obtained from the empirical formula. For LODA, df is equal to 1 for DH, BC1 and RIL populations, and 2 for F2 population when one-dimension scanning is conducted. Similarly for LODAE, df is equal to e-1 for DH, BC1 and RIL populations, and 2(e-1) for F2 populations to obtain LOD threshold for significant QEI.
LOD threshold in QEI mapping is much higher than that in single-environment analysis, due to the increased degree of freedom. Some QTL detected in single environment may result in peaks lower than the threshold, which will not be reported in QEI mapping. But, most QTL detected in more than one environment and QTL with higher LOD score in single-environment analysis are more likely to be detected in QEI mapping. In the maize RIL population, most QTL detected in WS condition had higher LOD than that in WW condition. For example, 12 QTL were detected in WS, 8 of which had LOD scores over 4. In WW condition, 7 QTL were detected, 3 of which had LOD scores over 4. In addition, more QTL in WS were detected in more than one environment. For example, QTL around 217 cM on chromosome 1 was detected in WSM1 and WSZ1; QTL around 120 cM on chromosome 2 was detected in WSM2 and WSZ2. In comparison, more QTL in WW were detected in only one environment, especially those not identified by QEI mapping.
Genetic architecture of quantitative traits could be more complicated than additive, dominance and digenic epsitatsis discussed in this study. The environments where the traits are phenotyped can be equally complicated. Though we see benefits to apply QEI analysis for multi-environmental trials, we cannot exclude the use of QTL mapping by each environment, and then summarize the mapping results across the environments. Neither can we exclude the use of the estimated breeding values in QTL mapping where the target is to locate the highly-adapted and stable genes.
S1 File. Input files for simulating BC1 populations under null genetic model and the 1000 simulated BC1 populations to validate empirical LOD threshold formula under 1, 2, 3, 4, 5, 6, 8 and 10 environments, respectively.
S2 File. Input files for simulating F2 populations under null genetic model and the 1000 simulated F2 populations to validate empirical LOD threshold formula under 1, 2, 3, 4, 5, 6, 8 and 10 environments, respectively.
S3 File. Input files for simulatiing DH populations under unlinked genetic model and the 1000 simulated DH populations.
S4 File. Input files for simulating DH populations under linked genetic models (L1-L8) with H2 = 0.1 and the 1000 simulated DH populations.
S5 File. Input files for simulating DH populations under linked genetic models (L1-L8) with H2 = 0.5 and the 1000 simulated DH populations.
S6 File. Input files for simulating DH populations under linked genetic models (L1-L8) with H2 = 0.8 and the 1000 simulated DH populations.
S7 File. Input files for maize RIL population used in single-environment analysis and QEI mapping.
S8 File. QTL identified in the 1000 simulated DH populations under unlinked genetic model.
Conceived and designed the experiments: JW LZ. Performed the experiments: SL LZ. Analyzed the data: SL. Contributed reagents/materials/analysis tools: SL JW LZ. Wrote the paper: SL JW LZ.
- 1. Boer MP, Wright D, Feng L, Podlich DW, Luo L, Cooper M, et al. A mixed-model quantitative trait loci (QTL) analysis for multiple-environment trial data using environmental covariables for QTL-by-environment interactions, with an example in maize. Genetics. 2007; 177: 1801–1813. pmid:17947443
- 2. Collard BCY, Mackill DJ. Marker-assisted selection: an approach for precision plant breeding in the twenty-first century. Phil Trans R Soc B. 2008; 363: 557–572. pmid:17715053
- 3. EI-Soda M, Malosetti M, Zwaan BJ, Koornneef M, Aarts MGM. Genotype× environment interaction QTL mapping in plants: lessons from Arabidopsis. Trends Plant Sci. 2014; 19: 390–398. pmid:24491827
- 4. Pillen K, Zacharias A, Léon J. Advanced backcross QTL analysis in barley (Hordeum vulgare L). Theor Appl Genet. 2003; 107: 340–352. pmid:12677407
- 5. Jiang C, Zeng Z. Multiple trait analysis of genetic mapping for quantitative trait loci. Genetics. 1995; 140: 1111–1127. pmid:7672582
- 6. Jansen RC, Van Ooijen JW, Stam P, Lister C, Dean C. Genotype-by-environment interaction in genetic mapping of multiple quantitative trait loci. Theor Appl Genet. 1995; 91: 33–37. pmid:24169664
- 7. Li H, Ye G, Wang J. A modified algorithm for the improvement of composite interval mapping. Genetics. 2007; 175: 361–374. pmid:17110476
- 8. Tinker NA, Mather DE. Methods for QTL analysis with progeny replicated in multiple environments. J Quantitative Trait Loci. 1995; Avaliable: http://wheat.pw.usda.gov/jag/papers95/paper195/jqtl15.html.
- 9. Korol AB, Ronin YI, Nevo E. Approximate analysis of QTL-environment interaction with no limits on the number of environments. Genetics. 1998; 148: 2015–2028. pmid:9560414
- 10. Hackett CA, Meyer RC, Thomas WTB. Multi-trait QTL mapping in barley using multivariate regression. Genet Res (Camb). 2001; 77: 95–106.
- 11. Haley CS, Knott SA. A simple regression method for mapping quantitative trait loci in line crosses using flanking markers. Heredity. 1992; 69: 315–324. pmid:16718932
- 12. Wang DL, Zhu J, Li ZK, Paterson AH. Mapping QTLs with epistatic effects and QTL×environment interactions by mixed linear model approaches. Theor Appl Genet. 1999; 99: 1255–1264.
- 13. Piepho H. A mixed-model approach to mapping quantitative trait loci in barley on the basis of multiple environment data. Genetics. 2000; 156: 2043–2050. pmid:11102394
- 14. Malosetti M, Voltas J, Romagosa I, Ullrich SE, van Eeuwijk FA. Mixed models including environmental covariables for studying QTL by environment interaction. Euphytica. 2004; 137: 139–145.
- 15. Vargas M, van Eeuwijk FA, Crossa J, Ribaut J. Mapping QTLs and QTL×environment interaction for CIMMYT maize drought stress program using factorial regression and partial least squares methods. Theor Appl Genet. 2006; 112: 1009–1023. pmid:16538513
- 16. Crossa J, Vargas M, van Eeuwijk FA, Jiang C, Edmeades GO, Hoisington D. Interpreting genotype×environment interaction in tropical maize using linked molecular markers and environmental covariables. Theor Appl Genet. 1999; 99: 611–625. pmid:22665197
- 17. van Eeuwijk FA, Crossa J, Vargas M, Ribaut J. Analysing QTL-environment interaction by factorial regression, with an application to the CIMMYT drought and low-nitrogen stress programme in maize. In: Kang MS (ed) Quantitative genetics, genomics and plant breeding. Wallingford: CABI; 2002. pp. 245–256.
- 18. Chen X, Zhao F, Xu S. Mapping environment-specific quantitative trait loci. Genetics. 2010; 186: 1053–1066. pmid:20805558
- 19. Zhao F, Xu S. Genotype by environment interaction of quantitative traits: a case study in barley. G3. 2012; 2: 779–788. pmid:22870401
- 20. Gauch HG Jr, Rodrigues PC, Munkvold JD, Heffner EL, Sorrell M. Two new strategies for detecting and understanding QTL×environment interactions. Crop Sci. 2011; 51: 96–113.
- 21. Wang J, Li H, Zhang L. Genetic Mapping and Breeding Design. 1st ed. Beijing: Science Press; 2014.
- 22. Li H, Ribaut J, Li Z, Wang J. Inclusive composite interval mapping (ICIM) for digenic epistasis of quantitative traits in populations. Theor Appl Genet. 2008; 116: 243–260. pmid:17985112
- 23. Zhang L, Li H, Li Z, Wang J. Interactions between markers can be caused by the dominance effect of quantitative trait loci. Genetics. 2008; 180: 1177–1190. pmid:18780741
- 24. Wang J. Inclusive composite interval mapping of quantitative trait genes. Acta Agron Sin. 2009; 35: 239–245 (in Chinese with English abstract).
- 25. Wu W, Li W, Lu H. A general approach for filtrating genetic background noise in QTL mapping. Journal of Biomathematics. 1998; 13: 592–595.
- 26. Li H, Zhang L, Wang J. Estimation of statistical power and false discovery rate of QTL mapping methods through computer simulation. Chin Sci Bull. 2012; 57: 2701–2710.
- 27. Zhang L, Li H, Wang J. The statistical power of inclusive composite interval mapping in detecting digenic epistasis showing common F2 segregation ratios. J Integr Plant Biol. 2012; 54: 270–279. pmid:22348947
- 28. Alves AA, Rosado CCG, Faria DA, Guimarães LMS, Lau D, Brommonschenkel SH, et al. Genetic mapping provides evidence for the role of additive and non-additive QTLs in the response of inter-specific hybrids of Eucalyptus to Puccinia psidii rust infection. Euphytica. 2012; 183: 27–38.
- 29. Li X, Chen X, Xiao Y, Xia X, Wang D, He Z, et al. Identification of QTLs for seedling vigor in winter wheat. Euphytica. 2014; 198: 199–209.
- 30. Njau PN, Bhavani S, Huerta-Espino J, Keller B, Singh RP. Identification of QTL associated with durable adult plant resistance to stem rust race Ug99 in wheat cultivar ‘Pavon 76’. Euphytica. 2013; 190: 33–44.
- 31. Yuste-Lisbona FJ, Capel C, Sarria E, Torreblanca R, Gόmez-Guillamόn ML, Capel J, et al. Genetic linkage map of melon (Cucumis melo L) and localization of a major QTL for powdery mildew resistance. Mol Breed. 2011; 27: 181–192.
- 32. Zeng Z. Precision mapping of quantitative trait loci. Genetics. 1994; 136: 1457–1468. pmid:8013918
- 33. Whittaker JC, Thompson R, Visscher PM. On the mapping of QTL by regression of phenotype on marker-type. Heredity. 1996; 77: 23–32.
- 34. Meng X, Rubin DB. Maximum likelihood estimation via the ECM algorithm: a general framework. Biometrika. 1993; 80: 67–278.
- 35. Piepho HP. A quick method for computing approximate thresholds for quantitative trait loci detection. Genetics. 2001; 157: 425–432. pmid:11139522
- 36. Sun Z, Li H, Zhang L, Wang J. Properties of the test statistic under null hypothesis and the calculation of LOD threshold in quantitative trait loci (QTL) mapping. Acta Agron Sin. 2013; 39: 1–11 (in Chinese with English abstract).
- 37. Meng L, Li H, Zhang L, Wang J. QTL IciMapping: Integrated software for genetic linkage map construction and quantitative trait locus mapping in biparental populations. The Crop Journal 2015; in press.
- 38. Li H, Hearne S, Bänziger M, Li Z, Wang J. Statistical properties of QTL linkage mapping in biparental genetic populations. Heredity. 2010; 105: 257–267. pmid:20461101
- 39. Messmer R, Fracheboud Y, Bänziger M, Vargas M, Stamp P, Ribaut J. Drought stress and tropical maize: QTL-by-environment interactions and stability of QTLs across environments for yield components and secondary traits. Theor Appl Genet. 2009; 119: 913–930. pmid:19597726
- 40. Churchill GA, Doerge RW. Empirical threshold values for quantitative trait mapping. Genetics. 1994; 138: 963–971. pmid:7851788
- 41. Qin Y, Liu R, Mei H, Zhang T, Guo W. QTL Mapping for Yield Traits in Upland Cotton (Gossypium hirsutum L). Acta Agron Sin. 2009; 35: 1812–1821 (in Chinese with English abstract).
- 42. Li Q, Yang X, Xu S, Cai Y, Zhang D, Han Y, et al. Genome-wide association studies identified three independent polymorphisms associated with α-tocopherol content in maize kernels. PLoS One 2012 May 15; 7(5).