A Complete Solution for Dissecting Pure Main and Epistatic Effects of QTL in Triple Testcross Design

Epistasis plays an important role in genetics, evolution and crop breeding. To detect the epistasis, triple test cross (TTC) design had been developed several decades ago. Classical procedures for the TTC design use only linear transformations Z1, Z2 and Z3, calculated from the TTC family means of quantitative trait, to infer the nature of the collective additive, dominance and epistatic effects of all the genes. Although several quantitative trait loci (QTL) mapping approaches in the TTC design have been developed, these approaches do not provide a complete solution for dissecting pure main and epistatic effects. In this study, therefore, we developed a two-step approach to estimate all pure main and epistatic effects in the F2-based TTC design under the F2 and F∞ metric models. In the first step, with Z1 and Z2 the augmented main and epistatic effects in the full genetic model that simultaneously considered all putative QTL on the whole genome were estimated using empirical Bayes approach, and with Z3 three pure epistatic effects were obtained using two-dimensional genome scans. In the second step, the three pure epistatic effects obtained in the first step were integrated with the augmented epistatic and main effects for the further estimation of all other pure effects. A series of Monte Carlo simulation experiments has been carried out to confirm the proposed method. The results from simulation experiments show that: 1) the newly defined genetic parameters could be rightly identified with satisfactory statistical power and precision; 2) the F2-based TTC design was superior to the F2 and F2:3 designs; 3) with Z1 and Z2 the statistical powers for the detection of augmented epistatic effects were substantively affected by the signs of pure epistatic effects; and 4) with Z3 the estimation of pure epistatic effects required large sample size and family replication number. The extension of the proposed method in this study to other base populations was further discussed.


Introduction
Epistasis, the interaction between genes, plays an important role in genetics, evolution and crop breeding. First, it is an important genetic component in the genetic architecture of complex traits [1,2]. Next, it can lead to heterosis [3][4][5][6][7], which is very important in hybrid breeding. In addition, it is a driving force in evolution and plays a central role in founder effect models of speciation [1,8,9]. Over the past several decades, many attempts have been made to detect the epistasis. One important attempt was triple test cross (TTC) design developed by Kearsey and Jinks [10], which is a powerful breeding design as well. Therefore, the great importance associated with the epistasis necessitates an in-depth study of the TTC design.
The TTC design is to cross the ith individual (i = 1,2,…n) of an F 2 population (or backcross, recombinant inbred lines (RIL) and near isogenic lines (NIL)) to the same three testers, the two inbred lines (P 1 and P 2 ) and their F 1 , to produce 3n families. The design is considered the most efficient model as it provides not only a precise test for epistasis, but also unbiased estimates of additive and dominance components if epistasis is absent [10]. In early studies, only the phenotypic data of quantitative traits were used in the TTC to infer the nature of the additive, dominance and epistatic effects of polygenes using classical generation mean [11][12][13] and variance component analysis [10,12,[14][15][16][17]. However, these conventional biometrical genetic procedures deal only with the collective effects of all the polygenes [6,7,11,12]. The introduction of molecular markers has facilitated the mapping of quantitative trait loci (QTL) in numerous species, and substantial progress has been achieved in the detection of individual QTL and their interaction in the RIL-and NIL-based TTC designs.
In the RIL-based TTC designs, Kearsey et al. [12] employed the marker difference regression of Kearsey and Hyne [18] to detect QTL for 22 quantitative traits in Arabidopsis thaliana. Frascaroli et al. [16] used composite interval mapping [19] to identify main-effect QTL and the mixed linear model approach [20] to detect digenic epistatic QTL in the analyses of heterosis in maize. The method has been used to identify the main-effect QTL and digenic epistatic QTL underlying the heterosis of nine important agronomic and economic traits in rice by Li et al. [17]. However, the additive and dominant effects estimated from the above approaches are confounded with epistatic effect if epistasis is present. To overcome this issue, Melchinger et al. [21] derived quantitative genetic expectations of QTL main and interaction effects in the RIL-based TTC design. On their theoretical findings, using one-dimensional genome scans, we can estimate augmented additive and dominance effects [7] and QTL-bygenetic background interaction, whereas using two-way ANOVA between all pairs of marker loci, we can estimate additive-byadditive (aa) and dominance-by-dominance (dd) interactions. Kusterer et al. [22] applied the novel approaches of Melchinger et al. [7,21] to detect QTL for heterosis of biomass-related traits in Arabidopsis. In the above studies, only one variable was involved at one time. To increase the power of QTL detection, Kusterer et al. [22] adopted multi-variable joint analysis [23], as proposed by Melchinger et al. [7] for QTL mapping in the NCIII design.
In the NIL-based TTC design, Melchinger et al. [21] used two QTL mapping methods to study heterosis in Arabidopsis. In the generation means approach, additive, dominance and QTL 6 genetic background epistasis effects were tested and estimated, and the approach along with particular two-segment NILs was applied by Reif et al. [24] to map aa digenic interaction. In addition, Zhu and Zhang [25] derived formulae for calculating the statistical power in the detection of epistasis; and Wang et al. [26] used interval mapping [27] to detect QTL underlying endosperm traits and demonstrated that the TTC provided a reasonably precise and accurate estimation of QTL positions and effects, especially the two dominant effects, which perfectly overcomes the drawback of the F 2:3 design.
The objective of this study was to estimate, in an unambiguous and unbiased manner, all the main and epistatic effects of QTL in the F 2 -based TTC design. A series of Monte Carlo simulation experiments was carried out to confirm the proposed approach. The extension of the new method to other base populations in the TTC was discussed as well.

Genetic design and data collection
An F 2 population was derived from two inbred lines (P 1 and P 2 ) that differed significantly in the quantitative traits of interest and possessed abundant polymorphism molecular markers. A random sample of n F 2 individuals (female parents) was backcrossed to three testers, the two parental lines and their F 1 , to produce 3n families (L 1i , L 2i and L 3i ). All of the 3n families, each with m replications, were planted. Molecular marker information was observed from all of the n F 2 individuals, whereas quantitative traits were measured for all of the 3nm TTC progeny. The phenotypic observations were denoted by y tij , where t~1,2 and 3 for L 1 , L 2 and L 3 ; i~1,2, Á Á Á ,n and j~1,2, Á Á Á ,m. The family means were denoted by L L ti~P m j~1 y tij .
m. Following Kearsey and Jinks [10] and Melchinger et al. [21], we performed three linear transformations: Z 1i~ L L 1i z L L 2i , Z 2i~ L L 1i { L L 2i and Z 3i~ L L 1i z L L 2i {2 L L 3i . The association between Z t and the marker genotypes of the F 2 plants were used to infer the genetic architecture of the trait.

Genetic models for mapping QTL in the F 2 -based TTC design
The expected genetic values of Z 1i , Z 2i and Z 3i depended on the choice of the metric. Two main metrics, the F 2 and F ' metrics, were adopted for the populations derived from the cross between the two inbred lines [28][29][30]. The derivation of the expected genetic values of Z 1i , Z 2i and Z 3i under both the F 2 and the F ' metric models was presented in Table S1, Table S2, Table S3,  Table S4, Table S5, Table S6, and Supporting Information S2. The genetic effect symbols adopted in this study were referred to Kao and Zeng [28].
Statistical genetic models for mapping QTL under the F 2 metric model. According to the expected genetic value of Z 1i under the F 2 metric model in Table S5, the phenotypic value of Z 1i can be described as: Z 1i~2 mzx a 1 i a 1 zx a 2 i a 2 zx a 1 a 2 i i a 1 a 2 zx a 1 d 2 i i a 1 d 2 zx d 1 a 2 i i d 1 a 2 zx d 1 d 2 where m is the mean genotypic value of the F 2 population; a k and d k are additive and dominance effects of the kth QTL, respectively; i a 1 a 2 , i a 1 d 2 , i d 1 a 2 and i d 1 d 2 are additive-by-additive, additive-by-dominance, dominance-by-additive and dominanceby-dominance interactions between the 1st and 2nd QTL, respectively; x a 1 i , x a 2 i , x a 1 a 2 i , x a 1 d 2 i , x d 1 a 2 i and x d 1 d 2 i are dummy variables and determined by the genotype of the ith F 2 plant (Table S5); and e 1i is the residual error with an N(0,s 2 1 ) distribution. According to the results in Table S5, there are To solve the genetic parameters, model (1) must be reduced to: where a 1 a 2 i~xd 1 d 2 i z 1 2 . If the quantitative trait was controlled by q QTL, model (2) should be extended to: where model mean m Z 1~2 m{ 1  (Table 1). The coefficients for the genotype M k m k M l m l were integrated by the frequencies of M k M l =m k m l and M k m l =m k M l . The augmented epistatic effects ( i < kl ) are ignored in Melchinger et al. [21], this may result in a bigger residual error and lower statistical power.
In the same way, the phenotypic value of Z 2i can be described as: where u d 1 i , u d 2 i , u a 1 a 2 i , u a 1 d 2 i and u d 1 a 2 i are determined by the genotype of the ith F 2 plant (Table S5); and e 2i is the residual error with an N(0,s 2 2 ) distribution. According to the results in Table S5, there are u a 1 d 2 i~ud 1 a 2 i and u a 1 a 2 i~{ 1 2 (u d 1 i zu d 2 i ). To solve the genetic parameters, model (4) must be reduced to: where m Z 2~a 1 za 2 , d Ã 1~d 1 { 1 2 i a 1 a 2 , d Ã 2~d 2 { 1 2 i a 1 a 2 ,ĩi 12ĩ a 1 d 2 zi d 1 a 2 and u~i i 12 i~u a 1 d 2 i~ud 1 a 2 i .
If the quantitative trait was controlled by q QTL, model (5) should be extended to: where model mean i a k a l is augmented dominance effect of QTL k;ĩi kl~ia k d l zi d k a l is augmented epistatic effect between QTL k and l; and dummy variables u d k i and u~i i kl i are determined by the genotypes of the kth and lth QTL of the ith F 2 plant ( Table 1). The augmented epistatic effects (ĩi kl ) are overlooked in Melchinger et al. [21], this may result in a bigger residual error and lower statistical power. Similarly, the phenotypic value of Z 3i can be described as: where m Z 3~r i a 1 a 2 ; r is the recombination fraction between two QTL under study; and dummy variables v a 1 d 2 i , v d 1 a 2 i and v d 1 d 2 i are determined by the genotype of the ith F 2 plant (Table 1 and  Table S5). Here pure ad, da and dd epistatic effects can be estimated with two-dimensional genome scans. This differs from that in Melchinger et al. [21], in which only dd epistasis is estimated with two-way ANOVA. Models (3), (6) and (7) were working models for our QTL mapping approach in the F 2 -based TTC design. Here we proposed a two-step approach to obtain all the pure main and epistatic effects in the presence of epistasis. In the first step, model (3) can be used to estimate the augmented additive (a Ã k ) and epistatic ( i < kl ) effects, model (6) can be used to estimate the augmented dominance (d Ã k ) and epistatic (ĩi kl ) effects, and model (7) can be used to estimate three types of pure epistatic effects (i a k d l , i d k a l and i d k d l ). In the second step, all estimated epistatic effects in models (3), (6) and (7) were integrated for the estimation of all four types of the pure epistatic effects using i a k a l~i 3. These pure epistatic effects further integrate with the estimates of both a Ã k and d Ã k for the estimation of pure additive and dominance effects, using a k~( a Ã k z 1 2 P q l~1,l=k i d k a l ) and d k~( d Ã k z 1 2 P q l~1,l=k i a k a l ). When epistasis is absent, pure additive (a k ) and dominance (d k ) effects can be directly obtained from model (3) and model (6), respectively.
Genetic models for mapping QTL under the F ' metric model. With Z 1i , Z 2i and Z 3i genetic models for mapping QTL under the F ' metric model have the same forms as described in models (3), (6) and (7), respectively. The detailed derivation was described in Table S6 and Supporting information S1 and the detailed comparisons were given in Tables 1 and 2 MkMk Ml Ml Mkmk ml ml P q l~1,l=k (i a k d l {i d k a l ) and d k~½ d Ã k z 1 2 P q l~1,l=k (i a k a l {i d k d l ).

Genetic parameter estimation
Models (3) and (6) have a uniform appearance. However, the true number of QTL (q) is hard to determine. Variable selection via a stepwise regression or a stochastic search variable selection is the common procedure for epistatic QTL analysis. But these methods are computationally intensive and may not be optimal [31][32][33]. Thus, we adopted the empirical Bayes (E-Bayes) method of Xu [33] for the estimation of parameters in the above models. The E-Bayes approach assumes that there is one QTL standing on each marker throughout the genome and shrinks the genetic effects of all ''nonsignificant'' QTL toward zero. Here, we only gave some necessary procedures; for the technical details of the E-Bayes refer to the original study of Xu [33].
In the expectation and maximization (EM) algorithm of the E-Bayes method [33], model (9) is a typical mixed model and b is treated as a fixed effect, whereas ª is treated as a random effect. Therefore, y has a multivariate normal distribution with the mean m~Xb and the variance-covariance matrix V~P p j~1 Z j Z T j s 2 j zIs 2 . In the EM algorithm of E-Bayes, the genetic parameters ª are the focus of interest and the normal prior is assigned to c j , i.e., c j~N (0,s 2 j ) and s 2 j is further assigned a scaled inverse x 2 prior, The EM algorithm procedures are as follows: 1) Choose j~(t,v)~(0,0) and assign initial values: s 2 1s 2 2~Á Á Á~s 2 p~1 :0, b~(X T X) {1 X T y, s 2~( y{Xb) T (y{Xb)=n. 2) E-step: the best linear unbiased prediction (BLUP) estimation of the expectation of the quadratic term 3) M-step: the maximum-likelihood estimation for s 2 j , fixed effects and residual variance 4) repeat steps 2) -3) until a certain criterion of convergence is satisfied, e.g. the difference of parameter estimate values between two adjacent iterations were less than 10 210 .
In addition, we performed a two-dimension scan using the maximum likelihood approach for the estimation parameters in models (7).

Likelihood ratio test
If we only want to report QTL with relatively large effects and give readers accurate information about how significant the identified QTL were, statistical test should be conducted. The usual likelihood ratio test (LRT) cannot be carried out with the E-Bayes method owing to an oversaturated epistatic genetic model. We proposed the following two-stage selection process to screen the QTL [31]. In the first stage, all QTL with t j~Db b j D .ŝ s j w2:0 are picked up. In the second stage, the epistatic genetic model is modified so that only effects past the first round of selection are included in the model. Owing to the smaller dimensionality of the reduced model, we can use the maximum likelihood method to reanalyze the data and perform the LRT [31]. The test statistic is where h is the parameters vector in the statistical genetic model in the second stage analysis of model (

Experiment I
The purpose of the simulation experiment was: (1) to evaluate the statistical performance of the proposed approach; (2) to compare the proposed method with previous approaches, such as Kearsey et al. [12], Frascaroli et al. [16] and Li et al. [17] or Melchinger et al. [7,21] and Kusterer et al. [22], according to statistical power, standard deviation and accuracy measure; and (3) to compare the TTC design with the F 2 and F 2:3 genetic designs.
The simulated genome consisted of three chromosomes (chr1, chr2 and chr3), and 11 evenly spaced markers covered each chromosome with an average marker interval of 10.0 cM. We simulated three main-effect QTL and one pair-wise interaction QTL, all of which overlapped with markers. All three main-effect QTL were located at the center (50.0 cM) of each chromosome, and QTL 2 on chr2 interacted with QTL 3 on chr3. The genetic parameters under both the F 2 and the F ' metric models were as follows: m~100:00; a 1~1 :50 and d 1~1 :50 for QTL 1 ; a 2~2 :00 and d 2~{ 1:00 for QTL 2 ; a 3~{ 1:00 and d 3~2 :00 for QTL 3 ; i a2a3~1 :00, i a2d3~1 :50, i d2a3~1 :00 and i d2d3~1 :50 for the epistatic effects between QTL 2 and QTL 3 . The marginal heritabilities of these genetic effects varied from 1.01% to 36.54%. The sample size (n), the number of individual in the F 2 population, was set at two levels: 200 and 400. The number of individuals (m) for each TTC family was set at 1, 5 and 10. The environmental variance (s 2 e ) was set at 4.00 and 1.00. To implement the last objective of the simulation experiment, two other kinds of populations, the F 2 and F 2:3 populations, were also simulated. However, molecular marker information for all three populations was derived from the corresponding F 2 individuals. Each treatment was replicated 200 times for the TTC and F 2:3 designs and 400 times for the F 2 design. In the analyses of the TTC family data, two approaches were adopted: 1) Method A, the proposed method in this study, and 2) Method B, the modified method of Kearsey et al. [12], Frascaroli et al. [16] and Li et al. [17] or Melchinger et al. [7,21] and Kusterer et al. [22], by removing the augmented epistatic effects from models (3) and (6). In the analyses of the F 2 and F 2:3 datasets, all of the main effects and all of the pair-wise interaction effects for all of the markers on the whole genome were simultaneously included in the genetic model. For each simulated QTL, we counted the samples in which the LOD statistic was greater than 2.5 and the identified QTL was within 20.0 cM of the simulated QTL. The estimate for QTL parameter was the average of the corresponding estimates in the counted samples. The ratio of the number of such samples to the total number of replicates represented the empirical power of this QTL.
To achieve the first objective of the simulation experiment, Z 1 , Z 2 and Z 3 were analyzed by Method A. In the first step, with Z 1 or Z 2 33 augmented additive or dominance effects (a Ã k or d Ã k ) and 528 augmented epistatic effects ( i < kl orĩi kl ) were estimated, and with Z 3 1584 pure epistatic effects (i akdl ,i dkal and i dkdl ) were estimated. All the effects were tested by likelihood ratio statistic in order that real QTL could be identified. The results for detected QTL under the F 2 metric model were listed in Table 3, Table 4, Table 5. The results show that the newly defined parameters, i.e., m Z k , a Ã k , d Ã k (k~1,2,3), i < 23 andĩi 23 , were estimated in an almost unambiguous and unbiased manner, and all of the main-effect QTL were identified with a high statistical power and precision in the estimated effects and positions of the QTL by taking the TTC family mean as the unit of phenotypic measurement. The augmented epistatic QTL ( i < 23 andĩi 23 ) were also well detected, except for the situation when n~200, m~5 and s 2 e~4 :00. In the second step, all the pure main and epistatic effects would be estimated in an unbiased manner (Table 6). It should also be noted that a large sample (n §400), a greater family replication number (m §10), and moderate QTL heritability (s 2 e ƒ1:00) are needed for the partition of the augmented epistatic effects ( i < 23 andĩi 23 ) into its components (aa, ad, da and dd), and detecting dd epistasis is more difficult than detecting ad epistasis (Tables 5 and 6). The theoretical explanation is that ad (also da) has a larger contribution to the genetic variance of Z 3 than dd when r 23~0 :50, Supporting Information S2). In addition, the powers in the detection of the augmented epistatic effects ( i < 23 in Table 3 andĩi 23 in Table 4) were always much higher than those of pure epistatic effects (ad, da and dd in Table 5). The possible explanations lie in that 1) the augmented epistatic effects ( i < 23~ia2a3 zi d2d3 andĩi 23~ia2d3 zi d2a3 ) were the sum of two epistatic effects with the same signs in Experiment I and were inflated, and 2) these epistatic effects have different contributions to the genetic variances of Z 1 , Z 2 and Z 3 (Supporting Information S2).
To achieve the second objective of the simulation experiment, Z 1 and Z 2 were re-analyzed by method B and the results under the F 2 metric model were also listed in Tables 3 and 4. The results show that the Z 1 and Z 2 could still be used to unbiasedly   * n denotes sample size; m is number of replications; and s 2 e is residual variance for the phenotypic trait value y tij . The numbers in parentheses are standard deviation and the same is true for the later tables except for Table 6 to  Tables 3 and 4 were indeed the newly defined additive effects (a Ã 2 and a Ã 3 ) and the new dominance effects (d Ã 2 and d Ã 3 ) with slightly poorer precision (little larger in standard deviation) in estimated QTL effects and positions and lower statistical power. This means that the new method was better than the previous methods of Kearsey et al. [12], Frascaroli et al. [16] and Li et al. [17] in the presence of epistasis. The higher statistical power and smaller error variance for method A over method B shows that the new method was also superior to the methods of Melchinger et al. [7,21] and Kusterer et al. [22].
To achieve the third objective of the simulation experiment, the F 2 and F 2:3 data were analyzed and the results under the F 2 metric model were listed in Tables 7 and 8. The results show that many effects could be estimated in an unambiguous and unbiased manner in the F 2 and F 2:3 genetic designs. In the situation of m~1, the F 2 design was superior to the both TTC and F 2:3 designs. The reasons are as follows. In all the above three designs, marker genotypes were from F 2 individuals. If m~1, genotype sampling error was large for both TTC and F 2:3 designs. Meanwhile, the proposed approach in this study did not consider the mixed distribution of the F 2:3 (or TTC) progeny derived from heterozygous F 2 parents. However, the powers in the detection of the main and epistatic QTL were smaller for the F 2 design than for the TTC design with m~5 (or 10) when sample size (n) was small and/or environmental variance (s 2 e ) was large, and the same trend was obtained for the precision of the estimates for the effects and the positions of the main and epistatic QTL. For example, when n~200 and s 2 e~4 :00, the power for main effects a 1 and d 1 were 0.850 and 0.775 and the standard deviation (SD) were 0.253 and 0.308, respectively, in F 2 design (Table 7); while the power for a 1 and d 1 were 1.000 and 1.000 and the SD were 0.118 and 0.104, respectively, in TTC design with a family replication of 10 (Tables 3 and 4). This may be due to the fact that the phenotypic value is measured from F 2 individuals and from the TTC family, and the family mean can be used to decrease the residual variance and to improve the precision of the phenotypic data. Both the TTC and F 2:3 designs use family mean to decrease environmental variance and improve the precision of phenotype of quantitative trait. In addition, the dominant components decrease significantly in the F 2:3 design due to its self-crossing, and the statistical powers for detecting dominance effects, additive by dominance (dominance by additive) epistatic effect and especially dominance by dominance epistatic effect in the F 2:3 design will be lower than that in the TTC design. For example, when n~400, m~10 and s 2 e~4 :00, the power of 0.170 for i d2d3 in F 2:3 (Table 8) was much lower than that of 0.490 in the TTC ( Table 5). The genetic variance contributed by the simulated three QTL under TTC and F 2:3 designs were These variance component can be used to interpret the above simulated experiments results.

Experiment II
The purpose of the simulation experiment was to show the statistical properties of the proposed approach in the TTC design when the augmented epistatic effects consisted of two epistatic effects of equal strength in opposite directions. The genetic parameters under both the F 2 and the F ' the metric models were as follows: m~100:00; a 1~1 :50, d 1~1 :50 for QTL 1 ; a 2~2 :00, d 2~{ 1:00 for QTL 2 ; a 3~{ 1:00, d 3~2 :00 for QTL 3 ; i a2a3~1 :00, i a2d3~1 :50, i d2a3~{ 1:00 and i d2d3~{ 1:50 for the epistatic effects between QTL 2 and QTL 3 . The marginal heritabilities of these genetic effects now varied from 0.98% to 38.75%. The value of m was set at 5 and 10. The other settings were the same as those in Experiments I.
The results for Experiments II are listed in Table 9, Table 10, Table 11. The results show that the powers in the detection of the augmented epistatic effects ( i < 23 in Table 9 andĩi 23 in Table 10) were very low. The results are reasonable because the genetic contributions of the augmented epistatic effects to the genetic variance of Z 1 and Z 2 were low. However, the powers for pure epistatic effects (i ad , i da and i dd ) remained steady (Tables 5 and 11) because the genetic contributions for these effects do not change.

Experiment III
We simulated a large genome to explore the performance of the proposed method in real data analysis. The simulated genome was 1000.0 cM in total length and covered by 210 markers (10 chromosomes, each covered with twenty-one 5.0 cM equally spaced markers). Ten main-effect QTL and three pairs of interacted QTL, which totally explained ,50% variation of L 1 , L 2 and L 3 , were assumed (Tables 12 and 13). The environmental variance (s 2 e ), sample size and family replication number were set at 6.0, 500 and 10, respectively. The mapping results from 200 samples under the F 2 metric model were presented in Table 12 for the main-effect QTL and Table 13 for the epistatic QTL. Results from Table 12 showed that all the augmented main effects were unbiasedly estimated with satisfactory powers; and most pure additive and dominance effects were also unbiasedly estimated with the exception of pure dominance effects for QTL 5 and QTL 8 . The results from Table 13 demonstrated that with Z 1 and Z 2 the augmented epistatic effects ( i < andĩi ) were well estimated when they consisted of two epistatic effects with same sign (QTL 4 and QTL 7 , QTL 9 and QTL 10 ) and were poorly detected when they consisted of two epistatic effects of equal strength in opposite directions ( i < 58 andĩi 58 for QTL 5 and QTL 8 ); with Z 3 all the pure epistatic effects (i ad , i da and i dd ) were well estimated, and no matter what signs they were; and all pure epistatic effects (i aa , i ad , i da and i dd ) estimated in the second stage were unbiased except for i aa for QTL 5

Experiment IV
This simulation experiment was to consider the situation that QTL stands on the position in the marker interval. The three simulated QTL were placed at 45.0 (the middle of marker Table 6. Estimation of pure main and epistatic effects of QTL in the F 2 -based TTC design using the two-step approach under the cases of n = 400, m = 10 and s 2 e~1 :00 (200 replicates). interval), 52.5 (the right of the sixth marker) and 47.5 cM (the left of the sixth marker), respectively. The number of individuals (m) for each TTC family was set at 5 and 10. The other settings were the same as those in the Experiment I. The results were shown in Table 14, Table 15, Table 16. The accuracies for the effects and the positions of QTL, as well as the empirical power, were satisfied but lower than those presented in Table 3, Table 4, Table 5; and the QTL effects were slightly underestimated because of the recombination between QTL and its adjacent marker.

Discussion
Compared to previous studies on the methodologies for the TTC, the method described here offers advantages over the previous approaches. First, with Z 1 or Z 2 all augmented main and epistatic effects (a Ã k , d Ã k , i < kl andĩi kl ) were included simultaneously in one genetic model and estimated together by the E-Bayes approach. Our simulation studies showed that these augmented effects could be estimated with very high power and precision when the component epistatic effects (i a k a l and i d k d l or i a k d l and i d k a l ) of i < kl andĩi kl have the same direction (Tables 3, 4 and 13). Even though these epistatic effects have different signs, the new approach works well for augmented main-effect QTL parameters (Tables 9, 10 and 12).
Second, with Z 3 three pure epistatic effects (i a k d l , i d k a l and i d k d l ) were estimated simultaneously in this study by two-dimensional genome scans. Although we attempted to use a full genetic model that included all the digenic epistatic effects for the estimation of all the epistatic effects under the framework of E-Bayes, it failed. The reasons are unclear. To date, there have been several approaches to detect the epistasis in the RIL-based TTC and NCIII designs, little is currently reported about the estimation of more than two epistatic effects in the TTC. Frascaroli et al. [16] and Li et al. [17] adopted the mixed linear model approach of Wang et al. [20] to detect i a k a l in the analyses of Z 1 and i d k d l in the analyses of Z 2 ; and Kusterer et al. [22] and Melchinger et al. [21] used two-way ANOVA on L 3 and Z 3 for the detection of i a k a l and i d k d l , respectively. However, the two studies involved only one digenic epistatic effect. Although multiple interval mapping has been used to detect the augmented epistatic effects ( i < kl andĩi kl ) by Garcia et al. [34], the genetic design is NCIII and the estimate is a compound effect, not a pure epistatic effect. In addition, Reif et al. [24] proposed a two-step procedure to detect i a k a l with particular two-segment NILs.
Finally, many main and epistatic effects can be estimated in an unambiguous and unbiased manner by our two-step approach. In the first step, the augmented main and epistatic effects (a Ã k ,d Ã k , i < kl andĩi kl ) and three pure epistatic effects (i akdl , i dkal and i dkdl ) may be estimated in the separate analyses of Z 1 , Z 2 and Z 3 . In the next step, all four pure epistatic effects (i a k a l , i a k d l , i d k a l and i d k d l ) may be estimated by using the equation i < kl~iakal zi dkdl and i i kl~( i a k d l zi d k a l ) and pure additive and dominant effects may be further estimated by using the equations of a Ã k and d Ã k . The simulation results show that the two-step approach works well (Tables 6, 12 and 13). However, the pure epistatic effects (i a k d l , i d k a l and i d k d l ) could not be detected with satisfactory statistical power when the sample size (n) and family replication number (m) were low (Tables 5 and 11). Therefore, a large n and m are needed for the detection of epistasis. To accommodate larger n, suitable field experimental designs, such as split-plot design [13,16] and block in replication [35], are desired to control for environmental error.
The F 2 -based TTC design is superior to the F 2 design for the detection of main-effect and epistatic QTL when there is a small Table 7. Results of QTL mapping in F 2 population under the F 2 metric model (400 replications).        sample size and a large residual variance (Tables 3, 4, 5 and 7), and is more powerful for estimating d k , i a k d l (or i d k a l ) and especially i d k d l than the F 2:3 design (Tables 4, 5 and 8). The new method may be extended to the TTC design derived from other base populations, such as RIL, BC and DH. This is because the genetic models for Z 1 , Z 2 and Z 3 in these new TTC designs can be described in the same manner. In Tables S7, S8 and Supporting Information S3 we only presented the expected genetic values and genetic variance for Z 1 , Z 2 and Z 3 under both the F 2 and the F ' metric models in the RIL-based TTC design.
The proposed approach in this study differs from the previous methods of Kearsey et al. [12], Frascaroli et al. [16], Melchinger et al. [7,21] and Li et al. [17]. First, the former derives the linear regression models for Z 1 , Z 2 and Z 3 and the latter makes use of ANOVA. Thus, the precondition for the former is to derive the dummy variables for each genetic effects, whereas the precondition for the latter is to obtain the expectation and expected mean squares. In the expectation and expected mean squares, if one effect is confounded by another effect, these confounded effects may be estimated together. That is the augmented effect in the above ANOVA. If there are multicollinear relationships among dummy variables, the corresponding effects cannot be estimated. However, the effect combination is estimable. That is the augmented effect in the linear regression analysis. This can explain why we construct augmented effects. Second, we consider all the main-effect QTL and all the digenic interactions in one model of Z 1 or Z 2 , all the augmented additive, dominance and epistatic effects have been rightly defined, and all the pure main and epistatic effects can be unbiasedly estimated. Although in the previous studies the augmented additive and dominant effects (a Ã k and d Ã k ) have been rightly defined and are clearly confounded by QTL 6 genetic background epistasis in the RIL-based TTC and NCIII designs [7,21,22], the augmented epistatic effects have been ignored. This neglect would result in a biased estimation for the augmented main effects, a larger residual variance and a lower power of QTL detection (Tables 3 and 4). In addition, with Z 3 we can estimate three types of pure epistatic effects (ad, da and dd) using two-dimensional genome scans. This differs from Melchinger et al. [21], in which only dd epistasis can be obtained.
The F 2 and F ' are two main metrics that are adopted for populations derived from a cross between two inbred lines. The F 2 metric is orthogonal for the F 2 population when epistatic genes are under linkage equilibrium, whereas the F ' metric is orthogonal for homozygous lines [28][29][30]]. An orthogonal model implies that estimates of the genetic effects are consistent in a full and reduced model and is directly related to the partition of the genetic variance in the population. Using different models does not influence the detection of the main and epistatic QTL, but it does influence the estimation and interpretation of genetic effects [30]. Melchinger et al. [7,21] and Kusterer et al. [13,22] advocated the F 2 metric in the RIL-based NCIII and TTC designs for three reasons: (1) it has the advantage that each variance component is proportional to the sum of the squares of the corresponding genetic effects and does not involve any other type of genetic effects that could obscure their interpretation; (2) epistatic interactions by two-way ANOVAs for pairs of marker loci using Z 3i was just i dd ; and (3)  effects (i ad , i da , and i dd ) could also be detected and well estimated under both metrics when the sample size and number of family replications were large in our simulation studies (Tables 5, 11 and 13). The differences under the two metrics may be as follows: (1) the newly defined main effects and model means are different for the Z 1 and Z 2 under the two models; and (2) the F 2 metric model seems to behave better than the F ' metric model (higher power and precision) (data not shown).
The proposed approach in this study assumes that all the QTL stand on the markers. When marker density is high, all the QTL can be detected with a high power and precision. When marker density is sparse, the QTL effects are slightly underestimated because of the recombination between QTL and its adjacent marker. To solve the issue, some virtual marker (treated as missing data) may be inserted. At this time marker imputation techniques may be used.
The drawbacks for our method may lie in two aspects: (1) with Z 1 and Z 2 the augmented epistatic effects ( i < andĩi ) were poorly detected when their corresponding components have an equal strength in opposite directions (Tables 9, 10 and 13). This would result in biased estimate for pure aa epistatic effect, such as i a5a8 in Table 13, and further cause bad estimate for pure dominance effect, such as d 5 and d 8 in Table 12; and (2) The estimation error for the pure main and epistatic effects using the two-step approach seemed to be a little large. This will be studied in the future.

Supporting Information
Supporting Information S1 Statistical genetic models for mapping QTL in the TTC design under the F ' metric model.