Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

A Complete Solution for Dissecting Pure Main and Epistatic Effects of QTL in Triple Testcross Design

  • Xiao-Hong He,

    Affiliation Section on Statistical Genomics, State Key Laboratory of Crop Genetics and Germplasm Enhancement/Chinese National Center for Soybean Improvement, College of Agriculture, Nanjing Agricultural University, Nanjing, Jiangsu, China

  • Yuan-Ming Zhang

    soyzhang@njau.edu.cn

    Affiliation Section on Statistical Genomics, State Key Laboratory of Crop Genetics and Germplasm Enhancement/Chinese National Center for Soybean Improvement, College of Agriculture, Nanjing Agricultural University, Nanjing, Jiangsu, China

Abstract

Epistasis plays an important role in genetics, evolution and crop breeding. To detect the epistasis, triple test cross (TTC) design had been developed several decades ago. Classical procedures for the TTC design use only linear transformations Z1, Z2 and Z3, calculated from the TTC family means of quantitative trait, to infer the nature of the collective additive, dominance and epistatic effects of all the genes. Although several quantitative trait loci (QTL) mapping approaches in the TTC design have been developed, these approaches do not provide a complete solution for dissecting pure main and epistatic effects. In this study, therefore, we developed a two-step approach to estimate all pure main and epistatic effects in the F2-based TTC design under the F2 and F metric models. In the first step, with Z1 and Z2 the augmented main and epistatic effects in the full genetic model that simultaneously considered all putative QTL on the whole genome were estimated using empirical Bayes approach, and with Z3 three pure epistatic effects were obtained using two-dimensional genome scans. In the second step, the three pure epistatic effects obtained in the first step were integrated with the augmented epistatic and main effects for the further estimation of all other pure effects. A series of Monte Carlo simulation experiments has been carried out to confirm the proposed method. The results from simulation experiments show that: 1) the newly defined genetic parameters could be rightly identified with satisfactory statistical power and precision; 2) the F2-based TTC design was superior to the F2 and F2:3 designs; 3) with Z1 and Z2 the statistical powers for the detection of augmented epistatic effects were substantively affected by the signs of pure epistatic effects; and 4) with Z3 the estimation of pure epistatic effects required large sample size and family replication number. The extension of the proposed method in this study to other base populations was further discussed.

Introduction

Epistasis, the interaction between genes, plays an important role in genetics, evolution and crop breeding. First, it is an important genetic component in the genetic architecture of complex traits [1], [2]. Next, it can lead to heterosis [3][7], which is very important in hybrid breeding. In addition, it is a driving force in evolution and plays a central role in founder effect models of speciation [1], [8], [9]. Over the past several decades, many attempts have been made to detect the epistasis. One important attempt was triple test cross (TTC) design developed by Kearsey and Jinks [10], which is a powerful breeding design as well. Therefore, the great importance associated with the epistasis necessitates an in-depth study of the TTC design.

The TTC design is to cross the ith individual (i = 1,2,…n) of an F2 population (or backcross, recombinant inbred lines (RIL) and near isogenic lines (NIL)) to the same three testers, the two inbred lines (P1 and P2) and their F1, to produce 3n families. The design is considered the most efficient model as it provides not only a precise test for epistasis, but also unbiased estimates of additive and dominance components if epistasis is absent [10]. In early studies, only the phenotypic data of quantitative traits were used in the TTC to infer the nature of the additive, dominance and epistatic effects of polygenes using classical generation mean [11][13] and variance component analysis [10], [12], [14][17]. However, these conventional biometrical genetic procedures deal only with the collective effects of all the polygenes [6], [7], [11], [12]. The introduction of molecular markers has facilitated the mapping of quantitative trait loci (QTL) in numerous species, and substantial progress has been achieved in the detection of individual QTL and their interaction in the RIL- and NIL-based TTC designs.

In the RIL-based TTC designs, Kearsey et al. [12] employed the marker difference regression of Kearsey and Hyne [18] to detect QTL for 22 quantitative traits in Arabidopsis thaliana. Frascaroli et al. [16] used composite interval mapping [19] to identify main-effect QTL and the mixed linear model approach [20] to detect digenic epistatic QTL in the analyses of heterosis in maize. The method has been used to identify the main-effect QTL and digenic epistatic QTL underlying the heterosis of nine important agronomic and economic traits in rice by Li et al. [17]. However, the additive and dominant effects estimated from the above approaches are confounded with epistatic effect if epistasis is present. To overcome this issue, Melchinger et al. [21] derived quantitative genetic expectations of QTL main and interaction effects in the RIL-based TTC design. On their theoretical findings, using one-dimensional genome scans, we can estimate augmented additive and dominance effects [7] and QTL- by-genetic background interaction, whereas using two-way ANOVA between all pairs of marker loci, we can estimate additive-by-additive (aa) and dominance-by-dominance (dd) interactions. Kusterer et al. [22] applied the novel approaches of Melchinger et al. [7], [21] to detect QTL for heterosis of biomass-related traits in Arabidopsis. In the above studies, only one variable was involved at one time. To increase the power of QTL detection, Kusterer et al. [22] adopted multi-variable joint analysis [23], as proposed by Melchinger et al. [7] for QTL mapping in the NCIII design.

In the NIL-based TTC design, Melchinger et al. [21] used two QTL mapping methods to study heterosis in Arabidopsis. In the generation means approach, additive, dominance and QTL × genetic background epistasis effects were tested and estimated, and the approach along with particular two-segment NILs was applied by Reif et al. [24] to map aa digenic interaction. In addition, Zhu and Zhang [25] derived formulae for calculating the statistical power in the detection of epistasis; and Wang et al. [26] used interval mapping [27] to detect QTL underlying endosperm traits and demonstrated that the TTC provided a reasonably precise and accurate estimation of QTL positions and effects, especially the two dominant effects, which perfectly overcomes the drawback of the F2:3 design.

In summary, two issues in the detection of QTL in the TTC need to be addressed. First, only a few studies are built on F2-based TTC [25], [26], whereas most are built on RIL [7], [12], [16], [17], [21], [22] and NIL [6], [24]. Second, additive and dominance effects were confounded with QTL-by-genetic background interaction [7], [21], [22] and only aa and dd digenic interactions were evaluated in the RIL-based TTC [16], [17], [21], [22].

The objective of this study was to estimate, in an unambiguous and unbiased manner, all the main and epistatic effects of QTL in the F2-based TTC design. A series of Monte Carlo simulation experiments was carried out to confirm the proposed approach. The extension of the new method to other base populations in the TTC was discussed as well.

Methods

Genetic design and data collection

An F2 population was derived from two inbred lines (P1 and P2) that differed significantly in the quantitative traits of interest and possessed abundant polymorphism molecular markers. A random sample of n F2 individuals (female parents) was backcrossed to three testers, the two parental lines and their F1, to produce 3n families (, and ). All of the 3n families, each with m replications, were planted. Molecular marker information was observed from all of the n F2 individuals, whereas quantitative traits were measured for all of the 3nm TTC progeny. The phenotypic observations were denoted by , where and 3 for , and ; and . The family means were denoted by . Following Kearsey and Jinks [10] and Melchinger et al. [21], we performed three linear transformations: , and . The association between and the marker genotypes of the F2 plants were used to infer the genetic architecture of the trait.

Genetic models for mapping QTL in the F2-based TTC design

The expected genetic values of , and depended on the choice of the metric. Two main metrics, the F2 and F metrics, were adopted for the populations derived from the cross between the two inbred lines [28]-[30]. The derivation of the expected genetic values of , and under both the F2 and the F metric models was presented in Table S1, Table S2, Table S3, Table S4, Table S5, Table S6, and Supporting Information S2. The genetic effect symbols adopted in this study were referred to Kao and Zeng [28].

Statistical genetic models for mapping QTL under the F2 metric model.

According to the expected genetic value of under the F2 metric model in Table S5, the phenotypic value of can be described as:(1)where is the mean genotypic value of the F2 population; and are additive and dominance effects of the kth QTL, respectively; , , and are additive-by-additive, additive-by-dominance, dominance-by-additive and dominance-by-dominance interactions between the 1st and 2nd QTL, respectively; , , , , and are dummy variables and determined by the genotype of the ith F2 plant (Table S5); and is the residual error with an distribution. According to the results in Table S5, there are , and . To solve the genetic parameters, model (1) must be reduced to:(2)where , , , and .

If the quantitative trait was controlled by QTL, model (2) should be extended to:(3)where model mean ; is augmented additive effect of QTL ; is augmented epistatic effect between QTL and ; and and are determined by the genotypes of the kth and lth QTL (marker) of the ith F2 plant (Table 1). The coefficients for the genotype were integrated by the frequencies of and . The augmented epistatic effects () are ignored in Melchinger et al. [21], this may result in a bigger residual error and lower statistical power.

thumbnail
Table 1. Dummy variable values for genetic parameters in the genetic model of , and under various marker genotypes of F2 plant and the F2 and the F metric models.

https://doi.org/10.1371/journal.pone.0024575.t001

In the same way, the phenotypic value of can be described as:(4)where , , , and are determined by the genotype of the ith F2 plant (Table S5); and is the residual error with an distribution. According to the results in Table S5, there are and . To solve the genetic parameters, model (4) must be reduced to:(5)where , , , and .

If the quantitative trait was controlled by QTL, model (5) should be extended to:(6)where model mean ; is augmented dominance effect of QTL ; is augmented epistatic effect between QTL and ; and dummy variables and are determined by the genotypes of the kth and lth QTL of the ith F2 plant (Table 1). The augmented epistatic effects () are overlooked in Melchinger et al. [21], this may result in a bigger residual error and lower statistical power.

Similarly, the phenotypic value of can be described as:(7)where ; is the recombination fraction between two QTL under study; and dummy variables , and are determined by the genotype of the ith F2 plant (Table 1 and Table S5). Here pure ad, da and dd epistatic effects can be estimated with two-dimensional genome scans. This differs from that in Melchinger et al. [21], in which only dd epistasis is estimated with two-way ANOVA.

Models (3), (6) and (7) were working models for our QTL mapping approach in the F2-based TTC design. Here we proposed a two-step approach to obtain all the pure main and epistatic effects in the presence of epistasis. In the first step, model (3) can be used to estimate the augmented additive () and epistatic () effects, model (6) can be used to estimate the augmented dominance () and epistatic () effects, and model (7) can be used to estimate three types of pure epistatic effects (, and ). In the second step, all estimated epistatic effects in models (3), (6) and (7) were integrated for the estimation of all four types of the pure epistatic effects using , and . These pure epistatic effects further integrate with the estimates of both and for the estimation of pure additive and dominance effects, using and . When epistasis is absent, pure additive () and dominance () effects can be directly obtained from model (3) and model (6), respectively.

Genetic models for mapping QTL under the F metric model.

With , and genetic models for mapping QTL under the F metric model have the same forms as described in models (3), (6) and (7), respectively. The detailed derivation was described in Table S6 and Supporting information S1 and the detailed comparisons were given in Tables 1 and 2. The pure epistatic effects under the two metrics are calculated in the same way and the pure additive and dominance effects under the two metrics are calculated in different ways, here and .

thumbnail
Table 2. Genetic parameter component and parameter estimation method for the genetic models of Z1, Z2 and Z3 under the F2 and the F metric models.

https://doi.org/10.1371/journal.pone.0024575.t002

Genetic parameter estimation

Models (3) and (6) have a uniform appearance. However, the true number of QTL () is hard to determine. Variable selection via a stepwise regression or a stochastic search variable selection is the common procedure for epistatic QTL analysis. But these methods are computationally intensive and may not be optimal [31][33]. Thus, we adopted the empirical Bayes (E-Bayes) method of Xu [33] for the estimation of parameters in the above models. The E-Bayes approach assumes that there is one QTL standing on each marker throughout the genome and shrinks the genetic effects of all “nonsignificant” QTL toward zero. Here, we only gave some necessary procedures; for the technical details of the E-Bayes refer to the original study of Xu [33].

Models (3) and (6) can be uniformly written as:(8)where is the model mean; is the augmented main effect of the kth QTL; is the augmented epistatic effect between the kth and lth QTL; is the total number of genetic effects, including the augmented main and epistatic effects; and is the residual error. Model (8) can be expressed in matrix form:(9)where ; ; ; ; and .

In the expectation and maximization (EM) algorithm of the E-Bayes method [33], model (9) is a typical mixed model and is treated as a fixed effect, whereas is treated as a random effect. Therefore, has a multivariate normal distribution with the mean and the variance-covariance matrix .

In the EM algorithm of E-Bayes, the genetic parameters are the focus of interest and the normal prior is assigned to , i.e., and is further assigned a scaled inverse prior, i.e., . The has uniform prior distribution.

The EM algorithm procedures are as follows:

1) Choose and assign initial values: , , .

2) E-step: the best linear unbiased prediction (BLUP) estimation of the expectation of the quadratic term(10)

3) M-step: the maximum-likelihood estimation for , fixed effects and residual variance(11)

4) repeat steps 2) - 3) until a certain criterion of convergence is satisfied, e.g. the difference of parameter estimate values between two adjacent iterations were less than 10−10.

In addition, we performed a two-dimension scan using the maximum likelihood approach for the estimation parameters in models (7).

Likelihood ratio test

If we only want to report QTL with relatively large effects and give readers accurate information about how significant the identified QTL were, statistical test should be conducted. The usual likelihood ratio test (LRT) cannot be carried out with the E-Bayes method owing to an oversaturated epistatic genetic model. We proposed the following two-stage selection process to screen the QTL [31]. In the first stage, all QTL with are picked up. In the second stage, the epistatic genetic model is modified so that only effects past the first round of selection are included in the model. Owing to the smaller dimensionality of the reduced model, we can use the maximum likelihood method to re-analyze the data and perform the LRT [31]. The test statistic is (12)where is the parameters vector in the statistical genetic model in the second stage analysis of model (8); is the parameters vector in excluding the currently tested genetic effect ; and are the log maximum likelihood function for and , respectively. For simplicity, we took and 3.0 as the critical values in our small and larger genome simulation experiments, respectively.

Results

Experiment I

The purpose of the simulation experiment was: (1) to evaluate the statistical performance of the proposed approach; (2) to compare the proposed method with previous approaches, such as Kearsey et al. [12], Frascaroli et al. [16] and Li et al. [17] or Melchinger et al. [7], [21] and Kusterer et al. [22], according to statistical power, standard deviation and accuracy measure; and (3) to compare the TTC design with the F2 and F2:3 genetic designs.

The simulated genome consisted of three chromosomes (chr1, chr2 and chr3), and 11 evenly spaced markers covered each chromosome with an average marker interval of 10.0 cM. We simulated three main-effect QTL and one pair-wise interaction QTL, all of which overlapped with markers. All three main-effect QTL were located at the center (50.0 cM) of each chromosome, and QTL2 on chr2 interacted with QTL3 on chr3. The genetic parameters under both the F2 and the F metric models were as follows: ; and for QTL1; and for QTL2; and for QTL3; , , and for the epistatic effects between QTL2 and QTL3. The marginal heritabilities of these genetic effects varied from 1.01% to 36.54%. The sample size (n), the number of individual in the F2 population, was set at two levels: 200 and 400. The number of individuals (m) for each TTC family was set at 1, 5 and 10. The environmental variance () was set at 4.00 and 1.00. To implement the last objective of the simulation experiment, two other kinds of populations, the F2 and F2:3 populations, were also simulated. However, molecular marker information for all three populations was derived from the corresponding F2 individuals. Each treatment was replicated 200 times for the TTC and F2:3 designs and 400 times for the F2 design. In the analyses of the TTC family data, two approaches were adopted: 1) Method A, the proposed method in this study, and 2) Method B, the modified method of Kearsey et al. [12], Frascaroli et al. [16] and Li et al. [17] or Melchinger et al. [7], [21] and Kusterer et al. [22], by removing the augmented epistatic effects from models (3) and (6). In the analyses of the F2 and F2:3 datasets, all of the main effects and all of the pair-wise interaction effects for all of the markers on the whole genome were simultaneously included in the genetic model. For each simulated QTL, we counted the samples in which the LOD statistic was greater than 2.5 and the identified QTL was within 20.0 cM of the simulated QTL. The estimate for QTL parameter was the average of the corresponding estimates in the counted samples. The ratio of the number of such samples to the total number of replicates represented the empirical power of this QTL.

To achieve the first objective of the simulation experiment, , and were analyzed by Method A. In the first step, with or 33 augmented additive or dominance effects ( or ) and 528 augmented epistatic effects ( or ) were estimated, and with 1584 pure epistatic effects (, and ) were estimated. All the effects were tested by likelihood ratio statistic in order that real QTL could be identified. The results for detected QTL under the F2 metric model were listed in Table 3, Table 4, Table 5. The results show that the newly defined parameters, i.e., , , (), and , were estimated in an almost unambiguous and unbiased manner, and all of the main-effect QTL were identified with a high statistical power and precision in the estimated effects and positions of the QTL by taking the TTC family mean as the unit of phenotypic measurement. The augmented epistatic QTL ( and ) were also well detected, except for the situation when , and . In the second step, all the pure main and epistatic effects would be estimated in an unbiased manner (Table 6). It should also be noted that a large sample (), a greater family replication number (), and moderate QTL heritability () are needed for the partition of the augmented epistatic effects ( and ) into its components (, , and ), and detecting epistasis is more difficult than detecting epistasis (Tables 5 and 6). The theoretical explanation is that (also ) has a larger contribution to the genetic variance of than ( when , Supporting Information S2). In addition, the powers in the detection of the augmented epistatic effects ( in Table 3 and in Table 4) were always much higher than those of pure epistatic effects (, and in Table 5). The possible explanations lie in that 1) the augmented epistatic effects ( and ) were the sum of two epistatic effects with the same signs in Experiment I and were inflated, and 2) these epistatic effects have different contributions to the genetic variances of , and (Supporting Information S2).

thumbnail
Table 3. Comparison of the proposed approach (Method A) with previous method (Method B) that does not consider augmented epistasis for mapping QTL of Z1 under the F2 metric model.

https://doi.org/10.1371/journal.pone.0024575.t003

thumbnail
Table 4. Comparison of the proposed approach (Method A) with previous method (Method B) that does not consider augmented epistasis for mapping QTL of Z2 under the F2 metric model.

https://doi.org/10.1371/journal.pone.0024575.t004

thumbnail
Table 6. Estimation of pure main and epistatic effects of QTL in the F2-based TTC design using the two-step approach under the cases of n = 400, m = 10 and (200 replicates).

https://doi.org/10.1371/journal.pone.0024575.t006

To achieve the second objective of the simulation experiment, and were re-analyzed by method B and the results under the F2 metric model were also listed in Tables 3 and 4. The results show that the Z1 and Z2 could still be used to unbiasedly estimate QTL additive () and dominance effect () when the QTL (QTL1) acted independently; but provided biased estimation of QTL additive ( and ) and dominance effects ( and ) when the QTL acted dependently (QTL2 and QTL3). The additive ( and ) and dominance effects ( and ) of interactive QTL obtained by Method B in Tables 3 and 4 were indeed the newly defined additive effects ( and ) and the new dominance effects ( and ) with slightly poorer precision (little larger in standard deviation) in estimated QTL effects and positions and lower statistical power. This means that the new method was better than the previous methods of Kearsey et al. [12], Frascaroli et al. [16] and Li et al. [17] in the presence of epistasis. The higher statistical power and smaller error variance for method A over method B shows that the new method was also superior to the methods of Melchinger et al. [7], [21] and Kusterer et al. [22].

To achieve the third objective of the simulation experiment, the F2 and F2:3 data were analyzed and the results under the F2 metric model were listed in Tables 7 and 8. The results show that many effects could be estimated in an unambiguous and unbiased manner in the F2 and F2:3 genetic designs. In the situation of , the F2 design was superior to the both TTC and F2:3 designs. The reasons are as follows. In all the above three designs, marker genotypes were from F2 individuals. If , genotype sampling error was large for both TTC and F2:3 designs. Meanwhile, the proposed approach in this study did not consider the mixed distribution of the F2:3 (or TTC) progeny derived from heterozygous F2 parents. However, the powers in the detection of the main and epistatic QTL were smaller for the F2 design than for the TTC design with (or 10) when sample size () was small and/or environmental variance () was large, and the same trend was obtained for the precision of the estimates for the effects and the positions of the main and epistatic QTL. For example, when and , the power for main effects and were 0.850 and 0.775 and the standard deviation (SD) were 0.253 and 0.308, respectively, in F2 design (Table 7); while the power for and were 1.000 and 1.000 and the SD were 0.118 and 0.104, respectively, in TTC design with a family replication of 10 (Tables 3 and 4). This may be due to the fact that the phenotypic value is measured from F2 individuals and from the TTC family, and the family mean can be used to decrease the residual variance and to improve the precision of the phenotypic data. Both the TTC and F2:3 designs use family mean to decrease environmental variance and improve the precision of phenotype of quantitative trait. In addition, the dominant components decrease significantly in the F2:3 design due to its self-crossing, and the statistical powers for detecting dominance effects, additive by dominance (dominance by additive) epistatic effect and especially dominance by dominance epistatic effect in the F2:3 design will be lower than that in the TTC design. For example, when , and , the power of 0.170 for in F2:3 (Table 8) was much lower than that of 0.490 in the TTC (Table 5). The genetic variance contributed by the simulated three QTL under TTC and F2:3 designs were (Supporting Information S2):

thumbnail
Table 7. Results of QTL mapping in F2 population under the F2 metric model (400 replications).

https://doi.org/10.1371/journal.pone.0024575.t007

thumbnail
Table 8. Results of QTL mapping in F2:3 population under the F2 metric model (200 replications)

https://doi.org/10.1371/journal.pone.0024575.t008

These variance component can be used to interpret the above simulated experiments results.

Experiment II

The purpose of the simulation experiment was to show the statistical properties of the proposed approach in the TTC design when the augmented epistatic effects consisted of two epistatic effects of equal strength in opposite directions. The genetic parameters under both the F2 and the F the metric models were as follows: ; , for QTL1; , for QTL2; , for QTL3; , , and for the epistatic effects between QTL2 and QTL3. The marginal heritabilities of these genetic effects now varied from 0.98% to 38.75%. The value of m was set at 5 and 10. The other settings were the same as those in Experiments I.

The results for Experiments II are listed in Table 9, Table 10, Table 11. The results show that the powers in the detection of the augmented epistatic effects ( in Table 9 and in Table 10) were very low. The results are reasonable because the genetic contributions of the augmented epistatic effects to the genetic variance of and were low. However, the powers for pure epistatic effects (, and ) remained steady (Tables 5 and 11) because the genetic contributions for these effects do not change.

thumbnail
Table 9. Results of mapping QTL of Z1 under F2 metric model while augmented epistatic effects consisted of two epistatic effects of equal strength in opposite directions (200 replications).

https://doi.org/10.1371/journal.pone.0024575.t009

thumbnail
Table 10. Results of mapping QTL of Z2 under the F2 metric model while augmented epistatic effects consisted of two epistatic effects of equal strength in opposite directions (200 replications).

https://doi.org/10.1371/journal.pone.0024575.t010

thumbnail
Table 11. Results of mapping QTL of Z3 under F2 metric model while augmented epistatic effects consisted of two epistatic effects of equal strength in opposite directions (200 replications).

https://doi.org/10.1371/journal.pone.0024575.t011

Experiment III

We simulated a large genome to explore the performance of the proposed method in real data analysis. The simulated genome was 1000.0 cM in total length and covered by 210 markers (10 chromosomes, each covered with twenty-one 5.0 cM equally spaced markers). Ten main-effect QTL and three pairs of interacted QTL, which totally explained ∼50% variation of L1, L2 and L3, were assumed (Tables 12 and 13). The environmental variance (), sample size and family replication number were set at 6.0, 500 and 10, respectively. The mapping results from 200 samples under the F2 metric model were presented in Table 12 for the main-effect QTL and Table 13 for the epistatic QTL. Results from Table 12 showed that all the augmented main effects were unbiasedly estimated with satisfactory powers; and most pure additive and dominance effects were also unbiasedly estimated with the exception of pure dominance effects for QTL5 and QTL8. The results from Table 13 demonstrated that with and the augmented epistatic effects ( and ) were well estimated when they consisted of two epistatic effects with same sign (QTL4 and QTL7, QTL9 and QTL10) and were poorly detected when they consisted of two epistatic effects of equal strength in opposite directions ( and for QTL5 and QTL8); with all the pure epistatic effects (, and ) were well estimated, and no matter what signs they were; and all pure epistatic effects (, , and ) estimated in the second stage were unbiased except for for QTL5 and QTL8 (). The failure of detecting resulted in biased estimate for , which further caused bad estimate for and . These results were similar to those in simulation experiments I and II. The time cost was ∼4.70h per sample on our person computer (CPU: Intel® CoreTM 2 DUO 3.0G, Memory: 2.0G).

thumbnail
Table 12. Simulated and estimated main-effect QTL position and effects for large genome data under the F2 metric model (200 replications).

https://doi.org/10.1371/journal.pone.0024575.t012

thumbnail
Table 13. Simulated and estimated epistatic QTL positions and effects for large genome data under the F2 metric model (200 replications).

https://doi.org/10.1371/journal.pone.0024575.t013

Experiment IV

This simulation experiment was to consider the situation that QTL stands on the position in the marker interval. The three simulated QTL were placed at 45.0 (the middle of marker interval), 52.5 (the right of the sixth marker) and 47.5 cM (the left of the sixth marker), respectively. The number of individuals (m) for each TTC family was set at 5 and 10. The other settings were the same as those in the Experiment I. The results were shown in Table 14, Table 15, Table 16. The accuracies for the effects and the positions of QTL, as well as the empirical power, were satisfied but lower than those presented in Table 3, Table 4, Table 5; and the QTL effects were slightly underestimated because of the recombination between QTL and its adjacent marker.

thumbnail
Table 14. Results of mapping QTL of Z1 under F2 metric model while the simulated QTL were placed on the position in the marker intervals (200 replications).

https://doi.org/10.1371/journal.pone.0024575.t014

thumbnail
Table 15. Results of mapping QTL of Z2 under F2 metric model while the simulated QTL were placed on the position in the marker intervals (200 replications).

https://doi.org/10.1371/journal.pone.0024575.t015

thumbnail
Table 16. Results of mapping QTL of Z3 under F2 metric model while the simulated QTL were placed on the position in the marker intervals (200 replications).

https://doi.org/10.1371/journal.pone.0024575.t016

Discussion

Compared to previous studies on the methodologies for the TTC, the method described here offers advantages over the previous approaches. First, with or all augmented main and epistatic effects (, , and ) were included simultaneously in one genetic model and estimated together by the E-Bayes approach. Our simulation studies showed that these augmented effects could be estimated with very high power and precision when the component epistatic effects ( and or and ) of and have the same direction (Tables 3, 4 and 13). Even though these epistatic effects have different signs, the new approach works well for augmented main-effect QTL parameters (Tables 9, 10 and 12).

Second, with three pure epistatic effects (, and ) were estimated simultaneously in this study by two-dimensional genome scans. Although we attempted to use a full genetic model that included all the digenic epistatic effects for the estimation of all the epistatic effects under the framework of E-Bayes, it failed. The reasons are unclear. To date, there have been several approaches to detect the epistasis in the RIL-based TTC and NCIII designs, little is currently reported about the estimation of more than two epistatic effects in the TTC. Frascaroli et al. [16] and Li et al. [17] adopted the mixed linear model approach of Wang et al. [20] to detect in the analyses of and in the analyses of ; and Kusterer et al. [22] and Melchinger et al. [21] used two-way ANOVA on and for the detection of and , respectively. However, the two studies involved only one digenic epistatic effect. Although multiple interval mapping has been used to detect the augmented epistatic effects ( and ) by Garcia et al. [34], the genetic design is NCIII and the estimate is a compound effect, not a pure epistatic effect. In addition, Reif et al. [24] proposed a two-step procedure to detect with particular two-segment NILs.

Finally, many main and epistatic effects can be estimated in an unambiguous and unbiased manner by our two-step approach. In the first step, the augmented main and epistatic effects (,, and ) and three pure epistatic effects (, and ) may be estimated in the separate analyses of , and . In the next step, all four pure epistatic effects (, , and ) may be estimated by using the equation and and pure additive and dominant effects may be further estimated by using the equations of and . The simulation results show that the two-step approach works well (Tables 6, 12 and 13). However, the pure epistatic effects (, and ) could not be detected with satisfactory statistical power when the sample size () and family replication number () were low (Tables 5 and 11). Therefore, a large and are needed for the detection of epistasis. To accommodate larger , suitable field experimental designs, such as split-plot design [13], [16] and block in replication [35], are desired to control for environmental error.

The F2-based TTC design is superior to the F2 design for the detection of main-effect and epistatic QTL when there is a small sample size and a large residual variance (Tables 3, 4, 5 and 7), and is more powerful for estimating , (or ) and especially than the F2:3 design (Tables 4, 5 and 8). The new method may be extended to the TTC design derived from other base populations, such as RIL, BC and DH. This is because the genetic models for , and in these new TTC designs can be described in the same manner. In Tables S7, S8 and Supporting Information S3 we only presented the expected genetic values and genetic variance for , and under both the F2 and the F metric models in the RIL-based TTC design.

The proposed approach in this study differs from the previous methods of Kearsey et al. [12], Frascaroli et al. [16], Melchinger et al. [7], [21] and Li et al. [17]. First, the former derives the linear regression models for , and and the latter makes use of ANOVA. Thus, the precondition for the former is to derive the dummy variables for each genetic effects, whereas the precondition for the latter is to obtain the expectation and expected mean squares. In the expectation and expected mean squares, if one effect is confounded by another effect, these confounded effects may be estimated together. That is the augmented effect in the above ANOVA. If there are multicollinear relationships among dummy variables, the corresponding effects cannot be estimated. However, the effect combination is estimable. That is the augmented effect in the linear regression analysis. This can explain why we construct augmented effects. Second, we consider all the main-effect QTL and all the digenic interactions in one model of Z1 or Z2, all the augmented additive, dominance and epistatic effects have been rightly defined, and all the pure main and epistatic effects can be unbiasedly estimated. Although in the previous studies the augmented additive and dominant effects ( and ) have been rightly defined and are clearly confounded by QTL × genetic background epistasis in the RIL-based TTC and NCIII designs [7], [21], [22], the augmented epistatic effects have been ignored. This neglect would result in a biased estimation for the augmented main effects, a larger residual variance and a lower power of QTL detection (Tables 3 and 4). In addition, with Z3 we can estimate three types of pure epistatic effects (ad, da and dd) using two-dimensional genome scans. This differs from Melchinger et al. [21], in which only dd epistasis can be obtained.

The F2 and F are two main metrics that are adopted for populations derived from a cross between two inbred lines. The F2 metric is orthogonal for the F2 population when epistatic genes are under linkage equilibrium, whereas the F metric is orthogonal for homozygous lines [28][30]. An orthogonal model implies that estimates of the genetic effects are consistent in a full and reduced model and is directly related to the partition of the genetic variance in the population. Using different models does not influence the detection of the main and epistatic QTL, but it does influence the estimation and interpretation of genetic effects [30]. Melchinger et al. [7], [21] and Kusterer et al. [13], [22] advocated the F2 metric in the RIL-based NCIII and TTC designs for three reasons: (1) it has the advantage that each variance component is proportional to the sum of the squares of the corresponding genetic effects and does not involve any other type of genetic effects that could obscure their interpretation; (2) epistatic interactions by two-way ANOVAs for pairs of marker loci using was just ; and (3) with digenic epistasis, midparent heterosis involves only beside dominance effects, whereas under the F metric MPH is additionally influenced by . For F2-based TTC design, neither F2 nor F metric models are orthogonal (Supporting Information S2). With the Z1 and Z2 the newly defined parameters (, , and ) were all rightly identified and estimated by our full model methods under both metrics (Tables 3, 4, 12 and 13), and with Z3 the pure epistatic effects (, , and ) could also be detected and well estimated under both metrics when the sample size and number of family replications were large in our simulation studies (Tables 5, 11 and 13). The differences under the two metrics may be as follows: (1) the newly defined main effects and model means are different for the Z1 and Z2 under the two models; and (2) the F2 metric model seems to behave better than the F metric model (higher power and precision) (data not shown).

The proposed approach in this study assumes that all the QTL stand on the markers. When marker density is high, all the QTL can be detected with a high power and precision. When marker density is sparse, the QTL effects are slightly underestimated because of the recombination between QTL and its adjacent marker. To solve the issue, some virtual marker (treated as missing data) may be inserted. At this time marker imputation techniques may be used.

The drawbacks for our method may lie in two aspects: (1) with and the augmented epistatic effects ( and ) were poorly detected when their corresponding components have an equal strength in opposite directions (Tables 9, 10 and 13). This would result in biased estimate for pure aa epistatic effect, such as in Table 13, and further cause bad estimate for pure dominance effect, such as and in Table 12; and (2) The estimation error for the pure main and epistatic effects using the two-step approach seemed to be a little large. This will be studied in the future.

Supporting Information

Supporting Information S1.

Statistical genetic models for mapping QTL in the TTC design under the F metric model.

https://doi.org/10.1371/journal.pone.0024575.s001

(DOC)

Supporting Information S2.

The expected genetic values of , and under the F2 and the F metric models in the F2-based TTC design.

https://doi.org/10.1371/journal.pone.0024575.s002

(DOC)

Supporting Information S3.

The expected genetic values of the , and values under the F and the F2 metric models in the RIL-based TTC design.

https://doi.org/10.1371/journal.pone.0024575.s003

(DOC)

Table S1.

Genetic constitutions of the F2-based TTC family means L1i, L2i and L3i.

https://doi.org/10.1371/journal.pone.0024575.s004

(DOC)

Table S2.

Expected genetic value of L1i family under the F2 and the F metric models in the F2-based TTC design.

https://doi.org/10.1371/journal.pone.0024575.s005

(DOC)

Table S3.

Expected genetic value of L2i family under the F2 and the F metric models in the F2-based TTC design.

https://doi.org/10.1371/journal.pone.0024575.s006

(DOC)

Table S4.

Expected genetic value of L3i family under the F2 and the F metric models in the F2-based TTC design.

https://doi.org/10.1371/journal.pone.0024575.s007

(DOC)

Table S5.

Expected genetic values of , and under the F2 metric model in the F2-based TTC design.

https://doi.org/10.1371/journal.pone.0024575.s008

(DOC)

Table S6.

Expected genetic values of , and under the F metric model in the F2-based TTC design.

https://doi.org/10.1371/journal.pone.0024575.s009

(DOC)

Table S7.

Expected genetic values of ,and under the F2 metric model in the RIL-based TTC design.

https://doi.org/10.1371/journal.pone.0024575.s010

(DOC)

Table S8.

Expected genetic values of , and under the F metric model in the RIL-based TTC design.

https://doi.org/10.1371/journal.pone.0024575.s011

(DOC)

Acknowledgments

The authors thank the anonymous reviewers for their comments on an earlier version of the manuscript.

Author Contributions

Conceived and designed the experiments: Y-MZ. Performed the experiments: X-HH. Analyzed the data: X-HH. Contributed reagents/materials/analysis tools: X-HH. Wrote the paper: Y-MZ X-HH.

References

  1. 1. Carlborg Ö, Haley CS (2004) Epistasis: too often neglected in complex trait studies. Nat Rev Genet 5: 618–625.
  2. 2. Moore JH, Williams SM (2005) Traversing the conceptual divide between biological and statistical epistasis: systems biology and a more modern synthesis. BioEssays 27: 637–646.
  3. 3. Jinks JL, Jones RM (1958) Estimation of the components of heterosis. Genetics 43: 223–234.
  4. 4. Yu SB, Li JX, Xu CG, Tan YF, Gao YJ, et al. (1997) Importance of epistasis as the genetic basis of heterosis in an elite rice hybrid. Proc Natl Acad Sci USA 94: 9226–9231.
  5. 5. Lippman ZB, Zamir D (2006) Heterosis: revisiting the magic. Trends in Genetics 23: 60–66.
  6. 6. Melchinger AE, Piepho HP, Utz HF, Muminovi J, Wegenast T, et al. (2007) Genetic basis of heterosis for growth-related traits in Arabidopsis investigated by testcross progenies of near-isogenic lines reveals a significant role of epistasis. Genetics 177: 1827–1837.
  7. 7. Melchinger AE, Utz HF, Piepho HP, Zeng ZB, Schön CC (2007) The role of epistasis in the manifestation of heterosis: a systems-oriented approach. Genetics 177: 1815–1825.
  8. 8. Wright S (1980) Genic and organismic selection. Evolution 34: 825–843.
  9. 9. Carson HL, Templeton AR (1984) Genetic revolutions in relation to speciation phenomena: the founding of new populations. Annu Rev Ecol Syst 15: 97–131.
  10. 10. Kearsey MJ, Jinks JL (1968) A general method of detecting additive, dominance and epistatic variation for metrical traits. I. Theory. Heredity 23: 403–409.
  11. 11. Kearsey MJ, Pooni HS (1996) The genetical analysis of quantitative traits. Chapman and Hall: London.
  12. 12. Kearsey MJ, Pooni HS, Syed NH (2003) Genetics of quantitative traits in Arabidopsis thaliana. Heredity 91: 456–464.
  13. 13. Kusterer B, Muminovic J, Utz HF, Piepho HP, Barth S, et al. (2007) Analysis of a triple testcross design with recombinant inbred lines reveals a significant role of epistasis in heterosis for biomass-related traits in Arabidopsis. Genetics 175: 2009–2017.
  14. 14. Jinks JL, Perkins JM (1970) A general method for the detection of additive, dominance and epistatic components of variation III. F2 and backcross populations. Heredity 25: 419–429.
  15. 15. Perkins JM, Jinks JL (1970) Detection and estimation of genotype-environmental, linkage and epistatic components of variation for a metrical trait. Heredity 25: 157–177.
  16. 16. Frascaroli E, Canè MA, Landi P, Pea G, Giabfranceschi L, et al. (2007) Classical genetic and quantitative trait loci analyses of heterosis in a maize hybrid between two elite inbred lines. Genetics 176: 625–644.
  17. 17. Li L, Lu K, Chen Z, Mu T, Hu Z, et al. (2008) Dominance, overdominance and epistasis condition the heterosis in two heterotic rice hybrids. Genetics 180: 1725–1742.
  18. 18. Kearsey MJ, Hyne V (1994) QTL analysis: a simple ‘marker-regression’ approach. Theor Appl Genet 89: 698–702.
  19. 19. Zeng ZB (1994) Precision mapping of quantitative trait loci. Genetics 136: 1457–1468.
  20. 20. Wang DL, Zhu J, Li ZK, Paterson AH (1999) Mapping QTL with epistatic effects and QTL × environment interactions by mixed model approaches. Theor Appl Genet 99: 1255–1264.
  21. 21. Melchinger AE, Utz HF, Schön CC (2008) Genetic expectations of quantitative trait loci main and interaction effects obtained with the triple testcross design and their relevance for the analysis of heterosis. Genetics 178: 2265–2274.
  22. 22. Kusterer B, Piepho HP, Utz HF, Schön CC, Muminovic J, et al. (2007) Heterosis for biomass-related traits in Arabidopsis investigated by quantitative trait loci analysis of the triple testcross design with recombinant inbred lines. Genetics 177: 1839–1850.
  23. 23. Jiang C, Zeng ZB (1995) Multiple trait analysis of genetic mapping for quantitative trait loci. Genetics 140: 1111–1127.
  24. 24. Reif JC, Kusterer B, Piepho HP, Meyer RC, Altman T, et al. (2009) Unraveling epistasis with triple testcross progenies of near-isogenic lines. Genetics 181: 247–257.
  25. 25. Zhu C, Zhang R (2007) Efficiency of triple test cross for detecting epistasis with marker information. Heredity 98: 401–410.
  26. 26. Wang XF, Song W, Yang ZF, Wang YM, Tang ZX, et al. (2009) Improved genetic mapping of endosperm traits using NCIII and TTC designs. Journal of Heredity 100: 496–500.
  27. 27. Lander ES, Botstein D (1989) Mapping Mendelian factors underlying quantitative traits using RFLP linkage maps. Genetics 121: 185–199.
  28. 28. Kao CH, Zeng ZB (2002) Modeling epistasis of quantitative trait loci using Cockerham's model. Genetics 160: 1243–1261.
  29. 29. Yang RC (2004) Epistasis of quantitative trait loci under different gene action models. Genetics 167: 1493–1505.
  30. 30. Zeng ZB, Wang T, Zou W (2005) Modeling quantitative trait loci and interpretation of models. Genetics 169: 1711–1725.
  31. 31. Zhang YM, Xu S (2005) A penalized maximum likelihood method for estimating epistatic effects of QTL. Heredity 95: 96–104.
  32. 32. Xu S (2007) An empirical Bayes method for estimating epistatic effects of quantitative trait loci. Biometrics 63: 513–521.
  33. 33. Xu S (2010) An expectation–maximization algorithm for the Lasso estimation of quantitative trait locus effects. Heredity 105: 483–494.
  34. 34. Garcia AAF, Wang S, Melchinger AE, Zeng ZB (2008) Quantitative trait loci mapping and the genetic basis of heterosis in maize and rice. Genetics 180: 1707–1724.
  35. 35. Cockerham CC, Zeng ZB (1996) Design III with marker loci. Genetics 143: 1437–1456.