Figures
Abstract
Epistasis plays an important role in genetics, evolution and crop breeding. To detect the epistasis, triple test cross (TTC) design had been developed several decades ago. Classical procedures for the TTC design use only linear transformations Z1, Z2 and Z3, calculated from the TTC family means of quantitative trait, to infer the nature of the collective additive, dominance and epistatic effects of all the genes. Although several quantitative trait loci (QTL) mapping approaches in the TTC design have been developed, these approaches do not provide a complete solution for dissecting pure main and epistatic effects. In this study, therefore, we developed a two-step approach to estimate all pure main and epistatic effects in the F2-based TTC design under the F2 and F∞ metric models. In the first step, with Z1 and Z2 the augmented main and epistatic effects in the full genetic model that simultaneously considered all putative QTL on the whole genome were estimated using empirical Bayes approach, and with Z3 three pure epistatic effects were obtained using two-dimensional genome scans. In the second step, the three pure epistatic effects obtained in the first step were integrated with the augmented epistatic and main effects for the further estimation of all other pure effects. A series of Monte Carlo simulation experiments has been carried out to confirm the proposed method. The results from simulation experiments show that: 1) the newly defined genetic parameters could be rightly identified with satisfactory statistical power and precision; 2) the F2-based TTC design was superior to the F2 and F2:3 designs; 3) with Z1 and Z2 the statistical powers for the detection of augmented epistatic effects were substantively affected by the signs of pure epistatic effects; and 4) with Z3 the estimation of pure epistatic effects required large sample size and family replication number. The extension of the proposed method in this study to other base populations was further discussed.
Citation: He X-H, Zhang Y-M (2011) A Complete Solution for Dissecting Pure Main and Epistatic Effects of QTL in Triple Testcross Design. PLoS ONE 6(9): e24575. https://doi.org/10.1371/journal.pone.0024575
Editor: Kerby Shedden, University of Michigan, United States of America
Received: April 14, 2011; Accepted: August 14, 2011; Published: September 19, 2011
Copyright: © 2011 He, Zhang. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Funding: This work was supported by grant 2011CB109300 from the National Basic Research Program of China, grants 30900842, 31000666 and 30971848 from the National Natural Science Foundation of China, grant KYT201002 from the Fundamental Research Funds for the Central Universities, grant 20100097110035 from Specialized Research Fund for the Doctoral Program of Higher Education, grant B08025 from the 111 Project, grant KJ08001 from the NAU Youth Sci-Tech Innovation Fund and a Project Funded by the Priority Academic Program Development of Jiangsu Higher Education Institutions. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
Introduction
Epistasis, the interaction between genes, plays an important role in genetics, evolution and crop breeding. First, it is an important genetic component in the genetic architecture of complex traits [1], [2]. Next, it can lead to heterosis [3]–[7], which is very important in hybrid breeding. In addition, it is a driving force in evolution and plays a central role in founder effect models of speciation [1], [8], [9]. Over the past several decades, many attempts have been made to detect the epistasis. One important attempt was triple test cross (TTC) design developed by Kearsey and Jinks [10], which is a powerful breeding design as well. Therefore, the great importance associated with the epistasis necessitates an in-depth study of the TTC design.
The TTC design is to cross the ith individual (i = 1,2,…n) of an F2 population (or backcross, recombinant inbred lines (RIL) and near isogenic lines (NIL)) to the same three testers, the two inbred lines (P1 and P2) and their F1, to produce 3n families. The design is considered the most efficient model as it provides not only a precise test for epistasis, but also unbiased estimates of additive and dominance components if epistasis is absent [10]. In early studies, only the phenotypic data of quantitative traits were used in the TTC to infer the nature of the additive, dominance and epistatic effects of polygenes using classical generation mean [11]–[13] and variance component analysis [10], [12], [14]–[17]. However, these conventional biometrical genetic procedures deal only with the collective effects of all the polygenes [6], [7], [11], [12]. The introduction of molecular markers has facilitated the mapping of quantitative trait loci (QTL) in numerous species, and substantial progress has been achieved in the detection of individual QTL and their interaction in the RIL- and NIL-based TTC designs.
In the RIL-based TTC designs, Kearsey et al. [12] employed the marker difference regression of Kearsey and Hyne [18] to detect QTL for 22 quantitative traits in Arabidopsis thaliana. Frascaroli et al. [16] used composite interval mapping [19] to identify main-effect QTL and the mixed linear model approach [20] to detect digenic epistatic QTL in the analyses of heterosis in maize. The method has been used to identify the main-effect QTL and digenic epistatic QTL underlying the heterosis of nine important agronomic and economic traits in rice by Li et al. [17]. However, the additive and dominant effects estimated from the above approaches are confounded with epistatic effect if epistasis is present. To overcome this issue, Melchinger et al. [21] derived quantitative genetic expectations of QTL main and interaction effects in the RIL-based TTC design. On their theoretical findings, using one-dimensional genome scans, we can estimate augmented additive and dominance effects [7] and QTL- by-genetic background interaction, whereas using two-way ANOVA between all pairs of marker loci, we can estimate additive-by-additive (aa) and dominance-by-dominance (dd) interactions. Kusterer et al. [22] applied the novel approaches of Melchinger et al. [7], [21] to detect QTL for heterosis of biomass-related traits in Arabidopsis. In the above studies, only one variable was involved at one time. To increase the power of QTL detection, Kusterer et al. [22] adopted multi-variable joint analysis [23], as proposed by Melchinger et al. [7] for QTL mapping in the NCIII design.
In the NIL-based TTC design, Melchinger et al. [21] used two QTL mapping methods to study heterosis in Arabidopsis. In the generation means approach, additive, dominance and QTL × genetic background epistasis effects were tested and estimated, and the approach along with particular two-segment NILs was applied by Reif et al. [24] to map aa digenic interaction. In addition, Zhu and Zhang [25] derived formulae for calculating the statistical power in the detection of epistasis; and Wang et al. [26] used interval mapping [27] to detect QTL underlying endosperm traits and demonstrated that the TTC provided a reasonably precise and accurate estimation of QTL positions and effects, especially the two dominant effects, which perfectly overcomes the drawback of the F2:3 design.
In summary, two issues in the detection of QTL in the TTC need to be addressed. First, only a few studies are built on F2-based TTC [25], [26], whereas most are built on RIL [7], [12], [16], [17], [21], [22] and NIL [6], [24]. Second, additive and dominance effects were confounded with QTL-by-genetic background interaction [7], [21], [22] and only aa and dd digenic interactions were evaluated in the RIL-based TTC [16], [17], [21], [22].
The objective of this study was to estimate, in an unambiguous and unbiased manner, all the main and epistatic effects of QTL in the F2-based TTC design. A series of Monte Carlo simulation experiments was carried out to confirm the proposed approach. The extension of the new method to other base populations in the TTC was discussed as well.
Methods
Genetic design and data collection
An F2 population was derived from two inbred lines (P1 and P2) that differed significantly in the quantitative traits of interest and possessed abundant polymorphism molecular markers. A random sample of n F2 individuals (female parents) was backcrossed to three testers, the two parental lines and their F1, to produce 3n families (,
and
). All of the 3n families, each with m replications, were planted. Molecular marker information was observed from all of the n F2 individuals, whereas quantitative traits were measured for all of the 3nm TTC progeny. The phenotypic observations were denoted by
, where
and 3 for
,
and
;
and
. The family means were denoted by
. Following Kearsey and Jinks [10] and Melchinger et al. [21], we performed three linear transformations:
,
and
. The association between
and the marker genotypes of the F2 plants were used to infer the genetic architecture of the trait.
Genetic models for mapping QTL in the F2-based TTC design
The expected genetic values of ,
and
depended on the choice of the metric. Two main metrics, the F2 and F∞ metrics, were adopted for the populations derived from the cross between the two inbred lines [28]-[30]. The derivation of the expected genetic values of
,
and
under both the F2 and the F∞ metric models was presented in Table S1, Table S2, Table S3, Table S4, Table S5, Table S6, and Supporting Information S2. The genetic effect symbols adopted in this study were referred to Kao and Zeng [28].
Statistical genetic models for mapping QTL under the F2 metric model.
According to the expected genetic value of under the F2 metric model in Table S5, the phenotypic value of
can be described as:
(1)where
is the mean genotypic value of the F2 population;
and
are additive and dominance effects of the kth QTL, respectively;
,
,
and
are additive-by-additive, additive-by-dominance, dominance-by-additive and dominance-by-dominance interactions between the 1st and 2nd QTL, respectively;
,
,
,
,
and
are dummy variables and determined by the genotype of the ith F2 plant (Table S5); and
is the residual error with an
distribution. According to the results in Table S5, there are
,
and
. To solve the genetic parameters, model (1) must be reduced to:
(2)where
,
,
,
and
.
If the quantitative trait was controlled by QTL, model (2) should be extended to:
(3)where model mean
;
is augmented additive effect of QTL
;
is augmented epistatic effect between QTL
and
; and
and
are determined by the genotypes of the kth and lth QTL (marker) of the ith F2 plant (Table 1). The coefficients for the genotype
were integrated by the frequencies of
and
. The augmented epistatic effects (
) are ignored in Melchinger et al. [21], this may result in a bigger residual error and lower statistical power.
In the same way, the phenotypic value of can be described as:
(4)where
,
,
,
and
are determined by the genotype of the ith F2 plant (Table S5); and
is the residual error with an
distribution. According to the results in Table S5, there are
and
. To solve the genetic parameters, model (4) must be reduced to:
(5)where
,
,
,
and
.
If the quantitative trait was controlled by QTL, model (5) should be extended to:
(6)where model mean
;
is augmented dominance effect of QTL
;
is augmented epistatic effect between QTL
and
; and dummy variables
and
are determined by the genotypes of the kth and lth QTL of the ith F2 plant (Table 1). The augmented epistatic effects (
) are overlooked in Melchinger et al. [21], this may result in a bigger residual error and lower statistical power.
Similarly, the phenotypic value of can be described as:
(7)where
;
is the recombination fraction between two QTL under study; and dummy variables
,
and
are determined by the genotype of the ith F2 plant (Table 1 and Table S5). Here pure ad, da and dd epistatic effects can be estimated with two-dimensional genome scans. This differs from that in Melchinger et al. [21], in which only dd epistasis is estimated with two-way ANOVA.
Models (3), (6) and (7) were working models for our QTL mapping approach in the F2-based TTC design. Here we proposed a two-step approach to obtain all the pure main and epistatic effects in the presence of epistasis. In the first step, model (3) can be used to estimate the augmented additive () and epistatic (
) effects, model (6) can be used to estimate the augmented dominance (
) and epistatic (
) effects, and model (7) can be used to estimate three types of pure epistatic effects (
,
and
). In the second step, all estimated epistatic effects in models (3), (6) and (7) were integrated for the estimation of all four types of the pure epistatic effects using
,
and
. These pure epistatic effects further integrate with the estimates of both
and
for the estimation of pure additive and dominance effects, using
and
. When epistasis is absent, pure additive (
) and dominance (
) effects can be directly obtained from model (3) and model (6), respectively.
Genetic models for mapping QTL under the F∞ metric model.
With ,
and
genetic models for mapping QTL under the F∞ metric model have the same forms as described in models (3), (6) and (7), respectively. The detailed derivation was described in Table S6 and Supporting information S1 and the detailed comparisons were given in Tables 1 and 2. The pure epistatic effects under the two metrics are calculated in the same way and the pure additive and dominance effects under the two metrics are calculated in different ways, here
and
.
Genetic parameter estimation
Models (3) and (6) have a uniform appearance. However, the true number of QTL () is hard to determine. Variable selection via a stepwise regression or a stochastic search variable selection is the common procedure for epistatic QTL analysis. But these methods are computationally intensive and may not be optimal [31]–[33]. Thus, we adopted the empirical Bayes (E-Bayes) method of Xu [33] for the estimation of parameters in the above models. The E-Bayes approach assumes that there is one QTL standing on each marker throughout the genome and shrinks the genetic effects of all “nonsignificant” QTL toward zero. Here, we only gave some necessary procedures; for the technical details of the E-Bayes refer to the original study of Xu [33].
Models (3) and (6) can be uniformly written as:(8)where
is the model mean;
is the augmented main effect of the kth QTL;
is the augmented epistatic effect between the kth and lth QTL;
is the total number of genetic effects, including the augmented main and epistatic effects; and
is the residual error. Model (8) can be expressed in matrix form:
(9)where
;
;
;
;
and
.
In the expectation and maximization (EM) algorithm of the E-Bayes method [33], model (9) is a typical mixed model and is treated as a fixed effect, whereas
is treated as a random effect. Therefore,
has a multivariate normal distribution with the mean
and the variance-covariance matrix
.
In the EM algorithm of E-Bayes, the genetic parameters are the focus of interest and the normal prior is assigned to
, i.e.,
and
is further assigned a scaled inverse
prior, i.e.,
. The
has uniform prior distribution.
The EM algorithm procedures are as follows:
1) Choose and assign initial values:
,
,
.
2) E-step: the best linear unbiased prediction (BLUP) estimation of the expectation of the quadratic term(10)
3) M-step: the maximum-likelihood estimation for , fixed effects and residual variance
(11)
4) repeat steps 2) - 3) until a certain criterion of convergence is satisfied, e.g. the difference of parameter estimate values between two adjacent iterations were less than 10−10.
In addition, we performed a two-dimension scan using the maximum likelihood approach for the estimation parameters in models (7).
Likelihood ratio test
If we only want to report QTL with relatively large effects and give readers accurate information about how significant the identified QTL were, statistical test should be conducted. The usual likelihood ratio test (LRT) cannot be carried out with the E-Bayes method owing to an oversaturated epistatic genetic model. We proposed the following two-stage selection process to screen the QTL [31]. In the first stage, all QTL with are picked up. In the second stage, the epistatic genetic model is modified so that only effects past the first round of selection are included in the model. Owing to the smaller dimensionality of the reduced model, we can use the maximum likelihood method to re-analyze the data and perform the LRT [31]. The test statistic is
(12)where
is the parameters vector in the statistical genetic model in the second stage analysis of model (8);
is the parameters vector in
excluding the currently tested genetic effect
;
and
are the log maximum likelihood function for
and
, respectively. For simplicity, we took
and 3.0 as the critical values in our small and larger genome simulation experiments, respectively.
Results
Experiment I
The purpose of the simulation experiment was: (1) to evaluate the statistical performance of the proposed approach; (2) to compare the proposed method with previous approaches, such as Kearsey et al. [12], Frascaroli et al. [16] and Li et al. [17] or Melchinger et al. [7], [21] and Kusterer et al. [22], according to statistical power, standard deviation and accuracy measure; and (3) to compare the TTC design with the F2 and F2:3 genetic designs.
The simulated genome consisted of three chromosomes (chr1, chr2 and chr3), and 11 evenly spaced markers covered each chromosome with an average marker interval of 10.0 cM. We simulated three main-effect QTL and one pair-wise interaction QTL, all of which overlapped with markers. All three main-effect QTL were located at the center (50.0 cM) of each chromosome, and QTL2 on chr2 interacted with QTL3 on chr3. The genetic parameters under both the F2 and the F∞ metric models were as follows: ;
and
for QTL1;
and
for QTL2;
and
for QTL3;
,
,
and
for the epistatic effects between QTL2 and QTL3. The marginal heritabilities of these genetic effects varied from 1.01% to 36.54%. The sample size (n), the number of individual in the F2 population, was set at two levels: 200 and 400. The number of individuals (m) for each TTC family was set at 1, 5 and 10. The environmental variance (
) was set at 4.00 and 1.00. To implement the last objective of the simulation experiment, two other kinds of populations, the F2 and F2:3 populations, were also simulated. However, molecular marker information for all three populations was derived from the corresponding F2 individuals. Each treatment was replicated 200 times for the TTC and F2:3 designs and 400 times for the F2 design. In the analyses of the TTC family data, two approaches were adopted: 1) Method A, the proposed method in this study, and 2) Method B, the modified method of Kearsey et al. [12], Frascaroli et al. [16] and Li et al. [17] or Melchinger et al. [7], [21] and Kusterer et al. [22], by removing the augmented epistatic effects from models (3) and (6). In the analyses of the F2 and F2:3 datasets, all of the main effects and all of the pair-wise interaction effects for all of the markers on the whole genome were simultaneously included in the genetic model. For each simulated QTL, we counted the samples in which the LOD statistic was greater than 2.5 and the identified QTL was within 20.0 cM of the simulated QTL. The estimate for QTL parameter was the average of the corresponding estimates in the counted samples. The ratio of the number of such samples to the total number of replicates represented the empirical power of this QTL.
To achieve the first objective of the simulation experiment, ,
and
were analyzed by Method A. In the first step, with
or
33 augmented additive or dominance effects (
or
) and 528 augmented epistatic effects (
or
) were estimated, and with
1584 pure epistatic effects (
,
and
) were estimated. All the effects were tested by likelihood ratio statistic in order that real QTL could be identified. The results for detected QTL under the F2 metric model were listed in Table 3, Table 4, Table 5. The results show that the newly defined parameters, i.e.,
,
,
(
),
and
, were estimated in an almost unambiguous and unbiased manner, and all of the main-effect QTL were identified with a high statistical power and precision in the estimated effects and positions of the QTL by taking the TTC family mean as the unit of phenotypic measurement. The augmented epistatic QTL (
and
) were also well detected, except for the situation when
,
and
. In the second step, all the pure main and epistatic effects would be estimated in an unbiased manner (Table 6). It should also be noted that a large sample (
), a greater family replication number (
), and moderate QTL heritability (
) are needed for the partition of the augmented epistatic effects (
and
) into its components (
,
,
and
), and detecting
epistasis is more difficult than detecting
epistasis (Tables 5 and 6). The theoretical explanation is that
(also
) has a larger contribution to the genetic variance of
than
(
when
, Supporting Information S2). In addition, the powers in the detection of the augmented epistatic effects (
in Table 3 and
in Table 4) were always much higher than those of pure epistatic effects (
,
and
in Table 5). The possible explanations lie in that 1) the augmented epistatic effects (
and
) were the sum of two epistatic effects with the same signs in Experiment I and were inflated, and 2) these epistatic effects have different contributions to the genetic variances of
,
and
(Supporting Information S2).
To achieve the second objective of the simulation experiment, and
were re-analyzed by method B and the results under the F2 metric model were also listed in Tables 3 and 4. The results show that the Z1 and Z2 could still be used to unbiasedly estimate QTL additive (
) and dominance effect (
) when the QTL (QTL1) acted independently; but provided biased estimation of QTL additive (
and
) and dominance effects (
and
) when the QTL acted dependently (QTL2 and QTL3). The additive (
and
) and dominance effects (
and
) of interactive QTL obtained by Method B in Tables 3 and 4 were indeed the newly defined additive effects (
and
) and the new dominance effects (
and
) with slightly poorer precision (little larger in standard deviation) in estimated QTL effects and positions and lower statistical power. This means that the new method was better than the previous methods of Kearsey et al. [12], Frascaroli et al. [16] and Li et al. [17] in the presence of epistasis. The higher statistical power and smaller error variance for method A over method B shows that the new method was also superior to the methods of Melchinger et al. [7], [21] and Kusterer et al. [22].
To achieve the third objective of the simulation experiment, the F2 and F2:3 data were analyzed and the results under the F2 metric model were listed in Tables 7 and 8. The results show that many effects could be estimated in an unambiguous and unbiased manner in the F2 and F2:3 genetic designs. In the situation of , the F2 design was superior to the both TTC and F2:3 designs. The reasons are as follows. In all the above three designs, marker genotypes were from F2 individuals. If
, genotype sampling error was large for both TTC and F2:3 designs. Meanwhile, the proposed approach in this study did not consider the mixed distribution of the F2:3 (or TTC) progeny derived from heterozygous F2 parents. However, the powers in the detection of the main and epistatic QTL were smaller for the F2 design than for the TTC design with
(or 10) when sample size (
) was small and/or environmental variance (
) was large, and the same trend was obtained for the precision of the estimates for the effects and the positions of the main and epistatic QTL. For example, when
and
, the power for main effects
and
were 0.850 and 0.775 and the standard deviation (SD) were 0.253 and 0.308, respectively, in F2 design (Table 7); while the power for
and
were 1.000 and 1.000 and the SD were 0.118 and 0.104, respectively, in TTC design with a family replication of 10 (Tables 3 and 4). This may be due to the fact that the phenotypic value is measured from F2 individuals and from the TTC family, and the family mean can be used to decrease the residual variance and to improve the precision of the phenotypic data. Both the TTC and F2:3 designs use family mean to decrease environmental variance and improve the precision of phenotype of quantitative trait. In addition, the dominant components decrease significantly in the F2:3 design due to its self-crossing, and the statistical powers for detecting dominance effects, additive by dominance (dominance by additive) epistatic effect and especially dominance by dominance epistatic effect in the F2:3 design will be lower than that in the TTC design. For example, when
,
and
, the power of 0.170 for
in F2:3 (Table 8) was much lower than that of 0.490 in the TTC (Table 5). The genetic variance contributed by the simulated three QTL under TTC and F2:3 designs were (Supporting Information S2):
These variance component can be used to interpret the above simulated experiments results.
Experiment II
The purpose of the simulation experiment was to show the statistical properties of the proposed approach in the TTC design when the augmented epistatic effects consisted of two epistatic effects of equal strength in opposite directions. The genetic parameters under both the F2 and the F∞ the metric models were as follows: ;
,
for QTL1;
,
for QTL2;
,
for QTL3;
,
,
and
for the epistatic effects between QTL2 and QTL3. The marginal heritabilities of these genetic effects now varied from 0.98% to 38.75%. The value of m was set at 5 and 10. The other settings were the same as those in Experiments I.
The results for Experiments II are listed in Table 9, Table 10, Table 11. The results show that the powers in the detection of the augmented epistatic effects ( in Table 9 and
in Table 10) were very low. The results are reasonable because the genetic contributions of the augmented epistatic effects to the genetic variance of
and
were low. However, the powers for pure epistatic effects (
,
and
) remained steady (Tables 5 and 11) because the genetic contributions for these effects do not change.
Experiment III
We simulated a large genome to explore the performance of the proposed method in real data analysis. The simulated genome was 1000.0 cM in total length and covered by 210 markers (10 chromosomes, each covered with twenty-one 5.0 cM equally spaced markers). Ten main-effect QTL and three pairs of interacted QTL, which totally explained ∼50% variation of L1, L2 and L3, were assumed (Tables 12 and 13). The environmental variance (), sample size and family replication number were set at 6.0, 500 and 10, respectively. The mapping results from 200 samples under the F2 metric model were presented in Table 12 for the main-effect QTL and Table 13 for the epistatic QTL. Results from Table 12 showed that all the augmented main effects were unbiasedly estimated with satisfactory powers; and most pure additive and dominance effects were also unbiasedly estimated with the exception of pure dominance effects for QTL5 and QTL8. The results from Table 13 demonstrated that with
and
the augmented epistatic effects (
and
) were well estimated when they consisted of two epistatic effects with same sign (QTL4 and QTL7, QTL9 and QTL10) and were poorly detected when they consisted of two epistatic effects of equal strength in opposite directions (
and
for QTL5 and QTL8); with
all the pure epistatic effects (
,
and
) were well estimated, and no matter what signs they were; and all pure epistatic effects (
,
,
and
) estimated in the second stage were unbiased except for
for QTL5 and QTL8 (
). The failure of detecting
resulted in biased estimate for
, which further caused bad estimate for
and
. These results were similar to those in simulation experiments I and II. The time cost was ∼4.70h per sample on our person computer (CPU: Intel® CoreTM 2 DUO 3.0G, Memory: 2.0G).
Experiment IV
This simulation experiment was to consider the situation that QTL stands on the position in the marker interval. The three simulated QTL were placed at 45.0 (the middle of marker interval), 52.5 (the right of the sixth marker) and 47.5 cM (the left of the sixth marker), respectively. The number of individuals (m) for each TTC family was set at 5 and 10. The other settings were the same as those in the Experiment I. The results were shown in Table 14, Table 15, Table 16. The accuracies for the effects and the positions of QTL, as well as the empirical power, were satisfied but lower than those presented in Table 3, Table 4, Table 5; and the QTL effects were slightly underestimated because of the recombination between QTL and its adjacent marker.
Discussion
Compared to previous studies on the methodologies for the TTC, the method described here offers advantages over the previous approaches. First, with or
all augmented main and epistatic effects (
,
,
and
) were included simultaneously in one genetic model and estimated together by the E-Bayes approach. Our simulation studies showed that these augmented effects could be estimated with very high power and precision when the component epistatic effects (
and
or
and
) of
and
have the same direction (Tables 3, 4 and 13). Even though these epistatic effects have different signs, the new approach works well for augmented main-effect QTL parameters (Tables 9, 10 and 12).
Second, with three pure epistatic effects (
,
and
) were estimated simultaneously in this study by two-dimensional genome scans. Although we attempted to use a full genetic model that included all the digenic epistatic effects for the estimation of all the epistatic effects under the framework of E-Bayes, it failed. The reasons are unclear. To date, there have been several approaches to detect the epistasis in the RIL-based TTC and NCIII designs, little is currently reported about the estimation of more than two epistatic effects in the TTC. Frascaroli et al. [16] and Li et al. [17] adopted the mixed linear model approach of Wang et al. [20] to detect
in the analyses of
and
in the analyses of
; and Kusterer et al. [22] and Melchinger et al. [21] used two-way ANOVA on
and
for the detection of
and
, respectively. However, the two studies involved only one digenic epistatic effect. Although multiple interval mapping has been used to detect the augmented epistatic effects (
and
) by Garcia et al. [34], the genetic design is NCIII and the estimate is a compound effect, not a pure epistatic effect. In addition, Reif et al. [24] proposed a two-step procedure to detect
with particular two-segment NILs.
Finally, many main and epistatic effects can be estimated in an unambiguous and unbiased manner by our two-step approach. In the first step, the augmented main and epistatic effects (,
,
and
) and three pure epistatic effects (
,
and
) may be estimated in the separate analyses of
,
and
. In the next step, all four pure epistatic effects (
,
,
and
) may be estimated by using the equation
and
and pure additive and dominant effects may be further estimated by using the equations of
and
. The simulation results show that the two-step approach works well (Tables 6, 12 and 13). However, the pure epistatic effects (
,
and
) could not be detected with satisfactory statistical power when the sample size (
) and family replication number (
) were low (Tables 5 and 11). Therefore, a large
and
are needed for the detection of epistasis. To accommodate larger
, suitable field experimental designs, such as split-plot design [13], [16] and block in replication [35], are desired to control for environmental error.
The F2-based TTC design is superior to the F2 design for the detection of main-effect and epistatic QTL when there is a small sample size and a large residual variance (Tables 3, 4, 5 and 7), and is more powerful for estimating ,
(or
) and especially
than the F2:3 design (Tables 4, 5 and 8). The new method may be extended to the TTC design derived from other base populations, such as RIL, BC and DH. This is because the genetic models for
,
and
in these new TTC designs can be described in the same manner. In Tables S7, S8 and Supporting Information S3 we only presented the expected genetic values and genetic variance for
,
and
under both the F2 and the F∞ metric models in the RIL-based TTC design.
The proposed approach in this study differs from the previous methods of Kearsey et al. [12], Frascaroli et al. [16], Melchinger et al. [7], [21] and Li et al. [17]. First, the former derives the linear regression models for ,
and
and the latter makes use of ANOVA. Thus, the precondition for the former is to derive the dummy variables for each genetic effects, whereas the precondition for the latter is to obtain the expectation and expected mean squares. In the expectation and expected mean squares, if one effect is confounded by another effect, these confounded effects may be estimated together. That is the augmented effect in the above ANOVA. If there are multicollinear relationships among dummy variables, the corresponding effects cannot be estimated. However, the effect combination is estimable. That is the augmented effect in the linear regression analysis. This can explain why we construct augmented effects. Second, we consider all the main-effect QTL and all the digenic interactions in one model of Z1 or Z2, all the augmented additive, dominance and epistatic effects have been rightly defined, and all the pure main and epistatic effects can be unbiasedly estimated. Although in the previous studies the augmented additive and dominant effects (
and
) have been rightly defined and are clearly confounded by QTL × genetic background epistasis in the RIL-based TTC and NCIII designs [7], [21], [22], the augmented epistatic effects have been ignored. This neglect would result in a biased estimation for the augmented main effects, a larger residual variance and a lower power of QTL detection (Tables 3 and 4). In addition, with Z3 we can estimate three types of pure epistatic effects (ad, da and dd) using two-dimensional genome scans. This differs from Melchinger et al. [21], in which only dd epistasis can be obtained.
The F2 and F∞ are two main metrics that are adopted for populations derived from a cross between two inbred lines. The F2 metric is orthogonal for the F2 population when epistatic genes are under linkage equilibrium, whereas the F∞ metric is orthogonal for homozygous lines [28]–[30]. An orthogonal model implies that estimates of the genetic effects are consistent in a full and reduced model and is directly related to the partition of the genetic variance in the population. Using different models does not influence the detection of the main and epistatic QTL, but it does influence the estimation and interpretation of genetic effects [30]. Melchinger et al. [7], [21] and Kusterer et al. [13], [22] advocated the F2 metric in the RIL-based NCIII and TTC designs for three reasons: (1) it has the advantage that each variance component is proportional to the sum of the squares of the corresponding genetic effects and does not involve any other type of genetic effects that could obscure their interpretation; (2) epistatic interactions by two-way ANOVAs for pairs of marker loci using was just
; and (3) with digenic epistasis, midparent heterosis
involves only
beside dominance effects, whereas under the F∞ metric MPH is additionally influenced by
. For F2-based TTC design, neither F2 nor F∞ metric models are orthogonal (Supporting Information S2). With the Z1 and Z2 the newly defined parameters (
,
,
and
) were all rightly identified and estimated by our full model methods under both metrics (Tables 3, 4, 12 and 13), and with Z3 the pure epistatic effects (
,
, and
) could also be detected and well estimated under both metrics when the sample size and number of family replications were large in our simulation studies (Tables 5, 11 and 13). The differences under the two metrics may be as follows: (1) the newly defined main effects and model means are different for the Z1 and Z2 under the two models; and (2) the F2 metric model seems to behave better than the F∞ metric model (higher power and precision) (data not shown).
The proposed approach in this study assumes that all the QTL stand on the markers. When marker density is high, all the QTL can be detected with a high power and precision. When marker density is sparse, the QTL effects are slightly underestimated because of the recombination between QTL and its adjacent marker. To solve the issue, some virtual marker (treated as missing data) may be inserted. At this time marker imputation techniques may be used.
The drawbacks for our method may lie in two aspects: (1) with and
the augmented epistatic effects (
and
) were poorly detected when their corresponding components have an equal strength in opposite directions (Tables 9, 10 and 13). This would result in biased estimate for pure aa epistatic effect, such as
in Table 13, and further cause bad estimate for pure dominance effect, such as
and
in Table 12; and (2) The estimation error for the pure main and epistatic effects using the two-step approach seemed to be a little large. This will be studied in the future.
Supporting Information
Supporting Information S1.
Statistical genetic models for mapping QTL in the TTC design under the F∞ metric model.
https://doi.org/10.1371/journal.pone.0024575.s001
(DOC)
Supporting Information S2.
The expected genetic values of ,
and
under the F2 and the F∞ metric models in the F2-based TTC design.
https://doi.org/10.1371/journal.pone.0024575.s002
(DOC)
Supporting Information S3.
The expected genetic values of the ,
and
values under the F∞ and the F2 metric models in the RIL-based TTC design.
https://doi.org/10.1371/journal.pone.0024575.s003
(DOC)
Table S1.
Genetic constitutions of the F2-based TTC family means L1i, L2i and L3i.
https://doi.org/10.1371/journal.pone.0024575.s004
(DOC)
Table S2.
Expected genetic value of L1i family under the F2 and the F∞ metric models in the F2-based TTC design.
https://doi.org/10.1371/journal.pone.0024575.s005
(DOC)
Table S3.
Expected genetic value of L2i family under the F2 and the F∞ metric models in the F2-based TTC design.
https://doi.org/10.1371/journal.pone.0024575.s006
(DOC)
Table S4.
Expected genetic value of L3i family under the F2 and the F∞ metric models in the F2-based TTC design.
https://doi.org/10.1371/journal.pone.0024575.s007
(DOC)
Table S5.
Expected genetic values of ,
and
under the F2 metric model in the F2-based TTC design.
https://doi.org/10.1371/journal.pone.0024575.s008
(DOC)
Table S6.
Expected genetic values of ,
and
under the F∞ metric model in the F2-based TTC design.
https://doi.org/10.1371/journal.pone.0024575.s009
(DOC)
Table S7.
Expected genetic values of ,
and
under the F2 metric model in the RIL-based TTC design.
https://doi.org/10.1371/journal.pone.0024575.s010
(DOC)
Table S8.
Expected genetic values of ,
and
under the F∞ metric model in the RIL-based TTC design.
https://doi.org/10.1371/journal.pone.0024575.s011
(DOC)
Acknowledgments
The authors thank the anonymous reviewers for their comments on an earlier version of the manuscript.
Author Contributions
Conceived and designed the experiments: Y-MZ. Performed the experiments: X-HH. Analyzed the data: X-HH. Contributed reagents/materials/analysis tools: X-HH. Wrote the paper: Y-MZ X-HH.
References
- 1. Carlborg Ö, Haley CS (2004) Epistasis: too often neglected in complex trait studies. Nat Rev Genet 5: 618–625.
- 2. Moore JH, Williams SM (2005) Traversing the conceptual divide between biological and statistical epistasis: systems biology and a more modern synthesis. BioEssays 27: 637–646.
- 3. Jinks JL, Jones RM (1958) Estimation of the components of heterosis. Genetics 43: 223–234.
- 4. Yu SB, Li JX, Xu CG, Tan YF, Gao YJ, et al. (1997) Importance of epistasis as the genetic basis of heterosis in an elite rice hybrid. Proc Natl Acad Sci USA 94: 9226–9231.
- 5. Lippman ZB, Zamir D (2006) Heterosis: revisiting the magic. Trends in Genetics 23: 60–66.
- 6. Melchinger AE, Piepho HP, Utz HF, Muminovi J, Wegenast T, et al. (2007) Genetic basis of heterosis for growth-related traits in Arabidopsis investigated by testcross progenies of near-isogenic lines reveals a significant role of epistasis. Genetics 177: 1827–1837.
- 7. Melchinger AE, Utz HF, Piepho HP, Zeng ZB, Schön CC (2007) The role of epistasis in the manifestation of heterosis: a systems-oriented approach. Genetics 177: 1815–1825.
- 8. Wright S (1980) Genic and organismic selection. Evolution 34: 825–843.
- 9. Carson HL, Templeton AR (1984) Genetic revolutions in relation to speciation phenomena: the founding of new populations. Annu Rev Ecol Syst 15: 97–131.
- 10. Kearsey MJ, Jinks JL (1968) A general method of detecting additive, dominance and epistatic variation for metrical traits. I. Theory. Heredity 23: 403–409.
- 11.
Kearsey MJ, Pooni HS (1996) The genetical analysis of quantitative traits. Chapman and Hall: London.
- 12. Kearsey MJ, Pooni HS, Syed NH (2003) Genetics of quantitative traits in Arabidopsis thaliana. Heredity 91: 456–464.
- 13. Kusterer B, Muminovic J, Utz HF, Piepho HP, Barth S, et al. (2007) Analysis of a triple testcross design with recombinant inbred lines reveals a significant role of epistasis in heterosis for biomass-related traits in Arabidopsis. Genetics 175: 2009–2017.
- 14. Jinks JL, Perkins JM (1970) A general method for the detection of additive, dominance and epistatic components of variation III. F2 and backcross populations. Heredity 25: 419–429.
- 15. Perkins JM, Jinks JL (1970) Detection and estimation of genotype-environmental, linkage and epistatic components of variation for a metrical trait. Heredity 25: 157–177.
- 16. Frascaroli E, Canè MA, Landi P, Pea G, Giabfranceschi L, et al. (2007) Classical genetic and quantitative trait loci analyses of heterosis in a maize hybrid between two elite inbred lines. Genetics 176: 625–644.
- 17. Li L, Lu K, Chen Z, Mu T, Hu Z, et al. (2008) Dominance, overdominance and epistasis condition the heterosis in two heterotic rice hybrids. Genetics 180: 1725–1742.
- 18. Kearsey MJ, Hyne V (1994) QTL analysis: a simple ‘marker-regression’ approach. Theor Appl Genet 89: 698–702.
- 19. Zeng ZB (1994) Precision mapping of quantitative trait loci. Genetics 136: 1457–1468.
- 20. Wang DL, Zhu J, Li ZK, Paterson AH (1999) Mapping QTL with epistatic effects and QTL × environment interactions by mixed model approaches. Theor Appl Genet 99: 1255–1264.
- 21. Melchinger AE, Utz HF, Schön CC (2008) Genetic expectations of quantitative trait loci main and interaction effects obtained with the triple testcross design and their relevance for the analysis of heterosis. Genetics 178: 2265–2274.
- 22. Kusterer B, Piepho HP, Utz HF, Schön CC, Muminovic J, et al. (2007) Heterosis for biomass-related traits in Arabidopsis investigated by quantitative trait loci analysis of the triple testcross design with recombinant inbred lines. Genetics 177: 1839–1850.
- 23. Jiang C, Zeng ZB (1995) Multiple trait analysis of genetic mapping for quantitative trait loci. Genetics 140: 1111–1127.
- 24. Reif JC, Kusterer B, Piepho HP, Meyer RC, Altman T, et al. (2009) Unraveling epistasis with triple testcross progenies of near-isogenic lines. Genetics 181: 247–257.
- 25. Zhu C, Zhang R (2007) Efficiency of triple test cross for detecting epistasis with marker information. Heredity 98: 401–410.
- 26. Wang XF, Song W, Yang ZF, Wang YM, Tang ZX, et al. (2009) Improved genetic mapping of endosperm traits using NCIII and TTC designs. Journal of Heredity 100: 496–500.
- 27. Lander ES, Botstein D (1989) Mapping Mendelian factors underlying quantitative traits using RFLP linkage maps. Genetics 121: 185–199.
- 28. Kao CH, Zeng ZB (2002) Modeling epistasis of quantitative trait loci using Cockerham's model. Genetics 160: 1243–1261.
- 29. Yang RC (2004) Epistasis of quantitative trait loci under different gene action models. Genetics 167: 1493–1505.
- 30. Zeng ZB, Wang T, Zou W (2005) Modeling quantitative trait loci and interpretation of models. Genetics 169: 1711–1725.
- 31. Zhang YM, Xu S (2005) A penalized maximum likelihood method for estimating epistatic effects of QTL. Heredity 95: 96–104.
- 32. Xu S (2007) An empirical Bayes method for estimating epistatic effects of quantitative trait loci. Biometrics 63: 513–521.
- 33. Xu S (2010) An expectation–maximization algorithm for the Lasso estimation of quantitative trait locus effects. Heredity 105: 483–494.
- 34. Garcia AAF, Wang S, Melchinger AE, Zeng ZB (2008) Quantitative trait loci mapping and the genetic basis of heterosis in maize and rice. Genetics 180: 1707–1724.
- 35. Cockerham CC, Zeng ZB (1996) Design III with marker loci. Genetics 143: 1437–1456.