^{*}

^{¤a}

^{¤b}

^{¤a}

Current address: Department of Animal Breeding and Genetics, Swedish University of Agricultural Sciences, Uppsala, Sweden

Current address: CEES, Department of Biology, Oslo University, Oslo, Norway

Analyzed the data: JÁ AL. Contributed reagents/materials/analysis tools: JÁ AL ÖC. Wrote the paper: JÁ. Conceived part of the content of the article and developed part of the theory: JÁ ÖC.

The authors have declared that no competing interests exist.

The authors have declared that no competing interests exist.

Although the genotype-phenotype map plays a central role both in Quantitative and Evolutionary Genetics, the formalization of a completely general and satisfactory model of genetic effects, particularly accounting for epistasis, remains a theoretical challenge. Here, we use a two-locus genetic system in simulated populations with epistasis to show the convenience of using a recently developed model, NOIA, to perform estimates of genetic effects and the decomposition of the genetic variance that are orthogonal even under deviations from the Hardy-Weinberg proportions. We develop the theory for how to use this model in interval mapping of quantitative trait loci using Halley-Knott regressions, and we analyze a real data set to illustrate the advantage of using this approach in practice. In this example, we show that departures from the Hardy-Weinberg proportions that are expected by sampling alone substantially alter the orthogonal estimates of genetic effects when other statistical models, like F_{2} or G2A, are used instead of NOIA. Finally, for the first time from real data, we provide estimates of functional genetic effects as sets of effects of natural allele substitutions in a particular genotype, which enriches the debate on the interpretation of genetic effects as implemented both in functional and in statistical models. We also discuss further implementations leading to a completely general genotype-phenotype map.

The rediscovery of Mendel's laws of inheritance of genetic factors gave rise to the research field of Genetics at the very beginning of the last century. The idea of traits being determined by the effects of inherited genes is thus the conceptual core of Genetics. After more than one century, however, we still lack a completely general mathematical description of how genes can control traits. Such descriptions are called genotype-phenotype maps, or models of genetic effects, and they become particularly cumbersome in the presence of interaction among genes, also referred to as epistasis. The models of genetic effects are necessary for unraveling the genetic architecture of traits—finding the genes underlying them and obtaining estimates of their individual effects and interactions—and for meaningfully using that information to investigate their evolution and to improve response to selection in traits of economical importance. Here, we illustrate the convenience of using a recently developed model of genetic effects with arbitrary epistasis, NOIA, to inspect the genetic architecture of traits. We implement NOIA for practical use with a regression method and exemplify that theory with a real dataset. Further, we discuss the state of the art of genetic modeling and the future perspectives of this subject.

There is an increasing interest in Quantitative Genetics and Evolutionary Biology to identify genetic effects, and more particularly gene interactions, on a genome-wide scale and to understand its role in the genetic architecture of complex traits

Regarding now the first issue mentioned above—the models of genetic effects—the definition of the genetic effects in Haley and Knott's _{∞} model

The statistical formulation of the recently developed NOIA (Natural and Orthogonal InterActions) model of genetic effects is orthogonal in situations where previous models are not—for departures from the Hardy-Weinberg proportions (HWP) at any number of loci—and it is therefore more appropriate choice for estimating genetic effects from data in genetic mapping _{2} population or into an outbred population of interest. Second, using the functional formulation of NOIA, it is possible also to express the genetic effects as effects of allele substitutions from reference individual genotypes—instead of from population means like in the statistical formulation. In other words, starting from the orthogonal genetic effects of a population or sample under study, which are the ideal ones for performing model selection and have a particular meaning, NOIA enables us to obtain the values of the genetic effects that are associated to other desired meanings and are useful, therefore, to inspect different aspects of the evolution of a population, or selective breeding for increasing or decreasing a trait values.

Our motivation for this communication is to show how to use models of genetic effects to obtain estimates of genetic effects from data that have the desired meaning of any particular scientific purpose. To this end we first inspect how much of a difference it makes to use the classical models for ideal populations, such as ideal F_{2} populations, to compute genetic effects in a non-ideal situation, under departures from the HWP. We address this issue by generating simulated populations that depart from the HWP in several degrees and analyzing them with NOIA and other models. We quantify the deviances from orthogonal estimates due to using models that assume ideal conditions in the populations under study, thus showing the practical convenience of using the NOIA model for performing real estimates of genetic effects in QTL experiments. Second, we develop an implementation of NOIA with HKR, allowing it for immediate practical use and illustrate its performance using an example with real data. By this example we provide estimates of genetic effects with different meanings and, for the first time, functional estimates of genetic effects—using an individual genotype as reference—from a real data set. We discuss on how this feature opens new possibilities of using real data to analyze important topics in Evolutionary Genetics.

_{2}), the genetic effects of a two-locus and two-allele genetic system (_{A}_{2}_{2} is almost completely absent. The second group contains the reference point—the mean of the population, μ—and the single locus effects of locus _{B}_{B}_{A}_{2} model always give the same values independently of the genetic constitution of the population. The F_{2} thus fails to capture the effects of departures from HWP at all. Thus, unless when the studied population is an ideal F_{2} (and the deviances from HWP are zero, see _{2} is biased and the genetic estimates do not reflect the average effects of allele substitutions in the population under study. Those deviations become more severe as the departure from HWP increases (

The genetic effects were obtained using the F_{2}, G2A and NOIA models in a two locus genetic system that was simulated in nine F_{2} populations with departures from HWP ranging from zero to 97% (see text for details).

Genotype at locus | |||

Genotype at locus | _{1}_{1} | _{1}_{2} | _{2}_{2} |

_{1}_{1} | 0.25 | −0.75 | −0.75 |

_{1}_{2} | −0.75 | 2.25 | 2.25 |

_{2}_{2} | −0.75 | 2.25 | 2.25 |

_{2} population, where the three models coincide), and thus there exist covariances between the genetic effects that would need to be accounted for to obtain the true genetic variance of the population _{2} models is, thus, non-orthogonal. The G2A leads to a greater departure form an orthogonal decomposition of variance than the F_{2} model by the particular kind of departures from HWP simulated here. Both the G2A and F_{2} models underestimate the additive variance and therefore also the heritability of the trait in the simulated populations.

The variance decomposition was performed for the same cases as in _{P} is the phenotypic variance, which (in absence of environmental variance) is equal to V_{G}, the genetic variance. V_{A} is the additive variance, V_{D} is the dominance variance and V_{I} is the epistatic (interaction) variance.

For illustrating the advantage of using NOIA for analyzing experimental data, we reanalyze a two-locus (_{2} cross between Red junglefowl and White leghorn layer chickens _{2} and F_{∞}. As explained in the previous subsection, NOIA is orthogonal under departures from the HWP, whereas the other models are not. The F_{∞} model deviates severely from the estimates obtained by NOIA. Deviations are expected since the F_{∞} model is non-orthogonal even in an ideal F_{2} population with no deviations from the expected frequencies due to sampling errors. The F_{2} and G2A models, on the other hand, would be orthogonal under ideal circumstances and the observed deviations from orthogonality of those models when analyzing these experimental data are due to sampling (as explained above). _{2} and G2A differ substantially from these of NOIA (up to 18/42% for the G2A and 53/138% for the F_{2} model, for the genetic effects/variance component estimates). This example with real data, thus, shows that it makes a substantial improvement to use NOIA to compute genetic effects and variance decomposition in QTL mapping experiments over the classical models of genetic effects designed to fit ideal experimental situations.

Vector of genetic effects, E, and components of variance associated to each of the genetic effects | |||||||||

Model | μ | α_{A} | δ_{A} | α_{B} | δ_{B} | αα | αδ | δα | δδ |

NOIA | 269.49 | 169 | 1.00 | 0.45 | 6.74 | 11.28 | 4.47 | 9.75 | −11.75 | 34.32 | 9.67 | 20.78 | −20.30 | 46.66 | 8.22 | 8.18 | −24.80 | 37.87 |

G2A | 269.32 | 164 | 1.18 | 0.64 | 7.00 | 12.25 | 4.15 | 8.43 | −10.74 | 28.66 | 9.68 | 20.83 | −20.21 | 46.28 | 8.28 | 8.35 | −24.80 | 38.19 |

F_{2} | 269.68 | 177 | 1.53 | 1.07 | 7.44 | 13.84 | 4.90 | 11.80 | −11.15 | 31.08 | 10.48 | 24.76 | −19.70 | 44.56 | 9.50 | 11.07 | −24.80 | 38.44 |

F_{∞} | 265.23 | 581 | 11.38 | 59.46 | 19.84 | 212.83 | 0.15 | 0.01 | 1.25 | 0.80 | 10.48 | 24.76 | −19.70 | 90.72 | 9.50 | 23.94 | −24.80 | 169.37 |

The variances in this column are the total genetic variances computed as the sum of the components of variance given in the rest of the columns.

From the statistical estimates in _{1}_{1}_{1}_{1}” as reference genotype, the estimates of functional genetic effects, and the standard deviations associated to these estimates, are shown in _{1}_{1}_{1}_{1}”.

_{B} = −10.33 | _{B} | |

σ_{aB} | σ_{dB} | |

_{A} | ||

σ_{aA} | σ_{aa} | σ_{ad} |

_{A} | ||

σ_{dA} | σ_{da} | σ_{dd} |

QTL on chromosome 2 (486 cM).

QTL on chromosome 3 (117 cM).

To illustrate the usefulness of these functional genetic effects for understanding how epistatic effects can contribute to phenotype change, we consider the role of this QTL pair in increasing the growth rate in the Red junglefowl. For simplicity, we assume hereafter that

For analyzing what would happen if eventually the two mutations were present at the same time in the population, we have to consider also the interaction effects. The double homozygote for White leghorn layer allele increases the phenotype with roughly forty grams (four times _{1111}), relative to the expected value without epistasis, which is a decrease in roughly 20 grams from the Red junglefowl. In total, this makes the phenotype of the White leghorn layer 20 grams higher than the Red junglefowl. However, for inspecting if this results support the White leghorn layer alleles being likely to reach fixation we also need to consider the phenotypes of the heterozygotes. Interactions involving dominance in locus _{2}. The role of allele _{2} is not as obvious, since _{1}_{2}_{2}_{2}” is roughly 30 grams higher than the Red junglefowl (computed again from

The statistical formulation of NOIA is orthogonal under random deviations from ideal experimental populations and outbreeding pedigrees

Second, we used experimental data on epistatic QTL from a previously published study

After model selection and the estimation of genetic effects have been properly carried out using an orthogonal model, the obtained estimates provide the effects of allele substitutions in the sample of individuals used in the study, and the decomposition of variance is also the appropriate one in that particular sample of individuals. The NOIA model provides convenient tools for transforming those estimates into the ones with any other desired meaning, like the orthogonal estimates and the decomposition of variance in a different population

One example of the previous is removing the characteristics of the data that are not supposed to be properties of a target population from the estimates. The departures from HWP of the experimental data we dealt with in this article are in fact supposed to be only due to sampling, instead of being caused by real Hardy-Weinberg disequilibrium in the F_{2} population. If we were interested in the genetic effects or in the decomposition of variance of the ideal F_{2} as a target population—in which the departures from HWP are absent—we could use the transformation tool of NOIA to obtain (from the original estimates with the reference of the mean of the sample population) the ones with the reference of the mean of an ideal F_{2} population. Further, as illustrated in the example with real data, it is possible to transform statistical estimates of genetic effects into functional ones, using a particular reference genotype. Another situation in which these transformations are valuable is, for instance, in a three-locus genetic system with pairwise epistasis. In this case, NOIA would easily permit to consider only the significant genetic effects and to re-compute the genotypic values only from the significant genetic effects (assuming the non-significant third-order interactions to be zero).

Statistical models of genetic effects are necessary for QTL analysis and for performing orthogonal decompositions of the genetic variance in populations. Functional models of genetic effects, on the other hand, are convenient—especially in the presence of epistasis—for studying evolutionary properties of the populations such us adaptation in the presence of drift and speciation (see _{2} experimental population, into functional genetic effects as allele substitutions performed from a reference individual. Concerning these functional estimates of genetic effects, we have shown in the previous section how they can improve the understanding of the genetic system by inspecting a two-locus model obtained from real data. Notice that when changing the reference of the model, the genetic effects can change their magnitudes and even their signs (see _{1}_{1}_{1}_{1}”, the genetic effects have to be described with a model that uses that particular genotype as reference point. Those are the only ones that are meaningful for analyzing the problem under consideration.

The computation of genetic effects using NOIA in the example with real data required the use of the theory developed in this article, the implementation of the model to handle missing data (1). When performing IM for searching for the positions and estimates of genetic effects in QTL mapping experiments, missing data occurs at two levels. First, the genotype of the QTL located in a marker interval is not known and needs to be estimated from the observed flanking marker genotypes. Second, in most experimental datasets there are missing genotypes for many genetic markers that can be imputed from genotypes at closely linked informative markers. Thus, the implementation of HKR with NOIA enables us to perform IM with a regression method and using a model of genetic effects that is orthogonal regardless of how far the available data is from the HWP.

The HKR has been assessed as a good approximation of IM when dense marker maps are available and missing data are few and random

Models of genetic effects need to be further generalized. Two important cases that need to be accounted for are multiple-alleles and LD, which have been addressed in several recent publications dealing with statistical models of genetic effects. Yang

We use a simulated numerical example to show how departures from the HWP affect the estimates of genetic effects in several models of genetic effects. We simulate a trait controlled by two biallelic loci, _{2} population (_{2} population of 800 individuals in strict HWP and LE. From this population we subsequently removed 24 _{2}_{2} individuals and added eight _{1}_{1} and 16 _{1}_{2} individuals in a balanced way, without affecting the population size, the frequencies at locus _{1}_{1} versus _{1}_{2} individuals or LE. Only deviations from the HWP against the _{2}_{2} homozygote were introduced in the data. We repeated this procedure eight times in total and saved each population data, until only eight _{2}_{2} individuals remained. We measured the departures from HWP in these populations by computing the percentage of reduction of _{2}_{2} individuals relative to _{1}_{1}, which of course was zero in the ideal F_{2} population we started from.

We analyzed the simulated data by computing the genetic effects of the system using three models: NOIA, G2A and F_{2}. The F_{2} model, described in _{2} populations, although it is only orthogonal in ideal F_{2} populations with the genotypic frequencies being exactly ¼, ½, ¼. The NOIA model is as described in _{1}. The genetic effects were computed for each individual genotype using the genetic-effects design matrices and the estimates of genetic effects from each of the three models, which produced different outcomes. The additive, dominance and interaction variances were obtained as the correspondent sums of the variances of each genetic effect (for instance, the sum of the variances of the additive effects of each of the loci gives the additive variance).

We recall the required theory behind the HKR and NOIA in _{S}_{2} model in _{11}, _{12} and _{22} in the NOIA statistical formulation (S5) are the exact genotype frequencies at the considered loci. In the HKR, the genotype frequencies are not known, but can be estimated as:^{*} be the column-vector of observed phenotypes, ^{*}_{k}_{k}_{S}_{AB}

Carlborg _{2} intercross of roughly 800 individuals between one Red junglefowl male and three White leghorn females. A simultaneous two-dimensional genome scan was performed to identify pairs of interacting loci regardless of whether their marginal effects were significant or not. We have studied in more detail one of the detected pairs involving QTL on chromosome 2 (486 cM) and 3 (117 cM), hereafter loci

We have computed the genetic effects of the epistatic pair involving loci _{∞} model, which was the one also used by Carlborg _{2} model, which was designed for F_{2} populations. Third, the G2A model, which can account for departures of the gene frequencies from ½, and finally the statistical formulation of NOIA, which can adapt to the genotype frequencies of the sample used for the estimation of QTL effects. In these analysis we have made use of the theory developed in this article: the implementation of HKR with NOIA. These developments enable us to deal both with missing data and with the estimation of genetic effects of positions inside the marker intervals.

Álvarez-Castro and Carlborg _{1}⋅_{1}, and the inverse of the _{2}⋅_{2}:_{1} can be expressed as functions of the estimates in _{2} as:_{2}, can be computed from the ones in _{1} as:_{2}, _{2}, from the vector of variances of the estimates _{1}, _{1}, we just rewrite (3) in algebraic notation as:

Background information on the HKR and NOIA. Concepts and equations related to the original formulation of the HKR and to the NOIA statistical formulation that will help the reader to deeper understand some details of the methods used in the article.

(0.09 MB DOC)

The authors thank Lars Rönnegård and Carl Nettelbald for fruitful discussion. Örjan Carlborg acknowledges founding from Knut and Alice Wallenberg Foundation.