Epistasis Is a Major Determinant of the Additive Genetic Variance in Mimulus guttatus

The influence of genetic interactions (epistasis) on the genetic variance of quantitative traits is a major unresolved problem relevant to medical, agricultural, and evolutionary genetics. The additive genetic component is typically a high proportion of the total genetic variance in quantitative traits, despite that underlying genes must interact to determine phenotype. This study estimates direct and interaction effects for 11 pairs of Quantitative Trait Loci (QTLs) affecting floral traits within a single population of Mimulus guttatus. With estimates of all 9 genotypes for each QTL pair, we are able to map from QTL effects to variance components as a function of population allele frequencies, and thus predict changes in variance components as allele frequencies change. This mapping requires an analytical framework that properly accounts for bias introduced by estimation errors. We find that even with abundant interactions between QTLs, most of the genetic variance is likely to be additive. However, the strong dependency of allelic average effects on genetic background implies that epistasis is a major determinant of the additive genetic variance, and thus, the population’s ability to respond to selection.


Author Summary
Complex traits are influenced not only by the effects of individual genes but also by the myriad ways that these genes interact with one another, commonly referred to as epistasis. Theory suggests that epistasis could have important population-level implications in terms of the genetic variance components that govern evolution in response to natural or artificial selection. Unfortunately, empirical examples extending from observed interactions between genes to genetic variances are scant, particularly for natural populations. Here, we characterize epistasis between naturally segregating polymorphisms in M. guttatus and determine the cumulative effect of epistasis on population genetic variance components. To do this, we first elaborate the necessary statistical theory to accommodate estimation error in genetic effects, as failing to do so will upwardly bias variance predictions. We find that gene interactions have a net positive effect on both the total and Introduction Epistasis, the interactions between genetic loci, is an important determinant of phenotypes across a large number of taxa [1][2][3][4][5][6][7][8]. Yet for quantitative (complex) traits, the net effect of epistasis on the components of variation, specifically the additive genetic variance (V A ) that determines the response to natural or artificial selection, remains polemical. This is evidenced by a renewed debate over the evolutionary relevance of epistasis as exemplified by Crow [9] and Hansen [10]. An unfortunate source of confusion sustaining this debate is the simultaneous use of terms to describe both the effects of individual genes as well as the genetic variance components of populations (additive, dominance, and epistatic). It has long been known that high additive genetic variance does not imply additive gene action [11], a conclusion reiterated by the theoretical demonstration in Hill et al. [12] (see also [13]). However, there is little empirical evidence regarding the extent to which gene interactions determine the additive genetic variance, [14][15][16][17], which leaves several important questions unanswered. For instance, do interactions among genes tend to increase or decrease V A of traits, on average? As allele frequencies change in response to selection, does epistasis accelerate or dampen the corresponding change in V A ?
For a particular locus, the contribution to V A depends on the average effect of substitution at the locus and on the frequencies of the different alleles in the population [11]. The total V A is a simple sum over all loci affecting the trait. With epistasis, the average effect of a locus will change as the frequency of its epistatic partners in the population change [16]. Thus, the contribution of a locus to V A depends simultaneously on its own allele frequencies as well as the allele frequencies of all other segregating loci. Association mapping studies can estimate the V A contributed by a locus (e.g [7]), but this estimate is an average over the genetic backgrounds in the population. The extent to which locus-specific V A is determined by interactions with other loci remains unknown. An alternative to association mapping is to estimate genetic effects from genotypes produced from experimental crosses, with the remainder of the genetic background held constant. These genetic effects, often termed functional effects [18], can be defined as deviations from a reference genotype, and therefore, do not depend on an unknown distribution of genetic backgrounds. Given allele frequencies in a population, it is straightforward to calculate total and locus-specific variances based on these genetic effects. One can also calculate these variances under the assumption of additivity of loci (i.e. no epistasis). Contrasting these variances to those calculated from genetic effects based on multi-locus measurements provides a simple, direct demonstration of the effect of epistasis on V A [15]. Further, one can observe how this contrast changes as population allele frequencies change.
Standard equations used to calculate V A from genetic effects [11] assume that effects are estimated without error. Estimation error in genetic effects is often substantial even with large sample sizes, and failing to account for this error will result in an upward bias in variance predictions [19]. This is because genetic effects are squared and different effects are multiplied together when variances are calculated. As the true value for a genetic effect is the estimate minus a residual (the estimation error), treating the genetic effect estimates as the truth results in the inclusion of squared residual terms thus biasing variance components upwards. Luo et al. [19] derived a correction that incorporates the variance-covariance matrix of the genetic effect estimates for the case of a single locus. Here, we extend the bias-correction to multiple loci, accommodating epistatic terms, and demonstrate its validity using parametric bootstrapping. Then, we use the corrected variances to explore the effects of epistasis on the total and additive genetic variances under different models of allele frequency.
We consider loci affecting floral morphology and development rate that are polymorphic within a single population of Mimulus guttatus (yellow monkeyflower). Mimulus guttatus is an emerging model in evolutionary genetics; notable for its high degree of phenotypic and genetic variation within and between populations as well as its ability to adapt to novel environments [20]. The species is broadly distributed across western North America and ranges from high alpine to low-elevation coastal environments. In Mimulus guttatus, multiple studies have shown that epistasis contributes to within-population variation in floral morphology, development time, and fitness components [3,21]. Interaction effects are routinely of the same magnitude as single-locus effects, although the magnitude and direction of epistasis between loci is highly variable.
In this study, we maximize statistical power to estimate direct and epistatic effects between QTL using Double-NIL lines (DNILs, hereafter), in which two loci segregate in an otherwise isogenic genetic background. We confirm the finding of important but highly variable epistasis between these QTL, but also quantify the contribution of epistasis to population genetic variance using the genetic effect estimates and a model of allele frequency. We find that the average effect, which determines the locus-specific response to selection, depends heavily on genetic background or rather the frequency of different genetic backgrounds in the population (i.e. the genotypes at all other loci). For some traits, this leads to an average increase in genetic variance components, whereas in others, the effect is opposite albeit minimally. Overall, it is clear that epistasis is an important determinant of both individual phenotype as well as the genetic variance components, which govern the ability of a population to respond to natural or artificial selection.

Genetic lines
The DNILs were derived from a previous study by Kelly and Mojica [3]. The process of mapping the original QTL began with a large-scale artificial selection experiment on a collection of lines derived from a single natural population at Iron Mountain in Central Oregon. This resulted in populations with highly divergent floral traits [22]. Individuals from the tails of the distribution were randomly selected and crossed to produce three F1 populations and each of these populations were backcrossed for six generations to IM767, a commonly used inbred line with medium floral trait values derived from the same natural population [23]. This resulted in 3 panels of Nearly Isogenic Lines (NILs; 493 NILs in total), with each NIL containing a random segment of donor genome from one of the three F1 populations. NILs were measured for corolla width, and selective genotyping of NILs from the tails of the distribution identified 7 QTLs affecting corolla width. Three rounds of background-cleaning were performed for each of the 7 NILs, using a combination of selfing and backcrossing to IM767 to eliminate segments of donor, non-IM767 genome from other parts of the genome.
The seven NILs containing the QTL were crossed in each possible pairing to produce 21 F1's each of which contained solely the double-heterozygote for the donor alleles present in their parents. A single individual from each F1 was self-fertilized, and the resulting F2's were genotyped at the relevant loci (see S1 Table for the list of diagnostic markers for the 7 QTL). The four true-breeding (double-homozygous) genotypes from each F2 were set aside and self-fertilized, in order to serve as the parents of a single Double-NIL line set. Each DNIL is essentially a collection of nine genotypes corresponding to two biallelic QTL segregating in an otherwise uniform genetic background. For a single DNIL, we create these nine genotypes by selfing the true-breeders as well as crossing them in all possible directions to produce the five heterozygous genotypes. With seven QTL, there are 21 possible pairs of loci; however, this study examines only 11 of the DNILs owing to loss of several lines used in Kelly and Mojica [3].

Phenotype data
Plants were grown in the University of Kansas greenhouse in five large cohorts. All nine twolocus genotypes for a particular Double-NIL were included in a cohort, which meant that only a subset of Double-NILs could be included in any one cohort. Within a cohort, multiple seed families from each genotype were sprinkled into unique 2x2 in. pots and watered generously. After approximately 10 days, individual plants were transplanted to their own 2x2 in pot. The pot locations were randomized initially and rotated regularly to avoid effects of inconsistent conditions within the greenhouse. Plants were watered every other day following transplant and fertilized once a week.
Upon flowering, plants were measured for several traits: corolla width (CW), distance between stigma and anther (SA separation), pistil length, and the number of days until first flower (DTF). Measurements were taken on all open flowers present at time first flower, which was typically only one or two. The corolla width is the widest distance of the flattened width of the lower lip of the corolla, while the rest of the measurements are self-explanatory. Kelly and Mojica [3] measured the double-homozygotes for 17 of the 21 DNILs. For this study, we elaborated measurements to include all nine two-locus genotypes for 11 of the 21 DNILs. As there is significant overlap between individuals in this and the aforementioned study, we combined the relevant data from Kelly and Mojica [3], which included plants grown in seven distinct cohorts. This resulted in a highly unbalanced dataset, but provided greater accuracy for estimating particular genotypic means. All individuals were grown at the KU greenhouse under the same watering and fertilizer regiment.

Confirmation genotyping
Crossing two of the true-breeding genotypes within a DNIL will necessarily result in genetically identical heterozygous offspring; however, we genotyped a subset of individuals from each cohort via touch-down PCR at gene-based markers diagnostic of particular QTL (S1 Table) in order to confirm that progeny genotypes were as expected (incorrect genotypes occasionally result from mislabeling or accidental pollen transfer during selfing/crossing). The few individuals with incorrect genotypes that were identified (typically < 5% per cohort) were removed from the analysis along with all of their siblings. PCR fragments were analyzed on ABI 3130 BioAnalyzer, and genotype calls were made using GeneMapper (Applied Biosystems, Foster City, CA, USA).

The overall test for epistasis
We performed a likelihood ratio test to compare a full and reduced model for each trait corresponding to models with and without epistatic parameters, respectively. In R, we fit each model using REML as implemented in the "lme4" package followed by a call to the "anova. merMod()" function [24]. This produces a likelihood for each model from which a likelihood ratio is calculated and compared to a Chi-squared distribution with degrees of freedom equal to the difference in the number of parameters between the full and reduced model. There were 60 degrees of freedom in the full model and 16 in the reduced model, which corresponds to the number of genetic effects plus the cohort and family effect. The difference, 44, is the degrees of freedom for each of the likelihood ratio tests.

Linear model for estimation of genetic effects
We estimated the single-locus and epistatic genetic effects using the NOIA functional genetic effect model [18]: where Here, Z is the trait value, C k is the random effect due to cohort, F l is the effect of seed family (environmental maternal effect), which is nested within genotype; a and d are the single-locus effects, and aa, ad, da, and dd are the epistatic effects. The residual variance applies to variance within families. The X a and X d variables (corresponding to the design matrix in [18]) are numerical values that, together, specify an individual's diploid genotype at a locus. In this case, W is the donor allele and w is the reference IM767 allele at a QTL. For a pair of loci, the four pairwise products of these variables provide contrasts by which the four analogous epistatic parameters are estimated. For a completely homozygous IM767 individual (ww at all loci), all X a and X d terms are 0, and therefore, the standard inbred IM767 line serves as the reference point in the functional NOIA model by which all genetic effects are defined as deviations from. These genetic effects can then be used (as described in the proceeding section) to generate predicted genotypic values for multi-locus genotypes. To determine the predicted genotypic values in the absence of epistasis (non-epistatic values), we fit separate models to estimate single-locus effects using only data corresponding to single-locus genotypes (essentially, the NIL genotype data that forms a subset of the DNIL data). Again, we use the functional NOIA parameterization (Eq 1 without epistatic terms), specifying the homozygous IM767 genotype as the reference point. By this method, we define the non-epistatic value as the predicted multi-locus genotypic value given only information from individual loci in an isogenic background.
It should be noted that the NOIA model has both a statistical formulation and a functional formulation, which is used here. Effectively, Eq 1 is the traditional animal model of genetic effects, and the functional NOIA (referred to as NOIA, hereafter) simply refers to the index variables used to specify an individual's genotype. Here, we require functional genetic effects, in order to predict variances for any set of population allele frequencies. We investigated alternative, functional parameterizations for the index variables including the F 1 model [25] and the unweighted-regression (UWR) of Cheverud and Routman [15], but these models use a different reference point and the parameters have a different quantitative interpretation. F 1 and UWR yield the same predicted genotype values as NOIA (the models are inter-convertible), but we prefer the NOIA because parameters are defined as deviations from a reference genotype and, therefore, are more clearly interpretable between the full model (Eq 1) and the reduced (non-epistatic) model (fit to the reduced dataset). As a result, we find that the nonepistatic coefficients of NOIA (a and d terms of Eq 1) are "stable." If we fit the full NOIA model (all terms) to the full dataset (all genotypic combinations included) we get estimates for the a and d terms that are nearly equivalent to what we get when we fit the reduced NOIA model (no interactions terms) to the reduced dataset (plants of the reference genotype plus those that differ from the reference genotype at only one QTL). This is not true of analogous estimates from the F 1 or UWR models, which do not use a common reference genotype as the reference point. While this consistency is convenient for the interpretation of genetic effects, it is not crucial to the results. It is the genotypic values predicted with and without epistasis that serve as the basis for determining the effect of epistasis on genetic variance, and as stated previously, the multiple parameterizations that we investigated all provide the same predicted values.
We used REML (implemented in JMP, Version 11. SAS Institute Inc., Cary, NC, 1989-2014) to estimate the fixed genetic effects as well as accommodate random effects (cohort and family) in the model. While Eq 1 is specified for only 2 loci, all relevant genetic effects were included in the linear model and fit to the entire DNIL (full model) or NIL data (reduced model). Fitting this larger model accommodated the fact that many DNILs have overlapping genotypes. Models were fit separately for each trait. In total, there were 14 single-locus effects (seven 'a' terms and seven 'd' terms) as well as 44 epistatic terms (four epistatic terms per DNIL x 11 DNILs). Model fits were based on 4263 measurements carried forward from Kelly and Mojica [3] plus 6234 measurements from the five additional grow-ups.

From effects to variances
Estimation error in genetic effect estimates must be properly accommodated because genetic effect estimates are squared and different effects are multiplied together, when calculating variances. This can introduce bias with or without epistasis. Consider the single locus, 2-allele model [11], where the additive genetic variance (V A ) is Here, a and d are the additive effect and dominance deviation, respectively, and p and q are the frequencies of alternative alleles. An experimental study will yield estimates for the genetic effects,â andd, but even if unbiased, these estimates will be encumbered with estimation error: Here, the γ are residuals; random variables with mean 0 and a variance contingent on experimental design (e.g. sample sizes). If direct substitution is used to estimate V A , i.e. V A ¼ 2pq½â þdðq À pÞ 2 , bias is introduced: The second term of the sum, involving the estimation variances (Var[γ a ], Var[γ d ]) and the covariance of errors (Cov[γ a , γ d ]), is the bias. A bias corrected estimate, denoted V Ã A , can be derived using standard dispersion statistics: where s 2 a is the estimated variance of γ a (the squared standard error ofâ), s 2 d is the estimated variance of γ d and s ad is the sampling co-variance. This statistical issue has been addressed for a single locus [19], and we here generalize bias-correction for genetic variance predictions when there are interactions among loci. We extend the logic of Eq 5 to all eight genetic effect estimates associated with each QTL pair (see "Linear model for estimation of genetic effects" section above).

Prediction of variance components
The genetic variance of a trait for a particular population is a function of the individual genotypic values and their frequency in the population. For a particular multi-locus genotype (u), we can calculate the predicted genotypic value using the genetic effect estimates from the linear model fit as follows:Ẑ whereb i is the estimate for the i'th effect (B is the total number of effects), and X i,u is the relevant indicator variable from Eq 1 for genotype u. If we let Z G represent the genotypic value of an individual drawn randomly from a population, then the total genetic variance, V G , is simply the variance of Z G , which can be found by, These expected values are functions of the genotypic values and the multi-locus genotype frequencies. For example, where Ω is the set of all possible multi-locus genotypes and F u is the population frequency of the multi-locus genotype, u. ExpandingẐ 2 G (temporarily suppressing the u subscript), we see thatẐ We see thatẐ 2 G is biased because E½b 2 i > b 2 i and E½b ibj 6 ¼ b i b j , essentially the same reason evident in Eq 4 We correct for this upward bias by subtracting off the relevant sampling variance/covariance term, such that the corrected value is, More generally, . Thus, the expected value for Eq 11 is Correcting the estimate for E[Z G ] 2 is slightly more involved because the full expansion (substituting Eq 6 for Z G ) produces terms of squares and cross-products within and across multi-locus genotypes. The bias corrected estimator for The additive genetic variance, V A , due to a set of L bi-allelic loci is where α k is the average effect of the allele with frequency p k at locus k. The average effect is Here, E[Z | W k W k ] is the mean phenotype across all multi-locus genotypes that are homozygous for allele 1 at locus k, E[Z | W k w k ] is the corresponding conditional mean for heterozygotes, and E[Z | w k w k ] is the mean for allele 2 homozygotes. Eqs 14-16 assume Hardy-Weinberg genotype proportions. We focus on the Hardy-Weinberg case, because without random mating, the additive genetic variance is not a sufficient statistic to predict response to selection [26][27][28].
The average effect is a linear function of genetic effects, and as a consequence, direct substitution of effect estimates into Eqs 14 and 15 yields unbiased estimates. However, when α k is squared (Eq 13), upward bias is introduced by estimation error. Computation of bias-corrected estimates α k 2 follows the same method of Eqs 7-13, although the relevant sums are over all loci except k (code to implement these calculations was written in C; available in supplemental information). Importantly, in these calculations, we assume that epistasis is absent for pairs of QTL corresponding to DNILs for which we have no measurements. We also assume no higher-order interactions. All parameter estimates from the linear model fit were incorporated regardless of their statistical significance. Our method accounts for uncertainty in parameter estimates by directly incorporating the sampling variance/covariance of estimates into the calculation of variance components. Predictions of V G , V A , and locus specific α k were generated for each set of simulated allele frequencies.

Allele frequency model
We investigated the distributions of genetic variance components under two differing allele frequency models, a uniform distribution and a U-shaped distribution as in Hill et al. [12]. Allele frequencies were sampled independently for each locus to create a set of 7 frequencies per set. We drew 200 sets of frequencies from each distribution, and these have been included in the supplemental information. For each set of allele frequencies and for each trait, we calculated the corrected and uncorrected genetic variance components variance (Eqs 6-16). We performed this operation, first, using the entire suite of single-locus and epistatic genetic effects to predict genotypic values, and then a second time, using only the single-locus genetic effects estimated from the reduced data set consisting only of single-locus genotypic data. This allows us to observe the effect of epistasis on the total and locus-specific genetic variance components.

A simulation test of the bias-correction procedure
The bias-correction procedure will produce unbiased estimates of variance components if the estimated sampling (co)variances (sb ibj ) are unbiased estimates of the true sampling (co)variances. However, precision of bias-corrected estimates may be reduced, as including sb ibj in the calculations could increase sampling variance of bias-corrected estimates, particularly if sb ib j terms are large. We performed parametric bootstrapping to determine the effect of our procedure on the precision of estimates as well as to confirm its efficacy in eliminating bias. Using the estimated genetic effects as well as the within-group variance estimated from the linear model fits, we simulated trait values for individuals of particular genotypes producing 500 replicate data sets. Each simulated dataset for each trait was the same size and structure as the original dataset. We then estimated the genetic effects and sampling (co)variance matrices for each replicate dataset and calculated the corrected and uncorrected genetic variance components for two sets of allele frequencies. For the first set, the reference allele frequency was set equal to 0.5 for all loci. For the second set, the reference allele frequency was set equal to 0.05, and thus, the donor allele frequency was set to 0.95 (see "Linear model for estimation of genetic effects" section above for definition of reference vs. donor). To determine the bias exhibited by each variance calculation, we calculated the true variances given the genetic effects that we specified to simulate the data. We calculated the mean square error for each distribution of variance calculations by dividing the sum of squared deviations from the true value by the number of replicates (500). To standardize the mean square error, we divided the sum by the square of the true variance. In addition, we calculated the standardized bias as x Ã Àx x , where x Ã is the average of estimates and x is the true value.

Prevalence and patterns of epistasis between QTL
There is strong evidence for epistasis for all of the traits (Table 1), consistent with the prior study of these loci based solely on homozygous genotypes [3]. Concerning the individual terms of the models, we find that epistatic genetic effects are occasionally significant and typically of the same order as single-locus effects (S2 Table and S3 Table). There was no clear trend towards positive or negative epistasis, although additive-by-additive interactions are more frequently observed to be significant than other forms. Particular types of epistasis are illustrated by example in Fig 1 (see S1 Fig for the full collection of graphs), which contrast the predicted genotypic value with epistasis (the bars) to the corresponding non-epistatic value (the 'X'). The non-epistatic value is the genotypic value predicted using only the relevant single-locus effects (a and d terms) estimated from a linear model fit based on single-locus genotype data (essentially, NILs). The epistatic genotypic value is based on all genetic effects (single-locus and epistatic) estimated from the full linear model fit to the DNIL data. The deviation between these values is the contribution of epistasis to the observed genotypic value. Fig 1E provides an example of sign epistasis, wherein the positive effect of the donor allele at QTL x8 in the QTL x10a AA background exhibits a negative effect in the Aa background. Fig 1A-1C demonstrate the potential for consistent epistatic deviations for certain genotypes across multiple traits; particularly, the AABb genotype exhibits striking positive epistasis for all morphological traits. Conversely, some genetic backgrounds have variable effects when combined with other loci. For instance, positive synergism is observed for QTL x1 in the QTL x9 AA background (Fig 1D), whereas negative diminishing returns epistasis is observed for QTL x5a in the same AA background of QTL x9. Lastly, Fig 1A-1C also provides evidence of the potential of epistasis to modify dominance relationships. In the AA background, overdominance emerges for all traits, despite single-locus predictions of partial dominance (Fig 1A and 1B) and underdominance ( Fig 1C).
When considered together, the epistatic deviations (deviation between the 'X's and bars in Fig 1) illustrate the pattern of epistasis. Focusing on only the genotypes unobserved in the single-locus NIL model fit (AABB, AaBB, AABb, and AaBb), we find that the average epistatic deviation is near zero for corolla width, SA separation, and pistil length (-0.04, 0.01, and 0.06 respectively), but is more appreciable for days to flower (-0.49). This indicates that plants tended to flower earlier than expected based on non-epistatic predictions. The standard deviations of the epistatic deviations speaks to the variability of epistatic effects and are 0.43, 0.20, 0.30, and 0.68 for corolla width, SA separation, pistil length, and days to flower, respectively. Comparing the sum of absolute deviations from the reference genotype for non-epistatic versus epistatic values provides additional information on the pattern of epistasis. If we subtract the sum of epistasis values from the sum of non-epistatic values, a negative difference would indicate synergism (sometimes called positive epistasis), wherein epistasis gives rise to greater deviations from the reference point, on average. The converse would indicate a diminishing effect of mutations under epistasis. For corolla width, days to flower, pistil length, and SA separation, the percent difference between the sum of absolute deviations for no-epistasis vs. epistasis was 0.04, 0.02, -0.11, and 0.01, respectively. Evidence for cumulative synergism or diminishing returns is rather weak for all cases.

Pleiotropy and correlation among traits
Pleiotropy is common for both single-locus and epistatic effects (S2 Table and S3 Table). Effects were typically significant for between two and three traits. Pleiotropy seems to be modular in the sense that QTL/DNILs with a significant effect on one floral trait tends to also affect other floral traits, in contrast to the day of flowering. Effects were significantly correlated between pistil length and corolla width (r = 0.47; p = 0.0002), between pistil length and days to flower (r = -0.28, p = 0.0305), and between pistil length and stigma-anther separation (r = 0.29, p = 0.0292; S4 Table for full list of values). The traits themselves were also strongly correlated (S5 Table). While some of the effects are in line with the correlational structure of the data (e.g. single-locus additive effect of QTL x10b), this was not a consistent pattern (e.g. single-locus additive effect of QTL x5a).

Extension of genetic effects to genetic variance predictions
Density plots of V A across the 200 simulated allele frequency sets (uniform distribution) calculated from the single-locus (termed "No Epistasis") and DNIL ("Epistasis") data are depicted in  (Table 2) regardless of the allele frequency distribution. As expected, average V A and V G are always larger for Uniform distribution compared to the Ushaped distribution, whereas the proportion of variance that is additive is greater for the Ushaped distribution. Notably, the genetic variance is mostly additive even in the presence of substantial epistatic interactions. Epistasis also significantly affected the shape (variance) of the distributions of genetic variance estimates in addition to the location (mean). This is evident in Fig 2, as well as S4-S10 Figs. Epistasis tended to increase the variance, oftentimes producing long tails representing the observance of more extreme values.
The bias-correction procedure significantly reduced estimated V G values (particularly for days to flower; DTF) although there was considerable variation among sets ( Table 2). The larger reduction due to bias-correction of DTF is expected given that estimation error is greatest for this trait and that this estimation error is directly related to the degree of bias in variance calculations. Bias-correction also reduced predicted V A , but to a lesser extent. As a consequence, the ratio of V A to V G is substantially greater for bias-corrected values (60-80% across the four traits) than for uncorrected variance components (45-60%). This is true regardless of whether one calculates V A /V G for each simulation replicate and then averages, or takes the ratio of mean V A to mean V G (as in Hill et al. [12]). Occasionally, unrealistic, negative values for V A result from the bias-correction procedure, and this tendency seems to be exacerbated by greater estimation error of the coefficient estimates. For example, effect estimates for Days to Flower routinely had the largest standard errors, and this is accompanied by many negative values (Fig 2, lower left panel). It should be noted that uncorrected variance estimates also produce negative values albeit to a lesser extent (Figs S8 and S10), which is true for any estimate whose standard error is large relative to its magnitude.
The distributions resulting from the bias-correction simulations demonstrate that the correction procedure is effective (Table 3 and Figs S11 and S12). The mean of corrected estimates matches the truth more closely than uncorrected estimates (Std. Bias in Table 3). There is a slight negative bias to the bias corrected estimate (typically 1-4%), perhaps because the sampling (co)variances of estimates are only approximate. The mean square error of corrected statistics is lower than for the uncorrected (Table 3). This is due entirely to the bias reductions given that the distributions of the corrected and uncorrected statistics have nearly identical variances.
The V A for traits is a weighted sum of squared average effects (Eqs 14-16). When considering the effects of epistasis on individual loci, we see dependence of the average effect, α, on genetic background (Fig 3 for selected examples, S2 Fig for full collection). The points in the figure are α values for a locus calculated from our set of 200 allele frequencies (uniform distribution), and the dashed line represents the best-fit line through the points. The solid black line Table 2. Mean and standard deviation (in parentheses) for additive and total genetic variance calculated with and without bias-correction (above and below the line, respectively), as well as with and without epistasis for both of the allele frequency distributions. shows the values for α without epistasis, which depend only on the genetic effects and allele frequency at that locus. If the locus showed entirely additive gene action, this line would be perfectly horizontal, whereas dominance gives rise to a non-zero slope. Over-or underdominance is implied when lines cross 0 on the y-axis. Deviations between the dashed and solid lines in Fig 3 demonstrate the effect of epistasis averaged over genetic backgrounds. A change in slope between solid and dashed lines indicates the statistical dominance effect depends on epistasis. In Fig 3A, we see that locus QTL x10a is predicted to contribute little to no V A without epistasis, but exhibits a substantial average effect when epistasis is considered. Fig  3C indicates a case in which epistasis does not affect dominance, but changes the sign of α. However, epistasis often affects the dominance properties of a locus as evidenced by differences in the slope of the dashed and solid lines. Epistasis is seen to make an additive locus exhibit dominance (Fig 3A and 3D), reduce dominance to near additivity (Fig 3B and 3E), and give rise to over or underdominance (Fig 3A and 3F). Some loci are relatively less sensitive (more robust) to background than others: Note the small scatter and relatively shallow trajectory of x5a relative to x10a for Days to Flower (Fig 3A and 3D).

Discussion
Epistasis is often a major factor in the mapping from genotype to phenotype [2,5,29,30], but its relevance to heritability and evolution remains contentious [9,10]. In part, this is due to persistent confusion about how genetic effects are defined and, thus, relate to genetic variation. Kempthorne [31] and Cockerham [32] defined genetic effects to be orthogonal, such that additive, dominance, and epistatic variance were each attributed solely to the corresponding genetic effects. The orthogonal property of these models is convenient, and the resulting estimates relate directly to evolutionary change. However, the estimated effects are specific to the population under consideration and depend entirely on allele frequencies. Epistatic effects defined in this way do not contribute to the additive variance (V A ), despite that interactions between genes certainly do affect V A . This has led many to the incorrect assumption that epistasis is unimportant for evolution. Functional models provide an alternative view that better reflect the physiological or molecular interactions that determine the mapping from genotype to phenotype [15]. A substantial body of theory and simulation has shown that gene interactions can be an important determinant of heritable variation and, thus, the response to selection [14,16,30]. Unfortunately, key empirical evidence about how epistasis influences heritability is lacking. Most basically, do interactions among genes tend to increase or decrease V A on average? How does epistasis affect the evolution of V A under sustained selection? How do interactions among QTLs affect the allele frequency dynamics that underpin changes in quantitative traits? While there are many empirical studies demonstrating gene interactions, nearly all lack a clear population context and heritability is a population-specific statistic. While crosses between divergent populations/species routinely reveal epistasis, the loci segregating in these crosses may never have been segregating (simultaneously) within a specific natural population (e.g. Dobzansky-Muller incompatibilities). More importantly, the magnitude and pattern of effects may not be representative of the segregating polymorphisms that determine genetic variation within populations, perhaps due to multiple, subsequent mutation in a locus [33] (see Hansen [34] for a discussion on the evolution of epistatic interactions). By focusing on epistasis between loci polymorphic in a single, natural population of M. guttatus, we provide an empirically calibrated evaluation of how gene interactions alter genetic variance components in nature. We find that epistasis has a net positive effect on both V A and V G for most measured traits, providing an empirical demonstration of epistasis in determining heritable variation and the response to selection.
The contribution of epistasis to genetic variance depends, in part, on the existence of particular patterns of epistasis among loci [14,35]. Synergistic epistasis among mutations, termed positive epistasis by [14] (but see [36]), will increase variability among genotypes, whereas diminishing returns epistasis should have the opposite effect. However, when interactions are variable and idiosyncratic as we observe in this study, the effect on genetic variance will likely depend more on allele frequencies in the population than any average pattern across loci. As allele frequencies change, the balance between positive and negative effects of gene interactions on genetic variance also changes in proportion to the relative frequency of different genotypic combinations. Indeed, we routinely observe both positive and negative effects in the genetic variance predictions for all traits across simulated allele frequency sets (Fig 2). Epistasis will thus facilitate evolution in some cases, but hinder it in others. It may do both at different points in time within the same evolving population.
The average tendency for epistasis to increase genetic variance for days to flower, pistil length, and SA separation suggest a positive average effect of interactions on the additive genetic variance. The exception is corolla width where the mean V A is slightly lower with epistasis; a result that could be anticipated from the observation in Kelly and Mojica [3] that corolla width QTLs had smaller effects in combination than predicted from individual effects in the isogenic background. For all traits, epistasis increased the variance of the V A across allele frequency sets (Fig 2 and Table 2). This speaks directly to how V A will change under directional selection. With only additive effects, V A is predicted to change slowly with selection, unless there are major loci and/or if allele frequencies are initially extreme. The greater dispersion of V A with interactions (Fig 2) suggests that epistasis is likely to generally accelerate changes in V A .
The strong dependency of the average effect of substitution at a locus on interactions with other genotypes (Fig 3) indicates that epistasis also has important implications for the dynamics of individual loci. With directional selection, the change in allele frequency at a QTL is determined by the locus-specific V A , which, in turn, is proportional to the average effect [37,38]. Our results suggest that average effects and, consequently, locus-specific V A will be highly malleable throughout the selection response as allele frequencies change simultaneously across all loci. This is an important consideration now that we are able to directly monitor allele frequency change within populations under sustained directional selection [39][40][41][42][43][44]. Allele frequency trajectories will be variable in finite populations, and epistasis is likely to amplify this variability, generating idiosyncratic responses to replicate selection events [42,[45][46][47]. Even when genetic drift is inconsequential, sign epistasis and emergent over/under dominance imply that the ultimate loss, fixation, or balance of an allele will depend on the order of fixation of alleles at other loci. A thorough characterization of epistasis is, therefore, necessary if one wishes to understand allelic dynamics in response to selection and how this translates to the observed change in trait means.
Characterizing higher order epistastic interactions involving three or more loci may be necessary for a complete understanding of the selection response in terms of the underlying loci. Nevertheless, studies of pairwise epistasis like this one provide important information on the relative role of at least a subset of possible interactions. First, they provide a baseline estimate of the genetic variance attributable to epistasis. Second, they provide insight regarding the interactive properties of a locus (i.e. how frequently and strongly does the locus exhibit epistasis, and to what degree does this influence the average effect at the locus). Lastly, pairwise estimates allow us to determine the improvement in predictive ability of the selection response due to the inclusion of epistasis. For example, Carlborg et al. [48] demonstrate that pairwise estimates of epistasis are necessary to predict the long-term selection response in an artificial selection experiment on growth in domestic chickens. While only a first step towards understanding the entire network of epistatic interactions, pairwise estimates illustrate the relevance of epistasis to heritable variation and the evolutionary process.
To extend from genetic effects to genetic variances, it is essential to accommodate estimation error. Otherwise, predictions will necessarily be upwardly biased. Though this issue has been addressed previously [19], it has gone unnoticed in the vast majority of studies extending genetic effects to variances. Even in studies utilizing highly replicated measurements from inbred lines, such as this one, estimation error is appreciable, and the greater the estimation error, the greater the upward bias in variance predictions. This is particularly disconcerting given the disproportionate effects on variance components documented in this study, as this will impact estimates of heritability in the narrow sense. We have shown that our method of correcting variance calculations does indeed remove this bias and remains as precise as uncorrected calculations, although the latter point is likely to depend on the sample size of a study and the inherent variation of a particular trait. In smaller studies, one must make the choice between the tradeoff of a biased estimate vs. an imprecise one. However, we stress that future studies attempting to unite genetic effects with heritability take greater care to accommodate the potential for estimation error to inflate estimates of genetic variance components.

Conclusion
The results of this study, documenting the role of variable epistasis in determining genetic variance components are timely given the renewed interest and debate on the subject [9,10]. Given that most genetic variance remains additive in the presence of epistasis and that additive variance is largely sufficient to predict the response to selection, it would seem, at first glance, as if epistasis is irrelevant. Upon further inspection, we find that epistasis contributes substantially to additive genetic variance, increasing it on average for most traits (Table 2 and Fig 2), which should accelerate the response to selection. We note however that epistasis reduces the additive variance for particular combinations of allele frequencies with all traits. Contrary to the perspective that epistasis will have only transient effects on selection dynamics due to allelic combinations held together by linkage disequilibrium [37], our results suggest that the principal effect of epistasis may be as a major determinant of V A [10,14], although empirical evidence supporting the generality of this conclusion is currently limited [15].
Semantics has been a major impediment to connecting epistasis and the additive variance; the terms additive, dominance, and epistatic effects are used in a broad range of genetic effect models, yet differ substantially both in their interpretation as well as their relationship to genetic variance. This has led several authors to make general statements regarding epistasis that may be valid in the context of their own study, but are incorrect generally. For instance, it is not uncommon for authors to simply claim that non-additive effects, such as dominance and epistasis, do not contribute to additive genetic variance [49] and, therefore, are unimportant for the evolution of polygenic traits [9]. While this is true of statistical models of genetic effects, it is not true of functional models, as this and several other articles have argued [10,14,15]. In addition to partially determining additive variance, epistasis implies that allelic dynamics will depend on initial frequencies, such that replicate selection events are expected to be largely idiosyncratic and perhaps unrepeatable in terms of changes at underlying loci. Therefore, understanding functional epistastic interactions are important for understanding the fate of individual genes as well as populations exposed to selection.
Supporting Information S1 Data. Phenotype date used for estimation of genetic effects as well as sampling (co)variance matrices. (XLSX) S1 Code. C code used to calculate variance components. (C) S1