Figures
Abstract
Gene-environment interaction (G × E) studies hold promise for identifying genetic loci mediating the effects of environmental risk on disease. However, interpretation of G × E effects is often confounded by two fundamental issues: the dependence of interaction estimates on outcome scale and the presence of endogenous treatment effects, in which genetic liability influences environmental exposure. These factors can induce apparent G × E signals—even when genetic and environmental contributions are purely additive on an unobserved scale. In this work, we demonstrate that any monotone convex transformation of an outcome induces sign-consistent G × E effects: the sign of the interaction term aligns with the sign of the corresponding main genetic effect. Convex transformations are a broad class of functions that include many commonly used data transformations, such as exponential and logarithmic functions, the square root, and other power transformations. We further show that endogenous treatment effects, modeled as threshold-based interventions, generate G × E effects with a similar directional signature. Exploiting this property, we propose a simple diagnostic: sign consistency across G × E estimates can signal when interactions are driven by outcome scaling or exposure endogeneity. We validate our framework in the UK Biobank using transcriptome-wide interaction studies (TxEWAS) across multiple trait–environment pairs, observing widespread sign consistency in some settings—suggesting confounding by scaling or treatment bias. Our results provide both a theoretical foundation and a practical tool for interpreting G × E findings, enabling researchers to assess whether the observed G × E signal may depend substantially on outcome scaling or be influenced by exposure endogeneity.
Author summary
Gene-environment interaction (G × E) studies examine the extent to which genetic differences modulate environmental impacts on individuals’ health outcomes. However, their results depend on how these outcomes are measured or modeled, and are often confounded by endogenous treatment effects, where exposure to an environment depends on the health outcome itself (for example, individuals with high blood pressure are more likely to receive blood pressure reducing medications). We demonstrate that both a wide class of scaling functions and endogenous treatment effects induce sign-consistent G × E: the direction of the interaction aligns with the direction of the main genetic effect. This property can be used as a diagnostic to assess when an apparent G × E signal could be driven by outcome scaling or exposure endogeneity.
Citation: Sadowski M, Dahl AW, Zaitlen N, Border R (2026) The geometry of G × E: How scaling and endogenous treatment effects shape interaction direction. PLoS Genet 22(4): e1012073. https://doi.org/10.1371/journal.pgen.1012073
Editor: Xiang Zhou, Yale University, UNITED STATES OF AMERICA
Received: August 4, 2025; Accepted: February 27, 2026; Published: April 1, 2026
Copyright: © 2026 Sadowski et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: The UK Biobank data underlying the results presented in this study were accessed under application 33127 and cannot be further distributed in accordance with UK Biobank policies. Researchers may obtain access to these data by submitting an application directly to the UK Biobank: https://www.ukbiobank.ac.uk/enable-your-research/apply-for-access. All data and code necessary to reproduce the figures and results presented in this study are publicly available in a GitHub repository at https://github.com/michalsad/gxe_sign.
Funding: This work was funded by the National Institutes of Health grants R01MH130581, U01MH126798, R01MH122688, R01HG006399, R01HG011345, and R01GM142112 (NZ, RB, MS); L30HG013856 (RB); and R35GM150822 and K25HL157603 (AWD). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
Introduction
Individuals exhibit substantial phenotypic heterogeneity in response to environmental perturbations. Part of this heterogeneity arises from individual differences in genetic background and is referred to as gene-environment interaction (G × E). Several interactions identified to date have important implications for human health. For example: (1) dietary treatment prevents symptoms of phenylketonuria—a genetic disorder caused by mutations in the PAH gene [1]; (2) physical activity blunts the effects of obesity risk variants in the fat mass and obesity-associated gene, FTO [2]; (3) a variant of the NAT2 gene elevates the risk of bladder cancer in smokers [3]; and (4) a multitude of gene polymorphisms have been shown to impact drug response or toxicity [4–8]. These examples showcase the potential of G × E discovery to enhance disease prevention and management, to enable design of individualized treatments that are safer and more effective, and, more generally, to advance our understanding of disease etiology. To unlock this potential, many methods for G × E detection have been developed [9–12] and they are continually being optimized. Most recent approaches enable genome-wide G × E screens in large-scale studies of human populations [13–15].
Standard statistical methods assess G × E by comparing additive and non-additive models of genetic and environmental effects. Although valuable for exploratory analysis, this definition depends on modeling assumptions and might not reflect biological mechanisms of interaction [16,17]. Here, we focus on two fundamental issues that complicate the interpretation of current G × E approaches: (1) dependence on phenotype scale and (2) endogenous treatment effects. In case (1), the detection of an interaction effect and its direction depend on the scale on which the outcome is measured, or to which it might be transformed [18–21]. For example, even though a genetic variant G and an environmental factor E impact an outcome Y additively, an interaction test performed on Y that has been log-transformed (e.g., as part of quality control processing) can yield a highly significant G × E effect (Fig 1). More generally, many interaction effects can be induced or removed by monotonic non-linear transformations of the data. In case (2), exposure and genetic liability are causally intertwined. For example, imagine that a treatment is administered to taper the level of a heritable phenotype when it crosses some threshold (e.g., statins may be prescribed to lower low-density lipoprotein (LDL) cholesterol levels). In this case, exposure to the intervention is related to genetic factors influencing phenotype. Such endogenous treatment effects can result in apparent G × E, even when gene and environment act additively on the observed scale. As a result, most G × E findings require the caveat that they are specific to a particular measurement or may be a consequence of endogenous treatment effects [22–25].
A: Depiction of the effects of the genetic variant G (with reference allele A and alternative allele B, MAF = 0.4), the environmental factor E (drawn from a standard normal distribution), and the interaction between the two (G × E) on the outcome Y (generated as for 1,000 samples). The p-value for the G × E effect (PGxE) is given in the right lower corner. B: Depiction of the same effects on the log-transformed Y. Whereas the G × E effect is not detectable for Y (A), it is detectable for the log-transformed Y (B).
Not all observed interactions require these caveats. For instance, there is an extreme form of interaction, also known as a crossover effect, in which the direction of an association (rather than only its magnitude) depends on a moderating factor. This type of interaction cannot be eliminated by a monotonic transformation [16] and, when sufficiently large, has a relatively straightforward interpretation [17]. Other statistical interactions, however, can in principle be removed by a monotonic transformation—meaning that, after transforming the outcome, the data are adequately described by an additive model.
Here, we demonstrate that monotone convex transformations of an outcome induce sign-consistent G × E, where the direction of the interaction effects is determined by the sign of the corresponding main effects. We further show that endogenous treatment effects, modeled as threshold-based interventions, also generate sign-consistent G × E. Finally, we discuss examples of non-convex transformations, like the logistic function, showing why and under what circumstances they induce this particular type of interaction effects. Our results indicate that a simple examination of sign consistency across detected G × E can rule out the possibility that all interactions have been induced by a monotone convex scaling of the outcome or endogenous treatment effects. Another consequence of this result is that if G × E signal is not sign-consistent, it cannot be eliminated by a monotone convex transformation.
Monotone convex transformations describe a broad class of functions that include many of the most commonly used data transformations in genetic studies and data analysis in general. For example, Box–Cox transformations, which are widely used to reduce departures from normality, are convex. Examples of transformations that are convex down include the square function, power transformations with even exponents, and the exponential function. Transformations that are convex up include the logarithm, the square root, and power transformations with exponents between zero and one. Many more commonly used transformations are locally convex; for instance, the logistic function and the hyperbolic tangent are convex down on one half of their domain and convex up on the other. We restrict our attention to monotone transformations as they preserve (or completely reverse) the order of data values.
We demonstrate the usefulness of sign consistency examination in real data, as our analysis of this property identifies that statin use induces false positive gene-age interaction effects on LDL cholesterol levels.
Results
Sign-consistent interaction property
We investigate the relationship between the signs of regression coefficients estimated in the G × E model across multiple genetic variants. More concretely, consider testing two haploid variants, G1 and G2, for an interaction with a binary environmental exposure E against phenotype Y in two single-variant regressions:
where ,
,
and
are coefficients estimated in regression i, and
represents homoskedastic residual variation. Critically, neither of these models needs to accurately describe causal relationships; we just assume we can fit these regression models and have well-behaved errors.
Next, consider the same two tests performed for the same phenotype Y measured on a different scale:
where is the map from the scale of the former measurement of Y to the scale of the latter measurement of Y, and the corresponding coefficients and residuals are marked with superscript
.
We demonstrate that the signs of interaction effects, and
, induced by a monotone convex transformation
depend on the signs of the main genetic effects,
and
. The interaction effects satisfy a precise sign rule:
Theorem 1 (Sign-consistent interaction property). Assume measurement Y has homogeneous variance and exhibits no G × E effects () on the original scale, with G and E independent, mean-centered, and having finite variance. If
is a monotone convex transformation, then the G × E effects
and
satisfy:
where denotes the sign of the second derivative of
(positive for convex down, negative for convex up).
Note that if we assume that the alleles of G1 and G2 are encoded so that their main effects have the same direction (i.e., ), the above property becomes:
which we call sign consistency. By contraposition, if the homoskedasticity and assumptions hold but property (2) does not, non-zero G × E effects
and
are not both induced by a monotone convex scaling of Y.
Though we focus on the simple case of haploid genotypes and binary environmental exposures in the primary text, these results apply to diploid genotypes and continuous environmental exposures.
Corollary 1 (Sign consistency for diploid genotypes). The sign rule (1) extends to diploid genotypes under Hardy-Weinberg equilibrium, with E any random variable satisfying the independence and moment conditions above.
We further generalize this result to allow for moderate G-E correlations:
Theorem 2 (Sign consistency for correlated G and E). When G and E are correlated, the induced interaction effect includes an additional correlation term:
where arises from the non-orthogonality of predictors. The sign rule (1) holds when the transformation effect dominates:
.
Corollary 2 (Dominance of transformation effect). For small gene-environment correlations or strong curvature , the sign rule approximately holds. In practice, if observed interactions are strongly sign-consistent across many variants, the transformation effect likely dominates any correlation-induced deviations.
Finally, we show that endogenous treatment effects—where treatment is assigned based on a phenotype threshold—also induce sign-consistent interactions:
Theorem 3 (Opposite-sign rule for endogenous treatment). Consider an environmental exposure E assigned when phenotype Y exceeds a threshold t, with treatment effect on Y. For a genetic variant with effect
on Y, if the treatment threshold exceeds the reference mean (
), the induced interaction satisfies:
We prove these results rigorously in S1 Appendix.
Generally, in abstraction from this two-variant example, we demonstrate that if there is a scale, on which phenotype Y has homogeneous variance across values of environmental factor E and genetic variant G, and those factors have only additive effects on Y, then the direction of the G × E effect estimated for a measurement that is a monotone convex transformation of Y depends on the direction of the corresponding main genetic effect.
We, therefore, propose to examine observed G × E effects for the sign consistency property, as it provides a means to exclude the family of monotone convex transformations as the sole source of these effects.
Sketch of argument
Suppose that phenotype Y has homogeneous variance across values of environmental factor E and genotype G:
where is a symmetric distribution with mean ν and variance
. We fit the following linear regression model to test the effect of an interaction between G and E on this phenotype:
where ,
,
and
are estimated coefficients and ε is the error. We consider a simplified example, where E and G are binary variables, which corresponds to the case of a haploid genetic variant and a binary environmental exposure. Importantly, our reasoning is not contingent on whether this model is correct, only that the homoskedasticity condition (3) holds. The coefficients in (4) can be related to the empirical conditional expectations:
where . For example, P0,1 is the average value of phenotype Y in individuals with genotype G = 1 who are not exposed to environmental factor E (i.e., E = 0).
Without loss of generality, suppose further that coefficients ,
and
are estimated to be positive, and that the estimate of the G × E effect
is zero, as depicted on the x-axis of Fig 2A. Consider now a regression similar to (4), but performed on phenotype Y measured on a different than the original scale:
A: The x-axis shows the intercept (), the main effect of E (
), the main effect of G (
) and the G × E effect (
) estimated by regressing E, G and GE on phenotype Y. If, as assumed here,
and
are positive and
is null, a similar regression on this phenotype transformed with an increasing convex down function
will yield the main effect of G (
) and the G × E effect (
) that are positive. The sign of the G × E effect can be calculated by
, where
. B: Similar to A, but shows the signs of
and
when
is negative. C: Similar to A, but shows the signs of
and
when
is increasing convex up. The colored segments on the x-axis indicate the sign of
—red if positive and blue if negative. Orange and cyan segments on the y-axis denote, respectively, positive and negative values of
or
, whichever has smaller magnitude. The green segment represents the difference between these two differences, which may be positive or negative depending on the transformation.
where is the map from the original scale of Y to the new scale.
If we assume that is increasing convex down (Fig 2A), then the signs of
and
can be related to points Pa,b as:
That is, with the above assumptions, the direction of the effects estimated for the scaled phenotype can be expressed using quantities Pa,b defined on the original scale of Y (see Methods for a derivation of this fact). Looking at Fig 2A, it is easy to see that in our example:
- The signs of differences
and
are the same, and, by (6), follow the sign of
.
- The magnitude of
is larger than the magnitude of
.
From (7) and these two facts it follows that the sign of is positive. Note that this will be true for any genetic variant whose main effect,
, tested in (5) is positive. If, on the other hand,
is negative, the G × E effect,
, will be negative (Fig 2B). In general, we have the following relation:
Applying a similar argument, it can be shown that when is increasing convex up, the opposite relation,
, holds (Fig 2C). In general, the direction of this relation depends on whether
is convex down or convex up, and on the sign of
, that is, the estimated effect of environmental factor E (Table 1). A complete proof discussing these cases is given in Methods.
Endogenous treatment effects
Suppose that the environmental factor E tested in the G × E regression model is a treatment. Suppose that this treatment is administered to taper the level of a heritable phenotype when it crosses some threshold—e.g., the statin therapy for individuals with high LDL cholesterol. In this case, exposure to the intervention is related to genetic factors influencing phenotype—which we refer to as endogenous treatment effects. As shown in [8], endogenous treatment effects can cause false discoveries when the observed levels of the phenotype (subjected to treatment) are tested for G × E. Following, we prove that the G × E effects induced in a simple model of endogenous treatment effects are sign-consistent.
Consider the following model of phenotype Y:
where, again, Gi indicates the presence of an alternative allele at haploid variant i, is the effect of this allele on Y, and ε is the environmental noise. We assume that the environmental noise is homoskedastic (
).
Suppose that if the level of Y is high, an individual is administered treatment E:
where t is some threshold. When applied, treatment E changes the level of Y by :
Claim 1 (Endogenous treatment effect interaction sign property). Suppose that we observe phenotype and test the effect
of the interaction between variant Gj and environmental factor E on this phenotype:
Then the direction of the estimated G × E effect, , is opposite to the direction of the main genetic effect,
.
As in Sketch of argument, we define quantities P0 and P1 on the scale of Y, which, transformed, can be used to compute the main genetic effect, , and the G × E effect,
, on the scale of
. Note that in this case, not only the signs, but also the values of these effects can be expressed in terms of P0 and P1 transformed by a certain function. We investigate the properties of this function to determine the properties of
and
. Specifically, we define
and
, and show (Methods) that the main genetic effect
and the G × E effect
estimated in (9) can be related to these points as:
where is the inverse Mill’s Ratio (
, where
and
are the standard normal probability density function and cumulative distribution function, respectively). Importantly,
is decreasing and strictly convex down [26].
Suppose that . Then, looking at Fig 3, we see that:
- The difference
has the opposite sign to difference
.
- The magnitude of
is larger than the magnitude of
.
- The difference
has the same sign as difference
.
- The magnitude of
is greater than the magnitude of
.
The x-axis shows the quantities and
defined for phenotype Y affected by the haploid genetic variant G. The main effect of G and the effect of GE on phenotype Y, after treatment E is applied to reduce levels of Y that exceed threshold t, can be expressed as functions of P0 and P1 and their images under the inverse Mill’s Ratio
and
. The signs of those functions are dependent.
From our assumption that and facts 1 and 2 above, it follows that the sign of
is negative. From the same assumption and facts 1, 3 and 4, it follows that the sign of
is positive. Note that once the sign of difference
is established, the signs of
and
can be determined based on the properties of
. In our example,
is negative, which makes
negative and
positive. On the other hand, when
is positive,
is positive and
is negative. Therefore, we have shown that G × E effects induced by endogenous treatment effects (modeled as in (8)) have opposite directions to the corresponding main genetic effects. That is, for any j,
.
The same property holds when treatment E is administered whenever the level of Y is below threshold t (Methods).
Non-convex transformations
An arbitrary non-linear scaling of an outcome may or may not induce sign-consistent G × E. To provide more intuition on this, we discuss properties of G × E that can be produced by two examples of non-convex scaling commonly used in genetic analyses: (1) the logistic function and (2) the inverse normal transformation (INT).
Specifically, we are interested in the relationship between the signs of the main effect of genetic variant G and the effect
of the interaction between G and environmental factor E on
-transformed phenotype Y in the linear regression:
Following Sketch of argument, we assume that E and G are binary and that Y (on the original scale) has homogeneous variance and does not exhibit G × E effects, meaning that the linear regression:
yields .
As demonstrated earlier, given these assumptions, and
can be defined in terms of the empirical conditional means of untransformed Y:
where and f is the PDF of ε.
Case study: the logistic function.
Let be a logistic function:
Using definitions (11) and (12), it is easy to see that the sign of induced by the logistic scaling depends not only on the relative positions of points P0,0 and P1,0 (as was the case for monotone convex transformations), but also on their values and the width of the distribution of ε (Fig 4A). More specifically, the values of P0,0 and P1,0 determine—up to noise—the relation between the magnitudes of
and
in (12). Since these values will be different for different genetic variants, the relationship between the signs of
and
can be variant-specific.
A: Depiction of the effect that the logistic transformation of the outcome may have on the regression-based G × E test. Compare with Fig 2. B: An example of two genetic variants (green and orange) with positive effects on the phenotype that after transforming this phenotype with the logistic function
exhibit G × E effects of opposite directions. Compare with Fig 2.
Consider an illustrative example of two genetic variants G1 and G2. For simplicity, we assume , which simplifies (11) and (12) to:
For G1 we assume: ,
,
and
. This means that in unexposed individuals carrying the reference allele at G1 the average value of phenotype Y is 5, whereas in exposed individuals carrying the same allele it is 8; and the effect of G1 in both unexposed and exposed groups is 0.5. For variant G2, on the other hand, we assume:
,
,
and
, which corresponds to average values of 5.7 and 8.7 in unexposed and exposed non-carriers, respectively, and the genetic effect of 1 (see x-axis of Fig 4B). Now imagine that we convert phenotype Y to a risk scale where the value of 7 corresponds to the risk of 50%:
, and perform regression (10). For variant G1, this regression yields a positive main genetic effect and a positive G × E effect. For variant G2, it produces a positive main effect, but a negative G × E effect (Fig 4B). Thus, in this example the logistic transformation induces G × E effects that are not sign-consistent.
There are, however, scenarios where the logistic scaling will induce G × E that are sign-consistent. A plausible example of such a scenario in healthcare data occurs when x0 in (13) is large (meaning that the cases are called at high phenotype values) and the environmental effect and individual genetic effects on (untransformed) Y are relatively small, so that all points Pa,b for all considered genetic variants are smaller than x0. Since the logistic function is convex on the domain , transforming points Pa,b with this function yields a relation
(Table 1). In general, the sign of a G × E effect induced by the logistic scaling depends on the relative positions of P0,0, P1,0, P0,1 and P1,1 with respect to x0—all possible cases are detailed in S1 File.
Case study: the inverse normal transformation.
Another data transformation commonly used in genetic analyses is the INT. It matches quantiles of the data distribution with the quantiles of the standard normal distribution. Because the transformation depends on the data’s distribution, its consequences cannot be generalized. More specifically, INT preserves the order of data points, but not the distances between them. In particular, the relationship between the transformed differences of the conditional means: and
(
and
in (12)), depends not only on the ordering of these conditional means but also on their magnitudes. Consequently, this relationship need not be the same across variants whose original effects have the same direction. As a result, the INT transformation can induce G × E effects in any direction with respect to the main genetic effect of a given sign.
Previously published interaction results exhibit sign consistency property
We have examined sign consistency for several G × E studies, selecting E-outcome pairs for which interactions have previously been found (Fig 5A). More specifically, we performed TxEWAS [8,27] in the UK Biobank [28] population of unrelated white British individuals (Methods). TxEWAS tests the effect of the interaction between predicted expression of a gene G and environmental exposure E on phenotype Y using the following linear regression model:
A: Main vs interaction effects for identified genes. For each gene, we plot the estimates corresponding to the tissue with the strongest interaction p-value. is the main environmental effect. B: The fraction of G × E that have sign-consistent effects. This fraction was calculated among interacting genes (called at hFDR < 10%) whose main effects were nominally significant at 5%. The tissue with the strongest interaction p-value for a given gene was considered.
where Ci is the i-th additional environmental covariate included in the model, Greek letters represent effect sizes, and . Among additional covariates we included: age, sex, birth date, Townsend deprivation index, and the first 16 genetic principal components (PCs) [29] (if not already used as E). In a single study, we performed multiple tests for a single gene—corresponding to multiple tissues in which this gene was expressed—and used the hierarchical FDR (hFDR) correction to call significant interactions from aggregated results [8]. Sign consistency was examined considering these interactions in tissues, in which they had the strongest effects.
We have investigated gene-sex interaction effects on the primary male sex hormone, testosterone, and the end-product of the purine metabolism, urate; gene-smoking interaction effects on body mass index (BMI); gene-age interaction effects on LDL cholesterol levels; and gene-statin interaction effects on statins’ primary target, LDL cholesterol, and phenotypes related to their potential side effect on diabetes risk [30,31]—blood glucose and hemoglobin A1c (see Methods for phenotype definitions and preprocessing details). We have observed moderate to strong evidence for sign-consistent G × E effects across these traits. More specifically, the fraction of sign-consistent G × E effects was moderate for the sex-testosterone E-outcome pair, high for the sex-urate and statin-hemoglobin A1c pairs, and maximal for the rest of our studies (Fig 5B).
Such a high degree of sign consistency calls for careful interpretation, as many of these interaction effects may have been induced by the outcome measurement scaling or endogeneity. Monotone convex transformations systematically amplify G × E effects whose sign is consistent with that of the corresponding main genetic effect, while attenuating interactions with the opposite sign. As a result, the degree of sign consistency after such a transformation can be high even in the presence of G × E with the opposite sign pattern on the untransformed scale. For example, under an increasing convex up transformation and a positive main environmental effect, G × E effects opposing the main genetic effects are amplified, whereas those aligned with the genetic main effects are reduced or eliminated (Fig 6A and 6B). Consistent with this intuition, our simulations show high sign consistency rates after applying monotone convex transformations to outcomes with randomly directed G × E effects (Figs 6 and S1 Fig). Even when the phenotypic variance explained by interaction effects exceeds that of the additive effects, commonly used transformations—such as the logarithm or square—yield sign consistency rates exceeding 75% (Figs 6C and S1B Fig). This rate depends on the directionality and size of the interaction effects on the untransformed scale, and on the specific transformation applied. Consequently, there is no universal threshold that indicates when scaling and endogenous treatment effects should be identified as major drivers of observed G × E signal. Nevertheless, a predominance of sign-consistent interactions indicates that the results should be interpreted with care and may deserve closer examination. For comparison, our simulations show that the inverse normal transformation, which is not convex, does not alter the sign consistency rate relative to the original scale (S2 Fig).
A: Z-scores for main genetic (G) and interaction (G × E) effects estimated for the outcome before (left) and after (right) transformation. G × E effects were simulated using . B: Number of detected G × E effects for outcomes on the original and transformed scales as a function of the variance of the simulated G × E effects,
. C: Estimated rate of sign consistency for outcomes on the original and transformed scales as a function of the variance of the simulated G × E effects,
. The sign consistency rate was defined as the proportion of G × E effects exhibiting the more prevalent sign relationship with their corresponding main effects. Due to this definition, sign consistency rate for the untransformed outcome may exceed 0.5.
As a concrete example, we hypothesized that the interaction effects detected in the age-LDL cholesterol study were a consequence of endogenous treatment effects between statin use and LDL cholesterol levels. This is because with age increases the probability of taking statins, which are prescribed at high LDL cholesterol levels—meaning that genetic variation associated with LDL cholesterol levels is also correlated with age. Indeed, when we included statin use in our model as a covariate, the G × E effects disappeared.
Sign consistency alone cannot determine the extent to which an observed G × E signal is driven by endogenous treatment effects. For instance, many gene-statin interactions for LDL cholesterol identified by TxEWAS were replicated in a retrospective longitudinal pharmacogenomic study [8], in which major sources of endogeneity were controlled. To determine the mechanisms underpinning such observed associations, additional analyses and experiments are necessary.
Discussion
We have demonstrated that if there is a scale on which an outcome has homogeneous variance across values of environmental factor E and genetic variant G, and these factors have only additive effects on this outcome, then the direction of the G × E effect estimated on the scale that is a monotone convex transformation of the original outcome scale is determined by the direction of the main effect of G. In addition, we have shown that endogenous treatment effects, modeled as threshold-based interventions, can only produce G × E effects with the same sign property.
A consequence of our result is that if G × E effects in both directions with respect to the main genetic effects are observed, there is no monotone convex transformation that can eliminate the G × E effects. Furthermore, they could not have been all induced by endogenous treatment effects. Our results are related to prior conditions under which outcome scaling can eliminate interaction effects [20], especially prior results bounding interaction effect sizes as a function of the curvature of the scaling function [32].
Our argument assumes a null interaction effect on some scale to assess the properties of a signal fully attributable to an outcome transformation. Our heuristic examines whether observed interactions are consistent with this hypothesis at a large number of loci. Although it is unreasonable in general to imagine that all variants interact in the same way relative to an environmental moderator, a monotone convex transformation of the outcome results in a high sign consistency rate even if this null hypothesis is not true for every locus. Thus, a predominance of sign-consistent interactions provides a meaningful indication that the results should be interpreted with care and may deserve closer examination.
Despite apparent similarities between endogenous treatment effects and gene-environment (G-E) correlation, the two phenomena differ. In the considered model of outcome-dependent treatment allocation, genetic factors that influence the outcome become associated with treatment status, and, critically, treatment status becomes correlated with the error term. A correlation between genetic factors and exposure alone does not induce statistical G × E effects. Although G-E correlation can produce spurious G × E signals when genetic markers such as tag SNPs are analyzed instead of the true causal variants [33], this arises from a different data-generating process and is likely a much weaker source of misleading G × E findings [34].
Sign consistency of G × E effects does not imply that they are induced by endogenous treatment effects, nor does the fact that they can be eliminated by an outcome transformation imply that the outcome should be analyzed on the transformed scale. Whenever possible, the outcome scale should be chosen to ensure that results are interpretable and practically meaningful. For example, the relevant scale may be determined by the specific mechanistic model of a biological phenomenon under study or by the public health intervention being evaluated. However, as it is generally unclear what the correct scale is for a given phenotype, examining sign consistency across observed G × E effects can help assess the extent to which a particular type of outcome transformation may alter the results. Moreover, such examination can rule out the possibility that all observed interactions can be attributed to endogenous treatment effects in studies where such effects may be present and the underlying causal mechanisms are unknown. Our analysis of real data sets demonstrates that our approach can help identify potential confounding.
To reduce inaccuracy in assessing sign consistency, we recommend applying the sign consistency property to genome-wide significant interactions (or, at a minimum, to a threshold determined a priori), as done in the analyses presented in this paper.
We note that the homoskedasticity assumption made in our proofs is also an assumption of the linear regression model. Violation of this assumption results in a biased test for the interaction effect [8,35]. In the observed data, it is specifically common that the variance of the outcome differs across strata defined by the environmental factor [36]. Owing to its importance and incomplete characterization, we comprehensively examine the conditional heteroskedasticity bias in S1 Supporting Information. We analytically describe the conditions under which this bias is expected to arise and the direction of its effect. It has been established that, in the presence of heteroskedasticity, G × E should be modeled using the double generalized linear model or a standard linear model modified to incorporate robust standard errors [8,35].
Methods
Sign consistency of G × E effects under monotone convex transformations of the outcome
Here we provide geometric intuition for the sign-consistent interaction property; a direct algebraic derivation is given in S1 Appendix.
Suppose that there is a scale, on which a phenotype exhibits no G × E, and has homogeneous variance. We show that any monotone convex transformation of this phenotype can only induce sign-consistent G × E effects (Theorem 1). The implication is that if G × E effects in both directions with respect to the main genetic effects are observed, there is no such transformation that can eliminate the G × E effects.
Specifically, consider phenotype Y that has homogeneous variance across values of binary environmental factor E and haploid genotype G:
where is a symmetric distribution with mean ν and variance
. Consider further fitting the following linear regression model to Y:
where ,
,
and
are estimated coefficients, and ε is the error. We assume that:
The coefficients in (14) can be related to the empirical conditional means of Y, which we denote by points :
The order of P0,0 and P0,1, P1,0 and P1,1, P0,0 and P1,0, and P0,1 and P1,1 is determined by the signs of coefficients and
. To see this, note that if
, then
. Alternatively, if
, then
. Furthermore, by the assumption that
is null,
and
(Fig 7).
When , the signs of regression coefficients
and
in (14) determine the order of P0,0 and P0,1, P1,0 and P1,1, P0,0 and P1,0, and P0,1 and P1,1. Here,
.
We will use this fact to show that a regression similar to (14) on a monotone convex transformation of Y yields G × E effects whose directions depend on the directions of the corresponding main genetic effects.
Consider Y transformed by a function , and a linear regression of this transformed Y on E, G and GE:
We can relate the coefficients in (15) to points P0,0, P0,1, P1,0, and P1,1 that we have defined on the original scale of Y:
and likewise for :
where f is the PDF of ε. Note that each point Pa,b above is always shifted by the same value; and that the signs of the above expressions are invariant to this shift if is monotone convex:
Without loss of generality, suppose that is increasing convex down. To determine the sign of
, we need to know the signs of differences
and
, and the relation between their magnitudes. Since P1,0 and P1,1 are shifted from P0,0 and P0,1 by the same value,
, the signs of
and
are the same, and, by (16), follow the sign of
(Fig 8A). The relation between their magnitudes depends on the sign of
. If
is positive, the magnitude of
is greater than the magnitude of
, and the opposite is true if
is negative.
A: The relation between the signs of and
when
and
are positive and
is increasing convex down. B: Similar to A, but when
is negative. C: Similar to A, but when
is increasing convex up. D: Similar to A, but when
is negative and
is increasing convex up.
If two genetic variants, G1 and G2, are regressed like G in (15)—and the assumptions of model (14) are met—such that the sign of in these two cases is the same, the sign of
differs between these regressions only if the sign of
differs.
For example, when is increasing convex down and
is positive, then:
implies
, because both
and
are positive, and
(Fig 8A).
implies
, because both
and
are negative, and
(Fig 8B).
Therefore, in this example, .
Similarly, when is positive, but
is increasing convex up, then:
implies
, because both
and
are positive, and
(Fig 8C).
implies
, because both
and
are negative, and
(Fig 8D).
Therefore, in this example, .
Note that if is increasing convex down,
is decreasing convex up; and if
is increasing convex up,
is decreasing convex down. Change of the sign of the function inverts the directions of both
and
(see (16) and (17)). Thus, those pairs of transformations, induce the same relation between the signs of
and
.
Finally, change of the sign of , inverts the relation between the magnitudes of differences
and
, which, for a given transformation, results in an inverted relation between the signs of
and
. We summarize all possible cases in Table 2.
In S1 Appendix, we show that the distinction between increasing and decreasing transformations is absorbed into the observed coefficients and
, yielding the unified sign rule
. In practice, this formula can be applied directly using the estimated coefficients without needing to determine whether the transformation is increasing or decreasing.
Sign consistency of G × E effects under a threshold-based model of endogenous treatment effects
Consider the following model of phenotype Y:
where Gi indicates the presence of an alternative allele at variant i, is the effect of this allele on Y, and ε is the environmental noise. We assume that the genotypes are independent:
, and that the environmental noise is homoskedastic:
. Note that, unlike in our previous derivation where no specific generating model is assumed, here we assume this is the actual generating process. Without loss of generality, let
.
Suppose that if the level of Y is high, treatment E is administered:
where t is some threshold. When applied, treatment E changes the level of Y by :
Suppose further that we observe phenotype and test the effect
of the interaction between variant Gj and environmental factor E on this phenotype:
We prove that the sign of is determined by the sign of the main effect
of Gj (Claim 1).
Note that coefficients and
can be related to empirical conditional expectations of
:
Furthermore, note that phenotype conditioned on the value of E has a truncated normal distribution, and its conditional mean is given by:
where is the probability density function and
is the cumulative distribution function of the standard normal distribution, and
is the mean of Y. Furthermore, the value of
depends on the genotype Gj:
where .
To simplify the above expressions, we denote the inverse Mill’s Ratio , and note that
, because
is even, and
. Furthermore, we define points
and
, and express the estimated effects
and
in (18) as a function of these points:
The function is decreasing and strictly convex down [26] (Fig 9). As a result, the order of points P0 and P1 determines the signs of
and
. Without loss of generality, let t > 0. Since t distinguishes “high” from “normal” levels of phenotype Y, it is reasonable to assume that means
and
are smaller than t (note that variants Gi are independent); that is, any individual SNP does not result in high enough Y to receive the treatment, as it is likely for any polygenic trait. There are therefore two possible cases:
, which imposes the following order on the points used in definitions (19) and (20):
(Fig 9A). Given this order and the properties of
, we have: 1)
,
, and
, which implies that
is negative; and 2)
, which implies that
is positive.
, which results in:
(Fig 9B). Given this order and the properties of
, we have: 1)
,
, and
, which implies that
is positive; 2)
, which implies that
is negative.
A: The x-axis shows the quantities and
defined for phenotype Y affected by the haploid genetic variant Gj, where
. The main effect of Gj and the effect of GjE on phenotype Y, after treatment E is applied to reduce levels of Y that exceed threshold t, can be expressed as functions of P0 and P1 and their images under the inverse Mill’s Ratio
and
. The signs of those functions are dependent. B: Similar to A, but when
.
We have, therefore, shown that:
which means that the estimated effects and
in regression (18) have opposite directions. It can be analogously shown that relation (21) holds when treatment E is administered whenever the level of phenotype Y is below threshold t.
Simulation details
We conducted simulations to assess sign consistency of estimated G × E effects after applying monotone convex transformations to outcomes that exhibit G × E on the original scale. In each simulation, outcomes were generated according to , where E is a binary environmental exposure, X is a matrix of 200 independent diploid SNPs, and
denotes the element-wise product of E with each column of X. Additive genetic effects
were drawn from a standard normal distribution. We randomly selected 100 SNPs to also have interaction effects, with half having interaction effects aligned in sign and half opposed in sign relative to their corresponding main effects. Interaction effects
for these 100 SNPs were drawn from a Gaussian distribution with mean zero and variance
. The residual variance
was scaled to achieve heritability 0.5 among samples with E = 0. A total of 10,000 samples were simulated (5,000 per environmental condition). Statistical significance of G × E effects was assessed at a 5% false positive rate using single-SNP regressions applied to the original outcome Y and to the transformed outcome.
To estimate the number of detected G × E effects and the sign consistency rate, 100 independent replicates were performed for each value of , and results were summarized using the mean and standard deviation.
TxEWAS in the UK Biobank
The TxEWAS presented in this work were performed following the Sadowski et al. protocol [27]. The studied UK Biobank population of 342,257 unrelated white British individuals was identified by performing the steps described by Sadowski et al. [8]
We imputed gene expression into the UK Biobank using eQTL weights trained in 48 tissues of The Genotype-Tissue Expression (GTEx v7) project, linked by the TxEWAS protocol [27]. Hierarchical FDR (hFDR < 10%) was used to account for multiple hypothesis testing across genes and tissues [37,38].
Individuals who took statins were identified by codes: 1140861958, 1140861970, 1141146138, 1140888594, 1140888648, 1140910632, 1140910654, 1141146234, 1141192410, 1141192414, 1141188146, 1140881748, and 1140864592 in the UK Biobank field 20003-0.0-47. Smoking status was derived from the UK Biobank field 20116-0.0 by encoding the “current” category as 1, and the categories of “never” and “previous” as 0.
For all tested outcomes except testosterone, we discarded measurements greater than five standard deviations from the mean, with the assumption that such extreme levels were results of non-modeled circumstances. The distribution of testosterone levels was bimodal, but the sign consistency pattern for this phenotype presented in Fig 5A remained similar after inverse normally transforming it.
We included age, sex, birth date, Townsend deprivation index, and the first 16 genetic PCs [29] as covariates in our studies. All non-binary covariates were standardized (transformed to mean zero, variance one) before calculating interaction variables.
Supporting information
S1 Appendix. Derivation of the sign-consistent interaction property.
https://doi.org/10.1371/journal.pgen.1012073.s001
(PDF)
S1 File. Supporting Information.
Supplementary notes.
https://doi.org/10.1371/journal.pgen.1012073.s002
(PDF)
S1 Fig. Sign consistency after square-transformation of a simulated outcome with randomly directed G × E effects (Methods).
A: Number of detected G × E effects for outcomes on the original and transformed scales as a function of the variance of the simulated G × E effects, . B: Estimated rate of sign consistency for outcomes on the original and transformed scales as a function of the variance of the simulated G × E effects,
. The sign consistency rate was defined as the proportion of G × E effects exhibiting the more prevalent sign relationship with their corresponding main effects. Due to this definition, sign consistency rate for the untransformed outcome may exceed 0.5.
https://doi.org/10.1371/journal.pgen.1012073.s003
(PDF)
S2 Fig. Sign consistency after inverse normal transformation of a simulated outcome with randomly directed G × E effects (Methods).
A: Number of detected G × E effects for outcomes on the original and transformed scales as a function of the variance of the simulated G × E effects, . B: Estimated rate of sign consistency for outcomes on the original and transformed scales as a function of the variance of the simulated G × E effects,
. The sign consistency rate was defined as the proportion of G × E effects exhibiting the more prevalent sign relationship with their corresponding main effects. Due to this definition, sign consistency rate for the untransformed outcome may exceed 0.5.
https://doi.org/10.1371/journal.pgen.1012073.s004
(PDF)
References
- 1. Hillert A, Anikster Y, Belanger-Quintana A, Burlina A, Burton BK, Carducci C, et al. The Genetic Landscape and Epidemiology of Phenylketonuria. American Journal of Human Genetics. 2020;107(2):234–50.
- 2. Rampersaud E, Mitchell BD, Pollin TI, Fu M, Shen H, O’Connell JR, et al. Physical activity and the association of common FTO gene variants with body mass index and obesity. Arch Intern Med. 2008;168(16):1791–7. pmid:18779467
- 3. Freedman ND, Silverman DT, Hollenbeck AR, Schatzkin A, Abnet CC. Association between smoking and risk of bladder cancer among men and women. JAMA. 2011;306(7):737–45. pmid:21846855
- 4. Pirmohamed M. Pharmacogenomics: current status and future perspectives. Nature Reviews Genetics. 2023;24(6):350–62.
- 5. Franczyk B, Rysz J, Gluba-Brzózka A. Pharmacogenetics of Drugs Used in the Treatment of Cancers. Genes (Basel). 2022;13(2):311. pmid:35205356
- 6. Wang C-W, Preclaro IAC, Lin W-H, Chung W-H. An Updated Review of Genetic Associations With Severe Adverse Drug Reactions: Translation and Implementation of Pharmacogenomic Testing in Clinical Practice. Front Pharmacol. 2022;13:886377. pmid:35548363
- 7. Takeuchi F, McGinnis R, Bourgeois S, Barnes C, Eriksson N, Soranzo N, et al. A genome-wide association study confirms VKORC1, CYP2C9, and CYP4F2 as principal genetic determinants of warfarin dose. PLoS Genet. 2009;5(3):e1000433. pmid:19300499
- 8. Sadowski M, Thompson M, Mefford J, Haldar T, Oni-Orisan A, Border R, et al. Characterizing the genetic architecture of drug response using gene-context interaction methods. Cell Genom. 2024;4(12):100722. pmid:39637863
- 9. Marderstein AR, Kulm S, Peng C, Tamimi R, Clark AG, Elemento O. A polygenic-score-based approach for identification of gene-drug interactions stratifying breast cancer risk. Am J Hum Genet. 2021;108(9):1752–64. pmid:34363748
- 10. Miao J, Lin Y, Wu Y, Zheng B, Schmitz LL, Fletcher JM, et al. A quantile integral linear model to quantify genetic effects on phenotypic variability. Proc Natl Acad Sci U S A. 2022;119(39):e2212959119. pmid:36122202
- 11. Zhu C, Ming MJ, Cole JM, Edge MD, Kirkpatrick M, Harpak A. Amplification is the primary mode of gene-by-sex interaction in complex human traits. Cell Genom. 2023;3(5):100297. pmid:37228747
- 12. Durvasula A, Price AL. Distinct explanations underlie gene-environment interactions in the UK Biobank. Am J Hum Genet. 2025;112(3):644–58. pmid:39965571
- 13. Pazokitoroudi A, Liu Z, Dahl A, Zaitlen N, Rosset S, Sankararaman S. A scalable and robust variance components method reveals insights into the architecture of gene-environment interactions underlying complex traits. Am J Hum Genet. 2024;111(7):1462–80. pmid:38866020
- 14. Zhu X, Yang Y, Lorincz-Comi N, Li G, Bentley AR, de Vries PS, et al. An approach to identify gene-environment interactions and reveal new biological insight in complex traits. Nat Commun. 2024;15(1):3385. pmid:38649715
- 15. Di Scipio M, Khan M, Mao S, Chong M, Judge C, Pathan N, et al. A versatile, fast and unbiased method for estimation of gene-by-environment interaction effects on biobank-scale datasets. Nat Commun. 2023;14(1):5196. pmid:37626057
- 16. Wang X, Elston RC, Zhu X. The meaning of interaction. Hum Hered. 2010;70(4):269–77. pmid:21150212
- 17. Thompson WD. Effect modification and the limits of biological inference from epidemiologic data. J Clin Epidemiol. 1991;44(3):221–32. pmid:1999681
- 18. Greenland S. Interactions in epidemiology: relevance, identification, and estimation. Epidemiology. 2009;20(1):14–7. pmid:19234397
- 19. Gauderman WJ, Mukherjee B, Aschard H, Hsu L, Lewinger JP, Patel CJ, et al. Update on the State of the Science for Analytical Methods for Gene-Environment Interactions. Am J Epidemiol. 2017;186(7):762–70. pmid:28978192
- 20. Sverdlov S, Thompson EA. The epistasis boundary: Linear vs. nonlinear genotype-phenotype relationships. bioRxiv. 2018;2018:503466.
- 21. Sverdlov S, Thompson E. Combinatorial Methods for Epistasis and Dominance. Journal of Computational Biology. 2017;24(4):267–79.
- 22. Ottman R. Gene-environment interaction: definitions and study designs. Prev Med. 1996;25(6):764–70. pmid:8936580
- 23. Dick DM. Gene-environment interaction in psychological traits and disorders. Annu Rev Clin Psychol. 2011;7:383–409. pmid:21219196
- 24. Barcellos SH, Carvalho LS, Turley P. Education can reduce health differences related to genetic risk of obesity. Proc Natl Acad Sci U S A. 2018;115(42):E9765–72. pmid:30279179
- 25. Westerman KE, Sofer T. Many roads to a gene-environment interaction. Am J Hum Genet. 2024;111(4):626–35. pmid:38579668
- 26. Sampford MR. Some Inequalities on Mill’s Ratio and Related Functions. Ann Math Statist. 1953;24(1):130–2.
- 27. Sadowski M, Dahl AW, Zaitlen N. Protocol to estimate the heritability of drug response with GxEMM and identify gene-drug interactions with TxEWAS. STAR Protoc. 2025;6(2):103780. pmid:40249708
- 28. Bycroft C, Freeman C, Petkova D, Band G, Elliott LT, Sharp K, et al. The UK Biobank resource with deep phenotyping and genomic data. Nature. 2018;562(7726):203–9. pmid:30305743
- 29. Privé F, Aschard H, Carmi S, Folkersen L, Hoggart C, O’Reilly PF, et al. Portability of 245 polygenic scores when derived from the UK Biobank and applied to 9 ancestry groups from the same cohort. Am J Hum Genet. 2022;109(1):12–23. pmid:34995502
- 30. Preiss D, Welsh P, Murphy SA, Ho JE, Waters DD, Demicco DA. Risk of incident diabetes with intensive-dose. JAMA. 2011;305(24):2556–64.
- 31. Collins R, Reith C, Emberson J, Armitage J, Baigent C, Blackwell L, et al. Interpretation of the evidence for the efficacy and safety of statin therapy. Lancet. 2016;388(10059):2532–61. pmid:27616593
- 32. Sheppard B, Rappoport N, Loh P-R, Sanders SJ, Zaitlen N, Dahl A. A model and test for coordinated polygenic epistasis in complex traits. Proc Natl Acad Sci U S A. 2021;118(15):e1922305118. pmid:33833052
- 33. Dudbridge F, Fletcher O. Gene-environment dependence creates spurious gene-environment interaction. Am J Hum Genet. 2014;95(3):301–7. pmid:25152454
- 34. Dahl A, Nguyen K, Cai N, Gandal MJ, Flint J, Zaitlen N. A Robust Method Uncovers Significant Context-Specific Heritability in Diverse Complex Traits. Am J Hum Genet. 2020;106(1):71–91. pmid:31901249
- 35. Almli LM, Duncan R, Feng H, Ghosh D, Binder EB, Bradley B, et al. Correcting systematic inflation in genetic association tests that consider interaction effects: application to a genome-wide association study of posttraumatic stress disorder. JAMA Psychiatry. 2014;71(12):1392–9. pmid:25354142
- 36. Mefford J, Smullen M, Zhang F, Sadowski M, Border R, Dahl A, et al. Beyond predictive R2: Quantile regression and non-equivalence tests reveal complex relationships of traits and polygenic scores. Am J Hum Genet. 2025;112(6):1363–75. pmid:40480198
- 37. Peterson CB, Bogomolov M, Benjamini Y, Sabatti C. TreeQTL: hierarchical error control for eQTL findings. Bioinformatics. 2016;32(16):2556–8. pmid:27153635
- 38. Peterson CB, Bogomolov M, Benjamini Y, Sabatti C. Many Phenotypes Without Many False Discoveries: Error Controlling Strategies for Multitrait Association Studies. Genet Epidemiol. 2016;40(1):45–56. pmid:26626037