Mendelian Randomisation and Causal Inference in Observational Epidemiology

Nuala Sheehan and colleagues describe how Mendelian randomization provides an alternative way of dealing with the problems of observational studies, especially confounding.


The Problem of Inferring Causality in Epidemiology
The notion of risk is central to epidemiological research, both in its original context of studying conditions thought to be caused by a particular factor and, more broadly, in predicting the probability of a condition for prognostic purposes. For prognostic research, all factors associated with the outcome are of interest, whether they are causal or not. In aetiological research, on the other hand, causality is meaningful. Here, the focus is often on assessing the effect of some modifiable exposure on a disease with a view to informing health interventions at the individual or population level, or health advice for particular risk groups. For such intervention or advice to be effective, it is important to verify that the observed association between the exposure and disease means that the exposure is in fact causal for the disease. For example, once the relationship between periconceptual maternal folate supplementation and risk of neural tube defects was established [1,2], the United States, Canada, and Chile implemented mandatory fortification of cereal flour and related foods with folic acid and reported reductions in neural tube defect incidence between 27% and just over 50% [3]. However, observational research has had several high-profile failures when exposures that seemed to affect disease risk were later shown to be non-causal in follow-up randomised controlled trials (RCTs). For instance, observational evidence that seemed to suggest that vitamin E is protective for cardiovascular disease, beta-carotene for cancer, and, more recently, oestrogen for dementia, has now been refuted [4]. Since only candidate causes with the strongest observational support tend to be followed up in RCTs when these are possible, it is likely that many more reported observational findings are not actually causal [5].
Inferring causality from observational data is problematic as it is not always clear which of two associated variables is the cause and which the effect, or whether both are common effects of a third unobserved variable, or confounder (see Glossary). The direction of causality can sometimes be determined by temporal criteria (e.g., the cause must precede the effect) or from knowledge of the underlying biology. Confounding is more difficult to deal with because it is mainly due to social, behavioural, or physiological factors that are difficult to measure and control for. In practice, one can never be sure that the relevant confounders have been identified and accounted for. Besides the fact that RCTs are not feasible or ethical for many exposures of public health relevance, such as toxins, physical activity, or complex nutritional regimes, observational studies also have some advantages over RCTs; for example, the subjects in the latter are not always representative of the population for which an intervention is being considered [6]. "Mendelian randomisation" provides an alternative way of dealing with the problems of observational studies [6][7][8][9], especially for the case where confounding is believed to be present but cannot be controlled for because it is not fully understood.

Mendelian Randomisation
We outline the idea now known as "Mendelian randomisation" using the example provided by Katan  [10] This briefly outlines the original idea behind the method of Mendelian randomisation as it is commonly used now.

Mendelian Randomisation and Causal Inference in Observational Epidemiology
[10] in his early description of the concept in 1986, although the first implementation of this basic idea in an epidemiological setting under the flag of "Mendelian randomisation" was more recent [11]. Details of the derivation of the approach and its nomenclature are provided in a recent review [12].
In the mid-1980s, there was considerable debate over the hypothesis that low serum cholesterol levels might directly increase the risk of cancer. Alternative explanations for the observed association were that cholesterol levels were lowered by the presence of latent tumours in future cancer patients (reverse causation), or that both cancer risk and cholesterol levels might be affected by confounding factors like diet and smoking. The observation that individuals with abetalipoproteinaemia, and hence negligible levels of serum cholesterol, did not seem to be predisposed to cancer led Katan to the idea of finding a larger group of individuals genetically inclined towards lower cholesterol levels. The apolipoprotein E (ApoE) gene was known to affect serum cholesterol, the ApoE2 variant being associated with lower levels. Katan's idea was that many individuals will carry the ApoE2 variant and thus will naturally have lower cholesterol levels from birth. Crucially, since genes are randomly assigned during meiosis (which gives rise to the name "Mendelian randomisation"), these ApoE2 carriers will not be systematically different from carriers of the other ApoE alleles in any other respect, and in consequence there should be no confounding. Only if low serum cholesterol is really causal for the disease should cancer patients have more ApoE2 alleles than controls. Otherwise the distribution of ApoE alleles should be similar in both groups. This can be easily checked from the observed distributions.
Katan's reasoning corresponds exactly to what is known as an instrumental variable method in econometrics [13][14][15][16]. The genetic variant acts as a so-called instrumental variable (or instrument) and helps to disentangle the confounded causal relationship between intermediate phenotype and disease. Once this theoretical connection had been made, epidemiologists were able to learn from and adapt the methods that were so well known in econometrics [7,17].
The three key assumptions for Katan's idea to work, and hence for a genetic variant to qualify as an instrumental variable, are illustrated graphically in Figure 1 and interpreted as follows.
The genetic variant is unrelated 1.
to (independent of) the typical confounding factors, i.e., the graph has no arrow (in either direction) connecting ApoE with the confounders. The genetic variant is (reliably) 2.
associated with the exposure, i.e., there is an arrow connecting ApoE to serum cholesterol and we can accurately quantify the relationship this represents. For known exposure status 3.
(cholesterol level) and known confounders (if the confounders were observable), i.e., conditional on exposure and confounders, the genetic variant is independent of the outcome, i.e., ApoE does not provide any additional information for the prediction of cancer once these two variables are measured. An equivalent way of expressing this, which is less precise but perhaps more intuitive, is to say that there is no direct effect of genotype on disease (no single arrow between ApoE and cancer) nor any other mediated effect other than through the exposure of interest (no other routes in the graph between ApoE and cancer).
Note that these assumptions have to be justified from background knowledge of the underlying biology. Neither the first nor the third assumption can be tested statistically since they depend on the confounding factors, which, by definition, are unobserved. The first assumption means that you must have reasonable belief that your genetic variant is unaffected by the sort of confounding that might generally be expected of such an exposure-disease relationship. Fortunately, the very basis of Mendelian randomisation rests on the knowledge that alleles are randomly assigned from parental alleles at meiosis (see above), and this implies that, across the population, genetic effects are relatively robust, although not immune to confounding [7,18]. Furthermore, the type of information needed to explore this assumption is often available in practice, as it is usually well-studied genetic variants that are proposed as instruments. Assumption 3 demands a comprehensive understanding of the underlying biological and clinical science, and may appropriately be considered in a sensitivity analysis. Unlike the first and third, the second assumption can formally be tested using the observed data, and the method works better the stronger the association between gene and exposure.
If the three assumptions seem reasonable (i.e., Figure 1 is believable), then it can be shown that, as Katan originally hypothesised, a simple statistical test of association between the ApoE genotype and cancer amounts to a test for causal effect of cholesterol levels on cancer [19].
The idea of using a gene as an instrument to test for a causal effect of an intermediate phenotype on a disease has been used for a range of other traits, some of which are summarised in Table 1 [9,[20][21][22][23][24][25][26][27][28]. For example, raised plasma fibrinogen levels have been associated with an elevated risk of coronary heart disease (CHD) in large-scale prospective studies, prompting suggestions that methods to reduce fibrinogen levels should be sought [29]. If the fibrinogen-CHD relationship were causal, then such interventions could have considerable clinical and public health benefits. However, interventions to lower plasma fibrinogen levels would not be warranted if the association was explained by confounding or reverse causation. Doubts about a causal link between fibrinogen and CHD have The arrows can be thought of as representing causal relationships, but this is not what matters here. What is essential is the absence of an arrow between ApoE and the confounders and between ApoE and cancer, as detailed in the three key assumptions in the text.
been raised by evidence that the association is considerably attenuated by adjustment for smoking, body mass index, and plasma apolipoprotein B/ A 1 ratio [20], and that there are many known correlates of fibrinogen, only some of which are typically measured and adjusted for in individual studies [30]. Furthermore, bezafibrate was found to reduce plasma fibrinogen in randomised controlled trials, but it did not have a greater effect on CHD risk than could already be explained by its cholesterol-lowering effect [31].
Additional light can be cast on this relationship from relevant genetic studies. A recent large metaanalysis of genetic association studies of fibrinogen promoter region polymorphisms (G-455 ‡A and C-148 ‡T) showed that there was a mean increase in fibrinogen of 0.12 g/l (95% confidence interval [CI] 0.09 to 0.14) per copy of the A or T allele. However, these same alleles were not associated with CHD risk: the odds ratio per allele was 0.98 (95% CI 0.92 to 1.04) [21]. Since the 95% confidence interval includes the null hypothesis value of 1, we cannot reject the null hypothesis at the 5% level and hence conclude that the data provide little or no evidence for a causal effect of fibrinogen on CHD. This could be due to random error or lack of power of the statistical test, which is a problem with genetic association studies when relatively small effects are being sought. The findings are also consistent with the hypothesis that the associations shown previously in observational studies are partially or wholly explained by reverse causation or confounding. Of course, as with any test, the fact that an exposure appears to be non-causal does not necessarily mean that it is not clinically useful. Clearly, it would be dangerous to stop investigating the role of fibrinogen in CHD risk because of such an outcome. What is implied, however, is that more investigation is required before making any great investment in intervening on fibrinogen levels.
Mendelian randomisation can also be applied when the exposure of interest is a modifiable behaviour rather than an intermediate phenotype.
For example, Chen et al. [9] consider the causal effect of alcohol intake on blood pressure. An RCT would be problematic here, and measurement of alcohol intake is prone to error. Hence, observational data have to be considered in a setting where the causal relationship of interest is known to be heavily confounded. In some populations, a particular variant (*2) of the ALDH2 gene is quite common. The *2 variant is associated with accumulation of acetaldehyde, and therefore unpleasant symptoms, after drinking alcohol. Carriers of this variant tend to limit their alcohol consumption, and alleles at the ALDH2 locus can hence be used as a surrogate or proxy for alcohol intake [9]. Based on this assumption, a Mendelian randomisation metaanalysis approach, combining evidence from several studies, indicated that

Problems and Limitations
The limitations of Mendelian randomisation fall into two main categories. Firstly, the key assumptions for a genotype to be an instrument (see above) may not be plausible, in which case any inference about the causal effect will typically be biased. Such limitations include the presence of linkage disequilibrium, genetic heterogeneity, pleiotropy, population stratification, canalisation, or lack of knowledge about the confounding factors. These limitations have received a lot of attention in the literature [6,7,33]. However, graphs can be used as a visual check, and some apparent violations may not actually be problems in practice [19]. For example, Figure 2 addresses the case where the chosen instrument, Gene1, is in linkage disequilibrium with another gene, Gene2, which has not been observed. Here, Gene2 directly affects the disease level or risk, and hence Gene1 is not an instrument due to violation of the third key assumption. However, if Gene2 only affects the disease via its effect on the same intermediate exposure, as shown in Figure 3, there is no such violation and Gene1 can be used as an instrument in a Mendelian randomisation analysis. Note that Gene1 would also qualify as an instrument if its association with the exposure was only via its association with Gene2 (Figure 4). Hence, it does not really matter whether Gene1 or Gene2 is the causal variant for the exposure when they are in linkage disequilibrium, as either one qualifies as an instrumental variable in this case.
A similar check for violations can be applied to the situation described in Lawlor et al. [34], where the hypothesised causal effect of maternal adiposity on offspring adiposity is investigated using maternal FTO genotype as an instrument. The reason that one must also adjust for offspring FTO genotype in the relevant regression in order to perform a Mendelian randomisation analysis can be illustrated quite simply by the graph in Figure 5. Without adjusting for (conditioning on) offspring FTO, key assumption 3 would be violated due to the existence of an alternative path to the outcome via this genotype. Note that this situation is specific to the graph in Figure 5, which assumes that there is no other confounder of offspring FTO and offspring adiposity (such as paternal FTO).
If the three key assumptions of an instrumental variable are satisfied by the genetic variant, testing for a causal effect of phenotype on disease by testing for an association between genotype and disease is straightforward for most practical purposes. Any statistical test that is appropriate for the variables being considered will suffice. However, calculation of the magnitude of the causal effect requires additional strong assumptions, such as linearity of all relationships (e.g., constant increase of disease with exposure) and no interactions. If these assumptions are satisfied, we can obtain an estimate of the causal effect from a mathematically simple combination of the observed genotype-disease and genotype-exposure associations [13]. The second class of limitations of Mendelian randomisation concerns the validity of such additional assumptions. These limitations have not generated so much discussion to date, although in many observational studies the outcome is a binary variable, and, under the mathematical models that are typically applied-e.g., logistic or probit regression-conventional linearity is not satisfied [19]. In consequence, the estimate that is valid in the all-linear case should not really be applied to binary outcome data, although it has sometimes been advocated [17,26]. Generalisations of the instrumental variable method to the non-linear case can be found in the literature [8,15,[35][36][37][38][39], but are typically aimed at very different kinds of applications. Their usefulness in the context of Mendelian randomisation has yet to be investigated. It is, perhaps, important to stress that these extra distributional assumptions are only an issue for estimation of the magnitude of the causal effect and not for testing for the presence of such an effect.

The Future for Mendelian Randomisation
A Mendelian randomisation analysis does not aim to identify genetic factors that are causal for disease risk in order to target individuals on the basis of their genotype. On the contrary, the focus is on the causal association between an exposure and a disease with a view to informing the potential impact of non-genetic interventions on that exposure. To that end, such analysis exploits a well-studied genetic    In order to widen the applicability of the approach, more general methods for the common, but statistically nonstandard, case with a binary disease outcome need to be developed. In particular, the relevance to observational epidemiology of related methods in other areas, especially in terms of the particular assumptions required, is currently being investigated. We should also stress the importance of obtaining good estimates from genetic association studies, in particular ensuring sufficiently large sample sizes with adequate power to detect the typically modest effects one might expect for the determinants of common multifactorial diseases [6,20,40]. The need to formally combine information from different sources, such as the large biobanks that are currently being set up worldwide, is also essential [41].
Mendelian randomisation has received its fair share of criticism (e.g., [42]). One objection is that good genetic instruments are not easy to find, but recent rapid advances in genetic epidemiology are addressing this issue [5]. Most criticisms concern the violations of the key assumptions implicit in Figure 1. Confounding of the genotype-disease relationship is one such violation that has received some attention. However, it has recently been re-emphasised that this violation may not be as serious as may at first appear because, as outlined above, Mendelian randomisation analyses are fundamentally less susceptible to confounding than conventional epidemiology analyses [18].

Summary
It is often unavoidable (and sometimes desirable) to use observational data to infer causality, but it may then be difficult to disentangle causation from association, especially in the presence of confounding. We would argue that some of the confusion and misleading interpretations of results from observational studies are partly due to the lack of a clear formal approach to distinguish between association and causation. Causal terminology is often used loosely in the medical literature. It is intended to convey more than a simple association between potential risk factors and their effects, but this is rarely made explicit. More formal approaches are based on the idea of a hypothetical intervention [43,44], which seems particularly suited to the present context where we have potential health interventions in mind. These formal approaches highlight the usefulness of Mendelian randomisation studies for inferring causality and enable precise specification of the key assumptions (as depicted in Figure 1) necessary for the method to be valid.
Given the tendency of high-profile findings to persist in the literature, and influence public health and clinical policy, long after they have been formally refuted by RCT analyses [4], and given the expense and the scientific and ethical constraints of RCTs, it is fortunate that advances in biology, biotechnology, and epidemiology have provided us with an alternative tool, in the shape of Mendelian randomisation, that can help us to formally assess causality based on observational data. But the approach demands a sound understanding both of the underlying biomedicine and of the statistical assumptions invoked in its application. If it is used wisely, Mendelian randomisation could make a major contribution to our understanding of the aetiological architecture of complex diseases; but if it is used unthinkingly, it could sow seeds of confusion and set back progress in bioscience. This short article is aimed at encouraging the former and avoiding the latter.

Genetic Terms
Alleles are the different variants of a gene at a locus. They are sometimes called polymorphisms.
Canalisation is a developmental compensation that can atone for disruptive environmental or genetic forces.
Genetic heterogeneity refers to the situation where a phenotype is influenced by several alleles, usually at different genetic loci.
Linkage disequilibrium refers to (statistical) association between alleles at different loci. One reason for such an association is that the relevant genetic loci are physically close on the chromosome, and the alleles tend to be inherited together.
Pleiotropy refers to the situation where a genetic variant has more than one specific phenotypic effect.
Population stratification occurs when allele frequencies and disease rates, or allele frequencies and exposure rates, vary widely between different sub-groups of the population and cause an association between the two at the overall population level.

Statistical Terms
Conditional independence: For variables X, Y, and Z, we say that X and Y are conditionally independent given Z if knowledge of X (or alternatively Y) does not improve our prediction of Y (alternatively X) once we actually know Z.

Confounding:
The effect of a variable X on another variable Y is said to be confounded if the observed association between X and Y does not correspond to the causal effect. Confounding is often due to the existence of another cause of Y that is also associated with X.
Interaction: Variables X 1 and X 2 interact in their association with Y if the association of X 1 with Y varies for different values or levels of X 2 .

Linear relationship:
The relationship between variables X and Y is linear if the change in Y caused by a unit change in X is constant for all values or levels of X. Any departure from this criterion is a nonlinear relationship.