MR-pheWAS with stratification and interaction: Searching for the causal effects of smoking heaviness identified an effect on facial aging

Mendelian randomization (MR) is an established approach to evaluate the effect of an exposure on an outcome. The gene-by-environment (GxE) study design can be used to determine whether the genetic instrument affects the outcome through pathways other than via the exposure of interest (horizontal pleiotropy). MR phenome-wide association studies (MR-pheWAS) search for the effects of an exposure, and can be conducted in UK Biobank using the PHESANT package. In this proof-of-principle study, we introduce the novel GxE MR-pheWAS approach, that combines MR-pheWAS with the use of GxE interactions. This method aims to identify the presence of effects of an exposure while simultaneously investigating horizontal pleiotropy. We systematically test for the presence of causal effects of smoking heaviness–stratifying on smoking status (ever versus never)–as an exemplar. If a genetic variant is associated with smoking heaviness (but not smoking initiation), and this variant affects an outcome (at least partially) via tobacco intake, we would expect the effect of the variant on the outcome to differ in ever versus never smokers. We used PHESANT to test for the presence of effects of smoking heaviness, instrumented by genetic variant rs16969968, among never and ever smokers respectively, in UK Biobank. We ranked results by the strength of interaction between ever and never smokers. We replicated previously established effects of smoking heaviness, including detrimental effects on lung function. Novel results included a detrimental effect of heavier smoking on facial aging. We have demonstrated how GxE MR-pheWAS can be used to identify potential effects of an exposure, while simultaneously assessing whether results may be biased by horizontal pleiotropy.


75
Mendelian randomization (MR) is an established approach that uses genetic variants as proxies 76 for a modifiable exposure, to estimate the causal effect of the modifiable exposure on a 77 downstream outcome (1). MR is often implemented within an instrumental variables (IV) 78 framework, with a key assumption being that the genetic instrument affects the outcome solely 79 through the exposure of interest. Horizontal pleiotropy (2), where the genetic instrument affects 80 the outcome through pathways not via the exposure, invalidates this assumption. Whilst the 81 absence of horizontal pleiotropy cannot be statistically tested, MR studies can include an 82 investigation of evidence for a violation of this MR assumption (3). For instance, where several 83 independent genetic instruments exist for an exposure, a difference in the effect estimates 84 would indicate they may be acting through different pathways (4)(5)(6). The majority of these 85 approaches are only applicable where there are several genetic variants that proxy for the 86 exposure of interest (3,7). 87 88 A design in which the genetic variant is interacted with another variable can provide evidence 89 regarding violation of the IV assumptions. The additional variable can be of many forms; not 90 necessarily environmental. For example, an early study interacted a genetic variant related to 91 alcohol with sex in a population where women drank very little, such that an effect of the 92 variant on the outcome in women would suggest that at least some of the effect of the variant 93 on the outcome is not acting through alcohol consumption (8). Here we report interaction with 94 an environmental factor, and refer to the design henceforth as involving gene x environment 95 (GxE) interaction (8)(9)(10)(11)(12). In contrast to traditional GxE studies that aim to identify genetic 96 variants whose effects on an outcome are modified by an environmental factor (or vice-versa) 97 (13), the aim here is to determine the extent that horizontal pleiotropy may be biasing estimated 98 Identified main effects from MR-pheWAS in ever smokers. The results of our MR-147 pheWAS among ever smokers includes 18 514 tests ranked by P value of the estimated effect 148 on each outcome, given in S1 File (S1 Fig in S1 Text shows the number of fields reaching each 149 stage of the PHESANT pipeline). A QQ plot is given in Figure 3a. We identified 70 results at 150 a false discovery rate of 5% (using a P value threshold of 0.05x70/18 514=1.89x10 -4 ), given in 151 Figure 4 and see Table B in S1 Text, and of these, 30 results had a P value lower than a stringent 152 Bonferroni corrected threshold of 2.70x10 -6 (0.05/18 514). In contrast, we identified 8 results 153 at a false discovery rate of 5% among never smokers (see Table C in S1 Text), 47 results among 154 our full sample (see Table D in S1 Text), and 9 results using a two-step approach (identifying 155 the outcomes associated using the full sample then ranking by interaction strength among this 156 subset; see Table F in S1 Text). 157 158 Identified interactions from MR-pheWAS in ever smokers versus never smokers. Of the 159 16 683 interactions tested, we identified 8 results with a P value lower than a stringent 160 Bonferroni corrected threshold of 3.00x10 -6 (0.05/16 683), where an increase in rs16969968 161 smoking-heaviness increasing allele dosage was associated with worse lung function (3 162 phenotypes), and less likelihood of being a morning person, in ever compared with never 163 smokers. We found a further 4 results at a false discovery rate of 5% (using a P value threshold 164 of 0.05x12/16 683=3.60x10 -5 ) (see Table E in S1 Text), where an increase in rs16969968 165 smoking-heaviness increasing allele dosage was associated with a higher risk of chronic 166 obstructive pulmonary disease, emphysema and cancer diagnoses and more facial aging, in 167 ever compared with never smokers. A QQ plot is given in Figure 3b. 168

169
The results identified when ranking by our P value for interaction were a subset of those 170 identified when ranking by the P value of the main effects among ever smokers, except for a 171 binary outcome describing whether the participant has had a keratinizing squamous cell 172 carcinoma (field ID [FID]=40011, value 8071). A higher rs16969968 smoking-heaviness 173 increasing allele dosage was associated with a higher risk of keratinizing squamous cell 174 carcinoma among ever smokers, but a lower risk among never smokers. Overlap  to heavier smoking was associated with an increased risk of looking older (as perceived by 180 others) relative to your age. Estimates of association are shown in Table 1 Testing the impact of collider bias. Our conclusions about the causal effect of smoking 199 heaviness on our outcomes may be affected by collider bias (14). This is because we stratified 200 on smoking status, and we found an effect of the smoking heaviness genetic variant on smoking 201 status. If any cause of an outcome also causes smoking status, then an association may be 202 induced between the smoking heaviness genetic variant and that outcome (S3 Fig in S1 Text). 203 If there is in truth an association between the genetic variant for smoking heaviness and the 204 outcome, then it could be estimated with bias in this scenario. We performed a simulation to 205 determine the degree of confounding needed to induce an association between the genetic 206 variant and our identified facial aging phenotype, given there is no true causal effect of the 207 SNP on the outcome. Further details of our simulation approach are given in Section S1 in S1 208 Text. Our simulation indicated that any collider bias due to the effect of the genetic variant on 209 smoking status is likely to have had a negligible impact on the estimated effect of the genetic 210 variant on facial aging (S4 Fig in S1 Text), given the same sample size and numbers of ever 211 and never smokers as our UK Biobank sample. For example, assuming no effect of the SNP 212 on the outcome, and given both a confounder that is the only cause of facial aging, and large 213 effect of the confounder on smoking status (OR=10), the estimated effect of the SNP on the 214 facial aging in ever and never smokers are consistent and their confidence intervals both 215 include the null, i.e. the degree of bias is not sufficient to (incorrectly) suggest an effect of the 216 SNP on facial aging via either smoking heaviness or some other path. 217 218 219

220
In this study, we searched for causal effects of smoking heaviness, using the PHESANT 222 package to perform a GxE MR-pheWAS, by estimating the association of genetic variant 223 rs16969968 with each outcome while restricting to ever and never smokers, respectively. We 224 used two approaches to identify the causal effects of smoking heaviness from our PHESANT 225 results. The first, ranking results based on the strength of the effect of rs16969968 among ever 226 smokers, identifies causal effects under the assumption that the combined effect through 227 smoking heaviness and any pleiotropic effect is not null. For instance, given a positive effect 228 of a genetic variant on an outcome via smoking heaviness, and an equal but opposite effect via 229 horizontal pleiotropy, then the effect of the variant would appear null in ever smokers and 230 negative in never smokers, and hence this outcome would not be identified when ranking by 231 the effect among ever smokers. The second approach, ranking results by the strength of 232 interaction between ever and never smokers, is commensurate with our aims, as a genetic 233 variant that affects an outcome (at least in part) through smoking heaviness will exhibit a 234 different effect size on this outcome, among ever and never smokers, respectively. However, 235 this approach lacks statistical power, especially when combined with the need to correct for 236 the multiple tests performed in an MR-pheWAS. 237 238 Our GxE MR-pheWAS of smoking heaviness identified 70 results when ranking on strength 239 of effect in ever smokers, and 8 when ranking on interaction strength. Of the 8 identified in the 240 latter, 7 of these were also identified in the former. Furthermore, the majority of results 241 identified when ranking on the effect in ever smokers were qualitative, with an effect seen 242 among ever smokers but not among never smokers (type F in Figure 1). Our results confirmed 243 several established or previously reported causal effects of smoking heaviness. For example, 244 our findings suggested higher smoking heaviness was associated with worse lung function 245 (FIDs={3063, 20150, 20151}) (23), higher resting heart rate (FID=102) (24,25), higher risk of 246 chronic obstructive pulmonary disease (FID=22130) (23) and lung cancer (FID=40001 value 247 C349) (26), and lower odds of being a morning person (FID=1180) (27). 248 249 Our PHESANT-estimated results identified a weak negative effect of rs16969968 on BMI in 250 ever smokers (that did not meet the 5% FDR threshold). Previous MR studies have also 251 identified an effect of smoking heaviness on BMI, where additional smoking-increasing alleles 252 was associated with a lower BMI among current smokers (10,15,19). Since we used ever 253 smokers rather than current smokers our weaker association may be because the effect on BMI 254 weakens over time after smoking cessation (10). Furthermore, while previous studies found an 255 effect of rs16969968 on BMI in never smokers (10,15,19), we found little evidence of an 256 association in UK Biobank. 257

258
We identified other novel results, including a detrimental effect on facial aging. We estimated 259 that a 1SD increase in lifetime smoking causes a 1.28 [95% CI: 1.08, 1.52] higher odds of 260 reporting that others say you look older than you are. A 1SD increase in lifetime smoking is, 261 for example, equivalent to being a current smoker who has smoked 5 cigarettes per day for 12 262 years, or a former smoker who smoked 5 cigarettes per day for 21 years but stopped smoking 263 10 years ago, rather than a never smoker. This identified association should be further 264 investigated and replicated in an independent sample; it may reflect a true causal effect of 265 smoking heaviness, may be due to chance, or may arise because the genetic variants have 266 horizontal pleiotropic effects and are thus invalid instruments for smoking heaviness or lifetime 267 smoking. However, we examined the extent to which pleiotropy might be biasing our smoking 268 heaviness results, by estimating the effect of the smoking heaviness genetic variant among 269 never smokers, and found little evidence of an association. Furthermore, another recent study 270 assessed perceptions of facial attractiveness using a two-alternative forced choice design where 271 participants were randomly shown prototypical faces for smokers and non-smokers and asked 272 to select the most attractive, and found that smoking faces were deemed less attractive (28). 273 Our study using MR is likely to have different sources of potential biases such that triangulation 274 of these results adds further evidence of the detrimental effect of smoking on facial appearance 275 in general (29,30). such as risk of operative procedures on the tarsometatarsal joint and eye problems, these 293 associations may be due to pleiotropy (or chance) rather than a causal effect of the smoking 294 heaviness variant through smoking status, as associations were consistent in ever and never 295 smokers. 296 297 There are some limitations of this work that should be considered. First, due to the automated 298 processing of PHESANT where pre-processing of an outcome may depend on the number of 299 participants with each outcome value, a different statistical test is used for some outcomes (e.g. 300 linear versus ordered logistic regression), for ever versus never smokers, such that tests of 301 interaction between these could not be performed. Second, while ranking by interaction 302 strength between ever and never smokers would be the preferred approach to identify causal 303 effects of smoking heaviness (i.e. outcomes where the effect of the genetic variant differs 304 among ever versus never smokers), this lacked power and identified only a small number of 305 results. However, ranking using the strength of association among ever smokers has better 306 power and, although in theory some results may be missed, in practice we identified many 307 potentially interesting novel results with this method, including all but one of the results based 308 on interaction strength. 309 310 Third, UK Biobank is a highly selected sample of the UK population, having a response rate 311 of 5.5% (32). For example, UK Biobank participants are, on average, less likely to smoke, and 312 have less self-reported health conditions, compared with the general population (33). Among 313 smokers, the difference between smoking heaviness in UK Biobank compared with the general 314 population varies by age and sex. For instance, smokers aged 45-54 years in UK Biobank, and 315 women smokers aged 55-64 years in UK Biobank on average smoke more heavily than those 316 in the general population, but this difference was not seen among male smokers age 55-64. If 317 these differences are due to an effect of smoking heaviness on selection into the study, then our 318 estimates may be biased (34). Also, if selection is additionally dependent on a given outcome, 319 then associations may be biased by a particular form of collider bias -selection-induced 320 collider bias (14). In general, collider bias may occur when two variables (A and B) affect a 321 third variable and this third variable (C) is conditioned upon in analyses. Selection induced 322 collider bias may occur when variable C represents whether a person is selected into the sample 323 (i.e. variables A and B both affect participation in the study). Hence, estimates of association 324 between two phenotypes -such as our smoking heaviness genetic variant and a given outcome 325 in our study -can be biased, if inclusion in the study is affected by both phenotypes. 326 327 Fourth, we found an association between rs16969968 and smoking status (ever versus never). 328 This means that associations may be biased by selection induced collider bias, because we 329 stratify our sample on smoking status. While our simulations indicated that any collider bias 330 due to the SNP effect on smoking status would have a negligible impact on the estimated effect 331 of the SNP on a given outcome, it is possible that this has increased our type 1 error rate across 332 the large number of tests performed in our MR-pheWAS. 333 334 Fifth, the estimates used to identify interactions are of the direct association of the genetic 335 variant with the outcome, and so are not estimates of the causal effect of smoking heaviness. 336 We cannot follow-up results using a formal IV analysis because there is no accurate measure 337 of tobacco exposure, and using poor measures such as cigarettes smoked may give biased 338 associations (7). For other exposures that are measured accurately, GxE interactions can be 339 used to estimate the size of causal effects in the presence of pleiotropy (35). 340 341 Sixth, it is possible that reporting bias of smoking status may have biased associations. For 342 instance, if some 'ever' smokers reported that they have never smoked, then for outcomes 343 affected by rs16969968 via smoking heaviness the estimate in never smokers would be biased 344 towards that in ever smokers. This would bias estimates of interaction between ever and never 345 smokers towards the null. Furthermore, if the effect of smoking heaviness is transient then our 346 interaction estimates may also be biased towards the null, because previous smokers are 347 assigned to the ever smoker group but the effect may no longer be present. In this case, testing 348 the interaction between current and never smokers may be more appropriate. participants as FALSE (i.e. assuming no missingness across all participants). This is not 361 appropriate for sex specific codes, where analyses should be restricted to a particular sex. We 362 further investigated this result by restricting to male participants, and testing the effect of 363 rs16969968 on: 1) 'malignant neoplasm of testis, descended' and 2) 'malignant neoplasm of 364 testis, unspecified'. The latter serves as a replication for the former, under the assumption that 365 the proportion of participants with descended versus undescended testis in the 'unspecified' 366 group is the same as the ICD codes (C62.1 versus C62.0) where this is known (i.e. the majority 367 of the unspecified group are descended; the ratio in UK Biobank is 1:16). While the positive 368 association with the 'descended' group remains in both ever and never smokers 369 (N_descended=26), we find little evidence of an association in the unspecified group 370 (N_unspecified=199), suggesting that the association with the particular data field is due to 371 chance. 372 373 We used the freely available PHESANT package to search for causal effects across thousands 374 of outcomes. We have shown that the GxE MR-pheWAS is an effective approach to search for 375 the causal effects of an exposure using observational data, when the degree to which an 376 exposure manifests is different across known subsets of the population. Other potential 377 applications include searching for the causal effects of alcohol intake (stratified by never versus 378 ever drinkers), cannabis use (stratified by never versus ever used) and milk consumption 379 (stratified by ever or never drinking milk). The approach could also be applied to qualitative 380 gene by gene interactions. Our study of smoking heaviness serves as a model for future studies 381 seeking to search for the causal effects of an exposure using GxE MR-pheWAS.  (36). This cohort includes a large and diverse 389 range of data from blood, urine and saliva samples and health and lifestyle questionnaires (37). 390

391
Of the 487 406 participants with genetic data, we removed 373 with genetic sex different to 392 reported sex, and 471 with sex chromosome aneuploidy (identified as putatively carrying sex 393 chromosome configurations that are not either XX or XY). We found no outliers in 394 heterozygosity and missing rates, which would indicate poor quality of the genotypes. We 395 removed 78 309 participants not of white British ancestry (38). We removed 73 277 396 participants who were identified as being related, having a kinship coefficient denoting a third 397 degree (or closer) relatedness (38). We removed 8 individuals with withdrawn consent, giving 398 a sample of 334 968 participants (we refer to as our full sample). Of these, 182 961 and 150 399 831 reported being never and ever (comprising previous and current) smokers, respectively. A 400 participant flow diagram is given in Figure 2. We excluded 74 fields a priori (see Table A in S1 Text) for the following reasons: 1 field 415 denoting the assessment centre; 2 fields described by UK Biobank as 'polymorphic', 416 containing values with mixed data types; 7 fields that, although listed in the data showcase, 417 were not currently available; 17 genetic descriptor fields, 1 sex field, 4 age fields, 17 fields 418 UK Biobank. For this reason, we first identified the subset of outcomes, where PHESANT used 493 the same type of regression to test the association, among ever and never smokers. For this 494 subset, we determined the strength of interaction between ever versus never smokers using 495 meta regression (metan command in Stata), and ranked the results by the P value of the 496 generated Q-test statistic -a measure of the heterogeneity across subgroups. We tested 497 interactions of binary and ordered categorical results in terms of the log odds (i.e. the 498 interactions are assumed to be multiplicative). To identify potential causal effects, we used 499 both a Bonferroni and 5% FDR threshold, as described above. 500 501 It may be reasonable to assume that rs16969968 affects most outcomes only via smoking 502 heaviness (i.e. there is no horizontal pleiotropy), such that an association between rs16969968 503 and the outcome in the whole sample would indicate an effect of smoking heaviness on this 504 outcome. We perform a two-step approach to identify interactions, similar to an approach for 505 identifying gene-environment interactions proposed previously (40). First, we rank outcomes 506 by the strength of association in the whole sample and used a 5% FDR threshold to identify 507 results to take forward to step two. Second, we rank these results by the strength of interaction 508 between ever and never smokers, ranking by the P value of the Q-test statistic. This approach 509 is valid because the strength of association in the whole sample is independent of the strength 510 of the interaction between ever and never smokers. 511 512 Follow-up analyses of identified associations. We identified an association with a facial 513 aging phenotype. Participants were asked 'do people say you look:' and asked to select either 514 'younger than you are', 'about your age' or 'older than you are'. It is possible that the 515 PHESANT automated approach made inappropriate decisions in its analysis, hence we re-516 examined this association to ensure it is not erroneous. We estimated the effect of rs16969968 517 on facial aging, using ordered logistic regression (ologit Stata command), among ever and 518 never smokers respectively, adjusting for covariates as described above. We also adjusted for 519 age, sex and the first 40 genetic principal components, as a sensitivity analysis. 520

521
We derived a measure of lifetime smoking that incorporates smoking heaviness, duration and 522 time since cessation into a single measure (41) called the lifetime smoking index as described 523 previously (42). We generated an lifetime smoking genetic instrument using the 124 identified 524 SNPs from a recent GWAS of the lifetime smoking index (42). As this GWAS was also 525 conducted in UK Biobank, we calculated our lifetime smoking genetic instrument as the sum 526 of the lifetime smoking index increasing alleles (i.e. we did not weight by the effect size). We 527 derived a binary measure of facial aging representing whether 'people say you look older than 528 you are' or not, by combining the 'younger than your age' and 'about your age' categories. We 529 determined the strength of our lifetime smoking genetic score as an instrument for lifetime 530 smoking index using linear regression, adjusting for age, sex and the first 10 genetic principal 531 components. We generated confidence intervals using bootstrapping (with 1000 bootstraps) as 532 the residuals of this regression (due to non-normality of the lifetime smoking index) were not 533 normally distributed. We estimated the causal effect of lifetime smoking on facial aging, using 534 IV probit regression. Again, we generated confidence intervals using bootstrapping (with 1000 535 bootstraps) as the residuals of the first stage were not normally distributed. We take the 536 exponent of 1.6 times the estimates, to approximate the association in terms of the change of 537 odds (43).     a) Bonferroni threshold: 0.05/18514 =2.70x10 -6 ; 5% FDR threshold: 0.05x70/18514= 1.89x10 -4 . b) Bonferroni threshold: 0.05/16683=3.00x10 -6 ; 5% FDR threshold: 0.05x12/16683=3.60x10 -5 . Results shown are those where an interaction was identified after correcting for multiple testing. One result was an unordered categorical result (Field 1448 reference 3), and is not shown as P value was generated using a likelihood ratio test such that an estimate and confidence interval is not available.
7 of 13 ordered categorical results are shown, with PHESANT ordered categorical result for never smokers. 33 of 38 binary results are shown, with PHESANT binary result for never smokers. 14 of 18 linear results are shown, with PHESANT linear result for never smokers.