Genetically determined serum urate levels and cardiovascular and other diseases in UK Biobank cohort: A phenome-wide mendelian randomization study

Background The role of urate in cardiovascular diseases (CVDs) has been extensively investigated in observational studies; however, the extent of any causal effect remains unclear, making it difficult to evaluate its clinical relevance. Methods and findings A phenome-wide association study (PheWAS) together with a Bayesian analysis of tree-structured phenotypic model (TreeWAS) was performed to examine disease outcomes related to genetically determined serum urate levels in 339,256 unrelated White British individuals (54% female) in the UK Biobank who were aged 40–69 years (mean age, 56.87; SD, 7.99) when recruited from 2006 to 2010. Mendelian randomization (MR) analyses were performed to replicate significant findings using various genome-wide association study (GWAS) consortia data. Sensitivity analyses were conducted to examine possible pleiotropic effects on metabolic traits of the genetic variants used as instruments for urate. PheWAS analysis, examining the association with 1,431 disease outcomes, identified 13 distinct phecodes representing 4 disease groups (inflammatory polyarthropathies, hypertensive disease, circulatory disease, and metabolic disorders) and 9 disease outcomes (gout, gouty arthropathy, pyogenic arthritis, essential hypertension, coronary atherosclerosis, ischemic heart disease, chronic ischemic heart disease, myocardial infarction, and hypercholesterolemia) that were associated with genetically determined serum urate levels after multiple testing correction (p < 3.35 × 10−4). TreeWAS analysis, examining 10,750 ICD-10 diagnostic terms, identified more sub-phenotypes of cardiovascular and cerebrovascular diseases (e.g., angina pectoris, heart failure, cerebral infarction). MR analysis successfully replicated the association with gout, hypertension, heart diseases, and blood lipid levels but indicated the existence of genetic pleiotropy. Sensitivity analyses support an inference that pleiotropic effects of genetic variants on urate and metabolic traits contribute to the observational associations with CVDs. The main limitations of this study relate to possible bias from pleiotropic effects of the considered genetic variants and possible misclassification of cases for mild disease that did not require hospitalization. Conclusion In this study, high serum urate levels were found to be associated with increased risk of different types of cardiac events. The finding of genetic pleiotropy indicates the existence of common upstream pathological elements influencing both urate and metabolic traits, and this may suggest new opportunities and challenges for developing drugs targeting a common mediator that would be beneficial for both the treatment of gout and the prevention of cardiovascular comorbidities.

were associated with genetically determined serum urate levels after multiple testing correction (p < 3.35 × 10 −4 ). TreeWAS analysis, examining 10,750 ICD-10 diagnostic terms, identified more sub-phenotypes of cardiovascular and cerebrovascular diseases (e.g., angina pectoris, heart failure, cerebral infarction). MR analysis successfully replicated the association with gout, hypertension, heart diseases, and blood lipid levels but indicated the existence of genetic pleiotropy. Sensitivity analyses support an inference that pleiotropic effects of genetic variants on urate and metabolic traits contribute to the observational associations with CVDs. The main limitations of this study relate to possible bias from pleiotropic effects of the considered genetic variants and possible misclassification of cases for mild disease that did not require hospitalization.

Conclusion
In this study, high serum urate levels were found to be associated with increased risk of different types of cardiac events. The finding of genetic pleiotropy indicates the existence of common upstream pathological elements influencing both urate and metabolic traits, and this may suggest new opportunities and challenges for developing drugs targeting a common mediator that would be beneficial for both the treatment of gout and the prevention of cardiovascular comorbidities.

Author summary
Why was this study done?
• Serum urate level has been extensively studied in epidemiological studies in relation to various diseases, but the extent of any causal effect is still unclear, making it difficult to evaluate the clinical importance of urate.
• Mendelian randomization (MR) uses naturally occurring genetic variants as instruments to infer the causal role of a risk factor in a disease or outcome of interest. Previous MR studies were typically hypothesis driven, and few studies have comprehensively investigated how serum urate level might influence overall health.

What did the researchers do and find?
• We implemented a phenome-wide association study (PheWAS) followed by a Bayesian analysis of tree-structured phenotypic model (TreeWAS) and MR analyses to explore the association between urate and a broad range of disease outcomes in the UK Biobank.
• We identified gout, hypertension, hypercholesterolemia, and a multitude of cardiovascular and cerebrovascular diseases (e.g., coronary atherosclerosis, myocardial infarction, angina pectoris, ischemic heart disease, heart failure, and cerebral infarction) that were associated with serum urate levels.
• MR analysis using various GWAS consortia data successfully replicated these associations but indicated the existence of genetic pleiotropy. Sensitivity analyses examining

Introduction
The role of urate has been explored in a large number of observational studies in relation to a multitude of health outcomes [1]. Apart from gout, compelling evidence exists for the associations between high serum urate level and an increased risk of non-crystal deposition disorders, including hypertension, cardiovascular diseases (CVDs), and metabolic syndrome [2,3]. Although considerable research efforts have been made in trying to understand the pathological role of urate in such disorders, its causal role has not been clearly established. Therefore, it has been argued that either these associations are confounded by other risk factors, such as obesity, or they represent reverse causality [4]. As is typical in complex traits, genetic determinants are implicated in the regulation of serum urate levels. Genetic studies among twins and families have reported a substantial heritable component of serum urate level with an estimated heritability of 40%-60% [5,6]. The genetic determinants of serum urate level have been explored in several genome-wide association studies (GWASs) [7][8][9][10] and the wealth of resultant data allows for the identification and application of genetic variants as instruments to help separate causal from noncausal associations, given that genotypes are generally independent of environmental exposures and the transmission of genetic information is usually unidirectional. Investigating the associations between genetic variants related to serum urate and disease outcomes might help provide causal evidence in support of the hypotheses that link urate to multiple clinical disorders. Previous mendelian randomization (MR) studies using the genetic variants as instruments of serum urate levels reported inconsistent findings [11][12][13][14]. While some supported a causal effect on health outcomes beyond gout (e.g., diabetic macrovascular disease, CVD mortality, and sudden cardiac death), the majority reported no causal relationships [1]. Specifically, most of the negative results of previous MR studies are perhaps due to the selection of genes selectively involved in renal handling of urate, while a substantial portion of cardiovascular risk is probably due to pleiotropic genes controlling for xanthine oxidase activity and urate production.
Our recently published MR-phenome-wide association study (PheWAS) analysis on the interim release data of UK Biobank (n = 120,091) provided an overview of the disease outcomes that were associated with the urate genetic risk loci [15]. Our study demonstrated that serum urate level shared the same genetic risk loci with multiple disease outcomes, particularly those related to cardiovascular/metabolic diseases and autoimmune disorders [15]. These findings provide a rationale for further investigating whether these cross-phenotype associations are causal. Although we have applied multiple methodologies to distinguish the PheWAS associations that were causal from those due to pleiotropy or genetic linkage, the use of the interim release data of UK Biobank set power limitations to our investigation and did not allow us to investigate less prevalent phenotypes. The release of the full UK Biobank GWAS genotype dataset provides a unique opportunity to further explore the previous MR-PheWAS findings, repeat analysis with the larger available cohort, and include phenotypes that were not investigated in the previous study due to an insufficient number of cases.
In this study, we performed an updated phenome-wide mendelian randomization study (PWMR) by using data from the full UK Biobank cohort. A weighted polygenic risk score (GRS) incorporating effect estimates of multiple genetic risk loci taken from the most recent and largest GWAS of serum urate was employed as a proxy of serum urate level [8]. The framework of phenome was defined by using both the PheCODE schema (also used in the previous MR-PheWAS) [15] and a novel Bayesian analysis framework, termed TreeWAS (tree-structured phenotypic model) [16]. Any replication of previous findings and/or novel findings was further explored in this study.

Methods
This study is reported as per the STROBE guideline (S1 STROBE Checklist). UK Biobank has ethics approval from the North West Multi-Centre Research Ethics Committee (11/NW/ 0382). Appropriate informed consent was obtained from participants and ethical approval was covered by the UK Biobank, from which data for this work were obtained (under approved data request application ID 10775). Any additional ethical approval was adjudged unnecessary for the present study. Although there is no formal or documented protocol for this study, the main analyses of PheWAS, TreeWAS, and the replication study were prespecified; the sensitivity analysis for the GRS of genetic polymorphisms involved in renal handling of urate was supplemented according to the reviewer's comments to better interpret the findings.

UK Biobank
UK Biobank is a large-scale, population-based prospective cohort study, which recruited over 500,000 participants aged between 40 and 69 years in 2006-2010 and combined extensive measurement of baseline data and genotype data with linked national medical records (e.g., inpatient hospital episode records, cancer registry, and death registry) for longitudinal follow-up. This study was constrained to a subset of unrelated White British individuals with high-quality genotype data in order to minimize the influence of diverse population structure within UK Biobank. Details about genotype data and phenotype data and the procedures of quality control are described in S1 Text.

Weighted genetic risk score
To generate a genetic proxy for serum urate, genetic variants associated with urate were searched across the GWAS catalogue and literature. Thirty-one genetic variants associated with urate among European populations were identified from previous GWASs [7,8] and were selected as components of the genetic proxy for serum urate level. The overall proportion of variance (adjusted R 2 ) of urate explained by the 31 genetic variants was around 7% [8]. The SNP effect on urate (effect size and standard error [SE]) was taken from the largest meta-analysis of GWASs performed by the Global Urate Genetics Consortium (GUGC) [8]. A weighted GRS was constructed by incorporating effect estimates of the 31 urate variants for UK Biobank participants. Specifically, the GRS was created by adding up the number of urate-increasing alleles for each SNP weighted for the SNP effect size on serum urate level (regression beta coefficients) and then adding this weighted score for all 31 SNPs.

Phenome framework
We analyzed three phenotypic datasets (i.e., inpatient hospital records, cancer registry data, and death registry data) available in the UK Biobank database. As we were interested in disease phenotypes, the ontology of the phenome was defined based on the ICD codes in the electronic medical records. We pooled the hospital episode data, cancer registry data, and death registry data together and included both the primary and secondary ICD codes. Individual ICD codes could not be directly used to define the phenome, as they represent specific sub-phenotypes of a similar set of diseases, instead of independent phenotypes. To account for the correlations between ICD codes, we applied two strategies: (i) the PheCODE schema that has been recently updated and successfully adopted in our previous MR-PheWAS [15]; and (ii) a novel Bayesian analysis of a TreeWAS that was developed by researchers from the Wellcome Trust Centre for Human Genetics [16].
PheCODE schema. The PheCODE system was developed to combine one or more related ICD codes into distinct disease groups [17]. To develop a phenotyping method applicable to the ICD-10 coding system in UK Biobank, we created a map to match ICD-9/10 codes to phecodes [15]. The latest version of the PheCODE system includes 1,866 hierarchical phenotype codes that could be directly matched to the ICD-9/10 codes and provides a scheme to automatically exclude patients that have similar or potentially overlapping disease states from the corresponding control group (e.g., excluding type 1 diabetes from being in the control group when analyzing the phenotype of type 2 diabetes). The PheCODE map is made publicly accessible via the link https://phewascatalog.org/phecodes_icd10.

TreeWAS.
A novel Bayesian analysis on a TreeWAS has recently been developed to interrogate the increasingly specific sub-phenotypes defined by the ICD-10 coding system. It has been suggested that this model has higher statistical power for detecting genotype-phenotype associations [16]. In principle, this phenotyping method models the genetic coefficients across all phenotypes as a set of random variables. To model the correlations of the hierarchical treelike structure of ICD-10 codes (termed as TreeWAS), a Markov process is applied to allow the genetic coefficients to evolve down the tree trunk and branches. The tree structure is determined based on the classification hierarchy of the ICD-10 coding system, in which each node in the tree represents a clinical term in the classification. More details about the tree-structured phenotyping process are described elsewhere [16].

Statistical analysis
To take advantage of both phenotyping models, we explored the association between the weighted GRS of urate and the phenome framework defined by both the PheCODE schema (described as PheWAS analysis) and the tree-structured phenotypic model (described as Tree-WAS analysis). The correlation with weighted GRS was examined for a number of potential confounding factors including sex, age, body mass index (BMI), assessment center, and the first 5 genetic principal components (PCs). In the PheWAS analysis, associations between weighted GRS and phecodes (with no fewer than 20 cases) were examined by logistic regression. Given that phenotypes investigated are not totally independent in the PheCODE system, because multiple levels of phenotypic granularity were used for the definition of the case-control groups, we applied the false discovery rate (FDR) method (corresponding to the FDR of q < 0.05) to account for multiple comparisons instead of the more stringent Bonferroni correction [18]. In the TreeWAS analysis, associations between the weighted GRS and the phenome variables were tested by the Bayesian network analysis at both terminal and internal nodes of the tree structure. The marginal posterior probability (PP) for each node in the tree (where its genetic coefficient was nonzero) and the corresponding maximum a posteriori (MAP) effect estimate with 95% credible interval were determined by using the MAP estimator. Any association with any node of the tree at the PP � 0.95 was reported for further investigation. Details about the TreeWAS analysis have been described previously [16]. All the statistical analyses were implemented by R 3.3.2.

Replication in MR-base database
To validate findings, PheWAS associations were further examined in the MR-base database for replication in different populations [19,20]. We used this platform to make causal inference by performing two-sample MR analysis using available GWAS consortia data. We applied the simplest inverse variance weighted mendelian randomization (MR IVW) approach as crude analysis; if there was horizontal pleiotropy that violated the assumptions of the MR IVW, we applied a mixture-of-experts machine learning framework of mendelian randomization (MR-MoE) to predict the performance of three main classes of MR analytical approaches (mean-based, median-based, and mode-based methods) in the context of different models of pleiotropy and then selected the most likely unbiased causal estimate for each specific circumstance [20]. Full details of these MR approaches, including their different assumptions, are provided in S1 Text and S1 Table. The schematic presentation of the overall study design is shown in Fig 1.

Sensitivity analyses
We created a separate GRS of seven genetic polymorphisms, which are involved in renal handling of urate (six urate transporter-coding genes: SLC22A12, SLC2A9, ABCG2, SLC22A11, SLC17A1, SLC16A9, and one urate transporter-related scaffolding gene, PDZK1), as a sensitivity analysis. We then performed additional sensitivity analyses to further explore any pleiotropic associations. To identify genetic variants showing pleiotropy, we examined their association with a set of metabolic traits (i.e., BMI, waist-to-hip ratio [WHR], total cholesterol [TC], low-density lipoprotein cholesterol [LDL-c], high-density lipoprotein cholesterol [HDLc], fasting glucose, 2-hour glucose, glycoproteins, systolic blood pressure [SBP], and diastolic blood pressure [DBP]) through publicly available resources from various GWAS consortia (a summary of these GWASs is provided in S1 Text). An association was declared as pleiotropic when these GWAS summary data reported any association between the serum urate risk loci and these metabolic traits at p < 1.61 × 10 −3 (the threshold was determined based on the Bonferroni correction with a significance level of α = 0.05 divided by the number of 31 serum urate risk loci analyzed in this study). These 31 urate genetic risk loci were then divided into five categories, accordingly: (i) urate-specific loci, including 14 SNPs with no pleiotropic effect on the examined metabolic traits (S2 Table); (ii) urate-obesity pleiotropic loci, including 10 SNPs with pleiotropic effects on BMI or WHR (S3 Table); (iii) urate-BP pleiotropic loci, including 10 SNPs with pleiotropic effects on blood pressures (BPs) (i.e., DBP and SBP) (S4 Table); (iv) urate-lipid pleiotropic loci, including 6 SNPs with pleiotropic effects on lipids (i.e., TC, LDL-c, and HDL-c) (S5 Table); and (v) urate-glucose pleiotropic loci, including 3 SNPs with pleiotropic effects on blood glucose (fasting glucose, 2-hour glucose, glycoproteins) (S6 Table). A set of GRSs were created accordingly to recalculate the effect estimates in PheWAS analysis.

Results
We included 339,256 unrelated White British individuals from the full UK Biobank cohort, consisting of 157,146 men and 182,110 women. The mean age of the study population was Other sociodemographic characteristics of the study population are summarized in S7 Table. The mean value of weighted GRS among the study population was 0.44 (SD: 0.31), which is equivalent to 0.44 mg/dL of serum urate level. The correlations between the weighted GRS and potential confounding factors (i.e., age, sex, BMI, assessment center, and the PCs) are provided in S7 Table. Of these, two variables (i.e., assessment center and the PCs) were statistically significantly correlated with the weighted GRS and therefore were adjusted as covariates.

Replication in MR-base database
To validate the findings, we performed two-sample MR analyses on associated diseases (i.e., gout, RA, coronary heart disease [CHD], myocardial infarction, ischemic stroke) or on their corresponding intermediate traits or surrogate outcomes (i.e., SBP, DBP, TC, LDL-c, HDL-c) ( Table 3). Results from MR IVW suggested that a genetically determined high serum urate  (Table 3). Causal estimates from each of the MR analytical approaches are provided in S9-S17 Tables.

Sensitivity analyses
PheWAS analysis using the GRS of 7 SNPs involved in renal handling of urate showed significant associations with gout (p = 3.04 × 10 −91 ) and related diseases (e.g., inflammatory polyarthropathies, gouty arthropathy) after FDR correction. A GRS of the remaining 24 SNPs (excluding genetic polymorphisms involved in renal handling of urate) showed significant associations with gout, hypertension, hypercholesterolemia, and CVDs (e.g., coronary atherosclerosis, ischemic heart diseases, and myocardial infarction) (S18 Table). Given that most of the related outcomes were CVDs, we performed further sensitivity analyses to examine the potential of any pleiotropy effect of urate risk variants on metabolic traits. We recalculated the PheWAS estimates by using a number of GRSs created based on their association with a set of metabolic traits (Fig 3 and S19 Table), and the specific metabolic traits investigated were further determined by the availability of summary GWAS data. The GRS of urate-specific loci was only associated with gout and its upper disease group of inflammatory polyarthropathies, but not with any cardiovascular/metabolic diseases. In contrast, the GRSs of pleiotropic loci on obesity, BP, lipids, and glucose showed significant association with both gout and the CVDs. Specifically, the GRS of pleiotropic loci on lipids was significantly associated with all CVDs, including hypertensive diseases (i.e., essential hypertension), heart diseases (i.e., ischemic heart diseases), and metabolic disorders (i.e., hypercholesterolemia). Additionally, the GRS of pleiotropic loci on glucose was significantly associated with diabetes (i.e., type 2 diabetes). When removing any group of pleiotropic loci from the creation of GRS, their associations with hypertensive diseases, heart diseases, and metabolic disorders were not statistically significant (S20 Table). The effects of pleiotropic loci (mapped with genes) on serum urate level against their effects on four representative disease outcomes were plotted in S1 Fig, in which the two urate transporter genes (SLC2A9 and ABCG2) are recognised as the leading loci driving the association with gout, the GCKR gene is the leading locus driving the association with hypercholesterolemia, and the PTPN11/ATXN2 gene is the leading locus driving the association with hypertension and ischemic heart diseases.

Discussion
The present study demonstrated that genetically determined high serum urate level was consistently associated with increased risk of several disease groups, including inflammatory polyarthropathies (e.g., gout and gouty arthropathy), hypertensive diseases (e.g., essential hypertension), heart diseases (e.g., coronary atherosclerosis, myocardial infarction, angina pectoris, ischemic heart disease, and heart failure), and metabolic disorders (e.g., hypercholesterolemia). This study, using data from the full UK Biobank cohort (n = 339,256), verified the associations discovered in the previous MR-PheWAS study based on the interim release of UK Biobank genetic data (n = 120,091) [15] and identified a number of new sub-phenotypes of diseases (e.g., gouty arthropathy, angina pectoris, and heart failure). Some disease outcomes (e.g., disorders of iron metabolism, celiac disease) reported in the previous study were not identified in the present study, as these associations were derived from the genetic linkage disequilibrium between two single variants and therefore were diluted by the use of a weighted GRS of multiple genetic instruments. Association between urate and the risk of gout, hypertension, CHD, myocardial infarction, and a decreased level of HDL-c was successfully replicated in different European populations by analyzing various GWAS consortia data documented in the MRbase database [19], but a causal relationship was only supported for gout. Overall, findings from the current study support the observational associations between high serum urate level and increased risk of hypertensive diseases, heart diseases, and metabolic disorders and also indicated that these associations were more likely due to genetic pleiotropy instead of causality.
A recent umbrella review summarized the published MR studies and examined the causal relationship of serum urate level with a wide range of health outcomes, including gout, cardiovascular, metabolic, and neurocognitive disorders, and for the majority of investigated traits, causality was not verified [1]. There were nine disease outcomes (e.g., diabetic macrovascular disease, arterial stiffness [internal diameter of carotid artery], adverse renal events, Parkinson disease, lifetime anxiety disorders, memory performance, CVD mortality, sudden cardiac The red circles represent the disease outcomes associated with the weighted GRS of the 31 urate genetic rick loci; the blue circles represent disease outcomes associated with the weighted GRS of urate-specific risk loci; the green circles represent diseases outcomes associated with urate-obesity pleiotropic loci; the orange circles represent disease outcomes associated with urate-lipid pleiotropic loci; and the pink circles represent disease outcomes associated with urate-lipid pleiotropic loci. E70-E90, metabolic disorders; GRS, genetic risk score; I10-I15, hypertensive diseases; I20-I25, ischemic heart diseases; M05-M14, inflammatory polyarthropathies; PheWAS, phenome-wide association study. https://doi.org/10.1371/journal.pmed.1002937.g003 death, and gout) reported to have a nominally statistically significant causal relationship with urate, but most of them presented with discordant results between MR studies or suffered from methodological limitations (e.g., inadequate study power, invalid genetic instruments), and only that for gout was verified based on convincing evidence.
Specifically, our finding that genetically predicted serum urate level is causally associated with increased risk of gout is not surprising, as it is well known that the causal factor of gout is represented by the monosodium urate crystals (MSUs), which leads to acute local inflammation in joints. Moreover, this study also detected an association between urate and the disease group of inflammatory polyarthropathies. To investigate if there were any other types of inflammatory polyarthropathies (beyond gout) associated with urate, we examined the association of urate with all specific diseases included in this group, but none of them were statistically significant. When excluding gout from this disease group, the association was not statistically significant any longer, indicating the observed association was actually driven by gout.
Numerous epidemiological studies have reported that elevated serum urate level is related to increased risk of hypertension, and their relationship has been consistent, showing a doseresponse relationship of similar magnitude [21]. Findings from our current study support this association, but the magnitude of estimated effect size (OR = 1.07, 95% CI: 1.05-1.11) is smaller than that of traditional epidemiological studies [22]. In our PheWAS, TreeWAS, and MR IVW analysis, we consistently showed a moderate association between urate and different types of heart disease, including coronary atherosclerosis, angina pectoris, ischemic heart diseases, acute/old myocardial infarction, and heart failure; however, the MR-MoE analysis did not support the causal inference after accounting for the presence of pleiotropy.
Large epidemiological studies have established an association between high serum urate level and the increased risk of metabolic disorders [23]. The NHANES III survey study suggested that a high serum urate level was associated with increased levels of serum LDL-c, triglycerides, TC, and apolipoprotein-B and a decreased level of HDL-c [24]. Our study further strengthened this epidemiological evidence and highlighted an association between urate and hypercholesterolemia. Our MR IVW analysis replicated the corresponding association with its surrogate outcome (i.e., HDL-c) but suggested the presence of pleiotropy instead of causality. Additionally, epidemiological studies have also indicated that high serum urate level is associated with increased risk of diabetes [25]. However, this association was not detected in the main PheWAS or TreeWAS analysis, while sensitivity analysis using the GRS of urate-glucose pleiotropic loci (i.e., GCKR, IGF1R, and SLC16A9) identified significant association with type 2 diabetes.
To explore how genetic pleiotropy influences the association with cardiovascular/metabolic diseases, we analysed all 31 urate loci across a set of metabolic traits and identified 14 SNPs (urate-specific loci) that were exclusively associated with urate and 17 SNPs (pleiotropic loci) that were associated with metabolic traits. When examining the urate-specific loci, their GRSs were only associated with gout and its upper disease group of inflammatory polyarthropathies, but not with any cardiovascular or metabolic diseases. In contrast, when categorizing the pleiotropic loci into different groups (e.g., GRS of urate-obesity loci, GRS of urate-BP loci, GRS of urate-lipid loci, and GRS of urate-glucose loci), the GRSs of pleiotropic loci showed consistent associations with both gout and the cardiovascular/metabolic diseases. When removing any group of pleiotropic loci from the creation of GRS (e.g., GRS of urate without pleiotropic loci on BP, or GRS of urate without pleiotropic loci on lipids), their association with heart diseases and metabolic disorders was not statistically significant. Based on these findings, our study suggests that the association between urate and CVDs is probably due to the pleiotropic effects of genetic variants on urate and metabolic traits.
Examining the associations between individual urate genetic risk loci and the related disease outcomes highlighted two loci, GCKR and PTPN11/ATXN2, which drive their association with hypercholesterolemia, hypertension, and ischemic heart disease. Pathway network analysis of the leading pleotropic genes provides some clues on how genetic pleiotropy contributes to the association between urate and cardiovascular/metabolic disease. Genetic variation in GCKR is shown to be associated with concentrations of urate, triglyceride, and glucose [26]. The most plausible explanation for this observation is that GCKR affects both serum urate and triglyceride and glucose levels by a common unconfirmed mediator, which is proposed to be glucose-6-phosphate [27]. The GCKR controls the hepatic production of glucose-6-phosphate, which is catabolized for triglyceride synthesis via glycolysis, pyruvate, and acetyl coenzyme A, while glucose-6-phosphate is also a precursor of purine (uric acid) metabolism [27]. Additionally, gene functional annotation of PTPN11/ATXN2 highlights another subnetwork around hemostasis pathways, including platelet activation, aggregation, and sensitization (activated by LDL-c) [28], and these may be relevant to the observed association with hypertension and heart diseases; but how this gene influences serum urate level has not yet been clearly demonstrated.
The detection of a multitude of cross-phenotype associations in this study adds to our understanding of the extent of shared genetic/biological components between urate and metabolic traits. Further characterizing the associations between urate and disease outcomes as causal or pleiotropic contributes to our knowledge of how the role of urate should be interpreted and used in clinical practice in the management of related disease conditions. Given that the observational associations between urate and cardiometabolic diseases are more likely due to pleiotropy rather than causality, our study supports the notion that urate could be a predictor but probably not a direct target for the development of compounds that could reduce cardiovascular/metabolic disease risk. The linked biological pathways between urate and metabolic traits indicated that the frequent coexistence of gout with hypertension, CVDs, and hyperlipidemia is a range of interrelated disease outcomes due to linked pathogenic components, rather than isolated events. This supports the European League against Rheumatism (EULAR) recommendation of systematic screening and assessment of cardiovascular/metabolic comorbidities in gout patients [29]. The classification of high serum urate levels due to renal handling dysfunction or high urate production would improve the identification of gout patients with higher risk of metabolic and CVD, and promote a more selective and effective use of urate-lowering drugs. The finding of genetic pleiotropy indicates the existence of common upstream pathological elements influencing both urate and metabolic traits, and this may suggest new opportunities and challenges for developing drugs targeting a common mediator that would be beneficial for both the treatment of gout and the prevention of cardiovascular/ metabolic comorbidities. This study has focused on the detection of cross-phenotype associations and highlighted the importance of pleiotropy in the links of these complex diseases. We have made efforts to try to understand the cross-phenotype association in the context of a pleiotropy model, but functionally characterizing the underlying biological mechanisms remains a challenge in this field and is worthy of further investigation.
The strengths of this study include its potential to examine a broad spectrum of disease outcomes related to urate and to reflect the shared biological relevance among associated phenotypes, given that previous MR studies were typically hypothesis driven and few studies have comprehensively investigated how serum urate level might influence overall health. Compared with the previous MR-PheWAS [15], the present study extends the prior findings by combining genetic risk loci of urate into a weighted GRS, exploring genetic pleiotropy on a set of metabolic traits systematically, investigating more disease outcomes, assessing their associations with >3-fold more cases, examining consistency of findings across two different phenotyping models to reduce the probability of false positive/negative findings due to factors related to the model, and replicating the findings by performing two-sample MR in different populations. Our study demonstrated the performance of two phenotyping models by accounting for the differences in the specificity and granularity of different phenome definitions and by characterizing the phenotypic correlations among different levels of ICD hierarchy. TreeWAS is shown to increase statistical power and can detect new associations missed by conventional PheWAS [16]. One of the major accomplishments of this study together with the previous MR-PheWAS has been the establishment of a framework or workflow for PheWAS [15]. We believe this study would be an excellent starting point for researchers who plan to use the UK Biobank resource to comprehensively interrogate the clinical significance of biomarkers. The updated version of PheCODE schema used in this study is made available for researchers who are interested in performing PheWAS in UK Biobank.
This study also has limitations. The causal inference in our study is limited by the common difficulty of pleiotropy caused by the use of multiple genetic instruments. Although we have performed sensitivity analyses by grouping the pleiotropic loci based on metabolic traits and exploring their association separately, there is still a probability of undetected pleiotropy or the possibility that the relatively weak causal effects of urate on diseases were concealed by the strong pleotropic effects of the genetic variants on metabolic traits. Moreover, as most cases were identified from the inpatient hospital records, this may have impaired the coverage of case ascertainment, especially for the diseases that do not usually cause events for hospitalization. The incorporation of self-reported data would improve this limitation, but it is also likely to mistakenly include patients who do not have a true diagnosis and introduce information bias. As UK Biobank is currently performing disease adjudication and processing linkages to general practice records and outpatient data, a widely covered and accurately defined criteria of case ascertainment for PheWAS study would be possible in the future.

Conclusions
Overall, when taking together the findings from PheWAS/TreeWAS, MR replication, and sensitivity analyses, we conclude a robust association between urate and a group of diseases including gout, hypertensive diseases, heart diseases, and metabolic disorders of lipids, but the causal role of urate is only supported in gout. Our study indicates that the association between urate and CVDs is probably due to the pleiotropic effects of genetic variants on urate and metabolic traits. These findings support that urate could be a good predictor for the cardiovascular/metabolic disease risk. Further investigation on therapies targeting the shared biological pathways between urate and metabolic traits would be beneficial for the treatment of gout and the primary prevention of cardiovascular/metabolic comorbidities.   Table. Results from MR-MoE analysis for urate and gout. MR-MoE, a mixture-of-experts machine learning framework of mendelian randomization. (DOCX) S10 Table. Results from MR-MoE analysis for urate and DBP. DBP, diastolic blood pressure; MR-MoE, a mixture-of-experts machine learning framework of mendelian randomization. (DOCX) S11 Table. Results from MR-MoE analysis for urate and SBP. MR-MoE, a mixture-ofexperts machine learning framework of mendelian randomization; SBP, systemic blood pressure. (DOCX) S12 Table. Results from MR-MoE analysis for urate and CHD. CHD, coronary heart disease; MR-MoE, a mixture-of-experts machine learning framework of mendelian randomization. (DOCX) S13 Table. Results from MR-MoE analysis for urate and myocardial infarction (MI). MR-MoE, a mixture-of-experts machine learning framework of mendelian randomization. (DOCX) S14 Table. Results from MR-MoE analysis for urate and TC. MR-MoE, a mixture-of-experts machine learning framework of mendelian randomization; TC, total cholesterol. (DOCX) S15 Table. Results from MR-MoE analysis for urate and HDL-c. HDL-c, high-density lipoprotein cholesterol; MR-MoE, a mixture-of-experts machine learning framework of mendelian randomization. (DOCX) S16 Table. Results from MR-MoE analysis for urate and LDL-c. LDL-c, low-density lipoprotein cholesterol; MR-MoE, a mixture-of-experts machine learning framework of mendelian randomization. (DOCX) S17 Table. Results from MR-MoE analysis for urate and ischemic stroke (IS). MR-MoE, a mixture-of-experts machine learning framework of mendelian randomization. (DOCX) S18 Table. Sensitivity analysis by using the GRS of genetic polymorphisms involved in renal handling of urate. GRS, polygenic risk score. (DOCX) S19 Table