Causal linkage between adult height and kidney function: An integrated population-scale observational analysis and Mendelian randomization study

As adult height is linked to various health outcomes, further investigation of its causal effects on kidney function later in life is warranted. This study involved a cross-sectional observational analysis and summary-level Mendelian randomization (MR) analysis. First, the observational association between height and estimated GFR determined by creatinine (eGFRcreatinine) or cystatin C (eGFRcystatinC) was investigated in 467,182 individuals aged 40–69 using UK Biobank. Second, the genetic instrument for adult height, as reported by the GIANT consortium, was implemented, and summary-level MR of eGFRcreatinine and CKDcreatinine in a CKDGen genome-wide association study was performed (N = 567,460), with multivariable MR being adjusted for the effects of genetic predisposition on body mass index. To replicate the findings, additional two-sample MR using the summary statistics of eGFRcystatinC and CKDcystatinC in UK Biobank was performed (N = 321,405). In observational analysis, adult height was inversely associated with both eGFRcreatinine (per 1 SD, adjusted beta -1.039, standard error 0.129, P < 0.001) and eGFRcystatinC (adjusted beta -1.769, standard error 0.161, P < 0.001) in a multivariable model adjusted for clinicodemographic, anthropometric, metabolic, and social factors. Moreover, multivariable summary-level MR showed that a taller genetically predicted adult height was causally linked to a lower log-eGFRcreatinine (adjusted beta -0.007, standard error 0.001, P < 0.001) and a higher risk of CKDcreatinine (adjusted beta 0.083, standard error 0.019, P < 0.001). Other pleiotropy-robust sensitivity MR analysis results supported the findings. In addition, similar results were obtained by two-sample MR of eGFRcystatinC (adjusted beta -1.303, standard error 0.140, P < 0.001) and CKDcystatinC (adjusted beta 0.153, standard error 0.025, P < 0.001) in UK Biobank. In conclusion, the results of this study suggest that a taller adult height is causally linked to worse kidney function in middle-aged to elderly individuals, independent of the effect of body mass index.

Introduction Kidney disease is a major category of comorbidity, and the prevalence of chronic kidney disease (CKD) is increasing with the aging population and global obesity epidemic [1]. Impaired kidney function negatively affects quality of life and various disorders. Recent evidence suggests that an estimated 5-10 million people die due to kidney disease each year [2]. Thus, identifying various factors affecting kidney function is important.
Kidney function is commonly evaluated by estimated values, namely, the estimated glomerular filtration rate (eGFR), due to its greater availability than a directly measured GFR. Because of widespread serum creatinine testing and robustly developed eGFR equations [3,4], eGFR is often determined in routine health exams or when assessing the general medical condition of a patient. Lower eGFR, indicating a kidney function decline, is a sensitive biomarker related to risks of various health outcomes. Furthermore, considering that serum creatinine has a limitation in that it can be affected by body mass or dietary factors, the advantage of using a cystatin C-based eGFR, which is less affected by such bias, has been reported [4,5].
Height is an important factor related to health outcomes [6], as is body mass index (BMI) which has been the primary focus of modern medicine regarding obesity issues. Common variants explain the majority (60%) of the genetic heritability in height. Previous studies have reported that taller height may confer a lower risk of coronary artery disease and a better socioeconomic status based on Mendelian randomization (MR) analysis [7][8][9], which can reveal causal effects of exposure traits on the occurrence of complex diseases. However, the causal effect of height, independent of BMI, on kidney function has rarely been studied. Nevertheless, a previous study showed that a taller height is related to a higher single-nephron GFR in individuals with normal kidney function, similar to obesity, suggesting that a taller height is a potential risk factor for glomerular overload related to consequent kidney function impairment in later life [10]. Additionally, a recent study identified a larger glomerular size, as in those with obesity or hypertension, in taller individuals [11]. On the other hand, a possibility remains that biomarkers for determining eGFR may be affected by height, resulting in inappropriately estimated kidney function [12]. Thus, further investigation is necessary to examine the association and causal linkage between adult height and kidney function in middle-aged to elderly individuals.
In the abovementioned MR analysis, genetic instruments were used to examine the association between genetically predicted exposure and outcome traits [13]. As one's genotype is determined before birth, the genetically proxied phenotype is less likely affected by confounding effects or reverse causation. Therefore, MR analysis is widely adopted in the medical literature to investigate causative pathways between complex exposure and outcome traits, also benefiting from the recent availability of large-scale genetic data [14]. MR analysis has also been recently implemented in the field of nephrology to reveal risk factors for CKD [15][16][17][18][19][20].
Here, we present a cross-sectional observational study and an MR analysis focusing on causal effects of adult height on eGFR in middle-aged to elderly individuals. Along with our recent efforts to identify causative factors for kidney function through MR investigations [15][16][17][18], we hypothesized that a taller adult height, as adjusted for the effects of BMI, would be causally linked to lower kidney function in the middle-aged to elderly population.

Ethics approval
The study was performed in accordance with the Declaration of Helsinki. The study was approved by the institutional review boards of Seoul National University Hospital (No. E-2005-006-1120) and the UK Biobank consortium (application No. 53799). Informed consent was waived as the study analyzed anonymous databases.

Study setting
The study incorporated three databases or their findings: 1) the previous GIANT consortium genome-wide association study (GWAS) meta-analysis for height, which was implemented to identify genetic instruments for height; 2) the previous CKDGen consortium GWAS for kidney function traits based on serum creatinine values, which was implemented as the outcome summary statistics for summary-level MR; and 3) UK Biobank data, which were used for cross-sectional observational analysis. The UK Biobank data was also used as another summary statistics for kidney function traits based on cystatin C levels for summary-level MR.
The study mainly consisted of two parts (Fig 1). First, cross-sectional observational analysis was performed to investigate the association between adult height and eGFR measured in the middle-aged to elderly population in UK Biobank. However, as the investigation was crosssectional and observational, the possibility of unmeasured confounding effects or reverse causation existed, whereby the results would be rather supportive and causal inference limited. Therefore, we further performed summary-level MR analyses for the eGFR or CKD outcomes using a genetic instrument for adult height developed from the GIANT study. UK Biobank data and other summary statistics for kidney function traits by the CKDGen consortium were utilized as outcome information of the summary-level MR [21].

Cross-sectional observational analysis
The abovementioned cross-sectional observational analysis was performed with UK Biobank data. In brief, UK Biobank is a population-scale prospective cohort including participants aged 40-69. The study recruited > 500,000 participants from 2006 to 2010 from 22 assessment centers in the United Kingdom, and the study is powered by its extensively phenotyped and genotyped information. Other details of the UK Biobank project can be found in the literature [22,23].
For the cross-sectional observational study, we included 467,182 UK Biobank participants with adult height values, as measured by assessment staff during the inclusion visit, and both eGFR (creatinine) and eGFR (cystatin C), as calculated by the CKD-EPI equation, which includes information on biomarkers, age, sex, and ethnicity [3,4]. The serum creatinine and cystatin C values reported in UK Biobank were measured using standardized enzymatic methods, and a previous study found that eGFR (cystatin C) is less affected by body shape measures or diet and showed superior value when predicting patient-oriented outcomes [5]. In the current observational analysis, we investigated both eGFR (creatinine) and eGFR (cystatin C) measured during the baseline visits as outcome variables.
First, the overall association between adult height and eGFR values was investigated by generalized additive models adjusted for age and BMI for each sex. Second, linear regression analysis was performed with the construction of stepwise multivariable models. The first model included basic factors, age, sex, and BMI. The second multivariable model included a list of body index variables, adding waist circumference, weight, and whole-body fat-free mass, as measured by a BC418 MA body composition analyzer (Tanita, Tokyo, Japan). The third multivariable model included a wide range of metabolic parameters and social factors collected at inclusion visits, such as hypertension history, systolic BP, diastolic BP, dyslipidemia medication, LDL cholesterol, HDL cholesterol, triglycerides, diabetes mellitus, hemoglobin A1c, testosterone level, uric acid level, history of angina/heart attack/stroke, history of cancer, average moderate physical activity frequency, and smoking history. As information was missing in the dataset, the third fully adjusted model was repeatedly constructed in an imputed dataset established by multiple imputation with the changed equation method [24]. The analysis was performed with the "glm" command in R (version 3.6.2, the R Foundation). The details of the covariates collected and missing information are described in S1 Methods.

Concept of MR and MR assumptions
As causal inference by observational analysis is limited, we further extended our findings to an MR investigation. In MR, genotype-determined inborn individuals are used as genetic instruments for an exposure trait of interest, and the association between genetically predicted exposure and an outcome trait is tested. Genetically proxied exposure can less likely be affected by confounders, and the direction of the association is toward outcome; thus, a significant finding in MR can support causal effects of the exposure trait on the outcome.
A valid MR study requires three assumptions to be met to suggest causal inference [13]. First, the relevance assumption is that the genetic instrument should be closely associated with the exposure phenotype of interest. As the genetic instrument explained a certain variance regarding height and was reported in a large-scale GWAS, this assumption was considered to have been met. Particular caution is necessary with regard to the other two assumptions, i.e., exclusion-restriction and independence assumptions, meaning the absence of a horizontal pleiotropic pathway. The independence assumption is that the instrument should not be closely associated with confounding phenotypes. We performed various pleiotropy-robust MR sensitivity analyses to support attainment of the assumption and calculated the MR-Egger intercept P value, which is used to identify the presence of a significant pleiotropic effect. In addition, we performed sensitivity analysis by excluding single-nucleotide polymorphisms (SNPs) strongly associated with potential confounders. Although the exclusion restriction assumption is yet untestable, some pleiotropy-robust MR analyses relax this assumption in a portion of the instruments, further supporting the main causal estimates.

Genetic instruments for height exposure
We used the genetic instrument for adult height reported by the GIANT consortium [25], and the study was a GWAS meta-analysis of adult height using summary statistics from 79 GWASs involving 253,288 individuals of European ancestry. The study reported 697 independent SNPs reaching the genome-wide significance level (P < 5×10 −8 ) for height by conditional and joint association analysis. Using the SNPs reported in the study has particular strength for an MR analysis as the samples included in the GWAS meta-analysis were completely independent from the UK Biobank data, allowing a two-sample MR analysis. Two-sample MR analysis has strength because the possibility of false-positive findings is minimal, supporting the robustness of a positive finding in MR, and such two-sample approach would not be possible if we used a different, recently reported GWAS results [26] as there were overlapping samples.
The previously reported 697 SNPs associated with adult height used in this study had explained variance of 15.9%. We first included all 697 SNPs in our investigations (S2, S3, S4, S5 Tables); we also performed further sensitivity analysis excluding SNPs with potential associations with possible confounders to robustly attain the independence assumption (S6 Table).

Outcome summary statistics for summary-level MR
As mentioned above, first, we implemented the recent CKDGen GWAS meta-analysis for logtransformed eGFR based on creatinine and CKD [creatinine-based eGFR < 60 mL/min/1.73 m 2 ] for 567,460 individuals of European ancestry [21]. These data are the largest scale summary statistics for kidney function traits, and the study reported 308 index SNPs explaining 7.1% eGFR variance and 19.6% eGFR genetic heritability. The individuals of European ancestry in CKDGen had a median age of 54 years, and 50% of them were male, which suggests that the eGFR values were obtained from a generally middle-aged population. The CKDGen data had partial overlap with the samples included in the GIANT study (e.g., TWINGENE). Summary-level MR was performed with the genetic instrument for the summary statistics of kidney function outcomes.
Because creatinine values are likely to be affected by various body shape factors, we produced summary statistics for continuous eGFR based on the cystatin C level and CKD (eGFR < 60 mL/min/1.73 m 2 or prevalent end-stage kidney disease history) using UK Biobank data, which is independent from the samples involved in the genetic instrument identification [4]. We identified 337,138 individuals of white British ancestry passing the basic quality control process; there were no cases of excess kinship, sex chromosome aneuploidy, or outliers in terms of heterozygosity or missing rate. The detailed genotyping and imputation process for the UK Biobank genetic data has been previously described [14]. Among the participants, baseline eGFR (cystatin C) values were available for 321,405 and were included for generating outcome summary statistics. Linear regression or logistic regression for the eGFR (cystatin C) and CKD (cystatin C) model was constructed while adjusting for age, sex, age×sex, age 2 , and the first 10 principal components of the genetic information, and summary statistics for the SNPs included in the genetic instrument for adult height were obtained. The above outcome summary statistics were developed using PLINK 2.0 [27].
F statistics, which should be over 10 to avoid weak instrument bias, were 153.7 and 87.0 for the CKDGen and the UK Biobank outcome data, respectively [28].

Statistical methods for summary-level MR for kidney function traits
Overlapping SNPs with compatible alleles that were not palindromic with intermediate allele frequencies were utilized as the genetic instrument for our summary-level MR investigation.
The main method for summary-level MR was the standard multiplicative random-effect inverse variance weighting method, which can handle variant-specific heterogeneity and is more conservative than the conventional fixed-effect model [29]. Heterogeneity was assessed by Cochran's Q statistics. To test whether there was a disproportionate effect from a single SNP, single SNP analysis, calculating the causal estimates from a SNP at a time, and leave-oneout analysis, leaving one SNP out of the analysis at a time, were performed.
Causal estimates by the inverse variance weighting method can still be biased by unbalanced pleiotropic effects; thus, additional pleiotropy robust MR sensitivity analysis is necessary to support MR assumptions. The current literature suggests several robust methods and performing a set of analyses, preferably from each domain, is encouraged [30]. First, MR-Egger regression, which yields pleiotropy-robust causal estimates, was conducted [31], with calculation of the MR-Egger intercept P value, which is used to indicate the presence of a significant pleiotropic effect. Another sensitivity analysis was carried out using the weighted median method, which provides valid causal estimates even in the presence of invalid instruments in up to 50% of instrumented weights [32]. Among outlier-robust methods, we used MR-Robust, which entails inverse variance weighting analysis but downweighs outlier effects [33]. We also applied MR-Robust Adjusted Profile Score (RAPS), which provides estimates that are valid when pleiotropic effects are normally distributed at approximately zero [34]. Next, the contamination mixture method, which can detect subgroups of genetic variants with similar causal effects and perform robust MR analysis in the presence of invalid instrumental variables, was implemented [35].
Finally, we performed multivariable MR [36] while adjusting for the effects of BMI, which is one of the most widely recognized anthropometric factors affecting various health outcomes, including the risk of CKD. Multivariable MR has particular strength when genetic variants that are only associated with an exposure trait of interest but not with related confounders are rare, as the method can directly adjust the genetic effects of instrumented SNPs on confounding phenotypes. For multivariable MR to support causal effects, genetic instruments should attain the assumptions as conventional MR analysis. Multivariable MR additionally requires an exclusive association between the instruments and risk factors. As BMI and height are very closely related anthropometric parameters and as the biological effect of BMI is large, we aimed to investigate causal estimates from height independent of BMI by implementing multivariable MR. The effects of BMI were provided by another GWAS meta-analysis undertaken also by the GIANT consortium and focusing on BMI traits [37]. Conditional F statistics for eGFR and CKD outcomes were 18.5 and 16.5, respectively, and information for the input data used for the multivariable MR analysis is presented in S7 and S8 Tables.
The summary-level MR analysis was performed using the TwoSampleMR, Mendelian Randomization, and MVMR packages in R [38,39]. Bidirectional MR was not carried out because kidney function traits in this study were measured mostly in the middle-to old-age population, whereas height was generally determined before reaching adulthood. Sex-stratified MR analysis was not applied because the CKDGen data do not provide sex-specific summary statistics.

Baseline characteristics of the observational dataset
The median age of the UK Biobank participants in the observational analysis dataset was 58 years old, and 45.6% of them were males. The baseline characteristics of the study participants according to adult height above the median (� 162 cm for females, � 176 cm for males) for each sex are described in Table 1. Those with a taller height had a younger median age and a lower BMI or proportion of obesity. However, among male participants, central obesity was common in those with a taller height. Those with a height taller than the median also had a higher household income. Regarding comorbidities, those who were taller than the median

PLOS ONE
had a lower proportion of hypertension, dyslipidemia, and diabetes than those shorter than the median.

Cross-sectional observational association between height and eGFR
Overall, median eGFR values appeared to be higher in those who were taller than the median than in those shorter than the median (Table 1). Additionally, a taller height was generally associated with a lower eGFR, as determined by both cystatin C and creatinine and in both males and females, after adjusting for age and BMI (Fig 2). In linear regression analysis (Table 2 and S9 Table), a taller height was significantly associated with a lower eGFR, as determined both by creatinine (beta -1.109, SE (standard error) 0.025, P < 0.001) and cystatin C (beta -0.821, SE 0.031, P < 0.001), in the model adjusted for age, sex, and BMI. Furthermore, these results were maintained in subgroups according to median age (age 58 years old) or sex. The inverse association between a taller height and lower eGFR was observed in further adjusted multivariable models, including those adjusted for various anthropometric measurements, metabolic parameters, comorbidities, and social factors. Only the association between eGFR (cystatin C) and height in males in the model adjusted for body shape factors was in disagreement (beta 0.600, SE 0.285, P = 0.035) with the overall trends, but the inverse association as the main result was again noted after adjustment for other characteristics.

Summary-level MR results
After disregarding 27 SNPs that were palindromic with intermediate allele frequencies and 5 SNPs that did not overlap with the CKDGen results, 665 SNPs were utilized as the genetic instrument for eGFR, and 666 SNPs were used for CKD (S2 and S3 Tables). Causal estimates for eGFR (creatinine) and CKD (creatinine) indicated that a taller genetically predicted adult height was causally linked to a lower log-eGFR (beta -0.006, SE 0.001, P < 0.001) and a higher risk of CKD (beta 0.069, SE 0.019, P < 0.001) when using the CKDGen dataset and the multiplicative random effect inverse variance method (Fig 3 and Table 3). This result remained significant in pleiotropy-robust MR sensitivity analyses, and the MR-Egger regression test for directional pleiotropy indicated no significant directional pleiotropy in the causal estimates for eGFR (MR-Egger intercept P value = 0.572) and CKD (MR-Egger intercept P value = 0.261). In addition, single-SNP analysis, leave-one-out analysis, and funnel plots visually showed the absence of disproportionate effects from some SNPs (S1, S2, S3 Figs). Although the causal estimates remained significant by the weighted median method for log-eGFR (beta = -0.005, SE = 0.001, P < 0.001), the results were nonsignificant for the CKD outcome (beta = 0.034, SE = 0.025, P = 0.178), with generally similar effect sizes and directions as the main causal estimates. Moreover, the results remained significant in multivariable MR analysis adjusted for the effects of BMI on log-eGFR (beta = -0.007, SE = 0.001, P < 0.001) and CKD (beta = 0.083, SE = 0.019, P < 0.001). The scatter plots by the main MR methods are presented in S4 Fig. The causal estimates were similar when we further excluded SNPs with genome-wide significant associations with diabetes mellitus, hypertension, or obesity.
These results were reproduced when we introduced the GWAS summary statistics from UK Biobank for eGFR (cystatin C) and CKD (cystatin C). Using 667 SNPs of the genetic instrument that were not palindromic (28 SNPs) or did not overlap (2 SNPs) in the UK Biobank dataset (S4 and S5 Tables), the causal estimates of a taller genetically predicted adult

Discussion
In the current study, which involved MR analysis, we identified a causal linkage between taller height and lower eGFR or higher risk of CKD as defined by eGFR. The observational findings showed that adult height is inversely associated with estimated kidney function, even in the stringently adjusted model considering large number of covariates. The identified observational results are extended to causal inference by our MR analysis. The MR analysis results consistently indicated that a taller adult height causally reduces estimated kidney function in middle-to old-age individuals, independent of the effects of BMI. Two possible hypotheses can be considered to explain the study results. First, a simple explanation is that as eGFR is still an "estimated" metric of kidney function relying on equations and biomarkers, whereby a taller height may cause a higher serum creatinine or cystatin C value, independent of other factors, with a lower eGFR but not actually decreased kidney function [12]. The significant inverse observational association found even after adjusting for various anthropometric information or the results for eGFR (cystatin C), which is less affected by such bias, may discourage this interpretation [40]. Regardless, as directly measured GFR values were unavailable, the possibility that the explanation is true could not be disregarded. If the explanation is valid, the causal linkage between height and eGFR still raises an important issue about underestimating kidney function when relying on eGFR values in taller individuals, even when cystatin C values are measured. Second, there may be causal effects of height itself on actual kidney function in middle-aged to elderly individuals and not only on estimated kidney function, as the results remained significant for eGFR (cystatin C) and after adjusting for various body size parameters. This hypothesis is supported by the idea that higher height may be linked to the relative overload of single nephrons or a larger glomerular size, similar to that observed with obesity [10,11,41,42]. After reaching an age beyond the 40s, which the human body had not evolved to reach in the past, the kidney may fail to maintain its workload, which would be larger in taller individuals than in shorter individuals, resulting in decreased kidney function. Such glomerular overload has explained the consequences of obesity or diabetes in terms of kidney function, starting with glomerular hyperfiltration and finally leading to impaired kidney function [42,43]. Thus, an effect on kidney function similar to that of obesity may occur in taller individuals, which is another important factor of body size [11]. However, as a direct mechanistic explanation is impossible based on this MR study, further research is warranted to investigate the biological mechanisms of the possible effects of height on kidney function. There are several limitations to this study. First, as measured GFR values were absent, directly proving the effects of height on actual kidney function was impossible. Despite efforts to reduce the effects of the difference between estimated and actual kidney function by including cystatin C values, further research is necessary to draw conclusions regarding the mechanisms of the causal effects of adult height on kidney function in middle-aged to elderly individuals. Second, the study was mainly based on individuals of European ancestry; thus, it is unclear whether the findings can be directly applied to other ethnicities. Third, there was some sample overlap between the CKDGen data and the samples of the GWAS used to identify genetic instruments, which may cause bias towards false-positive findings. However, the results from the two-sample MR, using the independent UK Biobank data, were similar to that from the CKDGen data, thus, the issue would not likely to change the findings of the current study. Fourth, the study only addressed adult height and estimated kidney function in those of middle to old age, and the link between height and kidney function in the pediatric population or young adults was beyond the scope of this study. Last, as MR analysis is weak in detecting nonlinear effects, it remains to be determined whether the findings can be applied to an extreme height range (e.g., very low height).
In conclusion, taller adult height causally reduces eGFR, as determined by either creatinine or cystatin C values, in middle-aged to elderly individuals. Further study is necessary to elucidate the mechanism of this causal link, and whether a taller adult height causes glomerular overload leading to impaired kidney function later in life should be investigated.