Association of eGFR-Related Loci Identified by GWAS with Incident CKD and ESRD

Family studies suggest a genetic component to the etiology of chronic kidney disease (CKD) and end stage renal disease (ESRD). Previously, we identified 16 loci for eGFR in genome-wide association studies, but the associations of these single nucleotide polymorphisms (SNPs) for incident CKD or ESRD are unknown. We thus investigated the association of these loci with incident CKD in 26,308 individuals of European ancestry free of CKD at baseline drawn from eight population-based cohorts followed for a median of 7.2 years (including 2,122 incident CKD cases defined as eGFR <60ml/min/1.73m2 at follow-up) and with ESRD in four case-control studies in subjects of European ancestry (3,775 cases, 4,577 controls). SNPs at 11 of the 16 loci (UMOD, PRKAG2, ANXA9, DAB2, SHROOM3, DACH1, STC1, SLC34A1, ALMS1/NAT8, UBE2Q2, and GCKR) were associated with incident CKD; p-values ranged from p = 4.1e-9 in UMOD to p = 0.03 in GCKR. After adjusting for baseline eGFR, six of these loci remained significantly associated with incident CKD (UMOD, PRKAG2, ANXA9, DAB2, DACH1, and STC1). SNPs in UMOD (OR = 0.92, p = 0.04) and GCKR (OR = 0.93, p = 0.03) were nominally associated with ESRD. In summary, the majority of eGFR-related loci are either associated or show a strong trend towards association with incident CKD, but have modest associations with ESRD in individuals of European descent. Additional work is required to characterize the association of genetic determinants of CKD and ESRD at different stages of disease progression.


Introduction
Chronic kidney disease (CKD) and end stage renal disease (ESRD) are associated with significant cardiovascular morbidity and mortality, with substantial economic burden [1][2][3][4]. Diabetes and hypertension are the primary risk factors for CKD and ESRD [5][6][7][8] but do not fully account for CKD and ESRD risk [9][10][11]. Studies indicate familial aggregation of ESRD [12]. In African Americans, high risk common variants in the MYH9/APOL1 locus account for much of the excess genetic risk for non-diabetic ESRD compared to their counterparts of European descent. In contrast, comparable genetic risk loci of severe renal phenotypes have not been identified in individuals of European ancestry [13][14][15].
Recently, 16 genetic risk loci associated with estimated glomerular filtration rate (eGFR) and prevalent CKD were identified and replicated by genome wide association studies (GWAS) in about 70,000 individuals of European ancestry in the CKDGen consortium [16,17]. Two of these loci were also identified by an independent consortium [18]. However, these studies focused on eGFR and prevalent CKD (defined as eGFR ,60 ml/min/1.73m 2 ) at one time point, which encompasses the entire spectrum of CKD, and does not does not address the question of whether these genetic factors are involved in the initiation of CKD or in the progression to ESRD, the most advanced stage of CKD. We thus sought to analyze the association of the previously identified 16 eGFR-associated loci with the development of CKD and with ESRD in a total of over 34,000 individuals of European descent.

Association of SNPs with Incident CKD
Overall, 26,308 individuals of European descent, from eight population-based prospective studies, who were free of CKD at baseline were included in the incident CKD analysis (Table 1). At baseline, mean age ranged from 40.5 to 71.7 years. After a median follow-up of 7.2 years, 2122 participants developed incident CKD.
At each of the significant loci, the direction and the magnitude of the association was similar to those from the discovery analyses of eGFR and prevalent CKD [17]. For example, at the UMOD locus, each copy of the minor T allele at rs12917707 was associated with a 24% reduced risk for incident CKD, while in the CKDGen consortium the same allele was associated with higher eGFR [17]. Though the associations between incident CKD and SNPs in SLC7A9, ATXN2, PIP5K1B and VEGFA were not significant, the direction and magnitude of associations were consistent with our previous findings for the phenotypes eGFR and prevalent CKD [16,17]. TFDP2 was the only locus where we did not observe association with incident CKD. Of the 16 SNPs tested, 15 had the same direction of association with incident CKD as their original associations with prevalent CKD. The probability of observing this many SNPs with consistency in direction of associations is 0.0002. We did not observe evidence for heterogeneity between studies at any of the 16 loci (test for heterogeneity p.0.05 for all SNPs).

Association of SNPs with ESRD
For the ESRD analysis, we included four case-control studies with a total of 3775 ESRD patients and 4577 controls of European descent without CKD (Table 3). Mean age ranged from 50.7 to 66.2 years in cases and from 47.7 to 62.1 years in controls. Although the direction and magnitude of association for 8 SNPs (at the UMOD, GCKR, PIP5K1B, PRKAG2, STC1, VEGFA, SHROOM3, and ALMS1/NAT8 loci) were consistent with our previous findings for eGFR and prevalent CKD [16,17], only two SNPs showed nominally significant associations with ESRD ( Table 2): rs1260326 in GCKR (OR = 0.93; p-value = 0.03) and rs12917707 in UMOD (OR = 0.92; p-value = 0.04). The lack of association was not likely due to heterogeneity of ESRD cases as only two SNPs showed moderate heterogeneity in their associations with ESRD ( Table 2): rs4744712 at the PIP5K1B locus (p = 0.04 for heterogeneity) and rs626277 at the DACH1 locus (p = 0.02 for heterogeneity).

Discussion
Among individuals of European Ancestry, most genetic loci associated with the quantitative trait eGFR are also associated with risk for initiation of CKD, with more than half of these associations independent of eGFR at the baseline examination. In contrast, only two SNPs were nominally associated with ESRD.
For the ESRD analysis, we had adequate power to detect effects that were similar to those for prevalent CKD in the discovery GWAS, where odds ratios ranged from 0.8 to 1.19 [16,17]. In the

Author Summary
Chronic kidney disease (CKD) affects about 6%-11% of the general population, and progression to end stage renal disease (ESRD) has a significant public health impact. Family studies suggest that the risk for CKD and ESRD is heritable. Unraveling the genetic underpinning of risk for these diseases may lead to the identification of novel mechanisms and thus diagnostic and therapeutic tools. We have previously identified 16 genetic markers in association with kidney function and prevalent CKD in general population studies. However, little is known about the relevance of these SNPs to the initial development of CKD or to ESRD risk. Therefore, we have now analyzed the association of these markers with the initiation of CKD in more than 26,000 individuals from the general population using serial estimations of kidney function, and with ESRD in four case-control studies in subjects of European ancestry (3,775 cases, 4,577 controls). We show that many of the 16 markers are also associated or show a strong trend towards association with initiation of CKD, while only 2 markers are nominally associated with ESRD. Further work is required to characterize the association of genetic determinants of different stages of CKD progression. present study, where associations were observed, the odds ratios for ESRD tended to be smaller and ranged from 0.92 to 1.11. There are several potential explanations for this effect dilution. First, the mechanisms involved in the initiation of CKD, the progression of CKD, and the incidence of ESRD may differ [30][31][32][33]. Experimental animal data and gene expression profiling in human kidney biopsies suggest differential biological pathways contributing to kidney disease initiation and progression [34][35][36]. Second, the majority of patients with CKD die of cardiovascular disease before developing ESRD [37][38][39]. Thus, the genetic findings for kidney function in the general population may not apply to the highly selected group of dialysis populations. Finally, the process of progression from CKD to ESRD often involves repeated insults including episodes of acute kidney injury by diagnostic and operative procedures and therapies [40][41][42][43], cardiac function deterioration [44], variation in access to adequate health care [45,46] and other non-genetic factors [47]. Jointly, these factors may further decrease the relative impact of the small effects of SNPs derived from GWAS of eGFR in the general population at the earliest stage of disease initiation. The observed small effect sizes for ESRD in our study are in contrast to the large effect sizes observed in relatively small cohorts of individuals of African descent for variants in the MYH9/APOL1 locus, where odds ratios for ESRD ranged from 7.3 for the G1-G2 haplotype at the APOL1 locus to 2.38 for the E1 haplotype in the MYH9 locus [13][14][15]. However, the strong effect at this locus is an exceptional case and may be a consequence of a pronounced positive selection against vulnerability for Trypanosoma brucei rhodesiense infection at the price of a higher susceptibility for nondiabetic ESRD in African Americans not observed in other ethnicities. The establishment of large cohorts is thus needed for performing GWAS of CKD initiation and progression as well as ESRD to overcome the challenge of identifying novel loci significantly associated with these phenotypes with small effect sizes.
The strength of our work lies in the large number of individuals studied. Further, we exclusively analyzed candidate SNPs identified by the unbiased method of GWAS [16,17]. However, some limitations warrant mention. First, seven of the eight cohorts used for the incident CKD analysis were also part of the CKDGen discovery effort; thus the two samples are not entirely ''independent''. However, the phenotype studied differs substantially: in Köttgen et al [17], we used prevalent eGFR data including those with CKD, while follow-up data in those without CKD at the baseline examination was used for the present incident CKD analysis. In the present work, we demonstrate robustness of our findings independent of baseline GFR. Second, we relied on only two serum creatinine measurements to define incident CKD, which may have introduced misclassification and biased our findings towards the null. Third, we did not account for pharmacological treatment with inhibitors of the reninangiotensin-aldosterone system. Since these drugs may affect kidney function independently of kidney damage, their use may have diluted observable genetic effects [48]. Fourth, our study was not designed to detect fluctuations in eGFR. Furthermore, the etiology of ESRD in the cases we examined may vary between studies, though we observed a low degree of heterogeneity. Finally, our sample consisted of individuals of European ancestry; findings may not be generalizable to other ethnicities.
SNPs associated with eGFR in population-based studies are associated with incident CKD, whereas modest associations were observed with ESRD. Additional work is necessary to characterize the genetic underpinnings across the full range of kidney disease phenotypes, which could ultimately lead to novel diagnostic and therapeutic strategies.

Ethics statement
In all studies, all participants gave informed consent. All studies were approved by their appropriate Research Ethics Committees.

Study design and phenotype definition
In population based cohorts, serum creatinine measurements were calibrated to the National Health and Nutrition Examination Study (NHANES) standards in all studies to account for betweenlaboratory variation across studies, as described previously [10,16,17]. Using calibrated serum creatinine, we calculated the estimated glomerular filtration rate (eGFR) with the 4-variable MDRD equation [49].
For incident CKD, we analyzed studies of incident CKD in eight population-based cohorts in the CKDGen consortium with follow-up available: ARIC, CHS, CoLaus, FHS, KORA S3/F3, KORA S4/F4, the Rotterdam Study and SHIP. Each study's design is shown in Text S1. Incident CKD cases were defined as those free of CKD at baseline (defined as eGFR$60 ml/min/ 1.73m 2 ) but with a follow-up eGFR,60 ml/min/1.73m 2 . Controls were those free of CKD at baseline and at follow-up. For the ESRD analysis, we performed four case control studies of ESRD. Cases were ESRD patients from six cohorts of ESRD patients: CHOICE, ArMORR, GENDIAN, 4D, MMKD and FHKS. Controls were those free of CKD (defined as eGFR$60 ml/min/1.73m 2 ) in three population-based cohorts (KORA F3, KORA F4, SAPHIR) and one type 2 diabetes cohort (GENDIAN). Each study's design is shown in Text S1.

Statistical methods
In each study, we performed age-and sex adjusted logistic regression of incident CKD, with and without additional adjusting for baseline eGFR, or ESRD status with each SNP. In multicenter studies further adjustment for study-center was performed to account for possible differences between recruiting centers. For family-based studies, we applied logistic regression via generalized estimating equations (GEE) to account for the familial relatedness. Study-specific results were then combined by meta-analysis using a fixed effects model, using METAL (http://www.sph.umich.edu/ csg/abecasis/Metal/index.html) [50]. When significant heterogeneity between studies was observed (p for heterogeneity between studies ,0.05) we used the random effects model [51]. Statistical significance was defined as a one-sided p-value ,0.05 for each SNP without adjustment for multiple testing since all SNPs examined had strong prior probabilities of being associated with the outcomes and the same alleles were hypothesized to be associated with lower eGFR, incident CKD, and ESRD.

Power estimation
We used the QUANTO software for power estimation, assuming an additive genetic model (http://hydra.usc.edu/GxE) [52]. For the ESRD analysis and for SNPs with minor allele frequency ranging from 0.2 to 0.4 we had 80-100% power to detect an OR $ 1.10, whereas power was borderline for an OR of 1.05 to 1.09. For example, for the SNP rs12917707 at UMOD, we had 100% power to detect an association with ESRD in the 3775 ESRD cases and 4577 controls assuming that the effect in ESRD would be the same or larger than the effect observed for prevalent CKD previously [16,17].

Genotyping methods and quality control
For the incident CKD analysis, we used the allele dosage information of each of the 16 SNPs from each study's genome wide data set imputed to HAPMAP CEU samples described previously [17,18]. Imputation provides a common SNP panel across all studies to facilitate a meta-analysis across all contributing SNPs. Information on each study's genotyping and imputation platform and quality control procedures are shown in Table S1. Table S2 summarizes each SNPs imputation quality.

Supporting Information
Table S1 Genotyping and Imputation Platforms Used by Studies in the incident CKD analysis. (DOC)  Text S1 Study-specific details. (DOC)