Genome-Wide Association and Trans-ethnic Meta-Analysis for Advanced Diabetic Kidney Disease: Family Investigation of Nephropathy and Diabetes (FIND)

Diabetic kidney disease (DKD) is the most common etiology of chronic kidney disease (CKD) in the industrialized world and accounts for much of the excess mortality in patients with diabetes mellitus. Approximately 45% of U.S. patients with incident end-stage kidney disease (ESKD) have DKD. Independent of glycemic control, DKD aggregates in families and has higher incidence rates in African, Mexican, and American Indian ancestral groups relative to European populations. The Family Investigation of Nephropathy and Diabetes (FIND) performed a genome-wide association study (GWAS) contrasting 6,197 unrelated individuals with advanced DKD with healthy and diabetic individuals lacking nephropathy of European American, African American, Mexican American, or American Indian ancestry. A large-scale replication and trans-ethnic meta-analysis included 7,539 additional European American, African American and American Indian DKD cases and non-nephropathy controls. Within ethnic group meta-analysis of discovery GWAS and replication set results identified genome-wide significant evidence for association between DKD and rs12523822 on chromosome 6q25.2 in American Indians (P = 5.74x10-9). The strongest signal of association in the trans-ethnic meta-analysis was with a SNP in strong linkage disequilibrium with rs12523822 (rs955333; P = 1.31x10-8), with directionally consistent results across ethnic groups. These 6q25.2 SNPs are located between the SCAF8 and CNKSR3 genes, a region with DKD relevant changes in gene expression and an eQTL with IPCEF1, a gene co-translated with CNKSR3. Several other SNPs demonstrated suggestive evidence of association with DKD, within and across populations. These data identify a novel DKD susceptibility locus with consistent directions of effect across diverse ancestral groups and provide insight into the genetic architecture of DKD.


Abstract
Diabetic kidney disease (DKD) is the most common etiology of chronic kidney disease (CKD) in the industrialized world and accounts for much of the excess mortality in patients with diabetes mellitus. Approximately 45% of U.S. patients with incident end-stage kidney disease (ESKD) have DKD. Independent of glycemic control, DKD aggregates in families and has higher incidence rates in African, Mexican, and American Indian ancestral groups relative to European populations. The Family Investigation of Nephropathy and Diabetes (FIND) performed a genome-wide association study (GWAS) contrasting 6,197 unrelated individuals with advanced DKD with healthy and diabetic individuals lacking nephropathy of European American, African American, Mexican American, or American Indian ancestry. A large-scale replication and trans-ethnic meta-analysis included 7,539 additional European American, African American and American Indian DKD cases and non-nephropathy controls. Within ethnic group meta-analysis of discovery GWAS and replication set results identified genome-wide significant evidence for association between DKD and rs12523822 on chromosome 6q25.2 in American Indians (P = 5.74x10 -9 ). The strongest signal of association in the trans-ethnic meta-analysis was with a SNP in strong linkage disequilibrium with rs12523822 (rs955333; P = 1.31x10 -8 ), with directionally consistent results across ethnic groups. These 6q25.2 SNPs are located between the SCAF8 and CNKSR3 genes, a region with DKD relevant changes in gene expression and an eQTL with IPCEF1, a gene co-translated with CNKSR3. Several other SNPs demonstrated suggestive evidence of association with DKD, within and across populations. These data identify a novel DKD susceptibility locus with consistent directions of effect across diverse ancestral groups and provide insight into the genetic architecture of DKD.

Introduction
Diabetic kidney disease (DKD) is a devastating complication in patients with diabetes mellitus (DM) and is associated with high risk for cardiovascular disease and death. [1,2] DKD is the leading cause of end-stage kidney disease (ESKD) requiring renal replacement therapy in developed nations; these procedures incur high healthcare costs with great personal, family and societal burden. [3] The prevalence of DKD continues to rise in the United States in proportion to the growing prevalence of DM. Unfortunately, intensification of glycemic, lipid and blood pressure control have not dramatically impacted the prevalence of DKD. [3,4] Hyperglycemia alone is insufficient to cause DKD. Genetic factors appear critical in its pathogenesis based upon variable incidence rates of DKD between population groups, aggregation of DKDassociated ESKD in families, and the highly heritable nature of diabetic renal histologic changes, estimated glomerular filtration rate (eGFR) and proteinuria. [5] Genome-wide association studies (GWAS) have identified multiple loci for kidney function and chronic kidney disease (CKD) in population-and community-based cohorts, primarily of European ancestry. [6][7][8][9][10] However, CKD phenotypes in many studies included minimally to moderately reduced eGFR, not fully reflective of the progressive forms of CKD seen in kidney disease clinics. In early reports, published GWAS signals for DKD were equivocal, confounded by small sample sizes and failure to consistently replicate. Recently, the GEnetics of Nephropathy: an International Effort (GENIE) consortium identified genome-wide significant, replicated signals in a meta-analysis of over 12,000 type 1 (T1) DM patients with DKD of European ancestry. [9] Type 2 (T2) DM is far more prevalent than T1DM, accounting for 90% of cases worldwide and for the majority of prevalent cases of DKD. Relative to European Americans (EAs) with T2DM, African American (AA), American Indian (AI), and Mexican American (MA) patients with T2DM are disproportionately affected by severe DKD, [3] yet under-represented in genetic analyses. Defining the underlying genetic architecture responsible for advanced T2DM-associated kidney disease in multiple populations could provide critical insights into pathogenesis and identify new molecular targets for therapy. We report the results of a GWAS in AA, EA, MA, and AI patients with DKD enrolled in the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK)-sponsored "Family Investigation of Nephropathy and Diabetes" (FIND) [11] and the corresponding large replication study and trans-ethnic meta-analysis.

Results and Discussion
Demographic characteristics of the Discovery and Find Large Replication study (FILR) samples that met FIND phenotype qualifications and genotype quality control (QC) are summarized in Table 1 and S1 Table. The proportion of females, age at ESKD or enrollment, hemoglobin A1c and proportion with diabetic retinopathy (DR) varied by ancestry but was generally comparable between the Discovery and FILR samples within a specific ethnic population. Phenotypic differences among these populations and their genetic and DKD prevalence differences motivated the meta-analysis approach.
The principal component (PC) analysis identified PCs that genetically partitioned the Discovery sample into ancestry groups consistent with self-report. S1 Fig displays the two-dimensional partitioning via PC analysis with the boundaries for inclusion into the GWAS analysis. The logistic regression model that included the PCs as covariates reduced the inflation factor to nominal levels and combined with the P-P plot show no evidence of a systematic inflation (S2 Fig). In the replication, the inflation factor was λ = 1.05 using 278 AIMs. If we scale to 1,000 cases and 1,000 controls this would be λ 1000 = 1.017, an appropriate inflation factor for the replication study. S3 Fig provides a summary of the statistical power analyses for the race-specific discovery, replication analysis, and meta-analyses. These calculations show that for risk-predisposing variants shared across ancestries the study has power >0.50 and 0.80 to detect odds ratios (OR) on the order of 1.06 to 1.15 for a minor allele frequency (MAF) = 0.45 and 1.30 to 1.36 for a MAF = 0.05, respectively. Further, leveraging the differences in linkage disequilibrium (LD) among the four ancestries, the study is powered to potentially reduce the size of the associated region via trans-ethnic mapping.

Trans-ethnic Meta-Analysis Associations
The only locus that reached genome wide significance for DKD in the trans-ethnic meta-analysis encompassing all FIND Discovery and Replication samples was rs955333 on chromosome 6 (minimum p-value 1.31x10 -8 [additive]; minimum p-value 9.02x10 -11 [dominant]) ( Table 2). Fig 1 contains the Manhattan plot for the meta-analysis across all ancestries included in the Discovery and Replication samples. Consistent directions of association were present in three ethnic groups (only AA samples did not pass QC) and several supporting single nucleotide polymorphisms (SNPs) were detected in the region (regional plot in Fig 2). This SNP lies between the SR-like carboxyl-terminal domain associated factor 8 gene (SCAF8) and the connector enhancer of KSR family of scaffold proteins gene (CNKSR3), suggesting a possible role in transcription regulation. CNKSR3 is a direct mineralocorticoid receptor target gene involved in regulation of the epithelial sodium channel (ENaC) on the apical membrane of cells in the distal nephron. [12] CNKSR3 is highly expressed in the renal cortical collecting duct and upregulated in response to physiologic aldosterone concentrations. ENaC precisely regulates renal sodium absorption and plays important roles in maintenance of plasma volume and blood pressure. Ziera et al. [12] suggested that CNKSR3, a PSD-95/DLG-1/ZO-1 (PDZ) domain containing protein, inhibits the RAS/ERK signaling pathway, stimulating ENaC activity with enhanced renal sodium absorption. More recently, CNKSR3 was shown to function as an aldosterone-induced scaffolding platform that orchestrated assembly of ENaC and its regulators Nedd4-2, Raf-1 and SGK-1 and was essential for stimulation of ENaC function by aldosterone. [13] Clinically, renin-angiotensin-aldosterone system (RAAS) blockade serves as a mainstay of therapy for patients with DKD and other proteinuric kidney diseases. [14,15]  Inhibition of aldosterone may further limit renal fibrosis, independent of natriuretic effects. [16,17] Hence, significant association between DKD and markers near CNKSR3 is consistent with clinical trial data demonstrating that blockade of the renin angiotensin system or the aldosterone receptor slows DKD progression. However, further experiments are needed to demonstrate that the associated SNP regulates the pathogenesis of progressive DKD. Further studies will be necessary to assess if the CNKSR3 regulates DKD pathogenesis indirectly by its effects on ENaC activity or directly by promoting aldosterone-dependent fibrosis.
Less is known about the function of SCAF8, also known as RBM16. SCAF8 is a RNA maturation factor recruited to the carboxy-terminal domain of RNA polymerase II in a phosphorylation-dependent manner. [18] It also is a target for ataxia telangiectasia mutated (ATM) kinase, a crucial component of the DNA damage response required for DNA repair and cell cycle control. [19] ATM kinase is associated with responsiveness of patients with DM to the insulin sensitizer metformin in some but not all studies. [20,21] Thus, genes in the region of rs955333 are suggestive of DKD-related pathogenesis.
GWAS loci identify elements that may regulate gene expression, and recent data indicate GWAS associations are located in regions bounded by recombination hot spots near non-coding causal variants, which regulate transcription. [22,23] We next contrasted transcript abundance of the genes within the megabase region centered on rs955333, TIAM2, SCAF8, CNKSR3, IPCEF1 and OPRM1, in DKD and living donor kidney biopsies. DKD biopsies were obtained from European and AI cohorts and were analyzed separately. All five genes show Direction: RA is risk allele. The odds ratio (OR) is presented for the risk allele, compared with the non-risk allele, for a given model. The FIND ancestry groups are presented in the following order: AA-AI-EA-MA. A "+" or "-"indicates the direction of the effect in individuals of a specific ancestry. A "?" denotes that the indicated SNP did not pass QC in that ancestry and the results were not included in the meta-analysis. *a, additive; r, recessive; d, dominant doi:10.1371/journal.pgen.1005352.t002 doi:10.1371/journal.pgen.1005352.g002 statistically significant differential expression in at least one kidney tissue compartment of one population. SCAF8 steady state mRNA levels show increased expression in DKD compared to living donor biopsies in glomerular and tubulo-interstitial compartments of both populations (S2B Table); TIAM2 and OPRM1 show glomerular-specific differential expression; IPCEF1 is repressed in both tissue compartments of AI subjects; and CNKSR3 is increased in the tubulointerstitial compartment of AIs (S2B Table). Normalized tubulo-interstitial expression of CNKSR3 correlated with urine albumin (r = 0.78, q = 0.0056) and urine albumin:creatinine ratio (UACR) (r = 0.74, q = 0.0107). In addition, IPCEF1, located downstream of CNKSR3, has been reported to be translated with CNKSR3 as one protein, [24] and has a tubulo-interstitial expression quantitative trait locus (eQTL) (NM_001130699, rs249964, P = 2.34E-04) (S2A Table). LD between this SNP and the sentinel variants in the region significantly associated with DKD in the trans-ethnic (rs955333) and AI association analysis (rs12523822; see below) is negligible (D' = 0.43, r 2 = 0.01 in AI). However, tubulo-interstitial expression of IPCEF1 in kidney tissue from AIs was significantly correlated with the DKD phenotype UACR (r = -0.54, q = 0.031). These studies were limited by the small number of available biopsies the narrow criteria used to define the region of interest (see Methods). As proxies, disease-dependent differential gene expression and the rs249964 eQTL demonstrate DKD regulatory activity in the locus. Significant results of eQTL and differential gene expression analyses for other loci in Table 2 are also sown in S2A Table and S2B Table, respectively.

African American Associations
No SNP reached genome-wide significance (P<5x10 -8 ) in the AA GWAS; however, a number provided suggestive evidence for association with DKD ( Table 3; S5A Table and S6A Table  summarize the top 200 SNP associations in the discovery GWAS and replication study, respectively). The strongest associations were found within the apolipoprotein L1 (APOL1) and nonmuscle heavy chain 9 gene (MYH9) region on 22q ( Table 2, Discovery + FILR meta-analysis: rs5750250, P = 7.7x10 -8 ; rs136161, P = 5.23x10 -7 ). Since G1 and G2 variants of APOL1 are strongly associated with non-diabetic nephropathy in AA patients, [25][26][27] the G1/G2 compound risk was modeled under a recessive genetic model and these variants accounted for the associations on 22q in Table 2 (rs5750250 P = 7.70x10 -8 , OR = 1.27; rs136161 P = 5.23x10 -7 , OR = 1.36). Association with G1/G2 within APOL1 likely exists due to inclusion of non-FIND AA cases with coincident DM and unrecognized non-diabetic kidney disease. [28] APOL1 was not associated with T2D-ESKD in a logistic regression analysis adjusting for age, gender and global ancestry restricted to FIND MALD and CHOICE (Choices for Healthy Outcomes In Caring for End-stage renal disease) study cases meeting the original FIND DKD case definition (rs73885319 P = 0.1098; rs71785313 P = 0.1182). [29] Regions beyond 22q provided suggestive evidence of association in the AA Discovery + FILR meta-analysis including rs1298908 on 10q22 (OR = 1.36, P = 8.83x10 -7 ) between MAT1A and ANXA11, in a region dense with regulatory elements and transcription factors. There was also an association on 3p26 (rs304029, OR = 1.26 P = 1.10x10 -6 ) within inositol 1,4,5-trisphosphate receptor, type 1 (ITPR1), a gene involved in cerebellar and autoimmune disorders but not renal involvement. [30] The genes in these other candidate regions (ANXA11, MAT1A and ITPR1) also show statistically significant differential expression in at least one population and compartment; as do IGSF22 near candidate rs11766496 on chromosome 11, and TNFRSF19 near rs95107795 on chromosome 13. Other top AA associated regions in Table 2 do not have clear connections to kidney disease. Since APOL1 association likely reflected inclusion of non-FIND cases with non-diabetic nephropathy, a GWAS was re-computed within AAs in the discovery sample, which only included subjects lacking two APOL1   Table. The correlation between the-log10 (p-value) for GWAS with and with AA subjects with and without two APOL1 risk variants is r = 0.82 (S4 Fig). The top association in this subset GWAS was rs2780902 on 1p31 (OR = 0.52, P = 2.98x10 -7 ) within Janus kinase 1 (JAK1), a member of the protein-tyrosine kinases. [31] The ENCODE data shows that this SNP resides within a region with numerous transcription factors and DNase I hypersensitivity sites. JAK1 is a widely expressed membrane associated phosphoprotein and is involved in interferon transduction pathway. This kinase links cytokine ligand binding to tyrosine phosphorylation of various known signaling proteins and the signal transducers and activators of transcription (STATs). Another interesting association among the top 10 associations is rs2596230 on 15q14 (OR = 1.56, P = 9.36x10 -6 ) within ryanodine receptor 3 (RYR3). [32] The protein encoded by RYR3 functions to release calcium from intercellular storage in many cellular processes and the gene is expressed in the kidney. The closely related gene, RYR2, is associated with albuminuria. [33] Our prior analyses of transcript expression in DKD biopsies provide additional support for the associations. Both JAK1 and RYR3 (and RYR2) show differential expression that is restricted to the European subjects with Stage III and Stage IV CKD. JAK1 expression is increased in DKD in both compartments, while RYR3 and RYR2 are depressed in the glomerulus. [34] We also recomputed the genome wide discovery and trans-ethnic meta-analysis removing AA subjects with APOL1. The top 200 associations are summarized in S8 Table. American Indian Associations Several regions provided evidence of association with DKD in AIs ( Table 3; S5B Table and S6B Table summarize  . Thus, the high risk allele at this locus does not appear to be Amerindian specific. The p-value for association in European Americans is 0.0013 and 1.3x10 -6 in American Indian, suggesting that the signal does not come entirely from American Indians samples. Further fine-mapping or sequencing will be necessary to fully characterize the association signal within and across ethnic groups. Another association that approached genome-wide significance was rs13254600 (OR = 0.58, P = 5.54x10 -8 ) on 8q24 within WD repeat domain 67 (WDR67). This gene is expressed in a wide variety of tissues, including kidney, and may affect cellular membrane functions by regulating Rab GTPase activity. [35] TBC1D31 (WDR67) mRNA is increased in both compartments of kidney tissue from AIs, but only in the glomerulus for European subjects with more advanced DKD. Another SNP of interest is rs10019835 (OR = 0.70, 5.47x10 -7 ) on 4q32 within guanylate cyclase 1, soluble, alpha 3 (GUCY1A3); the protein encoded by GUCY1A3 serves as a receptor for nitric oxide, [36] which through its role in endothelial function may be a mediator of DKD. [37] GUCY1A3 is differentially expressed in both tissue compartments and both DKD biopsy cohorts, and shows one of the strongest differences of all genes in candidate regions (especially among the European subjects who have more advanced DKD) (S3A Table and S3B Table; S6 Fig). In addition, the candidate SNP rs10019835 has a tubulo-interstitial specific eQTL with the fulllength isoform of GUCY1A3 (NM_000856, P = 4.97x10 -4 ). The shortest isoform of the gene (NM_001130687) has a glomerular eQTL with rs12504357 (P = 2.63x10 -5 ), an intronic SNP that is 5kb upstream of the associated variant. These two eQTL SNPs have D' = 1 in some populations, likely reflecting low allele frequencies in the reference populations. Integrin alpha 6 (ITGA6, rs13421350, 2q31, OR = 0.58, P = 5.54x10 -8 ) is involved in cell adhesion and is expressed in the kidney. The gene shows negative differential expression in Europeans with DKD, and it has both glomerular and tubulo-interstitial eQTL. The glomerular eQTL is with the SNP rs6758468 (P = 5.41x10 -4 ), which is 143kb from the candidate; while the tubulo-interstitial eQTL is with rs12469788 (P = 3.26x10 -4 ), which is 5kb from the candidate with D' = 1, but negligible r 2 . Finally, rs10952362 on 7q36 near XRCC2 (rs10952362, OR = 1.91, P = 7.99x10 -8 ), a gene involved in DNA repair was strongly associated with DKD. [38] We find that XRCC2 is repressed in the tubulo-interstitial kidney tissue from AIs.

European American Associations
EA subjects comprised the smallest group within FIND and power to detect variants associated with DKD was limited (S3 Fig). None of the associations in the EA Discovery + FILR metaanalysis had a p-value <10 −5 ( Table 3; S5C Table and S6C Table summarize the top 200 SNP associations in the discovery GWAS and replication study, respectively).

Mexican American Associations
Several suggestive associations were identified in the MA Discovery GWAS ( Table 3; S5D  Table summarizes the top 200 SNP associations in the GWAS). No replication cohort was available to be genotyped in FILR, so only the Discovery GWAS and trans-ethnic meta-analysis are reported (Tables 2 and 3). The strongest association was on 12q24 for rs7975752, located 242 kb downstream of the mediator complex subunit 13-like (MED13L) gene (OR = 1.76, P = 1.67 x 10 −6 ). MED13L functions as a transcriptional coactivator for RNA polymerase IItranscribed genes. While its functional significance in DKD is unclear, gene variants 4 Mb downstream (rs614226) and upstream (rs653178) on 12q24 show genome-wide significant association with ESKD [9] and CKD [39] in Europeans. We see that MED13L is repressed in both compartments in kidney tissue from AIs but only in the glomerular transcriptome in the European subjects. Association was observed between DKD and rs731565 (P = 4.06 x 10 −6 ) residing within an intronic region of the contactin-associated protein-like 2 (CNTNAP2) gene on 7q36. SNP rs7805747, approximately 4 Mb downstream from rs731565 has been associated with CKD in European populations [39] Finally, rs4849965, 1.2 Mb upstream of the SRYrelated HMG-box 11 (SOX11) gene on 2p25.2 trended toward association with DKD (OR 1.50, 95% CI 1.26-1.79; P = 6.18x10 -6 ) and has previously been associated with CKD in Europeans. [39] We find that absolute tubulo-interstitial expression of SOX11 in AIs is correlated with ACR (r = 0.66, q = 0.029).

Conclusions
The current FIND GWAS comprises the largest genetic analysis for severe DKD based upon risk for progression to ESKD in EA and high-risk non-European ethnic groups including AAs, AIs, and MAs. As in other GWAS, results support a role for multiple DKD susceptibility genes, each with weak effects. A number of the SNPs most strongly associated with DKD had additional support from compartment-specific gene expression measures and eQTL analysis obtained in European and American Indian populations. A novel chromosome 6q25.2 DKD locus was identified in AI samples; SNPs in this region had genome-wide significant association and consistent directions of effect in the meta-analysis across all ethnic groups. Independent support for this region comes from an association with serum creatinine/eGFR in a GWAS in East Asian populations (P = 2.6 x 10 −5 at rs4870304) [40]. Strengths of the FIND GWAS were the severe phenotype in cases, focus on DKD in T2D, and inclusion of non-European populations. The 6q25.2 locus requires fine mapping and additional replication in independent sample sets of diabetic subjects with and without DKD that has sufficient power to detect associated, common variants with moderate effect size. Once localized and replicated, functional studies in animal and cell culture models will be necessary to discover the biological mechanisms responsible for the association of DKD with the underlying genetic architecture.
As in other GWAS for complex disease, many previously identified DKD loci were not replicated in the FIND analyses. The inconsistency between our data and published DKD GWAS could reflect that FIND limited the DKD case group to subjects with ESKD and DKD with heavy proteinuria felt to be at high risk for progression to ESKD. FIND did not include microalbuminuric participants as "cases" in the Discovery cohort, choosing instead to focus on advanced nephropathy. However, some microalbuminuric participants with ACR<100 mg/g were included in the replication analysis. Prior GWAS focused on European and Asian DKD populations, often enriched for T1D-associated DKD. Genetic associations may not replicate across other populations; for example, association of APOL1 variants with non-diabetic kidney disease is limited to populations with recent African ancestry. Another possible interpretation is the variants, which regulate DKD pathogenesis, are distinct for T1D and T2D, although a meta-analysis including both T1D and T2D subjects may identify shared loci. Finally, the DKD phenotype in the FIND GWAS relied on standard, stringent clinical criteria for advanced DKD. This approach limited phenotypic heterogeneity but potentially minimized the utility of cross-study comparisons. Although heavy proteinuria is a hallmark of DKD, recent analyses suggest approximately one third of patients with diabetes and an eGFR <60 ml/min per 1.73 m 2 had normal urinary protein excretion. [4] This would justify the focus of FIND on advanced DKD. Although not the only DKD phenotype with a genetic component, several investigators recently proposed using ESKD as the optimal DKD phenotype in genetic association studies. [41,42] The availability of bio-samples from patients with advanced DKD is limited. Therefore, entry criteria in the present replication cohorts were loosened to increase sample size; this likely included a small number of participants with non-diabetic CKD (or DKD less likely to progress to ESKD). The AA non-FIND cases used in our replication cohort appear to have included individuals with DM and coincident focal segmental glomerulosclerosis (FSGS), an effect addressed via partitioning based on APOL1 G1 and G2. [28] As in all GWAS, some nonnephropathy controls may develop DKD. This effect would bias results toward the null making it less likely to detect significant association.
FIND was well-powered to detect common risk variants with moderate effect sizes shared across ethnic groups. It was also well powered to use differences in effect sizes to help localize the region of association via transracial mapping. However, it was not powered to detect modest ethnic-specific effects that are not shared with another ethnicity or gene-gene interactions. Thus, these ethnic-specific scans provide important hypothesis generating results for subsequent meta-analyses, pathway enrichment analyses and hypothesis generation.

Ethics Statement
The FIND was completed in accordance with the principles of the Declaration of Helsinki. Written informed consent was obtained from all participants. The Institutional Review Board at each participating center (Case Western Reserve University, Cleveland, OH, Harbor-University of California Los Angeles Medical Center, Johns Hopkins University, Baltimore, National Institute of Diabetes and Digestive and Kidney Diseases, Phoenix, AZ, University of California, Los Angeles, CA, University of New Mexico, Albuquerque, NM, University of Texas Health Science Center at San Antonio, San Antonio, TX, Wake Forest School of Medicine, Winston-Salem, NC) approved all procedures, and all study subjects provided written informed consent. A certificate of confidentiality was filed at the National Institutes of Health.

Samples
Discovery cohorts. FIND is a multi-ancestry family study of severe DKD. [11] Index cases had advanced DKD, likely to progress to ESKD based on clinical criteria, and at least one informative sibling with either DKD or long-standing DM without nephropathy. Detailed phenotype criteria for enrollment have been reported. [26,43,44] Index cases of AA, EA, MA and AI ethnicity were included in the Discovery GWAS; all had DM duration >5 years and/or DR, with UACR >1 g/g or ESKD. Unrelated controls had DM duration 9 years, UACR <30 mg/g (equating to overnight albumin excretion <20 mcg/min), and serum creatinine <1.6 mg/dl ([122 μmol/L] men) or <1.4 mg/dl ([107 μmol/L] women). In AA, EA and MA only unrelated cases and controls were included; since AI participants were largely recruited from relatively small communities all available cases and controls meeting criteria were included, regardless of relationships.
Additional non-FIND-study DKD cases and controls (with and without DM) were genotyped to increase power (S1 Table). Non-FIND samples included unrelated DKD cases and controls of self-reported African American ethnicity recruited at Wake Forest, [45] Case Western Reserve [46] and Howard Universities; [47] unrelated cases and controls of EA ethnicity recruited at Wake Forest [48] and Case Western Reserve; [49,50] cases and controls of AI ethnicity recruited at NIDDK-Phoenix; [51] and cases and controls of MA ethnicity recruited in San Antonio and Los Angeles. [52] Replication cohorts. The FIND Large Replication (FILR) Study was comprised of samples independent from the Discovery cohorts. AA and EA replication cohorts were unrelated individuals recruited at Wake Forest, Johns Hopkins, Case Western Reserve and Harbor UCLA Universities and out-of-study control data from the Genetic Association Information Network (GAIN) consortium; [53] AI replication cohorts consisted of pedigree data from NIDDK-Phoenix [51] and from the Dakota and Oklahoma centers of the Strong Heart Family Study. [54] Replication cases had DM duration >5 years and/or DR, UACR 0.3 g/g (equating to overnight albumin excretion >200 mcg/min) and/or proteinuria >500 mg/day or ESKD. DM controls had an eGFR >60 ml/min/1.73 m 2 , UACR <30 mg/g after 10 year DM duration or UACR <100 mg/g after 15 year DM duration. GAIN study subjects with and without DM were used as controls; no kidney function data were available for these individuals. GAIN samples were excluded for specific SNPs, if MAFs were inconsistent with those in FIND controls. Additional MA subjects were not available for inclusion in FILR.
Samples analyzed. Based on ancestry, the FIND discovery GWAS samples included: (i) AA: 1564 DKD cases (633 in FIND, 931 out of study), 369 controls with DM lacking nephropathy (277 in FIND, 92 out of study) and 1,288 non-diabetic non-nephropathy controls (all out of study); (ii) AI: 538 DKD cases, 319 controls with DM lacking nephropathy; (iii) EA: 342 DKD cases, 404 controls with DM lacking nephropathy; and (iv) MA: 779 DKD cases and 594 controls with DM lacking nephropathy. The FILR replication study included: (i) AA: 950 DKD cases, 50 controls with DM lacking nephropathy and 1,887 non-diabetic non-nephropathy controls; (ii) AI: 471 DKD cases, 340 controls with DM lacking nephropathy and 486 non-diabetic non-nephropathy controls and (iii) EA: 582 DKD cases, 205 controls with DM lacking nephropathy and 2,568 non-diabetic non-nephropathy controls. FILR samples were genotyped at loci including the top associated SNPs from the Discovery GWAS, eQTL associations, literature-based candidate gene polymorphisms and ancestry informative markers (AIMs). S1 Table  delineates the sample sources in the Discovery GWAS and FILR, stratified by ancestry.

Genotyping and Statistical Methods
See Supplementary Methods (S1 Text).

SNP Selection for the Discovery and Replication Study
The DNA samples that comprise the Discovery cohorts, plus an additional 244 blind duplicates were genotyped on the Affymetrix Genome-Wide Human 6.0 SNP array (see S1 Text Supplemental Methods for details). The FILR replication samples were genotyped for 3,937 SNPs selected based on the strength of the statistical association from the Discovery GWAS. Additional SNPs were included based on the FIND eQTL association and candidate gene SNPs previously reported to be associated with DKD (see S1 Text Supplemental Methods for details). Specifically, within each ancestry group, the SNPs with the strongest statistical evidence of association were identified; a few additional SNPs from each region with supportive but weaker evidence of association were also identified (i.e., associations due to LD but r 2 <0.95 with the primary associated SNP). This redundancy was designed to limit the number of regions not represented in the replication study due to genotyping failure. In total, 3,019 SNPs (821 AA, 790 AI, 608 EA, and 800 MA) were genotyped for FILR based solely on statistical association with DKD within an ethnicity. The trans-ethnic meta-analysis of the discovery cohort identified another 436 SNPs nominally associated with DKD (p<0.0003). In addition, 482 SNPs (121 AA, 133 AI, 122 EA, 14 MA, meta-analysis 92) were chosen with the smallest L 2 -norm (i.e., Euclidean distance) of the-log 10 (p-values) from GWAS and eQTL association analyses, provided that p <0.01 from GWAS. Here, the L 2 -norm was defined relative to the maximum of the-log 10 (p-values) from the GWAS and eQTL and provides an ordering of the combined evidence for eQTL and association with DKD. SNP associations in FILR were considered "replicated" if both the association reached statistical significance and direction of the association was consistent with the Discovery analysis. Finally, 278 AIMs were genotyped to allow for adjustment of potential population substructure. Thus, FILR was designed as a replication study and not a large-scale trans-ethnic fine-mapping study. Subsequent studies will complete fine-mapping to localize associations.  Table. A. Kidney tissue-compartment eQTL in AI biopsy participants corresponding to genomic regions determined by trans-ethnic meta-analysis GWAS results (Table 2). B. Differential expression of genes occurring in genomic regions determined by trans-ethnic meta-analysis GWAS candidates (Table 2). Table. A. Kidney tissue-compartment eQTL in AI biopsy participants corresponding to genomic regions determined by candidates from ethnicity specific GWAS (Table 3) B. Differential expression of genes occurring in genomic regions determined by candidates from ethnicity specific GWAS (Table 3). Table. Counts of differentially expressed genes at q 0.05 for ERCB and AI biopsy cohorts against the Living Donor cohort. The fulllength isoform NM_000856 has a tissue-specific tubulo-interstitial eQTL with AI GWAS candidate rs10019835 (P = 4.97 x 10 −4 , glomerulus not significant at p > 0.00024), and short isoform NM_001130687 having a glomerular eQTL with intronic SNP rs12504357 (P = 2.63 x 10 −5 , tubulo-interstitium not significant at p>0.05). Both isoforms satisfy the test for expression in both tissues. (TIFF)