Impact of common genetic determinants of Hemoglobin A1c on type 2 diabetes risk and diagnosis in ancestrally diverse populations: A transethnic genome-wide meta-analysis

Background Glycated hemoglobin (HbA1c) is used to diagnose type 2 diabetes (T2D) and assess glycemic control in patients with diabetes. Previous genome-wide association studies (GWAS) have identified 18 HbA1c-associated genetic variants. These variants proved to be classifiable by their likely biological action as erythrocytic (also associated with erythrocyte traits) or glycemic (associated with other glucose-related traits). In this study, we tested the hypotheses that, in a very large scale GWAS, we would identify more genetic variants associated with HbA1c and that HbA1c variants implicated in erythrocytic biology would affect the diagnostic accuracy of HbA1c. We therefore expanded the number of HbA1c-associated loci and tested the effect of genetic risk-scores comprised of erythrocytic or glycemic variants on incident diabetes prediction and on prevalent diabetes screening performance. Throughout this multiancestry study, we kept a focus on interancestry differences in HbA1c genetics performance that might influence race-ancestry differences in health outcomes. Methods & findings Using genome-wide association meta-analyses in up to 159,940 individuals from 82 cohorts of European, African, East Asian, and South Asian ancestry, we identified 60 common genetic variants associated with HbA1c. We classified variants as implicated in glycemic, erythrocytic, or unclassified biology and tested whether additive genetic scores of erythrocytic variants (GS-E) or glycemic variants (GS-G) were associated with higher T2D incidence in multiethnic longitudinal cohorts (N = 33,241). Nineteen glycemic and 22 erythrocytic variants were associated with HbA1c at genome-wide significance. GS-G was associated with higher T2D risk (incidence OR = 1.05, 95% CI 1.04–1.06, per HbA1c-raising allele, p = 3 × 10−29); whereas GS-E was not (OR = 1.00, 95% CI 0.99–1.01, p = 0.60). In Europeans and Asians, erythrocytic variants in aggregate had only modest effects on the diagnostic accuracy of HbA1c. Yet, in African Americans, the X-linked G6PD G202A variant (T-allele frequency 11%) was associated with an absolute decrease in HbA1c of 0.81%-units (95% CI 0.66–0.96) per allele in hemizygous men, and 0.68%-units (95% CI 0.38–0.97) in homozygous women. The G6PD variant may cause approximately 2% (N = 0.65 million, 95% CI 0.55–0.74) of African American adults with T2D to remain undiagnosed when screened with HbA1c. Limitations include the smaller sample sizes for non-European ancestries and the inability to classify approximately one-third of the variants. Further studies in large multiethnic cohorts with HbA1c, glycemic, and erythrocytic traits are required to better determine the biological action of the unclassified variants. Conclusions As G6PD deficiency can be clinically silent until illness strikes, we recommend investigation of the possible benefits of screening for the G6PD genotype along with using HbA1c to diagnose T2D in populations of African ancestry or groups where G6PD deficiency is common. Screening with direct glucose measurements, or genetically-informed HbA1c diagnostic thresholds in people with G6PD deficiency, may be required to avoid missed or delayed diagnoses.

Abstract Background Glycated hemoglobin (HbA1c) is used to diagnose type 2 diabetes (T2D) and assess glycemic control in patients with diabetes. Previous genome-wide association studies (GWAS) have identified 18 HbA1c-associated genetic variants. These variants proved to be classifiable by their likely biological action as erythrocytic (also associated with erythrocyte traits) or glycemic (associated with other glucose-related traits). In this study, we tested the hypotheses that, in a very large scale GWAS, we would identify more genetic variants associated with HbA1c and that HbA1c variants implicated in erythrocytic biology would affect the diagnostic accuracy of HbA1c. We therefore expanded the number of HbA1c-associated loci and tested the effect of genetic risk-scores comprised of erythrocytic or glycemic variants on incident diabetes prediction and on prevalent diabetes screening performance. Throughout this multiancestry study, we kept a focus on interancestry differences in HbA1c genetics performance that might influence race-ancestry differences in health outcomes.

Methods & findings
Using genome-wide association meta-analyses in up to 159,940 individuals from 82 cohorts of European, African, East Asian, and South Asian ancestry, we identified 60 common genetic variants associated with HbA1c. We classified variants as implicated in glycemic, erythrocytic, or unclassified biology and tested whether additive genetic scores of erythrocytic variants (GS-E) or glycemic variants (GS-G) were associated with higher T2D incidence in multiethnic longitudinal cohorts (N = 33,241). Nineteen glycemic and 22 erythrocytic variants were associated with HbA1c at genome-wide significance. GS-G was associated with higher T2D risk (incidence OR = 1.05, 95% CI 1.04-1.06, per HbA1c-raising allele, p = 3 × 10 −29 ); whereas GS-E was not (OR = 1.00, 95% CI 0.99-1.01, p = 0.60). In Europeans and Asians, erythrocytic variants in aggregate had only modest effects on the diagnostic accuracy of HbA1c. Yet, in African Americans, the X-linked G6PD G202A variant (T-allele frequency 11%) was associated with an absolute decrease in HbA1c of 0.81%-units (95% CI 0.66-0.96) per allele in hemizygous men, and 0.68%-units (95% CI 0.38-0.97) in homozygous women. The G6PD variant may cause approximately 2% (N = 0.65 million, 95% CI 0.55-0.74) of African American adults with T2D to remain undiagnosed when screened with HbA1c. Limitations include the smaller sample sizes for non-European ancestries and the inability to classify approximately one-third of the variants. Further studies in large multiethnic cohorts with HbA1c, glycemic, and erythrocytic traits are required to better determine the biological action of the unclassified variants.

Conclusions
As G6PD deficiency can be clinically silent until illness strikes, we recommend investigation of the possible benefits of screening for the G6PD genotype along with using HbA1c to diagnose T2D in populations of African ancestry or groups where G6PD deficiency is common. Screening with direct glucose measurements, or genetically-informed HbA1c diagnostic thresholds in people with G6PD deficiency, may be required to avoid missed or delayed diagnoses.

Author summary
Why was this study done?
• Blood glucose binds in an irreversible manner to circulating hemoglobin in red blood cells (RBCs), generating "glycated hemoglobin," called HbA1c. HbA1c is used to diagnose and monitor diabetes.
• Previous large-scale human genetic studies have demonstrated that HbA1c is influenced by genetic variants. Some variants are thought to influence the function, structure, and lifespan of the red blood itself ("erythrocytic variants"), while others are thought to influence blood glucose control ("glycemic variants"). This study aimed to identify additional variants influencing HbA1c levels, and investigate the extent to which variants affecting this measurement independently of blood glucose concentration may lead to misdiagnosis, mistreatment, and human health disparities.
What did the researchers do and find?
• We studied genetic variants and their association with HbA1c levels in almost 160,000 people from European, African, East Asian, and South Asian ancestry from 82 separate studies worldwide. We found 60 genetic variants influencing HbA1c, of which 42 variants were new. Of the 60 variants, we found 19 glycemic variants and 22 erythrocytic variants.
• In approximately 33,000 people from 5 ancestry groups followed carefully over time, we found that the more glycemic variants a person had, the higher their risk to get diabetes (OR = 1.05 per HbA1c-raising allele, p = 3 × 10 −29 ). However, more erythrocytic variants did not lead to a higher risk of diabetes, meaning erythrocytic variants that lower HbA1c levels independently from glucose concentration could lead to missed diagnosis of diabetes.
• Next, we found that in everyone but those of African ancestry, those with more versus those with less of the 60 HbA1c genetic variants had a fairly small difference in HbA1c (about 0.2 units), while those of African ancestry had a larger difference (about 0.8 units, a fairly large number for this medical test).
• This difference in African ancestry was explained by one erythrocytic variant on the X chromosome. This variant mutates the protein made by the gene "glucose-6-phosphate dehydrogenase" (G6PD), which can shorten RBC lifespan, and thus lower HbA1c levels, no matter the blood glucose level.
• About 11% of people of African American ancestry carry at least one copy of this G6PD variant, while almost no one of any other ancestry does. We estimated that if we tested all Americans for diabetes using HbA1c, about 650,000 African Americans would be missed because of these genetically lowered HbA1c levels.
What do these findings mean?
• We may want to investigate the benefits of screening for the G6PD genotype in specific communities or perform additional diagnostic tests to avoid health disparities between communities.
• It will also be important to follow up with additional studies to check whether new standardized thresholds for diagnoses should be recommended for those that have this G6PD variant.

Introduction
Type 2 diabetes (T2D) is a health scourge rising unabated worldwide, escaping all past and current control measures, in part because only half of prevalent T2D worldwide has been clinically diagnosed [1]. Glycated hemoglobin (HbA1c) is an accepted diagnostic test for T2D and a principal clinical measure of glycemic control in individuals with diabetes. T2D arises from the environment interacting with genetics. Studies investigating genetic contributions to HbA1c in individuals of European [2][3][4] and Asian ancestry [5][6][7] have identified 18 loci influencing HbA1c through glycemic and nonglycemic pathways, the latter primarily reflecting erythrocytic biology. Alterations in HbA1c that are due to genetic variation acting through nonglycemic pathways may not accurately reflect ambient glycemia or T2D risk and could affect the validity of HbA1c as a diagnostic test and measure of glycemic control in some individuals or populations. Some genetic variants (e.g., the sickle cell variant HbS) that vary in frequency across ancestries can interfere with the accuracy of certain assays [8]. Further, certain hematologic conditions associated with shortened erythrocyte lifespan (e.g., hemolytic anemias) lower HbA1c values irrespective of the assay performed. HbA1c values in such patients may no longer accurately reflect ambient glycemia [9]. Epidemiologic studies have reported ethnic differences in HbA1c, with African Americans having, on average, higher HbA1c than European ancestry Americans [10]. While these differences are largely due to demographic and metabolic factors [11,12], genetic factors associated with hematologic conditions that impact erythrocyte turnover may confound the relationship between HbA1c and glycemia, causing misclassification of T2D diagnosis [8,13].
This study had 3 aims, the first was to expand genetic discovery efforts to larger sample sizes, including populations of ancestries not previously studied, to uncover novel loci influencing HbA1c and that might capture a greater fraction of the variability in HbA1c. Second, as done in previous studies, we aimed to classify the variants as acting through glycemic or erythrocytic biology. Thirdly, as erythrocytic variants may influence HbA1c due to effects on the red blood cell (RBC), we wished to explore whether this might lead to HbA1c values that no longer reflected ambient glycemia. To do this, we specifically tested the hypothesis that HbA1c-associated genetic variants, particularly those that act through erythrocytic pathways, influence the performance of HbA1c for diabetes risk prediction and diabetes diagnoses (S1 Fig).

Methods
Analysis plans were followed and can be found in S1 Analysis Plans.

Genetic discovery study participants
In the genetic discovery analysis, we combined data from up to 159,940 participants (maximum number available for any variant) of European, African American, East Asian, and South Asian ancestry, including subsets from previous publications [4,5] (S1 Table, S2 Fig). All participants were free of diabetes defined by physician diagnosis, medication use, or fasting glucose (FG) ! 7 mmol/L. A small number of cohorts also removed individuals with 2hr glucose (2hrGlu) ! 11.1 mmol/L, or HbA1c ! 6.5%, where FG was not available (details of exclusions by individual cohorts, S1 Table). Analysis followed the details in S1 Analysis Plans (Hemoglobin A1c Genetic Discovery Analysis Plan).

Genotyping and quality control
Each cohort was genotyped on commercially available genome-wide arrays (for instance, the Affymetrix Genome-Wide Human SNP Assay 6.0 or the Illumina Human610-Quad Bead-Chip) or the Illumina CardioMetabochip (Metabochip) [15]. Variant and sample quality control (QC) was conducted within each cohort following a shared analysis plan (S1 Analysis Plans). Cohorts were advised to keep SNPs with hardy-weinberg-disequilibrium p-value ! 1 × 10 −6 , SNP genotyping call rate ! 95% and minor allele frequency (MAF) ! 1% (full details of SNP and sample QC can be found in S1 Table). Following QC, studies with genome-wide array data were imputed (primarily using the Phase 2 of the International HapMap Project reference panel [16], see S1 Table, row 40), and poorly imputed variants (variants which could not reliably be inferred from surrounding variants) were excluded based on standard imputation quality thresholds (R-sq < 0.3, INFO < 0.4). Approximately 2.5 million SNPs were available for analysis after imputation and QC (S1 Table). QC of the Metabochip data is described elsewhere, but included filtering out poorly genotyped individuals or low-quality SNPs [17]. Variant association testing in men and women combined was conducted under an additive model adjusting for study-specific covariates and was limited to variants with MAF of at least 1% in each cohort. Details of the study cohorts, genotyping platforms and QC criteria, imputation reference panel, covariates in the analysis, and software used are provided for each study in S1 Table. Our study followed STREGA guidelines (S1 Checklist).
Genetic discovery using ancestry-specific and trans-ancestry metaanalyses Association data were combined within each ancestry group using a fixed-effects meta-analysis in METAL, which assumes the SNP effect is the same for each study within an ancestry [18]. Results for each cohort were corrected for any systematic biases, such as residual population structure using the genomic control inflation factor, λ GC [17,19]. We excluded variants from further followup if they had an ancestry-specific sample size N < 20,000 in Europeans, N < 3,000 in African Americans, N < 7,000 in East Asians, and N < 3,000 in South Asians (minimum number of samples, where the threshold was chosen to minimize signals driven by a single cohort), or evidence of significant within-ancestry heterogeneity, suggesting effect size significantly differs between cohorts of the same ancestry (Cochran's Q-test heterogeneity p-value < 0.0001). We retained the lead variant in the X-chromosome analysis of the African American ancestry data (rs1050828, G202A in G6PD) despite significant heterogeneity, as it was a strong biological candidate.

Identification of primary and secondary distinct HbA1c-associated signals
Variants were considered to be significantly associated with HbA1c when they met standard genome-wide significant thresholds (based on p = 0.05 divided by the estimated number of independent tests across the genome), of p < 5 × 10 −8 in the European and Asian, or p < 2.5 × 10 −8 in African American [21] ancestry-specific meta-analyses, or a log 10 Bayes Factor !6 in the transancestry meta-analysis. All significant variants within 500 kb of a lead (most significantly associated) variant were grouped into a single locus. Novel loci were by definition >500 kb from previously reported HbA1c-associated variants. We ran approximate conditional analyses using the Genome-wide Complex Trait Analysis (GCTA) software [22,23] (following analysis plans detailed in S1 Analysis Plans, Conditional analyses in GCTA) using the Women's Genome Health Study (WGHS, Europeans), Jackson Heart Study (JHS, African Americans), Singapore Malay Eye Study (SiMES, East Asians), and the London Life Sciences Prospective Population Study (LOLIPOP, South Asians) as reference populations for linkage disequilibrium (LD) estimates, to confirm the lead variants on the autosomes (within 1 Mb) were distinct, and similarly used exact conditional regression for the African-American signals on the X-chromosome in JHS.
To identify distinct signals at associated loci (that is, secondary signals), we performed approximate conditional analyses using GCTA, conditioning on lead variants identified in the transancestry MANTRA analysis. Where the lead variant was absent in a cohort, an exact proxy (r 2 = 1) was used, unless the variant was very low frequency or monomorphic.

Classification of variants as glycemic or erythrocytic
We extracted summary association statistics from publicly available meta-analysis results for glycemic [17,[24][25][26] and blood-cell [27] traits and asked a subset of the genome-wide discovery cohorts to repeat association analyses for each lead variant, conditioning on any one of FG, 2hrGlu, hemoglobin level (Hb), mean corpuscular volume (MCV), or mean corpuscular hemoglobin (MCH), where available (S3 Fig, S2 Table and S3 Table).
Variants were classified as "glycemic" if they were associated (p < 0.0001) with any of the glycemic traits from publicly available results or had !25% attenuation of variant HbA1c effect size in association models conditioned on fasting or 2hrGlu. That is, evidence of being associated with any of the glycemic traits or a reduction in the effect of the variant on HbA1c after repeated association analysis in a model additionally adjusting for fasting/2hrGlu, suggested the observed association with HbA1c was being driven through an association with fasting/ 2hrGlu. Variants not classified as glycemic were classified as "erythrocytic" if they were associated (p < 0.0001) with Hb, MCH, MCV, PCV, RBC, or MCHC in the publicly available results or, as above, had !25% attenuation of effect size in Hb-, MCV-, or MCH-conditioned models (suggesting the observed association with HbA1c was being driven through an association with these blood cell traits). The 25% attenuation threshold was chosen as the optimal balance between specificity and sensitivity based on comparisons with the classification based only on association with any of the glycemic/erythrocytic traits. Two SNPs were classified based on evidence from the literature, rs12132919 (TMEM79) was classified as erythrocytic based on association with MCHC in Japanese individuals [28] and rs7616006 (SYN2) was classified as erythrocytic based on association with platelet count in Europeans [29].
Variants associated with HbA1c but not glycemic or erythrocytic traits remained "unclassified" (S3 Fig). A single variant (rs579459 near ABO) was classified as both glycemic and erythrocytic, but as we were primarily concerned about variants that might affect HbA1c without reflecting ambient glycemia and this variant also affected glycemia, we treated it as glycemic in all analyses.
Effect of HbA1c genetic scores on reclassification of prevalent undiagnosed T2D for population screening using HbA1c Analyses on the reclassification of prevalent T2D around the HbA1c 6.5% threshold before and after accounting for the contribution of erythrocytic variants were conducted in up to 19,380 individuals and incident T2D prediction analyses in up to 33,241 individuals from European, African, and East Asian ancestry cohorts (derived in part from discovery cohorts; in S4 Table, and following the details in the S1 Analysis Plans, Net-reclassification analysis). We acknowledge that nonindependent GWAS discovery and application cohorts can lead to inflated effect estimates [30]; however, this was not evident in our study, and effect estimates across all cohorts were similar with low heterogeneity.
We estimated reclassification of prevalent T2D status by HbA1c after accounting for the contribution of erythrocytic loci in 5 population-based cohorts with 3 ancestries partially overlapping with the discovery GWAS: the Framingham Heart Study (FHS), the Atherosclerosis Risk in Communities Study (ARIC), and the Multiethnic Study of Atherosclerosis (MESA) in individuals of European ancestry; ARIC and MESA in African Americans; and MESA, the Taiwan-Metabochip Study for Cardiovascular Disease (TAICHI), and the Singapore Prospective Study (SP2) in East Asians (N = 19,380). Variant-adjusted HbA1c was calculated as: where Y i was the measured HbA1c for individual, i,b i is the ancestry-specific, meta-analytic β coefficient for the k th erythrocytic SNP, g ki is the dosage (estimated number of HbA1c-raising alleles), and E(g ki ) was two times the HbA1c-raising allele frequency. When the less frequent (minor) allele was associated with higher HbA1c, it was coded as the HbA1c-raising allele, when it was associated with lower HbA1c, the more frequent (major) allele was coded as the HbA1c-raising allele. As some HbA1c-raising alleles in one ancestry could be HbA1c-lowering in a different ancestry, we coded HbA1c-raising alleles by ancestry. Participants on antidiabetic therapy were excluded, and screen-detected T2D was defined as FG ! 7 mmol/L. For the reclassification analysis, we constructed 2-by-2 tables showing the proportion of participants reclassified around the HbA1c 6.5% diagnostic threshold, with and without adjusting measured HbA1c for the contribution of erythrocytic loci.
Calculation of genetic risk scores. Genetic risk scores of erythrocytic variants and glycemic variants (GS-E and GS-G, respectively) were calculated as detailed in S1 Analysis Plans (Investigate the Effect of Glycemic and Erythrocytic Hemoglobin A1c (HbA1c) Genetic Variants on Diabetes Prediction), as standard in the field, by summing the number of ancestry-specific HbA1c-raising alleles at each variant (0, 1, 2, or expected number of alleles based on the probability of each genotype), multiplied by their ancestry-specific β coefficients for HbA1c from the genome-wide association study (GWAS) meta-analysis multiplied by the number of variants and divided by the sum of β coefficients [31]. This means the contribution of each associated variant to the trait, in a given individual, is influenced by the number of "risk alleles" (or in this case HbA1c-raising alleles) and the effect of the variant on the trait (increase in HbA1c estimated from the meta-analysis).

Effect of HbA1c genetic scores on prediction of incident T2D
We tested the hypothesis that glycemic and erythrocytic HbA1c loci predicted incident T2D differently in Europeans, East Asians, and African Americans from 5 cohorts (partially overlapping with the discovery GWAS) with prospective follow-up: FHS, the European Prospective Investigation into Cancer and Nutrition InterAct project (EPIC-InterAct), ARIC, MESA, and the Singapore Chinese Health Study (SCHS) (N = 33,241). Using age-and sex-adjusted regression models, we tested the association between the genetic scores GS-E or GS-G and incident T2D, defined by FG ! 7 mmol/L, 2hrGlu ! 11.1 mmol/L, antidiabetic medication use, or a physician diagnosis for T2D, accrued over a 10-to-15-year follow-up period. Clinical practice guidelines did not include HbA1c as a diagnostic test until 2010. As the majority of incident T2D cases were accrued before 2010, participants are very unlikely to have received a T2D diagnosis based only on HbA1c measurements. To test whether individuals with higher GS-E, compared to those with lower GS-E, had lower T2D risk for the same HbA1c, we adjusted models for baseline HbA1c. We meta-analyzed results using a fixed-effects meta-analysis and assessed heterogeneity using Higgin's I-squared. See S1 Analysis Plans (Investigate the Effect of Glycemic and Erythrocytic Hemoglobin A1c (HbA1c) Genetic Variants on Diabetes Prediction) for analysis plan.

Ancestral differences in the genetic architecture of HbA1c
In FHS, ARIC, MESA, and SCHS, we calculated the difference in HbA1c of individuals at the bottom and top 5% of the distribution of an ancestry-specific GS composed of all 60 variants (GS-Total) and an equivalent analysis using GS-E.
We also pursued additional analyses at chromosome X rs1050828 because this single variant showed the largest effect on HbA1c in African Americans and was monomorphic in the other ancestries. The T allele is known to be associated with glucose-6-phosphate dehydrogenase (G6PD) deficiency, an enzymatic defect causing hemolytic anemia [32,33]. Imperfect correlation between HbA1c and glycemia may indicate the impact of reduced erythrocyte lifespan on HbA1c in individuals with the T allele. Fructosamine, a measure of serum protein glycation not influenced by erythrocyte-related factors, reflects average glycemia over the previous 2-3 weeks. Following the analysis plan detailed in S1 Analysis Plans (The Difference Between Fructosamine-inferred HbA1c and Measured HbA1c) we thus calculated the estimated residuals from a linear regression of HbA1c on fructosamine in ARIC African Americans (N = 1,676) to determine whether the T allele was associated with lower HbA1c than predicted by fructosamine, suggesting that the T allele artificially lowered HbA1c through a reduction in the average erythrocytic lifespan. We then reported the mean estimated residuals by genotype (women: CC, CT, TT; men: C, T).
Estimated number of African Americans with T2D in the United States whose diagnosis would be missed due to the G6PD variant if screened with HbA1c. Using publicly-available data from the National Health and Nutrition Examination Survey (NHANES) 2013-2014 [34], a nationally representative sample of US residents, we calculated the proportion of African American adults (aged ! 18 years) with T2D who would be missed by not accounting for rs1050828 when using a single HbA1c diagnostic threshold of 6.5%, assuming the observed effect size of rs1050828, allele frequency of 11% and accounting for NHANES sampling design. The study sample was restricted to 1,133 adults, aged ! 18 years, who self-identified as non-Hispanic black with measured HbA1c in 2013-2014. We defined known T2D by self-reported physician diagnosis or medication use. Assuming Hardy-Weinberg Equilibrium and a T allele frequency of 11% for the G6PD variant in our sample, we lowered the diagnostic threshold from the widely accepted 6.5%-units cut-point to 5.7%-units in men with the T genotype, 5.8%-units in women with the TT genotype, and 6.2%-units in women with the CT genotype. We then calculated the proportion of African American individuals with missed T2D diagnosis if screened with HbA1c using the 6.5% diagnostic threshold. We applied procedures to account for sampling probabilities and complex sampling design to enable population-level inferences. Data analysis was performed using SAS (version 9.2 or 9.3; SAS Institute, Cary, NC).

HbA1c-associated genetic variants and classification into glycemic and nonglycemic pathways
To discover new genetic loci influencing HbA1c in populations from 4 different ancestries (European, African American, East Asian, and South Asian), we performed within-ancestry fixed-effects genome-wide association meta-analyses and transancestry meta-analyses using a model that allowed for different effects between ancestry groups (Methods, S2 Fig). Using this approach in up to 159,940 participants without diabetes, we identified 60 variants associated with HbA1c at genome-wide significance (Fig 1, Table 1 and S5 Table). Of 60, 18 have been previously reported, and 42 were novel, including distinct secondary signals at 5 known loci. To classify the associated loci into groups reflecting their likely mode of action on HbA1c, we repeated association analyses conditioning on erythrocytic or glycemic traits and performed lookups in publicly-available association results summary statistics for additional glycemic and erythrocytic traits (Methods, S3 Fig, S2 Table and S3 Table). Based on the combined results from conditional and lookup results, we were able to classify 22 variants as erythrocytic and 19 as glycemic, with 19 remaining unclassified (Fig 1, Table 1 and S5 Table).
Effect of HbA1c genetic scores on reclassification of prevalent undiagnosed T2D in population screening using HbA1c Next, we tested whether erythrocytic variants influenced the ability of HbA1c to accurately classify individuals with diabetes when screening populations using a single HbA1c measurement. In 5 cohorts, among the 767 individuals with undiagnosed T2D by FG ! 7 mmol/L, 390 (50.8%) had measured HbA1c < 6.5% and would remain undiagnosed based on HbA1c. After accounting for the effect of erythrocytic variants, 5 (1.3%) of these individuals were correctly reclassified to having a HbA1c ! 6.5%. Among the 18,613 individuals without T2D by FG < 7 mmol/L, 266 (0.3%) had measured HbA1c ! 6.5% and would be incorrectly diagnosed with T2D by HbA1c. After accounting for the effect of erythrocytic variants, 50 (18.8%) of these individuals [13 of 80 (16.3%) European ancestry, 28 of 109 (25.7%) African ancestry, 9 of 77 (11.7%) Asian ancestry] were correctly reclassified to having a HbA1c<6.5% (Table 2, S6  Table). While adjusting for the effect of erythrocytic variants improved reclassification for   Table), suggesting that accounting for the contribution of erythrocytic variants may not be relevant for individuals who already meet diagnostic thresholds using both FG and HbA1c.

Effect of HbA1c genetic scores on prediction of incident T2D
Next, we tested whether erythrocytic variants influenced the ability of HbA1c to predict incident diabetes in initially nondiabetic populations. GS-G was associated with increased incidence of T2D (odds ratio [OR] per weighted allele 1.05, 95% CI 1.04-1.06 p = 2.5 × 10 −29 ) overall, although not in African Americans (Fig 2, S7 Table). GS-E was not associated overall with incident T2D (OR 1.00 95% CI 0.99-1.01, p = 0.60) (Fig 3, S7 Table), but was negatively associated  (Fig 4, S4 Fig and S7 Table), meaning individuals with a higher GS-E will have a lower risk of developing T2D given the same HbA1c value, suggesting that despite having the same HbA1c value, this does not reflect the same level of glycemia.

Ancestral differences in the genetic architecture of HbA1c
The population genetic history of African ancestry groups has undergone selective pressure due to the effects of malaria and other infectious diseases on erythrocytes, unlike in most European ancestry populations [35]. This led us to seek ancestral differences in the genetic determinants of HbA1c. The variance in HbA1c levels explained by all 60 genetic variants over a basic regression model including age and sex was 4.2%-5.8% in Europeans, 6.0%-14.3% in East Asians, and 8.9%-9.7% in African Americans (S8 Table). In addition, compared to Europeans and East Asians, African Americans had the largest difference in mean HbA1c between the bottom and top 5% of the GS-Total distribution (0.91%-units, 95% CI 0.78-1.05; Fig 5 and S9 Table). Erythrocytic variants alone explained around one-fifth to three-quarters of the total explained genetic variance in HbA1c (S8 Table). The absolute differences in mean HbA1c from the bottom and top 5% of the GS-E distribution were similar to GS-Total, implying that genetically-induced differences in HbA1c may be largely driven by erythrocytic variants (S9 Table  and S10 Table). In African Americans, this difference was largely driven by the C-to-T missense variant (G202A) in G6PD, rs1050828 on chromosome X. This variant alone explained 14.4% of variance in HbA1c (MESA; 9.6% in women; 19.9% in men). Men with the T allele had, on average, an absolute 0.81%-units (95% CI 0.66-0.96) lower HbA1c than those with the C allele. Homozygous TT women had, on average, an absolute 0.68%-units (95% CI 0.38-0.97) lower HbA1c compared to CC homozygous women. The effect size was similar after excluding those with anemia (Hb < 12 g/dL in women and < 13 g/dL in men, S11 Table).
Fructosamine is another measure of serum protein glycation, which reflects glycemia over a 2-3 week window, but unlike HbA1c it is not influenced by RBC traits; therefore, we sought to explore the difference between fructosamine-inferred HbA1c and measured HbA1c (Methods, S1 Analysis Plans) to test the hypothesis that the G6PD variant might be influencing HbA1c levels independently of ambient glycemia. Among African Americans, the T allele at rs1050828 was associated with measured HbA1c that was lower than fructosamine-predicted HbA1c (0.31%-units, 95% CI 0.25-0.37, p = 6.4 × 10 −19 ). Among men with the C allele, measured HbA1c was similar to fructosamine-predicted HbA1c (residuals, 0.04%-units, 95% CI −0.04 to 0.12, N = 351). This suggested that only the T allele was associated with markedly lower HbA1c than expected from glycemic measurements (S11 Table).

Public health implications of the G6PD variant on T2D screening
Given the large effects of the G6PD G202A variant on HbA1c levels, we sought to investigate the impact this variant would have on diabetes detection if using HbA1c as a screening tool. To do this, we used publicly-available data from NHANES 2013-2014 [34], a nationally representative sample of the US, to calculate the proportion of African Americans adults with T2D who would be missed by not accounting for rs1050828 when using a single HbA1c diagnostic threshold of 6.5%, assuming the observed effect size of rs1050828, allele frequency of 11%, and accounting for NHANES sampling design. In the NHANES sample of African Americans (N = 1,133), the mean age was 44.2 years (standard error 0.9), 55.2% were women, and mean HbA1c, excluding those with physician-diagnosed T2D, was 5.5%-units (standard error 0.02). 13.45% of African American adults aged ! 18 years had physician-diagnosed T2D with an additional 2.50% with undiagnosed T2D by HbA1c ! 6.5%. An additional estimated 2.17% (95% CI 1.88-2.46) with HbA1c < 6.5% may be considered to have T2D if the effect of rs1050828 was accounted for by using genotype-specific diagnostic thresholds of 5.7% for T in men, 5.8% for TT, and 6.2% for TC in women. According to the 2014 United States Census Bureau, approximately 29.9 million adults identified themselves as African American [36], suggesting that 0.65 (95% CI 0.55-0.74) million adults with T2D would remain undiagnosed when screened by a single HbA1c measurement if this genetic information were not taken into account (S12 Table).

Discussion
In a very large transancestry GWAS of HbA1c, we identified 42 novel and 18 known genetic variants associated with HbA1c, explaining 4%-14% of the trait variance. Genetic variants influencing HbA1c through erythrocytic pathways did not predict future T2D, and adjusting for their contribution to HbA1c led to a moderate misclassification of T2D by adjusted HbA1c. Notably, we detected strong ancestral differences in the contribution of genetic variants to HbA1c that substantially altered the performance of HbA1c as a diagnostic test for T2D in African Americans compared with Europeans and East Asians.
Our findings elucidate the contribution of common genetic variants to the genetic architecture of HbA1c and identify an important interface of modern human genetics with clinical and public health. In people of European and Asian ancestry, we found multiple genetic loci with small-to-modest effects, whereas, in African American ancestry, the genetic architecture was dominated by a single variant at G6PD (G202A). This variant was responsible for 0.81%units HbA1c difference in men and 0.68%-units in homozygous TT women, corresponding to adjusted T2D diagnosis thresholds of 5.7 (95% CI 5.5-5.8) and 5.8 (95% CI 5.5-6.1), respectively. To meet the NGSP certification criteria, laboratory-reported HbA1c ought to be within 6% of the standard reference laboratory mean values (e.g., 6.5%-units ± 0.4%-units) for the majority of patient samples [14]. The limits of acceptable analytic variability were exceeded by this G6PD variant. This may also have important implications for the management of diabetes, with carriers of the HbA1c-lowering G6PD allele requiring adjusted (lower) HbA1c treatment targets. Previous epidemiologic studies have shown that a 1%-unit increase in HbA1c in individuals without T2D was associated with a more than 2-fold increase in risk of future T2D and a 20%-50% increased risk of cardiovascular disease (CVD) [37]. HbA1c ! 6.5% compared to those with HbA1c < 5.7% had a higher risk of kidney disease and retinopathy [38].
Only one other African-specific variant, rs11954649, located in the intron of SOX30, reached genome-wide significance in African Americans. However, this variant had a relatively small effect size (β = 0.12 per G allele) on HbA1c and was not classified as glycemic or erythrocytic. The variant was thus not included in the genetic scores and, unlike G6PD, the causal transcript and biological mechanism through which it influences HbA1c remains unclear. Future studies on larger sample sizes of ethnic minorities can focus on dissecting the genomic and biological implications of novel HbA1c-related variants.
When considering all ethnicities, both glycemic and erythrocytic variants influence measured HbA1c; yet, only glycemic variants were associated with increased T2D risk (5% per allele) over a decade-long follow-up period. For an equivalent HbA1c, individuals carrying more erythrocytic HbA1c-raising alleles, or fewer HbA1c-lowering alleles, had lower incident T2D risk (−5% per allele), implying that for the same HbA1c level those individuals with the greater number of erythrocytic HbA1c-raising alleles have artificially higher HbA1c values that do not reflect ambient glycemia. Thus, the influence of erythrocytic HbA1c variants may partly explain why some individuals with the same HbA1c may have different risks of future T2D. We note that the estimates of variance explained by genetic variants underlying HbA1c were comparable with those for FG in Europeans (4.8%) [17].
Our results on the reclassification of prevalent T2D were consistent with previous reports indicating that a diagnostic cut-point at 6.5% for HbA1c classified fewer cases than FG ! 7 mmol/L [39,40]. Adjusting for the contribution of erythrocytic variants correctly reclassified approximately 1 in 5 individuals with FG < 7 mmol/L who were incorrectly diagnosed as having T2D (HbA1c ! 6.5%) to having HbA1c < 6.5%, suggesting that a subset of these individuals may have artificially elevated HbA1c due to the contribution of the erythrocytic variants.
Though the specific G6PD variant we identified is monomorphic in Asian and European ancestry, other diverse G6PD variant alleles have reached polymorphic frequencies in malarial endemic regions around the world [35]. G6PD deficiency is unlikely to be identified through routine screening for anemia in healthy individuals, and universal screening for G6PD deficiency is not currently recommended worldwide [32,41]. Testing for G6PD deficiency is only performed on individuals before being prescribed specific drugs, such an antimalarial medications, or in patients with clinical presentation consistent with the disease; for instance, prolonged neonatal jaundice or hemolytic crisis following exposure to specific drugs, infections, or foods [32]. Thus, asymptomatic individuals often remain unaware of their G6PD genotype status and screening for the G6PD genotype before using HbA1c to diagnose T2D may be warranted in populations or ethnic groups where G6PD deficiency is common. Similarly, a recent study identified a significant hemolytic risk in women heterozygous for the G6PD Mahidol variant when treated with primaquine who were not detected by current screening methods [42]. Rarer hematologic conditions that reduce erythrocyte lifespan, e.g., hereditary hemolytic anemias, hereditary spherocytosis, and hemoglobinopathies have also been shown to lower HbA1c [9,43], and should also be considered before using HbA1c in these patients. We recommend additional testing using direct glucose measurements (e.g., FG or oral glucose tolerance testing) or other erythrocyte-independent methods to diagnose T2D. This supports the use of a combination of HbA1c and FG to confirm T2D diagnosis in routine screening [44]. Future studies could also explore G6PD effect modification by HbA1c assay type.
Further studies in large cohorts with HbA1c, glycemic, and erythrocytic traits are required to better determine the biological action of genetic variants that have yet to be classified. Similarly, future analyses conditional on RBC distribution width or reticulocyte count will help to better understand the effects of erythrocytic HbA1c-associated variants, should such data become available. The relatively small sample size for Asian and African ancestry cohorts limited the discovery of ancestry-specific genetic variants, beyond the African-specific G6PD variant, and could explain why GS-G was associated with higher incident T2D in European, but not other, ancestries. This underscores the need to extend such studies to non-European populations, particularly those with a high prevalence of some hemoglobinopathies or iron deficiency anemias. Epidemiologic studies have reported higher mean HbA1c in African Americans compared to European ancestry individuals in the US [45,46]. While our genetic findings could not determine whether this difference was completely attributable to relative hyperglycemia, accounting for the effect of the G6PD variant that lowers HbA1c only in African Americans would further widen this disparity.
In conclusion, HbA1c remains an appropriate diagnostic test for the majority of people of diverse genetic backgrounds, having lower intraindividual variability compared to FG with the ability to capture chronic hyperglycemia, and robust associations with T2D-related complications [37]. Nevertheless, nonglycemic lowering of measured HbA1c for 1 in 10 African American men who carry this G6PD variant, and 1 in a 100 African American women homozygous for this variant, could amount to 0.65 (95% CI 0.56-0.74) million African American adults in the US with a missed T2D diagnosis using HbA1c as a screening test. We therefore recommend investigation of the possible benefits of screening for the G6PD genotype along with using HbA1c to diagnose T2D in populations of African ancestry or groups where G6PD deficiency is common, and screening with direct glucose measurements, or genetically-informed HbA1c diagnostic thresholds in people with G6PD deficiency. This work supports a role for a precision medicine application to reduce race-ethnic health disparities using HbA1c genetics to improve T2D diagnosis and prediction and to inform screening strategies for T2D across the African continent where the prevalence of the G6PD variant can reach 20%.