Metabolomics and Incidence of Atrial Fibrillation in African Americans: The Atherosclerosis Risk in Communities (ARIC) Study

Background Atrial fibrillation (AF) is a common arrhythmia. Application of metabolomic approaches, which may identify novel pathways and biomarkers of disease risk, to a longitudinal epidemiologic study of AF has been limited. Methods We determined the prospective association of 118 serum metabolites identified through untargeted metabolomics profiling with the incidence of newly-diagnosed AF in 1919 African-American men and women from the Atherosclerosis Risk in Communities study without AF at baseline (1987–1989). Incident AF cases through 2011 were ascertained from study electrocardiograms, hospital discharge codes, and death certificates. Results During a median follow-up of 22 years, we identified 183 incident AF cases. In Cox proportional hazards models adjusted for age, sex, smoking, body mass index, systolic blood pressure, use of antihypertensive medication, diabetes, prevalent heart failure, prevalent coronary heart disease, and kidney function, two conjugated bile acids (glycolithocholate sulfate and glycocholenate sulfate) were significantly associated with AF risk after correcting for multiple comparisons (p<0.0004). Multivariable-adjusted hazard ratios (95% confidence intervals) of AF were 1.22 (1.12–1.32) for glycolithocholate sulfate and 1.22 (1.10–1.35) for glycocholenate sulfate per 1-standard deviation higher levels. Associations were not appreciably different after additional adjustment for alcohol consumption or concentrations of circulating albumin and liver enzymes. Conclusion We found an association of higher levels of two bile acids with an increased risk of AF, pointing to a potential novel pathway in AF pathogenesis. Replication of results in independent studies is warranted.


Introduction
Atrial fibrillation (AF) is a common cardiac arrhythmia associated with increased mortality and an elevated risk of stroke, heart failure, myocardial infarction, and dementia. [1] The last two decades have seen major advances in our understanding of the pathophysiology of AF; however, key knowledge gaps remain. [2] Identification of novel biomarkers of AF risk could shed new light into relevant biological pathways and open new avenues for development of preventive and therapeutic strategies. [3] In this regard, large-scale, high-throughput "omics" techniques, such as genomics, epigenomics, transcriptomics, proteomics, and metabolomics, offer a unique opportunity to discover new mechanisms and to refine our comprehension of established etiopathogenic pathways. [4] The application of metabolomics-the systematic study of small molecules in a particular tissue-to research on cardiac arrhythmias, specifically AF, has been limited. Published studies have examined the metabolomic profile of atrial tissue from AF patients or from a canine model of AF. [5,6] These studies found changes in molecules involved in energy metabolism compared to AF-free controls. However, no prospective assessments of metabolomic profiling with risk of AF have been previously published. To address this gap, we explored the association of molecules identified through untargeted metabolomics with risk of newly-diagnosed AF in a subset of participants of the Atherosclerosis Risk in Communities (ARIC) study.

Study sample
The ARIC study is a prospective cohort originally designed to assess risk factors for cardiovascular disease in the general population. A total of 15,792 men and women age 45-64 years old were recruited from four communities (Forsyth County, North Carolina; Jackson, Mississippi; northwest suburbs of Minneapolis, Minnesota; and Washington County, Maryland) in 1987-89. Participants were mostly white in the Minneapolis and Washington County sites, white and African-American in Forsyth County, while only African-American individuals were

Metabolomic profiling
As previously described in detail in the context of a study assessing metabolomic predictors of heart failure [8], metabolomic profiles were performed in 2010 in serum samples obtained from a subset of 1977 African Americans in the Jackson field center. These samples had been kept at -80°C since their collection in 1987-89 and were assayed with an untargeted, gas chromatography/mass spectrometry and liquid chromatography/mass spectrometry-based metabolomic quantification protocol by Metabolon, Inc. (Durham, North Carolina). This approach identified and quantified named compounds with known chemical identities as well as unnamed compounds without current chemical standards. For the present analysis, we considered 118 named compounds with adequate medium-term reliability (defined as a reliability coefficient 0.6 in repeat samples obtained 4-6 weeks apart in 60 individuals) [9] and 80% of missing values or below the limit of detection.

Ascertainment of AF
A thorough description of AF ascertainment in the ARIC study has been previously published. [10,11] Briefly, cases of AF were identified from three sources: electrocardiograms (ECG) at the study examinations, hospital discharge codes, and death certificates. At all examinations, participants underwent a standard 12-lead ECG with MAC PC ECG machines (Marquette Electronics, Milwaukee, WI). Digital ECG information was transferred to the ARIC ECG Reading Center located at the Epidemiological Cardiology Research Center, Wake Forest School of Medicine, Winston Salem, NC, where it was automatically processed using GE Marquette 12-SL program (GE Marquette, Milwaukee, WI). All automatically detected AF cases were reviewed by an experienced cardiologist. Hospitalizations during follow-up are identified through annual follow-up calls (response rate >90%) and surveillance of local hospitals, and hospitalization discharge codes are recorded. AF was considered present if ICD-9-CM codes 427.31 or 427.32 were present in a hospitalization in any position not accompanied by a procedure code for open cardiac surgery. This approach for case ascertainment has demonstrated adequate validity in the ARIC cohort and other studies. [10,12] Finally, AF was considered present if the death certificate included ICD-9 code 427.3 or ICD-10 code I48.

Assessment of other covariates
At baseline, information on age, sex, race, and smoking status was self-reported. Alcohol consumption was ascertained by an interviewer-administered questionnaire. Height and weight were measured with the participant lightly dressed. Body mass index was calculated as weight in kilograms divided by height in meters squared. Sitting blood pressure was measured three times using a random-zero sphygmomanometer after five minutes of rest, and the second and third measurements were averaged. Diabetes was defined as a fasting blood glucose 126 mg/ dL, non-fasting blood glucose 200 mg/dL, a self-reported physician diagnosis of diabetes, or current use of antidiabetic medication. Estimated glomerular filtration rate (eGFR CKD-EPI ) was calculated from serum creatinine using the CKD-EPI equation. [13] Serum albumin was measured with a Coulter DACOS (Coulter Diagnostics) using Coulter's bromcresol green colorimetric assay. Liver enzymes (aspartate aminotransferase, alanine aminotransferase, gammaglutamyl transpeptidase) were measured in serum samples collected at visit 2 (1990-1992) using Roche reagents on the Roche Modular P800 Chemistry analyzer (Roche Diagnostics Corporation). Prevalent heart failure was defined using the Gothenburg criteria, [14] while prevalent coronary heart disease was considered present if the participant self-reported a history of myocardial infarction, coronary bypass surgery, or coronary angioplasty, or had evidence of a previous myocardial infarction by ECG at the baseline visit.

Statistical analysis
The association of each metabolite with newly-diagnosed AF was evaluated using Cox proportional hazards regression. Time to follow-up was defined as the time between the baseline examination and the incidence of AF, death, loss to follow-up, or December 31, 2011, whichever occurred first. Metabolites with <50% missing or below the detection limit values were mean centered and modeled as continuous variables in standard deviation units; missing values for this group were imputed using the lowest detectable value in the study sample. Metabolites with 50-80% missing or below the detection limit values were modeled as an ordinal variable with 3 levels: missing/below the detection limit, detected below the median, and detected equal or above the median. Models were initially adjusted for age and sex. A second model additionally adjusted for smoking status (current vs former/never), body mass index, systolic blood pressure, hypertension medications, diabetes, prevalent heart failure, and prevalent coronary heart disease. Finally, a third model added eGFR CKD-EPI to the previously listed covariates. Significance tests were corrected using the Dubey/Armitage-Parmar algorithm, a modified Bonferroni procedure. [15] Applying this method to the metabolomics data, two-tailed p-values <0.0004 were considered statistically significant. Because the two metabolites significantly associated with AF risk in our analysis may be considered markers of liver function or damage (see below), we conducted a sensitivity analysis further adjusting for baseline serum albumin, baseline alcohol consumption (in grams/week), and liver enzymes (aspartate aminotransferase, alanine aminotransferase, gamma-glutamyl transpeptidase) measured at visit 2. Also, given the role of bile acids in cholesterol metabolism and the potential link of blood lipids with AF incidence [16,17], we conducted an analysis adjusting for baseline serum LDL cholesterol, HDL cholesterol and triglycerides. Additionally, we performed sex-specific analysis for the two metabolites significantly associated with AF risk. The proportional hazards assumption was assessed with Schoenfeld residuals and including interaction terms between time and the corresponding metabolite in the regression model. We explored the shape of the association between the significant metabolites and AF risk by modeling the metabolite as restricted cubic splines with knots at the 5 th , 27.5 th , 50 th , 72.5 th , and 95 th percentiles, as recommended by Harrell. [18] Finally, to determine the joint association of the two significant metabolites with AF risk, we conducted two additional analyses. First, we ran a multivariable Cox model including both metabolites simultaneously. Second, we categorized the study participants by quintiles of the two metabolites and created a new variable as the sum of the quintiles rank (range 2-10); this variable was modeled both as a continuous and a categorical variable (with individuals in the bottom quintile of both metabolites as the reference group).
All analyses were conducted using SAS version 9.2 or 9.3 (SAS Institute, Inc., Cary, NC).

Results
Of 1977 participants with available metabolomics profiling, 58 were excluded because they had prevalent AF or did not have a baseline electrocardiogram, leaving 1919 eligible individuals.
During a median follow-up of 22 years, 183 incident cases of AF were identified (incidence rate: 5.0 per 1000 person-years). Table 1 reports selected baseline characteristics by AF status during follow-up. Compared to those without diagnosed AF during follow-up, those who developed AF were slightly older, more likely to be male and current smokers, had higher body mass index and systolic blood pressure and higher prevalence of diabetes and prevalent cardiovascular diseases.
In models adjusted for age and sex, 8 metabolites were significantly associated with AF risk (p<0.0004) ( Table 2). After additional adjustment for several AF risk factors, two metabolites, the bile acids glycolithocholate sulfate and glycocholenate sulfate, remained significantly associated. The hazard ratios (HR) and 95% confidence interval (CI) of AF per 1-standard deviation higher levels were 1.22, 95%CI 1.12-1.32 for glycolithocholate sulfate, and 1.22, 95%CI 1.10-1.35 for glycocholenate sulfate. Further adjustment for kidney function did not change the associations (Table 2). When modeled as restricted cubic splines, both glycolithocholate Table 1. Selected baseline characteristics by atrial fibrillation (AF) status during follow-up in a subsample (N = 1919) of participants from the Atherosclerosis Risk in Communities (ARIC) study Jackson, Mississippi field center without AF at baseline (1987)(1988)(1989) 1). Complete results for the 118 metabolites are provided as a supplementary file (S1 Table). We did not find any evidence of violation of the proportional hazards assumption. We performed some additional analyses. Since sulfated bile acids have been described as potential markers of hepatobiliary diseases, we performed an analysis adjusting for baseline serum albumin, alcohol consumption, and liver enzymes measured at visit 2. After these additional adjustments, both glycolithocholate sulfate and glycocholenate sulfate remained associated with AF incidence (HR, 95%CI: 1.18, 1.09-1.29 and 1.23, 1.10-1.36, respectively). Likewise, both metabolites showed similar associations with AF incidence after adjustment for baseline serum LDL cholesterol, HDL cholesterol and triglycerides (HR, 95%CI: 1.17, 1.08-1.27 for glycolithocholate sulfate and 1.18, 1.07-1.30 for glycocholenate sulfate). We also explored the association of both metabolites with incidence of AF in men and women separately. No differences were observed in the association of glycocholenate sulfate with AF incidence by sex; however, glycolithocolate sulfate was more strongly associated with AF incidence in men than in women (p for interaction < 0.001) (S2 Table).
Finally, we assessed whether associations of both metabolites with AF risk were independent of each other. The two bile acids were moderately correlated (r = 0.45). When included simultaneously in a multivariable Cox model, the association of both metabolites with AF incidence became weaker and remained significant at the traditional p = 0.05 level only for glycolithocholate sulfate, but not for glycocholenate sulfate (HR 1.16, 95%CI 1.02-1.31, p = 0.02 for glycolithocholate sulfate and 1.09, 95%CI 0.94-1.25, p = 0.25 for glycocholenate sulfate). When the circulating levels of both metabolites were jointly modeled as the sum of their quintiles (rank 2 to 10), the association became stronger: the HR (95%CI) of AF among participants in the top quintile of both bile acids, compared to those in the bottom quintiles, was 2.99 (1.12-7.99), and the p-value for the sum of quintiles modeled as a continuous variable was 0.003. Results from Cox proportional hazards model with metabolites modeled using restricted cubic splines (knots at 5 th , 27.5 th , 50 th , 72.5 th , and 95 th percentiles), adjusted for age, sex, body mass index, smoking, diabetes, systolic blood pressure, use of antihypertensive medication, prevalent coronary heart disease, prevalent heart failure, and eGFR CKD-EPI . Median value of the metabolite was considered the reference (HR = 1). The histograms represent the frequency distribution of both metabolites in the study sample. The red dots indicate the position of the knots used in the restricted cubic splines. Atherosclerosis Risk in Communities Study subsample, 1990-2011. eGFR CKD-EPI : CKD-EPI creatinine-based estimated glomerular filtration rate.

Discussion
In this systematic assessment of circulating metabolites and AF risk, we found that increased levels of two conjugated bile acids, glycolithocholate sulfate and glycocholenate sulfate, were associated with increased incidence of AF. This association was independent of other risk factors for AF, including kidney function, and of alcohol consumption and markers of liver damage and function.
To date, the use of metabolomics in the study of AF has been limited to cross-sectional studies. Mayr and colleagues used a combined metabolomic and proteomic approach in human atrial tissue from patients with AF and controls in sinus rhythm [5]. Using high-resolution proton nuclear magnetic resonance spectroscopy, the authors found that levels of beta-hydroxybutyrate, involved in ketone body metabolism, ketogenic amino acids, and glycine were elevated in cardiac tissue from patients with persistent AF. Also employing a combined metabolomics and proteomic approach, De Souza and colleagues studied left-atrial cardiomyocytes of ventricular-tachypaced dogs, an animal model of heart failure-induced AF, and sham controls [6]. Observed metabolic changes suggested increased metabolic stress paired with inefficient energy utilization, and a shift from glycolysis to ketoacid metabolism. In contrast to our analysis of the ARIC cohort, these two studies focused on metabolomics of AF-affected cardiac tissue, not directly addressing the association of circulating metabolites with the risk of developing AF.
The two molecules associated with AF risk in our study are glycine-conjugated bile acids that have also undergone sulfation. Specifically, glycolithocholate sulfate is derived from glycine-conjugated lithocholic acid, a secondary bile acid synthesized by intestinal bacteria from chenodeoxycholic acid, a primary bile acid excreted by the liver. Lithocholic acid in abnormally high concentrations can be cytotoxic; sulfation reduces its toxic effects through increased hydrophilicity and enhanced fecal and urinary excretion [19]. High levels of glycolithocolate sulfate in urine have been associated with presence of chronic hepatic disease, suggesting a potential role as marker of liver damage [20]. Glycocholenate sulfate, on the other hand, is possibly synthesized from glycine-amidation and sulfation of 3-beta-hydroxy-5-cholenoic acid (also known as cholenate). Previous literature has described elevations of 3-beta-hydroxy-5-cholenoic acid in patients with liver disease, particularly in those with primary biliary cirrhosis [21], a condition characterized by progressive inflammatory destruction of the bile ducts, leading to cirrhosis. The untargeted metabolomics assessment measured other bile acids in serum, including glycocholate, glycodeoxycholate, and ursodeoxycholate, but none of these were significantly associated with AF risk after correction for multiple comparisons.
The mechanism responsible for the association of higher levels of bile acids with risk of AF is unclear. Both glycolithocolate sulfate and glycocholenate sulfate may be elevated in the context of liver disease. We and others have shown that higher circulating levels of liver enzymes, markers of liver damage, are associated with an increased risk of AF [22,23]. In the present analysis, however, the association of the two bile acids with higher AF risk was independent of markers of liver damage and function, pointing to different pathways underlying this association. Limited previous evidence has linked bile acids with the occurrence of arrhythmias. Experiments in rat ventricular muscle have demonstrated negative inotropic effects of high levels of bile acids [24], and studies in neonatal rat cardiomyocyte cultures have also demonstrated that taurine-conjugated bile acids cause bradycardia and loss of synchronous beating [25]. Moreover, intrahepatic cholestasis of pregnancy, a disorder characterized by elevated maternal serum bile acids, has been linked to presence of fetal cardiac arrhythmias [26]. Recently, Rainer and colleagues explored the potential arrhythmogenic effects of bile acids in the adult human heart [27]. Their studies found that increasing concentrations of taurine-and glycine-conjugated bile acids caused increased occurrence of arrhythmic extra contractions in myocardial tissue extracted from the right atrium of adult patients undergoing heart surgery. They also found higher concentrations of non-ursodeoxycholate bile acids in patients with AF compared with controls in sinus rhythm [27]. These findings, together with growing evidence of the systemic metabolic effects of circulating bile acids through activation of farnesoid X receptor [28], and the effect that the gut microbiota (responsible for secondary bile acids synthesis) has in cardiometabolic health [29], suggest that future research should pay additional attention to the role of bile acids in cardiac arrhythmogenesis.
Strengths of our study include the well-characterized cohort with excellent follow-up and availability of rigorous information on potential confounders. Some limitations need to be mentioned. First, our method of AF ascertainment probably missed asymptomatic cases and those managed exclusively in the outpatient setting. Previous work, however, has shown the high specificity and validity of AF ascertainment using hospitalization discharge codes in the ARIC cohort and other epidemiologic studies [10,12]. Second, we only had single metabolomic assessments and this may be insufficient to characterize long-term levels of these metabolites. Nonetheless, as described in the methods, we only considered in our analysis metabolites with adequate medium-term reliability [9]. Third, we do not have direct evidence of the stability of the metabolite profile over the more than 20 years that the samples remained in storage. Finally, an additional limitation includes the unavailability of a replication sample. Although not a limitation, restriction of our study to African Americans makes uncertain the generalizability of our results to other racial groups.

Conclusions
We have identified a novel association of two secondary bile acids with the incidence of AF. Replication of these results in additional studies is critical. If our observations are replicated, additional research should address the potential pathways linking bile acid metabolism with the etiopathogenesis of AF.
Supporting Information S1