Matrix Gla protein polymorphism rs1800801 associates with recurrence of ischemic stroke

The MGP single nucleotide polymorphism (SNP) rs1800801 has previously been associated with recurrent ischemic stroke in a Spanish cohort. Here, we tested for association of this SNP with ischemic stroke recurrence in a North American Caucasian cohort. Acute ischemic stroke patients admitted between 10/2009 and 12/2016 at three hospitals within a large healthcare system in the northeastern United States that were enrolled in a healthcare system-wide exome sequencing program were retrospectively reviewed. Patients with recurrent stroke within 1 year after index event were compared to those without recurrence. Of 9,348 suspected acute ischemic strokes admitted between 10/2009 and 12/2016, 1,727 (18.5%) enrolled in the exome-sequencing program. Among those, 1,068 patients had exome sequencing completed and were eligible for inclusion. Recurrent stroke within the first year of stroke was observed in 79 patients (7.4%). In multivariable analysis, stroke prior to the index stroke (OR 9.694, 95% CI 5.793–16.224, p ≤ 0.001), pro-coagulant status (OR = 3.563, 95% CI 1.504–8.443, p = 0.004) and the AA genotype of SNP rs1800801 (OR = 2.408, 95% CI 1.079–4.389, p = 0.004) were independently associated with recurrent stroke within the first year. The AA genotype of the MGP SNP rs1800801 is associated with recurrence within the first year after ischemic stroke in North American Caucasians. Study of stroke subtypes and additional populations will be required to determine if incorporation of allelic status at this SNP into current risk scores improves prediction of recurrent ischemic stroke.

Introduction Annually, about 795,000 Americans suffer from stroke, and almost one quarter, 185,000, of these strokes are recurrent strokes [1]. Although several prediction models for recurrent stroke have been reported [2][3][4][5][6], their certainty is moderate at best [7]. Recent data on inclusion of genetic markers in prediction models of stroke has been mixed. Achterberg and colleagues did not find an additional value of genetic information in predicting recurrence of vascular events including stroke after cerebral ischemia [8]. In the Genotyping Recurrence Risk of Stroke (GRECOS) project, however, Fernándes-Cadenas et al. found an association of the single nucleotide polymorphism (SNP) rs1800801 in the matrix carboxyglutamatic acid Gla protein (MGP) gene with firstyear recurrent ischemic stroke in Spanish Caucasians [9]. MGP is an extracellular matrix protein involved in the inhibition of calcification of arteries and cartilage [10]. Recently, a meta-analysis highlighted the SNP rs1800801 (12-15038788C-T, G>A) in Caucasians to increase the risk for vascular calcification and atherosclerotic disease [11]. Here, we investigated the impact of rs18008001 on the risk for first-year recurrent stroke in a North American Caucasian population.

Study design
Caucasian acute ischemic stroke patients admitted between October 2009 and December 2016 at three hospitals that are part of a large healthcare system in the northeastern United States were retrospectively reviewed. Patients were identified through the American Heart Association "Get With The Guidelines1 Stroke" center database and cross-checked with individuals enrolled in a healthcare system-wide bio-banking exome sequencing program to link genetic samples and electronic health records data called MyCode. The institutional phenomic analytics and clinical data core performed electronic health record (EHR) data extraction from the clinical documentation improvement specialist (CDIS) and various disparate data sources with subsequent manual chart review (N.S., M.A., S.K.) to verify the accuracy of diagnosis of acute ischemic stroke and document subsequent clinical management. This study was carried out in accordance with the recommendations of the Geisinger Institutional Review Board with written informed consent from all subjects. All subjects gave written informed consent in accordance with the Declaration of Helsinki. The protocol was approved by the Geisinger Institutional Review Board (IRB#: 2017-0521).

MyCode enrollment process and whole exome sequencing
The MyCode Community Health Initiative was established in 2007 as a discovery research initiative with more than 200,000 currently consented subjects [12]. Recruitment occurs in primary care and specialty clinics throughout the health system without regard to underlying diseases. The health system established research collaboration with Regeneron Genetics Center (RGC) that includes conducting whole-exome sequencing in MyCode participants and linking sequence data to EHR data. Whole-exome sequencing is performed at RGC as previously described [13]. Exome-sequencing data were available for 92,455 individuals at the commencement of this study. The MGP SNP rs1800801 genotype was obtained in eligible stroke patients. The minor allele frequency (MAF) of rs1800801, which is a single nucleotide G>A variant in the 5'UTR, is 0.31 for the A allele [14].

Patient characteristics, clinical variables, and outcome measures
The diagnosis of acute ischemic stroke was based on the neurologic examination and confirmed by CT-scan or magnetic resonance imaging (MRI). Ischemic strokes were allocated to Regeneron, Geisinger's industry partner. The agreement between Geisinger and Regeneron states that summary statistics can be shared such as in this manuscript. Regeneron reviewed and approved this submission. Any data sharing on the patient level (i.e. individual patient SNPs) requires the execution of a data sharing agreement between the requestor and Geisinger that is also aligned with Geisinger-Regeneron data sharing terms. For such requests, research contracts at Geisinger and Geisinger's Institutional Review Board have to be contacted. Please direct requests to irb@geisinger. edu. Once appropriate data use agreements are executed, others with be able to access the data in the same manner. There are no special privileges others will not be able to get access to after appropriate data use agreements are executed. ], but did not have any additional role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript. The specific roles of these authors are articulated in the 'author contributions' section.

Competing interests:
The affiliation Geisinger Health does not alter our adherence to PLOS ONE policies on sharing data and materials. We also declare that there are no relevant declarations relating to consultancy, patents, products in development, or marketed products.
subtypes of ischemic stroke according to TOAST criteria [15]. Clinical and past-medical history variables were collected in a non-blinded fashion. Functional outcome was assessed using the modified Rankin Scale (mRS) with mRS 0-2 representing a favorable and mRS 3-6 representing an unfavorable functional outcome (mRS 6 death).

First-year recurrent ischemic stroke and non-recurrent ischemic strokes
Patients who experienced an acute ischemic stroke between 10/2009 and 12/2016 were screened for first-year recurrent stroke via chart review of the health system database. Patients were categorized as (1) recurrent stroke within the first year of index stroke versus (2) no recurrent stroke. Those patients who experienced a recurrent stroke beyond the first year were excluded from this analysis.

Statistical analysis
Continuous variables are presented as mean ± standard deviation and categorical variables are presented as frequency and percent. Univariable analyses were carried out using binary logistic regression, Chi-square and Fisher's exact tests, as appropriate. Post hoc testing for crosstabs exceeding 2x2 dimension was performed calculating adjusted standardized residuals (z-scores) and thereof p-values. Significance for crosstabs was eventually evaluated after Bonferroni correction. Multivariable analysis was performed by integrating variables with a possible association with first-year recurrent stroke (p < 0.15). Stepwise backward elimination was performed, p-values of < 0.05 were considered statistically significant. Discrimination of predictive models was performed using the area under the receiver operating characteristic curve (AUC). Statistical analysis was performed using IBM SPSS version 22. Analysis was conducted using MedCalc software.

Patient characteristics
A total of 9,348 suspected acute ischemic strokes were admitted to one of three health system hospitals between 10/2009 and 12/2016, with 1,727 (18.5%) enrolled in the exome sequencing program. Among those, 1,068 MyCode patients had exome sequencing data available at the commencement of this study and were eligible for inclusion. The one-year incidence rate of recurrent stroke was 79/1,068 (7.4%) in our study population. Mean age of patients with and without recurrent stroke was 66.9 and 68.2 years, respectively. Functional outcome at discharge was favorable (mRS 0-2) in 28/79 (35.4%) recurrent stroke patients and 352/985 (35.7%) non-recurrent stroke patients. In-hospital mortality was 1.3% (1/79) in recurrent stroke patients and 2.4% (24/985) in non-recurrent stroke patients. At 90 days follow-up, 38/ 77 (49.4%) recurrent stroke patients had a favorable functional outcome, and 6/79 (7.8%) recurrent stroke patients were deceased ( Table 1).

Association of matrix Gla protein polymorphism, clinical variables and recurrent stroke
The MGP SNP rs1800801 was in Hardy-Weinberg equilibrium in patients with and without first-year recurrent stroke (p = 0.928 and p = 0.424, respectively). Genotype distributions between both groups differed significantly (p = 0.011). Linear-by-linear association demonstrated a significant association of the T allele (polymorphism allele) with the rate of recurrent strokes (p = 0.004) ( Table 2). In univariable analysis, dyslipidemia, peripheral vascular disease, coronary artery disease, prior stroke, pro-coagulant disorder, and positive family history of stroke were associated with recurrent stroke within the first year and were thus analyzed in multivariable analysis. Home medication (each of the following: anti-platelet, anticoagulation, statins, ACE-/AT 1 -inhibitor, beta-blocker and oral anti-diabetic) was also associated with recurrent stroke within the first year. However, these variables were not integrated into the multivariable analysis as they present indirect indicator variables of cerebrovascular disease and other health conditions that were directly assessed through past medical history variables. In multivariable analysis, stroke prior to the index stroke (OR 9.694, 95% CI 5.793-16.224, p � 0.001), pro-coagulant status (hypercoagulability) (OR = 3.563, 95% CI 1.504-8.443, p = 0.004) and the AA vs GG and GA genotype of SNP rs1800801 (OR = 2.408, 95% CI 1.079-4.389, p = 0.004) were independently associated with recurrent stroke within the first year. A trend towards recurrent stroke was observed for positive family history (OR = 1.698, 95% CI 0.988-2.916, p = 0.055) ( Table 3).
Subgroup analysis comparing the AA genotype vs GA genotype (excluding the GG genotype) within the same model showed an OR = 2.062, 95% CI 1.013-3.937, p = 0.029, thus demonstrating the AA genotype to be at highest risk. The area under the receiver operating characteristics curve (AUC) for the model including SNP rs1800801 was 0.744 (95% CI 0.717-0.770), whereas the model without the SNP rs1800801 yielded an AUC of 0.740 (95% CI 0.712-0.766) (Fig 1).

Discussion
Identification of patients at risk for recurrent stroke is critical because approximately onequarter of all strokes represent with recurrence and purport poor outcomes. Additional cerebrovascular ischemic events or systemic vascular ischemic events increase the risk for morbidity and mortality [1]. Thus, secondary prevention in patients who recently experienced a cerebrovascular event is critical and warrants investigation of clinical and non-clinical risk factors associated with stroke recurrence. In this analysis, we observed an independent association between the AA genotype of the MGP SNP rs1800801 and recurrent stroke within the first year in North American Caucasians. The MGP is an extracellular matrix protein primarily synthesized by vascular smooth-muscle cells (VSMCs) and chondrocytes. Mice deficient in MGP display significant artery calcification and suffer pathological cartilage calcification causing osteopenia and fractures. [10]. The underlying mechanism of vascular calcification and the role of MGP in this multifactorial process are still under investigation. An interplay of uremic toxins, calcium and phosphate deposits, lipidaceous vesicles, and VSMC transdifferentiation has been debated [16]. Herrmann and colleagues identified 8 MGP polymorphisms and found that individuals with femoral atherosclerotic plaques calcifications and myocardial infarction were more frequently carriers of the minor A allele of rs1800801. However, they could not demonstrate a function for this allele in vitro [17]. Harbuzova et al. analyzed the role of MGP in cerebrovascular disease. They found that the AA genotype of rs1800801 was associated with an increased risk of ischemic atherothrombotic stroke in Ukrainian females [18,19]. They also reported that the AA genotype associated with an altered prothrombin time, presumably causing increased hypercoagulability [20]. This study group also observed the AA genotype of rs1800801 to occur significantly more frequently in patients that suffered from acute coronary syndrome [21]. Recently, Sheng et al. examined the association of the MGP SNPs rs1800801, rs1800802, and rs4236 with vascular calcification and atherosclerotic disease in a meta-analysis [22]. Out of 23 included case-control studies with 5,280 cases and 5,773 controls, the A allele of rs1800801 associated significantly with vascular calcification and atherosclerotic disease in Caucasians but not Asians. Moreover, no association was found with rs1800802 or rs4236. Fernández-Cadenas and colleagues recently reported that rs1800801 associated with first-year recurrent ischemic stroke in a Spanish cohort in the Genotyping Recurrence Risk of Stroke (GRECOS) study [9]. Interestingly, the G allele was the risk allele and the A allele was protective. Their derivation cohort consisted of 1,494 white patients enrolled in 23 Spanish hospitals and the results were replicated in a Spanish cohort of 1,305 patients. However, the authors did not find significant association of rs1800801 with recurrent ischemic stroke in a North American cohort of 1,683 patients who were also enrolled in the Vitamin Intervention for Stroke Prevention (VISP) trial [23]. The GRECOS authors raised the possibility that vitamin intervention in the VISP trial, genetic differences between North American and Spanish Caucasian populations, and/or exclusion criteria in the VISP trial could account for the discord between the Spanish and North American cohorts [9]. The findings in the current study of a Caucasian North American cohort are also discordant with those of Fernándes-Cadenas et al. We identified the allele A of rs1800801 as the risk allele. This finding is in line with the findings, discussed above, that the A allele is the risk allele in cerebrovascular disease.
Whether genetic information adds significant value to predict recurrent stroke has recently gained increasing attention. Achterberg et al. found that genetic information did not improve a risk stratification model for recurrence of vascular events including stroke after cerebral ischemia. The AUC was negligibly increased from 0.65 (95% CI 0.54-0.65) to 0.66 (95% CI 0.54-0.66) when adding genetic information [8]. In the current study, the predictive value of recurrent ischemic stroke was also not improved significantly by rs1800801 genetic information. The AUC of the prediction model when including the genetic status of rs1800801 was 0.744 as compared to 0.740 in the prediction model without inclusion of genetic status. Stroke recurrence is clearly multifactorial and the extent to which this or other SNPs or a combination thereof have an impact is currently unknown. Comparison of the two AUCs clearly illustrates the limitations of genetic factors when analyzed in combination with clinical characteristics. Thus, additional investigation is required to determine whether genetic differences, including MGP SNP rs1800801, contribute to the risk of recurrent stroke in acute ischemic stroke patients and whether they can improve accuracy when integrated into clinical prediction models. To date, the role of MGP SNP rs1800801 is somewhat inconclusive. With respect to prediction of recurrent ischemic stroke, it remains to be determined whether MGP SNP rs1800801 is a valid biomarker or surrogate marker.

Limitations
Data collection and analysis were performed retrospectively and, as such, are subject to incomplete datasets. Association of a SNP does not provide insight into whether the given SNP is in linkage disequilibrium with another SNP(s) to account for the findings. We focused on rs1800801 and first-year recurrent risk of stroke based on the findings by Fernándes-Cadenas et al. [9]. A recurrent stroke within the first year was observed in 7.4% of our cohort. This is in line with other studies [2,9,24]. Whether setting the cutoff for early recurrence of stroke at a year is vindicated will require further exploration. The purpose of the presented study was to validate the work of Fernándes-Cadenas et al. where only recurrences within the first year were assessed [9]. The distribution of TOAST stroke subtypes and other clinical characteristics are comparable to other large stroke studies [15,25]. Additionally, the MyCode stroke database is not subject to selection bias [26]. Some patients may have had a recurrent stroke managed outside of our healthcare system during the follow-up period or suffered another stroke without further treatment.

Conclusions
We found that the AA genotype of MGP SNP rs1800801 is associated with recurrent stroke within the first year after ischemic stroke in North American Caucasian. Whether incorporation of the genetic status at this locus can contribute to a more accurate prediction of recurrent ischemic strokes in subgroups of Caucasians or other populations remains to be determined.