Prediction of recurrent stroke among ischemic stroke patients with atrial fibrillation: Development and validation of a risk score model

Background There is currently no validated risk prediction model for recurrent events among patients with acute ischemic stroke (AIS) and atrial fibrillation (AF). Considering that the application of conventional risk scores has contextual limitations, new strategies are needed to develop such a model. Here, we set out to develop and validate a comprehensive risk prediction model for stroke recurrence in AIS patients with AF. Methods AIS patients with AF were collected from multicenter registries in South Korea and Japan. A developmental dataset was constructed with 5648 registered cases from both countries for the period 2011‒2014. An external validation dataset was also created, consisting of Korean AIS subjects with AF registered between 2015 and 2018. Event outcomes were collected during 1 year after the index stroke. A multivariable prediction model was developed using the Fine–Gray subdistribution hazard model with non-stroke mortality as a competing risk. The model incorporated 21 clinical variables and was further validated, calibrated, and revised using the external validation dataset. Results The developmental dataset consisted of 4483 Korean and 1165 Japanese patients (mean age, 74.3 ± 10.2 years; male 53%); 338 patients (6%) had recurrent stroke and 903 (16%) died. The clinical profiles of the external validation set (n = 3668) were comparable to those of the developmental dataset. The c-statistics of the final model was 0.68 (95% confidence interval, 0.66 ‒0.71). The developed prediction model did not show better discriminative ability for predicting stroke recurrence than the conventional risk prediction tools (CHADS2, CHA2DS2-VASc, and ATRIA). Conclusions Neither conventional risk stratification tools nor our newly developed comprehensive prediction model using available clinical factors seemed to be suitable for identifying patients at high risk of recurrent ischemic stroke among AIS patients with AF in this modern direct oral anticoagulant era. Detailed individual information, including imaging, may be warranted to build a more robust and precise risk prediction model for stroke survivors with AF.


Introduction
Atrial fibrillation (AF) is a well-known risk factor for systemic embolic events, including ischemic stroke [1]. Nonvalvular AF independently increases the risk of stroke by almost five-fold across all age-groups [2]. The excess event rate of stroke due to AF was estimated to be 10.4/ 1000 person-years in middle-aged and 18.3/1000 person-years in older individuals in a Japanese cohort study [3]. Anticoagulation with warfarin or direct oral anticoagulants (DOACs) has been proven to reduce the risk of recurrent stroke and systemic embolization [4][5][6][7]. However, considering the potential risk of bleeding complications, it is necessary to weigh the benefits and risks from anticoagulation before initiating treatment.
Various risk stratification tools to predict stroke in non-valvular AF patients have been developed and are widely used in clinical practice; these include the CHADS 2 score, CHA 2 DS 2 -VASc score, and ATRIA score [8][9][10]. However, these scores have limited applicability in treatment decisions for patients with acute ischemic stroke (AIS) and AF. The score schemas were developed from community-based cohorts; thus, stroke survivors were rare in the developmental datasets of these tools. Furthermore, the mainstay treatment for secondary prevention at the time when these scores were developed was vitamin K antagonists; hence, the validity of their usage in the modern DOAC era is questionable. Recent advances in electronic health record systems and stroke imaging make it possible to obtain the ample information that is required to choose an antithrombotic strategy in patients with AIS. AF patients may suffer ischemic stroke despite antithrombotic mediation, and such cases have an elevated risk of recurrent stroke [11,12]. Moreover, both the risks of ischemic and hemorrhagic strokes were numerically higher in the Asian population [13][14][15][16]. which were not adequately represented in the developmental datasets of the existing risk stratification tools.
Considering that the application of conventional risk scores is limited by the context of their clinical milieu and developmental dataset, a whole new set of developmental strategies may be required in developing a new stroke recurrence prediction tool for AIS patients with AF. In this study, we developed and validated a comprehensive risk prediction model for recurrent strokes using prospective stroke registries from South Korea and Japan.

Study subjects and clinical data collection
This study was a retrospective analysis of prospectively collected databases from multicenter registries in South Korea and Japan. AIS patients with documented non-valvular AF who were hospitalized between 2011 and 2014 were identified from the Clinical Research Collaboration for Stroke in Korea (CRCS-K; n = 4844) and the Stroke Acute Management with Urgent Riskfactor Assessment and Improvement (SAMURAI)-NVAF study (n = 1192) [17,18]. Among the 6036 collected patients, 388 patients were excluded due to in-hospital death (n = 385) or a lack of outcome information (n = 3). A total of 5648 patients were thus included in the developmental dataset. The external validation dataset was comprised of 3668 AIS patients with non-valvular AF who were hospitalized between 2015 and 2018 and were registered in the CRCS-K. The developmental and external validation datasets were mutually exclusive (Fig 1). The data dictionaries and elements were harmonized to generate a comparable and interchangeable common dataset using the CRCS-K and SAMURAI-NVAF databases. The common dataset included demographic data, baseline clinical profiles, stroke information, laboratory information, in-hospital treatments, discharge medications, and outcome data. Functional outcomes were modified Rankin Scale scores at 3 months and at 1 year after the index stroke. Recurrent stroke, myocardial infarction, and death for up to 1 year were collected as event outcomes. All the information recorded in the source databases was retrieved to construct a common dataset. All the study participants or their next of kin had given written consents to participate in the CRCS-K or SAMURAI-NVAF studies. The local institutional review boards (IRBs) of all participating centers approved the original CRCS-K and SAMURAI-N-VAF study. Secondary use of the registry data and additional review of medical records for the current study were approved by the IRB of Seoul National University Bundang Hospital [B-1705/396-306]. The source data could not be made publicly available due to legal constraints, specifically the Personal Information Protection Act (2014). No explicit informed consent for public archiving of the pseudonymized source data has been obtained, in which case local regulations preclude public archiving of the data. The pseudonymized data that support the findings of this study are available from the corresponding author, Dr. Hee-Joon Bae, or the IRB of Seoul National University Bundang Hospital (82, Gumi-ro 173 Beon-gil, Bundang-gu, Seongnam-si, Gyeonggi-do 13605, South Korea; https://msri.snubh.org) upon reasonable request, subsequent approval from the local IRB, and completion of a legal data sharing agreement.

Development and validation of the prediction model for recurrent stroke
A risk prediction model for recurrent stroke was developed and validated according to published guidelines [19,20]. Potential predictors for recurrent stroke were retrieved from the developmental dataset. Candidate variables were selected based on published evidence, clinical experience, and the availability of data elements. Variables related to antithrombotic medications at discharge (anticoagulants and antiplatelet agents) were included in the models. Selected variables were checked for missing data, multicollinearity, influential observations, and goodness-of-fit in the models. For explanatory variables whose relationships with the outcome variable (logarithm of time to recurrent stroke) were nonlinear, appropriate transformation was made based on Akaike's information criteria to maximize the predictability of the model. Finally, a multivariable model incorporating significant interaction terms between predictor variables was developed using the Fine-Gray subdistribution hazard model. Non-stroke mortality was considered a competing event (n = 903).
For internal validation, regression parameter estimates were re-estimated with the bootstrapping method, in which the whole dataset was sampled using 999 repetitions with replacement [21]. Measures used to examine the model's predictive performance were Harrel's cstatistic for discrimination ability, Nagelkerke's R 2 for variation explained, and a discrimination slope for agreement between predicted and observed probabilities.
External validation was performed to calibrate and revise the regression coefficients of the developed model, using an independently collected dataset of the 3668 AIS patients with AF. The overall slope was calibrated by refitting a null model using the linear predictors of the developed model as an offset variable. Next, for each of the variables with p-values less than 0.5, the regression parameter was revised according to the method described previously. 20 To examine the performance of the final prediction model, the model's predicted risks were categorized into deciles, and their percent prediction for recurrent stroke was compared with the event proportions according to conventional CHADS 2 , CHA 2 DS 2 -VASc, and ATRIA scores in both the developmental and external validation datasets. Due to the lack of information on proteinuria in the developmental and external validation datasets, it was randomly imputed with a Bernoulli (p = 0.5) distribution.
Baseline characteristics were summarized as frequencies (percentages), mean ± standard deviation (SD), or median (interquartile range, IQR), as appropriate. Differences between categories were evaluated using the chi-squared test or Student's t-test. A Fine-Gray subdistribution hazard model was used to estimate the cumulative incidence of recurrent stroke. The significance level was set at a two-tailed p-value of < 0.05. All statistical analyses were performed using SAS version 9.4 (SAS Inc., Cary, NC, USA).

Results
The 5648 AIS patients with documented non-valvular AF that were included in the developmental dataset were recruited from South Korea (n = 4483; 79%) and Japan (n = 1165; 21%). Mean age was 74 years, and 53.1% were male (Table 1). Vascular risk factors including hypertension, diabetes, dyslipidemia, and smoking were prevalent in this population. Before the index stroke, 35.5% of patients used antiplatelet medications and 20% used anticoagulants. Intravenous thrombolysis was administered in 16.2% of patients and endovascular recanalization treatment was administered in more than 10%. Recurrent stroke affected 6.0% of patients, but 16.0% died during the first year after the index stroke.
The clinical profiles of the included subjects differed by country. Japanese patients were more likely to be older, on anticoagulants prior to the index stroke, and less likely to be smokers. In the developmental dataset, a prescription of DOACs at the time of discharge was more frequent in the SAMURAI-NVAF dataset (40%) than in the CRCS-K database (3%). The median values of CHADS 2 , CHA 2 DS 2 -VASc, and ATRIA scores were 4 (IQR, 3-4), 5 (4)(5)(6), and 9 (9-10), and their distributions were numerically comparable between the two countries (S1 and S2 Figs).
The external validation dataset was constructed using AIS patients with non-valvular AF who were hospitalized and registered in the CRCS-K between 2015 and 2018. Their clinical profiles were generally comparable to those of Korean patients in the developmental dataset. However, the frequency of DOAC prescription at discharge had increased to 49% in the external validation dataset (S1 Table).
We constructed a clinical prediction model for the risk of recurrent stroke among stroke survivors with non-valvular AF, treating all-cause mortality as a competing risk. The prediction model, incorporating the appropriately transformed variables and significant interaction terms, underwent internal validation through 999 bootstrap samples. We performed further calibration and revision of the model through the external validation dataset (Fig 2; S1 File). The final model is presented in Table 2.
The final model showed modest performance in predicting recurrent stroke, as assessed by the c-index (0.68 [95% CI, 0.66-0.71]). Table 3 and Fig 3 show the event rates of recurrent stroke for each of the currently available risk scores as well as the deciles of our prediction model, based on the developmental dataset of 5648 AIS patients with non-valvular AF. Neither the conventional risk scores nor our newly developed model showed a consistent dose-dependent relationship. The observed incidence rates of recurrent stroke according to the CHADS 2 and CHA 2 DS 2 -VASc scores dropped at the penultimate strata (5-point for CHADS 2 score and 7-point for CHA 2 DS 2 -VASc score). The incidence rate according to the ATRIA scores decreased in the higher score range. Our newly developed prediction model showed limited differentiation in the lower score range.

Discussion
We built a clinical prediction model for recurrent stroke based on the 5648 AIS patients with non-valvular AF recruited from South Korea and Japan, using detailed clinical information that was easily collected during clinical practice. The model was further calibrated and revised using an external validation dataset. The comprehensive final model showed only modest utility in individual risk stratification, with similar performance as the conventional risk scores, such as CHADS 2 , CHA 2 DS 2 -VASc, and ATRIA scores.
The model development and validation process adhered to the academic standards and published guidelines [19,20]. Patient data were collected from two countries with different epidemiological characteristics and healthcare systems, to ensure the generalizability of the  [24].
Currently, there is no validated risk prediction tool for recurrent stroke among patients with non-valvular AF who survive the acute phase of ischemic stroke. Instead, the conventional risk scores are utilized even in patients who have already scored at least two points on the CHADS 2 and CHA 2 DS 2 -VASc risk schemas, and for whom, therefore, anticoagulation is automatically indicated. Considering the low risk of bleeding while on DOACs, it may be feasible to combine DOACs with antiplatelet therapy for patients with non-valvular AF and concomitant advanced atherosclerosis [25]. There is an urgent need to develop a new risk stratification tool for AIS patients with AF. However, the discrimination ability of both the newly developed model and conventional risk scores was unsatisfactory over the entire risk score strata. Overall, the risk prediction tools, including our newly developed model, showed modest performance in predicting recurrent stroke (Fig 3). There are irregularities that limit the applicability of these tools in clinical practice.
This unsatisfactory performance of the conventional tools and our newly developed model may be due to the following factors: First, ischemic stroke is a heterogeneous entity [26]. AF contributes strongly to the occurrence of ischemic stroke, but atherothrombosis or lacunar stroke may also occur in a patient with AF. Additional biomarkers are needed to identify highrisk individuals more accurately [27]. Second, systemic embolism related to AF occurs subsequent to thrombus generation in the cardiac chamber. To measure the individual risk of ischemic events more precisely, it would be necessary to consider the function of the cardiac chamber, atrial myopathy, duration and type of AF, serum and imaging biomarkers, genetic predisposition, and so forth [28][29][30][31]. Third, improved medication adherence by introducing DOACs may mitigate the differential risk of recurrent stroke over the whole range of the risk scores [32]. Lastly, the number of recurrent stroke events in the developmental and validation datasets were relatively small, so that statistical power was not optimal. A few points need further clarification. Our study was based on Korean and Japanese stroke populations; therefore, the generalizability of the study results to other races is uncertain. Japanese stroke patients have been reported to have a lower long-term mortality than that reported elsewhere in previous studies [33]. Because the final model incorporated non-stroke mortality as a competing risk for recurrent stroke, checking the reclassification performance of the conventional scores according to mortality was not feasible. Applying conventional risk scores to  the AIS population was beyond the intention of developing these scores. The number of DOAC prescriptions rapidly increased in Korea after 2015, when it was approved for the reimbursement list. Thus, the proportion of DOAC usage increased to 49% in the external validation set from 10% in the development set.

Conclusion
We developed and validated a comprehensive risk prediction model for recurrent stroke in East Asian patients with ischemic stroke and non-valvular AF. The newly developed model showed only modest utility in discriminating the risk of recurrence, similar to the conventional risk scores (ATRIA, CHADS 2 , and CHA 2 DS 2 -VASc scores). Detailed individual information, including brain imaging, serum biomarkers, and cardiac function, may be needed to build a more robust and precise risk prediction model. https://doi.org/10.1371/journal.pone.0258377.g003 S1 Table. Clinical profile of the external validation dataset. (PDF) S1 File. Development, recalibration, and revision processes of the prediction model. The model's predictability was compared to the observed probability of recurrent stroke stratified by the deciles of predicted risks. (PDF)