Figures
Abstract
Background
Stroke remains a global health challenge with high rates of mortality and rehospitalization placing significant demands on healthcare systems. Identifying factors that determine outcomes of post-hospitalization improves resource allocation. Traditional statistical prediction models are suboptimal for the analysis of complex, multi-dimensional datasets. The objective of our study is to define the extended list of clinical and non-clinical predictors, which we believe can be achieved using Explainable Machine Learning (XML) models as an expansion of conventional methods.
Methods
We evaluated 11 established XML models that represent key ML methodologies to predict 90-day outcomes, namely mortality and rehospitalization among stroke survivors. The study population are 1,300 post-stroke individuals enrolled in the Transitions of Care Stroke Disparities Study (TCSD-S) (NIH/NIMH, NCT03452813) between June 2018 – October 2022. The care after transition data is sourced from participating comprehensive stroke centers and from the Florida Stroke Registry. The analysis incorporated clinical (e.g., age, stroke severity, comorbidities) and non-clinical factors including Social Drivers of Health (SDOH). A combined ranking approach, using Weighted Importance Scores and Frequency Counts, identified significant predictors across models.
Results
The resulting list of selected predictors included both established clinical factors and non-clinical factors, which enhanced prediction accuracy. Out of 38 identified predictors, 20 are non-clinical variables reflecting the importance of SDOH, environmental factors, and behavioral modifications beyond traditional clinical predictors of death/readmission. A secondary analysis restricted to ischemic stroke patients (n = 1,038) yielded virtually identical predictive performance, indicating robustness of the model within this subgroup.
Conclusions
Integrating SDOH, environmental factors, and behavioral modifications alongside traditional clinical predictors enhances the predictive accuracy of post-stroke outcome models. This underscores the critical role of addressing socioeconomic disparities during post-stroke transitions of care. Moreover, XML models’ ability to identify predictors spanning clinical and non-clinical domains suggests their potential to guide recovery. The resulting predictors are crucial for post-hospital care and hold strong potential for identifying individuals at risk of stroke, making them potentially significant across pre-stroke and hospitalization stages.
Citation: Veledar E, Zhou L, Veledar O, Gardener H, Gutierrez CM, Brown SC, et al. (2025) Identifying determinants of readmission and death post-stroke using explainable machine learning. PLoS One 20(9): e0332371. https://doi.org/10.1371/journal.pone.0332371
Editor: Noah Hammarlund, University of Florida, UNITED STATES OF AMERICA
Received: March 22, 2025; Accepted: August 29, 2025; Published: September 18, 2025
Copyright: © 2025 Veledar et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: The FSR uses data from Get with The Guidelines-Stroke® (GWTG-S). Due to data-sharing agreements, researchers must apply for access at http://www.heart.org/qualityresearch, with proposals reviewed by GWTG-S and FSR committees upon reasonable request.
Funding: The author(s) received no specific funding for this work.
Competing interests: The authors have declared that no competing interests exist.
1. Introduction
Stroke challenges healthcare systems across the entire continuum of care. As the third leading cause of mortality and a major source of disability worldwide [1], stroke is a major chronic non-communicable condition associated with reduced population well-being [2]. Assessing stroke risk factors is a crucial initial step in decreasing its burden. Predictive models can be applied at three stages: primary prevention, response to acute treatment, and post-stroke outcomes including post-hospitalization recurrence and readmissions. The complexity of stroke adjudication, diagnosis, and treatment algorithms often restricts model effectiveness, demanding phase-specific approaches. Current meta-analyses reveal that no high-quality predictive or explanatory models exclusively addressing stroke risk currently exist [3,4]. Existing models often rely on non-modifiable risk factors and borrow heavily from frameworks developed for composite cardiovascular diseases. For the general population, all measures of model quality remain low, highlighting significant gaps in predictive accuracy and reliability. While recent studies have not substantially improved prediction intervals, they have introduced greater inclusivity by incorporating non-traditional biomarkers, such as genetic and polygenic scores, and leveraging novel ML methods, marking incremental progress in methodology and scope. Our focus is on developing and improving models specifically for the rehospitalization and death.
In the first (primary prevention) phase, predictive models aim to assess stroke risk within the general population but face substantial limitations, with most models achieving ROC scores below 0.7. These models rely on cardiovascular markers such as the AHA Life’s Essential 8 [5] which lack stroke specificity. Adding stroke-specific indicators is typically used for high-risk individuals to enhance prediction accuracy but is impractical for general screening [6].
The second phase focuses on patients receiving hospital care for acute stroke. Here, predictive power increases with detailed clinical data from advanced diagnostics like, brain and head/neck vessel imaging enabling the use of ML models to refine treatment. However, the wide variation in stroke subtypes and specific predictors for each subtype limits the development of a universal model, necessitating predictor-specific models for different stroke etiologies.
In the post-hospitalization stroke period, stroke survivors face heightened risks of complications, recurrence, and rehospitalization. Previous studies have the prevalence of stroke survivors discharged home ranging between 43–92% [7–12], often requiring ongoing care, rehabilitation, and adherence to treatment plans. A vast number of studies have focused on analyzing individual medical characteristics of stroke patients. However, our research takes a different approach, concentrating on additional variables that capture broader, non-individual characteristics. This study aims to predict 90-day post-discharge outcomes from acute stroke hospitalization by identifying key predictors of mortality and readmission. By leveraging data from multiple sources, the study employs XML models, which integrate clinical factors, SDOH, and health behaviors. Using multiple XML models, we aim to isolate robust, modifiable predictors of 90-day outcomes of death or readmission post-stroke. Our approach underscores XML’s ability to capture complex, non-linear relationships, enhancing interpretability and trust in predictive insights to inform post-stroke care. Therefore, our main research question targets to determine how XML models improve the prediction of 90-day mortality and readmission outcomes following acute stroke hospitalization even when the sample size is small and the design is unbalanced.
2. Materials and methods
We adopt the Weighted Importance Score and Frequency Count (WISFC) framework for aggregating feature‐importance across multiple models, as recently detailed [13]. This approach systematically combines magnitude and consistency information from diverse explainers to produce a robust, consensus‐based ranking of predictors.
2.1. Study population
Our study population is comprised of 1,300 post-stroke individuals enrolled in the Transitions of Care Stroke Disparities Study (TCSD-S) (NIH/NIMH, NCT03452813) between June 2018-October 2022 combined by information from participating comprehensive stroke centers from the Florida Stroke Registry [14]. The TCSD-S is an observational study of stroke survivors investigating factors that influence successful transitions post-stroke. Eligible participants were adults aged 18 years and older, diagnosed with either acute ischemic stroke or intracerebral hemorrhage, and discharged to either a rehabilitation facility or directly home. Although ischemic and intracerebral hemorrhage (ICH) strokes differ pathophysiologically, in our cohort, both groups had relatively low Modified Rankin Scale scores at discharge and showed no statistically significant difference in 90-day outcome rates. Given this similarity, and to preserve sample size and generalisability, both stroke types were retained in the analysis. Hospital care coordinators conducted interviews at discharge to assess SDOH. Follow-up structured interviews at 30 and 90 days post-discharge tracked readmissions, emergency room visits, discharge education, and behavioral modifications. The TCSD-S enrollee data from 10 collaborating comprehensive stroke centers were linked to the American Heart Association’s Get With The Guidelines–Stroke (GWTG-S) Database, providing additional clinical and demographic information such as race/ethnicity, sex, age, insurance status, stroke severity, and pre-stroke health conditions. Patients who died within 30 days or had specific post-discharge dispositions (e.g., transferred to hospice care) were excluded from the study.
Only those patients who provided written informed consent participated in the TCSD-S. The study protocol was approved by the Institutional Review Board of the University of Miami Protocol ID20170892.
2.2. Study variables
The primary outcome variable was a composite of hospital readmission or mortality within 90 days after discharge from the index hospitalization. Since approximately half of all readmissions within one year after a stroke occur within the first 90 days [15–17], predicting 90-day outcomes is useful.
Independent variables were extracted from three sources: the TCSD-S which provides insight into individual SDOH, the GWTG-Stroke database provides clinical and index stroke characteristics, and the Social Contextual Indicators for Research and Analysis (SCIERA) database [18]. The database provides neighborhood-level SDOH factors (including detailed zip code characteristics), offering a broader context for individual patient data.
We included nine comprehensive arrays of potential predictors encompassing neighborhood socio-economic, and demographic characteristics, individual social determinants of health, health characteristics before the stroke, index stroke characteristics, acute care variables, hospital characteristics, discharge status, and measures of Adequate Transition of Care [19]:
- Patients’ sociodemographic Characteristics: Age, sex, race/ethnicity, and type of insurance.
- Individual SDOH: Language spoken at home, education status, prior work status, difficulty paying for medical care, difficulty paying for necessities (e.g., food, electricity), living arrangements, and social support [20].
- Health Characteristics Before the Stroke: Prior ambulation status, diabetes, hypertension, dyslipidemia, atrial fibrillation, peripheral vascular disease, coronary artery disease, previous stroke/Transient Ischemic Attack (TIA), carotid stenosis, chronic renal insufficiency, sleep apnea, depression, smoking status, drug/alcohol use, overweight/obesity, and history of deep vein thrombosis/pulmonary embolism (DVT/PE).
- Index Stroke Characteristics: Stroke etiology, National Institutes of Health Stroke Scale (NIHSS) score, final clinical diagnosis related to stroke, presence of weakness/paresis, altered level of consciousness, aphasia/language disturbance, other neurological signs/symptoms, mode of arrival (e.g., ambulance, self-transport), and length of hospital stay.
- Acute Care Variables: intravenous thrombolysis, endovascular therapy, discharge medications including antiplatelet agents, anticoagulants, and statins, provision of defect-free care, and presence of active bacterial or viral infection at admission or during hospitalization.
- Hospital Characteristics: Stroke center type (e.g., primary stroke center, comprehensive stroke center), teaching hospital status, and the number of beds.
- Discharge Status: Modified Rankin Scale (mRS) score at discharge, discharge disposition (e.g., home, rehabilitation facility), and ambulation status at discharge.
- Neighborhood (Zip-Code) Characteristics: Percentages of Hispanic, Non- Hispanic Black, and Non-Hispanic White residents; percentage of residents with a bachelor’s degree; median household income; percentage of high school completion; rural-urban commuting area codes; total housing population; the ratio of owner-occupied housing; housing density [21]; percentage below the poverty line; unemployment rate; densities of tobacco, alcohol, restaurant, fast food, grocery, pharmacy, and gym businesses; and counts of hospitals, clinics, and rehabilitation centers in the area. Crowding, in our context, is a measure of population density within a zip code. It is calculated by dividing the total housing population by the adjusted number of housing units, which is determined by dividing the count of housing units by the median number of rooms per unit. This metric provides insight into how densely populated a given area is concerning its housing capacity. Total Housing Population is defined as the total number of people living in owner or renter-occupied housing in a zip code. Social Support Size is the measure of how many persons a patient knows that they feel close to (i.e., persons they can talk to or reach out to if needed with 3 categories: “None”, “1-2”, “3 or more”).
- Adequate Transition of Care (ATOC): Adherence to at least 75% of applicable transition of care behavior modifications, including filled medications and taken as prescribed 90–100% of the time; attending outpatient therapy or attended and completed therapy if prescribed; has seen a medical provider after discharge; stopped using tobacco, alcohol, marihuana, and other drugs; exercising by regular walking on a treadmill or outside, or regular exercise other than walking; modified diet per recommendation after stroke.
The methods for obtaining outcome data and adjudicating endpoints have been detailed in the prior publication [19].
2.3. XML techniques and implementation
To predict post-discharge stroke outcomes, we applied a comprehensive array of XML techniques aimed at identifying the most important predictive variables from a pool of diverse clinical, socioeconomic, and behavioral data detailed above. This section outlines the methodology used to handle the 73 potential variables, the application of 11 different XML models, and the process of ranking variable importance.
Step 1: Identifying and Selecting Variables: The 73 study variables were selected as they represent strong candidates based on prior evidence [14,19,22] and their availability in the dataset. As detailed in the previous section, they were drawn directly from multiple data sources and span key domains such as sociodemographic characteristics, social determinants of health, pre-stroke health status, stroke-specific and acute care data, hospital and discharge characteristics, neighborhood factors, and transition of care adequacy. By incorporating these diverse variables, we ensured a comprehensive basis for modeling post-stroke recovery and readmission risk.
Step 2: Applying XML Methods: To process these 73 variables, we employed 11 different XML models. We use logistic regression as a universal benchmark, alongside 10 additional models encompassing regression-based, tree-based, and distance-based algorithms. These 10 models represent state-of-the-art approaches widely accepted in ML, ensuring comprehensive coverage of predictive methodologies. This method uses one data set randomly divided into 10 parts. Nine of those parts are used for training and a tenth for testing. This procedure is repeated 10 times reserving a different tenth for testing [23]. Each XML model training was carried out under 10-fold cross-validation to optimize hyperparameters and estimate out-of-sample performance. All continuous predictors in the training set were then mean-centered and scaled to unit variance; the same centering and scaling parameters were subsequently applied to the test set to avoid data leakage. No additional feature-engineering transformations were performed beyond this normalization step.
All candidate predictors were assessed for completeness prior to model development. Patients with missing values in any of the predictors listed in Table 1 were excluded from the analysis; no imputation methods were applied. Of the original cohort of 1,200 stroke patients, 73 (6.1%) were removed due to incomplete records, yielding a final sample of 1,127 patients with fully observed data.
To assess whether model performance differed when restricted to ischemic stroke, we performed the secondary analysis in ischemic stroke subset. We excluded the 89 hemorrhagic cases from our final cohort of 1,127 patients, yielding 1,038 ischemic strokes (92.1%). Predictors specific to stroke subtype (variables 16 and 38) were omitted. Discrimination (AUC) and calibration metrics remained virtually unchanged compared to the primary analysis.
All models were tuned or set under a uniform 10-fold cross-validation framework. For the penalized regressions (LASSO, Ridge, and Elastic Net with α = 0.5), we selected the smallest λ that minimized cross-validated error in each fold. Principal Component Regression was run with a fixed 10 components (k = 10). The k-Nearest Neighbors model evaluated 20 candidate k values via cross-validation. Support Vector Machines employed the radial basis kernel with default cost and γ settings. Random Forests were grown with 500 trees and the default mtry. Gradient Boosting was configured with 500 trees, interaction depth = 3, and shrinkage = 0.1. Finally, XGBoost models were trained for up to 100 rounds (with early stopping after 10 rounds), max_depth = 3, η = 0.3, subsample = 0.8, and colsample_bytree = 0.7.
These models were selected to capture both linear and non-linear relationships within the data and to explore varying perspectives on how the variables influence patient outcomes. The following methods were used:
- Regression-based algorithms: Logistic regression, LASSO, ridge regression, elastic net [23].
- Distance-based algorithms: Support Vector Machines (SVM), K-Nearest Neighbors (KNN) [24].
- Tree-based algorithms: Random Forest, gradient boosting, XGBoost [25].
Each of these models assessed the data from different angles, providing a broad spectrum of variable importance evaluations. However, each of our 11 XML models carries its own assumptions and potential weaknesses, e.g., penalized regressions presume linear relationships and may miss nonlinear effects, PCR depends on variance‐based dimensionality reduction that can overlook low‐variance but predictive features, k-NN is sensitive to feature scaling and local density, SVMs hinge on kernel choice and can struggle with large datasets, and tree-based learners (RF, GBM, XGBoost) may overemphasize variables with many split points or categorical levels. By design, however, our framework remains agnostic to any single method’s limitation: every algorithm contributes its 12 most influential predictors under the same 10-fold CV regime, and we synthesize these into a unified importance profile. Even if one model misses interactions, overfits, or is skewed by sparse data, its particular weaknesses get smoothed out when we combine results from all models.
For regression-based algorithms, variable importance is determined by the magnitude of the coefficients, where larger absolute values of the t-statistics or standardized coefficients indicate greater importance. For tree-based algorithms, importance is assessed using model-specific criteria. In the case of Random Forest, variable importance is measured by the Mean Decrease in the Gini Index, which quantifies how much each variable reduces Gini impurity across all trees. In Gradient Boosting, importance is calculated based on the relative influence of each variable, determined by the reduction in the loss function whenever the variable is used for splitting. In XGBoost, importance is assessed based on the improvement in accuracy contributed by each variable when used for splitting branches. For KNN, variable importance is estimated using a proxy method that examines the impact of variable scaling. For SVM, importance is derived from the coefficients of the hyperplane, with larger absolute values indicating higher importance.
Step 3: Target Outcomes: The goal of applying these methods was to determine the importance of each variable in predicting the two main outcomes: death and hospital readmission.
Step 4: Quantifying Model Performance: We assessed the performance of each XML model using standard evaluation metrics, including accuracy, area under the ROC curve (AUC), and logistic loss. This quantifiable performance estimate ensured that each model’s predictive capacity was considered when interpreting the variable importance rankings.
Step 5: Variable Ranking for Each Method: For each XML method, we produced a ranked list of 73 variables and selected the top 12 variables based on their importance in predicting the target outcomes. The decision to include 12 variables accounts for the traditional “events per variable” rule of logistic regression, which typically supports one significant variable per 20 outcomes. To accommodate variations across different model types and ensure comprehensive coverage of the strongest predictors, we extended the list to include two additional variables. Since each method approaches the data differently, the top 12 variables differed across models. To capture these differences, a comparison of the variable rankings across all methods was created, showing how each method prioritized different variables.
Step 6: Aggregating and Filtering Variables: Based on the rankings generated by each method, some variables appeared in the top 12 across multiple methods, while others were highly ranked by only a few methods. Variables that did not appear in the top 12 for any method were excluded from further analysis. This process left us with 38 key variables for continued evaluation.
Step 7: Weighting Variable Importance: To ensure fair comparison and robust ranking, we ranked variables in 2 ways. The first way reflects the number of times the variable appears among the top 12 variables in each model. The second way assigns point weights ranging from 12 to 1 for each variable, according to their rank order within each model, ensuring that variables consistently ranked highly across multiple methods are given more importance.
Step 8: Final Ranking of Variables: The outcome of this process was a ranked list of variables that were considered most important in predicting post-discharge stroke outcomes. These top variables provide insights for healthcare providers to focus on during the 90-day post-stroke recovery period.
2.4. Evaluation metrics and statistical analysis
To assess the performance of the predictive models, several evaluation metrics were calculated: Accuracy [26], C-statistic (Area Under the ROC Curve) [27], Squared-Error Loss (Mean Squared Error, MSE) [28], Logistic Loss (Log Loss or Cross-Entropy Loss) [29] and Misclassification Rate [26]. These metrics provide a comprehensive assessment of each model’s strengths and weaknesses in terms of discrimination and calibration.
All statistical methods (including ML) suffer from class imbalance [30], which occurs when the distribution of classes in a dataset is highly skewed, leading to challenges in model training and performance as the algorithm may favor the majority class while neglecting the minority class. Applying multiple ML models and the creation of the combined predictor list offers a solution to class imbalance.
Continuous variables were summarized using means and standard deviations or medians with interquartile ranges, depending on their distribution. Categorical variables were presented as counts and percentages. Comparisons between patient groups were conducted using Chi-square tests for categorical variables and t-tests or Mann–Whitney U tests for continuous variables, as appropriate. Paired t-tests were used to compare the averages of estimates from different dataset pairs. All statistical analyses were performed using R version 4.4.
To avoid relying on any one method’s own biases, we employed 11 distinct XML models (regression-based models, tree-based models, and distance-based models), each of which independently identified its top 12 predictors and evaluated performance across multiple metrics via 10-fold cross-validation. This consensus‐driven approach ensures that our findings are not dominated by one method’s assumptions or parameter settings, but instead reflect variables that consistently emerge as influential across all algorithms. The result is a more stable and broadly applicable set of drivers for the outcome.
2.5. Creation of the combined predictor list
The list creation starts with the extraction of top predictors: from each model, we extracted the top 12 predictors. The aggregation combines the predictors from all models into a single list producing two ranking methods: Weighted Importance Scores, which assigned weights based on predictor rank and summed them across models, and Frequency Counts, which tracked the number of appearances in the top 12 predictors across all models. Final rankings are based on these scores, highlighting factors linked to 90-day readmission or mortality representing a merged outcome derived from two distinct methods.
The created and sorted list of the 38 strongest predictors combines the top 12 variables from each 73 sorted variables of the 11 models. The list captures insights from diverse modeling perspectives. This approach mitigates the lack of established consensus on the optimal method for combining predictors from different models to generate a comprehensive list [31,32]. The resulting aggregation method combines Weighted Importance Scores and Frequency Counts to ensure a balanced and robust selection process.
3. Results
A total of 1,300 stroke survivors were included in the analysis. The cohort had a mean age of 63.8 years (SD = 13.9), and 56% of the participants were male. The ethnic composition included 22% Hispanic, 23% Non-Hispanic Black, and 51% Non-Hispanic White individuals. Ischemic strokes accounted for 92% of cases, while 8% were intracerebral hemorrhages (ICH). The overall 90-day readmission or mortality rate was 15.8%, affecting 206 of the 1,300 patients. Table 1 summarises the demographic and stroke-related characteristics of the study population, stratified by the presence of the 90-day outcome (readmission or mortality).
Out of the 1,300 patients included in the study, 206 experienced an adverse 90-day outcome (either readmission or death) while 1,094 remained event-free. Specifically, 22 patients died and 187 were readmitted to the hospital within 90 days post-discharge.
To ensure the integrity of our analyses, we excluded the 73 patients with missing values and retained only the 1,227 individuals with complete data for all covariates. All subsequent analyses were therefore based on these 1,227 patients.
Table 2 summarizes the 10-fold cross-validation model fit statistics, including c-statistic, squared-error loss, logistic loss, misclassification rate, precision, recall and F1 across all models. The best values for each metric are highlighted in bold. Ridge Regression demonstrated the highest discriminative ability with a c-statistic of 0.660, while LASSO and Elastic Net excelled in capturing outcome probabilities, achieving a logistic loss of 0.414. Principal Component Regression (PCR) showed its effectiveness in minimizing classification errors with the lowest misclassification rate of 0.156.
3.1. Variable importance
Table 3 presents the top 12 variables for each XML model, while Table 4 aggregates the importance of these variables across all models. These results highlight a combination of clinical and non-clinical factors that significantly impact patient outcomes after a stroke. Each model evaluates the contribution of different variables to predicting 90-day outcomes, resulting in rankings for each algorithm.
3.2. Summary of results
Table 4 provides a summary of the top variables across all 11 models. It ranks these 38 variables based on the number of models in which they appeared (count) among the top 12 predictors (R1). The cumulative ranking approach (sum) identifies variables consistently significant across models (R2), emphasizing their critical role in predicting 90-day outcomes for stroke survivors. The emerging socio-contextual risk determinants are shown in bold.
3.3. Sensitivity analysis
To assess the robustness of our findings, we repeated all analyses under three alternative scenarios, i.e., predicting 90-day readmission only; repeating the combined endpoint analysis in the ischaemic stroke subgroup; and predicting readmission only in the ischaemic subgroup. In all cases, the top predictors remained consistent with those reported above, and model discrimination and calibration measures show minimal deviation from the primary results (data not shown, but further discussed).
4. Discussion
Our study provides a unique look at estimating death and readmission risks in the first 90 days post-stroke, going beyond the typical clinical risk factors, and integrating 3 large data sets to provide a comprehensive view of the impact of clinical, individual social, and community-level risk factors on transitions of care. When only regression analysis models are used to model risks, they are limited to using a small number of predictors that operate in the same way on everyone, and uniformly throughout their range [33].
For studies where the goal is to predict the occurrence of an outcome and not measure the association between specific risk factors and an event in a clinically interpretable way, traditional regression models can be modified or abandoned in favour of models that
produce a more flexible relationship among the predictor variables and the outcome [33]. These methods have similar goals to regression-based approaches but different motivating philosophies (Fig 1). They do not require pre-specification of a model structure but instead search for the optimal fit within certain constraints (specific to the individual algorithm). This can result in a better final prediction model at the sacrifice of interpretability of how risk factors relate to the outcome of interest.
Predictive models can serve both explanatory and predictive purposes [34]. In our work, we focus on the former: identifying a concise set of explanatory variables without delving into the precise functional relationships among them. While this strategy does not eliminate multicollinearity, it has only a marginal effect on the robustness of our variable selection and on the overall interpretability of the resulting model.
By adding 10 XML models, our study advances the understanding of 90-day post-stroke outcomes by integrating clinical and non-clinical factors, particularly SDOH, into predictive models. By focusing on stroke-specific and non-classic predictors, we deliver a unified, systematic, and targeted perspective, advancing the precision and relevance of predictive modeling in stroke research. The stroke-specific list and findings underscore the value of Explainable XML models in identifying and ranking predictors that consistently influence readmission and mortality, thereby bridging gaps in traditional risk prediction frameworks. These XML models also assist medical practitioners in evaluating the validity of a diagnosis while ensuring the output is interpretable and comprehensible, even for patients [35]. By focusing on modifiable factors during the critical post-hospitalization period, this study contributes actionable insights for improving patient outcomes and resource allocation.
4.1. Clinical and non-clinical predictors
Key predictors of post-stroke outcomes, such as stroke severity, age, comorbidities (e.g., coronary artery disease) [36], active infection [37,38] and length of hospital stay, reinforce established clinical knowledge [39]. SDOH variables, such as socioeconomic status, education level, and neighborhood characteristics, emerged as significant contributors to model accuracy, aligning with prior findings [40], which demonstrated that incorporating SDOH improved mortality prediction for non-Hispanic Black patients with heart failure (HF). Similarly, our study found that integrating non-clinical factors enhanced predictive power, supporting the need for tailored, equitable healthcare strategies.
We also confirm the previously indicated importance of social determinants, such as (Medicare) insurance [17,41], housing status, social support, educational level, and employment status [42]. In particular, the role of an individual’s social support network has been shown to significantly impact functional recovery after stroke, especially within the first three months post-discharge. Future research should explore whether interventions outside the medical context, such as transitional care resources, community health workers, or rehabilitation strategies that strengthen social networks and incorporate group activities with family or caregivers, could mitigate the effects of limited social networks and enhance recovery outcomes for stroke survivors [42].
4.2. Machine learning methodologies
We employ 11 XML models, including tree-based algorithms like Random Forest and XGBoost, which excelled in capturing non-linear relationships within the data. The use of interpretable models provides transparency, fostering clinician trust and aligning with the recommendations by [43] and [44] for adopting trustworthy and explainable AI in healthcare. ML methods were also applied during patients’ initial hospitalization to identify those at high risk for readmission or mortality [45]. By synthesizing results through Weighted Importance Scores and Frequency Counts, this study provides a robust cumulative ranking of variable importance, extending beyond the limitations of traditional regression methods. This approach is novel, as the prevailing practice involves using multiple ML exploratory models in Phase 2 to predict clinical outcomes [46], without combining or synthesizing findings across different models. Our approach creates two distinct ranks, addressing the common challenge of interpreting results when multiple models are employed, a problem that remains unresolved in existing methodologies, some of which are shown in [47].
Over the past few years, AI has become an industry disruptor: as we demonstrate, it can more accurately refine and streamline data prediction; future AI processes may create patient-specific, guideline-based treatment plans as well, however, these projects must occur securely and ethically [48].
While we did not apply explicit resampling or class‐weighting to rebalance the training data, our assessment framework inherently counteracts imbalance in two ways. First, by integrating predictions from eleven diverse XML models—each with different inductive biases and error‐minimization strategies—we avoid overreliance on any single algorithm’s tendency to favor the majority class. Second, we evaluated every model using multiple performance metrics that capture different aspects of predictive quality: the c-Statistic (AUC) for discrimination, squared-error loss and logistic loss for probabilistic calibration, and misclassification rate for raw accuracy. By insisting that strong performance be sustained across all four metrics, we ensure that a model cannot succeed simply by predicting the majority class. Together, this approach provides a robust safeguard against the distortions introduced by class imbalance.
4.3. The role of SDOH and disparities in stroke outcomes
In the absence of relevant socio-environmental variables, XML models show only a modest improvement in prediction performance and explainability compared to traditional modeling techniques, even in cases where complete relevant electronic records are available (e.g., Sweden) [49]. Our findings emphasize the impact of SDOH on stroke outcomes, echoing evidence from the [40] study, which highlighted disparities in SDOH-related predictive improvements between Black and non-Black patients. While clinical predictors remain essential, the integration of neighborhood-level SDOH variables, such as access to healthcare and socioeconomic stability, offers a more holistic perspective on patient risk, paving the way for tailored interventions to address health disparities and community-level interventions. This represents a significant advancement in the field. While many studies focus on identifying disparities and social needs, the critical next step is to develop and implement solutions to address these needs effectively.
Future public health policies need to address the heightened mortality and readmission rates among stroke survivors from vulnerable areas highlighting the need to enhance care transitions and support for underserved patient populations.
4.4. Clinical implications
Prior analyses evaluating long-term stroke outcomes were based on electronic medical records [50] that do not include detailed SDOH data. Our study considers socio-environmental variables that significantly impact stroke survivors’ lives. By identifying high-impact variables, healthcare providers can prioritize high-risk patients, personalize preventive and rehabilitation strategies, and optimize resource allocation. The integration of XML models into clinical workflows may offer real-time risk assessment capabilities, enabling more proactive care management. These models also empower clinicians with interpretable outputs, bridging the gap between advanced analytics and practical decision-making, as recommended by other experts [44].
4.5. Strengths and limitations
Our use of a diverse stroke cohort from the state of Florida and a two-layer nested cross-validation approach ensures methodological robustness. However, the reliance on regional data may limit generalizability, and less than 10% of the cohort had acute intracerebral hemorrhage so results are not generalizable to this population. Although our cohort is geographically confined to Florida and predominantly ischemic stroke (92.1%), the secondary ischemic‐only analysis demonstrated equivalent performance, supporting the model’s robustness in this common stroke subtype. Limited individual-level SDOH data constrains the depth of insights into disparities. The Florida-specific dataset limits generalizability and challenges such as class imbalance and missing data persist. Validation in diverse populations and integration of real-time data could enhance predictive accuracy and enable dynamic care adjustments.
We conducted sensitivity analyses restricting to readmission only and to the ischaemic stroke subgroup, which confirmed that our main results, which focus on the prominence of SDOH factors, were robust to endpoint definition and stroke subtype. We have not included all of these additional tables here, but we plan to include them in a dedicated follow-up study.
Future research should aim to validate these models across diverse populations and incorporate real-time, individual-level data to enhance prediction accuracy and equity in care delivery.
While there are signs that similar predictors are vital across geographies, indicators of stroke outcomes and care may vary significantly between high-income and low-income countries due to differences in acute stroke management, post-stroke care, rehabilitation practices, and methodological approaches, making direct comparisons challenging [51].
Despite XML models’ methodological robustness, the study limitations are drawn from the sample size (n = 1,300), which is moderate relative to the number of potential predictors. This may constrain the detection of weaker associations and limit model generalizability. Second, the data are derived from stroke centers within a single U.S. state (Florida), which may introduce regional biases related to healthcare delivery, socioeconomic conditions, and demographic composition. As a result, the findings may not fully generalize to populations with different healthcare systems, geographic contexts, or stroke care practices. Future research should aim to validate these results using larger, multi-regional datasets that reflect broader clinical and social diversity.
Our cohort’s adverse outcome rate of 15.8% reflects the typical imbalance seen in post‐stroke prognostic studies. All stroke‐prediction models face this challenge, and few have achieved a ROC‐AUC substantially above 0.7 without risking overfitting. Since our goal was to evaluate an array of modeling methods rather than to fine‐tune a single algorithm, we did not apply formal imbalance‐correction techniques such as SMOTE or differential class‐weighting. Consequently, precision and recall (particularly for the minority class) are inherently constrained by the low event rate. Future work may explore resampling strategies or threshold adjustment to optimize these metrics, but such approaches must be balanced against potential bias and reduced generalizability.
4.6. May modeling help post-acute stroke care
The 15.8% adverse‐outcome rate in our cohort reflects the low‐prevalence challenge faced by all stroke‐prognosis models. Historically, no stroke‐prediction model has achieved an ROC‐AUC substantially above 0.7 without risking overfitting. Because our primary aim was to benchmark multiple modeling frameworks rather than to fine‐tune one classifier, we did not implement SMOTE, differential class weighting, or other resampling strategies. As a result, precision and recall (particularly for the minority adverse‐outcome class) remain limited, with recall values around 0.50. While these metrics are critical for clinical decision‐making, their potential improvement may necessitate imbalance‐focused methods that carry their own risk of bias and reduced generalizability.
In this paper, our focus is on answering three fundamental questions for post‑acute stroke care coordination: what interventions to implement, who should carry them out, and in which settings. Once a patient returns to their changed pre‑stroke environment, most medical and clinical factors are already addressed through established protocols; what remains poorly understood is the broader exposome, the totality of exposures and conditions a patient experienced prior to and after their stroke. We approximate these influences using variables commonly grouped under SDOH. Notably, 19 of the 38 top‑ranked predictors in our analysis represent modifiable or preventable socioeconomic factors. We do not claim to establish causal hierarchies or precise effect sizes for these variables, but we believe they highlight critical opportunities for improving post‑acute care coordination, an aspect often neglected in current practice.
We calculated precision, recall, and F1-score for each model and reported them in Table 3. Across models, recall values hover around 0.50, indicating that approximately half of the true adverse outcomes are identified (a characteristic consequence of the 15.8% event rate and consistent with prior stroke-prediction studies). Although these threshold-dependent metrics are critical for understanding the performance of the minority class, the low prevalence constrains improvements without risking overfitting. As our goal was to compare modeling frameworks rather than to implement imbalance-specific corrections, we have not pursued additional resampling or class-weighting strategies here.
4.7. Value of social determinants of health in post‐discharge prediction
As shown in Table 4, we find that 19 (marked in bold) of the 38 candidate predictors are SDOH variables. While clinical measures such as NIHSS score and comorbidity burden are paramount during the acute and early recovery phases, their prognostic influence attenuates once patients leave the hospital environment. In contrast, modifiable SDOH factors (housing security, access to care, social support networks, and neighborhood socioeconomic status) emerge as increasingly salient drivers of long‐term outcomes. In our permutation‐importance analysis, multiple SDOH predictors placed within the top ten features for several modeling approaches, underscoring their empirical contribution to discrimination and calibration. This finding aligns with prior studies from our group demonstrating SDOH’s role in shaping post‐stroke functional recovery and readmission risk [52,53]. Collecting SDOH data may impose additional burden, but these variables capture patient exposome dimensions essential for accurate, real‐world prognostication.
5. Conclusion
This study identifies several predictors of 90-day readmission or mortality in stroke survivors using explainable Machine Learning (XML) models, which outperform conventional regression techniques. These findings enable more accurate identification of high-risk patients and support tailored post-discharge care. With over seven million U.S. stroke survivors, leveraging predictive tools and expanding data sources can make a significant impact on the national stroke burden by refining interventions, improving outcomes, and inspiring innovation in care strategies. Machine learning models hold the potential to better harness large datasets across all phases, paving the way for unified, phase-spanning models that guide stroke risk reduction and improved recovery from onset to long-term care.
By applying eleven different modeling frameworks, we aim to present multiple perspectives on post-acute stroke care rather than to rank any one algorithm as inherently superior. Instead, these models shed light on diverse aspects of patient life and treatment pathways, particularly environmental exposures and Social Determinants Of Health (SDOH), that remain poorly understood. In a field urgently in need of better predictive tools and interpretability, our work demonstrates how ML methods can offer valuable preliminary insights to guide future research and intervention design when no prevailing explanatory frameworks yet exist.
With this knowledge, we plan to create a solution that can be integrated into Electronic Health Record (EHR) systems to harness diverse data sources (e.g., hospital EHRs, the Florida Stroke Registry, and socioeconomic datasets) and build a dynamic risk scoring system and Digital Twin model that predicts stroke outcomes and guides treatment decisions in real-time. By integrating established prognostic scores with guideline-driven management trees and leveraging real-world data, the future solution will deliver personalized, evidence-based care.
References
- 1.
Institute for Health Metrics and Evaluation (IHME). Global Burden of Disease 2021: Findings from the GBD 2021 Study. Seattle: Institute for Health Metrics and Evaluation (IHME). 2024.
- 2. Norrving B, Davis SM, Feigin VL, Mensah GA, Sacco RL, Varghese C. Stroke prevention worldwide–what could make it work?. Neuroepidemiology. 2015;45:215–20.
- 3. Saxena A, McGranaghan P, Das S, Rubens M, Salami J, Veledar E. Abstract WP242: Modeling Prevention, Prediction or Explanation of Stroke Risk: Insights From Meta-Analysis of Studies With Models Predicting Stroke or Composite Outcomes. Stroke. 2019;50:AWP242–AWP242.
- 4. Saxena A, Zhang Z, Ahmed MA, McGranaghan P, Rubens M, Veledar E. Abstract TMP98: Insights From Meta-analysis Of Studies With Models Predicting Stroke Or Composite Outcomes: A 2021 Study Update. Stroke. 2022;53(Suppl_1).
- 5. Lloyd-Jones DM, Allen NB, Anderson CAM, Black T, Brewer LC, Foraker RE, et al. Life’s Essential 8: Updating and Enhancing the American Heart Association’s Construct of Cardiovascular Health: A Presidential Advisory From the American Heart Association. Circulation. 2022;146(5):e18–43. pmid:35766027
- 6. Islam U, Mehmood G, Al-Atawi AA, Khan F, Alwageed HS, Cascone L. NeuroHealth guardian: A novel hybrid approach for precision brain stroke prediction and healthcare analytics. J Neurosci Methods. 2024;409:110210. pmid:38968974
- 7. Cho J, Place K, Salstrand R, Rahmat M, Mansouri M, Fell N, et al. Developing a Predictive Tool for Hospital Discharge Disposition of Patients Poststroke with 30-Day Readmission Validation. Stroke Res Treat. 2021;2021:5546766. pmid:34457232
- 8. Albert GP, McHugh DC, Roberts DE, Kelly AG, Okwechime R, Holloway RG, et al. Hospital Discharge and Readmissions Before and During the COVID-19 Pandemic for California Acute Stroke Inpatients. J Stroke Cerebrovasc Dis. 2023;32(8):107233. pmid:37364401
- 9.
Bettger JP, Thomas L, Liang L. Comparing recovery options for stroke patients. 2019.
- 10. Schrage T, Thomalla G, Härter M, Lebherz L, Appelbohm H, Rimmele DL, et al. Predictors of Discharge Destination After Stroke. Neurorehabil Neural Repair. 2023;37(5):307–15. pmid:37039307
- 11. Stein J, Borg-Jensen P, Sicklick A, Rodstein BM, Hedeman R, Bettger JP, et al. Are Stroke Survivors Discharged to the Recommended Postacute Setting?. Arch Phys Med Rehabil. 2020;101(7):1190–8. pmid:32272107
- 12. Clery A, Bhalla A, Bisquera A, Skolarus LE, Marshall I, McKevitt C, et al. Long-Term Trends in Stroke Survivors Discharged to Care Homes: The South London Stroke Register. Stroke. 2020;51(1):179–85. pmid:31690255
- 13. Veledar E, Zhou L, Veledar O, Gardener H, Gutierrez CM, Romano JG, et al. Synthesizing Explainability Across Multiple ML Models for Structured Data. Algorithms. 2025;18(6):368.
- 14. Johnson KH, Gardener H, Gutierrez C, Marulanda E, Campo-Bustillo I, Gordon Perue G, et al. Disparities in transitions of acute stroke care: The transitions of care stroke disparities study methodological report. J Stroke Cerebrovasc Dis. 2023;32(9):107251. pmid:37441890
- 15.
National Institutes of Health. MyRIAD: Atrial Cardiopathy and Antithrombotic Drugs in Prevention After Cryptogenic Stroke.
- 16. Benbassat J, Taragin M. Hospital readmissions as a measure of quality of health care: advantages and limitations. Arch Intern Med. 2000;160(8):1074–81. pmid:10789599
- 17. Zhou LW, Lansberg MG, de Havenon A. Rates and reasons for hospital readmission after acute ischemic stroke in a US population-based cohort. PLoS One. 2023;18(8):e0289640. pmid:37535655
- 18.
SCIERA Inc. Your smart marketing intelligence partner. https://www.sciera.com. 2021.
- 19. Dong C, Wang K, Di Tullio MR, Gutierrez C, Koch S, García EJ, et al. Disparities and temporal trends in stroke care outcomes in patients with atrial fibrillation: The FLiPER-AF stroke study. Int J Cerebrovasc Dis Stroke. 2019;2:19.
- 20. Griffith DM, Towfighi A, Manson SM, Littlejohn EL, Skolarus LE. Determinants of Inequities in Neurologic Disease, Health, and Well-being: The NINDS Social Determinants of Health Framework. Neurology. 2023;101(7 Suppl 1):S75–81. pmid:37580154
- 21. Badland H, Foster S, Bentley R, Higgs C, Roberts R, Pettit C, et al. Examining associations between area-level spatial measures of housing with selected health and wellbeing behaviours and outcomes in an urban context. Health Place. 2017;43:17–24. pmid:27894015
- 22. Kapral MK, Wang H, Mamdani M, Tu JV. Effect of socioeconomic status on treatment and mortality after stroke. Stroke. 2002;33(1):268–73. pmid:11779921
- 23.
Hastie T, Tibshirani R, Friedman J. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. 2nd ed. New: Springer; 2009.
- 24.
Murphy KP. Machine learning: a probabilistic perspective: MIT press; 2012.
- 25.
Molnar C. Interpretable machine learning. Lulu.com. 2020.
- 26. Stehman SV. Selecting and interpreting measures of thematic classification accuracy. Remote Sensing of Environment. 1997;62(1):77–89.
- 27. Harrell FE Jr, Lee KL, Califf RM, Pryor DB, Rosati RA. Regression modelling strategies for improved prognostic prediction. Stat Med. 1984;3(2):143–52. pmid:6463451
- 28. Zhou Wang, Bovik AC. Mean squared error: Love it or leave it? A new look at Signal Fidelity Measures. IEEE Signal Process Mag. 2009;26(1):98–117.
- 29.
Bishop CM. Pattern Recognition and Machine Learning. New York: Springer. 2006.
- 30. Leevy JL, Khoshgoftaar TM, Bauder RA, Seliya N. A survey on addressing high-class imbalance in big data. J Big Data. 2018;5(1).
- 31. Brankovic A, Cook D, Rahman J, Huang W, Khanna S. Evaluation of popular XAI applied to clinical prediction models: can they be trusted?. arXiv preprint. 2023.
- 32. Krishna S, Han T, Gu A, Wu S, Jabbari S, Lakkaraju H. The disagreement problem in explainable machine learning: A practitioner’s perspective. arXiv preprint. 2022.
- 33. Goldstein BA, Navar AM, Carter RE. Moving beyond regression techniques in cardiovascular risk prediction: applying machine learning to address analytic challenges. Eur Heart J. 2017;38(23):1805–14. pmid:27436868
- 34. Veledar E, Veledar O, Gardener H, Rundek T, Garelnabi M. Harnessing Statistical and Machine Learning Approaches to Analyze Oxidized LDL in Clinical Research. Cell Biochem Biophys. 2025;:10.1007/s12013-025-01837–9. pmid:40884728
- 35.
Tahirovic E, Krivic S. Interpretability and explainability of logistic regression model for breast cancer detection. In: ICAART. 2023. 161–8.
- 36. Rao A, Barrow E, Vuik S, Darzi A, Aylin P. Systematic Review of Hospital Readmissions in Stroke Patients. Stroke Res Treat. 2016;2016:9325368. pmid:27668120
- 37. Boehme AK, Kulick ER, Canning M, Alvord T, Khaksari B, Omran S, et al. Infections Increase the Risk of 30-Day Readmissions Among Stroke Survivors. Stroke. 2018;49(12):2999–3005. pmid:30571394
- 38. Boehme AK, Oka M, Cohen B, Elkind MSV, Larson E, Mathema B. Readmission Rates in Stroke Patients with and without Infections: Incidence and Risk Factors. J Stroke Cerebrovasc Dis. 2022;31(1):106172. pmid:34798436
- 39. Chiou L-J, Lang H-C. Potentially preventable hospital readmissions after patients’ first stroke in Taiwan. Sci Rep. 2022;12(1):3743. pmid:35260680
- 40. Segar MW, Hall JL, Jhund PS, Powell-Wiley TM, Morris AA, Kao D, et al. Machine Learning-Based Models Incorporating Social Determinants of Health vs Traditional Models for Predicting In-Hospital Mortality in Patients With Heart Failure. JAMA Cardiol. 2022;7(8):844–54. pmid:35793094
- 41. Spiegler KM, Irvine H, Torres J, Cardiel M, Ishida K, Lewis A, et al. Characteristics associated with 30-day post-stroke readmission within an academic urban hospital network. J Stroke Cerebrovasc Dis. 2024;33(11):107984. pmid:39216710
- 42. Bishop L, Brown SC, Gardener HE, Bustillo AJ, George DA, Gordon Perue G, et al. The association between social networks and functional recovery after stroke. Int J Stroke. 2025;20(1):95–104. pmid:39215634
- 43. Linardatos P, Papastefanopoulos V, Kotsiantis S. Explainable AI: A Review of Machine Learning Interpretability Methods. Entropy (Basel). 2020;23(1):18. pmid:33375658
- 44. Albahri AS, Duhaim AM, Fadhel MA, Alnoor A, Baqer NS, Alzubaidi L, et al. A systematic review of trustworthy and explainable artificial intelligence in healthcare: Assessment of quality, bias risk, and data fusion. Information Fusion. 2023;96:156–91.
- 45. Hung L-C, Sung S-F, Hu Y-H. A Machine Learning Approach to Predicting Readmission or Mortality in Patients Hospitalized for Stroke or Transient Ischemic Attack. Applied Sci. 2020;10(18):6337.
- 46. Matsumoto K, Nohara Y, Soejima H, Yonehara T, Nakashima N, Kamouchi M. Stroke Prognostic Scores and Data-Driven Prediction of Clinical Outcomes After Acute Ischemic Stroke. Stroke. 2020;51(5):1477–83. pmid:32208843
- 47. Voigtlaender S, Pawelczyk J, Geiger M, Vaios EJ, Karschnia P, Cudkowicz M, et al. Artificial intelligence in neurology: opportunities, challenges, and policy implications. J Neurol. 2024;271(5):2258–73. pmid:38367046
- 48.
Saini H, Rose DZ. The Ghost in the Machine: Artificial Intelligence in Neurocardiology Will Advance Stroke Care. 2024.
- 49. Otieno JA, Häggström J, Darehed D, Eriksson M. Developing machine learning models to predict multi-class functional outcomes and death three months after stroke in Sweden. PLoS One. 2024;19(5):e0303287. pmid:38739586
- 50. Bopche R, Gustad LT, Afset JE, Ehrnström B, Damås JK, Nytrø Ø. In-hospital mortality, readmission, and prolonged length of stay risk prediction leveraging historical electronic patient records. JAMIA Open. 2024;7(3):ooae074. pmid:39282081
- 51. Nkoke C, Jingi AM, Noubiap JJ, Nkouonlack C, Njume D, Dzudie A. Readmission and mortality during the first year after an acute stroke: A prospective cohort study from Cameroon. PLoS One. 2024;19(10):e0311893. pmid:39466804
- 52. Schoon BA, Hansen D, Roozenbeek B, Oude Groeniger J, van der Steen W, van der Lugt A, et al. Neighborhood Socioeconomic Status and the Functional Outcome of Patients Treated With Endovascular Thrombectomy for Ischemic Stroke. Neurology. 2025;105(1):e213615. pmid:40513055
- 53. Voura EB, Abdul-Malak Y, Jorgensen TM, Abdul-Malak S. A retrospective analysis of the social determinants of health affecting stroke outcomes in a small hospital situated in a health professional shortage area (HPSA). PLOS Glob Public Health. 2024;4(1):e0001933. pmid:38190408