Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Predicting need for heart failure advanced therapies using an interpretable tropical geometry-based fuzzy neural network

  • Yufeng Zhang ,

    Roles Conceptualization, Data curation, Formal analysis, Methodology, Software, Validation, Visualization, Writing – original draft, Writing – review & editing

    chloezh@umich.edu

    Affiliation Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, United States of America

  • Keith D. Aaronson,

    Roles Conceptualization, Funding acquisition, Investigation, Project administration, Resources, Supervision, Writing – original draft, Writing – review & editing

    Affiliation Department of Internal Medicine, Division of Cardiovascular Medicine, University of Michigan, Ann Arbor, Michigan, United States of America

  • Jonathan Gryak,

    Roles Data curation, Project administration, Resources, Software

    Affiliation Department of Computer Science, Queens College, City University of New York, New York, New York, United States of America

  • Emily Wittrup,

    Roles Project administration, Supervision, Writing – review & editing

    Affiliation Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, United States of America

  • Cristian Minoccheri,

    Roles Project administration

    Affiliation Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, United States of America

  • Jessica R. Golbus,

    Roles Conceptualization, Funding acquisition, Investigation, Resources, Supervision, Validation, Writing – original draft, Writing – review & editing

    Affiliation Department of Internal Medicine, Division of Cardiovascular Medicine, University of Michigan, Ann Arbor, Michigan, United States of America

  • Kayvan Najarian

    Roles Conceptualization, Funding acquisition, Investigation, Project administration, Resources, Supervision

    Affiliations Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, United States of America, Michigan Institute for Data Science, University of Michigan, Ann Arbor, Michigan, United States of America, Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, Michigan, United States of America, Department of Emergency Medicine, University of Michigan, Ann Arbor, Michigan, United States of America

Abstract

Background

Timely referral for advanced therapies (i.e., heart transplantation, left ventricular assist device) is critical for ensuring optimal outcomes for heart failure patients. Using electronic health records, our goal was to use data from a single hospitalization to develop an interpretable clinical decision-making system for predicting the need for advanced therapies at the subsequent hospitalization.

Methods

Michigan Medicine heart failure patients from 2013–2021 with a left ventricular ejection fraction ≤ 35% and at least two heart failure hospitalizations within one year were used to train an interpretable machine learning model constructed using fuzzy logic and tropical geometry. Clinical knowledge was used to initialize the model. The performance and robustness of the model were evaluated with the mean and standard deviation of the area under the receiver operating curve (AUC), the area under the precision-recall curve (AUPRC), and the F1 score of the ensemble. We inferred membership functions from the model for continuous clinical variables, extracted decision rules, and then evaluated their relative importance.

Results

The model was trained and validated using data from 557 heart failure hospitalizations from 300 patients, of whom 193 received advanced therapies. The mean (standard deviation) of AUC, AUPRC, and F1 scores of the proposed model initialized with clinical knowledge was 0.747 (0.080), 0.642 (0.080), and 0.569 (0.067), respectively, showing superior predictive performance or increased interpretability over other machine learning methods. The model learned critical risk factors predicting the need for advanced therapies in the subsequent hospitalization. Furthermore, our model displayed transparent rule sets composed of these critical concepts to justify the prediction.

Conclusion

These results demonstrate the ability to successfully predict the need for advanced heart failure therapies by generating transparent and accessible clinical rules although further research is needed to prospectively validate the risk factors identified by the model.

Introduction

Heart failure (HF) affected 6.5 million adults in the United States in 2017 and is expected to impact 8 million by 2030 [1,2]. While medical and device therapies have improved outcomes over the past four decades, 5-year mortality for heart failure remains high at nearly 50% [1,3]. Amongst patients with HF, approximately 5% annually progress to an advanced disease state also known as Stage D HF [4]. For these patients, heart transplantation (HT) and left ventricular assist devices (LVADs) [58] offer the best opportunities for long-term survival with improved quality of life. HF advanced therapies, however, carry risks for adverse events and there exists a supply-demand mismatch for donor hearts, necessitating careful patient selection [9]. Therefore, whether and when to refer a HF patient for advanced therapies relies on timely clinical judgment requiring an effective and efficient method of identifying potential candidates amongst all HF patients.

With the high prevalence of HF in the United States, most patients are cared for by general cardiologists or primary care clinicians who may lack specialized training in HF advanced therapies [10]. However, the capacity of advanced HF specialists to evaluate patients for advanced therapies is finite. There is thus a need for decision support systems that can identify patients with HF in need of advanced therapies to ensure that they are referred to a HF cardiologist at the appropriate time. While several risk models have been developed to assist in risk stratification of patients with advanced HF, their focus has been on predicting HF hospitalizations and mortality, typically at pre-specified time points such as 1 or 5 years, rather than assisting providers at the bedside with the timing of advanced therapy delivery. In addition, a subset incorporates data types not collected in routine practice [1113]. Machine learning has gained significant popularity in recent years and has found applications across various domains, such as biology, medicine, and healthcare, as evident from numerous studies [1418]. Notably, recent advancements in machine learning have led to successful models for identifying high-risk patients [1921]. However, these models come with certain limitations, primarily the challenge of interpretability and transparency in model recommendations. This lack of interpretability, often referred to as the "black box" aspect of traditional machine learning models, may pose challenges in gaining acceptance from healthcare providers and in their subsequent integration into healthcare practices. Although some machine learning methods provide a way to interpret the importance of every feature (interpretability), most of the models are not transparent to present the decision logic in a rule format (transparency). Consequently, there is an urgent need for more transparent risk prediction models that can be broadly implemented within electronic health records (EHRs) that use routinely collected data to predict need for HF advanced therapies. This can be used to ensure that HF patients are referred to advanced HF cardiologists at the appropriate time or to prompt HF cardiologists to initiate a timely advanced therapies evaluation.

Recently, an interpretable algorithm based on a tropical geometry-based fuzzy neural network (TGFNN) was developed [22]. Unlike traditional machine learning methods, this model incorporates existing clinical knowledge and produces a set of criteria by which to explain the rationale for its recommendations [22,23]. We extend that early work herein from classification to risk prediction, predicting the future need for HF advanced therapies using routinely collected clinical variables from a single hospitalization.

Methods

The study utilized a TGFNN to identify patients who would require advanced therapies for heart failure during a subsequent hospitalization. The proposed system’s flow diagram is depicted in Fig 1A. Prior to analysis, the study obtained approval from the University of Michigan Institutional Review Board (HUM00184418) which waived the need for informed consent, and the EHR data used in the study were completely deidentified prior to analysis. The data was accessed from Michigan Medicine on May 18th, 2021.

thumbnail
Fig 1. Flowchart of the clinical decision support system and the interpretable tropical geometry-based fuzzy neural network algorithm.

Fig 1A describes the system and training strategy: The EHR dataset is collected from Michigan Medicine and then the patient selection and outcome definition were performed. The data is split into training and test sets. Five-fold cross-validation is performed on the training dataset. Rules extracted from five-folds are ensembled. The model is then retrained on the whole training dataset with ensembled rule initialization. The trained model is later validated on the test set. Fig 1B depicts the structure of tropical geometry-based fuzzy neural network: The encoding layer encodes the input features into ‘low’, ‘medium’ and ‘high’ fuzzy concepts. The rule layer combines different concepts to generate several rules and decisions are made at the final inference layer by leveraging all rules. The edges between modules are trainable parameters to optimize the model. xi: continuous variables; : low concept membership function; : medium concept function; : high concept membership function; A: attention matrix; M: connection matrix; W: inference matrix.

https://doi.org/10.1371/journal.pone.0295016.g001

Study cohort

Eligible patients were 18–80 years of age at the time of a hospitalization at Michigan Medicine for acute on chronic HF, as derived from billing codes, where they received at least one dose of an intravenous diuretic. Qualifying hospitalizations were between January 1, 2013, and January 30, 2021. Patients were included if their most recent ejection fraction recorded in the EHR at the time of admission was less than or equal to 35% and excluded if they had a body mass index (BMI) > 50 kg/m2. Patients were required to have at least two eligible HF hospitalizations in one year to support the study design.

To predict the next-visit state, eligible pairs of HF hospitalizations were then labeled as positive if the patient received an urgent HT or LVAD during their second hospitalization, with urgent HTs defined as those transplanted at Status 1A or 1B prior to October 17, 2018, or at statuses 1–4 thereafter. The remaining hospitalization pairs were labeled as negative and included those too well for HF advanced therapies, which we defined as patients who survived at least two years after their first HF hospitalization without the need for HT or LVAD implantation. The resulting cohort consisted of 557 HF hospitalizations pairs (samples) from 300 patients, 193 of whom received advanced therapy at their second hospitalization. Within our study cohort, 157 patients had two encounters, 66 three encounters, and the remaining patients more than three encounters. The time intervals between the two consecutive visits exhibit the following distribution: the ¼ quantile is 27 days, the ½ quantile is 61 days, and the ¾ quantile is 189 days.

Clinical features and preprocessing

Our model incorporated continuous and categorical clinical variables from each hospitalization identified by HF cardiologists as being of clinical value in the setting of HF, including laboratory values, vital signs, and comorbidities as determined by Elixhauser [24] (S1 and S2 Tables in S1 File). For most continuous variables, we only utilized the first measurement obtained during a given hospitalization. For brain natriuretic peptide (BNP) and creatinine, the relative change over each hospitalization (first to last measurement) was expressed as the percent change over the hospitalization. Furthermore, the first measurement of both systolic and diastolic blood pressure (BP) was used to calculate mean arterial pressure (MAP) and pulse pressure. Standardization was executed for all continuous features after data partition, while one-hot encoding was employed for all categorical features to transform them into binary or ordinal representations. Data imbalance was not addressed during this phase; instead, weighted loss was implemented throughout the entire training process by assigning higher weights to positive samples.

For missing values, carry-forward imputation was applied between encounters, and any remaining missing values were imputed using multiple imputations [25]. Although echocardiographic features had an approximately 40% missing rate, they were retained due to their importance in HF decision-making. Any other features with a missing rate greater than 60% were removed, and any patient with more than ten missing values was excluded from the analysis.

Tropical geometry-based interpretable machine learning method

We used the TGFNN to predict the future need of advanced therapies. While the TGFNN model has demonstrated successful application in the classification task using the REVIVAL and INTERMACS registries, its potential in the prediction task with EHR data remains unexplored. In this proof-of-concept study for the prediction task, we utilize EHR data from previous hospitalizations as input. The overall architecture of the algorithm is depicted in Fig 1B. The TGFNN algorithm consists of three modules: encoding, rule extraction, and inference. In the encoding module, each comorbidity is assigned a value of 1 if it exists in the patient’s medical history and 0 otherwise with an indicator function. At the same time, continuous variables are encoded into three fuzzy concepts: ’low’, ’medium’ and ’high’. Fuzzy concepts offer a valuable approach to address the complexities and uncertainties associated with determining cutoff points in clinical practice by encoding continuous variables into three categories, thereby accommodating the inherent ambiguity in defining thresholds. The three fuzzy concepts correspond to membership functions , and , which are defined as follows: (1) (2) (3) (4) where ai,j denotes the cutoff parameters for the concepts and ε controls the smoothness of membership functions. The membership function defines to what degree the measurements belong to each concept rather than trichotomizing variables that exist along a continuum. Therefore, the uncertainty that fuzzy concepts introduce can make the model more flexible in terms of interpretability. In addition, the ‘cut-off points’ for three fuzzy concepts are learned from the study cohort by the algorithm rather than pre-defined.

The rule module consists of two layers. The first layer leverages the three concepts of variable i in relation to rule k using attention tensor A∈ℝN×3×K. Each entry Ai,j,k shows the importance of xi being of concept j to the rule k. The message passing formula for this layer is given as

. Every entry is normalized to [0, 1] and trainable. High value in A indicates the higher importance in the decision system. The second layer measures the importance of i-th variable to k-th rule using a connection matrix M of size N×K, whose entries are also normalized between [0,1] and trainable. The second half of the message-passing formula of the rule module is given by . rk measures the rule firing strength using a parameterized T-norm. The parameterized T-norm with two inputs is define by (5) (6)

When ε2 approaches to 1, parameterized T-norm approaches to the multiplication operation while ε2 approaches to 0, parameterized T-norm takes the minimum of the inputs. The higher value in M, the higher contribution to the firing strength. The N-input T-norm is defined as: (7)

In the inference module, an inference matrix W of size K×C is learned to estimate the rule firing strengths, where C denotes the number of outcome classes. Each entry Wk,c of the inference matrix represents the contribution of the rule k to the final prediction of the class c, which is calculated as . The parametrized T-conorm with two inputs can be defined as . When ε3→1, parameterized T-conorm approaches to the addition operation while ε3→0, parameterized T-conorm takes the maximum of the inputs. The two-inputs T-conorm can be generalized to K-input T-conorm (8)

The three smoothness parameters ε1, ε2, ε3 are all trainable with constraint 0<ε1, ε2, ε3<1 and can control the (1) sharpness of the membership function (2) the behavior of the T-norm and T-conorm functions. In our current study, we used the same ε for simplicity. All parameters were trained using Adam optimizer except for ε. ε is initialized with 0.99 and then decrease at every training step using the scheduling formula: ε =max (εmin, εγtraining_step) where γ is the decay rate. Furthermore, one-hot encoded categorical features do not require membership functions, but their category levels behave like the three concepts of continuous variables. Therefore, the weighting process in the rule and inference modules can be applied as well to the categorical variables if we adapt the number of concepts from 3 to the number of category levels.

The algorithm was trained with a weighted cross entropy alongside two regularization terms through backpropagation. The first regularization term is defined as below in favor of feature sparsity: (9)

The second regularization term add penalty for highly correlated rules and is formulated as: (10) where S is constructed utilizing the entries in the attention matrix, A and connection matrix M as follows: Si,d,k = Ai,d,k×Mi,k, where i∈{1,…N}, d∈{1,2,3}, k∈{1,…K}. The contribution matrix S represents the contributions of individual concepts and variables to each rule. The total loss function can be therefore written as: (11)

Where vec(∙) denotes matrix vectorization.

Through the attention matrix and connection matrix, the algorithm operates with complete transparency, enabling the extraction and display of the underlying rules, thus revealing the overall decision-making logic.

TGFNN rule initialization, ensemble, and extraction

To enhance the TGFNN model with clinical knowledge, we gathered four simplified rules from HF cardiologists and from the medical literature [1,2,26]. Rules used for network initialization are provided below:

  • IF left ventricular ejection fraction (LVEF) is low AND systolic BP (SBP) is low, THEN refer to HT/ LVAD
  • IF LVEF is low AND mitral regurgitation is high (severe) THEN refer to HT/ LVAD
  • IF LVEF is low AND BNP change elevated (positive delta from first to last measurements during the initial hospitalization) THEN refer to HT/ LVAD
  • IF LVEF is low and serum sodium is low THEN refer to HT/ LVAD

These rules were used to initialize the network for every fold prior to training. The resulting learned rules were filtered based on their firing strength and correlations with one another, while the corresponding features were selected by their contribution to each individual rule. Thus, the highest-weighted and least-correlated rules with important variables were retained and ensembled for network re-initialization. Details are illustrated in the supporting information.

Experimental design

This study compared the TGFNN to the following classical machine learning models: Random Forest, XGBoost, Logistic Regression, Support Vector Machines, Naive Bayes, and Decision Trees on the same dataset. We performed patient-wise five-fold cross-validation on the training dataset to evaluate model performance and robustness. A random search algorithm for hyperparameter tuning was employed during the training stage. The models were then evaluated on the validation and test datasets, which were kept separate from the training dataset. In our experimental setting, the validation sets were utilized to assess the model’s performance and robustness as an unseen dataset with the same data distribution as the training dataset. The test set served as an external dataset for overall model evaluation. The details of the data split are further described in supplementary materials.

Model evaluation

Models were then evaluated by using the mean (standard deviation [SD]) and by their accuracy ((true positive + true negative) / (all positives + all negatives)), recall (true positives/(true positives + false negatives)), specificity, precision (true positives/(true positives + false positives)), F1 score (harmonic mean of precision and recall), area under the receiver operating curve (AUC), area under the precision-recall curve (AUPRC) and Matthews correlation coefficient (MCC) [27]. Each of the machine learning models was also assessed concerning (1) whether the model could evaluate variable importance (interpretability) and (2) whether model prediction can be explained by clinical rules (transparency).

Results

Prediction performance

Patient characteristics are shown in Table 1.

thumbnail
Table 1. Demographic characteristics of patients requiring HT/LVAD evaluation (“Positive”) and those too well for HF advanced therapies (“Negative”).

Displayed are mean (standard deviation) for continuous variables or N (%) for categorical variables.

https://doi.org/10.1371/journal.pone.0295016.t001

The cross-validation results for both the TGFNN and other standard machine-learning models are summarized in Table 2. The model’s performance when initialized with clinical knowledge achieved an F1 score of 0.569, an AUC of 0.747, and an AUPRC of 0.642. Our TGFNN with clinical initialization outperformed all standard machine learning models with respect to their F1 scores, AUC, and AUPRC except for XGBoost and Random Forest, which are ensemble models that lack explicit rules and operate through complex combinations of decision boundaries, making it challenging to interpret their inner workings. These results highlight the advantage of our model regarding transparency, interpretability, and performance in comparison to traditional ensemble approaches.

thumbnail
Table 2. Performance of machine learning models on HF dataset using 5-fold cross validation.

Models are referred to as transparent if they can explain their recommendations in a way understood by humans. The column ‘Interpretability’ indicates whether the feature importance can be provided with the model. Although Random Forest, XGBoost and SVM are listed as interpretable, these models can only be interpreted using external approach such as SHAP (SHapley Additive exPlanations). The column “Rules” refers to whether the model provides a set of clinical rules by which to explain its prediction.

https://doi.org/10.1371/journal.pone.0295016.t002

We also assessed model performance on the holdout test dataset as shown in Table 3. The re-initialized TGFNN model with ensembled rules extracted from 5 folds improved the F1 score from 0.577 to 0.656. In addition, the re-initialized TGFNN model achieved the highest AUC (0.855) and AUPRC (0.833) of all machine learning models. In clinical practice, there is a preference for models that can provide their underlying logic, as it enables a better understanding of the reasoning behind their predictions for healthcare professionals.

thumbnail
Table 3. Performance of machine learning models on the HF test dataset.

https://doi.org/10.1371/journal.pone.0295016.t003

TGFNN rules

The rules extracted from the re-initialized TGFNN model are presented in Fig 2, showing how the network makes decisions. Among the seven rules, rules 2–5 were learned from the data apart from the initially injected rules. For example, patients with low systolic blood pressure, low BMI, and low LVEF were more likely to be recommended for heart transplantation.

thumbnail
Fig 2. Clinical rules extracted from the network: In the heatmap, each column represents a rule, while each row represents one concept of a clinical feature.

The number beneath every rule measures the contribution of the rule. The color shades on the heatmap indicate the importance of individual concepts for each rule. Rule 1 can be written as: IF Systolic Blood Pressure is low AND Left Ventricular Ejection Fraction is low, THEN refer for heart transplantation/ LVAD. KEY: BMI = body mass index; BNP = brain natriuretic peptide; CREAT = creatine; HGB = hemoglobin; LVEF = left ventricular ejection fraction; MAP = mean arterial pressure; SBP = systolic blood pressure; SOD = sodium.

https://doi.org/10.1371/journal.pone.0295016.g002

Based on the learned rules, low SBP, low MAP, and low LVEF on hospital admission are the most important indicators of needing HF advanced therapies. Other factors, such as a relative increase in creatinine or hyponatremia, were also important indicators of need for HF advanced therapies. Demonstrative rules for identifying patients in need of HF advanced therapy are presented below:

  • Rule: IF MAP is low AND Creatinine increases AND LVEF is low, THEN refer for HT/ LVAD
  • Rule: IF SBP is low AND Hemoglobin is low AND LVEF is low AND No Diabetes THEN refer for HT/ LVAD

Range inference

Membership functions for several important continuous medical variables drawn from the model are shown in Fig 3, depicting how concepts (low, medium, high) are assigned to clinical variables to generate fuzzy sets in the rules. Furthermore, our model allows the transition between the ’low’ and ’medium’ concepts for SBP to be smooth and the change between the ’medium’ and ’high’ concepts to be sharp. Asymmetrical smoothness is vital as it provides greater flexibility and uncertainty in decision-making, which is useful for interpretation, suggesting the different disease progressions at the boundary between the adjacent fuzzy concepts.

thumbnail
Fig 3. Membership function visualization: Continuous clinical features are encoded into three concepts: ‘‘low’, ‘medium’ and ‘high’.

Membership values range from 0 to 1. The x-axis of each membership function represents the range of possible values, while the y-axis represents the degree of membership of each value in the corresponding fuzzy set, ranging from 0 to 1. The X-coordinates of the intersection of two membership functions indicate where the transition from one concept to another occurs KEY: SBP = systolic blood pressure; MAP = mean arterial pressure; SOD = sodium; HGB = hemoglobin; BMI = body mass index; CREAT = creatine.

https://doi.org/10.1371/journal.pone.0295016.g003

We can infer the possible ranges leading to decision-making by utilizing these membership function boundaries. The learned boundaries for these three concepts for clinical features are shown in Table 4. We observe consistency by comparing the learned boundaries and possible reference ranges that HF cardiologists employ in clinical practice.

thumbnail
Table 4. Critical values of membership functions learned from the study cohort by the algorithm.

These critical values indicate the potential threshold where adjacent concepts transition.

https://doi.org/10.1371/journal.pone.0295016.t004

Individual performance

In addition to the population-level rule extraction and predictive performance, rules can also be applied at the individual level. Upon feeding the patient’s EHR profile into the model, it not only generates predictions but also highlights the specific rules that are triggered for the given case. We illustrate this by using one patient who was referred for advanced therapies in our test set (Fig 4). The TGFNN with ensemble initialization successfully predicted the patient’s need for HF advanced therapies at his subsequent hospitalization. Rules 1, 2, 4, 5 and 6 in Fig 3 activated for this patient, leading to the recommendation for advanced therapies.

thumbnail
Fig 4. Profile for a patient in the test dataset, showing the composite rules that fired.

https://doi.org/10.1371/journal.pone.0295016.g004

Discussion

Herein we describe a transparent and interpretable machine learning model capable of using EHR data to predict whether HF patients will require advanced therapies at a future HF hospitalization. Our model outperformed all but two of the standard machine learning models to which it was compared and was the only model to be both transparent, providing the rationale for its recommendations, and interpretable. Importantly, unlike prior machine learning models, the current model uses routinely collected EHR data from a single hospitalization to predict the need for HF advanced therapies during a subsequent hospitalization. Such an approach allows the mobilization of critical resources to ensure that patients are able to undergo a comprehensive, advanced therapies evaluation in an anticipatory rather than a reactive manner, with the latter placing patients at risk for clinical deterioration precluding advanced therapies.

In light of the burgeoning numbers of HF patients, there has been growing interest in developing clinical decision-support systems capable of identifying patients with advanced HF [28]. These models have differentiated themselves from many of the historical regression-based models whose limitations have included a focus on mortality and hospitalizations at pre-specified time points, reliance on data not routinely collected in practice, need for a relatively small number of clinical variables, and inability to account for non-linear relationships amongst variables. Recent machine learning models have overcome many of the limitations of these traditional models. These include an augmented intelligence-enabled workflow for identifying outpatients with Stage D HF warranting clinical review to determine need for referral to a HF cardiologist [19] and an ensemble deep learning model trained to predict all-cause death, listing for HT, or extracorporeal membrane oxygenation (ECMO)/VAD within 1-year [20].

Our model distinguished itself from the previous methods in a number of ways. First, the transparent structure of the TGFNN method allowed for the justification of treatment recommendations at both the population and individual levels through fuzzy rules. These rules enable the evaluation of feature importance and feature interaction and can be quickly verified by clinicians and tested for applicability in other clinical settings. Second, the model defined abnormal ranges for continuous variables, aiding in model interpretability and in the utilization of these ranges when caring for patients when clinical decision support may be unavailable. Finally, our model predicted the future need for advanced HF therapies using routinely collected data from a single hospitalization, thereby moving from classification to prediction and avoiding the risk of missing the optimal advanced therapies window.

This study should be interpreted within the context of its limitations. First, the data were obtained from a single medical center, limiting the generalizability of our study findings. Thus, our work will need to be validated in additional settings with a larger sample. Second, the model requires prospective validation using the EHR with subsequent clinician review of model recommendations. Such an approach, when implemented elsewhere, led to an increase in clinical referrals to HF cardiologists as well as an increase in advanced therapies evaluations [19]. Third, the current algorithm only uses the previous visit to predict whether the patient will subsequently require advanced therapies. Future enhancements of the model will incorporate more extensive longitudinal data, potentially improving model performance. Finally, our analysis only incorporated a subset of the clinical variables with known associations with advanced HF. A greater number of diverse variables will be added to the analysis for future exploration, potentially improving model performance and allowing the generation of additional clinical rules.

In conclusion, in this study, a TGFNN, an interpretable and transparent machine learning method, was applied to predict the future need for HF advanced therapies using data routinely collected in the EHR. The results show that this method’s performance exceeds existing traditional machine learning methods while extracting clinical rules that are easily interpretable and verifiable. Future research is needed, however, to incorporate longitudinal data and a broader sample of HF patients for long-term prediction.

Supporting information

S1 File. Contains the training details, clinical characteristic of patient encounters from Michigan Medicine and the data split information.

https://doi.org/10.1371/journal.pone.0295016.s001

(DOCX)

References

  1. 1. Tsao C W, Aday A W, Almarzooq Z I, et al. Heart disease and stroke statistics—2022 update: a report from the American Heart Association. Circulation, 2022, 145(8): e153–e639. pmid:35078371
  2. 2. Virani S S, Alonso A, Benjamin E J, et al. Heart disease and stroke statistics—2020 update: a report from the American Heart Association. Circulation, 2020, 141(9): e139–e596. pmid:31992061
  3. 3. Parikh K S, Sharma K, Fiuzat M, et al. Heart failure with preserved ejection fraction expert panel report: current controversies and implications for clinical trials. JACC: Heart Failure, 2018, 6(8): 619–632. pmid:30071950
  4. 4. Kalogeropoulos AP, Samman-Tahhan A, Hedley JS, et al. Progression to stage D heart failure among outpatients with stage C heart failure and reduced ejection fraction. JACC Heart Fail. 2017;5(7):528–537. pmid:28624484
  5. 5. Miller L, Birks E, Guglin M, et al. Use of ventricular assist devices and heart transplantation for advanced heart failure. Circulation research, 2019, 124(11): 1658–1678. pmid:31120817
  6. 6. Stevenson L W, Pagani F D, Young J B, et al. INTERMACS profiles of advanced heart failure: the current picture[J]. The Journal of Heart and Lung Transplantation, 2009, 28(6): 535–54
  7. 7. Mehra MR, Uriel N, Naka Y, Cleveland JC Jr, Yuzefpolskaya M, Salerno C, et al. A fully magnetically levitated circulatory pump for advanced heart failure. N Engl J Med. 2017;376(5):440–450. pmid:27959709
  8. 8. Cogswell R, John R, Estep JD, Anand I, Boyle A, Butler J, et al. An early investigation of outcomes with the new 2018 donor heart allocation system in the United States. J Heart Lung Transplant. 2020;39(1):1–4. pmid:31810767
  9. 9. Fanaroff AC, DeVore AD, Mentz RJ, Daneshmand MA, Patel CB. Patient selection for advanced heart failure therapy referral. Crit Pathw Cardiol. 2014;13(1):1. pmid:24526143
  10. 10. Thorvaldsen T, Lund LH. Focusing on referral rather than selection for advanced heart failure therapies. Card Fail Rev. 2019;5(1):24. pmid:30847241
  11. 11. Aaronson KD, Schwartz JS, Chen TM, Wong KL, Goin JE, Mancini DM. Development and prospective validation of a clinical index to predict survival in ambulatory patients referred for cardiac transplant evaluation. Circulation. 1997;95(12):2660–2667. pmid:9193435
  12. 12. Levy WC, Mozaffarian D, Linker DT, Sutradhar SC, Anker SD, Cropp AB, et al. The Seattle Heart Failure Model: prediction of survival in heart failure. Circulation. 2006 Mar 21;113(11):1424–33. pmid:16534009
  13. 13. Meta-analysis Global Group in Chronic Heart Failure (MAGGIC). The survival of patients with heart failure with preserved or reduced left ventricular ejection fraction: an individual patient data meta-analysis. European heart journal. 2012;33(14):1750–1757. pmid:21821849
  14. 14. Hassoun S, Jefferson F, Shi X, Stucky B, Wang J, Rosa E Jr. Artificial intelligence for biology. Integrative and Comparative Biology. 2021 Dec;61(6):2267–75.
  15. 15. Bao W, Gu Y, Chen B, Yu H. Golgi_DF: Golgi proteins classification with deep forest. Frontiers in Neuroscience. 2023 May 12;17:1197824. pmid:37250391
  16. 16. Bao W, Cui Q, Chen B, Yang B. Phage_UniR_LGBM: phage virion proteins classification with UniRep features and LightGBM model. Computational and mathematical methods in medicine. 2022 Apr 15;2022. pmid:35465015
  17. 17. Ladbury C, Amini A, Govindarajan A, Mambetsariev I, Raz DJ, Massarelli E, et al. Integration of artificial intelligence in lung cancer: Rise of the machine. Cell Reports Medicine. 2023 Feb 3. pmid:36738739
  18. 18. Yu KH, Beam AL, Kohane IS. Artificial intelligence in healthcare. Nature biomedical engineering. 2018 Oct;2(10):719–31. pmid:31015651
  19. 19. Cheema B, Mutharasan RK, Sharma A, Jacobs M, Powers K, Lehrer S, et al. Augmented intelligence to identify patients with advanced heart failure in an integrated health system. JACC: Advances. 2022 Apr;1(4):1–11.
  20. 20. McGilvray MM, Heaton J, Guo A, Masood MF, Cupps BP, Damiano M, et al. Electronic health record-based deep learning prediction of death or severe decompensation in heart failure patients. JACC Heart Failure. 2022 Sep;10(9):637–47.
  21. 21. Miller RJ, Sabovˇcik F, Cauwenberghs N, Vens C, Khush KK, Heidenreich PA, et al. Temporal shift and predictive performance of machine learning for heart transplant outcomes. The Journal of Heart and Lung Transplantation. 2022 Jul;41(7):928–36. pmid:35568604
  22. 22. Yao H, Derksen H, Golbus JR, Zhang J, Aaronson KD, Gryak J, Najarian K. A Novel Tropical Geometry-Based Interpretable Machine Learning Method: Pilot Application to Delivery of Advanced Heart Failure Therapies. IEEE Journal of Biomedical and Health Informatics. 2022 Oct 4;27(1):239–50.
  23. 23. Yao H, Golbus JR, Gryak J, Pagani FD, Aaronson KD, Najarian K. Identifying potential candidates for advanced heart failure therapies using an interpretable machine learning algorithm. The Journal of Heart and Lung Transplantation. 2022 Dec;41(12):1781–9. pmid:36192320
  24. 24. Elixhauser A, Steiner C, Harris DR, Coffey RM. Comorbidity measures for use with administrative data. Medical Care. 1998;36(1):8–27. pmid:9431328
  25. 25. Jakobsen JC, Gluud C, Wetterslev J, et al. When and how should multiple imputation be used for handling missing data in randomised clinical trials–a practical guide with flowcharts. BMC Medical Research Methodology. 2017;17(1):1–10.
  26. 26. Morris AA, Kalogeropoulos AP, Tang WHW, et al. Guidance for timely and appropriate referral of patients with advanced heart failure: a scientific statement from the American Heart Association. Circulation. 2021;144(15):e238–e250. pmid:34503343
  27. 27. Chicco D, Tötsch N, Jurman G. The Matthews correlation coefficient (MCC) is more reliable than balanced accuracy, bookmaker informedness, and markedness in two-class confusion matrix evaluation. BioData mining. 2021 Dec;14(1):1–22.
  28. 28. Miller RJ, Chew DS, Howlett JG. Can Machines Find the Sweet Spot in End-Stage Heart Failure? Vol 1. American College of Cardiology Foundation Washington DC; 2022:1–3.