Skip to main content
Advertisement
  • Loading metrics

Using interpretable machine learning to predict bloodstream infection and antimicrobial resistance in patients admitted to ICU: Early alert predictors based on EHR data to guide antimicrobial stewardship

  • Davide Ferrari ,

    Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Project administration, Resources, Software, Validation, Visualization, Writing – original draft, Writing – review & editing

    davide.ferrari@kcl.ac.uk

    Affiliations School of Life Course and Population Sciences, King’s College London, London, United Kingdom, Centre for Clinical Infection & Diagnostics Research, St. Thomas’ Hospital, London, United Kingdom

  • Pietro Arina,

    Roles Conceptualization, Methodology, Writing – original draft, Writing – review & editing

    Affiliation Bloomsbury Institute of Intensive Care Medicine, University College London, London, United Kingdom

  • Jonathan Edgeworth,

    Roles Conceptualization, Funding acquisition, Supervision, Writing – review & editing

    Affiliation Centre for Clinical Infection & Diagnostics Research, St. Thomas’ Hospital, London, United Kingdom

  • Vasa Curcin,

    Roles Conceptualization, Funding acquisition, Supervision, Writing – review & editing

    Affiliation School of Life Course and Population Sciences, King’s College London, London, United Kingdom

  • Veronica Guidetti,

    Roles Conceptualization, Methodology, Software, Writing – review & editing

    Affiliation FIM Department, University of Modena and Reggio Emilia, Italy

  • Federica Mandreoli,

    Roles Methodology, Supervision, Writing – review & editing

    Affiliation FIM Department, University of Modena and Reggio Emilia, Italy

  • Yanzhong Wang

    Roles Conceptualization, Funding acquisition, Supervision, Writing – review & editing

    Affiliation School of Life Course and Population Sciences, King’s College London, London, United Kingdom

Abstract

Nosocomial infections and Antimicrobial Resistance (AMR) stand as formidable healthcare challenges on a global scale. To address these issues, various infection control protocols and personalized treatment strategies, guided by laboratory tests, aim to detect bloodstream infections (BSI) and assess the potential for AMR. In this study, we introduce a machine learning (ML) approach based on Multi-Objective Symbolic Regression (MOSR), an evolutionary approach to create ML models in the form of readable mathematical equations in a multi-objective way to overcome the limitation of standard single-objective approaches. This method leverages readily available clinical data collected upon admission to intensive care units, with the goal of predicting the presence of BSI and AMR. We further assess its performance by comparing it to established ML algorithms using both naturally imbalanced real-world data and data that has been balanced through oversampling techniques. Our findings reveal that traditional ML models exhibit subpar performance across all training scenarios. In contrast, MOSR, specifically configured to minimize false negatives by optimizing also for the F1-Score, outperforms other ML algorithms and consistently delivers reliable results, irrespective of the training set balance with F1-Score.22 and.28 higher than any other alternative. This research signifies a promising path forward in enhancing Antimicrobial Stewardship (AMS) strategies. Notably, the MOSR approach can be readily implemented on a large scale, offering a new ML tool to find solutions to these critical healthcare issues affected by limited data availability.

Author summary

This study confronts the global healthcare challenges posed by hospital-acquired infections and antibiotic resistance. It introduces an innovative machine learning approach known as Multi-Objective Symbolic Regression (MOSR), designed to predict bloodstream infections and evaluate antibiotic resistance risks using readily available clinical data from intensive care unit admissions. Unlike conventional models, MOSR consistently outperforms its counterparts, delivering reliable results even when faced with data imbalances. This advancement holds significant promise for enhancing Antimicrobial Stewardship (AMS) strategies, potentially curbing the unnecessary use of antibiotics. The simplicity and scalability of MOSR indicate its potential for widespread implementation, offering a robust solution to address these critical healthcare issues on a larger scale and ultimately improve patient outcomes.

Introduction

Emerging infectious diseases and Antimicrobial Resistance (AMR) are two of the most relevant threats to healthcare systems and worldwide society [1]. They contribute to patients’ morbidity and mortality [2], and severely increase the prevalence of poor and adverse outcomes [3,4]. Individuals admitted to the Intensive Care Unit (ICU) face a heightened susceptibility to bloodstream infections (BSIs), with a noteworthy incidence rate of 35%. These infections are often attributed to Gram-negative bacteria, particularly those that are resistant to carbapenem antibiotics, and pathogens categorized as DTR (difficult-to-treat and drug-resistant) [57]. From the moment a patient is admitted to the ICU with a suspect of sepsis or infection the current protocol focuses on collecting multiple biological samples from the patients for culturing bacteria, a process that typically takes two to three days to yield conclusive results. Meanwhile, clinicians often confront challenging decisions in this context, where they initiate broad-spectrum antibiotics for source control to address the underlying infection. These choices become particularly complex when clinicians must decide on antibiotic initiation while awaiting the results of laboratory tests [8,9] (In Fig 1, we depict a typical timeline).

thumbnail
Fig 1. Timeline of the decision-making process regarding antimicrobial therapy in ICU.

https://doi.org/10.1371/journal.pdig.0000641.g001

The widespread or improper utilization of antibiotics can fuel the emergence of AMR [10]. Additionally, improper infection control (IC) measures can further propagate AMR, posing a significant issue in the ICU environment [11] and subjecting patients to elevated risks of increased morbidity [3,4] and mortality [2]. Addressing this issue involves a multifaceted approach that not only encompasses traditional methods like monitoring, pre-emptive measures, swift identification, and the creation of novel medications and immunizations, but also leverages modern advancements. Although cutting-edge rapid solutions for point-of-care diagnostics exist, their widespread adoption is still in progress [5].

As a result, research efforts are also geared towards identifying new correlations in data available upon admission, while waiting for lab results, to develop more universally applicable models. The forecasts of BSI and AMR has recently gained prominence due to the rise of machine learning (ML) techniques and the availability of extensive clinical datasets [12]. Though BSI and AMR are significant concerns, they are statistically rare or rare events. This rarity inherently complicates their detection and management using traditional ML models, which often lack the required specificity and sensitivity for such tasks [13,14]. A compelling alternative for tackling these challenges is the application of Multi-Objective Symbolic Regression (MOSR) to medical databases; this approach has yielded promising results, improving both performance and the potential for clinical interpretability [15].

This retrospective study endeavours to tackle a persistent challenge within the clinical landscape by introducing MOSR, a pioneering ML algorithm that has proven to be very effective in handling naturally imbalanced datasets [15]. Particularly, the choice of MOSR lies on its multi-objective capabilities and the ability to induce the model’s training to optimize also the F1-Score together with standard Binary Crossentropy. By choosing MOSR as ML algorithm we expect to replicate the same behaviour seen in literature and obtain a set of models able to produce more balanced performance metrics and put more attention in the least represented of the two classes.

Our focus lies in reshaping the conventional understanding of early detection of AMR and BSI in a clinical context, leveraging routinely accessible EHR data such as blood biomarkers, ongoing pharmacological interventions, and established medical histories. In contrast to prevailing ML methodologies tailored for tabular data, our approach seeks to underscore the inherent limitations in addressing the complexities of this clinical scenario. By delineating two interconnected tasks—first, discerning the presence or absence of BSI, and second, establishing the correlation between infections and AMR—we aim to underscore the inadequacies of traditional ML methods in addressing this intricate issue. This study not only highlights the existence of a persisting problem but also pioneers a transformative perspective, proposing a viable path forward that redefines our approach to such challenges for the first time.

Methods

Ethics statement

Clinical, demographic, microbiological, antimicrobial treatment, intervention, bed occupancy, and staffing level data were extracted from intensive care (CareVue; Philips), microbiology (MC&S; GSTT), and electronic patient administration systems (iSoft) to form the anonymized Guy’s and St Thomas’ Staphylococcal Transmission and Antimicrobial Record database, with approval from the hospital ethics committee (10/H1102/80). Access to the data is subject to appropriate checks and vetting and researchers can refer to the contact of the database manager Finola Higgins (finola.higgins@kcl.ac.uk).

This research was funded/supported by King’s College London, DRIVE-Health, KCL funded Centre for Doctoral Training (CDT) in Data-Driven Health, and Guy’s and St. Thomas’ NHS Fundation Trust. The views expressed are those of the author(s) and not necessarily those of the NHS, the NIHR or the Department of Health and Social Care. The authors acknowledge the use of the King’s College London CREATE computing infrastructure [16] to run the experiments in the Trusted Research Environment.

Dataset description

Initially, our dataset encompassed over 5000 heterogeneous patients of Guy’s and St. Thomas’ Hospitals in London admitted to the ICU from Admission and Emergency, hospital wards or operating theatre, with data collected at the patients’ admission. The data collection spans 8 years and encompass all type of complex patients who are taken care of in a high complexity university hospital. However, in the interest of data completeness, we excluded individuals with missing values in any of the selected variables. Consequently, our final dataset consists of 1142 patients, and Table 1 provides an overview of variable distributions categorized by one of the primary outcomes, which is BSI.

As we selected non-linear ML models, we opted for a non-linear features selection approach. Namely, we used Hilbert-Schmidt Independence Criterion Lasso (HSIC Lasso) [17] to find the minimal and optimal subset of features to explain a given phenomenon. Only the features having a HSIC score larger than 10−3 are used in the ML algorithms. The HSIC Lasso method finds the optimal features subset solving a feature-wise kernelized Lasso optimization problem with a non-negative constraint. We use the same feature set to train all ML algorithms in the comparison.

HSIC Lasso reduces the number of variables from 100 to 25. These include individual components of the APACHE-II score [18], laboratory data, symptoms, past medical history, and medical therapies that patients were undergoing at the time of admission to the ICU. Circulatory system diseases were defined with the ICD-9 code [19] between 390 and 459. These variables are usually available at patient admission as they do not rely on previous health records but reflect the patient’s status.

In our study, we model BSI and AMR as binary outcomes due to the limited quantity and representation of available data, which prevented us from delving deeper into individual organism details. BSI is defined as a positive blood culture, whereas AMR is a positive blood culture presenting resistance to at least one antimicrobial medicines. We did not assess the effectiveness of the treatment before the blood culture because the samples for the cultures were collected at the time of admission, which did not allow for a proper evaluation of the efficacy of the AB therapy.

While it is indeed the first instance of predicting BSI and AMR from readily available patient variables, we prioritize safety and knowledge generation. To this end, we selected and compared the following state-of-the-art algorithms: Logistic Regression (LR), Decision Trees (DT), Random Forests (RF), Gradient Boosting (GB), Extreme Gradient Boosting (XGBoost), Light Gradient Boosting (LightGBM), Linear and Quadratic Discriminant analyses (LDA and QDA), Extra Tree (ET), AdaBoost, and Symbolic Regression (SR). We did not use Deep Learning strategies because we aimed at algorithms that would provide easily understandable interpretation, either intrinsic (like in LR or SR) or post hoc (like SHAP [20] or Lime [21] for tree-based methods). For all algorithm, a comprehensive grid search optimization of hyperparameters has been conducted during training.

Symbolic regression

Symbolic regression is an evolutionary algorithm that generates predictive models in the shape of tree expressions where internal nodes are operations while leaves are features or constants [22]. The algorithm starts with a group of P initial random formulas, i.e., the initial population. At each training step, it generates new individuals by applying random mutations, and then it keeps the best P individuals based on their predictive performance.

The SR method [23,24] was shown to provide exceptional management of naturally imbalanced data [22,25] with the ability to model highly non-linear relations with limited data both in terms of data points and features available. Because the algorithm can choose a subset of variables, it can also be considered a feature selection approach itself, laying further potential for its use in the healthcare domain. Contrary to other non-linear ML algorithms, SR generates flexible mathematical equations that use a customizable set of mathematical operations and feature interactions. In this work, we considered: sum, multiplication, exponential, logarithm, square root, power, and minimum/maximum between two numbers. Although SR is the least common and documented approach, its use has been recently revived in clinical applications for the remarkable potential of a multi-objective training setup [22,25].

MOSR can simultaneously optimize multiple performance measures, also called fitness functions. Its optimisation can produce several equally optimal models, identified through dominance criteria, that create the so-called first Pareto front. We used a MOSR implementation available on GitHub [26] that uses the Non-Dominant Sorting Algorithm II genetic algorithm (NSGA-II) [27].

The optimal formulas show different properties and qualities and allow domain experts to choose the most appropriate model. The flexible and usually compact models arising from SR are naturally interpretable and allow directly inspecting the relations between the variables and identifying the most relevant ones used in the prediction. These features are very useful in a clinical environment as they allow for maintaining control of the model complexity (e.g., limiting non-linearities or setting a maximum length). In this experiment we optimize Binary Crossentropy (BCE), F1 Score and the complexity of the model (how many operations and variables are used).

The clinical implication of an equation generated by MOSR is still an open research challenge and necessitates of further methodological exploration.

From the technical implementation point of view, particularly thinking about Clinical Decision Support Systems (CDSS) software with Electronic Health Records (EHR), one of the unique features of MOSR is that models are plain expressions that can be easily transferred to any information system (even copied and pasted in a spreadsheet) and therefore their technical implementation is much simpler than any other ML alternative. The precondition to achieve this seamless integration would be like any other ML approach, that is producing consistent data preprocessing to feed as input to the model. The challenge of prediction representation to support clinical decision making would a decision in and of itself like in all other ML alternatives as it strongly depends on whether the output should be an action instruction (e.g. “prescribe Enalapril”) or more of an additional piece of information for a wider diagnostic reasoning (e.g., “Risk of death is 5%”).

Experimental design and statistics

We partitioned the dataset into training (80%) and test (20%) sets, ensuring stratification based on the class frequency due to the substantial class imbalance between BSI and non-BSI cases. We trained our ML models under two distinct conditions: firstly, without balancing the training set, allowing us to assess the ML algorithm’s performance in real-world scenarios, and secondly, by preprocessing the training set solely using the SMOTE [28] method to oversample the minority class, enabling us to evaluate how ML algorithm performance changes under ideal training conditions. The test dataset was unaltered and imbalanced to validate the ML models. For each ML algorithm, we conducted hyperparameter tuning via a standard grid search method and incorporated class weighting, assigning each class a weight equivalent to the inverse of its prevalence in the dataset. Moreover, for each ML model we tuned its hyperparameters using a standard grid search for the best predictive performance.

In the context of our binary classification task for both outcomes, our training prioritizes conservative models, which prioritize high Sensitivity while accepting a penalty in Specificity. Specifically, we place a greater emphasis on minimizing FNs, as these errors correspond to patients with BSI or AMR who may be overlooked, posing significant risks for the patient. To enhance the models discussed in this study, future efforts should focus on training on larger datasets and seeking ways to augment their Specificity, thus ensuring their reliability for deployment in real-world clinical scenarios.

Results

Epidemiological analysis

This retrospective study uses data of 1142 patients collected from the ICUs within Guy’s and St. Thomas’ NHS Foundation Trusts in Central London during this specific period, we conducted a detailed examination of this intriguing phenomenon. All patients of at least 18 year of age at their admission to ICU are included in this dataset.

Fig 2 reports data from the study and reveals that although the number of requested blood cultures decreased by a modest 17%, the count of BSI experienced a substantial drop of over 50% during the same timeframe. Furthermore, antibiotics (ABs) usage per patient demonstrated an increase of more than 27%. Notably, the correlation between the number of blood cultures requested and the infections detected showed a high positive correlation of 0.84 (p<0.05), aligning with expectations. In contrast as the rate of ABs usage increased, the infections detected decreased, and vice versa. This was shown by a negative correlation of -0.70 (p<0.05). These results indicated that the infections detected, and the rate of ABs usage moved in opposite directions, highlighting a discrepancy between the actual therapeutic need for antibiotics and their utilization for the time analysed.

thumbnail
Fig 2. Evolution of the number of blood cultures, BSI detected, and the average use of AMs used per patient.

https://doi.org/10.1371/journal.pdig.0000641.g002

Data description

In Table 1, we provide a description of the selected variables and conduct a comparison of their distribution among groups categorized by the presence or absence of BSI. The statistical relevance is assessed by means of the appropriate statistical test based on the data type and distribution (t-test for the continuous variables, ANOVA for categorical variables). It is worth noting that most variables exhibit non-statistically significant differences between the two groups, primarily due to the limited biological relevance of these variables to the outcome. Nevertheless, these variables are retained in our experiments because certain non-linear methods, such as SR, have the capacity to generate complex features that are functions of the original variables. These newly created composite features may reveal statistically distinct distributions among groups of patients. The presence or absence of such non-linearities cannot be ruled out solely by comparing feature distributions.

The class imbalance that manifest in this analysis is representative of the general population and intrinsic behaviour of many clinical phenomena. This paper intends to not alter this distribution and propose a ML methodology able to overcome this behaviour better than state-of-the-art approaches, therefore enabling a new way of tackling this class of clinical prediction tasks.

MOSR performance compared to state-of-the-art ML algorithms

Tables 2 and 3 showcase the performance metrics assessed on the test dataset, encompassing BSI and AMR detection respectively, across various ML approaches. We examine both scenarios of training datasets, one with balanced class distribution and the other with imbalanced class distribution. In the case of standard ML models, we present the outcomes of the performance calculated on the test dataset. For the MOSR case, we provide two key models: one for the model with the highest AUROC score and one for the model yielding the highest F1 score. Highlighting the model with best F1 score in the table is crucial because it offers a valuable measure of model performance, especially in scenarios where precision and recall need to be balanced, making it essential for tasks like BSI and AMR detection where both false positives and false negatives carry different, yet significant consequences.

In S1 Appendix, more details on the models and the confusion matrices are reported.

thumbnail
Table 2. Detection of BSI: predictive performance on test dataset.

https://doi.org/10.1371/journal.pdig.0000641.t002

thumbnail
Table 3. Detection of AMR: predictive performance on test dataset.

https://doi.org/10.1371/journal.pdig.0000641.t003

Our results indicate that traditional ML approaches struggle to efficiently predict BSI or AMR presence in ICU patients using standard EHR data, while MOSR emerges as a promising alternative. MOSR consistently achieves the highest AUROC and F1, remaining stable in both imbalanced and balanced training scenarios. Even though SMOTE enhances some metrics, it falls short of ensuring balancing in predictive performance. The improved performance of MOSR on the test sets are an indication of its ability to mitigate overfitting behavior. In summary, MOSR outperforms other ML algorithms in both balanced and imbalanced training datasets, holding potential for predictive modelling in this critical healthcare context.

The MOSR chosen individual shows much higher, balanced, and thus reliable performances with .42 and .44 F1-Scores respectively compared to the next best alternative of DT ad ADA respectively. This indicates that the multi-objective training of MOSR on BCE, AUROC, and F1 was successful.

The MOSR approach is characterized by innovation, yet it could be further enhanced through the incorporation of established machine learning methodological practices.

External validation of MOSR models would be required for a complete assessment of this approach and more experimentation will be needed to further assess the full potential of the multi-objective training. We leave the problem of choosing the best MOSR model for clinical purposes for future work. In fact, only once more extensive and representative datasets are explored will it be appropriate to dive into the details of model choice and explanation. Nonetheless, our work remains a solid proof-of-concept putting light on the way ahead for AMR and AB stewardship research.

Discussion and conclusion

In summary, the challenges posed by BSI and AMR detection persist as formidable issues within healthcare systems, giving rise to the inappropriate use of antibiotics and unfavourable patient outcomes. This matter is particularly critical in high-risk settings like the ICU, where vulnerable patients are exposed to heightened risks of morbidity and mortality. ICU clinicians make daily decisions regarding antibiotic therapy, often in the absence of conclusive bacterial culture results at the time of admission.

Throughout the period we observed, a significant development took place, wherein the implementation of the IC initiative led to a decrease in infection rates. Paradoxically, this reduction in infections coincided with an uptick in the use of antibiotics per patient. This unexpected divergence in the relationship between antibiotic utilization and infection prevalence at the hospital level prove the persistent need of a rapid diagnostic support. Despite the considerable passage of time and the routine incorporation of the IC program, the approach to managing ICU admissions, encompassing infection detection technology and clinical protocols, has remained largely unaltered paving the way for the implementation of a clinical decision support tool like ours.

In this study, advanced ML models was used to predict the occurrence of BSI and AMR based on variables already available for every patient at ICU admission. Our models outperformed classical ML models and demonstrated promise for further ML development in this field. Our dataset reflects an era where the technological and behavioural approaches to AM stewardship in ICU admissions have remained largely unchanged. In this context, our findings reveal that MOSR emerges as the most promising approach for BSI and AMR detection, utilizing commonly available Electronic Health Record (EHR) data and without relying on previous microbiological data relating infections and resistances. These results hold substantial implications for improving early diagnosis and targeted AM therapy, ultimately contributing to the ongoing battle against the growing threat of AMR in healthcare settings.

A comprehensive literature search on Pubmed, Google Scolar and Scopus has produced a selection of seven papers that approach the problem of AMR prediction using EHR and ML techniques. Traditional approaches to identifying AMRs use AM susceptibility testing based on phenotypic testing [29] while more recent methods use ML and deep learning to analyse genome sequences directly [30,31], including Lewin-Epstein et al. [32], where an ensemble of ML models is used to predict the resistance to five ABs. However, all these approaches start from bacterial cultures data and not EHR data that are commonly available at ICU admission in a predictive manner. Looking at EHR-focused efforts, a study by Moran et al. [33] used EHRs to predict resistance to co-amoxiclav and piperacillin/tazobactam with AUC ranging from .61 to .67. Their predictive analysis relies on previously acquired blood cultures. Garcia-Vidal et al. [34] show how their ML approach can use EHRs to predict multidrug-resistant Gram-negative bacilli with AUC ranging from .79 to .97. The chosen variables may be hard to obtain as they rely on an extensive past medical history that is not often available and previous blood cultures. Moreover, although the number of records they used is 3235, the number of patients is only 349, making the model very much prone to overfitting. Pascual-Sanchez et al. [35], study a similar application and strategy, using a single data point per patient re-balancing the outcomes by under-sampling the records not presenting AMR in the final dataset. Overall, the reviewed approaches make use of variables that are not readily available at patient admission in their EHR records, which was a core study design choice of our proposal. Moreover, this work is the first of its kind proposing the introduction of the methodology of MOSR and multi-objective training.

To the best of our knowledge, this is the first work aiming to predict the outcome of bacterial cultures using easily and readily collectible EHR data in a competitive way. This makes our result an exciting and appealing front-line tool for clinicians waiting for laboratory evidence. In our future work, we will focus on improving our models’ robustness, reliability, and interpretability to facilitate their use in real-world situations. Moreover, collecting new data would allow extending the current approach to individual microorganism AMR detection and sharpening the algorithm performance.

This study, while valuable, is constrained by several limitations. The first is data scarcity, with our dataset comprising approximately 1000 patients. This limited size may affect the robustness and generalizability of our findings. A possible solution to this limitation is the use of Federated Learning strategies to allow more clinical institutions to participate in the training process with their data.

The second limitation is data imbalance, particularly in binary outcomes such as BSI and AMR. This disparity can introduce bias and affect the performance of ML models. Despite this, our approach offers a promising solution for imbalanced prediction tasks.

The third limitation relates to the need to identify the most effective and reliable representation of the predictions for a proper integration with clinical decision support systems. This would require a closer work with practicing clinician to understand the nuances of their decision-making environment and what type of representation would convey the prediction best.

Lastly, the absence of external validation limits the broader applicability of our approach. However, the primary focus of this work is to propose a novel method for addressing imbalanced prediction tasks in healthcare, rather than resolving clinical challenges.

To address these limitations, we are actively working on expanding our dataset, both in terms of size and diversity, to enhance the representativeness of our findings. Additionally, we are reevaluating our approach to potentially explore predictions at a finer level, focusing on individual organisms rather than generic binary outcomes. These efforts aim to overcome the data limitations, refine our methodology, and ultimately provide more comprehensive insights into the complex dynamics of healthcare-associated infections and AMR.

Supporting information

S1 Appendix. Models and confusion matrices.

When the equations are of reasonable complexity, we report it below, otherwise we omit them in the manuscript. Variables need to be scaled between 0 and 1.

https://doi.org/10.1371/journal.pdig.0000641.s001

(DOCX)

References

  1. 1. Hernando-Amado S., Defining and combating antibiotic resistance from One Health and Global Health perspectives, Nature Microbiology 2019 4:9 4 (2019) 1432–1442. pmid:31439928
  2. 2. Vincent J.L., International Study of the Prevalence and Outcomes of Infection in Intensive Care Units, JAMA 302 (2009) 2323–2329. pmid:19952319
  3. 3. Weinstein M.P., The Clinical Significance of Positive Blood Cultures in the 1990s: A Prospective Comprehensive Evaluation of the Microbiology, Epidemiology, and Outcome of Bacteremia and Fungemia in Adults, Clinical Infectious Diseases 24 (1997) 584–602. pmid:9145732
  4. 4. Vallés J., Nosocomial Bacteremia in Critically Ill Patients: A Multicenter Study Evaluating Epidemiology and Prognosis, Clinical Infectious Diseases 24 (1997) 387–395. https://doi.org/10.1093/CLINIDS/24.3.387.
  5. 5. Timsit J.F., Bloodstream infections in critically ill patients: an expert statement, Intensive Care Med 46 (2020) 266–284. pmid:32047941
  6. 6. Bassetti M., Bloodstream infections in the Intensive Care Unit, Virulence 7 (2016) 267–279. pmid:26760527
  7. 7. Tabah A., Epidemiology and outcomes of hospital-acquired bloodstream infections in intensive care unit patients: the EUROBACT-2 international cohort study, Intensive Care Med 49 (2023) 178–190. pmid:36764959
  8. 8. Arina P., Pathophysiology of sepsis, Curr Opin Anaesthesiol 34 (2021) 77–84. pmid:33652454
  9. 9. Evans T., Diagnosis and management of sepsis, Clinical Medicine 18 (2018) 146–149. pmid:29626019
  10. 10. Adebisi Y.A., Balancing the risks and benefits of antibiotic use in a globalized world: the ethics of antimicrobial resistance, Global Health 19 (2023) 1–7. https://doi.org/10.1186/S12992-023-00930-Z/METRICS.
  11. 11. Llor C., Antimicrobial resistance: risk associated with antibiotic overuse and initiatives to reduce the problem, Ther Adv Drug Saf 5 (2014) 229–241. pmid:25436105
  12. 12. Sakagianni A., Using Machine Learning to Predict Antimicrobial Resistance―A Literature Review, Antibiotics 2023, Vol. 12, Page 452 12 (2023) 452. pmid:36978319
  13. 13. Mandreoli F., Real-world data mining meets clinical practice: Research challenges and perspective, Front Big Data 5 (2022) 99. pmid:36338334
  14. 14. Arina P., Prediction of Complications and Prognostication in Perioperative Medicine: A Systematic Review and PROBAST Assessment of Machine Learning Tools, Anesthesiology (2023). https://doi.org/10.1097/ALN.0000000000004764.
  15. 15. Ferrari D., Multi-objective Symbolic Regression to Generate Data-driven, Non-fixed Structure and Intelligible Mortality Predictors using EHR: Binary Classification Methodology and Comparison with State-of-the-art, AMIA Annual Symposium Proceedings 2022 (2022) 442. /pmc/articles/PMC10148348/ (accessed September 13, 2023). pmid:37128446
  16. 16. King’s College London e-Research, (n.d.). https://docs.er.kcl.ac.uk/ (accessed October 13, 2023).
  17. 17. Climente-González H., Block HSIC Lasso: model-free biomarker detection for ultra-high dimensional data, Bioinformatics 35 (2019) i427–i435. pmid:31510671
  18. 18. Knaus W.A., APACHE II: A severity of disease classification system, Crit Care Med 13 (1985). https://journals.lww.com/ccmjournal/fulltext/1985/10000/apache_ii__a_severity_of_disease_classification.9.aspx. pmid:3928249
  19. 19. International classification of diseases : [9th] ninth revision, basic tabulation list with alphabetic index, (n.d.). https://iris.who.int/handle/10665/39473 (accessed October 13, 2023).
  20. 20. Lundberg S.M., A Unified Approach to Interpreting Model Predictions, Adv Neural Inf Process Syst 30 (2017). https://github.com/slundberg/shap (accessed March 12, 2023).
  21. 21. Ribeiro M.T., “Why Should I Trust You?”: Explaining the Predictions of Any Classifier, NAACL-HLT 2016–2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Demonstrations Session (2016) 97–101. https://doi.org/10.48550/arxiv.1602.04938.
  22. 22. Ferrari D., Multi-Objective Symbolic Regression for Data-Driven Scoring System Management, 2022 IEEE International Conference on Data Mining (ICDM) (2022) 945–950. https://doi.org/10.1109/ICDM54844.2022.00112.
  23. 23. Koza J.R., Genetic programming as a means for programming computers by natural selection, Stat Comput 4 (1994) 87–112. https://doi.org/10.1007/BF00175355/METRICS.
  24. 24. La Cava W., Contemporary Symbolic Regression Methods and their Relative Performance, (2021). https://arxiv.org/abs/2107.14351v1 (accessed March 21, 2023). pmid:38715933
  25. 25. Ferrari D., Multi-objective Symbolic Regression to Generate Data-driven, Non-fixed Structure and Intelligible Mortality Predictors using EHR: Binary Classification Methodology and Comparison with State-of-the-art, in: AMIA 2022, American Medical Informatics Association Annual Symposium, Washington, DC, USA, November 5–9, 2022, AMIA, 2022. https://knowledge.amia.org/76677-amia-1.4637602/f006-1.4642154/f006-1.4642155/877-1.4642417/511-1.4642414.
  26. 26. Multi-objective Symbolic Regression open-source code, (n.d.). https://github.com/davideferrari92/multiobjective_symbolic_regression (accessed March 2, 2023).
  27. 27. Deb K., A fast and elitist multiobjective genetic algorithm: NSGA-II, IEEE Transactions on Evolutionary Computation 6 (2002) 182–197. https://doi.org/10.1109/4235.996017.
  28. 28. Chawla N. V., SMOTE: Synthetic Minority Over-sampling Technique, Journal Of Artificial Intelligence Research 16 (2011) 321–357. https://doi.org/10.1613/jair.953.
  29. 29. Davis J.J., Antimicrobial Resistance Prediction in PATRIC and RAST, Nature Publishing Group (2016). https://doi.org/10.1038/srep27930.
  30. 30. Ren Y., Prediction of antimicrobial resistance based on whole-genome sequencing and machine learning, Bioinformatics 38 (2022) 325–334. pmid:34613360
  31. 31. Liu Z., Evaluation of Machine Learning Models for Predicting Antimicrobial Resistance of Actinobacillus pleuropneumoniae From Whole Genome Sequences, Front Microbiol 11 (2020) 48. pmid:32117101
  32. 32. Lewin-Epstein O., Predicting Antibiotic Resistance in Hospitalized Patients by Applying Machine Learning to Electronic Medical Records, Clinical Infectious Diseases 72 (2021) e848–e855. pmid:33070171
  33. 33. Moran E., Towards personalized guidelines: using machine-learning algorithms to guide antimicrobial selection, Journal of Antimicrobial Chemotherapy 75 (2020) 2677–2680. pmid:32542387
  34. 34. Garcia-Vidal C., Machine Learning to Assess the Risk of Multidrug-Resistant Gram-Negative Bacilli Infections in Febrile Neutropenic Hematological Patients, Infect Dis Ther 10 (2021) 971–983. pmid:33860912
  35. 35. Pascual-Sanchez L., Predicting Multidrug Resistance Using Temporal Clinical Data and Machine Learning Methods, Proceedings—2021 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2021 (2021) 2826–2833. https://doi.org/10.1109/BIBM52615.2021.9669829.