Systematic evaluation of machine learning models for postoperative surgical site infection prediction

Anna M. van Boekel; Siri L. van der Meijden; Sesmu M. Arbous; Rob G. H. H. Nelissen; Karin E. Veldkamp; Emma B. Nieswaag; Kim F. T. Jochems; Jeroen Holtz; Annekee van IJlzinga Veenstra; Jeroen Reijman; Ype de Jong; Harry van Goor; Maryse A. Wiewel; Jan W. Schoones; Bart F. Geerts; Mark G. J. de Boer

doi:10.1371/journal.pone.0312968

Abstract

Background

Surgical site infections (SSIs) lead to increased mortality and morbidity, as well as increased healthcare costs. Multiple models for the prediction of this serious surgical complication have been developed, with an increasing use of machine learning (ML) tools.

Objective

The aim of this systematic review was to assess the performance as well as the methodological quality of validated ML models for the prediction of SSIs.

Methods

A systematic search in PubMed, Embase and the Cochrane library was performed from inception until July 2023. Exclusion criteria were the absence of reported model validation, SSIs as part of a composite adverse outcome, and pediatric populations. ML performance measures were evaluated, and ML performances were compared to regression-based methods for studies that reported both methods. Risk of bias (ROB) of the studies was assessed using the Prediction model Risk of Bias Assessment Tool.

Results

Of the 4,377 studies screened, 24 were included in this review, describing 85 ML models. Most models were only internally validated (81%). The C-statistic was the most used performance measure (reported in 96% of the studies) and only two studies reported calibration metrics. A total of 116 different predictors were described, of which age, steroid use, sex, diabetes, and smoking were most frequently (100% to 75%) incorporated. Thirteen studies compared ML models to regression-based models and showed a similar performance of both modelling methods. For all included studies, the overall ROB was high or unclear.

Conclusions

A multitude of ML models for the prediction of SSIs are available, with large variability in performance. However, most models lacked external validation, performance was reported limitedly, and the risk of bias was high. In studies describing both ML models and regression-based models, one modelling method did not outperform the other.

Citation: van Boekel AM, van der Meijden SL, Arbous SM, Nelissen RGHH, Veldkamp KE, Nieswaag EB, et al. (2024) Systematic evaluation of machine learning models for postoperative surgical site infection prediction. PLoS ONE 19(12): e0312968. https://doi.org/10.1371/journal.pone.0312968

Editor: Mohamad K. Abou Chaar, Mayo Clinic Rochester, UNITED STATES OF AMERICA

Received: January 21, 2024; Accepted: October 15, 2024; Published: December 12, 2024

Copyright: © 2024 van Boekel et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All relevant data are within the manuscript and its Supporting Information files.

Funding: The author(s) received no specific funding for this work.

Competing interests: I have read the journal’s policy and the authors of this manuscript have the following competing interests: B.F. Geerts declares to be shareholder and owner of Healthplus.ai S.L. van der Meijden, M. Wiewel, E.B. Nieswaag, K.F.T. Jochems, J. Holtz, A. van IJlzinga Veenstra, and J. Reijman declare to be an employee of Healthplus.ai. This does not alter our adherence to PLOS ONE policies on sharing data and materials.

Introduction

Surgical site infections (SSIs) are known complications following surgery and belong to the most frequently occurring hospital-acquired infections. The incidence of SSIs ranges between 0.6% and 18% and depend on the type of surgical procedure and setting [1–4]. Surgical site infections lead to increased morbidity, mortality and hospital stay, resulting in a negative impact on the patient’s health-related quality of life [2]. Moreover, SSIs cause an increase in healthcare costs due to prolonged hospitalization, the need for extra diagnostic tests and interventions, and prolonged treatment. Recent meta-analyses showed an additional length of hospital stay between 2.1 to 54 days for patients with an SSI [2] with an estimated cost ranging from USD $10,443 to USD$ 25,546 per case [3]. Early detection and treatment are important for reducing these negative effects of SSIs.

Several risk factors for the development of SSIs have been identified such as sex, BMI, comorbidity American Society of Anesthesiologist (ASA) score, smoking, age and surgical approach [5, 6]. Several prognostic prediction models have been developed to identify which patients are at risk for developing an SSI. Besides traditional models, such as those using logistic regression [7], machine learning (ML) models are increasingly being developed and used for this purpose. ML comprises a wide spectrum of different algorithms that automatically learn from presented and new input data in a continuous iterative process, and variable selection for ML models is performed by these algorithms. This in contrast to traditional models, where variable selection and internal model settings are more dictated by humans [8–10]. ML models benefit not only from this iterative learning process, but also from using more and different types of input variables. The complex algorithmic structure can find non-linear relations between variables, which contrasts with traditional regression-based models [11]. The disadvantage of ML models is that the outcomes result in “black-box” predictions, where the used data for ML model output, the (relative) importance of these data and their possible mutual effects are less evident compared to regression-based models [12, 13].

To evaluate the statistical performance of prediction models, discriminative performance in terms of concordance statistics (C-statistic), also known as area under the receiver operating characteristic curve (ROC or area under the curve -AUC-), and calibration in terms of calibration plots with slope and intercept are most often assessed [14]. Discriminative performance is the ability of the model to distinguish between patients with and without the outcome, whereas calibration is the agreement between the predicted probability and the proportion of patients with the actual outcome. Prediction models are first internally validated, using for example cross-validation or bootstrapping. Thereafter, external validation should be performed either on other hospital datasets, prospectively in time, or both, to ensure generalizability [15].

ML models are increasingly being developed for many different purpuses in surgery [7]. Elfanagely et al [16] described 45 ML models used for the prediction of surgical outcomes and another review [17] summarized the outcomes of 212 articles with ML models developed for prediciting a broad spectrum of outcomes in vascular surgery. The ML models performed reasonably well, but there were concerns regarding the risk of bias. A recent systematic review and meta-analysis performed by Wu et al. showed that there are many different ML models for the prediction or detection of SSIs, but that the validation of these models is generally insufficient [18]. Wu et al. mainly focused on the methodological aspects of the models and made no distinction between the prediction of SSI or SSI detection for surveillance purposes. Moreover, a clear overview of the available models for different surgical specialties or SSI subtypes (superficial-, deep- or organ space SSI) is still missing. The number of models developed for SSI prediction is increasing, and since 2021 new models have been developed. The aim of this systematic review was therefore to describe the performance of all internally or externally validated ML models for the prediction of SSI, to describe the methodological quality of the studies studying ML models for prediction of SSI, and to give an overview of the available models per surgical specialty and SSI subtype.

Methods

A systematic review of the published literature on the prediction of postoperative infections was conducted according to the Preferred Reporting items for Systematic Reviews and Meta-Analyses (PRISMA) statement (S1 Appendix). The protocol for this study was registered in PROSPERO (registration number 248953).

Search strategy

The literature search was performed in MEDLINE, EMBASE and the Cochrane Library from inception to July 1, 2023. The complete search strings are shown in the Supplementary material (S2 Appendix).

Inclusion and exclusion criteria

All original studies that developed and validated (internally or externally) ML models for the prediction of SSIs and studies that externally validated ML models that were previously developed were included. Models were considered to be an ML model if a non-regression-based approach for model development was used such as random forests, support vector machines and neural networks. As outcome, prediction of all types of SSIs within 30 days postoperative were included. Models that only predicted SSIs as part of a composite adverse outcome were excluded. Other exclusion criteria were pediatric populations (age <18 years old), no full text article available, and articles not written in English language.

Screening and data extraction

Study selection was performed using the Covidence^® software program (www.covidence.org, Melbourne, Australia). After removal of duplicates, titles and abstracts were screened on full text inclusion criteria by two independent authors (AB, BG, or MW). Full text analysis of the remaining articles was performed by the same authors. All conflicts were resolved by a third reviewer (MB or SA).

The following data were obtained from the included articles: type of SSI predicted (either superficial, deep or organ space), surgical specialty, number of surgeries, patients or both, performance parameters of the model (sensitivity, specificity, accuracy, calibration and C-statistic), method of validation, variables used as predictors, and all types of developed and/or validated models (ML as well as regression-based models). A complete list of the extracted data is provided in S2 Table. Reviewers used a standardized data extraction form that was based on the CHARMS (CHecklist for critical Appraisal and data extraction for systematic Reviews of prediction Modelling Studies) [19]. Extracted data was double checked for inconsistencies by AB and BG and discrepancies were resolved by consensus.

Descriptive analyses

Results were summarized using descriptive statistics. We did not perform a meta-analysis due to the heterogeneity in reported outcome measures and definitions. Analyses were performed using R (version 2023.06.1+524, R Core Team, Vienna, Austria).

Risk of bias

The methodological quality of all included studies was assessed using the Prediction model study Risk of Bias Assessment Tool (PROBAST) [20, 21]. The PROBAST is designed to critically appraise prediction models and contains two main domains: the risk of bias, which consists of four subdomains (participant selection, predictor selection, outcome definition and analysis) and the applicability for the review. In total there are 20 signaling questions which can be scored as ‘yes’, ‘probably yes’, ‘probably no’, ‘no’, or ‘no information’ which combined lead to a low, uncertain, or high risk of bias and applicability.

Results

A flowchart of the search is summarized in Fig 1. Of the 4,377 publications identified, 24 studies were included for further analysis. See S1 Table for the exclusion reasons of the excluded full text articles.

Download:

Fig 1. PRISMA figure.

https://doi.org/10.1371/journal.pone.0312968.g001

Characteristics of included studies

The 24 included studies described a total of 85 different ML models. Sixty-nine models (81%) were internally validated and 16 (19%) were externally validated, including one model (the Predictive OpTimal Trees in Emergency Surgery Risk (POTTER) Calculator) that was externally validated in five separate studies. The most frequently predicted outcome was SSI in general (i.e., a combination of superficial-, deep- and organ space SSI or unspecified), 11 models predicted superficial SSI, nine models predicted deep SSI and 24 models predicted organ space SSI. Abdominal surgery was the surgical specialty for which most models were developed (47%), followed by general surgery (21%) and orthopedic surgery (8%). See Table 1 for an overview of all included studies.

Download:

Table 1. Overview of included studies.

https://doi.org/10.1371/journal.pone.0312968.t001

Performance of ML models

The most common reported outcome for model performance was the C-statistic, which was reported in 96% of the studies. Other model performance parameters reported were sensitivity, specificity, negative predicting value and positive predicting value. Only two studies reported calibration metrics of which one study also reported the brier score [39, 44]. Of the internally validated models, the median C-statistic was 0.62 and ranged from 0.44 to 0.99, for the externally validated models the median C-statistic was 0.79 and ranged from 0.55 to 0.87. Sensitivity, specificity, negative predictive value (NPV) and positive predictive value (PPV) were reported in one externally validated model by Grass et al. and were 0.47, 0.8, 0.97 and 0.10 respectively. Of the internally validated models, sensitivity was reported for twenty (29%) models and varied between 0.24 to 0.90, specificity was reported for fifteen (22%) models and varied between 0.25 to 0.91, NPV was reported for four (6%) models and varied between 0.87 to 0.98 and PPV was reported for eleven (16%) models and varied between 0.06 to 0.90 respectively. Overall, the performance of the models varied widely and there was no clear difference between the different surgical specialties or type of SSI predicted (Tables 2–5).

Download:

Table 2. Performance of ML models predicting SSI in general per surgical specialism.

https://doi.org/10.1371/journal.pone.0312968.t002

Download:

Table 3. Performance of ML models predicting superficial SSI per surgical specialism.

https://doi.org/10.1371/journal.pone.0312968.t003

Download:

Table 4. Performance of ML models predicting deep SSI per surgical specialism.

https://doi.org/10.1371/journal.pone.0312968.t004

Download:

Table 5. Performance of ML models predicting organ space SSI per surgical specialism.

https://doi.org/10.1371/journal.pone.0312968.t005

Predictors used in ML models

Of the 85 included ML models, the number of predictors used in the model was reported for 20 models (24%), with mentioning of feature importance (determined by SHAP values) in 15 models (18%). In total 116 different predictors were used in these 20 models. The median number of included predictors per model was 22, ranging from 5–56. The most commonly included predictors were age (100%), oral corticosteroid use (85%), sex (85%), smoking (80%), and diabetes (75%) (Fig 2).

Download:

Fig 2. Predictors used in proportion of the ML models.

All predictors used five times or more are included in the figure. ASA classification (American Society of Anesthesiologists); BMI, (Body Mass Index); COPD (Chronic Obstructive Pulmonary Disease); INR, (International Normalized Ratio); PT, (Prothrombin time); WBC, (White blood count).

https://doi.org/10.1371/journal.pone.0312968.g002

Regression-based models

Of the 24 studies, thirteen studies (54%) also included regression-based models and compared the regression-based performance to the performance of their developed ML models (Fig 3 and S3 Table). The C-statistic for regression-based models varied between 0.41 to 0.95. For the prediction of SSIs, ML performed slightly better compared to regression-based models in four studies [27, 29, 35, 44], whereas regression-based models performed better in two studies [26, 41]. In the other studies reporting both regression-based and ML models, performances were similar [23, 30, 31, 39, 40, 43, 45]. See Fig 3 for an overview of the AUCs of the studies presenting both ML and regression-based models.

Download:

Fig 3. Area under the curve (AUC) for each article that presented both ML and regression-based models.

Green dots represent the AUC of the ML models, orange dots represent the AUC of the regression-based models. The green and orange lines represent the median.

https://doi.org/10.1371/journal.pone.0312968.g003

Risk of bias

The ROB was assessed for all models described in the 24 studies. ROB was low in the participants domain. ROB was high or unclear in the predictors domain and outcome domain, as studies often poorly reported the used predictors and whether predictors were selected independent of the outcome status. In the analysis domain, all studies had a high or unclear ROB, mostly caused by statistical issues such as poor reporting of performance measures, not taking competing risks into account and inappropriate methods to handle missing data. There were no concerns on applicability for all studies. See Fig 4 for an overview of ROB, and S4 Table for the complete ROB.

Download:

Fig 4. Summary of risk of bias assessment using the PROBAST.

Green low risk of bias, yellow unclear risk of bias due to lack of information, red high risk of bias. ROB; Risk of bias.

https://doi.org/10.1371/journal.pone.0312968.g004

Discussion

This systematic review showed that a multitude of 85 different validated ML prediction models for SSIs exists. Most models were developed and tested in patient populations that underwent abdominal surgery. Most of these models (81%) were only internally validated. The most frequently reported parameter for performance was the C-statistic, which varied widely between the different models, and only two studies reported calibration metrics. This corresponds with previous studies on the use of ML in other fields, that found that calibration is rarely reported and that only a minority of the models is externally validated [11, 46, 47]. However, for proper assessment of model performance, both discrimination and calibration are essential parameters for the interpretation of the predicted probabilities [14]. Without external validation of a prediction model, it is difficult to accurately estimate the actual performance of a model in different clinical practices. Furthermore, it is common that retraining or recalibration of an ML model is necessary to fit the unseen population [48]. Therefore, newly developed ML prediction models as well as already existing models need to be retrained, recalibrated, and again validated for new populations. Furthermore, their effect on patient care should then be evaluated and reported with impact studies.

Thirteen of the included studies described both regression-based and ML models and compared their performances in the same population. Both the regression-based models as well as the ML models showed large variability of performance, which is in accordance with previous literature on regression-based models for the prediction of SSIs [49–51]. When compared, the ML and regression-based models did not outperform each other. This is in accordance with previous studies that compared ML models with regression-based models, although some studies suggest that certain subtypes of ML (i.e. gradient boosting trees) perform better than regression-based models [52, 53]. ML models generally need larger datasets to use their full potential. It is possible that this condition was not met in all studies, as the median number of predictors was 22 and the sample size ranged from 256 to 5,881,881.

Model explainability is an important issue with ML prediction models. In general, ML models are considered to be more complex and less transparent with respect to which variables are selected for the prediction compared to regression-based models. Furthermore, in our study, transparency of ML models was further limited as only in the minority of the ML models (24%) the used predictors were reported. This contrasts with regression-based models which are usually presented with regression coefficients representing the strength of the relation between individual predictors and the outcome [54]. Despite being less transparent, ML models are able to utilize large and heterogenous number of datasets and types, can take into account more complex relationships of predictors, can be adapted to the local setting if the model has been validated or recalibrated to this population and can be incorporated in the electronic health care system, making them potentially more beneficial when implemented into clinical care [10].

The ROB was high or unclear for almost all studies, suggesting considerable methodological issues. ROB was scored using the PROBAST which is the most common used tool to estimate ROB of prediction studies. Although an high or unclear ROB for almost all studies is in agreement with previous reviews using the PROBAST [55–57], the PROBAST has been criticized because of poor inter-rater agreement [56, 57]. Moreover, it is not possible to distinguish domains with a high ROB based on one single signaling question answered with ‘no’ from domains with all signaling questions answered with ‘no’. Despite the limitations of the PROBAST, it remains a useful tool to assess methodological shortcomings in prediction studies. Therefore, caution for the interpretation of the findings from these ML models for SSI prediction is recommended. Recently, the new TRIPOD-AI guidelines have been published and new ML models being developedshould follow these guidelines in order to prevent bias [58].

Strengths and limitations

The major strength of this systematic review is that it included all presently available validated ML models for the prediction of SSIs without restrictions on surgical specialty or SSI subtype. In addition, we described the comparison of regression-based models with ML models where possible. As both types of models were compared to each other within the same population, bias was minimalized.

Some limitations exist.

Differences in the quality and the heterogeneity of the data prevented the conduction of a sound meta-analysis comparing the different ML models. Furthermore, this review is limited to SSIs as outcome, although other postoperative infections such as pneumonia and bloodstream infections are also clinically relevant.

Conclusions

This systematic review showed that many ML models for the prediction of SSIs exist, and that their performance generally is equal to regression-based models. Machine learning techniques are still developing and are seen as a promising tool to improve medical care. However, there are multiple methodological issues with the currently available models and there is still a substantial gap between the existing models and their practical and safe implementation in clinical settings. The recently published TRIPOD-AI guidelines should be used to reduce methodological flaws. To create clinically relevant prediction models for future use, more collaboration between clinicians and data scientists, as well as post-implementation studies are needed.

Supporting information

S1 Appendix. Prisma checklist.

https://doi.org/10.1371/journal.pone.0312968.s001

(DOCX)

S2 Appendix. Search strategy.

https://doi.org/10.1371/journal.pone.0312968.s002

(DOCX)

S1 Table. Excluded full text articles.

https://doi.org/10.1371/journal.pone.0312968.s003

(DOCX)

S2 Table. Extracted parameters from the data.

https://doi.org/10.1371/journal.pone.0312968.s004

(DOCX)

S3 Table. Studies with both ML and regression-based models.

https://doi.org/10.1371/journal.pone.0312968.s005

(DOCX)

S4 Table. Risk of bias assessment with the use of the PROBAST score.

https://doi.org/10.1371/journal.pone.0312968.s006

(DOCX)

Acknowledgments

The author would like to thank Rory Monahan for proofreading the pre-final manuscript.

References

1. Global patient outcomes after elective surgery: prospective cohort study in 27 low-, middle- and high-income countries. Br J Anaesth. 2016;117(5):601–9.
2. Badia JM, Casey AL, Petrosillo N, Hudson PM, Mitchell SA, Crosby C. Impact of surgical site infection on healthcare costs and patient outcomes: a systematic review in six European countries. J Hosp Infect. 2017;96(1):1–15. pmid:28410761
- View Article
- PubMed/NCBI
- Google Scholar
3. Gillespie BM, Harbeck E, Rattray M, Liang R, Walker R, Latimer S, et al. Worldwide incidence of surgical site infections in general surgical patients: A systematic review and meta-analysis of 488,594 patients. Int J Surg. 2021;95:106136. pmid:34655800
- View Article
- PubMed/NCBI
- Google Scholar
4. European Centre for Disease Prevention and Control. Healthcare-associated infections: surgical site infections. ECDC. Annual epidemiological report for 2018–2020. Stockholm; 2023.
5. Qu H, Liu Y, Bi DS. Clinical risk factors for anastomotic leakage after laparoscopic anterior resection for rectal cancer: a systematic review and meta-analysis. Surg Endosc. 2015;29(12):3608–17. pmid:25743996
- View Article
- PubMed/NCBI
- Google Scholar
6. Dietz N, Sharma M, Alhourani A, Ugiliweneza B, Wang D, Drazin D, et al. Evaluation of Predictive Models for Complications following Spinal Surgery. J Neurol Surg A Cent Eur Neurosurg. 2020;81(6):535–45. pmid:32797468
- View Article
- PubMed/NCBI
- Google Scholar
7. Guo Y, Hao Z, Zhao S, Gong J, Yang F. Artificial Intelligence in Health Care: Bibliometric Analysis. J Med Internet Res. 2020;22(7):e18228. pmid:32723713
- View Article
- PubMed/NCBI
- Google Scholar
8. Rajkomar A, Dean J, Kohane I. Machine Learning in Medicine. N Engl J Med. 2019;380(14):1347–58. pmid:30943338
- View Article
- PubMed/NCBI
- Google Scholar
9. Choi RY, Coyner AS, Kalpathy-Cramer J, Chiang MF, Campbell JP. Introduction to Machine Learning, Neural Networks, and Deep Learning. Transl Vis Sci Technol. 2020;9(2):14. pmid:32704420
- View Article
- PubMed/NCBI
- Google Scholar
10. Miotto R, Wang F, Wang S, Jiang X, Dudley JT. Deep learning for healthcare: review, opportunities and challenges. Brief Bioinform. 2018;19(6):1236–46. pmid:28481991
- View Article
- PubMed/NCBI
- Google Scholar
11. Andaur Navarro CL, Damen JAA, van Smeden M, Takada T, Nijman SWJ, Dhiman P, et al. Systematic review identifies the design and methodological conduct of studies on machine learning-based prediction models. J Clin Epidemiol. 2023;154:8–22. pmid:36436815
- View Article
- PubMed/NCBI
- Google Scholar
12. Solomonides AE, Koski E, Atabaki SM, Weinberg S, McGreevey JD, Kannry JL, et al. Defining AMIA’s artificial intelligence principles. J Am Med Inform Assoc. 2022;29(4):585–91. pmid:35190824
- View Article
- PubMed/NCBI
- Google Scholar
13. Hunter DJ, Holmes C. Where Medical Statistics Meets Artificial Intelligence. N Engl J Med. 2023;389(13):1211–9. pmid:37754286
- View Article
- PubMed/NCBI
- Google Scholar
14. Steyerberg EW, Vickers AJ, Cook NR, Gerds T, Gonen M, Obuchowski N, et al. Assessing the performance of prediction models: a framework for traditional and novel measures. Epidemiology. 2010;21(1):128–38. pmid:20010215
- View Article
- PubMed/NCBI
- Google Scholar
15. Steyerberg EW, Vergouwe Y. Towards better clinical prediction models: seven steps for development and an ABCD for validation. Eur Heart J. 2014;35(29):1925–31. pmid:24898551
- View Article
- PubMed/NCBI
- Google Scholar
16. Elfanagely O, Toyoda Y, Othman S, Mellia JA, Basta M, Liu T, et al. Machine Learning and Surgical Outcomes Prediction: A Systematic Review. J Surg Res. 2021;264:346–61. pmid:33848833
- View Article
- PubMed/NCBI
- Google Scholar
17. Li B, Feridooni T, Cuen-Ojeda C, Kishibe T, de Mestral C, Mamdani M, et al. Machine learning in vascular surgery: a systematic review and critical appraisal. NPJ Digit Med. 2022;5(1):7. pmid:35046493
- View Article
- PubMed/NCBI
- Google Scholar
18. Wu G, Khair S, Yang F, Cheligeer C, Southern D, Zhang Z, et al. Performance of machine learning algorithms for surgical site infection case detection and prediction: A systematic review and meta-analysis. Ann Med Surg (Lond). 2022;84:104956. pmid:36582918
- View Article
- PubMed/NCBI
- Google Scholar
19. Moons KG, de Groot JA, Bouwmeester W, Vergouwe Y, Mallett S, Altman DG, et al. Critical appraisal and data extraction for systematic reviews of prediction modelling studies: the CHARMS checklist. PLoS Med. 2014;11(10):e1001744. pmid:25314315
- View Article
- PubMed/NCBI
- Google Scholar
20. Moons KGM, Wolff RF, Riley RD, Whiting PF, Westwood M, Collins GS, et al. PROBAST: A Tool to Assess Risk of Bias and Applicability of Prediction Model Studies: Explanation and Elaboration. Ann Intern Med. 2019;170(1):W1–w33. pmid:30596876
- View Article
- PubMed/NCBI
- Google Scholar
21. de Jong Y, Ramspek CL, Zoccali C, Jager KJ, Dekker FW, van Diepen M. Appraising prediction research: a guide and meta-review on bias and applicability assessment using the Prediction model Risk Of Bias ASsessment Tool (PROBAST). Nephrology (Carlton). 2021;26(12):939–47. pmid:34138495
- View Article
- PubMed/NCBI
- Google Scholar
22. Bertsimas D, Dunn J, Velmahos GC, Kaafarani HMA. Surgical Risk Is Not Linear: Derivation and Validation of a Novel, User-friendly, and Machine-learning-based Predictive OpTimal Trees in Emergency Surgery Risk (POTTER) Calculator. Ann Surg. 2018;268(4):574–83. pmid:30124479
- View Article
- PubMed/NCBI
- Google Scholar
23. Bonde A, Varadarajan KM, Bonde N, Troelsen A, Muratoglu OK, Malchau H, et al. Assessing the utility of deep neural networks in predicting postoperative surgical complications: a retrospective study. Lancet Digit Health. 2021;3(8):e471–e85. pmid:34215564
- View Article
- PubMed/NCBI
- Google Scholar
24. Chang B, Sun Z, Peiris P, Huang ES, Benrashid E, Dillavou ED. Deep Learning-Based Risk Model for Best Management of Closed Groin Incisions After Vascular Surgery. J Surg Res. 2020;254:408–16. pmid:32197791
- View Article
- PubMed/NCBI
- Google Scholar
25. El Hechi MW, Maurer LR, Levine J, Zhuo D, El Moheb M, Velmahos GC, et al. Validation of the Artificial Intelligence-Based Predictive Optimal Trees in Emergency Surgery Risk (POTTER) Calculator in Emergency General Surgery and Emergency Laparotomy Patients. J Am Coll Surg. 2021. pmid:33705983
- View Article
- PubMed/NCBI
- Google Scholar
26. Gowd AK, Agarwalla A, Amin NH, Romeo AA, Nicholson GP, Verma NN, et al. Construct validation of machine learning in the prediction of short-term postoperative complications following total shoulder arthroplasty. J Shoulder Elbow Surg. 2019;28(12):e410–e21. pmid:31383411
- View Article
- PubMed/NCBI
- Google Scholar
27. Grass F, Storlie CB, Mathis KL, Bergquist JR, Asai S, Boughey JC, et al. Challenges of Modeling Outcomes for Surgical Infections: A Word of Caution. Surg Infect (Larchmt). 2020.
- View Article
- Google Scholar
28. Ke C, Jin Y, Evans H, Lober B, Qian X, Liu J, et al. Prognostics of surgical site infections using dynamic health data. J Biomed Inform. 2017;65:22–33. pmid:27825798
- View Article
- PubMed/NCBI
- Google Scholar
29. Liu WC, Ying H, Liao WJ, Li MP, Zhang Y, Luo K, et al. Using Preoperative and Intraoperative Factors to Predict the Risk of Surgical Site Infections After Lumbar Spinal Surgery: A Machine Learning-Based Study. World Neurosurg. 2022;162:e553–e60. pmid:35318153
- View Article
- PubMed/NCBI
- Google Scholar
30. Liu X, Lei S, Wei Q, Wang Y, Liang H, Chen L. Machine Learning-based Correlation Study between Perioperative Immunonutritional Index and Postoperative Anastomotic Leakage in Patients with Gastric Cancer. Int J Med Sci. 2022;19(7):1173–83. pmid:35919820
- View Article
- PubMed/NCBI
- Google Scholar
31. Mamlook REA, Wells LJ, Sawyer R. Machine-learning models for predicting surgical site infections using patient pre-operative risk and surgical procedure factors. Am J Infect Control. 2023;51(5):544–50. pmid:36002080
- View Article
- PubMed/NCBI
- Google Scholar
32. Maurer LR, Chetlur P, Zhuo D, El Hechi M, Velmahos GC, Dunn J, et al. Validation of the AI-based Predictive OpTimal Trees in Emergency Surgery Risk (POTTER) Calculator in Patients 65 Years and Older. Ann Surg. 2020;Publish Ahead of Print.
- View Article
- Google Scholar
33. Mazaki J, Katsumata K, Ohno Y, Udo R, Tago T, Kasahara K, et al. A Novel Predictive Model for Anastomotic Leakage in Colorectal Cancer Using Auto-artificial Intelligence. Anticancer Res. 2021;41(11):5821–5. pmid:34732457
- View Article
- PubMed/NCBI
- Google Scholar
34. Merath K, Hyer JM, Mehta R, Farooq A, Bagante F, Sahara K, et al. Use of Machine Learning for Prediction of Patient Risk of Postoperative Complications After Liver, Pancreatic, and Colorectal Surgery. J Gastrointest Surg. 2020;24(8):1843–51. pmid:31385172
- View Article
- PubMed/NCBI
- Google Scholar
35. Nudel J, Bishara AM, de Geus SWL, Patil P, Srinivasan J, Hess DT, et al. Development and validation of machine learning models to predict gastrointestinal leak and venous thromboembolism after weight loss surgery: an analysis of the MBSAQIP database. Surg Endosc. 2021;35(1):182–91. pmid:31953733
- View Article
- PubMed/NCBI
- Google Scholar
36. Ohno Y, Mazaki J, Udo R, Tago T, Kasahara K, Enomoto M, et al. Preliminary Evaluation of a Novel Artificial Intelligence-based Prediction Model for Surgical Site Infection in Colon Cancer. Cancer Diagn Progn. 2022;2(6):691–6. pmid:36340449
- View Article
- PubMed/NCBI
- Google Scholar
37. Sanger PC, van Ramshorst GH, Mercan E, Huang S, Hartzler AL, Armstrong CA, et al. A Prognostic Model of Surgical Site Infection Using Daily Clinical Wound Assessment. J Am Coll Surg. 2016;223(2):259–70.e2. pmid:27188832
- View Article
- PubMed/NCBI
- Google Scholar
38. Taylor J, Meng X, Renson A, Smith AB, Wysock JS, Taneja SS, et al. Different models for prediction of radical cystectomy postoperative complications and care pathways. Ther Adv Urol. 2019;11:1756287219875587. pmid:31565072
- View Article
- PubMed/NCBI
- Google Scholar
39. Van Esbroeck A, Rubinfeld I, Hall B, Syed Z. Quantifying surgical complexity with machine learning: looking beyond patient factors to improve surgical models. Surgery. 2014;156(5):1097–105. pmid:25108343
- View Article
- PubMed/NCBI
- Google Scholar
40. van Kooten RT, Bahadoer RR, Ter Buurkes de Vries B, Wouters M, Tollenaar R, Hartgrink HH, et al. Conventional regression analysis and machine learning in prediction of anastomotic leakage and pulmonary complications after esophagogastric cancer surgery. J Surg Oncol. 2022;126(3):490–501. pmid:35503455
- View Article
- PubMed/NCBI
- Google Scholar
41. Velmahos CS, Paschalidis A, Paranjape CN. The Not-So-Distant Future or Just Hype? Utilizing Machine Learning to Predict 30-Day Post-Operative Complications in Laparoscopic Colectomy Patients. Am Surg. 2023:31348231167397. pmid:36992631
- View Article
- PubMed/NCBI
- Google Scholar
42. Walczak S, Davila M, Velanovich V. Prophylactic antibiotic bundle compliance and surgical site infections: an artificial neural network analysis. Patient Saf Surg. 2019;13:41. pmid:31827618
- View Article
- PubMed/NCBI
- Google Scholar
43. Weller GB, Lovely J, Larson DW, Earnshaw BA, Huebner M. Leveraging electronic health records for predictive modeling of post-surgical complications. Stat Methods Med Res. 2018;27(11):3271–85. pmid:29298612
- View Article
- PubMed/NCBI
- Google Scholar
44. Ying H, Guo BW, Wu HJ, Zhu RP, Liu WC, Zhong HF. Using multiple indicators to predict the risk of surgical site infection after ORIF of tibia fractures: a machine learning based study. Front Cell Infect Microbiol. 2023;13:1206393. pmid:37448774
- View Article
- PubMed/NCBI
- Google Scholar
45. Zhang N, Fan K, Ji H, Ma X, Wu J, Huang Y, et al. Identification of risk factors for infection after mitral valve surgery through machine learning approaches. Front Cardiovasc Med. 2023;10:1050698. pmid:37383697
- View Article
- PubMed/NCBI
- Google Scholar
46. van der Endt VHW, Milders J, Penning de Vries BBL, Trines SA, Groenwold RHH, Dekkers OM, et al. Comprehensive comparison of stroke risk score performance: a systematic review and meta-analysis among 6 267 728 patients with atrial fibrillation. Europace. 2022;24(11):1739–53. pmid:35894866
- View Article
- PubMed/NCBI
- Google Scholar
47. de Jong Y, Ramspek CL, van der Endt VHW, Rookmaaker MB, Blankestijn PJ, Vernooij RWM, et al. A systematic review and external validation of stroke prediction models demonstrates poor performance in dialysis patients. J Clin Epidemiol. 2020;123:69–79. pmid:32240769
- View Article
- PubMed/NCBI
- Google Scholar
48. de Hond AAH, Kant IMJ, Fornasa M, Cinà G, Elbers PWG, Thoral PJ, et al. Predicting Readmission or Death After Discharge From the ICU: External Validation and Retraining of a Machine Learning Model. Crit Care Med. 2023;51(2):291–300. pmid:36524820
- View Article
- PubMed/NCBI
- Google Scholar
49. Kunutsor SK, Whitehouse MR, Blom AW, Beswick AD. Systematic review of risk prediction scores for surgical site infection or periprosthetic joint infection following joint arthroplasty. Epidemiol Infect. 2017;145(9):1738–49. pmid:28264756
- View Article
- PubMed/NCBI
- Google Scholar
50. Gwilym BL, Ambler GK, Saratzis A, Bosanquet DC. Groin Wound Infection after Vascular Exposure (GIVE) Risk Prediction Models: Development, Internal Validation, and Comparison with Existing Risk Prediction Models Identified in a Systematic Literature Review. Eur J Vasc Endovasc Surg. 2021;62(2):258–66. pmid:34246547
- View Article
- PubMed/NCBI
- Google Scholar
51. Lubelski D, Alentado V, Nowacki AS, Shriver M, Abdullah KG, Steinmetz MP, et al. Preoperative Nomograms Predict Patient-Specific Cervical Spine Surgery Clinical and Quality of Life Outcomes. Neurosurgery. 2018;83(1):104–13. pmid:29106662
- View Article
- PubMed/NCBI
- Google Scholar
52. Song X, Liu X, Liu F, Wang C. Comparison of machine learning and logistic regression models in predicting acute kidney injury: A systematic review and meta-analysis. Int J Med Inform. 2021;151:104484. pmid:33991886
- View Article
- PubMed/NCBI
- Google Scholar
53. Christodoulou E, Ma J, Collins GS, Steyerberg EW, Verbakel JY, Van Calster B. A systematic review shows no performance benefit of machine learning over logistic regression for clinical prediction models. J Clin Epidemiol. 2019;110:12–22. pmid:30763612
- View Article
- PubMed/NCBI
- Google Scholar
54. van Smeden M, Heinze G, Van Calster B, Asselbergs FW, Vardas PE, Bruining N, et al. Critical appraisal of artificial intelligence-based prediction models for cardiovascular disease. Eur Heart J. 2022;43(31):2921–30. pmid:35639667
- View Article
- PubMed/NCBI
- Google Scholar
55. Venema E, Wessler BS, Paulus JK, Salah R, Raman G, Leung LY, et al. Large-scale validation of the prediction model risk of bias assessment Tool (PROBAST) using a short form: high risk of bias models show poorer discrimination. J Clin Epidemiol. 2021;138:32–9. pmid:34175377
- View Article
- PubMed/NCBI
- Google Scholar
56. Langenhuijsen LFS, Janse RJ, Venema E, Kent DM, van Diepen M, Dekker FW, et al. Systematic metareview of prediction studies demonstrates stable trends in bias and low PROBAST inter-rater agreement. J Clin Epidemiol. 2023;159:159–73. pmid:37142166
- View Article
- PubMed/NCBI
- Google Scholar
57. Kaiser I, Pfahlberg AB, Mathes S, Uter W, Diehl K, Steeb T, et al. Inter-Rater Agreement in Assessing Risk of Bias in Melanoma Prediction Studies Using the Prediction Model Risk of Bias Assessment Tool (PROBAST): Results from a Controlled Experiment on the Effect of Specific Rater Training. J Clin Med. 2023;12(5). pmid:36902763
- View Article
- PubMed/NCBI
- Google Scholar
58. Collins GS, Moons KGM, Dhiman P, Riley RD, Beam AL, Van Calster B, et al. TRIPOD+AI statement: updated guidance for reporting clinical prediction models that use regression or machine learning methods. Bmj. 2024;385:e078378. pmid:38626948
- View Article
- PubMed/NCBI
- Google Scholar

[ref1] 1. Global patient outcomes after elective surgery: prospective cohort study in 27 low-, middle- and high-income countries. Br J Anaesth. 2016;117(5):601–9.

[ref2] 2. Badia JM, Casey AL, Petrosillo N, Hudson PM, Mitchell SA, Crosby C. Impact of surgical site infection on healthcare costs and patient outcomes: a systematic review in six European countries. J Hosp Infect. 2017;96(1):1–15. pmid:28410761
View Article
PubMed/NCBI
Google Scholar

[3] View Article

[4] PubMed/NCBI

[5] Google Scholar

[ref3] 3. Gillespie BM, Harbeck E, Rattray M, Liang R, Walker R, Latimer S, et al. Worldwide incidence of surgical site infections in general surgical patients: A systematic review and meta-analysis of 488,594 patients. Int J Surg. 2021;95:106136. pmid:34655800
View Article
PubMed/NCBI
Google Scholar

[7] View Article

[8] PubMed/NCBI

[9] Google Scholar

[ref4] 4. European Centre for Disease Prevention and Control. Healthcare-associated infections: surgical site infections. ECDC. Annual epidemiological report for 2018–2020. Stockholm; 2023.

[ref5] 5. Qu H, Liu Y, Bi DS. Clinical risk factors for anastomotic leakage after laparoscopic anterior resection for rectal cancer: a systematic review and meta-analysis. Surg Endosc. 2015;29(12):3608–17. pmid:25743996
View Article
PubMed/NCBI
Google Scholar

[12] View Article

[13] PubMed/NCBI

[14] Google Scholar

[ref6] 6. Dietz N, Sharma M, Alhourani A, Ugiliweneza B, Wang D, Drazin D, et al. Evaluation of Predictive Models for Complications following Spinal Surgery. J Neurol Surg A Cent Eur Neurosurg. 2020;81(6):535–45. pmid:32797468
View Article
PubMed/NCBI
Google Scholar

[16] View Article

[17] PubMed/NCBI

[18] Google Scholar

[ref7] 7. Guo Y, Hao Z, Zhao S, Gong J, Yang F. Artificial Intelligence in Health Care: Bibliometric Analysis. J Med Internet Res. 2020;22(7):e18228. pmid:32723713
View Article
PubMed/NCBI
Google Scholar

[20] View Article

[21] PubMed/NCBI

[22] Google Scholar

[ref8] 8. Rajkomar A, Dean J, Kohane I. Machine Learning in Medicine. N Engl J Med. 2019;380(14):1347–58. pmid:30943338
View Article
PubMed/NCBI
Google Scholar

[24] View Article

[25] PubMed/NCBI

[26] Google Scholar

[ref9] 9. Choi RY, Coyner AS, Kalpathy-Cramer J, Chiang MF, Campbell JP. Introduction to Machine Learning, Neural Networks, and Deep Learning. Transl Vis Sci Technol. 2020;9(2):14. pmid:32704420
View Article
PubMed/NCBI
Google Scholar

[28] View Article

[29] PubMed/NCBI

[30] Google Scholar

[ref10] 10. Miotto R, Wang F, Wang S, Jiang X, Dudley JT. Deep learning for healthcare: review, opportunities and challenges. Brief Bioinform. 2018;19(6):1236–46. pmid:28481991
View Article
PubMed/NCBI
Google Scholar

[32] View Article

[33] PubMed/NCBI

[34] Google Scholar

[ref11] 11. Andaur Navarro CL, Damen JAA, van Smeden M, Takada T, Nijman SWJ, Dhiman P, et al. Systematic review identifies the design and methodological conduct of studies on machine learning-based prediction models. J Clin Epidemiol. 2023;154:8–22. pmid:36436815
View Article
PubMed/NCBI
Google Scholar

[36] View Article

[37] PubMed/NCBI

[38] Google Scholar

[ref12] 12. Solomonides AE, Koski E, Atabaki SM, Weinberg S, McGreevey JD, Kannry JL, et al. Defining AMIA’s artificial intelligence principles. J Am Med Inform Assoc. 2022;29(4):585–91. pmid:35190824
View Article
PubMed/NCBI
Google Scholar

[40] View Article

[41] PubMed/NCBI

[42] Google Scholar

[ref13] 13. Hunter DJ, Holmes C. Where Medical Statistics Meets Artificial Intelligence. N Engl J Med. 2023;389(13):1211–9. pmid:37754286
View Article
PubMed/NCBI
Google Scholar

[44] View Article

[45] PubMed/NCBI

[46] Google Scholar

[ref14] 14. Steyerberg EW, Vickers AJ, Cook NR, Gerds T, Gonen M, Obuchowski N, et al. Assessing the performance of prediction models: a framework for traditional and novel measures. Epidemiology. 2010;21(1):128–38. pmid:20010215
View Article
PubMed/NCBI
Google Scholar

[48] View Article

[49] PubMed/NCBI

[50] Google Scholar

[ref15] 15. Steyerberg EW, Vergouwe Y. Towards better clinical prediction models: seven steps for development and an ABCD for validation. Eur Heart J. 2014;35(29):1925–31. pmid:24898551
View Article
PubMed/NCBI
Google Scholar

[52] View Article

[53] PubMed/NCBI

[54] Google Scholar

[ref16] 16. Elfanagely O, Toyoda Y, Othman S, Mellia JA, Basta M, Liu T, et al. Machine Learning and Surgical Outcomes Prediction: A Systematic Review. J Surg Res. 2021;264:346–61. pmid:33848833
View Article
PubMed/NCBI
Google Scholar

[56] View Article

[57] PubMed/NCBI

[58] Google Scholar

[ref17] 17. Li B, Feridooni T, Cuen-Ojeda C, Kishibe T, de Mestral C, Mamdani M, et al. Machine learning in vascular surgery: a systematic review and critical appraisal. NPJ Digit Med. 2022;5(1):7. pmid:35046493
View Article
PubMed/NCBI
Google Scholar

[60] View Article

[61] PubMed/NCBI

[62] Google Scholar

[ref18] 18. Wu G, Khair S, Yang F, Cheligeer C, Southern D, Zhang Z, et al. Performance of machine learning algorithms for surgical site infection case detection and prediction: A systematic review and meta-analysis. Ann Med Surg (Lond). 2022;84:104956. pmid:36582918
View Article
PubMed/NCBI
Google Scholar

[64] View Article

[65] PubMed/NCBI

[66] Google Scholar

[ref19] 19. Moons KG, de Groot JA, Bouwmeester W, Vergouwe Y, Mallett S, Altman DG, et al. Critical appraisal and data extraction for systematic reviews of prediction modelling studies: the CHARMS checklist. PLoS Med. 2014;11(10):e1001744. pmid:25314315
View Article
PubMed/NCBI
Google Scholar

[68] View Article

[69] PubMed/NCBI

[70] Google Scholar

[ref20] 20. Moons KGM, Wolff RF, Riley RD, Whiting PF, Westwood M, Collins GS, et al. PROBAST: A Tool to Assess Risk of Bias and Applicability of Prediction Model Studies: Explanation and Elaboration. Ann Intern Med. 2019;170(1):W1–w33. pmid:30596876
View Article
PubMed/NCBI
Google Scholar

[72] View Article

[73] PubMed/NCBI

[74] Google Scholar

[ref21] 21. de Jong Y, Ramspek CL, Zoccali C, Jager KJ, Dekker FW, van Diepen M. Appraising prediction research: a guide and meta-review on bias and applicability assessment using the Prediction model Risk Of Bias ASsessment Tool (PROBAST). Nephrology (Carlton). 2021;26(12):939–47. pmid:34138495
View Article
PubMed/NCBI
Google Scholar

[76] View Article

[77] PubMed/NCBI

[78] Google Scholar

[ref22] 22. Bertsimas D, Dunn J, Velmahos GC, Kaafarani HMA. Surgical Risk Is Not Linear: Derivation and Validation of a Novel, User-friendly, and Machine-learning-based Predictive OpTimal Trees in Emergency Surgery Risk (POTTER) Calculator. Ann Surg. 2018;268(4):574–83. pmid:30124479
View Article
PubMed/NCBI
Google Scholar

[80] View Article

[81] PubMed/NCBI

[82] Google Scholar

[ref23] 23. Bonde A, Varadarajan KM, Bonde N, Troelsen A, Muratoglu OK, Malchau H, et al. Assessing the utility of deep neural networks in predicting postoperative surgical complications: a retrospective study. Lancet Digit Health. 2021;3(8):e471–e85. pmid:34215564
View Article
PubMed/NCBI
Google Scholar

[84] View Article

[85] PubMed/NCBI

[86] Google Scholar

[ref24] 24. Chang B, Sun Z, Peiris P, Huang ES, Benrashid E, Dillavou ED. Deep Learning-Based Risk Model for Best Management of Closed Groin Incisions After Vascular Surgery. J Surg Res. 2020;254:408–16. pmid:32197791
View Article
PubMed/NCBI
Google Scholar

[88] View Article

[89] PubMed/NCBI

[90] Google Scholar

[ref25] 25. El Hechi MW, Maurer LR, Levine J, Zhuo D, El Moheb M, Velmahos GC, et al. Validation of the Artificial Intelligence-Based Predictive Optimal Trees in Emergency Surgery Risk (POTTER) Calculator in Emergency General Surgery and Emergency Laparotomy Patients. J Am Coll Surg. 2021. pmid:33705983
View Article
PubMed/NCBI
Google Scholar

[92] View Article

[93] PubMed/NCBI

[94] Google Scholar

[ref26] 26. Gowd AK, Agarwalla A, Amin NH, Romeo AA, Nicholson GP, Verma NN, et al. Construct validation of machine learning in the prediction of short-term postoperative complications following total shoulder arthroplasty. J Shoulder Elbow Surg. 2019;28(12):e410–e21. pmid:31383411
View Article
PubMed/NCBI
Google Scholar

[96] View Article

[97] PubMed/NCBI

[98] Google Scholar

[ref27] 27. Grass F, Storlie CB, Mathis KL, Bergquist JR, Asai S, Boughey JC, et al. Challenges of Modeling Outcomes for Surgical Infections: A Word of Caution. Surg Infect (Larchmt). 2020.
View Article
Google Scholar

[100] View Article

[101] Google Scholar

[ref28] 28. Ke C, Jin Y, Evans H, Lober B, Qian X, Liu J, et al. Prognostics of surgical site infections using dynamic health data. J Biomed Inform. 2017;65:22–33. pmid:27825798
View Article
PubMed/NCBI
Google Scholar

[103] View Article

[104] PubMed/NCBI

[105] Google Scholar

[ref29] 29. Liu WC, Ying H, Liao WJ, Li MP, Zhang Y, Luo K, et al. Using Preoperative and Intraoperative Factors to Predict the Risk of Surgical Site Infections After Lumbar Spinal Surgery: A Machine Learning-Based Study. World Neurosurg. 2022;162:e553–e60. pmid:35318153
View Article
PubMed/NCBI
Google Scholar

[107] View Article

[108] PubMed/NCBI

[109] Google Scholar

[ref30] 30. Liu X, Lei S, Wei Q, Wang Y, Liang H, Chen L. Machine Learning-based Correlation Study between Perioperative Immunonutritional Index and Postoperative Anastomotic Leakage in Patients with Gastric Cancer. Int J Med Sci. 2022;19(7):1173–83. pmid:35919820
View Article
PubMed/NCBI
Google Scholar

[111] View Article

[112] PubMed/NCBI

[113] Google Scholar

[ref31] 31. Mamlook REA, Wells LJ, Sawyer R. Machine-learning models for predicting surgical site infections using patient pre-operative risk and surgical procedure factors. Am J Infect Control. 2023;51(5):544–50. pmid:36002080
View Article
PubMed/NCBI
Google Scholar

[115] View Article

[116] PubMed/NCBI

[117] Google Scholar

[ref32] 32. Maurer LR, Chetlur P, Zhuo D, El Hechi M, Velmahos GC, Dunn J, et al. Validation of the AI-based Predictive OpTimal Trees in Emergency Surgery Risk (POTTER) Calculator in Patients 65 Years and Older. Ann Surg. 2020;Publish Ahead of Print.
View Article
Google Scholar

[119] View Article

[120] Google Scholar

[ref33] 33. Mazaki J, Katsumata K, Ohno Y, Udo R, Tago T, Kasahara K, et al. A Novel Predictive Model for Anastomotic Leakage in Colorectal Cancer Using Auto-artificial Intelligence. Anticancer Res. 2021;41(11):5821–5. pmid:34732457
View Article
PubMed/NCBI
Google Scholar

[122] View Article

[123] PubMed/NCBI

[124] Google Scholar

[ref34] 34. Merath K, Hyer JM, Mehta R, Farooq A, Bagante F, Sahara K, et al. Use of Machine Learning for Prediction of Patient Risk of Postoperative Complications After Liver, Pancreatic, and Colorectal Surgery. J Gastrointest Surg. 2020;24(8):1843–51. pmid:31385172
View Article
PubMed/NCBI
Google Scholar

[126] View Article

[127] PubMed/NCBI

[128] Google Scholar

[ref35] 35. Nudel J, Bishara AM, de Geus SWL, Patil P, Srinivasan J, Hess DT, et al. Development and validation of machine learning models to predict gastrointestinal leak and venous thromboembolism after weight loss surgery: an analysis of the MBSAQIP database. Surg Endosc. 2021;35(1):182–91. pmid:31953733
View Article
PubMed/NCBI
Google Scholar

[130] View Article

[131] PubMed/NCBI

[132] Google Scholar

[ref36] 36. Ohno Y, Mazaki J, Udo R, Tago T, Kasahara K, Enomoto M, et al. Preliminary Evaluation of a Novel Artificial Intelligence-based Prediction Model for Surgical Site Infection in Colon Cancer. Cancer Diagn Progn. 2022;2(6):691–6. pmid:36340449
View Article
PubMed/NCBI
Google Scholar

[134] View Article

[135] PubMed/NCBI

[136] Google Scholar

[ref37] 37. Sanger PC, van Ramshorst GH, Mercan E, Huang S, Hartzler AL, Armstrong CA, et al. A Prognostic Model of Surgical Site Infection Using Daily Clinical Wound Assessment. J Am Coll Surg. 2016;223(2):259–70.e2. pmid:27188832
View Article
PubMed/NCBI
Google Scholar

[138] View Article

[139] PubMed/NCBI

[140] Google Scholar

[ref38] 38. Taylor J, Meng X, Renson A, Smith AB, Wysock JS, Taneja SS, et al. Different models for prediction of radical cystectomy postoperative complications and care pathways. Ther Adv Urol. 2019;11:1756287219875587. pmid:31565072
View Article
PubMed/NCBI
Google Scholar

[142] View Article

[143] PubMed/NCBI

[144] Google Scholar

[ref39] 39. Van Esbroeck A, Rubinfeld I, Hall B, Syed Z. Quantifying surgical complexity with machine learning: looking beyond patient factors to improve surgical models. Surgery. 2014;156(5):1097–105. pmid:25108343
View Article
PubMed/NCBI
Google Scholar

[146] View Article

[147] PubMed/NCBI

[148] Google Scholar

[ref40] 40. van Kooten RT, Bahadoer RR, Ter Buurkes de Vries B, Wouters M, Tollenaar R, Hartgrink HH, et al. Conventional regression analysis and machine learning in prediction of anastomotic leakage and pulmonary complications after esophagogastric cancer surgery. J Surg Oncol. 2022;126(3):490–501. pmid:35503455
View Article
PubMed/NCBI
Google Scholar

[150] View Article

[151] PubMed/NCBI

[152] Google Scholar

[ref41] 41. Velmahos CS, Paschalidis A, Paranjape CN. The Not-So-Distant Future or Just Hype? Utilizing Machine Learning to Predict 30-Day Post-Operative Complications in Laparoscopic Colectomy Patients. Am Surg. 2023:31348231167397. pmid:36992631
View Article
PubMed/NCBI
Google Scholar

[154] View Article

[155] PubMed/NCBI

[156] Google Scholar

[ref42] 42. Walczak S, Davila M, Velanovich V. Prophylactic antibiotic bundle compliance and surgical site infections: an artificial neural network analysis. Patient Saf Surg. 2019;13:41. pmid:31827618
View Article
PubMed/NCBI
Google Scholar

[158] View Article

[159] PubMed/NCBI

[160] Google Scholar

[ref43] 43. Weller GB, Lovely J, Larson DW, Earnshaw BA, Huebner M. Leveraging electronic health records for predictive modeling of post-surgical complications. Stat Methods Med Res. 2018;27(11):3271–85. pmid:29298612
View Article
PubMed/NCBI
Google Scholar

[162] View Article

[163] PubMed/NCBI

[164] Google Scholar

[ref44] 44. Ying H, Guo BW, Wu HJ, Zhu RP, Liu WC, Zhong HF. Using multiple indicators to predict the risk of surgical site infection after ORIF of tibia fractures: a machine learning based study. Front Cell Infect Microbiol. 2023;13:1206393. pmid:37448774
View Article
PubMed/NCBI
Google Scholar

[166] View Article

[167] PubMed/NCBI

[168] Google Scholar

[ref45] 45. Zhang N, Fan K, Ji H, Ma X, Wu J, Huang Y, et al. Identification of risk factors for infection after mitral valve surgery through machine learning approaches. Front Cardiovasc Med. 2023;10:1050698. pmid:37383697
View Article
PubMed/NCBI
Google Scholar

[170] View Article

[171] PubMed/NCBI

[172] Google Scholar

[ref46] 46. van der Endt VHW, Milders J, Penning de Vries BBL, Trines SA, Groenwold RHH, Dekkers OM, et al. Comprehensive comparison of stroke risk score performance: a systematic review and meta-analysis among 6 267 728 patients with atrial fibrillation. Europace. 2022;24(11):1739–53. pmid:35894866
View Article
PubMed/NCBI
Google Scholar

[174] View Article

[175] PubMed/NCBI

[176] Google Scholar

[ref47] 47. de Jong Y, Ramspek CL, van der Endt VHW, Rookmaaker MB, Blankestijn PJ, Vernooij RWM, et al. A systematic review and external validation of stroke prediction models demonstrates poor performance in dialysis patients. J Clin Epidemiol. 2020;123:69–79. pmid:32240769
View Article
PubMed/NCBI
Google Scholar

[178] View Article

[179] PubMed/NCBI

[180] Google Scholar

[ref48] 48. de Hond AAH, Kant IMJ, Fornasa M, Cinà G, Elbers PWG, Thoral PJ, et al. Predicting Readmission or Death After Discharge From the ICU: External Validation and Retraining of a Machine Learning Model. Crit Care Med. 2023;51(2):291–300. pmid:36524820
View Article
PubMed/NCBI
Google Scholar

[182] View Article

[183] PubMed/NCBI

[184] Google Scholar

[ref49] 49. Kunutsor SK, Whitehouse MR, Blom AW, Beswick AD. Systematic review of risk prediction scores for surgical site infection or periprosthetic joint infection following joint arthroplasty. Epidemiol Infect. 2017;145(9):1738–49. pmid:28264756
View Article
PubMed/NCBI
Google Scholar

[186] View Article

[187] PubMed/NCBI

[188] Google Scholar

[ref50] 50. Gwilym BL, Ambler GK, Saratzis A, Bosanquet DC. Groin Wound Infection after Vascular Exposure (GIVE) Risk Prediction Models: Development, Internal Validation, and Comparison with Existing Risk Prediction Models Identified in a Systematic Literature Review. Eur J Vasc Endovasc Surg. 2021;62(2):258–66. pmid:34246547
View Article
PubMed/NCBI
Google Scholar

[190] View Article

[191] PubMed/NCBI

[192] Google Scholar

[ref51] 51. Lubelski D, Alentado V, Nowacki AS, Shriver M, Abdullah KG, Steinmetz MP, et al. Preoperative Nomograms Predict Patient-Specific Cervical Spine Surgery Clinical and Quality of Life Outcomes. Neurosurgery. 2018;83(1):104–13. pmid:29106662
View Article
PubMed/NCBI
Google Scholar

[194] View Article

[195] PubMed/NCBI

[196] Google Scholar

[ref52] 52. Song X, Liu X, Liu F, Wang C. Comparison of machine learning and logistic regression models in predicting acute kidney injury: A systematic review and meta-analysis. Int J Med Inform. 2021;151:104484. pmid:33991886
View Article
PubMed/NCBI
Google Scholar

[198] View Article

[199] PubMed/NCBI

[200] Google Scholar

[ref53] 53. Christodoulou E, Ma J, Collins GS, Steyerberg EW, Verbakel JY, Van Calster B. A systematic review shows no performance benefit of machine learning over logistic regression for clinical prediction models. J Clin Epidemiol. 2019;110:12–22. pmid:30763612
View Article
PubMed/NCBI
Google Scholar

[202] View Article

[203] PubMed/NCBI

[204] Google Scholar

[ref54] 54. van Smeden M, Heinze G, Van Calster B, Asselbergs FW, Vardas PE, Bruining N, et al. Critical appraisal of artificial intelligence-based prediction models for cardiovascular disease. Eur Heart J. 2022;43(31):2921–30. pmid:35639667
View Article
PubMed/NCBI
Google Scholar

[206] View Article

[207] PubMed/NCBI

[208] Google Scholar

[ref55] 55. Venema E, Wessler BS, Paulus JK, Salah R, Raman G, Leung LY, et al. Large-scale validation of the prediction model risk of bias assessment Tool (PROBAST) using a short form: high risk of bias models show poorer discrimination. J Clin Epidemiol. 2021;138:32–9. pmid:34175377
View Article
PubMed/NCBI
Google Scholar

[210] View Article

[211] PubMed/NCBI

[212] Google Scholar

[ref56] 56. Langenhuijsen LFS, Janse RJ, Venema E, Kent DM, van Diepen M, Dekker FW, et al. Systematic metareview of prediction studies demonstrates stable trends in bias and low PROBAST inter-rater agreement. J Clin Epidemiol. 2023;159:159–73. pmid:37142166
View Article
PubMed/NCBI
Google Scholar

[214] View Article

[215] PubMed/NCBI

[216] Google Scholar

[ref57] 57. Kaiser I, Pfahlberg AB, Mathes S, Uter W, Diehl K, Steeb T, et al. Inter-Rater Agreement in Assessing Risk of Bias in Melanoma Prediction Studies Using the Prediction Model Risk of Bias Assessment Tool (PROBAST): Results from a Controlled Experiment on the Effect of Specific Rater Training. J Clin Med. 2023;12(5). pmid:36902763
View Article
PubMed/NCBI
Google Scholar

[218] View Article

[219] PubMed/NCBI

[220] Google Scholar

[ref58] 58. Collins GS, Moons KGM, Dhiman P, Riley RD, Beam AL, Van Calster B, et al. TRIPOD+AI statement: updated guidance for reporting clinical prediction models that use regression or machine learning methods. Bmj. 2024;385:e078378. pmid:38626948
View Article
PubMed/NCBI
Google Scholar

[222] View Article

[223] PubMed/NCBI

[224] Google Scholar

Figures

Abstract

Background

Objective

Methods

Results

Conclusions

Introduction

Methods

Search strategy

Inclusion and exclusion criteria

Screening and data extraction

Descriptive analyses

Risk of bias

Results

Characteristics of included studies

Performance of ML models

Predictors used in ML models

Regression-based models

Risk of bias

Discussion

Strengths and limitations

Some limitations exist.

Conclusions

Supporting information

S1 Appendix. Prisma checklist.

S2 Appendix. Search strategy.

S1 Table. Excluded full text articles.

S2 Table. Extracted parameters from the data.

S3 Table. Studies with both ML and regression-based models.

S4 Table. Risk of bias assessment with the use of the PROBAST score.

Acknowledgments

References