Natural language processing augments comorbidity documentation in neurosurgical inpatient admissions

Rahul A. Sastry; Aayush Setty; David D. Liu; Bryan Zheng; Rohaid Ali; Robert J. Weil; G. Dean Roye; Curtis E. Doberstein; Adetokunbo A. Oyelese; Tianyi Niu; Ziya L. Gokaslan; Albert E. Telfeian

doi:10.1371/journal.pone.0303519

Abstract

Objective

To establish whether or not a natural language processing technique could identify two common inpatient neurosurgical comorbidities using only text reports of inpatient head imaging.

Materials and methods

A training and testing dataset of reports of 979 CT or MRI scans of the brain for patients admitted to the neurosurgery service of a single hospital in June 2021 or to the Emergency Department between July 1–8, 2021, was identified. A variety of machine learning and deep learning algorithms utilizing natural language processing were trained on the training set (84% of the total cohort) and tested on the remaining images. A subset comparison cohort (n = 76) was then assessed to compare output of the best algorithm against real-life inpatient documentation.

Results

For “brain compression”, a random forest classifier outperformed other candidate algorithms with an accuracy of 0.81 and area under the curve of 0.90 in the testing dataset. For “brain edema”, a random forest classifier again outperformed other candidate algorithms with an accuracy of 0.92 and AUC of 0.94 in the testing dataset. In the provider comparison dataset, for “brain compression,” the random forest algorithm demonstrated better accuracy (0.76 vs 0.70) and sensitivity (0.73 vs 0.43) than provider documentation. For “brain edema,” the algorithm again demonstrated better accuracy (0.92 vs 0.84) and AUC (0.45 vs 0.09) than provider documentation.

Discussion

A natural language processing-based machine learning algorithm can reliably and reproducibly identify selected common neurosurgical comorbidities from radiology reports.

Conclusion

This result may justify the use of machine learning-based decision support to augment provider documentation.

Citation: Sastry RA, Setty A, Liu DD, Zheng B, Ali R, Weil RJ, et al. (2024) Natural language processing augments comorbidity documentation in neurosurgical inpatient admissions. PLoS ONE 19(5): e0303519. https://doi.org/10.1371/journal.pone.0303519

Editor: Vijayalakshmi Kakulapati, Sreenidhi Institute of Science and Technology, INDIA

Received: August 10, 2022; Accepted: April 4, 2024; Published: May 9, 2024

Copyright: © 2024 Sastry et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All relevant anonymized data and code are within the manuscript and its Supporting Information files.

Funding: The author(s) received no specific funding for this work.

Competing interests: The authors have declared that no competing interests exist.

Introduction

Timely and accurate medical documentation is a quality and safety imperative. Precise documentation can advance efficacious inpatient care, enhance transitions across the healthcare ecosystem, reduce needless variation and excess utilization, facilitate clinical research efforts, and capture the intensity and quality of care, on which hospital reimbursements are based [1–3]. Education and training regarding best practices in documentation, however, can be perceived as extraneous or of minimal importance, especially in the context of resident education and training. Such inattention can result in substantial underestimations of the intensity of care provided to operative and non-operative surgical inpatients [2–6]. Estimated revenue losses of up to 40% have clear and obvious consequences for hospital operations, particularly in the context of hospitalized trauma patients who do not undergo surgical intervention [3]. An array of interventions, including provider education, constant clinician review of electronic medical records (EMR), manually generated documentation queries, and others, have been implemented at various centers. However, they are often additive to the work of busy clinicians and trainees, who already spend historically large amounts of time on documentation, or by expanding the numbers of clinical documentation staff to constantly assess provider practices [5, 7–10]. These additive measures are of the work harder, not smarter, framework and have been documented across healthcare to be a growing source of clinician discontent and burnout [11, 12].

In the United States, inpatient reimbursements are determined by broad classifications of patient diagnoses known as diagnosis-related groups (DRGs), which were originally implemented as part of Medicare’s Prospective Payment System in 1983 [1, 13, 14]. Medicare Severity DRGs (MS-DRGs), the most common system used in the United States, are stratified into three categories: (1) DRG without complication or comorbidity (CC) and without major CC; (2) DRG with a CC; and (3) DRG with MCC [1, 14]. Inpatient and relevant outpatient documentation of pertinent medical and surgical diagnoses, as well as the specific treatments or interventions that treat these diagnoses, determines the CCs and MCCs used as secondary diagnoses during and after admission. In this context, our neurosurgery department at a large American level 1 trauma center recently implemented a provider-based initiative to improve inpatient documentation and comorbidity capture rates [4].

Given the success of machine learning (ML) approaches in a variety of medical contexts [15–17], we hypothesized that a natural language processing (NLP)-based ML algorithm may be able to identify neurosurgical inpatients likely to have 1 or more commonly encountered CC/MCCs based solely on text interpretations of computed tomography (CT) or magnetic resonance imaging (MRI) reports obtained during hospital admission regardless of underlying pathology.

Materials and methods

Patient cohort

The protocol for this study was reviewed and approved by the Institutional Review Board of Rhode Island Hospital (Providence, RI). As the proposed research was a retrospective observational study, the need for patient consent was waived by the aforementioned Institutional Review Board. Data were fully anonymized at the time of chart review. A retrospective cohort of 979 images was comprised of all scans and respective radiological impressions of patients admitted to the neurosurgery service in June 2021 who underwent either CT or MRI of the brain and all scans of patients seen in the emergency department from July 1–8, 2021 who underwent either CT or MRI of the brain. This cohort was devised in order to include an appropriate number of positive and negative controls for algorithm training and development even though the target population of this effort consists exclusively of neurosurgical admissions. Given the nature of these inclusion criteria, in some cases, multiple scans were included for a given patient/admission. A separate provider comparison cohort, which was comprised of 76 patients who were admitted to the neurosurgery service in October 2021 and underwent inpatient CT or MRI of the brain, was also identified in order to facilitate comparison between algorithm performance and real-world documentation. October 2021 was selected as a representative month because it reflected steady state documentation practices after the recent implementation of a documentation improvement protocol and progress note template [4]. In this cohort, only the first scan obtained within our hospital’s system, regardless of indication or modality, was included (thus resulting in one scan per patient). This cohort was specifically only used as a subset of the test data so that we can compare model performance to provider performance on the entire cohort. The combined dataset (n = 1055) was split into a training cohort (n = 885, 83.9%) and a testing set (n = 170, 16.2%) for the purpose of algorithm selection and training. Class imbalance was a major consideration in our data split structure as model fitting can be biased by highly imbalanced datasets. We ensured that our training and testing set had relatively similar class proportions Table 1. Images were not excluded on the basis of elective vs. emergent admission or surgical vs. non-surgical pathologies.

Download:

Table 1. Class proportions for brain compression and edema.

https://doi.org/10.1371/journal.pone.0303519.t001

Gold labels

All 1,055 patient images were reviewed by a single author (RAS) in a blinded fashion and were assessed for the presence or absence of “brain compression” or “brain edema,” both of which are common neurosurgical CC/MCCs that were the primary targets of a recent intra-departmental documentation improvement effort [4].

Human prediction

Records for patients in the provider comparison cohort, the temporal range of which was chosen to reflect documentation practices after successful implementation of a provider-education intervention in late 2020, were also manually queried for discharge summary documentation of “brain compression” and “brain edema”; as such, for patients in this cohort, presence or absence of either term in the discharge summary were used to assess the performance of real-world provider documentation against the gold standard of author review.

Data pre-processing

We only used the impression texts of CT and MRI radiology reports to predict “brain compression” and “brain edema” classifications. All word/data tokenization was completed using the Natural Language Toolkit (NLTK) [18] package in the Python programming language (Python Software Foundation, https://www.python.org/). All texts were first “tokenized” into single word vectors by splitting the text on white space thereby one word becoming one word “token”. The list of tokens was then parsed and all words were casted to lowercase, all stop words were removed, and all punctuation was removed to isolate significant word tokens. The list of word vectors was then scored based on two different word tokenizing strategies: term frequency-inverse document frequency (TF-IDF) and frequency (TF) (Fig 1). These word vectors were then fed into ML and deep learning (DL) algorithms with a bag of words technique to predict lesion classification. Bag of words featurization allows for sentences to be vectorized based on the words they contain. The dimension of the sentence vector space is set to the number of unique word tokens where each index of the vector is representative of a unique word. Each sentence vector is constructed by either using a TF approach or a TF-IDF approach. The TF approach assigns values to each index in a sentence vector based on the frequency of that word occurring in the sentence. The TF-IDF approach discounts words that occur in high frequency in the corpus by a discounting factor of log(N/df) where N represents the number of reports present in the dataset and df representing the number of documents that a specific term was present in. The values at each index of a sentence vector constructed by using the TF-IDF approach is the product of the discounting factor and the respective TF value (Fig 1).

Download:

Fig 1. Example of frequency and TF-IDF tokenization strategies illustrating how TF-IDF controls for words that frequently occur in the corpus.

TF-IDF = Term Frequency Inverse Document Frequency.

https://doi.org/10.1371/journal.pone.0303519.g001

Overall, the entire dataset of radiology reports only mentioned “compression” in 1.7% of the reports and mentioned “edema” in 15.5% of the reports. The presence of these specific tokens do not necessarily correlate deterministically with a positive label (27% and 54% of reports with compression and edema tokens were positive for compression and edema respectively) which demonstrates the need of a more sophisticated NLP approach for classification.

As previously noted, the primary patient cohort data was split into an 84% training set and 16% testing set. After tokenizing and preprocessing the radiology data, we used a multitude of ML and DL supervised learning models to predict lesion classifications.

Machine learning and deep learning prediction

We used python packages scikit-learn Python [19] and Tensorflow [20] to fit various ML and DL models respectively. We trained a random forest, logistic, support vector machine, and Naive Bayes classifiers on both word tokenization techniques (TF-IDF and TF). As for DL models, we fit a single layer perceptron and a multilayer perceptron for classification. Each model was fit once for brain compression and once for brain edema; all models were binary classifiers. Both DL methods used a binary cross-entropy loss function, the random forest used a Gini Impurity loss function, and the rest of the ML classifiers used their respective default loss functions/techniques that were prebuilt in scikit-learn’s library.

Hyperparameter optimization was carried out through grid search optimizing for area under the curve (AUC) to avoid biased model fitting due to our slightly unbalanced datasets. Each model was fit using 5-fold cross validation where the training data was split into 5 equally sized groups, and the model was trained with 4 out of the 5 folds where the last fold was used as a validation set. 5-fold validation was carried out with shuffle split to ensure training data was shuffled and randomized before being divided into folds. The training and validation accuracies were only used to select the best hyperparameters for each model and are not reported here. For each model, AUC and accuracy, the proportion of correct binary classifications among the samples in the test set, are reported in addition to other select metrics.

Data availability

A de-identified dataset that supports the findings of this study is available on request. The data are not publicly available due to the use of protected health information.

Results

Characteristics of the patient cohort and included imaging studies are summarized in Table 2. The performance of included ML and DL algorithms for prediction of “brain compression” among the testing cohort are presented in Fig 2. Among ML methods, a random forest classifier with TF-IDF tokenization outperformed other candidate algorithms with an accuracy of 0.81 (Standard Deviation [SD] 0.01) and AUC of 0.90 (SD 0.01). Among DL methods, a multilayer perceptron method with frequency tokenization outperformed other candidate algorithms with an accuracy of 0.78 (SD 0.02) and AUC of 0.88 (SD 0.01). The performance of included ML and DL algorithms for prediction of “brain edema” are presented in Fig 3. Among ML methods, the random forest classifier with term frequency-inverse document frequency again outperformed other candidate algorithms with an accuracy of 0.92 (SD 0.02) and AUC of 0.94 (SD 0.01). Among DL methods, a multi-layer perceptron method with term frequency-inverse document frequency outperformed other candidate algorithms with an accuracy of 0.89 (SD 0.01) and AUC of 0.87 (SD <0.01).

Download:

Fig 2. Machine learning and deep learning model performance on brain compression data.

(A) Machine learning classifiers’ performance methods with both frequency (TF) and term frequency-inverse document frequency (TFIDF) tokenization strategies. (B) Deep learning classifiers’ performance methods with both frequency and term frequency-inverse document frequency (TFIDF) tokenization strategies. SVM = support vector machine; NB = Naïve bayes; Log = Logistic regression.

https://doi.org/10.1371/journal.pone.0303519.g002

Download:

Fig 3. Machine learning and deep learning model performance on brain edema data.

(A) Machine learning classifiers’ performance methods with both frequency (TF) and term frequency-inverse document frequency (tfidf) tokenization strategies. (B) Deep learning classifiers’ performance methods with both frequency and term frequency-inverse document frequency (TFIDF) tokenization strategies. SVM = support vector machine; NB = Naïve bayes; Log = Logistic regression.

https://doi.org/10.1371/journal.pone.0303519.g003

Download:

Table 2. Demographic and diagnostic characteristics for provider comparison cohort.

https://doi.org/10.1371/journal.pone.0303519.t002

Receiver operating characteristic (ROC) curves for both random forest classifiers are shown in Fig 4. For the optimal compression classifier, we chose a point on the ROC curve that corresponded to a classifier with an accuracy of 0.81, specificity of 0.88, and sensitivity of 0.65. For the optimal edema classifier, we chose a point on the ROC curve that corresponded to a classifier with an accuracy of 0.92, specificity of 1.0, and sensitivity of 0.48. A more complete characterization of each ML and DL classifier’s performance is broken down in Table 3A and 3B.

Download:

Fig 4. Receiver operating characteristic (ROC) curves for random forest classifier with TF-IDF tokenization.

(A) Estimator trained for brain compression classification. (B) Estimator trained for brain edema classification. AUC = area under the curve.

https://doi.org/10.1371/journal.pone.0303519.g004

Download:

Table 3.

A. Performance Metrics for ML and DL Models on Brain Compression. B. Performance Metrics for ML and DL Models on Brain Edema.

https://doi.org/10.1371/journal.pone.0303519.t003

Based on these data, the random forest classifier was selected as the best performing algorithm and compared against discharge summary documentation in Fig 5. For documentation of “brain compression”, the Random Forest ML algorithm demonstrated better accuracy (0.76 vs 0.70) and sensitivity (0.73 vs 0.43) than provider documentation. The logarithmic regression also performed very well with an identical accuracy (0.76) and a slightly higher sensitivity (0.76 vs 0.73) albeit having a slightly lower AUC when compared to the Random Forest (0.87 vs 0.89). For “brain edema,” the Random Forest ML algorithm again demonstrated better accuracy (0.92 vs 0.84) and sensitivity (0.45 vs 0.09) than provider documentation. The logarithmic regression also performed very well with an identical accuracy (0.92) and a higher sensitivity (0.54 vs 0.45) albeit having a lower AUC compared to the Random Forest (0.88 vs 0.91).

Download:

Fig 5. Machine learning estimator and provider documentation comparison.

(A) Estimators for compression dataset. (B) Estimators for edema dataset. SVM = support vector machine; NB = Naïve bayes; Log = Logistic regression.

https://doi.org/10.1371/journal.pone.0303519.g005

Overall, our optimal classifiers, for both brain compression and edema, vastly outperformed provider documentation in sensitivity due its ability to more readily classify true brain compression and brain edema cases. We do see, however, a lower specificity in our brain compression classifiers when compared to providers (0.80 vs 0.95), but this can likely be attributed to the fact that provider documentation overwhelmingly failed to document brain compression. This lead to a high number of true negatives and low number of false positives resulting in a very high specificity. A detailed metric comparison of our classifiers’ performance compared to provider documentation performance is presented In Table 4A and 4B.

Download:

Table 4.

A. Estimator and Provider Performance Comparison for Brain Compression. B. Estimator and Provider Performance Comparison for Brain Compression.

https://doi.org/10.1371/journal.pone.0303519.t004

Discussion

In contemporary American healthcare, the benefits of improved documentation are at best infrequently and indirectly apparent to those on whom the burden of documentation falls. As such, despite the longevity of the DRG-based reimbursement system, sporadic hospital- and practice-based efforts to optimize inpatient documentation abound [1, 2, 4–6, 10, 14, 21–29]. Given the relatively large financial impact of neurosurgical procedures to overall hospital finances and the significant costs of non-operative trauma care, developing simple, reproducible, and efficacious mechanisms for documentation improvement for inpatient neurosurgical practitioners is of paramount importance [2, 5, 30]. In this context, we report the successful development and validation of an NLP/ML-based algorithm for the identification of two common neurosurgical CC/MCC’s from the reports of CTs or MRIs of the brain. When assessed against real-life performance of inpatient neurosurgical providers, our algorithm outperformed baseline provider documentation after the recent implementation of a documentation improvement effort. These results suggest that ML-based decision support should be considered as efficient and cost-effective components of future documentation improvement efforts and, in this specific context, could suggest diagnoses that could be documented along with diagnosis-specific treatment plans. More broadly, the implementation of an efficient, text-based algorithm could have many applications to inpatient care outside of neurosurgery alone.

Time spent documenting in EMR already consumes multiple hours in the average surgical workday [9, 31]. This documentation burden is increasingly significant for inpatient medical and surgical residents, who, along with inpatient APPs, perform the majority of consequential documentation for hospital inpatients in academic centers [7, 8, 14, 23, 31, 32]. Surgeon perception that additional documentation may not be clinically meaningful necessarily limits the implementation of documentation improvement programs, nearly all of which require investment in the form of time, personnel, or both [33]. As previously noted, many previous interventions have coupled targeted provider education sessions with ongoing chart review to provide providers with feedback or to generate further documentation queries [2–6, 14, 21, 24, 25, 27, 28]. For instance, Fox et al report a cost greater than $350,000 and return on investment of 220% for a program that involved personalized documentation teaching sessions and allocation of documentation specialists to round with a trauma surgery team and to review notes at a Level 1 trauma center [10]. Similarly, for a similar intervention, Spurgeon et al reported that nurses working 10–15 hours/week on documentation improvement were only able to review less than half of inpatient neurosurgical notes over an 8-week time period [5]. Efforts to minimize time investment required by providers to update notes underpinned the development of the documentation query, in which a provider need only respond “yes” or “no” for the presence or absence of a given diagnosis [34]; however, even with simple systems, time-consuming manual review by documentation experts is still required to generate queries. The progress note template, which standardizes common comorbidities during documentation efforts, is another low-cost documentation improvement intervention, though contemporary success of simple, paper-based checklists have, perplexingly, been shown to yield more thorough documentation than EMR-based approaches [29]. In clinical contexts, documentation of “brain compression” and “brain edema” can only be reliably extracted from neuroimaging and rarely convey meaningful clinical information relative to more commonly used expressions; as such, ML-based approaches to extract these diagnoses, which are both common and commonly undocumented, may yield significant benefits relative to low costs.

ML and NLP applications in neurosurgery and neuroimaging are numerous and varied [15, 35–39] and have only increased in breadth in depth since the widespread popularization of large language models (LLMs) such as ChatGPT [40–44]. A variety of methods utilizing either radiology reports, raw images, or both, have been successfully applied in a variety of clinical applications [45, 46]. Decision support for clinical documentation may offer a particularly fruitful application of these technologies, especially given that the imperative is to augment documentation at provider discretion without necessarily changing the course of patient care. Documentation efforts likely require a flexible approach for ML applications–certain diagnoses, such as “brain compression”, can be exclusively learned through imaging reports. Others, such as “encephalopathy”, cannot and instead would require parsing provider documentation and medication administration. Another challenge of clinical documentation efforts is that documentation requirements for various stakeholders may not necessarily overlap and, furthermore, may change over time with the release of new documentation standards. A final challenge, which will likely become more prominent with the availability of public LLMs, is the requirement to protect patient data confidentiality [47]. For instance, while an internally developed algorithm such as our own may not jeopardize patient health information (PHI) as both the training and implementation of the model are local; however, use of public LLMs may easily risk transmission of PHI to servers of an external organization. Future applications of these technologies will need to be aware of these particular risks. Nevertheless, the opportunities for NLP are significant and likely extend beyond comorbidity documentation to clinical decision support, safety oversight, telehealth, clinical encounter documentation, and informed patient consent, among many others.

While this project does demonstrate the feasibility of NLP-based decision support for clinical neurosurgical documentation, it does have notable limitations. Our optimal random forest classifiers demonstrated relatively low sensitivity (0.65 and 0.48, for compression and edema, respectively) relative to their high specificity (0.88 and 1.0, for compression and edema respectively). However, for a clinical decision support system, a high specificity in the context of a lower sensitivity is preferred to low specificity and high sensitivity, as an optimal clinical decision support system should generate few false positives and a high number of true positives. Our lower sensitivity numbers are likely attributable to the use of a more naïve NLP approach by only looking at the presence of individual word tokens rather than processing and interpreting word tokens in context of the report as a whole. Future studies should focus on increasing sensitivity, which would likely occur with a larger dataset and the use of more intricate NLP infrastructures such as recursive neural networks or transformers that can more readily contextualize blocks of text as a whole. From the perspective of data collection, the development of this particular NLP model was based on radiology reports generated within a single health care system; as such, its applicability to different reporting systems or in other languages may be limited. Furthermore, the reports were reviewed by a single author. As previously noted, this effort evaluated the diagnosis of two particular comorbidities that could be readily ascertained from neuroimaging. Finally, tactful EMR implementation will be necessary to present results of this algorithm to clinicians in a way that encourages responses and meaningfully improves clinical documentation.

Conclusions

An NLP-based ML algorithm can reliably detect 2 major comorbidities for neurosurgical patients from radiology reports. Algorithm performance exceeds real-life documentation performance.

Supporting information

S1 File.

https://doi.org/10.1371/journal.pone.0303519.s001

(ZIP)

S1 Dataset.

https://doi.org/10.1371/journal.pone.0303519.s002

(ZIP)

References

1. Aiello FA, Judelson DR, Durgin JM, Doucet DR, Simons JP, Durocher DM, et al. A physician-led initiative to improve clinical documentation results in improved health care documentation, case mix index, and increased contribution margin. J Vasc Surg. 2018;68: 1524–1532. pmid:29735302
- View Article
- PubMed/NCBI
- Google Scholar
2. Barnes SL, Waterman M, MacIntyre D, Coughenour J, Kessel J. Impact of standardized trauma documentation to the hospital’s bottom line. Surgery. 2010;148: 793–798. pmid:20797746
- View Article
- PubMed/NCBI
- Google Scholar
3. Reyes C, Greenbaum A, Porto C, Russell JC. Implementation of a Clinical Documentation Improvement Curriculum Improves Quality Metrics and Hospital Charges in an Academic Surgery Department. J Am Coll Surg. 2017;224: 301–309. pmid:27919741
- View Article
- PubMed/NCBI
- Google Scholar
4. Ali R, Syed S, Sastry RA, Abdulrazeq H, Shao B, Roye GD, et al. Toward more accurate documentation in neurosurgical care. Neurosurg Focus. 2021;51: E11. pmid:34724645
- View Article
- PubMed/NCBI
- Google Scholar
5. Spurgeon A, Hiser B, Hafley C, Litofsky NS. Does Improving Medical Record Documentation Better Reflect Severity of Illness in Neurosurgical Patients? Neurosurgery. 2011;58: 155–163. pmid:21916142
- View Article
- PubMed/NCBI
- Google Scholar
6. Momin SR, Lorenz RR, Lamarre ED. Effect of a Documentation Improvement Program for an Academic Otolaryngology Practice. JAMA Otolaryngol Neck Surg. 2016;142: 533–537. pmid:27055147
- View Article
- PubMed/NCBI
- Google Scholar
7. Oxentenko AS, West CP, Popkave C, Weinberger SE, Kolars JC. Time Spent on Clinical Documentation: A Survey of Internal Medicine Residents and Program Directors. Arch Intern Med. 2010;170: 377–380. pmid:20177042
- View Article
- PubMed/NCBI
- Google Scholar
8. Hripcsak G, Vawdrey DK, Fred MR, Bostwick SB. Use of electronic clinical documentation: time spent and team interactions. J Am Med Inform Assoc JAMIA. 2011;18: 112–117. pmid:21292706
- View Article
- PubMed/NCBI
- Google Scholar
9. Golob JFJ, Como JJ, Claridge JA. The painful truth: The documentation burden of a trauma surgeon. J Trauma Acute Care Surg. 2016;80: 742–747. pmid:26886003
- View Article
- PubMed/NCBI
- Google Scholar
10. Fox N, Swierczynski P, Willcutt R, Elberfeld A, Mazzarelli AJ. Lost in translation: Focused documentation improvement benefits trauma surgeons. Injury. 2016;47: 1919–1923. pmid:27156039
- View Article
- PubMed/NCBI
- Google Scholar
11. Shanafelt TD, Dyrbye LN, West CP. Addressing Physician Burnout: The Way Forward. JAMA. 2017;317: 901–902. pmid:28196201
- View Article
- PubMed/NCBI
- Google Scholar
12. Downing NL, Bates DW, Longhurst CA. Physician Burnout in the Electronic Health Record Era: Are We Ignoring the Real Cause? Ann Intern Med. 2018;169: 50–51. pmid:29801050
- View Article
- PubMed/NCBI
- Google Scholar
13. Steinwald B, Dummit LA. Hospital Case-Mix Change: Sicker Patients Or Drg Creep? Health Aff (Millwood). 1989;8: 35–47. pmid:2501203
- View Article
- PubMed/NCBI
- Google Scholar
14. Rosenbaum BP, Lorenz RR, Luther RB, Knowles-Ward L, Kelly DL, Weil RJ. Improving and Measuring Inpatient Documentation of Medical Care within the MS-DRG System: Education, Monitoring, and Normalized Case Mix Index. Perspect Health Inf Manag. 2014;11: 1c. pmid:25214820
- View Article
- PubMed/NCBI
- Google Scholar
15. Raju B, Jumah F, Ashraf O, Narayan V, Gupta G, Sun H, et al. Big data, machine learning, and artificial intelligence: a field guide for neurosurgeons. J Neurosurg. 2020;1: 1–11. pmid:33007750
- View Article
- PubMed/NCBI
- Google Scholar
16. Luo JW, Chong JJR. Review of Natural Language Processing in Radiology. Neuroimaging Clin N Am. 2020;30: 447–458. pmid:33038995
- View Article
- PubMed/NCBI
- Google Scholar
17. Kehl KL, Elmarakeby H, Nishino M, Van Allen EM, Lepisto EM, Hassett MJ, et al. Assessment of Deep Natural Language Processing in Ascertaining Oncologic Outcomes From Radiology Reports. JAMA Oncol. 2019;5: 1421–1429. pmid:31343664
- View Article
- PubMed/NCBI
- Google Scholar
18. Bird S, Klein E, Loper E. Natural Language Processing with Python: Analyzing Text with the Natural Language Toolkit. O’Reilly Media, Inc.; 2009.
- View Article
- Google Scholar
19. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, et al. Scikit-learn: Machine Learning in Python. Mach Learn PYTHON.: 6.
- View Article
- Google Scholar
20. Abadi M, Agarwal A, Barham P, Brevdo E, Chen Z, Citro C, et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems. ArXiv160304467 Cs. 2016 [cited 14 Mar 2022]. Available: http://arxiv.org/abs/1603.04467
- View Article
- Google Scholar
21. Arquiette JM, Moss HA, Truong T, Pieper CF, Havrilesky LJ. Impact of a documentation intervention on health-assessment metrics on an inpatient gynecologic oncology service. Gynecol Oncol. 2019;153: 385–390. pmid:30824212
- View Article
- PubMed/NCBI
- Google Scholar
22. Campbell S, Giadresco K. Computer-assisted clinical coding: A narrative review of the literature on its benefits, limitations, implementation and impact on clinical coding professionals. Health Inf Manag J. 2020;49: 5–18. pmid:31159578
- View Article
- PubMed/NCBI
- Google Scholar
23. Castaldi M, McNelis J. Introducing a Clinical Documentation Specialist to Improve Coding and Collectability on a Surgical Service. J Healthc Qual JHQ. 2019;41: e21. pmid:31094954
- View Article
- PubMed/NCBI
- Google Scholar
24. Elkbuli A, Godelman S, Miller A, Boneva D, Bernal E, Hai S, et al. Improved clinical documentation leads to superior reportable outcomes: An accurate representation of patient’s clinical status. Int J Surg. 2018;53: 288–291. pmid:29653245
- View Article
- PubMed/NCBI
- Google Scholar
25. Frazee RC, Matejicka AV, Abernathy SW, Davis M, Isbell TS, Regner JL, et al. Concurrent Chart Review Provides More Accurate Documentation and Increased Calculated Case Mix Index, Severity of Illness, and Risk of Mortality. J Am Coll Surg. 2015;220: 652–656. pmid:25724608
- View Article
- PubMed/NCBI
- Google Scholar
26. Grogan EL, Speroff T, Deppen SA, Roumie CL, Elasy TA, Dittus RS, et al. Improving documentation of patient acuity level using a progress note template. J Am Coll Surg. 2004;199: 468–475. pmid:15325618
- View Article
- PubMed/NCBI
- Google Scholar
27. Johnson CE, Peralta J, Lawrence L, Issai A, Weaver FA, Ham SW. Focused Resident Education and Engagement in Quality Improvement Enhances Documentation, Shortens Hospital Length of Stay, and Creates a Culture of Continuous Improvement. J Surg Educ. 2019;76: 771–778. pmid:30552003
- View Article
- PubMed/NCBI
- Google Scholar
28. Spellberg B, Harrington D, Black S, Sue D, Stringer W, Witt M. Capturing the Diagnosis: An Internal Medicine Education Program to Improve Documentation. Am J Med. 2013;126: 739–743.e1. pmid:23791207
- View Article
- PubMed/NCBI
- Google Scholar
29. Weinberg JA, Chapple KM, Gagliano RA, Israr S, Petersen SR. Back to the Future: Impact of a Paper-Based Admission H&P on Clinical Documentation Improvement at a Level 1 Trauma Center. Am Surg. 2019;85: 611–619.
- View Article
- Google Scholar
30. Resnick AS, Corrigan D, Mullen JL, Kaiser LR. Surgeon Contribution to Hospital Bottom Line. Ann Surg. 2005;242: 530–539. pmid:16192813
- View Article
- PubMed/NCBI
- Google Scholar
31. Cox ML, Farjat AE, Risoli TJ, Peskoe S, Goldstein BA, Turner DA, et al. Documenting or Operating: Where Is Time Spent in General Surgery Residency? J Surg Educ. 2018;75: e97–e106. pmid:30522828
- View Article
- PubMed/NCBI
- Google Scholar
32. Chaiyachati KH, Shea JA, Asch DA, Liu M, Bellini LM, Dine CJ, et al. Assessment of Inpatient Time Allocation Among First-Year Internal Medicine Residents Using Time-Motion Observations. JAMA Intern Med. 2019;179: 760–767. pmid:30985861
- View Article
- PubMed/NCBI
- Google Scholar
33. Zalatimo O, Ranasinghe M, Harbaugh RE, Iantosca M. Impact of improved documentation on an academic neurosurgical practice: Clinical article. J Neurosurg. 2014;120: 756–763. pmid:24359011
- View Article
- PubMed/NCBI
- Google Scholar
34. Morrison RJ, Malloy KM, Bakshi RR. Improved Comorbidity Capture Using a Standardized 1-Step Quality Improvement Documentation Tool. Otolaryngol Neck Surg. 2018;159: 143–148. pmid:29557262
- View Article
- PubMed/NCBI
- Google Scholar
35. Jumah F, Raju B, Nagaraj A, Shinde R, Lescott C, Sun H, et al. Uncharted Waters of Machine and Deep Learning for Surgical Phase Recognition in Neurosurgery. World Neurosurg. 2022;160: 4–12. pmid:35026457
- View Article
- PubMed/NCBI
- Google Scholar
36. English M, Kumar C, Ditterline BL, Drazin D, Dietz N. Machine Learning in Neuro-Oncology, Epilepsy, Alzheimer’s Disease, and Schizophrenia. Acta Neurochir Suppl. 2022;134: 349–361. pmid:34862559
- View Article
- PubMed/NCBI
- Google Scholar
37. Muhlestein WE, Akagi DS, Davies JM, Chambless LB. Predicting Inpatient Length of Stay After Brain Tumor Surgery: Developing Machine Learning Ensembles to Improve Predictive Performance. Neurosurgery. 2019;85: 384–393. pmid:30113665
- View Article
- PubMed/NCBI
- Google Scholar
38. Muhlestein WE, Akagi DS, Kallos JA, Morone PJ, Weaver KD, Thompson RC, et al. Using a Guided Machine Learning Ensemble Model to Predict Discharge Disposition following Meningioma Resection. J Neurol Surg Part B Skull Base. 2018;79: 123. pmid:29868316
- View Article
- PubMed/NCBI
- Google Scholar
39. Merali ZA, Colak E, Wilson JR. Applications of Machine Learning to Imaging of Spinal Disorders: Current Status and Future Directions. Glob Spine J. 2021;11: 23S–29S. pmid:33890805
- View Article
- PubMed/NCBI
- Google Scholar
40. Ali R, Connolly ID, Tang OY, Mirza FN, Johnston B, Abdulrazeq HF, et al. Bridging the literacy gap for surgical consents: an AI-human expert collaborative approach. Npj Digit Med. 2024;7: 1–6. pmid:38459205
- View Article
- PubMed/NCBI
- Google Scholar
41. Roman A, Al-Sharif L, AL Gharyani M. The Expanding Role of ChatGPT (Chat-Generative Pre-Trained Transformer) in Neurosurgery: A Systematic Review of Literature and Conceptual Framework. Cureus. 15: e43502. pmid:37719492
- View Article
- PubMed/NCBI
- Google Scholar
42. Dubinski D, Won S-Y, Trnovec S, Behmanesh B, Baumgarten P, Dinc N, et al. Leveraging artificial intelligence in neurosurgery—unveiling ChatGPT for neurosurgical discharge summaries and operative reports. Acta Neurochir (Wien). 2024;166: 38. pmid:38277081
- View Article
- PubMed/NCBI
- Google Scholar
43. Goodman KE, Yi PH, Morgan DJ. AI-Generated Clinical Summaries Require More Than Accuracy. JAMA. 2024;331: 637–638. pmid:38285439
- View Article
- PubMed/NCBI
- Google Scholar
44. Patel SB, Lam K. ChatGPT: the future of discharge summaries? Lancet Digit Health. 2023;5: e107–e108. pmid:36754724
- View Article
- PubMed/NCBI
- Google Scholar
45. Pons E, Braun LMM, Hunink MGM, Kors JA. Natural Language Processing in Radiology: A Systematic Review. Radiology. 2016;279: 329–343. pmid:27089187
- View Article
- PubMed/NCBI
- Google Scholar
46. Chartrand G, Cheng PM, Vorontsov E, Drozdzal M, Turcotte S, Pal CJ, et al. Deep Learning: A Primer for Radiologists. RadioGraphics. 2017;37: 2113–2131. pmid:29131760
- View Article
- PubMed/NCBI
- Google Scholar
47. Kanter GP, Packel EA. Health Care Privacy Risks of AI Chatbots. JAMA. 2023;330: 311–312. pmid:37410449
- View Article
- PubMed/NCBI
- Google Scholar

[ref1] 1. Aiello FA, Judelson DR, Durgin JM, Doucet DR, Simons JP, Durocher DM, et al. A physician-led initiative to improve clinical documentation results in improved health care documentation, case mix index, and increased contribution margin. J Vasc Surg. 2018;68: 1524–1532. pmid:29735302
View Article
PubMed/NCBI
Google Scholar

[2] View Article

[3] PubMed/NCBI

[4] Google Scholar

[ref2] 2. Barnes SL, Waterman M, MacIntyre D, Coughenour J, Kessel J. Impact of standardized trauma documentation to the hospital’s bottom line. Surgery. 2010;148: 793–798. pmid:20797746
View Article
PubMed/NCBI
Google Scholar

[6] View Article

[7] PubMed/NCBI

[8] Google Scholar

[ref3] 3. Reyes C, Greenbaum A, Porto C, Russell JC. Implementation of a Clinical Documentation Improvement Curriculum Improves Quality Metrics and Hospital Charges in an Academic Surgery Department. J Am Coll Surg. 2017;224: 301–309. pmid:27919741
View Article
PubMed/NCBI
Google Scholar

[10] View Article

[11] PubMed/NCBI

[12] Google Scholar

[ref4] 4. Ali R, Syed S, Sastry RA, Abdulrazeq H, Shao B, Roye GD, et al. Toward more accurate documentation in neurosurgical care. Neurosurg Focus. 2021;51: E11. pmid:34724645
View Article
PubMed/NCBI
Google Scholar

[14] View Article

[15] PubMed/NCBI

[16] Google Scholar

[ref5] 5. Spurgeon A, Hiser B, Hafley C, Litofsky NS. Does Improving Medical Record Documentation Better Reflect Severity of Illness in Neurosurgical Patients? Neurosurgery. 2011;58: 155–163. pmid:21916142
View Article
PubMed/NCBI
Google Scholar

[18] View Article

[19] PubMed/NCBI

[20] Google Scholar

[ref6] 6. Momin SR, Lorenz RR, Lamarre ED. Effect of a Documentation Improvement Program for an Academic Otolaryngology Practice. JAMA Otolaryngol Neck Surg. 2016;142: 533–537. pmid:27055147
View Article
PubMed/NCBI
Google Scholar

[22] View Article

[23] PubMed/NCBI

[24] Google Scholar

[ref7] 7. Oxentenko AS, West CP, Popkave C, Weinberger SE, Kolars JC. Time Spent on Clinical Documentation: A Survey of Internal Medicine Residents and Program Directors. Arch Intern Med. 2010;170: 377–380. pmid:20177042
View Article
PubMed/NCBI
Google Scholar

[26] View Article

[27] PubMed/NCBI

[28] Google Scholar

[ref8] 8. Hripcsak G, Vawdrey DK, Fred MR, Bostwick SB. Use of electronic clinical documentation: time spent and team interactions. J Am Med Inform Assoc JAMIA. 2011;18: 112–117. pmid:21292706
View Article
PubMed/NCBI
Google Scholar

[30] View Article

[31] PubMed/NCBI

[32] Google Scholar

[ref9] 9. Golob JFJ, Como JJ, Claridge JA. The painful truth: The documentation burden of a trauma surgeon. J Trauma Acute Care Surg. 2016;80: 742–747. pmid:26886003
View Article
PubMed/NCBI
Google Scholar

[34] View Article

[35] PubMed/NCBI

[36] Google Scholar

[ref10] 10. Fox N, Swierczynski P, Willcutt R, Elberfeld A, Mazzarelli AJ. Lost in translation: Focused documentation improvement benefits trauma surgeons. Injury. 2016;47: 1919–1923. pmid:27156039
View Article
PubMed/NCBI
Google Scholar

[38] View Article

[39] PubMed/NCBI

[40] Google Scholar

[ref11] 11. Shanafelt TD, Dyrbye LN, West CP. Addressing Physician Burnout: The Way Forward. JAMA. 2017;317: 901–902. pmid:28196201
View Article
PubMed/NCBI
Google Scholar

[42] View Article

[43] PubMed/NCBI

[44] Google Scholar

[ref12] 12. Downing NL, Bates DW, Longhurst CA. Physician Burnout in the Electronic Health Record Era: Are We Ignoring the Real Cause? Ann Intern Med. 2018;169: 50–51. pmid:29801050
View Article
PubMed/NCBI
Google Scholar

[46] View Article

[47] PubMed/NCBI

[48] Google Scholar

[ref13] 13. Steinwald B, Dummit LA. Hospital Case-Mix Change: Sicker Patients Or Drg Creep? Health Aff (Millwood). 1989;8: 35–47. pmid:2501203
View Article
PubMed/NCBI
Google Scholar

[50] View Article

[51] PubMed/NCBI

[52] Google Scholar

[ref14] 14. Rosenbaum BP, Lorenz RR, Luther RB, Knowles-Ward L, Kelly DL, Weil RJ. Improving and Measuring Inpatient Documentation of Medical Care within the MS-DRG System: Education, Monitoring, and Normalized Case Mix Index. Perspect Health Inf Manag. 2014;11: 1c. pmid:25214820
View Article
PubMed/NCBI
Google Scholar

[54] View Article

[55] PubMed/NCBI

[56] Google Scholar

[ref15] 15. Raju B, Jumah F, Ashraf O, Narayan V, Gupta G, Sun H, et al. Big data, machine learning, and artificial intelligence: a field guide for neurosurgeons. J Neurosurg. 2020;1: 1–11. pmid:33007750
View Article
PubMed/NCBI
Google Scholar

[58] View Article

[59] PubMed/NCBI

[60] Google Scholar

[ref16] 16. Luo JW, Chong JJR. Review of Natural Language Processing in Radiology. Neuroimaging Clin N Am. 2020;30: 447–458. pmid:33038995
View Article
PubMed/NCBI
Google Scholar

[62] View Article

[63] PubMed/NCBI

[64] Google Scholar

[ref17] 17. Kehl KL, Elmarakeby H, Nishino M, Van Allen EM, Lepisto EM, Hassett MJ, et al. Assessment of Deep Natural Language Processing in Ascertaining Oncologic Outcomes From Radiology Reports. JAMA Oncol. 2019;5: 1421–1429. pmid:31343664
View Article
PubMed/NCBI
Google Scholar

[66] View Article

[67] PubMed/NCBI

[68] Google Scholar

[ref18] 18. Bird S, Klein E, Loper E. Natural Language Processing with Python: Analyzing Text with the Natural Language Toolkit. O’Reilly Media, Inc.; 2009.
View Article
Google Scholar

[70] View Article

[71] Google Scholar

[ref19] 19. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, et al. Scikit-learn: Machine Learning in Python. Mach Learn PYTHON.: 6.
View Article
Google Scholar

[73] View Article

[74] Google Scholar

[ref20] 20. Abadi M, Agarwal A, Barham P, Brevdo E, Chen Z, Citro C, et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems. ArXiv160304467 Cs. 2016 [cited 14 Mar 2022]. Available: http://arxiv.org/abs/1603.04467
View Article
Google Scholar

[76] View Article

[77] Google Scholar

[ref21] 21. Arquiette JM, Moss HA, Truong T, Pieper CF, Havrilesky LJ. Impact of a documentation intervention on health-assessment metrics on an inpatient gynecologic oncology service. Gynecol Oncol. 2019;153: 385–390. pmid:30824212
View Article
PubMed/NCBI
Google Scholar

[79] View Article

[80] PubMed/NCBI

[81] Google Scholar

[ref22] 22. Campbell S, Giadresco K. Computer-assisted clinical coding: A narrative review of the literature on its benefits, limitations, implementation and impact on clinical coding professionals. Health Inf Manag J. 2020;49: 5–18. pmid:31159578
View Article
PubMed/NCBI
Google Scholar

[83] View Article

[84] PubMed/NCBI

[85] Google Scholar

[ref23] 23. Castaldi M, McNelis J. Introducing a Clinical Documentation Specialist to Improve Coding and Collectability on a Surgical Service. J Healthc Qual JHQ. 2019;41: e21. pmid:31094954
View Article
PubMed/NCBI
Google Scholar

[87] View Article

[88] PubMed/NCBI

[89] Google Scholar

[ref24] 24. Elkbuli A, Godelman S, Miller A, Boneva D, Bernal E, Hai S, et al. Improved clinical documentation leads to superior reportable outcomes: An accurate representation of patient’s clinical status. Int J Surg. 2018;53: 288–291. pmid:29653245
View Article
PubMed/NCBI
Google Scholar

[91] View Article

[92] PubMed/NCBI

[93] Google Scholar

[ref25] 25. Frazee RC, Matejicka AV, Abernathy SW, Davis M, Isbell TS, Regner JL, et al. Concurrent Chart Review Provides More Accurate Documentation and Increased Calculated Case Mix Index, Severity of Illness, and Risk of Mortality. J Am Coll Surg. 2015;220: 652–656. pmid:25724608
View Article
PubMed/NCBI
Google Scholar

[95] View Article

[96] PubMed/NCBI

[97] Google Scholar

[ref26] 26. Grogan EL, Speroff T, Deppen SA, Roumie CL, Elasy TA, Dittus RS, et al. Improving documentation of patient acuity level using a progress note template. J Am Coll Surg. 2004;199: 468–475. pmid:15325618
View Article
PubMed/NCBI
Google Scholar

[99] View Article

[100] PubMed/NCBI

[101] Google Scholar

[ref27] 27. Johnson CE, Peralta J, Lawrence L, Issai A, Weaver FA, Ham SW. Focused Resident Education and Engagement in Quality Improvement Enhances Documentation, Shortens Hospital Length of Stay, and Creates a Culture of Continuous Improvement. J Surg Educ. 2019;76: 771–778. pmid:30552003
View Article
PubMed/NCBI
Google Scholar

[103] View Article

[104] PubMed/NCBI

[105] Google Scholar

[ref28] 28. Spellberg B, Harrington D, Black S, Sue D, Stringer W, Witt M. Capturing the Diagnosis: An Internal Medicine Education Program to Improve Documentation. Am J Med. 2013;126: 739–743.e1. pmid:23791207
View Article
PubMed/NCBI
Google Scholar

[107] View Article

[108] PubMed/NCBI

[109] Google Scholar

[ref29] 29. Weinberg JA, Chapple KM, Gagliano RA, Israr S, Petersen SR. Back to the Future: Impact of a Paper-Based Admission H&P on Clinical Documentation Improvement at a Level 1 Trauma Center. Am Surg. 2019;85: 611–619.
View Article
Google Scholar

[111] View Article

[112] Google Scholar

[ref30] 30. Resnick AS, Corrigan D, Mullen JL, Kaiser LR. Surgeon Contribution to Hospital Bottom Line. Ann Surg. 2005;242: 530–539. pmid:16192813
View Article
PubMed/NCBI
Google Scholar

[114] View Article

[115] PubMed/NCBI

[116] Google Scholar

[ref31] 31. Cox ML, Farjat AE, Risoli TJ, Peskoe S, Goldstein BA, Turner DA, et al. Documenting or Operating: Where Is Time Spent in General Surgery Residency? J Surg Educ. 2018;75: e97–e106. pmid:30522828
View Article
PubMed/NCBI
Google Scholar

[118] View Article

[119] PubMed/NCBI

[120] Google Scholar

[ref32] 32. Chaiyachati KH, Shea JA, Asch DA, Liu M, Bellini LM, Dine CJ, et al. Assessment of Inpatient Time Allocation Among First-Year Internal Medicine Residents Using Time-Motion Observations. JAMA Intern Med. 2019;179: 760–767. pmid:30985861
View Article
PubMed/NCBI
Google Scholar

[122] View Article

[123] PubMed/NCBI

[124] Google Scholar

[ref33] 33. Zalatimo O, Ranasinghe M, Harbaugh RE, Iantosca M. Impact of improved documentation on an academic neurosurgical practice: Clinical article. J Neurosurg. 2014;120: 756–763. pmid:24359011
View Article
PubMed/NCBI
Google Scholar

[126] View Article

[127] PubMed/NCBI

[128] Google Scholar

[ref34] 34. Morrison RJ, Malloy KM, Bakshi RR. Improved Comorbidity Capture Using a Standardized 1-Step Quality Improvement Documentation Tool. Otolaryngol Neck Surg. 2018;159: 143–148. pmid:29557262
View Article
PubMed/NCBI
Google Scholar

[130] View Article

[131] PubMed/NCBI

[132] Google Scholar

[ref35] 35. Jumah F, Raju B, Nagaraj A, Shinde R, Lescott C, Sun H, et al. Uncharted Waters of Machine and Deep Learning for Surgical Phase Recognition in Neurosurgery. World Neurosurg. 2022;160: 4–12. pmid:35026457
View Article
PubMed/NCBI
Google Scholar

[134] View Article

[135] PubMed/NCBI

[136] Google Scholar

[ref36] 36. English M, Kumar C, Ditterline BL, Drazin D, Dietz N. Machine Learning in Neuro-Oncology, Epilepsy, Alzheimer’s Disease, and Schizophrenia. Acta Neurochir Suppl. 2022;134: 349–361. pmid:34862559
View Article
PubMed/NCBI
Google Scholar

[138] View Article

[139] PubMed/NCBI

[140] Google Scholar

[ref37] 37. Muhlestein WE, Akagi DS, Davies JM, Chambless LB. Predicting Inpatient Length of Stay After Brain Tumor Surgery: Developing Machine Learning Ensembles to Improve Predictive Performance. Neurosurgery. 2019;85: 384–393. pmid:30113665
View Article
PubMed/NCBI
Google Scholar

[142] View Article

[143] PubMed/NCBI

[144] Google Scholar

[ref38] 38. Muhlestein WE, Akagi DS, Kallos JA, Morone PJ, Weaver KD, Thompson RC, et al. Using a Guided Machine Learning Ensemble Model to Predict Discharge Disposition following Meningioma Resection. J Neurol Surg Part B Skull Base. 2018;79: 123. pmid:29868316
View Article
PubMed/NCBI
Google Scholar

[146] View Article

[147] PubMed/NCBI

[148] Google Scholar

[ref39] 39. Merali ZA, Colak E, Wilson JR. Applications of Machine Learning to Imaging of Spinal Disorders: Current Status and Future Directions. Glob Spine J. 2021;11: 23S–29S. pmid:33890805
View Article
PubMed/NCBI
Google Scholar

[150] View Article

[151] PubMed/NCBI

[152] Google Scholar

[ref40] 40. Ali R, Connolly ID, Tang OY, Mirza FN, Johnston B, Abdulrazeq HF, et al. Bridging the literacy gap for surgical consents: an AI-human expert collaborative approach. Npj Digit Med. 2024;7: 1–6. pmid:38459205
View Article
PubMed/NCBI
Google Scholar

[154] View Article

[155] PubMed/NCBI

[156] Google Scholar

[ref41] 41. Roman A, Al-Sharif L, AL Gharyani M. The Expanding Role of ChatGPT (Chat-Generative Pre-Trained Transformer) in Neurosurgery: A Systematic Review of Literature and Conceptual Framework. Cureus. 15: e43502. pmid:37719492
View Article
PubMed/NCBI
Google Scholar

[158] View Article

[159] PubMed/NCBI

[160] Google Scholar

[ref42] 42. Dubinski D, Won S-Y, Trnovec S, Behmanesh B, Baumgarten P, Dinc N, et al. Leveraging artificial intelligence in neurosurgery—unveiling ChatGPT for neurosurgical discharge summaries and operative reports. Acta Neurochir (Wien). 2024;166: 38. pmid:38277081
View Article
PubMed/NCBI
Google Scholar

[162] View Article

[163] PubMed/NCBI

[164] Google Scholar

[ref43] 43. Goodman KE, Yi PH, Morgan DJ. AI-Generated Clinical Summaries Require More Than Accuracy. JAMA. 2024;331: 637–638. pmid:38285439
View Article
PubMed/NCBI
Google Scholar

[166] View Article

[167] PubMed/NCBI

[168] Google Scholar

[ref44] 44. Patel SB, Lam K. ChatGPT: the future of discharge summaries? Lancet Digit Health. 2023;5: e107–e108. pmid:36754724
View Article
PubMed/NCBI
Google Scholar

[170] View Article

[171] PubMed/NCBI

[172] Google Scholar

[ref45] 45. Pons E, Braun LMM, Hunink MGM, Kors JA. Natural Language Processing in Radiology: A Systematic Review. Radiology. 2016;279: 329–343. pmid:27089187
View Article
PubMed/NCBI
Google Scholar

[174] View Article

[175] PubMed/NCBI

[176] Google Scholar

[ref46] 46. Chartrand G, Cheng PM, Vorontsov E, Drozdzal M, Turcotte S, Pal CJ, et al. Deep Learning: A Primer for Radiologists. RadioGraphics. 2017;37: 2113–2131. pmid:29131760
View Article
PubMed/NCBI
Google Scholar

[178] View Article

[179] PubMed/NCBI

[180] Google Scholar

[ref47] 47. Kanter GP, Packel EA. Health Care Privacy Risks of AI Chatbots. JAMA. 2023;330: 311–312. pmid:37410449
View Article
PubMed/NCBI
Google Scholar

[182] View Article

[183] PubMed/NCBI

[184] Google Scholar

Figures

Abstract

Objective

Materials and methods

Results

Discussion

Conclusion

Introduction

Materials and methods

Patient cohort

Gold labels

Human prediction

Data pre-processing

Machine learning and deep learning prediction

Data availability

Results

Discussion

Conclusions

Supporting information

S1 File.

S1 Dataset.

References