Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Applications of natural language processing at emergency department triage: A narrative review

  • Jonathon Stewart ,

    Roles Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Resources, Software, Visualization, Writing – original draft, Writing – review & editing

    Jonathon.Stewart@research.uwa.edu.au

    Affiliations School of Medicine, The University of Western Australia, Crawley, Western Australia, Australia, Harry Perkins Institute of Medical Research, Murdoch, Western Australia, Australia, Department of Emergency Medicine, Fiona Stanley Hospital, Murdoch, Western Australia, Australia

  • Juan Lu,

    Roles Data curation, Formal analysis, Investigation, Methodology, Visualization, Writing – review & editing

    Affiliations School of Medicine, The University of Western Australia, Crawley, Western Australia, Australia, Harry Perkins Institute of Medical Research, Murdoch, Western Australia, Australia, Department of Computer Science and Software Engineering, The University of Western Australia, Crawley, Western Australia, Australia

  • Adrian Goudie,

    Roles Conceptualization, Supervision, Writing – review & editing

    Affiliation Department of Emergency Medicine, Fiona Stanley Hospital, Murdoch, Western Australia, Australia

  • Glenn Arendts,

    Roles Conceptualization, Resources, Supervision, Writing – review & editing

    Affiliations School of Medicine, The University of Western Australia, Crawley, Western Australia, Australia, Department of Emergency Medicine, Fiona Stanley Hospital, Murdoch, Western Australia, Australia

  • Shiv Akarsh Meka,

    Roles Data curation, Software, Visualization, Writing – review & editing

    Affiliation HIVE & Data and Digital Innovation, Royal Perth Hospital, Perth, Western Australia, Australia

  • Sam Freeman,

    Roles Writing – review & editing

    Affiliations Department of Emergency Medicine, St Vincent’s Hospital Melbourne, Melbourne, Victoria, Australia, SensiLab, Monash University, Melbourne, Victoria, Australia

  • Katie Walker,

    Roles Supervision, Writing – review & editing

    Affiliation School of Clinical Sciences at Monash Health, Monash University, Melbourne, Victoria, Australia

  • Peter Sprivulis,

    Roles Conceptualization, Funding acquisition, Supervision, Writing – review & editing

    Affiliation Western Australia Department of Health, East Perth, Western Australia, Australia

  • Frank Sanfilippo,

    Roles Funding acquisition, Supervision, Writing – review & editing

    Affiliation School of Population and Global Health, University of Western Australia, Crawley, Western Australia, Australia

  • Mohammed Bennamoun,

    Roles Funding acquisition, Supervision, Writing – review & editing

    Affiliation Department of Computer Science and Software Engineering, The University of Western Australia, Crawley, Western Australia, Australia

  • Girish Dwivedi

    Roles Conceptualization, Funding acquisition, Resources, Supervision, Writing – review & editing

    Affiliations School of Medicine, The University of Western Australia, Crawley, Western Australia, Australia, Harry Perkins Institute of Medical Research, Murdoch, Western Australia, Australia, Department of Cardiology, Fiona Stanley Hospital, Murdoch, Western Australia, Australia

Abstract

Introduction

Natural language processing (NLP) uses various computational methods to analyse and understand human language, and has been applied to data acquired at Emergency Department (ED) triage to predict various outcomes. The objective of this scoping review is to evaluate how NLP has been applied to data acquired at ED triage, assess if NLP based models outperform humans or current risk stratification techniques when predicting outcomes, and assess if incorporating free-text improve predictive performance of models when compared to predictive models that use only structured data.

Methods

All English language peer-reviewed research that applied an NLP technique to free-text obtained at ED triage was eligible for inclusion. We excluded studies focusing solely on disease surveillance, and studies that used information obtained after triage. We searched the electronic databases MEDLINE, Embase, Cochrane Database of Systematic Reviews, Web of Science, and Scopus for medical subject headings and text keywords related to NLP and triage. Databases were last searched on 01/01/2022. Risk of bias in studies was assessed using the Prediction model Risk of Bias Assessment Tool (PROBAST). Due to the high level of heterogeneity between studies and high risk of bias, a metanalysis was not conducted. Instead, a narrative synthesis is provided.

Results

In total, 3730 studies were screened, and 20 studies were included. The population size varied greatly between studies ranging from 1.8 million patients to 598 triage notes. The most common outcomes assessed were prediction of triage score, prediction of admission, and prediction of critical illness. NLP models achieved high accuracy in predicting need for admission, triage score, critical illness, and mapping free-text chief complaints to structured fields. Incorporating both structured data and free-text data improved results when compared to models that used only structured data. However, the majority of studies (80%) were assessed to have a high risk of bias, and only one study reported the deployment of an NLP model into clinical practice.

Conclusion

Unstructured free-text triage notes have been used by NLP models to predict clinically relevant outcomes. However, the majority of studies have a high risk of bias, most research is retrospective, and there are few examples of implementation into clinical practice. Future work is needed to prospectively assess if applying NLP to data acquired at ED triage improves ED outcomes when compared to usual clinical practice.

Introduction

Millions of patients attend emergency departments (EDs) around the world every year [1]. Queues for care are common, so patients are often triaged on arrival to the ED by a trained nurse. Triage is central to the practice of emergency medicine [2]. In the face of excess demand, triage allows EDs to allocate their finite resources in an equitable, efficient, and standardised way [3, 4]. Triage systems in current use include the Emergency Severity Index (ESI), Australasian Triage Scale (ATS), Manchester Triage Scale (MTS), and the Korean Triage and Acuity Scale (KTAS) [3, 5]. Triage systems aim to aid emergency care providers in making a structured decision regarding the urgency of care that a patient requires, and in doing so, identify and prioritise those patients with time-sensitive care needs [3, 4]. No triage tool is perfect, and all have issues with sensitivity and specificity resulting in over and under-triage, particularly for certain demographic groups and conditions [68]. There is opportunity to improve triage performance in identifying patients with critical illness, and for improving triage accuracy and the consistency of triage categorisation between healthcare workers [3].

Machine learning (ML) is a subfield of artificial intelligence (AI), that uses various methods to automatically deduce patterns in data, then make predictions [9]. These patterns are learned from the data rather than being explicitly pre-programmed by humans. ML models are iteratively improved through a process called training. In supervised ML training, the model’s predicted output is compared to a "ground truth", and the error between the predicted value and ground truth is progressively reduced through the training process [9]. ML models may have the potential to improve risk stratification and outcome prediction in the ED setting [1012].

Triage has been identified as a promising area to apply ML in the ED [13, 14]. ML has previously been applied to structured data acquired at triage (such as patient age and vital signs) in attempts to predict outcomes including need for admission and intensive care [15, 16]. Triage nurses routinely collect structured data and an unstructured free-text history of presenting complaint, capturing their impression and subjective assessment about the presentation. This free-text may be more expressive, nuanced, and contain a higher level of information than structured data [17]. Prior work has suggested that incorporating free-text may improve the performance of ML at ED triage and is an important area for future research despite the challenges of incorporating free-text data into models [1820].

Natural language processing (NLP) uses computational methods to analyse and understand human language and its structure [21]. Early NLP techniques were relatively simple. For example, a “bag-of-words” model bases its decision on the relative frequencies of words in the text, ignoring their order [22]. These early models often lacked the ability to assess context, negations, and as a result had numerous limitations [23]. Significant advancements in NLP have been made over the last few years through the use of Deep Learning (DL), a subfield of ML [24, 25]. DL models pass data through multiple processing layers and in doing so, achieve increasingly abstract representations of the input data, enabling them to learn complex functions [26]. Massive DL based NLP models have recently been developed [2729]. These models have been trained on datasets containing billions of words and have achieved high levels of performance [2729]. Some large, pre-trained models, such as Bidirectional Encoder Representations from Transformers (BERT) are publicly available [27]. Using a pre-trained model allows researchers to take a high performing model as their starting point, and then customise it to their unique needs through fine tuning the model on their local data. For example, Tahayori et al. were able to accurately predict admission from ED using only free-text triage notes and a BERT based NLP model [30]. Multimodal models integrate NLP with other types of ML to analyse combinations of both free-text data and structured data (such as age and vital signs).

Objectives

This review aims to evaluate the applications of NLP at ED triage by answering the following questions:

  1. How has NLP been applied to data acquired at ED triage?
  2. Do NLP based models outperform humans or current risk stratification techniques when predicting outcomes?
  3. Does incorporating free-text improve predictive performance of ML models when compared to ML models that use only structured data?

Methods

A review protocol was prepared in accordance with PRISMA-P guidelines and registered with the International Prospective Register of Systematic Reviews (PROSPERO) on 04/10/2021 (Registration ID: CRD42021276980) [31, 32]. All English language peer-reviewed research that applied an NLP technique to free-text obtained at ED triage were eligible for inclusion. As this study aims to broadly assess the capability of NLP at triage, all outcomes and comparators were included. We excluded studies focusing solely on disease surveillance, and studies that used information obtained after triage (such as emergency physician clinical notes and investigations performed within the ED).

We searched PubMed (MEDLINE), Embase, Cochrane Database of Systematic Reviews, Web of Science, and Scopus for research published from database inception to the present day. Electronic databases were first searched on 16/09/2021 and last searched on 01/01/2022. We searched for medical subject headings (MeSH) and text keywords related to NLP and triage. The search strategy was iteratively developed by the multidisciplinary project team that included emergency physicians and computer scientists (S1 File). Reference lists of the included studies and the authors’ personal archives were reviewed for further relevant literature.

Citations and abstracts were screened independently by two reviewers (JS and JL) against the inclusion and exclusion criteria. Both reviewers were blind to the journal titles, study authors, and institutions. Full text articles were obtained for any articles identified by one reviewer to meet inclusion criteria. Two reviewers (JS and JL) then evaluated the full text reports against the inclusion and exclusion criteria. Data were extracted by JS and JL using a standardised form that included study country, study design, outcomes, number of sites, study population, input data, NLP and ML models used, comparison, and results. The form was piloted, and calibration exercises were conducted prior to formal data extraction to ensure consistency between reviewers. In cases of conflict or discrepancy, additional review authors were involved until a decision was reached. There were no uncertainties that required authors of the included studies to be contacted. Risk of bias in studies was assessed independently by two authors (JS and JL) using the Prediction model Risk of Bias Assessment Tool (PROBAST) [33]. We chose PROBAST as it is a well-designed, commonly used, and generally accepted tool to assess risk of bias and applicability concerns in prediction model studies.

Heterogeneity was assessed by the study team through review of included papers and the results table. The application of NLP to data acquired at triage was not a homogenous and consistent intervention. Initial review of included studies revealed a wide range of settings with associated differences in language, healthcare systems and triage systems. Further variation is seen in the range of inputs and then these inputs were then studied using different models based on a wide variety of ML approaches (ranging from traditional ML techniques to early deep learning models to modern transformer-based techniques). The outputs for the models are also variable including disposition, identification of critical illness, triage score, investigation ordering and specific disease groups. Most studies were subsequently assessed as having a high risk of bias. Despite some studies assessing NLP performance in predicting similar outcomes, models and inputs used were often very different. There were no randomised controlled studies, few studies compared NLP to usual practice, and few studies had an appropriate comparator. Due to these reasons, the consensus decision by the research team is that a meta-analysis would not be meaningful, and would likely be misleading. Instead, a narrative synthesis is provided to summarise review findings.

Results

Study selection

This process is summarised in a PRISMA Flow Diagram (Fig 1). There were 5329 records identified following database searching and a further 11 records identified through other sources. Following removal of duplicates, 3730 records remained and underwent title and abstract screening. 3597 records were excluded. The remaining 173 full-text articles were assessed for eligibility. In total, 153 articles were excluded, and 20 studies remained for inclusion (Fig 1). There were no unresolved disagreements as to study inclusion or results of data extraction.

Characteristics of included studies

A summary of the included studies is shown in Table 1. There were 19 retrospective studies [17, 18, 30, 3449]. One study reported their ML model was developed using retrospective data then validated using prospective data [50]. All used observational cohort designs. Two studies were international multi-centre studies (USA and Portugal); 12 were conducted in the USA; 2 were from South Korea; one each from Australia, Brazil, China, and France. The most common outcomes assessed were prediction of triage score (six studies), prediction of admission (five studies), and prediction of critical illness (three studies). Two studies predicted need for imaging within the ED, two studies looked at the assignment of provider assigned chief complaint label, one study predicted diagnosis of infection in the ED, and one study aimed to identify and classify temporal expressions used in triage notes.

The population size varied greatly between studies ranging from 598 triage notes to 1.8 million patients. Five studies used a population of under 100 000, four studies had a population of between 100 000 and 200 000, six studies had a population of between 200 001 and 300 000, and six studies had a population of over 300 000. Twelve studies used data from a single site and eight studies used data from multiple sites. The largest number of sites used was 642 by Zhang et al.

Fifteen studies applied NLP to free-text history of presenting complaint, seven studies applied NLP to a free-text chief complaint, two studies applied NLP to a structured chief complaint label, and one study applied NLP to simulated triage dialogues that had been transcribed by either a human or an ML model. The other most frequently used input variables were patient demographics (13 studies), patient vital signs (heart rate, respiratory rate, oxygen saturation, blood pressure, and temperature) (15 studies), pain score (12 studies), triage score (10 studies), mode of arrival (10 studies), time of arrival (9 studies) and past medical history (7 studies). Other input variables included mental status (5 studies), and blood glucose level (5 studies).

Prediction of admission

Overall, NLP models and multimodal models achieved high Area Under the Receiver Operating Characteristic Curve (AUC) in predicting admission at time of triage for adult and paediatric patients (Table 2) [18, 30, 35, 41, 46]. Of the five studies focusing on predicting admission to hospital, Roquette et al. achieved the highest (AUC) using a gradient boosting model (AUC 0.89). Tahayori et al. achieved a similar AUC (0.88) using only free-text history of presenting complaint. Tahayori et al. were the only authors that compared their model to emergency physician performance. Their model achieved a higher accuracy than five emergency consultants (0.83 vs 0.78) and higher specificity (0.86 vs 0.77), but lower sensitivity (0.72 vs 0.9). Roquette et al. and Zhang et al. both compared ML models trained using structured data only with ML models that incorporated both structured data and text data. They found that the addition of text data results in a small improvement (Zhang AUC 0.823 to AUC 0.844, Roquette AUC 0.872 to AUC 0.891) when compared to the use of structured data alone. This improvement was not assessed for statistical significance.

thumbnail
Table 2. Results of included studies, grouped by outcome.

https://doi.org/10.1371/journal.pone.0279953.t002

Prediction of critical illness

Of the three studies that predicted critical illness at triage (defined as ICU admission, cardiopulmonary arrest within 24 hours, or death within 24 hours of triage), Fernandes et al. achieved the highest AUC (0.96) in predicting in-hospital death or cardiopulmonary arrest within 24 hours of triage using an extreme gradient boosting model (Table 2) [4345]. They found no difference in AUC when using clinical variables only or clinical variables and structured chief complaint processed by NLP. Joseph et al. found their NLP model (AUC 0.857) significantly outperformed an abnormal vital sign trigger (AUC 0.521) and ESI ≤ 2 (AUC 0.672) in predicting critical illness. The addition of free-text data improved the performance of their neural network model (from AUC 0.820 to AUC 0.857).

Prediction of triage score

NLP has been retrospectively applied to data acquired from multiple different triage systems [17, 3638, 47, 48]. NLP models and multimodal models have achieved high AUCs in assigning triage categories using structured and free-text data (Table 2) [17, 3638, 47, 48]. Wang et al. achieved the highest performance in predicting ESI using their "DeepTriager" model (AUC 0.96). Kim et al. achieved an AUC of 0.89 in assigning a KTAS category to auto-transcribed simulated triage dialogue. This was only slightly lower than the performance achieved using human-transcribed simulated triage dialogue (AUC 0.90).

Three studies compared the accuracy of triage scores assigned by multimodal models incorporating NLP to triage scores assigned by nurses [17, 36, 47]. Such models were reported to be more accurate than nurses in two out of three papers [17, 36, 47]. Ivanov et al. (2021) used a random sample test set of 729 records to assess their model’s ability to predict ESI. In this test set, ESI had been assigned to each record with unanimous agreement by three expert clinicians. They also compared ED site nurses’ original ESI against the expert consensus assigned ESI. When applied to the test set, their clinical-NLP model achieved an AUC of 0.85 in predicting ESI and original ED nurse triage achieved an AUC of 0.75. Three members of the study team (two emergency clinicians and one emergency nurse) also assigned ESI to this test set, achieving similar AUCs to the NLP model (clinician one AUC 0.86, clinician two AUC 0.85, clinician three AUC 0.82). Sterling et al. (2020) retrospectively calculated number of resources used (ESI), and then used a test set of 1000 randomly selected records to compare their model performance against two experienced ED nurses. In this test set, their model achieved a similar F1 score (0.589 vs 0.659) but lower accuracy (0.589 vs 0.659) than the two ED nurses in predicting number of resources used. Gligorijevic et al. (2018) approximated nurses’ performance by analysing predicted (ESI) versus actual number of resources used. They then compared approximated nurses’ performance against their model’s capability at predicting actual resources used. Overall, their NLP model achieved a higher accuracy than approximated nurses’ performance in assigning number of resources used category (43.6% vs 59.6%). The addition of text data compared to structured data alone improved performance in assigning triage score [36, 37].

Prediction of provider-assigned chief complaint

NLP models and multimodal models incorporating NLP were able to accurately map free-text history of presenting complaint to structured chief complaints (Table 2) [42, 50]. Chang et al. (2020) used BERT to predict provider-assigned chief complaint labels (Top-5 structured label AUC 0.92). Greenbaum et al. (2019) applied NLP to free text triage notes to rank structured chief complaint labels by their predicted probability. This improved structured data capture from 26.2% to 97.2%.

Prediction of investigations

Multimodal models incorporating NLP were used to predict diagnostic imaging performed in the ED (Table 2) [39, 40]. Zhang et al. developed a model to predict need for advanced diagnostic imaging (computed tomography, ultrasound, magnetic resonance imaging) in the ED, and obtained an AUC 0.78. Zhang et al. also achieved an AUC 0.824 in predicting the need for any diagnostic imaging in a paediatric population. In both cases, the inclusion of structured variables improved performance slightly when compared to unstructured variables alone (AUC from 0.74 to 0.78, and AUC from 0.81 to 0.82).

Identifying infection

Horng et al. (2017) found that the incorporation of free-text data improves the discriminatory ability (increase in AUC from 0.67 to 0.86) for identifying sepsis (defined by ICD-9-CM code) in the ED at triage (Table 2).

Extracting temporal information from triage notes

Irving et al. (2008) applied an NLP system to extract and classify temporal information contained in free-text triage notes. Such information included the relative time of event compared to triage time, and duration of event. They report better performance was obtained using a Decision Tree compared to Naive Bayes.

Multimodal models

Eleven papers compared ML models that used only structured data to multimodal models that incorporated both structured data and free-text data (Table 2) [3440, 4346]. The best performing model in each of these papers incorporated free-text. The largest improvement in model performance from incorporating free-text was found by Horng et al. (increase in AUC from 0.67 to 0.86 for identifying infection). The addition of free-text did not improve model AUC in one case, however, did improve model average precision [44]. There were no cases where the incorporation of free-text into the model resulted in worse performance. Six papers assessed models that used only free-text, with no structured data [30, 36, 37, 39, 40, 42]. Tahayori et al. were able to use only free-text data to predict admission with high accuracy (83%). Zhang et al. used free-text to predict performance of diagnostic imaging. Gligorijevic’s “Deep Attention” models using only unstructured data outperformed those using only structured data. Incorporating both structured data and free-text data improved results when compared to models that used only free-text data, though often only a small improvement was found.

Modern NLP compared to traditional NLP

Three papers directly compared modern NLP based on DL to more traditional ML techniques such as bag-of-words and topic modelling (Table 2) [30, 38, 48]. Modern DL based NLP outperformed traditional ML based NLP in two cases [30, 38]. In contrast, Kim et al. found that a BERT based DL model did not perform better than ML based models, though their population was relatively small. Chang et al. compared the performance of multiple modern DL based models, finding BERT slightly outperformed Embeddings from Language Models (ELMo) and Long Short-Term Memory (LSTM) networks in mapping free-text chief complaints to structured fields.

Integration into practice

Greenbaum et al. was the only study that reported the deployment of an NLP based model into clinical practice. Greenbaum et al. aimed to increase the ease of high-quality structured data collection of patients’ chief complaint at triage through the use of an NLP based model. Their model used both free-text triage notes and structured data to provide contextual autocomplete of chief complaint label, and also show the user a list of the top five most likely chief complaints. Prior to implementation of their model, chief complaint was more commonly entered as unstructured free text, with only 26.2% of patient encounters resulted in structured data capture. Following implementation, this increased to 97.2%. The authors aggregated multiple incidents of unscheduled downtime that occurred throughout the study to opportunistically assess the impact of their model. When ML based autocomplete was not operational (and instead alphabetised autocomplete was shown), the percent of encounters that resulted in structured data capture decreased from 97.2% to 89.2%. The number of keystrokes typed for each presenting problem decreased from 11.6 pre-implementation to 0.6 post-implementation. Contextual autocomplete was associated with qualitatively more complete and higher quality (as assessed by three independent reviewers on a four-point Likert scale) structured documentation of chief complaints.

Study quality—Risk of bias within and across studies

A summary of the PROBAST assessment is provided in Table 3. Overall, 16 out of 20 studies were considered to have a high risk of bias. A common reason for overall high risk of bias in the PROBAST assessment was lack of external validation. Four studies were assessed as having a low risk of bias. One study had high applicability concerns and 19 studies had low applicability concerns. The four studies assessed as having low risk of bias also had low applicability concerns. No studies referred to a previously published or publicly registered protocol.

Availability of datasets and code

Availability of study datasets and code is shown in Table 4. Data was publicly available for three studies (all by Zhang et al.) and was available on request from study authors for a further four studies [30, 34, 35, 39, 40, 43, 44]. One study reported plans to release a modified de-identified dataset, however at the time of this review this is still pending approvals [45]. The model code was publicly available for two studies [42, 45]. Notably, the code repository from Chang et al. was well organised and contained clear instructions for researchers on how to download their pretrained model and apply it to their own dataset.

thumbnail
Table 4. Availability of dataset and code for included studies.

https://doi.org/10.1371/journal.pone.0279953.t004

Discussion

NLP at triage

This review finds that NLP has been applied to data available at the time of ED triage to predict a range of outcomes, with a focus on predicting need for admission and assigned triage score. This review also the combining free-text nursing triage notes with structured data appears to result in the best model performance, however free-text nursing triage notes alone have been used by NLP algorithms to predict need for admission and need for diagnostic imaging [18, 30, 39, 40]. A potential benefit of developing models that require only free-text as an input is that it may allow for easier portability of predictive models between different triage systems [30].

Structured data capture

Accurate and consistent structured capture of patients’ presenting complaints is important for research, service improvement, and public health initiatives [50]. Common medical ontologies also improve system interoperability [51]. However collection of structured data is often difficult, especially when contrasted with the ease and expressiveness of free-text entry [50]. In a rare singular example of NLP being deployed into routine clinical practice at ED triage, Greenbaum et al. developed, implemented, and prospectively evaluated an NLP driven user interface in an attempt to improve structured data capture [50]. Promisingly, they report that their NLP based contextual auto-predict did not add additional burden to users, made structured data collection easier than unstructured data collection, and significantly increased structured data collection.

NLP compared to humans

Human performance may be a reasonable baseline for ML models to meet to be considered accurate enough for implementation into clinical practice. Few studies have compared NLP models at triage to human performance. Such comparisons will be crucial in future work. Tahayori et al. was the only study that compared results from NLP models to emergency physicians [30]. Ivanov et al., Sterling et al, and Gligorijevic et al. compared NLP based models to nurses in assigning triage scores and found model accuracy was similar to nurses [17, 36, 47].

Modern NLP

While it is difficult to compare studies due to their heterogeneity, advanced DL based NLP appears to outperform traditional NLP. This is certainly the case when compared internally within studies and is consistent with previous NLP research [52]. BERT appears to be the most popular advanced NLP that has been used. BERT was released in October 2018 and at the time of release, BERT outperformed other NLP models [27]. However, of the 16 papers published since the release of BERT, only three have used it. Other large models have subsequently been released. For example, GPT-3 is a 175 billion parameter language model that was released in 2020 and is reported to outperform BERT in various circumstances [28]. Chowdhery et al. have recently published Pathways Language Model (PaLM), a 540-billion parameter model that achieves further increases in performance [29].

Future directions

NLP at ED triage appears to be a promising area for future research. Triage datasets often contain a large volume data with clearly labelled outcomes (such as admission or discharge), which is useful for developing NLP models. Triage information may be available hours before emergency physician documentation, and accurate predictions made at triage have the potential to increase healthcare system efficiency [18]. There is also the possibility of close human oversight if deployed in practice. Future work could aim to predict other important patient-oriented outcomes at the time of triage such as wait times, need for advanced cardiovascular investigations, or need for surgery.

Incorporating clinical gestalt

Sterling et al. 2020 noted the difficulty in capturing the general clinical impression of the triage nurse [17]. Ivanov et al. also noted that important contextual aspects at triage were not available for consideration by ML models [47]. Future work could assess the impact of incorporating triage nurses’ gestalt into predictive models. Other contextual data available at the time of triage includes the number of patients currently waiting to be seen, the number of patients currently in the ED, and number of admitted patients in the hospital could also be incorporated into ML models. However, it is unknown if the inclusion of such data would improve model performance.

Integration with other AI systems

Kim et al. provides an interesting example of how various AI based technologies can be combined [48]. Future work could assess if it is feasible to integrate triage NLP models with other novel AI based interventions, such as automated monitoring of patients’ vital signs while they are in the waiting room, or with data entered by patients themselves in self-triage applications. However, combining multiple AI based predictive models may not result in improved performance.

Pre-trained models for ED triage

Publicly available large DL based language models have often been trained on corpuses containing text from newspapers, books, and websites [27, 28]. Triage notes are often quite short and contain a number of unique and idiosyncratic abbreviations and acronyms not common in everyday English language [17, 30]. NLP models that have been applied to triage notes were often based on models that were not developed specifically for this purpose. DL based NLP models that have been fine-tuned on large corpuses of medical text have been released, however they have not been applied to ED triage. Large publicly available clinical databases such as MIMIC-IV that contain ED triage notes with linked outcomes may be helpful in further model development and may facilitate direct comparisons between models developed by different research groups [53, 54]. Triage focused NLP research could potentially benefit from groups sharing large language models that have been pre-trained on triage data, though it is unknown whether such models’ performance would generalise across different healthcare settings and triage systems.

Interpretability

NLP models have become increasingly complex. Inputs to DL based NLP models may be processed through billions of interconnected processing units prior to the model generating an output [26]. This has increased their predictive capability, however the explanation of why a model gives a certain output when presented with a particular input has become less interpretable to humans. Some have suggested that it is unethical to implement such models into clinical practice if the reasoning behind their predictions cannot be interpreted by humans [55]. It has also been suggested that improved interpretability may help end users detect model biases and improve patient safety [56]. Other however, argue that model interpretability is unnecessary and should not be pursued at the expense of model accuracy [57]. It is currently unknown if clinicians will accept a model into routine clinical practice if the model cannot explain why an output was given in a way that is interpretable to humans. Developing “explainable AI” models is an active area of research [58, 59]. Few papers attempted to address human interpretability of their NLP model’s output. Wang et al. show how models could be somewhat more interpretable [38]. Their triage model is able to highlight free-text triage notes, with a darker colour corresponding to the sections of text that was more heavily weighted by the model. This provides an initial "sense check" that humans can then combine with their own experience and knowledge.

Prospective and external validation is needed

The majority of research so far has been retrospective with a high risk of bias, and completed in the USA. There is a significant need for prospective evaluation and external validation, especially in other countries and triage systems. Further research is also required to assess the impact of integrating NLP at triage on patient orientated outcomes, as there is currently little evidence that NLP at triage improves outcomes compared to usual clinical practice.

Clinical impact and risk

NLP models have rarely been deployed at ED triage. As such, it is unknown what impact these tools could have on clinical practice. The introduction of a new tool into a complex system is likely to have unintended consequences, and use of the tool may itself change practice. Triage notes may be written in a different way if it is known that they are being used for predictive purposes. It is also unknown if the length of triage notes impacts model performance. There may also be unintended harms. For example, telling a patient at triage that they are likely to be admitted or to have a long wait time, could potentially influence their behaviour and affect the number of patients who leave without being seen. It may be useful to establish the performance benchmarks predictive models must meet prior to implementation into clinical practice. This could be achieved through further studies comparing NLP model performance to emergency physicians and nurses.

Predictive models are trained on data that reflects current practice. This engrains the assumption that current practice is appropriate, which may not be the case. If NLP models are retrained and updated as new data becomes available, then model performance may change over time. It will be important to ensure that there is appropriate algorithm stewardship in place prior to clinical use [60].

Acceptability

It is also unknown if the use of NLP at triage is acceptable to patients and staff. It will be important to involve clinicians, patients, and healthcare consumer groups in the development and governance of any future implementation projects. It will also be important to ensure that these systems do not place further burden on users. Ease of use and perceived clinical impact will likely be important factors for adoption by clinicians.

Ethical issues

Racial, age, and gender biases at ED triage have been previously reported [6163]. Concerns over bias in ML models have been well described [64, 65]. The assessment and reduction of bias in ML is an ongoing and active area of research [66]. At its best, NLP at triage could help reduce bias through standardising triage decisions and providing a more objective triage score. However, at its worst NLP at triage could further ingrain existing biases into practice, under the guise of objectivity and hidden in the opacity of abstract algorithms. Patient apprehensions and concerns about the use of AI will also need to be considered. An emerging body of literature suggests that in while in general, patients view AI positively, they do have some concerns with its use in healthcare [67]. These include perceptions that AI will be less accurate than clinicians, that there is a lack of transparency in predictions, and that there are risks to the privacy of their personal healthcare data [6873]. Further research investigating the impact of NLP based tools on vulnerable and minority populations is warranted.

Limitations

Study level

Only one study contained prospectively validated results, and no studies contained results that were externally validated at a separate site. Results reported may not be generalisable to other settings. There was inconsistent reporting of methods and results among studies. The majority of studies (79%) were assessed to have a high risk of bias.

Review level

Heterogeneity of the included studies precluded meta-analysis which limits the level of evidence this review provides. All studies reported positive results for NLP at triage, which may reflect publication bias. While we took significant care to ensure our search strategy was broad enough to capture all relevant literature, the variety of NLP and ML terminology means that some studies may have been missed. Non-English articles, and articles published prior to 2012 were also excluded from our search. We used the PROBAST tool to assess for risk of bias and applicability concerns for included studies. However, other tools to assess risk of bias do exist and their use may have resulted in different risk of bias assessments.

Conclusion

NLP has been applied to triage data in attempts to predict important patient-oriented outcomes including need for admission and need for critical care. However, there are few examples of implementation into clinical practice and most research is retrospective and at a high risk of bias. Despite these limitations, NLP at triage appears to be a promising area for future research. Further work is needed to prospectively assess the acceptability and clinical impact of implementing NLP at triage on staff, patients, and the healthcare system, and if there are any added benefits over usual clinical practice.

Supporting information

S1 Checklist. PRISMA 2020 for abstracts checklist.

https://doi.org/10.1371/journal.pone.0279953.s001

(PDF)

S1 File. Search strategy.

Search strategy for PubMed (MEDLINE), Embase, Cochrane Database of Systematic Reviews, Web of Science, and Scopus.

https://doi.org/10.1371/journal.pone.0279953.s003

(DOCX)

References

  1. 1. Morley C, Unwin M, Peterson GM, Stankovich J, Kinsman L. Emergency department crowding: A systematic review of causes, consequences and solutions. PLoS One. 2018;13(8):e0203316. pmid:30161242
  2. 2. Iserson KV, Moskop JC. Triage in medicine, part I: Concept, history, and types. Ann Emerg Med. 2007 Mar;49(3):275–81. pmid:17141139
  3. 3. Hinson JS, Martinez DA, Cabral S, George K, Whalen M, Hansoti B, et al. Triage performance in emergency medicine: a systematic review. Ann Emerg Med. 2019 Jul;74(1):140–52. pmid:30470513
  4. 4. Cameron P, Little M, Mitra B, Deasy C, editors. Textbook of adult emergency medicine. Fifth edition. Edinburgh: Elsevier; 2020.
  5. 5. Park JB, Lim TH. Korean Triage and Acuity Scale (KTAS). Journal of The Korean Society of Emergency Medicine. 2017;28(6):547–51.
  6. 6. Zachariasse JM, van der Hagen V, Seiger N, Mackway-Jones K, van Veen M, Moll HA. Performance of triage systems in emergency care: a systematic review and meta-analysis. BMJ Open. 2019 May 28;9(5):e026471. pmid:31142524
  7. 7. Jeppesen E, Cuevas-Østrem M, Gram-Knutsen C, Uleberg O. Undertriage in trauma: an ignored quality indicator? Scand J Trauma Resusc Emerg Med. 2020 May 6;28(1):34. pmid:32375842
  8. 8. Banco D, Chang J, Talmor N, Wadhera P, Mukhopadhyay A, Lu X, et al. Sex and race differences in the evaluation and treatment of young adults presenting to the emergency department with chest pain. J Am Heart Assoc. 2022 May 17;11(10):e024199. pmid:35506534
  9. 9. Murphy KP. Machine learning: a probabilistic perspective. Cambridge, Mass.: MIT Press; 2012.
  10. 10. Stewart J, Sprivulis P, Dwivedi G. Artificial intelligence and machine learning in emergency medicine. Emerg Med Australas. 2018 Dec;30(6):870–4. pmid:30014578
  11. 11. Kareemi H, Vaillancourt C, Rosenberg H, Fournier K, Yadav K. Machine learning versus usual care for diagnostic and prognostic prediction in the emergency department: a systematic review. Acad Emerg Med. 2021 Feb;28(2):184–96. pmid:33277724
  12. 12. Stewart J, Lu J, Goudie A, Bennamoun M, Sprivulis P, Sanfillipo F, et al. Applications of machine learning to undifferentiated chest pain in the emergency department: A systematic review. PLoS One. 2021;16(8):e0252612. pmid:34428208
  13. 13. Levin S, Toerper M, Hamrock E, Hinson JS, Barnes S, Gardner H, et al. Machine-learning-based electronic triage more accurately differentiates patients with respect to clinical outcomes compared with the emergency severity index. Ann Emerg Med. 2018 May;71(5):565–574.e2.
  14. 14. Sánchez-Salmerón R, Gómez-Urquiza JL, Albendín-García L, Correa-Rodríguez M, Martos-Cabrera MB, Velando-Soriano A, et al. Machine learning methods applied to triage in emergency services: A systematic review. Int Emerg Nurs. 2022 Jan;60:101109. pmid:34952482
  15. 15. Hong WS, Haimovich AD, Taylor RA. Predicting hospital admission at emergency department triage using machine learning. PLoS One. 2018;13(7):e0201016. pmid:30028888
  16. 16. Kwon JM, Lee Y, Lee Y, Lee S, Park H, Park J. Validation of deep-learning-based triage and acuity score using a large national dataset. PLoS One. 2018;13(10):e0205836. pmid:30321231
  17. 17. Sterling NW, Brann F, Patzer RE, Di M, Koebbe M, Burke M, et al. Prediction of emergency department resource requirements during triage: An application of current natural language processing techniques. J Am Coll Emerg Physicians Open. 2020 Dec;1(6):1676–83. pmid:33392576
  18. 18. Sterling NW, Patzer RE, Di M, Schrager JD. Prediction of emergency department patient disposition based on natural language processing of triage notes. Int J Med Inform. 2019 Sep;129:184–8. pmid:31445253
  19. 19. Spasic I, Nenadic G. Clinical text data in machine learning: systematic review. JMIR Med Inform. 2020 Mar 31;8(3):e17984. pmid:32229465
  20. 20. Leaman R, Khare R, Lu Z. Challenges in clinical natural language processing for automated disorder normalization. J Biomed Inform. 2015 Oct;57:28–37. pmid:26187250
  21. 21. Russell SJ, Norvig P, Davis E. Artificial intelligence: a modern approach. 3rd ed. Upper Saddle River: Prentice Hall; 2010
  22. 22. Manning CD, Schütze H. Foundations of statistical natural language processing. Cambridge, Mass: MIT Press; 1999.
  23. 23. Juluru K, Shih HH, Keshava Murthy KN, Elnajjar P. Bag-of-words technique in natural language processing: a primer for radiologists. Radiographics. 2021;41(5):1420–6. pmid:34388050
  24. 24. Young T, Hazarika D, Poria S, Cambria E. Recent trends in deep learning based natural language processing [review article]. IEEE Computational Intelligence Magazine. 2018 Aug;13(3):55–75.
  25. 25. Wu S, Roberts K, Datta S, Du J, Ji Z, Si Y, et al. Deep learning in clinical natural language processing: a methodical review. J Am Med Inform Assoc. 2020 Mar 1;27(3):457–70. pmid:31794016
  26. 26. LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015 May 28;521(7553):436–44. pmid:26017442
  27. 27. Devlin J, Chang MW, Lee K, Toutanova K. Bert: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers) [Internet]. Minneapolis, Minnesota: Association for Computational Linguistics; 2019 [cited 2022 Apr 6]. p. 4171–86. Available from: https://aclanthology.org/N19-1423
  28. 28. Brown T, Mann B, Ryder N, Subbiah M, Kaplan JD, Dhariwal P, et al. Language models are few-shot learners. In: Advances in Neural Information Processing Systems [Internet]. Curran Associates, Inc.; 2020 [cited 2022 Apr 8]. p. 1877–901. Available from: https://papers.nips.cc/paper/2020/file/1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdf
  29. 29. Chowdhery A, Narang S, Devlin J, Bosma M, Mishra G, Roberts A, et al. Palm: scaling language modeling with pathways [Internet]. arXiv; 2022 [cited 2022 Apr 5]. Available from: http://arxiv.org/abs/2204.02311
  30. 30. Tahayori B, Chini-Foroush N, Akhlaghi H. Advanced natural language processing technique to predict patient disposition based on emergency triage notes. Emerg Med Australas [Internet]. 2020;33(3):480–4. Available from: pmid:33043570
  31. 31. Moher D, Shamseer L, Clarke M, Ghersi D, Liberati A, Petticrew M, et al. Preferred reporting items for systematic review and meta-analysis protocols (Prisma-p) 2015 statement. Syst Rev. 2015 Jan 1;4(1):1. pmid:25554246
  32. 32. Page MJ, McKenzie JE, Bossuyt PM, Boutron I, Hoffmann TC, Mulrow CD, et al. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. BMJ. 2021 Mar 29;372:n71. pmid:33782057
  33. 33. Wolff RF, Moons KGM, Riley RD, Whiting PF, Westwood M, Collins GS, et al. Probast: a tool to assess the risk of bias and applicability of prediction model studies. Ann Intern Med. 2019 Jan 1;170(1):51–8. pmid:30596875
  34. 34. Horng S, Sontag DA, Halpern Y, Jernite Y, Shapiro NI, Nathanson LA. Creating an automated trigger for sepsis clinical decision support at emergency department triage using machine learning. PLoS One. 2017;12(4):e0174708. pmid:28384212
  35. 35. Zhang X, Kim J, Patzer RE, Pitts SR, Patzer A, Schrager JD. Prediction of emergency department hospital admission based on natural language processing and neural networks. Methods Inf Med. 2017 Oct 26;56(5):377–89. pmid:28816338
  36. 36. Gligorijevic D, Stojanovic J, Satz W, Stojkovic I, Schreyer K, Del Portal D, et al. Deep attention model for triage of emergency department patients. In: Proceedings of the 2018 SIAM International Conference on Data Mining (SDM) [Internet]. Society for Industrial and Applied Mathematics; 2018 [cited 2022 Dec 17]. p. 297–305. (Proceedings). Available from: https://epubs.siam.org/doi/abs/10.1137/1.9781611975321.34
  37. 37. Choi SW, Ko T, Hong KJ, Kim KH. Machine learning-based prediction of korean triage and acuity scale level in emergency department patients. Healthc Inform Res. 2019 Oct;25(4):305–12. pmid:31777674
  38. 38. Wang G, Liu X, Xie K, Chen N, Chen T. Deeptriager: a neural attention model for emergency triage with electronic health records. In: 2019 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). 2019. p. 978–82.
  39. 39. Zhang X, Bellolio MF, Medrano-Gracia P, Werys K, Yang S, Mahajan P. Use of natural language processing to improve predictive models for imaging utilization in children presenting to the emergency department. BMC Med Inform Decis Mak. 2019 Dec 30;19(1):287. pmid:31888609
  40. 40. Zhang X, Kim J, Patzer RE, Pitts SR, Chokshi FH, Schrager JD. Advanced diagnostic imaging utilization during emergency department visits in the United States: A predictive modeling study for emergency department triage. PLoS One. 2019;14(4):e0214905. pmid:30964899
  41. 41. Arnaud É, Elbattah M, Gignon M, Dequen G. Deep learning to predict hospitalization at triage: integration of structured data and unstructured text. In: 2020 IEEE International Conference on Big Data (Big Data). 2020. p. 4836–41.
  42. 42. Chang D, Hong WS, Taylor RA. Generating contextual embeddings for emergency department chief complaints. JAMIA Open. 2020 Jul;3(2):160–6. pmid:32734154
  43. 43. Fernandes M, Mendes R, Vieira SM, Leite F, Palos C, Johnson A, et al. Predicting Intensive Care Unit admission among patients presenting to the emergency department using machine learning and natural language processing. PLoS One. 2020;15(3):e0229331. pmid:32126097
  44. 44. Fernandes M, Mendes R, Vieira SM, Leite F, Palos C, Johnson A, et al. Risk of mortality and cardiopulmonary arrest in critical patients presenting to the emergency department using machine learning and natural language processing. PLoS One. 2020;15(4):e0230876. pmid:32240233
  45. 45. Joseph JW, Leventhal EL, Grossestreuer AV, Wong ML, Joseph LJ, Nathanson LA, et al. Deep-learning approaches to identify critically Ill patients at emergency department triage using limited information. J Am Coll Emerg Physicians Open. 2020 Oct;1(5):773–81. pmid:33145518
  46. 46. Roquette BP, Nagano H, Marujo EC, Maiorano AC. Prediction of admission in pediatric emergency department with deep neural networks and triage textual data. Neural Netw. 2020 Jun;126:170–7. pmid:32240912
  47. 47. Ivanov O, Wolf L, Brecher D, Lewis E, Masek K, Montgomery K, et al. Improving ed emergency severity index acuity assignment using machine learning and clinical natural language processing. J Emerg Nurs. 2021 Mar;47(2):265–278.e7. pmid:33358394
  48. 48. Kim D, Oh J, Im H, Yoon M, Park J, Lee J. Automatic classification of the korean triage acuity scale in simulated emergency rooms using speech recognition and natural language processing: a proof of concept study. J Korean Med Sci. 2021 Jul 12;36(27):e175. pmid:34254471
  49. 49. Irvine AK, Haas SW, Sullivan T. TN-TIES: A system for extracting temporal information from emergency department triage notes. AMIA Annu Symp Proc. 2008 Nov 6;2008:328–32. pmid:18998945
  50. 50. Greenbaum NR, Jernite Y, Halpern Y, Calder S, Nathanson LA, Sontag DA, et al. Improving documentation of presenting problems in the emergency department using a domain-specific ontology and machine learning-driven user interfaces. Int J Med Inform. 2019 Dec;132:103981. pmid:31605881
  51. 51. Liyanage H, Krause P, De Lusignan S. Using ontologies to improve semantic interoperability in health data. J Innov Health Inform. 2015 Jul 10;22(2):309–15. pmid:26245245
  52. 52. Li H. Deep learning for natural language processing: advantages and challenges. National Science Review [Internet]. 2018 Jan 1 [cited 2022 Jun 16];5(1):24–6. Available from: https://academic.oup.com/nsr/article/5/1/24/4107792
  53. 53. Johnson, Alistair, Bulgarelli, Lucas, Pollard, Tom, Celi, Leo Anthony, Mark, Roger, Horng, Steven. Mimic-iv-ed [Internet]. PhysioNet; [cited 2022 Jun 22]. Available from: https://physionet.org/content/mimic-iv-ed/2.0/
  54. 54. Goldberger AL, Amaral LA, Glass L, Hausdorff JM, Ivanov PC, Mark RG, et al. PhysioBank, PhysioToolkit, and PhysioNet: components of a new research resource for complex physiologic signals. Circulation. 2000 Jun 13;101(23):E215–220. pmid:10851218
  55. 55. Amann J, Blasimme A, Vayena E, Frey D, Madai VI, Precise4Q consortium. Explainability for artificial intelligence in healthcare: a multidisciplinary perspective. BMC Med Inform Decis Mak. 2020 Nov 30;20(1):310.
  56. 56. Yoon CH, Torrance R, Scheinerman N. Machine learning in medicine: should the pursuit of enhanced interpretability be abandoned? J Med Ethics. 2022 Sep;48(9):581–5. pmid:34006600
  57. 57. London AJ. Artificial intelligence and black-box medical decisions: accuracy versus explainability. Hastings Cent Rep. 2019 Jan;49(1):15–21. pmid:30790315
  58. 58. Adadi A, Berrada M. Peeking inside the black-box: a survey on explainable artificial intelligence (Xai). IEEE Access. 2018;6:52138–60.
  59. 59. Linardatos P, Papastefanopoulos V, Kotsiantis S. Explainable ai: a review of machine learning interpretability methods. Entropy (Basel). 2020 Dec 25;23(1):E18.
  60. 60. Eaneff S, Obermeyer Z, Butte AJ. The case for algorithmic stewardship for artificial intelligence and machine learning technologies. JAMA. 2020 Oct 13;324(14):1397–8. pmid:32926087
  61. 61. Schrader CD, Lewis LM. Racial disparity in emergency department triage. J Emerg Med. 2013 Feb;44(2):511–8. pmid:22818646
  62. 62. Kuhn L, Page K, Rolley JX, Worrall-Carter L. Effect of patient sex on triage for ischaemic heart disease and treatment onset times: A retrospective analysis of Australian emergency department data. Int Emerg Nurs. 2014 Apr;22(2):88–93. pmid:24071742
  63. 63. Vigil JM, Coulombe P, Alcock J, Kruger E, Stith SS, Strenth C, et al. Patient ethnicity affects triage assessments and patient prioritization in u. S. Department of veterans affairs emergency departments. Medicine (Baltimore). 2016 Apr;95(14):e3191. pmid:27057847
  64. 64. Gianfrancesco MA, Tamang S, Yazdany J, Schmajuk G. Potential biases in machine learning algorithms using electronic health record data. JAMA Intern Med. 2018 Nov 1;178(11):1544–7. pmid:30128552
  65. 65. Linardatos P, Papastefanopoulos V, Kotsiantis S. Explainable ai: a review of machine learning interpretability methods. Entropy (Basel). 2020 Dec 25;23(1):18. pmid:33375658
  66. 66. Vokinger KN, Feuerriegel S, Kesselheim AS. Mitigating bias in machine learning for medicine. Commun Med (Lond). 2021 Aug 23;1:25. pmid:34522916
  67. 67. Young AT, Amara D, Bhattacharya A, Wei ML. Patient and general public attitudes towards clinical artificial intelligence: a mixed methods systematic review. Lancet Digit Health. 2021 Sep;3(9):e599–611. pmid:34446266
  68. 68. Ongena YP, Haan M, Yakar D, Kwee TC. Patients’ views on the implementation of artificial intelligence in radiology: development and validation of a standardized questionnaire. Eur Radiol. 2020 Feb;30(2):1033–40. pmid:31705254
  69. 69. Bala S, Keniston A, Burden M. Patient perception of plain-language medical notes generated using artificial intelligence software: pilot mixed-methods study. JMIR Form Res. 2020 Jun 5;4(6):e16670. pmid:32442148
  70. 70. Nelson CA, Pérez-Chada LM, Creadore A, Li SJ, Lo K, Manjaly P, et al. Patient perspectives on the use of artificial intelligence for skin cancer screening: a qualitative study. JAMA Dermatol. 2020 May 1;156(5):501–12. pmid:32159733
  71. 71. Jutzi TB, Krieghoff-Henning EI, Holland-Letz T, Utikal JS, Hauschild A, Schadendorf D, et al. Artificial intelligence in skin cancer diagnostics: the patients’ perspective. Front Med (Lausanne). 2020;7:233. pmid:32671078
  72. 72. Nadarzynski T, Miles O, Cowie A, Ridge D. Acceptability of artificial intelligence (Ai)-led chatbot services in healthcare: A mixed-methods study. Digit Health. 2019;5:2055207619871808. pmid:31467682
  73. 73. Palmisciano P, Jamjoom AAB, Taylor D, Stoyanov D, Marcus HJ. Attitudes of patients and their relatives toward artificial intelligence in neurosurgery. World Neurosurg. 2020 Jun;138:e627–33. pmid:32179185