Advanced diagnostic imaging utilization during emergency department visits in the United States: A predictive modeling study for emergency department triage

Background Emergency department (ED) crowding is associated with negative health outcomes, patient dissatisfaction, and longer length of stay (LOS). The addition of advanced diagnostic imaging (ADI), namely CT, ultrasound (U/S), and MRI to ED encounter work up is a predictor of longer length of stay. Earlier and improved prediction of patients’ need for advanced imaging may improve overall ED efficiency. The aim of the study was to detect the association between ADI utilization and the structured and unstructured information immediately available during ED triage, and to develop and validate models to predict utilization of ADI during an ED encounter. Methods We used the United States National Hospital Ambulatory Medical Care Survey data from 2009 to 2014 to examine which sociodemographic and clinical factors immediately available at ED triage were associated with the utilization of CT, U/S, MRI, and multiple ADI during a patient’s ED stay. We used natural language processing (NLP) topic modeling to incorporate free-text reason for visit data available at time of ED triage in addition to other structured patient data to predict the use of ADI using multivariable logistic regression models. Results Among the 139,150 adult ED visits from a national probability sample of hospitals across the U.S, 21.9% resulted in ADI use, including 16.8% who had a CT, 3.6% who had an ultrasound, 0.4% who had an MRI, and 1.2% of the population who had multiple types of ADI. The c-statistic of the predictive models was greater than or equal to 0.78 for all imaging outcomes, and the addition of text-based reason for visit information improved the accuracy of all predictive models. Conclusions Patient information immediately available during ED triage can accurately predict the eventual use of advanced diagnostic imaging during an ED visit. Such models have the potential to be incorporated into the ED triage workflow in order to more rapidly identify patients who may require advanced imaging during their ED stay and assist with medical decision-making.


Introduction
Emergency department (ED) crowding is a well-recognized problem in the United States [1][2][3]. Problems associated with ED crowding have been extensively documented: longer wait time and length of stay (LOS) during ED visit; staff and patient dissatisfaction; higher hospital costs; and negative patient outcomes [4][5][6][7][8]. As a result, many emergency departments are moving toward physician triage models in which physicians perform rapid evaluations to expedite work-ups and dispositions while patients are still in the waiting area. This method has shown promise-with studies demonstrating decreased LOS and decreased number of patients who leave without being seen [9][10][11][12]. Algorithmic clinical decision support, specifically predictive analytics, may be of benefit in this clinical setting [13]; however its use has not been sufficiently described or tested.
The decision by an ED provider to pursue advanced diagnostic imaging (ADI) studies during an ED visit is a major contributor to increased ED LOS [14,15], and ADI use in the ED has been increasing for more than a decade [16]. The median LOS for ED patients with ADI is 114 minutes longer than those without ADI [17]. This increased LOS can be attributed to clinical factors-such as the amount of time it takes to obtain and interpret a CT scan-and diagnostic factors-such as the time it takes to clinically evaluate a patient and decide if they will need ADI. Early prediction of eventual ADI use has the potential to shorten diagnostic time. To date, research has not examined the role of a predictive model that can use information immediately available to a triage provider upon patient arrival (e.g. patient demographics, vitals, medical history, and the patient's own descriptions of their reason for visit) to estimate the probability that the patient will undergo ADI during their ED visit. Such a predictive model-if implemented and tested in the clinical setting perhaps as an adjunct to the Electronic Health Record (EHR) or as a standalone program-could support clinicians in making rapid, informed decisions regarding ADI. Furthermore, few studies have utilized the important information that exists within the free-text reason for visit that patient's give on arrival to the ED to make predictions regarding processes and outcomes in the ED [18,19]. This freetext reason for visit information can be utilized via natural language processing (NLP), a method through which text data can be extracted and processed for analysis and has been shown to improve models related to health outcomes [20][21][22][23].
In a nationally representative sample, we examined patient information that would readily be available during the ED triage process, including free-text reason for visit, to develop predictive models for ADI use including computed tomography (CT), ultrasound (US), and magnetic resonance imaging (MRI) during the ED encounter.

Study population
This study is a secondary analysis of data collected from the 2009-2014 National Hospital Ambulatory Medical Care Survey ED Subfile (NHAMCS-ED) [24][25][26], a multistage, stratified probability sample of ED visits in the United States administered by the National Center for Health Statistics, a branch of the Centers for Disease Control and Prevention. The NHAMCS-ED sample is collected during a random 4-week period each year. Study staff visit approximately 300 hospitalbased EDs, which are randomly selected from approximately 1,900 geographically defined areas covering all 50 States and the District of Columbia. A standardized form and protocol are utilized to abstract data from approximately 100 patient charts per ED. Details of the survey methodology are available from the National Center for Health Statistics [25,26]. A total of 179,036 patient visits were included in the survey datasets from 2009 to 2014. After excluding pediatric visits (n = 39,886), 139,150 (77.7%) adult patients (�18 years old) visits remained for analysis.
Missing data. Missing values for age, sex, race, and ethnicity, approximately 0.1%, 0.9%, 16.8%, and 30.3% respectively, were imputed by the NHAMCS investigators in the dataset prior to public release. According to the NHAMCS, the investigators imputed age and sex using a hot deck based on 3-digit ICD-9-CM code for primary diagnosis, triage level, ED volume, and geographic region, while they imputed patient ethnicity using a model-based single, sequential regression method [26]. We imputed missing values for all other variables with the median of the corresponding variable before establishing the statistical models for this study; these variables include vital signs, mode of arrival, patient's residence type, source of payment, episode of care, whether the visit was related to injury/poisoning, triage level, and pain scale (Table 1).

Statistical analyses
Topic modeling. NLP is a branch of computational linguistic techniques that extract and analyze information from unstructured and semi-structured text or speech data. Topic modeling is a commonly used technique for NLP, which can identify patterns hidden in the free text to evaluate an underlying theme or topic of the text [28]. The model based on the Latent Dirichlet Allocation (LDA) algorithm [29,30] was used to break all the free text into different themes after preprocessing [31,32].
The mathematical principles and algorithm of LDA have been described in prior research [28,29]. Briefly, the free-text reasons for visit from each patient is a mixture of several topics composed of a set of words. For example, in a two-topic model, reasons for visit from patient 1 may contain 20% topic A (gastrointestinal problem) and 80% topic B (respiratory problems), while patient 2's reason for visit could be 90% topic A and 10% topic B. The most common terms in the gastrointestinal topic might be "hematemesis" and "vomit", while the respiratory problems may be composed of words including "breath", "asthma", and "shortness". The LDA method can identify the mixture of topics, which describes each free-text reason for visit, while determining the mixture of words that associated with each topic. The correlation coefficient between each patient and each topic can be estimated. In this way, the free text reason for visit were transformed into a structured matrix of correlation coefficients between the patients and topics, which can be used for predicting the outcome. We employed the ldatatuning package in R for this analysis as previously described [32].
Regression modelling. Logistic regression models were used to measure the association between the outcome and the structured and unstructured predictors, and to predict the outcomes. To determine the predictive performance in identifying patients with advanced imaging use, we analyzed three models: (1) models with structured variables; (2) models with freetext data using NLP; (3) models with both structured and free-text variables. This was done for any ADI use, any CT use (including multiple cases), any U/S scan (including multiple cases), any MRI (including multiple cases), and multiple types of ADI use.
Ten-fold cross-validation was used to validate the performance of each model. The dataset was randomly divided into 10 sets; 9 of the 10 sets were used to train the models while the one remaining was used as the testing set. The area under the receiver-operating curve (ROC) was recorded for the testing set. The average ROC curve was derived by comparing the prediction values from all 10 cross-validation testing set. The probabilities of ADI use for each patient were calculated with this model. The best cutoff of the probabilities was determined by using the point on the ROC curve with the shortest distance to the upper left corner (where sensitivity = 1 and specificity = 1). The best cutoff of the probabilities for prediction and the corresponding sensitivity, specificity, and overall accuracy were recorded [33]. To evaluate the effect of the missing values on the models, specifically for the five comorbidities variables that were not part of the dataset prior to 2012 as well as one variable that was not collected in one survey year, we performed a sensitivity analysis using cases without missing values in any of the variables considered. Among a total of 139,150 cases, there were 14,009 (10.1%) cases without any missing values. Basic data organization was done in SAS 9.4. The text analyses were performed in R 3.3.2. The modeling of logistic regression was performed in Matlab R2016b.

Characteristics of ED patients
Among 139,150 ED patient visits from December 2008 to December 2014, 21.9% of visits resulted in ADI use, including 16.8% who had CTs, 3.6% with U/S, 0.4% with MRIs, and 1.2% who had multiple types of ADI. The ADI use proportion increased in the older age groups.

Factors associated with ADI use
The adjusted odds ratio of ED visits resulting in different types of ADI use (vs. no ADI use) for each variable using multinomial logistic regression analyses are presented in Table 2 and S1  Table. Age, triage level, arrival mode, place of residence, and certain comorbidities were also predictive of the eventual use of ADI. For example, the odds of ADI use increased progressively with increasing age; compared to patients in the age 18-29 group, the adjusted odds of ADI use was 1.97 times higher for patients � 75 years old (95% CI 1.86-2.08), 1.69 times higher for patients in the 65-74 age group (95% CI 1.59-1.79), 1.25 times higher for patients in the 45-64 age group (95% CI 1.20-1.30), and 1.11 times higher for patients in the 30-44 age group (95% CI 1.06-1.15) ( Table 2). Those who arrived via ambulance were 1.87 times more likely to receive ADI than those who did not (95% CI 1.80-1.93). Compared to those who lived in a private residence, nursing home patients were 10% less likely (OR: 0.90, 95% CI 0.83-0.98), while those who were homeless were 46% less likely to have any ADI use (OR: 0.54 95% CI 0.46-0.64). Comorbidities had varying likelihood of any ADI use; patients with history of cerebrovascular disease were 1.74 times more likely (95% CI 1.63-1.85) and those with dementia were 1.56 times more likely (95% CI 1.35-1.80) than those who did not have respective comorbidities to have any ADI use (Table 2). Increasing pain scale was associated progressively with an increased likelihood of any ADI use, as well as of CT use (S1 Fig). The trends in the likelihood of CT use only and multiple types of ADI use (vs. no ADI use) closely mirrored that of any ADI use with race, age, triage level, mode arrival, place of residence, source of payment, and certain comorbidities (cerebrovascular disease and dementia) being most predictive based on the odds ratios (Table 2).

Text variables extracted using topic modelling to predict ADI use
The top 10 terms in each topic for the first 20 topics are presented in S3 Table. Although these topics cannot all be generalized into terms that are clinically meaningful, words that have been grouped into topics may indicate a theme. For example, the first topic shows a theme related to the extremities, the second gastrointestinal problems, the third respiratory problems, and the fourth trauma. The first 20 topics all show significant odds in all types of ADI use; for example, "topic 12" has an odds ratio of 0.07 (95% CI 0.04-0.11) for predicting any ADI use.

Predictive performance of multivariable logistic regression models
Applying the three logistic regression models (model 1: structured variables only, model 2: textbased reason for visit variables only, and model 3: both text-based and structured variables), we found that the predictive accuracy for ADI use was higher for models with text-based reason for visit variables only compared to models with structured variables only. The predictive accuracy was the highest when both text-based reason for visit and structured variables were included (Table 3 and Fig 1). For models that included both unstructured and unstructured variables, the AUC was 0.78 (0.77-0.78) for any ADI use, 0.79 (0.79-0.79) for CT use, 0.83 (0.82-0.84) for U/ S use, 0.80 (0.79-0.80) for MRI use, and 0.78 (0.77-0.79) for multiple ADI use. Estimated coefficients and standardized coefficients of the structured variables from logistic regression between the outcome of ADI use and the predictors were presented as a modeling example (S4 Table), which can be used for perspective study. Standardized coefficients can be compared to present which variable have a greater effect on the ADI use prediction. The item "whether the injury/poisoning intentional" and the immediate triage level presented highest standardized coefficients among the structured variables.

Discussion
Improving ED efficiency may help address the continued problem and negative consequences of ED crowding in the U.S. [2,6,34]. One previously unexplored solution to address this problem may be to identify patients more likely to eventually obtain ADI earlier in their ED encounter. Our study applied predictive analytics and natural language processing modeling techniques with six years of nationally representative survey data to create a model to predict ADI use during the ED triage process. One of the novel aspects of this study was the use of not only structured variables (examples: age, race, residence type), but also text-based information (reason for visit and cause of injury) via natural language processing. Specifically, we chose LDA topic modelling, which balances predictive performance and ease of information interpretation by grouping words into topics [28]. With the inclusion of reason for visit information in the model, the AUC ranged from 0.78 to 0.83 for all outcomes. When choosing the best probability cut-off given in the study (p = 0.05) as the threshold, the best overall accuracy of this model for ultrasound use, for example, reached 78% (with sensitivity of 0.73 and specificity of 0.78), which means that with the model given in the study, physicians can predict with an accuracy of 73% whether a patient will eventually receive ultrasound during their ED stay, and offers a 78% discriminatory accuracy for those who will not receive ADI.
During our exploration and model development, we observed surprising and substantial racial and socioeconomic disparities in the use of ADI in this sample. Similar to previous studies, African Americans were less likely to have ADI compared to white patients [35,36]. There are several potential explanations for these differences. For example, some evidence suggests that injury severity varies by race, thus warranting differential use of ADI [35,37]. In addition, the extent of overcrowding in an ED has been shown to affect the thoroughness of patients' evaluation, which disproportionately affects hospitals that serve higher number of African Americans [37,38]. Other potential explanations for racial differences in ADI use include provider implicit bias [35], and/or potential overuse of ADI in white patients rather than underuse by African American patients [39]. Patients with Medicaid and uninsured patients were also less likely to receive ADI compared to patients with private insurance [40]. Compared to patients that live in private residences, patients from nursing homes and patients who were homeless had decreased likelihood of any ADI use. Reasons for these disparities should be further explored in future research to determine the appropriateness of including or excluding these variables in prediction models [37] based on the clinical context, such as the one proposed in this study. This will be important to determine whether such prediction models can serve as a more objective tool to predict whether a patient will need ADI by excluding factors that may be influenced by clinician bias, for example. It may also be of value to explore the relationship between measurements of disparities, such as the role that insurance type plays in the racial differences we observed in ADI, or the influence of the ED specific characteristics such as urbanicity, teaching hospital designation, or safety net designation on ADI utilization. Further studies are needed to determine the effect of predictive clinical decision support algorithms such as the one constructed for this study, on the clinical use of ADI in settings where it can potentially be deployed to reduce racial and socioeconomic disparities in ADI use.
We also found that age, triage level, arrival mode, place of residence, and certain comorbidities were predictive of the eventual use of ADI during ED visit. As expected, patients with emergent and immediate triage levels had the highest likelihood of ADI use. These patients are often immediately placed in an ED room for workup shortly after arrival and early identification of their need for ADI would likely have less of an impact on ED LOS, and the decision tree that can lead to ADI in these patients often bypasses the traditional triage processes. However, patients who were triaged as urgent (typically triage level 3) or semi-urgent (typical triage level 4) also had increased odds of ADI use. These patients typically spend a longer portion of their LOS in the waiting area prior to being placed in a room-after which a provider typically makes the decision to pursue ADI. Urgent and semi-urgent patients comprise the majority of all ED patients (80.2% in this sample) and stand to benefit the most from this form of predictive modelling as they utilize the majority of ED ADI (79.8% of ADI in this sample).
When the use of each type of ADI (CT, MRI, U/S) was analyzed, we found that the general trends closely mirrored that of any ADI use except for U/S. One explanation for this difference is the increasing number of ultrasounds performed unofficially as a point of care test at bedside or in a fashion that was not captured in the dataset. For example, the Focused Assessment with Sonography in Trauma (FAST) ultrasound [41] is often not captured, which would result in under-reporting in the current dataset. Despite having been shown in prior research to be a poor predictor of health outcomes [42,43], we found that patients indicating higher levels of pain on the traditional ten-point pain scale had increased odds of receiving ADI ( Table 2, S1  Fig). This may reflect the fact that physicians tend to do more for patients who complain of severe pain [44].
The triage process in the ED represents the earliest in-person point of contact between a patient and a medical provider, often a nurse, after arriving in the ED. This is an extremely important encounter, but one that is often quite brief. A decision support system built on models such as the ones proposed in this study may be valuable to triage personnel, charge nurses, hospital leaders, hospital flow coordinators, and ED physicians. Further research will be needed to test the effect of using such a system in the clinical setting. Because these models were derived from nationally representative survey data, the clinical use of this type of modelling strategy may benefit from location-specific data and additional calibration of models for specific regions of the county or patient populations.
The present study has several limitations. First, missing values in the datasets affect the predictive performance of the models; sensitivity analyses were performed to limit this potential source of bias. Second, the interpretation of the topic models used for natural language processing in extracting data from unstructured information is not always straightforward as these topics are computer generated and take into account multiple layers of variable interactions. Third, the specific test ordered (example: CT of head) was not available in the dataset for analysis; however, this may be important to explore in future studies. Fourth, the dataset does not provide the results of ADI studies, the medical indication for those studies, who ordered the study, or the time of ordering the study. Therefore, we lack the temporal information to know if an ADI study was ordered for a patient immediately upon arrival by triage personnel or if was ordered later by a different provider who was privy to additional clinical information such as lab results or changes in clinical status. Additionally, we used the ED physicians' decision to pursue ADI as the gold standard to establish the outcome for the predictive modelnot whether the imaging yielded or ruled out a diagnosis. As a result, the appropriateness of these decisions cannot be assessed. To determine the utility of these models in supporting physicians' ED triage decisions, these predictive models should be developed and prospectively validated in the clinical setting. Because these predictive models were designed on national data and focused on a specific time point (triage) during an ED encounter, it is not possible to account for the impact of serial interactions in the ED that lead toward or away from ADI. Further studies are needed to assess the impact of using this type of predictive model on the triage behaviors, ordering patterns, clinical pathways, overall imaging utilization, and ED flow.

Conclusion
This investigation used six years of nationally representative ED data to construct statistical models to predict the eventual use of advanced diagnostic imaging-using only the information that would be available at the time of ED triage: vital signs, general medical information, and the patient's stated reason for visit. The overall discriminatory accuracy of these models supports prospective testing for use as an adjunct clinical decision support tool.