Identifying Patients with Bacteremia in Community-Hospital Emergency Rooms: A Retrospective Cohort Study

Objectives (1) To develop a clinical prediction rule to identify patients with bacteremia, using only information that is readily available in the emergency room (ER) of community hospitals, and (2) to test the validity of that rule with a separate, independent set of data. Design Multicenter retrospective cohort study. Setting To derive the clinical prediction rule we used data from 3 community hospitals in Japan (derivation). We tested the rule using data from one other community hospital (validation), which was not among the three “derivation” hospitals. Participants Adults (age ≥ 16 years old) who had undergone blood-culture testing while in the ER between April 2011 and March 2012. For the derivation data, n = 1515 (randomly sampled from 7026 patients), and for the validation data n = 467 (from 823 patients). Analysis We analyzed 28 candidate predictors of bacteremia, including demographic data, signs and symptoms, comorbid conditions, and basic laboratory data. Chi-square tests and multiple logistic regression were used to derive an integer risk score (the “ID-BactER” score). Sensitivity, specificity, likelihood ratios, and the area under the receiver operating characteristic curve (i.e., the AUC) were computed. Results There were 241 cases of bacteremia in the derivation data. Eleven candidate predictors were used in the ID-BactER score: age, chills, vomiting, mental status, temperature, systolic blood pressure, abdominal sign, white blood-cell count, platelets, blood urea nitrogen, and C-reactive protein. The AUCs was 0.80 (derivation) and 0.74 (validation). For ID-BactER scores ≥ 2, the sensitivities for derivation and validation data were 98% and 97%, and specificities were 20% and 14%, respectively. Conclusions The ID-BactER score can be computed from information that is readily available in the ERs of community hospitals. Future studies should focus on developing a score with a higher specificity while maintaining the desired sensitivity.


Introduction
Delaying treatment of bacteremia can be fatal. [1] Thus, clinicians need to very quickly identify patients who have bacteremia, a diagnosis that is confirmed with blood-culture results. [2] However, clinicians may suspect that sepsis is present in too many patients in whom it is, in fact, absent. If blood cultures are used for all patients in whom bacteremia might be suspected, then most of the results will be negative. Specifically, in previous studies 4-7% of blood-culture results have been positive. [3][4] Blood-culture analyses can be costly, resulting in a 20% increase in total hospital costs for patients with false-positive blood-culture results. [5][6][7] Moreover, the results of blood-culture testing may not affect clinical decisions. [8][9] Unnecessary blood cultures of course waste medical resources and healthcare workers' time, and expose them and their patients to unnecessary risks.
Furthermore, because of the time required, blood-culture results do not inform the decision of whether or not to start treatment. That decision is challenging also because the clinical presentation of bacteremia varies greatly, depending on the cause of the infection. [10] Physicians often overestimate a patient's likelihood of having bacteremia. In the emergency room (ER) of a community hospital, if one could identify those patients who are very likely to have community acquired bacteremia, then it might be possible to avoid, at least in a few cases, unnecessary punctures for blood samples, unnecessary antibiotic therapy, and unnecessary admission to hospital. [11] Attempts to devise a quick procedure to identify patients with bacteremia have had only limited success. [12] Some are useful only in elderly patients [13] or only in those with urinarytract infections [14] or with pneumonia. [15] Others require complex calculations, [16] or use rare or difficult-to-obtain measurements. [17,18] While some studies reported rules for predicting hospital-acquired bacteremia, [3,13,16,[19][20][21] at least three procedures have been developed to identify community acquired bacteremia among ER patients. [17,18,22] They were developed using data from one university hospital each, so their utility in ERs of various community hospitals is unclear. In addition, they use bands or procalcitonin values, which are usually not available in ERs of community hospitals, [17,18] and their development did not include validation studies. [18,22] The objective of this study was to develop a new prediction rule to identify bacteremia, which will overcome some of the limitations of prior studies. Here we report on the development and testing of a highly sensitive procedure that uses only information that is readily available in ERs of community hospitals, to identify patients who have community acquired bacteremia.

Study design
In this multicenter retrospective cohort study we derived and evaluated a clinical prediction rule. First we used clinical data from 3 hospitals (the "derivation" data) to develop a procedure for identifying patients in the ER who have bacteremia, and then we used data from one other hospital (the "validation" data) to test that procedure.

Setting
The setting was 4 community hospitals in Japan, all of which receive patients on an emergency basis and all of which provide primary, secondary, and tertiary care. The Japanese Red Cross Nagoya Daini Hospital and Tenri Hospital have transplant wards, and Okinawa General Hospital has a Level I trauma center. The data were obtained from patients aged 16 years and older who had undergone blood-culture testing while in the ER between 1 April 2011 and 31 March 2012. In these ERs, physicians drew blood for cultures from different sites each time, and with no time interval between blood samples. In 2009 in Japan, the median percentage of blood-culture tests involving multiple sets of samples was 67.2%. [23] In 2014, it became more common to take multiple sets of samples for blood cultures, because their cost began to be covered under the national health insurance system. [24] In this study, we excluded patients from whom there was only one sample for blood culture.
Derivation data came from the Japanese Red Cross Nagoya Daini Hospital, Okinawa General Hospital, and Shizuoka General Hospital. Those hospitals have, respectively, 812 beds, 550 beds, and 720 beds. The numbers of ER patients who underwent blood-culture tests were, 2264 (4.9% of 45,779 visits in one year), 4180 (11.3% of 37,106 visits in one year), and 582 (6.6% of 9738 visits in one year) for those three hospitals, respectively.
One very common practice is to include at least 10 cases with the outcome (i.e., bacteremia) for each potential predictor in a multivariable model. We considered that we might need to use up to 15 predictors, and so, with "10 cases per predictor" in mind, we expected to need about 150 cases of bacteremia. Next, to be conservative, we presumed that only about 10% of the patients to be studied in fact had bacteremia. Thus, we estimated that we would need to study records from about 1500 patients (as 10% of 1500 is 150). Sampling approximately equally from the three hospitals would result in about 500 records from each hospital. We assumed that some of the records would be unusable (because of excessive missing data, inconsistencies, etc.) and so we sampled, randomly, somewhat more than 500 from two of the hospitals (in Nagoya and in Okinawa), and we used as many of those sampled records as possible. Regarding the third hospital (Shizuoka General Hospital), the total number available was not far from 500 (it was 582), so we used data from all 582. Thus, we collected information from a total of 1570 patients: 505 sampled randomly from those in Nagoya, 483 sampled randomly from those in Okinawa, and all 582 from Shizuoka. Of those 1570 patients, 55 underwent only one blood-culture test. We excluded those 55 and analysed the data from the remaining 1515 (96.5% of 1570) (Fig 1).
Validation data came from Tenri Hospital (in Nara Prefecture), which has 815 beds. The number of ER patients who underwent blood-culture tests was 823 (5.9% of 13,997 visits in one year). We randomly chose 500 patients, then excluded 33 because they had undergone only one test, and analysed data from the remaining 467 (93.4% of 500) (Fig 1).

Candidate predictors
In choosing candidate predictors, we excluded those that could not be readily obtained while a patient was in the ER of a community hospital. We also took into account the results of recent studies [3, 12-22, 25, 26] and had discussions with 4 physicians, each of whom had more than 10 years of experience in clinical practice. The candidate predictors (Table 1) included demographic data, signs and symptoms, mental status, comorbid conditions, and laboratory data (measured within 12 hours of arrival at the ER). Demographic data were gender, age ( 65 years), fever, chills including shaking chills, vomiting (which did not include only nausea), dyspnea (including  Identifying Patients with Bacteremia in Community-Hospital Emergency Rooms orthopnea), and spontaneous abdominal pain. Whether patients had fever, chills, vomiting, dyspnea, or abdominal pain at the initial visit was evaluated by the doctors who worked in the ER of each hospital. Comorbid conditions were hemodialysis, stroke (including cerebral hemorrhage, cerebral infarction, subarachnoid hemorrhage, and a history of those conditions), diabetes, cirrhosis (including any stage of cirrhosis), malignancy (including any stage of malignancy and past history of malignancy), urinary catheter, and the presence of an internal device (i.e, an implanted central-venous access device, an implanted pacemaker, an implanted cardioverter defibrillator, or an artificial valve). Comorbid conditions were assessed by information in the medical records and by medical interviews done by the doctors at the initial visit. Vital signs were altered mental status, which was defined as a score 14 on the Glasgow coma scale or a score 1 on the Japan coma scale, body temperature 38.0 degrees C, systolic blood pressure < 90 mmHg, pulse rate 100/min, respiratory rate 20/min, heart murmur, and focal abdominal sign including tenderness, rebound tenderness, and muscular defense (guarding). Laboratory data were white blood cell count 15,000/μL, hemoglobin < 10 mg/dL, platelets < 150,000/μL, blood urea nitrogen (BUN) 20 mg/dL, creatinine (Cre) 1.5 mg/dL, lactate dehydrogenase (LDH) 400 IU/ L, and C-reactive protein (CRP) 10 mg/dL. From each participant's medical record we collected data regarding those candidate predictors.

Definition of true bacteremia
True bacteremia was defined as growth of known pathogenic bacteria in 1 blood culture or as growth of common skin pathogens (i.e., coagulase-negative Staphylococcus species, diphtheroids, Bacillus species, Propionibacterium species, or micrococci) in 2 blood cultures. [19] True bacteremia was distinguished from contamination by the judgment of 2 physicians working independently, each of whom had more than 10 years of clinical experience. They referred to the results of at least 2 blood cultures and to the patient's clinical course. [27] We calculated kappa to quantify the agreement between those 2 physicians. When the 2 physicians' first judgements did not agree, they discussed the case until they reached agreement.

Data collection
For data collection we used a standardized clinical research form. For the derivation data, records were in electronic form at three institutions. The data were collected by four physicians. For the validation data, the one institution from which data were collected had records on paper only, and the data in those records were collected by 18 physicians. Records with missing values and outliers were checked by the first author (TT).

Development of clinical prediction rule
First we used the chi-square test to identify variables associated with true bacteremia. Those variables for which the result of the chi-square test was statistically significant (p < 0.05) were then entered into a multiple logistic regression model. Only predictors with a p value of less than 0.05 were kept in the final model. We then used regression coefficients to make the scorebased prediction rule, as previously discussed by Moons et al. [28] For each variable, we divided its beta coefficient by the beta coefficient for temperature, converted the result to an integer, and then used that for the score. Then we divided the patients into five groups by risk-score, and we compared the observed percentages of bacteremia among those groups. The overall discriminative power of the model was quantified as the area under the receiver-operating characteristic (ROC) curve. [29] Calibration, that is, the agreement between the predicted outcomes and the observed outcomes, was evaluated by using the Hosmer-Lemeshow chi-square statistic [30] and the slope and intercept of the calibration plot. [31] In the calibration plot, the predictions are on the x-axis and the observations are on the y-axis. An ideal calibration line has a slope of 1 and an intercept of 0. [31] Assessment of test performance For all possible cut-off scores we computed the sensitivity, the specificity, and the likelihood ratios for positive and negative test results. [29] Validation testing For internal validation testing we used the bootstrap method (1000 iterations) with the derivation data. [32] For each bootstrap sample, patients were drawn randomly (with replacement) from the derivation data set, and each bootstrap sample was the same size as the derivation sample. For each iteration the model was refitted, and discrimination was again quantified as the area under the ROC curve (AUC). For external validation testing we used the validation data. We computed the total risk score of each patient in the validation data set. Then, for each of the categories described above we computed the percentages of patients who had bacteremia. We also computed the AUC, the sensitivity and specificity, and the likelihood ratios, also as described above. Further, we compared AUCs between the model developed as described above and a model incorporating only temperature.

Software, and research ethics
The data were analyzed with Stata version 11.2 (Stata Corp., College Station, Texas). The Research Ethics Committee of Kyoto University approved this study (assessment number E1382). The data were anonymized before analyses. Table 1 shows demographic and clinical characteristics of the patients from whom the derivation and validation data were obtained. The differences between those two groups of patients were small. The two groups significantly differed with regard to 12 characteristics: age 65, arrival by ambulance, fever, chills, vomiting, abdominal pain, diabetes, devices, systolic blood pressure < 90 mm Hg, respiratory rate 20/min, focal abdominal sign, and LDH 400 IU/L.

Bacteremia
The derivation data included 241 cases of bacteremia: 70 in Nagoya, 42 in Okinawa, and 129 in Shizuoka. The validation data included 87 cases of bacteremia. In the derivation data, 419 pathogens were isolated by blood culture, of which 274 were judged to be true pathogens, with the others judged to be contamination. In the validation data, 121 pathogens were also isolated, of which 91 were judged to be true pathogens. The Kappa statistic for the two reviewers' judgments was 0.94 (95%CI, 0.92-0.96).

Clinical prediction rule
With the exceptions of data on respiratory rate and lactate dehydrogenase (LDH), very few data were missing. As shown in Table 2, 19 of the 28 candidate predictors had statistically significant associations with bacteremia: age 65 years, 9 signs and symptoms, 3 comorbid conditions, and 6 blood-test findings. The results of multivariate analysis (with n = 1288) indicated that 11 predictors were associated with bacteremia: age 65 years old, chills, vomiting, altered mental status, temperature 38°C, systolic blood pressure < 90 mm Hg, focal abdominal sign, white blood cell count 15,000/μL, platelets < 150,000/μL, BUN 20 mg/dL, and CRP 10 mg/dL. Those predictors and their integer scores are shown in Table 3. The sum of those scores can range from 0 to 12. Because that sum is to be used to identify bacteremia in ERs, we call it the ID-BactER score. For the patients in the derivation set, the mean ID-BactER score was 3.3 (95% CI, 3.2-3.4, median: 3, interquartile range: 2-4, range of observed scores: 0-9).
Each patient in the derivation set was assigned to one of five categories, by ID-BactER score. The lower ID-BactER score categories had lower percentages of patients with bacteremia. Categories defined by higher ID-BactER scores had higher percentages of patients with bacteremia (Fig 2). Specifically, bacteremia was detected in 1.8% (4/222) of patients with a score of 0 or 1; in 6.7% (36/537) with a score of 2 or 3; in 24.2% (105/434) with a score of 4 or 5; in 51.3% (59/ 115) with a score of 6 or 7; and in 71.4% (10/14) with a score greater than 7. The AUC was 0.80 (95% CI, 0.77-0.83, Fig 3). The Hosmer-Lemeshow chi-square statistic was 5.84 (p = 0.66). The calibration slope was 1.02 and the intercept was -0.01. Sensitivities, specificities, and likelihood ratios for possible cut-off scores are shown in the upper part of Table 4. With 2 as the cutoff score, the sensitivity was 98%, the specificity was 20%, the positive likelihood ratio was 1.22, the negative likelihood ratio was 0.10, and the percentage of false negatives was 1.8% (4/222).
External validation: For the external validation data, the mean ID-BactER score was 3.6 (95% CI, 3.5-3.8, median: 3, interquartile range: 2-5, range of observed scores: 0-10). The relationship between the category of ID-BactER score and the percentage of patients with bacteremia was the same with the validation data as with the derivation data (Fig 2). Specifically, bacteremia was detected in 4.4% (2/45) with a score of 0 or 1; in 10.1% (15/149) with a score of 2 or 3; in 23.9% (32/134) with a score of 4 or 5; in 46.0% (23/50) with a score of 6 or 7; and in 55.6% (5/9) with a score greater than 7. The AUC was 0.74 (95% CI, 0.68-0.80, Fig 3). In contrast, the AUC derived from the model incorporating only temperature was 0.60 (95% CI, 0.54-0.65). The Hosmer-Lemeshow chi-square statistic was 4.17 (p = 0.90). The calibration slope was 1.15 and the intercept was -0.03. Sensitivities, specificities, and likelihood ratios for possible cut-off scores are shown in the lower part of Table 4. With 2 as the cutoff score, the sensitivity was 97%, the specificity was 14%, the positive likelihood ratio was 1.13, the negative likelihood ratio was 0.19, and the percentage of false negatives was 4.4% (2/45).
Complete data from 1709 patients seen from April 2011 through March 2012 were available in derivation and validation data. When using the same cutoff score of 2 in the complete data, the percentage of true positives was 16.7% (285/1709), for false positives it was 67.7% (1157/ 1709), for false negatives it was 0.4% (6/1709), and for true negatives it was 15.3% (261/1709).

Discussion
To the best of our knowledge, this study is the first to develop and test a procedure to identify ER patients who have community acquired bacteremia using data available at community hospitals. It uses information on 11 variables, all of which can be obtained easily within a few hours of the time that a patient arrives in the ER of a community hospital. The necessary information comes from the patient's history, vital signs, physical examination, and results of common blood tests. Unlike other clinical prediction rules for bacteremia, information about bands or procalcitonin is not required. The new procedure results in a score that can range from 0 to 12. With all scores other than 0 and 1 taken as indicators of bacteremia (i.e., test-positive), the sensitivity to true bacteremia in the validation data was 97%.

Independent predictors compared with existing literature
Most of the predictors used in the new method have previously been found to be associated with community acquired bacteremia. Being older than 65, chills, vomiting, elevated body temperature, hypotension, leukocytosis, thrombocytopenia, and blood urea nitrogen (BUN) were mentioned in a study of ER patients. [17,18,22] The strongest predictor in the present analysis, chills, was also the strongest predictor in a review of 35 studies. [12] C-reactive protein (CRP) has been associated with bacteremia in observational studies. [16,18,25] However, we identified altered mental status and focal abdominal sign as new predictors to identify community acquired bacteremia. In contrast, procalcitonin has been used as a predictor of bacteremia [18], but it was not measured at any of the 4 ERs in this study. Measurement of bands [17] was possible at only 1 of the 4 hospitals in this study. In future studies, this biomarker could increase the diagnostic performance of the score.
We compared 3 prediction models previously developed in ER settings (Table 5). [17,18,22] Each of the three used data from a single institution. The predictors used by Shapiro [17] were similar to those we used in this study, although Shapiro also used bands. Their resulting AUCs were similar to those in the present study: 0.80 for the derivation data and 0.75 for the validation data. The AUC in the study by Su [18] was slightly higher (0.85), but the internal validation (bootstrap method) resulted in a low AUC (0.66). The number of patients whose data were used in that study was relatively small (n = 558) and there was no external validation. Studying patients with fever, Lee [22] used predictors some of which were the same as those we used in the present study, and the resulting AUC was high: 0.91. However, the number of patients whose data were used in that study was quite small (n = 396) and there was no external validation.

Discrimination and calibration of the prediction model
With the ID-BactER score, the AUCs were 0.80 from the derivation data and 0.74 from the validation data. This has been described as "moderate accuracy". [29] The high p values of the Hosmer-Lemeshow chi-square statistics (0.66 from the derivation data and 0.90 from the validation data) indicate no statistically significant differences between predicted and observed bacteremia. The calibration slope of approximately 1 and intercept of approximately 0 indicate almost-perfect calibration. One potential problem in the development of any prediction model is overfitting. For this model, the lack of overfitting is evidenced by the calibration slope of 1.15 from the validation data. [31] Clinical relevance and implications For a bacteremia prediction rule to be clinically useful it must yield very few false negatives, that is, it must be very sensitive. In one study, the median sensitivity that 149 physicians Identifying Patients with Bacteremia in Community-Hospital Emergency Rooms required for a bacteremia prediction rule was 95%. [33] When scores 2 were taken as indicators of bacteremia (i.e., test-positive), the sensitivity of this model was 98% with the derivation data and 97% with the validation data, and the negative likelihood ratios were 0.10 and 0.19 respectively. With that cut-off, false negatives were extremely rare: 1.8% (derivation data) and 4.4% (validation data). This prediction rule meets clinicians' need for a very sensitive indicator of bacteremia. Therefore we would recommend that clinicians obtain blood-culture results if the ID-BactER score is 2. If the score is 0 or 1, then there may be no need to draw blood for cultures. Still, with a pre-test probability of 0.1 the post-test probabilities were 0.01 (derivation data) and 0.02 (validation data). [34] Therefore, even with scores of 0 or 1, exceptions might be reasonable (i.e., drawing blood for cultures might be indicated) in some cases. For example, we would not generalize the findings of this study to patients who are suspected to have infectious endocarditis, because very few patients in this study had that diagnosis (0.9% in the derivation data and 0.2% in the validation data).
With a cutoff score of 2 the specificity was 20% for the derivation data and 14% for the validation data. Given a pre-test probability of 0.1, the post-test probability would be 0.12 for the derivation data and 0.11 for the validation data. [34] 34 The low likelihood ratios do not allow us to "rule in" bacteremia, when using a cutoff score of 2. In addition, that cutoff was associated with 1,157 false-positive results. As noted by Hranjec [35], it is important to minimize the number of unnecessary treatments, and so even if a cutoff score of 2 is used great care must be taken in the prescription of antibiotic drugs. Nonetheless, when a physician strongly suspects that a patient has severe sepsis, then blood for cultures should be drawn as soon as possible and antibiotic treatment should be started early, even before calculating the ID-BactER score.

Strengths
This study has five strengths. First, this rule was developed and tested with data from randomly selected samples of all patients who underwent blood cultures in the ERs studied, so it is applicable very broadly. It can be used whether the patient originally had only a urinary tract infection, or pneumonia, or cellulitis, etc. Second, all of the data required for the ID-BactER score are available within a few hours of presentation at the ER of a community hospital, even if the patient arrives at night. We note that all of the measurements needed were available at the institution from which the validation data were obtained. Third, for validation we used external data, which is done only rarely despite being very important in this context. [36] Fourth, we used a robust criterion to diagnose bacteremia (i.e., results of 2 or more blood cultures). With that criterion most cases of bacteremia are detected and true bacteremia can be distinguished from contamination. [37] Finally, the outcome was judged by two physicians with more than 10 years of experience in general practice, one of whom was certified as an infectious-disease specialist.

Limitations
Our study has at least the following eight limitations. The first is the possibility of spectrum bias. [38] The number of patients who in fact had bacteremia could be higher than the number reported here, because in some patients with bacteremia blood cultures might not have been taken. These patients, if they had been included in the study, might have had low ID-BactER scores, and their non-inclusion (because of a lack of blood-culture data) could have caused overestimation of the score's sensitivity. One challenge for future research is to include all emergency-room patients. The difference in contamination rates between the derivation data and the validation data (35% vs. 25%) could reflect different approaches of clinicians with regard to drawing blood cultures and may limit interpretation of the score's performance. The second limitation is missing data. [39] The data that were most commonly missing were respiratory rate (24.6%) and LDH (23.7%). For all other candidate predictors fewer than 5% of the data were missing. However, neither respiratory rate nor LDH was an independent predictor of bacteremia in previous studies, and neither was associated with true bacteremia in univariate analysis in this study. So, we believe that missing data did not influence the main result of this study. Third, the study was retrospective, which made it difficult to ensure the quality of the data. As examples, it is possible that not all of the physicians asked about chills, and it is possible that not all of them evaluated each patient's level of consciousness. Fourth, with the cutoff set at a score of 2, the resulting sensitivity is very high but the specificity is low, which makes the score inappropriate for confirming bacteremia. Future work should focus on building a model that maintains the desired sensitivity and yet is also more specific. Fifth, some physicians may find the score impractical to use when an emergency department is very busy, as the 11 variables could be difficult to remember. Devising a simpler, more easily computed score is another important goal of future work. Sixth, with regard to the sample size we note that, very unfortunately, "there are no generally accepted approaches to estimate the sample size requirements for derivation and validation studies of risk prediction models." [40] Thus, as described above in the Methods section, our estimate of "about 1500" was based on general considerations for all multivariable models (at 10 cases per predictor variable) and it was also based on the estimate that only about 10% of the records to be reviewed would be from patients who actually had bacteremia (which was a conservative assumption). General rules or widely-applicable guidelines for estimating sample size for development and testing of prediction rules would be welcome. Seventh, the model building did not use a standardized method such as stepwise selection. We considered that clinical variable selection could avoid the potential overfitting, but it is controversial whether clinical variable selection is superior to stepwise selection. Finally, we used data from community hospitals only in Japan, which could reduce the external validity. [41] Further testing with data from other countries is certainly needed.

Conclusion
Using the ID-BactER score, physicians in the ERs of community hospitals can identify the patients who are most likely to have community acquired bacteremia. The ID-BactER score can be computed from information that is readily available in the ERs of community hospitals. A score that maintains the ID-BactER's current sensitivity but also has a higher specificity should be developed in future studies. In addition, further validation testing should focus on all emergency-room patients and should use data from different countries.
introducing us to collaborators for this project. We thank Dr. Joseph Green for his myriad comments and for suggesting that we give the name "ID-BactER" to the score used to IDentify patients with Bacteremia in the Emergency Room.