External validation of the NOBLADS score, a risk scoring system for severe acute lower gastrointestinal bleeding

Background We aimed to evaluate the generalizability of NOBLADS, a severe lower gastrointestinal bleeding (LGIB) prediction model which we had previously derived when working at a different institution, using an external validation cohort. NOBLADS comprises the following factors: non-steroidal anti-inflammatory drug use, no diarrhea, no abdominal tenderness, blood pressure ≤ 100 mmHg, antiplatelet drug use, albumin < 3.0 g/dL, disease score ≥ 2, and syncope. Methods We retrospectively analyzed 511 patients emergently hospitalized for acute LGIB at the University of Tokyo Hospital, from January 2009 to August 2016. The areas under the receiver operating characteristic curves (ROCs-AUCs) for severe bleeding (continuous and/or recurrent bleeding) were compared between the original derivation cohort and the external validation cohort. Results Severe LGIB occurred in 44% of patients. Several clinical factors were significantly different between the external and derivation cohorts (p < 0.05), including background, laboratory data, NOBLADS scores, and diagnosis. The NOBLADS score predicted the severity of LGIB with an AUC value of 0.74 in the external validation cohort and one of 0.77 in the derivation cohort. In the external validation cohort, the score predicted the risk for blood transfusion need (AUC, 0.71), but was not adequate for predicting intervention need (AUC, 0.54). The in-hospital mortality rate was higher in patients with a score ≥ 5 than in those with a score < 5 (AUC, 0.83). Conclusions Although the external validation cohort clinically differed from the derivation cohort in many ways, we confirmed the moderately high generalizability of NOBLADS, a clinical risk score for severe LGIB. Appropriate triage using this score may support early decision-making in various hospitals.


Introduction
Acute lower gastrointestinal bleeding (LGIB) is a common indication for hospital admission in the United States [1] and accounts for approximately 35.7 per 100,000 adult hospitalizations annually. Although the rate of upper gastrointestinal bleeding (UGIB) has decreased rapidly over the past 10 years [2], the incidence of LGIB has increased slightly [2], and a similar trend has been reported in Asia. [3,4] Patients with acute LGIB often experience persistent or recurrent bleeding and require blood transfusions, long hospitalization stays, and interventions such as colonoscopic, radiological, and surgical treatment. [5][6][7] In addition, a certain proportion of patients with acute LGIB die during their hospital stay (< 4%). [7,8] Therefore, for appropriate triage to emergency hospitalization or early intervention, and for ultimately better outcomes, a risk stratification tool to predict severe LGIB is required. However, unlike UGIB [9], predictive clinical scores with high generalizability have not been established for severe acute LGIB.
Although some studies have investigated predictors for severe acute LGIB [10][11][12][13], few predictive scores have been validated in different settings. Strate et al. validated their score, but only 39% of patients (n = 107) in the validation study were recruited from a hospital other than where the score was developed. [11] Using an external cohort (n = 172), Newman et al. assessed the utility of the BLEED criteria, which suggested predictors for poor outcomes of GI bleeding, but the criteria were not very useful because the area under the receiver operating characteristic curve (ROC-AUC) was relatively low (0.60). [13] We have recently developed and prospectively validated NOBLADS, a clinical risk scoring system for severe LGIB. [14] However, our validation study was conducted at the same hospital where we developed the score and the number of patients was relatively small (n = 161). Thus, it remains to be determined whether this model can be generalized to other hospitals and to other patients. Because the background, diagnosis, and management of acute LGIB may vary according to different institutions, the ability of this model to provide accurate predictions at other institutions needs to be confirmed.
To evaluate the generalizability of the NOBLADS score for severe LGIB, we investigated an external validation cohort, which was composed of a large number of patients who were hospitalized for acute overt LGIB on an emergency basis.

Study design, setting, and participants
This study complied with the Declaration of Helsinki. The design was approved by the ethics committee of The University of Tokyo (approval number 11528) and by the institutional review board at the National Center for Global Health and Medicine (approval number 2163). This study was a retrospective observational study, carried out by the opt-out method of our hospital website. We retrospectively identified patients who were admitted to the University of Tokyo Hospital on an emergency basis due to the onset of acute, continuous, or frequent overt LGIB between January 2009 and August 2016. This hospital is one of the referral university hospitals in the Tokyo metropolitan area, and it is a different institution from the hospital where we developed a clinical risk score for severe LGIB, NOBLADS. [14] Data were collected from admission databases and a recorded endoscopic database. The endoscopic database is a searchable collection of records into which endoscopists prospectively input data after completing endoscopies. We searched the endoscopic database and selected patients with overt GI bleeding who were assessed by colonoscopy (Fig 1). We subsequently reviewed the endoscopic and clinical findings of these patients using the electronic medical record system and excluded patients with (i) UGIB, (ii) inpatient-onset LGIB, and (iii) elective admission with chronic LGIB. Ultimately, 511 patients with outpatient-onset acute LGIB were analyzed for the external validation of the NOBLADS score.

Outcome criteria
Outcomes were defined in the derivation and internal validation studies as follows. [14] The main outcome was severe LGIB comprising: (i) continuous bleeding during the first 24 h (transfusion of ! 2 units of packed red blood cells and/or a decrease in hematocrit of ! 20%) and/or (ii) recurrent bleeding after initial colonoscopy (rectal bleeding accompanied by a further decrease in hematocrit of ! 20% and/or additional blood transfusions) as previously described. [10] Secondary outcomes included blood transfusion requirement, length of stay (LOS), intervention (endoscopy, interventional radiology, or surgery), and in-hospital mortality. Blood transfusion was indicated when hemoglobin levels fell below 7.0 g/dL (or 8.0 g/dL when vital signs were unstable). After spontaneous cessation of bleeding with conservative treatment or hemostasis, all patients were started on a liquid diet and gradually progressed to a solid diet over a period of three days before being discharged. Endoscopic intervention was the firstline treatment when stigmata of recent hemorrhage (SRH) was detected on colonoscopy.
Interventional radiology was performed in patients with extreme bleeds that did not resolve with endoscopic treatment. Patients with persistent bleeds after endoscopic treatment and/or interventional radiology were surgically treated. Data concerning death during hospitalization were collected from the medical records and death certificates of the study hospitals.

Risk scoring system (NOBLADS score)
We previously used multivariate logistic regression to detect risk factors for severe bleeding in a retrospectively collected cohort of 439 patients emergently hospitalized for acute LGIB at the National Center for Global Health and Medicine in Japan, from January 2009 to December 2013 (i.e., derivation cohort). From these data, we developed NOBLADS, a clinical risk scoring system for severe LGIB. [14] This score comprises the following factors: NSAID use, no diarrhea, no abdominal tenderness, blood pressure 100 mmHg, antiplatelet drug use (non-aspirin), albumin < 3.0 g/dL, disease score ! 2 (according to the Charlson comorbidity index), and syncope ( Table 1). Each predictor was given a weight of 1 point. In this report, we assessed the external validity of the six-level NOBLADS score using 0, 1, 2, 3, 4, or ! 5 predictors.

Diagnosis of LGIB and data collection
All patients in this study were assessed by colonoscopy using high-resolution electronic video endoscopes (type PCF-240I, PCF-Q260AI, or PCF-Q260JI; Olympus Optical, Tokyo, Japan) after bowel preparation with a polyethylene glycol solution. If bowel preparation was inadequate, an Olympus Flushing Pump water-jet (Olympus Optical) was applied to improve visualization. Colonoscopy was repeated for a more detailed assessment if colon preparations were insufficient at the first colonoscopy or if rebleeding occurred. The diagnostic criteria for diverticular bleeding were classified as definitive and presumptive. [15] A definitive diagnosis was based on colonoscopic visualization of a colonic diverticulum with SRH such as active bleeding, adherent clot, or a visible vessel. A presumptive diagnosis was based on fresh blood localized at a colonic diverticulum in the presence of a potential bleeding source on complete colonoscopy, or bright red blood in the rectum confirmed by objective color testing and colonoscopy showing a single potential bleeding source in the colon complemented by negative upper or negative capsule endoscopic findings. [15] Overt LGIB of unknown origin or hemorrhoidal bleeding was defined as a clinically significant decrease in hematocrit of ! 10% and/or a decrease in hemoglobin levels of ! 2 g/dL from baseline [16]. All required variables (symptoms, vital signs, comorbidities, medications, and laboratory findings) were collected in the emergency department within two hours of a patient presenting at our hospital. We evaluated 19 comorbidities using the Charlson comorbidity index. [17] Statistics Characteristics of the derivation cohort and the external validation cohort were compared using a univariate analysis with the Pearson's Chi-squared test, Fisher's exact test, or Wilcoxon rank sum test as appropriate. We used previously published data [14] in the derivation cohort (n = 439) and in the internal validation cohort (n = 161) to compare the validity and prediction ability of the NOBLADS score. We assessed the validity of the NOBLADS score using ROC-AUC of the external validation cohort compared with the derivation cohort and the internal validation cohort. Model calibration in the external validation cohort was evaluated using the Hosmer-Lemeshow goodness-of-fit test. The ability of the score to predict severe bleeding and secondary outcomes including blood transfusion requirement, LOS, intervention requirements and in-hospital mortality was determined in the external validation cohort. These relationships were assessed using a nonparametric trend test (nptrend in Stata) or Fisher's exact test.
A value of P < 0.05 was considered to indicate statistical significance. The STATA version 13 software was used to perform all analyses (StataCorp, College Station, TX, USA).

Patient characteristics
We analyzed data from 511 patients (male, 66.1%; mean age, 68.7 years; range, 16-99 years) with LGIB ( Table 2). The external validation cohort and the derivation cohort were different in that the former had a significantly higher number of males, more comorbid diseases, lower initial hematocrit levels, a greater number of blood transfusions, higher NOBLADS scores, and were more often diagnosed with diverticular bleeding than the latter cohort. The two cohorts had similar initial vital signs. According to the NOBLADS factors, rates of no diarrhea, no abdominal tenderness, non-aspirin antiplatelet drug use, and Charlson comorbidity index ! 2 were significantly higher in the external validation cohort than in the derivation cohort, whereas rates of NSAID use, blood pressure 100 mmHg, albumin < 3.0 g/dL, and syncope were similar in the two cohorts.

Discussion
We validated the usefulness of the NOBLADS score, which consists of eight predictors, in a relatively large group of acute LGIB patients. Although many clinical factors including  background, laboratory data, NOBLADS factors, and endoscopic diagnosis differed between the external and derivation cohorts, the predictive accuracy of the score was moderately high in the external validation cohort (AUC, 0.74) as well as in the derivation cohort (AUC, 0.77). [14] The score predicted the risk for blood transfusion requirement and in-hospital mortality, but was not able to predict longer hospital stay and intervention, in this external validation cohort.
As shown in Table 1, the external validation cohort appeared to have more patients with a high vascular event risk and with more severe bleeding, compared with the derivation cohort. Moreover, the distribution of diseases differed somewhat between these two cohorts. As shown in Figs 2 and 3, the NOBLADS score effectively predicted severe bleeding in a cohort who were more unwell, and with more comorbidities, than an earlier cohort. This indicates the high generalizability of the score.
Although the p value for trend was significant in the analysis of LOS, caution should be exercised for the interpretation of this result because the mean LOS of patients with low-risk scores (0-1) was longer than that of patients with higher scores (2-4) (Fig 4B). One reason might be that low-risk scores (0-1) included 67% of patients with inflammatory bowel disease. The disease tended to require longer hospital stays for remission induction treatments such as corticosteroids. In addition, in this cohort, the NOBLADS score could not significantly predict intervention (p = 0.060, for trend; AUC, 0.54). Thus, further data are needed to determine if this score can be widely used to predict these important outcomes, although the score predicted these outcomes in the original study [14]. Particularly for intervention, an appropriate predictive model is needed, because all seven previous models failed to predict intervention in a recent LGIB study. [18] The mortality rate of patients with a score ! 5 was 6.5% (3/46), whereas that of patients with a score < 5 was 0.2% (1/465). Although the in-hospital mortality of LGIB is generally low [7,8], our results suggest that patients with NOBLADS scores ! 5 should be monitored closely, with appropriate allocation of resources, given their high risk of in-hospital mortality. As NOBLADS includes the Charlson comorbidity index [17,19], hypoalbuminemia [19,20], and NSAID use [7,8], which were earlier established as mortality predictors, the NOBLADS score may predict in-hospital mortality. Recently, Sengupta et al. [19] derived and validated a prognostic score predicting LGIB mortality. In our external validation cohort, the AUC of the score by Sengupta et al. was 0.95. Because there were only a few events in this study, further data is needed to validate the utility of using NOBLADS to predict mortality.
Previously, some studies investigated predictors of severe acute LGIB [10-13, 18, 21-23], and a few scores have been validated in other settings. The BLEED criteria [21] failed to be validated in other LGIB patients (AUC, 0.60). [13] The score by Strate et al. [11] may not be generalizable, because more than half of the patients in the validation study were recruited from the same hospital where the score was developed. Although Das et al. validated an artificial neural network (ANN) model [22], it is cumbersome and requires the entry of as many as 26 variables. The scores by Velayos et al. [12], Newman et al. [13], and Chong et al. [23] lack validation. Recently, the score by Oakland et al. [18] afforded a better discriminative performance to identify patients for safe outpatient management than the NOBLADS score and other predictive models. The main difference between the NOBLADS score and the score by Oakland et al. [18] is that the hemoglobin level was the most important predictor in the score by Oakland et al. [18]. Prospective comparisons are needed to determine which scoring system (with or without the hemoglobin level) performs best.
Our study had some limitations. First, this study was retrospectively designed. Second, even though the patient populations were distinct, this external validation study was not fully independent from the derivation study since the same investigators performed both studies. Third, because both the derivation and validation cohorts of this score did not include inpatientonset patients and patients who were discharged from the emergency room, it is unclear if the NOBLADS score can be applied to these patients. Fourth, the inclusion criteria for patients who underwent colonoscopy might have caused selection bias. In Japan, most institutions including our hospital perform endoscopies at first for diagnosis and treatment for almost all of the GIB patients. We did not perform colonoscopy in few cases such as patients who had already undergone colonoscopy for GIB in the past few months, or in patients whose activities of daily living were too low for colonoscopy. Therefore, it is also unclear whether this score can be applied to these patients. Our study also had strengths. For example, by only including patients who underwent colonoscopy, we were able to measure the rate of intervention. Furthermore, we evaluated the usefulness of the NOBLADS score using patients whose characteristics varied widely from the derivation cohort, and that the sample size was larger (n = 511) than that of previous validation studies [11,22] (n 252).
In conclusion, we externally validated the NOBLADS, a clinical risk score for severe LGIB. This score may guide a standardized approach in managing acute LGIB. Further prospective studies in other countries are warranted to examine whether the consistent application of this score to LGIB management can reduce adverse outcomes and resource utilization.