External validation of geriatric influenza death score: A multicenter study

Yuan Kao; Wei-Jing Lee; Kang-Ting Tsai; Chung-Feng Liu; Chien-Chin Hsu; Hung-Jung Lin; Chien-Cheng Huang; How-Ran Guo

doi:10.1371/journal.pone.0283475

Abstract

The Geriatric Influenza Death (GID) score was developed to help decision making in older patients with influenza in the emergency department (ED), but external validation is unavailable. Thus, we conducted a study was to fill the data gap. We recruited all older patients (≥65 years) who visited the ED of three hospitals between 2009 and 2018. Demographic data and clinical characteristics were retrospectively collected. Discrimination, goodness of fit, and performance of the GID score were evaluated. Of the 5,508 patients (121 died) with influenza, the mean age was 76.6±7.4 (standard deviation) years, and 49.3% were males. The GID score was higher in the mortality group (1.7±1.1 vs. 0.8±0.8, p <0.01). With 0 as the reference, the odds ratio for morality with score of 1, 2 and ≥3 was 3.08 (95% confidence interval [CI]: 1.66–5.71), 6.69 (95% CI: 3.52–12.71), and 23.68 (95% CI: 11.95–46.93), respectively. The area under the curve was 0.722 (95% CI: 0.677–0.766), and the Hosmer–Lemeshow goodness of fit test was 1.000. The GID score had excellent negative predictive values with different cut-offs. The GID score had good external validity, and further studies are warranted for wider application.

Citation: Kao Y, Lee W-J, Tsai K-T, Liu C-F, Hsu C-C, Lin H-J, et al. (2023) External validation of geriatric influenza death score: A multicenter study. PLoS ONE 18(3): e0283475. https://doi.org/10.1371/journal.pone.0283475

Editor: Muhammad Tarek Abdel Ghafar, Tanta University Faculty of Medicine, EGYPT

Received: June 18, 2022; Accepted: March 9, 2023; Published: March 24, 2023

Copyright: © 2023 Kao et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: Because the data contain potentially sensitive information, derived data supporting the findings of this study are available from the Institutional Review Board of Chi Mei Medical Center on request (contact information: https://www.chimei.org.tw/main/cmh_department/59024/indexInternet.htm).

Funding: This study was supported by Grant CMNCKU10816 and Grant CMHCR10954 from Chi Mei Medical Center. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.

Introduction

Older adults are at high risks of developing serious complications from influenza due to the immune changes with increasing age compared to younger population [1]. Approximately 70%–85% of influenza-related deaths and 50%–70% influenza-related hospitalizations occurred in older population [1]. When older adults are infected with influenza, 67% of them become housebound temporarily and 25% become bedbound temporarily [2]. Therefore, influenza contributed a significant burden to the mortality and morbidity in older population [2].

Mortality and morbidity in older adults with influenza are high, thus, predicting adverse outcome and early intervention are important. However, clinical manifestations in older adults with influenza are usually atypical and complicated with an underlying illness [3]. A Canadian study recruiting patients aged ≥60 years in the emergency department (ED) revealed that only 31% had temperature of ≥37.8°C and cough, with or without sore throat, which are the criteria for influenza-like illness defined by the United States Centers for Disease Control and Prevention [3]. In 2018, Chung et al. proposed a Geriatric Influenza Death (GID) score, intending to develop a useful clinical decision rule (CDR) to help in decision making for this population [4]. The GID score consists of five predictors as following: severe coma (Glasgow coma scale score of ≤8, 2 points), histories of cancer or coronary artery disease (1 point for each history), elevated C-reactive protein (CRP) levels (>10 mg/dl, 1 point), and bandemia (>10% band cells, 1 point) [4]. The GID score ranges from 0 to 6 [4]. When an older adult is diagnosed with influenza in the ED, the GID score can be calculated easily, and 30-day mortality can be predicted [4]. In a previous study including three risk groups, including low risk group (≤1 point, 1.1% mortality), moderate risk group (2 points, 16.7% mortality), and high risk group (≥3 points, 40% mortality), the area under the curve (AUC) and the Hosmer-Lemeshow goodness of fit tests were 0.861 and 0.578 for the GID score [4]. While the previous study showed that he GID score had good performance in predicting mortality in older patients with influenza in the ED, its validation study is still unavailable. Therefore, we conducted this multicenter study to externally validate the GID score.

Methods

Study design, setting, and participants

We conducted a retrospective multicenter study (Chi Mei Medical Center, Chi Mei Liouying Hospital, and Chi Mei Chiali Hospital) and recruited all older patients (≥65 years) who visited these EDs between 2009 and 2018. The study population is Tainan city, which had about 1.8 million of people and 17.3% of them were age ≥65 years until December 31, 2021 [5]. The criteria of influenza were defined as the diagnosis of the International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) of 487–488, ICD-10 of J09-J11, or a prescription of Oseltamivir, Peramivir, or Relenza in the index ED visit. We identified patients and collected their demographic data, underlying comorbidities, laboratory data, and mortality were from electronic medical records.

Definitions of variables

The age was divided into three subgroups: young elderly (65−74 years), moderate elderly (75−84years), and old elderly (≥85years) [4]. The recorded vital signs included Glasgow coma scale (GCS), systolic blood pressure (SBP), heart rate, respiratory rate, and body temperature. Underlying comorbidities included in the study were hypertension, diabetes, chronic obstructive pulmonary disease (COPD), coronary artery disease (CAD), cerebrovascular accident (CVA), malignancy, congestive heart failure (CHF), dementia, and bedridden. Laboratory data included white blood cell count (WBC), bandemia, hemoglobin, platelet, serum creatinine, and high-sensitivity CRP (hs-CRP). Severe coma was defined as GCS ≤ 8 [4]. Bandemia was defined as band >10% [4].

Measurement of outcome

The outcome was defined as in-hospital mortality. The original measurement of outcome in the GID score is 30-day mortality. Because we were interested in in-hospital mortality, we chose to use this measurement of outcome in this study. The patients who were discharged from the ED and had no record of mortality in the electronic medical record were also defined as survival.

Ethical statement

This study was conducted after the approval by the Institutional Review Board of the Chi Mei Medical Center. This is a retrospective study containing de-identified information, and therefore informed consents from patients were waived as it would not affect their rights and welfare.

Statistics

To evaluate differences in demographic characteristics, underlying comorbidities, and laboratory data between the mortality and the survival groups, we use independent t-tests for continuous variables and chi-square tests for categorical variables. Logistic regression analysis was performed for comparing the odds ratio (OR) among patients with different GID scores. The AUC was used to evaluate the discrimination of the score, and the Hosmer–Lemeshow test to evaluate goodness of fit. The performance of GID score was evaluated by sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV). The level of significance was set at 0.05 (two-tailed). Because this study is retrospective and missing data is unavoidable in the real world, we chose to give normal values to fill the missing data of Glasgow coma scale, bandemia, and hs-CRP.

Results

A total of 5,508 patients were identified in this study (Table 1). The mortality rate was 2.2% (121/5508). The mean age (standard deviation) was 76.6 (7.4) years and percentage of males was 49.3%. Patients with mortality predominated in the moderate elderly (41.3%), followed by the old elderly (38.8%), and then the young elderly (19.8%). The average mortality rate was highest in the old elderly (5.2%, 47/902), followed by the moderate elderly (2.2%, 50/2234) and the young elderly (1.0%, 24/2372). Compared to the survival group, the mortality group had lower GCS, SBP, and body temperature but higher heart rate and respiratory rate. Patients with mortality had more underlying comorbidities, including hypertension, COPD, CVA, malignancy, CHF, dementia, and bedridden than patients who survived. As to laboratory data, patients with mortality had higher WBC, bandemia, platelet, serum creatinine, and hs-CRP but lower hemoglobin than those who survived. The average stay in hospital was 7.2 ± 7.5 days. The GID score was higher in patients with mortality than those who survived (1.7 ± 1.1 vs. 0.8 ± 0.8, p < 0.01). The missing data and given values are listed in the S1 Table.

Download:

Table 1. Characteristics of all older patients with influenza between 2009 and 2018 in ED.

https://doi.org/10.1371/journal.pone.0283475.t001

Compared to patients with GID score of 0, ORs of mortality in patients with GID score of 1, 2, and ≥3 were 3.08 (95% confidence interval [CI]: 1.66–5.71), 6.69 (95% CI: 3.52–12.71), and 23.68 (95% CI: 11.95–46.93), respectively (Table 2). The AUC was 0.722 (95% CI: 0.677–0.766) (S1 Fig), and the Hosmer-Lemeshow goodness of fit test was 1.000.

Download:

Table 2. Logistic regression analysis, AUC, and Hosmer–Lemeshow goodness of fit test for validating the accuracy of GID score in older patients with influenza in ED.

https://doi.org/10.1371/journal.pone.0283475.t002

In the performance analysis, the NPV of the GID score across three score groups were excellent (0.994 with GID score ≥1, 0.987 with GID score ≥2, and 0.982 with GID score ≥3). For predicting mortality, the GID score cutoff of 1 had the best sensitivity (0.893) and NPV (0.994), and the cutoff of 3 had the best specificity (0.968) (Table 3). Mortality rates associated with GID score of ≥3, GID score of 2, GID score of 1, and GID score of 0 were 13.3%, 4.2%, 1.9%, and 0.6%, respectively (Fig 1).

Download:

Fig 1. Mortality rates in three groups of GID score (0 vs. 1 vs. 2 vs. ≥3).

GID, geriatric influenza death.

https://doi.org/10.1371/journal.pone.0283475.g001

Download:

Table 3. Performance of GID score in predicting mortality in older patients with influenza in ED.

https://doi.org/10.1371/journal.pone.0283475.t003

Discussion

This external validation study showed that GID score had an acceptable discrimination and a good fit for predicting mortality for older patients with influenza in the ED. The mortality rate increased with the GID score. The GID score performed best in NPV. GID score of ≥1 had the best sensitivity and NPV for predicting mortality, whereas GID score of ≥3 had the best specificity.

Compared to the original study that developed the GID score, the mortality rate in this validation study was lower (2.2% vs. 4.9%). Possible reasons are different population and measurement of mortality. The original GID score was developed in an ED of a medical center in northern Taiwan [4], whereas the validation study was conducted in three hospitals in southern Taiwan. In addition, the validation study recruited 5,508 patients, which is a far larger sample size than in the original study (409 patients). We adopted in-hospital mortality in the validation study because of the difficulty in following 5,508 patients after discharge, which is also different from the original study. The “in-hospital mortality” should be lower than the “30-day mortality” due to the shorter follow-up period. In the literature, the mortality in older adults with influenza has a great variation, ranging from 0.009% to 14.3% in the nursing home for the elderly [6] due to different race, medical care system, and time period.

The AUC in the validation study is 0.722, lower than the 0.861 AUC in the original study [4]. Other studies have revealed that CDRs always perform better in the dataset which they derived than their applications either in internal or external validations [6, 7]. Lower performance in the validation study may be due to differences in patient samples and prevalence of the disease, over-fitting, unsatisfactory model derivation, absence of important predictors, and differences in interpretation and measurement of predictors [8, 9]. Even if a CDR is well developed, it is not necessarily be generalizable to new populations [8]. Thus, external validation, such as this study, is an essential process to assess the performance of a CDR [8].

Aims of external validation includes taking the original model and its predictors and regression coefficients, measuring the predictor and outcome variables in a new population, applying the original CDR to these data to predict the outcome of interest, and comparing the predictive performance of the CDR by analyzing outcomes [8, 10]. The evaluation of a CDR performance can be done by discrimination, calibration, and measurement to quantify clinical usefulness such as decision curve analysis [8, 11]. A CDR can be revised according to results if it performs poorly in the external validation [8]. Despite external validation as an essential process for clinical use of CDR, few CDRs have been externally validated [8, 12–14]. In a systematic review of 101 CDRs showed that 76% of them had no such validation, 17% had narrow validation, 8% had broad validation, and none had impact analysis [14].

The GID score performs well for the NPV across different scores in this study. In addition to the well-developed GID score, another explanation may be due to low mortality rate in the study population, which also explains the low PPV (0.031 with the GID score cut-off of 1, 0.058 with the cut-off of 2, and 0.132 with the cut-off of 3) in this study. The GID score cut-off of 1 had a sensitivity of 0.893, which suggests that the cut-off “1” may be used for identifying patients with a high mortality risk. The GID score cut-off of 3 had a specificity of 0.968, which suggests that the cut-off “3” may be used for identifying patients with a low mortality risk. The cut-off point depends on the aim of clinicians.

In the development and evaluation of CDR, three main stages, including derivation, validation, and impact analysis were found [8, 15]. Suggested requirements for designing external validation include a prospective multicenter design, a minimum of 100 outcome events, and a framework of generalizability to enhance the interpretation of findings [8, 16]. Suggested requirements for types of external validation are conducting temporal, geographical, and domain validation studies and meta-analysis using a published framework to summarize the overall performance of the CDR [8, 17]. After the validation, refinement of a CDR, including model updating or adjustment are suggested [8]. Final aim of a CDR is to improve the quality of care, thus, the impact of a validated CDR on patient outcomes and clinician behavior is suggested to be examined [8, 18]. Evaluation of cost-effectiveness and long-term implementation and dissemination are also advised after the impact analysis [8].

The major strength of this study is that it is the first external validation study for GID score. The study had a large-sample size and used multicenter design. There are some limitations as the follows. First, there were different epidemics within the time range that could have biased the results. The major circulating viruses were influenza A (H1N1) in 2009, 2010, 2015, and 2018; influenza A (H3N2) in 2012, 2013, 2014, and 2016; and influenza B in 2011 and 2017 [19]. Second, the definition of disease in the previous paper was presence of fever and identification of flu in nasal swabs whereas in this case it is influenza diagnosis or prescription of common anti-flu drugs. The measurement of outcome was in-hospital mortality in this study, which is different from 30-day mortality in the original study for developing GID score. The difference of disease definition and outcome measurement between two studies may contributed to a lower accuracy in this study. However, the result of this study for predicting in-hospital mortality could provide an important reference for validating GID score for different outcome. Because we had no data about the percentage of death within 30 days after discharge, further study about this issue is needed. Before the widely use of the GID score, more validation studies, effects of epidemics of different viruses, refinement, and preparation for impact analysis may be warranted.

Conclusions

This multicenter study assessed the external validity of the GID score. The discrimination is acceptable, and the fit is good. The GID score has the best performance for NPV. The GID score cut-off of 1 is suggested to be used for identifying patients with a high mortality risk, and the cut-off 3 is suggested to be used for identifying patients with a low mortality risk. Further studies, including more validation studies, refinement, and preparation for impact analysis are suggested.

Supporting information

S1 Fig. The AUC of GID score.

AUC, area under the curve; GID, Geriatric Influenza Death.

https://doi.org/10.1371/journal.pone.0283475.s001

(DOCX)

S1 Table. Missing data and their given values.

https://doi.org/10.1371/journal.pone.0283475.s002

(DOCX)

Acknowledgments

We thank Ms. Chia-Jung Chen and Ms. Tzu-Lan Liu for their assistance in data collection and management and Ms. Yu-Shan Ma for the assistance with statistics. We thank Enago for the English revision.

References

1. Flu & People 65 Years and Older. 2022 [cited 2023 January 13]; https://www.cdc.gov/flu/highrisk/65over.htm.
2. Talbot H.K., Influenza in Older Adults. Infect Dis Clin North Am, 2017. 31(4): p. 757–766. pmid:28911829
3. Lam P.P., et al., Predictors of influenza among older adults in the emergency department. BMC Infect Dis, 2016. 16(1): p. 615. pmid:27793117
4. Chung J.Y., et al., Geriatric influenza death (GID) score: a new tool for predicting mortality in older people with influenza in the emergency department. Sci Rep, 2018. 8(1): p. 9312. pmid:29915256
5. Statistics of the population. 2021 [cited 2023 January 1]; https://bca.tainan.gov.tw/News_Content.aspx?n=1134&s=8156.
6. Taniguchi K., et al., Epidemiology and burden of illness of seasonal influenza among the elderly in Japan: A systematic literature review and vaccine effectiveness meta-analysis. Influenza Other Respir Viruses, 2021. 15(2): p. 293–314. pmid:32997395
7. Moons K.G., et al., Risk prediction models: II. External validation, model updating, and impact assessment. Heart, 2012. 98(9): p. 691–8. pmid:22397946
8. Cowley L.E., et al., Methodological standards for the development and evaluation of clinical prediction rules: a review of the literature. Diagn Progn Res, 2019. 3: p. 16. pmid:31463368
9. Toll D.B., et al., Validation, updating and impact of clinical prediction rules: a review. J Clin Epidemiol, 2008. 61(11): p. 1085–94. pmid:19208371
10. Altman D.G., et al., Prognosis and prognostic research: validating a prognostic model. BMJ, 2009. 338: p. b605. pmid:19477892
11. Vickers A.J. and Elkin E.B., Decision curve analysis: a novel method for evaluating prediction models. Med Decis Making, 2006. 26(6): p. 565–74. pmid:17099194
12. Keogh C., et al., Developing an international register of clinical prediction rules for use in primary care: a descriptive analysis. Ann Fam Med, 2014. 12(4): p. 359–66. pmid:25024245
13. Steyerberg E.W., et al., Prognosis Research Strategy (PROGRESS) 3: prognostic model research. PLoS Med, 2013. 10(2): p. e1001381. pmid:23393430
14. Maguire J.L., et al., Clinical prediction rules for children: a systematic review. Pediatrics, 2011. 128(3): p. e666–77. pmid:21859912
15. McGinn T., Putting Meaning into Meaningful Use: A Roadmap to Successful Integration of Evidence at the Point of Care. JMIR Med Inform, 2016. 4(2): p. e16. pmid:27199223
16. Debray T.P., et al., A new framework to enhance the interpretation of external validation studies of clinical prediction models. J Clin Epidemiol, 2015. 68(3): p. 279–89. pmid:25179855
17. Debray T.P., et al., A framework for meta-analysis of prediction model studies with binary and time-to-event outcomes. Stat Methods Med Res, 2019. 28(9): p. 2768–2786. pmid:30032705
18. Moons K.G., et al., Prognosis and prognostic research: application and impact of prognostic models in clinical practice. BMJ, 2009. 338: p. b606. pmid:19502216
19. Gong Y.N., et al., Centennial review of influenza in Taiwan. Biomed J, 2018. 41(4): p. 234–241. pmid:30348266

[ref1] 1. Flu & People 65 Years and Older. 2022 [cited 2023 January 13]; https://www.cdc.gov/flu/highrisk/65over.htm.

[ref2] 2. Talbot H.K., Influenza in Older Adults. Infect Dis Clin North Am, 2017. 31(4): p. 757–766. pmid:28911829
View Article
PubMed/NCBI
Google Scholar

[3] View Article

[4] PubMed/NCBI

[5] Google Scholar

[ref3] 3. Lam P.P., et al., Predictors of influenza among older adults in the emergency department. BMC Infect Dis, 2016. 16(1): p. 615. pmid:27793117
View Article
PubMed/NCBI
Google Scholar

[7] View Article

[8] PubMed/NCBI

[9] Google Scholar

[ref4] 4. Chung J.Y., et al., Geriatric influenza death (GID) score: a new tool for predicting mortality in older people with influenza in the emergency department. Sci Rep, 2018. 8(1): p. 9312. pmid:29915256
View Article
PubMed/NCBI
Google Scholar

[11] View Article

[12] PubMed/NCBI

[13] Google Scholar

[ref5] 5. Statistics of the population. 2021 [cited 2023 January 1]; https://bca.tainan.gov.tw/News_Content.aspx?n=1134&s=8156.

[ref6] 6. Taniguchi K., et al., Epidemiology and burden of illness of seasonal influenza among the elderly in Japan: A systematic literature review and vaccine effectiveness meta-analysis. Influenza Other Respir Viruses, 2021. 15(2): p. 293–314. pmid:32997395
View Article
PubMed/NCBI
Google Scholar

[16] View Article

[17] PubMed/NCBI

[18] Google Scholar

[ref7] 7. Moons K.G., et al., Risk prediction models: II. External validation, model updating, and impact assessment. Heart, 2012. 98(9): p. 691–8. pmid:22397946
View Article
PubMed/NCBI
Google Scholar

[20] View Article

[21] PubMed/NCBI

[22] Google Scholar

[ref8] 8. Cowley L.E., et al., Methodological standards for the development and evaluation of clinical prediction rules: a review of the literature. Diagn Progn Res, 2019. 3: p. 16. pmid:31463368
View Article
PubMed/NCBI
Google Scholar

[24] View Article

[25] PubMed/NCBI

[26] Google Scholar

[ref9] 9. Toll D.B., et al., Validation, updating and impact of clinical prediction rules: a review. J Clin Epidemiol, 2008. 61(11): p. 1085–94. pmid:19208371
View Article
PubMed/NCBI
Google Scholar

[28] View Article

[29] PubMed/NCBI

[30] Google Scholar

[ref10] 10. Altman D.G., et al., Prognosis and prognostic research: validating a prognostic model. BMJ, 2009. 338: p. b605. pmid:19477892
View Article
PubMed/NCBI
Google Scholar

[32] View Article

[33] PubMed/NCBI

[34] Google Scholar

[ref11] 11. Vickers A.J. and Elkin E.B., Decision curve analysis: a novel method for evaluating prediction models. Med Decis Making, 2006. 26(6): p. 565–74. pmid:17099194
View Article
PubMed/NCBI
Google Scholar

[36] View Article

[37] PubMed/NCBI

[38] Google Scholar

[ref12] 12. Keogh C., et al., Developing an international register of clinical prediction rules for use in primary care: a descriptive analysis. Ann Fam Med, 2014. 12(4): p. 359–66. pmid:25024245
View Article
PubMed/NCBI
Google Scholar

[40] View Article

[41] PubMed/NCBI

[42] Google Scholar

[ref13] 13. Steyerberg E.W., et al., Prognosis Research Strategy (PROGRESS) 3: prognostic model research. PLoS Med, 2013. 10(2): p. e1001381. pmid:23393430
View Article
PubMed/NCBI
Google Scholar

[44] View Article

[45] PubMed/NCBI

[46] Google Scholar

[ref14] 14. Maguire J.L., et al., Clinical prediction rules for children: a systematic review. Pediatrics, 2011. 128(3): p. e666–77. pmid:21859912
View Article
PubMed/NCBI
Google Scholar

[48] View Article

[49] PubMed/NCBI

[50] Google Scholar

[ref15] 15. McGinn T., Putting Meaning into Meaningful Use: A Roadmap to Successful Integration of Evidence at the Point of Care. JMIR Med Inform, 2016. 4(2): p. e16. pmid:27199223
View Article
PubMed/NCBI
Google Scholar

[52] View Article

[53] PubMed/NCBI

[54] Google Scholar

[ref16] 16. Debray T.P., et al., A new framework to enhance the interpretation of external validation studies of clinical prediction models. J Clin Epidemiol, 2015. 68(3): p. 279–89. pmid:25179855
View Article
PubMed/NCBI
Google Scholar

[56] View Article

[57] PubMed/NCBI

[58] Google Scholar

[ref17] 17. Debray T.P., et al., A framework for meta-analysis of prediction model studies with binary and time-to-event outcomes. Stat Methods Med Res, 2019. 28(9): p. 2768–2786. pmid:30032705
View Article
PubMed/NCBI
Google Scholar

[60] View Article

[61] PubMed/NCBI

[62] Google Scholar

[ref18] 18. Moons K.G., et al., Prognosis and prognostic research: application and impact of prognostic models in clinical practice. BMJ, 2009. 338: p. b606. pmid:19502216
View Article
PubMed/NCBI
Google Scholar

[64] View Article

[65] PubMed/NCBI

[66] Google Scholar

[ref19] 19. Gong Y.N., et al., Centennial review of influenza in Taiwan. Biomed J, 2018. 41(4): p. 234–241. pmid:30348266
View Article
PubMed/NCBI
Google Scholar

[68] View Article

[69] PubMed/NCBI

[70] Google Scholar

Figures

Abstract

Introduction

Methods

Study design, setting, and participants

Definitions of variables

Measurement of outcome

Ethical statement

Statistics

Results

Discussion

Conclusions

Supporting information

S1 Fig. The AUC of GID score.

S1 Table. Missing data and their given values.

Acknowledgments

References