Skip to main content
Advertisement
  • Loading metrics

Early individualized risk prediction using clinical data for children during the febrile phase of dengue in outpatient settings in Vietnam and Thailand

  • Sorawat Sangkaew ,

    Roles Conceptualization, Formal analysis, Methodology, Validation, Writing – original draft

    ‡ SY, AH and ID are joint senior authors on this work. SS and BCD are joint first author on this work.

    Affiliations Department of Social Medicine, Hatyai Hospital, Songkhla, Thailand, Section of Adult Infectious Disease, Department of Infectious Disease, Faculty of Medicine, Imperial College London, London, United Kingdom

  • Bethan Cracknell Daniels ,

    Roles Formal analysis, Methodology, Validation, Writing – original draft, Writing – review & editing

    ‡ SY, AH and ID are joint senior authors on this work. SS and BCD are joint first author on this work.

    Affiliation MRC Centre for Global Infectious Disease Analysis, Department of Infectious Disease Epidemiology, School of Public Health, Imperial College London, London, United Kingdom

  • Damien K. Ming,

    Roles Conceptualization, Methodology, Validation, Writing – review & editing

    Affiliation Centre for Antimicrobial Optimisation, Imperial College London, London, United Kingdom

  • Bernard Hernandez,

    Roles Conceptualization, Methodology, Writing – review & editing

    Affiliation Department of Electrical and Electronic Engineering, Imperial College London, London, United Kingdom

  • Pau Herrero,

    Roles Conceptualization, Methodology, Writing – review & editing

    Affiliations Department of Electrical and Electronic Engineering, Imperial College London, London, United Kingdom, The David Price Evans Global Health and Infectious Diseases Unit, Faculty of Health & Life Sciences, University of Liverpool, Liverpool, United Kingdom

  • Piyarat Suntarattiwong,

    Roles Data curation, Writing – review & editing

    Affiliation Department of Pediatrics, Queen Sirikit National Institute of Child Health, Bangkok, Thailand

  • Siripen Kalayanarooj,

    Roles Conceptualization, Data curation, Resources, Supervision, Writing – review & editing

    Affiliation Department of Pediatrics, Queen Sirikit National Institute of Child Health, Bangkok, Thailand

  • Anon Srikiatkhachorn,

    Roles Conceptualization, Data curation, Methodology, Resources, Writing – review & editing

    Affiliations Faculty of Medicine, King Mongkut’s Institute of Technology Ladkrabang, Bangkok, Thailand, Institute for Immunology and Informatics and Department of Cell and Molecular Biology, University of Rhode Island, Providence, Rhode Island, United States of America

  • Alan L. Rothman,

    Roles Conceptualization, Methodology, Supervision, Writing – review & editing

    Affiliation Institute for Immunology and Informatics and Department of Cell and Molecular Biology, University of Rhode Island, Providence, Rhode Island, United States of America

  • Darunee Buddhari,

    Roles Data curation, Writing – review & editing

    Affiliation Department of Virology, WRAIR-AFRIMS, Bangkok, Thailand

  • Nguyen Lam Vuong,

    Roles Conceptualization, Data curation, Methodology, Writing – review & editing

    Affiliations Oxford University Clinical Research Unit, Wellcome Trust Asia Programme, Ho Chi Minh City, Vietnam, Faculty of Public Health, University of Medicine and Pharmacy at Ho Chi Minh City, Ho Chi Minh City, Vietnam

  • Phung Khanh Lam,

    Roles Methodology, Writing – review & editing

    Affiliations Oxford University Clinical Research Unit, Wellcome Trust Asia Programme, Ho Chi Minh City, Vietnam, Faculty of Public Health, University of Medicine and Pharmacy at Ho Chi Minh City, Ho Chi Minh City, Vietnam

  • Minh Tuan Nguyen,

    Roles Writing – review & editing, Data curation

    Affiliation Children’s Hospital No. 1, Ho Chi Minh City, Vietnam

  • Bridget Wills,

    Roles Conceptualization, Supervision

    Affiliations Oxford University Clinical Research Unit, Wellcome Trust Asia Programme, Ho Chi Minh City, Vietnam, Centre for Tropical Medicine and Global Health, University of Oxford, Oxford, United Kingdom

  • Cameron Simmons,

    Roles Conceptualization, Supervision, Writing – review & editing

    Affiliation Department of Microbiology and Immunology, Peter Doherty Institute for Infection and Immunity, University of Melbourne, Parkville, Melbourne, Australia

  • Christl A. Donnelly,

    Roles Conceptualization, Methodology, Supervision, Writing – review & editing

    Affiliations MRC Centre for Global Infectious Disease Analysis, Department of Infectious Disease Epidemiology, School of Public Health, Imperial College London, London, United Kingdom, Department of Statistics, University of Oxford, Oxford, United Kingdom

  • Sophie Yacoub ,

    Roles Conceptualization, Data curation, Methodology, Resources, Supervision, Writing – review & editing

    ‡ SY, AH and ID are joint senior authors on this work. SS and BCD are joint first author on this work.

    Affiliations Oxford University Clinical Research Unit, Wellcome Trust Asia Programme, Ho Chi Minh City, Vietnam, Centre for Tropical Medicine and Global Health, University of Oxford, Oxford, United Kingdom

  • Alison Holmes ,

    Roles Conceptualization, Supervision, Writing – review & editing

    ‡ SY, AH and ID are joint senior authors on this work. SS and BCD are joint first author on this work.

    Affiliations Centre for Antimicrobial Optimisation, Imperial College London, London, United Kingdom, The David Price Evans Global Health and Infectious Diseases Unit, Faculty of Health & Life Sciences, University of Liverpool, Liverpool, United Kingdom

  •  [ ... ],
  • Ilaria Dorigatti

    Roles Conceptualization, Formal analysis, Methodology, Supervision, Validation, Visualization, Writing – original draft

    i.dorigatti@imperial.ac.uk

    ‡ SY, AH and ID are joint senior authors on this work. SS and BCD are joint first author on this work.

    Affiliation MRC Centre for Global Infectious Disease Analysis, Department of Infectious Disease Epidemiology, School of Public Health, Imperial College London, London, United Kingdom

  • [ view all ]
  • [ view less ]

Abstract

Dengue severity prediction models are usually developed using hospitalized patient data, but triage and hospital admission are mainly evaluated in outpatient settings. This study developed models using clinical and laboratory data from patients in outpatient settings during the febrile phase. Data from two cohort studies in Vietnam and Thailand were used to develop and validate six models: logistic regression with warning signs, Lasso-selected logistic regression, random forest, extreme gradient boosted classification, support vector machine, and artificial neural network. Models predicted dengue shock syndrome (DSS) as the primary endpoint and moderate plasma leakage and/or DSS as the secondary endpoint. We assessed model performance, discrimination, and calibration, using sensitivity, specificity, accuracy, Brier score, AUROC, CITL, calibration slope, calibration plots, and decision curve analysis. The optimal model was the Lasso-selected logistic regression for predicting DSS and the combined endpoint of moderate plasma leakage and/or DSS (Brier score: 0.044 [95% CI 0.043, 0.044] and 0.104 [95% CI 0.104, 0.105]; AUROC: 0.789 [95% CI 0.787, 0.791] and 0.741 [95% CI 0.740, 0.742]). We identified hematocrit, platelet count, lymphocyte count, and aspartate aminotransferase as predictors for DSS, and abdominal pain or tenderness, vomiting, mucosal bleeding, white blood cell count, lymphocyte count, platelet count, aspartate aminotransferase, and serum albumin as predictors for the secondary endpoint. Logistic regression and machine learning models using clinical and laboratory data during the febrile phase can support early prediction of severe disease in outpatient settings. Integrating risk prediction models into a decision support system could improve triage and optimize healthcare and resource allocation in endemic and resource-limited areas.

Author summary

Most dengue risk models are developed from hospitalized patient data, despite triage occurring in outpatient settings. Few studies have examined early outpatient predictors, and none have undergone external validation across countries. In this study, we developed and validated dengue risk prediction models using logistic regression and machine learning with outpatient data from Vietnam and Thailand. Our prior systematic review and expert consultation informed predictor selection. The models outperformed the WHO warning signs alone in predicting dengue shock syndrome and moderate plasma leakage, demonstrating better discrimination and calibration. Models incorporating four to eight routinely collected clinical parameters show promise for guiding early triage and improving care allocation, especially in resource-limited, dengue-endemic settings.

Introduction

Dengue is a mosquito-borne viral disease that heavily burdens public health systems globally, with 3.83 billion (3.45–4.09) people currently living in areas suitable for dengue transmission [1]. In 2024, the number of dengue cases reached the highest level on record, with more than 14,000,000 cases reported globally [2]. However, estimates suggest the true burden was 77.8 million cases (95% CI 50.1–101.2 million) [3,4]. Globally, Asia has the greatest dengue burden, with ~70% of the global dengue burden in this region [1,5]. The annual incidence of dengue hospitalization in Vietnam and Thailand is estimated at 142 and 136 people per 100,000, respectively [3]. In 2023, the estimated direct medical cost per dengue case increased markedly with inpatient care, rising from USD 7.51 (outpatient) to USD 160.09 (inpatient) in Thailand and from USD 27.82 to USD 65.84 in Vietnam [6]. Thus, inpatient care is a primary driver of the global dengue economic burden [7] despite only 7% of cases being treated in inpatient settings [3]. Around 1–5% of symptomatic cases develop severe clinical syndrome, typically on days 4–6 of illness, and can be fatal without prompt supportive therapy [8]. No antiviral treatments are available, and the vaccines developed and licensed to date have complex efficacy profiles and are expected to have modest impacts on hospitalizations [9]. Therefore, early recognition of patients at higher risk of severe disease, requiring close monitoring in hospitals or appropriate for outpatient care, is critical to improving case management and healthcare resource allocation.

The World Health Organization (WHO), in collaboration with the Special Program for Research and Training in Tropical Diseases, recommends using warning signs to help triage patients in the febrile phase by identifying those at higher risk of developing severe dengue [8]. The warning signs checklist shows high sensitivity but moderate specificity, potentially causing many unnecessary admissions [1012]. Notably, the checklist does not provide an individualized prediction for the risk of severe dengue.

Clinical data-driven prediction models can estimate individual risk of severe disease, improving patient triage during the early febrile phase of dengue illness. Although several prediction models for dengue severity exist, most rely on hospitalized patient data [1316], which may create a selection bias toward more severe presentations from the outset. Conversely, risk prediction models using data collected in the early febrile phase from outpatient settings can inform the triage and hospitalization admission of patients. As well as the early identification of high-risk individuals, this also reduces the unnecessary hospitalizations of low-risk patients, which is particularly important in resource-limited settings or during large dengue outbreaks, which can quickly overwhelm healthcare settings. We identified five studies that developed risk production models using outpatient data [1721]. All studies showed moderate to high performance on internal training and validation sets; however, no studies included external validation on an independent dataset from a different country, limiting the generalizability of their findings.

This study fills this gap by developing statistical and machine learning models to predict progression to (i) dengue shock syndrome (DSS) and (ii) moderate plasma leakage and/or DSS using data from the febrile phase of outpatient illness. We train our model using data collected from a cohort study in Vietnam and validate it on an independent dataset collected in Thailand.

Results

Patient characteristics

The Vietnamese study enrolled 8,100 patients, of which 2,245 had laboratory-confirmed dengue, resulting in 45% (1,019) of patients being hospitalized, with the remaining 55% (1,226) of patients managed as outpatients (non-hospitalized) (S1 Fig). Among outpatients, 1,185 (96.66%) completed follow-up; among hospitalized patients, 110 (10.79%) developed DSS, and 185 (18.16%) developed moderate plasma leakage. (S1 Fig). In the Thai dataset, 1,210 patients were enrolled; 524 had laboratory-confirmed dengue, all hospitalized. Amongst these, 36 (6.87%) children developed DSS, and 182 (34.73%) developed moderate plasma leakage. (S1 Fig).

The Thai and Vietnamese patient characteristics with complete data on the day of enrolment are shown in S1S2 Tables. In both datasets, the mean age was 8–9 years, and 54–56% of patients were male. Patients with DSS were hospitalized later in both Vietnam and Thailand. In the Vietnamese dataset, vomiting and abdominal pain/tenderness were associated with both DSS and the combined endpoint, and mucosal bleeding was associated with the combined endpoint. No significant symptom differences were found in the Thai dataset (S1S2 Tables). Higher AST and lower platelet counts were associated with DSS and the combined endpoint in both datasets, but higher hematocrit and lower serum albumin levels were only significant in the Vietnamese dataset. Secondary dengue infection (defined as detecting at least one positive dengue-specific IgG on the febrile and convalescence samples) was associated with DSS and the combined endpoint in both countries, while dengue serotype differences were observed in Vietnam but not in Thailand (S1S2 Tables).

Selected predictors

Based on our previous systematic review and meta-analysis [22], and discussion with dengue experts,  12 and 11 candidate predictors for DSS and the combined endpoint were selected, respectively, comprising demographic information (age and nutritional status), signs and symptoms (vomiting, abdominal pain or tenderness, skin hemorrhage, mucosal bleeding), and laboratory data (hematocrit, platelet count, white blood cell count, lymphocyte count, AST, serum albumin) collected during the febrile phase (S3 Table). Hematocrit was excluded for the combined endpoints as it was part of the outcome definition. Using Lasso selection, we selected four predictors for DSS (hematocrit, platelet count, lymphocyte count, AST) and eight predictors for the combined endpoint (vomiting, mucosal bleeding, abdominal pain and/or tenderness, platelet count, white blood cell count, lymphocyte count, AST, serum albumin).

Table 1 shows the association of candidate predictors measured at enrolment with the two clinical endpoints. Vomiting (OR = 1.67 [95%CI 1.13, 2.47] and OR = 1.68 [95%CI 1.31, 2.15]) and abdominal pain (OR = 2.98 [95%CI 1.28, 6.09] and OR = 2.50 [95%CI 1.38, 4.34]) were associated with both DSS and the combined endpoint. Skin bleeding was associated with DSS (OR = 1.89 [95%CI 1.14, 3.02]), and mucosal bleeding was associated with the combined endpoint (OR = 1.94 [95%CI 1.16, 3.11]). During the febrile phase, higher hematocrit and AST, and lower platelet count, lymphocyte count, and serum albumin were significantly associated with DSS and the combined endpoint (Table 1).

thumbnail
Table 1. Mean and 95% confidence interval (CI) of the Odds Ratio of developing dengue shock syndrome (DSS) and the combined endpoint (moderate plasma leakage and/or DSS) for each candidate predictor derived from univariate logistic regression.

https://doi.org/10.1371/journal.pdig.0001171.t001

Predictive performance

We assessed the predictive performance of two regression models (the reference model: logistic regression model using the WHO warning signs parameters as predictors and logistic regression with Lasso selection) and four machine learning models developed using all candidate predictors (random forest, RF, extreme gradient boosted tree classification, XGB, support vector machine, SVM, and artificial neural network, ANN). Fig 1 summarizes the conceptual framework of model development and internal validation, including 10-fold cross-validation to tune hyperparameters and 10-fold calibration to calibrate the model. Model training and validation were repeated 45 times to obtain mean and 95% CI estimates of predictive performance using Block Jackknife estimation. Predictive performance was assessed using the Brier score (a strictly proper scoring rule which measures the accuracy of probabilistic predictions), a discrimination measurement (the area under receiver operating characteristic curves, AUROC), calibration measurements (calibration plots, calibration in the large, CITL, and calibration slope), decision curve analysis (quantifies the clinical net benefit of models across a range of threshold probabilities for decision-making), the sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV).

thumbnail
Fig 1. Description of the workflow used for model development and internal validation.

After data preparation (Steps 1 and 2), the data are used to develop risk prediction models (Step 3). The dataset is split into 45 random blocks stratified by the outcome (Step 4). Each block (green boxes) is then used for hyperparameter optimization (Process T). The model is trained and validated on 44 of the 45 blocks with 10-fold cross-validation (Process V), leaving one block out in turn (Step 5). The training set (blue boxes) is divided into ten folds (in the Calibration loop): nine folds (red boxes) are used to develop the model, and the other fold (grey boxes) is used for Platt model calibration (Process C). Outputs from the calibrated model are the predicted probability of developing the endpoint (DSS in the primary analysis; moderate plasma leakage and/or DSS in the secondary analysis). The 10-fold validation and 10-fold calibration were repeated 45 times, each time leaving a different block and using the optimized hyper-parameterization. The central estimates and variance of the predictive performances are estimated from the 45 blocks using Block Jackknife estimation (Process E).

https://doi.org/10.1371/journal.pdig.0001171.g001

The predictive performance of the models developed on the Vietnamese dataset and internally validated using 10-fold cross-validation is presented in Table 2. The optimal hyperparameters and the coefficients of the multivariable logistic regression models with Lasso selection are presented in S4S8 Tables. When predicting DSS, the reference model (logistic regression with the WHO warning signs) achieved the best overall and discrimination performance with an average accuracy of 0.756 [095% CI 0.752, 0.761], Brier score of 0.041 [95% CI 0.040, 0.041], and AUROC of 0.789 [95% CI 0.787, 0.791] (Table 2). Logistic regression with Lasso selection achieved better calibration performance (CITL -0.001 [95% CI -0.002, 0.001], calibration slope 0.942 [95% CI 0.933, 0.951]) with comparable discrimination performance and AUROC (Table 2). Amongst the machine learning models, XGB performed best in terms of discrimination and calibration (Table 2). The optimal hyperparameters selected for the machine learning models are presented in S9 Table.

thumbnail
Table 2. Predictive performance of the risk prediction models for DSS and the combined endpoint of moderate plasma leakage and/or DSS using multivariable logistic regression with the WHO warning signs as predictors, logistic regression with variables selected by Lasso selection, random forest, extreme gradience boosted tree, support vector machine, and artificial neural network with 2 hidden layers, all trained on the Vietnamese dataset on internal validation using 10-fold cross validation.

https://doi.org/10.1371/journal.pdig.0001171.t002

When predicting the combined endpoint, model performance was generally lower compared to DSS. The reference model (logistic regression using WHO warning signs) again obtained the best Brier score; however, its AUROC was lower than the logistic regression with Lasso selection (0.697 [95% CI 0.696, 0.698] versus 0.741 [95% CI 0.740, 0.742]). This reflects the higher sensitivity and specificity of the Lasso model in predicting the combined endpoint (Table 2). Overall, XGB achieved the highest discrimination (AUROC 0.745 [95% CI 0.744, 0.747]) and accuracy (0.721 [95% CI 0.717, 0.724]), but its calibration performance (CITL -0.005 [95% CI -0.007, -0.002], calibration slope 1.040 [95% CI 1.031, 1.050]) was worse than logistic regression with Lasso selection which showed the best overall calibration (CITL 0.001 [95% CI 0.001, 0.001], calibration slope 0.968 [95% CI 0.962, 0.974]) (Fig 2 and Table 2). RF and SVM underperformed in overall performance, discrimination, and calibration (Table 2).

thumbnail
Fig 2. Calibration plots showing the mean predicted probability (x-axis) versus the mean observed probability (y-axis) for DSS (triangles) and the combined endpoint of moderate plasma leakage and/or DSS (points) using multivariable logistic regression with the WHO warning signs as predictors (WS) (validated only internally), logistic regression with variables selected by Lasso selection (LR), random forest (RF), extreme gradience boosted tree (XGB), support vector machine (SVM), and artificial neural network with 2 hidden layers (ANN).

The diagonal dotted line in each panel represents the perfect agreement between predicted and observed risk. DSS: dengue shock syndrome.

https://doi.org/10.1371/journal.pdig.0001171.g002

Sensitivity, specificity, and accuracy for all models are presented in Table 2. For DSS, the Lasso model achieved a sensitivity of 0.780 (95% CI: 0.774–0.787), specificity of 0.728 (95% CI: 0.722–0.734), and accuracy of 0.731 (95% CI: 0.725–0.736), which were comparable to the WHO warning signs model (sensitivity 0.779 [95% CI: 0.773–0.785], specificity 0.755 [95% CI: 0.750–0.760], accuracy 0.756 [95% CI: 0.752–0.761]). For the combined endpoint of moderate plasma leakage or DSS, the Lasso model performed better, with sensitivity 0.706 (95% CI: 0.701–0.711), specificity 0.703 (95% CI: 0.698–0.707), and accuracy 0.703 (95% CI: 0.700–0.707). Similar patterns were observed across the machine-learning models.

Including days of illness (DOI) as an additional predictor did not improve model performance (S10 Table). For DSS, the AUROC was 0.789 (95% CI: 0.787–0.791) with the Lasso model and 0.778 (95% CI: 0.704–0.853) when DOI was included, with calibration and classification metrics remaining stable. Similar findings were observed for the combined endpoint of plasma leakage or DSS (AUROC 0.741 [95% CI: 0.740–0.742] without DOI vs. 0.732 [95% CI: 0.693–0.771] with DOI). In a subgroup analysis of hospitalized cases with the combined endpoint, logistic regression with Lasso selection achieved the highest performance (AUROC 0.648 [95%CI 0.647, 0.650], CITL 0.000 [95%CI -0.001, 0.001], calibration slope 0.903 [95%CI 0.892, 0.914]) (S11 Table).

Fig 3 visualizes the decision curve analysis showing the net benefit at a given risk threshold (probability of positive diagnosis). At a low threshold probability (< 0.1), the ANN model provided the highest net benefit for predicting DSS. Overall, however, the combined endpoint was more clinically valuable, with logistic regression with Lasso selection, XGB, and ANN achieving higher net benefits than logistic regression with the WHO warning signs. However, at higher threshold probabilities (>25%), logistic regression with the WHO warning signs had a higher net benefit. RF and SVM performed the worst across all thresholds (Fig 3).

thumbnail
Fig 3. Decision curves, showing the net benefit (y-axis) against the threshold probability (x-axis) for predicting moderate plasma leakage and/or dengue shock syndrome (DSS) (left panel) and DSS alone (right panel).

Models evaluated are logistic regression using the WHO warning signs (WS), logistic regression with variables selected by Lasso selection (LR), random forest (RF), Extreme gradience boosted tree (XGB), support vector machine (SVM), and artificial neural network with 2 hidden layers (ANN). All (pink diagonal line) represents treating all patients. None (grey horizontal line) represents treating no patients.

https://doi.org/10.1371/journal.pdig.0001171.g003

External validation

The Vietnamese (training) and Thai (external validation) datasets had different case mixes (membership model concordance statistic of 0.84 [95% CI: 0.81–0.86]). Differences in predicted risk further support these dataset differences (S2 Fig). Predictive performance on the external validation dataset was lower than on the internal validation sets, with AUROC values for DSS and the combined endpoints 11% and 17% lower on average, respectively (Tables 23). Calibration metrics suggest prediction of both endpoints is generally underestimated (CITL higher than 0) and too extreme (calibration slope lower than 1) (Table 3 and Fig 2). The artificial neural network achieved the highest AUROC for DSS (0.690), but calibration remained imperfect (intercept 0.331, slope 0.912), consistent with the pattern observed in internal validation. By comparison, logistic regression with Lasso selection (AUROC 0.677, intercept 0.188, slope 0.531) and XGBoost (AUROC 0.679, intercept 0.248, slope 0.544) showed slightly lower discrimination but more stable calibration (Table 3). In the Thai cohort, application of the Lasso model reduced false positives compared with WHO warning signs (13% vs. 47% for DSS; 41% vs. 76% for plasma leakage), but at the cost of higher false negatives (26% vs. 6% for DSS; 2% vs. 0% for plasma leakage) (S12 Table). When predicted probabilities of DSS and plasma leakage were plotted against each predictor and stratified by country, the patterns of association were consistent across the Vietnamese and Thai datasets, suggesting that differences in baseline patient characteristics did not materially alter predictor–outcome relationships (S5S6 Figs).

thumbnail
Table 3. Predictive performance of the risk prediction models for DSS and the combined endpoint of moderate plasma leakage and/or DSS using multivariable logistic regression with the WHO warning signs as predictors, logistic regression with variables selected by Lasso selection, random forest, extreme gradience boosted tree, support vector machine, and artificial neural network with 2 hidden layers, on the Vietnamese (training) dataset and validated on the Thai (external validation) dataset.

https://doi.org/10.1371/journal.pdig.0001171.t003

Sensitivity analysis

In a sensitivity analysis, we optimized the classification threshold by considering the cost of false negatives and DSS prevalence for the logistic regression with Lasso selection and XGB models. We found that a cost value of approximately 20 maintains the accuracy level observed in the baseline scenario, while increasing the cost beyond 20 improves sensitivity, with the logistic regression with Lasso selection model identifying 90.1% of DSS cases at a cost of 100, while excluding 61.0% of non-DSS cases (S13 Table).

Discussion

In outpatient settings, effective triage is key for identifying patients at risk of severe dengue. We developed risk prediction models using data from outpatient departments in Vietnam and Thailand to identify clinical and laboratory predictors collected during the febrile phases of illness. Internal validation showed that logistic regression with predictors selected by Lasso achieved the most balanced performance, combining good discrimination and calibration. Among machine-learning methods, XGBoost and ANN performed well, with XGBoost showing the most consistent and optimal results, particularly in external validation.

Unlike most previous studies [1316], our models use predictors collected in outpatient settings during the febrile phase, providing earlier estimates of DSS risk to improve triage and patient management. Of the five studies previously predicting risk of severe dengue using outpatient data, two included no validation [18,19], and three performed model validation by splitting their dataset, either temporally or based on missing data [17,20,21]. Thus, by training our model on a Vietnamese dataset and validating it on an external Thai dataset, ours is the first to demonstrate the generalizability of a model trained on outpatient data across independent settings. We found good predictive performance on the external validation dataset (maximum AUROC 0.69) despite differences between the training and validation sets in transmission setting, hospital admission protocol, clinical management, and monitoring frequency.

To our knowledge, ours is also the first study using outpatient data to include a decision-curve analysis, allowing us to evaluate the net clinical benefit of our prediction models compared to hospitalization of all or none of the patients presenting at an outpatient setting, ensuring the models improve patient outcomes. We showed that at low threshold probabilities, machine learning models (ANN) offered the greatest net benefit in predicting DSS.

Given DSS is rare, we additionally considered a combined endpoint of moderate plasma leakage and/or DSS, increasing the PPV from 0.138 [95%CI 0.135, 0.142] to 0.273 [95%CI 0.270, 0.275] using the Lasso model. This combined endpoint also had a higher PPV compared to a similar study predicting DSS (PPV 0.10 [95%CI 0.09, 0.12]), reducing unnecessary inpatient care [21]. However, failing to identify individuals progressing to severe dengue poses a greater risk. In a sensitivity analysis, we showed how DSS prevalence and the cost of failing to identify at-risk individuals can be incorporated into the prediction model, tailoring it to individual settings and prioritizing DSS identification. All predictors included in this study are conventional laboratory parameters widely available in outpatient settings. They were carefully chosen based on our previous systematic review, meta-analysis [22], and in discussions with dengue experts. Using multivariable logistic regression and Lasso selection, we found four variables (hematocrit levels, platelet count, lymphocyte count, AST) and eight variables (vomiting, abdominal pain, mucosal bleeding, platelet count, white blood cell count, lymphocyte count, AST, serum albumin levels) were predictors of DSS and the combined endpoint, respectively. Platelet count and AST were consistently strong predictors in all models, in agreement with our previous meta-analysis [22], supporting testing and monitoring of these during the febrile phase. Our results also agree with vomiting, mucosal bleeding, and abdominal pain as established risk factors for developing severe dengue, which are warning signs in the 2009 WHO Dengue guidelines [8,22].

In this analysis, we investigated the association of severe dengue with the presence or absence of these symptoms, but the frequency and amount of vomiting, mucosal bleeding, and abdominal pain can also indicate the severity of the disease. In future work, it will be relevant to explore whether quantitative differences in vomiting, mucosal bleeding, and abdominal pain improve the prediction of DSS and/or plasma leakage. Additionally, we only considered the risk of severe dengue, whereas previous work has identified clinical fluid accumulation and elevated hematocrit as significant predictors of time taken to severe dengue progression in a hospitalized pediatric population in India [23]. Future work using Cox regression for semi-parametric analysis or discrete-time hazard models that leverage machine learning approaches to predict time to severe dengue progression could provide clinically actionable predictions for optimizing monitoring intensity, timing interventions, and allocating intensive care resources based on individual patient trajectories.

This study has additional limitations. The combined endpoint definition assumed non-hospitalized Vietnamese patients did not develop moderate plasma leakage unless detected during outpatient visits, potentially leading to underestimation. Limited chest X-rays for detecting plasma leakage might have caused missed cases. A higher hemoconcentration cut-off (e.g., 20% instead of 15%) would reduce under-reporting and the frequency of plasma leakage, affecting the relationship between predictors and the combined outcome. The cut-off threshold for plasma leakage also influences the relatedness between the Vietnamese (training) and Thai (external validation) datasets. Reciprocal validation using the Thai dataset was not feasible because the limited number of outcome events would not have supported robust model development. Finally, our Thai and Vietnamese cohorts (1994–2008 and 2010–2013, respectively) may have limited contemporary transportability given temporal changes in dengue patient characteristics, including demographic shifts toward older patients, increasing obesity rates, and higher baseline liver enzyme levels in the region [2427]. Prospective validation of our models in contemporary cohorts is therefore essential. The analysis offers new insights into predictive variables associated with severe dengue progression from two independent cohort studies in Thailand and Vietnam. It provides evidence that monitoring platelet count and AST during the febrile phase is useful for predicting severe disease. The models developed in this study are a step towards a decision support system for triage and clinical decision-making. While further testing and validation are needed, this work lays the foundation for integrating evidence-based, data-driven methods into a decision support system, optimizing healthcare allocation and resource use.

Materials and methods

This study was approved by the scientific and ethical committees of collaborating hospitals and the Oxford University Tropical Research Ethical Committee (NCT01421732) and The Research Ethics Review Committee of Queen Sirikit National Institute of Child Health (QSNICH) (REC.082/2562). This study followed the Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD) guidelines for prediction model development [28].

Data resource

We developed risk prediction models using two existing datasets from (i) a prospective cohort study of children with dengue conducted by researchers at the Oxford University Clinical Research Unit (OUCRU) in Ho Chi Minh City with seven collaborator hospitals in Southern Vietnam between 2010 and 2013 and (ii) a prospective cohort study of children with dengue conducted at the QSNICH in Bangkok, Thailand. We used dataset (i) as the training dataset and (ii) as the external validation dataset, as the Thai dataset contained too few outcome events for both endpoints to support robust model development without risking overfitting and unstable estimates. A summary of the data collection process is provided in the S1 Text pp 1–2, and full details are available in Nguyen et al. [21] and Kalayanarooj et al. [29].

Prediction outcome

This study considered two clinical endpoints: DSS as the primary endpoint and moderate plasma leakage and/or DSS as a secondary endpoint. DSS was defined based on pulse pressure or hypotension with signs of poor perfusion, while moderate plasma leakage was determined by hematocrit changes or imaging evidence of plasma leakage (see S1 Text pp 2 for outcome definitions and measurement). The combined endpoint identifies patients with less severe disease requiring close monitoring and prompt medical intervention in hospitals, given that DSS alone was rare and represents extreme physiological derangement. In the Vietnamese dataset, all non-hospitalized patients were assumed not to have developed either endpoint.

Candidate predictors and predictor selection

Candidate predictors were selected based on our previous systematic review and meta-analysis [22], discussion with dengue experts (SY, SK, and PS), and data availability in the cohort. The number of candidate predictors was considered according to the 10–15 events per variable rule [30,31]. Two logistic regression models were developed: the first used predefined predictors based on WHO warning signs (vomiting, mucosal bleeding, abdominal pain or tenderness, clinical fluid accumulation, lethargy or restlessness, and platelet count). In the second model, Lasso regression was used to select the most relevant variables. We included the predictors with the highest model accuracy and stability in the final model [32]. Model accuracy was evaluated using the regularization parameter (λ) to minimize misclassification error. Stratified 10-fold cross-validation determined the minimal λ using the Glmnet package [33]. Model stability and variable selection were performed using 1,000 bootstrap samples. All candidate predictors were included in the machine learning risk prediction models. See S1 Text pp 2 for handling of outliers and missing data.

Model development

We aimed to predict the probability of developing DSS (primary endpoint) and the combined endpoint of moderate plasma leakage and/or DSS (secondary endpoint) using six different models: a logistic regression model using the WHO warning signs parameters as predictors (the reference model), a multivariable logistic regression model included all predictors selected by Lasso regression (see Results, Selected Predictors), and four machine learning-based models (Random forest [RF], Extreme gradient boosted tree classification [XGB], Support vector machine [SVM], and Artificial neural network [ANN]) developed using all candidate predictors. See S1 Text pp 3 for further details on the machine learning models. Logistic regression and Lasso regression were applied using the Glmnet package [33]. RF, SVM, and XGB were developed using the Caret package [34] and ANN models were implemented using the Keras package [35] in R [36]. All continuous predictors, including age and laboratory variables, were standardized for the SVM and ANN algorithms. Platelet count and aspartate aminotransferase (AST) were log-transformed (with natural base) due to the right long tail distribution.

Fig 1 summarizes the conceptual framework of model development and internal validation. The dataset was split into 45 random blocks, stratified by outcome. Each model was trained and validated on 44 blocks simultaneously, leaving one block out in turn. 10-fold cross-validation was used to tune hyperparameters, and 10-fold calibration to calibrate the model. Model training and validation were repeated 45 times to obtain mean and 95% CI estimates of predictive performance using Block Jackknife estimation.

Model validation

Internal validation.

We assessed the predictive performance of the models with 10-fold cross-validation and an overall measurement (the Brier score), a discrimination measurement (the area under receiver operating characteristic curves, AUROC), calibration measurements (calibration plots, calibration in the large, CITL, and calibration slope), and decision curve analysis. We also assessed sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV). We estimated the mean and CI of the predictive performance metrics using the Block Jackknife technique (Fig 1). Calibration and decision curve analyses were performed on 10-fold cross-validation.

The threshold probability for classification was defined using the minimum distance from the left-upper corner approach, to minimize the false positive and false negative misclassification. In a sensitivity analysis, we redefined the threshold probability for classification, considering the cost of a false negative and disease prevalence, reflecting that a false negative is a far worse outcome from a clinical perspective:

Where . We varied the cost value from 2-100. We used the same candidate predictors and performance metrics adopted in the main analysis and predicted DSS using the two best-fitting models (Lasso regression and XGB).

In a subgroup analysis, hospitalized cases with moderate plasma leakage and/or DSS were evaluated. Logistic regression with Lasso selection, RF, SVM, and XGB models were developed and validated, and their performance was assessed using the same criteria as the main analysis.

External validation.

The models were externally validated on the Thai dataset. We quantified the degree of relatedness between the Vietnamese (training) and Thai (external validation) datasets using the AUROC of membership and the mean and standard deviation of the linear predictors obtained from the training set and external validation set, as suggested by Debray et al. [37]. Predictive performance on the external validation was assessed using the Brier scores, AUROCs, and calibration measurements (reliability diagrams, CITL, and calibration slope).

Supporting information

S1 Fig. Flow chart describing patients’ outcomes in the Vietnamese training (A) and Thai external validation datasets (B).

https://doi.org/10.1371/journal.pdig.0001171.s001

(DOCX)

S2 Fig. Results of difference in means with 95% CI (the y axis) and relative difference with 95% CI in standard deviation (the x axis) of predicted logarithmic odds on the training (Left) and validation sets (Right).

The vertical and horizontal dotted lines reflect no relative difference in standard deviation and no difference in means of predicted logarithmic odds between the training and validation sets. LR: models with logistic regression and lasso selection; RF: models with random forest; XGB: models with extreme gradience boosted tree; SVM: models with support vector machine; ANN: models with artificial neural networks with 2 hidden layers.

https://doi.org/10.1371/journal.pdig.0001171.s002

(DOCX)

S5 Fig. Predicted probability of dengue shock syndrome (DSS) by day of illness, age, hematocrit, white blood cell count, platelet count, lymphocyte count, albumin, and AST, stratified by country (Thailand vs. Vietnam).

https://doi.org/10.1371/journal.pdig.0001171.s003

(DOCX)

S6 Fig. Predicted probability of plasma leakage by day of illness, age, hematocrit, white blood cell count, platelet count, lymphocyte count, albumin, and AST, stratified by country (Thailand vs. Vietnam).

https://doi.org/10.1371/journal.pdig.0001171.s004

(DOCX)

S1 Table. Characteristics and clinical variables of patients with laboratory confirmed dengue infection developing dengue shock syndrome (DSS) in the training (Vietnamese) and validation (Thai) datasets.

https://doi.org/10.1371/journal.pdig.0001171.s005

(DOCX)

S2 Table. Characteristics and clinical variables of patients with laboratory confirmed dengue infection developing the combined endpoint of moderate plasma leakage and/or dengue shock syndrome in the training (Vietnamese) and validation (Thai) datasets.

https://doi.org/10.1371/journal.pdig.0001171.s006

(DOCX)

S3 Table. Considered candidate predictors of Dengue Shock Syndrome (DSS) and the combined endpoint of moderate plasma leakage and/or DSS in the Vietnamese dataset.

Predictors included in each analysis are denoted by +.

https://doi.org/10.1371/journal.pdig.0001171.s007

(DOCX)

S4 Table. Global model, model selected by lasso selection with the minimum lambda and bootstraps-derived quantities for assessing the uncertainty of model.

https://doi.org/10.1371/journal.pdig.0001171.s008

(DOCX)

S5 Table. Top 10 model with most selected frequencies by a 1,000-bootstrap resampling technique.

https://doi.org/10.1371/journal.pdig.0001171.s009

(DOCX)

S6 Table. Global model, model selected by lasso selection with the minimum lambda and bootstraps-derived quantities for assessing model’s uncertainty for a combined endpoint of moderate plasma leakage or DSS.

https://doi.org/10.1371/journal.pdig.0001171.s010

(DOCX)

S7 Table. Top 10 model with most selected frequencies by 1,000 bootstrap resampling techniques for a combined endpoint of moderate plasma leakage or DSS.

https://doi.org/10.1371/journal.pdig.0001171.s011

(DOCX)

S8 Table. Summary of models developed using multivariable logistic regression with Lasso selection on the training set.

https://doi.org/10.1371/journal.pdig.0001171.s012

(DOCX)

S9 Table. Optimal hyperparameters selected using Bayesian Global Optimisation with Gaussian Processes of models for dengue shock syndrome.

https://doi.org/10.1371/journal.pdig.0001171.s013

(DOCX)

S10 Table. Predictive performance of the risk prediction models for the combined endpoint of moderate plasma leakage and/or DSS in a subgroup analysis trained on hospitalised patients in the Vietnamese dataset on internal validation using 10-fold cross validation.

https://doi.org/10.1371/journal.pdig.0001171.s014

(DOCX)

S11 Table. Dengue shock syndrome predictive performance of the logistic regression with lasso selection and extreme gradient boosted tree risk prediction models trained on the Vietnamese dataset, using different cost values of a false negative results.

https://doi.org/10.1371/journal.pdig.0001171.s015

(DOCX)

S12 Table. Lower and upper bounds of hyperparameters used in each machine learning algorithms.

https://doi.org/10.1371/journal.pdig.0001171.s016

(DOCX)

S13 Table. Dengue shock syndrome predictive performance of the logistic regression with lasso selection and extreme gradient boosted tree risk prediction models trained on the Vietnamese dataset, using different cost values of a false negative results. CI: Confidence interval.

https://doi.org/10.1371/journal.pdig.0001171.s017

(DOCX)

Acknowledgments

The authors acknowledge the research teams at the Oxford University Clinical Research Unit (OUCRU), the seven participating hospitals in Vietnam, the Queen Sirikit National Institute of Child Health (QSNICH), and the Armed Forces Research Institute of Medical Sciences (AFRIMS) for conducting the prospective cohort studies in Vietnam and Thailand and for providing the data used in this study.

Material has been reviewed by the Walter Reed Army Institute of Research. There is no objection to its presentation and/or publication. The opinions or assertions contained herein are the private views of the authors and are not to be construed as official or as reflecting the views of the Department of the Army or the Department of Defense.

References

  1. 1. Messina JP, Brady OJ, Golding N, Kraemer MUG, Wint GRW, Ray SE, et al. The current and future global distribution and population at risk of dengue. Nat Microbiol. 2019;4(9):1508–15. pmid:31182801
  2. 2. Haider N, Hasan MN, Onyango J, Billah M, Khan S, Papakonstantinou D, et al. Global dengue epidemic worsens with record 14 million cases and 9000 deaths reported in 2024. Int J Infect Dis. 2025;158:107940. pmid:40449873
  3. 3. DengueMap. Available from: https://arbomap.org/dengue/may24/FOI
  4. 4. Cattarino L, Rodriguez-Barraquer I, Imai N, Cummings DAT, Ferguson NM. Mapping global variation in dengue transmission intensity. Sci Transl Med. 2020;12(528):eaax4144. pmid:31996463
  5. 5. Bhatt S, Gething PW, Brady OJ, Messina JP, Farlow AW, Moyes CL, et al. The global distribution and burden of dengue. Nature. 2013;496(7446):504–7. pmid:23563266
  6. 6. Leelavanich D, Dorigatti I, Turner HC. The economic burden of dengue: A systematic literature review of cost-of-illness studies. medRxiv. 2025;2025.08.21.25334162.
  7. 7. Shepard DS, Undurraga EA, Halasa YA, Stanaway JD. The global economic burden of dengue: a systematic analysis. Lancet Infect Dis. 2016;16(8):935–41. pmid:27091092
  8. 8. Dengue: Guidelines for Diagnosis, Treatment, Prevention and Control: New Edition. Geneva: World Health Organization; 2009.
  9. 9. Cracknell Daniels B, Ferguson NM, Dorigatti I. Efficacy, public health impact and optimal use of the Takeda dengue vaccine. Nat Med. 2025;31(8):2663–72. pmid:40563017
  10. 10. Kalayanarooj S. Dengue classification: current WHO vs. the newly suggested classification for better clinical application? J Med Assoc Thai. 2011;94(Suppl 3):S74-84.
  11. 11. Barniol J, Gaczkowski R, Barbato EV, da Cunha RV, Salgado D, Martínez E, et al. Usefulness and applicability of the revised dengue case classification by disease: multi-centre study in 18 countries. BMC Infect Dis. 2011;11:106. pmid:21510901
  12. 12. Srikiatkhachorn A, et al. Dengue--how best to classify it. Clin Infect Dis. 2011;53:563–7.
  13. 13. Phakhounthong K, Chaovalit P, Jittamala P, Blacksell SD, Carter MJ, Turner P, et al. Predicting the severity of dengue fever in children on admission based on clinical features and laboratory indicators: application of classification tree analysis. BMC Pediatr. 2018;18(1):109. pmid:29534694
  14. 14. Tan KW, Tan B, Thein TL, Leo Y-S, Lye DC, Dickens BL, et al. Dynamic dengue haemorrhagic fever calculators as clinical decision support tools in adult dengue. Trans R Soc Trop Med Hyg. 2020;114(1):7–15. pmid:31943116
  15. 15. Lee VJ, Lye DC, Sun Y, Leo YS. Decision tree algorithm in deciding hospitalization for adult patients with dengue haemorrhagic fever in Singapore. Trop Med Int Health. 2009;14(9):1154–9. pmid:19624479
  16. 16. Lam PK, Tam DTH, Dung NM, Tien NTH, Kieu NTT, Simmons C, et al. A Prognostic Model for Development of Profound Shock among Children Presenting with Dengue Shock Syndrome. PLoS One. 2015;10(5):e0126134. pmid:25946113
  17. 17. Park S, Srikiatkhachorn A, Kalayanarooj S, Macareo L, Green S, Friedman JF, et al. Use of structural equation models to predict dengue illness phenotype. PLoS Negl Trop Dis. 2018;12(10):e0006799. pmid:30273334
  18. 18. Talukdar S, Thanachartwet V, Desakorn V, Chamnanchanunt S, Sahassananda D, Vangveeravong M, et al. Predictors of plasma leakage among dengue patients in Thailand: A plasma-leak score analysis. PLoS One. 2021;16(7):e0255358. pmid:34324559
  19. 19. Potts JA, Gibbons RV, Rothman AL, Srikiatkhachorn A, Thomas SJ, Supradish P-O, et al. Prediction of dengue disease severity among pediatric Thai patients using early clinical laboratory indicators. PLoS Negl Trop Dis. 2010;4(8):e769. pmid:20689812
  20. 20. Pang J, Lindblom A, Tolfvenstam T, Thein T-L, Naim ANM, Ling L, et al. Discovery and Validation of Prognostic Biomarker Models to Guide Triage among Adult Dengue Patients at Early Infection. PLoS One. 2016;11(6):e0155993. pmid:27286230
  21. 21. Nguyen MT, Ho TN, Nguyen VVC, Nguyen TH, Ha MT, Ta VT, et al. An Evidence-Based Algorithm for Early Prognosis of Severe Dengue in the Outpatient Setting. Clin Infect Dis. 2017;64(5):656–63. pmid:28034883
  22. 22. Sangkaew S, Ming D, Boonyasiri A, Honeyford K, Kalayanarooj S, Yacoub S, et al. Risk predictors of progression to severe disease during the febrile phase of dengue: a systematic review and meta-analysis. Lancet Infect Dis. 2021;21(7):1014–26. pmid:33640077
  23. 23. Sreenivasan P, Geetha S, Kumar AS. WHO 2009 Warning Signs as Predictors of Time Taken for Progression to Severe Dengue in Children. Indian Pediatr. 2020;57(10):899–903. pmid:33089804
  24. 24. Chen C-Y, Chiu Y-Y, Chen Y-C, Huang C-H, Wang W-H, Chen Y-H, et al. Obesity as a clinical predictor for severe manifestation of dengue: a systematic review and meta-analysis. BMC Infect Dis. 2023;23(1):502. pmid:37525106
  25. 25. Huang AT, Takahashi S, Salje H, Wang L, Garcia-Carreras B, Anderson K, et al. Assessing the role of multiple mechanisms increasing the age of dengue cases in Thailand. Proc Natl Acad Sci U S A. 2022;119(20):e2115790119. pmid:35533273
  26. 26. Suwarto S, Diahtantri RA, Hidayat MJ, Widjaya B. Nonalcoholic fatty liver disease is associated with increased hemoconcentration, thrombocytopenia, and longer hospital stay in dengue-infected patients with plasma leakage. PLoS One. 2018;13(10):e0205965. pmid:30332476
  27. 27. Kuruppu H, Karunananda M, Jeewandara C, Gomes L, Dissanayake DMCB, Ranatunga C, et al. Oxidative stress induced liver damage in dengue is exacerbated in those with obesity. medRxiv. 2025;2025.03.18.25324170. pmid:40166538
  28. 28. Moons KGM, Altman DG, Reitsma JB, Ioannidis JPA, Macaskill P, Steyerberg EW, et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD): explanation and elaboration. Ann Intern Med. 2015;162(1):W1–73. pmid:25560730
  29. 29. Kalayanarooj S, Vaughn DW, Nimmannitya S, Green S, Suntayakorn S, Kunentrasai N, et al. Early clinical and laboratory indicators of acute dengue illness. J Infect Dis. 1997;176(2):313–21. pmid:9237695
  30. 30. Peduzzi P, Concato J, Kemper E, Holford TR, Feinstein AR. A simulation study of the number of events per variable in logistic regression analysis. J Clin Epidemiol. 1996;49(12):1373–9. pmid:8970487
  31. 31. Harrell FE. Regression Modeling Strategies: With Applications to Linear Models, Logistic and Ordinal Regression, and Survival Analysis. Cham, Switzerland: Springer International Publishing; 2015.
  32. 32. Heinze G, Wallisch C, Dunkler D. Variable selection - A review and recommendations for the practicing statistician. Biom J. 2018;60(3):431–49. pmid:29292533
  33. 33. Friedman J, Hastie T, Tibshirani R. Regularization Paths for Generalized Linear Models via Coordinate Descent. J Stat Softw. 2010;33(1):1–22. pmid:20808728
  34. 34. Kuhn M. Caret: Classification and Regression Training. 2020.
  35. 35. Allaire J, Chollet F. keras: R Interface to 'Keras'. R package version 2.16.0. 2025. Available from: https://CRAN.R-project.org/package=keras
  36. 36. R Core Team. R: A Language and Environment for Statistical Computing. 2020.
  37. 37. Debray TPA, Vergouwe Y, Koffijberg H, Nieboer D, Steyerberg EW, Moons KGM. A new framework to enhance the interpretation of external validation studies of clinical prediction models. J Clin Epidemiol. 2015;68(3):279–89. pmid:25179855