A Prediction Rule to Stratify Mortality Risk of Patients with Pulmonary Tuberculosis

Tuberculosis imposes high human and economic tolls, including in Europe. This study was conducted to develop a severity assessment tool for stratifying mortality risk in pulmonary tuberculosis (PTB) patients. A derivation cohort of 681 PTB cases was retrospectively reviewed to generate a model based on multiple logistic regression analysis of prognostic variables with 6-month mortality as the outcome measure. A clinical scoring system was developed and tested against a validation cohort of 103 patients. Five risk features were selected for the prediction model: hypoxemic respiratory failure (OR 4.7, 95% CI 2.8–7.9), age ≥50 years (OR 2.9, 95% CI 1.7–4.8), bilateral lung involvement (OR 2.5, 95% CI 1.4–4.4), ≥1 significant comorbidity—HIV infection, diabetes mellitus, liver failure or cirrhosis, congestive heart failure and chronic respiratory disease–(OR 2.3, 95% CI 1.3–3.8), and hemoglobin <12 g/dL (OR 1.8, 95% CI 1.1–3.1). A tuberculosis risk assessment tool (TReAT) was developed, stratifying patients with low (score ≤2), moderate (score 3–5) and high (score ≥6) mortality risk. The mortality associated with each group was 2.9%, 22.9% and 53.9%, respectively. The model performed equally well in the validation cohort. We provide a new, easy-to-use clinical scoring system to identify PTB patients with high-mortality risk in settings with good healthcare access, helping clinicians to decide which patients are in need of closer medical care during treatment.


Participant description
p.7, p.17 (Table 1) Details of treatments received, if relevant NA Study dates p.7 (study design)

OUTCOME(S) TO BE PREDICTED
Definition and method for measurement of outcome p.5 (definition of death outcome and survival time), p.5-6 (logistic regression with 6-month mortality as the outcome measure) Was the same outcome definition (and method for measurement) used in all patients?

Yes
Type of outcome (e.g., single or combined endpoints) Single Was the outcome assessed without knowledge of the candidate predictors (i.e., blinded)?
No. Collection of all data was performed sequentially at the same time, which included date of death or date of loss of follow-up. However, definition of the clinical prediction model (ie, selection of relevant predictors) was performed independently.
Were candidate predictors part of the outcome (e.g., in panel or consensus diagnosis)?
No Time of outcome occurrence or summary of duration of follow-up p.5 (event within 6 months)

PREDICTORS (OR INDEX TESTS)
Number and type of predictors (e.g., demographics, patient history, physical examination, additional testing, disease characteristics) p.4-5 (data collection), p.17 (Table  1) Definition and method for measurement of candidate predictors p.4-5 (data collection), p.5 (criteria for redefining continuous variables into binary factors), p.17 (Table 1 footnote) Timing of predictor measurement (e.g., at patient presentation, at diagnosis, at treatment initiation) p.5 (data collection at baseline, meaning at patient diagnosis) Were predictors assessed blinded for outcome, and for each other (if relevant)?
No. Collection of all data was performed sequentially at the same time, which included date of death or date of loss of follow-up. However, definition of the clinical prediction model (ie, selection of relevant predictors) was performed independently.
Handling of predictors in the modelling (e.g., continuous, linear, non-linear transformations or categorised) p.7-8 (development of a practical CPR to assess risk of death) and p.8-9 (development of scoring system TReAT to stratify the risk of death)

SAMPLE SIZE
Number of participants and number of outcomes/events p.7 (study design) and Figure 1 Number of outcomes/events in relation to the number of candidate predictors (Events Per Variable) p.7 (study design) and Figure 1 MISSING DATA Number of participants with any missing value (include predictors and outcomes) p.18 (Table 3, the sample size for both sets, n=539 and n=103, corresponds to the number of participants with any missing value) Number of participants with missing data for each predictor p.17 (Table 1) Handling of missing data (e.g., complete-case analysis, imputation, or other methods) p.6 (Heckman's selection model)

MODEL DEVELOPMENT
Modelling method (e.g., logistic, survival, neural network, or machine learning techniques) p.5-6 (logistic regression) Modelling assumptions satisfied Yes. We believe that is implicit throughout the text (see p.5-8): binary dependent variable; large sample size with >10 cases per predictor; observations are independent, without multicollinearity.
Method for selection of predictors for inclusion in multivariable modelling (e.g., all candidate predictors, pre-selection based on unadjusted association with the outcome) p.5-8 (pre-selection based on unadjusted association with the outcome) Method for selection of predictors during multivariable modelling (e.g., full model approach, backward or forward selection) and criteria used (e.g., p-value, Akaike Information Criterion) p.5 and p.7-8 (stepwise backward selection) Shrinkage of predictor weights or regression coefficients (e.g., no shrinkage, uniform shrinkage, penalized estimation) No shrinkage

MODEL PERFORMANCE
Calibration (calibration plot, calibration slope, Hosmer-Lemeshow test) and Discrimination (C-statistic, D-statistic, log-rank) measures with confidence intervals p.6 (Models were assessed for goodness-of-fit using receiving operator characteristic (ROC) curves and the Hosmer-Lemeshow test) Classification measures (e.g., sensitivity, specificity, predictive values, net reclassification improvement) and whether a-priori cut points were used p.8-9 and p.18 (Table 3)

MODEL EVALUATION
Method used for testing model performance: development dataset only (random split of data, resampling methods e.g. bootstrap or crossvalidation, none) or separate external validation (e.g. temporal, geographical, different setting, different investigators) p.9 and p.18 (Table 4 - In case of poor validation, whether model was adjusted or updated (e.g., intercept recalibrated, predictor effects adjusted, or new predictors added) Validation performed well. There was no need for any adjustment or update.

RESULTS
Final and other multivariable models (e.g., basic, extended, simplified) presented, including predictor weights or regression coefficients, intercept, baseline survival, model performance measures (with standard errors or confidence intervals) p.8-9 (development of a scoring system, taking into account the weights derived from the regression coefficients; see also Any alternative presentation of the final prediction models, e.g., sum score, nomogram, score chart, predictions for specific risk subgroups with performance Figure 2C Comparison of the distribution of predictors (including missing data) for development and validation datasets Supplementary table 2

INTERPRETATION AND DISCUSSION
Interpretation of presented models (confirmatory, i.e., model useful for practice versus exploratory, i.e., more research needed) p.9-12 Comparison with other studies, discussion of generalizability, strengths and limitations. p.9-12