Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

The leap to ordinal: Detailed functional prognosis after traumatic brain injury with a flexible modelling approach

  • Shubhayu Bhattacharyay ,

    Roles Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Resources, Software, Validation, Visualization, Writing – original draft, Writing – review & editing

    sb2406@cam.ac.uk

    Affiliations Division of Anaesthesia, University of Cambridge, Cambridge, United Kingdom, Department of Clinical Neurosciences, University of Cambridge, Cambridge, United Kingdom, Laboratory of Computational Intensive Care Medicine, Johns Hopkins University, Baltimore, MD, United States of America

  • Ioan Milosevic,

    Roles Formal analysis, Methodology, Software, Writing – review & editing

    Affiliation Division of Anaesthesia, University of Cambridge, Cambridge, United Kingdom

  • Lindsay Wilson,

    Roles Writing – review & editing

    Affiliation Division of Psychology, University of Stirling, Stirling, United Kingdom

  • David K. Menon,

    Roles Funding acquisition, Resources, Supervision, Writing – review & editing

    Affiliation Division of Anaesthesia, University of Cambridge, Cambridge, United Kingdom

  • Robert D. Stevens,

    Roles Writing – review & editing

    Affiliations Laboratory of Computational Intensive Care Medicine, Johns Hopkins University, Baltimore, MD, United States of America, Department of Anesthesiology and Critical Care Medicine, Johns Hopkins University, Baltimore, MD, United States of America

  • Ewout W. Steyerberg,

    Roles Methodology, Supervision, Writing – review & editing

    Affiliation Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, The Netherlands

  • David W. Nelson,

    Roles Methodology, Project administration, Supervision, Writing – review & editing

    Affiliation Department of Physiology and Pharmacology, Section for Perioperative Medicine and Intensive Care, Karolinska Institutet, Stockholm, Sweden

  • Ari Ercole,

    Roles Conceptualization, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Resources, Supervision, Writing – review & editing

    Affiliations Division of Anaesthesia, University of Cambridge, Cambridge, United Kingdom, Cambridge Centre for Artificial Intelligence in Medicine, Cambridge, United Kingdom

  • the CENTER-TBI investigators participants

    A full list of the CENTER-TBI investigators and participants can be found in the Acknowledgments.

Abstract

When a patient is admitted to the intensive care unit (ICU) after a traumatic brain injury (TBI), an early prognosis is essential for baseline risk adjustment and shared decision making. TBI outcomes are commonly categorised by the Glasgow Outcome Scale–Extended (GOSE) into eight, ordered levels of functional recovery at 6 months after injury. Existing ICU prognostic models predict binary outcomes at a certain threshold of GOSE (e.g., prediction of survival [GOSE > 1]). We aimed to develop ordinal prediction models that concurrently predict probabilities of each GOSE score. From a prospective cohort (n = 1,550, 65 centres) in the ICU stratum of the Collaborative European NeuroTrauma Effectiveness Research in TBI (CENTER-TBI) patient dataset, we extracted all clinical information within 24 hours of ICU admission (1,151 predictors) and 6-month GOSE scores. We analysed the effect of two design elements on ordinal model performance: (1) the baseline predictor set, ranging from a concise set of ten validated predictors to a token-embedded representation of all possible predictors, and (2) the modelling strategy, from ordinal logistic regression to multinomial deep learning. With repeated k-fold cross-validation, we found that expanding the baseline predictor set significantly improved ordinal prediction performance while increasing analytical complexity did not. Half of these gains could be achieved with the addition of eight high-impact predictors to the concise set. At best, ordinal models achieved 0.76 (95% CI: 0.74–0.77) ordinal discrimination ability (ordinal c-index) and 57% (95% CI: 54%– 60%) explanation of ordinal variation in 6-month GOSE (Somers’ Dxy). Model performance and the effect of expanding the predictor set decreased at higher GOSE thresholds, indicating the difficulty of predicting better functional outcomes shortly after ICU admission. Our results motivate the search for informative predictors that improve confidence in prognosis of higher GOSE and the development of ordinal dynamic prediction models.

Introduction

Globally, traumatic brain injury (TBI) is a major cause of death, disability, and economic burden [1]. The treatment of critically ill TBI patients is largely guided by an initial prognosis made within a day of admission to the intensive care unit (ICU) [2]. Early outcome prediction models set a baseline against which clinicians consider the effect of therapeutic strategies and compare patient trajectories. Therefore, well-calibrated and reliable prognostic models are an essential component of intensive care.

Outcome after TBI is most often evaluated on the ordered, eight-point Glasgow Outcome Scale–Extended (GOSE) [36], which stratifies patients by their highest level of functional recovery according to participation in daily activities. Existing baseline prediction models used in the ICU dichotomise the GOSE into binary endpoints for TBI outcome. For example, the Acute Physiologic Assessment and Chronic Health Evaluation (APACHE) II [7] model predicts in-hospital survival (GOSE > 1) while the International Mission for Prognosis and Analysis of Clinical Trials in TBI (IMPACT) [8] models focus on predicting functional independence (GOSE > 4, or ‘favourable outcome’) and survival at 6 months post-injury.

Dichotomised GOSE prediction employs a fixed threshold of favourability among the eight levels of recovery for all patients. However, there is no empirical justification for an ideal treatment-effect threshold of GOSE [9]. Moreover, dichotomisation removes each patient or caregiver’s ability to define a different level of recovery as ‘favourable’ during prognosis. By concealing the nuanced differences in outcome defined by the GOSE, dichotomisation also limits the prognostic information made available during a shared treatment decision making process. For example, when clinicians, patients, or next of kin must together decide whether to withdraw life-sustaining measures (WLSM) after severe TBI, knowing the probability of different levels of functional recovery in addition to the baseline probability of survival would enable better quality-of-life consideration and confidence in the decision (Fig 1B) [10]. These problems of dichotomisation cannot be addressed simply by independently training a combination of binary prediction models at several GOSE thresholds. If model predictions are not constrained across the thresholds (i.e., ensuring probabilities do not increase with higher thresholds) during training, then combining multiple threshold outputs may result in nonsensical values. For example, the purported probability of survival (GOSE > 1) might be lower than that of recovering functional independence (GOSE > 4).

thumbnail
Fig 1. Comparison of ordinal outcome prediction to binary outcome prediction in terms of model architecture and clinical application.

GOSE = Glasgow Outcome Scale–Extended at 6 months post-injury. ReLU = rectified linear unit. Pr(●) = Probability operator, i.e., “probability of ●.” Pr(●|○) = Conditional probability operator, i.e., “probability of ●, given ○.” (A) Output layer architectures of binary and ordinal GOSE prediction models. Ordinal prediction models must not only have a more complicated output structure (in terms of learned weights and outcome encoding choices) but also constrain probabilities across the possible levels of functional outcome (indicated by ‘Constraint’ in the ordinal model representations). The constraint for multinomial outcome encoding is performed with a softmax activation function while the constraint for ordinal outcome encoding is performed with subtractions of output values (implemented with a negative ReLU transformation) from lower thresholds. In the provided legend formula for the softmax activation function, zi represents the outputted value of the ith node of the multinomial outcome encoding layer (i.e., the node representing the ith possible score of GOSE) preceding the softmax transformation. (B) A sample patient case to demonstrate the difference in prognostic information between ordinal and binary GOSE prediction models. Binary models predict outcomes at one GOSE threshold while ordinal models predict outcomes at every GOSE threshold concurrently and provide conditional predictions of higher GOSE threshold outcomes given lower GOSE threshold outcomes. Bespoke conditional probability diagrams can be constructed between any number of GOSE thresholds, as desired by model users, so long as lower thresholds (e.g., GOSE > 1) precede higher thresholds (e.g., GOSE > 3) in directionality. Conditional probabilities are calculated by dividing the model probability at the higher threshold by the model probability at the lower threshold (e.g., ).

https://doi.org/10.1371/journal.pone.0270973.g001

A practical solution would be to train ordinal outcome prediction models, which concurrently return probabilities at each GOSE threshold by learning the interdependent relationships between the predictor set and the possible levels of functional recovery (Fig 1A). Ordinal GOSE prediction models would allow users to interpret the probability of different levels of functional recovery. Additionally, they can provide insight into the conditional probability of obtaining greater levels of recovery given lower levels (see Fig 1B for a practical clinical application of this information). However, moving from binary to ordinal outcome prediction poses three key challenges. First, there is no guarantee that widely accepted TBI outcome predictor sets, validated either by binary or ordinal regression analysis, will be able to capture the nuanced differences between levels of functional recovery well enough for reliable prediction. Second, ordinal prediction models typically need to be more complicated than binary models to encode the possibility of more outcomes and the constrained relationship between them [11]. For GOSE prediction, ordinal models can either encode the outcomes as: (1) multinomial, in which nodes exist for each GOSE score and collectively undergo a softmax transformation (to constrain the sum of values to one) and probabilities are calculated by accumulating values up to each threshold, or (2) ordinal, in which nodes exist for each threshold between consecutive GOSE scores, constrained such that output values must not increase with higher thresholds, and probabilities for each threshold are calculated with a sigmoid transformation (Fig 1A). Third, assessment of prediction performance is not as intuitive with an ordinal outcome as with a binary outcome. Widely used dichotomous prediction performance metrics such as the c-index (i.e., the area under the receiver operating characteristic curve [AUC]) do not trivially extend to the ordinal case [12], so assessment of ordinal prediction models requires the consideration of multifactorial metrics and visualisations that may complicate interpretations of model performance [13].

As part of the Collaborative European NeuroTrauma Effectiveness Research in TBI (CENTER-TBI) project, we aim to address the challenges of ordinal outcome prediction. Our analyses cover a range of modelling strategies and predictors available within the first 24 hours of admission to the ICU.

Materials and methods

Study population and dataset

The study population was extracted from the ICU stratum of the core CENTER-TBI dataset (v3.0) using Opal database software [14]. The project objectives and experimental design of CENTER-TBI have been described in detail by Maas et al. [15] and Steyerberg et al. [16] Study patients were prospectively recruited at one of 65 participating ICUs across Europe with the following eligibility criteria: admission to the hospital within 24 hours of injury, indication for CT scanning, and informed consent according to local and national requirements.

Per project protocol, each patient’s follow-up schedule included a GOSE assessment at 6 months post-injury, or, more precisely, within a window of 5–8 months post-injury. GOSE assessments were conducted using structured interviews [6] and patient/carer questionnaires [17] by the clinical research team of CENTER-TBI. The eight, ordinal scores of GOSE, representing the highest levels of functional recovery, are decoded in the heading of Table 1. Since patient/carer questionnaires do not distinguish vegetative patients (GOSE = 2) into a separate category, GOSE scores 2 and 3 (lower severe disability) were combined to one category (GOSE ∈ {2,3}) in our dataset. Of the 2,138 ICU patients in the CENTER-TBI dataset available for analysis, we excluded patients in the following order: (1) age less than 16 years at ICU admission (n = 82), (2) follow-up GOSE was unavailable (n = 283), and (3) ICU stay was less than 24 hours (n = 223). Our resulting sample size was n = 1,550. For 1,351 patients (87.2%), either the patient died during ICU stay (n = 205) or results from a GOSE evaluation at 5–8 months post-injury were available in the dataset (n = 1,146). For the remaining 199 patients (12.8%), GOSE scores were imputed using a Markov multi-state model based on the observed GOSE scores recorded at different timepoints between 2 weeks to one-year post-injury [18]. A flow diagram for study inclusion and follow-up is provided in S1 Fig, and summary characteristics of the study population are detailed in Table 1.

thumbnail
Table 1. Summary characteristics of the study population at ICU admission stratified by ordinal 6-month outcomes.

https://doi.org/10.1371/journal.pone.0270973.t001

Repeated k-fold cross-validation

We implemented the ‘scikit-learn’ module (v0.23.2) [20] in Python (v3.7.6) to create 100 stratified partitions of our study population for repeated k-fold cross-validation (20 repeats, 5 folds). Within each of the partitions, approximately 80% of the population would constitute the training set (n ≈ 1,240 patients) and 20% of the population would constitute the corresponding testing set (n ≈ 310 patients). For parametric (i.e., deep learning) models, we implemented a stratified shuffle split on each of the 100 training sets to set 15% (n ≈ 46 patients) aside for validation and hyperparameter optimisation.

Selection and preparation of concise predictor set

In selecting a concise predictor set, our primary aim was to find a small group of well-validated, widely measured clinical variables that are commonly used for TBI outcome prognosis in existing ICU practice. We selected the ten predictors from the extended IMPACT binary prediction model [8] for moderate-to-severe TBI–defined by a baseline Glasgow Coma Scale (GCS) [21, 22] score between 3 and 12, inclusive–to represent our concise set. While 26.6% of our study population falls out of this GCS range (Table 1), we find that the IMPACT predictor set is the most rigorously validated [2327] baseline set available for the overall critically ill TBI population. The ten predictors, characterised in Table 2, are all measured within 24 hours of ICU admission and include demographic characteristics, clinical severity scores, CT characteristics, and laboratory measurements. The predictors as well as empirical justification for their inclusion in the IMPACT model have been described in detail [28]. In this manuscript, each of the models trained on the IMPACT predictor set is denoted as a concise-predictor-based model (CPM).

thumbnail
Table 2. Concise baseline predictors of the study population stratified by ordinal 6-month outcomes.

https://doi.org/10.1371/journal.pone.0270973.t002

Seven of the concise predictors had missing values for some of the patients in our study population (S2 Fig). In each repeated cross-validation partition, we trained an independent, stochastic predictive mean matching imputation function on the training set and imputed all missing values across both sets using the ‘mice’ package (v3.9.0) [30] in R (v4.0.0) [31]. The result was a multiply imputed (m = 100) dataset with a unique imputation per partition, allowing us to simultaneously account for the variability due to resampling and the variability due to missing value imputation during repeated cross-validation.

Prior to the training of CPMs, each of the multi-categorical variables (i.e., GCSm, Marshall CT, and unreactive pupils in Table 2) were one-hot encoded and each of the continuous variables (i.e., age, glucose, and haemoglobin) were standardised based on the mean and standard deviation of each of the training sets with the ‘scikit-learn’ module in Python.

Selection of concise-predictor-based models (CPMs)

We tested four CPM types, each denoted by a subscript: (1) multinomial logistic regression (CPMMNLR), (2) proportional odds (i.e., ordinal) logistic regression (CPMPOLR), (3) class-weighted feedforward neural network with a multinomial (i.e., softmax) output layer (CPMDeepMN), and (4) class-weighted feedforward neural network with an ordinal (i.e., constrained sigmoid at each threshold) output layer (CPMDeepOR). These models were selected because, in the setting of ordinal GOSE prediction, we wished to compare the performance of: (1) nonparametric logistic regression models (CPMMNLR and CPMPOLR) to nonlinear, parametric deep learning networks (CPMDeepMN and CPMDeepOR), and (2) multinomial outcome encoding (CPMMNLR and CPMDeepMN) to ordinal outcome encoding (CPMPOLR and CPMDeepOR). Each of these model types returns a predicted probability for each of the GOSE thresholds at 6 months post-injury from the concise set of predictors (Fig 1A). A detailed explanation of CPM architectures, hyperparameters for the parametric CPMs, loss functions, and optimisation algorithms is provided in S1 Appendix.

CPMBest denotes the optimal CPM for a given performance metric in the Results. CPMMNLR and CPMPOLR were implemented with the ‘statsmodels’ module (dev. v0.14.0) [32] in Python, and CPMDeepMN and CPMDeepOR were implemented with the ‘PyTorch’ (v1.10.0) [33] module in Python.

Design of all-predictor-based models (APMs)

In contrast to the CPMs, we designed and trained prediction models on all baseline (i.e., available to ICU clinicians at 24 hours post-admission) clinical information (excluding high-resolution data such as full brain images or physiological waveforms) in the CENTER-TBI database. Each of these models is designated as an all-predictor-based model (APM).

For our study population, there are 1,151 predictors [34], each being in one of the 14 categories listed in Table 3, with variable levels of missingness and frequency per patient. This information also includes 81 predictors denoting treatments or interventions within the first 24 hours of ICU care (e.g., type and dose of medication administered) and 76 predictors denoting the explicit impressions or rationales of ICU physicians (e.g., reason for surgical intervention and expected prognosis with or without surgery).

thumbnail
Table 3. Predictor baseline tokens per patient in the CENTER-TBI dataset.

https://doi.org/10.1371/journal.pone.0270973.t003

To prepare this information into a suitable format for training APMs, we tokenised and embedded heterogenous patient data [35] in a process visualised in Fig 2. Predictor tokens were constructed in one of the following ways: (1) for categorical predictors, a token was constructed by concatenating the predictor name and value, e.g., ‘GCSTotalScore_04,’ (2) for continuous predictors, a token was constructed by learning the distribution of that predictor from the training set and discretising into 20 quantile bins, e.g., ‘SystolicBloodPressure_BIN17,’ (3) for text-based entries, we removed all special characters, spaces, and capitalisation from the text and appended the unformatted text to the predictor name, e.g., ‘InjuryDescription_skullfracture,’ and (4) for missing values, a separate token was created to designate missingness, e.g., ‘PriorMedications_NA’ (Fig 2A). The unique tokens from a patient’s first 24 hours of ICU stay made up his or her individual predictor set, and the median number of unique tokens (excluding missing value tokens) per patient per predictor category are provided in Table 3. Notably, this process does not require any data cleaning, missing value imputation, outlier removal, or domain-specific knowledge for a large set of variables and imposes no constraints on the number or type of predictors per patients [35]. Additionally, by including missing value tokens, models can discover meaningful patterns of missingness if they exist [36].

thumbnail
Fig 2. Tokenisation and embedding procedure for the development of ordinal all-predictor-based models (APMs).

ICU = intensive care unit. ER = emergency room. Hx = history. SES = socioeconomic status. CSF = cerebrospinal fluid. GOSE = Glasgow Outcome Scale–Extended at 6 months post-injury. (A) Process of converting all clinical information, from the first 24 hours of each patient, into an indexed dictionary of tokens during model training. The tokenisation process is illustrated with three example predictors and their associated values in step 2. The first entry in the trained token dictionary (‘0) <unrecognised>‘) of step 3 is a placeholder token for any tokens encountered in the testing set that were not seen in the training set. (B) Visual representation of token embedding and significance-weighted averaging pipeline during APM prediction runs. After tokenising an individual patient’s clinical information, the vector of tokens is converted to a vector of the indices corresponding to each token in the trained token dictionary. The corresponding vectors and significance weights of the indices are extracted to weight-average the patient information into a single vector. The embedding layer and significance weights are learned through stochastic gradient descent during model training, and significance weights are constrained to be positive with an exponential function. While not explicitly shown, the weighted vectors are divided by the number of vectors during weight-averaging. The individual, weight-averaged vector then feeds into an ordinal prediction model to return probabilities at each GOSE threshold. The ordinal prediction model could either have multinomial output encoding (APMMN) or ordinal outcome encoding (APMOR), as represented in Fig 1A.

https://doi.org/10.1371/journal.pone.0270973.g002

Taking inspiration from artificially intelligent (AI) natural language processing [37, 38], all the predictor tokens from the training set (excluding the validation set) are used to construct a token dictionary. APMs learn a lower dimensional vector as well as a positive significance weight for each entry in the dictionary during training. The vectors for each of the tokens of a single patient are significance-weight-averaged into a single vector which is then fed into a class-weighted feedforward neural network (Fig 2B). If the neural network has no hidden layers, then the APM is analogous to logistic regression, while if it does have hidden layers, the APM corresponds to deep learning. In this work, we train APMs with one of two kinds of output layers: multinomial, i.e., softmax, (APMMN), or ordinal, i.e., constrained sigmoid at each GOSE threshold, (APMOR). Both model types output a predicted probability for each of the GOSE thresholds at 6 months post-injury. A detailed explanation of APM architectures, hyperparameters, loss functions, and optimisation algorithms is provided in S2 Appendix.

APMBest denotes the optimal APM for a given performance metric in the Results. APMMN and APMOR were implemented with the ‘PyTorch’ module in Python.

Predictor importance in all-predictor-based models (APMs)

The relative importance of predictor tokens in the trained APMs was measured with absolute Shapley additive explanation (SHAP) [39] values, which, in our case, can be interpreted as the magnitude of the relative contribution of a token towards a model output for a single patient. For APMMN, this corresponds to the predictor contributions towards each node (after softmax transformation, Fig 1A) corresponding to the probability at a GOSE score. For APMOR, this corresponds to the predictor contributions towards each node (after sigmoid transformation, Fig 1A) corresponding to the probability at a GOSE threshold. Absolute SHAP values were measured for each patient in the testing set of every repeated cross-validation partition, and we averaged these values over the partitions to derive our individualised importance scores per token. These scores were averaged, once again, over the entire patient set to calculate the mean absolute SHAP values of each token. Finally, to derive importance scores for each predictor, we calculated the maximum of the mean absolute SHAP values of the possible tokens from the predictor.

Selection and preparation of extended concise predictor set

We selected a small set of the most important APM predictors by mean absolute SHAP values to add to the concise predictor set and observe the change in model performance. Since the concise predictor set does not include any information on intervention decisions or physician impressions from the first day, we did not consider these predictor types. Moreover, for every multi-categorical predictor selected, we examined the mean absolute SHAP values of each of the predictor’s possible tokens to determine which of the categories should be explicitly encoded (e.g., including 10 categories for employment status or just one indicator variable for retirement). The extended concise predictor set, including the 10 original concise predictors and the 8 added predictors, in our study population is listed and characterised in S1 Table. Each of the models trained on the concise set with these variables added is denoted as an extended concise-predictor-based model (eCPM).

The process of multiple imputation (m = 100), one-hot encoding, and standardisation of the extended concise predictor set was identical to that of the concise predictor set, as described earlier.

Selection of extended concise-predictor-based models (eCPMs)

The four eCPM model types we tested are identical to the four CPM model types, as described earlier and in S1 Appendix with, however, the extended concise predictor set: (1) multinomial logistic regression (eCPMMNLR), (2) proportional odds (i.e., ordinal) logistic regression (eCPMPOLR), (3) class-weighted feedforward neural network with a multinomial (i.e., softmax) output layer (eCPMDeepMN), and (4) class-weighted feedforward neural network with an ordinal (i.e., constrained sigmoid at each threshold) output layer (eCPMDeepOR).

eCPMBest denotes the optimal eCPM for a given performance metric in the Results.

Assessment of model discrimination and calibration

All model metrics, curves, and associated confidence intervals (CI) were calculated from testing set predictions using the repeated Bootstrap Bias Corrected Cross-Validation (BBC-CV) method [40] with 1,000 resamples of unique patients for bootstrapping. The collection of metrics from the bootstrapped testing set resamples for each model then formed our unbiased estimation distribution for statistical inference (i.e., CI).

In this work, we assess model discrimination performance (i.e., how well do the models separate patients with different GOSE scores?) and probability calibration (i.e., how reliable are the predicted probabilities at each threshold?). The metrics and visualisations are explained in detail, with mathematical derivation and intuitive examples, in S3 Appendix. In this section, we will only list the metrics, their interpretations, and their range of feasible values. Feasible values range from the value corresponding to no model information or random guessing (i.e., the no information value [NIV]) to the value corresponding to ideal model performance (i.e., the full information value [FIV]).

Our primary metric of model discrimination performance is the ordinal c-index (ORC) [13]. ORC has two interpretations: (1) the probability that a model correctly separates two patients with two randomly chosen GOSE scores and (2) the average proportional closeness between a model’s functional outcome ranking of a set of patients (which includes one randomly chosen patient from each possible GOSE score) to their true functional outcome ranking. In addition, we calculate Somers’ Dxy [41, 42], which is interpreted as the proportion of ordinal variation in GOSE that can be explained by the variation in model output. Our final metrics of model discrimination are dichotomous c-indices (i.e., AUC) at each threshold of GOSE. Each is interpreted as the probability of a model correctly discriminating a patient with GOSE above the threshold from one with GOSE below. The range of feasible values for each discrimination metric are: NIVORC = 0.5 to FIVORC = 1, NIVSomers’ Dxy = 0 to FIVSomers’ Dxy = 1, and NIVDichotomous c-index = 0.5 to FIVDichotomous c-index = 1. ORC is the only discrimination metric that is independent of the sample prevalence of each GOSE category [13].

To assess the calibration of predicted probabilities at each GOSE threshold, we use the logistic recalibration framework [43] to measure calibration slope [44]. A calibration slope less than one indicates overfitting (i.e., high predicted probabilities are overestimated while low predicted probabilities are underestimated) while a calibration slope greater than one indicates underfitting [45]. We also examine smoothed probability calibration curves [46] to detect miscalibrations that may be overlooked by the logistic recalibration framework [45]. The ideal calibration curve is a diagonal line with slope one and y-intercept 0 while one indicative of random guessing would be a horizontal line with a y-intercept at the proportion of the study population above the given threshold. We accompany each calibration curve with the integrated calibration index (ICI) [47], which is the mean absolute error between the smoothed and the ideal calibration curves, to aid comparison of curves across model types. FIVICI = 0, but NIVICI varies based on the outcome distribution at each threshold (S3 Appendix).

All metrics were calculated using the ‘scikit-learn’ and ‘SciPy’ (v1.6.2) [48] modules in Python and figures were plotted using the ‘ggplot2’ package (v3.3.2) [49] in R.

Computational resources

All computational and statistical components of this work were performed in parallel on the Cambridge Service for Data Driven Discovery (CSD3) high performance computer, operated by the University of Cambridge Research Computing Service (http://www.hpc.cam.ac.uk). The training of each APM was accelerated with graphical processing units and the ‘PyTorch Lightning’ (v1.5.0) [50] module. The training of all parametric models (CPMDeepMN, CPMDeepOR, APMMN, APMOR, eCPMDeepMN, and eCPMDeepOR) was made more efficient by dropping out consistently underperforming parametric configurations, on the validation sets, with the Bootstrap Bias Corrected with Dropping Cross-Validation (BBCD-CV) method [40] with 1,000 resamples of unique patients. The results of hyperparameter optimisation are detailed in S4 Appendix.

Results

CPM and APM discrimination performance

The discrimination performance metrics for each CPM are listed in S2 Table. Deep learning models (CPMDeepMN and CPMDeepOR) made no significant improvement (based on 95% CI) over logistic regression models (CPMMNLR and CPMPOLR). The only significant difference in discrimination among the model types was observed in CPMDeepOR, which had a significantly lower ORC and Somers’ Dxy than the other models. The discrimination performance metrics for each APM are listed in S3 Table. APMMN had a significantly higher ORC, Somers’ Dxy, and dichotomous c-indices at lower GOSE thresholds (i.e., GOSE > 1 and GOSE > 3) than did APMOR. Moreover, in S4 Appendix, we see that the best-performing parametric configurations of APMMN did not contain additional hidden layers between the token embedding and output layers. Our results of performance within predictor sets consistently demonstrate that increasing analytical complexity, in terms of using deep learning (for CPMs) or adding hidden network layers (for APMs), did not improve discrimination of outcomes. In the case of deep learning models, multinomial outcome encoding significantly outperformed ordinal outcome encoding (Fig 1A).

The discrimination performance metrics of the best-performing CPMs (CPMBest), compared with those of the best-performing APMs (APMBest), are listed in Table 4. In contrast to the case of analytical complexity, we observe that expanding the predictor set yielded a significant improvement in ORC, Somers’ Dxy, and each threshold-level dichotomous c-index except for those of the highest GOSE thresholds (i.e., GOSE > 6 and GOSE > 7). On average, models trained on the concise predictor set (CPMs) correctly separated two randomly selected patients from two randomly selected GOSE categories 70% (95% CI: 68%– 71%) of the time, while models trained on all baseline predictors (APMs) in the CENTER-TBI dataset did so 76% (95% CI: 74%– 77%) of the time. These percentages also correspond to the average proportional closeness of predicted rankings to true GOSE rankings of patient sets. CPMBest explained 44% (95% CI: 41%– 48%) of the ordinal variation in GOSE while APMBest explained 57% (95% CI: 54%– 60%) in their respective model outputs. At increasing GOSE thresholds, the dichotomous c-indices of CPMBest and APMBest, as well as the gap between them, consistently decreased (Table 4). This signifies that predicting higher 6-month functional outcomes is more difficult than predicting lower 6-month functional outcomes. Moreover, the gains in discrimination earned from expanding the predictor set mostly come from improved performance at lower GOSE thresholds (i.e., predicting survival, return of consciousness, or recovery of functional independence).

thumbnail
Table 4. Best ordinal model discrimination and calibration performance per predictor set.

https://doi.org/10.1371/journal.pone.0270973.t004

CPM and APM calibration performance

The calibration slopes and calibration curves for each CPM are displayed in S2 Table and S3 Fig, respectively. Both logistic regression CPMs (CPMMNLR and CPMPOLR) are significantly overfitted at the three highest GOSE thresholds (i.e., GOSE > 5, GOSE > 6, and GOSE > 7). The graphical calibration of CPMDeepOR was significantly worse than that of the other CPMs (S3 Fig). The calibration slopes and calibration curves for each APM are displayed in S3 Table and S4 Fig, respectively. APMOR is poorly calibrated at each threshold of GOSE. APMMN is significantly overfitted at the three highest GOSE thresholds (i.e., GOSE > 5, GOSE > 6, and GOSE > 7).

The calibration slopes and calibration curves for the best-calibrated CPMs (CPMBest), compared against those for the best-calibrated APMs (APMBest), are displayed in Table 4 and Fig 3, respectively. Unlike CPMBest, APMBest could not avoid significant overfitting at the three highest GOSE thresholds (i.e., GOSE > 5, GOSE > 6, and GOSE > 7). At these thresholds, we observe that the calibration curve of APMBest significantly veered off the diagonal line of ideal calibration for higher predicted probabilities. However, due to the relative infrequency of these predictions (comparative histograms in Fig 3), the ICI of APMBest is not significantly higher than that of CPMBest. Our results suggest that APMBest requires more patients with higher functional outcomes, in both the training and validation sets, to mitigate overfitting [45].

thumbnail
Fig 3. Ordinal calibration curves of best-performing concise-predictor-based model (CPMBest) and best-performing all-predictor-based model (APMBest).

GOSE = Glasgow Outcome Scale–Extended at 6 months post-injury. In each panel, a comparative histogram (200 uniform bins), centred at a horizontal line in the bottom quarter, displays the distribution of predicted probabilities for CPMBest (above the line) and APMBest (below the line) at the given GOSE threshold. CPMBest and APMBest correspond to the CPM (S2 Table) and APM (S3 Table), respectively, with the lowest unweighted average of integrated calibration indices (ICI) across the thresholds. Shaded areas are 95% confidence intervals derived using bias-corrected bootstrapping (1,000 resamples) to represent the variation across repeated k-fold cross-validation folds (20 repeats of 5 folds) and, for CPMBest, 100 missing value imputations. The values in each panel correspond to the mean ICI (95% confidence interval) at the given threshold. The diagonal dashed line represents the line of perfect calibration (ICI = 0).

https://doi.org/10.1371/journal.pone.0270973.g003

Predictor importance

Given that APMMN significantly outperforms APMOR in discrimination and calibration, we focus the assessment of predictor importance to APMMN. A bar plot of the mean absolute SHAP values associated with the 15 most important predictors in APMMN is provided in Fig 4. We find that the subjective early prognoses of ICU physicians had the greatest contribution towards APMMN predictions, particularly for the prediction of death (GOSE = 1) within 6 months. Initially, this result (along with the high contribution of other physician impressions) seems to suggest that integration of a physician’s interpretations of a patient’s baseline status may add important prognostic information. These impressions likely summarise information from a variable number of other predictors along with the physician’s own experience-based judgement, resulting in high prediction contributions. However, inclusion of these variables may result in problematic self-fulfilling prophecies [51]. For instance, a physician’s poor prognosis directly influences WLSM, which was instituted in 144 (70.2%) of the 205 patients who died in the ICU [52]. Including a variable for physician prognosis may then negatively bias the outcome prediction and unduly promote WLSM. Therefore, we do not consider physician impression predictors for our extended concise predictor set. We also observe that ‘age at admission’ was the only concise predictor among the 15 most important ones. The importance ranks (out of 1,151) of the concise predictors (Table 2) are: age = 5th, glucose = 23rd, Marshall CT = 25th, pupillary reactivity = 29th, GCSm = 42nd, haemoglobin = 50th, hypoxia = 284th, tSAH = 301st, EDH = 414th, and hypotension = 420th. The eight remaining predictors of the top 15 (Fig 4) were added to the concise predictor set to form our extended concise predictor set. Within the tokens for “employment status before injury,” we found that the single token indicating retirement is much more important than the others. Thus, instead of encoding all 10 options for employment status, we included a single indicator variable for retirement in our extended concise predictor set. The eight added predictors included 2 demographic variables (retirement status and highest level of formal education), 4 protein biomarker concentrations (neurofilament light chain [NFL], glial fibrillary acidic protein [GFAP], total tau protein [T-tau], and S100 calcium-binding protein B [S100B]), and 2 clinical assessment variables (worst abbreviated injury score [AIS] among head, neck, brain, and cervical spine injuries and incidence of post-traumatic amnesia at ICU admission). The extended concise predictor set, including the ten original concise predictors and the eight added predictors, is statistically characterised in S1 Table.

thumbnail
Fig 4. Mean absolute shapley additive explanation (SHAP) values of most important predictors for multinomial-encoding all-predictor-based model (APMMN).

ICU = intensive care unit. ER = emergency room. CT = computerised tomography. GOS = Glasgow Outcome Scale (not extended). UO = unfavourable outcome, defined by functional dependence (i.e., GOSE ≤ 4). AIS = Abbreviated Injury Scale. GOSE = Glasgow Outcome Scale–Extended at 6 months post-injury. CPM = predictors that are included in the original concise predictor set. eCPM = predictors that are added to the original concise predictor set to form the extended concise predictor set. The mean absolute SHAP value is interpreted as the average magnitude of the relative additive contribution of a predictor’s most important token towards the predicted probability at each GOSE score for a single patient. Predictor types are denoted by the coloured boundary around predictor names. Physician impression predictors denote predictors that encode the explicit impressions or rationales of ICU physicians and are not considered for the extended concise predictor set.

https://doi.org/10.1371/journal.pone.0270973.g004

A bar plot of the mean absolute SHAP values of APMMN for each of the five folds of the first repeat is provided in S5 Fig. Most of the eight added predictors, along with age at admission, are consistently represented among the most important predictors across the five folds.

eCPM discrimination and calibration

The discrimination and calibration metrics for the best-performing extended-predictor-based model (eCPMBest) are listed in Table 4. Inclusion of the eight selected predictors accounted for about half of the gains in discrimination performance achieved by APMBest over CPMBest according to ORC, Somers’ Dxy, and the dichotomous c-indices. Based on the difference in Somers’ Dxy, the eight added predictors allowed models to explain an additional 6% of the ordinal variation in GOSE at 6 months post-injury. Unlike APMBest, eCPMBest is not significantly overfitted at any threshold. The calibration curves of eCPMs (S6 Fig) are largely similar to those of the corresponding CPMs (S3 Fig), except at the highest threshold (i.e., GOSE > 7). Similar to those of APMMN, the calibration curves of eCPMs veer off the line of ideal calibration at higher predicted probabilities of GOSE > 7. The eCPM results support the finding that discrimination performance can be improved with the expansion of the predictor set. Furthermore, by limiting the number of added predictors and the analytical complexity of the model, eCPM avoided the significant miscalibration of APM at higher thresholds.

The discrimination and calibration metrics for each eCPM are listed in S4 Table.

Discussion

To our knowledge, this is the most comprehensive evaluation of early ordinal outcome prognosis for critically ill TBI patients. Our analysis cross-compares a range of ordinal prediction modelling strategies with a large range of available baseline predictors to determine the relative contribution of each towards model performance. Employing an AI tokenisation and embedding technique, we develop highly flexible ordinal prediction models that can learn from the entire, heterogeneous set of 1,151 predictors, available within the first 24 hours of ICU stay, in the CENTER-TBI dataset. This information includes not only all baseline clinical data currently deemed significant for ICU care of TBI but also advanced sub-study results (e.g., protein biomarkers, central haemostatic markers, genetic markers, and advanced MRI results) that represent the experimental frontier of clinical TBI assessment [1, 15, 16]. Therefore, our work reveals the interpretable limits of baseline ordinal, 6-month GOSE prediction in the ICU at this time.

Our key finding is that augmenting the baseline predictor set was much more relevant for improving ordinal model prediction performance than was increasing analytical complexity with deep learning. Within a given predictor set, artificial neural networks did not perform better than logistic regression models (S2 and S4 Tables), nor did models with additional hidden layers for the APMs (S4 Appendix). This result is consistent with findings in the binary prediction case [53]. On the other hand, augmenting the predictor set, from CPM to APM, substantially improved ordinal discrimination (ORC: +8.6%, Table 4) and prediction at lower GOSE thresholds (e.g., GOSE > 1 c-index: +8.4%, Table 4). Just adding eight predictors to the concise predictor set accounted for about half of the gains in discrimination. However, the addition of predictors negatively affected model calibration, particularly at higher GOSE thresholds (Fig 3, Table 4). This result underlines the need for careful consideration of probability calibration during model development (e.g., recalibrate with isotonic regression to mitigate overfitting).

At the same time, our results also indicate that ordinal early outcome prognosis for critically ill TBI patients is limited in capability. The best-performing model, which learns from all baseline information in the CENTER-TBI dataset, can only correctly discriminate two randomly chosen patients with two randomly chosen GOSE scores 76% (95% CI: 74%– 77%) of the time. Equivalently, if the best performing model was tasked with ranking seven randomly chosen patients–each with a different true GOSE–by predicted GOSE, an average 5.10 (95% CI: 4.74–5.46) of the 21 possible pairwise orderings will be incorrect. Currently, ordinal model outputs explain, at best, 57% (95% CI: 54%– 60%) of the ordinal variation in 6-month GOSE. Ordinal prediction models struggle to reliably predict full recovery (GOSE > 7 c-index: 75% [95% CI: 72%– 79%], Table 4), and gains from expanding the predictor set diminish with higher GOSE thresholds.

It is important to acknowledge that the predictor importance results of this article should not be interpreted for predictor discovery or validation. SHAP values are visualised (Fig 4) solely to globally interpret APMMN predictions and to form the extended concise predictor set. Risk factor validation, which falls out of the scope of this work, would require investigating the robustness and clinical plausibility of the relationship between predictor values and their corresponding SHAP values [54]. Moreover, causal analysis with apt consideration of confounding factors or dataset biases would be necessary before commenting on the potential effects or mechanisms of individual predictors.

We recognise several limitations in our study. While the concise predictor set was originally designed for prognosis after moderate-to-severe TBI [8] (i.e., baseline GCS 3–12), 26.6% of our study population had experienced mild (i.e., baseline GCS 13–15) TBI (Table 1). Predictor sets have been designed for mild TBI patients (e.g., UPFRONT study predictors [55]). However, in line with the aims of the CENTER-TBI project [15], we focus the TBI population not by initial characterisation with GCS but by stratum of care (i.e., admission to the ICU). Therefore, we selected the single concise predictor set that was best validated for the majority of critically ill TBI patients. Our outcome categories (GOSE at 6 months post-injury) were statistically imputed for 13% of our dataset using available GOSE between 2 weeks and one-year post-injury. Although this method was strongly validated on the same (CENTER-TBI) dataset [18], we do recognise that our outcome labels may not be precisely correct. The focus of this work is on the prediction of functional outcomes through GOSE; nonetheless, it is worth considering other outcomes, such as quality-of-life and psychological health, that are important for clinical decision making [56]. Finally, before the AI models developed in this work and in subsequent iterations could be integrated into ICU practice, limitations of generalisability must be addressed [57]. Our models were developed on a multicentre, adult population, prospectively recruited between 2014 and 2017 [25], across Europe, and may encode recruitment, collection, and clinical biases native to our patient set. AI models must continuously be updated, iteratively retrained on incoming information, to help fight the effect these biases may have on returned prognoses for a given patient.

In the setting of TBI prognosis, we encourage the use of AI not to add analytical complexity (i.e., make models “deeper”) but to expand the predictor set (i.e., make models “wider”). Studies have uncovered promising prognostic value in neuro-inflammatory markers [58, 59] and high-resolution TBI monitoring and imaging modalities (e.g., intracranial and cerebral perfusion pressure [6062], accelerometery [63], and MRI [6466]), and we recommend integrating these features into ordinal prognostic models, especially to improve prediction of higher functional outcomes. We also believe that there is a feasible performance limit to reliable ordinal outcome prognosis if only statically considering the clinical information from the first 24 hours of ICU stay. It would seem far-fetched to expect all relevant information pertaining to an outcome at 6 months to be encapsulated in the first 24 hours of ICU treatment. Heterogeneous pathophysiological processes unfold over time in patients after TBI [67, 68], and dynamic prediction models, which return model outputs longitudinally with changing clinical information, are better equipped to consider these temporal effects on prognosis. Dynamic prognosis models have been developed for TBI patients [69] and the greater ICU population (not exclusive to TBI) [35, 70, 71], but none of them predict functional outcomes on an ordinal scale. We suggest that the next iteration of this work should be to develop ordinal dynamic prediction models on all clinical information available during the complete ICU stay.

Ethical approval statement

The CENTER-TBI study has been conducted in accordance with all relevant laws of the European Union and all relevant laws of the country where the recruiting sites were located, including (but not limited to) the relevant privacy and data protection laws and regulations, the relevant laws and regulations on the use of human materials, and all relevant guidance relating to clinical studies from time in force including (but not limited to) the ICH Harmonised Tripartite Guideline for Good Clinical Practice (CPMP/ICH/135/95) and the World Medical Association Declaration of Helsinki entitled “Ethical Principles for Medical Research Involving Human Subjects.” Written informed consent by the patients and/or the legal representative/next of kin was obtained (according to local legislation) for all patients recruited in the core dataset of CENTER-TBI and documented in the electronic case report form. Ethical approval was obtained for each recruiting site.

The list of sites, ethical committees, approval numbers and approval dates can be found on the website: https://www.center-tbi.eu/project/ethical-approval.

Supporting information

S1 Appendix. Explanation of selected ordinal prediction models for CPM and eCPM.

https://doi.org/10.1371/journal.pone.0270973.s001

(PDF)

S2 Appendix. Explanation of APM for ordinal GOSE prediction.

https://doi.org/10.1371/journal.pone.0270973.s002

(PDF)

S3 Appendix. Detailed explanation of ordinal model performance and calibration metrics.

https://doi.org/10.1371/journal.pone.0270973.s003

(PDF)

S4 Appendix. Hyperparameter optimisation results.

https://doi.org/10.1371/journal.pone.0270973.s004

(PDF)

S1 Fig. CONSORT-style flow diagram for patient enrolment and follow-up.

CENTER-TBI = Collaborative European NeuroTrauma Effectiveness Research in TBI. ICU = intensive care unit. GOSE = Glasgow Outcome Scale–Extended. MSM = Markov multi-state model (see Materials and methods). The dashed, olive-green line in the lower-middle of the diagram divides the enrolment flow diagram (above) and the follow-up breakdown (below).

https://doi.org/10.1371/journal.pone.0270973.s005

(TIF)

S2 Fig. Characterisation of missingness among concise predictor set.

U.P. = unreactive pupils. GCSm = motor component score of the Glasgow Coma Scale. Hb = haemoglobin. Glu. = glucose. HoTN = hypotension. Marshall = Marshall computerised tomography classification. tSAH = traumatic subarachnoid haemorrhage. EDH = extradural haematoma. (A) Proportion of total sample size (n = 1,550) with missing values for each IMPACT extended model predictor. (B) Missingness matrix where each column represents a concise predictor, and each row represents a combination of missing predictors (red) and non-missing predictors (blue) found in the dataset. The prevalence of each combination (i.e., row) in the study population is shown with a horizontal histogram (far right) labelled with the proportion of the study population with the corresponding combination of missing predictors. For example, the bottom row of the matrix shows that 54.77% of the study population had no missing concise predictors while the penultimate row shows that 14.71% of the study population had only glucose and haemoglobin missing among the concise predictors.

https://doi.org/10.1371/journal.pone.0270973.s006

(TIF)

S3 Fig. Ordinal calibration curves of each concise-predictor-based model (CPM).

GOSE = Glasgow Outcome Scale–Extended at 6 months post-injury. Shaded areas are 95% confidence intervals derived using bias-corrected bootstrapping (1,000 resamples) to represent the variation across repeated k-fold cross-validation folds (20 repeats of 5 folds) and 100 missing value imputations. The values in each panel correspond to the mean integrated calibration index (ICI) (95% confidence interval) at the given threshold. The diagonal dashed line represents the line of perfect calibration (ICI = 0). The CPM types (CPMMNLR, CPMPOLR, CPMDeepMN, and CPMDeepOR) are decoded in the Materials and methods and described in S1 Appendix.

https://doi.org/10.1371/journal.pone.0270973.s007

(TIF)

S4 Fig. Ordinal calibration curves of each all-predictor-based model (APM).

GOSE = Glasgow Outcome Scale–Extended at 6 months post-injury. Shaded areas are 95% confidence intervals derived using bias-corrected bootstrapping (1,000 resamples) to represent the variation across repeated k-fold cross-validation folds (20 repeats of 5 folds). The values in each panel correspond to the mean integrated calibration index (ICI) (95% confidence interval) at the given threshold. The diagonal dashed line represents the line of perfect calibration (ICI = 0). The APM types (APMMN and APMOR) are decoded in the Materials and methods and described in S2 Appendix.

https://doi.org/10.1371/journal.pone.0270973.s008

(TIF)

S5 Fig. Mean absolute SHAP values of the most important predictors for APMMN in each of the five folds of the first repeat.

ICU = intensive care unit. CT = computerised tomography. ER = emergency room. GOS = Glasgow Outcome Scale (not extended). AIS = Abbreviated Injury Scale. UO = unfavourable outcome, defined by functional dependence (i.e., GOSE ≤ 4). FIBTEM = fibrin-based extrinsically activated test with tissue factor and cytochalasin D. GOSE = Glasgow Outcome Scale–Extended at 6 months post-injury. The mean absolute SHAP value is interpreted as the average magnitude of the relative additive contribution of a predictor’s most important token towards the predicted probability at each GOSE score for a single patient.

https://doi.org/10.1371/journal.pone.0270973.s009

(TIF)

S6 Fig. Ordinal calibration curves of each extended concise-predictor-based model (eCPM).

GOSE = Glasgow Outcome Scale–Extended at 6 months post-injury. Shaded areas are 95% confidence intervals derived using bias-corrected bootstrapping (1,000 resamples) to represent the variation across repeated k-fold cross-validation folds (20 repeats of 5 folds) and 100 missing value imputations. The values in each panel correspond to the mean integrated calibration index (ICI) (95% confidence interval) at the given threshold. The diagonal dashed line represents the line of perfect calibration (ICI = 0). The eCPM types (eCPMMNLR, eCPMPOLR, eCPMDeepMN, and eCPMDeepOR) are decoded in the Materials and methods and described in S1 Appendix.

https://doi.org/10.1371/journal.pone.0270973.s010

(TIF)

S1 Table. Extended concise baseline predictors of the study population stratified by ordinal 6-month outcomes.

https://doi.org/10.1371/journal.pone.0270973.s011

(PDF)

S2 Table. Ordinal concise-predictor-based model (CPM) discrimination and calibration performance.

https://doi.org/10.1371/journal.pone.0270973.s012

(PDF)

S3 Table. Ordinal all-predictor-based model (APM) discrimination and calibration performance.

https://doi.org/10.1371/journal.pone.0270973.s013

(PDF)

S4 Table. Ordinal extended concise-predictor-based model (eCPM) discrimination and calibration performance.

https://doi.org/10.1371/journal.pone.0270973.s014

(PDF)

Acknowledgments

We are grateful to the patients and families of our study for making our efforts to improve TBI care and outcome possible.

S.B. would like to thank: Abhishek Dixit (Univ. of Cambridge) for helping access the CENTER-TBI dataset, Jacob Deasy (Univ. of Cambridge) for aiding the development of modelling methodology, and Kathleen Mitchell-Fox (Princeton Univ.) for offering comments on the manuscript. All authors would like to thank Andrew I. R. Maas (Antwerp Univ. Hospital) for offering comments on the manuscript.

The CENTER-TBI investigators and participants

The co-lead investigators of CENTER-TBI are designated with an asterisk (*), and their contact email addresses are listed below.

Cecilia Åkerlund1, Krisztina Amrein2, Nada Andelic3, Lasse Andreassen4, Audny Anke5, Anna Antoni6, Gérard Audibert7, Philippe Azouvi8, Maria Luisa Azzolini9, Ronald Bartels10, Pál Barzó11, Romuald Beauvais12, Ronny Beer13, Bo-Michael Bellander14, Antonio Belli15, Habib Benali16, Maurizio Berardino17, Luigi Beretta9, Morten Blaabjerg18, Peter Bragge19, Alexandra Brazinova20, Vibeke Brinck21, Joanne Brooker22, Camilla Brorsson23, Andras Buki24, Monika Bullinger25, Manuel Cabeleira26, Alessio Caccioppola27, Emiliana Calappi27, Maria Rosa Calvi9, Peter Cameron28, Guillermo Carbayo Lozano29, Marco Carbonara27, Simona Cavallo17, Giorgio Chevallard30, Arturo Chieregato30, Giuseppe Citerio31,32, Hans Clusmann33, Mark Coburn34, Jonathan Coles35, Jamie D. Cooper36, Marta Correia37, Amra Čović 38, Nicola Curry39, Endre Czeiter24, Marek Czosnyka26, Claire Dahyot-Fizelier40, Paul Dark41, Helen Dawes42, Véronique De Keyser43, Vincent Degos16, Francesco Della Corte44, Hugo den Boogert10, Bart Depreitere45, Đula Đilvesi46, Abhishek Dixit47, Emma Donoghue22, Jens Dreier48, Guy-Loup Dulière49, Ari Ercole47, Patrick Esser42, Erzsébet Ezer50, Martin Fabricius51, Valery L. Feigin52, Kelly Foks53, Shirin Frisvold54, Alex Furmanov55, Pablo Gagliardo56, Damien Galanaud16, Dashiell Gantner28, Guoyi Gao57, Pradeep George58, Alexandre Ghuysen59, Lelde Giga60, Ben Glocker61, Jagoš Golubovic46, Pedro A. Gomez62, Johannes Gratz63, Benjamin Gravesteijn64, Francesca Grossi44, Russell L. Gruen65, Deepak Gupta66, Juanita A. Haagsma64, Iain Haitsma67, Raimund Helbok13, Eirik Helseth68, Lindsay Horton69, Jilske Huijben64, Peter J. Hutchinson70, Bram Jacobs71, Stefan Jankowski72, Mike Jarrett21, Ji-yao Jiang58, Faye Johnson73, Kelly Jones52, Mladen Karan46, Angelos G. Kolias70, Erwin Kompanje74, Daniel Kondziella51, Evgenios Kornaropoulos47, Lars-Owe Koskinen75, Noémi Kovács76, Ana Kowark77, Alfonso Lagares62, Linda Lanyon58, Steven Laureys78, Fiona Lecky79,80, Didier Ledoux78, Rolf Lefering81, Valerie Legrand82, Aurelie Lejeune83, Leon Levi84, Roger Lightfoot85, Hester Lingsma64, Andrew I.R. Maas43,*, Ana M. Castaño-León62, Marc Maegele86, Marek Majdan20, Alex Manara87, Geoffrey Manley88, Costanza Martino89, Hugues Maréchal49, Julia Mattern90, Catherine McMahon91, Béla Melegh92, David Menon47,*, Tomas Menovsky43, Ana Mikolic64, Benoit Misset78, Visakh Muraleedharan58, Lynnette Murray28, Ancuta Negru93, David Nelson1, Virginia Newcombe47, Daan Nieboer64, József Nyirádi2, Otesile Olubukola79, Matej Oresic94, Fabrizio Ortolano27, Aarno Palotie95,96,97, Paul M. Parizel98, Jean-François Payen99, Natascha Perera12, Vincent Perlbarg16, Paolo Persona100, Wilco Peul101, Anna Piippo-Karjalainen102, Matti Pirinen95, Dana Pisica64, Horia Ples93, Suzanne Polinder64, Inigo Pomposo29, Jussi P. Posti103, Louis Puybasset104, Andreea Radoi105, Arminas Ragauskas106, Rahul Raj102, Malinka Rambadagalla107, Isabel Retel Helmrich64, Jonathan Rhodes108, Sylvia Richardson109, Sophie Richter47, Samuli Ripatti95, Saulius Rocka106, Cecilie Roe110, Olav Roise111,112, Jonathan Rosand113, Jeffrey V. Rosenfeld114, Christina Rosenlund115, Guy Rosenthal55, Rolf Rossaint77, Sandra Rossi100, Daniel Rueckert61 Martin Rusnák116, Juan Sahuquillo105, Oliver Sakowitz90,117, Renan Sanchez-Porras117, Janos Sandor118, Nadine Schäfer81, Silke Schmidt119, Herbert Schoechl120, Guus Schoonman121, Rico Frederik Schou122, Elisabeth Schwendenwein6, Charlie Sewalt64, Ranjit D. Singh101, Toril Skandsen123,124, Peter Smielewski26, Abayomi Sorinola125, Emmanuel Stamatakis47, Simon Stanworth39, Robert Stevens126, William Stewart127, Ewout W. Steyerberg64,128, Nino Stocchetti129, Nina Sundström130, Riikka Takala131, Viktória Tamás125, Tomas Tamosuitis132, Mark Steven Taylor20, Braden Te Ao52, Olli Tenovuo103, Alice Theadom52, Matt Thomas87, Dick Tibboel133, Marjolein Timmers74, Christos Tolias134, Tony Trapani28, Cristina Maria Tudora93, Andreas Unterberg90, Peter Vajkoczy 135, Shirley Vallance28, Egils Valeinis60, Zoltán Vámos50, Mathieu van der Jagt136, Gregory Van der Steen43, Joukje van der Naalt71, Jeroen T.J.M. van Dijck101, Inge A. M. van Erp101, Thomas A. van Essen101, Wim Van Hecke137, Caroline van Heugten138, Dominique Van Praag139, Ernest van Veen64, Thijs Vande Vyvere137, Roel P. J. van Wijk101, Alessia Vargiolu32, Emmanuel Vega83, Kimberley Velt64, Jan Verheyden137, Paul M. Vespa140, Anne Vik123,141, Rimantas Vilcinis132, Victor Volovici67, Nicole von Steinbüchel38, Daphne Voormolen64, Petar Vulekovic46, Kevin K.W. Wang142, Daniel Whitehouse47, Eveline Wiegers64, Guy Williams47, Lindsay Wilson69, Stefan Winzeck47, Stefan Wolf143, Zhihui Yang113, Peter Ylén144, Alexander Younsi90, Frederick A. Zeiler47,145, Veronika Zelinkova20, Agate Ziverte60, Tommaso Zoerle27

1Department of Physiology and Pharmacology, Section of Perioperative Medicine and Intensive Care, Karolinska Institutet, Stockholm, Sweden

2János Szentágothai Research Centre, University of Pécs, Pécs, Hungary

3Division of Surgery and Clinical Neuroscience, Department of Physical Medicine and Rehabilitation, Oslo University Hospital and University of Oslo, Oslo, Norway

4Department of Neurosurgery, University Hospital Northern Norway, Tromso, Norway

5Department of Physical Medicine and Rehabilitation, University Hospital Northern Norway, Tromso, Norway

6Trauma Surgery, Medical University Vienna, Vienna, Austria

7Department of Anesthesiology & Intensive Care, University Hospital Nancy, Nancy, France

8Raymond Poincare hospital, Assistance Publique–Hopitaux de Paris, Paris, France

9Department of Anesthesiology & Intensive Care, S Raffaele University Hospital, Milan, Italy

10Department of Neurosurgery, Radboud University Medical Center, Nijmegen, The Netherlands

11Department of Neurosurgery, University of Szeged, Szeged, Hungary

12International Projects Management, ARTTIC, Munchen, Germany

13Department of Neurology, Neurological Intensive Care Unit, Medical University of Innsbruck, Innsbruck, Austria

14Department of Neurosurgery & Anesthesia & intensive care medicine, Karolinska University Hospital, Stockholm, Sweden

15NIHR Surgical Reconstruction and Microbiology Research Centre, Birmingham, UK

16Anesthesie-Réanimation, Assistance Publique–Hopitaux de Paris, Paris, France

17Department of Anesthesia & ICU, AOU Città della Salute e della Scienza di Torino—Orthopedic and Trauma Center, Torino, Italy

18Department of Neurology, Odense University Hospital, Odense, Denmark

19BehaviourWorks Australia, Monash Sustainability Institute, Monash University, Victoria, Australia

20Department of Public Health, Faculty of Health Sciences and Social Work, Trnava University, Trnava, Slovakia

21Quesgen Systems Inc., Burlingame, California, USA

22Australian & New Zealand Intensive Care Research Centre, Department of Epidemiology and Preventive Medicine, School of Public Health and Preventive Medicine, Monash University, Melbourne, Australia

23Department of Surgery and Perioperative Science, Umeå University, Umeå, Sweden

24Department of Neurosurgery, Medical School, University of Pécs, Hungary and Neurotrauma Research Group, János Szentágothai Research Centre, University of Pécs, Hungary

25Department of Medical Psychology, Universitätsklinikum Hamburg-Eppendorf, Hamburg, Germany

26Brain Physics Lab, Division of Neurosurgery, Dept of Clinical Neurosciences, University of Cambridge, Addenbrooke’s Hospital, Cambridge, UK

27Neuro ICU, Fondazione IRCCS Cà Granda Ospedale Maggiore Policlinico, Milan, Italy

28ANZIC Research Centre, Monash University, Department of Epidemiology and Preventive Medicine, Melbourne, Victoria, Australia

29Department of Neurosurgery, Hospital of Cruces, Bilbao, Spain

30NeuroIntensive Care, Niguarda Hospital, Milan, Italy

31School of Medicine and Surgery, Università Milano Bicocca, Milano, Italy

32NeuroIntensive Care, ASST di Monza, Monza, Italy

33Department of Neurosurgery, Medical Faculty RWTH Aachen University, Aachen, Germany

34Department of Anesthesiology and Intensive Care Medicine, University Hospital Bonn, Bonn, Germany

35Department of Anesthesia & Neurointensive Care, Cambridge University Hospital NHS Foundation Trust, Cambridge, UK

36School of Public Health & PM, Monash University and The Alfred Hospital, Melbourne, Victoria, Australia

37Radiology/MRI department, MRC Cognition and Brain Sciences Unit, Cambridge, UK

38Institute of Medical Psychology and Medical Sociology, Universitätsmedizin Göttingen, Göttingen, Germany

39Oxford University Hospitals NHS Trust, Oxford, UK

40Intensive Care Unit, CHU Poitiers, Potiers, France

41University of Manchester NIHR Biomedical Research Centre, Critical Care Directorate, Salford Royal Hospital NHS Foundation Trust, Salford, UK

42Movement Science Group, Faculty of Health and Life Sciences, Oxford Brookes University, Oxford, UK

43Department of Neurosurgery, Antwerp University Hospital and University of Antwerp, Edegem, Belgium

44Department of Anesthesia & Intensive Care, Maggiore Della Carità Hospital, Novara, Italy

45Department of Neurosurgery, University Hospitals Leuven, Leuven, Belgium

46Department of Neurosurgery, Clinical centre of Vojvodina, Faculty of Medicine, University of Novi Sad, Novi Sad, Serbia

47Division of Anaesthesia, University of Cambridge, Addenbrooke’s Hospital, Cambridge, UK

48Center for Stroke Research Berlin, Charité –Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, Berlin, Germany

49Intensive Care Unit, CHR Citadelle, Liège, Belgium

50Department of Anaesthesiology and Intensive Therapy, University of Pécs, Pécs, Hungary

51Departments of Neurology, Clinical Neurophysiology and Neuroanesthesiology, Region Hovedstaden Rigshospitalet, Copenhagen, Denmark

52National Institute for Stroke and Applied Neurosciences, Faculty of Health and Environmental Studies, Auckland University of Technology, Auckland, New Zealand

53Department of Neurology, Erasmus MC, Rotterdam, the Netherlands

54Department of Anesthesiology and Intensive care, University Hospital Northern Norway, Tromso, Norway

55Department of Neurosurgery, Hadassah-hebrew University Medical center, Jerusalem, Israel

56Fundación Instituto Valenciano de Neurorrehabilitación (FIVAN), Valencia, Spain

57Department of Neurosurgery, Shanghai Renji hospital, Shanghai Jiaotong University/school of medicine, Shanghai, China

58Karolinska Institutet, INCF International Neuroinformatics Coordinating Facility, Stockholm, Sweden

59Emergency Department, CHU, Liège, Belgium

60Neurosurgery clinic, Pauls Stradins Clinical University Hospital, Riga, Latvia

61Department of Computing, Imperial College London, London, UK

62Department of Neurosurgery, Hospital Universitario 12 de Octubre, Madrid, Spain

63Department of Anesthesia, Critical Care and Pain Medicine, Medical University of Vienna, Austria

64Department of Public Health, Erasmus Medical Center-University Medical Center, Rotterdam, The Netherlands

65College of Health and Medicine, Australian National University, Canberra, Australia

66Department of Neurosurgery, Neurosciences Centre & JPN Apex trauma centre, All India Institute of Medical Sciences, New Delhi-110029, India

67Department of Neurosurgery, Erasmus MC, Rotterdam, the Netherlands

68Department of Neurosurgery, Oslo University Hospital, Oslo, Norway

69Division of Psychology, University of Stirling, Stirling, UK

70Division of Neurosurgery, Department of Clinical Neurosciences, Addenbrooke’s Hospital & University of Cambridge, Cambridge, UK

71Department of Neurology, University of Groningen, University Medical Center Groningen, Groningen, Netherlands

72Neurointensive Care, Sheffield Teaching Hospitals NHS Foundation Trust, Sheffield, UK

73Salford Royal Hospital NHS Foundation Trust Acute Research Delivery Team, Salford, UK

74Department of Intensive Care and Department of Ethics and Philosophy of Medicine, Erasmus Medical Center, Rotterdam, The Netherlands

75Department of Clinical Neuroscience, Neurosurgery, Umeå University, Umeå, Sweden

76Hungarian Brain Research Program—Grant No. KTIA_13_NAP-A-II/8, University of Pécs, Pécs, Hungary

77Department of Anaesthesiology, University Hospital of Aachen, Aachen, Germany

78Cyclotron Research Center, University of Liège, Liège, Belgium

79Centre for Urgent and Emergency Care Research (CURE), Health Services Research Section, School of Health and Related Research (ScHARR), University of Sheffield, Sheffield, UK

80Emergency Department, Salford Royal Hospital, Salford UK

81Institute of Research in Operative Medicine (IFOM), Witten/Herdecke University, Cologne, Germany

82VP Global Project Management CNS, ICON, Paris, France

83Department of Anesthesiology-Intensive Care, Lille University Hospital, Lille, France

84Department of Neurosurgery, Rambam Medical Center, Haifa, Israel

85Department of Anesthesiology & Intensive Care, University Hospitals Southhampton NHS Trust, Southhampton, UK

86Cologne-Merheim Medical Center (CMMC), Department of Traumatology, Orthopedic Surgery and Sportmedicine, Witten/Herdecke University, Cologne, Germany

87Intensive Care Unit, Southmead Hospital, Bristol, Bristol, UK

88Department of Neurological Surgery, University of California, San Francisco, California, USA

89Department of Anesthesia & Intensive Care,M. Bufalini Hospital, Cesena, Italy

90Department of Neurosurgery, University Hospital Heidelberg, Heidelberg, Germany

91Department of Neurosurgery, The Walton centre NHS Foundation Trust, Liverpool, UK

92Department of Medical Genetics, University of Pécs, Pécs, Hungary

93Department of Neurosurgery, Emergency County Hospital Timisoara, Timisoara, Romania

94School of Medical Sciences, Örebro University, Örebro, Sweden

95Institute for Molecular Medicine Finland, University of Helsinki, Helsinki, Finland

96Analytic and Translational Genetics Unit, Department of Medicine; Psychiatric & Neurodevelopmental Genetics Unit, Department of Psychiatry; Department of Neurology, Massachusetts General Hospital, Boston, MA, USA

97Program in Medical and Population Genetics; The Stanley Center for Psychiatric Research, The Broad Institute of MIT and Harvard, Cambridge, MA, USA

98Department of Radiology, University of Antwerp, Edegem, Belgium

99Department of Anesthesiology & Intensive Care, University Hospital of Grenoble, Grenoble, France

100Department of Anesthesia & Intensive Care, Azienda Ospedaliera Università di Padova, Padova, Italy

101Dept. of Neurosurgery, Leiden University Medical Center, Leiden, The Netherlands and Dept. of Neurosurgery, Medical Center Haaglanden, The Hague, The Netherlands

102Department of Neurosurgery, Helsinki University Central Hospital

103Division of Clinical Neurosciences, Department of Neurosurgery and Turku Brain Injury Centre, Turku University Hospital and University of Turku, Turku, Finland

104Department of Anesthesiology and Critical Care, Pitié -Salpêtrière Teaching Hospital, Assistance Publique, Hôpitaux de Paris and University Pierre et Marie Curie, Paris, France

105Neurotraumatology and Neurosurgery Research Unit (UNINN), Vall d’Hebron Research Institute, Barcelona, Spain

106Department of Neurosurgery, Kaunas University of technology and Vilnius University, Vilnius, Lithuania

107Department of Neurosurgery, Rezekne Hospital, Latvia

108Department of Anaesthesia, Critical Care & Pain Medicine NHS Lothian & University of Edinburg, Edinburgh, UK

109Director, MRC Biostatistics Unit, Cambridge Institute of Public Health, Cambridge, UK

110Department of Physical Medicine and Rehabilitation, Oslo University Hospital/University of Oslo, Oslo, Norway

111Division of Orthopedics, Oslo University Hospital, Oslo, Norway

112Institue of Clinical Medicine, Faculty of Medicine, University of Oslo, Oslo, Norway

113Broad Institute, Cambridge MA Harvard Medical School, Boston MA, Massachusetts General Hospital, Boston MA, USA

114National Trauma Research Institute, The Alfred Hospital, Monash University, Melbourne, Victoria, Australia

115Department of Neurosurgery, Odense University Hospital, Odense, Denmark

116International Neurotrauma Research Organisation, Vienna, Austria

117Klinik für Neurochirurgie, Klinikum Ludwigsburg, Ludwigsburg, Germany

118Division of Biostatistics and Epidemiology, Department of Preventive Medicine, University of Debrecen, Debrecen, Hungary

119Department Health and Prevention, University Greifswald, Greifswald, Germany

120Department of Anaesthesiology and Intensive Care, AUVA Trauma Hospital, Salzburg, Austria

121Department of Neurology, Elisabeth-TweeSteden Ziekenhuis, Tilburg, the Netherlands

122Department of Neuroanesthesia and Neurointensive Care, Odense University Hospital, Odense, Denmark

123Department of Neuromedicine and Movement Science, Norwegian University of Science and Technology, NTNU, Trondheim, Norway

124Department of Physical Medicine and Rehabilitation, St.Olavs Hospital, Trondheim University Hospital, Trondheim, Norway

125Department of Neurosurgery, University of Pécs, Pécs, Hungary

126Division of Neuroscience Critical Care, Johns Hopkins University School of Medicine, Baltimore, USA

127Department of Neuropathology, Queen Elizabeth University Hospital and University of Glasgow, Glasgow, UK

128Dept. of Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, The Netherlands

129Department of Pathophysiology and Transplantation, Milan University, and Neuroscience ICU, Fondazione IRCCS Cà Granda Ospedale Maggiore Policlinico, Milano, Italy

130Department of Radiation Sciences, Biomedical Engineering, Umeå University, Umeå, Sweden

131Perioperative Services, Intensive Care Medicine and Pain Management, Turku University Hospital and University of Turku, Turku, Finland

132Department of Neurosurgery, Kaunas University of Health Sciences, Kaunas, Lithuania

133Intensive Care and Department of Pediatric Surgery, Erasmus Medical Center, Sophia Children’s Hospital, Rotterdam, The Netherlands

134Department of Neurosurgery, Kings college London, London, UK

135Neurologie, Neurochirurgie und Psychiatrie, Charité –Universitätsmedizin Berlin, Berlin, Germany

136Department of Intensive Care Adults, Erasmus MC–University Medical Center Rotterdam, Rotterdam, the Netherlands

137icoMetrix NV, Leuven, Belgium

138Movement Science Group, Faculty of Health and Life Sciences, Oxford Brookes University, Oxford, UK

139Psychology Department, Antwerp University Hospital, Edegem, Belgium

140Director of Neurocritical Care, University of California, Los Angeles, USA

141Department of Neurosurgery, St.Olavs Hospital, Trondheim University Hospital, Trondheim, Norway

142Department of Emergency Medicine, University of Florida, Gainesville, Florida, USA

143Department of Neurosurgery, Charité –Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, Berlin, Germany

144VTT Technical Research Centre, Tampere, Finland

145Section of Neurosurgery, Department of Surgery, Rady Faculty of Health Sciences, University of Manitoba, Winnipeg, MB, Canada

*Co-lead investigators: andrew.maas@uza.be (AIRM) and dkm13@cam.ac.uk (DM)

References

  1. 1. Maas AIR, Menon DK, Adelson PD, Andelic N, Bell MJ, Belli A, et al. Traumatic brain injury: integrated approaches to improve prevention, clinical care, and research. Lancet Neurol. 2017;16: 987–1048. pmid:29122524
  2. 2. Lingsma HF, Roozenbeek B, Steyerberg EW, Murray GD, Maas AI. Early prognosis in traumatic brain injury: from prophecies to predictions. Lancet Neurol. 2010;9: 543–554. pmid:20398861
  3. 3. Jennett B, Snoek J, Bond MR, Brooks N. Disability after severe head injury: observations on the use of the Glasgow Outcome Scale. J Neurol Neurosurg Psychiatry. 1981;44: 285–293. pmid:6453957
  4. 4. Horton L, Rhodes J, Wilson L. Randomized Controlled Trials in Adult Traumatic Brain Injury: A Systematic Review on the Use and Reporting of Clinical Outcome Assessments. J Neurotrauma. 2018;35: 25–2014.
  5. 5. McMillan T, Wilson L, Ponsford J, Levin H, Teasdale G, Bond M. The Glasgow Outcome Scale—40 years of application and refinement. Nat Rev Neurol. 2016;12: 477–485. pmid:27418377
  6. 6. Wilson JT, Pettigrew LE, Teasdale GM. Structured interviews for the Glasgow Outcome Scale and the extended Glasgow Outcome Scale: guidelines for their use. J Neurotrauma. 1998;15: 573–585. pmid:9726257
  7. 7. Knaus WA, Draper EA, Wagner DP, Zimmerman JE. APACHE II: a severity of disease classification system. Crit Care Med. 1985;13: 818–829. pmid:3928249
  8. 8. Steyerberg EW, Mushkudiani N, Perel P, Butcher I, Lu J, McHugh GS, et al. Predicting Outcome after Traumatic Brain Injury: Development and International Validation of Prognostic Scores Based on Admission Characteristics. PLoS Med. 2008;5: e165. pmid:18684008
  9. 9. Zuckerman D, Giacino J, Bodien Y. Traumatic Brain Injury: What Is a Favorable Outcome? J Neurotrauma. 2021. pmid:34861770
  10. 10. Turgeon AF, Lauzier F, Simard J, Scales DC, Burns KEA, Moore L, et al. Mortality associated with withdrawal of life-sustaining therapy for patients with severe traumatic brain injury: a Canadian multicentre cohort study. CMAJ. 2011;183: 1581–1588. pmid:21876014
  11. 11. Harrell FE Jr, Margolis PA, Gove S, Mason KE, Mulholland EK, Lehmann D, et al. Development of a clinical prediction model for an ordinal outcome: the World Health Organization Multicentre Study of Clinical Signs and Etiological Agents of Pneumonia, Sepsis and Meningitis in Young Infants. Stat Med. 1998;17: 909–944. pmid:9595619
  12. 12. Hilden J. The Area under the ROC Curve and Its Competitors. Med Decis Making. 1991;11: 95–101. pmid:1865785
  13. 13. Van Calster B, Van Belle V, Vergouwe Y, Steyerberg EW. Discrimination ability of prediction models for ordinal outcomes: Relationships between existing measures and a new measure. Biom J. 2012;54: 674–685. pmid:22711459
  14. 14. Doiron D, Marcon Y, Fortier I, Burton P, Ferretti V. Software Application Profile: Opal and Mica: open-source software solutions for epidemiological data management, harmonization and dissemination. Int J Epidemiol. 2017;46: 1372–1378. pmid:29025122
  15. 15. Maas AIR, Menon DK, Steyerberg EW, Citerio G, Lecky F, Manley GT, et al. Collaborative European NeuroTrauma Effectiveness Research in Traumatic Brain Injury (CENTER-TBI): A Prospective Longitudinal Observational Study. Neurosurgery. 2014;76: 67–80. pmid:25525693
  16. 16. Steyerberg EW, Wiegers E, Sewalt C, Buki A, Citerio G, De Keyser V, et al. Case-mix, care pathways, and outcomes in patients with traumatic brain injury in CENTER-TBI: a European prospective, multicentre, longitudinal, cohort study. Lancet Neurol. 2019;18: 923–934. pmid:31526754
  17. 17. Wilson JTL, Edwards P, Fiddes H, Stewart E, Teasdale GM. Reliability of postal questionnaires for the Glasgow Outcome Scale. J Neurotrauma. 2002;19: 999–1005. pmid:12482113
  18. 18. Kunzmann K, Wernisch L, Richardson S, Steyerberg EW, Lingsma H, Ercole A, et al. Imputation of Ordinal Outcomes: A Comparison of Approaches in Traumatic Brain Injury. J Neurotrauma. 2021;38. pmid:33108942
  19. 19. Harrell FE. Ordinal Logistic Regression. In: Harrell FE. Regression Modeling Strategies. 2nd ed. Cham: Springer; 2015. pp. 311–325. https://doi.org/10.1007/978-3-319-19425-7_13
  20. 20. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, et al. Scikit-learn: Machine Learning in Python. J Mach Learn Res. 2011;12: 2825–2830.
  21. 21. Teasdale G, Jennett B. Assessment of coma and impaired consciousness. A practical scale. Lancet. 1974;304: 81–84. pmid:4136544
  22. 22. Teasdale G, Maas A, Lecky F, Manley G, Stocchetti N, Murray G. The Glasgow Coma Scale at 40 years: standing the test of time. Lancet Neurol. 2014;13: 844–854. pmid:25030516
  23. 23. Dijkland SA, Foks KA, Polinder S, Dippel DWJ, Maas AIR, Lingsma HF, et al. Prognosis in Moderate and Severe Traumatic Brain Injury: A Systematic Review of Contemporary Models and Validation Studies. J Neurotrauma. 2020;37: 1–13. pmid:31099301
  24. 24. Han J, King NKK, Neilson SJ, Gandhi MP, Ng I. External Validation of the CRASH and IMPACT Prognostic Models in Severe Traumatic Brain Injury. J Neurotrauma. 2014;31: 1146–1152. pmid:24568201
  25. 25. Roozenbeek B, Lingsma HF, Lecky FE, Lu J, Weir J, Butcher I, et al. Prediction of outcome after moderate and severe traumatic brain injury: External validation of the International Mission on Prognosis and Analysis of Clinical Trials (IMPACT) and Corticoid Randomisation After Significant Head injury (CRASH) prognostic models. Crit Care Med. 2012;40: 1609–1617. pmid:22511138
  26. 26. Lingsma H, Andriessen, Teuntje M. J. C., Haitsema I, Horn J, van der Naalt J, Franschman G, et al. Prognosis in moderate and severe traumatic brain injury: External validation of the IMPACT models and the role of extracranial injuries. J Trauma Acute Care Surg. 2013;74: 639–646. pmid:23354263
  27. 27. Panczykowski DM, Puccio AM, Scruggs BJ, Bauer JS, Hricik AJ, Beers SR, et al. Prospective Independent Validation of IMPACT Modeling as a Prognostic Tool in Severe Traumatic Brain Injury. J Neurotrauma. 2012;29: 47–52. pmid:21933014
  28. 28. Murray GD, Butcher I, McHugh GS, Lu J, Mushkudiani NA, Maas AIR, et al. Multivariable Prognostic Analysis in Traumatic Brain Injury: Results from The IMPACT Study. J Neurotrauma. 2007;24: 329–337. pmid:17375997
  29. 29. Licht C. New methods for generating significance levels from multiply-imputed data. Dr. rer. pol. Thesis, The University of Bamberg. 2010. Available from: https://fis.uni-bamberg.de/handle/uniba/263
  30. 30. van Buuren S, Groothuis-Oudshoorn CGM. mice: Multivariate Imputation by Chained Equations in R. J Stat Softw. 2011;45.
  31. 31. R Core Team. R: A Language and Environment for Statistical Computing. 2020;4.0.0.
  32. 32. Seabold S, Perktold J. Statsmodels: Econometric and Statistical Modeling with Python. In: van der Walt S, Millman J, editors. Proceedings of the 9th Python in Science Conference (SciPy 2010). Austin: SciPy; 2010. pp. 92–96. https://doi.org/10.25080/Majora-92bf1922-011
  33. 33. Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In: Wallach H, Larochelle H, Beygelzimer A, d’Alché-Buc F, Fox E, Garnett R, editors. Advances in Neural Information Processing Systems 32 (NeurIPS 2019). Vancouver: NeurIPS; 2019.
  34. 34. CENTER-TBI Investigators and Participants. Data Dictionary. CENTER-TBI. [Cited 2022 January 26]. Available from: https://www.center-tbi.eu/data/dictionary
  35. 35. Deasy J, Liò P, Ercole A. Dynamic survival prediction in intensive care units from heterogeneous time series without the need for variable selection or curation. Sci Rep. 2020;10: 22129. pmid:33335183
  36. 36. Ercole A, Dixit A, Nelson DW, Bhattacharyay S, Zeiler FA, Nieboer D, et al. Imputation strategies for missing baseline neurological assessment covariates after traumatic brain injury: A CENTER-TBI study. PLoS ONE. 2021;16: e0253425. pmid:34358231
  37. 37. Bengio Y, Ducharme R, Vincent P, Jauvin C. A neural probabilistic language model. J Mach Learn Res. 2003;3: 1137–1155.
  38. 38. Mikolov T, Sutskever I, Chen K, Corrado G, Dean J. Distributed Representations of Words and Phrases and their Compositionality. In: Burges CJC, Bottou L, Welling M, Ghahramani Z, Weinberger KQ, editors. Advances in Neural Information Processing Systems 26 (NIPS 2013). Lake Tahoe: NIPS; 2013.
  39. 39. Lundberg SM, Lee S. A Unified Approach to Interpreting Model Predictions. In: Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R, editors. Advances in Neural Information Processing Systems 30 (NIPS 2017). Long Beach: NIPS; 2017.
  40. 40. Tsamardinos I, Greasidou E, Borboudakis G. Bootstrapping the out-of-sample predictions for efficient and accurate cross-validation. Mach Learning. 2018;107: 1895–1922. pmid:30393425
  41. 41. Somers RH. A New Asymmetric Measure of Association for Ordinal Variables. Am Sociol Rev. 1962;27: 799–811.
  42. 42. Kim J. Predictive Measures of Ordinal Association. Am J Sociol. 1971;76: 891–907.
  43. 43. Cox DR. Two further applications of a model for binary regression. Biometrika. 1958;45: 562–565.
  44. 44. Miller ME, Langefeld CD, Tierney WM, Hui SL, McDonald CJ. Validation of Probabilistic Predictions. Med Decis Making. 1993;13: 49–57. pmid:8433637
  45. 45. Van Calster B, Nieboer D, Vergouwe Y, De Cock B, Pencina MJ, Steyerberg EW. A calibration hierarchy for risk models was defined: from utopia to empirical data. J Clin Epidemiol. 2016;74: 167–176. pmid:26772608
  46. 46. Austin PC, Steyerberg EW. Graphical assessment of internal and external calibration of logistic regression models by using loess smoothers. Stat Med. 2014;33: 517–535. pmid:24002997
  47. 47. Austin PC, Steyerberg EW. The Integrated Calibration Index (ICI) and related metrics for quantifying the calibration of logistic regression models. Stat Med. 2019;38: 4051–4065. pmid:31270850
  48. 48. Virtanen P, Gommers R, Oliphant TE, Haberland M, Reddy T, Cournapeau D, et al. SciPy 1.0: fundamental algorithms for scientific computing in Python. Nat Methods. 2020;17: 261–272. pmid:32015543
  49. 49. Wickham H. ggplot2: Elegant Graphics for Data Analysis. 2nd ed. New York: Springer; 2016. https://doi.org/10.1007/978-3-319-24277-4
  50. 50. Falcon WA, et al. PyTorch Lightning. GitHub. 2019. Available from: https://github.com/PyTorchLightning/pytorch-lightning
  51. 51. Izzy S, Compton R, Carandang R, Hall W, Muehlschlegel S. Self-Fulfilling Prophecies Through Withdrawal of Care: Do They Exist in Traumatic Brain Injury, Too? Neurocrit Care. 2013;19: 347–363. pmid:24132565
  52. 52. van Veen E, van der Jagt M, Citerio G, Stocchetti N, Gommers D, Burdorf A, et al. Occurrence and timing of withdrawal of life-sustaining measures in traumatic brain injury patients: a CENTER-TBI study. Intensive Care Med. 2021;47: 1115–1129. pmid:34351445
  53. 53. Gravesteijn BY, Nieboer D, Ercole A, Lingsma HF, Nelson D, van Calster B, et al. Machine learning algorithms performed no better than regression models for prognostication in traumatic brain injury. J Clin Epidemiol. 2020;122: 95–107. pmid:32201256
  54. 54. Farzaneh N, Williamson CA, Gryak J, Najarian K. A hierarchical expert-guided machine learning framework for clinical decision support systems: an application to traumatic brain injury prognostication. NPJ Digit Med. 2021;4: 78. pmid:33963275
  55. 55. van der Naalt J, Timmerman ME, de Koning ME, van der Horn Harm J., Scheenen ME, Jacobs B, et al. Early predictors of outcome after mild traumatic brain injury (UPFRONT): an observational cohort study. Lancet Neurol. 2017;16: 532–540. pmid:28653646
  56. 56. Kean J, Malec JF. Towards a Better Measure of Brain Injury Outcome: New Measures or a New Metric? Arch Phys Med Rehabil. 2014;95: 1225–1228. pmid:24732171
  57. 57. Futoma J, Simons M, Panch T, Doshi-Velez F, Celi LA. The myth of generalisability in clinical research and machine learning in health care. Lancet Digit Health. 2020;2: e489–e492. pmid:32864600
  58. 58. Zeiler FA, Thelin EP, Czosnyka M, Hutchinson PJ, Menon DK, Helmy A. Cerebrospinal Fluid and Microdialysis Cytokines in Severe Traumatic Brain Injury: A Scoping Systematic Review. Front Neurol. 2017;8: 331. pmid:28740480
  59. 59. Thelin EP, Tajsic T, Zeiler FA, Menon DK, Hutchinson PJA, Carpenter KLH, et al. Monitoring the Neuroinflammatory Response Following Acute Brain Injury. Front Neurol. 2017;8: 351. pmid:28775710
  60. 60. Zeiler FA, Donnelly J, Smielewski P, Menon DK, Hutchinson PJ, Czosnyka M. Critical Thresholds of Intracranial Pressure-Derived Continuous Cerebrovascular Reactivity Indices for Outcome Prediction in Noncraniectomized Patients with Traumatic Brain Injury. J Neurotrauma. 2018;35: 1107–1115. pmid:29241396
  61. 61. Zeiler FA, Ercole A, Cabeleira M, Carbonara M, Stocchetti N, Menon DK, et al. Comparison of Performance of Different Optimal Cerebral Perfusion Pressure Parameters for Outcome Prediction in Adult Traumatic Brain Injury: A Collaborative European NeuroTrauma Effectiveness Research in Traumatic Brain Injury (CENTER-TBI) Study. J Neurotrauma. 2019;36: 1505–1517. pmid:30384809
  62. 62. Svedung Wettervik T, Howells T, Enblad P, Lewén A. Temporal Neurophysiological Dynamics in Traumatic Brain Injury: Role of Pressure Reactivity and Optimal Cerebral Perfusion Pressure for Predicting Outcome. J Neurotrauma. 2019;36: 1818–1827. pmid:30595128
  63. 63. Bhattacharyay S, Rattray J, Wang M, Dziedzic PH, Calvillo E, Kim HB, et al. Decoding accelerometry for classification and prediction of critically ill patients with severe brain injury. Sci Rep. 2021;11: 23654. pmid:34880296
  64. 64. Yuh EL, Mukherjee P, Lingsma HF, Yue JK, Ferguson AR, Gordon WA, et al. Magnetic resonance imaging improves 3-month outcome prediction in mild traumatic brain injury. Ann Neurol. 2013;73: 224–235. pmid:23224915
  65. 65. Griffin AD, Turtzo LC, Parikh GY, Tolpygo A, Lodato Z, Moses AD, et al. Traumatic microbleeds suggest vascular injury and predict disability in traumatic brain injury. Brain. 2019;142: 3550–3564. pmid:31608359
  66. 66. Wallace EJ, Mathias JL, Ward L. The relationship between diffusion tensor imaging findings and cognitive outcomes following adult traumatic brain injury: A meta-analysis. Neurosci Biobehav Rev. 2018;92: 93–103. pmid:29803527
  67. 67. Stocchetti N, Carbonara M, Citerio G, Ercole A, Skrifvars MB, Smielewski P, et al. Severe traumatic brain injury: targeted management in the intensive care unit. Lancet Neurol. 2017;16: 452–464. pmid:28504109
  68. 68. Wang KKW, Moghieb A, Yang Z, Zhang Z. Systems biomarkers as acute diagnostics and chronic monitoring tools for traumatic brain injury. In: Southern Š, editor. Proceedings (Volume 8723) of SPIE Defense, Security, and Sensing: Sensing Technologies for Global Health, Military Medicine, and Environmental Monitoring III. Baltimore: SPIE; 2013. https://doi.org/10.1117/12.2020030
  69. 69. Raj R, Luostarinen T, Pursiainen E, Posti JP, Takala RSK, Bendel S, et al. Machine learning-based dynamic mortality prediction after traumatic brain injury. Sci Rep. 2019;9: 17672. pmid:31776366
  70. 70. Meiring C, Dixit A, Harris S, MacCallum NS, Brealey DA, Watkinson PJ, et al. Optimal intensive care outcome prediction over time using machine learning. PLoS ONE. 2018;13: e0206862. pmid:30427913
  71. 71. Thorsen-Meyer H, Nielsen AB, Nielsen AP, Kaas-Hansen B, Toft P, Schierbeck J, et al. Dynamic and explainable machine learning prediction of mortality in patients in the intensive care unit: a retrospective study of high-frequency data in electronic patient records. Lancet Digit Health. 2020;2: e179–e191. pmid:33328078