The risk of disabling, surgery and reoperation in Crohn’s disease – A decision tree-based approach to prognosis

Introduction Crohn’s disease (CD) is a chronic inflammatory bowel disease known to carry a high risk of disabling and many times requiring surgical interventions. This article describes a decision-tree based approach that defines the CD patients’ risk or undergoing disabling events, surgical interventions and reoperations, based on clinical and demographic variables. Materials and methods This multicentric study involved 1547 CD patients retrospectively enrolled and divided into two cohorts: a derivation one (80%) and a validation one (20%). Decision trees were built upon applying the CHAIRT algorithm for the selection of variables. Results Three-level decision trees were built for the risk of disabling and reoperation, whereas the risk of surgery was described in a two-level one. A receiver operating characteristic (ROC) analysis was performed, and the area under the curves (AUC) Was higher than 70% for all outcomes. The defined risk cut-off values show usefulness for the assessed outcomes: risk levels above 75% for disabling had an odds test positivity of 4.06 [3.50–4.71], whereas risk levels below 34% and 19% excluded surgery and reoperation with an odds test negativity of 0.15 [0.09–0.25] and 0.50 [0.24–1.01], respectively. Overall, patients with B2 or B3 phenotype had a higher proportion of disabling disease and surgery, while patients with later introduction of pharmacological therapeutic (1 months after initial surgery) had a higher proportion of reoperation. Conclusions The decision-tree based approach used in this study, with demographic and clinical variables, has shown to be a valid and useful approach to depict such risks of disabling, surgery and reoperation.


Results
Three-level decision trees were built for the risk of disabling and reoperation, whereas the risk of surgery was described in a two-level one. A receiver operating characteristic (ROC) analysis was performed, and the area under the curves (AUC) Was higher than 70% for all outcomes. The defined risk cut-off values show usefulness for the assessed outcomes: risk levels above 75% for disabling had an odds test positivity of 4.06 [3.50-4.71], whereas risk levels below 34% and 19% excluded surgery and reoperation with an odds test negativity of 0.15 [0.09-0.25] and 0.50 [0.24-1.01], respectively. Overall, patients with B2 or B3 phenotype had a higher proportion of disabling disease and surgery, while patients with later introduction of pharmacological therapeutic (1 months after initial surgery) had a higher proportion of reoperation.

Introduction
Crohn's disease (CD) is a chronic inflammatory bowel disease for which no definitive treatment has been described. As so, clinicians approach the disease attempting to control the symptoms, avoiding disease complications and improving patients' quality of life [1]. The most frequent CD complications are related to an uncontrolled inflammation of the bowel, which may cause obstruction and perforation of the small intestine or of the colon, abscess, fistulae and/or intestinal bleeding. The occurrence of these events may require a surgical intervention, which ends up being a common strategy in CD management. In fact, previous studies have reported that around 50% of all CD patients will eventually undergo bowel surgery within 10 years after the diagnosis, whereas 80% will eventually require a surgery throughout the entire disease course [1,2]. Moreover, recurrence is extremely frequent, and the rate of reoperations in previous studies ranged from 40% to 80% [2,3].
As for the concept of disabling, this term was introduced by Beaugerie et al. in 2006 [4] and by Loly et al. [5] in 2008: both groups evaluated the impact of the disease using clinical and measurable criteria. These studies reported a proportion of disabling disease between 85% and 58%, respectively. Five years after the initial study on this topic, Yang et al. [6] presented a new report that settled the proportion of disabling at 80%. However, this last study used a slightly different definition of disabling disease. In fact, given the rapid evolution of disease control strategies, there is currently no consensus on the concept of disabling disease.
The definition of a strong and accurate prognosis model is a key step towards a better disease control and a higher quality of life in CD patients. In this context, this study aimed to unveil the differential impact of several clinical and demographic variables on the CD patients' risk of surgery, disabling and reoperation, using a decision trees-based strategy.

Derivation and validation cohort
This manuscript describes a multicentric retrospective cohort study including 1547 CD patients recruited from six Portuguese inflammatory bowel disease (IBD) specialist hospitals. Patients were included if 1) had a definitive diagnosis of CD; 2) had at least three years of follow-up; 3) had at least one consultation with a physician involved in this study during 2014 or 2015; and 4) had performed at least an X-ray computed tomography (CT) or a magnetic resonance imaging (MRI) during the follow-up. A hold-out strategy was followed to enable a generalized validation of the prognostic models: the cohort was randomly split into two groups. The first one comprised 80% of patients and constituted the derivation cohort; the held-out remaining 20% of patients were considered to be the validation cohort.

Clinical and demographic variables
All data was retrieved from GEDII (Grupo de Estudos de Doenças Inflamatórias Intestinais, the Portuguese IBD group) database [7] and included clinical and demographic variables, the dates in which the patients were submitted to bowel surgeries or started immunosuppression, and their classification regarding steroid dependence and refractoriness. The definition of steroid dependence was the inability to reduce steroids below the equivalent of 10 mg per day, prednisolone within 3 months of starting steroids without recurrent active disease, or disease relapse within 3 months of stopping steroids. Steroid resistance was defined as the presence of active disease despite a prednisolone dose of up to 0.75 mg kg −1 per day over a period of 4 weeks [8]. The presence and timing of immunosuppressive medication was stratified in four categories: 1) no pharmacological treatment; 2) pharmacological treatment both before and after the first surgery (started within the first month after surgery); 3) pharmacological treatment only after the surgery (starting more than 1 month after the first surgery); and 4) pharmacological treatment only before the first surgery.

Outcomes analyzed
Three different outcomes were analyzed in this study: 1. disabling disease, defined as a composite endpoint characterized by the presence of at least one of the following events: more than one abdominal surgery or two hospital admissions in the follow-up period; steroid dependence or steroid refractoriness; need for switching the first immunosuppression or anti-TNFα; and the appearance of new clinical events after the index episode (stenosis, anal disease or penetrating disease) [9]; 2. surgery, defined as the need for a surgical intervention (abdominal surgery only for CD); 3. reoperation, defined as the need for more than one surgical intervention (abdominal surgery only for CD).

Statistical analyses
The results of the statistical analysis performed during this study are summarized into decision trees, which are a graphical representation of a possible combination of variables based on specific conditions. It uses a divide-and-conquer strategy to solve a decision problem, which works by dividing a complex problem into simpler problems, recursively applying the same strategy. The different solutions of sub problems are then combined in the form of a tree to produce a solution for the original problem. Each split in the tree (a node) is produced by specifying the percentage of the outcome present in each of the categories of one independent variable (the one that has the highest impact at that level), while the final leaves convey an estimate of the outcome for the subgroup of patients that recursively traversed the tree along that path. Therefore, each path in the tree (from root to leaf) represents an exclusive decision rule associated with an estimate for the outcome. Whereas most decision trees supporting clinical decision problems are expert-based following a deductive reasoning, inductive learning the decision tree from data, e.g. using recursive partitioning, is a valid method to generate a datadriven decision model [10]. In order to determine the relationship between clinical/demographical factors and outcomes, decision tree classifiers were built from the derivation cohort, applying the CHAID algorithm [11], which is based upon corrected (Bonferroni post-hoc test) chi-squared significant testing. The following variables were analyzed for the outcomes disabling and surgery: gender, smoking habits, age at diagnosis, location disease, behavior, upper tract involvement (L4) and perianal disease. The presence and timing of medical therapeutics were also included when considering the outcome reoperation. The decision tree parameters were validated on the independently held-out validation cohort. The predictive quality of the leaves was evaluated on both cohorts estimating the proportion of the outcome for each of the derived rules.
To assess the discriminative ability of the trees for each outcome, specific cut-off values were chosen after analyzing the ROC curves in the derivation cohort. For disabling disease, a rule-in approach was applied aiming at a high positive predictive value (around 80%). For surgery and reoperation, a rule-out approach was applied aiming at a high negative predictive value (also around 80%). The derived trees (defining exclusive decision rules) were evaluated in both cohorts for the estimation of sensitivity, specificity, accuracy, predictive values, likelihood ratios and post-test odds.
Variables were described through absolute (n) and relative (%) frequencies. The comparison between derivation and validation cohort was made applying a Chi-Square test. All reported p-values were two-sided, for a significance level of 5%. All data were arranged, processed and analyzed with SPSS 1 v.24.0 (Statistical Package for Social Sciences).
The data collection that was used in this work has been approved by the Portuguese National Committee of Data Protection. This study was conducted according to the principles expressed in the Declaration of Helsinki.

Population baseline characteristics and measured outcomes
The derivation cohort consisted of 1245 CD patients, the majority of them female (54%), nonsmokers (53%) and diagnosed as young adults (17 to 40 years old, 69%) ( Table 1). Disease location and behavior were classified according to the Montreal classifications [12]: 16% had colonic disease and only 12% presented upper tract involvement. Concerning behavior, 46% had a non stricturing/non penetrating phenotype, whereas 26% had perianal disease. Disabling disease occurred in 68% of patients, 47% underwent bowel surgery, and 38% (among the latter) needed reoperation.

Disabling disease
Disabling disease occurred in 68% of the derivation cohort patients. The induced decision tree, computed from all independent variables with the exception of the presence and timing of pharmacological therapy (as this variable is itself involved in the disabling definition), resulted in a three-level model (Fig 1). The first level was defined by the behavior phenotype, a two-way split that separated the risk for disabling of B1 (54%) apart from that of B2 and B3 phenotypes (80%). The second level consisted in the presence of perianal disease. Location and gender defined the third level for patients that had the B1 phenotype and the absence or presence of perianal disease, respectively. The set of rules defined by this tree can be summarized by the different risk levels (32% to 90%) reported in Table 2. Overall, patients with phenotypes B2 or B3 have a higher risk of disabling disease, while for phenotype B1, gender plays an important role, with female patients having a higher risk than males.

Surgery
The outcome surgery affected 47% of the derivation cohort patients. The induced decision tree, computed using the same variables as those used for the disabling decision tree, resulted in a two-level model (Fig 2). As for the disabling, the first level was defined by disease behavior, although in this case a three-way split separated the surgery risk of all phenotypes: 16% for B1, 68% for B2, and 75% for B3. The second level of the model included information on upper track involvement (L4) (for patients with the B2 phenotype) and gender (for patients with the B3 phenotype). The set of rules hereby defined represent different risk levels (17% to 81%), which are depicted in Table 2. Globally, patients with a B3 phenotype have a higher risk of surgery than those with a B1 behavior, and this risk is further aggravated in male B3 patients.

Reoperation
The rate of reoperation was defined among those patients that underwent bowel surgery: 38% required additional surgical interventions. The induced decision tree, computed using all variables described before and including the timing and presence of pharmacological therapeutics, resulted in a three-level model of variables (Fig 3). The first level was defined by the presence and timing of the anti-TNF introduction, separating those that started anti-TNF more than one month after surgery (53% of reoperation risk) from all the others (29% of reoperation risk). The second level included behavior for the former (stratified in B1 vs. B2/B3) and presence and timing of AZA introduction for the later (discriminating between patients that have either never been medicated or been medicated only before surgery from the remaining). The third level encompassed the disease location, separating L1 from L2 and L3. The defined set of rules resulted in different risk levels (13% to 58%) that are listed in Table 2. Overall, patients with a later introduction of pharmacological intervention (1 month after initial surgery) had a worse outcome, i.e, they have a higher probability of undergoing more than one surgery during the disease course.

Model validation
The validation cohort consisted of 302 patients and was similar to the derivation cohort concerning the frequency of the analyzed variables and outcomes ( Table 1). The risk of each outcome following the decision rules extracted from the trees in the derivation and validation cohorts (S1 Fig). The proportion of the outcomes is similar in both cohorts, and their confidence intervals are overlapping, therefore attesting the robustness of the decision rules. Receiver operating characteristic (ROC) analysis were performed independently for the derivation and validation cohorts and the respective AUC values were similar in and had overlapping 95% CI for disabling and surgery, but not for reoperation ( S2 Fig). Moreover, the AUCs were rather heterogeneous for the different outcomes: the derivation cohort presented an AUC of 72% for disabling, 80% for surgery and 69% for reoperation.
The derivation cohort ROC curves were used to establish cut-offs able to assess the likelihood of the occurrence of each outcome. Positive results were defined for risk levels above 75% concerning disabling disease, above 34% concerning surgery, and above 19% concerning reoperation. These cut-offs enabled the computation of a simplified set of rules that are listed in Table 2. The performance parameters of the chosen cut-offs considering each outcome are depicted in Table 3 for the derivation and the validation cohorts. Most of the 95% CI overlapped between both cohorts, once again validating the initial model. Overall, the application of the cut-offs to the validation cohort resulted in 81% [74%-86%] PPV for disabling disease, 87% [80%-92%] NPV for surgery, and 67% [57%-75%] for reoperation.

Discussion
Given the impact and frequency of recurrences among CD patients, the development of prognostic models is a cornerstone to guide physicians in their therapeutic choices and to improve patients' well-being. The most important characteristics of these models are their user-friendliness and readability, allowing a fast and effortless readout during patient encounters or upon the need to decide on a therapeutic approach. This cohort presented a disabling rate of 68%, a similar value to that depicted in previous studies of different Portuguese cohorts [9,13]. However, other authors have reported higher disabling rates [4][5][6]. This difference is likely due to the fact that the disabling definition used in this study is stricter than that used in previous ones, namely by excluding the need of immunosuppression or anti-TNF as criteria. In our opinion, the introduction of pharmacological therapy not qualify as disabling, given the top-down and accelerated step-up strategies currently followed to approach CD.
Surgery, on its turn, affected 47% of the patients in the derivation cohort. This value is lower than that presented by Bernal et al. [2], which could be related to the fact that the cohort analyzed in that study was composed of older patients (data collected since 1955). The rapid evolution of CD therapeutics and the current strategies used to approach the disease, together with the fact that our cohort included patients that have been more recently diagnosed, explains our lower surgery rate. Moreover, a recent meta-analysis has reported a 47% risk of surgery within 10 years after diagnosis [14], thereby supporting the results described here. Reoperation, on the other hand, affected 38% of the patients who underwent a first surgery, a rate similar to that presented in a recent meta-analysis that settled the 10-years risk of reoperation at 33% [15].
The results from this study are depicted in three decision trees that represent the risk for each of the outcomes described above taking into specific combinations of clinical and demographic variables. These decision trees were validated in an independent cohort by; a) comparing the proportion of the outcome in each derived rule; and b) comparing the diagnostic The variables used in the decision tree were chosen by applying the CHAID algorithm, which is similar to the chi-square with Bonferroni correction post hoc test. The final selection of variables was the same as that used in previous studies that employed different selection methods, therefore attesting the robustness of the computed decision trees [2,[4][5][6][16][17][18]. The decision tree analysis has a some advantages over others that are more widely used (e.g. logistic regression). An undeniable strength of this method is its graphical representation, which allows a quick and intuitive reading. On the other hand, decision trees are rather flexible in the way that they do not assume any data distribution. Another advantage is the attribute selection, which restricts the variables in the model to those that are non-redundant. The interpretability of the trees is also one their strong points-complex decisions can be approximated by simple or local decisions. Finally, decision trees allow an easy comparison of patients' subgroups, as decision rules can be created directly from the tree. Overall, patients with B2 or B3 phenotype had a higher proportion of disabling disease and surgery, while patients with later introduction of pharmacological interventions (one month after initial surgery) had a higher proportion of reoperation. Although analyzed in a retrospective-fashion and using retrospectively-defined outcomes, this study presents an analysis of a large multicentric cohort formally validated by the application of the derived results in a validation cohort.
In conclusion, we have shown that variables such as disease behavior, upper gastrointestinal involvement, gender, perianal disease, location and medical therapeutics affect the risk of disabling disease, surgery and reoperation in CD patients. Moreover, these variables impact the aforementioned outcomes at different levels, having different weights in sub-groups of patients with different variables' combinations. Our results are represented in three graphical and user- The risk of disabling, surgery and reoperation in Crohn's disease friendly bedside tools that can be used by the physicians to assess the risk of disabling, surgery and reoperation in CD patients, therefore supporting the decision making process regarding therapeutic strategies. A disabling risk above 75% allows the prediction of disabling events with a PPV of 81% and an odds post-test positivity of 4.24, whereas a surgery risk inferior to 34% allows the exclusion of future surgeries with a NPV of 87% and an odds post-test negativity of 0.15. The reoperation was the hardest outcome to predict, although a risk below 19% could be useful for excluding future events (NPV: 87 and odds post-test negativity: 0.5).