Confounding by indication of the safety of de-escalation in community-acquired pneumonia: A simulation study embedded in a prospective cohort

Observational studies have demonstrated that de-escalation of antimicrobial therapy is independently associated with lower mortality. This most probably results from confounding by indication. Reaching clinical stability is associated with the decision to de-escalate and with survival. However, studies rarely adjust for this confounder. We quantified the potential confounding effect of clinical stability on the estimated impact of de-escalation on mortality in patients with community-acquired pneumonia. Data were used from the Community-Acquired Pneumonia immunization Trial in Adults (CAPiTA). The primary outcome was 30-day mortality. We performed Cox proportional-hazards regression with de-escalation as time-dependent variable and adjusted for baseline characteristics using propensity scores. The potential impact of unmeasured confounding was quantified through simulating a variable representing clinical stability on day three, using data on prevalence and associations with mortality from the literature. Of 1,536 included patients, 257 (16.7%) were de-escalated, 123 (8.0%) were escalated and in 1156 (75.3%) the antibiotic spectrum remained unchanged. Crude 30-day mortality was 3.5% (9/257) and 10.9% (107/986) in the de-escalation and continuation groups, respectively. The adjusted hazard ratio of de-escalation for 30-day mortality (compared to patients with unchanged coverage), without adjustment for clinical stability, was 0.39 (95%CI: 0.19–0.79). If 90% to 100% of de-escalated patients were clinically stable on day three, the fully adjusted hazard ratio would be 0.56 (95%CI: 0.27–1.12) to 1.04 (95%CI: 0.49–2.23), respectively. The simulated confounder was substantially stronger than any of the baseline confounders in our dataset. Quantification of effects of de-escalation on patient outcomes without proper adjustment for clinical stability results in strong negative bias. This study suggests the effect of de-escalation on mortality needs further well-designed prospective research to determine effect size more accurately.


Introduction
The aim of antimicrobial stewardship is improving antibiotic use, without compromising clinical outcomes on the individual level [1]. De-escalation of empirical antimicrobial therapy is highly recommended in antimicrobial stewardship programs. In a recent systematic review de-escalation of empirical antimicrobial therapy was associated with a 56% (95% CI 34%-70%) relative risk reduction in mortality [2]. Although it seems a safe strategy, most studies evaluating de-escalation and reporting mortality were observational with a high risk of bias, high clinical heterogeneity and not sufficiently powered to demonstrate safety for mortality. To the best of our knowledge, there are two randomized trials evaluating de-escalation, and these trials did not show a survival benefit for de-escalation [3,4]. A possible physiological mechanism for decreased mortality due to de-escalation could be a result of a more effective strategy by narrow-spectrum antibiotics or in case of continuation of unnecessary broad-spectrum antibiotics due to more (severe) side-effects. However, it seems highly unlikely that this would lead to increased mortality in the population. Therefore, the association between de-escalation and improved survival in observational studies is most likely biased by unmeasured confounding by indication. Confounding by indication is present if the indication for the intervention (here: de-escalation of empirical antimicrobial therapy) is also a prognostic factor for the outcome (mortality). De-escalation is usually only performed when clinical stability is reached in the first days after starting antimicrobial therapy and this also is a strong prognostic factor for patient outcome. However, hardly any of the observational studies adjusts for clinical stability during admission. In the aforementioned systematic review [2] only one of nineteen observational studies corrected for this confounder [5]. Potentially they did not consider this to be an important confounder, or they lacked data on clinical stability during admission. Not taking this into account causes a negative bias (towards a protective effect). However, the magnitude of this bias has never been established. The aim of the current study was to quantify the potential effect of unmeasured confounding by indication due to clinical stability in the association between de-escalation and patient outcome in patients with community-acquired pneumonia.

Data collection
Data were used from the Community-Acquired Pneumonia immunization Trial in Adults (CAPiTA) [6]. This study was a parallel-group, randomized, placebo-controlled, double blind trial to assess the efficacy of a 13-valent pneumococcal conjugate vaccine. The study included 84,496 immunocompetent community-dwelling adults, 65 years of age and above. Surveillance for suspected pneumonia was performed in 58 hospitals in the Netherlands, in the period September 2008-August 2013. The study was approved by the Central Committee on Research Involving Human Subjects and by the Ministry of Health, Welfare and Sport in the Netherlands and all the participants provided written informed consent. For the current analysis, patients receiving antibiotics on the day of admission and with a working diagnosis of CAP admitted to a non-intensive care unit (ICU) were included. We think the effect of de-escalation on mortality in the ICU population is different than in non-ICU population and including these patients will result in a more heterogeneous population. Moreover, factors such as culture results and clinical stability may play a very different role in that population. Patients were excluded from the current analysis if they participated in a simultaneously running interventional trial evaluating different antibiotic regimens for CAP [7], since this trial interfered with the choice of empirical antibiotic treatment, or if they died within 24 hours of admission because these are not eligible for de-escalation.

Definitions
To define de-escalation, antibiotics were ranked based on their spectrum of activity against CAP pathogens, from rank 1 ('narrow-spectrum') to rank 3 ('extended / restricted spectrum') antibiotics ( Table 1). The ranking was performed by a team of experts: two clinical microbiologists (CHEB, MJMB), one infectious diseases specialist (JJO), two clinical pharmacists (IvH, PDvdL) and one epidemiologist (CHvW). In the Dutch setting, penicillin and amoxicillin are in general classified as narrow-spectrum antibiotics. For mild CAP in primary care and moderate-severe CAP (non-ICU ward) these antibiotics are first choice treatment with tetracyclines as an alternative in case of allergies [8]. Sweden and Denmark have similar policies [9,10]. These antibiotics were classified as rank 1. Antibiotics with a 'restricted' label, advised by the national guide for antibiotic stewardship teams were classified as rank 3 [11]. All other regimens were classified as rank 2. In patients with combination therapy, the highest rank of any individual antibiotic was counted, except for combination therapy of β-lactam therapy and a macrolide, which was considered as rank 3, as for respiratory pathogens this combination results in a much broader spectrum than any of the individual antibiotics. Therapy adjustment was defined as the first switch from empirical therapy to another antimicrobial class during hospitalization, independent of the reason for switching. De-escalation and escalation were defined as a change to a lower rank or a higher rank, respectively. Continued regimens or adjustments to an equivalent rank were defined as continuation.

Statistical analysis
Descriptive statistics were used to describe clinical practice of de-escalation. Differences in patient characteristics between patients with a de-escalation versus no de-escalation were compared using Student's t test or χ2 tests. Frequencies of de-escalation, escalation and continuation were described visually and numerically. We tested the proportional hazard assumptions for a follow-up period of 90 days, which revealed that the hazards were proportional up to 30 days and not thereafter (see Fig 1). Therefore we used 30-day mortality as the outcome. To determine the effect of de-escalation on clinical outcome we excluded patients starting in rank 1, since they are not able to de-escalate. We performed Cox proportional hazards regression with de-escalation as time-dependent variable and adjusted for baseline characteristics using propensity score analyses. Propensity scores were calculated from a logistic regression model to estimate a patients propensity for de-escalation and included the variables: age, gender, smoking status, history of diabetes mellitus, history of chronic pulmonary disease, antibiotic use two weeks before admission, rank on day 1, season of admission, weekday vs. weekend day (the latter defined as Saturday or Sunday), culture results and all variables from the Pneumonia Severity Index (PSI) score (nursing home resident, comorbidities (neoplastic disease, liver disease history, congestive heart failure history, cerebrovascular disease history, renal disease history), altered mental status, respiratory rate, systolic blood pressure, temperature, heart rate, pH, blood urea nitrogen, sodium, glucose, hematocrit, partial pressure of oxygen and pleural effusion on x-ray). Propensity scores were then included as a continuous variable in the Cox proportional hazard regression model. Patients with escalation of therapy were censored at the time of escalation so that only the days before escalation contributed to the analysis. Other patients were censored at day 30.

Effect of confounding by indication
To quantify the effect of unmeasured confounding by indication we simulated clinical stability during hospital admission as a new confounder. We defined clinical stability during admission as a binary variable evaluated at 72 hours, because clinical stability in patients with CAP is often reached within 48 hours and therapy is often evaluated after three days (with culture results also available) [8,12,13]. The strength of any given confounder is determined by the following three parameters: (1) the prevalence in the group with the determinant (de-escalation), (2) the prevalence in group without the determinant (continuation) and (3) the association with patient outcome (mortality). For the simulation of clinical stability at 72 hours we reviewed the literature for reasonable assumptions for the three parameters. We assumed that 80% of CAP patients admitted to a non-ICU ward will be clinically stable at day three, based on three randomized controlled trials evaluating intravenous to oral switches in patients [14][15][16]. As the prevalence of clinical stability in the total study population is a weighted average of the prevalence of clinical stability in the de-escalation and the continuation group, the prevalence in one group can be calculated from the prevalence in the other group. We assumed a high prevalence for clinical stability in the de-escalation group, so we varied the prevalence from 80% to 100%, with corresponding calculated prevalence's in the continued group between 80% and 75% to arrive at the overall prevalence of 80%. The assumed crude odds ratio (OR) between clinical stability at 72 hours and 30-day mortality was 0.14, based on unpublished data of a randomized controlled trial evaluating the effect of adjunct prednisone therapy versus placebo on time to clinical stability for patients with CAP (Courtesy of dr. Blum) [17]. In this trial, clinical stability was measured every 12 hours during hospital stay and was defined as time (days) until stable normalized vital signs for � 24 hours: temperature � 37.8˚C without antipyretic agents, heart rate � 100 beats per minute, spontaneous respiratory rate � 24 per minute, systolic blood pressure � 90 mmHg (�100 mmHg for patients diagnosed with hypertension) without vasopressor support, mental status back to level before CAP, oxygenation on room air or oxygen therapy (PaO2 �60 mmHg or pulse oximetry � 90%, or PaO2 or pulse oximetry measurement back to baseline for patients with chronic hypoxemia or chronic oxygen therapy) [17]. To simulate the confounder of clinical stability at 72 hours in our dataset, we randomly assigned the presence and the absence of clinical stability such that the aforementioned assumptions about the three parameters were met. Subsequently, the HR of de-escalation on mortality adjusted for clinical stability was determined by including clinical stability as an extra covariate in the propensity score adjusted model. The robustness of the resulting adjusted HRs was tested by repeating the random assignment three times with a different random seed, which verified that the same adjusted HRs was achieved. In the end we plotted the crude and adjusted HR without clinical stability and the resulting HRs for different prevalence's of clinical stability.
We also quantified the strength of each confounder as the change in HR of the model with or without each confounder. For the simulated confounder (clinical stability) we used the corresponding adjusted HR when added to the model with prevalence's of resp. 90% and 100% in the de-escalation group. Data analysis was performed using SPSS for Windows, v.25.0 (SPSS, Chicago, IL, USA) and R v.3.4.3 http://www.R-projects.org/.

Effect of confounding by indication due to clinical stability
The results of the simulation analysis are depicted in Fig 4. Not using clinical stability for adjustment yields the afore-mentioned HR of 0.39. When using the assumed odds ratio between clinical stability at 72 hours and 30-day mortality of 0.14, the adjusted HR for de-escalation gradually increased to 1.04 with an increasing prevalence of clinical stability in patients with de-escalation up to 100%. The upper boundary of 95% confidence interval crosses 1 if the prevalence of clinical stability in the de-escalated patients was > = 87%. Determination of the strength of the simulated confounder, clinical stability, revealed that it was substantially stronger than any of the observed confounders in our dataset (Table 3).

Discussion
In this observational study of patients hospitalized with CAP, after adjustment for observed baseline confounders de-escalation of antimicrobial therapy was associated with a 61% lower hazard of day-30 mortality. However, our simulations have demonstrated that clinical stability Safety of de-escalation and the influence of confounding by indication at 72 hours, which was not measured in our study, could fully explain this effect under reasonable, literature based assumptions. Based on these findings we conclude that the effects of deescalation on patient outcome cannot be reliably quantified without adjustment for clinical stability and that the true effect of de-escalation on mortality needs to be quantified by a welldesigned prospective study.
De-escalation occurred in 16.7% of the patients. During the enrolment period of our study antibiotic stewardship was not yet well established. Therefore, we expect the proportion of deescalation in current practice to be larger. In our population, most patients continued the antibiotic regimen, even though the majority should be clinically stable based on data from the literature. In the absence of antibiotic stewardship, physicians might be more inclined to continue the regimen when it appears to be effective.
In a systematic review including different infectious diseases, de-escalation of empirical antimicrobial therapy was associated with a large reduction in mortality [2]. Although our study only included CAP patients, we expect that the mechanism of bias applies to all infectious diseases for which empirical broad-spectrum antibiotic treatment is common practice. This bias, introduced by not including clinical stability during admission, applies to all previous studies evaluating de-escalation in patients with CAP hospitalized at a non-ICU ward [18][19][20][21][22]. To the best of our knowledge, there are four observational studies on the association The line reflects the Hazard Ratios for 30-day mortality (based on Cox proportional hazard regression analysis adjusted with propensity scores) with 95% Confidence Interval (shaded area) for different prevalence's of clinical stability in patients with and without de-escalation (horizontal axis). At the left side the weighted average of the two proportions is fixed at 80%, which reflects the adjusted Hazard Ratio without adjustment for clinical stability. The dashed line represents a HR of 1. The HR rises from 0.39 to 1.04 when the prevalence of clinical stability increases to 100% in the de-escalated group. From a prevalence of clinical stability of 87% and above in the deescalated group the upper limit of the 95% confidence interval included 1. For example a prevalence of 90% in de-escalated results in an adjusted HR of 0.56 (95% CI: 0.27-1.12) and a prevalence of 100% results in a HR of 1.04 (95% CI: 0.49-2.23).
https://doi.org/10.1371/journal.pone.0218062.g004 Safety of de-escalation and the influence of confounding by indication between de-escalation and mortality that adjusted for clinical stability or a similar time-varying confounder. In the first study by Joung et al. patients with intensive care unit-acquired pneumonia were included and clinical stability during admission was measured as two scores; APACHE-II (Acute Physiology and Chronic Health Evaluation II) and modified CPIS (clinical pulmonary infection score) both measured on day 5 after development of pneumonia. Both high APACHE II score (�24) on day 5 and a high CPIS (�10) on day 5 were associated with an increased 30-day pneumonia-related mortality. By including these confounders, next to other baseline covariates into the multivariable analysis the association between no de-escalation of antibiotics and 30-day mortality resulted in an aHR of 3.988 (95% CI 0.047-6.985) [23].
The study objective was to determine independent risk factors for mortality, hence the focus of model building was not on selecting appropriate confounders and one should be careful to interpret the results as a causal effect. In the second study by Garnacho-Montero et al. patients admitted to the ICU with severe sepsis or septic shock were included and clinical stability during admission was measured as Sequential Organ Failure Assessment (SOFA) score on the day when culture results were available. A high SOFA score at culture result day was associated with a higher in-hospital mortality. When including this covariate next to other covariates the association between de-escalation and in-hospital mortality resulted in an aOR of 0.55 (95% CI 0.32-0.98, p = 0.022) [5]. In the third study by Montravers et al. patients admitted with health care-associated intra-abdominal infection admitted to ICU were included and clinical stability during admission was measured by SOFA score. Here a decreased SOFA score at day three after initiation of empirical antimicrobial therapy was associated with a lower 28-day mortality. By including this covariate next to other covariates in the analysis this resulted in an aHR of 0.566 (95% CI 0.2503-1.278, p = 0.171) for association between de-escalation and 28-day mortality. However, this multivariate analysis also had the purpose to identify risk factors for 28-day mortality, not on selecting appropriate confounders [24]. The fourth study by Lee et al. included patients with community-onset monomicrobial Escherichia coli, Klebsiella species and Proteus mirabilis bacteremia treated empirically with broad-spectrum beta-lactams and clinical stability during admission was measured by the Pitt bacteremia score. A high Pitt bacteremia score (�4) at day three was associated with 4-week mortality. After propensity score matching there was no statistically significant difference in mortality rates between deescalation and no-switch regarding 2-week, 4-week and 8-week mortality [25]. Comparison of the studies is difficult because different criteria for de-escalation and different definitions of disease severity during admission were used, and different populations were studied. The first three studies included ICU patients, and in this setting registering scores representing clinical stability is part of routine care, which makes it more feasible to include such parameters in observational studies. Although the definition for clinical stability for CAP as provided by Halm et al. [13] is widely accepted, in clinical practice patients can be declared stable based on other criteria (e.g. feeling well, eating and drinking) even if they do not meet the formal criteria. A critique of the aforementioned studies is that all used de-escalation as a fixed variable. However, de-escalation is performed on a different day for each individual and should be analyzed as a time-dependent variable, otherwise it introduces immortal time bias [26]. It is recommended to include sensitivity analyses to estimate the potential impact of unmeasured confounding in every non-randomized study on causal associations [27]. However, for observational studies evaluating de-escalation of antimicrobial therapy this has never been done before. To strengthen our sensitivity analysis we based our assumptions about the prevalence of clinical stability and association with mortality on existing high-quality data. We further assumed that physicians will only de-escalate when a patient is clinically stable or to initiate targeted treatment for an identified pathogen. In the latter case, we still expect that most patients in whom the physician decides to de-escalate will be clinically stable. We, therefore, expect that at least 90% and probably close to 100% of de-escalated patients will be clinically stable on day three.
Strengths of our study include the pragmatic approach of using prospectively collected data of a large patient population treated with empiric antibiotics and a working diagnosis of CAP. This included patients without an identified pathogen, which increases the generalizability of our study results. The effect of de-escalation on mortality may be different from one country to another, or even between hospitals within one country, depending on local antibiotic practices. However, we think that the confounding effect of clinical stability is generalizable to other countries and also applies to other severe bacterial infections, because clinical stability will always be a major determinant of de-escalation. A limitation of our study is that we had to exclude 165 patients due to participation in a concurrent trial which could result in selection bias. However this was a small number of patients and participation was hospital dependent, so the influence of selection bias will be small. Another limitation of our study was that we had to make assumptions for the prevalence of clinical stability in the de-escalated and continued group and for the association between clinical stability and day-30 mortality. These were derived from different study populations, all representing CAP patients hospitalized to a non-ICU ward. Our findings suggest that adjustment for clinical stability will result in a non-significant effect of de-escalation on mortality, which would be biologically plausible. Our findings also demonstrate that the individual baseline confounders, as measured in our study, are poorly predictive for de-escalation, indicating that their correlation with clinical stability is probably also weak.
Another simplification in our analysis was that we modelled clinical stability as a binary variable on day three, which does not well represent reality. For future studies we recommend to measure clinical stability repeatedly over time, as a time-varying confounder and on a continuous scale. Finally, we did not have information on quality of our sputum samples on which the pathogen was identified. Quality of sputum samples is also a prognostic factor for de-escalation of empirical antimicrobial therapy, however we could not correct for this in our model.
The results of our analysis may also suggest that possibility of clinically relevant harm due to de-escalation cannot be excluded, as the upper boundary of the 95% confidence interval for the HR was over to 2 in the most extreme scenario. The scientific evidence for safety of deescalation is de facto based on two RCTs. However, both RCTs are not powered for mortality. The first prospective, open-label, randomized clinical trial included patients with hospitalacquired pneumonia in an ICU without inclusion criteria regarding baseline clinical stability. After randomization de-escalation was performed three to five days after initiation if empirical treatment when culture results were available. For the association between de-escalation and 14-day mortality the RR was 0.67 (95% CI 0.31-1.43), for 28-day mortality the RR was 0.75 (95%CI 0.46-1.23) and for in-hospital mortality the RR was 0.64 (95%CI 0.37-1.13), (calculated by the authors based on the data reported in [3]. The other multicenter non-blinded randomized non-inferiority trial evaluated the safety of de-escalation with 90-day mortality as secondary outcome in patients with severe sepsis admitted to an ICU without inclusion criteria regarding baseline clinical stability. After randomization de-escalation was performed after culture results were available (IQR 2-4 days after initiation of empirical therapy). In the deescalation group 18 of 59 patients (31%) died within 90-days, compared to 13 of 57 patients (23%) in the continuation group, yielding an adjusted HR of 1.7 (95% CI 0.79-3.49, p = 0.18). Although not statistically significant, this trend may indicate potential harm rather than improved outcome due to de-escalation [4]. As we have demonstrated, observational studies performed so far do not contribute to determining the safety of de-escalation because the amount of confounding by indication due to clinical stability is insurmountable. As appropriate adjustment of confounding by indication was not performed in the majority of the published observational studies on de-escalation, the ones that adjusted for clinical stability had other important limitations, and only two small RCTs have been performed, we conclude that the safety of this widely propagated antibiotic stewardship intervention should be studied more appropriately. We recommend that future observational studies addressing this research question include clinical stability in the analysis, preferably as a time-varying variable because clinical stability may change over time. It has been suggested that in the case of time-varying confounders a marginal structural model is appropriate [28]. Ultimately, although more expensive, de-escalation would be optimally studied in a pragmatic randomized controlled trial.
To conclude, the previously observed protective effect of de-escalation on mortality is likely due to confounding by unobserved factors such as clinical stability during admission. This study suggests the effect of de-escalation on mortality needs further prospective research to determine effect size more accurately.