Predictive accuracy of physicians’ estimates of outcome after severe stroke

Introduction End-of-life decisions after stroke should be guided by accurate estimates of the patient’s prognosis. We assessed the accuracy of physicians’ estimates regarding mortality, functional outcome, and quality of life in patients with severe stroke. Methods Treating physicians predicted mortality, functional outcome (modified Rankin scale (mRS)), and quality of life (visual analogue scale (VAS)) at six months in patients with major disabling stroke who had a Barthel Index ≤6 (of 20) at day four. Unfavorable functional outcome was defined as mRS >3, non-satisfactory quality of life as VAS <60. Patients were followed-up at six months after stroke. We compared physicians’ estimates with actual outcomes. Results Sixty patients were included, with a mean age of 72 years. Of fifteen patients who were predicted to die, one actually survived at six months (positive predictive value (PPV), 0.93; 95% CI, 0.66–0.99). Of thirty patients who survived, one was predicted to die (false positive rate (FPR), 0.03; 95%CI 0.00–0.20). Of forty-six patients who were predicted to have an unfavorable outcome, four had a favorable outcome (PPV, 0.93; 95% CI, 0.81–0.98; FPR, 0.30; 95% CI; 0.08–0.65). Prediction of non-satisfactory quality of life was less accurate (PPV, 0.63; 95% CI, 0.26–0.90; FPR, 0.18; 95% CI 0.05–0.44). Conclusions In patients with severe stroke, treating physicians’ estimation of the risk of mortality or unfavorable functional outcome at six months is relatively inaccurate. Prediction of quality of life is even more imprecise.

Introduction More than half of the patients with acute stroke are dead or disabled after two years [1]. In US studies, most in-hospital deaths of these patients occurred after a decision to withhold or withdraw life-sustaining therapies [2,3]. These decisions usually evolve from complex discussions, in which accurate predictions of prognosis are crucial.
A wide range of prognostic models have been developed to aid prognostication after stroke, but none of these models is sufficiently accurate in the prediction of mortality or poor functional outcome to serve as the sole basis of decisions to limit treatment [4]. In addition, the large majority of these models are based on factors collected in the first hours after stroke onset, whereas treatment restrictions are often considered when there is no meaningful improvement during the first days or weeks [4]. Prognostication based on a physician's estimate rather than on prognostic models can take into account factors that are usually not included in prognostic models, such as complications of stroke, previous comorbidities, changes in functional status over the course of hospitalization and estimated quality of life. However, the accuracy of prognostic estimates regarding mortality, functional outcome, and quality of life is uncertain. If the accuracy of physicians' estimates is poor, physicians should be aware of prognostic uncertainties and their consequences when discussing end-of-life decisions.
In this study, we assessed the accuracy of treating physicians' estimates in predicting mortality, functional outcome, and quality of life at six months in patients with severe disability at four days after stroke, and interpret these findings in the context of the end-of-life decision making process.

Patient selection
We studied patients included in the Advance Directive And Proxy opinions in acute sTroke (ADAPT) cohort, a prospective, two-center cohort study [5]. Consecutive patients admitted at the stroke unit with major disability, defined as Barthel Index (BI) 6 (out of 20) [6] at day four after ischemic stroke or intracerebral hemorrhage were eligible for participation. We restricted ourselves to this population because these are the patients in whom treatment restrictions are most often considered. Patients were included as soon as possible from four days after stroke and could be included until discharge.
Patients with a subarachnoid hemorrhage and patients without an available legal representative were excluded from the study. Patients were included between September 2012 and December 2013 in the University Medical Center Utrecht, a tertiary referral hospital, and between January and December 2013 in the St. Elisabeth hospital in Tilburg, a large regional teaching hospital, both in The Netherlands.
The study was approved by the institutional review board of the University Medical Center Utrecht and of the St. Elisabeth hospital. Written informed consent was obtained from each patient or a legal representative.

Data collection
We collected information on patient characteristics (age, sex), type of stroke (ischemic stroke or intracerebral hemorrhage), stroke severity on admission (by means of National Institutes of Health Stroke Scale (NIHSS)) [7] and pre-stroke comorbidity with use of the Charlson Comorbidity Index (CCI) [8].

Physicians' estimates
The treating physicians were neurology residents assigned to the daily care of patients during admission. These residents were supervised in the daily care of patients by experienced stroke neurologists. Prognosis and the need of installment of end-of-life decisions were discussed by the neurologist and the resident on a daily basis. The neurology resident predicted outcome after six months immediately after patient inclusion by a questionnaire regarding the prediction of mortality, functional outcome (as measured with the modified Rankin Scale (mRS)), [9] and quality of life (as measured with a visual analogue scale (VAS)) [10]. Scores on the mRS range from 0 (no symptoms) through to 5 (severe disability); for statistical purposes, death was given a score of 6. The VAS was a vertical line of 10 centimeters with a '☺' at the top demarcating the best possible quality of life and a '☹' at the lower end for the worst possible quality of life. Scores were calculated as the indicated level in (centimeters/10) Ã 100. Quality of life was considered acceptable if VAS !60 [11] No formal prediction models were used in the daily care of the patients, nor in the estimation of outcomes.

Follow-up
A single trained investigator (FASdK), blinded to the physicians' predictions, visited each patient and caregiver at six months (+/-six weeks) after stroke to assess functional outcome (as measured with the mRS and BI) and quality of life (as measured with a VAS and with the Medical Outcomes Study 36-item short-form health survey (SF-36)). For the SF-36, two summary scores were calculated as a representation of physical or mental health [12].

Statistical analyses
The primary outcomes were the physicians' accuracies regarding the prediction of mortality, functional outcome, and quality of life at six months. Predictions of functional outcome were considered correct if the prediction of either favorable (mRS 0-3) or unfavorable (mRS 4-6) functional outcome was correct. Prediction of quality of life was considered correct if the prediction of satisfactory quality of life (VAS !60) or non-satisfactory quality of life (VAS <60) was correct.
In a secondary analysis, prediction of functional outcome was considered correct if there was an exact agreement on the mRS.
Accuracy results for mortality and the dichotomized mRS and VAS outcomes were measured by calculating the positive predictive values (PPV), negative predictive values (NPV), and false positive rate (FPR) with corresponding 95% confidence intervals (CI). We present PPV, NPV and FPR as measures of test performance (rather than sensitivity and specificity), because these tests will provide the clinically most important results, taking into account the context of end-of-life decisions they are used in. An incorrect prediction of a poor outcome can have the irreversible consequence of withdrawing or withholding life-sustaining therapies. In this context, we are interested in how likely it is that if the physician predicts a poor outcome (the test is positive), the patient really has a poor outcome (and the installment of a treatment restriction can be justified). This is represented by the positive predictive value. In other words: a high positive predictive value means you can trust the physicians' prediction it if it is positive. The same is true for a predicted good outcome (the test is negative), which is represented by the negative predictive value.
We used χ 2 -test to compare PPVs between groups. As a cut-off point for predictive accuracy, we used a false positive rate for predicted mortality, poor functional outcome or unsatisfactory quality of life of 0.01.

Sample size calculation.
To achieve a precision (the maximum difference between estimated FPR and the true value) of 10% for FPR, using an expected FPR of 0.05 and an expected prevalence of 0.50 for either mortality, poor functional outcome or unsatisfactory quality of life, we need a total sample size of 39 patients.
Subgroup analyses. Predefined subgroup analyses were done with regard to type of stroke and in patients who had no treatment restrictions, including no do-not-resuscitate (DNR) orders. The relation between treatment restrictions and predicted outcomes was calculated with Poisson regression analysis with a robust error, and expressed as relative risk (RR) with corresponding 95% CI.

Results
We included 60 patients with a mean age of 72 years (SD, 15) and a median Barthel Index of 0 (range, 0-6). Thirty-six (60%) patients had an ischemic stroke. The median time from admission to inclusion was six days (range, 4-10). Baseline characteristics are presented in Table 1.
Twenty-one neurology residents, supervised by 14 stroke neurologists filled out the questionnaires. The number of patients treated by one physician ranged from one to eight.

Functional outcome
Functional outcome at six months could be assessed in 59 patients; one patient declined follow-up. Thirty patients died, of the survivors 19 (65%) had an unfavorable outcome. The median mRS in survivors was 4 (range, 2-5) ( Table 3 and Fig 1).

Subgroup analyses
There were no differences between patients with ischemic stroke and intracerebral hemorrhage concerning the predictive accuracy of mortality (P = 0.27), unfavorable functional outcome (P = 0.11), or unsatisfactory quality of life (P = 0.69) (S1 and S2 Tables). 18 of 60 patients had no treatment restrictions. In these patients, predictive accuracy of unfavorable functional outcome was essentially the same as in the total group (PPV, 0.88; 95% CI, 0.47-0.99), but prognostic errors were more optimistic (S3 Table).

Discussion
This study shows that in patients with severely disabling stroke, defined as a Barthel Index 6 after four days, treating physicians' estimation of the risk of death or unfavorable functional outcome at six months is relatively inaccurate. Prediction of quality of life is even more imprecise.
Accurate information about the expected outcome of disease is required to guide physicians and other professionals, patients, and their relatives in making decisions related to the withdrawal or withholding of life-sustaining treatments. If an expected negative outcome (death, unfavorable functional outcome or a non-satisfactory quality of life) is used as a basis for treatment restrictions, the predictive accuracy should be very high to prevent unfounded pessimism which can lead to early withdrawal of treatment in a patient that otherwise could have recovered. The false positive rate of a predicted poor outcome should preferably be zero, with a narrow confidence interval. At present, such predictive accuracy only exists for prognostic models in comatose patients after cardiopulmonary resuscitation for cardiac arrest [13] and not for stroke patients. In our opinion, the predictive accuracy of physicians is insufficient to serve as the sole basis of decisions to limit treatment. Physicians should be aware of prognostic uncertainties and their consequences when discussing end-of-life decisions.
In this study, physicians predicted unfavorable functional outcome better than a non-satisfactory quality of life, probably because quality of life is not only related to the severity of disability, but also to factors such as the presence or absence of meaningful activities and social or emotional support [14] which are often not identified during admission. Moreover, patients often report greater happiness and quality of life than healthy people predict they would feel under the same circumstances, a phenomenon often referred to as a 'disability paradox', which is explained in part by the capacity of patients with chronic illness or disability to adapt to their circumstances [15]. The accuracy of the physician's prognostic estimates in this study is in the same range as in a previous study where neurovascular fellows predicted functional outcome at six months in patients with subarachnoid hemorrhage [16] and in a study where junior neurointensivists predicted functional outcome and quality of life at six months in patients requiring mechanical ventilation in any neurological disease [17]. Previous studies compared the accuracy of prediction models to physicians' estimates on mortality and functional outcome, and found the prediction models to be more accurate in patients with ischemic stroke, [18,19], but not in patients with intracerebral hemorrhage [20].
The frequently used iScore for ischemic stroke and ICH score for intracerebral hemorrhage have areas under the curve for case fatality at 30 days of 0.79 and 0.88, respectively, in validation studies [21,22]. This means that both prognostic models and physicians' prognostic estimates lack the accuracy to serve as sole base for end-of-life decisions. The predictive accuracy might increase when using a combination of 'mathematical' prediction models and physicians' prognostic estimates, but this requires further research.
This study has limitations. First, we included both patients with ischaemic and hemorrhagic stroke. These entities are inherently different with respect to prognosis. Second, our findings cannot be generalized to an unselected population of stroke patients. We included patients who were alive but severely dependent on day four after stroke, because treatment restrictions are probably most often installed in this patient group. As a result of this selection, our results apply only to this selected group of patients and cannot be extrapolated to patients in the first days after stroke, those who are less severely disabled at day four after stroke or to patients with other diseases. Quality of life predictions were assessed in an even more selected group, because only survivors could be evaluated. The number of patients available to assess predictive accuracy of quality of life lack the statistical power to draw firm conclusions. Third, quality of life data should be interpreted with caution because patients could have given desired answers during the home visit. Fourth, the cut-off value for predictive accuracy is rather arbitrary and represents the authors' interpretation based on previously presented visions [4]. The lower the false positive rate, the more decisions to withhold or withdraw life-sustaining therapies are based on correct predictions of poor outcome. Fifth, self-fulfilling prophecies are a major concern when assessing prognostic accuracy. In the present study, patients with a predicted poor prognosis more frequently had treatment restrictions, which have been associated with an actual poor outcome in previous studies [2,5,23,24]. Only 18 patients received full supportive care, a number too small to draw firm conclusions. Finally, improvement of functional outcome and quality of life may still occur after six months [11] and our results therefore do not represent a completed recovery.
In patients with severe stroke, treating physicians' estimation of the risk of mortality or unfavorable functional outcome at six months is relatively inaccurate. Prediction of quality of life is even more imprecise.
Future research should focus on how to improve predictive accuracy, for example by using a combination of 'mathematical' prediction models and physicians' prognostic estimates, and on how to identify in the acute stage after stroke patients who will recapture a good quality of life despite poor functional outcome.