The authors have declared that no competing interests exist.
Conceived and designed the experiments: BM BD. Analyzed the data: BM. Contributed reagents/materials/analysis tools: RS SK. Wrote the paper: BM BD AK RM.
Prognostic models are often used to estimate the length of patient survival. The Cox proportional hazards model has traditionally been applied to assess the accuracy of prognostic models. However, it may be suboptimal due to the inflexibility to model the baseline survival function and when the proportional hazards assumption is violated. The aim of this study was to use internal validation to compare the predictive power of a flexible RoystonParmar family of survival functions with the Cox proportional hazards model. We applied the Palliative Performance Scale on a dataset of 590 hospice patients at the time of hospice admission. The retrospective data were obtained from the Lifepath Hospice and Palliative Care center in Hillsborough County, Florida, USA. The criteria used to evaluate and compare the models' predictive performance were the explained variation statistic R^{2}, scaled Brier score, and the discrimination slope. The explained variation statistic demonstrated that overall the RoystonParmar family of survival functions provided a better fit (R^{2} = 0.298; 95% CI: 0.236–0.358) than the Cox model (R^{2} = 0.156; 95% CI: 0.111–0.203). The scaled Brier scores and discrimination slopes were consistently higher under the RoystonParmar model. Researchers involved in prognosticating patient survival are encouraged to consider the RoystonParmar model as an alternative to Cox.
Prognostic models are often used to estimate the length of patient survival and improvement in the accuracy of prognosis translates into superior quality of patient care. Precise prognosis of survival using modeling techniques requires rigorous methods for the development and testing of the accuracy of prognostic models. Developing a prognostic model entails having accurate patient data for prognosis, and selecting clinically relevant candidate predictors and measures of model performance, usually in the context of multivariable regression
In the hospice setting, accurate prognostication of survival affords patients and their families a vital opportunity to attend to matters such as planning, prioritizing, and preparing for death
The Cox proportional hazards (CPH) model
In this manuscript we compare CPH with an alternative method of estimating survival in the form of the class of flexible RoystonParmar (RP) parametric functions
In addition to PPS, other risk factors such as age, cancer status and gender have been reported to be significant predictors of palliative patient survival in several studies
The patient data were obtained from the Lifepath Hospice and Palliative Care Center licensed since 1983 to serve in Hillsborough County, Florida. Hospice care focuses on pain control and symptom management. To avoid selection bias, we retrospectively and sequentially extracted data for 590 patients who, as of January 2009 were deceased. This study was a retrospective review of the deceased patients' medical records. Only data pertaining to outcomes were collected; personal information was not collected and the data were deidentified prior to analysis. Since we did not collect any information that can identify deceased patients or their family members, under HIPPA rules and regulations (45 CFR 164.512) the requirement for consent does not apply. The study and consent procedures were approved by the University of South Florida Institutional Review Board prior to study initiation. Two research assistants extracted all data necessary to populate the model variables and two faculty members randomly checked 25% of the data for accuracy. The models were tested against observed survival duration.
The Palliative Performance Scale (PPS) was developed and reported by Anderson et al.
Validating a prognostic model is generally accepted to mean that given a patient population it works in a data set other than the one it is applied to
The role of the baseline survival is significant as it quantifies the absolute patient survival probabilities over time. For a vector of covariates
When the goal of a survival analysis is to estimate hazard ratios (the effect of covariates on the changing hazard function), the baseline function is of no consequence. The CPH is appropriate as the baseline function gets absorbed when coefficient
An alternative to the CPH is the RP family of models that resembles the generalized linear models and can be viewed as a parametric extension Cox proportional hazard models
In the RP framework, if the proportional hazard assumption is violated, the probitlink function g(s) = −Φ^{−1}(s) can be applied, where Φ^{−1}(.) is the inverse standard normal distribution function. The baseline survival function
We compared RP and CPH by performing internal validation (assessing validity in the population where the development data originated from) on the whole data set (naïve) and using splitsample crossvalidation. We performed 10fold crossvalidation by splitting the data into development and validation sets and repeating the process 20 times. The methods can be readily implemented in Stata
Model performance is the ability of the estimated risk score to predict survival and is assessed using the measures of explained variation, calibration, and discrimination. Calibration refers to how closely the predicted survival at a prespecified time agrees with the observed survival. For crossvalidation, we compared the average fitted probabilities of survival under RP and CPH for the first 15 days to observed probabilities estimated nonparametrically using KaplanMeier curves
The Brier score is a quadratic scoring rule that calculates the differences between the actual outcomes and predicted probabilities
For a particular risk score, discrimination is the ability to differentiate between the patients who died versus those who survived. The KaplanMeier plot of survival for patients in different risk groups can be used to test for separation, indicating that the different risk groups are well defined
The discrimination or Yates slope is a measure of how well the subjects with and without the outcome are separated. It is defined as the absolute difference in mean predictions of survival (mean[p_{i}]) between those who died and those who survived at time t
All statistical calculation were performed using Stata version 11.2
The patient characteristics of the retrospective cohort are summarized in
Variable  Result 
Total no. of patients  590 (100%) 
Age at Treatment  
<45  37 (6.3%) 
45–64  187 (31.7%) 
65–74  110 (18.6%) 
75–84  129 (21.9%) 
85+  127 (21.5%) 
Gender  
Male  293 (49.7%) 
Female  295 (50%) 
Unknown  2 (0.3%) 
No. of patients with cancer/noncancer  
Noncancer  363 (61.5%) 
Cancer  227 (38.5%) 
Diagnosis category for cancer  
Brain  10 (1.7%) 
Gastrointestinal  35 (5.9%) 
Genitalfemale  12 (2%) 
Genitalmale  12 (2%) 
Head and neck  8 (1.4%) 
Hematopoietic  10 (1.7%) 
Pancreas  24 (4.2%) 
Respiratory  49 (8.3%) 
Skin  2 (0.3%) 
Urinary  4 (0.6%) 
Other  61 (10.3%) 
Diagnosis category for noncancer  
AIDS  12 (2%) 
Cardiovascular  74 (12.5%) 
Neurological  119 (20.2%) 
Respiratory  37 (6.3%) 
Other  121 (20.6%) 
Survival Times (in Days)  
Variable  Mean (95% CI)  Median (95% CI)  Range  No. of Patients (%) 
Total no. of patients  
Overall  14 (12, 17)  6 (5, 6)  1–371  590 
Age at Treatment  
<45  15 (8, 22)  8 (4,12)  1–95  37 (6.3%) 
45–64  14 (11, 17)  7 (5, 9)  1–114  187 (31.7%) 
65–74  14 (8, 20)  5 (4, 6)  1–271  110 (18.6%) 
75–84  14 (8, 20)  6 (5, 7)  1–371  129 (21.9%) 
85+  15 (9, 21)  5 (4, 6)  1–313  127 (21.5%) 
Gender  
Male  14 (10, 18)  6 (5, 7)  1–371  293 (49.7%) 
Female  15 (11, 19)  6 (5, 7)  1–271  295 (50%) 
No. of patients with cancer  
Noncancer  12 (8, 16)  5 (4, 6)  1–371  363 (61.5%) 
Cancer  17 (14, 20)  9 (7, 11)  1–113  227 (38.5%) 
Diagnosis category for cancer  
Brain  27 (16, 39)  28 (14, 42)  3–55  10 (1.7%) 
Gastrointestinal  21 (14, 29)  11 (5,17)  1–82  35 (5.9%) 
Genitalfemale  15 (6, 24)  8 (1, 15)  2–55  12 (2%) 
Genitalmale  26 (7, 45)  13 (4, 22)  1–100  12 (2%) 
Head and neck  10 (2, 18)  5 (1, 9)  1–36  8 (1.4%) 
Hematopoietic  4 (2, 6)  3 (1, 5)  1–10  10 (1.7%) 
Pancreas  18 (7, 29)  7 (3, 11)  1–113  24 (4.2%) 
Respiratory  15 (10, 20)  10 (7, 13)  1–71  49 (8.3%) 
Skin  11  11  11–11  2 (0.3%) 
Urinary  25 (1, 58)  9 (1, 39)  4–76  4 (0.6%) 
Other  17 (12, 22)  9 (5, 12)  1–103  61 (10.3%) 
Diagnosis category for noncancer  
AIDS  18 (3, 33)  8 (1, 15)  1–85  12 (2%) 
Cardiovascular  14 (5, 23)  5 (3, 7)  1–271  74 (12.5%) 
Neurological  8 (5, 11)  5 (4,6)  1–77  119 (20.2%) 
Respiratory  25 (1, 49)  3 (1, 5)  1–371  37 (6.3%) 
Other  11 (1, 15)  5 (4, 6)  1–174  121 (20.6%) 
Initial PPS Score  
PPS 10%  5 (3, 7)  3 (2, 4)  1–77  188 (32.6%) 
PPS 20%  16 (8, 24)  5 (4, 6)  1–371  125 (21.7%) 
PPS 30%  15 (11, 19)  7 (5, 9)  1–140  123 (21.4%) 
PPS 40%  24 (18, 30)  14 (11, 17)  1–147  96 (16.7%) 
PPS 50–80%  28 (21, 35)  18 (9, 27)  1–76  44 (7.6%) 
The time of admission was the starting point for survival time. The KaplanMeier curves stratified by initial PPS level are shown in
No. of knots m  PH  PO  Probit 
AIC, BIC, R^{2}  AIC, BIC, R^{2}  AIC, BIC, R^{2}  
0  2033, 2042, 0.156  1887, 1896, 0.321  1872, 1881, 0.295 
1  1889, 1902, 0.178  1883, 1896, 0.322 

2  1871, 1888, 0.170  1870, 1887, 0.312  1857, 1874, 0.296 
3  1870, 1892, 0.172  1870, 1892, 0.311  1858, 1880, 0.297 
4 


1855, 1881, 0.296 
5  1866, 1896, 0.171  1865, 1896, 0.309  1856, 1886, 0.296 
R^{2} was higher in the RP model (R^{2} = 0.298; 95% CI: 0.236–0.358) than the Cox model (R^{2} = 0.156; 95% CI: 0.111–0.203), indicating that the RP model explained significantly more variation than CPH. To illustrate the differences for the baseline function,
Crossvalidation showed that the relation between the two predicted survival estimates is approximately linear, with RP model consistently estimating a higher probability, which is particularly evident for higher scores of PPS corresponding to longer survival times (
Both are consistently higher for RP indicating better accuracy and discrimination.
The results from our study show that RP family of models predicts survival more accurately than CPH through its flexible modeling of the baseline survival function. Using the RP flexible baseline function modeling would allow for more precise calibration in the prognostication phase than CPH. As
There are limitations to our study, the primary one being the use of retrospective data. The RP family of parametric functions needs to be applied prospectively to assess accuracy of prognostic models through external validation. Furthermore, the dataset was limited to the hospice setting with no censored observations and with majority of patients having a very short followup time. For future studies, application of the proposed methodology should account for these limitations, and comparisons with parametric prognostic survival models should be explored.
The flexible models discussed in this paper could greatly improve the ability of researchers to accurately predict survival. An advantage of RP is that it can be used to validate published models for which the original individual patient data are unavailable. If the scale used (hazard, probit or odds), the knot positions, and the estimates of prognostic indices are known, then it would be possible to use RP. In the case of CPH this is not possible, since the baseline function would not be available.
The authors wish to thank Dr. Jane Carver for her help in preparing the manuscript.