Symptomatic spinal metastasis: A systematic literature review of the preoperative prognostic factors for survival, neurological, functional and quality of life in surgically treated patients and methodological recommendations for prognostic studies

Purpose While several clinical prediction rules (CPRs) of survival exist for patients with symptomatic spinal metastasis (SSM), these have variable prognostic ability and there is no recognized CPR for health related quality of life (HRQoL). We undertook a critical appraisal of the literature to identify key preoperative prognostic factors of clinical outcomes in patients with SSM who were treated surgically. The results of this study could be used to modify existing or develop new CPRs. Methods Seven electronic databases were searched (1990–2015), without language restriction, to identify studies that performed multivariate analysis of preoperative predictors of survival, neurological, functional and HRQoL outcomes in surgical patients with SSM. Individual studies were assessed for class of evidence. The strength of the overall body of evidence was evaluated using GRADE for each predictor. Results Among 4,818 unique citations, 17 were included; all were in English, rated Class III and focused on survival, revealing a total of 46 predictors. The strength of the overall body of evidence was very low for 39 and low for 7 predictors. Due to considerable heterogeneity in patient samples and prognostic factors investigated as well as several methodological issues, our results had a moderately high risk of bias and were difficult to interpret. Conclusions The quality of evidence for predictors of survival was, at best, low. We failed to identify studies that evaluated preoperative prognostic factors for neurological, functional, or HRQoL outcomes in surgical patients with SSM. We formulated methodological recommendations for prognostic studies to promote acquiring high-quality evidence to better estimate predictor effect sizes to improve patient education, surgical decision-making and development of CPRs.


Introduction
Symptomatic spinal metastasis (SSM) afflicts up to 10% of cancer patients [1][2][3], of which approximately 10% are surgically managed. [4] Given that over 14 million Americans lived with a diagnosis of cancer in 2014 and almost 19 million are expected to do so by 2024 [5], the number of cancer survivors expected to undergo surgery for SSM will increase by approximately 36% over the next 10 years.
Since the randomized controlled trial conducted by Patchell et al. [6] showing that surgery followed by radiotherapy provided superior neurologic outcomes compared to radiotherapy alone in patients suffering from a single cervical or thoracic SSM with a life expectancy of ! 3 months, this life expectancy threshold has been widely adopted in decision-making for surgical treatment. [7][8][9] However, clinicians and surgeons tend to estimate survival in patients with advanced cancer inaccurately. [10][11][12][13] Also, although several studies reported that surgical intervention improved health related quality of life (HRQoL) [6,9,[14][15][16][17][18][19][20], SSM treated with surgery is the most costly skeletal-related event in patients with cancer. [21] Clinical prediction rules (CPRs), which combine various clinical factors from an individual with a given health state and provide an estimate of the risk of experiencing a specific endpoint within a certain period [22], may allow physicians to make more precise clinical estimates and thus assist therapeutic decision-making and counselling. [22,23] Although several CPRs of survival have been elaborated, we are not aware of any CPR for HRQoL for SSM patients. Also, current CPRs of survival have variable prognostic ability. [24][25][26] This may be due to differences between patient samples that were used to generate and conduct prognostic value assessment. For instance, Bartels et al. [27] created a CPR of survival based on a cohort of patients who received radiotherapy. In their most recent external validation study [28], misspecification of their model was attributed to the surgical patient subgroup.
The majority of published series assessed preoperative predictors of survival rather than HRQoL. We conducted a systematic review to ascertain the preoperative prognostic factors for 1) survival, 2) neurologic status, 3) functional status, and 4) HRQoL in surgical SSM patients. We also appraised the methodology and reporting of prognostic studies that met our eligibility criteria. The results of this study could not only be used to modify existing CPRs of survival to improve their prognostic value, but also to improve the theoretical framework to develop new CPRs for survival and HRQoL outcomes specific to surgical SSM patients.

Methods
This systematic review and best-evidence synthesis was conducted and reported in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines [29]. In compliance with the guidelines, our systematic review protocol was registered with the International Prospective Register of Systematic Reviews (PROSPERO) [30] on June 24 th , 2015 and was last updated on July 12 th , 2016 (registration number CRD42015023831).

Literature search
In adult patients who underwent surgery for SSM, we sought to answer the following four key questions (KQs): What are the preoperative clinical factors associated with postoperative (1) survival; (2) neurologic status, such as muscle power on the Medical Research Council (MRC) scale for testing muscle strength, neurologic outcome measures (e.g. American Spinal Injury Association (ASIA) or Frankel grade) or autonomic functions (bladder / bowel control); (3) functional status, in terms of ambulatory status, functional outcome measures, such as functional independence measure (FIM), Barthel index, Eastern Cooperative Oncology Group (ECOG) or Karnofsky performance status (KPS); and (4) HRQoL, in terms of score on any HRQoL measure, such as short form health survey (SF-36), EuroQol 5 dimensions (EQ-5D) or Oswestry Disability Index (ODI)?
The electronic databases MEDLINE, MEDLINE in Process, Embase, Web of Science, CINAHL, Cochrane Central Register of Controlled Trials, and Scopus were systematically searched for studies performed in humans from January 1,1990 to December 31, 2015 with no language restrictions applied. The search strategies were developed in consultation with information specialists at the University Health Network Health Sciences Libraries. S1 Table presents our complete search strategies. The reference lists of studies meeting the eligibility criteria and relevant review papers were manually screened for additional studies.

Eligibility criteria
Citations were screened for eligibility by following a priori determined inclusion and exclusion criteria (Table 1). Original studies with an identifiable surgical treatment arm or surgical cohort of at least 30 patients, who underwent de novo spinal surgery for a single symptomatic metastatic spinal lesion, with a postoperative follow-up of at least 6 months, published in peerreviewed journals included in Ulrichsweb [31] at the time of publication, describing and reporting both the preoperative prognostic clinical factors assessed and the univariate and multivariate analyses conducted, were considered for inclusion. Studies that included surgical/ postoperative predictors in their multivariate analyses, patients < 18 years old, patients operated for recurrent SSM or primary spinal tumor were excluded.

Screening and selection
All duplicates were removed using EndNote X4 followed by manual elimination. Two authors (AN and ARM) independently (1) screened the titles and abstracts to identify potential eligible studies to undergo full-text assessment and then (2) reviewed the selected full-text articles for final inclusion. Discrepancies between the two reviewers were resolved by consensus agreement; persisting disagreements were settled by consulting the senior author (MGF).

Data extraction and synthesis
The following data were extracted by AN and then checked by ARM: 1) first author and publication date; 2) publication language; 3) study design; 4) purpose; 5) patient sample and characteristics, with relevant inclusion and exclusion criteria; 6) preoperative predictors 7) outcome assessed; 8) postoperative follow-up characteristics, including length, rate, and information about how missing data were handled; 9) methodology, including details related to predictors' Critical appraisal of the literature We are not aware of any consensus regarding a standardized approach for assessing the quality of prognostic studies.

Risk of bias in individual studies
AN and ARM independently assessed the risk of bias of individual articles (Class I to IV) using the method described by Skelly et al. [32,33] for prognostic studies (S2 Table). The final classof-evidence rating was assigned following consensus agreement.

Risk of bias across studies: Overall quality of evidence
Once all articles were individually evaluated, the strength of the overall body of evidence with respect to each predictor was allocated using the approach developed by the Grading of Recommendation Assessment, Development and Evaluation (GRADE) Working Group. [34] The baseline strength of the overall body of evidence was assigned "High" if the majority of the studies were Class I or II and "Low" if the majority of the studies were Class III or IV. The strength could then be downgraded by one or two levels based on the risk of bias, consistency, directness, precision and publication bias. Alternatively, the strength could be upgraded by one or two levels if the effect was large, there was evidence of a dose response gradient or all plausible confounders would either reduce a demonstrated effect or would suggest a spurious effect when the results showed no effect. The final strength of the overall body of evidence for each predictor was classified as High, Moderate, Low or Very Low and expresses our confidence that the evidence reflects the true effect and the likelihood of further research to change our confidence in the latter estimate of effect (S3 Table). Overall, this method adheres to the general principles described by Hayden et al. [35] for assessing the quality of prognostic studies in systematic reviews.

Results
The search yielded 4,818 unique citations, of which the title and abstract were screened, leading to the selection of 152 articles for full-text review. Among these, a total of 135 studies were excluded for one of the following reasons: preoperative prognostic factors were not evaluated or were assessed as part of a scoring system and not evaluated individually; multivariate analysis was not conducted; multivariate analysis included surgical and/or postoperative factors as predictors; the journal was not peer-reviewed at the time of publication on Ulrichsweb [31]; surgical patients were not evaluated separately from non-surgical patients; the study included less than 30 surgical patients; spinal metastases were not distinguished from extraspinal bony metastases; patient sample included patients < 18 years of age; the study involved metastasis from primary central nervous system tumors; postoperative follow-up was less than six months. No additional studies were added after manually checking reference lists (Fig 1). All 17 articles meeting our eligibility criteria were published in English and addressed KQ1, i.e. the preoperative clinical factors associated with survival in surgical SSM patients. There were six additional studies that examined the clinical prognostic factors of functional status (KQ3) in terms of the ability to walk [36][37][38][39][40][41] or regaining the ability to walk [37]  postoperatively and one study that isolated key predictors of survival (KQ1) and HRQoL (KQ4) using the postoperative EQ-5D score as the dependent variable [42]. However, these studies included surgical and/or postoperative factors in their multivariate analysis, leading to their exclusion from this review.

Risk of bias of individual studies
Prospective prognostic studies meeting the following criteria for a good-quality cohort study are considered Class I evidence: (1) patients were followed for sufficient periods in order that outcomes could occur, (2) follow-up rate was ! 80%, (3) patients were at similar point in the course of their disease, and (4) the study accounted for other prognostic factors (S2 Table). Although this review included three prospective studies [24,47,56], they were considered as Class III evidence due to violation of two of the criteria for good-quality cohort studies: followup period and drop-out rate were not clearly specified. The remaining 14 retrospective studies were also considered Class III due to violation of at least one of the criteria for good-quality studies.
Prognostic factors for survival varied substantially according to primary tumor types, with negative relationship as follows: breast cancer with shorter time interval from cancer diagnosis to SSM surgery, emergency hospital admission, primary tumor with poor/undifferentiated histologic grade and negative progesterone receptors [48]; HHC with low serum albumin and high lactate dehydrogenase [52]; prostate cancer with KPS 50-70% [54], Gleason score > 8 For predictive factors used as categorical variables, the referent is underlined only when it was clearly reported in a table or specified in the text.
* The authors report Frankel score as being a statistically significant predictor when they reported setting p < 0.05 as significant.

Methodological issues
All 17 studies used the Cox proportional hazards (PH) regression method for their multivariate survival analysis. Five studies [24,43,46,47,49] did not provide a clear definition, measurement or categorization of their predictors, e.g. "High versus Intermediate versus Low KPS" without defining the corresponding KPS numerical range. One study [50] identified CCIS ! 2 as a predictor of survival although CCIS ! 2 is not a discriminatory factor given that the sole presence of metastatic solid tumor gives a CCIS of 6 [58]. Four studies [43,44,49,53] did not clearly report which predictors were assessed using univariate analysis and, among these studies, one [49] did not report any results for these analyses. Three studies [52,53,57] did not specify how predictors were selected to enter the multivariate analysis. Four studies [24,48,50,51] did not analyze predictors that were described in the Methods section, and two studies [47,53] did not clearly distinguish the results from uni-and multivariate analyses. Only three [43,48,50] studies mentioned testing for the proportional hazards assumption (PHA). While two [48,50] of these studies specified the statistical method used to test the PHA, none actually reported their result. One study [43] reported testing for collinearity but reported neither the technique used nor the results. Eight studies [45,51,55,56], [43,48,49,53] did not report how many patients died during follow-up. Among these, four [45,51,55,56] included more predictor degrees of freedom in their multivariate model than their total sample size n divided by 10.

Overall strength of evidence related to survival
Seven studies examined the preoperative clinical factors associated with survival in patients with SSM from all sites of primary tumor including multiple myeloma (MM). A total of 20 factors were identified, among which 11 were related to the site/histology of the primary tumor [24, 43-45, 50, 56, 57]. Two studies [24,59] (Table 3). Various preoperative factors of survival were identified in multivariate analysis in specific groups of SSM patients. Although two studies examined the preoperative factors of survival in patients with SSM from prostate cancer, they did not consider the same predictors. KPS 50-70% [54], Gleason score > 8 [49], total number of metastases > 5 [49], presence of lymph node metastases at the time of surgery [49] and degree of canal compression > 25% [49] had a Low strength of evidence at baseline and their respective final strength of evidence was Very low due to high risk of bias (Table 3). Based on two studies [41,51], the four predictors of survival for SSM resulting from NSCLC also had a Low strength of evidence at baseline. All predictors were downgraded to a Very low final strength of evidence: low performance status and 14 days from onset of motor deficit to surgery because of high risk of bias and ! 3 vertebral metastases and presence of visceral metastasis because of high risk of bias and inconsistency (Table 3). Preoperative prognostic factors in patients with (1) unknown site of primary tumor at the time of SSM surgery, (2) breast cancer, (3) HHC, (4) lung cancer, ! 60 years old with SSM from heterogenous primary tumors were derived from a single study, all of which had a Low strength of evidence at baseline. The final strength of evidence for predictors of survival in breast cancer [48] was Low while all the others [46,47,52,55] were downgraded to Very low due to high risk of bias [46,47,52,55] and imprecision [47] (Table 4).

Summary of findings
To our knowledge, this is the first systematic literature review that has sought to determine the key preoperative predictors of survival (KQ1), neurologic (KQ2), functional (KQ3), and HRQoL (KQ4) outcomes in patients with SSM who underwent surgical treatment. This systematic review identified 17 studies related to our KQ1 that conducted multivariate analysis and reported a total of 46 preoperative prognostic factors of survival in surgical SSM patients. All 17 prognostic studies were rated as having a moderately high risk of bias (Class III evidence). The final strength of the overall body of evidence was graded low for 7 and very low for the remaining 39 predictors of survival.
In spite of performing a literature search designed to maximize sensitivity, this review was only able to identify studies addressing KQ1. Six studies examined the clinical prognostic factors of functional status (KQ3) and one study isolated predictors of HRQoL (KQ4), but these studies were excluded because they included surgical and/or postoperative predictors. Inclusion of such predictors in the multivariate analysis runs the risk of precluding relevant preoperative predictors from either being selected in the final model or showing statistical significance. Also, final models that retain surgical/postoperative predictors are not relevant in the preoperative period, which is the critical time-point for clinical decision-making. Therefore, while this review had limited success in establishing preoperative prognostic factors of survival, it also highlights the dearth of evidence related to predictors of neurologic, functional, and HRQoL outcomes in surgical SSM patients.

Methodological considerations and recommendations
Due to the nature of cohort studies, prognostic data from these will be biased and may not be generalizable to patients with spinal metastases. Cohort studies are often limited by cost and timescales, and significant losses to follow-up. Spinal centers may cover large geographical areas, and patients may be transferred elsewhere for subsequent oncological treatments. Failure to return for spinal clinic follow-up at prearranged appointment times may be due to travel constraints, the patient may be too unwell, undergoing other treatments, or they may prefer to be reviewed by local oncologists instead. In addition, survival analyses are inherently complex, and our attempt to synthesize this group of 17 such studies was challenging due to considerable heterogeneity in patient samples and prognostic factors investigated. Furthermore, the design and reporting of the statistical analyses were problematic in many studies, leading to a moderately high risk of bias and difficulty interpreting the results. Since multivariate techniques applied to systematically collected data from a specific patient population may improve clinical prediction by identifying key prognostic factors [23,60] and it is likely that various factors conjointly influence clinical outcomes such as survival or HRQoL, performing multivariate analysis was one of our inclusion criteria. Conducting multivariate analysis not only helps control for confounders, thus enhancing the confidence in the validity of the study results [61], but also provides an estimate of the actual effect size, offering both a clinical and statistical assessment of the impact of each factor on the outcome variable. [62] Prognostic studies should be designed and conducted to minimize potential biases related to six domains. [35] (1) Study participants and sample: Data should be collected prospectively. Patients should be at a common point in the course of their disease. Patient sample assembly should include method, period, place of recruitment, and eligibility criteria. Patient sample should be adequately described for key characteristics. (2) Study attrition: The follow-up period should be long enough for the outcome(s) of interest to occur. The proportion of participants completing the study should be reported and adequate for the study design and analyses. If applicable, the reason(s) for loss to follow-up should be recorded. Account and measurement of (3) prognostic factors, (4) outcomes and (5) confounding factors: the definition and method of measurement of prognostic factors, outcomes and confounders should be Symptomatic spinal metastasis -a systematic review clearly described, valid, reliable and appropriate. An adequate proportion of the study sample should have complete data for prognostic factor, outcome and confounders, and if imputation is used for missing data, the method should be described and appropriate. (6) Analysis: The statistical analyses, including model selection and building, should be suitable for the study design, assumptions should be verified, and if applicable, adequate adjustment for confounding should be undertaken. Finally, all results should be adequately reported. [35,63,64] The development of clinical prediction rules While there is no well recognized CPR for HRQoL, the variable prognostic ability of current CPRs of survival [24][25][26] in this patient population may be related to the fact that patients who are deemed surgical candidates are fundamentally different, with overall greater life expectancy and fewer comorbidities, than patients selected for conservative or radiotherapy treatment alone. Selecting relevant predictors from a larger set of candidate predictors is one of the steps involved in the first phase of development of CPRs; these predictors are typically derived from best literature evidence. These CPRs could be of high clinical value by providing more accurate estimates of survival and HRQoL after surgery, helping not only to guide therapeutic decisionmaking during informed consent discussions, but also patients to form more realistic expectations relative to surgical outcomes.

Strengths and limitations
The systematic literature review was conducted in accordance with the PRISMA guidelines.
We assessed the quality of the studies and evaluated the strength of the overall body of evidence for each preoperative predictor identified through our sensitive and rigorous literature review. However, this review aimed to identify predictors of a wide range of outcomes, combining the results of studies with substantial heterogeneity in the prognostic factors, outcome measures, and patient populations that were assessed, which may constitute a problem with internal validity. Furthermore, our a priori eligibility criteria were relatively narrow in their requirement and may have excluded studies that produced pertinent findings. In predicting future outcomes by using patient data available at presentation, there will always be a degree of randomness or "chaos" in the system affecting clinical outcomes and survival. [65] Although we may improve prediction by establishing better methodology, there will always be random variability between studies and between patients, and there comes a point where studying preoperative patient variables too closely may not be helpful, due to the inherent variation that does not improve regardless of increasing sample size.

Conclusions
Life expectancy and HRQoL are cornerstones to clinical decision-making in surgical SSM patients. Based on the results of 17 pertinent studies, this systematic review found a low overall strength of evidence for seven preoperative predictors of survival and very low strength evidence for 39 additional predictors. Consequently, we have low confidence that the evidence reflects the true effect size of these predictors. Furthermore, no evidence was found for the prediction of neurologic, functional, and HRQoL outcomes. Further rigorously conducted prospective studies are needed to better understand what preoperative factors are prognostic of these various outcomes, for the purpose of surgical decision-making, development of CPRs, patient education and levering treatment expectations. Genetic analysis of tumor subtypes will also need to be included in future prediction models, since novel chemotherapies and immunotherapies are showing promising influence on survival and HRQoL.