Time to Definitive Health-Related Quality of Life Score Deterioration in Patients with Resectable Metastatic Colorectal Cancer Treated with FOLFOX4 versus Sequential Dose-Dense FOLFOX7 followed by FOLFIRI: The MIROX Randomized Phase III Trial

Purpose We previously showed that a sequential chemotherapy with dose-dense oxaliplatin (FOLFOX7) and irinotecan (FOLFIRI; irinotecan plus 5-fluorouracil/leucovorin) is not superior to FOLFOX4 in patients at advanced stage of colorectal cancer with liver metastases. Here we aimed to determine whether time to health-related quality of life (HRQoL) score definitive deterioration (TUDD) differs by study arm. Methods HRQoL was evaluated using the European Organization for Research and Treatment of Cancer (EORTC) QLQ-C30 at baseline and every 4 cycles until the end of the study or death. Functional scale, symptom scale, global health status, and financial difficulties were analyzed. The TUDD was defined as the time interval between randomization and the first decrease in HRQoL score ≥ 5-point with no further improvement in HRQoL score ≥ 5 points or any further HRQoL data. TUDD was estimated using the Kaplan-Meier method and the long-rank test. Cox regression analyses were used to identify HRQoL items influencing TUDD. Sensitivity analyses were done using a multiple imputation method and different definitions of TUDD. Results Of the 284 patients, 171 (60.2%) completed HRQoL questionnaires. Cox multivariate analysis showed no statistically significant difference in TUDD for most of the QLQ-C30 scales between treatments. Patients with dyspnea and those without symptoms at baseline had a significantly longer TUDD when there was a delay >12 months between diagnosis of the primary tumor and metastases (HR 0.48 [0.26–0.89]) and when there was diarrhea (HR 0.59 [0.36–0.96]), respectively. Conclusion This study shows that TUDD does not differ significantly according to type of treatment. The TUDD method produces meaningful longitudinal HRQoL results that may facilitate effective clinical decision making in patients with mCRC. Trial Registration ClinicalTrials.gov NCT00268398


Introduction
Colorectal cancer (CRC) is the third leading cause of cancer death worldwide [1]. Almost 50% of patients with CRC will present metastases at same stage of their disease [2], the main cause of mortality associated to the cancer.
In case of isolated CRC metastases (mainly to the liver), the complete resection is the cornerstone treatment, allowing longer survival or even cure. Resection of metastases is possible in only 15%-25% of patients with 5-year survival rates between 25% and 65% in the most recent evaluations [3][4][5]. Unfortunately, majority of patients (50%-75%) will relapse at 2 years.
Additional systemic chemotherapy is increasingly used to reduce the risk of relapse. Based on the results of EORTC trial 40983, the standard of care for CRC patients is a perioperative chemotherapy composed of 12 cycles (6 cycles before and 6 cycles after surgery) of FOLFOX4 (oxaliplatin 85 mg/m 2 ) [6]. In order to reduce the risk of the oxaliplatin-related neuropathy and to increase the efficacy, a modified schedule containing 6 cycles of FOLFOX7 (oxaliplatin 130 mg/m 2 ) followed by 6 cycles of FOLFIRI (irinotecan plus 5-fluorouracil/leucovorin) was assessed [7]. We previously compared in phase III MIROX trial the modified schedule to 12 cycles of FOLFOX4 in patients with resectable metastatic CRC (mCRC). Chemotherapy was either perioperative or postoperative regarding the patient and disease characteristics (especially synchronicity of metastases). A sequential chemotherapy with FOLFOX7-FOLFIRI was not superior to FOLFOX4 in our study.
The importance of health-related quality of life (HRQoL) is well recognized, particularly in patients with advanced cancer. Still, HRQoL results remain poorly used to modify therapeutic strategies, mostly due to the complexity of longitudinal analysis and the lack of standardization, which lead to the inability to propose clinically meaningful HRQoL data. Moreover, in clinical studies of advanced-stage disease missing data (often arising when patients miss visits or do not fill in certain questionnaires due to rapid deterioration or death) is an important problem that potentially hampers the interpretation of HRQoL results [8].
Time until definitive deterioration (TUDD) in QoL score has been defined as a method of longitudinal analysis in oncology [9][10][11][12]. In metastatic setting, this method was allows patients' data to be preserved for analysis even if some of their questionnaires are missing and allows producing clinically meaningful and readable results for clinicians such as a Kaplan-Meier survival curve and a hazard ratio (HR). The aims of this analysis were to compare TUDD for QLQ-C30 scales between the FOL-FOX7-FOLFIRI and FOLFOX4 arms and to investigate the applicability of this method for mCRC analysis using a subset of patients enrolled on the MIROX study.

Patients and eligibility criteria
Eligibility criteria and study design have been previously described elsewhere [7]. This was an open-label, randomized, phase III trial comparing the efficacy of FOLFOX7-FOLFIRI with FOLFOX4 in mCRC from 19 French centers. Patients were eligible if they had histologically confirmed colorectal adenocarcinoma with initially resectable/resected metastases in only one site (liver, lung, ovary, or peritoneum). Regarding peritoneum was allowed a single and completely resected metastasis discovered during the resection of the primary tumor. Except this case, there was no cut-off limit for the number of metastases. Other eligibility criteria included age 18-75 years, WHO performance status 2, adequate hematological, renal, and hepatic functions. Prior adjuvant chemotherapy for CRC was allowed if ended 12 months before relapse. Patients were randomized (1:1) with a minimization technique stratifying them by chemotherapy timing: perioperative versus postoperative, local intervention: surgery versus radiofrequency ablation (RFA) with/without surgery, and Fong's score: 0-1 versus 2-3 versus 4-5. Patients received either 12 FOLFOX4 cycles (oxaliplatin dose: 85 mg/m 2 ) or 6 FOLFOX7 cycles (oxaliplatin dose: 130 mg/m 2 ), followed by 6 FOLFIRI cycles (irinotecan dose: 180 mg/m 2 ), 1 cycle every 2 weeks.
All patients were fully informed of the study and provided signed written informed consent. The protocol was approved by the ethics committees of Lille, France ("Comité de Protection des Personnes"). This study MIROX (Combination Chemotherapy in Treating Patients with Colorectal Cancer and Resectable Metastases) was registered on ClinicalTrials.gov (Identifier: NCT00268398). The protocol for this trial (including the written informed consent form and the list of Ethics Committees) and supporting CONSORT checklist are available as supporting information (S1 and S2 Protocols and S1 Checklist).
The primary endpoint was 2-year disease-free survival (DFS) and analysis was conducted by intend to treat (ITT). Secondary endpoints were overall survival (OS), objective response rate (ORR), resection type (R0-R2), toxicity, and HRQoL. TUDD approach was used for HRQoL longitudinal analysis. Between May 2004 and June 2010, 284 patients were enrolled.

HRQoL assessment
HRQoL was assessed using the European Organization for Research and Treatment of Cancer (EORTC) QLQ-C30 [13]. Assessment was performed at baseline (week before randomization) and every 4 cycles thereafter. The QLQ-C30 is a 30-item cancer-specific tool that generates global health status (GHS), five functional scales (physical, role, emotional, cognitive, and social), eight symptom scales (fatigue, nausea/vomiting, pain, dyspnea, insomnia, anorexia, constipation, diarrhea), and one financial difficulties item. Scoring was completed according to the EORTC scoring manual. Raw scores were linearly transformed to a 0 to 100 scale. For GHS lower scores represent worst HRQoL and higher scores better HRQoL, while for symptom parameters lower scores represent better HRQoL and higher scores worst HRQoL.

Statistical methods
The primary endpoint was DFS, defined as the interval between randomization and first evidence of relapse or death from any cause. ORR was evaluated according to the RECIST v.1.0 criteria [14]. Adverse events (AEs) were graded according to the NCI-CTCAE v.2.0 and an oxaliplatin-specific scale for neuropathy [15] The sample size was based on the hypothesis that the 2-year DFS might be improved from 30% with FOLFOX4 chemotherapy to 45% with FOLFOX7-FOLFIRI (Hazard Ratio [HR] of 0.66). To demonstrate this 15% difference using an 80% power and bilateral α type I error of 5%, 188 events were required. Based on estimated 36 months inclusion duration, and 24 months follow-up, at least 248 patients had to be enrolled. Assuming a drop-out rate of 20% (disease progression before surgery, R2 resections or lost to follow-up), a total number of 284 patients, 142 per treatment arm, was required.
Analyses were carried out on all patients who received at least one dose of treatment, based on a mITT approach. Survival and median follow-up were estimated by the Kaplan-Meier and the reverse Kaplan-Meier method, respectively [16,17]. Differences between treatment groups' outcomes were compared using a log-rank test. Proportional hazard assumptions were tested using scaled Schoenfeld residuals [18].
Continuous and qualitative variables were described by means and standard deviations (SD) and medians (min-max), and percentages, respectively. Patient characteristics were described according to the completion of questionnaire at baseline in order to determine a non-random missing patient profile. Questionnaire completion rates were calculated as a percentage of all patients who completed a questionnaire at a given time point. Completion rates and baseline HRQoL scores were compared according to treatment arm. Randomized patients whatever eligibility criteria with available HRQoL scores at baseline were included in the HRQoL analyses (modified ITT analysis).

Analysis of HRQoL
TUDD was defined as the interval between randomization and the first decrease in HRQoL score 5 points compared to baseline HRQoL score with no further improvement or in case of patient who dropped out after a 5 points decrease, resulting in missing data or death [10,19]. Alive patients were censored at the last HRQoL follow-up if a 5 points deterioration from baseline was not observed or if a 5 points decrease was present, but was followed by secondary 5 points improvement [10]. All randomly assigned patients with a baseline and at least one post-baseline HRQoL assessment were included in TUDD analyses.
TUDD was calculated using the Kaplan-Meier method and compared with the log-rank test. TUDD was described using medians with 95% confidence interval (CI). The univariate Cox model was used to calculate HR with 95% CI. The multivariate Cox model, with treatment arms and other covariates, was applied to identify independent factors associated with TUDD for each scale. All variables with a univariate P value 0.20 from the Cox univariate analyses were eligible for multivariate analyses. Correlations were tested for eligible variables. To prevent collinearity, when two variables were significantly correlated, one variable was retained according to its clinical relevance or to the value of the likelihood ratio. The treatment arm was forced into the multivariate analyses. The time to progression status was included in Cox analyses as a time-dependent variable.

Sensitivity analyses
As only the patients with a baseline HRQoL score were considered in TUDD, sensitivity analyses were performed to evaluate the effect on results of the discarded group of patients. Multiple imputation with predictive mean matching (PMM) method was used to handle baseline missing score. PMM matches the missing value to the observed value with the closest predicted mean (or linear prediction) [20]. The TUDD analyses were repeated on multiple imputation data in the same ways as done on the original data set.
Two sensitivity analyses by definition of an event for TUDD were also performed. In the first approach death was excluded as an event from the TUDD definition. The second approach used time to deterioration (TTD) that was defined as the interval between randomization and the first 5-point decrease in HRQoL compared to baseline HRQoL [11]. No further HRQoL investigation after this deterioration was considered. Patients were censored at the time of the last HRQoL assessment if they had not deteriorated before that.
As HRQoL was a secondary endpoint of the MIROX trial, no multiplicity adjustment was performed. All tests were two-sided and analyses were performed with Stata 11 software [21].

Patients
Between May 2004 and June 2010, 284 patients were enrolled. One hundred forty two patients received FOLFOX4 and 142 were given FOLFOX7 followed by FOLFIRI. The treatment groups were well balanced for baseline characteristics (S1 Table). Median age was 62 years, 67% of patients were male, 68% had colon cancer, and 33% had disease symptoms at baseline. Details have been given elsewhere [7].

Baseline HRQoL score and HRQoL compliance
The compliance with HRQoL assessment is summarized in Fig 1. One hundred seventy-one (60.2%) patients completed at least one HRQoL questionnaire during the study period; 83 (48.5%) in the FOLFOX4 arm and 88 (51.5%) in the FOLFOX7-FOLFIRI arm. The differences between the number of patients who responded to the questionnaire as compared to the nonresponders were as follow: 130 (76.0%) vs 141 (82.4%) at baseline, 100 (58.5%) vs 105 (61.4%) after 4 cycle, 65 (30.0%) vs 70 (40.9%) after 8 cycles, and 28 (16.3%) vs 29 (16.9%) after 12 cycles. Similar baseline characteristics, except for gender and presence of disease symptoms were observed between the two populations (S2 Table). Patients participated more likely in the HRQoL evaluation if they were women (49% vs 35% of men) and if they had no symptoms (44% vs 28% of patients with symptoms). Patients who completed HRQoL questionnaire at baseline (130; 76.0%) had similar characteristics excepted for age (Table 1). At least one score was missing in 16% of patients younger than 63 years compared to 32% of those older than 63 years. The two treatment arms had similar baseline HRQoL scores in all domains except for constipation score (
Multivariate Cox analyses (Fig 4 and S3 Table) showed no statistically significant difference in TUDD for the studied scales of the QLQ-C30 between treatments, except three dimensions. TUDD of pain was significantly associated with treatment arm (P = 0.044). Patient treated with FOLFOX7-FOLFIRI had a significantly longer TUDD (40% increase). Dyspnea was significantly associated with a delay > 12 months between diagnosis of the primary tumor and metastases (metachronous), the univariate HR of 0.

Sensitivity analyses
In univariate sensitivity analysis of TUDD excluding death as event (Fig 5, S4 Table), the TUDD did not differ significantly according to type of treatment. In addition, this result was  Table). In multivariate Cox analysis using multiple imputation data (not shown), TUDD of pain symptom was no longer significantly associated with treatment arm (HR 0.7 [0.44-1.10] for FOLFOX7-FOLFIRI, P = 0.13 compared to FOLFOX4.

Discussion
HRQoL has become an increasingly important treatment outcome in the care of cancer patients, especially those with advanced disease. In the context of mCRC, HRQoL endpoint takes on great importance and complements the traditional endpoints in assessment of treatment effectiveness [20]. Consequently, there is a need to propose statistical method for longitudinal analysis of HRQoL that can capture meaningful changes in HRQoL scores.
In this study, we used TUDD, with or without death as an event, as a conservative method that accounts for non-ignorable missing data as the primary endpoint of the HRQoL analysis following the methodology described by Bonnetain et al. [10] for several reasons. Given that a definitive deterioration of HRQoL can be acquired before patients withdraw the study; TUDD is less sensitive to the presence of missing data in the context of advanced disease than classical repeated measurements. Patients with some missing questionnaires are not excluded from the statistical analysis as long as data from at least one HRQoL assessment is obtained. Moreover, if a patient died during follow-up or experienced deterioration, and did not have assessments after that point, this reflected definitive deterioration of the patient's health. Furthermore, the measure of TUDD is robust and more familiar to clinicians because it is based on Kaplan-Meier survival curves and HR that thus allows them to draw more meaningful estimates of survival. The TUDD approach is closer to other time-to-event analyses, such as time to progression, and has already been used in the analysis of HRQoL in other cancer locations [9][10][11]22,23]. In the setting of CRC, Kabbinavar et al. [22] showed that HRQoL was similar whatever the treatment. In that study, the time to deterioration in HRQoL for CRC patient was analyzed as being the time of death or disease progression. We did not include progression of the disease as an event in our analysis. However, when it was used as a time-dependent variable, no association between TTD and progression was found.
The importance of HRQoL dimensions has been reported in many studies since they affect HRQoL in CRC survival. We analysed the HRQoL longitudinal changes in mCRC patients treated with FOLFOX4 versus FOLFOX7-FOLFIRI. Our results show that type of treatment did not significantly influence longitudinal TUDD for the main dimension of the QLQ-C30 scores, suggesting that switching oxaliplatin to irinotecan in the treatment of resectable mCRC does not improve patients' HRQoL. This result is consistent with those observed in studies of FOLFOX and FOLFIRI in this setting.  The presence of diarrhea and pain hamper the HRQoL among CRC patients with advanced stages [24,25]. In our analysis, patients without symptoms at inclusion presented significantly longer TUDD for diarrhea. Similarly those with dyspnea had significantly longer TUDD for a delay > 12 months between diagnosis of the primary tumor and metastases. The latter observation may be explained by the fact that a longer interval between diagnosis and metastases provides a significant change of recovery to the patient from the first line-treatment effects (after surgery and/or adjuvant chemotherapy) an in turn a satisfactory GHS.
In the present study we found that patients treated with FOLFOX7-FOLFIRI deteriorated less rapidly for pain symptoms scale than those treated with FOLFOX4. However, this result must be interpreted with caution given that the difference between the arms reached statistical significance only in multivariate analysis, possibly due to the effect of unbalanced sample size, an extremely large within group variation, relative to between group variation and the influence of missing data.
Some limitations of the analysis presented here should be noted. Firstly, although the MIROX study represents a large dataset, not all patients completed questionnaires, which could have an impact on the validation of results. Although the non-responders rate was high (39.8%), the patient characteristics were similar to responders. Nevertheless, a 60.2% HRQL response Time to Quality of Life Deterioration in Colorectal Cancer: MIROX Randomized Phase III Trial rate is considerable for advance-disease population study. Secondly, as we did not correct for multiple comparisons it may have introduced possible inflation of Type I error and in turn resulted in the lack of significant finding or findings with particularly low significance [26,27].
In conclusion, the present HRQoL results support our recent findings study about the lack of a clinical and statistical significant difference between FOLFOX4 and FOLFOX7-FOLFIRI in the main dimension of the QLQ-C30 scores in CRC patients with advanced disease. The TUDD demonstrates an accessible statistical approach for the longitudinal analysis of HRQoL that in turn are readily meaningful to clinicians and are more likely to influence clinical decision making.