In evaluation of the clinical benefit of a new targeted agent in a phase 3 trial enrolling molecularly selected patients with advanced non-small cell lung cancer (NSCLC), overall survival (OS) as an endpoint seems to be of limited use because of a high level of treatment crossover for ethical reasons. A more efficient and useful indicator for assessing efficacy is needed.
Methods and Findings
We identified 18 phase 3 trials in the literature investigating EGFR-tyrosine kinase inhibitor (TKIs) or ALK-TKIs, now approved for use to treat NSCLC, compared with standard cytotoxic chemotherapy (eight trials were performed in molecularly selected patients and ten using an “all-comer” design). Receiver operating characteristic analysis was used to identify the best threshold by which to divide the groups. Although trials enrolling molecularly selected patients and all-comer trials had similar OS-hazard ratios (OS-HRs) (0.99 vs. 1.04), the former exhibited greater progression-free survival-hazard ratios (PFS-HR) (mean, 0.40 vs. 1.01; P<0.01). A PFS-HR of 0.60 successfully distinguished between the two types of trials (sensitivity 100%, specificity 100%). The odds ratio for overall response was higher in trials with molecularly selected patients than in all-comer trials (mean: 6.10 vs. 1.64; P<0.01). An odds ratio of 3.40 for response afforded a sensitivity of 88% and a specificity of 90%.
Citation: Hotta K, Kato Y, Leighl N, Takigawa N, Gaafar RM, Kayatani H, et al. (2015) Magnitude of the Benefit of Progression-Free Survival as a Potential Surrogate Marker in Phase 3 Trials Assessing Targeted Agents in Molecularly Selected Patients with Advanced Non-Small Cell Lung Cancer: Systematic Review. PLoS ONE 10(3): e0121211. https://doi.org/10.1371/journal.pone.0121211
Academic Editor: Keitaro Matsuo, Kyushu University Faculty of Medical Science, JAPAN
Received: September 27, 2014; Accepted: February 11, 2015; Published: March 16, 2015
Copyright: © 2015 Hotta et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited
Data Availability: All relevant data are within the paper and its Supporting Information files.
Funding: The authors have no support or funding to report.
Competing interests: KH and NT have received honoraria from Eli Lilly Japan, AstraZeneca, Pfizer, Taiho Pharmaceutical, Chugai Pharmaceutical, and Sanofi. This does not alter the authors' adherence to PLOS ONE policies on sharing data and materials.
Platinum-based and single-agent cytotoxic chemotherapies have been the standard treatments for advanced non-small cell lung cancer (NSCLC) patients in first-line and salvage settings, respectively [1–5]. Unfortunately, even upon application of such standard therapies, nearly all patients with advanced NSCLC experience disease progression, and thus, the ultimate goals of palliative chemotherapy are to prolong overall survival (OS) and to improve symptoms and the quality of life, rather than to cure. To date, evaluation of the efficacies of treatment strategies and the approval of most new agents used to treat advanced NSCLC have been based principally on OS prolongation in randomized clinical trials . OS, the time from randomization to death from any cause, represents a direct measure of clinical benefit to the patient, and to date, no other endpoint has been shown statistically to serve as a suitable surrogate for OS in advanced NSCLC [7–12].
Many molecular targeted agents, including epidermal growth factor receptor (EGFR)- and anaplastic lymphoma kinase (ALK)-tyrosine kinase inhibitors (TKIs), have been assessed in phase 3 trials, compared with standard cytotoxic chemotherapy. Several trials have failed to demonstrate significant improvement in OS and/or progression-free survival (PFS), mainly because appropriate patient selection was not applied, i.e. the trials used molecularly unselected or all-comer designs. However, even in trials using only patients selected after evaluation of EGFR-mutant or ALK-fusion status (thus, molecularly selected patients) [13–21], EGFR- and ALK-TKIs failed to demonstrate any significant advantage in terms of OS, although these drugs are now widely approved in the U.S., E.U., and Japan. This would be explained by the inevitable high levels of crossover, essential from an ethical viewpoint, that allow control-arm patients to access these highly active investigational agents [13,18]. Thus, the data do not reflect inappropriate patient numbers or inadequate efficacy of the tested agents.
Currently, medical oncologists have strong views that in trials with crossover designs in molecularly selected patients, 1) the lack of any observed effect on OS does not necessarily mean that the agent is not efficacious; 2) the use of OS as the primary endpoint is limited; and 3) other endpoints are now urgently required to evaluate the efficacy of molecular targeted agents. If a significantly impressive benefit in PFS or overall response is evident specifically in molecularly-selected but not all-comer trials, we hypothesized that the PFS-hazard ratio (HR) or the odds ratio for the overall response would serve as a useful novel indicator in the former trial setting . In order to propose novel efficient and useful markers of efficacy in this setting, we reviewed published phase 3 trials that compared EGFR- or ALK-TKIs (gefitinib, erlotinib, afatinib, or crizotinib) with traditional cytotoxic chemotherapy. We next identified differences in the magnitudes of PFS-HRs or odds ratios for the overall response in the two types of trials, those conducted in molecularly selected populations versus those conducted in all-comers.
We performed a literature search of trials published between January 2003 and June 2014. To avoid publication bias, both published and unpublished trials were identified using a computer-based search of the PubMed database and of abstracts from conferences of the American Society of Clinical Oncology (ASCO), the European Society for Medical Oncology (ESMO), and the International Association for the Study of Lung Cancer (WCLC). The following search terms were used: “lung cancer AND advanced AND phase III study OR phase 3 study OR phase 3 trial OR phase III trial OR randomized controlled trial OR clinical trial OR controlled clinical trial”. Our search was also guided by a thorough examination of the reference lists of original and review articles, books, and meeting abstracts (ASCO, ESMO, and WCLC), and of the Physician Data Query registry of clinical trials.
Eligible phase 3 trials were those that evaluated EGFR-TKIs or ALK-TKIs in the treatment of advanced NSCLC (Fig. 1), provided data on PFS, the overall response rate, and OS. Drugs acting on known specific molecular targets were defined as molecular targeted agents [22,23]. Trials designed to assess combined modality treatment, including radiotherapy and/or surgery, were excluded. We selected phase 3 trials that compared EGFR- or ALK-TKIs with existing cytotoxic chemotherapy.
To avoid bias, two certified medical oncologists (K.H. and Y.K.) independently abstracted the trial data and compared their results, as described previously [5,24]. The following information was obtained from each report: year of trial initiation, number of patients randomized, treatment regimens, line of treatment, publication type, primary endpoint, PFS- and OS-HRs, and the number of responders. All data were verified for internal consistency, and disagreements were resolved by discussion between the investigators. The principal investigators of the trials were contacted and invited to confirm or update published data.
We performed linear regression analysis to investigate associations after assigning weights determined by sample size to each trial. The strength of each association was defined a priori using the commonly accepted criterion of the coefficient of determination (the R-square value; r2) , which ranges from 0 < r2 < 1, with a higher score indicating a stronger association [24,25].
Any influence of trial design (molecularly selected patients vs. all-comers) on the PFS-HR or the odds ratio of the overall response was evaluated by multiple stepwise regression analysis using the following stepping criteria: P-value allowing model entry, ≤ 0.05; P-value compelling removal from the model, ≥ 0.20, with adjustment for several confounders including the year of trial initiation, line of treatment, primary endpoint, number of randomized patients, and type of reporting.
The significance of differences between groups was assessed using t-tests. Receiver operating characteristic (ROC) analysis was used to identify the most accurate discrimination thresholds dividing the groups. The most suitable cutoff level was defined as that closest to the top-left corner. The odds ratio for the overall response was calculated as follows: ([number of patients in the investigational arm who achieved a complete or partial response: A]/[number of randomized patients allocated to the investigational arm—A])/([number of patients in the control arm who achieved a complete or partial response: B]/[number of randomized patients allocated to the control arm—B]).
All P values were calculated using two-sided tests, and the level of significance was set at P < 0.05. Statistical analyses were performed using STATA (Version 11; StataCorp, Dallas, TX, USA).
Of the 6,950 trials screened, 18 phase 3 trials that investigated four molecular targeted agents (gefitinib, erlotinib, afatinib, and crizotinib) in patients with advanced NSCLC were identified (S1 Table). The trials included a total of 7,633 randomized patients (Fig. 1). The trial characteristics are listed in Table 1. We found eight trials enrolling molecularly selected patients and 10 all-comer trials. Sixteen trials evaluated EGFR-tyrosine kinase inhibitors (TKIs) in patients with EGFR-mutant NSCLC, and the remaining two trials assessed the use of crizotinib, an ALK-TKI, to treat ALK-rearranged NSCLC.
Correlation between the OS-HR and PFS-HR and between the OS-HR and the odds ratio of the overall response
First, we examined the strength of the correlation between the PFS-HR and the OS-HR. As shown in Fig. 2A, the PFS-HR had no meaningful association with OS-HR (overall R-squared value = 0.233), suggesting that the PFS-HR explained only 23.3% of the overall variability in OS-HR (Fig. 2A). This weak association was especially apparent in molecularly selected patient trials compared with all-comer design trials (R-squared values = 0.002 vs. 0.409, respectively; Fig. 2B). Similar observations were made when the association between the odds ratio of the overall response and the OS-HR were compared (overall R-squared value = 0.101, Fig. 2C). The association was more marked in trials with molecularly selected patients (R-squared values = 0.039 vs. 0.429, respectively; Fig. 2D).
the all-comer design [pink], R-squared = 0.409; P-value for interaction = 0.34). Associations between the odds ratio of the overall response and OS-HR (R-squared = 0.101) (C), and that after stratification by trial design (D) (the molecularly selected patient design [blue], R-squared = 0.039, vs. the all-comer design [pink], R-squared = 0.429; P-value for interaction = 0.03). All analyses were weighted by trial size.
Neither the PFS-HR nor the odds ratio of the overall response accurately predicted OS when a linear regression model was used to analyze data from molecularly selected patient trials.
OS-HRs in trials with molecularly selected patients and all-comers designs
We found no significant difference in the OS-HRs between the two trial types (mean, 0.99 vs. 1.04 in molecularly selected patient trials vs. all-comer trials, respectively; P = 0.50) (Fig. 3A). In contrast, median survival time in molecularly selected patient trials was approximately double that in all-comer trials (median 23.1 and 26.6 months in the investigational and control arms of molecularly selected trials, respectively, compared with 11.9 and 12.2 months, respectively, in all-comer trials).
The left and right columns in each panel represent data from molecularly selected patient trials and all-comer trials, respectively. The diameter of each circle is representative of the size of the trial.
A, Trials with molecularly selected patients had almost identical OS-HRs, compared with those of all-comer trials (mean, 0.99 vs. 1.04, P = 0.50). B, The PFS-HRs were 0.40 vs. 1.01 in the two trial types (P < 0.01). C, Trials with molecularly selected patients had significantly greater odds ratios in terms of the overall response (mean; 6.10 vs. 1.64, P < 0.01).
PFS-HRs in both molecularly selected patient and all-comer trials
We next investigated differences in the PFR-HRs between the two trial groups. Molecularly selected patient trials had a greater PFS-HR than did all-comer trials (mean, 0.40 vs. 1.01; P < 0.01; Fig. 3A). This significant influence of trial design on PFS-HR was observed even when several potential confounders were adjusted upon multivariate analysis; trials using molecular selection had a PFS-HR score 0.42 points lower than that of the all-comer trials; P < 0.01; Table 2).
ROC analysis revealed that a PFS-HR of 0.60 was a useful cutoff point to distinguish the two types of trial designs, with a sensitivity and specificity of 100% and 100%, respectively, and an area under the ROC curve (AUC) of 1.00 (Table 3, Fig. 4A).
The most suitable cutoff was defined as that closest to the upper left corner. A, A PFS-HR of 0.60 was the optimal cutoff for distinguishing molecularly selected patient trials from all-comer trials (sensitivity 100%, specificity 100%, and AUC [area under the receiver operating characteristic curve], 1.00). B, The odds ratio for an overall response of 3.40 was a potentially useful cutoff to distinguish trials with molecularly selected patients from all-comer trials (sensitivity 88%, specificity 90%, and AUC = 0.95).
Odds ratios of the overall response both in molecularly selected patient and all-comer trials
The odds ratio of the overall response was higher in trials with molecularly selected patients than in all-comer trials (mean: 6.10 vs. 1.64, respectively; P < 0.01, Fig. 3C). This was maintained upon multivariate analysis; the former trial type had a 4.46-point greater odds ratio than that of the all-comer trial; P < 0.001; Table 2).
The ROC curve indicated that an odds ratio of 3.40 for the overall response was a potentially useful cutoff point to identify trials with molecularly selected patients, affording a sensitivity of 88%, a specificity of 90%, and an AUC of 0.95 (Table 3, Fig. 4B). The odds ratio, even in combination with the PFS-HR, did not increase the probability of detecting trials of molecularly selected patients (Table 3).
We noted robust benefits in terms of both the PFS and overall response in trials evaluating approved EGFR-TKIs or ALK-TKIs in molecularly selected patients (Fig. 3B and 3C, Table 3). In particular, a PFS-HR of approximately 0.6 was a useful cutoff for distinguishing molecularly selected patient trials from all-comer trials, with a sensitivity of 100% and a specificity of 100% (Fig. 4A, Table 3). To date, PFS has not been shown to be a statistically acceptable surrogate for OS because of the lack of a strong association between PFS and OS in advanced NSCLC patients [7,9]. Thus, PFS is not a formal surrogate endpoint but rather a potential future indicator of the clinical benefit of molecular targeted agents in trial designs in which an OS endpoint is of limited utility.
The principal result of our study was that of the 18 phase 3 trials assessing EGFR-TKIs and an ALK-TKI, we found that the PFS-HR yielded by the approved molecular targeted agents in molecularly selected patient trials was sufficiently large to allow distinction between the two trial types, with high accuracy, at a PFS-HR cutoff level of 0.6 (Fig. 3A and Table 3). Assuming that trials with molecularly selected patients have shown and will continue to show only small differences in OS, caused by high levels of crossover treatment, regardless of the effectiveness of the investigational agent , the extent of PFS benefit could serve as an important measure of the clinical benefit (Fig. 3A and Table 3). The U.S. Federal Drug Administration (FDA) considers that PFS is a valid clinical endpoint for advanced NSCLC when regulatory decisions on drug approval based on the substantial magnitudes of their effects are to be made . However, “substantial magnitude” remains poorly defined. Here, we describe a cutoff level that will be of potential use in future trials using molecularly selected patients; use of this cutoff will help resolve this long-standing issue.
In contrast, the odds ratio for the overall response seemed less useful for distinguishing trials using molecularly selected patients (Fig. 3C, and Table 3), possibly because the overall response did not accurately reflect dramatic tumor shrinkage; rather, it reflected the proportion of patients in whom the tumor diameter was reduced by ≥ 30%, thus ignoring profound shrinkage . A novel concept is required to establish surrogacy of the overall response; the proportion of patients exhibiting “dramatic responses”, as revealed by a waterfall plot might suffice.
Neither recent randomized trial of LUX-Lung 3 nor-6 revealed any significant OS advantage of afatinib, one of the existing EGFR-TKIs, over the platinum-based chemotherapy, although both combined analysis of these two trials and some subgroup analyses showed an OS benefit of the investigational agent . Our current result would also be applied even in such situation, as long as a trial demonstrates a large PFS benefit but no OS benefit.
A limitation of our study was that all analyses were performed in the absence of any detailed individual patient information, and thus future patient-based data analyses may be necessary to confirm our present findings [27,28]. In addition, we included a limited number of clinical trial analyses that were retrospective in design, and we only analyzed trials evaluating EGFR- or ALK-TKIs. Furthermore, PFS might be a more useful endpoint if it were combined with other endpoints such as quality of life, but no relevant data on this were available to us. Therefore, our work is still at the stage of hypothesis generation; we believe further studies are strongly warranted.
In conclusion, OS is no longer of utility in trials using molecularly selected patients that allow subsequent crossover to active investigational agents. In this situation, these molecularly targeted trials using PFS would be considered positive if their HR is less than or equal to 0.6 for PFS. Although desired threshold might differ in an individual trial, we have contributed critical information to the long-standing debate on potential endpoints alternative to the traditional OS endpoint used in trial design.
Conceived and designed the experiments: KH YK. Performed the experiments: KH YK. Analyzed the data: KH YK. Contributed reagents/materials/analysis tools: KH YK. Wrote the paper: KH YK NL NT RMG HK TH KD TK M. Tabata M. Tanimoto KK.
- 1. Non-Small Cell Lung Cancer Collaborative Group. Chemotherapy in non-small cell lung cancer: a meta-analysis using updated data on individual patients from 52 randomized clinical trials. BMJ 311:899–909, 1995 pmid:7580546
- 2. Matsuo K, Ueoka H, Kiura K, Tabata M, Tanimoto M: Meta-analysis of randomized clinical trials comparing cisplatin to carboplatin in patients with advanced non-small-cell lung cancer. J Clin Oncol 22:3852–3859, 2004 pmid:15326195
- 3. Hotta K, Matsuo K: Long-standing debate on cisplatin- versus carboplatin-based chemotherapy in the treatment of advanced non-small-cell lung cancer. J Thorac Oncol 2:96, 2007 pmid:17410021
- 4. Fujiwara Y, Matsuo K, Suzuki T, Kiura K, Tabata M, et al: Recent improvement in the survival of patients with advanced non-small-cell lung cancer enrolled in phase III trials of first-line systemic chemotherapy. Cancer 109:939–948, 2007 pmid:17285602
- 5. Shepherd FA, Dancey J, Ramlau R, Mattson K, Gralla R, O'Rourke M, et al. Prospective randomized trial of docetaxel versus best supportive care in patients with non-small-cell lung cancer previously treated with platinum-based chemotherapy. J Clin Oncol 18:2095–2103, 2000 pmid:10811675
- 6. United States, Department of Health and Human Services, Food and Drug Administration (FDA). Guidance for Industry: Clinical Trial Endpoints for the Approval of Non-Small Cell Lung Cancer Drugs and Biologics. Bethesda, MD: FDA; 2011. [Available online at: http://www.fda.gov/downloads/Drugs/GuidanceComplianceRegulatoryInformation/Guidances/UCM259421.pdf; accessed Aug 3, 2014]
- 7. Laporte S, Squifflet P, Baroux N, Fossella F, Georgoulias V, Pujol JL, et al. Prediction of survival benefits from progression-free survival benefits in advanced non-small-cell lung cancer: evidence from a meta-analysis of 2334 patients from 5 randomised trials. BMJ Open 3: e001802. 2013 pmid:23485717
- 8. Hotta K1, Suzuki E, Di Maio M, Chiodini P, Fujiwara Y, Takigawa N, et al: Progression-free survival and overall survival in phase III trials of molecular-targeted agents in advanced non-small-cell lung cancer. Lung Cancer. 79: 20–26, 2013 pmid:23164554
- 9. Hotta K1, Kiura K, Fujiwara Y, Takigawa N, Hisamoto A, Ichihara E, et al: Role of Survival Post-progression in Phase III Trials of Systemic Chemotherapy in Advanced Non-Small-Cell Lung Cancer: A Systemic Review. Plos One 6:e26646, 2011 pmid:22114662
- 10. Hotta K: Defining the gold standard: PFS v OS. Educational book in 15th World Conference on Lung Cancer. 2013: 219–222. [Available online at: http://www.2013worldlungcancer.org/documents/WCLC2013-EducationalBook-2013.12.10.pdf; accessed August 3, 2014]
- 11. American Society of Clinical Oncology. Clinically Meaningful Outcomes. [Available online at: http://www.asco.org/practice-research/clinically-meaningful-outcomes; accessed August 3, 2014].
- 12. Hotta K1, Fujiwara Y, Matsuo K, Kiura K, Takigawa N, Tabata M, et al: Time to Progression as a Surrogate Marker for Overall Survival in Patients with Advanced Non-small Cell Lung Cancer. J Thorac Oncol 4:311–317, 2009 pmid:19190515
- 13. Maemondo M, Inoue A, Kobayashi K, Sugawara S, Oizumi S, Isobe H, et al. Gefitinib or chemotherapy for non-small-cell lung cancer with mutated EGFR. N Engl J Med 362:2380–2388, 2010 pmid:20573926
- 14. Rosell R, Carcereny E, Gervais R, Vergnenegre A, Massuti B, Felip E, et al: Erlotinib versus standard chemotherapy as first-line treatment for European patients with advanced EGFR mutation-positive non-small-cell lung cancer (EURTAC): a multicentre, open-label, randomised phase 3 trial. Lancet Oncol 13:239–246, 2012 pmid:22285168
- 15. Zhou C, Wu YL, Chen G, Feng J, Liu XQ, Wang C, et al: Erlotinib versus chemotherapy as first-line treatment for patients with advanced EGFR mutation-positive non-small-cell lung cancer (OPTIMAL, CTONG-0802): a multicentre, open-label, randomized, phase 3 study. Lancet Oncol 12:735–742, 2011 pmid:21783417
- 16. Mitsudomi T, Morita S, Yatabe Y, Negoro S, Okamoto I, Seto T, et al: Updated overall survival results of WJTOG 3405, a randomized phase trial comparing gefitinib with cisplatin plus docetaxel as the fine treatment for patients with non-small cell lung cancer harboring mutations of the epidermal growth factor receptor (EGFR). J Clin Oncol 30: suppl. abstr. 7521, 2012
- 17. Shaw AT, Kim DW, Nakagawa K, Seto T, Crinó L, Ahn MJ, et al: Crizotinib versus chemotherapy in advanced ALK-positive lung cancer. N Engl J Med 368:2385–2394, 2013 pmid:23724913
- 18. Inoue A, Kobayashi K, Maemondo M, Sugawara S, Oizumi S, Isobe H, et al: Updated overall survival results from a randomized phase III trial comparing gefitinib with carboplatin-paclitaxel for chemo-naïve non-small cell lung cancer with sensitive EGFR gene mutations (NEJ002). Ann Oncol 24:54–59, 2013 pmid:22967997
- 19. Solomon BJ, Mok T, Kim DW, Wu YL, Nakagawa K, Mekhail T, et al. First-line crizotinib versus chemotherapy in ALK-positive lung cancer. N Engl J Med. 371:2167–77, 2014. pmid:25470694
- 20. Wu YL, Zhou C, Hu CP, Feng J, Lu S, Huang Y, et al. Afatinib versus cisplatin plus gemcitabine for first-line treatment of Asian patients with advanced non-small-cell lung cancer harbouring EGFR mutations (LUX-Lung 6): an open-label, randomised phase 3 trial. Lancet Oncol15:213–222, 2014 pmid:24439929
- 21. Sequist LV, Yang JC, Yamamoto N, O'Byrne K, Hirsh V, Mok T, et al. Phase III study of afatinib or cisplatin plus pemetrexed in patients with metastatic lung adenocarcinoma with EGFR mutations. J Clin Oncol 31:3327–3334, 2013 pmid:23816960
- 22. Yamanaka T, Okamoto T, Ichinose Y, Oda S, Maehara Y. Methodological aspects of current problems in target-based anticancer drug development. Int J Clin Oncol 11:167–175, 2006 pmid:16850122
- 23. Fujiwara Y, Hotta K, Di Maio M, et al: Time trend in treatment-related deaths of patients with advanced non-small-cell lung cancer enrolled into phase III trials of systemic treatment. Ann Oncol 22:376–382, 2011 pmid:20699278
- 24. Hotta K1, Kiura K, Fujiwara Y, Takigawa N, Oze I, Ochi N, et al. Association between incremental gains in the overall response rate and survival improvement in phase III trials of first-line chemotherapy for extensive disease small-cell lung cancer. Ann Oncol 20:829–834, 2009 pmid:19221150
- 25. Chu T-M, Weir B, Wolfinger R. A systematic statistical linear modeling approach to oligonucleotide array experiments. Math Biosci 176: 35–51, 2002 pmid:11867082
- 26. Blumenthal GM, Karuri S, Khozin S, Kazandjian D, Zhang H, Zhang L, et al. Overall response rate (ORR) as a potential surrogate for progression-free survival (PFS): A meta-analysis of metastatic non-small cell lung cancer (mNSCLC) trials submitted to the U.S. Food and Drug Administration (FDA). J Clin Oncol 32:5s, 2014 (suppl; abstr 8012)
- 27. Piedbois P, Buyse M: Meta-analysis based on abstracted data: A step in the right direction, but only a first step. J Clin Oncol 22:3839–3841, 2004 pmid:15326196
- 28. Matsuo K, Hotta K, Ueoka H: They also serve who stand and wait: Do individual clinical researchers too? Hope for individual patient data as public domain. J Clin Oncol 23: 1334–1335, 2005 pmid:15718348