Comparative efficacy and safety of licensed treatments for previously treated non-small cell lung cancer: A systematic review and network meta-analysis

Purpose This systematic review with network meta-analysis compared the efficacy and safety of currently licensed second-line treatments in patients with late stage non-small cell lung cancer (NSCLC). Methods Randomised controlled trials (RCTs) of participants with advanced/metastatic NSCLC receiving second/third line treatments were screened. We searched electronic databases (MEDLINE; EMBASE; Web of Science) from January, 2000 to July, 2017. Two reviewers screened bibliographic records, extracted data, and assessed risk of bias of included studies. The outcomes were overall survival (OS), progression-free survival (PFS), and drug-related grade 3–5 adverse-events (AEs). We pooled study-specific hazard ratios (HR; for OS and PFS) and risk ratios (RR; for AEs) using conventional and network-meta-analyses, and ranked interventions by the surface under the cumulative ranking curve. Findings We included 11 RCTs (7,581 participants) comparing nine drugs. All drugs except for erlotinib significantly improved OS compared to docetaxel. Nivolumab was the highest ranking drug followed by atezolizumab and pembrolizumab. There was no significant difference in OS across these three drugs (HR = 0.98, 95% CI 0.79, 1.21 for nivolumab vs atezolizumab; HR = 0.98, 95% CI 0.77, 1.25 for nivolumab vs pembrolizumab). For PFS, ramucirumab + docetaxel and nivolumab were the drugs with the highest ranking. All interventions except ramucirumab + docetaxel had a reduced risk for severe drug-related AEs vs. docetaxel. Of the drugs with the highest ranking on AEs, nivolumab was significantly safer compared to atezolizumab (RR = 0.55, 95% CI 0.38, 0.79) or pembrolizumab (RR = 0.52, 95% CI 0.34, 0.81). Implications Nivolumab, pembrolizumab and atezolizumab exhibited superior benefit/risk balance compared to other licensed drugs used late stage NSCLC. Our results indicate that the use of immunotherapies in people diagnosed with non-specific late stage NSCLC should be promoted. The use of docetaxel may now be judged irrelevant as a comparator intervention for approval of new drugs for second line treatment of NSCLC. Study registration number PROSPERO CRD42017065928.


Introduction
Lung cancer remains one of the most common cancers worldwide [1], with non-small cell lung cancer (NSCLC) accounting for 85 to 90% of all forms of lung cancer. [2] Because NSCLC is predominantly diagnosed at a late stage, most patients are not eligible for otherwise curative surgery, and thus have poor prognoses. While many first-line chemotherapies are available for patients with advanced/ metastatic NSCLC, second-line therapeutic options have been limited to docetaxel. [3] The development of targeted therapies and immunotherapies promises to fill some of the unmet need for the treatment of advanced/ metastatic NSCLC. In 2017, 13 agents had a label indication for the treatment of advanced/ metastatic NSCLC in patients after failure to respond to first-line chemotherapy. This includes three immune checkpoints (nivolumab, pembrolizumab, and atezolizumab). Although the effectiveness and safety of these drugs have been compared to those of docetaxel, they have not been compared to each other head-tohead.
In this systematic review and network meta-analysis (NMA), we compared the clinical efficacy and safety of the agents according to their licensed indication in patients with NSCLC (free of anaplastic lymphoma kinase [ALK] positive and Epidermal growth factor receptor [EGFR] positive expression) for whom first-line treatments failed.

Methods
We registered a protocol for this review in PROSPERO (CRD42017065928) (Study protocol in S1 File; Prisma checklist in S2 File).

Eligibility criteria: Studies, participants, and interventions
We included randomised controlled trials (RCTs) of people with advanced or metastatic (IIIB or IV) NSCLC of squamous, non-squamous, or mixed histology who experienced failure to prior first-line chemotherapy. Study populations had to have negative or predominantly negative expressions of ALK and EGFR. Patients with ALK and/or EGFR positive expression were ineligible, since they would be offered targeted therapies (e.g., erlotinib, gefitinib, osimertinib, crizotinib, or ceretinib). [1] The interventions of interest were the drugs with a European Medicines Agency (EMA) () label indication for the population described above as of June, 2017: Docetaxel (DOC), Pemetrexed (PEM), Ramucirumab plus docetaxel (RAM + DOC), Erlotinib (ERL), Nintedanib plus docetaxel (NINTE + DOC), Afatinib (AFA), Nivolumab (NIVO), Pembrolizumab (PEM-BRO), and Atezolizumab (ATEZO). The efficacy outcomes assessed were overall survival (OS), progression-free survival (PFS), the proportion of patients reporting at least one drugrelated grade 3 to 5 adverse event (AE), and the proportion of patients discontinuing study medication due to a drug-related AE.

Search strategy and study selection
English language studies were searched in databases (MEDLINE; EMBASE; Web of Science) from January, 2000 to July, 2017 (Supplementary online material A in S3 File).
Reference lists of relevant studies were scanned to identify additional citations. We consulted the EMA website to identify trials submitted by manufacturers in support of included drugs and sought relevant conference abstracts via relevant web sites.
Three reviewers (X.A., A.T., & M.C.) independently screened all titles/abstracts and examined full-text publications of potentially relevant citations. Disagreements were discussed and resolved through consensus. The study flow and reasons for exclusion at the full-text level were documented in the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) flow-chart. [4] Review outcomes and data extraction Two reviewers (X.A. & A.T.) independently extracted relevant data using an a priori defined pre-piloted extraction sheet. Data extracted included study author, country, funding source, sample size, patient characteristics (age, sex, diagnosis, data on tumour stage/histology), type, mode, dose and duration of treatments, dropouts, efficacy/safety outcomes of interest. The data extracted were cross-checked and any disagreements were resolved by discussion or recourse to another reviewer (M.C.).
For each study, we ascertained the estimates of hazard ratio (HR) for OS and PFS and risk ratios (RR) for drug-related grade 3 to 5 AEs, and discontinuation of study medication due to drug-related AE with corresponding 95% confidence intervals (95% CI). We extracted the HRs as reported in the primary studies. These were all derived from Cox regression stratified according to strata specified for randomisation. HRs adjusted for variables additional to randomisation strata were not included in the NMA. If time to progression (TTP) was reported, but not PFS, we used the TTP HR as a proxy for PFS HR. We used "treatment-emergent AEs" as a proxy for drug-related grade 3 to 5 events, if the latter was not reported.
When study results were available for different follow-ups, we extracted the outcomes from the latest follow-up irrespective of the publication type. To address incomplete reporting of outcomes, we used methods published by Tierney et al. [5] and by Guyot et al. [6] in the non-squamous analyses. The label indication for PEM specifies NSCLC "other than predominantly squamous histology," hence PEM was excluded from squamous analyses. For PEMBRO [9], we analysed data from the licenced 2mg/kg arm.
We used pairwise random-effects meta-analysis to pool the study-specific estimates with 95% CIs. The heterogeneity across trials was examined by visual inspection of forest plots and I 2 statistics (I 2 >50% indicating a substantial degree of heterogeneity). Sensitivity analyses were planned to assess the robustness of effect estimates across two RoB domains: allocation concealment and blinding of outcome assessors.
We assessed the transitivity assumption [11] by examining the distribution of the effect modifiers across studies (age, sex, performance status, stage IIIB vs IV at inclusion, and number of prior lines) and the dosages of common comparators used as anchor(s). Where possible, we planned to use a node-splitting test within each network with a loop to assess inconsistency between direct and indirect evidence. [12] We undertook random-effects network meta-analyses in the frequentist framework. Where there were few studies for each contrast between two treatments, we used a fixed-effect model. Summary league tables were generated for all comparisons. [13] We generated the surface under the cumulative ranking curve (SUCRA) to rank each intervention (i.e., probability of an intervention being superior in effectiveness or safety compared to DOC). [13] Clustered ranking plots for efficacy/safety outcomes were produced. [14] The threshold for the statistical significance was chosen as a two-tailed alpha = 0.05. All statistical analyses were performed using Stata 1 version 14.2 (StataCorp, USA).

Results
Of 1,949 records identified and screened at title/abstract level, 94 were examined for full-text, of which 46 records [8][9][10] corresponding to 11 RCTs with a total of 7,581 participants were included (Fig 1).

Characteristics of included studies
The mean age at inclusion in the eleven RCTs ranged from 57 to 66 years with a majority of male participants. The sample size ranged from 219 [8] to 1314 [10] patients. All studies included predominantly people with stage IV NSCLC and performance status 1. Only two studies had histology-specific inclusion criteria. [47,48] The included RCTs compared nine different drugs (AFA, ATEZO, DOC, ERL, NINTE--DOC, NIVO, PEMBRO, PEME, RAMU+DOC), majority of which were compared to DOC. Six RCTs [10,47,48,50,51,57] included only people receiving second-line treatment, while four others [9,49,53,54] included those receiving both second-and third-lines. In KEY-NOTE-010 [9] (PEMBRO vs DOC) study, patients had tumours expressing PD-L1 with a !1% tumour proportion score (TPS) (consistent with the marketing authorisation of PEMBRO). The characteristics of included studies are presented in Table 1.
Nine studies [8, 9, 47-49, 51, 53, 54, 57] were considered at high risk of bias for PFS and OS (due to the lack of blinding of participants and personnel). The five RCTs [9,[47][48][49]54] evaluating immunotherapies were open-label and therefore were rated as high-risk on the domain of performance bias.

Efficacy outcomes (overall analysis regardless of histology groups)
The evidence formed a connected star-shaped network with only a single RCT for most of the comparisons (Fig 2). [8,9,50] Four included RCTs were not presented in the network plot because in these one of the evaluated interventions was restricted in its label indication to one specific histology subgroup (i.e. the intervention is not licenced for NSCLC irrespective of the patient's tumour histology). [10,51,53,57] These four RCTs were used in the analyses by histological subgroup the results of which are reported in the subsequent sections.

OS comparisons (Findings are expressed as HR (95% CI), use of random-effects model.
Discontinuation due to drug-related AE. No NMA could be conducted for this outcome, because unlike for the previous outcome (Supplementary online material E in S3 File) the RR estimates from direct comparisons were not stable across different points of study follow-up (Supplementary online material F in S3 File).
Overall results (cluster rank analysis). Overall, NIVO, ATEZO and PEMBRO exhibited dominance in efficacy and safety over alternative therapies. According to the cluster rank analysis, NIVO was the drug with both the highest probability of being the most effective (overall survival) and the safest (drug-related grade 3-5 AEs) followed by ATEZO and PEM-BRO (Fig 4).

Efficacy outcomes by histology subgroups
The NMA for safety outcomes could not be performed due to sparse data.
Squamous histology. Head-to-head comparisons for OS and PFS are reported in Supplementary online materials G and H (both in S3 File), respectively. The studies formed connected, but sparse networks for OS and PFS, because not all studies reported these outcomes (Supplementary online material I in S3 File).
For OS, the SUCRA rankings suggested that NIVO (0.89) was the best intervention followed by ATEZO (0. For PFS, the network plot included one closed loop allowing a mixed treatment comparison between DOC, ERLO, and PEME (Supplementary online material N2 in S3 File). There was no evidence of inconsistency for the mixed treatment comparison (DOC, ERLO, PEME comparisons) within this loop (p = 0.07). The SUCRA rankings from the NMA suggested that RAMU+DOC (0.85) and NINTE+DOC (0.83) were the best interventions followed by PEM-BRO (0.58) and NIVO (0.49), PEME (0.49), and DOC (0.16), with ERLO (0.10) ranking the last (Supplementary online material P in S3 File). Among the four drugs with the highest rankings on PFS, no significant difference was observed.

Discussion
Overall, the evidence in this review indicated that the checkpoint inhibitors (NIVO, ATEZO, and PEMBRO) were superior in improving OS compared to non-immunotherapies irrespective of population histology (mixed, squamous or non-squamous) in people with advanced or metastatic NSCLC after failure to prior chemotherapy.
Indirect comparisons showed significantly reduced risks of drug-related grade 3-5 AEs with checkpoint inhibitors (NIVO, ATEZO, and PEMBRO) compared to RAMU+DOC. Taken together with OS results, this evidence suggested that the three immunotherapies were superior to other treatments (AFA, ERLO, PEME, DOC). Note: Y and X axes represent the cumulative ranking curve (SUCRA) to rank each intervention (i.e., probability between 0 to 1 of an intervention being superior in effectiveness or in safety compared to DOC); the plot guides a reader with respect to the trade-off between safety (measured drug-related grade 3-5 AE) and effectiveness (measures as OS) across the interventions: interventions in the right upper corner tend to be safer (higher SUCRA for AEs) and more effective (SUCRAs for OS) than those in the left lower corner of the plot (with lower SUCRAs on both factors). Thus, the Fig 3 supports a superior efficacy and safety for NIVO, ATEZO, and PEMBRO as opposed to DOC or ERLO. Also although NIVO compared to ATEZO and PEMBRO had similar effectiveness it appeared safer than the latter two. https://doi.org/10.1371/journal.pone.0199575.g004 Comparison of licensed treatments for previously treated non-small cell lung cancer The occurrence of drug-related AE is a time-varying outcome so that intervention comparisons are best examined using similar periods of exposure/follow-up per patient. In included studies, safety outcomes were reported at different points of follow-up.
Results based on indirect comparisons suggested a significantly reduced risk of drug-related grade 3-5 AEs with NIVO vs. ATEZO or PEMBRO (through DOC as the common comparator). One explanation could be the non-uniform occurrence rate of these events in the DOC arms (range: 35.9% [52] to 58.1% [46]) even though the same licenced dose regimen was used and duration of DOC treatment was comparable across the studies. Baseline characteristics of included patients do not suggest a particular reason explaining these differences. The incidence of drug-related grade 3-5 AEs across immunotherapies arms also showed slight differences between the three immunotherapies (range: 7.6% for NIVO [48] to 14.8% for ATEZO [54]). Owing to the above-mentioned discrepancies and the limited number of trials for each comparison, the observed more favourable safety profile of NIVO should be viewed with caution.
Peng et al. [58] have previously reported similar results regarding the better safety profile of NIVO vs PEMBRO.
In this work focusing on wild-type NSCLC (ALK and EGFR expression predominantly or 100% negative), ERLO was included although the summary of product characteristics for this drug indicates that "no survival benefit or other clinically relevant effects of the treatment have been demonstrated in patients with EGFR negative tumours". However, we included ERLO in our review, because we considered that the label indication does still theoretically include people with EGFR-expression.
In patients with squamous histology, NIVO and ATEZO were the only drugs significantly improving OS compared to DOC. Effectiveness of PEMBRO vs DOC was of similar as that of ATEZO vs DOC but the former was not statistically significant, one explanation for which could be lower statistical power in KEYNOTE-010 to show an OS benefit per histology. The higher ranking of NIVO compared to ATEZO and PEMBRO observed for OS could be explained by a lower rate of OS in the DOC arm in CHECKMATE-017 [48] compared to that in OAK [54] or in REVEL. [26] The low number of studies per comparison limited the interpretation of these findings. Although this subgroup analysis suggested the immunotherapies as the most effective for OS, there was little evidence showing one of the three drugs of this class being superior to another.
The meta-analyses in patients with non-squamous histology showed significantly improved OS with all the drugs except for ERLO compared to DOC. None of the indirect comparisons across PEMBRO, ATEZO, NIVO, PEME, NINTE+DOC and RAMU+DOC showed a significant improvement in OS. We were unable to meaningfully compare drugs on safety outcomes in the histology-specific subgroups of patients.
A recently published systematic review with NMA synthesised 102 RCTs to assess the efficacy and safety of 61 second-line treatments for patients with NSCLC regardless whether or not drugs (or drug combinations) were licensed or commercialised in this population. [59] Although the review authors provided a comprehensive evidence synthesis, their findings may have limited applicability to routine clinical practice. In contrast, the focus on licensed indications and dose regimens renders our review clinically more relevant.
Our work has several limitations. Although we used a systematic search approach we may have missed some unpublished relevant studies with null findings, so the potential for publication bias cannot be excluded. Because of the scarcity of evidence, we could not assess if RoB affected the NMA results due to either the lack of blinding or to industry sponsorship that potentially might influence some findings. Different definitions of safety outcomes and their reporting at different follow-ups may have affected the validity of drug comparisons. A further limitation is that in our NMA we used Cox regression model-based HR estimates that were stratified according to characteristics specified for randomisations, the use of which was not entirely consistent across the analysed studies.
In general, the differences in potential effect modifiers across studies were not substantial to violate the transitivity assumption.
The applicability of this review results may be limited owing to a changing landscape for the first-line treatment because immunotherapies are becoming standard treatments in this setting. This is particularly the case for PEMBRO which demonstrated improved survival outcomes compared to platinum-based chemotherapy in people with PD-L1 expression !50%.
[60] Should PEMBRO become a standard care at first line, one can assume that people with PD-L1 expression !50% receiving PEMBRO at first-line and progressing will not receive subsequent lines of other immunotherapies. Therefore, our findings may not be applicable for people with PD-L1 expression !50% (around 30% of NSCLC [60]).

Conclusions
In this review, we advanced the existing knowledge by comparing drugs approved in people with non-specific late-stage NSCLC. Our results indicate that the use of immunotherapies in people diagnosed with non-specific late stage NSCLC should be promoted. Amongst our included studies, more than 3,500 patients received licensed dosing of DOC, which proved relatively unsuccessful on both survival and safety. The use of DOC may now be judged irrelevant as a comparator intervention for approval of new drugs for second line treatment of NSCLC.