First-line targ veted therapies of advanced hepatocellular carcinoma: A Bayesian network analysis of randomized controlled trials

Purpose A variety of targeted drug were developed and proved effective and safe in clinical trials. Our study aims to compare the efficacies and safety of different targeted drugs in advanced hepatocellular carcinoma (HCC) for first-line treatment using a Bayesian network meta-analysis approach. Methods PubMed, Embase, and Cochrane library were searched for randomized controlled trials (RCTs) of advanced HCC patients that treated with different targeted drugs. Time to progress (TTP), overall survival (OS) and progress-free survival (PFS) were calculated as hazard ratios (HRs). Objective response rate (ORR) and the proportion of Grade 3–5 adverse events (G3-5AE) were expressed as odds ratios (ORs). We pooled study-specific HRs and ORs using Bayesian network meta-analyses, and ranked first-line drugs by the surface under the cumulative ranking curve (SUCRA). Results A total of 22 RCTs with 9288 patients were enrolled. Brivanib, linifanib, lenvatinib and sorafenib showed a significant improvement on TTP compared to placebo (HR range, 0.45–0.72). Sunitinib (HR = 1.99) and nintedanib (HR = 2.17) showed a significant decline on TTP compared to lenvatinib. Vandetanib (HR = 0.44) and sorafenib (HR = 0.73) showed a significant improvement on OS compared to placebo. There was no significant difference in PFS, ORR and G3-5AE across different drugs. According to cluster rank analysis, vandetanib was the drug with both more effective (OS) and more secure (G3-5AE) compared to Sor followed by nintedanib. Conclusions This network meta-analysis shows that vandetanib, linifanib, lenvatinib and nintedanib potentially may be the best substitution of sorafenib against advanced HCC as first-line targeted drugs. Vandetanib seems to be the best choise with low quality of evidence. For better survival, novel targeted treatment options for HCC are sorely needed.


Search strategy and study selection
Two researchers (W.D. & Y.T.) systematically searched Pubmed, Embase and the Cochrane Library using a well-developed search strategy without language restriction from inception to Jun 30th, 2019 (S2 Table). Additionally, relevant references were also searched. Unpublished literatures and conference abstracts were not included.
Two reviewers (W.X. & Y.W.) independently screened out the candidate articles via scanning all titles, abstracts and full-texts. A third reviewer (W.D.) made the final decision of the disagreements on candidate articles through consensus.

Data extraction
Two reviewers (W.D. & Y.T.) extracted relevant data including study author, post time, region, sample size, patient characteristics (age, gender, Eastern Cooperative Oncology Group [ECOG] score, Barcelona Clinic Liver Cancer [BCLC] stage), mode, dose and duration of treatments, and outcomes of interest, independently. A third reviewer (X.X.) made the final decision of the disagreements were via discussion.

Quality assessment
The quality and the risk of bias of RCTs in this study was assessed using the quality criteria of the Cochrane Collaboration's tool (S1 Table) [10]. The Grading of Recommendations Assessment, Development and Evaluation (GRADE) Working Group approach was used to assess the quality of evidence (QoE) in each of the direct, indirect, and NMA estimates [11,12]. For direct comparison, we graded evidence from the five aspects; risk of bias, inconsistency, indirectness, imprecision and publication bias, using the standard GRADE approach. For indirect comparison, we rated evidence according to the lower grades of direct comparisons and intransitivity. For NMA estimates, we rated evidence according to the higher grades of the direct and indirect comparisons and incoherence.

Data synthesis and analysis
Results regarding the OS, PFS and TTP are expressed as hazard ratios (HRs) with 95% confidence intervals (CIs). Results regarding ORR and G3-5AE are expressed as odds ratios (ORs) with 95% CIs. If HRs could not be acquired directly, they were extracted from Kaplan-Meier curves using the method described by Parmar et al [13]. If there were different HRs or ORs based on different evaluation criteria in the same article, we selected the result according to the latest criteria. We did direct pairwise meta-analyses of head-to-head comparisons with Rev-Man version 5.3.0 (Cochrane Collaboration). The evaluation of heterogeneity among studies was performed by Cochran's Q test and Higgin's I 2 statistics. The heterogeneity among all included studies was suggested significant when I 2 >50% and/or P<0.05, then a random-effect model was used (DerSimoniane-Laird method); otherwise, a fixed-effect model (Mantel-Haenszel method) was used.
We did Bayesian network meta-analysis with the package 'rjags' version 4-9 and the package 'GeMTC' version 0.8-2 in R version 3.6.1 (https://www.r-project.org). The merged HRs and/or ORs of relative treatment effects are reported as the median and accompanying 95% credibility intervals (95% CrI) of the posterior distribution. We drew network diagrams with Stata/MP version 14.0 (4905 Lakeway Drive, College Station, TX77845, USA). Hierarchical Bayesian modeling of the present network meta-analysis conformed to the National Institute for Health and Excellence Decision Support Units (NICEDSU) guidelines [14]. To confirm the transitivity and the loop-specific consistency assumption, pairwise direct and indirect effect estimates of closed loops of evidence were inspected for any disagreement [15]. The transitivity was assessed by examining the patient baseline characteristics across studies (age, gender, performance status and tumor stage), treatment stage and treatment protocol [16]. The global test for inconsistency assumption was conducted with the consistency and inconsistency (unrelated mean effects) models. The consistency between direct and indirect comparison was assessed via using a node-splitting test within each network with a loop [17]. The heterogeneity of network meta-analysis was evaluated with the posterior median of the between-trials standard deviation (σ) [14], while comparison-adjusted funnel plot was used to detect the presence of small-study effects or publication bias.
We undertook Markov Chain Monte Carlo (MCMC) simulation as Bayesian inference to calculate the posterior distributions of the interrogated nodes within the framework of the chosen models and likelihood function on the basis of prior assumptions. We used four different sets of initial values to fit the model, yielding 400,000 iterations (100,000 per chain) to obtain the posterior distributions of model parameters then used 50,000 burn-ins and a thinning interval of 10 for each chain. Autocorrelation function was used to assess the convergence of iterations. Global model fit and parsimony was compared between different fitted models to decide on the most accurate model. The posterior mean of the total residual deviance and deviance information criterion (DIC) was used to choose a more appropriate model [18,19]. The model with a lower DIC was considered as a more appropriate model. The threshold for the statistical significance was chosen as a two-tailed alpha = 0.05.
In order to determinate intervention rankings for outcomes, rank probabilities were extracted from the network meta-analysis. By merging the rank probabilities of different drugs, we generated the surface under the cumulative ranking curve (SUCRA) to simplify the ranking information as a few numbers [20]. It ranks from 0 to 1. It would be 1 when a treatment is certain to be the best and 0 when a treatment is certain to be the worst. To simultaneously compare the efficacy and safety of each drugs, we jointly presented the SUCRA value of OS and G3-5AE on the clustered ranking plot.

Results
Of 2,808 articles were collected from the databases mentioned above. After removing all duplicate articles and checking all titles and abstracts, 26 studies remained. After further full-texts screening, four researches were excluded (one study [21] was lack of control group, three studies [22][23][24] were the Sub-studies for previous trials). Finally, a total of 22 RCTs including 9288 patients from all over the world were included in this network meta-analysis (Fig 1) [6,.

Study characteristics
The main characteristics of the included studies were summarized in Table 1. The median age in the 22 RCTs ranged from 51 to 70 years with a majority of male participants. The sample size ranged from 67 to 1155 patients. The majority of ECOG scores were 0-1. The majority of BCLC stages were B-C. The included RCTs compared thirteen different drugs (bevacizumab, erlotinib, brivanib, dovitinib, erlotinib, everolimus, lenvatinib, linifanib, nintedanib, orantinib, sorafenib, sunitinib, tigatuzumab, vandetanib), which were only compared to sorafenib or placebo. The targeted drug treatment programs and their abbreviations are shown in S4 file. The main characteristics of the included studies are shown in Table 1. As shown in S1 Table, only twelve studies [25-29, 32, 34, 35, 37, 38, 41, 42] were considered with high risk of bias at blinding of participants and personnel due to their open-label design. There was no evidence of substantial imbalance in the distribution of the effect modifiers across trials in the network. A connected network diagram formed by all evidences is provided in Fig 2. The dosage regimen modes of the same drugs across studies were consistent. By examining the patient baseline characteristics, treatment stage and protocol, there was no evidence that the transitivity assumption was violated in any of the networks.

PLOS ONE
First-line targeted therapies of advanced HCC:A network meta-analysis of RCTs

Study
Year Intervene  Table. The NMA synthesis showed that four drugs (brivanib, lenvatinib, linifanib and sorafenib) achieved a significant benefit on TTP over placebo (HR range, 0.45-0.72). According to SUCRA, three highest ranking drugs were lenvatinib (0.94), linifanib (0.84) and brivanib (0.67), which were in red in Table 2.  Table. The NMA synthesis showed that there was no significant difference on PFS among drugs. According to SUCRA, three highest ranking drugs were lenvatinib (0.77), vandetanib (0.77) and orantinib (0.68) which were in red in Table 3.

Cluster rank analysis
According to the meta-analysis performed above, ten interventions (Bri, Dov, Erl + Sor, Eve + Sor, Lin, Nin, Pla, Sor, Van 100mg and Van 300mg) compared to each other head-to-head on both OS and G3-5AE. According to cluster rank analysis, Van 100mg was the drug with both more effective (OS) and more secure (G3-5AE) compared to Sor followed by Nin (Fig 3).

Consistency, heterogeneity and quality of evidence
The detection of inconsistency in evidence networks was conducted by evaluating the agreement between the consistency and inconsistency (unrelated mean effects) models (S3 Table).

PLOS ONE
First-line targeted therapies of advanced HCC:A network meta-analysis of RCTs The results of comparisons in both consistency and inconsistency models were roughly consistent. The result showed a robust and homogeneous network of evidence. Additionally, the node-splitting approach also showed a good consistency between the direct and indirect

PLOS ONE
First-line targeted therapies of advanced HCC:A network meta-analysis of RCTs comparisons (S3, S8, S11 and S14 Figs). Though application of a fixed-effect model would provide similar numerical results with shorter credible intervals, random-effect model was more appropriate according to the residual deviance and DIC criteria (S2 Table). There was no obvious asymmetry at visual inspection of funnel plots to suggest publication bias as shown in S16 Fig. According to GRADE approach, the direct, indirect, and NMA Estimates for OS and G3-5AE were shown in S4 and S5 Tables. The quality of the most evidence was low.

Sun
The values in red shading were the highest three SUCRAs. The texts in yellow shading were targeted drugs.

PLOS ONE
First-line targeted therapies of advanced HCC:A network meta-analysis of RCTs

Discussion
The SHARP trial was the first study to demonstrate efficacy (HR = 0.69; 95% CI 0.55-0.87, for sorafenib vs placebo, on OS) of targeted therapy for patients with unresectable HCC [6]. Subsequently, an Asia-Pacific study also confirmed the same conclusion (HR = 0.68, 95% CI 0.50-0.93) [45]. Based on the results of the two trials, sorafenib, a multi-targeted TKI, became the standard systemic treatment, approved by the regulatory authorities around the world, for patients with advanced unresectable HCC [46]. However, the advantages of survival and the improvements of symptom or living quality in these two trials were modest. In order to find more effective targeted drugs, several clinical trials ensued. Disappointingly, most of the results were negative. Several targeted drugs were compared with sorafenib directly in this review [25-29, 34-38, 40, 42]. For TTP, only Len (HR = 0.63, 95% CI 0.54-0.74) and Lin (HR = 0.76, 95% CrI 0.64-0.91) performed better than sorafenib while others comparisons showed no statistical

PLOS ONE
First-line targeted therapies of advanced HCC:A network meta-analysis of RCTs difference. For PFS, also Len (HR = 0.66, 95% CrI 0.56-0.77) and Lin (HR = 0.81, 95% CrI 0.69-0.95) performed better than sorafenib while others comparisons showed no statistical difference. For OS, no targeted drugs were superior to sorafenib while Sun performed worse than sorafenib with statistical difference. These direct comparison results are disappointing. Gratifyingly, a RCT verified that Van 100mg was superior in improving OS compared to placebo, although it didn't indicated that Van 100mg was better than sorafenib.
To see the results of different targeted drugs comparing to each other, we performed this Bayesian network analysis. In this meta-analysis, brivanib, lenvatinib and linifanib were superior in improving TTP compared to placebo. However, they showed non-superiority in terms of both PFS and OS compared with placebo. Sorafenib was superior in improving both TTP and OS, while Van 100mg was also superior in improving OS. Although Tig 6mg + Sor, Van 300mg and Van 100mg were the three highest ranking interventions, they showed non-superiority in terms of OS compared with sorafenib. For ORR and G3-5AE, there was no significant difference across all targeted drugs. In general, sorafenib appeared to remain superior in the present analysis.
There are some potential reasons for failure to meet the primary endpoints of prolonging OS in HCC trials. First, the inclusion criteria of clinical trials are mainly based on Child-Pugh scores and BCLC stages. However, this screening method couldn't eliminate the histologic heterogeneity in HCC. Therefore, several biomarkers (e.g., c-MET, RAS and FGF19) were recently used as bases for screening [47,48]. Lack of predictive biomarkers was also one of the reasons for the failure. Second, by analyzing the target of included drugs, most of the drugs were anti-angiogenic multikinase inhibitors sharing some common pathways [49]. For these trials, there must be only marginal differences relative to sorafenib. To avoid similar targets, several trails tested a new drug in combination with sorafenib vs sorafenib alone, for instance, erlotinib targeting epidermal growth factor receptor, and everolimus targeting mammalian target of rapamycin. However, none of these combinations were superior in improving OS compared to sorafenib. Therefore, there still must be some other reasons for failure in HCC trials. Third, the end point OS is affected by advanced cirrhosis since advanced HCC is often accompanied by severe cirrhosis. The differences in curative effects among targeted drugs may not enough to cause major improvements in survival. To some extent, TTP may more suitable as an endpoint in advanced hepatocellular carcinoma treated with molecular targeted therapy [50]. Fourth, liver cirrhosis is frequently associated with hypohepatia. Due to the insufficiency of liver's synthesis and metabolism function, the expected drug effect may not be exerted. Meanwhile, the side effects of drugs often lead to treatment interruption.
According to the cluster rank analysis, Van 100mg, Van 300mg and Nintedanib were more effective and more secure compared to Sorafenib, although the advantages were not statistically significant. Although vandetanib has limited clinical activity and was not warranted to be further developed as first-line therapy for advanced HCC [43], the correlational research of vandetanib in HCC had not stopped. Vandetanib-eluting radiopaque beads for locoregional treatment of HCC were under development [51]. Recent studies showed that nintedanib might have similar efficacy comparing to sorafenib in patients with advanced HCC, but with a manageable safety profile [25].
As we know, this is the first network meta-analysis of all RCTs to evaluate the efficacy and safety of targeted drugs for the treatment of HCC patients. Several limitations should be taken into consideration. First, the distributions of BCLC stages in different studies were not in full accord. Patients with B or C stage often had worse prognosis than those with A stage. The BCLC criteria for the patients could have an impact on the overall survival. Fortunately, the vast majority of patients include in this analysis were in stage B or C. Second, cirrhosis is also an important correlation factor in survival. Third, some HRs [26] were obtained by calculating the data extracted from the survival curves when they could not be acquired from the original article directly. Forth, both Response Evaluation Criteria in Solid Tumors (RECIST) v1.0, RECIST v1.1 and Modified RECIST (mRECIST) were used in the included studies. Both National Cancer Institute Common Terminology Criteria for Adverse Events, Version 3.0 and Version 4.0 were used in the included studies.
Our study also has several superiorities. First, we performed a comprehensive literature search to provide a summary of targeted therapies on HCC as detailed as possible. Second, in contrast to previous meta-analyses, the included studies were all RCTs that ensured the reliability of evidences. Third, we performed the cluster rank analysis considering both efficiency and safety in order to support clinical decision.

Conclusion
Taken together, our network meta-analysis suggests that vandetanib, linifanib, lenvatinib and nintedanib potentially may be the best substitution of sorafenib against advanced HCC. For OS, Van (100 and 300mg), seem to be the best options with low and moderate quality of evidence, respectively. For G3-5AE, Van (100 and 300mg), seem to be the best interventions, with low and very low quality of evidence all of them. Further studies are necessary to explore the curative effect of certain subgroup in HCC patients, especially the subgroup classified as BCLC stage, Child-Pugh score and Hepatitis B infection status. For better survival, novel targeted treatment options for HCC are sorely needed.