Limited Generalizability of Registration Trials in Hepatitis C: A Nationwide Cohort Study

Background Approval of drugs in chronic hepatitis C is supported by registration trials. These trials might have limited generalizability through use of strict eligibility criteria. We compared effectiveness and safety of real world hepatitis C patients eligible and ineligible for registration trials. Methods We performed a nationwide, multicenter, retrospective cohort study of chronic hepatitis C patients treated in the real world. We applied a combined set of inclusion and exclusion criteria of registration trials to our cohort to determine eligibility. We compared effectiveness and safety in eligible vs. ineligible patients, and performed sensitivity analyses with strict criteria. Further, we used log binomial regression to assess relative risks of criteria on outcomes. Results In this cohort (n = 467) 47% of patients would have been ineligible for registration trials. Main exclusion criteria were related to hepatic decompensation and co-morbidity (cardiac disease, anemia, malignancy and neutropenia), and were associated with an increased risk for serious adverse events (RR 1.45–2.31). Ineligible patients developed significantly more serious adverse events than eligible patients (27% vs. 11%, p< 0.001). Effectiveness was decreased if strict criteria were used. Conclusions Nearly half of real world hepatitis C patients would have been excluded from registration trials, and these patients are at increased risk to develop serious adverse events. Hepatic decompensation and co-morbidity were important exclusion criteria, and were related to toxicity. Therefore, new drugs should also be studied in these patients, to genuinely assess benefits and risk of therapy in the real world population.


Introduction
Regulatory approval of drugs and the development of guidelines are supported by evidence generated by registration trials. These trials aim for high internal validity through use of strict eligibility criteria, although this may jeopardize generalizability. [1,2] Some studies suggest that many real world patients would be excluded from registration trials and that drugs tested through these trials are less effective or less well tolerated in these patients. [3][4][5] The treatment arsenal for chronic hepatitis C patients (CHC) has increased enormously with the introduction of Direct Acting Antivirals (DAAs). DAAs were approved by regulatory authorities for use in clinical practice, with evidence coming from registration trials having strict criteria. [6] Indeed, real world cohorts contain large number of treated CHC patients who would be excluded from registration trials. [7][8][9][10] A lack of generalizability is only an issue when ineligible patients have worse outcomes, but this is not known for CHC. We hypothesize that CHC patients ineligible for trials, but who are treated in clinical practice have characteristics that are risk factors for treatment failure and toxicity.
Therefore, we aim to compare effectiveness and safety in real world CHC patients who are eligible or ineligible for registration trials. Our secondary aim is to identify criteria that impact trial eligibility and assess the risk of these criteria on outcomes.

Population and design
We conducted a nationwide, multicenter, retrospective real world cohort study of CHC patients in the Netherlands. We chose genotype 1 patients treated between 2011 and 2015 with telaprevir or boceprevir with peg-interferon and ribavirin as an example cohort. We identified CHC patients using up-to-date local databases. Treatment indication, choice of therapy, drug dosing and duration were at the discretion of the physician, following national guidelines. [11] Patients co-infected with HIV or hepatitis B virus (HBV) were excluded.
Formal evaluation was waived by the institute review board Committee on Research Involving Human Subjects Arnhem-Nijmegen given the retrospective character of our study. However, approval in participating centers was obtained according to local regulations. The study was conducted in accordance with good clinical practice guidelines and the code of conduct for medical research (www.federa.org). We obtained oral informed consent or collected data anonymously in accordance with the code of conduct for medical research. No identifying patient data was collected, and all patient data was anonymously entered in the database.

Identification of registration trials and general set of eligibility criteria
We identified registration trials of telaprevir and boceprevir in CHC patients through a systematic search (S1 Table). We extracted eligibility criteria from published protocols, and used the least stringent criteria of all studies to develop a general criteria set (Table 1). We applied the general set to our real world population to determine eligibility. If variables were missing, we assumed the patient would be eligible for that criterion.

Data acquisition and definitions
We extracted demographics, CHC characteristics, and laboratory values from the patients' medical records on a pre-designed case report form. Baseline variables were collected at the start of treatment not exceeding one year prior to treatment. Baseline concomitant medication was collected prior to possible medication switch for expected interactions. Data was collected until 24 weeks after cessation of treatment. We collected whether patients had a history of or current decompensated liver disease, this was defined as a history or signs of ascites, variceal bleed or hepatic encephalopathy. Effectiveness was defined as sustained virological response (SVR): undetectable hepatitis C virus RNA 12 or 24 weeks after cessation of treatment. Safety data included adverse events (AEs) and serious adverse events (SAEs). AEs were defined as any event that required 1) dose reduction of peg-interferon or ribavirin, 2) prescription of medication or 3) referral. We used the FDA definition for SAEs. [12] We categorized AEs and SAEs by common terminology criteria for adverse events (CTCAE version 4.0). [13] We recorded data anonymously in an Access database (Microsoft Access 2007).

Outcomes and analysis
The primary outcomes were SVR and (S)AE rates, which were compared between patients eligible and ineligible for registration trials. Furthermore, we identified criteria that affected eligibility and were associated with the outcomes. Analyses were performed on an intention to treat population, where telaprevir and boceprevir treated patients were pooled. To check validity of pooling, we compared baseline characteristics and treatment outcomes between telaprevir and boceprevir patients. [14] SVR rates, and (S)AE rates were analyzed with χ 2 (or Fisher exact if counts <5), and Mann-Whitney U test (median number of (S)AEs). For analyses on SVR, we separated patients into two groups based on expected similar effectiveness: 1) treatment-naive and relapse patients, and 2) patients with a prior non-response, viral breakthrough or early discontinuation [15]; for safety outcomes this distinction was not made. We used frequency counts to identify most important eligibility criteria. To study the association of criteria and outcomes, we performed log binomial regression (relative risk) or poisson regression. [16] To explore the validity of our generated set of the least stringent criteria from the protocols, we performed three sensitivity analyses: a) with most stringent criteria (S2 Table), b) with strictest exclusion of co-morbidity, and c) with the most important factor for exclusion eliminated from the criteria set. All analyses were two-sided with a significance level of p <0.05, and performed in SPSS (IBM SPSS Statistics 20).

Population
We identified 489 treated patients from 45 centers, and we excluded 22 patients (Fig 1). Centers treated a median of 8 patients (range 1-53). Overall, the majority of patients (60%) was treatment naive, 52% had advanced fibrosis or cirrhosis and 5% had a history of decompensated liver disease. Baseline characteristics are shown in Table 2. We pooled telaprevir (n = 265) and boceprevir (n = 202) data, as there were no significant differences in characteristics and treatment outcomes between patients (S3 and S4 Tables).

Registration trials and outcomes eligible vs. ineligible
Our search yielded eight trials of telaprevir and boceprevir [17][18][19][20][21][22][23][24], and five registration trials were included. (S1 Table). [22][23][24] On the basis of the general criteria (Table 1), 47% of patients treated in real world practice would be excluded from registration trials. We than compared the eligible to ineligible population with respect to safety parameters. We found that ineligible patients had significantly more SAEs compared to eligible patients (27% vs. 11%, p<0.001) (Fig 2). A total of 37 SAEs occurred in 28 eligible patients (1 patient died due to an accident), compared to 103 SAEs which occurred in 60 ineligible patients (7 patients died) (S5 Table). Also, after excluding patients with a history of decompensated liver disease (n = 24) from the analysis, ineligible patients had significantly higher SAE rates (24% vs. 11%, p<0.001). Further, ineligible patients had a higher median number of AEs and SAEs (p = 0.039 and p<0.001 respectively, S6 Table). The incidence of some typical hepatic or therapy related (S)AEs (anemia, thrombopenia and hepatobiliary events) were significantly higher in the ineligible patients (Fig 3).
We found (non-significant) lower SVR rates in ineligible patients. Two sensitivity analyses detected lower SVR rates in ineligible patients (treatment naive-relapse group): when applying most strict criteria (81% vs. 67%, p = 0.01) or when most stringent exclusion of patients with co-morbidity was done (76% vs. 65%, p = 0.02). We observed no difference in SVR in the third sensitivity analysis, where we excluded concomitant medication from the criteria set (Fig 4). No significant differences in effectiveness were found in the non-responder group (S1 Fig).

Discussion
This study sheds doubt on the generalizability of registration trials to the real world CHC population. In our study, one of the key findings is that nearly half of treated CHC patients would be ineligible for registration trials. Most important exclusion criteria relate to signs or history of hepatic decompensation and co-morbidity (cardiac disease, anemia, malignancy and neutropenia). Patients meeting those exclusion criteria developed more SAEs (RR between 1.45 and 2.31) and were less likely to reach SVR (RR between 0.49 and 0.66), especially when strict criteria were used. Vice versa, eligible patients had SVR and SAE rates comparable to published trials. [17][18][19][20][21] Altogether, this indicates that results from registration trials are only generalizable to the real world patients who fulfill the eligibility criteria. Translating results originating from registration trials to patients that would be ineligible should be done with caution.  The difference between registration trials and real world reflects a 'development paradox'. Drugs are developed through a phase II-III program that targets easy-to-treat patients, while in the real world difficult-to-treat patients are prioritized for treatment. [1,25,26] The sequence of drug development starting with easy-to-treat patients seems appropriate, but the final hurdle to perform trials that specifically target difficult-to-treat patients is often sidestepped or delayed until after market authorization. As a result, this population who has a clear treatment indication is exposed to DAAs in the real world, without proper data on efficacy and toxicity. [27] This results in an increased proportion of adverse events, dropouts and hence lower effectiveness. [28] Our results support the 'development paradox' and provide reasons why real world outcomes do differ from registration trials.
Our data on limited generalizability of registration trials accords with the literature. An increased likelihood for SAEs in patients with a history of decompensated cirrhosis who would have been excluded from registration trials was reported in a large CHC cohort (n = 2084). [9] Some 30-47% of compensated cirrhotic patients treated with first-generation protease inhibitors would be ineligible for registration trials, and this study showed unexpected high SAE rates in that population. [7] In addition, a study on ledipasvir/sofosbuvir in advanced liver disease patients, published after FDA and EMA approval, reported much higher SAE rates (23%) compared to registration trials (3%). [29] For another CHC regimen, paritaprevir/ritonavir, ombitasvir and dasabuvir, the FDA label changed within one year following approval based on review of adverse events. This regime is now contra-indicated in patients with Child-Pugh B cirrhosis. [30] It is likely that this could have been prevented if these patients had been trialed prior to approval of the regimen. There is literature that suggests that serious adverse events might be related to disease course instead of therapy. [31] Nonetheless, timely controlled studies in CHC patients with decompensated liver disease are necessary to accurately gauge risk-benefit balance for these individual patients.
Here, we used the first-generation protease inhibitor treated patients as an example cohort. We believe that our results are also applicable to new generation DAAs, because eligibility criteria of registration trials are comparable to the set used in the current study (S7 Table). [31][32][33][34][35][36][37] Indeed, a Canadian HIV/HCV cohort, found that up to 94% of patients from that cohort would be ineligible for registration trials with new generation DAAs. [10] Furthermore a real world cohort showed that liver decompensation and SAEs during sofosbuvir containing regimens were associated with lower baseline albumin and higher total bilirubin, which are general exclusion criteria. [38] As toxicity of new generation DAAs decreases, the difference between trials and real world might become smaller, however with the high ineligibility rate of real world patients, generalization of results remains difficult.
Limited generalizability of registration trials is also seen in other liver diseases such as hepatocellular carcinoma (HCC) and HBV infection. For example, sorafenib was approved for HCC treatment, on the basis of studies that excluded Child-Pugh B and C cirrhotic patients. [39,40] A real world cohort reported significantly decreased overall survival with sorafenib in Child-Pugh B compared to Child-Pugh A cirrhotics. [41] Likewise, post-marketing studies in entecavir for chronic HBV infection show lower proportions of ALT normalization than was shown in registration trials. [42] Our study comes with strengths and limitations. Strengths of this study are the nationwide and multicenter character, resulting in a large and representative real world cohort. Limitations of this study are the retrospective character that resulted in (some) missing values. We handled this conservatively, by classifying the missing value as eligible for that criterion. Furthermore, chart review may result in reporting bias, but we used strict definitions to reduce this. Another limitation is that patients received first-generation protease inhibitors, peginterferon and ribavirin, which may increase the potential for toxicity. However, we think that our results are also valid for new generation DAAs.
In conclusion, nearly half of CHC patients treated in real world practice would be ineligible for registration trials. In these patients we found impaired safety and effectiveness related to specific eligibility criteria (hepatic decompensation, co-morbidity). Prior to regulatory approval, new drugs should also be studied in the difficult-to-treat population, including patients with hepatic decompensation and co-morbidity, to genuinely assess the benefits and risks of treatment in the real world population.  Table. Search strategy. This is the flowchart of the systematic search for registration trials with telaprevir and boceprevir. (DOCX) S2 Table. Set of least and most stringent combined inclusion and exclusion criteria of registration trials. This table shows the least stringent and most stringent criteria of different registration trials per variable. The least stringent criteria set was used for primary analyses and the most stringent criteria set for a sensitivity analysis. (DOCX) S3 Table. Baseline characteristics telaprevir and boceprevir treated patients.