A Novel Bayesian Seamless Phase I/II Design

This paper proposes a novel bayesian phase I/II design featuring using a hybrid mTPI method in phase I for targeting the MTD level and a randomization allocation schema for adaptively assigning patients to desirable doses in phase II. The mechanism of simultaneously escalating dose in phase I and expanding promising doses to phase II is inherited from a design proposed in literature. Extensive simulation studies indicate that our proposed design can vastly save sample size and efficiently assign more patients to optimal dose when compared to two competing designs.


Introduction
Though some designs, under the name of phase I/II, have been proposed recent years. Most of them target the maximum tolerated dose(MTD) level(MTD refers to the highest dose that will produce the desired effect without unacceptable toxicity.), which should, as the authors believe, bear a more appropriate name: phase I/II dose finding design( [1], [2], [3], [4], [5], [6]). Based on our knowledge, only several proposed designs could be titled as ''true'' seamless phase I/II design, among them, there are XJT design proposed by Xie etc.( [7]) and three-stage design proposed by Pan etc ( [8]),which are currently available. While both of them show good performances, the three-stage design outperforms the XJT design. However, at the phase I stage, the three stage design still failed to use more flexible adaptive design, like CRM or mTPI designs(both are model-based Bayesian adaptive phase I designs). In this paper, we equip the three-stage design with the advanced mTPI design. More steps have been made in this paper: we adopt the hybrid mTPI in phase I and conduct extensive simulations to compare their performances to draw final conclusion. The major reason that we select the mTPI instead of CRM design to improve the efficiency in phase I of the three-stage design is that it has been proven to have the similar statistical performances to the CRM design yet simple to use( [9], [10], [11]). We are aware that none of this kind of studies that equip the mTPI design with the integration of the phase I and II processes have been explored in the literature.
We briefly introduce the paradigm of the three-stage design as follows. The design's three stages refer to phase I, phase IIa, and IIb, respectively. The design features integration of the processes of dose escalation and dose expansion. Dose escalation is guided by the 3+3 approach, which is a classical design and has been considered as the gold standard design in phase I trials( [10]). Once a current administered dose is escalated and a new dose is opened for toxicity study, this current dose is expanded to phase IIa(stage 2) for preliminary research. The efficacy information is updated by a beta-binomial model. Stage 2 requires two interim analyses: a futility rule which determines when the current dose should be dropped out from the study and a graduation rule which informs whether the current dose should be graduated to phase IIb(stage 3) or not. In stage 3, an adaptive randomization procedure is implemented to assign the treated patients to desirable dose levels. Readers refer to( [8]) for details. This paper is organized as follows. Section 2 covers the scheme of mTPI design and its hybrid version. In Section 3, the general design structure is introduced in depth. Section 4 elaborates on the extensive simulation studies. Finally, Section 5 is devoted to discussions and conclusions.

Phase I mTPI Design and its Hybrid Version
Firstly, we will introduce mTPI designs; and then the hybrid mTPI versions will be described.

mTPI Design
The dose-finding rules for the mTPI method involve two major steps. In the first step, one introduces an equivalence interval (EI), which leads to three toxicity probability intervals that partition the probability space (0,1) into three intervals, corresponding to three conditions, namely, under-dosing, proper dosing, and over dosing, respectively. Building upon the EI, the mTPI method computes the unit probability mass (UPM, which is defined as the ratio of the probability the interval and the length of the interval) for the three toxicity probability intervals, and sets up a decision-theoretic framework to guide dose escalation decision on a Bayes rule basis.
Specifically, define w to be the target toxicity probability of the MTD (e.g, w = 0.25). The goal of phase I clinical trials is to find the highest dose with a toxicity probability closest to w. Let p = (p 1 , Á Á Á ,p D ) 0 denote the toxicity probabilities for dose j = 1,Á Á Á,D, where D is the total number of candidate doses in the trial. The observed data include the n j patients treated at dose j and the corresponding x j experiencing toxicity. The likelihood function is a product of binomial densities, The mTPI design assumes independence among dose responses and proposes to use models with vague priors for p j so that the shape of the resulting posterior distributions will be decided mainly by the shape of the likelihood based on the observed data. In this design, we set priors of p j s as Beta(1,1), with Beta density proportional to x a{1 (1{x) b{1 . Combined with the likelihood in (1), the posterior of p j follows independent Beta(1zx j ,1zn j {x i ), for j~1, . . . ,D. Evidently, when strong prior information on the toxicity of the candidate doses are available, informative beta priors can replace the vague priors.
Assume dose j is currently used to treat patients. To apply mTPI, one simply calculates the three UPMs for under-, proper-, and over-dosing intervals, given by UPM (E,j)~P (p j {wv{E 1 Ddata ) w{E 1 for over dosing : A dose-assignment rule B j based on these three UPMs chooses the decision with the largest UPM, that is, The mTPI design imposes an extra safety rule which restricts escalation to toxic doses that have been previously used. Introduing a random variable T j~1 fP(p j wwDdata)wjg, where 1fg is the indicator function and j[(0,1) is a cutoff value (e.g.,j~0:95), mTPI incorporates T jz1 into the proposed doseassignment rule B j . Let UPM(Ẽ E,j)~UPM(E,j)(1{T jz1 ) and define the new dose-assignment rule with this toxicity exclusion to be B e j~a rg max m[fD,S,Ẽ Eg UPM(m,j). When T jz1~1 , dose jz1 is considered highly toxic and the UPM associated with escalation equals 0. Therefore, escalation will never be chosen for dose finding. We recommend readers to refer to (Ji et al., 2010) [11] for details.

Hybrid mTPI Design
The hybrid version of CRM design is advanced by Yuan & Yin ([12]). They have demonstrated that the hybrid CRM design outperforms the 3+3 and CRM designs. We borrow their essential idea here to construct a hybrid mTPI version, which inherits the robustness of bayesian hybrid dose-finding method. Specifically, in phase I, if the current observed information is informative enough for us to know whether this dose is below or above the MTD, we could make the relevant dose assignment decision (e.g.,either to escalate or deescalate or stay at the current dose) instantly without using advanced adaptive phase I design. If the information observed at the current dose is insufficient to make a definite decision, we will adopt the mTPI design so that borrowing strength across all the doses under study to guide proper dose assignment is feasible. The following is the detailed introduction of the hybrid mTPI design.
Suppose y j out of n j patients have experienced toxicity with the dose level j. To evaluate the distance between the toxicity probability of dose level j and the target toxicity probability of the MTD w, the following hypotheses are introduced: where p j is the toxicity probability of dose level j, and d is the tolerable margin prespecified by physicians. The hypotheses H 1 ,H 2 and H 3 represent the situation in which dose level j is below, approximately equal to, and above the MTD, respectively. We set up H 2 as an interval hypothesis w{dƒwzd rather than a traditional point hypothesis p j~w in that in clinical practice, as long as the toxicity probability of a dose is adequately close to w, this dose can be chosen as the MTD. For example, with w~0:25 and d~0:02, a dose with a toxicity probability within (0:23,0:27) would be accepted as the MTD. Given the data observed at dose level j, (n j ,y j ), we derive the evidence of supporting each hypothesis by calculating their posterior probabilities. We assign the toxic probabilities p j as a Beta(1,1) prior distribution under each hypothesis: p(p j DH 3 )~Unif (wzd,1): It then follows that the marginal distribution of y j under H 1 is given by where F beta (c; a,b) is the cumulative distribution function of a beta distribution with the shape and scale parameters a and b at the value c. Similarly, the marginal distributions of y j under H 2 and H 3 are given by p(y j DH 2 )F beta (wzd; y j z1,n j {y j z1){F beta (w{d; y j z1,n j {y j z1) 2d(n j z1) , respectively. Therefore, at dose level j, the posterior probability of H k (k~1,2,3) is given by which is equivalently to the following: where BF ik~p (y j DH i )=p(y j DH k )(i,k~1,2,3), is the Bayes factor of H i against H k .
To determine the magnitude of the evidence in favor of each hypothesis,more specifically, Jeffreys([14]) suggested interpreting the Bayes factor in the unit of 1/2 on the log 10 scale: if log 10 BF 12 w1=2, this indicates that the data contain substantial evidence in favor of H 1 against H 2 ; if log 10 BF 12 w1, such evidence is strong in the data; and if log 10 BF 12 w2, then the evidence appears to be decisive. In our case, if log 10 BF 12 w1=2 and log 10 BF 13 w1=2, or equivalently p(H 1 Dy j )w0:61, there is substantial evidence in favor of H 1 against both H 2 and H 3 , suggesting that dose level j is far below the MTD. As a result, we should directly escalate the dose to level jz1, without the need to borrow any information from other doses. Similarly, if p(H 3 Dy j )w0:61, we should de-escalate the dose to level j{1 as there is substantial evidence indicating that dose level j is far above the MTD. Finally, if p(H 2 Dy j )w0:61, there is substantial evidence that dose level j is close to the MTD, the next dose should then stay at the same level.
When none of the posterior probabilities of the hypotheses is greater than 0.61, that is, p(H k Dy j )ƒ0:61 for all k, then that's not informative enough at dose level j to support any action. As a consequence, we invoke the mTPI approach to pool the information together from all the dose levels to guide the dose assignment for new patients. In other words, if the toxicity information at the currently administered dose is strong enough, we draw the decision upon the Bayes factors obtained in (5); otherwise we resort to the model-based approach to borrow information across different dose levels.

Design
We replace the 3+3 method in the three-stage design with the above hybrid mTPI approach. After the completion of phase I stage, the adaptive randomization approach by Yuan & Yin([15]) is adopted to effectively assign patients to the ideal dose level. Our design uses the beta-binomial model for efficacy responses. Let Y j denotes the number of responses among the n j patients treated with dose arm j (j~1, Á Á Á ,D) and Y 0 is the number of responses among the n 0 patients treated with placebo arm. Let Y j (j~1, Á Á Á ,D) and Y 0 be independent random variables following the binomial distribution, Bin(n j ,q j ), and Bin(n 0 ,q 0 ), respectively. The joint likelihood function for all doses can be written as L(q 1 , Á Á Á ,q D ,q 0 )& P D j~1 q Yj j (1{q j ) nj {Yj |q Y0 0 (1{q 0 ) n0{Y0 . In Table 1. Dose response rates(q d ) and Placebo response rate(q 0 ) scenarios.
Note: The first row is average total sample size in our proposed design; numbers in parentheses are percentages reduction of sample size as compared to the XJT and the three-stage designs. doi:10.1371/journal.pone.0073060.t002 our design, the response rates q j and q 0 are assumed to be independent and identically distributed with Beta(0.5,0.5), where Beta(a,b) denotes a beta distribution, and its density is proportional to x a{1 (1{x) b{1 . Based on the Bayesian theory, the posterior distribution of q j is Beta(0.5+Y j , 0.5+n j 2Y j ) and the posterior distribution of q 0 is Beta(0.5+Y 0 , 0.5+n 0 2Y 0 ). Regarding phase I, assuming that p(H 1 )~p(H 2 )~p(H 3 )~1=3, three possibilities could be considered: (1) patients in the first cohort are treated at the lowest dose d 1 ; (2) at the current dose level j curr with the observed data y j curr , we calculate p(H 1 Dy j curr ),p(H 2 Dy j curr ) and p(H 3 Dy j curr ). If p(H 1 Dy j curr )w0:61, we escalate the dose level to j curr z1; if p(H 3 Dy j curr )w0:61, we deescalate the dose level to j curr {1; and if p(H 2 Dy j curr )w0:61, the dose stays at the same level as j curr for the next cohort of patients; (3) otherwise, we switch to the mTPI method to continue the dose-finding jobs.
During the phase I process, if there is/are promising dose(s) graduated to phase II, the proposed design adaptively randomizes to-be-treated patients to all the graduated doses or placebo. In our design, the cohort size is 3 patients in phase II, the estimated response rates for dose levels d 1 to d D are denoted by q 1 to q D , respectively. Adaptive allocation procedure has the feature of assigning the patients to the most efficacious dose. The mechanism of allocation of patients is the essential part of an adaptive randomization approach. There are several approaches that have been proposed in recent years ( [1], [16], [17]). One traditional approach is to let the assignment probability for dose j be proportional to its response rate q j evaluated by the accumulative information so far. This approach does not perform well when the sample size is small. However, small sample size characterizes in every early phase studies. The problem is that the estimated values of q j is not reliable and stable. A Bayesian approach is to compare the response rates with a target rate, say q, and let the allocation probabilities be proportional to the posterior probabilities q j~P r½q i wqDdata. Nonetheless, as pointed out in [15], adaptive randomization may not work well with this approach when all of the true response rates are much higher or lower than q. In our design, we adopt the approach proposed by [15], named as Bayesian moving-reference adaptive randomization(MAR) approach. Peculiar to MAR is that the set of treatments in comparison is continuously reduced and one can achieve a high resolution to distinguish various treatments thereby. In the following, we explicate the MAR approach. Firstly, Let A A and A denote the set of indices of the treatment arms that have or have not been assigned randomization probabilities. One starts with A A~f:g an empty set, and A~f1,2, Á Á Á ,Dg;secondly, compute the average response rate for the arms belonging to the set A, p p~P j[A p j = P j[A 1, and use p p as the reference to determine R j~P r(p j w p pDD), for j[A. Identify the arm that has the smallest value of R j , R l~m in j[A R j ; then assign arm l a randomization probability of p l , p l~R l P j[A Rj (1{ P j'[ A A p j' ), and update A and A A by removing arm l from A into A A; lastly, repeat the first two steps and keep spending the rest of the randomization probability until all of the arms are assigned randomization, (p 1 , Á Á Á ,p D ), and then randomize the next cohort of patients to the j-th arm with a probability of p j . This approach can overcome the disadvantage mentioned above. The detailed description of this approach is refereed to [15].
We propose the following rules that are applied to the accumulating data for each dose arm.
where, D is a physician-specified superiority treatment margin. If R 2 wk, graduate this dose to phase II.
In sum, our proposed design is schematized as follows: Trial initiation. Patients of the first cohort are treated at the lowest dose level.
Onset of phase I. Phase I dose-finding starts after the first cohort is enrolled. Dose escalation proceeds based on the hybrid mTPI design.
Dose expanding. If an adjacent higher dose arm is opened for safety testing, we simultaneously expand the current administered dose to phase II.
Onset of phase II. Once a dose graduates, phase II starts. Patients will be randomized to the graduated doses or a placebo arm. For arm j, the randomization probability is proportional to the probability computed by the MAR approach.
Trial termination. The trial is terminated when either of the two conditions is met: 1) no dose is left in both phases; or 2) the prespecified maximum sample size is reached.

Simulation settings
For the purpose of fair comparison, we use the simulation scenarios identical to Xie([7]), whose study consists of two sets of toxicity situations: equal toxicity rates, with t j~0 :05 for j~1, Á Á Á ,5, and increasing toxicity rates(t j s) with (0:03,0:06,0:09,0:12,0:15) for all dose levels. Two control response rates(q 0 ) used in the simulation are 0.2 and 0.5. The true treatment response rate(q j ) used in simulations intends to encompass the various scenarios occurred in the real practice, namely, null, increasing, decreasing, n-shaped, u-shaped and equal (please refer to Table 1 for details).
In the simulations, the parameters are selected with exploratory simulation studies by computing the competing designs to achieve the similar type I error rate: 0.05. The maximum sample size for the trial is 180 and, the maximum sample size for each dose arm or placebo arm is 30 patients. D is 0.2, and the cutoff k is set as 0.90 by calibration. For each of the 24 scenarios, the proposed design was compared to both the XJT design and the three-stage design varied in terms of the average total sample size, optimal dose selection percentage, average patient numbers on various dose levels and toxic rates.

Simulation results
Sample size reduction. When toxicity pattern is either equal or increasing, the average number of patients using the XJT design is 135 or 129 for j~1, Á Á Á ,5. Compared to the XJT design, the average sample size of our proposed design is vastly saved across all scenarios, approximately 25% sample size reduction on average; besides as against the three-stage design, the average sample consumption is also saved dramatically, approximately 12% sample size reduction on average. (See Table 2 for details). As is obvious from the above results, the proposed design is very competitive in terms of cost, with the implication of shorter drug development duration and higher ethics due to smaller sample size. The good performance is traceable to the fact that our proposed design adopts the efficient hybrid mTPI design in phase I and an adaptive randomization procedure in phase II.
Optimal allocations. The dose selection percentages and average numbers of patients treated upon various doses are presented in Table 3. With toxicity and efficacy increasing, scenarios 14 and 20 are the most encountered situations in clinical practice. It is easy to see, in two cases, the dose selection percentages for doses d 1 , Á Á Á ,d 5 are increasing from 13.7 to 28.9 and from 12.4 to 25.6 respectively, and the average number of patients assigned to the various doses is increasing from 8.3 to 15.1 and from 11.1 to 23.1 correspondingly. To examine the net effect of the number of patients assigned to an increasing dose response, for instance, in scenario 2, the toxic rates are constant across the various doses, while the response rates exhibit an increasing trend from 20% to 60%. As shown in Table 3, the number of patients assigned to various doses ranges from 7.9 to 19.3, or 11.4 to 34.2 in percentage. The above results show that the proposed design efficiently assigns more patients to the most effective dose levels. The Scenario 8 shows the similar results. When it comes to the net effect of increasing toxic rates as in scenario 13, the response rates remain unchanged, while the toxic rates vary between 3% and 15%. From Table 3, the average sample size consumed by doses decreases from 17.3 to 9.9, or from 26.4 to 14.1 in terms of percentage. The Scenario 19 also exhibits similar results.
In brief, the proposed design saves a lot sample size and, in all scenarios, the optimal doses are to be selected with high probability and a large proportion of patients can be assigned to the efficacious and safe dose levels.

Discussion
Early phases in clinical trials, like phase I and II, play a vital role in drug development. The success of phase III confirmatory trials is contingent on phase I and II. However, the traditional procedures separate the early phases into two distinct phases and fail to borrow the information across phases I and II. Therefore, a design that could efficiently integrate information accumulated in phases I and II would be especially beneficial and necessary to drug development, with reference to the current situation of high risk and cost. The design described in this paper intends to provide an upgraded version based on the three-stage design. The highlights of our proposed design in this paper embrace adoption of a novel phase I bayesian design -the hybrid mTPI design and the MAR randomization procedure. Extensive simulation studies are conducted to ascertain the claimed good performances.
Readers may wonder why we did not choose a hybrid CRM in phase I stage. Basically, the CRM approach requires one to select a group of skeletons prior to a trial, but how to choose the skeletons remains an unsolved academic problem, despite that one method has been proposed ( [13]). In fact, we have done the study using the hybrid CRM in phase I stage, yet not present them in tables. The findings, actually, lead to the same results as the hybrid mTPI. Accordingly, we believe that the hybrid mTPI approach would be much easier and more user-friendly to be adopted in real practice.
In the final analysis, the design proposed in the paper is more ethical with more patients assigned to the optimal doses, can expedite the clinical procedures, and can save the cost of drug development due to small sample size. The design structure is easy to understand and practice of this design is free from calibration of design parameters.