How to design a dose-finding study on combined agents: Choice of design and development of R functions

Background In oncology, the aim of dose-finding phase I studies is to find the maximum tolerated dose for further studies. The use of combinations of two or more agents is increasing. Several dose-finding designs have been proposed for this situation. Numerous publications have however pointed out the complexity of evaluating therapies in combination due to difficulties in choosing between different designs for an actual trial, as well as complications related to their implementation and application in practice. Methods In this work, we propose R functions for Wang and Ivanova’s approach. These functions compute the dose for the next patients enrolled and provide a simulation study in order to calibrate the design before it is applied and to assess the performance of the design in different scenarios of dose-toxicity relationships. This choice of the method was supported by a simulation study which the aim was to compare two designs in the context of an actual phase I trial: i) in 2005, Wang and Ivanova developed an empirical three-parameter model-based method in Bayesian inference, ii) in 2008, Yuan and Yin proposed a simple, adaptive two-dimensional dose-finding design. In particular, they converted the two-dimensional dose-finding trial to a series of one-dimensional dose-finding sub-trials by setting the dose of one drug at a fixed level. The performance assessment of Wang’s design was then compared with those of designs presented in the paper by Hirakawa et al. (2015) in their simulation context. Results and conclusion It is recommended to assess the performances of the designs in the context of the clinical trial before beginning the trial. The two-dimensional dose-finding design proposed by Wang and Ivanova is a comprehensive approach that yields good performances. The two R functions that we propose can facilitate the use of this design in practice.


Methods
In this work, we propose R functions for Wang and Ivanova's approach. These functions compute the dose for the next patients enrolled and provide a simulation study in order to calibrate the design before it is applied and to assess the performance of the design in different scenarios of dose-toxicity relationships. This choice of the method was supported by a simulation study which the aim was to compare two designs in the context of an actual phase I trial: i) in 2005, Wang and Ivanova developed an empirical three-parameter model-based method in Bayesian inference, ii) in 2008, Yuan and Yin proposed a simple, adaptive twodimensional dose-finding design. In particular, they converted the two-dimensional dose-finding trial to a series of one-dimensional dose-finding sub-trials by setting the dose of one drug at a fixed level. The performance assessment of Wang's design was then compared with those of designs presented in the paper by Hirakawa et al. (2015) in their simulation context.

Results and conclusion
It is recommended to assess the performances of the designs in the context of the clinical trial before beginning the trial. The two-dimensional dose-finding design proposed by Wang and Ivanova is a comprehensive approach that yields good performances. The two R functions that we propose can facilitate the use of this design in practice. PLOS

Introduction
In oncology, the aim of dose-finding phase I studies is to find the maximum tolerated dose (MTD) for further studies. Currently, most designs use the number of dose-limiting toxicities (DLTs) as a primary endpoint to estimate the MTD. The MTD is then defined as the dose associated with an acceptable estimated probability of DLTs. Combining two or more agents to improve antitumoral activity and therefore efficacy is common. However, Phase I trials on such combinations, aiming to find the MTD for the combination (also rated by MTD, in order to simplify the rating) are more difficult and complex than those on conventional single agents.
In single-agent trials, the MTD is based on the paradigm of an increasing dose-toxicity relationship. The continual reassessment method (CRM) is widely used in this case [1]. However, in studies on combinations, the ordering of toxicity probabilities between all combinations is not known prior to the study. For instance, the combination of two agents with three and four dose levels respectively, leads to 12 dose combinations. The ordering is only partially known among all these combinations. This leads to uncertainty in the ordering of monotonic toxicities for some dose combinations. This paper was prompted by the phase I VINILO trial, an international phase I trial evaluating the combination of vinblastine and nilotinib in children and adolescents with refractory or recurrent low-grade glioma (ClinicalTrials.gov Identifier: NCT01884922). It was planned to explore respectively four and three dose levels for vinblastine (from 3 to 6 mg/m 2 ) and nilotinib (115 to 350 mg/m 2 ). The study received regulatory approval in January 2013 and officially started in France in May 2013. The protocol was amended in 2014. The main modification concerned the dose-escalation method to apply the method published by Wang and Ivanova [2], whereas the initial version of the protocol used the method published by Yuan and Yin [3]. Both designs correspond to the particularities of the VINILO trial, which are detailed in the next section, and the choice of the final design was based on the simulation study, which will be presented here. Indeed, for studies testing combinations, many designs have been proposed, either algorithm-based or model-based [2,[4][5][6][7]. For example, Wages et al. proposed a partial order CRM design [7], which is able to identify the MTD based on various possible orderings with a partial order between combinations. This ordering between combinations is assumed to be monotonic and increases with the dose. Ivanova and Wang proposed an 'up-and-down algorithm' method based on isotonic regression to estimate a set of possible MTDs [4]. Mandrekar et al. developed a model-based design considering both the toxicity and efficacy of each agent to identify the optimal dose [5,6]. Wang and Ivanova developed a three-parameter model-based method for which the parameters were estimated using Bayesian inference [2]. This list from the literature is not exhaustive and many other methods exist. Some authors have compared different methods in simulation studies. Riviere et al. [8] compared two algorithm-based [4,9] aand four model-based dose-finding methods [2,7,10] in a simulation using 10 realistic scenarios describing different dose-toxicity relationships. One of their conclusions was that the model-based methods performed better than the algorithm-based methods. This point corresponds to the findings in single-agent studies [11]. Hirakawa et al. proposed a comparative study on phaseI combination therapies [12]; they focused only on model-based designs in which the primary interest was to find the MTD for only one combination. One of their conclusions was that the average performance of methods across the 16 scenarios used in their simulation study varied depending on (i) whether the dose-combination matrix is square or not; (ii) whether the true MTDs exist within the same group along the diagonals of the dosecombination matrix; and (iii) the number of true MTDs.
In the setting we encountered in the VINILO trial, we propose an approach to designing a dose-finding study on a combination of agents. Our presentation of this work is limited to the two designs used in VINILO [2,3]: i)Wang and Ivanova developed an empirical three-parameter model-based method for which the parameters were estimated using Bayesian inference. The dose assigned for the next patients was defined based on the neighboring doses considering a two-dimensional approach, ii) Yuan and Yin proposed a simple, adaptive two-dimensional dose-finding design that can accommodate any type of single-agent dose-finding method. In particular, they converted the two-dimensional dose-finding trial to a series of one-dimensional dose-finding sub-trials by setting the dose of one drug at a fixed level. Subtrials were then conducted sequentially using the CRM. The performances of these two approaches was compared under different scenarios in a simulation study in the context of the VINILO trial. We then compared the performance of Wang's design with those presented in the paper by Hirakawa et al. [12] in the context of their simulation studies. Six methods were used: the method based on a copula-type model, termed the YYC [10], the method using a hierarchical Bayesian model, termed the BW method [13], the likelihood-based dose-finding method using a shrinkage logistic model, termed the HHM method [14], the likelihood-based CRM for partial ordering, termed the WCO method [15], the order-restricted inference method of Conaway et al., termed CDP [16], and the Bayesian dose-finding design based on the logistic model termed RYDZ, [17].
While several other methods exist, they are rarely used in practice [8], mainly due to complications related to their implementation and the lack of simple tools. We propose a tool involving two R functions, for use with the Wang and Ivanova design: the two-dimensional dose-finding in discrete dose space [2]. The choice was prompted by the fact that: i) among the designs compared by Riviere et al., Wang's method performed well in terms of identifying the MTD, and ii) our simulation studies yielded good performances with this method.
The Methods are presented in Section 2. In Section 3, we present the comparison study in the context of the VINILO trial. We report the results in Section 4. In Section 5, two R functions are described, one to run the simulation study in order to assess the performance of the method in various scenarios, and one to compute the dose for the next patients in a phase I trial. In Section 6, the conclusions and discussion are presented.

Motivation for the research
The VINILO trial is an international phase I trial evaluating the combination of vinblastine and nilotinib in children, adolescents and young adults with low-grade gliomas in progression after first-line treatment with chemotherapy or radiotherapy. The aim of the trial is to determine the MTDs of nilotinib and vinblastine when used in combination for phase II trials, i.e. the dose associated with an estimated probability of DLT close to 20%. It was planned to explore four and three dose levels respectively for vinblastine (from 3 to 6 mg/m 2 ) and nilotinib (115 to 350 mg/m 2 ). A total of 12 dose levels could potentially be explored, however, after discussion with the principal investigator, the dose levels of 4, 5 and 6 mg for vinblastine combined with the dose level of 115 mg for nilotinib were not explored (see Fig 1). The trial started in France in 2013 using Yuan's approach, which converts the two-dimensional dose-finding trial into a series of one-dimensional dose-finding sub-trials by setting the dose of one drug at a fixed level. The protocol was amended in 2014. The main change concerned the dose-escalation method to follow the method published by Wang and Ivanova [2], whereas the initial version of the protocol had proposed the method published by Yuan and Yin in 2008 [3]. The decision was based on the simulation study, which we will present here.

Two-dimensional dose finding in discrete dose space
Wang and Ivanova proposed an approach based on an empirical model using a two-dimensional dose finding design in a Bayesian framework.
Let us consider two agents to be evaluated: Agent A where a set {s 1 , . . .s J } of doses will be evaluated, and Agent B where a set {t 1 , . . .t K } of doses will be evaluated, with K < J. The goal of Wang's approach is to find a collection of dose combinations that yield the target probability of toxicity, for each dose level of Agent B t k .
Wang and Ivanova proposed a dose-combination model with interaction term as: With θ = (α, β, γ): the vector of parameters to be estimated, where γ describes the interaction between the two agents.
To satisfy the assumption of monotonicity of toxicity, the parameters should satisfy the following constraints: α > 0, β > 0, and γ < 0.
The vectors a j and b k were used in place of actual doses, with 0 < a 1 < . . . < a J < 1 and 0 < b 1 < . . . < b K < 1. These vectors were considered as initial guesses which corresponded to the "desirable range of toxicity probabilities for testing" [1].
The authors proposed to estimate the parameters of the model using a Bayesian framework. The parameter θ is assumed to follow a joint prior distribution with density g(θ). The posterior distribution of θ given toxicity data from the first i subjects is:  where f(D i |θ) is the joint conditional density of the observed responses given θ: Cðs j ; t k ; yÞ y l ð1 À Cðs j ; t k ; yÞÞ ð1À y l Þ After the inclusion of each new cohort, the posterior mean of the probability of toxicity was to be updated and the authors recommended making the next dose assignment to a combination adjacent to the current one. If the current dose was (j, k), the next dose assigned was a combination from the set:: Due to ethical constraints, combinations where both agents were increased at the same time (j + 1, k + 1) were not permitted.
At the end of the trial, the MTD was defined, for each dose level of Agent B, as the dose combination where the estimated probability of toxicity was closest to the target.
Wang and Ivanova proposed a start-up rule in order to gather information before estimating parameters. Patients in phase I studies are allocated to doses sequentially. In practice, the first group of the start-up is treated at the lowest dose combination (s 1 , t 1 ). The start-up phase is then conducted according the following rules: i) if no toxicity is observed, the dose of the first agent is escalated; ii) if at least one toxicity is observed, the start-up is terminated at the current level (i,j), and the next dose combination is (i-2, j + 1). The start-up is terminated if all dose levels of Agent B were explored. At the end of the start-up, the vector of initial guesses is used to obtain the estimates of the toxicity rate for each combination. For safety's sake in the dose-escalation process, we proposed to start the two-dimensional design at the lowest level of Agent B at a dose combination with the probability of toxicity closest to the target toxicity.
We note that the start-up rule can be defined depending on the context of the clinical trial, for example priority can be given to exploring certain dose combinations. The start-up rules should be defined before beginning the trial.

Sequential CRM for two-dimensional dose-finding designs
Yuan and Yin proposed to convert the dose-escalation design for dual agents into k onedimensional dose-escalation trials, with k being the number of dose levels for the second agent, which had fewer dose levels to be explored. Each one-dimensional trial is referred to as a sub-trial, where the dose of the second drug is set at a fixed level, and the dose of the first drug varies. In this design, the idea is to first divide the entire two-dimensional trial into groups of sub-trials and then conduct the sub-trials sequentially in a specific order. The intermediate-dose sub-trial is run first, in order to find the MTD. The MTD that has been determined in the completed sub-trial can be used as the truncation boundary to shrink the search space of other sub-trials that have not yet been carried out. The low-dose and high-dose subtrials were then run simultaneously. For example, if dose level 3 is the MTD in the intermediate sub-trial, then based on the assumption of monotonicity, the MTD must be higher than or equal to dose level 3 in the lower sub-trial, and the MTD must be lower than or equal to dose level 3 in the higher trial. In other words, determining dose level 3 as the MTD in the intermediate sub-trial shrinks the number of patients in the lower and higher sub-trials, by eliminating "inadmissible" doses.
The two sub-trials are not independent of the main trial because the vectors of initial guesses of the low-dose and high-dose sub-trials are proportional to the posterior toxicity probability of the main sub-trial. For each sub-trial, the conventional Bayesian CRM with a one-parameter empirical model is used [1]. At the end of the trial, this method provides an MTD for each dose level of the second agent. The MTD is defined as the dose combination with an estimated probability of toxicity closest to the target.

Design parameters
The performances of Wang and Yuan approaches were compared under different scenarios in a simulation study in the context of the VINILO trial. After discussion with the principal investigator, the dose levels of 4, 5 and 6 mg of vinblastine combined with the dose level of 115 mg of nilotinib were not explored (see Fig 1). As the trial was started using Yuan's approach, no start-up rule was proposed for Wang's method. To be consistent across the different methods, the trial was to stop when the prespecified number of patients, n, was exhausted. The number of patients was set at 40, which was the capacity of trial recruitment. Using Yuan's approach, two sub-trials were considered, and the number of patients was divided into 24 patients in the main trial and 16 in the high-dose sub-trial. The cohort size was 3 patients. The target probability of DLTs was set at 20%.
For each scenario, we simulated 1000 repetitions of a trial. Skipping of dose levels during dose escalation was not permitted. No intra-patient dose escalation was permitted. At least 2 patients fully observed with no DLTs were required at a given dose before dose escalation. We used the same definition of the MTD in both methods, i.e. the dose associated with an estimated DLT probability closest to, but below, the target DLT. The target was set at 20%.
For the Wang and Ivanova approach, the parameter γ was set at 0 assuming no interaction between the two agents nilotinib and vinblastine. The prior distribution for the parameters (α, β) was the product of two independent Exponential with a mean of 1. The Bayesian estimation was based on 5000 Monte Carlo iterations with no start-up.
For the Yuan and Yin approach, as recommended by the authors, a non-informative prior distribution Gaussian (0, 2) was assigned in the Bayesian computation, and the proportionality coefficients for the initial guesses in the sub-trial were set at (0.85, 1.15), as recommended by the authors. We used the dfcrm R package to apply the CRM for each sub-trial.
As the initial comparison concerned the methods of both Wang and Yuan in the VINILO context, we completed the performance assessment of Wang's design, which was used in VINILO trial, by comparing it with those of designs presented in the paper by Hirakawa et al. [12] in the context of their simulation studies. Please refer to their paper for the designs compared and for the 16 scenarios studied in their simulation studies. We used the simulation setting of Hirakawa et al. [12]; the target toxicity was set to 0.3, no stopping rule was specified, and the pre-specified maximum sample size was set at 30. Each simulation study consisted of 1000 trials. The values of the prior toxicity probabilities for Agent A and Agent B were set at the same level as the prior toxicity probabilities for both agents as used in their paper (i.e.

Sensitivity analysis
A sensitivity analysis was performed to evaluate the performance of the method where: i) the proportionality coefficients were set at (0.85,1.5) and (0.85,2) instead of (0.85,1.15), and ii) different initial guesses using the R function "getprior" in the dfcrm package were used. As stated by Yin G and Lin R [18], typically, the maximum dose for each drug in a combination trial cannot exceed their individual MTDs, and as a result, the remaining lower doses can only be fractions of the corresponding MTDs. Thus, the maximum toxicity probability for each drug in the combination typically cannot exceed 20%, as synergism in DLTs is often a major concern.
Parameters of the scenarios The simulation study was performed using the scenarios reported by Wang and Yuan. A total of 8 scenarios were investigated ( Table 1). The first four were based on those reported by Wang and the other four were based on those reported by Yuan, with a target probability of DLT set at 20% in all scenarios. These scenarios spanned a broad range of hypotheses for dosetoxicity relationships in the context of VINILO.

Simulation of trials and metrics of comparison
In the context of VINILO, as the aim of the trial is to identify one MTD at the end of the trial, the methods were compared in terms of the percentage of dose recommendations, in particular the percentage of correct selection (PCS). The average number of observed DLTs at each dose level and the average number of patients treated at each dose level were also calculated (results available on request).

Results
The distribution of the recommended dose is presented in Table 2. Wang's method yielded a good performance with a PCS varying from 45% to 72%. In 6 of the 8 scenarios studied, Wang's method presented a PCS higher than that with Yuan's approach, with a difference in PCS varying from 7.6 to 44.6%. The control of overdosing was better with Wang's method in 6 of the 7 cases concerned. The percentage of patients treated at each dose level was compared. The number of patients treated at the MTD per dose level of the second dose was better with Wang's method (results available on request).
The performances of Wang's method compared to the designs used in the paper by Hirakawa for the 16 scenarios studied are presented in Table 3 [12,19]. The results are presented in terms of PCS and the average number of patients allocated to the MTD. The operating characteristics of Wang's method were inferior to those of the designs presented in the paper by Hirakawa, with an average difference of -11% in the PCS. Over-dosing was well controlled in Wang's method. The average number of patients allocated to the MTD was excellent with Wang's method, with a difference of +19. Wang's method appeared to be more conservative in the dose escalation process, with many more patients allocated to the MTD-1, in particular when it was the next closest dose to the target, leading to a conservative method, which may have been due to the start-up step used at the beginning of the trial. In fact, using the start-up steps, doses starting at the lowest dose level can be explored, and then the first agent can be explored while keeping the dose of the second agent at a fixed level until the first DLT occurs. This rule can however consume a large number of patients at subtherapeutic doses before exploring the MTD, and can thus lead to a shortage of patients to explore the MTD. For example, in scenarios 5 to 7 where the first doses had very low toxicity, many patients were used during the start-up before starting the two-dimensional trial. In these scenarios, a total of 16 doses were explored for 30 patients, resulting in an average of two patients per dose. We ran the same simulations using a variant of Wang's method with no start-up step. The results are presented in Table 3, noted as "Wang variant". The operating characteristics of the Wang variant were similar to those of the designs presented in the paper by Hirakawa, with an average difference in PCS less than 8%, with the best PCS in scenarios 8 and 10. The average number of patients allocated to the MTD was excellent with Wang's method, with a difference of +25 patients treated at the MTD. We ran the same simulations using 50 patients instead of 30, and the results were improved by more than 10% in all of the scenarios (results available on request).

Sensitivity analysis
We studied the operating characteristics of Yuan's method with different values for the proportionality coefficients for the initial guesses in the sub-trials. We considered pairs of values (0.85, 1.5) and (0.85, 2) instead of (0.85, 1.15). Similar results were obtained (results available upon request). We also compared the performance of Wang's method with different initial guesses. The results were similar in most scenarios (results available upon request).

Program description
We proposed scripts using Wang's method. We presented two functions: i) Wang's step-bystep (SBS) design, which computes the dose for the next patient in an actual phase I trial, and ii) Wang's simulation (SIM) design, which provides simulation replicates of a phase I trial using the Wang and Ivanova design. Both functions were written in R language. No downloading of R packages was required. The functions must be loaded in each work session and require several arguments, detailed below.

Example
The example concerns a phase I trial evaluating two agents, Agent A with five dose levels and Agent B with three dose levels. Thirty patients will be included in the trial, by cohorts of 3 patients. The first cohort will be allocated to dose level (1,1). Assuming that no toxicity is been observed at this dose, the next dose for the next cohort of patients is given as follows: SBS_design(cur_d1,cur_d2,nsamp, nd1, nd2, prior_1, prior_2, target, npts, ntox, ntot) To apply our R function, the arguments will be set as: prior_1 c(0. 12 The program first verifies consistencies in the parameters and error messages can be displayed. For example, if the length of prior_1 is superior or inferior to nd1, the program stops and signals that "prior_1 must be a vector (length nd1)". If there is no problem and the parameters are considered correct, calculations can then be continued, with the following message: "+ Parameters OK".
A summary of the number of patients treated at each dose level, the numbers of DLTs observed at each dose level and the next dose to be explored are then given.

Application: Validity of R scripts for Wang and Ivanova approach
To validate our R scripts, we ran the same scenarios (dose-toxicity relationships) proposed by Wang for the simulation study. The results were compared to those published in their paper. The results using R function were similar to those published by Wang.
Using the R functions, the results on the percentage of DLTs observed, and a recommended MTD at the end of the trial can be also obtained. The MTD is defined as the dose combination associated with a probability of DLT closest to the target toxicity (results available on request).

Conclusions and discussion
As stated by Hirakawa et al., thus far, there are no silver bullet designs to resolve the two-agent dose-finding problem. The operating characteristics of model-based dose-finding methods may be varied depending on prior toxicity probabilities, prior distributions, and number of patients included in trials. It is recommended to assess the performance of the designs in the context of the clinical trial before beginning the trial. The two-dimensional dose-finding design proposed by Wang and Ivanova is a comprehensive approach that yields good performances. The two R functions that we propose can facilitate the use of this design in practice. With the first function, the design can be applied to an actual phase I trial. With the second function, a simulation study can be run in order to calibrate the design before its application for a phase I trial, or in order to compare it with other designs. Both R functions are simple, and each part of the algorithm (start-up, decision rule. . .) can be modified depending on the requirements of the trial: i) the start-rule can be updated depending on the clinical context, ii) even if the design proposes to identify a set of MTDs, one MTD can be chosen from the set. For example, the MTD with a probability of DLT closest to the target toxicity can be chosen, or clinicians may prefer to choose to identify the dose for further trials depending on the clinical context.