What Can Be Learnt about Disease Progression in Breast Cancer Dormancy from Relapse Data?

Breast cancer patients have an anomalously high rate of relapse many years–up to 25 years–after apparently curative surgery removed the primary tumour. Disease progression during the intervening years between resection and relapse is poorly understood. There is evidence that the disease persists as dangerous, tiny metastases that remain at a growth restricted, clinically undetectable size until a transforming event restarts growth. This is the starting point for our study, where patients who have metastases that are all tiny and growth-restricted are said to have cancer dormancy. Can long-term follow-up relapse data from breast cancer patients be used to extract knowledge about the progression of the undetected disease? Here, we evaluate whether this is the case by introducing and analysing four simple mathematical models of cancer dormancy. These models extend the common assumption that a random transforming event, such as a mutation, can restart growth of a tiny, growth-restricted metastasis; thereafter, cancer dormancy progresses to detectable metastasis. We find that physiopathological details, such as the number of random transforming events that metastases must undergo to escape from growth restriction, cannot be extracted from relapse data. This result is unsurprising. However, the same analysis suggested a natural question that does have a surprising answer: why are interesting trends in long-term relapse data not more commonly observed? Further, our models indicate that (a) therapies which induce growth restriction among metastases but do not prevent increases in metastases' tumourigenicity may introduce a time post-surgery when more patients are prone to relapse; and (b), if a number of facts about disease progression are first established, how relapse data might be used to estimate clinically relevant variables, such as the likely numbers of undetected growth-restricted metastases. This work is a necessary, early step in building a quantitative mechanistic understanding of cancer dormancy.


Model 1: the cells of micrometastases can seed secondary micrometastases and micrometastases can disappear
The patient has a number of micrometastases n in total. A subset of these, of number n M , can disseminate cells that seed secondary micrometastases. In increments of time ∆t, the pair (n, n M ) ∈ Z 2 ≥0 ∪{G} changes value as probabilities being up to o(∆t). Relapse inevitably follows a growth event after growth time τ .
Model 2: micrometastases require two rate-limiting steps to escape from growth restriction The patient has micrometastases in different states denoted by S, SV , and V of number n S , n SV , and n V respectively. In increments of time ∆t, the triple (n S , n SV , n V ) ∈ Z 3 ≥0 ∪ {G} changes value as (n S , n SV , n V ) → (n S − 1, n SV + 1, n I ) prob. κ S n S ∆t, G = growth prob. (κ SV n SV + κ V n V ) ∆t, probabilities being up to o(∆t). Relapse inevitably follows a growth event after growth time τ . This definition is more general than that of the main text where κ SV = κ V Model 3: micrometastases escape from growth restriction at a rate that changes with time In increments of time ∆t, the patient's number of micrometastases n ∈ Z ≥0 ∪ {G} changes value as: n → G = growth prob. κ(t) n ∆t probabilities being up to o(∆t).

The specified growth time τ
For model fitting, the growth time τ was specified as 3 years for EBCTCG data and 1.5 years for Chia et al. data; the median doubling time in 199 cases of growing primary and metastatic breast cancer was 3.5 months [1], which gives an estimated time of 3 years for micrometastases to grow from 1 mm in diameter to a detectable size; study [2] estimated the metastasis growth time to be only 9 months; in other studies, estimates from the tumour volume-doubling time of breast metastases suggest that τ is approximately 2.5 years [3]. The choices for τ are sensible, and for the specific data sets used in this study, they demonstrate the results optimally; the conclusions are independent of the choices for τ . Recall that because relapse data from times < τ years is excluded from analyses, all patients who already had detectable or growing metastases upon resection are excluded from analyses.

Supplementary Material for Analyses
Explicit expressions for RFI curves used for rapid data fitting In this section, for Models 0-3 we present explicit analytic expressions for RFI curves f τ [t + τ ] at time t + τ post-resection 1 , where the subscript denotes normalization at time τ (necessary in order to exclude from analyses patients with growing or detectable metastases upon resection). These expressions enabled models to be fitted to data rapidly. The waiting time T R denotes the time at which a patient relapses, equivalent to the first time at which a metastasis becomes detectable; the waiting time T G denotes the time at which a growth event first occurs. For a given model of dormancy, parameters of the model may be different in different patients. Then the RFI curve is the probability that patients have not relapsed by time t + τ averaged among patients, i.e. averaged over parameters of the model according to some prescribed distribution representing parameter values among patients at the time of resection. Expressions are first derived for Models 0, 3, and 2 by elementary probability, then for Model 1 by a more lengthy derivation.

The Zeroth Model and Model 3
Upon resection a patient has n(0) micrometastases (of a single type) at time t, each undergoing growth events at a rate κ(t) that may vary with time. This is Model 3, and in the special case κ (t) = 0 it is the Zeroth Model. By definition, By normalizing RFI data at the growth time τ and analysing it only beyond this time, we are excluding from our analyses all patients who upon resection have either growing or detectable metastases. Initially upon resection at time 0, then, P {T G > 0} = 1, and the unique solution of this differential equation is 1 A Kaplan-Meier survival curve is derived from a set of times 0 ≤ t 1 < t 2 < . . . at which time the members of the cohort reach a specified 'end point'. First, the data is discretized into time intervals [u i , u i+1 ). If there are r i surviving patients at time u i of whom d i reach the end point in [u i , u i+1 ), then the Kaplan-Meier curve iŝ Now averaging over parameter values upon resection, this gives the explicit expression for the RFI curve normalized at time τ where throughout our analyses and in the main text the expectation operator with a subscript E t [·] denotes the average over patients without growing or detectable metastases at time t post-resection. In the main text, a linear dependence on time is assumed for κ(t) = κ 0 + κ 0 t, then (3) becomes Assuming also that κ 0 , κ 0 take the same values in all patients, and that among patients n(0) is distributed as a Poisson random variable with mean N , (4) gives explicit functional forms for the RFI curves of Model 3 and the Zeroth Model.

Model 2
In Model 2, the patient's micrometastases must undergo either one growth event to escape from growth restriction (type V ), or two growth events to escape from growth restriction (type S). Here T G denotes the first time at which a micrometastasis starts growing to a detectable size without any further intervening periods of growth restriction. The number of micrometastases of type V and type S upon resection are denoted by N V and N S respectively. In the case (n V (0), n S (0)) = (1, 0), P {T G > t} is again got directly from equation (3), where T S is the waiting time to the first growth event occurring at steady rate κ S and T SV is the waiting time for the second growth even occurring at steady rate κ SV . Again, as in (3), with corresponding probability density functions (p.d.f.s) From these expressions and the independence of micrometastases, we have , t > 0. (5)

Model 1
The following derivation [4] is a simple extension of work in [5] on a one-type linear birth-death process with killing. It enables rapid simulations of RFI curves for Model 1 by numerically solving Kolmogorov equations using standard software. In the special case p M = 1, there is an explicit analytic expression for RFI curves which can be found in [5].
Let n M = n − n M . The stopping times T C and T G denote the time of cancer clearance (n = 0) and the time of a growth event n → G respectively. The probability generating function (p.g.f.) of (n M , n M ) The coefficients of g j,k [w, z, t] tell us everything about the evolution of (n M , n M ). Importantly, and Assume µ > 0. (n M , n M ) is eventually either absorbed at 0 (clearance) or reaches G (growth) (proof not shown). The probability that (n M , n M ) is eventually absorbed at 0 is This is because s M , s M must satisfy the following conditioning on the first transition (here (·, ·) 1st −→ (·, ·) denotes the first transition and C|(j, k) denotes the event of eventual clearance from state (n M , n M ) = (j, k)): From this we see that (n M ,n M ) transits in infinitesimal time ∆t as up to o(∆t). (n M ,n M ) is a two-type birth-death process with absorption at (0, 0) but no growth, so well known theory applies. The p.g.f. of (n M , n M ) in terms of the p.g.f. of (n M ,n M ) is and For convenience of notation now define λ 1 = p M λs M , µ 1 = µ/s M , λ 2 = (1 − p M )λs M , and µ 2 = µ/s M . By definition up to o(∆t). Differential equations are got by setting t = ∆t in (11) & (12) and substituting the expressions (13) & (14) into the right-hand sides of (11) & (12), then expanding into Taylor series in terms of the increment ∆t; for ψ (M ) in (11), up to o(∆t). Rearranging and letting ∆t ↓ 0 obtains an ODE. A similar exercise for ψ ( M ) gets a second ODE; together these ODEs are called the Kolmogorov backward equations.
In summary, we have Backward equations: Forward equations: Initial condition: Note that as the forward equation is a linear PDE with only first order terms, the initial condition is sufficient for the uniqueness of solutions (see e.g. the chapter on Method of Characteristics, [6]). Given that initially n M (0) ∼ Bin(n(0), p M ) (equivalently, n(0) ∼ Poisson(N ) and n M (0) ∼ Poisson(p M × N )),

Models 1 and 3
In the following analysis we allow seeding and disappearance of micrometastases, as in Model 1, and a variable rate of the growth event, as in Model 3. Therefore, the following analysis applies to Model 1 in the special case κ(t) is constant in time, and to Model 3 in the special case of no secondary metastasis (λ = 0) and no disappearance of micrometastases (µ = 0). It is necessary to begin with a lemma.
be the expectation of variable (·) conditioned upon no growth event by time t. Then where the variable n is the total number of micrometastases in a patient at time t, and n M is the number of these in environments that permit secondary metastasis. Proof.
as required, assuming in line three that the series u o u (∆t) converges uniformly in ∆t so that u o u (∆t) = o(∆t).
Here n M = n − n M and, unlike in Model 3, κ = κ(t) may depend on time. Let P (j,k)→(l,m) (t) be the probability that (n M , n M )(t) = (l, m) given (n M , n M )(0) = (j, k). Let (·)(t) denote expectation of (·) at time t when the definition of (n M , n M )(t) is altered slightly: when T G ≤ t, (n M , n M )(t) = (0, 0), otherwise (n M , n M )(t) is as usual. Then  For any α, β clearly κ n ), and substituting the ODE for n , where κ is the derivative of κ with respect to time, defined for t such that P {T G > t} > 0.
We are now ready to derive the expression presented in the Methods section of the main text.
Parameters may vary among patients: f Ψ denotes the p.d.f. of the set of parameters Ψ among patients who do not have growing or detectable cancers upon resection. Then f τ [t + τ ] = P {T G > t|Ψ}f Ψ dΨ, and Note that these equations are independent of the initial condition.

Model 2
Again, it is necessary to begin with a lemma, the proof of which is very similar to the proof of the preceding lemma. Lemma. Proof.
as required, assuming in line three that the series u,v o u,v (∆t) converges uniformly in ∆t so that u,v o u,v (∆t) = o(∆t). Let P (a,b,c)→(k,l,m) (t) be the probability that (n S , n SV , n I )(t) = (k, l, m) given (n S , n SV , n V )(0) = (a, b, c). We use the same trick as before. Let (·)(t) denote expectation of (·) at time t when the definition of (n S , n SV , n V ) is altered slightly: when T G ≤ t, (n S , n SV , n V )(t) = (0, 0, 0), otherwise (n S , n SV , n V )(t) is as usual. Then equation (18) can be expressed as d dt P {T G > t} = κ SV n SV + κ V n V and when κ SV and κ V are constant in time this gets n SV (t + ∆t) | (n S , n SV , n V )(t) = (k, l, m) · P (a,b,c)→(k,l,m) (t) + 0 · P {T G < t} = (k,l,m) Substituting the expressions for n SV and n V into the expression for d 2 P {T G > t}/dt 2 gets as required.
The next step in the derivation is precisely analogous to the derivation following the lemma for Models 1 and 3 above. We have, where the expectation E t [·] and variance Var t [·] are over patients without growing or detectable metastases at time t post-resection. Note that again these equations are independent of the initial conditions.

Supplementary Material for Results
Unusual shapes including two late peaks in the hazard rate can be explained by the patient cohort being heterogeneous Hazard rates can have slightly more exotic shapes other than being primarily flat or having one maximum followed by a continual decrease (see Figure 1; in this data set [7] there is also an earlier peak (not shown); this earlier peak can be accounted for by surgery stimulating the growth of metastases [8,9]). In Figure  1, we show that a heterogeneous patient cohort made up of mixed, unclassified cancer dormancy types, which can be represented by taking the weighted linear combination of distinct parameter sets in Models 1-3, can account approximately for such hazard rates. Parameters must satisfy the inequalities in results section Maxima in hazard rates can be explained by an increase in the tumourigenicity of micrometastases and be such that the hazard rates of different dormancy types peak at different times.