Predicting regrowth of low-grade gliomas after radiotherapy

Diffuse low grade gliomas are invasive and incurable brain tumors that inevitably transform into higher grade ones. A classical treatment to delay this transition is radiotherapy (RT). Following RT, the tumor gradually shrinks during a period of typically 6 months to 4 years before regrowing. To improve the patient’s health-related quality of life and help clinicians build personalized follow-ups, one would benefit from predictions of the time during which the tumor is expected to decrease. The challenge is to provide a reliable estimate of this regrowth time shortly after RT (i.e. with few data), although patients react differently to the treatment. To this end, we analyze the tumor size dynamics from a batch of 20 high-quality longitudinal data, and propose a simple and robust analytical model, with just 4 parameters. From the study of their correlations, we build a statistical constraint that helps determine the regrowth time even for patients for which we have only a few measurements of the tumor size. We validate the procedure on the data and predict the regrowth time at the moment of the first MRI after RT, with precision of, typically, 6 months. Using virtual patients, we study whether some forecast is still possible just three months after RT. We obtain some reliable estimates of the regrowth time in 75% of the cases, in particular for all “fast-responders”. The remaining 25% represent cases where the actual regrowth time is large and can be safely estimated with another measurement a year later. These results show the feasibility of making personalized predictions of the tumor regrowth time shortly after RT.


Introduction
Diffuse gliomas are primary brain tumours originating from glial cells (oligodendrocytes and/or astrocytomas).In its 2016 classification, The World Health Organization defines four grades [1]: while the first grade gliomas are benign, second grade gliomas (or low grade gliomas, LGG) are invasive, growing at a rate of 2 to 8 mm/year [2] in diameter, but without involving metastasis or necrosis.Unfortunately, they cannot be cured by oncological treatments [3] so one needs to contain their growth as long as possible, before they transform into grade III and IV (glioblastomas) with a dramatically low survival rate.LGG are detected with magnetic resonance imaging (MRI) scans under a T2-FLAIR sequence.Since they are diffuse tumours that extend beyond the observed boundaries [4,5], the uncertainty on their size is irreducible.Classical treatments include resection (when possible), chemotherapy and radiotherapy (RT) [6].
Standard conformational radiotherapy for LGG is generally performed during 6 weeks (5 days a week) and the classical dose is around 50 Gy.Irradiation of gliomas involves a large number of physical processes [7] and its effect varies across patients.However, some general features emerge: the tumour shrinks during a period that varies between a few months and several years, before regrowing at a rate similar to the one observed before radiotherapy.
Mathematical modelling of natural and under treatment tumour growth has a long and rich history (in particular for gliomas, one can refer to the recent review [8]).For invasive tumour such as gliomas that cannot be removed by surgery, one aspect that is of special interest for clinicians is the response of tumour to treatments and in particular, radiotherapy [9][10][11][12][13].Its primary goal is to optimize treatments "virtually": for example, choosing the optimal radiation fractionation of doses [14], finding the best way to combine it to chemotherapy [15] or studying its interplay with the immune system [16].Beyond describing qualitatively the different processes at stake, the real usefulness of a model would be to predict the response of individual patients to a treatment, even before the end of the treatment.Such predictions would allow the clinician to personalise the follow-up (and the treatment) for each patient.There has been some attempts to predict tumour growth and the effects of treatments on individual patients.If purely statistical or image-based models can be used to predict glioma growth [17], mechanistic models are usually used for instance to predict the metastatic relapse in breast cancer [18], tumour growth in leukaemia and ovarian cancer [19], response of high grade gliomas to chemoradiation [20], or the patient-specific evolution of resistance in the context of prostate cancer [21].
For low-grade gliomas, individualized predictions from the tumour size dynamics and genetic characteristics, have been made for the response to a chemotherapy treatment [22].To our knowledge, such individual predictions do not exist in the case of low-grade glioma and RT.In this article, we show that it is possible to predict the evolution of LGGs under RT, for individual patients, with an approach based on a practical mechanistic model, even in the case where the number of patients is not sufficient to apply standard machine-learning techniques.
In order to be used for predictions, a model should have a limited number of parameters.Models that are too detailed are useful to describe qualitatively the tumour evolution but usually involve too many unknown parameters [23].Given the scarcity of clinical data, only a small number of parameters can be deduced.The goal here is to keep as few as possible parameters but still to capture the essential dynamics of the tumour.
In a previous work [13], we analysed a large number (43) of LGG radial evolutions under RT and proposed a physically motivated model, with 4 parameters, that fitted well all the profiles of tumour evolution during patient's follow-up.Following that work, we now try to make predictions using that model.This is challenging given the variety of possible in-vivo reactions to radiation.We have chosen to focus on the moment when the tumour stops shrinking and starts to regrow, what we call in the following the "regrowth time".This is an essential feature of the tumour dynamics for two reasons.First, the patients often ask their clinician when the tumour will regrow in order to plan some major life projects (as having a child, travelling, retiring, etc.).This would be a valuable information to improve their life-quality.Second, the the dates for the next MRIs are currently fixed and not optimal on an individual basis.By making predictions we may adjust them more precisely for personalised follow-ups.
1 Materials and Methods

Standard protocol approvals, registrations and patient consents
The study received required authorizations (IRB#1: 2021/20) from the human research institutional review board (IRB00011687).The requirement to obtain informed consent was waived according to French legislation (observational retrospective study).

The patients
We had at our disposal a set of 43 patients with LGGs who were diagnosed at the Sainte-Anne Hospital (Paris, France) from 1989 to 2000.These patients were selected according to precise criteria that are detailed elsewhere [24].In short, only adults with typical LGGs (that is, no angiogenesis and, thus, no contrast enhancement on gadolinium-T1 images), available clinical and imaging follow-ups before, during, and after RT, and RT as their first oncological treatment except for stereotactic biopsies were eligible.The external conformational RT was given using the same methodology (total dose, 50.4-54Gy; 6-week period) at 2 outside institutions.The patients had an MRI follow-up before, during, and after RT.Three tumour diameters in the axial, coronal, and sagittal planes on each MRI image with T2-weighted and FLAIR sequences were measured manually.The mean radiological tumour radius was defined as half the geometric mean of these three diameters and was measured as a function of time.The error bars for the measured mean radii were estimated by clinicians and were set to ±1 mm.From this cohort, we discarded the patients that did not have any sign of tumour regrowth at the last time point or those that had fewer than five time points in their follow-up.

The model
A biologically motivated model with the effect of RT on LGG has been presented in [13] and validated by the fits on 43 patient follow-ups.It is based on a standard diffusion-proliferation equation [25] and RT is modelled with a time-dependent death rate (κ D (t)).The evolution of the glioma cell density then follows the equation where ρ(r, t) is a function of the radius r (assuming a spherical symmetry) and time t (conventionally set to zero at the beginning of RT), D is the diffusion coefficient and κ the proliferation rate.In its most simple (thus predictive) form, the death-term is characterized by an amplitude and a characteristic time and is considered as null before RT.
Assuming that the tumour growth-rate when patients consult is already in the asymptotic state, i.e. that it evolves linearly with a speed v = √ 2Dκ, and neglecting diffusion after RT, the radius evolution can be approximated by [13] Inspired by this formula, we simplify the model eq.( 1) by proposing a purely geometrical one in the form which has 4 free parameters: R 0 , v, k, τ.We emphasize that eq. ( 4) should be considered as an ad-hoc model that cannot be related to the one obtained by solving numerically eq. ( 1) since eq.( 3) neglects diffusion.This simple geometrical model has the considerable advantage of being analytical.The role of each terms is clear and sketched in Fig 1 .It captures the 3 phases of the evolution: first the linear growth, then the exponential decay of a fraction of the tumour and therefore of its radius, third, the regrowth with the same velocity as before RT.
To test whether this model does fit our data appropriately, we construct a classical objective function as the mean squared error from the set of measured values where σ i = 1 mm.This is a 4-parameter real valued function that we minimize easily with a standard optimization algorithm [26] since the model is analytical.To obtain physical results, we impose as limits that all parameters be positive and that the radial asymptotic speed lie in the range 0.5 ≤ v ≤ 4 mm/yr [2].We then obtain the best-fit parameter values given in Table 1 and show, in Fig 2, the comparison between the data and the fitted model on a set of 20 patients who possess at least 9 data points.The agreement is excellent.Although the results are quite similar to the ones obtained in [13], we have considerably simplified the model and reduced drastically the run-time which will be useful later in making predictions.

Constraining the parameters space
We now study whether some common features appear in our best-fit parameters.No parameter displays a clearly peaked distribution.For a given patient, the expected parameters are random variables and a priori unpredictable, although within some bounds.
We now consider the correlation between the variables by computing their Pearson coefficients and show the results in Table 2.
Illustration of the analytical model describing the tumour radial evolution.Before RT (t < 0), the radius evolves linearly with the asymptotic speed v and reaches R 0 at t = 0. Then it becomes the sum of an exponential decay of amplitude R 0 − k and characteristic time τ (death term) and a linear vt regrowth.
The structure is far from being diagonal, indicating non-trivial correlations among most pairs of variables.Of particular interest is the large (k, v) correlation since it relates a quantity defined before RT (v) to a one after RT (k).
To make use of the information in the most efficient way, we first decorrelate the variables.This is performed by diagonalizing the covariance matrix 1 .From the eigenvectors, we build the transformation matrix T that projects our parameters p T = (R 0 , v, k, τ) onto an orthogonal basis where the new variables X T = (x 1 , x 2 , x 3 , x 4 ) are uncorrelated.From our data we measure the following projection matrix: and the linear change of variables is then simply Considering the important terms in the matrix, we see that the first 2 lines link essentially the size of the tumour (R 0 ) to the amplitude of the RT reaction (k).The next two ones relate in a non-trivial way, the growth speed (v) to the RT effect (k, τ).
We now consider the distribution of these new {x i=1,••• ,4 } variables that which, we recall, are mutually uncorrelated by construction.Their histograms are shown on Fig 4.
The nice feature now is that, unlike the original variables (Fig 3), the distributions are now approximately Gaussian2 .For each variable we fit the mean (µ i ) and standard-deviation σ i .
We can now build a term that contains the extra-information about the correlations among the variables in the form     We then add this term to the original χ 2 function (eq.( 5)) and perform the minimization.The constraint acts as a Bayesian prior, i.e. it includes all the a priori information R 0 1 0.17 0.46 -0.09 v 1 0.73 0.10 k 1 0.52 τ 1 Table 2. Correlation coefficients measured between the 20 bestfit parameters of our model.Since the matrix is symmetric with ones on the diagonal we only show its upper half.we have between the parameters.It has no sense to use it on the previous fits (Fig 2 ) since it was derived from them.But we have checked that the best-fits obtained using χ 2 T OT are exactly the same as the ones with only χ 2 , meaning that we are not over-constraining the parameters with the χ 2 cons term.So why add such a term?Suppose we have few data, for instance 2 measurements before RT and one after, then we have only 3 points to determine 4 parameters.Using eq. ( 8) we introduce some extra equations and the problem becomes at least technically solvable.
In the following we focus on the regrowth time, which according to our model (eq.( 4)) is It depends mostly on τ and logarithmically on the relative speed between the shrinkage due to RT (k/τ = v d ) and the intrinsic tumour growth (v).The χ 2 T OT (R 0 , v, k, τ) minimization leads to the ( R0 , v, k, τ) estimates and we use those values in eq.(10) to estimate the regrowth time.

Data validation
We first validate the procedure on our dataset by assessing the performances of our predictions with a single point after RT.
Among our patients, we choose 6 follow-ups, with at least two points before RT and enough subsequent points for the minimum of the fit to be robust (see Fig 2).We then take the points before RT and the first one just after it, and perform the constrained minimization (eq.( 9)).We obtain an estimate of t min and compare it to the one from the full fit.In order to avoid mixing the training and test samples, for each patient we rebuild the constraint on the lines of section 1.4, removing each time the patient's data from the datasets.The results are shown in Fig 5. We then only consider the red circled points consisting of all the points before RT and the first one just after, and perform the constrained minimization described in the text.The result for the model is the dashed red curve with the estimated regrowth time shown as the vertical red line.
The t min predictions for most of the patients are quite precise; they lie within a few months of the value determined with all data.For patient (13) it is slightly larger (10 months).This is an interesting case, since the point after RT is above the one before.This can be due to statistical fluctuations or to the fact that RT produces sometimes an oedema that can be misidentified as the tumour radius.However even in this case, we obtain a reasonable estimate.This shows that, at the date of the first MRI after RT, we could have guessed in most cases efficiently the regrowth time of the tumour and plan more efficiently the dates of the next MRIs.

Predictions
We now evaluate on virtual patients a strategy to estimate as soon as possible the tumour regrowth time.To this aim, we must first fix the times of the MRI measurements which are constrained in the following way.
1.Although we showed results with many points before RT (Fig 5), today's clinical paradigm is to reduce the tumour as soon as possible.We thus consider the case where the radiotherapy sessions are planned immediately after the first MRI within typically 6 months.
2. Since this is a central point, a second MRI should be performed around the RT date. 8/14 3. Fast-responders reacting within a few months, we propose to perform an MRI measurement 3 months after RT.
We will then consider the cases where the measurement times are located at t mes = [−6, 0, +3] months and test if we can still make some predictions for the regrowth time.This is a very challenging situation since we only have 3 nearby points with important relative errors.To assess statistically the performances of the prediction, we adopt a Monte-Carlo approach.For a given set of "true" parameters (R 0 , v, k, τ), we first compute the tumour radius at t mes .We then add to each point a random Gaussian noise with a σ = 1 mm standard deviation, and from these virtual measurements, estimate the regrowth time.We repeat the procedure 1000 times and consider the mean of the predictions and the 95% confidence-level interval (obtained from the [0.05, 0.95] percentiles) that we compare to the true t min value.This procedure is illustrated in Fig 6.The black curve represents a model (which is here the bestfit of patient ( 18)) with its minimum shown as the vertical black line.
One draws some Gaussian noise of σ = 1 mm at the measurement times t mes = [−6, 0, 3] months, and performs the t min estimation described in the text.This is repeated 1000 times which allows to construct the red histogram of all the t min estimates.The red vertical dashed line shows its mean value and the horizontal one the [0.05,0.95]percentile region.
We use our 20 best-fits as a representative set of "true models".We perform the Monte-Carlo study described previously for each set of parameters and compare the mean and 95% confidence-level interval of our estimated regrowth times to the true value on Fig 7.
First, we notice that 15 predictions out of 20 (75%) are good, the mean value being typically within 6 months of the true.In these cases, the guess follows roughly the true values which confirms that the method is not only driven by the constraint (which would lead always to the same interval) but also incorporates the information of the 3 measurements.Fast-responders (patients (1),( 4),( 8) and ( 11)) are correctly predicted and tend to lead to predictions under 1 year which could be the threshold to plan a next MRI rapidly (possibly 3 months later).
There are also 5 outliers out of 20 (25%) corresponding to the cases where the true regrowth times are the largest (3 − 4 years), i.e. to the slowest responders.A point at 3 months for them is much too soon to infer any information about the curvature, so that the prediction is only driven by the constraint and goes to its mean value of about 2 years.More precisely, by Taylor-expanding our model near where v d ≡ k/τ is the speed of the collapse and the curvature term is = v d /τ.For slow-responders, there is almost no curvature at 3 months, → 0 and τ = v d / diverges leading to a very broad (and even sometimes bi-modal) t min distribution.In this case, the prediction is only driven by the constraint.9/14 0 1 2 3 4 t min [yr] (0) ( Although pessimistic for the patient, the predicted value is still large (around 2 years, see Fig 7).Thus we can safely plan a next MRI 1 year after RT.We consider the case where the times for the radial measurement are at t mes = [−6, 0, 3, 12] months and perform the prediction again.The result is shown in black in Fig 8.
Unfortunately, the constraint eq. ( 8) is still pulling t min to too low values.We need to switch to a looser constraint.As is clear from eq. ( 11), the linear term, that is the best constrained, is related to the slopes measured before (v) and after (v d ) RT.In the absence of good knowledge of the curvature, we may try to relate these slopes to the regrowth time.Indeed, on our dataset, we observe a strong correlation between v d and t min (Fig 9) that we fit to a power-law The origin of this heuristic constraint remains to be understood, but we can use it to build a new estimator for t min : we fit on the data only the linear terms in eq. ( 11) in order to get v d which we transform according to eq. (12).Since this method uses a single correlation, we call it the loose constraint.We show the result of applying this procedure on the outliers in blue on  values.One may ask why not always use this constraint.As clear in the figure, the uncertainty is larger with the loose-constraint method.This is the classical bias/variance trade-off of any estimator.Although it gives indeed less biased results for outliers, the method would miss fast-responders at 3 months, since the slopes determination is then extremely noisy.On the contrary, with a point at 1 year, there is enough lever-arm to determine the slope quite precisely and take advantage from the correlation to let the data "speak for themselves".We point out that the loose-constraint method is very simple and may be used by any clinician without even a computer.First measure the slope before RT to obtain v, then the slope after RT (v − v d ) to obtain v d , and finally use eq.( 12) to predict the regrowth-time.

Discussion
We have proposed a new simple model to describe the evolution of diffuse low-grade gliomas before and after radiotherapy.It is analytical and describes in a satisfactory way the follow-ups of 20 patients with measured tumour radii before and after RT.This model has 4 free parameters, 2 before RT and 2 after, that vary for each patient.From the study of the correlation between all the parameters we proposed a way to include a prior information to any follow-up, which allows to perform predictions for the regrowth-time of the tumour rapidly after RT.From the data we had at our disposal, we showed that including this information allows to predict the regrowth time of the tumour at the very first MRI measurement after RT typically within 6 months.Using virtual patients, we have shown that is is possible to predict reasonably well the regrowth time with only one point 6 months before RT, one around RT and one 3 months after, in 75% of the cases.The remaining 25% for which our prediction is pessimistic, have all large regrowth-time ( 4 years) and may draw benefit from another measurement 1 year after RT, leading to more correct estimates.
These results assume that our database is representative of all LGG evolution and would profit from incorporating more patients' data.Similar profiles are obtained for chemotherapy treatments [22,27] and it would be interesting to redo the analysis in this case.
This work is based on a 4-parameters model which is a simplified version of a biologically motivated model.This choice can be challenged; why not use some non-parametric method that are often efficient?First, the low dataset (43 patients but in practice 20 with a sufficient number of points to inform our model) precludes the possibility of using general purpose Machine Learning techniques like Deep Neural Networks, Random Forests, Boosted Decision Trees (as described for instance in this recent review [28]), as well as Recurrent Networks dedicated to Time Series (e.g.[29]).Second, we could think of using Gaussian Processes (GP) method (e.g.[30]) that can work on small samples with some optimized kernel.We have tried it, with a squared exponential kernel and a white noise.However, by construction, outside the data input region the naive "vanilla" model converges to a constant and cannot describe the regrowth phase.To overcome this failure, one is forced to use a time dependent function of the mean which is exactly the meaning of the 4-parameters model developed in this article.This clarifies why modelling, especially based on physical arguments, is superior to all purely statistical methods.This was the key to the success of making predictions from a restricted dataset and with very few data points.
Here, we have varied the patient's population and shown that the method has the potential to make some predictions among various patients profile.The problem is different for a personalised follow-up (which is the practical clinical case) since the prediction depends on the details of the measurements (times and values).Using a Monte-Carlo Markov Chain technique, one can obtain an individualised probability distribution of the regrowth-time that can help clinicians adapt their treatment and the dates of the next MRIs.We plan to provide such a tool that will be publicly available online.
Fig 3 shows the histograms for each parameter on the 20 patients.

Fig 2 .
Fig 2. Comparison between the measured values of the tumour radius and the bestfit model for 20 patients.The points represent the measured values and the red line our model obtained by minimizing eq.(5).The abscissa represent time in years (with the origin set at RT) and the ordinate the tumour radius (in mm).The error bars on the measurements are of 1 mm.The dashed vertical red line shows the model minimum, i.e. the moment regrowth starts.
(5) corresponding to the fits shown on Fig 2.The first columns represents the patients' ID, (R 0 , v, k, τ) are the estimated parameters of the model, and the regrowth time (t min ) is derived from them.Lengths (R 0 , k) are expressed in mm and times (τ, t min ) in years.

Fig 4 .
Fig 4. Histograms of the transformed decorrelated variables The new variables are linear combinations of the fitted (R 0 , v, k, τ) parameters as described in the text.They are normalized to unit area and the result of the Gaussian fit is shown in black.

Fig 5 .
Fig 5.  Predictions for the regrowth time on real data with the first point after RT.The black dashed curve shows the best fit model using all the data points and the vertical black line shows its minimum (same asFig 2)  .We then only consider the red circled points consisting of all the points before RT and the first one just after, and perform the constrained minimization described in the text.The result for the model is the dashed red curve with the estimated regrowth time shown as the vertical red line.

Fig 6 .
Fig 6.Characterization of the regrowth time estimation with a Monte-Carlo method.The black curve represents a model (which is here the bestfit of patient (18)) with its minimum shown as the vertical black line.One draws some Gaussian noise of σ = 1 mm at the measurement times t mes = [−6, 0, 3] months, and performs the t min estimation described in the text.This is repeated 1000 times which allows to construct the red histogram of all the t min estimates.The red vertical dashed line shows its mean value and the horizontal one the [0.05,0.95]percentile region.

Fig 7 .
Fig 7.  Performances of the regrowth time estimates with 3 measurements at t mes = [−6, 0, 3] months for a set of true parameters corresponding to the bestfits of our 20 patients.Black points represent the mean of the estimates and the bars the 95% confidence-level interval.The red point is the true value associated to each best-fit for the patients labelled on the vertical axis (corresponding to the dashed lines inFig 2).Note that only the best-fit parameters of each patient are being used here.

Fig 8 .
The distributions are much better centred on the true 10

Fig 8 .Fig 9 .
Fig 8. Monte-Carlo estimates of the regrowth time with an extra point at 1 year for the 5 outliers of Fig 7 (t min > 3 years).Black bars corresponds to the 95% confidence-level intervals obtained from the standard constraint (eq.(8)) and the blue ones with the loose constraint described in the text.The black/blue points corresponds to the mean values and the red point is the true value of each model.

Table 1 .
Parameters of the least-square solutions of eq.