A hierarchical Bayesian model for understanding the spatiotemporal dynamics of the intestinal epithelium

Our work addresses two key challenges, one biological and one methodological. First, we aim to understand how proliferation and cell migration rates in the intestinal epithelium are related under healthy, damaged (Ara-C treated) and recovering conditions, and how these relations can be used to identify mechanisms of repair and regeneration. We analyse new data, presented in more detail in a companion paper, in which BrdU/IdU cell-labelling experiments were performed under these respective conditions. Second, in considering how to more rigorously process these data and interpret them using mathematical models, we use a probabilistic, hierarchical approach. This provides a best-practice approach for systematically modelling and understanding the uncertainties that can otherwise undermine the generation of reliable conclusions—uncertainties in experimental measurement and treatment, difficult-to-compare mathematical models of underlying mechanisms, and unknown or unobserved parameters. Both spatially discrete and continuous mechanistic models are considered and related via hierarchical conditional probability assumptions. We perform model checks on both in-sample and out-of-sample datasets and use them to show how to test possible model improvements and assess the robustness of our conclusions. We conclude, for the present set of experiments, that a primarily proliferation-driven model suffices to predict labelled cell dynamics over most time-scales.


Derivation of 'zeroth-order' continuous model
To derive the continuous approximation we first defined the position x as a continuous coordinate passing through the discrete cell indices. For example x = 0 denoted the coordinate of the cell labelled '0' (base of the crypt), while x = 0.5 was the location halfway between the cell labelled '0' and that labelled 1'. Sample locations consisting of space-time pairs were denoted by s = (x s , t s ). Then, for sample locations (i, t) corresponding to cell indices and arbitrary times, we matched the discrete model and continuous model using i.e. L(i, t) served as the parameter for a single measurement modelled as a Bernoulli trial at that sample location (as in the above Measurement model section).
Next, the discrete dynamics of p(l i (t) = 1) were 'transferred' to the continuous L(x, t) dynamics. In particular, since L(x, t) was taken to be a smooth function, we made the correspondence where ∆x = i − (i − 1) = 1 was the normalised cell length and we also conditioned on knowledge of the spatial derivatives at i, L x (i, t) = ∂L(i,t) ∂x etc. The continuous spatial field effectively interpolated between -i.e. internal to -points of the discrete grid, making use of local derivative information. Substituting the above Taylor series, and similar expressions, into the discrete Markov equation led to where, for completeness, we also retained higher order terms in ∆t for the continuous model. We similarly assumed the existence of smooth functions k(x, t) and v(x, t) that satisfied the discrete relations This assumption is discussed further in the Results section.
We obtained 'closure' for the continuous model by keeping only the lowest order terms in both time and space, and further asserting that the equation structure obtained held for all continuous x and not just discrete i (this could also be motivated by an assumption of grid translation invariance). This leads to the advection equation When we incorporated cell death, with discrete rates d i , this led to the same equations with k replaced by k − d, where d(x, t) was defined similarly to k(x, t).
Hence we interpreted k in the above as the net cell production rate (which hence could be negative).

Supplementary visualisations of posterior distributions
In Fig A, Fig B and Fig C, respectively, we present alternative visualisations of the posterior distributions for proliferation rates under healthy, Ara-C-treated and recovering conditions. These are alternative visualisations of the data presented in Figs 3-5 in the main manuscript. These plots were produced using the package 'corner.py' described in [1].

Typical sample from intestinal epithelium
The companion paper [2] contains full details of the experimental procedures.
In Fig D below we reproduce, for reference, a typical section obtained from an intestinal in these experiments.

Fig D.
Typical section obtained during the experimental procedures described in the main manuscript. These are also detailed more fully in the companion paper [2].

Interpretation of statistical evidence
We have described above how mechanistic or causal assumptions relate to assumptions of structural invariance under different scenarios. In order to interpret the results that follow, however, we also required an interpretation of the 'statistical evidence' that a set of measurements provided about parameter values within a fixed model structure. This proved a surprisingly controversial topic and we encountered continuing debate about fundamental principles and definitions of statistical evidence [3][4][5][6][7].
Following our conditional modelling approach, we decided to adopt the simple -yet quite generally applicable -principle of evidence based on conditional probability: if we observe b and p(a|b) > p(a) then we have evidence for a. A 'gold-standard' theory of statistical evidence starting from this premise has been developed and defended recently by Evans in a series of papers (summarised in [6]). Besides simplicity, a nice feature of this approach, that we used below, is that it can be applied both to prior and posterior predictive distribution comparisons such as p(y|y 0 ) ? > p(y), as well as to prior and posterior parameter distribution comparisons such as p(k|y 0 ) ? > p(k). This approach is not without criticism, however (again, see [3][4][5][6][7] for an entry point to the ongoing debates).
Another notable feature of the interpretation of statistical evidence that we adopted below is that we emphasised the visual comparison of various prior and posterior distributions, rather than adopting arbitrary numerical standards ( [8] advocates a similar 'movie strategy' for the interpretation of statistical evidence and inference procedures, [9][10][11][12] similarly emphasise the benefits of graphical visualisation methods in statistics).