Identifying the mechanism for superdiffusivity in mouse fibroblast motility

We seek to characterize the motility of mouse fibroblasts on 2D substrates. Utilizing automated tracking techniques, we find that cell trajectories are super-diffusive, where displacements scale faster than t1/2 in all directions. Two mechanisms have been proposed to explain such statistics in other cell types: run and tumble behavior with Lévy-distributed run times, and ensembles of cells with heterogeneous speed and rotational noise. We develop an automated toolkit that directly compares cell trajectories to the predictions of each model and demonstrate that ensemble-averaged quantities such as the mean-squared displacements and velocity autocorrelation functions are equally well-fit by either model. However, neither model correctly captures the short-timescale behavior quantified by the displacement probability distribution or the turning angle distribution. We develop a hybrid model that includes both run and tumble behavior and heterogeneous noise during the runs, which correctly matches the short-timescale behaviors and indicates that the run times are not Lévy distributed. The analysis tools developed here should be broadly useful for distinguishing between mechanisms for superdiffusivity in other cells types and environments.


Introduction
C ell motility is an integral part of biological processes such as morphogenesis [1], wound healing [2] , and cancer invasion [3].But what are the rules that govern how cells move?Cell migration involves a multitude of organelles and signaling pathways [4] and therefore a fruitful, bottom-up approach studies correlations between cell motion and sub-cellular processes that govern motility, including surface interactions [5], integrin signaling pathways [6], or formation of focal adhesions [7].
An alternate approach with recent successes is to develop simple models at the cellular scale that can help identify a coarse-grained set of rules that govern cell migration in specific cell types.One such class of models, composed of selfpropelled (SPP) or active Brownian particles [8] has been used to make predictions about the motion of biological cells in many contexts, including density fluctuations [9], formation of bacterial colonies [10], and both confined [11], and expanding monolayers [12].
These SPP models represent each cell as a particle that moves by generating active force on a substrate, which acts along a specified vector θ.Therefore, the parameters for the model specify both the magnitude of the force as well as how the orientation of the force changes with time.Given the ubiquity and usefulness of these models, one would like to have a standard framework for extracting these parameters from experimental data for all trajectories.In the past this has often been accomplished by analyzing ensemble-averaged features of cell trajectories.
One such quantity is the time averaged mean-squared displacement (MSD), which is the squared displacement between positions r(t) and r(t + dt) averaged over all starting times t and the ensemble of trajectories.This yields the MSD as a function of timescale, (r(t + dt)) − r(t)) 2 ∝ dt α .Ballistic motion, which corresponds to a cell moving in a straight line at constant speed, corresponds to α = 2. Diffusive motion, where a cell executes a random walk with no time correlation in orientation, corresponds to α = 1.In non-active matter at low densities, thermal fluctuations generically induce diffusive behavior at long timescales.In contrast, many cell types, including T-cells [13], Hydra cells [14], breast carcinoma cells [15], and swarming bacteria [16] display super-diffusive dynamics, defined as trajectories with a MSD exponent between 1 < α < 2.
Several authors have proposed explanations for why superdiffusive migration might be beneficial in biological systems.For example, super-diffusive trajectories are well known for being the optimal search strategy for randomly placed sparse targets [17,18], and have been found in animal foraging and migration patterns in jellyfish [19], albatross, and bumblebees [20].In the context of cell biology, superdiffusive migration implies that cells are covering new areas more quickly than they would if they were executing a simple random walk.
Although super-diffusive dynamics are commonly observed in in vitro experiments, the fundamental mechanism that generates anomalous diffusion in cell trajectories remains unclear.Pinpointing the mechanism would allow biology researchers to better isolate the signaling pathways that govern these processes.
Although one might think that simply including the effects of persistent active forces generated by cells would change the long-time behavior, it turns out that standard self-propelled particle models exhibit a fairly sharp crossover from ballistic to diffusive motion, with no extended superdiffusive regime.Since SPP models are commonly used to model cells and superdiffusive dynamics are commonly observed in experiments, we would like to identify the mechanism generating superdiffusitivity to improve the ability of these models to capture cellular phenomena.
Standard SPP models include smoothly varying persistent random walkers and standard run-and-tumble particles (RTP) [21].Persistent random walkers obey the following equations of motion for the cell center of mass ri and the orientation angle θi: Reserved for Publication Footnotes ).In a standard persistent random walk, the speed v0 and the rotational diffusion coefficient Dr, which controls the strength of fluctuations in orientation, are constant.In a standard run-and-tumble model, particles are ballistic during runs, ∂tθi = 0, followed by tumbling events where large changes in orientation occur.Variations of run-and-tumble models are characterized by the distribution of times particles remain in the run state.Two different classes of modifications to SPP models have been highlighted as being able to generate super-diffusive behavior on long timescales.The first modification is a heterogeneous speed model, which draws rotational diffusion coefficients and particle speeds from distributions [15,22].While persistent random walk models transition from ballistic to diffusive behavior at one characteristic timescale, heterogeneous speed models possess a heterogeneous distribution of crossover timescales, which generates an MSD with a broad superdiffusive regime, though the system becomes diffusive on timescales longer than 1/D min r .The second modification is a Lévy walk model, which is a run-and-tumble model where particles have power law distributed run times: with P (τ ) the distribution of run times with mean < τ > for µ > 1. [23].In contrast to the heterogeneous SPP model, super-diffusivity generated by Lévy walks is not transient, so that the long-time MSD scaling exponent is constant: So which of these models is the "right" one for a given cell type?By analyzing ensemble-averaged statistics such as the MSD and the velocity autocorrelation function (VACF), one group of researchers was able to show that heterogeneous motility models matched data from breast cancer carcinoma cells [15].Another group evaluated a different ensembleaveraged quantity, the probability displacement distribution, and used that data to suggest that T-cells were undergoing Levy walks [13].We would like to better understand whether these ensemble-averaged quantities are in fact a unique identifier of the underlying mechanism for superdiffusivity.Moreover, we also seek to develop a systematic procedure for using experimental data to constrain both the appropriate mechanism and the optimal model parameters for a specific subtype.To this end, we use automated tracking software to analyze over 1000 mouse fibroblast trajectories.We demonstrate that some ensemble-averaged statistics, such as the MSD and VACF, can not distinguish between mechanisms for superdiffusivity.
In order to better distinguish, we begin with a very general model for cell dynamics.Although standard SPP models have only two fit parameters, average cell speed v0 and average rotational noise Dr, in principal a generalized SPP model could have arbitrary distributions for cell speed P (v0) and rotational diffusion P (Dr) with arbitrary correlations between them.The heterogeneity motility model from [15] is the limit of such a model with Gaussian-distributed P (v0) and P (Dr), while a Lévy walk is the limit with a constant v0 and a specialized bimodal P (Dr).Because this is such a large parameter space, we first constrain the functional form of these distributions using specific features of single cell trajectory statistics.We find that the mouse fibroblast data are consistent with runand-tumble dynamics but the run times are not power-law dis-tributed, confirming that in mouse fibroblasts the mechanism for superdiffusivity is heterogeneous dynamics and not Lévy walk statistics.The toolkit we have developed here should be useful for pinpointing the origin of superdiffusivity in many other cell types.

Mouse fibroblast cell culture
Cell motility data was collected from C3H10T1/2 mouse fibroblast cells (ATCC) cultured on a flat, gold-coated polymer substrate, prepared as previously described [24].Cell nuclei were labled with Hoechst dye and cell motility imaged by timelapse microscopy under two different temperature conditions, 4 hrs (48 frames) at 30 • C and then 20 hrs (240 frames) at 37 • C (Supplemental Method 1).The resultant motility image stacks were analyzed using the ACTIV E image analysis package to track nuclei centers-of-mass [25].

Cell trajectory analysis and particle simulations
Cell motility was characterized using statistical analysis of cell nuclei trajectories, including MSD, VACF and displacement probability distributions.Tumbling events were identified with a Canny edge detection algorithm.Additional details on cell trajectory analysis can be found in Supplemental Method 2.

Active particle simulations
This manuscript focuses on two different models for noninteracting active particles.The first model is a Lévy walk with constant particle speed v0 at all timesteps.Particles execute ballistic runs with zero rotational noise for times τ drawn from the distribution in Eq. 3 and a mean run time τ given by Eq. 4.
The generalized SPP model has particles which follow the equations of motion seen in Eqs. 1 and 2, however the parameters for each model are not constant in time.A particle is initialized with a random orientation and assigned an initial speed v0 and rotational diffusion Dr drawn from distributions . We evolve the particle position and orientation for a time τ drawn from P (τ ) = 1 τ 0 e τ /τ 0 , where τ0 is the mean run time deter-   mined by experimental data.The particle then undergoes a tumbling event across one timestep where Dr = 2π, where the value of rotational diffusion is chosen to approximate an event where the orientation is completely randomized.After the tumble a new v0 and Dr are assigned until the next tumbling event.In contrast to a Lévy walk or standard SPP model, motility parameters are varied in time to replicate the variations and changes in a biological environment.For both models, particle trajectories are constructed by numerically integrating the equations of motion using a simple Euler scheme with a timestep dt = 0.1.To compare these results to experimental we equate the simulation time unit to 4 minutes.
Finally, we note the VACF for experimental data shows a sharp dropoff across one frame due to errors in reconstructing the nuclei centers caused by imaging noise and fluctuations in dye intensity.To replicate this feature we incorporate positional noise into both models through small Gaussian fluctutations.After particle trajectories are constructed, each position is changed by a vector δ r = dr φ, where dr is drawn from a Gaussian distribution of variable width ∆ and the direction φ is chosen randomly from the unit circle.This replicates experimental error in reconstructing cell positions, and allows our model trajectories to match the mouse fibroblast data.Mouse fibroblast Lévy walk Generalized SPP Fig. 3. Goodness-of-fit (χ 2 ) as a function of scaling exponent γ.The value of γ that best collapses each data set minimizes the 2 goodness of fit between the P (ρ(t)) calculated at each timescale.Experimental data all collapse at a value of 0.5 < γ < 1, consistent with a superdiffusive MSD.

Results
Experimentally observed ensemble-averaged quantities are well fit by several existing models.Previous reports have compared models to experimental data using ensemble-averaged statistics such at the MSD and the VACF.Therefore, our first goal is to determine whether one of the existing models for explaining superdiffusive cell trajectories is a better fit to the experimental MSD and VACF data, shown by the red lines in Fig. 2.
For comparison, we simulate a Lévy walk model with dynamics given by Eqns 3 and 4, as well as a generalized SPP with no Lévy-walk behavior, described in more detail below.With the best-fit parameters, we find that both models match the data equally well.As shown in Fig. 2 (B), the velocity autocorrelation function exhibits a sharp decrease after the first frame window, due to errors that we make in reconstructing the nuclei center of mass caused by imaging noise and fluctuations in the dye intensity.Therefore, we add an additional term to the model that shifts the particle position by a Gaussian distributed variable with zero-mean and variance ∆ 2 to account for this effect.
While the mean-squared displacement and velocity autocorrelation function are standard metrics for characterizing ensembles of trajectories, they may not be ideal for studying systems with superdiffusion.In an investigation of the Lévy walk properties of T-cells, Harris et al. study a quantity that reveals structures on shorter timescales: the probability for a cell to be at a displacement r(t) at time t [13].They suggest that Levy walks can be distinguished by collapsing these probability distributions with rescaled displacements ρ(t) = r(t) t γ , with γ significantly larger than the value of 1/2 expected for a persistent random walk.As seen in Fig. 4, we find that the mouse-fibroblast data does collapse, with the best fit exponent γ = 0.69 ± 0.02 as shown in Fig. 3.The best-fit Levy walk model collapses at γ = 0.59 ± 0.03, which is above the value expected for a persistent random walk but still lower than γ for mouse fibroblast cells.Importantly, the best-fit generalized SPP model also collapses at a similar value of γ, suggesting that such a collapse is not sufficient to uniquely identify Lévy walks as a mechanism for superdiffusivity.
Moreover, the functional form of the displacement probability distribution P (r(t)) provides additional information.Fig. 4 shows that P (r(t)) for the best-fit Lévy walk model has a significantly different functional form from mouse fibroblast trajectories at small displacements, due to ballistic runs over relatively large distances.In contrast, a non-Lévy version of the generalized SPP model matches the shape of mouse fibrob-  last P (r(t)) very well, providing an indication that a non-Lévy model might be better for describing mouse fibroblast data.

Numerical models are better constrained by single-cell trajectory data
We next study single-cell trajectories.A generalized SPP model with arbitrary distributions for P (v0) and P (Dr) has an infinite number of parameters that we could never hope to constrain.As a first step to simplifying our model we constrain functional form of these distributions using experimental data.
As shown in Fig. 5 (A), we first construct a distribution of cell speeds, determined from the magnitudes of the displacement of nuclei centers-of-mass between image capture events.Our experimental data is consistent with a Gaussian distribution of cell velocities, or equivalently, a distribution of cell speeds of the form , where µv and σv are the mean and standard deviation, respectively.Therefore, we use this functional form in our generalized model.Next we estimate a distribution P (Dr) of rotational diffusion constants (Dr) from the distribution of turning angles, shown in Fig. 5 (B).Simple active Brownian systems with a single value of Dr will generate a Gaussian distribution of turning angles [21].A Gaussian distribution of rotational noise broadens this distri-bution significantly.One can show the expected turning angle distribution in this case is a modified Bessel function of the second kind with an exponential tail, consistent with the numerical simulation data given by the red line in Fig. 5 (B).We were unable to match the mouse fibroblast turning angle distribution, which is given by the blue line in Fig. 5(B) and has significant weight as the largest values of ∆θ, with any Gaussian function for the rotational noise.
This suggests that mouse fibroblast cells may have a strongly bimodal distribution of rotational noises, further supported by intermittent run-and-tumble behavior seen in videos (Supplementary Movie 1.) We choose to capture this bimodal behavior with a noisy run-and-tumble model, where cells have during runs, which are punctuated by tumbling events.We use the Canny algorithm described in the methods section to explicitly identify tumbling events, and the data points in Fig. 5(C) show the distribution of times between such events.The red line in 5(C) shows this is well-fit by an exponential distribution with with τ0 ≈ 1 hour, and so in our model the distribution of run times τ is given by P (τ ) = 1 τ 0 e −τ /τ 0 .We note that this is a strong indication that the mouse fibroblasts are not well-described by a Lévy walk model.The magenta line in Fig. 5(B the distribution of turning angles for a noisy run-and-tumble model with the parameters identified above. To confirm that the model parameters we have identified are robust, and to quantify their sensitivity, we vary model parameters around the microscopically determined values and quantify how much this changes their displacement probability distributions.Specifically, we use the linear regression goodnessof-fit parameter (R 2 ) between P (r(t)) for mouse fibroblast and generalized model trajectories to characterize each parameter configuration and identify a best-fit between our model and mouse fibroblast statistics [26].Using this method we are able to capture the functional form of P (r(t)) very well, as shown in Fig. 4. It should be noted that incorporating a similar distribution of speeds into a Lévy walk model would improve the fit seen in Fig. 4C.However the distribution of run times would remain a power law and not match the distribution of mouse fibroblast run times shown in Fig. 5C, which is fit well by an exponential.
Happily, the configuration of parameters that best matches the macroscopic P(r(t)), located at µD = 0.09, σD = 0.002, µv = 1, σv = 0.8, p = 0, τ0 = 10, is very similar to those identified from microscopic statistics, indicating that the model is consistent with experimental results.A construction of the dynamical matrix around this minima and subsequent analysis of local eigenvectors indicates that our system is most sensitive to perturbations in the mean velocity and mean rotational noise as shown in Fig. 6A

Discussion
Both Lévy walks and heterogeneous SPP models are capable of generating superdiffusive trajectories.Previous studies have focused on one model or the other in order to identify possible mechanisms for superdiffusive cell trajectories.
We show that while both types of model are equally capable of matching the large-scale ensemble averaged statistics of mouse fibroblast cells, an analysis of single cell trajectories demonstrates that Lévy walks are not consistent with this data set, despite a very good scaling collapse of the probability displacement distribution with scaling exponent γ > 1/2.Instead, a careful analysis of turning angle distributions suggests thse mouse fibroblasts exhibit heterogeneous speeds, with noisy run-and-tumble behavior.
Because superdiffusive cells are able to cover distance faster than diffusive counterparts, it would be useful to adapt the tools developed here to study many more cell types.For example, directed cell motion is known to be a signature of invasiveness in cancer cell lines [27], and it would be interesting to know if these cell types are altering the mechanisms or timescales for superdiffusion as they become more malignant.To that end, we have created a MATLAB software package for deploying these analyses on generic data sets [28], which can be used to quantify superdiffusive dynamics and distinguish between different mechanism behavior in cells and active matter.
A natural extension of our current work is interacting SPP models.While a non-interacting model can approximate our mouse fibroblast data where cells are not in constant contact, higher density cell populations, and confluent tissues will require models with steric cell-cell interactions.The effect of super-diffusion, whether generated by a Lévy walk or heterogeneity based model, could potentially alter results obtained with standard SPP models.
For example, recent work suggests that groups of cells [29] and packings of SPPs undergo jamming transitions [11,30,31].Could the addition of superdiffusive dynamics have an effect on these types of transitions?Persistent motility can alter the jamming transition -higher speeds and more persistent trajectories allows particles to explore areas of the energy landscape that were previously inaccessible [31].Similar effects are seen in shape-based models for confluent tissues [29].The inclusion of both run-and-tumble dynamics as well as varying persistence length through broadly distributed rotational diffusion coefficients in a generalized SPP model could offer an interesting mechanism for tuning jamming.
Another emergent feature of self-propelled particle models is motility induced phase separation (MIPS).Persistently moving particles create an inward oriented boundary layer that cage interior particles into a solid phase, while other cells are in a lower density gas phase outside of this boundary [30,32] and this effect has recently been implicated in generating colony formation in bacteria [33].MIPS relies on persistence length to generate this behavior.Our generalized SPP model could reinforce this effect due to relatively persistent run phases, destroy the effect due to tumbling, or perhaps alter the nature of the transition due to enhanced fluctuations, and this is an interesting direction for future study.
Density also plays a critical role in cell interactions.Many cell types exhibit contact inhibition of locomotion (CIL), where contact with another cell will either halt their motion of cause them to immediately recoil and begin moving moving in the opposite direction.It is possible that the tumbling events we see in mouse fibroblast cells are CIL events.There could also be additional interactions such as alignment between neighbors or between cells and the underlying substrate to generate flocking [8].It would be interesting to explore the effect of alignment in a generalized SPP model, to see if heterogeneity causes any significant differences in the flocking transition.
Another benefit of simple SPP models is that they can be relatively easily coarse-grained to predict large scale features of a tissue or colony [21].We have shown that a generalized SPP model is more consistent with superdiffusive mouse fibroblast cell trajectories than a Lévy walk, opening the door to a hydrodynamic coarse-graining approach for this system.
Until now, mouse fibroblasts have not been highlighted as a system with run-and-tumble behavior and therefore the biomolecular mechanisms responsible for this behavior are unknown.To begin to investigate this question, it would be useful to correlate tumbling events with the dynamics of sub-cellular features such as spatio-temporal distributions of focal adhesions [34], Golgi bodies [35], or actin waves [27].This would help us to understand which signaling networks and components of motility machinery are involved generating tumbling behavior or broad distributions of rotational diffusion.Furthermore, it might be useful to study such behavior on structured or controllable substrates [38] , to tease apart the influence of environment vs. internal circuitry on controlling these timescales.

Fig. 1 .
Fig. 1. (A) An example image of nuclei stained with Hoescht dye.The scale bar is 500 µm.These images were processed using the ACTIV E image analysis package to track nuclei centers-of-mass [25], with overlaid best-fit ellipses.(B) A typical cell trajectory with tumbling events indicated by red circles, as identified by the 2D Canny edge detection algorithm.

Fig. 2 .
Fig. 2. All of the proposed superdiffusive models are capable of capturing ensemble-averaged mouse fibroblast statistics.(A)Mean-squared displacements for mouse fibroblast cells, shown in red, as well as a Lévy walk and a generalized SPP model.Both models are able to match the mouse fibroblast MSD within the margin of error.(B) The velocity auto-correlation function Cvv as a function of time dt.There is a sharp decrease in the VACF across the first frame, due to error in reconstructing the nuclei centers-of-mass generated by imaging noise and fluctuations in dye intensity.At the largest timescales, each bin corresponds to fewer events and so error bars become large.In addition, adding positional error to simulation trajectories to match the initial dropoff in the VACF causes significant fluctuations at larger timescales.

Fig. 4 .
Fig. 4. Displacement probability distribution P (ρ), where ρ(t) is the scaled displacement r(t) t γ , for the value of γ that best collapses the data, for (A) Mouse fibroblast cells, (B) generalized SPP model and (C) Lévy walk, with colors representing 4 binned timescales from blue (small) to yellow (large).Mouse fibroblast P (ρ) is shown as a dashed red line in (B,C) for comparison with each model, showing that only the generalized SPP model is consistent with the observed data.

Fig. 5 .,
Fig. 5. (A) Distribution of mouse fibroblast instantaneous speeds calculated from cell nuclei center-of-mass displacement between image capture events (blue).The red line is a fit to the form P (v 0 ) = |v 0 | σ 2 v

2 AFig. 6 .
Fig. 6.Sensitivity analysis examining the goodness of fit of the generalized SPP model to displacement distributions in mouse fibroblasts (1 − R 2 ) as a function of model parameters.(A) Contour plot of log(1 − R 2 ) illustrates the experimental data tightly constrains a linear combination of the mean velocity µv and mean noise µ D .(B,C)The goodness of fit as a function of each model parameter while all others are held fixed.(B) r 0 is the optimal coordinate in parameter space and r − r 0 is the distance of each parameter from its optimal value.(C) A value of τ smaller than ≈ 10 is inconsistent with experimental data, but data does not discriminate between larger values of τ .
, and relatively insensitive to correlations between Dr and v0 parameterized by p (Fig 6B) as well as mean run time τ0 (Fig 6C).