Describing Directional Cell Migration with a Characteristic Directionality Time

Many cell types can bias their direction of locomotion by coupling to external cues. Characteristics such as how fast a cell migrates and the directedness of its migration path can be quantified to provide metrics that determine which biochemical and biomechanical factors affect directional cell migration, and by how much. To be useful, these metrics must be reproducible from one experimental setting to another. However, most are not reproducible because their numerical values depend on technical parameters like sampling interval and measurement error. To address the need for a reproducible metric, we analytically derive a metric called directionality time, the minimum observation time required to identify motion as directionally biased. We show that the corresponding fit function is applicable to a variety of ergodic, directionally biased motions. A motion is ergodic when the underlying dynamical properties such as speed or directional bias do not change over time. Measuring the directionality of nonergodic motion is less straightforward but we also show how this class of motion can be analyzed. Simulations are used to show the robustness of directionality time measurements and its decoupling from measurement errors. As a practical example, we demonstrate the measurement of directionality time, step-by-step, on noisy, nonergodic trajectories of chemotactic neutrophils. Because of its inherent generality, directionality time ought to be useful for characterizing a broad range of motions including intracellular transport, cell motility, and animal migration.


Mathematical Notation
notation usage δt simulation time step ∆t sampling interval t time τ time interval bold variables vectors CAPITALIZED variables random variables ρ T (t) probability density function (PDF), (the probability of measuring random variable T between the values t and t + dt) R random variable for position used in analytical models (corresponding to the PDF, ρ R (r)) Θ random variable for 2D orientation (the polar angle of the instantaneous velocity vector) overline average over time t average over ensemble or analytical PDF c = cos Θ tangent-bias correlation (averaged over ρ Θ (θ)) r experimentally measured position v instantaneous velocity v rms root mean squared speed φ turning angle (not to be confused with polar angle θ) v(t + τ ) ·v(t) tangent-tangent correlation t p persistence time t d directionality time In this appendix, three random walk models are discussed. Positions, R, are specified by a two dimensional (2D) Cartesian coordinate system with unit vectorsê x andê y (Fig. 1 C). The orientation of movement is described by polar angle θ (π < θ ≤ π) defined with respect to the positive x-axis. The random walker begins by moving with speed V 1 in a particular direction Θ 1 for a particular time T 1 , before reorienting and moving with a new speed V 2 , in a new direction Θ 2 , for another time period given by T 2 . This process continues and is altogether described by speeds V n , angles Θ n , and step durations T n (n = 1, 2, ...), which are random values of speed v, polar angle θ, and time t, respectively. Each random variable is associated with a probability density function (PDF), ρ V (v), ρ Θ (θ), and ρ T (t). For each random walk model described in this appendix (SI Appendix A), we define the relevant variables, parameters, and distributions, and use these to determine the ensemble averaged squared displacement (EASD), R 2 (t) , and log-log EASD slope, β(t). From β(t), we derive the model specific meaning of directionality time, t d , the measure that is central to this work. A universal, biased random walk model invariant fit function is derived to measure t d from EASD data.

Model 1: Drift Diffusion (DD)
A walker undergoing a Diffusion in 2D (without drift) obeys the following rules: 1.
Stepwise Position: The position after n steps is given by the random variable where δ(...) is the Dirac delta function, λ is the reorientation frequency, and the other terms are defined in the Appendix A Introduction. The notion for average step duration, λ −1 , is used for consistency with the models below. The addition of drift to a diffusion process adds an additional term to the stepwise position equation. Without loss of generality, one can define a drift velocity along the x-axis, u = uê x .
Subsequently, Rule 1 would be redefined as 1'. Stepwise Position: The position after n steps is given by the random variable One example of DD would be the motion of an object trapped at an air-water interface in the presence of a surface current (air or water). If the object were charged, drift could also be induced by applying a uniform electric field (i.e. carrier drift) [1].
The ensemble averaged squared displacement (EASD) of a DD in d dimensions has been shown to be [2] , that asymptotes to v rms (∞) = u. Differentiation in log-log coordinates gives the log-log EASD slope where time constant, t 0 = 2dD u 2 , incorporates both model parameters D and u. Defining directionality time as the time when β = 3 2 provides a time scale where the migration transitions from diffusive to directionally drifting. This definition gives t d = t 0 or The greater the diffusion constant and/or the lower the drift velocity, the more time is required to observe a directional bias.
Stepwise position: The position after n steps is given by the random variable 2. Random step length: L j is drawn from a distribution with PDF, ρ L (l).

Constant track speed:
It can be shown that the EASD of a 2D-SBRW after n steps is given by [4] for a derivation). With a directional bias PDF that is symmetric about θ = 0 (c.f. rule 3), sin Θ = 0 and the EASD simplifies to Note that the motion is diffusive when n is small ( R 2 n ∼ n) and directional when n is large ( R 2 n n 2 L cos Θ 2 ). The proportionality constant of directional motion, L cos Θ 2 , is nonzero if and only if the combined PDF governing L and Θ generates directional motion. This term will be discussed in detail below. Also note that this step by step random walk has no persistent directionality at low n because the direction of motion changes with every step.
In order to calculate log-log EASD slope, β(t), an average track speed v is defined and the relation n ≈ vt L is used to transform R 2 n → R 2 (t) . In the general time representation, this is a model of a biased random walk (BRW): with an RMS speed that asymptotes to Differentiating in log-log coordinates gives the log-log EASD slope . Note that β BRW has the same functional form as β DD (Eq. A2). Only the mathematical constants have changed. Analogous to the DD treatment, defining directionality time This result simplifies further when the orientation angles {Θ j } are selected independent of the step lengths {L j }. Thus L cos Θ = L c, where c = cos Θ is called tangent-bias correlation because it describes the correlation of the random walk orientation (tangent vectors) to the direction of bias (θ = 0). Squared tangent-bias correlation, c 2 , ranges from 0, corresponding to no orientational bias ). Using this result, the asymptotic RMS speed (Eq. A7) simplifies to v rms = |c|v (A10) and the directionality time becomes This generalized directionality time equation is more easily understood by considering two examples. The first example shows the equivalence between the directionality time equations derived for DD and the 2D-SBRW. Consider the case of a drift diffusion-like process where T 2 = T 2 = δt 2 and the polar angle where u is drift speed and v is stepping speed). This gives a tangent-bias correlation of c = u 2v , and directionality time: where δt is the stepping time. Rewriting the instantaneous speed as v = δl δt where δl is the step length, and taking the continuum limit (δt, δl → 0) gives Taking the limit of the first term is the mathematical definition of the diffusion constant, D [5]. Taking the limit of the second term gives 0, leaving t d ∝ D u 2 . Up to a proportionality factor that appears because we did not model DD with an exact PDF, Eq. A12 shows the equivalence between Eq. A11 and Eq. A3, and that directionality time depends only on the amount of diffusivity (D), and the external parameter (or more generally, the set of external parameters) that drive directional bias (u).
The second example is a case with a variable stepping time. Consider the case where the probability of measuring a step length T between t and t + dt is given by a Poissonian PDF, ρ T (t) = 1 tp e − t tp . Then, Importantly, directionality time is independent of speed v. It depends only on the average step duration t p and the extent to which the orientation is biased when a reorientation occurs (given by c 2 ). Directionality time decreases when either the reorientation rate (1/t p ) or orientational bias (c 2 ) increase. In particular, scales from 1 at maximal bias, to ∞ at no bias. It may appear odd that t d → t p (the average time of one step) when the system is perfectly directional (c → 0). However, this is no more than a subtlety of stepping random walks. Based on the way EASD was converted from the stepping number domain to the continuous time domain, positions are only allowed to change in increments of T = t p .
Therefore, the minimum time to determine movement is directionally biased will always be greater than or equal to t p . No information can be gained from a random walker that has not yet taken any steps.
Since each step is accompanied by reorientation, this model cannot be used to derive a log-log EASD slope equation that accounts for short time scale directionality known as persistence. In order to consider the relation between directionality time and persistence, we consider a continuous time random walk model in the following section.
Stepwise position: The position after n steps is given by the random variable To allow for persistence, a PDF for the position of the PBRW object, ρ X (x, t), is derived in continuous time from a partial differential equation (as opposed to from the stepwise position). It can be shown (see Refs. [2,6]) that PDFs of a 1D-PBRW are solutions of the biased telegrapher equation Step one, multiply both sides of Eq. A14 by x.
Step two, integrate by parts over the entire x-axis. Assuming the position distribution ρ X (x, t) and its first spatial derivative ∂ρ X ∂x tend to zero as x → ±∞, integration by parts gives the ordinary differential equation where parameters λ ± ≡ λ l ± λ r are introduced to simplify the equations. The frequency λ − defines the degree of directional bias: leftward when λ − < 0 and rightward when λ − > 0. The frequency λ + is inversely related to the average step duration, t p = 1 2 λ −1 l + λ −1 r , known as persistence time [2].
The EASD, X 2 (t) = ∞ −∞ x 2 ρ X (x, t)dx, is calculated analogously to the EAD. Multiplying both sides of the biased telegrapher equation (Eq. A14) by x 2 and integrating gives an ordinary differential equation containing X(t) and X 2 (t) terms. Substitution of X(t) using Eq. A16 gives the ordinary differential equation Solving for X 2 (t) using the initial conditions X 2 (t) | t=0 = 0 and d X 2 (t) dt | t=0 = 0 gives the EASD where t ≡ λ + t is nondimensionalized time and c = λ− λ+ is the tangent-bias correlation as described in the previous section (c.f. rule 3). When there is bias (c 2 > 0), X 2 (t) → c 2 v 2 t 2 in the large time scale limit (t → ∞), corresponding to directional motion with an RMS speed asymptote where t p ≡ λ −1 + = 1 2 T (l) = 1 2 T (r) is the so-called persistence time, half the average time the walker moves in one direction before switching to the other.
Differentiating Eq. A18 gives the log-log EASD slope These curves are plotted for multiple values of tangent-bias correlation c in Fig. 2 A (solid curves). Of particular interest is the form of a directionality time that would be calculated from the log-log EASD slope in Eq. A21. Persistence over short time scales causes a β PBRW (0) = 2 and instead of a monotonic increase of β PBRW (t) from 1 to 2 as with β DD (t) and β BRW (t) (Eqs. A2 and A8, respectively), β PBRW dips towards a value of 1 before eventually asymptoting back towards 2 as t → ∞. At sufficiently large time scales, the constant terms and terms with decaying exponentials in Eq. A21 becomes negligible and Fig. 2 A, dashed curves). This is known as the β BRW -model. The convergence time scale above which the β PBRW (t) is approximated by the β BRW -model in Eq. A22, denoted t BRW , is defined as the time above which < 5% (Fig. 2 B, solid black curve). For the purpose of fitting the β BRW -model to PBRW data, choosing to fit data at times above the convergence time scale is important. Fitting at times above 10λ −1 + will produce good fits because 10λ −1 + > t BRW for all but the most unbiased motion, in which case directionality time has little meaning regardless.
Notably, at time scales t > t BRW , β PBRW has the same functional form as the β DD and β BRW (Eqs. A2 and A8, respectively). Analogous to both DD and the 2D-SBRW, defining directionality time as the time In the interpretation that directionality time is a proxy for the observation time necessary to distinguish directionality from randomness, we redefine directionality time to vanish in the 1 2 < c 2 < 1 domain, signifying directional motion at all time scales: The 1D-PBRW (model 3) can be modified to account for measurement errors. Assuming positions along the PBRW are sampled from Normal distribution with standard deviation σ m (see Fig. 3 A for diagram is 2-D), the change to EASD is: X 2 (t) → X 2 (t) + 2σ 2 m , where X 2 (t) is given in Eq. A18. The log-log EASD slope becomes where the term added with respect to Eq. A21 is marked by an underbrace and defined as 2 for brevity.
As with the 1-D PBRW without measurement error, the 1-D PBRWwE also converges to β BRW (t). The convergence times above which the difference between β PBRWwE and β BRW is less than 5% are shown in S2A Fig. (solid curves) for different values of the constant . These convergence times increase with 2 (i.e. increasing centroid measurement error and/or decreasing instantaneous speed). When 2 is sufficiently large, the convergence time is estimated by comparing 2 to c 2 t 2 (see the denominator of Eq. A24), giving vrms(∞) = 4.5 λ+c (S2A Fig., dashed curves), where v rms (∞) = |c|v as derived in Eq. A19. The factor of 4.5 was chosen so the convergence time scale corresponds to the 5% difference threshold. A larger factor can be chosen to lower the percent difference threshold. This convergence time scale can be used to set a minimum fit time (c.f. τ min in Fig. 3 D). Fitting the β BRW -model at times above the minimum fit time significantly decouples directionality time measurements from measurement error. Note that t σm is sampling interval independent.
As the dimensionality of the random walk increases, so does the effect of measurement error. Specifically, m where d is the number of dimensions. Also taking into the account the convergence time t BRW due to persistence over short time scales, the generalized convergence time scale, known as the minimum fit time t min , is the greater of: both of which are readily measured and sampling interval independent. Fitting experimental data to the β BRW -model at times above t min decouples measurements of directionality time, t d , from the effects of persistence and measurement error.
Above we have shown that directionality time can be derived as long as t d is not too small compared to t σm such that information about directionality is hidden beneath large measurement errors. Fitting at times above t min ensures that the directionality time measurement decouples from measurement error and persistence time. One final case to explicitly consider is that when directionality time is large compared to duration of data collection. One potential concern is the possibility that fitting the β BRW -model renders a measure of the transition from pseudo random kinematics caused my measurement noise, to persistent but not necessarily directed motion. This case corresponds to a log-log MSD slope estimated by taking Eq. A24 and setting c = 0 (i.e. t d much larger than the time scale of data collection). The corresponding asymptotic form of log-log MSD slope is which is a β = 0 to β = 1 transition that is not captured by the β BRW -model. When Eq. A26 is fit to the β BRW -model at times above t min , the directionality time measurement approaches infinite, therefore proving that the β BRW -model does not erroneously couple to a time scale other than that associated with directional bias. measurements, EASD is increased with respect to our analytical models where σ 2 (t) ≡ 2dσ 2 m + σ 2 p (t) incorporates all measurement and parameter uncertainties, and it has been assumed that each of these uncertainties is random and does not change R(t) .
Ensemble averaged data is often noisy. As more migration trajectories are tracked, the standard errors of ensemble averaged statistics decrease. However, ensemble averaged statistics such as EASD are often still too noisy to fit with many experimental designs. Statistical noise can be reduced by first time averaging the data before calculating the ensemble average. The time averaged squared displacement (TASD), for example, is given by where i is an integer path index (i.e. path 1, 2, 3,...) and t dur,i is the total duration of the i th path. The difference between TASD and EASD is described by where the factor ξ i is known as ergodicity. Mean squared displacement (MSD) is calculated by ensemble averaging TASD (averaging over all trajectories, i) For β-curves measured from MSD instead of EASD is Rewriting R 2 (τ ) in terms ξ, R 2 BRW , and σ 2 using Eqs. B6 and B3 gives .
The log-log MSD slope can thus be expanded into its final form where β BRW (τ ) is given by Eq. B1, β σ (τ ) is given by and β ξ (τ ) is given by The variance correction β σ (τ ) becomes negligible as time increases unless σ 2 (τ ) is of order τ 2 or greater such that σ 2 (τ ) → ∞ faster than R 2 (τ ) BRW → ∞. At large time scales, the position variance is dominated by parametric variance and not intrinsic variance (c.f. Eq. B2). Therefore, the standard deviations of the mean distance traveled can be used to calculate σ 2 (τ ) in the variance correction term, β σ (t). At large time scales, the variance correction term can be completely determined from the experimental data by using Eq. B3 to substitute R 2 (τ ) BRW for R 2 (τ ) − σ 2 (τ ) into Eq. B10, giving Here, all terms that can be measured experimentally have been moved to the left of the equality, while the fit model has been moved to the right. Eq. B12 is only valid at sufficiently large time intervals when the effects of persistence on β(τ ) are negligible (i.e. β PBRW (τ ) ≈ β BRW (τ ), c.f. Eqs. A21 and A8), and when the effect of implicit variance is small relative to the combined measurement and parametric variances , c.f. Eq. B2). The minimum fitting time interval τ min defined in Eq. A25 can be used to satisfy these conditions.
As a final note, calculating the ergodicity correction term β ξ (τ ) from experimental data is not always helpful because its uncertainty is often relatively large and ergodicity cannot be time averaged to reduce statistical noise. Therefore, the correction term β ξ (τ ) should only be calculated if the system is believed to be nonergodic (i.e. when the parameters that describe an experimentally measured migration change significantly over the duration of that migration). If the data is nonergodic and β ξ (τ ) is unacceptably noisy when calculated from experimental data, sometimes truncating the migration paths to equalize (and/or shorten) their durations will make the data nearly ergodic. This was the procedure used in order to ignore the ergodicity correction term (β ξ (τ ) = 0) in the chemotactic neutrophil example from the main text (Fig. 4).
If the data is still nonergodic after truncation, a quick and dirty first approach of choosing β ξ = a, where a is a constant such that lim τ →∞ [β(τ ) − a] = 2, may reasonably capture the nonergodic correction.
A more precise calculation of β ξ (τ ) can be achieved by simulating biased random walks as follows.
First, make an initial measurement of the persistence time and directionality time from the truncated migration paths. Based on these initial measurements, parameters t p and κ = κ(c 2 ) can be chosen for a corresponding 2D-PBRW simulation that will recapitulate the experimental data. Second, calculate the ensemble averaged instantaneous speed (EAIS) of the truncated migration paths over their entire duration 0 ≤ t ≤ t dur . When the data is nonergodic, there will typically be an overall acceleration or deceleration associated with the EAIS. Third, run 2D-PBRW simulations with instantaneous speeds, V (t), randomly selected from the experimentally measured EAIS distribution each time a reorientation event occurs. Simulate enough trajectories so that a smooth ensemble averaged ergodicity curve as a function of time interval, ξ(τ ), can be calculated. Finally, calculate ergodicity correction term β ξ (τ ) using Eq. B11.