Run and tumble navigation can be very fast

The ability to navigate environmental gradients is often critical for survival. When gradients are shallow or noisy, most organisms move by alternating straight motion (runs) with random reorientations (tumbles). Navigation is achieved by transiently reducing the probability to tumble when attractant signal increases. One drawback of this strategy is that occasional runs in the wrong direction reduce progress up the gradient. Here we discovered a positive feedback regime inherent in this strategy that strongly mitigates this problem. In an attractant field, motion up the gradient reduces tumble probability, which further boosts drift up the gradient. This positive feedback can drive large fluctuations in the internal state of the organism away from its mean, resulting in long runs in favorable directions but short ones otherwise. In this new regime the organism achieves a"ratchet-like"gradient climbing behavior unexpected from mean field theory, and drift speeds much faster than previously believed possible.


Introduction
Navigating gradients requires detecting the direction in which signals of interest increase. When the size of the navigator is small compared to the length scale of the gradient, local directional information becomes unreliable. In this case, cells [1][2][3][4] , worms 5, 6 , larvae 7 , and robots 8 , adopt run-and-tumble strategies similar in many ways to the well characterized chemotactic behavior of Escherichia coli. During runs the organism moves approximately straight at a constant speed, which enables it to collect information about the direction of the gradient. During tumbles the speed is nil while the organism reorients randomly to be able to explore a new direction (Fig. 1a). Rapid transduction of the signal to the motility apparatus is accomplished via an internal state variable, F , which in turn controls the probability to run r(F ). Following an increase in attractant the probability to run increases transiently, then adapts back due to a negative feedback (Fig. 1b). Using this approach an organism can perform a biased "random walk" by lengthening runs when it moves in the desired direction, and progress is made by averaging motion over many runs and tumbles. The efficiency of this process depends on the structure of the environment as perceived by the organism, the memory length of the organism, and the physical constraints on the movement of the organism.
We encapsulate these factors into three timescales to reveal a diversity of dynamical regimes (Fig. 1). The first time scale of interest is that of the change in perceived signal. For simplicity we will consider static environments, in which change in perceived signal is solely due to the motion of the organism (Fig. 1a). We define the timescale of the perceived environment, t E , as the perceived gradient length scale divided by the run speed. We will show that t E varies inversely with the strength of the positive feedback. The second important timescale is the duration of the memory of the organism, t M , that corresponds to the relaxation time of the adaptation mechanism (Fig. 1b). We will show that t M varies inversely with the strength of the negative feedback mediated by adaptation. The interplay between these two feedbacks -positive and negative -will be instrumental in determining the dynamics of gradient tracking. Finally, the direction decorrelation time, t D , quantifies how long the organism can move before losing its original direction due to tumbling or other physical constraints such as rotational Brownian motion (Fig. 1a). In effect, t D limits the usable duration of runs and memory. From these three timescales one can define the ratios τ E = t E /t M and τ D = t D /t M . The first one quantifies the relative strength of the negative and positive feedback. The second one measures how well the memory length t M matches the direction decorrelation time t D .
Classical studies by Keller & Segel and others [9][10][11][12][13][14][15][16][17][18][19] considered what happens when adaptation is fast (τ E 1). In this case the negative feedback dominates the dynamics. The internal state of the organism F exhibits small fluctuations around its mean F , which in turn barely deviates from F 0 , the adapted value of F in the absence of gradient. When τ E 1, the negative feedback is not as strong but still dominates over the positive feedback, keeping the fluctuating value of F close to its mean. However, F now deviates from F 0 because adaptation is not fast enough to fully compensate the average rate of increase of the signal introduced by the average drift up the gradient 20,21 . Related phenomena were beautifully demonstrated by stimulating cells with exponential ramps 22,23 and sinusoidal signals 24 . When the negative feedback dominates, motion resembles a biased random walk, with a moderate drift speed resulting from averaging over many runs of similar length, several of which are oriented in the wrong direction.
In this paper we show that there is another regime of the run-and-tumble gradient ascent strategy wherein much higher drift velocities can be achieved. This regime takes place when the positive feedback dominates over the negative feedback, τ E 1. A significant difference from the classical studies emerges because the distribution of the internal state F develops a long tail, which precludes the use of mean field approaches. The gradient-climbing strategy becomes highly effective so that, in a sense, it resembles a ratchet: the organism cruises along favorable directions and quickly turns away from unfavorable ones. Performance is Figure 1 | Different dynamical regimes of bacterial chemotaxis. (a) Run and tumble trajectory X(t) of a cell climbing up a gradient of attractant ∇C. θ is the angle between the direction of motion and that of the gradient. During runs (tumbles), speed is v 0 (zero) and rotational diffusion is D R (D T ), resulting in the direction decorrelation time t D . (b) Input, internal, and behavioral variables: a sudden step in the external concentration ∆C is amplified by the receptors into a change ∼ N ∆C/C in their free energy F above the adapted level F 0 before the negative feedback mechanism brings it back at rate t −1 M . The motors further amplify the signal by a factor H to cause a change ∼ r(1 − r)N H∆C/C in the probability to run. (c) Drift velocity V D of simulated E. coli cells swimming in static exponential gradients as a function of τ E = t E /t M and τ D0 = t D (r 0 )/t M . Black dashed line: horizontal: τ D0 = 1 fixed while τ E = 0.1 (green dot), τ E = 1 (blue dot), and τ E = 3 (red dot); vertical: τ E = 0.1 fixed while τ D0 = 1 (green dot) and τ D0 = 0.1 (orange dot). White dots: sampling of realistic parameters for 16, 000 individual E. coli cells from a wild type isogenic population 56 near the bottom of the gradient. Black dots: the same cells near the top of the gradient. Black/white contour lines: same as the black/white dots but for D R and D T 8 times smaller, i.e. assuming cells are twice as long (see Methods). (d) Main result: classical (red) vs. rapid climbing (green) trajectories. 5 sample trajectories (thin solid curves) and the mean over 10 4 sample trajectories (thick lines) of position x = X/(v 0 t M ) as a function of time τ = t/t M for cells in a positive-(green τ E = 0.1) and negative-dominated (red τ E = 3) regime. Colors correspond to the green and red dots in a. See Methods for simulation details.
highest when the duration of the memory matches the direction decorrelation time τ D 1. (We consider cases corresponding to the colored dots in Fig. 1c.) Because of phenotyopic heterogeneity [25][26][27] , cell speed, cell length, and signal transduction gain will likely vary from cell-to-cell even in an isogenic population. Using Escherichia coli chemotaxis as an example, we predict that within a wild type population there will be individual cells that operate in the positive feedback regime. Hence, some cells will race up the gradient much faster than the rest of the population.

Results
Exploration of the dynamical regimes of the run-and-tumble gradient ascent. For concreteness we consider the well characterized bacterial chemotaxis system in Escherichia coli 1, 28-31 (see Supplementary Note 1). E. coli swims by rotating flagella [32][33][34] . Switching between runs and tumbles follows Poisson statistics with transition rates λ R (F ) and λ T (F ) 1, 35-38 , causing the state of the cell to rapidly fluctuate around the probability to run r(F ) with characteristic time scale t S = 1/(λ R + λ T ) (Supplementary Note 2). Here F represents the free energy of the receptor clusters (normalized by k B T ) and H is the input-output gain of the flagellar motors 39 . During runs the speed of the cell is constant Ẋ = v 0 and the direction of motion is subject to rotational Brownian motion with diffusion coefficient D R 1 . During tumbles the speed is nil and reorientation is modeled with rotational diffusion D T > D R 40 . Taken together these two processes cause the cell to lose its original direction at the rate t −1 D = (n − 1)(rD R + (1 − rD T ) where n = 2, 3 for two-and three-dimensional motion respectively.
Like many other sensory systems, E. coli responds to relative changes along its trajectory X(t) in the external concentration C(X, t) of an attractant [41][42][43] where N is the gain of the receptor systems 50, 51 , and K i K a are the dissociation constants of the receptors limiting the cell's range of sensitivity 23, 52-55 .
The first term on the right hand side of equation (2) is the negative feedback. Because the probability to run r increases monotonically with F , it too relaxes to its adapted state r 0 = r(F 0 ) with the time scale t M . The second term on the right of equation (2) shows that F , and therefore r, rises whenever C increases, either locally or due to the cell moving up a gradient. In a static gradient, this term is proportional to the cell velocity along the gradient; but is zero when the cell tumbles. Thus, the resulting increase in F (and r) is proportional to the probability to run r, creating a positive feedback.
In this paper we will examine the dynamics of r taking place over multiple runs and tumble events, i.e. we will focus on time scales longer than t S . Taking the limit t S t E , t M , t D , the positive and negative feedback can be encapsulated in the following equation derived from equations (1) and (2) (see Methods and (3) For simplicity we start with a static exponential gradient with length scale L = 1/||∇ ln C|| and concentration within the sensitivity range of the cell K i C K a 42, 43 . Here s = cos θ denotes the direction of motion relative to that of the gradient, and f = HF = − ln (1/r − 1) denotes the rescaled free energy of the receptors. Two timescales naturally appear, t M and t E = L/(N Hv 0 ), which encapsulate the magnitude of the negative feedback (∝ 1/t M ) and positive feedback (∝ 1/t E ). From these definitions, we define the following non-dimensional timescales: Here τ E quantifies the ratio between the negative and positive feedbacks. τ D is the ratio between the direction decorrelation time, which depends on the weighted average of the two rotational Brownian processes, and the length of the memory. One would expect different dynamical regimes depending on whether τ E and τ D0 = τ D (r 0 ) are larger or smaller than one.
We first explored these differences by performing numerical simulations of the full stochastic model of bacterioal chemotaxis for cells in an exponential gradient of MeAsp. We used a previously described stochastic simulator that reproduces well experimental measurements of the behavior of E. coli cells 56 (see Methods).
Calculating the steady state drift velocity from 10 4 sample trajectories for a range of τ E and τ D0 values reveals that cells climb the gradient much faster when the positive feedback (τ E < 1) rather than the negative feedback (τ E > 1) dominates (Fig. 1c). The drift velocity also depends on τ D and peaks when the direction decorrelation time is on the same order as the memory duration (τ D 1).
These differences are reflected in the structure of the cell trajectories. When the negative feedback dominates cell trajectories exhibit runs in both the positive and negative directions with only a slight bias in their durations, resulting in a slow average progress up the gradient (Fig. 1d red). In contrast when the positive feedback dominates runs are long in the right direction and short otherwise, resulting in a much faster average drift up the gradient (Fig. 1d green). In this case matching the memory of the cell with the direction decorrelation time is important. If the cell loses direction too quickly (τ D0 1) it cannot take full advantage of the elongated runs in the correct direction (Supplementary Figure 1).
The positive feedback drives large fluctuations in the internal state of the organism. To better understand the origin of the fast drift velocity when the positive feedback dominates, we examined the relationship between drift velocity V D and the statistics of the internal state of the cell f while it is climbing the gradient at steady state. Non-dimensionalizing time and space, we define P (x, f, s, τ ), the probability that at time τ = t/t M the cell is at position x = X/(v 0 t M ) along the gradient with internal state f = HF and moving in the direction s = cos θ. Starting from the master equation describing the run and tumble dynamics of an individual cell, we derive the following Fokker-Planck equation, valid for time scales longer than an individual run or tumble, by taking the limit t S t M , t D , t E (see Methods): 2 ∂ s is the rotational diffusion operator on the (n − 1)-sphere, and r and τ D are functions of f . Because the cells are log-sensing and we are considering a static exponential gradient, they experience a constant external "field". Thus, at steady state the solution becomes independent of x; integrating over s and f we obtain the constant drift velocity (Supplementary Note 5) Here · represents the average over s and f and the bar indicates steady state. Equation (6) shows that the drift velocity depends linearly on the first moment of the internal state f , i.e. on the shape of the marginal steady state distribution p(f ) = P ds. To calculate p(f ) we solve (5) at steady state by expanding the solution in orthonormal eigenfunctions of the operatorL s with the first two coefficients being the marginal distribution p(f ) and the first order moment p 1 (f )/ √ n = P sds (see Methods and Supplementary Note 6).
When the negative feedback dominates (τ E 1) the distribution of f becomes sharply peaked and nearly Gaussian, with variance σ 2 = τ D0 r 2 0 /nτ 2 E around the first moment; it barely deviates from the adapted state f 0 (Fig. 2a red, see Methods and Supplementary Note 7). Substituting into equation (6) yields known mean field theory results 15,18,20,21 2b). When τ E 1 the negative feedback is not as strong and the distribution of f widens proportionally to the strength of the positive feedback σ ∝ τ −1 E (Fig. 2a blue).
These approximations no longer hold when the positive feedback dominates (τ E 1). p(f ) now exhibits large deviations (Fig. 2a green) bounded below and above by f U and f L respectively. These can be derived from the Fokker-Planck equation (5) (see Methods) As τ E becomes small the lower bound decreases as f L → ln τ E , whereas the upper bound increases as , causing the distribution to become very asymmetric. Thus, mean field theory becomes inadequate.
In this regime, keeping the direction of motion long enough (τ D 1) in the positive direction allows the internal variable to reach the upper limit f → f U , accumulating significant distribution near f U (Fig. 2a  green). A short direction decorrelation time, by comparison, shifts the peak of the distribution back to f 0 ( Fig. 2a orange), so that drift velocity becomes smaller (Supplementary Figure 1).
We verified that p(f ) obtained from equation (5) in the limit t S t E , t M , t D is a good approximation that captures well the relevant dynamics as long as τ D is not too small by plotting it against the distribution of f obtained from the numerical simulations with non-zero t S described earlier (Fig. 2a). Integrating p(f ) according to equation (6) predicts well the drift velocity over the entire range of feedback strengths, including in the region where the positive feedback dominates (τ E < 1). Note: in this regime Keller-Segel and mean field approximations (see Methods) fail (Fig. 2b).
Non-modal dynamics generate long runs uphill but short ones otherwise. In the fast gradient climbing regime (τ E 1) runs are long in the correct direction but short otherwise (Fig. 1b). To gain mechanistic insight into this striking difference we examined the Langevin equations equivalent to the Fokker-Planck equation (5). Defining v = rs as the instantaneous speed along the gradient, we change variables from (f, s, x) to (r, v, x) and obtain (Supplementary Note 8): Fig. 3a,d), intersect only once at the fixed point (r, v) = (r 0 , 0), which is stable because the eigenvalues of the corresponding relaxation matrix are both negative (see Methods). However, fluctuations introduced by rotational diffusion and tumbling prevent the system from settling at the fixed point (heat maps in Fig. 3). Instead, the system constantly  deviates from or relaxes towards it; net drift along the gradient is achieved by spending more time on average in the region of positive velocity v > 0.
The stochastic excursions in the (r, v)-plane away from the fixed point exhibit distinctive trajectories depending on the value of τ E . When the positive feedback dominates (τ E 1; Fig. 3a,b,c) the eigenvectors of the relaxation matrix, τ D0 −1 , 1) T , are highly non-orthogonal. As a result, linear deviations from the fixed point can grow transiently 57 to feed the non-linear positive feedback (v 2 term second line in equation (8)) leading to large deviations. Importantly, this only happens for runs that start in the correct direction. If instead the cell starts running in the wrong direction the linear deviation does not grow (Fig. 3c). This asymmetry arises because the v 2 term is always positive. Similar selective amplification properties are observed in neuronal networks, where non-normal dynamics enables the network to respond to certain signals while ignoring others (including noise) 58,59 . Thus, a cell running in the correct direction is aided by the positive feedback, which pushes its internal dynamics towards the upper right corner of the phase plane where the probability to run r = 1 and the speed v = 1. If, instead, a cell starts running in the wrong direction (v < 0), its internal dynamical state is pushed back into the high noise region near the fixed point where it will rapidly pick a new direction (Fig. 3b).
In contrast, when the negative feedback dominates (τ E 1; Fig. 3d,e,f), the eigenvectors of the relaxation matrix become nearly orthogonal. Linear deviations from the fixed point cannot grow and simply relax to the fixed point regardless of the initial direction of the run. This key difference between the positive and negative feedback regimes are reflected in the flow field (white lines in Fig. 3a,d). Thus, in the negative feedback regime, stochastic excursions from the fixed point are mostly dominated by the decorrelation time τ D with only slight bias in duration between runs up and down the gradient.
Receptor saturation, varying gradient length scales, and trade-offs. So far our analysis assumed a constant exponential gradient at concentrations within the sensitivity range of a log-sensing organism. However, for E. coli equation (2) indicates that the sensitivity to methyl-aspartate is logarithmic only for concentrations between K i C K a . When C < K i the sensitivity becomes linear 60 , whereas when C > K a the receptors saturate 55, 61-63 . Thus, as a cell approaches a high concentration source its sensitivity decreases, which effectively increases the value of τ E . Simulations in an exponential gradient show that this effect results in an eventual slow-down as the cell approaches the source (Fig. 4a-c).
Realistic gradients are typically limited in spatial extent and often are not exponential, in which case L and therefore τ E is different in different regions. L is long near the source in a linear gradient, for example, and decreases linearly with distance from the source. Simulations show that the cell initially climbs the gradient fast but later slows down as the gradient length scale L increases and τ E increases ( Fig. 4d-f). On the contrary, for a static localized source in three dimensions, L is short near the source but increases linearly with distance from it (see Methods). Thus, τ E decreases and the cell accelerates as it approaches the source (Fig. 4g-i). Comparing cells in various dynamical regimes (different values of τ E ) across these different gradients suggests that a lower value of τ E results in a faster gradient ascent.
When entering a food gradient, it is natural to try to climb as fast as possible. However this strategy could create a problem: the longer runs implied by the positive feedback mechanism could propel the accelerating E. coli beyond the nutrient source. This is the case in Fig. 4e, where cells with the lowest τ E (green) reach the source first but overshoot slightly; they settle, on average, at a further distance than those with intermediate τ E (blue). In general, natural environments will be complex , with a variety of different sources and gradients 3,18,24,64,65 , implying different parameter domains will be optimal for E. coli at different times. Such phenotypic diversity may well confer an advantage 26,27,56,66,67 .
Fast chemotactic cells in wild type isogenic population of E. coli. Motivated by our findings we asked whether in a wild type population of E. coli some cells are likely to be in the positive-feedback-dominated regime where τ E = L/ (N Hv 0 t M ) 1. This would happen when the gradient is steep (L short), the gain of the receptor cluster N or of the motors H is large, the velocity of the cell v 0 is large, or the cell memory t M is long. These parameters are likely to vary from cell-to-cell due to non-genetic phenotypic heterogeneity, known to be large in E. coli 25 . Using a recent model of phenotypic diversity in E. coli chemotaxis that reproduces available experimental data on the laboratory strain RP437 56 climbing a linear gradient of methyl-aspartate, we ran simulations of 16,000 wild type-like cells in that experiment. For each cell we calculated t E , t M , and t D based on the parameter input and their locations in the known gradient to show their scatter. Initially, the cells are near the bottom of the gradient and many cells are in the positive feedback dominated region (white dots in Fig. 1a). As the cells climb the gradient τ E becomes larger and the regime shifts to higher τ E (black dots in Fig. 1a). (d-f) Linear gradient. Similar to a-c but for C = C 1 − aR where the source concentration is C 1 = 1 mM and decreases linearly at rate a = 0.0001 mM/µm with distance R from the source. Contours spacing decreases with distance from the source (at the top), illustrating decreasing L. Diagonally filled region indicates increasing τ E over time. (g-i) Localized source. Similar to a-c but for C = C 2 R 0 /R, R > R 0 where the source concentration (C 2 = 1 mM ) is constant within a ball of radius R 0 = 100 µm and decreases with radial distance as 1/R away from the source. Contours spacing increases away from the source (at the origin), illustrating increasing L. Diagonally filled region indicates decreasing τ E over time. See Methods for details.
In these simulations we assumed that all the cells had the same length. However, as cells grow and divide, their length will change on average by a factor 2. Because the rotational diffusion coefficients D R and D T depend non-linearly on cell length 38,68 (see Methods) the cells are likely to move within the parameter space of Fig. 1a during their life time. Repeating the simulation with cells twice as long (black contours in Fig. 1a) reveals that as cells mature they should have a higher chance to be in the faster parameter regime.
It is interesting that not all wild cells are in the fast τ E < 1 regime. A likely reason is that RP437 is a laboratory strain that was selected by swimming in agar plates. In agar, cells can get trapped in the agar mesh, requiring a tumble to escape the trap. As a result, runs that are too long become detrimental 69 . Thus cells likely face a trade-off wherein long memory, t M , is better in liquid but could become detrimental in agar. Again, in alternating environments phenotypic heterogeneity might be advantageous 26,27,56,66,67 .

Discussion
It is common to make simplifying assumptions to facilitate analysis, but we do not believe that these limit our findings. Although we assumed log-sensing receptors and a constant adaptation time for most of our derivations (equation (2)), we showed with simulations that our results hold when the limited range of sensitivity of the receptors is taken into account. Likewise, we verified that the results remain valid over a wide range of the adapted probability to run r 0 , as well as when we include the asymmetric rates of adaptation exhibited by E. coli in response to step up and down in attractant 70 (see Supplementary Figure 2 for details). In this work we considered what happens when the environmental gradients are static (or slowly changing), and revealed the capability for very rapid response. This could be particularly important in rapidly changing gradients such as those generated by turbulent flows and lysing organisms in the sea 3, 64 .
Our findings imply that organisms using a run-and-tumble strategy to climb gradients of interest could do so rapidly and efficiently when their internal state undergoes large asymmetric fluctuations away from the mean. Such fluctuations arise from suppression (facilitation) of tumbles following motions up (down) the gradient, which corresponds to benefits (costs) associated with the gradient ascent. The large asymmetry in the distribution of internal states boosts the drift velocity significantly by dramatically increasing the benefitto-cost ratio. We showed how this is driven by the positive feedback inherent to the run-and-tumble strategy, and how it contrasts sharply with the negative feedback-dominated regime previously described.
Although we used E. coli bacterial chemotaxis as an example, we expect our findings to be generally applicable to other biased "random walk" strategies exhibited by organisms when local directional information is unreliable. In essence, any stochastic navigation strategy requires a memory, t M , to make temporal comparisons, a reorientation mechanism, t D , to sample new directions, and external information, t E , relayed to decision-making circuitry through motion and signal amplification. Our theoretical contribution showed the (surprisingly) diverse behavioral repertoire that is possible by having these work in concert. In retrospect, perhaps this should not be surprising given the diverse environments in which running-and-tumbling organisms can thrive. (3)-(5) in the fast switching limit. Define P R and P T the probability distributions at time t to be running or tumbling at position X along the gradient in direction s = cos θ relative to gradient direction with internal variable f = HF . They satisfy a two-state master equation model (Supplementary Note 4)
If we assume the switching terms with t −1 S in equation (10) dominate, the probabilities to be running and tumbling equilibrate on a much faster timescale than the other ones. Therefore we can let P = P R + P T and can approximate the actual probability to run as its expected value P R /P ≈ r. Adding the two equations above yields the Fokker-Planck equation: This is equivalent to a system of Langevin equations. Considering dr/df = r(1 − r) the internal variable dynamics (the first term) gives equation (3) which defines t E . The angular dynamics (the second term) defines t D .
Using the time scale definitions in (4) and non-dimensionalizing time τ = t/t M and position x = X/(v 0 t M ), we obtain the Fokker-Planck equation (5).
Derivation of the analytical solution to Fokker-Planck equation (5) (2k+n−2)(2k+n) δ k+1,l (summation over l implied) is an operator relating neighboring orders. It comes from the positive feedback term. When n = 3 it isŝ kl = k √ When τ D 1 we neglect the 2nd and higher orders to close the infinite series of moment equations. At steady state the approximation gives the analytical solution where Z is a normalization constant.
Derivation of the distribution p(f ) when the negative feedback dominates. Integrating equation (13) and assuming |f − f 0 | small, we can write the solution as (Supplementary Note 7) where σ 2 = r 2 0 τ D0 /nτ 2 E , C is a normalization factor and G is a polynomial in (f − f 0 ), giving a small correction to the Gaussian distribution. Explicitly integrating the above distribution we recover previous mean field theory results (see details in Supplementary Note 7) where τ D0 = dτ D /df is evaluated at f 0 .
Bounds of the distribution p(f ). The first term in equation (5) says the flux in f -space is non- Thus the upper bound f U of the distribution p(f ) is achieved at equality. Similarly, the lower bound f L is achieved when we take equal signs of noting s ≥ 1. This gives equation (7).
The plus sign gives exp (−f U )) 1 and f U ≈ 1/τ E . The minus sign gives exp (−f L )) 1 and f L = − exp (f L ) /τ E . Taking logarithm, the latter gives Linear response near the fixed point of the Langevin system (8). Consider the region in r-v plane near the fixed point (8) to obtain d dτ

Substitute this into the deterministic component of the Langevin equation
The eigenvalues and corresponding eigenvectors of the matrix describing linear response are: When τ E is large, (1−r0)r0 τ E τ D0 τ D0 −1 1 and the eigenvectors are almost orthogonal. When τ E is small, ( 1 and the eigenvectors are not orthogonal. (2) is, using L = 1/||∇ ln C||,

Receptor saturation. The second term in equation
The factor in the parenthesis is always less than 1, although it is close to 1 when K i C K a . As C increases this factor contributes to a smaller receptor gain and a corresponding larger τ E (Fig. 4b).
Varying gradient length scales. For a linear gradient C = C 1 − aX, we see L = 1/||∇ ln C|| = C/a = C 1 /a − X, i.e. the gradient length scale decreases linearly with distance away from the source.
For a localized source we assume that the concentration profile satisfies a standard diffusion equation ∂ t C = ∇ 2 C without decay. The spherically symmetric solution in 3D is C ∝ 1/R where R is the radial distance. Thus L = 1/||∇ ln C|| = R, i.e. the gradient length scale increases linearly with distance away from the source.
Rotational diffusion coefficient. An E. coli can be modeled as an ellipsoid with major axis parallel to the direction of motion. During runs and tumbles the change in running direction is associated with rotation about its minor axis. With semi-major axis a and semi-minor axis b, the rotational diffusion coefficient is 68 where k B is Boltzmann's constant, T is the temperature, and η is the viscosity of the fluid. For a long cell ln 2a b does not change significantly compared to the cubic dependence on a 3 in the denominator, so the diffusion coefficient is about 8 times smaller when cells are twice as long. We assume a proportionate diffusion coefficient during a tumble and let D T also scale 8 times smaller. In Fig. 1c heat map, 10 4 sample trajectories that were each 650 seconds long -sufficient for the cells to reach steady state -were generated for each parameter condition. The concentrations were assumed to be in the log-sensing range throughout the domain, and in the numerical integration ln C was used instead of ln ((1 + C/K i ) / (1 + C/K a )). The drift velocity V D was calculated by fitting the linear part of the mean trajectory. In Fig. 1d the first 50 s were removed to avoid the start up transient.
In Fig. 1c the black/white dots are generated using experimentally relevant simulation parameters previously described 56 . The time scale values of each cell depend on the cell parameters and the gradient length scale L as in equation (4). Near the bottom of the gradient L = 1500 µm and near the top of the gradient L = 4800 µm.
In Fig. 4 ln ((1 + C/K i ) / (1 + C/K a )) was used. In Fig. 4b,d,f the τ E calculation considered receptor saturation according to (16) as well as the varying gradient length scales, with C and L evaluated at mean positions. Note this is not the average τ E over the population.
In Fig. 3 the Langevin trajectories were generated using Euler's method to integrate equation (8).
Numerical methods and data analysis. Figure 2 used the simulations in Fig. 1c. The steady state p(f ) from agent-based simulation was calculated from the histogram of all the internal values of the 10 4 individual cells between τ = 10 and τ = 20, sampled at regular steps of τ = 0.01. V D was calculated as in Fig. 2a.
Numerical solutions of the Fokker-Planck equation (5) were obtained by expanding the distribution in angles, as in equation (12), and keeping the first 10 orders. The steady state p(f ) was found by solving an initial value problem using the NDSolve function in Mathematica, with 10 4 spatial points and integration time up to τ = 10. Further orders, finer grid, and longer integration times were checked to ensure solution accuracy. V D was calculated from equation (6).
In Fig. 2a inset, the black curves show the approximate Gaussian distribution in equation (14). The explicit expression is in Supplementary Note 7.
In Fig. 2b Keller-Segel refers to equation (15), similarly expressed in previous research 10,11,15,18 , where negative feedback dominates (τ E 1). MFT refers to cases where negative feedback is not as strong (τ E 1) so that the mean internal variable deviates noticeably from f 0 . Thus r, τ D , and τ D in equation (15) were evaluated at the steady state mean f instead of f 0 . This was combined with equation   2-4, 7, 8 in which the cell relays information from the external environment to the flagellar motor through a signalling cascade triggered by ligand-binding receptors. The receptors are described by a two-state model where the activity a is determined by the free energy difference F (in units of k B T ) between the active and inactive states a = 1 1 + e F . (S1) Two terms, F = F m (m) + F C (C), contribute to the free energy. The former depends on the methylation level m of the receptor F m (m) = 0 + 1 m, with constants 0 and 1 , while the latter depends on the ligand concentration C where K a and K i are the ligand dissociation constants of the active and inactive states and N is the gain of the receptor complex.
As the external environment signals changes in F C (C), the receptor adapts via methylation and demethylation to control F m (m) = F − F C (C), trying to maintain an activity level a 0 independent of the environment. The methylation level follows where t M is the adaptation time and m 0 (C) is the methylation level that maintains a 0 . Using equations (S1)-(S3), we see Assuming the signal transduction is fast, the receptor activity determines the concentration of a key protein CheY-P, Y (a) = αa with α constant. CheY-P then binds to the flagellar motor complex, modulating its switching between rotating clockwise (CW) or counterclockwise (CCW), which are Poisson processes with rates from CCW to CW as λ CCW (Y ) and from CW to CCW as λ CW (Y ) where ω, 2 , 3 , and K are constants.
The motor state of each flagellum determines its conformation to be in one of normal, curly-1, and semicoiled. The flagellar conformations in turn determine the cell's motility state (run or tumble). For a singleflagellum cell, the motor states CCW and CW corresponds to the motility states run and tumble, and we can write λ R = λ CCW and λ T = λ CW as the switching rates from run to tumble and from tumble to run, respectively 3 .
Supplementary Note 2: Derivation of equation (1) in the main text. Given constant Poisson switching rates λ R and λ T , the probability to be running is determined by r = λ T / (λ R + λ T ) and the characteristic time scale is t S = 1/ (λ R + λ T ) 9 . When the rates change over time we can say the probability to run tends to follow this value. From equation (S6) and the definition Y = αa = α/ 1 + e F we can write Thus the probability to run r is a monotonically increasing function of the free energy F . Its shape is almost identical to the standard sigmoidal function 1/ (1 + exp (−H (F − δ))) with a scaling H and shift δ. We match them by linearly expanding near δ , To match the zeroth order we must have ∆ = 1. This yields δ to match the first order to get H For the parameters chosen we have δ ≈ 0 (see Supplementary Table). So we can write equation (??) in the main text.
Supplementary Note 3: Derivation of equation (2) in the main text. We substitute the definitions (S2), (S3), and (S5) into the adaptation dynamics equation (S4) and get where d/dt is taken along the path of the cell, so d/dt = ∂/∂t +Ẋ · ∇. Substitute this into the above we obtain equation (??) in the main text.
Supplementary Note 4: Derivation of the master equation model. We define P R (X,û, F, t) and P T (X,û, F, t) as the probability distributions at time t to be running or tumbling at position X in direction u with internal variable F . A two-state stochastic model 9 is given by whereḞ | run,tumble are defined by equation (??) in the main text with constant speed v 0 in a run and zero speed in a tumble. Let s =û ·X = cos θ. In a static exponential gradient with length scale L = 1/||∇ ln C||, the internal dynamics satisfiesḞ (S10) On the other hand, the angular dynamics undergoes rotational diffusion with operator ∇ 2 u . In n-dimensional space we iteratively write down the Laplace-Beltrami operator 10 as Since 0 < θ < π, with change of variable s = cos θ we can write sin θ = √ 1 − s 2 and ∂ θ = − √ 1 − s 2 ∂ s . Then the polar angle part asL Since the gradient varies in one direction only we focus on motion in the gradient direction and integrate the probability over all other directions. Thus ∇ ·û = s∂ X and ∇ 2 u =L s . Using f = HF , r = λ T / (λ R + λ T ) and t S = 1/ (λ R + λ T ), we get (S12) Supplementary Note 5: Derivation of V D , equation (6) in the main text. At steady state we eliminate ∂ t and ∂ x and integrate over s, the Fokker-Planck equation (??) in the main text becomes where the bar indicates steady state. By the boundary conditions that P → 0 at ±∞, we must have From the third term on the right of the Fokker-Planck equation (??) in the main text, the spatial flux is r(f )s and the drift velocity is its average over the distribution. Thus we get the drift velocity as equation (??) in the main text Supplementary Note 6: Derivation of the analytical solution to Fokker-Planck equation (??) by angular moment expansion. The eigenvalue problem of the angular operatorL s , defined in equation (S11), is We identify this as the Gegenbauer differential equation 11 , with eigenfunctions the Gegenbauer polynomials C (n/2−1) k (s) and the corresponding eigenvalues λ k = −k(k + n − 2). When n = 3 they are Legendre polynomials with eigenvalues −k (k + 1). The first few are The Gegenbaer polynomials are orthogonal in the sense that where the normalization constants are N (n/2−1) k = π2 4−n (k+n−3)! k!(2k+n−2)(Γ(n/2−1)) 2 . When n = 3 they are 2 2k+1 , those of Legendre polynomials.
The weight in the integration above is consistent with the geometry on an (n − 1)-sphere S n−1 , whose the volume element are iteratively defined 10 as d S n−1 ω = (sin θ) n−2 dθd S n−2 ω.
After a change of variable s = cos θ and integrating over all remaining dimensions, we see that any integration of s should carry a weight w(s)ds = 1 − s 2 n−3 2 ds. (S16) From the above, we can write any function of s, in particular the probability distribution P , as a series of Gegenbauer polynomials. When n = 3 this is the Fourier-Legendre Series.

(S23)
We notice from equation (S21) that the range of distribution is bounded by f L and f U , defined by √ nτ E , we see that the integration range is much larger than the standard deviation of the Gaussian factor, and thus can be considered from −∞ to ∞. Therefore we get the normalization constant .

(S25)
Finally, noticing that by the definition of τ D in equation (??) in the main text we can get Taking D T D R , we put this back into equation (S25) and get ) .

(S26)
When converted back to real units (t instead of τ = t/t M ), the highest-order term is identical, except for notations, to equation (3) in Dufour et al. 4 obtained from a different approach. It can also be reduced to equation (12) in Si et al. 7 by assuming a high running probability r ≈ 1 and a long memory (t M t 0m ). It agrees with equation (6.24) in Erban & Othmer 12 and equation (16)  To apply the standard result of equivalence between Fokker-Planck equations and Langevin equations, we need to change the measure in s-space to unity. This prompts the definition Q(s, t) = w(s) P (y, f, s, t)dxdf so that the above becomes where we integrated by parts and discarded boundary terms. Since A(s) is an arbitrary function, we can write down the Fokker-Planck equation