Optimal chemotactic responses in stochastic environments

Although the “adaptive” strategy used by Escherichia coli has dominated our understanding of bacterial chemotaxis, the environmental conditions under which this strategy emerged is still poorly understood. In this work, we study the performance of various chemotactic strategies under a range of stochastic time- and space-varying attractant distributions in silico. We describe a novel “speculator” response in which the bacterium compare the current attractant concentration to the long-term average; if it is higher then they tumble persistently, while if it is lower than the average, bacteria swim away in search of more favorable conditions. We demonstrate how this response explains the experimental behavior of aerobically-grown Rhodobacter sphaeroides and that under spatially complex but slowly-changing nutrient conditions the speculator response is as effective as the adaptive strategy of E. coli.


Introduction
Movement of Escherichia coli consists of periods of running punctuated by tumbling events where the bacterium randomly changes direction. This can result in successful chemotaxis when the probability of initiating a tumble per short time interval (the tumbling rate) is a function of the concentration of attractant experienced by the bacterium. In the absence of attractant, an E. coli bacterium has a constant, basal tumbling rate. When the attractant concentration experienced by the bacterium is increasing, tumbling events become less frequent, so the bacterium has longer runs in the direction of increasing attractant. Conversely, when the bacterium detects a decreasing attractant concentration, it tumbles more often, shortening its runs down attractant concentration gradients. In E. coli, this mechanism involves an excitation pathway, inhibiting tumbling, and an adaptation pathway that methylates the receptors, decreasing their sensitivity and thereby attenuating the excitation pathway in the continued presence of attractant. This "adaptive" response allows the bacterium to locate and stay in regions of high attractant concentration. When attractant concentration plateaus, the tumbling rate returns to the basal rate, a phenomenon called "perfect adaptation". A consequence of perfect adaptation is that the response is independent of the absolute concentration of attractant and depends only on differences in concentration experienced by the bacterium. While E. coli has been instrumental for our understanding of chemotaxis, other bacteria show a considerable variety of chemotactic mechanisms and behaviors [1,2]. For example, responses have been identified which show very little adaptation, such as in certain cultures of Rhodobacter sphaeroides [3]. Furthermore, aerotaxis in E. coli is thought to involve the Aer receptor which lacks methylation sites [4], suggesting a lack of adaptation. In particular, some bacteria seem to tumble more in the presence of attractant; this appears to be the case in many mutant strains, such as aerotaxis [5] and redox taxis [6] in mutated E. coli, aerotaxis in mutated Salmonella typhimurium [5], and phototaxis in mutated Halobacteria [7]. Interestingly, this behavior was also found in wild-type, aerobically-grown R. sphaeroides [8]. This response seems paradoxical, as the bacterium would tend to run in the direction of decreasing attractant concentration and tumble more when it detects an increase in attractant concentration, leading to the accumulation of bacteria away from the attractant. In addressing these puzzling results, Goldstein and Soyer demonstrated the chemotactic efficacy of a non-adaptive "inverted" response [9,10]. With this strategy, bacteria respond to the absolute attractant concentration, tumbling more and therefore maintaining their position in regions of higher attractant concentration. Although less effective than the adaptive response, the inverted response requires only low receptor sensitivity, and could function in the absence of effective receptors by coupling to the cell metabolism [11]. As we show, there are discrepancies between the inverted response and the response observed in aerobically-grown R. sphaeroides.
Why do different bacteria exhibit different chemotactic responses? One possible reason is that different bacteria have evolved for different environments. For example, while E. coli might be expected to inhabit a resource-rich environment, marine bacteria experience a harsh environment in which attractant is localized in short-lasting patches [12] with attractant concentration inside patches being 3 to 6 orders of magnitude higher than outside [13]. This has led to a number of marine-specific evolutionary adaptations such as higher running speed in Pseudomonas haloplanktis [14] and run-and-reverse (as opposed to run-and-tumble) chemotaxis in over 70% of marine bacterial species [15]. Unfortunately, most experimental and theoretical studies to date consider chemotaxis in response to step functions or simple gradients [16], or deterministic time-profiles providing limited insights to how chemotaxis would function in different types of environments. In particular, there have been few studies [17,18] analyzing chemotactic strategies in stochastic environments, which are likely to be the most important environment during the evolution of a chemotactic response.
When considering the basis for a particular chemotactic response, most attention has focused on mechanistic aspects, e.g. how the cellular components interact to produce that response [19]. Here, we instead focus on questions of evolutionary strategies, what types of responses would be expected to evolve in different contexts such as complex nutrient conditions. We use this model to study how performance and optimal properties of various chemotactic strategies vary as a function of environmental conditions. In particular, we describe a new type of chemotactic strategy called the speculator response. It differs from the adaptive response in that the tumbling rate increases with increasing attractant concentration; furthermore, the bacterium makes temporal comparisons of attractant concentration which distinguishes the strategy from the inverted response. We demonstrate that the speculator strategy gives a remarkable match to the paradoxical response seen in wild-type, aerobically-grown R. sphaeroides [8] under experimental conditions.

Results
To model chemotaxis, we consider a single bacterium in a one-dimensional space with periodic boundary conditions and a distribution of attractant. The bacterium can run to the left or right, or tumble. α and β denote the rates at which the bacterium starts and stops tumbling. While β is assumed to be constant, the basal rate α 0 is modulated by the chemotactic response of the bacterium to the experienced attractant concentrations, to give a time-varying α(t): where R(t) is the chemotactic response function and c(x B (t), t) is the attractant concentration that the bacterium experiences at position x B (t) at time t [20]. In contrast to [20], we do not assume deviations from α 0 to be small. R(t) is represented as (A/τ + Bt/τ 2 ) exp(−t/τ) where τ controls the memory length, i.e. how far back in the past the bacterium "remembers" attractant concentrations, and A and B together determine the sensitivity and the characteristics of the response: adaptive, inverted, or speculator. In the adaptive response, A and B are constrained such that A < 0 and B = −A. This gives rise to a response function shown in red in Fig 1a that has a positive and a negative lobe. The red curve in Fig 1b illustrates the changes in α due to attractant addition and removal when a response function of this type is used. When c(x B (t), t) is increasing in time, such as when attractant is added, the negative lobe of has a larger area than the positive lobe, making the integral in Eq (1) negative, leading to an α that is smaller than α 0 , resulting in a decrease in tumbling, as the red curve in Fig 1b shows at t = 50. The opposite happens when attractant is removed, resulting in an increase in tumbling at t = 350. The constraint B = −A results in equal areas of the positive and negative lobes of R(t), ensuring perfect adaptation and a basal tumbling rate (α = α 0 ) for approximately 100 < t < 350. In the inverted response (Fig 1a, blue curve), A > 0 and B = 0, leading to a single-lobe response function. This results in a higher tumbling rate in the presence of attractant and a lower rate when attractant is absent, as the blue curve in Fig 1b shows. We also investigate a new type of response which we name the "speculator" response for reasons explained below. In the speculator response, A > 0 and B < 0, leading to a double-lobe response function (Fig 1a, green curve) that looks roughly like the negative of the adaptive response function (Fig 1a, red curve). This causes increased (decreased) tumbling when c(x B (t), t) is increasing (decreasing) in time (Fig  1b, green curve). The constraint of perfect adaptation is relaxed in the speculator response, so the areas of the positive and negative lobes are unequal (Fig 1a, green curve). This allows the steady-state α in the presence of constant attractant concentration to be different from α 0 , as shown for times 100 < t < 350 (Fig 1b, green curve). The double-lobe response functions of adaptive and speculator response cause bacteria to make temporal comparisons of attractant concentration, while the lack of perfect adaptation in the inverted and speculator responses causes bacteria to respond to absolute attractant concentrations. Note that Fig 1a and 1b are purely illustrative; they do not reflect real or optimized responses. The one-dimensional virtual world in which the bacterium moves contains a stochastic attractant distribution which varies in both time and space. Two parameters, T (correlation time) and L (correlation length), determine the dynamics of the attractant distribution. T is the timescale on which attractant concentrations change, while L determines the distance between peaks of attractant concentration; the shorter L, the more numerous and narrow the peaks are and the shorter the distances between them. The average amount of attractant available in the world is independent of T and L. Fig 2 shows how the distribution looks at different combinations of T and L. S1 Video illustrates the distribution dynamics as a function of T and L.
The framework described above allows us to assess the performance of a chemotactic response characterized by the response parameters α 0 , β, A, B and τ at a chosen combination of attractant distribution parameters T and L. Performance, or fitness, of a response, is equal to the average cell division rate, which we approximate as the inverse of the time it takes the bacterium to experience a specified amount of attractant. For any chemotactic strategy in any stochastic environment, we can optimize the response parameters to maximize the bacterial fitness. Performing this optimization for different strategies (by applying appropriate constraints on A and B) under different combinations of T and L allows us to explore the performance of the different strategies and how this performance varies with T and L. Fig 3 shows the optimal fitnesses of adaptive, inverted and speculator responses as a function of T and L. Fitnesses are scaled by the fitness of a non-chemotaxing bacterium, whose fitness is independent of T and L: a bacterium with a relative fitness of 4 therefore takes 4 times less time to experience the same amount of attractant than a non-chemotaxing bacterium. Note that our measure of fitness implicitly includes the effective robustness to nutrient fluctuations on the timescale T and length scale L. In all strategies, fitness increases with increasing T and decreasing L. As T increases, attractant concentrations change more slowly, making it easier for bacteria to track attractant peaks. At shorter L, fitness is higher because there are more attractant peaks and they are closer to one another. This means that if a bacterium loses track of a peak, or a peak diminishes in amplitude over time, the bacterium only needs to travel a short distance to reach another peak.
In addition to the previously studied adaptive and inverted responses, we characterize a novel chemotactic strategy. This "speculator" response, despite its seemingly paradoxical nature, is more fit under all studied conditions than the inverted response, although less fit than the adaptive response. Interestingly, at T = 10 4 , L = 20, the fitness of the speculator response is nearly identical to the fitness of the adaptive response. To understand the mechanism of the speculator response, we consider the optimal values of response parameters ( Table 1). The lack of perfect adaptation (optimal |A| > |B|) means that the bacterium will more often start to tumble when the attractant concentration is high, as shown in green in Fig 1c; the low value of β means that the bacterium will then continue tumbling, remaining in the region of high attractant. Consequently, the speculator response, like the inverted response (Fig 1c, blue curve), results in frequent long tumbles at high attractant concentrations. In contrast to the inverted response, the double-lobe response function of the speculator response results in a tumbling rate sensitive to the rate of change of the attractant. The long memory of the speculator response (large τ) allows sensitivity to long-term trends; this sensitivity, combined with the double-lobe response function, results in two important dynamical properties.  Firstly, the bacterium compares recent attractant concentrations with a long-term average, tumbling more when the recent past is more favorable than the average, and therefore maintaining its position in regions of higher attractant concentration. Secondly, the bacterium is able to sense improving and worsening conditions at its current location. In particular, a decline in the attractant concentration results in a decrease in α, allowing the bacterium to swim away from a peak when conditions are changing for the worse. Swimming away leads to a further decrease in α, setting a feedback loop in motion, resulting in continued swimming until a new optimum is reached. The speculator response is therefore analogous to the behavior of investors in financial markets: when the current performance is lower than the average, or when investment values are falling, speculators seek higher returns by abandoning their current position and investing elsewhere-hence the name "speculator" response. The behavior of the speculator response, compared with the adaptive and inverted responses, is illustrated in S2 Video. Significantly, the time course of α in the speculator response closely matches the time course of probability of tumbling in aerobically-grown R. sphaeroides (see Fig 2A in [8]). Fig 4 shows a curve-fit of our model of the speculator response to the digitized data of [8]. Although the response shown by R. sphaeroides may not be completely representative of its natural response, since a potentially high step concentration of 1 mM is used, it is likely to be qualitatively correct, since these responses are obtained assuming linear response; further, any potential saturation of receptors would tend to introduce non-linear distortions to this basic linear response behaviour, rather than a completely different qualitative behaviour. For these reasons, we suggest that the closeness of the fit provides strong evidence that aerobically-grown R. sphaeroides uses the speculator response to respond to Na-succinate. Experimental results show that aerobically-grown R. sphaeroides performs well in swarm plates [8], demonstrating the efficacy of this response.
As   [8]. The curve-fit is obtained by optimizing the response parameters of the speculator response to minimize the least-squares fit between the model and the digitized data. The data are digitized in MATLAB using the function imfindcircles [21]. and θ = 1250 (the period is 5000) the left Gaussian at position 25 is higher than the right Gaussian at position 75, but is decreasing. In the adaptive response (red curve), the bacterium is close to the top of this Gaussian during this period. The bacterium shows little movement toward the right Gaussian at position 75 until the right Gaussian is significantly higher i.e. θ > 1250. In the speculator response (green curve), the bacterium cannot track the top of the Gaussian as well as in the adaptive response, as evidenced by the large standard deviation around position 25 (green shading). However, the bacterium more quickly adapts to the changing attractant levels, leaving the declining left Gaussian and moving towards the growing right Gaussian sooner.
The strengths of the adaptive and speculator responses therefore lie in exploitation and exploration, respectively. In the adaptive response, the bacterium can track the top of a peak efficiently while in the speculator response, the bacterium is better at leaving the declining peak and finding the increasing peak. The exploitation behavior of the adaptive response is analogous to a hill-climbing algorithm, which efficiently finds, but may get stuck at, a local optimum. The exploration behavior of the speculator response is more analogous to a Monte Carlo search algorithm in that the bacterium may leave a peak in search of a higher peak at the cost of its ability to track the peak top efficiently. This explains the trend in Fig 3: for large L, the number of attractant peaks is small, and exploiting a given peak is more important than exploring new peaks. Under these conditions, the adaptive response is significantly more effective than the speculator response. At short L, there are multiple peaks in the environment, each of which has a different amplitude. Under such conditions, the exploration behavior of the speculator response allows the bacterium to locate higher peaks, while the exploitation behavior of the adaptive response may lead to the bacterium tracking a suboptimal peak. At T = 10 4 , L = 20, the two strategies are approximately equally effective, giving rise to very similar fitnesses (Fig 3).
These simulations were performed in a 1D stochastic environment, but the question remains as to the effectiveness of the speculator strategy in 2D or 3D space, as in a 1D environment a bacterium following this strategy will always encounter another peak eventually. We do not believe this would have a major impact on our findings as an isotropic and randomly distributed attractant in two or three dimensions will, by symmetry, look the same in all directions. To address this we ran simulations of the bacterium performing a random reorientation every time it has traversed a distance 2ℓ in a 3D stochastic nutrient distribution which has the same correlation length in each dimension, where ℓ is the length of the each dimension in the virtual world. The results in Fig 6 show that the effective correlation length, experienced by the bacterium, is essentially unchanged compared to a 1D environment, confirming that our 1D simulations should be representative of simulations in a 3D nutrient environment.
We next consider the optimal values of the response parameters. In the adaptive response, τ (the memory length) is very short (Table 1), allowing the bacterium to quickly adjust to small displacements from attractant optima. β, the rate at which the bacterium stops tumbling, is quite large, corresponding to short-lasting tumbles characteristic of chemotaxis in E. coli [22]. High sensitivity (large |A| and |B|) is necessary for the bacterium to respond to small differences in attractant concentration characteristic of small displacements from the top of an attractant peak. High sensitivity is responsible for the high α when the attractant is removed at t = 350 in Fig 1c (red curve). Optimal α 0 , the tumbling rate in the absence of attractant (or under constant attractant in case of perfect adaptation), is very low in all strategies, as is evident from Fig 1c. Low α 0 enables bacteria to run persistently in order to find regions with more favorable conditions more quickly. The near-zero value of α 0 removes the possibility of α going below α 0 , eliminating the response to increasing attractant in the adaptive response (Fig 1c, red curve, t = 50).
In the inverted response, the bacterium tumbles more at higher concentrations of attractant (Fig 1c, blue curve). τ is longer than for the adaptive response, allowing the bacterium to integrate over short-term fluctuations. In both inverted and speculator responses, β is much lower Optimal chemotactic responses in stochastic environments than in the adaptive response, resulting in significantly longer tumbles. This is central to the strategies, as it is the persistence of position when tumbling that allows bacteria to stay in regions of high attractant concentration. The sensitivity is lower than in the adaptive response, in agreement with simple models showing that the inverted response is optimized by lower sensitivity [10]. Sensitivity needs to be tailored to the range of attractant concentrations the bacterium experiences: if it is too low, the bacterium will run past high concentrations of attractant; if it is too high, the bacterium will tumble at low concentrations of attractant, never reaching higher concentrations.

Discussion
In this work we describe a new chemotactic strategy, termed the speculator response, in which the bacterium compares the current attractant concentration with a long-term average; if the current concentration is higher than the long-term average, the bacterium tumbles persistently to maintain its position. On the other hand, declines in the current concentration will increase the probability that the bacterium will swim away to a higher peak. By considering stochastic attractant distributions, we show that under slowly-changing but spatially complex attractant concentrations (large T, small L), the speculator response is almost as efficient at co-localizing with attractant as the adaptive response of E. coli (Fig 3). While the adaptive response achieves high fitness by accurately tracking the top of an attractant peak, the speculator response enables the bacterium to explore its environment and find higher peaks more efficiently (Fig 5).
The speculator response closely matches the response observed in wild-type, aerobicallygrown R. sphaeroides (Fig 4). The optimized response parameters from our simulations are in arbitrary units, and cannot be directly compared with those obtained by the fit to the wild-type response (Fig 4). Interestingly, however, the ratio of B to A which quantifies the extent of departure from perfect adaptation (B/A = −1 corresponds to perfect adaptation) is similar between the optimized values obtained from the simulations and the response observed in aerobically-grown R. sphaeroides (−0.86 and −0.82, respectively). Furthermore, we can acquire a rough estimate of the ratio of τ S /τ A (where τ S and τ A are the values of τ in the speculator and adaptive responses) by comparing the values of τ in aerobically-grown R. sphaeroides (Fig 4) and wild-type E. coli [23]. This ratio (71/1 = 71) is of similar order of magnitude to the ratio for the optimized simulated responses (43/0.20 = 215), despite the multitude of differences between wild-type R. sphaeroides and E. coli. The small value of τ A is necessary for the adaptive strategy to minimise overshooting of an attractant peak and maintain a close position around the peak in order to maximise attractant uptake. This is true even for large correlation lengths, where it might be expected that a longer memory length would be advantageous in being able to detect and climb shallow attractant gradients, as found in previous studies [24][25][26]. However, our measure of fitness encompasses both the effective velocity up gradients, as well as the bacterium's ability to maximise exploitation of a peak in a changing stochastic fashion, so our findings suggest that overall the latter dominates fitness, resulting in a smaller τ for the adaptive strategy.
The optimized adaptive response possesses high sensitivity (large |A| and |B|; Table 1) consistent with experimental results from E. coli [27]. Furthermore, β, the rate at which the bacterium stops tumbling, is high, which is in line with the short tumbles observed in real bacteria [22]. In contrast to real bacteria, the optimized bacteria have a lower α 0 , and thus tumble less than real bacteria when attractant concentration is increasing (Fig 1c, red curve, t = 50). This may be an artifact of modeling chemotaxis in a one-dimensional environment: in a threedimensional environment, tumbling may assist the bacterium in finding even steeper paths to attractant optima.
Our model does not take into account the motility-associated energy costs of the different chemotactic strategies. For instance, R. sphaeroides does not actively tumble, but rather stops running and lets rotational diffusion generate the re-orientation, reducing the costs of strategies that involve longer tumbles [28]. The speculator response therefore might have emerged partly because R. sphaeroides uses rotational diffusion to achieve tumbling. Alternatively, rotational diffusion might have emerged in response to the bacterium using a strategy that involves long tumbles. In addition, by necessity we are confined to a relatively small range of T and L; other conditions might exist (such as larger T and smaller L) that would favor the speculator response even more.
Our approach differs from that of other studies in that we consider realistic attractant distributions and extended tumbling times. The latter is essential for the speculator response to work as it allows bacteria to maintain their position in regions of high attractant concentration. Previous studies [20,29,30] modeled tumbles as instantaneous after chemotaxis in E. coli [22], however, experimental evidence from other bacterial species shows longer tumbling times [31,32]. Our results add to the growing body of evidence that extended tumbles allow for emergence of other modes of chemotaxis [9,10,33].
Most studies to date considered chemotaxis in response to step functions or simple gradients [16]. While this is important for our understanding of the basic mechanisms of chemotaxis, we should recognize that chemotactic strategies were inevitably shaped by the environments the bacteria inhabited. For example, studies in marine bacteria unearthed specific adaptations to marine environments [14,15,17], highlighting the need to study chemotaxis in the context of realistic attractant distributions. Here, we propose a model of a stochastic attractant distribution which allows us to compare the performance of various chemotactic strategies under different environments and study how optimal properties of chemotactic responses change as a function of environmental conditions. This can also help us characterize the environmental conditions based on the strategies that have evolved. Further characterization of natural environments [12] will allow theorists to construct more detailed attractant distributions and advances in microfluidics technologies will enable these environments to be reconstructed in laboratory settings [16].
Finally, although our results relate to the chemotaxis of bacteria, as search problems in temporally and spatially heterogeneous environments are very common, these results could have very broad application to fields outside of microbiology. For example in ecology, foraging animals face a similar dilemma of whether to exploit a currently available food source or prey or abandon it in hope of a more abundant opportunity in the future [34], or in general in optimisation, where memory of past solutions tried may inform the local search strategy.

Stochastic attractant distribution
We generate our stochastic attractant distribution by summing over cosine and sine modes with different mode numbers p so that the concentration at position x and time t along the virtual world is calculated as cðx; tÞ ¼ max 0; where X p (t) and Y p (t) are stochastic weights, ℓ is the length of the one-dimensional virtual world (ℓ = 100), ξ p (x) = 2πpx/ℓ and p Ã = ℓ/L is the largest mode included in the sum above. X p (0) = Y p (0) = 0 for all ps, and are updated at intervals of Δt c = T/100 according to: where η p (t) is a white noise Gaussian random process (hη p (t)η q (t 0 )i = δ(t − t 0 )δ pq ), generated by a random number sampled from a normal distribution with mean 0 and variance 1. A similar expression is used for Y p (t + Δt c ). By construction, this results in a Markov process with correlation time T and approximate correlation length L = ℓ/p Ã .
To investigate the effect navigation in 3D stochastic environments would have on the ability for a bacterium to find attractant compared to 1D (Fig 6), we create a static attractant environment analogous to Eq 2, but in 3D: cðx; y; zÞ ¼ X p Ã p;q;r fX pqr cos ðx p ðxÞ þ x q ðyÞ þ x r ðzÞÞ þ iY pqr sin ðx p ðxÞ þ x q ðyÞ þ x r ðzÞÞg ð4Þ where ξ p (x) = 2πpx/ℓ, ξ q (y) = 2πqy/ℓ, and ξ r (z) = 2πrz/ℓ, X pqr and Y pqr are independent Gaussian random variables for each combination of the integers p, q, r (hX pqr X p 0 q 0 r 0 i = δ pp 0 δ qq 0 δ rr 0 , hY pqr Y p 0 q 0 r 0 i = δ pp 0 δ qq 0 δ rr 0 and hX pqr Y pqr i = 0) and i ¼ ffiffiffiffiffiffi ffi À 1 p is the unit imaginary number. This gives by construction a random isotropic attractant distribution with the same correlation length in each dimension L x = L y = L z = L = ℓ/p Ã .

Chemotaxis
The attractant distribution is equilibrated for a period of at least T. Before a bacterium is introduced, α is initialized based on the equilibrated attractant distribution. The bacterium is then released and the state of the bacterium (whether it is running or tumbling) is updated every Δt B = min(T, L/v, τ /20) where v is the speed of the bacterium when running (v = 1). A Monte Carlo scheme is used to decide whether the bacterium starts tumbling (running) given that it was running (tumbling) previously, assuming first-order dynamics of a 2-state system. When the bacterium stops tumbling, it starts running left or right with equal probability.

Optimization
Mutagenesis followed by selection constitute one generation of the optimization. In the first generation, all response parameters are initialized randomly from a uniform distribution between 0 and 1 (but see below). In the adaptive response, only B is initialized and mutated, A is set to −B (at T = 10 4 , B is initialized between 1 and 10). In the speculator response, B is initialized randomly between 0 and −1. In every generation, one response parameter is chosen at random and mutated. Parameters are mutated on a log scale by a transformation exp(log e (a) + r) where a is the parameter being mutated and r is a random number sampled from a uniform distribution between −0.2 and 0.2. Further constraints on response parameter values are imposed for reasons of computational tractability: α 0 > 10 −3 in adaptive and inverted response, A > exp(−1) and |B| > exp(−1) in speculator response, τ > 0.01 in adaptive response.
After mutagenesis, the fitnesses of responses described by the wild-type and mutant response parameters are determined. This is achieved by letting 10 identical wild-type and 10 identical mutant bacteria explore the virtual world with the stochastic attractant distribution. Each of the 10 wild-type bacteria is subjected to an attractant distribution initialized with a different random seed; the attractant distributions are then re-used for the 10 mutants. (As the attractant distribution is stochastic, estimates of response fitness are stochastic too. This scheme of competing the wild-type and mutant with the same attractant distributions is thus used to ensure that lucky mutants do not fix.) Each of the bacteria is run until it experiences 50T attractant units. D w,i (D m,i ) denotes the time it took the i-th wild-type (mutant) to experience the specified amount of attractant. Fitness of response k = {w, m}, F k , is then calculated as Th 1 D k;i i i averaged over the bacteria. As mentioned in the main text, the way fitness is characterised, i.e. the time it takes to experience a critical amount of nutrient, despite fluctuations in space and time, includes an effective measure of robustness to these fluctuations over the length scale L and timescale T.
Once fitnesses are determined, the probability of acceptance of the mutation, p m , is calculated using the Metropolis-Hastings algorithm: p m = 1 if F m ! F w and p m = exp((F m − F w )/(UF w )) otherwise. U, the temperature, is constant at 0.005. Simulations are run until the fitness stops increasing and stays constant for at least 600 generations. 3 replicate simulations are run for each chemotactic strategy and combination of T and L.
The software implementing these simulations can be found at https://github.com/kotanyi/ chemotaxis-evolution. In the adaptive response, the bacterium swims up attractant gradients and tumbles when it experiences a decrease in attractant concentration. This leads to an oscillatory behavior around peak maxima. In the inverted response, tumbling rate increases with increasing attractant concentration. Response sensitivity is optimized such that the bacterium tumbles most persistently at attractant concentrations which correspond to typical values at attractant maxima. However, this means that the bacterium can get stuck at sub-optimal concentrations on large peaks. The speculator response compares the current concentration of attractant with a long term average. If the current concentration is greater than this average, the bacterium tumbles more. If the current concentration is lower than the average, or declining, the bacterium swims away, leaving the peak to search for higher attractant concentrations. The bacterium will typically run past peaks if their amplitude is lower than the peak it just left. (MP4) Investigation: MG.