Noise Contributions in an Inducible Genetic Switch: A Whole-Cell Simulation Study

Stochastic expression of genes produces heterogeneity in clonal populations of bacteria under identical conditions. We analyze and compare the behavior of the inducible lac genetic switch using well-stirred and spatially resolved simulations for Escherichia coli cells modeled under fast and slow-growth conditions. Our new kinetic model describing the switching of the lac operon from one phenotype to the other incorporates parameters obtained from recently published in vivo single-molecule fluorescence experiments along with in vitro rate constants. For the well-stirred system, investigation of the intrinsic noise in the circuit as a function of the inducer concentration and in the presence/absence of the feedback mechanism reveals that the noise peaks near the switching threshold. Applying maximum likelihood estimation, we show that the analytic two-state model of gene expression can be used to extract stochastic rates from the simulation data. The simulations also provide mRNA–protein probability landscapes, which demonstrate that switching is the result of crossing both mRNA and protein thresholds. Using cryoelectron tomography of an E. coli cell and data from proteomics studies, we construct spatial in vivo models of cells and quantify the noise contributions and effects on repressor rebinding due to cell structure and crowding in the cytoplasm. Compared to systems without spatial heterogeneity, the model for the fast-growth cells predicts a slight decrease in the overall noise and an increase in the repressors rebinding rate due to anomalous subdiffusion. The tomograms for E. coli grown under slow-growth conditions identify the positions of the ribosomes and the condensed nucleoid. The smaller slow-growth cells have increased mRNA localization and a larger internal inducer concentration, leading to a significant decrease in the lifetime of the repressor–operator complex and an increase in the frequency of transcriptional bursts.


Supporting Text S1
Supporting Methods

Derivation of rate constant relationship for non-cooperative ligand binding
Sequential binding of two ligands L to an enzyme with two binding sites E 2 can be described by the following two equations: (1) From these one can obtain the following mass balance equations: Solving for the steady state, by setting these equations equal zero, and using the expression for the total number of enzyme molecules E 2T = [E 2 ] + [LE 2 ] + [L 2 E 2 ], it can be shown that the fraction of enzyme binding sites occupied by a ligand molecule is given by where the equilibrium dissociation constants K d = k of f kon and K d2 = k 2of f k2on have been used. The concentration of ligand resulting in half of the binding sites being bound is then Cooperativity can be described using the Hill equation, f B = Equating Equations 3 & 4 it becomes apparent that Equations 1 & 2 are non-cooperative only when K d2 = 4 · K d . If the probability of ligand unbinding is independent of the number of inducers bound, the rate of unbinding when two ligands are bound will be twice that when a single ligand is bound:

Lattice microbe reaction operator
Models of spatially inhomogeneous stochastic chemical systems are often described using the reactiondiffusion master equation (RDME). In the formalism of the RDME the system's volume is divided into a set of discrete subvolumes (commonly a lattice in numeric implementations) with the chemical species in the system being distributed amongst the subvolumes. Reactions occur only between species within a subvolume and each subvolume is considered to be well stirred such that the reactions follow standard kinetic theory (constant probability per unit time). Each subvolume is then well-described by the chemical master equation (CME). Diffusion is accounted for by random transitions of species between subvolumes also with constant probability per unit time. The time evolution of the probability P for the system to be in a specific configuration x (counts of every chemical species for each subvolume) starting from a given initial state x 0 at t 0 obeys the RDME: Here, x α v is the number of molecules of species α (α = 1, · · · , N ) in subvolume v (v ∈ V ). R is the number of reactions. a r is the reaction propensity for reaction r in given the state of a subvolume x v . S is the stoichiometry matrix. d α ij is the diffusive propensity for one molecule of species α to diffuse from subvolume i to j.
The RDME is difficult to analytically study for even simple systems, and instead is most often sampled using a Monte Carlo approach. Many independent realizations of the system's path through the probability space are computationally calculated and then combined to reconstruct the probability density function (PDF) of the RDME. Several implementations of RDME samplers have been published [1][2][3][4][5][6]. Our approach is focused on sampling the RDME for systems that approximate the physical aspects of an individual cell, therefore we call it the lattice microbe method. Diffusion and reaction rate constants can be spatially dependent to allow the modeling of cellular features such as the membrane, nucleoid, and other cellular structures.
Simulating spatially resolved stochastic systems is computationally costly because of the large number of calculations associated with tracking the diffusion of many particles. We take a brute-force approach by adopting a parallelizable RDME method involving timesteps and nearest neighbor transitions and perform the diffusion calculations using the GPU [7]. Uniquely, our approach is of sufficient performance to permit the inclusion of in vivo crowding into the model, by constructing an approximation of the crowded cytoplasm using reflective sites.
Computing a single realization of the system's RDME starts with the discretization of space into a threedimensional cubic lattice of spacing λ and time into time steps of length τ . Then, starting from an initial state at t 0 , the state of the system at times t τ , t 2τ , t 3τ , . . . is sequentially calculated. Each time step is broken into reaction and diffusion operations. The reaction operator R calculates any changes in the chemical species present in each subvolume over the time step and the diffusion operator D calculates any redistribution of the species to and from neighboring subvolumes: where N (α, r, t) gives the number of molecules of species α present in lattice site (subvolume) r at time t. The implementation of the diffusion operator has previously been described [7], here we focus on the reaction operator.
Chemical species can react during a reaction operation according to defined stoichiometry and kinetic rates. We assume a well-stirred environment in each subvolume during a timestep and calculate the probability per unit time of a reaction occurring in a Gillespie-like manner [8]. For the current study we constrained the time steps and reaction volumes such that there was a low probability (≤5%) for any particular reaction to occur in a subvolume during a given time step. This allowed us to assume that a maximum of one reac-

Lattice coarse graining technique
The lattice microbe method simulates reaction-diffusion processes on a three-dimensional lattice representing the cellular environment. A key first step in performing a lattice microbe simulation is defining the three-dimensional cellular model, including the membrane, cytoplasmic obstacles, nucleoid, and other cellular features. The initial model is constructed in continuous space and then mapped onto a lattice at a given resolution, typically 2-16 nm. The model building algorithm uses a computational geometry data structure known as a kd-tree [9] to store the three-dimensional model. The kd-tree data structure is a binary search tree that efficiently stores objects in three-dimensional space and allows quick placement of new objects into unoccupied volume. This becomes particularly important when randomly placing the cytoplasmic obstacles, which at 50% crowding by volume amount to ∼1.5 million objects. Features in the cellular environment are mapped to the diffusion and reaction properties of lattice sites, such that particles diffuse and/or react differently based on the cellular architecture. For example, cytoplasmic obstacles are mapped as a group of lattice sites into which diffusion of particles is prohibited.
When the dimensions of a lattice site are larger than the size of the object being mapped onto the lattice, there is no longer a one-to-one or one-to-many mapping of objects to lattice sites. In that case, it is necessary to create a coarse-grained representation of the cell that omits the detailed description of the objects while preserving their overall effect. Many of the cytoplasmic obstacles in the in vivo Escherichia coli model are much smaller than the 16 nm lattice size used in the long-time simulations presented in the main text. The key contribution of these obstacles to the simulations is the effect they have on diffusion of particles: their presence causes diffusion in the cytoplasm to be anomalously subdiffusive. As such, the cell models for the long-time simulations were coarse-grained to 16 nm resolution in such a way as to preserve the anomalous behavior of proteins diffusing in the cytoplasm.
Anomalous diffusion can be phenomenologically described by the equation r 2 = 6Dt α , where r 2 is the mean-square displacement, D is the diffusion coefficient, t is the time, and α is the anomalous exponent.
For normal diffusion α = 1 and for subdiffusion α < 1. As shown in the main text, for a protein diffusing in a crowded environment α is one at short time scales, monotonically decreases to a given minimum value dependent on the packing density, and then monotonically increases back to one at long time scales. A coarse-graining function that preserves anomalous diffusion, then, should minimize the error in the minimum value of the α exponent.
The coarse-graining function used in this study first determines the fraction of each site's three-dimensional volume that is occupied by obstacles in the continuous space model. Then, all of the lattice sites with occupancy greater than a specified cutoff are considered occupied and the remainder empty. To determine the proper cutoff value, simulations are performed using a range of cutoff values and the α diffusion exponent measured for each. The cutoff that minimizes the error in α relative to the expected value becomes the coarse-graining parameter for that particular lattice occupancy and spacing. For 16 nm lattice spacing and 50% packing using the in vivo obstacle distribution reported in the main text, the best cutoff was 0.19, i.e.
all lattice sites with more than 19% volume occupied in the continuous space model were marked as obstructed. This corresponded to 37.6 % of the total number of sites in the coarse-grained model. This coarse graining method did result in a shift of the anomalous minimum to greater time scales, but the agreement with the α value at the minimum was good.

Supporting Video Legends
Video S1: Simulated colony of E. coli cells responding to inducer. Video composite of trajectories from six spatial PFB+IV simulations at 15 µM inducer. Yellow circles are LacY proteins and red circle are mY mRNA molecules. Two cells begin the process of switching to the induced state.
Video S2: Trajectory of a PFB+IV+CET cell responding to inducer. Visualization of a single slow-growth CET modeled cell responding to 15 µM inducer. Gray spheres are ribosomes and the blue region the nucleoid. Yellow circles are LacY proteins and red circles are mY mRNA molecules. The repressor-operator complex is green and the free operator is white.