Sources of Variability in a Synthetic Gene Oscillator

Synthetic gene oscillators are small, engineered genetic circuits that produce periodic variations in target protein expression. Like other gene circuits, synthetic gene oscillators are noisy and exhibit fluctuations in amplitude and period. Understanding the origins of such variability is key to building predictive models that can guide the rational design of synthetic circuits. Here, we developed a method for determining the impact of different sources of noise in genetic oscillators by measuring the variability in oscillation amplitude and correlations between sister cells. We first used a combination of microfluidic devices and time-lapse fluorescence microscopy to track oscillations in cell lineages across many generations. We found that oscillation amplitude exhibited high cell-to-cell variability, while sister cells remained strongly correlated for many minutes after cell division. To understand how such variability arises, we constructed a computational model that identified the impact of various noise sources across the lineage of an initial cell. When each source of noise was appropriately tuned the model reproduced the experimentally observed amplitude variability and correlations, and accurately predicted outcomes under novel experimental conditions. Our combination of computational modeling and time-lapse data analysis provides a general way to examine the sources of variability in dynamic gene circuits.

Synthetic gene oscillators are small, engineered genetic circuits that produce periodic variations in target protein expression. Like other gene circuits, synthetic gene oscillators are noisy and exhibit fluctuations in amplitude and period. Understanding the origins of such variability is key to building predictive models that can guide the rational design of synthetic circuits. Here, we developed a method for determining the impact of different sources of noise in genetic oscillators by measuring the variability in oscillation amplitude and correlations between sister cells. We first used a combination of microfluidic devices and timelapse fluorescence microscopy to track oscillations in cell lineages across many generations. We found that oscillation amplitude exhibited high cell-to-cell variability, while sister cells remained strongly correlated for many minutes after cell division. To understand how such variability arises, we constructed a computational model that identified the impact of various noise sources across the lineage of an initial cell. When each source of noise was appropriately tuned the model reproduced the experimentally observed amplitude variability and correlations, and accurately predicted outcomes under novel experimental conditions. Our combination of computational modeling and time-lapse data analysis provides a general way to examine the sources of variability in dynamic gene circuits.

Author Summary
A goal of synthetic biology is to design genetic circuits using mathematical models that predict circuit function. However, various sources of noise impact gene regulation in different ways. This hinders the development of accurate mathematical models, especially when single-cell accuracy is required. Here, we first experimentally characterize the noisy dynamics of a synthetic gene oscillator at the single-cell level. Then, using measurements obtained from the experiments, we develop a minimal computational model that correctly predicts the statistical behavior of single cells within a growing colony. Our method can be Introduction Random fluctuations in gene networks have a variety of origins: e.g. small molecule numbers within cells [1][2][3][4][5][6], fluctuations in the environment [7,8], spatial heterogeneity [9], or the cell cycle [10]. A number of experimental and theoretical studies have examined the impact of noise on gene networks in equilibrium. Frequently, such studies focus on decomposing fluctuations into an intrinsic component which affects individual genes independently, and an extrinsic component which impacts all reactions within the cell or population [3,8,[11][12][13][14][15]. However, different sources of extrinsic and intrinsic noise can have distinct impacts on network dynamics. For instance, metabolic and environmental fluctuations can affect genetic oscillations differently than variability induced by cell division, although all three can be characterized as extrinsic noise sources. An approach that identifies the effect and origin of different fluctuation sources would allow us to develop more accurate computational models of genetic circuits, and better understand the processes that affect their function.
Here, we present a combined experimental and theoretical approach to identify the sources of noise in a synthetic dual-feedback oscillator [16]. We used microfluidic devices [17] to track the amplitude and period of oscillations in individual cells through multiple cell cycles. By following the entire lineage of progenitor cells, we were able to characterize fluctuations within and co-fluctuations between cells. We found that the period of oscillations varied little, while the amplitude varied considerably in time and across the population. Sister cells were highly correlated upon division, but these correlations decayed in time.
To explain the mechanism behind these observations, we introduced a series of increasingly detailed computational models. We started with a deterministic model to estimate parameters and then used a modified discrete stochastic simulation algorithm that accounted for transcriptional delay to describe fluctuations due to small molecule numbers [18][19][20][21][22]. However, this model was unable to reproduce the amplitude variability and the correlation between sister cells simultaneously. Only by including various other extrinsic noise sources was the model able to explain both the amplitude variability and correlation observed experimentally. By increasing model complexity in steps we obtained a minimal model that explained the data. Importantly, our model predicted the level of amplitude fluctuations and correlations in a set of experiments we did not use in fitting it.
Our analysis showed that different sources of noise must be taken into account in order to explain the observed temporal and cell-to-cell amplitude variability and correlation in our population of synthetic oscillators. This approach to building minimal predictive models is quite general. Gradually increasing complexity by adding physiologically plausible sources of noise leads to a simple model that can explain the data. Subsequent cross-validation on separate experiments ensures that the model was not overfit. We have thus elucidated sources and consequences of variability within genetic oscillators, and provided a framework for constructing predictive models of complex dynamical gene networks.
promoter [23], which is up-regulated by AraC in the presence of arabinose and repressed by LacI in the absence of isopropyl-β-D-1-thiogalactopyranoside (IPTG). In addition, all three genes are tagged with the LAA version of the ssrA degradation sequence [24]. The presence of both the positive and negative feedback loops has been shown to support robust oscillations in the circuit [16,25]. Since all genes are under the control of identical promoters, the concentration of GFP and the resulting fluorescence level provide a measure of the level of transcription in the oscillator.

Amplitude variability and correlation in the oscillator
To measure the time-dependent GFP concentrations in individual cells, we used custom designed microfluidic devices that enable time-lapse fluorescence microscopy [26,27]. We acquired phase contrast and fluorescence images every three minutes for three hours. We next segmented the images and tracked each cell and its fluorescence across time, keeping track of all lineages as cells grew and divided. At cell division, we kept track of each sister cell separately. Starting from a single cell, we thus obtained a branched trajectory: After the first division the trajectory split into two branches, and each successive division increased the number of branches by one. The resulting "lineage fluorescence trajectories" thus contained information from all descendants of the cell or cells initially placed in the trap, and about the relation between all descendants. Fig 1B shows the lineage trajectory for a single initial cell, illustrating the branching of trajectories at each cell division. Oscillations were maintained throughout the cell lineages.
Although all cells within a lineage are clonal copies of the initial cell, we observed large variability in oscillation amplitude (as measured by peak height) and smaller variability in oscillation period (Fig 1C-1D). Variability in period resulted in the divergence in the phase of the traces obtained from sister cell trajectories across the lineage. The average period of the entire population was 41 min, and variability in oscillation period (CV = 0.11) was small compared to the variability in amplitude (CV = 0.47). Computing the statistics for each lineage separately yielded similar results (see Methods). To examine cell-to-cell co-variability in gene expression, we computed the Pearson correlation coefficient, ρ, of fluorescence between daughter cells t minutes after division using all pairs of daughter cells in a lineage (on average 175 pairs per lineage). In the first frame after division, fluorescence of two daughter cells was nearly identical (ρ X 1 ,X 2 (3) = 0.98, Fig 2A), indicating that protein partitioning was highly symmetric. This is consistent with previous results demonstrating that protein numbers after division follow a binomial distribution [28]. As time progressed, the correlation decreased as the trajectories of the daughter cells diverged. For instance, 24 min after cell division the correlation in fluorescence between sister cells decreased to ρ X 1 ,X 2 (24) = 0.67 ( Fig 2B). Fig 2B also shows the correlation function of the 6 recorded lineages and their mean.

Model of the dual-feedback oscillator: Intrinsic noise
To construct a mathematical model of the dual-feedback oscillator, we started with the following set of deterministic delay differential equations describing the dynamics of LacI (r), AraC (a), immature GFP (g), and mature GFP (G) where x τ = x(t − τ) for x in {r, a, g}, and  is the composite Hill function describing the activity of the hybrid promoters as a function of activator and repressor concentrations. Here α r , α a and α g are the maximal production rates; τ r , τ a , and τ g are the transcriptional delay times; γ r , γ a , γ g , γ G , and R 0 are the Michaelis-Menten parameters for ClpXP mediated proteolysis; C a , and C r are the concentrations needed for halfmaximal induction and repression. Subscripts refer to repressor (r), activator (a), and immature GFP (g). In addition, f is a unitless measure of the strength of the activation by a compared to basal production; λ is the maturation rate of GFP; and β is the dilution rate due to cell growth. We fit this deterministic model to experimental data to estimate the parameters. To do so we fit the shape of the solution as well as the period (see Methods).
In experiments the exact number of proteins within each cell is unknown. However, we can tune the protein number in the model by appropriately scaling the system in Eqs 1-4 without affecting the dynamics. More precisely, when we scaled each variable using the transformation x(t)!x(t)/O as well as appropriately changing the parameters (see details in Methods), the protein numbers were changed, but the dynamics of the corresponding deterministic system in Eqs 1-4 had the same form. The parameter O can be interpreted as unitless volume when other parameters are assumed to be fixed, or as a scaling parameter when the volume is fixed. Here we used the second interpretation.
To investigate the impact of noise due to finite size effects, we used a delayed stochastic simulation [18][19][20][21]29] with rates obtained from the deterministic fit as propensity functions in the reactions of the associated birth-death process (see [22] and Methods). The parameter O directly controls the number of proteins in the system; for instance, doubling or tripling O, does the same to the number of proteins. Smaller values of the parameter O correspond to smaller numbers of proteins and hence larger intrinsic noise. For large values of O the model exhibits robust oscillations (e.g. O = 4 in Fig 3A), but amplitude variability is smaller than observed in experimental data. As expected, decreasing O led to increased amplitude variability (Fig 3B), matching the experimentally observed value when O = 0.9 (See Fig 3C). However, at this value of O the coefficient of variation (CV) of the oscillation period was 5 times higher than in experiments. Thus, a model that only included fluctuations due to small protein number could not account for the experimentally observer variability in amplitude and period.

Lineage simulations and cell growth variability
We next asked if variability due to cell growth and division, and the associated random partitioning of proteins between daughter cells, could account for the discrepancies between our computational model and the experiments. To investigate this, we simulated the entire lineage of a progenitor cell (see Methods for details). In the stochastic version of Eqs 1-4 we accounted for cell growth only by incorporating dilution at rate β. Here, we instead accounted for cell growth explicitly by including an equation for cell volume, _ V ¼ bV, and scaling the production and proteolysis parameters to account for the doubling of molecular machinery such as plasmids and proteases ( Fig 4A). As before, we simulated the system using the delayed stochastic simulation algorithm. When the volume of a cell reached the size at which division occurred, we replaced it with two daughter cells with half the volume, and partitioned all proteins (including those already created and the immature protein in the stack of delayed events) according to a binomial distribution [28]. We repeated these steps for each sister cell, thus simulating an entire lineage ( Fig 4B). Furthermore, since we modeled cell growth explicitly, and cell growth and division exhibit variability [30], we allowed β to vary from cell to cell as well as the size required for division, V max . We estimated the distributions of β and V max from experiments and used these estimates in simulations (see Methods). Thus, in addition to intrinsic noise, our lineage simulations included noise due to: 1) variability in growth rates between cells; 2) variability in size at division; and 3) protein partitioning at cell division. Fig 5A and 5D show typical simulations of lineage trajectories for two different values of O. The model matched experimentally observed amplitude variability most closely when O = 0.5 ( Fig 5B). However, despite the close agreement in amplitude variability, lineage trajectories were qualitatively different from those observed experimentally ( Fig 1B). We therefore compared correlations in fluorescence intensity between sister cells after cell division between simulations and experimental data. Simulations with O = 0.5 resulted in correlations that decayed faster than observed experimentally ( Fig 5C). Decreasing intrinsic noise by setting O = 1.0 ( Fig  5F), produced a good match. However, while the model oscillated robustly, intrinsic noise was now too small to match the experimentally observed amplitude variability ( Fig 5E).
We therefore concluded that intrinsic noise and fluctuations in cell growth and protein numbers after cell division could not explain the experimentally observed variability. The level of intrinsic noise required to match the experimentally observed amplitude variability resulted in phase diffusion that was too fast, and correlations between sister cells that decayed too quickly. If we tried to match correlation decay, noise in our simulations was too small to reproduce the experimentally observed variability in amplitude.

Parameter variability
Another source of variability are the random fluctuations in the cellular microenvironment and cellular resources necessary for gene expression [31][32][33][34]. This type of extrinsic noise can be viewed as variability in the parameters that describe protein creation. For instance, partitioning effects can result in fluctuations in protein numbers (as described in the previous section), but also increase variability in the molecular machinery responsible for protein production. If plasmids or enzymes needed for protein production are unevenly distributed between daughter cells upon division, transcription rates within the two daughter cells will differ. Parameter variability within gene circuits has been modeled previously in several ways. In the absence of cell division, Mondragón-Palomino et al. randomly sampled parameters for each realization of an oscillator model [35]. If cell division is explicitly modeled, however, parameters might change significantly only when a cell divides [36], or fluctuate continuously between divisions [14,15].
We modeled parameter variability by sampling the value of parameters at each cell division. For a parameter subject to variability, for example α r , we sampled its new value after cell division from a distribution centered at the value of the parameter of the mother cell. The coefficient of variation of this distribution, Γ = σ α r /hα r i, can then be used to tune parameter variability. Also, we incorporated a homeostatic mechanism in this distribution to ensure parameters did not diverge (see Methods for details). We focused on the variability of α r , α a , and α g (Eqs (1-3)). These parameters were chosen because the activity of molecular machinery such as plasmid copy number, ribosomes, and energy in the cell, are expected to fluctuate more than parameters representing binding affinities, Hill coefficient, and degradation rates. All other parameters were fixed. We also explored alternative models of parameter variability, such as sampling of parameters independently of the values of parameters of the mother cell. We found that lineage dependence was necessary to be able to estimate O and Γ from experiments (see Methods).
In simulations we varied noise due to finite protein numbers by changing O, and parameter variability by changing Γ. We computed the coefficient of variation of the amplitude for a range of O and Γ and compared it to the coefficient of variation seen in experiments. Fig 6A shows the differences in the coefficient of variation of the amplitude between simulations and experiments, |CV sim − CV exp |, as O and Γ were varied in the model. We also computed the correlation function from simulations, corr sim (t) (t = time after cell division), and compared it to the experimentally obtained correlation functions, corr exp (t). Fig 6B shows the differences in correlation between simulations and experiments, |corr exp − corr exp | (using the L 2 norm), as O and Γ were varied in the model. As noted previously, in the absence of parameter variability, the experimentally observed amplitude variability can be matched in simulations with high intrinsic noise (small O). However, this same amplitude variability is observed over a range of Γ. Indeed, a decrease in intrinsic noise (increase in O) can be compensated by increasing parameter variability (Fig 6A, gray curve). Similarly, a good match for correlations in fluorescence can be obtained by appropriately tuning O and Γ (Fig 6B, gray curve). Matching concurrently the experimentally observed amplitude variability and correlation functions allows us to determine both O and Γ.
To explore further the dependence of amplitude variability and correlation on intrinsic and extrinsic noise, we used a simple oscillator model that captures the main features of the full model described above: where r is the amplitude of a realization of the oscillation, r 0 is the mean amplitude of the oscillations (in the absence of intrinsic noise), ρ determines the stability of the oscillator, θ is the phase, T is the period, dξ i 's are independent standard white noise processes with zero mean and unit variance, and O controls the level of intrinsic noise (see Methods). Variability in r 0 was modeled as in the full model using the coefficient of variation Γ as the parameter that controls extrinsic noise. We observed that intrinsic noise had a stronger effect on correlation than on amplitude variability; thus, large intrinsic noise can decorrelate cells faster as well as increase period variability. On the other hand, extrinsic noise has a stronger effect on amplitude variability. The same level of amplitude variability can be attained by decreasing intrinsic noise and increasing extrinsic noise (see Methods for details). By varying intrinsic and extrinsic noise it is thus possible to achieve large amplitude variability and high, persistent correlations between sister cells at the same time.
To validate our methodology experimentally, we compared the experimental lineage trajectories ( Fig 7A) with simulated trajectories (Fig 7B) using the values of O and Γ estimated above. We observed that their qualitative behavior was similar; simulations and experiments exhibited robust oscillations with high amplitude variability. The distribution of the amplitudes and correlations in fluorescence show a close agreement between experiments and simulations (Fig 7C  and 7D). Additionally, the coefficient of variation of the period, which we did not use to fit our model, was 0.10 in simulations (Fig 7E), close to the value observed in experiments (0.11).
To test the predictive power of our computational model we performed another experiment in which the concentration of IPTG was reduced from 2 mM to 0 mM. This reduction is known to decrease the period of the oscillator [16], and hence should change the amplitude variability and correlation between sister cells. The period at 0 mM IPTG was approximately 29 min (compared to *41 min in 2 mM IPTG), and amplitude variability and correlations were markedly different (Fig 8). To determine if our model could predict these changes, we decreased the parameter C r in Eq (5) (corresponding to decreasing IPTG) until the model oscillated with the experimentally observed period. Importantly, only the parameter C r , which represents the amount of LacI needed for half-maximal repression of the promoter, was changed, while all other parameters were fixed at the values determined from our previous fit, including O and Γ. This allowed us to cross-validate our model using the experimentally observed variability and correlations at 0 mM IPTG. In experiments, amplitude variability increased, but the oscillations were still robust ( Fig 8A); this was also observed in simulations (Fig 8B). The coefficient of variation of the amplitude was 0.55 in experiments, very close to the value predicted by simulations (0.56). We also compared the probability distribution of the amplitude in experiments and simulations, as well as the correlation functions (Fig 8C and 8D). The predicted distribution of amplitudes and the predicted correlation function matched the experimental data. Additionally, the coefficient of variation of the period was 0.19 in experiments (Fig 8E), close to the value predicted by simulations (0.17). These observations suggest that we did not overfit our model, and that our approach can be used to predict experimental outcomes. We also used the Kolmogorov-Smirnov test to compare the experimental and simulated amplitude distributions in Figs 7 and 8. In both cases they passed the KS test (within 95% confidence bound). The KS test also identified the two simulated distributions as different.

Discussion
Understanding the sources and consequences of noise is key to designing robust synthetic genetic circuits. Oscillators provide an ideal platform for studying noise, as they fluctuate between large and small numbers of proteins. Hence, the relative impact of various noise sources changes with the oscillator's phase. In experiments, we found that the amplitude exhibited high variability whereas the period did not, while GFP levels in sister cells were highly correlated for some time after cell division. Importantly, we determined that using just the amplitude variability to infer the level of intrinsic noise can result in model parameters inconsistent with other dynamical properties, such as correlation between sister cells after cell division. Instead, using a combination of intrinsic noise, cell cycle variability, and parameter variability we were able to obtain a model that better describes the stochastic dynamics of the oscillator. To estimate the impact of these sources of noise it was essential to track cell lineages across several generations. This allowed us to use long term correlations in addition to amplitude variability to distinguish between intrinsic noise and parameter variability.
We used parameter variability to capture multiple possible sources of noise which are not well understood. Other sources of variability that were not investigated were the genetic instability of synthetic circuits and the health of cells (filamentation, for example). However, these are very rare and difficult to examine due to the small number of occurrences, and unlikely to impact our overall conclusions due to the same reasons.
Since intrinsic noise has a fast time scale [28], it can quickly decorrelate the dynamics of sister cells. Other sources of noise can act on longer time scales [28]. This can result in large amplitude fluctuations from cycle to cycle, but a relatively slow decay of correlations between sister cells. Additionally, we found that it was necessary to implement long term correlations in parameter variability using a slow timescale. When our model did not include this feature, we could not find appropriate noise levels to explain amplitude variability and correlation at the same time (see Methods). Thus, if a gene network shows high amplitude variability but sister cells are correlated for an extended time after cell division, then extrinsic noise sources are likely to be the cause of such variability. We found that modeling the perturbation of parameters at cell division can capture these effects. Importantly, this type of perturbation covers sources of noise that affect protein synthesis, including unequal partitioning of cellular resources upon division or differences in cellular energy due to fluctuations in metabolic enzyme concentrations and local carbon source availability. Our method thus points towards a general strategy to identify the sources of noise in gene networks with complex dynamics.
We have examined the impact of different sources of noise in a system with relatively fast dynamics. These same sources could have a different impact in a slowly evolving system. However, our approach of using multiple statistical measures to characterize different aspects of the systems dynamics will work in such situations. For instance different sources of noise drive transitions between the two states of a genetic toggle switch [37,38]. Variability in these transitions likely reflect the internal and external sources of fluctuations that drive the alternation between the states. Different sources of noise likely have a different effect on the statistics of transitions between states [39,40]. Characterizing GFP variability, along with the first passage time distribution, and the decay of correlations in daughter cells could allow us to disentangle the different sources of fluctuations that contribute to this variability.

Microfluidic devices and microscopy
Devices were manufactured as in Hussain et al. (2014). Briefly, a 4" silicon wafer (Silicon Quest, San Jose, CA) was cleaned with acetone and isopropyl alcohol, and then dried with compressed nitrogen. The wafer was coated with SU-8 series photoresist (MicroChem, Newton, MA), then spun for 30 seconds in a spin coater (Brewer Instruments, Roala, MO) to distribute the resist. The wafer was baked at 95°C and let cool to room temperature. The wafer was mounted to the chuck of a mask aligner (SUSS, Germany). The photomask (CAD/Art Services, Bandon, OR) was mounted to the mask aligner, and the wafer was aligned to the photomask. The resist was exposed to UV light for cross-linking. The wafer was baked at 95°C to finalize cross-linking. Uncross-linked resist was removed with SU-8 developer (MicroChem, Newton, MA). The above steps were repeated until the device was completed. The wafer was hardbaked at 150°C to solidify the resist. To ensure PDMS liftoff, the wafer was coated with release agent (((tridecafluoro-1,1,2,2-tetrahydrooctyl)-1-trichlorosilane), Pfaltz & Bauer, Waterbury, CT) for 5 minutes under vacuum.
Polydimethylsiloxane (PDMS) polymer base and curing agent (Sylgaard 184, Dow Corning, Midland, MI) were mixed in a weigh boat at a 10:1 ratio until completely mixed (*5 min). All bubbles were removed by degassing under vacuum. The mold for the selected device was wrapped in aluminum foil to contain the PDMS. Mixed PDMS was poured onto the wafer, and the degassing process was repeated. The wafer and PDMS were baked at 80°C for 2 hours. The cured PDMS monolith was removed from the wafer, and excess PDMS was trimmed from the monolith. Ports for fluidic connections were punched with a 0.5mm biopsy punch (GE Healthcare, Pittsburgh, PA). Individual chips were cut from the monolith. Chips were sonicated in methanol for 8 minutes. Methanol was poured off, and fresh methanol was added. Another round of sonication was performed for 8 minutes. Chips were baked at 80°C for 30 minutes to remove methanol. Chips were cleaned with tape (3M, St. Paul, MN). #1.5 coverslips (VWR, Radnor, PA) were cleaned with isopropyl alcohol and dried with compressed nitrogen. Chips and coverslips were cleaned in a UV/ozone oven (Jelight Co., Irvine, CA) for 3 minutes. Upon removal, chips were immediately inverted onto the coverslips to bind, and completed devices were baked at 80°C overnight to finalize binding.
Microscope experiments were prepared and performed as described in [27]. The activator (pJS167) and repressor (pZA14) plasmids (obtained from the Hasty lab, [16]) were transformed into ΔaraCΔlacI E. coli cells (JS006, [16]), and plated onto LB/agar plates with 100μg/ ml ampicillin and 50μg/ml kanamycin. Plates were incubated overnight at 37°C and stored at 4°C. 5-10ml overnight cultures were inoculated from transformed cells, and incubated overnight at 37°C with shaking. Prior to experiments, overnight cultures were passed into 50ml fresh LB//100μg/ml ampicillin//50μg/ml kanamycin at a 1:1,000 dilution (i.e. 50μl of overnight culture in 50ml of fresh media). Passed cultures were grown to an OD 600 of 0.1-0.2, and 10ml was pelleted at 1,500×g for 5 minutes. Pelleted cells were resuspended in 10ml of fresh media with 100μg/ml ampicillin and 50μg/ml kanamycin prior to loading into the microfluidic device (Fig 9).

Image analysis
Segmentation of images was done manually. The tracking of cell lineage across images was done using our custom cell-tracking algorithm written in Matlab (available at github.com/alanavc/rodtracker): For each cell, C, in an image we found its position and length, P C = (x, y) and L C , respectively. Then, we found all cells in the next image whose position P next was near P, that is |P C − P next | < d move . The parameter d move equals the maximal movement of a cell from one frame to the next. From the cells satisfying this criterion, we selected cells with length L next similar to L C , that is |L C − L next | < d growth . The parameter d growth equals the maximal expected growth between frames. We also found all pairs of cells whose length L next,1 and L next,2 approximately added up to L C , that is |L C − (L next,1 + L next,2 )| < d growth . With this criteria we created a "lineage graph" where each cell in an image had a set of possible transitions from one image to the next. Each transition corresponded to either movement between frames connecting the cell to a single descendant, or division and movement, connecting a cell to two descendants. This graph was then reduced by removing inconsistent transitions (e.g., a cell can only have one possible location in the next image or two locations if it divided). The reduced graph was further reduced by only selecting transitions that minimized ∑ C (|L C − L next | + |P C − P next |). The final graph consisted of transitions where each cell is associated with a unique cell (if the cell moved) or two cells (if the cell moved and divided). The lineage trajectories were then computed using the lineage graph and the fluorescence data. One resulting lineage trajectory is shown in Fig 1B, and all trajectories are shown in Fig 10.

Fitting parameters of the deterministic model to experimental data
To simplify parameter estimation of the model given in Eq (4) we followed these steps: Since all genes were under the control of identical promoters, and the copy number of the plasmids was approximately 60 for the activator and reporter, and 25 for the repressor [23], we considered α g = 60α (monomer), α a = 60α/2 (dimer), and α r = 25α/4 (tetramer); where α is a parameter that controls the maximal production rate of proteins per plasmid when fully induced. Since the maturation rate does not change the total amount of GFP, but only the smoothness of the oscillations, it was not necessary to estimate this parameter from data. The half-time of maturation of fluorescence proteins ranges from 5 to 40min [41]. We set the maturation rate of GFP to λ = ln(2)/10min −1 (half-time equal to 10min), but as indicated before, our results do not depend on the precise value of λ. The promoter used in the dual-feedback oscillator is a strong promoter [23], and thus had a large maximal transcription rate, for which the delay affects the period but has little effect on its variability [42]. Thus, it was not necessary to estimate the delay parameters from data. We considered τ g = 5min as the transcriptional delay for GFP production. Since araC is a dimer and lacI is a tetramer, we considered larger delays equal to τ a = 5.5min and τ r = 6.0min, respectively. The growth rate was estimated from phase-contrast microscopy to be approximately β = 0.0295min −1 . We also assumed that the degradation of a mature GFP protein is identical to that of immature protein, and set γ G = γ g . Thus, the set of parameters to be estimated is P = (α, γ r , γ a , γ g , f, R 0 , C a , C r ).
To estimate the remaining parameters, we first needed to normalize the data from Fig 10A-10F. This was necessary, as the actual protein levels could not be determined exactly from our fluorescence measurements. First, we rescaled the trajectories so that oscillations with different amplitude are comparable: For the fluorescence trace obtained for each oscillation, G i (t), we found the peak time, t peak i , and consider the trajectory 33 minutes before and after the peak. That is, we considered the time series G i (t) for t peak i − 33 t t peak i + 33. We chose 33 minutes because that includes the previous and next trough in the trajectory completely. We then centered the trajectories at t = 0; that is, we redefined G i (t): = G i (t + t peak ) for −33 t 33 (here "≔" means reassignment, that is the quantity on the left side is assigned the value on the right side). We then normalized the time series using G i ðtÞ≔ G i ðtÞÀm i M i Àm i where m i = min{G i (t): −33 t 33} and M i = max{G i (t): − 33 t 33}. This first normalization ensures that all peaks have height 1 and all troughs have height zero; thus, their magnitudes are now comparable. Finally, to obtain the "mean oscillation", we took the average of all the shifted normalized oscillations:  Table 1, Fig 6), and for validation (Fig 7A-7D). G-H. Lineage trajectories with 0mM IPTG. The coefficient of variation of the amplitude of the two lineages combined is 0.55. The coefficient of variation within each lineage is 0.47 (G) and 0.66 (H), with an average value of 0.56. These two lineages were used for crossvalidation only (Fig 7E-7H), and not for parameter estimation. GðtÞ À minfGðtÞ : À33 t 33g maxfGðtÞ : À33 t 33g À minfGðtÞ : À33 t 33g : For each set of parameters p, we computed the error between simulations and experimental data as where G sim (t, p) is the data obtained from simulations using parameters p (centered at a peak), μ per is the mean period of the experimental data, and sim per is the period of the simulated data. The first term in E(p) measures the error in the shape of the oscillations and the second term measures the error in the period. We then used a gradient descend method to find the set of parameters, p, that minimized E(p). The value of the estimated parameters are given in Table 1.

Stochastic model and intrinsic noise
Based on the deterministic model in Eq (4) Table 1. Parameters used in Eq (4). Parameters marked with * were estimated using experimental data shown in Fig 10A-10F. where ð1 þ a C a Þð1 þ r C r Þ 2 and dðr; a; g; GÞ ¼ To simulate the system we used the delayed stochastic simulation algorithm (delayed SSA). This algorithm is an extension of the standard SSA [18], but for the delayed reactions the production of a protein is kept in a queue for the duration of the delay [19][20][21]29]. In order to control the level of intrinsic noise in Eq (7), we first note that if we rescale all the variables in the deterministic model given in Eq (4) by the same factor, the scale of the system changes, but the dynamics do not. In other words, the levels of each protein can be rescaled so that the dynamics of the system is unchanged. In particular, using the rescaling x ! x/O for each variable transforms the system of Eq (4) into _ r ¼ Oa r h O ½rðt À t r Þ; aðt À t r Þ À Og r r OR 0 þ r þ a þ g þ G À br; _ a ¼ Oa a h O ½rðt À t a Þ; aðt À t a Þ À Og a a OR 0 þ r þ a þ g þ G À ba; Thus, from the parameters in Table 1, we obtain a family of parameters P O ¼ ðOa; Og r ; Og a ; Og g ; f ; OR 0 ; OC a ; OC r Þ; for which the system in Eq (4) exhibits the same dynamics. Note that increasing O increases the number of each protein within the model. This is important, because we cannot infer the number of fluorescent proteins directly from the data. However, the amount of intrinsic noise within a biochemical system depends on the absolute number of the proteins within the circuit. Therefore, O controls the level of intrinsic noise in Eq (7)-increasing O decreases the level of intrinsic noise [43]. The parameter O can be interpreted as a unitless volume when parameters are fixed or as a scaling parameter when the volume is fixed. Here we used the second interpretation. We simulated 500 single-cell trajectories for each value of O in Fig 3B. Incorporating the cell cycle and lineage dependence into the stochastic model To incorporate the cell cycle we account for cell growth explicitly and divide the volume of a cell by two when a cell divides. Thus, a variable for volume, V(t) (satisfying V 0 = βV), was introduced in the model and the cell divides when V(t) = V max , where V max is the volume required for division. At time t = 0, we start with a volume equal to 1, V 0 (0) = 1, and simulate the model More precisely, if y is the state of the system and x is a protein of interest, then a reaction after incorporating cell size in the model [44]. The expression y/V is simply the concentration of proteins. Here V multiplies R because the molecular machinery in charge of production and degradation is being replicated proportionally to cell size (e.g. plasmids). This can also be justified starting from the deterministic model, incorporating cell size explicitly, and then incorporating stochasticity as follows. Consider a variable x that evolves according to where [Á] denotes concentration and β is the parameter corresponding to dilution due to cell growth. Since [x] = x/V and [y] = y/V, we obtain d dt which describes how x evolves incorporating volume explicitly. Finally, the stochastic implementation takes the form x! VRðy=VÞ x þ 1.
At cell division we divide the volume by half and partition all the existing proteins randomly into two groups corresponding to the two daughters using binomial partitioning. We similarly partition the stack of immature proteins in the production queue. We used a binomial distribution for partitioning (each cell is equally likely to receive a protein) [28]. Modeling cell division explicitly not only allows us to model variability caused by cell partitioning (Fig 4A), but it also allows us to simulate lineages (Fig 4B) and compare them directly to lineage trajectories from experimental data.
Modeling cell division explicitly also allows us to incorporate the variability in growth rate (β) and the size required for division (V max ). We estimated the variability from cell to cell of β and V max (Fig 12), and incorporated this information in the simulations. The statistics of V max and β were found from phase-contrast microscopy data. We used a normal distribution for β and a shifted-gamma distribution for V max : β = β 0 (1 + Γ β η 1 ) and V max = V base + η 2 , where Z 1 $ N ð0; 1Þ and η 2 * Gamma(k, θ), where k and θ are the shape and scale of the gamma distribution ( Table 2). We simulated 1000 lineages (8 generations each) for each value of O in Fig 5. We did not include variability in the size ratio of sister cells at division because it was not clear if the differences between sister cells sizes were due to real differences or due to errors in the segmentation of phase-contrast images. We estimated the variability of the size ratio (size of daughter cell/size of mother cell) to have a CV of about 0.09. The true variability between sister cells in cell size would be smaller. We checked the impact of including this overestimated value of size ratio variability in simulations used to obtain

Rate parameter variability
We implemented parameter variability in the model as described in Fig 13. For a reaction x ! Rðy;pÞ x þ 1 where y is the state of the system and p denotes a parameter of interest, the simulation proceeds as follows: When a cell is not dividing, the parameter has a constant value p = p 0 and the reaction rate is x ! Rðy;p 0 Þ x þ 1. When the cell divides, the parameters for the sister cells may be different due to upstream fluctuations or partitioning effects. Thus, sister cell 1 is assigned the parameter p = p 1 and cell 2 the parameter p = p 2 . After cell division the reaction rates are therefore different between the cells. Changes in the parameter p may represent transient changes in plasmid copy number or the availability of enzymes needed for protein production, as well as enzymes needed for protein degradation.
In our simulations for Figs 6 and 7 we modeled the evolution of a parameter of interest p by resampling it upon every division using the recursive relation Here σ p was chosen so that the CV was Γ and hpi denotes the mean value of p. This is equivalent to p i+1 = (qp i + (1 − q)hpi) (1 + Γη), where Z $ N ð0; 1Þ. The parameter q represents the  timescale of a homeostatic mechanism that ensures that the variance of p does not diverge. The initial parameters were themselves sampled from the stationary distribution of Eq (9), and parameters of daughter cells were sampled from a distribution determined by the parameters of the mother cell, p daughter $ N ðqp mother þ ð1 À qÞhpi; s 2 p Þ. The parameters of two sister cells, p daughter 1 and p daughter 2 , were chosen so that their mean was equal to p mother + (1 − q)p mean , that is (p daughter 1 + p daughter 2 )/2 = qp mother + (1 − q)p mean . For example, to model the evolution of the parameter that controls the production rate of the repressor, we initialized the system with a random value of a 0 r , taken from the stationary distribution of the sequence a iþ1 r ¼ ðqa i r þ ð1 À qÞha r iÞð1 þ GZÞ, where Z $ N ð0; 1Þ. Then, after one cell division, the new parameter was a 1 r ¼ ðqa 0 r þ ð1 À qÞha r iÞð1 þ GZÞ. At the next division, we used a 2 r ¼ ðqa 1 r þ ð1 À qÞha r iÞð1 þ GZÞ, and so on. To account for variability in the copy number of plasmids, and enzymes required for protein production, we considered variability in the parameters α x where x 2 {r, a, g} and kept the others fixed. The mean value of the parameters were set to hα g i = 60α, hα a i = 60α/2, and hα r i = 25α/4, where α is given in Table 1. The parameters changed independently at cell division according to the rule a iþ1 x ¼ ðqa i x þ ð1 À qÞha x iÞð1 þ GZÞ, for x 2 {r, a, g}. To model the slow evolution of parameters we used q = e −ln(2)/5 , corresponding to a timescale of 5 cell generations. Fitting amplitude variability and cell-cell correlation yielded the values O = 2.1 and Γ = 0.12 (Table 3).
We also considered simulations where the parameters were taken from the same distribution without changing the mean. That is, we considered p daughter $ N ðhpi; s 2 p Þ, regardless of p mother . Note that this corresponds to using an instantaneous time scale of parameters, q = 0. With this choice, we could not find values of the parameters O and Γ that provided a match with experimental data (Fig 14). On the other hand, using p daughter $ N ðp mother ; s 2 p Þ, still allowed Fig 13. Implementation of parameter variability. A rate parameter p changing at cell division due to partitioning effects. While the cell is not dividing, p has the constant value p 0 and the reaction rate is R(y, p 0 ), where y is the state of the system. At cell division, the parameter p is different for each cell: p 1 for cell 1, and p 2 for cell 2. Cells 1 and 2 then have reaction rates R(y, p 1 ) and R(y, p 2 ). The value of p for the daugther cells will remain unchanged until the next division. doi:10.1371/journal.pcbi.1004674.g013

A toy oscillator model
Here we explore the effects of intrinsic and extrinsic noise on amplitude variability and correlation using the toy oscillator in Eq 6. We also illustrate that our method of estimating the levels of intrinsic and extrinsic noise can be expected to work in general. The stochastic model in Eq 6 can be derived by transforming a 2 dimensional oscillator _ x ¼ f ðx; yÞ þ Z 1 ðtÞ, _ y ¼ gðx; yÞ þ Z 2 ðtÞ to polar coordinates, where η 1 (t) and η 2 (t) are independent noise terms [45]. Using x = rcos(θ), y = rsin(θ) we obtain _ r ¼ f cos ðyÞ þ g sin ðyÞ þ Z 1 cos ðyÞ þ Z 2 sin ðyÞ; _ y ¼ g cos ðyÞ À f sin ðyÞ r þ Z 2 cos ðyÞ À Z 1 sin ðyÞ r : Fig 14. The effect of omitting lineage dependence.
The parameter values at cell division are sampled from the same normal distribution regardless of the value of the parameters in the mother cell. A. Compared to the case of lineage dependence (Fig 6 in main text), the amplitude variability is lower and hence a higher level of parameter variability is needed to match the amplitude variability seen in experimental data. B. The amplitude correlation in simulations can match the experimental data by appropriately tuning the noise parameters Ω and Γ. C. The set of parameters obtained from panels A and B do not overlap, and it is not possible to match simultaneously the amplitude variability and correlation seen in experimental data.
We tested if our method can obtain the true values of O and Γ using only the amplitude variability and correlation. First, we chose a fixed level of intrinsic and extrinsic noise: O 0 = 200 and Γ = 0.15, and then generated simulations from which we computed the amplitude variability and the amplitude correlation: var exp and corr exp (in silico experimental data). Then, using the same analysis done for the full model, we computed the heatmaps of the difference between the amplitude variability from in silico experiments and from simulations (|var sim − var exp |, Fig  15A) as well as the difference in amplitude correlation (|corr sim − corr exp |, Fig 15B). Using these heatmaps, we found the predicted levels of noise to be (O, Γ) % (200, 0.15), precisely the levels of noise used to generate the in silico experimental data (Fig 15C.) We also observe that intrinsic noise has a stronger effect on correlation than on amplitude variability, whereas extrinsic noise has a stronger effect on amplitude variability.
Supporting Information S1 Dataset. Complete dataset. Spreadsheet containing data from each microscope run described in this article. (XLSX)