Figure 1.
A schematic view of the available knobs which can be systematically tuned to change the mRNA and protein distributions. In this work we begin by studying constitutive expression, eliminating the extra layer of complexity associated with transcription factors, and systematically control the RNAP binding affinity through control of the promoter sequence. These results are then generalized to the case in which these same promoters are subjected to regulation by repressor binding, with the level of repressor (i.e. TF copy number) controlled systematically.
Figure 2.
Energy matrix for RNAP binding.
Figure adapted from Kinney et al [20]. The contribution of each basepair to the total binding energy is represented by color. The total binding energy of a particular sequence can be calculated by summing the contribution from each base pair. Positive values indicate disfavorable contributions to binding energy. As expected, the most influential base pairs are those in the and
region which interact directly with the binding domains of RNAP
. Numeric matrix entries are available in SI Text S2. The sequence displayed above the energy matrix corresponds to the wild-type lac promoter; the bold bases mark
base pair increments.
coordinates are with respect to the transcription start site.
Figure 3.
Schematic of DNA construct inserted in the galK region.
The area between the promoter and the LacZ start codon is shown in more detail below along with a table displaying the specific RNAP binding sites (promoters) listed in order of descending binding affinity. The wild-type binding sequence is shown in red text, the lacUV5 sequence is shown in magenta text, and two additional promoters are marked by blue text and green text. The data points involving these four promoters will maintain this color coding throughout every figure. The and
RNAP recognition sequences are highlighted in a green box and a red box, respectively. The bases in these regions carry the most weight in the energy matrix. Sequences are available in text format in SI Text S4.
Figure 4.
States and weights of the unregulated promoter.
In the thermodynamic model, the promoter can be in one of two configurations: unoccupied by RNA polymerase (top) or occupied by RNA polymerase (bottom). The remaining polymerases are bound nonspecifically on the E. coli genome. The total energy is the sum of all the nonspecific binding energies and the specific energy of binding at the promoter (when occupied). The multiplicity factor accounts for the number of different ways of arranging polymerases on the genome.
Figure 5.
Gene expression as a function of RNAP binding energy.
(A) LacZ activity measured in Miller units and (B) average mRNA per cell vs. promoter binding energy in units of (with the zero of energy set to be the average interaction energy between RNAP and the the entire E. coli chromosome). To illustrate the reproducibility of our measurements, the translucent points represent individual measurements and the solid points represent the averaged value over repeated experiments. The solid black line in each plot is the Boltzmann factor scaling,
. The red data points correspond to the wild-type lac promoter, which was used to calibrate the arbitrary units of our energy matrix to (physical)
units. The magenta, red, blue, and green data points represent promoters which we examine in the context of simple repression.
Figure 6.
Expected relation between predictions and measurement for simple repressor titration.
Figure (A) shows three hypothetical promoters for which the predictions of the promoter design are either numerically correct (), underestimated (▾) or overestimated (
). The three smaller figures in (B) show the expected result as repressors are added in a simple repression architecture. The predicted theory line and the data points differ on average by the same percent as they do at
.
Figure 7.
Gene expression in the simple repression case.
(A) Solid surface: predicted gene expression of equation 7 as a function of repressor copy number and RNAP binding energy
. Data points represent measurements of gene expression in a strain with a given promoter and repressor copy number. (B) Data from part (A) collapsed onto the RNAP binding energy axis. The solid lines are the zero parameter predictions from the theory in equation 7 using
predicted from the position-weight matrix in Figure 2 (numerical values listed in Figure 3 under “model”). There is a systematic deviation between the theory and the experimental data which is inherited from the imperfect prediction of
by the RNAP binding strength model (illustrated schematically in Figure 6. In (c) the same data are shown after we have corrected
to fall on the theory fit line based on the constitutive expression (numerical values listed in Figure 3 under “LacZ”). Here we see that by correcting for the initial uncertainty in the binding energy prediction we observe good agreement between the theory and experimental data which indicates that our designed promoters function as expected even in a different regulatory context.