Precision of Readout at the hunchback Gene: Analyzing Short Transcription Time Traces in Living Fly Embryos

doi:10.1371/journal.pcbi.1005256

Fig 1.

Transcription dynamics in the fly embryo.

(A) The three models of transcription dynamics considered in this paper. From left to right: the two state model, the cycle model and the Gamma model (see SI Sections B, D and E). (B) Example of the promoter state dynamics (either ON or OFF) as a function of time. We assume that the polymerase is abundant and every time the promoter is ON and is not flanked by the previous polymerase a new polymerase will start transcribing. The function X(t) in black is non-zero when a polymerase is occupying the transcription initiation site and zero otherwise. (C) In the ON state, the promoter (Pr) is accessible to RNA polymerases (Pol II) that initiate the transcription of the target gene and the 24× MS2 loops. As the 24× target mRNA is elongated MCP-GFP fluorescent molecules bind a detectable fluorescence signal. (D) MCP-GFP molecules labeling several mRNAs co-localize at the transcription loci, which appear as green spots under the confocal microscope. The spot intensities are then extracted over time and classified by each nuclei’s position in the Drosophila embryo as Anterior, Boundary and Posterior. The spatial resolution of the spots is limited by the Abel limit, which is ∼ 200nm. The ability to identify spots is also limited by the background level of free MCP-GFP. Typical spot sizes are ∼ 260nm, giving an upper bound on the size of the transcription site. (E) The gene is divided into r sites of size 150 base pairs, indexed by i. The presence or absence of a polymerase at site i on the gene as a function of time is given by the promoter occupancy in B and a delay time that depends on the speed of the polymerase. (F) A cartoon representing the type of experimental signal we analyze (see S1 Fig for real traces): one spot’s intensity as a function of time, corresponding to the arrival of RNA polymerases in (E) and the promoter state in (B).

More »

Expand

Fig 2.

Autocorrelation analysis of fluorescent traces from cell cycles 12-13.

(A) Autocorrelation functions for traces of different length caused by the variable duration of the cell cycle. Each autocorrelation function is calculated from one embryo and one cell cycle from traces in the anterior region of the embryos. Reading off the autocorrelation time as the time at which the autocorrelation function decays by a value of e would give different values for each trace. The analysis is restricted to the steady state part of the traces (as defined in the text and S2 Fig). The durations of the steady state windows are given in SITable I. (B) Autocorrelation functions calculated for the same traces reduced to having equal trace lengths, all equal to the trace length of the shortest trace (101s), show that the differences observed in panel A are due to finite size effects. In the curtailed traces all sequential time points until the 101s time point were used. (C) An example of a signal simulated using the process described in Fig 1 for 300 seconds (blue curve) for a two state model. Taking the whole 300 second interval (red dashed lines) gives a good approximation of the average signal (red line) and the effect of finite size on the autocorrelation function is small (D). Reducing the time window to 60 seconds (green dashed lines) correlates the average with the signal much more and the effect of the finite size on the autocorrelation is strong (E). The sampling rates of the four embryos are: 13.1s, 10.2s, 5.1s and 4.3s, respectively. Parameters for the simulation in (C-E) are: k_on = k_off = 0.06s⁻¹, sampling time dt = 4s, for the red curve T = 300s and M = 2000 nuclei, for the green curve T = 60s and M = 10000 nuclei (same total amount of data). These parameters were chosen for illustrative purposes.

More »

Expand

Fig 3.

The autocorrelation prediction and autocorrelation based inference analysis performed on short trace simulated data for models of various complexity and positioning of the MS2 probe.

A cartoon of the construct with the MS2 cassette placed (A) after the gene (3’) and (B) before the gene (5’). Examples of the autocorrelation function’s analytical predictions compared to ones calculated from simulated traces (according to the Gillespie simulations described in SI Section G) show perfect agreement for 3’ MS2 insertions assuming a two state (telegraph) model, three state model and gamma function bursty model (C), as well as for the 3’ and 5’ constructs in the two state model (D). (E) Comparison between prediction and simulation for the cross-correlation between the signal coming from two different colored fluorescent probes positioned at the 3’ and 5’ ends. (F) The inference procedure for the two state model correctly finds the parameters of transcription initiation in a wide parameter range. The inference range grows with trace length and the number of nuclei. Error bars shown only for T = 240s, N = 50 nuclei (blue line) and T = 600s, N = 200 nuclei (red line) for clarity of presentation. Parameters for the simulations and predictions are, (C) for the two state model k_on = 0.005 s⁻¹, k_off = 0.01 s⁻¹, sampling time dt = 6 s, T = 360 s and number of cells M = 20000, the same parameters for the three state cycle model with k_off = 0.01 s⁻¹, k₁ = 0.01 s⁻¹ and k₂ = 0.02 s⁻¹, the same parameters for the Γ model with k_off = 0.005 s⁻¹ and α = 2 and β = 0.01 s⁻¹; (D) k_on = 0.02 s⁻¹, k_off = 0.01 s⁻¹, sampling time dt = 6 s, T = 600 s and number of cells M = 20000; (E) k_on = 0.01 s⁻¹, k_off = 0.01 s⁻¹, dt = 6 s, T = 480 s and M = 20000. The 5’ construct is modeled as by adding a 3000bp non-MS2 binding sequence to the 3′ end of the MS2-binding cassette. (F) P_on = 0.1.

More »

Expand

Fig 4.

The gene expression model used in the autocorrelation function calculation.

The autocorrelation inference approach is based on the idea that the stochastic transcriptional dynamics can be deconvoluted from the signal coming from the deterministic fluorescent construct, if we know the gene construct design. (A) A concatenation of snapshots of the gene from r consecutive time steps. A polymerase covers a length on the gene corresponding to its own length in one time step, producing about two MS2 loops. The gene has total length r and at any position i along the gene L_i < 24 loops have been produced. (Top right) The promoter state as a function of time and (center right) an instantaneous snapshot of the gene corresponding to transcription from this promoter. (B) The construct design is encoded in the loop function L_i. As the polymerase moves along the gene it produces MS2 loops. L_i is an average representation in terms of polymerase time steps of how many loops have been produced by a single polymerase. It is based on the experimental design shown on the left of the panel.

More »

Expand

Fig 5.

Inference results for fly data.

(A) Inferred values of P_on for different nucleus positions (A-Anterior, B-Boundary) and cell cycles. (B) Example of the mean connected autocorrelation function of the traces in cell cycle 13 from the boundary region of embryo 1 (blue dots, with shaded error region) and of the fitted Poisson-like (red), two-state (green) and cycle (black) promoter models. The fitted curves generated from the two-state and three state cycle model are almost superimposed. See S3 Fig for fits of all autocorrelation functions in both cell cycles and regions. (C) Inferred values of k_on + k_off using the two-state model. In (A) and (C), the standard error bars are calculated by performing the inference on 20 random subsets that take 60% of the original data. (D) Inferred values of k_on and k_off in the Anterior (red) and Boundary (blue) for the two-state model, in cell cycle 12 (circle) and cell cycle 13 (square). For each condition, 4 inferred values for 4 movies are shown. The dotted black line depicts the limit to inference coming from the time of τ_block ∼ 6s it takes the polymerase to leave the transcription initiation site (k_on + k_off = 1/(6s)). The shaded areas represent the standard deviational ellipse of k_on, k_off for each cycle and each embryo. The axes of the ellipses are the eigenvalues of the covariance matrix, represented in the directions of the eigenvectors. (E-F) Two simulated trajectories of the promoter state with the inferred parameters in the Anterior (red) and Boundary (blue).

More »

Expand

Fig 6.

Longer time traces help distinguish between two state and three state cycle models.

A. Inference from data generated by a two state model, which corresponds to k₁/k₂ = 0, from traces of different lengths T and using different numbers of nuclei N shows that longer traces help increase the probability to correctly learn the model type. Increasing the number of nuclei for short traces shows little improvement. The inference is repeated 50 times per condition. The experimental conditions studied in this paper are closest to the T = 240s and N = 50 nuclei panel. B. The same numerical experiment but assuming a three state cycle model, which corresponds to k₁/k₂ = 1. Parameters of the simulations: P_on = 0.1, k_off + 1/(1/k₁ + 1/k₂) = 0.02s⁻¹ and k₁/k₂ = 0 in A, and k₁/k₂ = 1 in B.

More »

Expand

Fig 7.

Precision of the hunchback gene transcription readout.

A. Comparison of the relative error in the mRNA produced during the steady state of the interphase estimated empirically from data (abscissa) and from theoretical arguments in Eq 4 for a two state switching promoter (blue symbols) using the inferred parameters in Fig 5C (ordinate), and theoretical arguments in Eq 5 for a Poisson-like static promoter (red symbols) using the inferred parameters in Fig 5A (ordinate), in the anterior (circles and squares) and the boundary (diamonds and triangles) regions. The theoretical prediction for the two state promoter shows very good agreement with the data, whereas the Poisson-like promoter shows poor agreement, especially in the boundary region. B. The relative error in the total mRNA produced in cell cycle 13 directly estimated from the data as the variance over the mean of the steady state mRNA production (red line, same data as in A), sum of the intensity over the whole duration of the interphase (blue line) and the total mRNA produced during cell cycles 11 to 13 (green line) for equal width bins equal to 10% embryo length at different positions along the AP axis. Each line describes an average over four embryos (see S9C Fig for the same data plotted separately for each embryo) and the error bars describe the variance. To calculate the total mRNA produced over the cell cycles, we take all the nuclei within a strip at cell cycle 13 and trace back their lineage through cycle 12 to cycle 11. We then sum the total intensity of each nuclei in cell cycle 13 and half the total intensity of its mother and 1/4 of its grandmother. C. Comparison of the relative error in the mRNA produced during the steady state for a two state, k₁/k₂ = 0, (solid lines) and three state cycle model, k₁/k₂ = 1, (dashed lines) with the same , for different values of and k_off, shows that the three state cycles system allows for greater readout precision. D. A comparison of the theoretical prediction of the steady state relative error rate for the Poisson-like and two state promoter as a function of P_on shows that the Poisson-like promoter is always more accurate. Different values of k_on are considered for the two state model.

More »

Expand