• Loading metrics

Bayesian inference and comparison of stochastic transcription elongation models

  • Jordan Douglas ,

    Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Software, Validation, Visualization, Writing – original draft, Writing – review & editing

    Affiliations School of Biological Sciences, University of Auckland, Auckland, New Zealand, Centre for Computational Evolution, School of Computer Science, University of Auckland, Auckland, New Zealand

  • Richard Kingston,

    Roles Investigation, Methodology, Project administration, Resources, Supervision, Writing – review & editing

    Affiliation School of Biological Sciences, University of Auckland, Auckland, New Zealand

  • Alexei J. Drummond

    Roles Investigation, Methodology, Project administration, Resources, Supervision, Writing – review & editing

    Affiliations School of Biological Sciences, University of Auckland, Auckland, New Zealand, Centre for Computational Evolution, School of Computer Science, University of Auckland, Auckland, New Zealand

Bayesian inference and comparison of stochastic transcription elongation models

  • Jordan Douglas, 
  • Richard Kingston, 
  • Alexei J. Drummond

This is an uncorrected proof.


Transcription elongation can be modelled as a three step process, involving polymerase translocation, NTP binding, and nucleotide incorporation into the nascent mRNA. This cycle of events can be simulated at the single-molecule level as a continuous-time Markov process using parameters derived from single-molecule experiments. Previously developed models differ in the way they are parameterised, and in their incorporation of partial equilibrium approximations. We have formulated a hierarchical network comprised of 12 sequence-dependent transcription elongation models. The simplest model has two parameters and assumes that both translocation and NTP binding can be modelled as equilibrium processes. The most complex model has six parameters makes no partial equilibrium assumptions. We systematically compared the ability of these models to explain published force-velocity data, using approximate Bayesian computation. This analysis was performed using data for the RNA polymerase complexes of E. coli, S. cerevisiae and Bacteriophage T7. Our analysis indicates that the polymerases differ significantly in their translocation rates, with the rates in T7 pol being fast compared to E. coli RNAP and S. cerevisiae pol II. Different models are applicable in different cases. We also show that all three RNA polymerases have an energetic preference for the posttranslocated state over the pretranslocated state. A Bayesian inference and model selection framework, like the one presented in this publication, should be routinely applicable to the interrogation of single-molecule datasets.

Author summary

Transcription is a critical biological process which occurs in all living organisms. It involves copying the organism’s genetic material into messenger RNA (mRNA) which directs protein synthesis on the ribosome. Transcription is performed by RNA polymerases which have been extensively studied using both ensemble and single-molecule techniques. Single-molecule data provides unique insights into the molecular behaviour of RNA polymerases. Transcription at the single-molecule level can be computationally simulated as a continuous-time Markov process and the model outputs compared with experimental data. In this study we use Bayesian techniques to perform a systematic comparison of 12 stochastic models of transcriptional elongation. We demonstrate how equilibrium approximations can strengthen or weaken the model, and show how Bayesian techniques can identify necessary or unnecessary model parameters. We describe a framework to a) simulate, b) perform inference on, and c) compare models of transcription elongation.


Transcription is carried out by RNA polymerases: RNAP in Escherichia coli, pol II in Saccharomyces cerevisiae, and T7 pol in Bacteriophage T7. It involves the copying of template double-stranded DNA (dsDNA) into single-stranded messenger RNA (mRNA). RNAP and pol II are comprised of multiple subunits, and their catalytic subunits are homologous [1, 2]. In contrast, T7 pol exists as a monomer with a distinct sequence, and resembles the E. coli DNA polymerase I [3].

Optical trapping experiments have been performed on the transcription elongation complex (TEC) from a variety of organisms [410]. In a typical experimental setup, two polystyrene beads (around 600 nm in diameter) are tethered to the system; one attached to the RNA polymerase and the other to the DNA [4]. As transcription elongation progresses, the distance between the two beads increases and the velocity of a single TEC can be computed. Optical tweezers can be used to apply a force F to the system (Fig 1).

Fig 1. Effect of an applied force on elongation velocity.

(A) Optical trapping setup showing dsDNA being transcribed by RNA polymerase (grey ellipse) into mRNA. Two polystyrene beads are tethered to the system allowing the application of force using optical tweezers. An assisting load F > 0 acts in the same direction as transcription (top) while a hindering load F < 0 acts in the opposing direction (bottom). Figure not to scale. (B) Schematic depiction of the effect of applying a force on RNA polymerase. Due to the stochastic nature of transcription at the single-molecule level, each experiment yields a different distance-time trajectory, even under the same applied force.

Single-molecule studies of the TEC have revealed that RNA polymerases progress in a discontinuous fashion [4, 1114] with step sizes that correspond to the dimensions of a single nucleotide (3.4 Å [15]). Consequently, at the single molecule level, transcription is best modelled as a discrete process rather than a continuous one.

A single cycle in the main transcription elongation pathway (Fig 2) requires (1) Forward translocation of the RNA polymerase, making the active site accessible; (2) Binding of the complementary nucleoside triphosphate (NTP); (3) Addition of the nucleotide onto the 3′ end of the mRNA. This third step involves NTP hydrolysis. Nucleoside monophosphate is added onto the chain and pyrophosphate is released from the enzyme.

Fig 2. State diagrams of RNA polymerase.

(A) The model of the main transcription elongation pathway, which shows the postulated states; the pathways for interconversion; and the rate constants that govern each part of the reaction. The transcription bubble is the set of β1 + h + β2 bases (see main text for definitions) in the double-stranded DNA which are unpaired. States are denoted by S(l, t) where l is the length of the mRNA and t is the position of the polymerase active site (small grey rectangle) with respect to the 3′ end of the mRNA. Polymerase translocation displaces the polymerase by a distance of δ = 1 bp = 3.4 Å. During polymerisation the chain is extended by one nucleotide. (B) Instantiated posttranslocated state of RNA polymerase transcribing the rpoB gene sequence, with β1 = 2, h = 9, β2 = 1. Forward translocation requires melting two T/A basepairs (right arrows). Backward translocation requires melting two C/G basepairs (left arrows). The mRNA secondary structure would also require reconfiguration [16, 17].

Our study aimed to identify the best model to describe this reaction cycle for RNAP, pol II, and T7 pol, based on analysis of published force-velocity data. As there are three reactions, up to six rate constants may be necessary for a kinetic model of a single nucleotide addition. These describe forward and backwards translocation (kfwd and kbck), binding and release of NTP (kbind and krel), and NTP catalysis and reverse-catalysis (kcat and krev), also known as pyrophosphorolysis [18]. However fewer than six parameters may be required in practice.

First, it is reasonable to assume that polymerisation is effectively irreversible [17, 1921], as pyrophosphorolysis is a highly exergonic reaction, reducing the number of rate constants to five. Second, translocation between the pretranslocated and posttranslocated states, and/or NTP binding, may occur on timescales significantly more rapid than the other steps, in which case they may be modelled as equilibrium processes. These assumptions simplify the model, as the respective forward and reverse reaction rate constants are subsumed by a single equilibrium constant. Third, thermodynamic models of nucleic acid structure can be used to estimate sequence-dependent translocation rates kfwd(l) and kbck(l), by invoking transition state theory, and this can sometimes result in parameter reduction [16, 17, 21].

Irrespective of equilibrium assumptions and parameterisation, transcription elongation under applied force can be modelled in two fundamentally distinct ways. First, there are the deterministic equations which can be used to calculate the mean pause-free elongation velocity v(F, [NTP]) as a function of force F and NTP concentration [NTP]. This kind of model can be derived from the differential equations describing the time evolution of all species, by application of the steady state approximation. Force effects on the translocation step are incorporated using transition state theory [22, 23].

An example is the following 3-parameter model [4]. (1) where δ is the distance between adjacent basepairs (3.4 Å, [15]), is the equilibrium constant of NTP binding, is the equilibrium constant of translocation, kB is the Boltzmann constant, and T is the absolute temperature. Increasingly complex equations may be used as more parameters or states are added to the model [4, 6, 17]. Such equations describe the velocity averaged across an ensemble of molecules. Parameter inference applied to velocity-force-[NTP] experimental data is straightforward and computationally fast when using these equations. However these equations do not describe the distribution of velocity nor do they account for site heterogeneity across the nucleic acid sequence and therefore cannot predict local sequence effects.

Second, there are the stochastic models, which can be implemented via simulation of single-molecule behaviour using the Gillespie algorithm [24]. The mean velocity can be calculated by averaging velocities over a number of simulations for a given F and [NTP]. This offers not just the mean but a full distribution of velocities and could potentially explain emergent properties unavailable from a deterministic model. Unfortunately, simulating can be very slow and therefore parameter inference can be a problem.

In this study we used a Markov-chain-Monte-Carlo approximate-Bayesian-computation (MCMC-ABC) algorithm [25] to estimate transcription elongation parameters for stochastic models via simulation. The observed pause-free velocities we are fitting to were measured at varying applied force and NTP concentration. For each RNA polymerase under study—E. coli RNAP, S. cerevisiae pol II, and T7 pol—we fit to one respective dataset from the single-molecule literature [4, 26, 27].


Notation and state space

Suppose the TEC is transcribing a gene of length L. Then let S(l, t) denote a TEC state, where the mRNA is currently of length lL, and describes the position of the active site with respect to the 3′ end of the mRNA. When t = 0 the polymerase is pretranslocated and cannot bind NTP, and when t = 1 the polymerase is posttranslocated and can bind NTP (Fig 2). This study is focused on the main elongation pathway and the observed velocities being fitted have pauses filtered out. Therefore, although additional backtracked states (t < 0) [4, 28, 29] and hypertranslocated states (t > 1) [30, 31] exist, these are not incorporated in the model.

Let β1 and β2 be the number of unpaired template nucleotides upstream and downstream of RNA polymerase, respectively, and let h be the number of basepairs in the DNA/mRNA hybrid (Fig 2A). Although there are uncertainties in these parameters, they are held constant at h = 9, β1 = 2, and β2 = 1 [17, 32].

Transcription of the gene begins at state S(l0, 0) and ends upon reaching S(L, 0), where l0 = β1 + h + 2.

Parameterisation of the NTP binding step

NTP binding has been modelled as both a kinetic and equilibrium process in the literature [4, 17, 21].

In a kinetic binding model, NTP binding occurs at pseudo-first order rate kbind[NTP], while NTP release occurs at rate krel. In this case, kbind and must be estimated.

Under a partial equilibrium approximation NTP binding and release are assumed to be rapid enough that equilibrium is achieved. In this case, the rate constants kbind and krel are subsumed by the NTP dissociation constant which becomes the sole binding-related parameter to estimate.

Parameterisation of the translocation step

While inferences about the rate constants associated with NTP binding and catalysis (kbind, , and kcat) can be made directly from the data, the translocation step is more complex. Transition state theory is invoked in order to estimate kfwd and kbck. Recasting the problem in this way (1) provides a way of accommodating the effects of applied force on the elongation process, and (2) allows the sequence-dependence of translocation to be incorporated by considering the energetics of basepairing. When allowing for sequence dependence, the total number of translocation rates required to model translocation of the full gene is 2(Ll0).

Thermodynamic models of base pairing energies.

The standard Gibbs free energies ΔrG0(= ΔG) involved in duplex formation are calculated using nearest neighbour models. The standard Gibbs energy of state S—arising from nucleotide basepairing and dangling ends—is calculated as (2) where SantaLucia’s DNA/DNA basepairing parameters [33] are used to calculate and Sugimoto’s DNA/RNA parameters [34] are used for . For the latter, dangling end energies are estimated as described by Bai et al. 2004 [21]. Here, and elsewhere, the (bp) superscript is used to denote a model parameter that can be evaluated from the sequence alone. Gibbs energies are expressed on a per molecule basis, relative to the thermal energy of the system, in multiples of kBT, where kBT = 4.28001 × 10−21 J at T = 310 K.

In order for RNA polymerase to translocate forward (backward), up to two basepairs must be disrupted: (1) the basepair at the downstream (upstream) edge of the transcription bubble, and (2) the basepair at the upstream (downstream) end of the DNA/mRNA hybrid (Fig 2B). Differences in the basepairing energies in these regions confer sequence-dependence on the rate of translocation.

Calculation of translocation rates or translocation equilibrium constant.

The standard Gibbs energies of the pre and posttranslocated states, and , respectively, are used with up to four additional terms—ΔGτ1, δ1, , and —to calculate the translocation rates. The first three are model parameters which must be estimated while the latter is directly evaluated from the sequence.

Let T(l, t) be the translocation transition state between S(l, t) and S(l, t + 1). Then is the sequence-dependent standard Gibbs energy of activation which must be overcome in order to translocate (Fig 3).

Fig 3. Parameterisation of the translocation step.

(A) Effects of model parameters on state energies. The figure displays a schematic Gibbs energy landscape of translocation, with backtracked states included for visualisation purposes. The solid red lines represent translocation states (t = 0: pretranslocated, t = 1: posttranslocated, and t < 0: backtracked), while the dashed red lines represent transition states. Applying an assisting force F > 0 tilts the landscape in favour of higher values of t. The effect of ΔGτ1 is observed at the posttranslocated state t = 1. In a translocation equilibrium model, the barrier height is assumed to be so small, = translocation is so rapid, that the transition states are disregarded. (B) A model for the sequence-dependent transition state between translocation states S(l, 0) and S(l, 1). This is required for estimating the Gibbs energy of basepairing in the transition state. The basepairing energy, added to a baseline term , together specify the height of the activation barrier (Eq 10).

Given an applied force F, the translocation rates governing transition between the pre and posttranslocated states (kfwd(l) and kbck(l)) are calculated from barrier height using an Arrhenius type relation: (3) (4)

The derived rates kfwd(l) and kbck(l) are therefore dependent on the local sequence. The pre-exponential factor A is held constant at 106 s−1. This term has been arbitrarily set to a variety of values in previous studies (106−109 s−1 [16, 17, 21]). This has little consequence for model fitting, however the value of is entangled with the value of the pre-exponential factor A and can only be meaningfully interpreted in light of its value.

If the system has time to reach equilibrium, the probabilities of observing the pretranslocated state S(l, 0) and posttranslocated state S(l, 1) are (5) (6)

This is described by equilibrium constant Kτ. (7) (8) (9)

The physical meanings of the terms ΔGτ1, δ1, , and , and the way they are used in the model, are detailed below.

Energetic bias for the posttranslocated states.

ΔGτ1 (units kBT) is a parameter added to the standard Gibbs energy of the posttranslocated state. If ΔGτ1 = 0, then the sequence alone determines the Gibbs energy difference between pre and posttranslocated states. In this case, pretranslocated states are usually favoured over posttranslocated states due to the loss of a single basepair in the hybrid of the latter.

ΔGτ1 has frequently been estimated for T7 pol [3537] and there has been discussion around whether such a term is necessary for RNAP [6].

Polymerase displacement and formation of the transition state.

δ1 (units Å) is the distance that the polymerase must translocate forward to facilitate the formation of the transition state. The distance between adjacent basepairs is held constant at an experimentally measured value δ = 3.4 Å [15], and 0 < δ1 < δ. The response of the system to an applied force F depends on this term. In general, the application of force F tilts the Gibbs energy landscape—the Gibbs energy difference between adjacent translocation states being augmented by a factor (Fig 3A, [38, 39]).

It may be necessary to estimate δ1 to model the data adequately [17], or it may be sufficient to simply set δ1 = δ/2 [38].

Energy barrier of translocation.

and (units kBT) together determine the activation barrier height in the translocation step. It is assumed that the sequence-dependent standard Gibbs energy of activation can be written as (10)

is therefore a sequence-independent baseline term used to compute the translocation barrier heights. The parameter must be estimated in order to evaluate translocation rates.

In contrast is a term that is evaluated directly from the sequence derived from a model of the transition state (Fig 3B). The term is evaluated as the standard Gibbs energy of a TEC containing all hybrid and gene basepairs found in both S(l, t) and S(l, t + 1), ie. the intersection between the two sets of basepairs.

Model space

The full transcription elongation model makes use of the following 6 parameters:

  • kcat (units s−1).
  • (units μM).
  • kbind (units μM−1 s−1).
  • ΔGτ1 (units kBT).
  • δ1 (units Å).
  • (units kBT).

However fewer than 6 parameters may be needed to adequately describe the data. If it is assumed that the energy differences between pre and posttranslocated states are determined by basepairing energies alone, the parameter ΔGτ1 does not need to be estimated. This is equivalent to holding ΔGτ1 constant at 0. If it is assumed that the displacement required for formation of the translocation intermediate state is half the distance between adjacent basepairs, the parameter δ1 does not need to be estimated. This is equivalent to holding δ1 constant at δ/2.

Partial equilibrium approximations may also simplify the model, as detailed above. If binding is approximated as an equilibrium process, kbind does not need to be estimated. If translocation is approximated as an equilibrium process, and δ1 do not need to be estimated. One, both, or neither of these two steps (binding and translocation) could be assumed to achieve equilibrium, thus yielding four equilibrium model variants (Fig 4A). The introduction of partial equilibrium approximations for both the NTP binding and translocation steps has implications when specifying the prior distributions for the Bayesian analysis (S4 Appendix) The chemical master equations for single nucleotide addition cycles of these models are presented in S2 Appendix.

Fig 4. The space of models to be compared.

(A) The four equilibrium model variants. NTP binding, translocation, both, or neither, could be assumed to achieve equilibrium prior to catalysis. (B) The 12 transcription elongation models. An arrow connects model i to j if augmentation of model i with a single parameter generates model j. The number of parameters to estimate k is shown for each level in the network. Equilibrium approximation colour scheme is the same as in A. ΔGτ1 and δ1 can each be estimated or set to a constant.

Incorporating these simplifications to the model in a combinatorial fashion results in a total of 12 related models, which together constitute the model space. Our objective was to determine which of these 12 models provides the best description of the experimental data. The simplest model (Model 1) contains 2 parameters (kcat and KD). The most complex model (Model 12) contains all 6 parameters. The full model space is displayed in Fig 4B.

Stochastic modelling

For each model we performed stochastic simulations, appropriate for the modelling of single-molecule force-velocity data. The simulations, performed using the Gillespie algorithm [24, 40], can be used to estimate the mean elongation velocity under a model.

The estimation of mean velocity can be broken down into three steps. First, the system is initialised by placing the RNA polymerase at the 3′ end of the template—state S(l0, 0)—with the transcription bubble open and a DNA/RNA hybrid formed. The force and NTP concentrations are assigned their experimentally set values. Second, a chemical reaction is randomly sampled. The probability that reaction is selected is proportional to its rate constant k (Fig 2). The amount of time taken for the reaction to occur is sampled from the exponential distribution. States which are subject to a partial equilibrium approximation are coalesced into a single state, which augments the outbound rate constants. The second step is repeated until the RNA polymerase has copied the entire template. Third, the previous two steps are repeated c times. The mean elongation velocity is evaluated as the mean of each mean elongation velocity across c simulations. For further information, see S1 Appendix.

Relation to previous models and stochastic simulations

There is an extensive literature concerned with the kinetic modelling of transcription elongation. Such models may incorporate backtracking, hypertranslocation, and other reactions. Here we are concerned only with the central elongation pathway.

A stochastic and sequence-dependent model was proposed by Bai et al. 2004 [21] for RNAP, with both NTP binding and translocation treated as equilibrium processes. The translocation equilibrium constant was calculated entirely from basepairing energies. Therefore this model is equivalent to Model 1, and the parameters were estimated as kcat = 24.7 s−1 and KD = 15.6 μM from fit to experimental data. Maoiléidigh et al. 2011 also presented stochastic simulations of RNAP. The elongation component of their model is equivalent to Model 6 [17]. We build on this work by providing a systematic Bayesian framework for model comparison and parameter estimation.

While our analysis employed sequence-dependent stochastic models, comparisons can also be made with some deterministic models.

Abbondanzieri et al. 2005 [4], Larson et al. 2012 [41], Schweikhard et al. 2014 [26], and Thomen et al. 2008. [27, 37] described a deterministic model (for RNAP, pol II, pol II, and T7 pol respectively) which estimated kcat, KD and translocation equilibrium constant . These are most similar to Model 4.

Maoiléidigh et al. 2011 for RNAP, and Dangkulwanich et al. 2013 for pol II, however found that the translocation and catalysis were occurring on similar timescales, and modelled only NTP binding as an equilibrium process [17, 42]. They also estimated the distance of translocation. These deterministic models are most similar to Model 11.

Finally, Mejia et al. 2015 [43] used a model that is quite different to all the above models, as it does not explicitly treat translocation. Instead elongation is modelled with a two step kinetic scheme, the first step involving NTP binding and conformational change, and the second step involving nucleotide incorporation and product release. This model is most similar to a special case of Model 5 where ΔGτ1 becomes extremely negative, driving the polymerase into the posttranslocated position.

Results and discussion

Model selection with MCMC-ABC

Our aim was to 1) use Bayesian inference to select the best of 12 transcription elongation models for each RNA polymerase; and 2) estimate the parameters for those of the models appearing in the 95% credible set of the posterior distribution. Selecting prior distributions behind each parameter is a critical process in Bayesian inference. A prior distribution should reflect what is known about the parameter before observing the new data. We have explicated our prior assumptions, with justifications, in Table 1.

We performed MCMC-ABC experiments which estimated the parameters and model indicator Mi for . Models which appear more often in this posterior distribution are better choices, given the data. The model indicator is a discrete variable which can take 12 values, and is treated identically to the 6 continuous parameters in the Bayesian framework.

The datasets we fit our models to are all from the single-molecule literature and are presented in: Figures 5a and 5b of Abbondanzieri et al. 2005 [4] for E. coli RNAP, Figure 2a of Schweikhard et al. 2014 [26] for S. cerevisiae pol II, and Table 2 of Thomen et al. 2008 [27] for T7 pol. To computationally replicate these experiments as faithfully as we could with the available information and computational limitations, simulations in this study were run on the 4 kb E. coli rpoB gene for RNAP (GenBank: EU274658), the first 4.75 kb of the human rpb1 gene for pol II (NCBI: NG_027747) the first 10 kb of the Enterobacteria phage λ genome for T7 pol (NCBI: NC_001416). The mean velocities from 32 (for RNAP), 10 (for pol II) and 3 (for T7 pol) simulations of the full respective sequences were used to estimate the mean elongation velocity during MCMC-ABC, given F and [NTP].

For further information about the MCMC-ABC algorithm [25, 44], or the model indicator Mi, see S3 Appendix.

The posterior distributions

The posterior distributions from our MCMC-ABC experiments are presented in Table 2, Figs 5 and 6.

Fig 5. Posterior and prior distribution plots.

Posterior distributions for all models which appear in the 95% credible set are displayed (two models for RNAP, two models for pol II, and one model for T7 pol). Plots show the prior probability density P(θ) of each parameter and posterior probability density of each parameter conditional on the model P(θ|D, Mi). The geometric median point-estimates and highest posterior density (HPD) intervals (calculated with Tracer 1.6 [53]) are displayed above each plot (3 sf).

Fig 6. Posterior distributions of simulated velocities.

Black open circles represent experimentally measured mean velocities reported in the original publication for (A) RNAP, (B) pol II, and (C) T7 pol [4, 26, 27]. Each coloured dot represents a single sample simulated from the posterior distribution of parameters/models for the respective polymerase. 30 samples were generated from each of the three posterior distributions. For RNAP, [NTP]eq is defined as [ATP] = 5 μM, [CTP] = 2.5 μM, [GTP] = 10 μM, and [UTP] = 10 μM.

A large effective sample size (> 100 [53]) and a small (< 1.1, as defined by Gelman et al. 1992 [5456]) are essential for making reliable parameter estimates. Table 2 suggests that the parameters in the 95% credible set of models are sufficiently estimated by these criteria.

These results indicate that the best models for the datasets examined are Models 11 and 12 for both RNAP and pol II, and Model 5 for T7 pol (Fig 4B).

For pol II, Model 12 has the highest posterior probability P(M12|D) = 0.71. This is the most complex model considered, with 6 estimated parameters. In Model 12 translocation, NTP binding and catalysis are all kinetic processes; the displacement required to facilitate formation of the translocation transition state, δ1 < δ, is estimated (); and the standard Gibbs energy of the posttranslocated state is influenced by parameter ΔGτ1 ≠ 0.

The posterior distribution for RNAP consists of the same set models as that of pol II. For RNAP, Model 11 has the highest probability P(M11|D) = 0.81. This model is a submodel of Model 12 with one fewer parameter: in Model 11 NTP binding is treated as an equilibrium process while in Model 12 it is not.

The only model in the 95% credible set for T7 pol is Model 5 P(M5|D) = 0.96. In Model 5 (4 parameters) translocation, but not binding, is treated as an equilibrium process, and ΔGτ1 is estimated. This positions T7 pol in a quite different area of the model space to the other two polymerases.

Translocation rates differ among RNA polymerases

For RNAP and pol II, we estimate that a partial equilibrium approximation for the translocation step is inadequate. The posterior probability that such models are inadequate is 1.00 (see Table 2). For T7 pol, however, translocation is significantly faster than catalysis and is best modelled with a partial equilibrium approximation. Using estimates for and ΔGτ1 under the maximum posterior models (Model 11 for RNAP and Model 12 for pol II) we estimate the mean forward and backward translocation rates averaged across the rpoB sequence as: 230 s−1 and 112 s−1 for RNAP, and 350 s−1 and 12.7 s−1 for pol II, respectively (3 sf). These estimates are within one order of magnitude of the respective estimate for the rate of catalysis (Fig 5) suggesting that translocation and catalysis indeed occur on similar timescales.

For RNAP and pol II, translocation has frequently been modelled as an equilibrium process [4, 21, 26, 41, 43], however in some recent analyses this assumption has been rejected [16, 17, 42, 57, 58]. Our Bayesian analysis supports this. In contrast, there is general agreement that translocation in T7 pol is adequately modelled as an equilibrium process [27, 59, 60].

The data does not determine the kinetics of the NTP binding step

It remains unclear how to best model the NTP binding step. Models that describe NTP binding as a kinetic process have posterior probabilities of 0.19 for RNAP, 0.71 for pol II and 0.96 for T7 pol (Table 2). However, in an earlier experiment, where we used different a prior distribution for , the latter probability was 0.21 and P(M4|D) was 0.79. The intermediate magnitude of these posterior probabilities, and sensitivity to the choice of prior, imply that the data contains very little information about which binding model is preferred.

Furthermore, and kbind (Models 5 and 12) are unable to be estimated simultaneously. For pol II and for T7 pol, kbind is estimated at around 0.48 and 1.4 μM−1 s−1 respectively with fairly narrow 95% highest posterior density (HPD) intervals (Fig 5). However, the HPD interval of spans three orders of magnitude and the value of this parameter was therefore poorly informed by the data. For RNAP, in contrast, neither kbind nor were well-informed by the data and both have HPD intervals spanning 1-2 orders of magnitude. This non-identifiability—where two or more parameters are unable to be estimated simultaneously (S4 Appendix)—highlights the appeal in an NTP binding equilibrium model where only one parameter needs to be estimated, despite the unrealistic assumptions it may invoke. In the case of each enzyme, the data has taught us nothing about one or two of the binding parameters.

The pause-free mean velocities measured during transcription elongation follow Michaelis-Menten kinetics even though the reaction cycle is more complicated than that of a simple enzyme [61]. As such, the inability to resolve the timescale of the substrate binding step is unsurprising [6264].

In the transcription literature, NTP binding is almost always assumed to achieve equilibrium for RNAP, pol II, and T7 pol [4, 16, 17, 21, 26, 27, 37, 41, 42, 60]. However Mejia et al. 2015 [43] have shown that NTP binding is indeed rate-limiting, and that mutations in the RNAP trigger loop impair the binding rate thus suggesting that the trigger loop is coupled with NTP binding.

RNAP has an energetic preference for the posttranslocated state

In previous stochastic sequence-dependent models [16, 21] the standard Gibbs energies of the pre and posttranslocated states have been based solely on the nucleic acid basepairing energies. Our models include an additional term, ΔGτ1, to account for potential interactions between the protein and the nucleic acid. The marginal posterior probability of a model in which an additional term ΔGτ1 is required is 1.00 in all three polymerases. In each case ΔGτ1 was estimated to be less than 0 kBT and 0 kBT is not included in the 95% HPD interval (Fig 5). We find that is the most significant in pol II and T7 pol: −4.6 kBT and −4.0 kBT respectively, while kBT for RNAP (2 sf).

These results suggest that structural elements within RNA polymerases can energetically favour posttranslocated states over pretranslocated states. We note that the sequence-dependent contribution of the dangling end of the DNA/RNA hybrid is included in the thermodynamic model. The energetic bias for the posttranslocated state is separable from this effect.

To facilitate comparison with previous deterministic models, using our estimates of ΔGτ1 we calculated the equilibrium constant between the pre and posttranslocated states. Geometrically averaged across the rpoB gene, these are (11)

Thus, for all three polymerases, Kτ < 1, indicating that the small energetic preference that the protein has for the posttranslocated state is sufficient to override the loss of basepairing energy, thereby biasing the system towards population of the posttranslocated positions. This is in agreement with estimates made for pol II and T7 pol [26, 27, 35, 36, 41] and Kireeva et al. 2018 [58] for RNAP: “forward translocation occurs in milliseconds and is poorly reversible”. However these estimates are inconsistent with some RNAP and pol II studies which place this ratio above 1 [4, 17, 42, 52].

Kinetic modelling can itself suggest no physical mechanism for the stabilisation. Yu et al. 2012 [36] have identified a conserved tyrosine residue near the active site of T7 pol that pushes against the 3′ end of the mRNA, and thus stabilises the posttranslocated state. They propose a similar mechanism for the multi-subunit RNA polymerases.

δ1 may be an important parameter but its physical meaning is unclear

Our results suggest that δ1, the distance that RNA polymerase must translocate forward by to reach the translocation transition state, is a necessary parameter to estimate for RNAP and pol II. Setting δ1 = δ/2 is not sufficient. The marginal posterior probability of models which estimate this term is 1.00. δ1 is irrelevant to the modeling of the T7 pol data because the best models invoke a partial equilibrium approximation for the translocation step.

While our prior distribution restricted δ1 to lie in the range (0, δ), the upper end our 95% HPD intervals of δ1 for RNAP and pol II are very close to δ = 3.4 Å. If it was not for this prior distribution, δ1 estimates would have included values higher than δ. Similar results have been observed by Maoiléidigh et al. 2011 [17] for RNAP.

Our interpretation of δ1 implies it should never be greater than δ nor should δ be more than the width of one basepair. The physical meaning of δ1 with values greater than δ is thus unclear. It is noted that δ1 is only used when F ≠ 0.

Comparing the kinetics of RNA polymerases

The in vivo rate of transcription elongation varies considerably across RNAP, pol II and T7 pol. The prokaryotic and eukaryotic RNA polymerases have a mean rate ranging from 20-120 bp/s [45, 46, 48, 49, 6567], which may be slowed down in histone-wrapped regions of eukaryotic genomes [7]. In contrast, Bacteriophage T7 pol operates up to an order of magnitude faster (around 200-240 bp/s [49, 68]) and is known to be quite insensitive to transcriptional pause sites [9, 27].

In additional to these differences, we have shown that translocation is very rapid in T7 pol, relative to the rate of NTP incorporation, while the disparity is much less significant in RNAP and pol II. Furthermore, the model does not fit the data for T7 pol as closely it does for RNAP and pol II (Fig 6). T7 pol therefore seems to operate under quite a different kinetic scheme than that of the cellular polymerases, which is not unexpected given their distant evolutionary relationship [3].

In general, the elongation velocity of RNA polymerase is significantly slower in an optical trap (with estimates ranging from 9.7-22 bp/s for RNAP [1113, 43, 69]) compared with that of the untethered enzyme (with estimates in vitro or in vivo ranging from 25-118 bp/s for RNAP [45, 49, 70, 71]). This relationship holds for multiple RNA polymerases including E. coli RNAP, S. cerevisiae pol II [41, 42, 52, 72], Bacteriophage T7 pol [9, 27, 49, 51], and Bacteriophage Φ6 P2 [10, 73]. This suggests that optical trapping perturbs the system to a significant extent. Additionally, varying degrees of heterogeneity in elongation rate have been observed across different polymerase complexes even under the same conditions [11, 13, 27].

The velocity perturbations resulting from the optical trapping apparatus will be propagated into the model parameters, especially kcat, and , and some caution is needed when extrapolating these results to untethered systems.

Bayesian inference of transcription elongation

To our knowledge we are the first to perform Bayesian inference on single-molecule models of transcription elongation. This was achieved by simulation which necessitated the use of approximate Bayesian computation. An alternative would be to build and use a likelihood function (ie. the probability of taking exactly t units of time for RNA polymerase to copy the sequence n times). The latter approach can be achieved using chemical master equations, as opposed to (Gillespie) sampling from the distribution. Finding analytical, stable numerical, or approximate solutions to the chemical master equations could provide a similar insight in less computational time, however is susceptible to a multitude of analytical and numerical issues associated with the exponentiation of an arbitrary transition rate matrix that grows with the length of the sequence (S2 Appendix) [74]. This problem would be amplified by the introduction of backtracking, hypertranslocation, or NTP misincorporation reactions into the model, for instance. The Bayesian framework we have presented, although computationally intensive due to its simulation requirement, is general and will work on any model of transcription without the need to resolve these issues. The path has been paved for modelling transcriptional pausing, for instance [16, 21, 75]. Nevertheless, likelihood-based Bayesian inference is an approach that should be explored in the future.

We have demonstrated that single-molecule data can be usefully analysed using a Bayesian inference and model selection framework. This analysis would have even greater statistical power if applied to the progression of individual RNA polymerase complexes instead of mean velocities averaged across multiple experiments.


In this article we evaluated some simple Brownian ratchet models of transcription elongation (Fig 2). By varying the parameterisation of the translocation step (Fig 3) and incorporating partial equilibrium approximations commonly invoked in the literature (Fig 4A) we enumerated a total of 12 related models (Fig 4B). Using stochastic simulations and approximate Bayesian computation, we then assessed which of these models were capable of describing the force-velocity data previously measured for several RNA polymerases (Table 2 and Fig 5) using single-molecule optical trapping experiments [4, 26, 27].

Our analysis suggests that 1) different partial equilibrium approximations of the translocation step are appropriate for the multisubunit RNA polymerases versus the single subunit T7 RNA polymerase. 2) Treatment of the NTP binding step remains a point of ambiguity. The existing data does not place strong constraints on the modelling of this step. 3) There is an energetic bias for posttranslocated state. 4) The model of the force-dependent translocation, which invokes transition state theory, is not physically realistic.

Supporting information

S1 Appendix. Stochastic simulation.

The Gillespie algorithm is described.


S2 Appendix. Chemical master equations.

Master equations for the four equilibrium model variants are presented.


S3 Appendix. MCMC-ABC.

A description of the MCMC-ABC algorithm and how it is used to infer parameters and models from experimental data.


S4 Appendix. Prior distributions.

Simulation-based justifications behind prior distributions for , kbind, and .


S1 Fig. Simulations of the elongation pathway.

Each point is a single simulation of the full rpoB gene (4029 nt). For (A-C), Parameters on the x- and z-axis are sampled uniformly at random from the displayed range at the beginning of each trial. The y-axis of each plot (mean elongation velocity) is then measured from the respective simulation. [NTP] and F held constant at 1000 μM and 0 pN respectively. (A) and (B): Relationship between and kcat for the melting model with binding at equilibrium (Model 8). ΔGτ1 set to its prior mean (0 for RNAP and pol II, and -3.3 for T7 pol). (C) Relationship between kbind and kcat for the kinetic binding model with translocation at equilibrium (Model 2). (D) Relationship between KD and kbind with translocation held at equilibrium (Model 2). KD and kbind sampled uniformly from specified range and velocity is measured. Samples with simulated velocities outside of the range 1-2 bp/s were discarded. [NTP] = 10 μM and kcat = 100 s−1.



  1. 1. Sweetser D, Nonet M, Young RA. Prokaryotic and eukaryotic RNA polymerases have homologous core subunits. Proceedings of the National Academy of Sciences. 1987;84(5):1192–1196.
  2. 2. Sosunov V, Sosunova E, Mustaev A, Bass I, Nikiforov V, Goldfarb A. Unified two-metal mechanism of RNA synthesis and degradation by RNA polymerase. The EMBO journal. 2003;22(9):2234–2244. pmid:12727889
  3. 3. Sousa R, Chung YJ, Rose JP, Wang BC. Crystal structure of bacteriophage T7 RNA polymerase at 3.3 Å resolution. Nature. 1993;364(6438):593. pmid:7688864
  4. 4. Abbondanzieri EA, Greenleaf WJ, Shaevitz JW, Landick R, Block SM. Direct observation of base-pair stepping by RNA polymerase. Nature. 2005;438(7067):460–465. pmid:16284617
  5. 5. Adelman K, La Porta A, Santangelo TJ, Lis JT, Roberts JW, Wang MD. Single molecule analysis of RNA polymerase elongation reveals uniform kinetic behavior. Proceedings of the National Academy of Sciences. 2002;99(21):13538–13543.
  6. 6. Bai L, Fulbright RM, Wang MD. Mechanochemical kinetics of transcription elongation. Physical review letters. 2007;98(6):068103. pmid:17358986
  7. 7. Hodges C, Bintu L, Lubkowska L, Kashlev M, Bustamante C. Nucleosomal fluctuations govern the transcription dynamics of RNA polymerase II. Science. 2009;325(5940):626–628. pmid:19644123
  8. 8. Galburt EA, Grill SW, Wiedmann A, Lubkowska L, Choy J, Nogales E, et al. Backtracking determines the force sensitivity of RNAP II in a factor-dependent manner. Nature. 2007;446(7137):820–823. pmid:17361130
  9. 9. Skinner GM, Baumann CG, Quinn DM, Molloy JE, Hoggett JG. Promoter binding, initiation, and elongation by bacteriophage T7 RNA polymerase a single-molecule view of the transcription cycle. Journal of Biological Chemistry. 2004;279(5):3239–3244. pmid:14597619
  10. 10. Dulin D, Vilfan ID, Berghuis BA, Hage S, Bamford DH, Poranen MM, et al. Elongation-competent pauses govern the fidelity of a viral RNA-dependent RNA polymerase. Cell reports. 2015;10(6):983–992. pmid:25683720
  11. 11. Neuman KC, Abbondanzieri EA, Landick R, Gelles J, Block SM. Ubiquitous transcriptional pausing is independent of RNA polymerase backtracking. Cell. 2003;115(4):437–447. pmid:14622598
  12. 12. Davenport RJ, Wuite GJ, Landick R, Bustamante C. Single-molecule study of transcriptional pausing and arrest by E. coli RNA polymerase. Science. 2000;287(5462):2497. pmid:10741971
  13. 13. Tolić-Nørrelykke SF, Engh AM, Landick R, Gelles J. Diversity in the rates of transcript elongation by single RNA polymerase molecules. Journal of Biological Chemistry. 2004;279(5):3292–3299. pmid:14604986
  14. 14. Abbondanzieri EA, Shaevitz JW, Block SM. Picocalorimetry of transcription by RNA polymerase. Biophysical journal. 2005;89(6):L61–L63. pmid:16239336
  15. 15. Watson JD, Crick FH, et al. Molecular structure of nucleic acids. Nature. 1953;171(4356):737–738. pmid:13054692
  16. 16. Tadigotla VR, Maoiléidigh DÓ, Sengupta AM, Epshtein V, Ebright RH, Nudler E, et al. Thermodynamic and kinetic modeling of transcriptional pausing. Proceedings of the National Academy of Sciences of the United States of America. 2006;103(12):4439–4444. pmid:16537373
  17. 17. Maoiléidigh DÓ, Tadigotla VR, Nudler E, Ruckenstein AE. A unified model of transcription elongation: what have we learned from single-molecule experiments? Biophysical journal. 2011;100(5):1157–1166.
  18. 18. Maitra U, Nakata Y, Hurwitz J. The Role of Deoxyribonucleic Acid in Ribonucleic Acid Synthesis XIV. A Study of the Initiation of Ribonucleic Acid Synthesis. Journal of Biological Chemistry. 1967;242(21):4908–4918. pmid:4862425
  19. 19. Erie DA, Yager TD, Von Hippel PH. The single-nucleotide addition cycle in transcription: a biophysical and biochemical perspective. Annual review of biophysics and biomolecular structure. 1992;21(1):379–415. pmid:1381976
  20. 20. Rhodes G, Chamberlin MJ. Ribonucleic acid chain elongation by Escherichia coli ribonucleic acid polymerase I. Isolation of ternary complexes and the kinetics of elongation. Journal of Biological Chemistry. 1974;249(20):6675–6683. pmid:4608711
  21. 21. Bai L, Shundrovsky A, Wang MD. Sequence-dependent kinetic model for transcription elongation by RNA polymerase. Journal of molecular biology. 2004;344(2):335–349. pmid:15522289
  22. 22. Bustamante C, Chemla YR, Forde NR, Izhaky D. Mechanical processes in biochemistry. Annual review of biochemistry. 2004;73(1):705–748. pmid:15189157
  23. 23. Cleland W. Partition analysis and concept of net rate constants as tools in enzyme kinetics. Biochemistry. 1975;14(14):3220–3224. pmid:1148201
  24. 24. Gillespie DT. Exact stochastic simulation of coupled chemical reactions. The journal of physical chemistry. 1977;81(25):2340–2361.
  25. 25. Beaumont MA. Approximate Bayesian computation in evolution and ecology. Annual review of ecology, evolution, and systematics. 2010;41:379–406.
  26. 26. Schweikhard V, Meng C, Murakami K, Kaplan CD, Kornberg RD, Block SM. Transcription factors TFIIF and TFIIS promote transcript elongation by RNA polymerase II by synergistic and independent mechanisms. Proceedings of the National Academy of Sciences. 2014;111(18):6642–6647.
  27. 27. Thomen P, Lopez P, Bockelmann U, Guillerez J, Dreyfus M, Heslot F. T7 RNA polymerase studied by force measurements varying cofactor concentration. Biophysical journal. 2008;95(5):2423–2433. pmid:18708471
  28. 28. Wang MD, Schnitzer MJ, Yin H, Landick R, Gelles J, Block SM. Force and velocity measured for single molecules of RNA polymerase. Science. 1998;282(5390):902–907. pmid:9794753
  29. 29. Shaevitz JW, Abbondanzieri EA, Landick R, Block SM. Backtracking by single RNA polymerase molecules observed at near-base-pair resolution. Nature. 2003;426(6967):684–687. pmid:14634670
  30. 30. Artsimovitch I, Landick R. Pausing by bacterial RNA polymerase is mediated by mechanistically distinct classes of signals. Proceedings of the National Academy of Sciences. 2000;97(13):7090–7095.
  31. 31. Zhou Y, Navaroli DM, Enuameh MS, Martin CT. Dissociation of halted T7 RNA polymerase elongation complexes proceeds via a forward-translocation mechanism. Proceedings of the National Academy of Sciences. 2007;104(25):10352–10357.
  32. 32. Greive SJ, Von Hippel PH. Thinking quantitatively about transcriptional regulation. Nature Reviews Molecular Cell Biology. 2005;6(3):221–232. pmid:15714199
  33. 33. SantaLucia J. A unified view of polymer, dumbbell, and oligonucleotide DNA nearest-neighbor thermodynamics. Proceedings of the National Academy of Sciences. 1998;95(4):1460–1465.
  34. 34. Wu P, Nakano Si, Sugimoto N. Temperature dependence of thermodynamic properties for DNA/DNA and RNA/DNA duplex formation. The FEBS Journal. 2002;269(12):2821–2830.
  35. 35. Yin YW, Steitz TA. The structural mechanism of translocation and helicase activity in T7 RNA polymerase. Cell. 2004;116(3):393–404. pmid:15016374
  36. 36. Yu J, Oster G. A small post-translocation energy bias aids nucleotide selection in T7 RNA polymerase transcription. Biophysical journal. 2012;102(3):532–541. pmid:22325276
  37. 37. Thomen P, Lopez PJ, Heslot F. Unravelling the mechanism of RNA-polymerase forward motion by using mechanical force. Physical Review Letters. 2005;94(12):128102. pmid:15903965
  38. 38. Depken M, Galburt EA, Grill SW. The origin of short transcriptional pauses. Biophysical journal. 2009;96(6):2189–2193. pmid:19289045
  39. 39. Herbert KM, Greenleaf WJ, Block SM. Single-molecule studies of RNA polymerase: motoring along. Annu Rev Biochem. 2008;77:149–176. pmid:18410247
  40. 40. Lecca P. Stochastic chemical kinetics. Biophysical reviews. 2013;5(4):323–345. pmid:28510113
  41. 41. Larson MH, Zhou J, Kaplan CD, Palangat M, Kornberg RD, Landick R, et al. Trigger loop dynamics mediate the balance between the transcriptional fidelity and speed of RNA polymerase II. Proceedings of the National Academy of Sciences. 2012;109(17):6555–6560.
  42. 42. Dangkulwanich M, Ishibashi T, Liu S, Kireeva ML, Lubkowska L, Kashlev M, et al. Complete dissection of transcription elongation reveals slow translocation of RNA polymerase II in a linear ratchet mechanism. Elife. 2013;2:e00971. pmid:24066225
  43. 43. Mejia YX, Nudler E, Bustamante C. Trigger loop folding determines transcription rate of Escherichia coli’s RNA polymerase. Proceedings of the National Academy of Sciences. 2015;112(3):743–748.
  44. 44. Csilléry K, Blum MG, Gaggiotti OE, François O. Approximate Bayesian computation (ABC) in practice. Trends in ecology & evolution. 2010;25(7):410–418.
  45. 45. Vogel U, Jensen KF. The RNA chain elongation rate in Escherichia coli depends on the growth rate. Journal of bacteriology. 1994;176(10):2807–2813. pmid:7514589
  46. 46. Ryals J, Little R, Bremer H. Temperature dependence of RNA synthesis parameters in Escherichia coli. Journal of bacteriology. 1982;151(2):879–887. pmid:6178724
  47. 47. Richardson JP, Greenblatt J. Control of RNA chain elongation and termination. Escherichia coli and Salmonella: cellular and molecular biology. 1996;1:822–848.
  48. 48. Mason PB, Struhl K. Distinction and relationship between elongation rate and processivity of RNA polymerase II in vivo. Molecular cell. 2005;17(6):831–840. pmid:15780939
  49. 49. Iost I, Guillerez J, Dreyfus M. Bacteriophage T7 RNA polymerase travels far ahead of ribosomes in vivo. Journal of bacteriology. 1992;174(2):619–622. pmid:1729251
  50. 50. Bonner G, Lafer EM, Sousa R. Characterization of a set of T7 RNA polymerase active site mutants. Journal of Biological Chemistry. 1994;269(40):25120–25128. pmid:7929200
  51. 51. Anand VS, Patel SS. Transient state kinetics of transcription elongation by T7 RNA polymerase. Journal of Biological Chemistry. 2006;281(47):35677–35685. pmid:17005565
  52. 52. Kireeva ML, Nedialkov YA, Cremona GH, Purtov YA, Lubkowska L, Malagon F, et al. Transient reversal of RNA polymerase II active site closing controls fidelity of transcription elongation. Molecular cell. 2008;30(5):557–566. pmid:18538654
  53. 53. Rambaut A, Drummond A. Tracer 1.6. University of Edinburgh, Edinburgh. UK. Technical report; 2013.
  54. 54. Gelman A, Rubin DB, et al. Inference from iterative simulation using multiple sequences. Statistical science. 1992;7(4):457–472.
  55. 55. Brooks SP, Gelman A. General methods for monitoring convergence of iterative simulations. Journal of computational and graphical statistics. 1998;7(4):434–455.
  56. 56. Brooks S, Gelman A, Jones G, Meng XL. Handbook of markov chain monte carlo. CRC press; 2011.
  57. 57. Nedialkov YA, Nudler E, Burton ZF. RNA polymerase stalls in a post-translocated register and can hyper-translocate. Transcription. 2012;3(5):260–269. pmid:23132506
  58. 58. Kireeva M, Trang C, Matevosyan G, Turek-Herman J, Chasov V, Lubkowska L, et al. RNA–DNA and DNA–DNA base-pairing at the upstream edge of the transcription bubble regulate translocation of RNA polymerase and transcription rate. Nucleic acids research. 2018;46(11):5764–5775. pmid:29771376
  59. 59. Guajardo R, Lopez P, Dreyfus M, Sousa R. NTP concentration effects on initial transcription by T7 RNAP indicate that translocation occurs through passive sliding and reveal that divergent promoters have distinct NTP concentration requirements for productive initiation. Journal of molecular biology. 1998;281(5):777–792. pmid:9719634
  60. 60. Arnold S, Siemann M, Scharnweber K, Werner M, Baumann S, Reuss M, et al. Kinetic modeling and simulation of in vitro transcription by phage T 7 RNA polymerase. Biotechnology and bioengineering. 2001;72(5):548–561. pmid:11460245
  61. 61. Wong F, Dutta A, Chowdhury D, Gunawardena J. Structural conditions on complex networks for the Michaelis–Menten input–output response. Proceedings of the National Academy of Sciences. 2018;115(39):9738–9743.
  62. 62. Briggs GE, Haldane JBS. A note on the kinetics of enzyme action. Biochemical journal. 1925;19(2):338. pmid:16743508
  63. 63. English BP, Min W, Van Oijen AM, Lee KT, Luo G, Sun H, et al. Ever-fluctuating single enzyme molecules: Michaelis-Menten equation revisited. Nature chemical biology. 2006;2(2):87–94. pmid:16415859
  64. 64. Schnell S. Validity of the Michaelis–Menten equation–steady-state or reactant stationary assumption: that is the question. The FEBS journal. 2014;281(2):464–472. pmid:24245583
  65. 65. Tennyson CN, Klamut HJ, Worton RG. The human dystrophin gene requires 16 hours to be transcribed and is cotranscriptionally spliced. Nature genetics. 1995;9(2):184–190. pmid:7719347
  66. 66. Darzacq X, Shav-Tal Y, De Turris V, Brody Y, Shenoy SM, Phair RD, et al. In vivo dynamics of RNA polymerase II transcription. Nature structural & molecular biology. 2007;14(9):796–806.
  67. 67. Kainov DE, Lísal J, Bamford DH, Tuma R. Packaging motor from double-stranded RNA bacteriophage ϕ12 acts as an obligatory passive conduit during transcription. Nucleic acids research. 2004;32(12):3515–3521. pmid:15247341
  68. 68. Makarova OV, Makarov EM, Sousa R, Dreyfus M. Transcribing of Escherichia coli genes with mutant T7 RNA polymerases: stability of lacZ mRNA inversely correlates with polymerase speed. Proceedings of the National Academy of Sciences. 1995;92(26):12250–12254.
  69. 69. Mejia YX, Mao H, Forde NR, Bustamante C. Thermal probing of E. coli RNA polymerase off-pathway mechanisms. Journal of molecular biology. 2008;382(3):628–637. pmid:18647607
  70. 70. Burns CM, Richardson LV, Richardson JP. Combinatorial effects of NusA and NusG on transcription elongation and rho-dependent termination in Escherichia coli1. Journal of molecular biology. 1998;278(2):307–316. pmid:9571053
  71. 71. Kingston R, Nierman W, Chamberlin M. A direct effect of guanosine tetraphosphate on pausing of Escherichia coli RNA polymerase during RNA chain elongation. Journal of Biological Chemistry. 1981;256(6):2787–2797. pmid:7009598
  72. 72. Galburt EA, Grill SW, Bustamante C. Single molecule transcription elongation. Methods. 2009;48(4):323–332. pmid:19426807
  73. 73. Usala SJ, Brownstein BH, Haselkorn R. Displacement of parental RNA strands during in vitro transcription by bacteriophage φ6 nucleocapsids. Cell. 1980;19(4):855–862. pmid:7379123
  74. 74. Moler C, Van Loan C. Nineteen dubious ways to compute the exponential of a matrix, twenty-five years later. SIAM review. 2003;45(1):3–49.
  75. 75. Bai L, Wang MD. Comparison of pause predictions of two sequence-dependent transcription models. Journal of Statistical Mechanics: Theory and Experiment. 2010;2010(12):P12007.