Linking Core Promoter Classes to Circadian Transcription

Circadian rhythms in transcription are generated by rhythmic abundances and DNA binding activities of transcription factors. Propagation of rhythms to transcriptional initiation involves the core promoter, its chromatin state, and the basal transcription machinery. Here, I characterize core promoters and chromatin states of genes transcribed in a circadian manner in mouse liver and in Drosophila. It is shown that the core promoter is a critical determinant of circadian mRNA expression in both species. A distinct core promoter class, strong circadian promoters (SCPs), is identified in mouse liver but not Drosophila. SCPs are defined by specific core promoter features, and are shown to drive circadian transcriptional activities with both high averages and high amplitudes. Data analysis and mathematical modeling further provided evidence for rhythmic regulation of both polymerase II recruitment and pause release at SCPs. The analysis provides a comprehensive and systematic view of core promoters and their link to circadian mRNA expression in mouse and Drosophila, and thus reveals a crucial role for the core promoter in regulated, dynamic transcription.

Here, we ask how various kinetic parameters (such as Pol II recruitment or release rates) influence pausing index (PI) and rhythm propagation. Rhythm propagation is assumed to occur between rhythmic Pol II recruitment and transcriptional initiation, or between rhythmic Pol II release from the paused state and transcriptional initiation, or a combination of both.
From the chip-seq measurements, we obtain three readouts: promoter-proximal paused Pol II density, gene body density, and the ratio of the former to the latter (the PI). Since chip-seq experiments were performed at different times of day, we can estimate relative amplitudes and phases for each of the readouts.
Here, we derive phases of these three readouts for 3 rhythm propagation scenarios: 1. Only oscillating recruitment of Pol II to the promoter. 2. Only rhythmic regulation of release from the paused state. 3. A combination of these two. We do so by analyzing a quantitative model for recruitment of Pol II and transcription initialization from the original publication (Le Martelot et al. PLoS Biol 10 2012). Scenarios 1 and 2 were considered there, where it was shown (as also done more elaborately here below) that scenario 1 will cause promoter-proximal paused Pol II density and gene body density to oscillate with the same phase. Scenario 2 will cause them to oscillate with opposite phases. Since opposite phases were not observed in the chip-seq data, it was concluded that circadian regulation operates at the level of recruitment of Pol II to promoters. Here we show that scenario 3 will lead to in-phase oscillations of promoter-proximal paused Pol II density and gene body density, as observed experimentally and as in scenario 1. We show, however, that scenario 3 will lead to a PI in antiphase to promoter-proximal paused Pol II density and gene body density. Scenario 1 leads to in-phase oscillations in all three readouts. We have found in the experimental data that for a subset of transcripts, the PI is indeed in antiphase to the Pol II densities.
How can the PI be in antiphase to promoter-proximal paused Pol II density and gene body density? Let these two densities have relative amplitudes A prom and A gb , respectively, but the same phase, and form their ratio f(t), which (apart from a constant factor which we here omit) will look like: It follows from elementary analysis of the extrema that this ratio will have the same phases as the two densities except if A prom < A gb , in which case the phases will be opposite. This is illustrated by the plots below. Thus, our objective involves analyzing how the 3 scenarios outlined above lead to different relations between the relative amplitudes of the densities. We will let recruitment of Pol II to the promoter have a relative amplitude A f . The release from the paused state has a relative amplitude A d . We shall analyze how the observables A prom and A gb and the A PI , the relative amplitude of the PI, depend on A f and A d . In this way, we can for instance infer from data whether A d sometimes is greater than zero, which means that Pol II release is regulated in a circadian fashion. Since we observe such inverted phases of the PI in a subset of transcripts, this analysis allows us to infer that circadian regulation of release from the paused state occurs and is relatively widespread, even if it is not the main cause of circadian rhythms in transcriptional activities.
Finally, we show that rhythm propagation from Pol II recruitment generally weakens when the mean recruitment rate increases. However, when the Pol II initiation rate increases, rhythm propagation increases as well (section Analysis of transfer functions: effects of parameters on paused Pol II levels and rhythm propagation). This relates to Figure  4C in the main text.

��������������������������������������������
To analyze the consequences of the 3 scenarios, we consider the mathematical model for the initiation of transcription formulated by Le Martelot et al. 2012. This model considers 4 steps: 1, The recruitment of Pol II to chromatin which may be circadian with relative amplitude A f . 2, Chromatin opening and pre-initiation complex formation of Pol II. 3, Early elongation and formation of promoter-proximal paused Pol II. 4, Release and initiation of transcription from the paused state, which may be circadian with relative amplitude A d . The model consists of 4 variables, three of them x + y + z, represent promoter-associated Pol II, and the fourth, w, represents elongating Pol II. The readout x + y + z will exhibit a relative amplitude A prom , and w a relative amplitude A gb . We define this model in Mathematica code like so: where u1 represents a relative change in Pol II recruitment rate, and where u2 represents a relative change in the rate of transcription initiation from the paused state. These two input signals are set to zero when reproducing the original model and computing steady states, but we let them oscillate around zero with amplitudes between 0 and 1 when analyzing the model's reaction to circadian modulation of these rates. We summarize the model parameters, which we have converted into hours -1 : ��  ��� ��  �� ��  ���� ��  ��� ��  ��� �  ���� and the differential equations:
We need the steady state in order to linearize the system for a comprehensive analysis. We solve for the steady state (u1 = u2 = 0): We define a state space model object around this steady state (which one can easily linearize and analyze in Mathematica) with the 2 inputs u1 and u2, and with 3 outputs: promoter-associated Pol II (x + y + z), transcribing Pol II (w), and pausing index (PI): (x + y + z) / w.

�����������
Here, we let Pol II recruitment oscillate with relative amplitude A f = 0.5, and let the release into productive elongation have a constant rate (A d = 0). The results are not dependent on parameters or the magnitude of A f , as outlined further below.
This is the code to solve the model ODEs numerically. We start by computing 100 days to eliminate transients, then compute another 72 hours.
To the three outputs each, a cosine is fitted using the following code, which yields means, relative amplitudes, and phases of the solutions. Conclusion: Scenario 1 entails in-phase Pol II and PI. The PI oscillates with such a low amplitude, that it will be hard to detect above noise level experimentally. When analyzing Pol II chip-seq data, we found this scenario to be consistent with observations for most, but not all, circadian genes.

�����������
Here, we let Pol II recruitment be constant, A f = 0, and let the release into productive elongation oscillate (A d = 0.5). The results are not again not dependent on parameters, as outlined further below.

�����������
Here, we let both Pol II recruitment and release into productive elongation oscillate. The results are not again not dependent on parameters, as outlined further below.
We plot the corresponding three outputs: Instead letting Pol II release have the greater amplitude leads to the same results as for scenario 2. In general, starting with scenario 2 and then entering scenario 3 by successively increasing A f starts leading to the above "inverted PI" phases at a point which is depending on the parameters, as discussed further below.

Conclusion: Scenario 3 entails promoter-associated Pol II and gene body
Pol II with the same phase. The PI is, however, in antiphase to these Pol II densities. According to the data analyzed, this is widespread although not occurring at the majority of circadian genes.
We consider relative amplitude gains, which are the ratios of relative amplitudes of the variables to the relative amplitude of oscillation in the input signals. A gain of 0.5 means for the variable x means that if an input signal oscillates with relative amplitude 0.6, then the variable oscillates with relative amplitude 0.6×0.5 = 0.3. Gains can be computed as the absolute values of the transfer functions between the inputs and variables, for the system linearized around the steady states. We derive transfer functions manually using Mason's rule (Mason SJ (1956) Feedback theory: further properties of signal flow graphs. Proceedings of the IRE 920-926) based on the linearization above, since using Mathematica's builtin TransferFunctionModel function takes too long (symbolic matrix inversion is very slow). Many of the following details (especially everything between the thin horizontal lines) can be skipped when reading but are shown in order to follow the process of transfer function derivation with Mason's rule.
Transfer function from input u3, the rate of initiation of transcription, before pausing, here we consider only action on z.
We start by looking at all transfer functions from input u1 -that is, recruitment of Pol II. This section is technical, the takehome messages are highlighted in bold text. Material between horizontal lines may be skipped by non-technical readers.
We inspect the transfer functions with expanded expressions for loops L1 and L1, and the inverted single-variable transfer functions (iGx and so on). The latter all have the form s + ∑k, where the k stands for degradation rate constants. We note that in particular the factor (ko xss -kf) appears everywhere and is negative in the steady state. This factor will always be negative: The first differential equation in the steady state

����[{{�����
can be reformulated: We see that since 1 -y is greater than zero, (ko xss -kf) will have to be negative at the steady state.
Since xss and yss are smaller than one, this means that the factor (ko xss -kf) always contributes to a decrease of the gains.
We create transfer function objects and normalize by steady-state values. Note that the input already is formulated as relative change. We thus obtain transfer functions between relative amplitudes. We include a transfer function TFp1 for the sum of all promoter-associated Pol II.
This means that the PI has the same phase as the Pol II densities (see Objective section). This seems to be the case for all parameter combinations: No single counter-example for alternative parameters could be found. To illustrate, we generate 1000 combinations for the 6 model parameters, in log-space, between 1 and 1000 h -1 . We do this by using the Sobol algorithm to cover parameter space in an even fashion.

�
We find that all ratios are greater than one, this is the smallest one:
We justify this observation with an inspection of the analytical formulae: Keeping in mind that x + y < 1 and z < 2, the following comparison between the transfer functions for Pol II and the variable z may provide more insight into why the gain for the sum of the promoter-associated Pol II variables (x + y + z) is greater. Note that the gain for w is always smaller than that for z, since the step from z to w is linear.
The denominator of the difference of the transfer functions is positive: So if we can show that the numerator is positive, we have proven the numerically obtained result. Now, in the expression for the numerator below, only the term ki (2 ko (-1 + yss) (xss + yss) works "against" the expression being positive. However, since we have no closed expressions for the steady-state concentrations (xss and so on), a formal mathematical proof cannot be obtained using these methods.

���������������������������������������������������������������������������� �����������������������������������������������������������
We obtain the transfer functions from release rate to the different Pol II variables in a similar way as above: First, we note that the denominators are all positive, which follows from the fact that x + y < 1 and z < 2, and that (ko xsskf) is always negative, as discussed above. This allows us to immediately conclude that the transfer function to the observable gene body Pol II (w) is positive (meaning that w has the same phase as the release rate) and that the promoter-associated Pol II variables y and z are negative (meaning that they have the opposite phase to the release rate). However, the transfer function for x is positive, so that this variable will be in phase with the release rate. The observable x+y+z (all promoter-associated Pol II), however, has a negative transfer function and will have a phase opposite to that of the release rate. This is hinted at by the form of the transfer function but again difficult to prove algebraically. Performing an analysis of 1000 random parameter combinations, generated by the Sobol algorithm as above, justifies this:

�����
We see that this is between 85%-90%. This includes the default parameter set (interactive simulation below) and we conclude that increased Pol II recruitment rates generally weaken rhythm propagation from recruitment combined with release.
We obtain the transfer functions from release rate to the different Pol II variables in a similar way as above: This transfer function is positive, so that an increase of ki always increases z and therefore the transcription initiation rate, kd × z.
In particular, the user may convince him or herself that the combination of a lowered k d and v together with increased k i may explain the combined increased transcriptional activity, higher promoter-proximal Pol II levels, and lower PI levels of SCPs as compared to general Type II promoters (which may be taken as the default parameter set). Furthermore, the user may convince him or herself that only increasing the Pol II recruitment rate, k f , lowers the net rhythm propagation (Gain sum).
Rhythmic recruitment of Pol II results in rhythms of the same phase in promoter-associated Pol II, gene body Pol II, and PI. The PI phase is the same, since the rhythms are stronger in promoter-associated Pol II than in gene-body Pol II.
Rhythmic release of Pol II results in rhythms with the same phase in gene body Pol II, but with opposite phase in promoter-associated Pol II.
Combined in-phase rhythms in recruitment and release of Pol II result in the recruitment rhythms amplifying the rhythms in gene body Pol II (w) but weakening the rhythms in promoter-associated Pol II (x+y+z). For stronger recruitment rhythms and moderate release rhythms, this will result in all Pol II variables being in phase with the recruitment rate, but the gene body rhythms will be stronger than those of promoter-associated Pol II. This results in a PI in antiphase to Pol II densities, as observed in particular for transcripts with strong circadian promoters, and as shown in the numerical example plots above.
Finally, we note that the recruitment gains for gene body Pol II, i.e. the strength with which rhythms in recruitment rates propagate to transcriptional activities, can be low. For the default parameter values, it is 0.5. This means that transcriptional activities cannot reach a larger relative amplitude than 0.5. This is not what is seen in measurements. Rhythmic regulation of release into elongation may thus be needed to produce very high amplitudes. Indeed, this is what our main analysis suggests for a significant percentage of strongly circadian transcripts.