Figures
Abstract
Enhancers are stretches of regulatory DNA that bind transcription factors (TFs) and regulate the expression of a target gene. Shadow enhancers are two or more enhancers that regulate the same target gene in space and time and are associated with most animal developmental genes. These multi-enhancer systems can drive more consistent transcription than single enhancer systems. Nevertheless, it remains unclear why shadow enhancer TF binding sites are distributed across multiple enhancers rather than within a single large enhancer. Here, we use a computational approach to study systems with varying numbers of TF binding sites and enhancers. We employ chemical reaction networks with stochastic dynamics to determine the trends in transcriptional noise and fidelity, two key performance objectives of enhancers. This reveals that while additive shadow enhancers do not differ in noise and fidelity from their single enhancer counterparts, sub- and superadditive shadow enhancers have noise and fidelity trade-offs not available to single enhancers. We also use our computational approach to compare the duplication and splitting of a single enhancer as mechanisms for the generation of shadow enhancers and find that the duplication of enhancers can decrease noise and increase fidelity, although at the metabolic cost of increased RNA production. A saturation mechanism for enhancer interactions similarly improves on both of these metrics. Taken together, this work highlights that shadow enhancer systems may exist for several reasons: genetic drift or the tuning of key functions of enhancers, including transcription fidelity, noise and output.
Author summary
During development, cells assume different fates based upon signals, including transcription factor proteins that bind to regions of the DNA called enhancers. Enhancers can interact with promoters to control the transcription of a target gene. Many developmental genes have multiple, seemingly redundant enhancers called shadow enhancers.
When each separate enhancer is bound by distinct transcription factors, shadow enhancers can drive less noisy gene expression than single enhancers. This allows for the buffering of perturbations in the transcription factor inputs. However, under this premise, a single large enhancer bound by distinct transcription factors should also be capable of buffering perturbations. Why then are shadow enhancers so prevalent?
Fletcher et al. developed computational models of enhancer-mediated transcription that vary in the numbers of enhancers and transcription factor binding sites. They analyzed transcriptional properties in systems with and without shadow enhancers. The models revealed that shadow enhancers can provide a wider landscape of possible transcriptional properties. This computational approach enabled a broader exploration of shadow enhancer properties than is feasible experimentally and may guide future experimentation. Given their prevalence in developmental gene regulation, investigation of shadow enhancers may lead to a better understanding on the pathogenicity of certain mutations found in developmental enhancers.
Citation: Fletcher A, Wunderlich Z, Enciso G (2023) Shadow enhancers mediate trade-offs between transcriptional noise and fidelity. PLoS Comput Biol 19(5): e1011071. https://doi.org/10.1371/journal.pcbi.1011071
Editor: Attila Csikász-Nagy, Pázmány Péter Catholic University: Pazmany Peter Katolikus Egyetem, HUNGARY
Received: August 4, 2022; Accepted: April 3, 2023; Published: May 19, 2023
Copyright: © 2023 Fletcher et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All calculations and simulations were done using MATLAB 2016b under GCC C/C++ 4.9 in conjunction with the CERENA toolbox. The code is available at https://github.com/WunderlichLab/TheoreticalEnhancerModels.git.
Funding: This research was partially supported by NSF-Simons Center grant DMS1763272, Simons Foundation grant 594598 to GE and ZW, NSF grant DMS1616233 to GE, NIH grant R01HD095246 to ZW, and the UC President’s Dissertation Year Fellowship to AF. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
1 Introduction
Enhancers are non-coding regions of the genome that are bound by transcription factors (TFs) and interact with the promoter to regulate transcription. Developmental genes are frequently expressed in multiple tissues or time points and are regulated by multiple enhancers. Traditionally, these enhancers were perceived as modular, with each driving a distinct portion of a gene’s spatiotemporal expression pattern, and together generating the entirety of a gene’s expression pattern. However, work in Drosophila revealed the presence of “shadow enhancers”—sets of two or more enhancers that control the same gene and drive identical or overlapping expression patterns [1]. Similar enhancer groups have been identified in C. elegans, mice, zebrafish, and humans [2–7]. Here, we aim to use theory and computational models to shed light on the regulatory advantages of shadow enhancers, as well as under what conditions a single enhancer and a set of shadow enhancers are interchangeable.
On the surface, shadow enhancers appear redundant—they drive overlapping expression patterns and can often be knocked out without meaningfully affecting phenotype. However, in multiple loci in both flies and mice, shadow enhancers are essential for driving normal development under conditions of stress [8–11]. For instance, under high temperatures but not ideal temperatures, deletion of a shadow enhancer for the Drosophila gene snail leads to abnormal development of the Drosophila embryo. The earliest work describing “shadow enhancers” designated the enhancer farther away from the promoter as the shadow enhancer and the promoter-proximal enhancer as the “primary enhancer.” For the purposes of this work, we use the more recently formulated definition of shadow enhancers, which refers to the entire group of enhancers regulating the same target gene over space and time [12].
Progress in methods for identifying enhancers genome-wide has revealed the pervasiveness of shadow enhancers in developmental gene loci. A study revealed that a majority of genes involved in Drosophila muscle development are controlled by sets of three or more shadow enhancers, and large-scale data analysis on mouse tissue samples from genomic databases showed widespread shadow enhancer activity [11, 13, 14]. Meanwhile, in humans, assays involving enhancer-derived RNAs suggested that approximately 80% of examined genes are under the regulation of two or more shadow enhancers [15]. Overall, these results and others suggest that most developmental genes in multi-cellular organisms are regulated by sets of shadow enhancers.
In addition to ensuring proper development in stressful conditions, shadow enhancers interact in multiple ways to fine-tune gene expression. For example, shadow enhancers might be assumed to behave additively—the sum of their individual mRNA outputs equals that of their combined output. However some shadow enhancers display subadditive or superadditive interactions, in which the combined activity of both enhancers is either less or more than the sum of their individual contributions, e.g. [16, 17]. Shadow enhancers can also repress one another, creating a composite gene expression pattern that is weaker or more restricted that either enhancer produces alone [18, 19]. Lastly, by binding distinct sets of input TFs, shadow enhancers can collectively buffer temporal noise in TF levels, yielding more consistent gene expression as a function of time [20].
Many of the mechanisms, e.g. synergy and repression, that are observed in shadow enhancers may also be achieved in single enhancers. This observation leads to the core question of this work: which properties are specific to shadow enhancers and not possible in single enhancer regulation? To do so, we focus on two important functions of enhancers. The first is their ability to faithfully translate an upstream signal, like a difference in TF concentration, into downstream expression output. That is, the enhancer should be capable of producing different expression levels in response to different TF concentrations. This feature is defined here as fidelity, and it is needed to generate output expression patterns that determine developmental cell fate in response to upstream signals. The second is their ability to buffer stochastic noise either from random fluctuations in the upstream signals or from internal enhancer dynamics. This feature is needed to buffer the noise that inevitably arises from molecular interactions to drive consistent expression patterns. For this purpose, we construct theoretical models of enhancer systems with different properties and analyze the resulting dynamics. In addition to exploring a larger number of configurations and parameter sets than is experimentally practical, we expect this approach to elucidate the selection pressures that can shape the creation of shadow enhancers and opportunities for transcriptional modulation that appear in the presence of shadow enhancers.
Previous work on theoretical models of shadow enhancers has proven to be fruitful in predicting and understanding the behavior of these systems. For example, a simple model of hunchback and knirps shadow enhancers by Bothma et al. [16] was used to show that frequent interactions among enhancers may lead to competition for promoter access and thus give rise to subadditive behavior. Likewise, a model of hunchback regulation of the even-skipped enhancers by Staller et al. [21] suggested that two different enhancers can recreate the same expression patterns with distinct regulatory logic. Grah et al. [22] studied a model related to the Monod-Wyman-Changeux hemoglobin system [23] in the context of enhancer regulation and considered several performance metrics. The work by Nousiainen et al. presented a computational framework for identifying model families that can predict enhancer activation dynamics in a mechanistic fashion [24]. In this work, we focused our approach on a set of minimalist reaction network models in which each of the reactions is stochastic, and the parameters were derived from previous transcriptional data of Kruppel enhancers [20]. Hence, we expect that this fully stochastic approach may capture nuances in the relationship between TF fluctuations and transcription.
In section 2.1, we describe a reaction network model of the Kruppel gene enhancer system. This is a minimal model developed and validated in Waymack et al. [20] that recapitulates the dynamics of enhancer-mediated transcription. We then describe how to use this model to approximate mean transcriptional output and derive the corresponding noise and fidelity. Section 2.2 describes our approach to generate similar models that differ in their numbers of enhancers and TF binding sites. By simulating these models, we can compare the effects of distinct shadow enhancer systems on transcriptional noise and fidelity. Since shadow enhancers can behave sub- or superadditively, we incorporate this behavior into our models in Section 2.3 and once again perform simulations to determine how the noise and fidelity are affected. In Section 2.4 we consider the concepts of transcriptional synergy and saturation. Finally, in Section 2.5, we use our modeling framework to compare the changes in noise and fidelity resulting from an enhancer duplication and an enhancer splitting. We consider these events to be potential mechanisms for the origin of shadow enhancers and study them in this context.
2 Results
2.1 The Kruppel enhancer model
As a case study for our work, we use Kruppel, a gene required for early embryonic patterning in Drosophila. Around two hours post-fertilization, Kruppel is expressed in a stripe around the middle of the embryo, and this expression pattern is generated by a pair of shadow enhancers (Fig 1A). A minimal version of the Kruppel enhancer system was described in Waymack et al. [20] using the model in Fig 1B. Here, A corresponds to the proximal enhancer closest to the promoter, while B corresponds to the other, distal enhancer. The subscripts for A and B alternate between 0 and 1 to denote whether there is a TF bound to them at a given time. In this model, it is assumed that each enhancer interacts with the promoter immediately after a TF binds to it. Waymack, et al. measured the dynamic expression output of Kruppel’s shadow enhancers individually and together. Using these data, we estimated the model parameters either directly or by using simulated annealing, which systematically simulates the model over several parameter sets until the difference between the model output and experimental data falls within an acceptable tolerance (Table 1.) To recapitulate our previously-published experimental data, we stochastically simulated our model to yield the number of mRNA transcripts over time. Using the parameters in Table 1, simulated mRNA bursts resemble those observed during live transcription in Drosophila embryos (Fig 1C) and properties such as the size and duration of bursts were consistent with those from Kruppel experimental data [20]. Based on this work, the model shown in Fig 1B is used as a description of shadow enhancer dynamics.
(A) A computational representation of the Drosophila embryo showing the region of Kruppel expression [20]. (B) Cartoon depicting a reaction network model of Kruppel shadow enhancers [20]. (C) Sample stochastic traces of mRNA from simulations of the model in (B) and their average over time E[R] which estimates the mean mRNA concentration. The values of E[R] and the standard deviation σR can also be approximated by moment closure techniques and be used to estimate the transcriptional noise and fidelity of the modeled enhancer system.
One disadvantage of using stochastic simulations to estimate mean mRNA transcription is the significant computational cost. To do this more efficiently, we use moment closure methods that create ordinary differential equations (ODEs) describing the first and second moments of the chemical species [25, 26]. The solution to the ODE for the first moment approximates the limit to infinity of the mean mRNA denoted by E[R]. For example, the top right of Fig 1C shows E[R] and the standard deviation σR derived from a finite number of stochastic traces while the plot below shows the moment closure approximation of E[R] and σR. Moment closure methods are popular tools to estimate measures of stochastic chemical reaction networks, but their reliability varies on a case by case basis. See [27, 28] for a comparison of different methods, and [29] for a set of sufficient conditions in which various approximations may hold.
Using the moment closure technique, we can now efficiently estimate the transcriptional fidelity and noise of our modeled enhancer system. To capture the transcriptional fidelity, we used the correlation between the TF values and the corresponding levels of gene expression. The correlation Corr(T1, R) quantifies the degree to which the first TF, T1 and mRNA concentrations rise or fall together. It aims to describe whether a large (or small) T1 input has a tendency to lead to a large (or small) mRNA output concentration. To study the transcriptional noise, we used the coefficient of variation (CV) by normalizing the standard deviation of mRNA expression by its mean. We have that while Notice that both of these quantities can be directly calculated from the first and second moments of the different species, which are in turn estimated using moment closure methods.
2.2 Models with varying number of enhancers and total binding sites
Using the Kruppel model as a starting point, we first wanted to explore if transcriptional fidelity and noise are dependent on TF binding sites being arranged into two different enhancers. Our previous experimental and computational work found that having two enhancers with distinct TF binding sites—T1 in the proximal enhancer and T2 in the distal enhancer—drove lower expression noise than two enhancers with identical TF binding sites. From this work, however, it was not clear whether the TF binding sites had to be split among two enhancers, or whether a single enhancer with sites for both T1 and T2 could achieve the same noise reduction. It also was not clear if there might be a trade-off between noise and fidelity.
To explore these questions, we constructed 48 models with up to four enhancers and four total binding sites, as shown in Fig 2A. Each enhancer has a given number of binding sites for T1 and T2, and every diagram in this figure corresponds to a set of chemical reactions. For an example of the specific networks associated with two of these diagrams, see S1 Fig in the Supplementary Material. We then input each network into the CERENA software (ChEmical REaction Network Analyzer) [26], which allows us to calculate the ODE equations for the mean and other moments of all chemical species in the network. CERENA uses an array of approximation methods that allow for a streamlined implementation of these ODE equations, in particular several methods of moment closure. The use of this software is significantly faster as compared with many repeated runs of the Gillespie algorithm [30]. We calculated ODEs up to the second moments in order to calculate the variance, and we used the method of zero cumulants for moment closure [31]. The results were found to be generally consistent with simulations of individual models carried out with the more standard Gillespie algorithm (Fig 1C, right).
To create the reaction equations for all our configurations of interest, we needed to decide how to assign the model parameters in a way that made the comparisons meaningful. Most of the parameters, including the kon and koff rates, were assigned the same values found to be consistent with experimental data in Waymack et al. [20] and used in Fig 1C. See Table 1 for the values of the different parameters in the model. The rates of transcription for each enhancer state are initially defined using the following additive scheme. Any enhancer with n bound T1 and m bound T2 will produce mRNA at a rate nr1 + mr2 where r1 and r2 are the rates of transcription of the Kruppel model in Fig 1B. Similarly, if two or more shadow enhancers are bound to the promoter, the overall mRNA transcription rate is the sum of the transcription rates for each enhancer. This way, we have an additive scenario where a single enhancer system with four fully bound sites of T1 will produce mRNA at the same rate as four independent enhancers that are each bound by a single T1.
We calculated the transcriptional noise and fidelity for all the models in Fig 2A to discern the effect of different configurations of binding sites and enhancers. The results are shown in Fig 2B, left (S2 Fig shows the results for fidelity with respect to T2). As long as the number of binding sites for T1 and T2 remain constant, the noise does not change when altering the number of enhancers. Moreover, even after fixing the number of enhancers, the actual distribution of TF binding sites among the enhancers appears to have no effect on transcriptional noise or fidelity (Fig 2B, right). The fidelity to T1 is also insensitive to the number of enhancers, but increases for increasing numbers of T1 binding sites, and it decreases when additional T2 binding sites are present (Fig 2B, middle). One can conclude that, in this additive regime, there may be negligible selection pressures on the number of enhancers regulating a gene and the distribution of TF binding sites among these enhancers.
(A) Different enhancer models used in the simulations. Each model has different total binding sites for T1, total binding sites for T2, distribution of binding sites, and number of enhancers. (B) Simulations for the models in (A) show that fidelity and noise are independent of the number of enhancers and the distribution of binding sites. The noise broadly decreases as a function of total TF binding sites, while fidelity with respect to T1 increases with the number of T1 binding sites. The table on the right shows the fidelity and noise values for two different configurations of TF binding sites among two enhancers. (C) Noise calculated as functions of the total binding sites for T1 or T2. As the total number of binding sites increases, the noise generally decreases.
Though the noise does not depend on enhancer number, it does vary with binding site number. The plots in Fig 2C show that enhancers with more binding sites generally lead to lower transcriptional noise. This may be because as the number of binding sites increases, there are more inputs in the system, which can serve to average each other out. Each binding site can be thought of as akin to a coin toss. When n independent coin tosses are made, the expected number of heads grows linearly but its standard deviation grows sublinearly, leading to an overall decrease in the noise.
Not only does transcriptional noise depend on binding site number, but it is also dependent on binding site identity. The production reactions of each of the two transcription factors are independent of each other. Shifting the total number of binding sites τ1 + τ2 shows non-monotonic changes in noise (rightmost plot in Fig 2C). The leftmost data points at τ1 = 0 correspond to enhancers that are bound exclusively by T2 while the rightmost points at τ1 = 4 correspond to enhancers that are bound exclusively by T1. As T1 sites increase, the ratio between T1 and T2 binding sites becomes more even and the noise decreases. However, once enhancers adopt too many T1 sites at the expense of T2 sites, noise begins to increase. This agrees with previous observations that having a single kind of TF binding site leads to higher noise in transcription than systems bound by multiple kinds of TF binding sites [20]. In summary, these models demonstrate that when enhancer output combines in an additive fashion, the fidelity does not depend on the number of enhancers. Noise is sensitive to the number and identity of binding sites, but not to their arrangement among one or more enhancers. Thus, the prevalence of shadow enhancers under additive assumptions might be the result of genome dynamics and genetic drift, as opposed to selection.
To verify the accuracy of the chosen moment closure method used for the simulations in this section and sections below, we carried out a comparison with a higher order zero-cumulant method as well as other methods, see the Supplementary Material (S5 Fig). While there is some variability between methods, the noise and fidelity measurements are broadly preserved.
2.3 Subadditivity and superadditivity
The results in the previous section showed that the number of additive enhancers does not affect transcriptional noise and fidelity. In turn, the number of additive enhancers may be under minimal selection pressure and instead originate through stochastic processes such as genetic drift. However, shadow enhancers have been observed to behave subadditively and superadditively—their combined activity either results in a lesser or greater amount of gene expression than the sum of their independent contributions. To investigate whether different levels of additivity could result in different fidelity and noise properties, we modified our models in the previous section to recapitulate this effect.
To capture varying enhancer additivity in our models, we increased or decreased the binding rates kon and koff to approximate the biochemical mechanisms underlying varying levels of additivity. We chose this approach to implementing sub- or superadditivity because it was consistent with previous experimental data [20]. Using this approach, any enhancer model can be made subadditive by decreasing kon and increasing koff. Similarly, enhancer models can be made superadditive by increasing the values of kon and decreasing the values of koff. In addition, in a later section we will briefly consider the cases where the polymerase loading rates are saturated by a single enhancer or remain null until all enhancers bound to the promoter.
For simplicity, we used a linear relation between the number of enhancers and the binding and unbinding parameters. Specifically, in the subadditive case the “new” binding rate is defined as , and similarly (Figs 3A and 4A). We selected values of d1 and d2 to broadly explore the effect of subadditivity while still allowing the moment closure method to accurately estimate noise and fidelity.
(A) Subadditivity is implemented in our model by linearly decreasing kon rates and linearly increasing koff rates. In this case d1, the rate of decrease for kon, was chosen to be 0.04 for T1 and 0.02 for T2. Meanwhile d2, the rate of increase for koff, was chosen to be 0.75 for both T1 and T2. (B) Systems with more subadditive enhancers tend to exhibit higher noise, while the fidelity is broadly independent of enhancer number. Noise and fidelity are also independent of binding site distribution but vary with respect to the number of binding sites. (C) Plots showing the relationship between binding site numbers and transcriptional noise for two subadditive enhancers.
(A) Superadditivity is implemented in our model by linearly increasing kon rates and linearly decreasing koff rates. In this case d2, the rate of decrease for koff, was chosen to be 0.4 for T1 and 0.3 for T2. Meanwhile d1, the rate of increase for kon, was chosen to be 0.01 for both T1 and T2 (B) Unlike in the subadditive case, enhancer numbers decrease transcriptional fidelity and also decrease noise. The distribution of binding sites does not affect either the noise or the fidelity all else being constant. (C) Plots showing the relationship between binding site numbers and transcriptional noise for two superadditive enhancers. Increasing binding site numbers leads to less noise in gene expression.
Unlike the additive case, when enhancers interact subadditively, their numbers affect the noise (Fig 3B and S2 Fig). The fidelity is however mostly unaffected by the number of enhancers. Similar to the additive case, binding site numbers in subadditive enhancers can affect the fidelity as well as the transcriptional noise. Generally, given a fixed number of enhancers, adding TF binding sites will reduce the noise, while adding T1 binding sites will increase the fidelity. Moreover, subadditive systems with binding sites for both T1 and T2 are less noisy than systems with a single kind of binding site (Fig 3C). The distribution of binding sites among enhancers has no effect on the fidelity or the noise. In sum, the subadditive models are distinct from the additive models in that they show that increasing enhancers numbers increases transcriptional noise while preserving fidelity. However, the relationship between the number and type of binding sites and transcriptional noise are similar between the additive and subadditive cases.
Superadditive shadow enhancers have also been observed, so to explore their properties, we linearly increased kon and decreased koff. Systems with superadditive enhancers show decreasing transcriptional noise as well as decreased fidelity with increasing numbers of enhancers. In this way, superadditive systems present a trade off between noise and fidelity—increasing enhancer numbers improve noise at the expense of fidelity (Fig 4B).
Once again, binding site distributions in superadditive enhancers do not affect the fidelity or the noise. However the total number of binding sites does affect the noise and fidelity as in the previous two cases (Fig 4C and S4 Fig).
2.4 Saturation and synergy
Another way that shadow enhancers can interact is through transcriptional saturation or synergy. In a system with n shadow enhancers, the system presents saturation if the transcription rate r is the same if at least one enhancer is bound to the promoter. On the contrary, the system presents synergy if the rate of transcription is 0 unless all n enhancers are bound to the promoter. We show the results of such systems in Fig 5. For simplicity, we use a single type of transcription factor and we vary the number of enhancers.
A saturated system yields mRNA at the same rate for any positive number of enhancers bound. On the other hand, a synergistic system becomes active only when all enhancers are bound to the promoter. The resulting plots of fidelity and noise corresponding to these systems show inverse relationships between noise and fidelity. In the saturating regime, low noise and high fidelity are achieved with higher enhancer numbers, while in the synergistic regime, low noise and high fidelity occur with lower enhancer numbers.
In the saturation case, increasing the number of enhancers will reduce the noise as well as increase fidelity. Both noise and fidelity change about two-fold as the system increases from one to four enhancers. In the synergy case, the noise increases, and the fidelity decreases, as the number of enhancers increases. A significant noise level of almost 30 was reached for four enhancers. In this sense, the saturation model can be considered the more efficient one in terms of the studied metrics.
2.5 Shadow enhancer duplication
While not explicitly stated, the models in Fig 2A correspond to a collection of single enhancers that split into multiple enhancers. Analyzing our results through this lens would suggest that the splitting of an enhancer into subadditive or superadditive enhancers could occur based on a need to modulate noise, fidelity, or absolute levels of gene expression. For additive enhancers, such a split would not face any adverse selection pressures based on the absence of changes to the mRNA output. However, the mechanisms by which shadow enhancers come to being remain unclear and the splitting of an enhancer might not be the only viable route to create shadow enhancers [12]. Hence, to contrast our results with a different mechanism, we apply our analysis to shadow enhancer systems that arise by enhancer duplication.
First, we generated models that are repeated duplications of single enhancer models. Cartoons depicting these models are shown in Fig 6A. In some cases, where each enhancer had multiple binding sites, we had to limit the number of enhancers simulated due to their large number of reactions and the associated computational costs. Given the experimental observations of subadditivity with enhancer duplication, we focused our analysis on the duplication of subadditive enhancers and compared it with the alternative of splitting subadditive enhancers [17, 20].
(A) Single enhancer models and those that result from repeated enhancer duplications. (B) Plots showing the relationship between total binding sites and enhancers for the case of enhancer splitting and enhancer duplication. Splitting does not affect total binding site numbers while it increases enhancer numbers. On the other hand, duplication increases both enhancer numbers and total binding sites at different rates. (C) Plots depicting transcriptional fidelity and noise for subadditive versions of the models in (A). Enhancer duplications increase transcriptional fidelity while the noise generally decreases.
Unlike splitting, enhancer duplication not only increases the number of enhancers but also the number of binding sites (Fig 6B). For this reason, the number of binding sites for each model in the plots of Fig 6C is shown as a product of the binding sites per enhancer times the number of enhancers. These plots show that the duplication of subadditive enhancers leads to a slight decrease in the noise while increasing transcriptional fidelity. The effect of duplication on transcriptional noise is consistent with the experimental measurements of the Kruppel enhancers [20].
Ultimately, enhancer duplication shows the potential for increasing transcriptional fidelity while simultaneously reducing expression noise, albeit with the additional metabolic cost of increasing RNA output. The duplication of subadditive enhancers presents similar trade-offs from those observed in the superadditive splitting scenario and expands the possibilities of transcriptional modulation from those available for single enhancer systems.
3 Discussion
In this work, we sought to determine the effect of varying the number of shadow enhancers as well as the nature of their interactions, and to understand whether a single enhancer can recapitulate their dynamical behavior. To do so, we simulated models with differing numbers of enhancers and TF binding sites and calculated the transcriptional noise and fidelity for each model. Sufficiently high fidelity is required for a gene’s expression to meaningfully reflect changes in upstream signals intended to shape a cell’s fate. On the other hand, sufficiently low noise is needed for transcription to convey consistent signals of expression in the face of unavoidable molecular fluctuations. Consequently, a balance is needed between these properties in order to properly pattern a developing organism.
We began by considering additive enhancers—enhancers which have a combined transcriptional rate equal to the sum of their individual contributions. Our models revealed that the number of additive enhancers has no effect on transcriptional noise and fidelity.
Therefore, in the case of additive shadow enhancers, there would seem to be no particular pressure to have multiple shadow enhancers, as opposed to one large enhancer. How might we explain the preponderance of shadow enhancers in this case? Here, the dynamics of genome evolution may be at play. If we imagine that a genome contains a single, large enhancer, there are processes, like transposable element insertions [32, 33] or DNA polymerase slippage [34, 35] that may split this enhancer into shadow enhancers. If these shadow enhancers can control gene expression similarly to the ancestral single enhancer, as our model suggests, there would be no particular pressure to remove the intervening sequence. In fact, it may be entropically more favorable to split an enhancer into shadow enhancers that to merge shadow enhancers into a single enhancer, given that many more splitting events could lead to two functional enhancers, while deletion or excision events would have to be a great deal more specific to avoid removing functional enhancer sequence.
Many experimental studies have shown that shadow enhancers can interact in a wide range of manners—sub- and superadditively, as well as repressively [16–19, 36–38]. We therefore modified our models to recapitulate this behavior. The resulting simulations revealed that increasing numbers of subadditive enhancers corresponded to lower levels of transcriptional noise while broadly maintaining the levels of fidelity. On the other hand, increasing numbers of superadditive enhancers led to a trade off between a desirable decrease in noise with an undesirable decrease in fidelity. We also found in the transcriptional saturation model that including more enhancers can lead to a significant increase in fidelity with a decrease in the noise, a potentially strong alternative to improve on both metrics. Notice that increasing the number of enhancers increases the metabolic cost of the system [39]. In general, when shadow enhancers interact in more intricate ways, there is a complex landscape of transcriptional noise, fidelity, and output that selection may act upon to determine the number of enhancers controlling a gene.
Though there are many hypotheses about the mechanisms that drive enhancers to act sub or superadditively, like enhancer competition for the promoter or synergy between recruited TFs and co-factors, we are not yet able to predict how two or more enhancers will interact [16, 40]. Superadditivity and subadditivity may be structural properties that may not be tunable, but instead are the result of existing constraints for each individual system. Thus there are likely many mechanisms by which enhancers become non-additive and other evolutionary trade-offs beyond transcriptional noise and fidelity that elude our simplified models.
We also wished to examine the impacts on noise and fidelity of different mechanisms that could result in the creation of shadow enhancers [12, 41]. In particular, we wanted to contrast the case of an enhancer that splits into multiple enhancers and an enhancer that duplicates itself. To study enhancer duplication, we adapted our models to this scenario and noted that repeated duplications of subadditive enhancers induced lower transcriptional noise alongside increases in transcriptional fidelity. It remains unclear whether splitting or duplication of genomic regions are common mechanisms for the origin of shadow enhancers and whether they presents additional trade-offs with respect to genome size. Our analysis was also limited in that it assumes exact duplications of each enhancer, which is inconsistent with the stochasticity typically involved in genome duplication [42].
Naturally, our approach was also limited by the constraints we imposed on our models. The linear approach to modeling sub- and superadditivity has a limitation, in that given a sufficiently high number of enhancers, the kon or koff rates can eventually become negative. We implemented this approach for simplicity, but it could also be done, e.g. with exponentially rather than linearly decreasing values to prevent this effect. While we use the CERENA package to estimate the moments of these systems, other methods such as the Linear Mapping Approximation [43] are available and could potentially be used in such networks in the future. This method in particular can be used for second order reactions in which a protein binds a gene, and it does not rely on assuming a particular distribution for the stochastic variables in the system. In S5 Fig, we present a comparison of the second order zero-cumulant method used for the moment closure with other methods, showing that our methodology is broadly compatible with other forms of moment closure. We also show calculations of the Fano factor for this system.
Throughout this work, we used parameters that were fitted to experimental data derived from studying the Kruppel enhancer system. This parameter space, however, may not be in accordance with the dynamics of other enhancer systems found in Drosophila and other organisms. To address this, in S6 Fig we carry out a basic random variation of the parameter set over several orders of magnitude, showing that the results in our main figures are broadly consistent in these other simulations. Regarding S6 Fig, notice that despite the general trends, it is still possible to find parameter sets for which the trends do not hold.
Our modeling suggests experiments that could prove useful in further exploring the mechanisms of shadow enhancer function. For instance, one could imagine constructing a synthetic system within an embryo to directly test the impact of varying enhancer numbers and TF binding site distribution on the associated noise and fidelity of the transcriptional output [44, 45]. In addition, a bioinformatic analysis comparing enhancer DNA sequences in different organisms could be used to determine the relative prevalence of enhancer splitting, duplication, and other ways of generating new shadow enhancers.
Overall, this work shows that different strategies for shadow enhancer interaction present a variety of trade-offs. In the additive scenario, increasing the number of enhancers appears to face no transcriptional trade-offs or advantages. In the subadditive scenario, additional enhancers increase noise but might be tolerated if additional mRNA output is needed. On the other hand, superadditive enhancers could supersede a single enhancer decreased noise is needed more than fidelity. Shadow enhancers that combine in a saturating manner can give rise to both increased fidelity and decreased noise with increasing enhancer numbers. We also showed that repeated enhancer splittings lead to distinct outcomes in noise and fidelity from repeated duplications. Ultimately, the preponderance of shadow enhancers may be due to a combination of genetic drift and to the variety of transcriptional modulation strategies possible with multiple, but not single enhancers.
4 Methods
4.1 Description of enhancer models and parameters
The model of Kruppel gene enhancers in Fig 1B is described by the following chemical reaction network.
Here, two enhancers denoted as A and B have a single binding site for their respective TFs T1 and T2. The subscripts for each enhancer reflect how many TFs are bound to them. For example, A0 denotes the enhancer A with no TFs bound while A1 denotes the same enhancer with a single TF bound. Our modeling framework assumes that a single TF bound to the enhancer is sufficient for an enhancer-promoter interaction. Once an enhancer is bound to the promoter, the rates of transcription correspond to a linear combination of the TFs bound to that enhancer. In this case, since A and B possess only one binding site, the transcriptional rates are only the single terms r1 and r2. The TFs T1 and T2 appear in clusters of sizes n1 and n2 and dissipate or degrade linearly at rates β−1 and γ−1. These particular properties of TFs were previously noted to be necessary for recapitulating Kruppel transcriptional data. Lastly, mRNA denoted by R degrades at a constant rate α. Details about fitting the reaction rates to experimental data and further justification for the topology of this model can be found in Waymack et al. [20].
We expanded the reasoning above to construct models for any number of enhancers each with an arbitrary number of binding sites. In particular, take a set of n enhancers given by A(1), A(2), …, A(n). We denote an enhancer where I is a vector composed of 0s or 1s with the kth entry being a 1 if there is a TF bound to the kth binding site of A(i) and 0 otherwise. Denote ej as the vector with 0s at all entries except for a 1 at the jth entry and having the same number of entries as I. Any binding of a TF Tm to the (empty) jth binding site of is described by a reaction of the form Similarly, an unbinding of Tm from is described by the reaction Suppose the enhancer is bound by TFs where the binding sites for an arbitrary TF are located between the entries and of I. Then any enhancer with subscript I will initiate transcription at a rate equal to where Id is the dth entry of I. Lastly, any TF Tm appears in clusters of size nm as described by the reaction and degrades at a linear rate. Two concrete examples of this procedure are shown in S1 Fig for enhancer systems that were included in Fig 2A.
4.2 Estimation of noise and fidelity
To estimate the noise in mRNA expression for all the models in Fig 2A, we approximated the moments for R using zero cumulants closure with a second order truncation. In other words, we assumed that all cumulants of the specified output with order larger than 2 were negligible when calculating the moments. However, unlike the simple example shown above, our models may have several dozen reactions which can make the process of calculating the moments extremely laborious. Fortunately, the CERENA toolbox provides a suite of moment closure methods, including zero cumulants closure, that are conveniently arranged to take reaction networks as inputs [26]. We generated these input files for the 48 models in Fig 2A using a Python script that takes as inputs a number of enhancers and TF binding sites and generates the corresponding model file for CERENA. Then, using the moments that were calculated with CERENA, we derived the correlation and CV for each of the models. These measurements were plotted in Figs 2, 3, 4 and 6.
4.3 Implementation of sub and superadditivity
The implementation of varying additivity levels was done by modulation of the binding rates kon and koff as shown in Figs 3A and 4A. For example, to make the Kruppel model in Fig 2A subadditive, we would proceed as follows. Let and be the rates of binding and unbinding for T1 to and from a single enhancer. Since the model in Fig 2A has two enhancers, we would set in this model equal to and to where d1 and d2 are positive constants. Then, repeat the same procedure for the binding and unbinding rates T2 for another set of d1 and d2 values. This way, the TFs will bind to the enhancers less often leading to an overall decrease in mRNA production. This approach was found to be sufficient for consistently recapitulating the subadditivity of Kruppel enhancers observed in the work by Waymack et al [20]. Superadditivity, on the other hand, can be implemented for the model in Fig 2A in a similar way by having equal to and to . The same procedure can also be applied to the binding rates of T2.
In general, any model can be made subadditive according to the following scheme. Take once again the system with n enhancers given by A(1), A(2), …, A(n) as described above. Denote and be the rates of binding and unbinding for some TF Tm to and from a single enhancer. Then, set in the reactions of the form and in the reactions of the form equal to and respectively for some positive constants and . Lastly, repeat these steps for all values of m. The same reasoning can be applied when designing superadditive systems but the signs in the equations of binding rate modulation need to be flipped, that is, use instead and for modifying the corresponding binding rates. These procedures were implemented for all the models in Fig 2A and the resulting transcriptional noise and fidelity for these modified models was plotted in Figs 3 and 4.
Supporting information
S1 Fig. Enhancer network structure.
Two different models and their corresponding reaction networks.
https://doi.org/10.1371/journal.pcbi.1011071.s001
(EPS)
S2 Fig. Transcriptional fidelities with respect to T2 for enhancers that behave additively, subadditively, and superadditively.
The fidelity trends with respect to T2 for all models in Fig 2A do not differ significantly from those corresponding to the fidelities with respect to T1.
https://doi.org/10.1371/journal.pcbi.1011071.s002
(EPS)
S3 Fig. Fitted rates of mRNA transcription for single and duplicated models suggest that a single enhancer is sufficient to saturate polymerase loading rates.
The polymerase loading rates r1 and r2 were fitted for the 4 models shown above according to the same methodology described in Waymack et al. [20]. Parameter fittings were done directly on the raw mRNA transcriptional data of Kruppel and show minimal differences between the single enhancer models and their duplicated counterparts.
https://doi.org/10.1371/journal.pcbi.1011071.s003
(EPS)
S4 Fig. Increasing the binding site numbers in systems with three enhancers leads to decreases in noise.
Noise trends with respect to binding site numbers for systems with three enhancers that operate subadditively and superadditively. Higher binding site numbers lead to lower transcriptional noise.
https://doi.org/10.1371/journal.pcbi.1011071.s004
(EPS)
S5 Fig.
(A) Comparison of different moment closure methods using the CERENA package. From left to right, we use the zero cumulant methods of second and third order, mean field of second order, low dispersion of second and third order, and derivative matching of second order. (B) Calculation of the Fano factor, defined as variance over mean, for the mRNA. The graphs correspond to the calculations of the coefficient of variation in Figs 2B, 3B and 4B.
https://doi.org/10.1371/journal.pcbi.1011071.s005
(EPS)
S6 Fig. Additional parameter sets.
Four new sets of parameters were randomly chosen to recreate several graphs from previous figures. All parameters but r, kon were randomized logarithmically from 0.1 to 100, while r was randomized from 10 to 1000 and kon was randomized from 0.1 to 10. The original parameter set is shown in red, while new parameter sets are shown in blue, green, black, and purple. The figures expand on calculations presented in (A) Fig 2B for the additive case, (B) Fig 3B for the subadditive case, and (C) 4B for the superadditive case, as well as S2 Fig.
https://doi.org/10.1371/journal.pcbi.1011071.s006
(EPS)
References
- 1. Hong JW, Hendrix DA, and Levine MS. Shadow enhancers as a source of evolutionary novelty. Science 2008; 321:1314–4. pmid:18772429
- 2. Stefanakis N, Carrera I, and Hobert O. Regulatory logic of pan-neuronal gene expression in C. elegans. Neuron 2015; 87:733–50. pmid:26291158
- 3. Torbey P, Thierion E, Collombet S, et al. Cooperation, cis-interactions, versatility and evolutionary plasticity of multiple cis-acting elements underlie krox20 hindbrain regulation. PLOS Genetics 2018;14. Publisher: Public Library of Science:e1007581. pmid:30080860
- 4. Jeong Y, El-Jaick K, Roessler E, Muenke M, and Epstein DJ. A functional screen for sonic hedgehog regulatory elements across a 1 Mb interval identifies long-range ventral forebrain enhancers. Development (Cambridge, England) 2006;133:761–72. pmid:16407397
- 5. Degenhardt KR, Milewski RC, Padmanabhan A, et al. Distinct enhancers at the Pax3 locus can function redundantly to regulate neural tube and neural crest expressions. Developmental Biology 2010;339:519–27. pmid:20045680
- 6. Allan CM, Walker D, and Taylor JM. Evolutionary duplication of a hepatic control region in the human apolipoprotein E gene locus. Identification of a second region that confers high level and liver-specific expression of the human apolipoprotein E gene in transgenic mice. The Journal of Biological Chemistry 1995;270:26278–81. pmid:7592836
- 7. Li Q, Peterson KR, Fang X, and Stamatoyannopoulos G. Locus control regions. Blood 2002;100:3077–86. pmid:12384402
- 8. Perry MW, Boettiger AN, Bothma JP, and Levine M. Shadow enhancers foster robustness of Drosophila gastrulation. Current Biology 2010;20:1562–7. pmid:20797865
- 9. Frankel N, Davis GK, Vargas D, Wang S, Payre F, and Stern DL. Phenotypic robustness conferred by apparently redundant transcriptional enhancers. Nature 2010;466:490–3. pmid:20512118
- 10. Antosova B, Smolikova J, Klimova L, et al. The gene regulatory network of lens induction is wired through meis-dependent shadow enhancers of Pax6. PLoS genetics 2016;12:e1006441. pmid:27918583
- 11. Osterwalder M, Barozzi I, Tissières V, et al. Enhancer redundancy provides phenotypic robustness in mammalian development. Nature 2018;554. Number: 7691 Publisher: Nature Publishing Group:239–43. pmid:29420474
- 12. Kvon EZ, Waymack R, Elabd MG, and Wunderlich Z. Enhancer redundancy in development and disease. Nature Reviews Genetics 2021:1–13.
- 13. Cannav‘o E, Khoueiry P, Garfield DA, et al. Shadow enhancers are pervasive features of developmental regulatory networks. Current Biology: CB 2016;26:38–51.
- 14. Gorkin DU, Barozzi I, Zhao Y, et al. An atlas of dynamic chromatin landscapes in mouse fetal development. Nature 2020;583. Number: 7818 Publisher: Nature Publishing Group:744–51. pmid:32728240
- 15. Andersson R, Gebhard C, Miguel-Escalada I, et al. An atlas of active enhancers across human cell types and tissues. Nature 2014;507. Number: 7493 Publisher: Nature Publishing Group:455–61. pmid:24670763
- 16. Bothma JP, Garcia HG, Ng S, Perry MW, Gregor T, and Levine M. Enhancer additivity and non-additivity are determined by enhancer strength in the Drosophila embryo. eLife 2015;4:e07956. pmid:26267217
- 17. Scholes C, Biette KM, Harden TT, and DePace AH. Signal integration by shadow enhancers and enhancer duplications varies across the Drosophila embryo. Cell Reports 2019;26:2407–2418.e5. pmid:30811990
- 18. El-Sherif E and Levine M. Shadow enhancers mediate dynamic shifts of gap gene expression in the Drosophila embryo. Current Biology: CB 2016;26:1164–9. pmid:27112292
- 19. Dunipace L, Akos Z, and Stathopoulos A. Coacting enhancers can have complementary functions within gene regulatory networks and promote canalization. PLOS Genetics 2019;15. Publisher: Public Library of Science:e1008525. pmid:31830033
- 20. Waymack R, Fletcher A, Enciso G, and Wunderlich Z. Shadow enhancers can suppress input transcription factor noise through distinct regulatory logic. eLife 2020;9:e59351. pmid:32804082
- 21. Staller MV, Vincent BJ, Bragdon MDJ, et al. Shadow enhancers enable Hunchback bifunctionality in the Drosophila embryo. Proceedings of the National Academy of Sciences 2015;112. Publisher: Proceedings of the National Academy of Sciences:785–90. pmid:25564665
- 22. Grah R, Zoller B, and Tkacik G. Nonequilibrium models of optimal enhancer function. Proceedings of the National Academy of Sciences 2020;117. Publisher: Proceedings of the National Academy of Sciences:31614–22. pmid:33268497
- 23. Changeux JP. Allostery and the Monod-Wyman-Changeux model after 50 years. Annual Review of Biophysics 2012;41:103–33. pmid:22224598
- 24.
Nousiainen K, Intosalmi J, and Lahdesmaki H. A mathematical model for enhancer activation kinetics during cell differentiation. In: Algorithms for Computational Biology. Ed. by Holmes I, Martín-Vide C, and Vega-Rodríguez MA. Lecture Notes in Computer Science. Cham: Springer International Publishing, 2019:191–202.
- 25.
Hespanha J. Moment closure for biochemical networks. In: 2008 3rd International Symposium on Communications, Control and Signal Processing. 2008 3rd International Symposium on Communications, Control and Signal Processing (ISCCSP). St. Julian’s, Malta: IEEE, 2008:142–7. url: http://ieeexplore.ieee.org/document/4537208/
- 26. Kazeroonian A, Frőhlich F, Raue A, Theis FJ, and Hasenauer J. CERENA: ChEmical REaction Network Analyzer—A toolbox for the simulation and analysis of stochastic chemical kinetics. PLOS ONE 2016;11. Ed. by Salahub D:e0146732. pmid:26807911
- 27. Grima R. A study of the accuracy of moment-closure approximations for stochastic chemical kinetics. The Journal of Chemical Physics 2012;136. Publisher: American Institute of Physics:154105. pmid:22519313
- 28. Schnoerr D, Sanguinetti G, and Grima R. Comparison of different moment closure approximations for stochastic chemical kinetics. The Journal of Chemical Physics 2015;143. Publisher: American Institute of Physics:185101. pmid:26567686
- 29. Schnoerr D, Sanguinetti G, and Grima R. Validity conditions for moment closure approximations in stochastic chemical kinetics. The Journal of Chemical Physics 2014;141. Publisher: American Institute of Physics:084103. pmid:25173001
- 30. Gillespie DT. Stochastic simulation of chemical kinetics. Annual Review of Physical Chemistry 2007;58:35–55. pmid:17037977
- 31.
Singh A and Hespanha JP. Approximate moment dynamics for chemically reacting systems. IEEE Transactions on Automatic Control 2011;56. Conference Name: IEEE Transactions on Automatic Control:414–8.
- 32. Feschotte C. Transposable elements and the evolution of regulatory networks. Nature Reviews Genetics 2008;9. Number: 5 Publisher: Nature Publishing Group:397–405. pmid:18368054
- 33. Jurka J. Conserved eukaryotic transposable elements and the evolution of gene regulation. Cellular and Molecular Life Sciences 2008;65:201–4. pmid:18030428
- 34. Enrique Viguera, Danielle Canceill, and Ehrlich S. Replication slippage involves DNA polymerase pausing and dissociation. The EMBO Journal 2001;20. Publisher: John Wiley & Sons, Ltd:2587–95.
- 35. Canceill D, Viguera E, and Ehrlich SD. Replication slippage of different DNA polymerases is inversely related to their strand displacement efficiency. Journal of Biological Chemistry 1999;274. Publisher: Elsevier:27481–90. pmid:10488082
- 36. Lam DD, Souza FSJd, Nasif S, et al. Partially redundant enhancers cooperatively maintain mammalian Pomc expression above a critical functional threshold. PLOS Genetics 2015;11. Publisher: Public Library of Science:e1004935. pmid:25671638
- 37. Dunipace L, Ozdemir A, and Stathopoulos A. Complex interactions between cis-regulatory modules in native conformation are critical for Drosophila snail expression. Development 2011;138:4075–84. pmid:21813571
- 38. Yan J, Anderson C, Viets K, et al. Regulatory logic driving stable levels of defective proventriculus expression during terminal photoreceptor specification in flies. Development (Cambridge, England) 2017;144:844–55. pmid:28126841
- 39. Lynch M and Marinov GK. The bioenergetic costs of a gene. Proceedings of the National Academy of Sciences of the United States of America 2015;112:15690–5. pmid:26575626
- 40. Spitz F and Furlong EEM. Transcription factors: from enhancer binding to developmental control. Nature Reviews Genetics 2012;13. Number: 9 Publisher: Nature Publishing Group:613–26. pmid:22868264
- 41. Rebeiz M and Tsiantis M. Enhancer evolution and the origins of morphological novelty. Current Opinion in Genetics & Development. Developmental mechanisms, patterning and evolution 2017;45:115–23. pmid:28527813
- 42. Farnoud F, Schwartz M, and Bruck J. Estimation of duplication history under a stochastic model for tandem repeats. BMC Bioinformatics 2019;20:64. pmid:30727948
- 43. Cao Z and Grima R. Linear mapping approximation of gene regulatory networks with stochastic dynamics. Nature Communications 2018;9. Number: 1 Publisher: Nature Publishing Group:3305. pmid:30120244
- 44. Crocker J and Ilsley GR. Using synthetic biology to study gene regulatory evolution. Current Opinion in Genetics & Development. Evolutionary genetics 2017;47:91–101.
- 45. McNamara HM, Ramm B, and Toettcher JE. Synthetic developmental biology: New tools to deconstruct and rebuild developmental systems. Seminars in Cell & Developmental Biology 2022.