Epigenetic cell memory: The gene’s inner chromatin modification circuit

Epigenetic cell memory allows distinct gene expression patterns to persist in different cell types despite a common genotype. Although different patterns can be maintained by the concerted action of transcription factors (TFs), it was proposed that long-term persistence hinges on chromatin state. Here, we study how the dynamics of chromatin state affect memory, and focus on a biologically motivated circuit motif, among histones and DNA modifications, that mediates the action of TFs on gene expression. Memory arises from time-scale separation among three circuit’s constituent processes: basal erasure, auto and cross-catalysis, and recruited erasure of modifications. When the two latter processes are sufficiently faster than the former, the circuit exhibits bistability and hysteresis, allowing active and repressed gene states to coexist and persist after TF stimulus removal. The duration of memory is stochastic with a mean value that increases as time-scale separation increases, but more so for the repressed state. This asymmetry stems from the cross-catalysis between repressive histone modifications and DNA methylation and is enhanced by the relatively slower decay rate of the latter. Nevertheless, TF-mediated positive autoregulation can rebalance this asymmetry and even confers robustness of active states to repressive stimuli. More generally, by wiring positively autoregulated chromatin modification circuits under time scale separation, long-term distinct gene expression patterns arise, which are also robust to failure in the regulatory links.

Here, we let u A =ũ AX withũ A a constant defined in Eq (84) in S1 File.

Qualitative understanding of the impact of the positive autoregulation
If we multiply both sides of the ODEs in (1) by D tot (k A M D tot ), system (1) can be rewritten in a dimensional way: in which all the parameters are defined as done for Eqs (79) in S1 File. In particular, k A W =k A W X. Now, in order to gain a qualitative understanding of the impact of self-activation on the stability of the system, let us approximate first the reactions involving X are fast and then we can set the protein dynamics to the QSS (i.e, X = p x D A with p x = α x /γ x ),the ODEs (2) can be re-written as follows: Comparing these ODEs with the ones related to the chromatin modification circuit alone in Eqs (79) in S1 File, it is possible to see that the introduction of the positive autoregulation leads to an increase of the auto-catalysis rate constant of the activating chromatin marks (that is, in the last ODE, before it was k A M , while now it is (k A W + k A M )). Based on the bifurcation plots realized for the chromatin modification circuit (Fig K in S1 File), we know that a higher k A M can restore the stability of the active state.

Derivation of the stationary probability distribution formula for the positively autoregulated gene
Following the same procedure used to obtain the one-dimensional reduced model for the chromatin dynamics circuit S1 File: reactions (158) (Section 3.2 in S1 File), we obtain the following chemical reaction system: (4) and k A W =k A W p x D A . Now, in order to derive the formula for the stationary distribution, we introduce x = n R 12 . Then, since the reactions (4) has the same form of the ones related to the chromatin modification system (S1 File: reactions (158)), in which k A W =k A W p x D A , we can use the same formula for π ϵ≪1 (x) given by Eq (169) in S1 File, in which we substitute u A with u A =k Dtot . Then, the stationary probability distribution under the condition ϵ ≪ 1 can be written as follows: in which x = n R , u tot = u A 0 +ū R 12 ,ū R 12 = u R 1 + u R 10 + u R 2 + u R 20 andK i defined in Eq (161) in S1 File.

Derivation of time to memory loss formula for the positively autoregulated gene
Now, let us derive the formula for the time to memory loss of the active gene state, τ Dtot 0 . In particular, for the Markov chain related to the S1 File: reactions (158) we obtain the formula Eq (175) in S1 File for τ Dtot 0 . Now, since our current reactions (4) differ from those in S1 File: reactions (158) by k A W =k A W p x (D tot − D R 12 ), we can directly use the formula for τ Dtot 0 previously computed in Eq (175) in S1 File and substitute . We then obtain that τ Dtot 0 can be defined as follows: in which α i and γ i are defined as Then, assuming that ϵ ′ ̸ = 0, it is possible to notice that, for ϵ ≪ 1, the dominant term of τ Dtot 0 is the first addend in (7).
Then, by normalizing the time to memory loss with respect in the regime ϵ ≪ 1 can be re-written as follows: parameters for the two chromatin modification circuits, we let X and Z denote the genes' products, and indicate the species within each of the corresponding chromatin modification circuits by "X" and "Z" superscripts, respectively. Furthermore, we define X := n X /D tot , Z := n Z /D tot . Thus, the ODE model in terms of non-dimensional variables and non-dimensional parameters can be written as follows: . Based on the expression for k 1,ℓ W , k 2,ℓ W and k A,ℓ W given in Eqs (10), we approximate u A,ℓ =ũ A ℓ and, for i ∈ {1, 2}, u R,ℓ i =ũ R i j, with ℓ, j = X, Z and ℓ ̸ = j (Section (2.1), Eqs (11),(13)).

Deterministic analysis
For this deterministic analysis, we exploit the results obtained for the positive autoregulation circuit viewed as an input/output system. In fact, the block diagram in Fig 6B makes it explicit that the mutual repression circuit is the input/output composition of two positively autoregulated genes, in which the output of one gene, n X or n Z , is used as an input to the other gene by increasing k R,Z W or k R,X W , respectively. Defining p x =ᾱ x /γ x and p z =ᾱ z /γ z , at steady stateX = p xD A,X and Z = p zD A,Z . Then for a state of system (14) to be a steady state, the (D A,X ,D A,Z ) pair must lie on the intersection of the input/output steady state characteristics of the X gene and of the Z gene as shown in Fig F. In particular, we assumeũ R 1 =ũ R 2 =ũ R and we consider, for the X gene, the (u R,X ,D A,X ) input/output steady state characteristic, with u R,X = u R,X 1 = u R,X 2 =ũ RZ , that, at steady state, can be written as u R,X =ũ R p zD A,Z . Similarly, for the Z gene, we consider the (u R,Z ,D A,Z ) input/output steady state characteristic, with u R,Z = u R,Z 1 = u R,Z 2 =ũ RX , that, at steady state, can be written as u R,Z =ũ R p xD A,X . It is also possible to show that for each of the intersections shown in Fig F, there is a unique combination of positive values for the remaining variables such that the system is at steady state. We determined stability of these steady states numerically by evaluating the eigenvalues of the Jacobian of the system.
In summary, for low p values, the system has a unique stable steady state about the origin. By increasing p, the system acquires two stable steady states, in which one gene is "on" and the other is "off". That is, one equilibrium withD A,X ≫D A,Z and the other equilibrium withD A,X ≪D A,Z . When ϵ is large, also the steady state about the origin is stable, while when ϵ is small, it is not. Thus, when p is large (high expression rate) and ϵ is small (small basal erasure rate), the system is tri-stable. When µ ′ is increased, the system can have four co-existing stable steady states (Fig F).

Analytical analysis
For this analysis, as we did for the previous system analyzed, we consider the parameter regime ϵ ′ ≪ 1 and we assume that the reactions involving X and Z are fast compared to the other reactions (that is, considering ϵ = cϵ ′ , let us assumeγ x ,γ z ≫ ϵ ′ , ϵ ′ µ, ϵ ′ µ ′ ) so that we can set the protein dynamics to the QSS (X = p x D A,X and Z = p z D A,Z ). Then, by considering the one-dimensional approximation of the chromatin dynamics circuit S1 File: reactions (158), we obtain the following chemical reaction system: with for i, j = X, Z and i ̸ = j. The reduced system (16) can be represented by a two-dimensional Markov chain in which the state is defined as s = (v, w), with v = n R,X 12 and w = n R,Z 12 . In particular, v and w can vary between zero and D tot . Furthermore, assuming p x = p z (i.e.,the production and degradation rate constants are the same for protein X and Z) and defining p = p , the transition rates between the states can be written as follows: The total number of states characterizing the Markov chain is (D tot + 1) 2 . Furthermore, let us introduce the infinitesimal generator of the Markov chain Q, that is a matrix whose (r, l) th entry, for 1 ≤ r ̸ = l ≤ (D tot + 1) 2 is the transition rate of going to the state l, starting from the state r and the (r, r) th entry is the opposite of the rate of leaving the state r [1]. Then, introducing π = [π(0, 0), π(0, 1), π(0, 2), ..., π(D tot , D tot − 1), π(D tot , D tot )], the stationary distribution can be evaluated by solving the system of equations given by For example, the generic r th equation of the system (19) associated with the state r = (v, w) is written as follows: Now, let us write explicitly theq (v,w) for the four extremal states (0, 0), (0, D tot ), (D tot , 0)(D tot , D tot ): (21) Then, if we assume ϵ ′ ̸ = 0 and ϵ ≪ 1, looking at the expressions in (21), it is possible to notice thatq (0,0) ,q (0,Dtot) ,q (Dtot,0) andq (Dtot,Dtot) are very small (q (0,0) ,q (0,Dtot) ,q (Dtot,0) ,q (Dtot,Dtot) ≈ 0) and then, by solving the system of equation in (19), we obtain that π(s) ≈ 0 except for s = (0, 0), (0, D tot ), (D tot , 0), (D tot , D tot ). Since π(s) = 1, we can conclude that, under the condition ϵ, π(0, 0) + π(0, D tot ) + π(D tot , 0) + π(D tot , D tot ) ≈ 1. This implies that, for a sufficiently small ϵ, the peaks of the distribution are concentrated in the four extremal states and the probability of finding the state outside of these states approaches zero. Now, let us determine the effect of p on the distribution: in particular, if we assume a sufficiently high p (p : 1 p ≪ ϵ ≪ 1), it is possible to notice by comparing the expressions in (21) thatq (0,Dtot) ,q (Dtot,0) ≪q (0,0) ,q (Dtot,Dtot) .Then, we can assume thatq (0,Dtot) ,q (Dtot,0) ≈ 0. Now, by solving (19) and using the fact that (DtotDtot+1) 2 s=1 π(s) = 1, we obtain that π(s) ≈ 0 except for s = (0, D tot ), (D tot , 0) and then π(0, D tot ) + π(D tot , 0) ≈ 1. This implies that, for a sufficiently high p, the peaks of the distribution are concentrated only in correspondence of the two states (0, D tot ) and (D tot , 0).

Figures
For all the plots, solid lines represent stable steady states, dotted lines represent unstable steady states and the black circle represents the bifurcation point. In this case we have a saddle-node bifurcation. Furthermore, we set α =ᾱ = α ′ = 1 andγ x = 1. (A) On the y axis we haveD A (green) andD R =D R 1 +D R 2 +D R 12 (red) and on the x axis we have µ ′ (log scale). We realize several bifurcation plots for different values of ϵ (ϵ = 0.1, 1, 10), different values of p x (p x = 0, 1, 10) and set ϵ ′ = 1 and µ = 1. (B) On the y axis we haveD A (green) andD R =D R 1 +D R 2 +D R

12
(red) and on the x axis we have µ (log scale). We realize several bifurcation plots for different values of ϵ (ϵ = 0.1, 1, 10), different values of p x (p x = 0, 1, 10) and set ϵ ′ = 1 and µ ′ = 1.  Table B with the SSA and we indicate with n R the total number of nucleosomes characterized by repressive chromatin modifications, that is n R = n R 1 + n R 2 + n R 12 . We consider two different cases, ϵ ′ ≪ 1 and ϵ ′ ≫ 1, and for each case we determine how varying µ ′ and p x affect the stationary probability distribution of the system. The parameter values of each regime are listed in Table B. In particular, we set ϵ = 0.12, µ = 1 and we consider two values of ϵ ′ (ϵ ′ = 10, 0.2), three values of p x (p x = 0, 0.1, 10) and two values of µ  Table I with the SSA and we indicate with n R the total number of nucleosomes characterized by repressive chromatin modifications, that is n R = n R 1 + n R 2 + n R 12 . We consider two different cases: in the first one (left side) we set µ = 1 and vary µ ′ (i.e.,µ ′ = 1, 0.5) and in the second one (right side) we set µ ′ = 1 and vary µ (i.e.,µ = 1, 0.5). The parameter values of each regime are listed in Table I. In particular, for both cases we set ϵ = 0.12, ϵ ′ = 1 and we consider three values of p x (p x = 0, 0.1, 10).  =ũ RX , that, at steady state, can be written as u R,Z =ũ R p xD A,X and the black line represents the (u R,X ,D A,X ) input/output steady state characteristic of the X gene, with u R,X = u R,X 1 = u R,X 2 =ũ RZ , that, at steady state, can be written as u R,X =ũ R p zD A,Z . We consider four values of ϵ (ϵ = 100, 10, 1, 0.1), three values of µ ′ (µ ′ = 10, 1, 0.1), three values of p (p = 10, 1, 0.1), we set u A 0 = u R 10 = u R 20 = u 0 = 0.1 and all the other parameters are set equal to 1.  Tables K-L with the SSA and we indicate with n A,ℓ with ℓ = X, Z the total number of nucleosomes in each gene characterized by activating chromatin modifications. In particular, we consider p x = p z (i.e.,the production and degradation rate constants are the same for protein X and Z) and we define p as p = p X = p z . Furthermore, we consider three different cases, ϵ ′ ≪ 1 and ϵ ′ = 1 and ϵ ′ ≫ 1, and for each case we determine how decreasing ϵ and increasing p affect the stationary probability distribution of the system. The parameter values of each regime are listed in Tables K-L. In particular, we set µ = 1, µ ′ = 0.6 and we consider three   Tables K-L with the SSA and we indicate with n A,ℓ with ℓ = X, Z the total number of nucleosomes in each gene characterized by activating chromatin modifications. In particular, we consider p x = p z (i.e.,the production and degradation rate constants are the same for protein X and Z) and we define p as p = p x = p z . Furthermore, we consider two different cases, µ ′ = 0.

Tables
It is important to point out that D tot represents the total number of nucleosomes in a gene. Since we can assume about one nucleosome per 200 pb [2](Chapter 4) and we can assume that an average gene spans 10,000-20,000 bp [3], D tot can be considered on average between 50 and 100. In particular, in our computational analysis we consider D tot = 50.
Param. Value

C
a Ω 10 10    Fig 6C and Fig  I, with i, j = X, Z and i ̸ = j.    Fig 6C and in Fig I, with i, j = X, Z and i ̸ = j.