Theory on the Dynamics of Feedforward Loops in the Transcription Factor Networks

Feedforward loops (FFLs) consist of three genes which code for three different transcription factors A, B and C where B regulates C and A regulates both B and C. We develop a detailed model to describe the dynamical behavior of various types of coherent and incoherent FFLs in the transcription factor networks. We consider the deterministic and stochastic dynamics of both promoter-states and synthesis and degradation of mRNAs of various genes associated with FFL motifs. Detailed analysis shows that the response times of FFLs strongly dependent on the ratios (wh = γpc/γph where h = a, b, c corresponding to genes A, B and C) between the lifetimes of mRNAs (1/γmh) of genes A, B and C and the protein of C (1/γpc). Under strong binding conditions we can categorize all the possible types of FFLs into groups I, II and III based on the dependence of the response times of FFLs on wh. Group I that includes C1 and I1 type FFLs seem to be less sensitive to the changes in wh. The coherent C1 type seems to be more robust against changes in other system parameters. We argue that this could be one of the reasons for the abundant nature of C1 type coherent FFLs.


Introduction
Transcription factors (TFs) regulate the quantitative levels of many proteins inside a cell [1][2][3][4]. TF networks consist of several fundamental building blocks such as auto regulatory loops, flipflops, feedback loops, single input modules, cascades, feed-forward loops (FFL) and dense overlapping regulons [5][6][7]. Positive auto regulatory loops play critical roles in the maintenance of cellular memory [3] and reprogramming whereas a negative auto regulatory loop seems to speed up the response time against an external stimulus [8][9]. The response time of a gene/motif is the amount of time that is required to achieve half of the steady-state concentration of resultant protein product which is often referred to as rise-time [3][4].
Feedforward loops consist of three different genes namely A, B and C which code for three different TFs. Here the protein of gene A regulates the transcription of both B and C whereas both the proteins of genes A and B regulate the transcription of C (Fig. 1). As shown in Figure 1, totally there are three such regulatory connections in a FFL network motif and eight such regulatory combinations viz. (PPP, PNP, NNN, NPP, PNN, PPP, NNP, and NPN). The first one in a combination ''FGH'' denotes the type of regulation of the transcription of gene B by the protein of gene A, the second one stands for the type of regulation of C by B and the third one denotes the type of regulation of C by TF protein A.
Here the type of regulation can be either positive or negative where P denotes positive and N denotes negative. Here PPP is classified as a coherent type FFL whereas PNP is classified as incoherent type. A FFL motif is said to be a coherent type if the direct effect of the general transcription factor (A) on the effector operons (C) has the same sign (negative or positive) as its net indirect effect through the specific transcription factor (B). Here PPP, NPN, PNN and NNP (termed as C1, C2, C3 and C4) are coherent types and PNP, NNN, PPN and NPP (termed as I1, I2, I3 and I4) are incoherent types [10][11]. Here A is the general transcription factor that directly regulates the effector operon of gene C and also indirectly regulates gene C through B. For example, in PNN coherent type, gene A positively regulates B which in turn negatively regulates C and therefore the net effect of regulation of C by TF gene A indirectly through B is negative. Since A directly regulates C via negative mode, PNN is called as a coherent type.
FFLs perform several important cellular tasks in various biological systems [11][12][13][14][15]. It has been shown that PPP type FFL can act as a sign sensitive delay [14]. For example when A and B regulate C via an AND type logic, then PPP type FFL shows a delay in the expression of gene C following induction of A by an external signal, but no delay following deactivation of A [14]. It seems that coherent types constitute ,85% of the naturally occurring FFL motifs. Magnan and Alon [10] have comprehensively studied the dynamical and kinetic behavior of various types of FFLs under the condition that the promoters of both genes A and B were triggered by external stimuli/signals. Upon analyzing literature-based databases of experimentally verified direct transcription interactions for E. coli [6] and S. cerevisiae [7], they have discovered that PPP (C1) and PNP (I1) type FFLs are more abundant in nature than others [10]. Although all the FFLs are biologically feasible [10], it is still not clear why these two types were preferably selected by nature against other types in these organisms.
Most of the earlier theoretical and experimental studies on FFLs assumed a quasi-equilibrium condition for the binding-unbinding dynamics of regulatory TF proteins at various promoters [10] and a steady-state condition for the dynamics of synthesis and degradation of mRNAs. These assumptions are valid [9] only when the timescales associated with the synthesis and degradation of TF proteins are much slower (several orders of magnitude) than the timescales associated with the binding-unbinding of regulatory TFs at the respective promoters and synthesis and degradation of mRNAs. Recently the role of mRNA stability in tuning the kinetics of gene induction has been studied in detail by Elkon et.al [16]. It seems that the rapidity of induction negatively correlates with the stability of mRNAs. Further, the dynamics of mRNAs can be approximated to be in a steady state only when the ratio w = c p / c m ( = lifetime of mRNA/lifetime of protein) is closer to zero which is not true for most of the protein coding genes [9]. Here c p and c m are the decay rate constants associated with the respective protein and mRNA. In prokaryotic systems w , 0.1 and in eukaryotic systems such as yeast w seems to vary [17][18][19] approximately from 0.1 to 1 with a median of ,0.3. The response times associated with various TFs in a given network are strongly dependent on w of the respective genes and the rate of protein production will be in turn dependent on the response time [9]. TF regulatory networks whose response times are close to or lesser than the generation time of the cell and also less sensitive to the variation in the values of w corresponding to the component genes are more desirable. In this paper using a combination of theoretical and simulation tools we will (a) formulate a detailed model of various types of FFLs that includes the binding-unbinding dynamics of regulatory TFs at various promoters and synthesis and degradation of mRNAs, (b) investigate the effect of variation in w and other system parameters on the response times and overall dynamics of different type of FFLs using the detailed model and (c) explain why some of the FFLs are more abundant in nature than others.

Theoretical Formulation
Feedforward loops consist of three genes coding for transcription factors (TF) A, B and C (Fig. 1). The corresponding Figure 1. Various types of feedforward loops (FFLs) considerd. FFLs consist of three genes which code for three different transcription factors A, B and C where B regulates C and A regulates both B and C. There are three regulatory connections in FFLs. Since each of these regulatory connections can be either positive or negative, totally there are eight different FFLs. We use three letter codes as ''FGH'' where 'F' denotes the type of regulation of B by A and 'G' denotes the type of regulation of C by B and 'H' denotes the type of regulation of C by A. Here PPP, NPN, PNN and NNP (termed as C1, C2, C3 and C4) are coherent type and PNP, NNN, PPN and NPP (termed as I1, I2, I3 and I4) are incoherent type. When dimer of A and B regulate C through AND-logic, there are four possible FFLs as P-P, P-N, N-N and N-P. doi:10.1371/journal.pone.0041027.g001 concentrations of mRNAs are (m a , m b and m c ) and the concentrations of the protein products are p a , p b and p c all are measured in mol/lit (M). Corresponding steady-state concentrations are denoted as (m as , m bs , m cs , p as , p bs and p cs ). In the absence of any regulation or when the promoters are turned-on completely, these steady state values will be (m hs = k mh /c mh , p hs = k mh k ph /c mh c ph where subscripts h = a, b, c denote the TF genes A, B and C respectively).
Here the transcription rate of TF gene 'h' is k mh (Ms 21 ) and the respective translation rate is k ph (s 21 ). The decay rate constant associated with the mRNA of gene 'h' is c mh (s 21 ) and the decay rate constant corresponding to the protein product is c ph (s 21 ). Here subscripts h, g = a, b, c respectively denote gene A, B and C. The gene associated with the transcription factor A is controlled/ triggered by external signal which may be an arbitrary time dependent pulse function (we denote this as x t ð Þ) or an exponentially decaying one. There is a cis-acting element associated with the promoter of gene B where TF protein of A can bind and hence up/down regulate the expression of B via distal action that is mediated by either tracking or looping modes [20]. There are cis-acting elements associated with promoter of C where the protein products of both genes A and B or the dimer of A-B can bind and hence can up/down regulate C respectively in a ''OR'' or ''AND'' logic mode. In an A-OR-B mode the presence of either protein A or B is enough to up/down regulate the promoter of gene C. The protein products of TF gene A and B can also up/down regulate C in a ''AND'' type logic when the dimer of protein products A-B binds with the promoter of gene C. In this case the presence of both A and B is essential for up/down regulation of gene C. There are eight numbers of regulatory combinations with A-OR-B type logic and four different combinations are possible with A-AND-B type logic (Fig. 1). The fraction occupancy of promoter of TF gene 'h' by the respective regulatory TF protein 'g' is denoted as X hg [ (0, 1) which is the ratio x hg /d hz where x hg is the cellular concentration of the promoter of gene 'h' that is bound with the protein of TF gene 'g' and d hz is the total concentration of the promoter of gene 'h' inside the cellular volume. There are at least two types of inter-molecular interactions [22] involved in the binding of TF proteins at the corresponding cis-regulatory sequences namely (a) a weak nonspecific electrostatic interactions between the negatively charged backbone of DNA and the positively charged side chains of the aminoacids which are present at the DNA binding domains of TF proteins and (b) the specific hydrogen bonding interactions at the site-specifically bound DNA protein interface. The strength of electrostatic interactions will be modulated by the presence of water molecules at the DNA protein interface. In such well hydrated conditions at the DNA-protein interface, the net electrostatic interactions can be either attractive or repulsive owing to the presence of multiple electrical double layers around each charged group [22]. The free energy barrier associated with the fluctuating dynamics of TF proteins within this electrostatic field is comparable with that of the thermal free energy that in turn helps the TF proteins to freely slide along the DNA within this electrostatic capturing domain without physical dissociation [23]. Dissociation or unbinding (X hg = 0) of TF protein from the cisacting site happens when all the specific hydrogen bonds are broken and the TF protein completely escapes from this electrostatic force field or capturing domain. When the TF protein is well within the electrostatic field then depending on the net inter-molecular interactions at the DNA-protein interface we find that X hg [ (0, 1). This is reasonable since there may be a situation where the site-specific hydrogen bonding network present at the interface of cis-regulatory DNA sequence and TF is broken due to thermal induced fluctuations but the TF protein is still present there within the electrostatic force field (partially bound condition). This partially bound promoter state is common in eukaryotic systems where a DNA loop connects the cis-acting module and promoter and holds the transcription initiation components so that they are nearby each other in three dimensional space within the electrostatic capturing domain. Further the binding-unbinding dynamics of TF protein 'g' at the promoter of TF gene 'h' will be observed as a continuous process in the timescale of the synthesis and decay of the corresponding mRNA and protein of TF gene 'h'. This means that at the timescales of synthesis and decay of mRNAs and proteins, the thermally driven local fluctuations in the occupancy of the promoters by the TF proteins well within the electrostatic capturing domain will be averaged out. Under such conditions the averaged promoter state occupancy will be equal to the thermodynamic probability of finding the promoter to be occupied by the regulatory TF protein. Upon considering all these facts, one can conclude that it is appropriate to use a continuous type probability variable (such as X hg ) to denote the promoter state occupancy to account for the promoter of gene 'h' that is partially bound with the TF protein 'g' rather than a discrete Boolean type variable as described earlier [9,[17][18]. The bimolecular collision rates associated with the binding of TF protein 'g' with the promoter of gene 'h' is denoted as k fgh (M 21 s 21 ) and the corresponding off-rates of these bimolecular site-specific DNAprotein complexes are denoted as k rgh (s 21 ). K gh = k rgh /k fgh (M) is the overall dissociation constant corresponding to the site-specific binding of TFs at their respective cognate sites. We have summarized the parameters of our detailed model in Tables 1  and 2. Since there are several system parameters, exploration of the entire parametric hyperspace will be a complicated one. To simplify the calculations further we introduce the following scaling scheme to project the dynamical variables onto a dimensionless space.
t~c pc t; M k~mk =m ks ; m ks~kmk =c mk ; P k~pk =p ks ; p ks~kmk k pk c mk c pk ; k~a,b,c Here we measure the real time t in terms of numbers of lifetimes (1/c pc ) of the protein product of gene C. When protein C is stable over several generations of the cell, then its rise-time that is required to achieve half of the steady state value will be equal to the generation time of the cell upon considering the dilution owing to the doubling of cell volume along the process of cell division [8][9]. In such conditions one can also transform the dimensionless t in terms of numbers of generation times of the cell by dividing as t/ln2. With these definitions we can write the deterministic differential equations associated with temporal evolution of (M h , P h and X hg where h = b, c and g = a, b, c) of FFLs with A-OR/AND-B type regulatory logic imposed on the transcription of gene C as follows. Here x t ð Þ[ 0,1 ð Þ is a time dependent external signal that can turn on/off the expression of A. For a constitutive expression of TF gene A we can set x t ð Þ~G t ð Þ and depending on the type of the signals and their decay properties this function can be modified. To investigate the properties of the response-times of various types of FFLs we can set x t ð Þ as rectangular pulses/dips with predefined widths. Here G t ð Þ is the Heaviside step function such that G t ð Þ~0 when tv0 and G t ð Þ~1 when tw0. Rectangular pulses of signals at a given time point can be constructed with a combination of step functions. To introduce a rectangular pulse at the scaled time t = t p for a width of h, we need to set To generate a series of n numbers of rectangular pulses of signals at time points t i with widths Þ . Further we assume that the basal expression levels of all TF genes A, B and C are zero.
The first one in Eqs 2 describes the binding-unbinding dynamics of TF protein A at the promoter of TF gene B. Second and third equations describe respectively the dynamics of synthesis and degradation of mRNA and protein products associated with TF gene B. The dimensionless perturbation parameters in Eqs 2 are defined as follows.
v ba~cpc k fab p as ; w b~cpc c mb ; s bc~kfbc d cz c pb ; m bc~Kbc =p bs ; r b~cpc c pb The function V + [(0, 1) will vary depending on the type of regulation. For a positive regulation of the promoter of TF gene B by TF protein of A, we find V z~Xba and for a negative type The first set of equations in Eqs 3 describe respectively the binding-unbinding dynamics of TF proteins A and B at the corresponding cis-regulatory elements associated with the promoter of gene C. Second and third equations describe the dynamics of synthesis and degradation of mRNA and protein products of C. The dimensionless perturbation parameters in Eqs 3 are defined as follows.
w c~cpc c mc ; v ca~cpc k fac p as ; v cb~cpc k fbc p bs The function f a+b+ X ca ,X cb ð Þf a+b+ varies depending on the type of regulation. Here the total fraction of promoter of C occupied by the proteins of either A or B is X c = X ca + X cb . There are four different possibilities.
Here the subscript ''a+b+'' indicates the case where both the TF proteins A and B positively regulate C and other combinations are defined in the similar way. In Eqs 4 we have defined X X c~Xca {X cb and depending on the type of regulation imposed on the promoter of gene C we find the following limiting conditions of the promoter-state occupancy of TF gene C.  (corresponding standard notations are  namely I1, I2, I3 and I4) are incoherent types FFLs. There will be an additional step corresponding to the dimerization of TF proteins A and B in case of FFLs with A-AND-B logic type regulation. The dynamics of protein-protein dimerization and dissociation can be described by the following differential equation.
Y ab~yab =p as e ab~cpc l fab p bs ; L ab~lfab l rab p bs ; s y~kfyc d zc l fab p bs ; m yc~Kyc p as Here yab is the cellular concentration (M) of the dimer of the protein products of TF genes A and B, lfab and lrab are the corresponding forward bimolecular (M21s21) and reverse unimolecular (s21) rate constants associated respectively with the dimerization and dissociation reactions. To be consistent with Eqs 6 the equations associated with the dynamics of synthesis and degradation of Pa and Pb will be modified as follows.
Further when the dimer A-B binds with the promoter of gene C and hence up/down regulate then the related rate equations corresponding to the expression of gene C will be modified as follows.
v cy dX cy dt~Y ab 1{X cy À Á {m yc X cy ; v cy~cpc k fyc p as The function Y y+ [ (0, 1) varies depending on the type of regulation of the promoter C by the A-B dimer. For a positive regulation we find Y yz~Xcy and for negative regulation Y y{~1 {X cy . Upon combining Eqs 2 and 8 we find four different types of FFLs with A-AND-B type logic on the promoter of gene C viz. (P-P, P-N, N-P and N-N). Here one should note that P-P is similar to PPP type FFL however with A-AND-B gated logic at the promoter of TF gene C, and P-N corresponds to PNN type, N-P corresponds to NPP, and N-N corresponds to NNN. In a combination ''K-H'', 'K' is the type of regulation of promoter B by protein A and 'H' is the type of regulation of promoter C by A-B dimer.

Steady-state Analysis
When the dynamics of the variables X hk and M k are much faster than the rate of synthesis and degradation of the corresponding proteins P k then we have the following limiting conditions for A-OR-B regulation.
Eqs 9 can be obtained by setting w k = 0 for all k = (a, b, c) and v b = v ca = v cb = 0 in Eqs 1-3. The steady state values of the protein products P hs where h = a, b, c will vary depending on the type of regulation and protein-protein interactions between A and B.
Here we have defined the steady state values of X hk in the coupled Eqs 9 as follows.
X bas~Pa = m ab zP a ð Þ ; X cas~mbc P a = m ac P b zm bc P a zm bc m ac ð Þ ; Similarly one can derive the limiting condition in the presence of A-AND-B type regulation of promoter C by the dimer of the protein products of genes A and B. In such conditions the differential equation associated with P c in Eqs 9 will be modified as follows.
Here one should note that the dynamical variables (Pa, Pb and Pc) also represent the efficiency of TF genes A, B and C in raising their protein levels toward their unregulated steady state values (phs = kmhkph/cmhcph where h = a, b, c) even in the presence of various types of positive/negative type regulation on their promoters. The maximum achievable steady state values of Phs will be Phs = 1. In case of PPP type A-OR-B FFL, Pa and Pb influence the rate of change in Pc in an additive way. Whereas in case of P-P type A-AND-B FFL Pa and Pb influence the rate of change of Pc in a multiplicative way. As a result of this multiplicative effect, P-P type FFL can effectively filter out the short/transient pulses of signals which are introduced at the promoter of TF gene A. When TF proteins A and B decay much faster than TF protein C, then we find that as r a and r b tend toward zero. Under such conditions, the expression for Xcys in Eqn 10 can be written as follows.
Eqn 11 suggests that when the binding parameters (mab, Lab, myc) are much lesser than one and the promoter of gene A is turned on for sufficiently longer time periods then the TF gene C will be turned on to its maximum expression level since Xcs,1 under such conditions. Similar to Eq 11 when r a and r b tend toward zero, then we can write the expression for the occupancy level of the promoter C by TF proteins of genes A or B in case of PPP type A-OR-B FFL as follows.
This equation suggests that when the binding parameters (mbc, mac) are much lesser than one and the promoter of gene A is turned on for sufficiently longer time periods, then the TF gene C will be turned on to its maximum expression level since X cs , 1 under such conditions. Using Eqs 11 and 12 one can derive the steady-state values (Pas, Pbs, Pcs) of scaled protein concentrations Pa, Pb, and Pc associated with Eqs 9 as follows.
P as~x t ð Þ; P bs~V+ X bas ð Þ; To evaluate Pbs one needs to substitute P as~x t ð Þ in the appropriate expression of V + X bas ð Þ that in turn depends on the type of regulation of promoter of the TF gene B by the protein of gene A and subsequently these Pas and Pbs need to be substituted in the appropriate expression of Pcs that depends on the types of regulation of TF gene C by proteins A and B. While deriving Eqs 1-3 related to A-OR-B type FFLs we have assumed that monomeric units of proteins A or B interact with the promoters of B and C. Similar to the regulation of C by the dimer of TF proteins A-B in case of A-AND-B type FFLs, one can generalize Eqs 1-3 corresponding to A-OR-B type FFLs to include the regulation of the promoter of the TF genes B and C by the multimeric form of TF proteins A and B respectively [10]. So for we have set wk = 0 for all k = a, b and c. One can show that the response-time associated with the synthesis of the terminal TF protein C upon induction of the promoter of TF gene A by an external signal is strongly dependent on wc as follows. Consider a quasi-equilibrium situations for the promoter state occupancies and steady state situation for the synthesis and degradation of TF genes A and B so that (wa, wb) = 0 and (vca, vcb, vba) = 0. When (ra, rb) = 0, then from Eqs 9 we can obtain the integral solution for the temporal evolution of the variable Pc for a given arbitrary wc ? 0 and the initial conditions Pc = 0 at t = 0 as follows.
When the TF gene C is turned on toward its maximum expression level then the input function f a+b+ X cas ,X cbs ð Þ, 1 and we can write the integral solution for temporal evolution of Pc under such conditions as follows.
From Eq 15 we find that P c~1 {e {t for wc is zero. Since the inequality e {t ve {t=wc will be true for all the values of wc .0 and P c will tends toward zero as w c tends toward infinity, we can conclude that the response time associated with the expression of TF gene C upon induction of TF gene A will monotonically increase as wc increases. The maximum amount of TF protein C that is synthesized for a given rectangular pulse at the promoter of TF gene A with a width h will be P c h ð Þ and one can measure the filtering efficiency of the FFL under consideration for an arbitrary pulse width h .0 from the ratio P c h ð Þ=P cs . The cutoff pulse width hc can be defined by the inequality P c h c ð Þ=P cs vd where d is proportional to the experimentally detectable limit of the TF protein C under consideration. When wc = 0, then we find that h c ƒ ln 1= 1{d ð Þ ð Þ and when wc = 1 we find that Þ where LambertW(x) function is the solution of equation yey = x for y. Here we set d ,1022 for simulation purposes. Upon substituting this value we find h c ƒ10 {2 for wc = 0 and h c ƒ0:15 for wc = 1. For other values of wc, one needs to numerically solve the inequality P c h c ð Þ=P cs vd for hc. For example, when wc = 0.1 then we find that h c ƒ0:05. These results suggest that the critical cut-off pulse width hc increases along with the parameter wc.

Stochastic Analysis
The set of Chemical Langevin equations (CLE) associated with the expression of TF genes A, B and C within the various types of A-OR-B type FFLs can be written as follows [26][27][28].
TF A: TF C: Eqs 16-18 suggest the presence of non-zero temporal correlations among the set of concentration variables SX ba P a T, SX ca P a T and SX cb P b T. The characteristic correlation times associated with these pairs of variables will be much lesser than that of the timescales associated with the synthesis and decay of the respective protein products. This follows from the fact that the timescale associated with the promoter state fluctuations is well separated from the timescale associated with the synthesis and decay of the TF proteins. Here one should note that r c~1 by definition. In this equation, various types of Z parameters are defined as follows.
In Eqs 16-19, the termCis the dimensionless delta-correlated Gaussian white noise with the following mean and variance properties.
In this equation, various types of Z parameters and the Gaussian noise terms C associated with the dimerization dynamics are defined as follows. parameters v v and s s characterize the temporal coupling between the mRNA/protein dynamics and the binding-unbinding dynamics of protein molecules with the promoter. The ordinary perturbation parameter s s characterizes the strength of temporal coupling between the protein level dynamics with the promoter state occupancies. The variable w w represents the ratios of the lifetimes of various mRNAs to lifetime of protein of TF gene C and reflects the coupling between the mRNA and protein degradation dynamics. The variable m m, the set of dissociation constants is inversely proportional to the TFs/promoters binding affinity and characterizes the strength of various regulatory connections. The parameter set s s does not affect the rise-time of the TF gene C significantly since the change in the number of TF protein molecules due to binding-unbinding at various promoters is negligible. In contrast, v v significantly affects the response times since the effect of varying v v is indirectly amplified through the corresponding mRNA dynamics. Furthermore, an increase in v v would decrease the rate at which promoter state occupancies shift towards saturation (in positive type regulation) or free form (in negative type regulation) as the regulatory TFs level builds up.
The parameter set w w describes how best the rate of degradation of mRNAs of the TF genes A, B and C of FFLs are coupled to the rate of degradation of TF protein C. Lower values of w w~c pc c mk (k = a, b, c) occur only when the decay rate constants of various mRNAs (c mk ) are much higher than the decay rate (c pc ) of TF protein C. One should note that both the transcription and translation of various TF genes of prokaryotes are taking place in the cytoplasm whereas in case of eukaryotes the transcription is taking place inside the nucleus and the synthesized mRNA transcripts need to be spliced and then transported to cytoplasm through nuclear pores for translation. These differences in the cellular architecture warrants higher lifetimes (1/c mk ) for eukaryotic mRNAs than the prokaryotic ones which results in the general observation that the values of w w ( = lifetime of mRNA/lifetime of protein) associated with various genes in prokaryotes are lower than eukaryotes genes. It seems that a spectrum of various values of w w occurs in the protein coding genes of both prokaryotes and eukaryotes. In yeast, the values of w w seems to vary from 0.1 to 1 with a median of ,0.3 [17][18][19]. This means that we cannot ignore the dynamics of mRNAs while describing any type of TF network.
The parameter set m m describes the strengths of various binding events associated with the FFLs under consideration. Higher values of m m represent strong binding condition and lower values represent weak binding condition. Most of the interactions of TF proteins with the cis-acting DNA elements of associated promoters seems to be much stronger and m m will be generally in the order of ,10 23 . When the steady state values of protein numbers in the absence of regulation under in vivo conditions is p ks , 10 3 then the value of m m,10 23 indicates that a single TF protein is enough to occupy the associated promoter for 50% of the observation times.
The parameter set v v represents the strength of coupling between the promoter state occupancies and the rates of synthesis and degradation of TF proteins. The effect of v v on the rate of synthesis and degradation of TFs will be generally mediated through the rate of changes in the respective mRNA levels. The speed at which a regulated promoter reaches its quasi-equilibrium state is inversely proportional to the value of v v. Higher values of v v will slow down the rate at which promoter state occupancy reached its quasi-equilibrium state that in turn can increase the response times of positively regulated promoters and decrease the response times of negatively regulated promoter. Decrease in v v can also lead to overshooting of protein production in negatively auto regulated loops [9]. Small changes in v v can significantly affect the dynamics of the associated TF protein since these changes are subsequently amplified by the dynamics of mRNAs. In most of the prokaryotic cases v v will be in order of ,10 24 and it strongly dependent on the volume of the cell or nucleus in case of eukaryotic systems. Response times are not much influenced [9] by the parameter set s s even in a wide range (10 23 -10 3 ). Here s s represents the direct coupling between the promoter state occupancies and the rate of synthesis and degradation of TF proteins. Variation in s s does not influence the response times much since the number of protein molecules associated with the binding-unbinding events are much less compared to the steady state values.
The dependency of the response times on the parameters w w seems to be strongly influenced by the set of binding parameters m m. Under weak binding conditions such as m m §1, we find that each of FFLs under consideration show different type of variations as w w changes. Figure 2A  As we have pointed out in the introduction section, a FFL will be an efficient one when (1) the associated response times with respect to an input signal at the promoter of TF gene A is reasonably low or close to the generation time of the cell and (2) also it is less sensitive to the changes in w w over the physiological dynamic range (0.1, 1). Those FFLs which satisfy these two criteria will be the efficient ones and naturally selected. Upon applying these two criteria on the subsets of Group I type FFLs, we find that C1 with both A-AND-B and A-OR-B gated logics and I1 are the preferred FFLs on overall range of w w since response times of I3 and C4 type FFLs are higher than the generation time of the cell. We find from Figure 2D that the increase in the response times is ,200% for Group I type FFLs upon increasing w w from 0.1 to 1 whereas it is .400% for Group II and III type FFLs. These results agree well with the earlier observations on the abundance of various FFLs in E. coli [10,13] and yeast [10,14]. Though Group II and III FFLs possess less response times, these FFLs are not selected by nature since the same speeding-up functionality can be achieved through a much simpler negative auto regulatory loops associated with the TF gene C [8][9]. Figures 2E and 2F suggest that the members of Groups I-III are not robust against variation of the parameter set v v under strong binding conditions. Results show that the P-P type FFL will be moved from third subset of Group I to the second one upon an order of increase in v v. Further we find that the FFLs such as NPP, P-P, PPP and N-P behave in a similar way with respect to the changes in w w at higher values of v v. The segregation pattern of various FFLs based on the behavior of the response times over changes in w w under such conditions seems to be {C4, I3, {C1, P-P, N-P, I4}, I1}, {C2, {N-N, I2}, C3}, P-N. Since the value v v is inversely proportional to the speed at which the promoter state reaches the steady state, an increase in v v would decrease the transcription and translational rates. This will in turn increase the response times of positively regulated promoters and decrease the response times of negatively regulated promoters. Upon considering all these results together one can conclude that the FFLs of types PPP, P-P and PNP are robust against changes in both v v and w w. This also could be one of the reasons [14] associated with the observation [10] that these particular coherent C1 type FFLs with both OR and AND-logics are more abundant in nature than the other types. These results are summarized in Table 3.
When the binding parameters (m ab , m yc , L ab ) are much lesser than one, then we find from Eq 11 that the TF gene C will be turned on toward its maximum expression level. This means that strong binding conditions are required for the following types of molecular interactions to achieve the maximum production of TF protein C viz. (1) protein-protein interactions between A and B, (2) binding of the dimer A-B at the promoter of C and (3) binding of protein A at the promoter of B. Conditions (1-3) also suggest that for an efficient filtering activity of the P-P type A-AND-B FFL against transient input signals at the promoter of the TF gene A, the inequality conditions (m ab , m yc , L ab ) & 1 are necessary which in turn will decrease the maximum achievable steady state value of protein C as shown in Figure 3. In other words there exists a critical value of m m = m m c for a given signal with a pulse width h above which the response of gene C for a pulse of signal at the promoter of gene A will be practically zero. In Figure 3, this critical value occurs at m m c~1 for a pulse width of h = 0.3. On the other hand, for a given value of m m there exist a cutoff pulse width h = h c below which the response of TF gene C is practically zero which is demonstrated in Figure 4. From Figure 4 we find that there also exists a transition region in which the TF gene C responds to the width of the input pulse that is given at the promoter of A in a graded manner. From Eqs 10 we find that we can conclude that the TF gene C will respond to the variation of width h of the activation signal in a graded manner in the strong binding limit as (m ab , m yc , L ab ) tend toward zero. The maximum achievable P c will increase proportional to the width of the activation signal. Whereas in the weak binding limit as the binding parameters (m ab , m yc , L ab ) tend toward infinity, the response of TF gene C with respect to the pulse width h seems to be approximately a sharp type. Under such conditions there exists a critical pulse width (h c ) with a sharp transition region above which the maximum achievable P c will be P c , 1 and below which the maximum achievable P c will be P c , 0. It seems that the parameters (h ab , h yc , L ab ) associated with various binding events need to be fine-tuned to achieve both high efficiency in the filtering activity as well as maximum possible steady-state values of protein P c upon inducing the promoter of gene A by a persistent signal. Results from stochastic simulations at various values of v v and w c are shown in Figure 5 and summarized in Table 4. These results suggest that the coefficient of variation in the response times associated with various FFLs under strong generation times and w w was iterated in the interval (0.001, 10) with D w w = 0.001. Under weak binding condition we observe that each of the considered FFLs behaves in a different way from others. B. Dependency of response times of FFLs w w. Settings are same as A with m m = 0.1. C. Dependency of response times of FFLs w w. Settings are same as A with m m = 0.01. D. Dependency of response times of FFLs w w. Settings are same as A however with m m = 0.001. Under this strong binding conditions all the considered FFLs segregate into three Groups namely I, II and III. It seems that PPP (C1) and PNP (I1) type FFLs show less variation with respect to changes in w w and also their response times are comparable with that of the unregulated C. E. Influence of increase in v v on the dependency of response times of FFLs on w w. Here v v = 0.003. This is the physiological value of v v for a typical yeast cell nucleus whose volume is ,10 times higher than a bacterial cell. In this case, P-P type FLL shifts from the third subgroup of Group I to the second subgroup. F. Influence of changes in v v on the dependency of response times of FFLs on w w. Here v v = 0.03. This is the physiological value for a typical human cell nucleus whose volume is ,100 times larger than a bacterial cell. In this case, PNP (I1) type incoherent FFL shifts to Group III. doi:10.1371/journal.pone.0041027.g002 Table 3. Summary of segregation patterns at weak and strong binding conditions. Here P-P (C1 type FFL with AND type logic on TF gene C) and I1 behaves similarly and therefore the advantages of I1 type FFL whose response time is lower than the generation time will be shared by P-P type which will be added up to the C1 type FFL with OR type gated logic to TF gene C. However this pattern seems to be weakly dependent on v v. When v v increases as in case of eukaryotic cell, then P-P behaves similar to C1 type and I1 type FFL will have the entire advantage of having lower response times than other subgroups of Group I. As a result, I1 type FFL will be more abundant in eukaryotes than prokaryotes. All these results are not dependent on changes in Earlier results ( [10], supplementary materials) based on the analysis of literature-based databases of experimentally verified direct transcription interactions for E. coli [6] suggested a distribution pattern of various FFLs as 83% coherent (out of which 80% were C1 type, 11% C3, 6% C2 and 3% C4) and 17% incoherent (out of which 72% were I1, 14% I2 and 14% I4) types. Similar analysis on S. cerevisiae [6,10] showed a distribution of FFLs as 55% coherent (out of which 84% were C1 and 16% C2) and 45% incoherent (out of which 84% were I1, 12% I2 and 4% I3) types. This data suggested an overall distribution of various FFLs in the transcription factor networks of bacteria as (67% C1, 12% I1, 10% C3, 5% C2, 1% C4, 1% I3 and 1% I4). The overall distribution of various FFLs in the TF networks of yeast is (47% C1, 38% I1, 9% C2, 5% I2 and 1% I3). Comparison of the order of preferences of various FFLs in bacteria (C1. I1. C3. C2. {C4, I3, I4}) and yeast (C1. I1. C2. I2. I3) with the Group I, II and III types of FFLs, one can conclude that those FFLs of Group I (C1, I1) whose response times are close to or less than the generation time of the cell apart from the robustness against changes in w w are more preferably selected. The next set of preferable FFLs (C2, C3 and I2) is from Group II. The set of FFLs (I3, I4 and C4) are the least preferable ones. Although I3 and C4 fall within Group I, their response times are much higher than the generation time. It seems that the relative abundance of I1 type FFL is more in eukaryotes than the prokaryotes. Here one should note that the value of the parameter set v v(v hk = c pc /k fhk p hs = 0.0003) that is used in our simulations ( Fig. 2A-D) is derived for a prokaryote organism whose cell volume is ,10 218 m 3 . With this setting we find from our simulations that the subset (P-P and I1) behaves in a similar way within Group I which is evident from the segregation pattern of various FFLs at strong binding conditions  We argue that some of the advantages of I1 such as lesser response time than the generation time apart from the robustness against changes in w w will be shared by P-P type FFL that is in turn added to the total abundances of C1. This follows from the fact that P-P is also a C1 type FFL with AND gated logic type regulation on the promoter of TF gene C. Stochastic simulations results suggested that the coefficient of variation in the response times is lesser in C1 type FFLs than the other FFLs which agrees well with the observation that these types are relatively more abundant in nature. On the other hand I1 type FFLs possess highest coefficient of variation in the response times. Overall results suggest that the coefficient of variation in the response times is relatively higher whenever the type of regulation imparted by TF genes A and B on the TF gene C is negative. This result is in line with the observations on the negatively self-regulated TF networks in which the coefficient of variation in the response times was shown [9] to be .50%. These results suggest that I1 type FFL as well as negative self-regulated motifs can speed up the response time at the cost of increasing the fluctuations in their response times.
Since the volume of eukaryotic nucleus is much larger than a bacterial cell volume over several orders of magnitudes (volume of yeast nucleus is ,10 1 times larger than a bacterial cell volume [25] whereas human cell nucleus is ,10 2 times larger than a bacterial cell volume), the overall bimolecular collision rate associated with the binding of TFs at their cognate sites (k fgh ) inside the nucleus will be much lower in case of eukaryotes than prokaryotes owing to the dilution effects. This means that the physiological values of v v will be higher in eukaryotes than prokaryotes. Upon an increase in v v as in case of eukaryotes such as yeast ( = 0.003), P-P type behaves in a similar way (Fig. 2E-F) as that of C1 whereas I1 type FFL recovers the full advantage over its lower response time within Group I compared to C1 which is evident from the altered segregation pattern {C4, I3, {C1, P-P, N-P, I4}, I1}, {C2, {N-N, I2}, C3}, P-N. This in turn will result in the higher relative abundances of I1 type FFL in eukaryotes than the prokaryotic TF networks. Further increase in v v as in case of nucleus of human cell ( = 0.03) shift the I1 type FFL from Group I to Group III. This result predicts that I1 type FFLs will be much lesser in abundance than the C1 type in the TF network of human and other higher animals and plants.
One also should note that we are still not able to explain the relative scarcity of I4 and N-P type FFLs even though they fall well within Group I. As pointed out by Magnan and Alon in reference [10], the relative scarcity could be due to the reduced functionality of type 3 and type 4 FFLs with AND gated logic (here it is applicable to N-P) since they response at most one of the triggering signals given at the promoters of both the genes A and B.
We have considered the sets of similar parameters ( w w, v v, m m, s s, r r) associated with TF genes A, B and C as variable units rather than individual ones for all these calculations and simulations. The TF network of an organism consists of several FFL motifs and each of these FFLs is constituted with different subsets of a pool of TF genes. Each subset of TFs can be represented as points in the parametric space of our model. Upon considering the entire pool of TF genes, one should note that each of the members of these sets of parameters can take a spectrum of values with a probability distribution with definite mean and variance. There are two types of variability of parameters among the TF genes of FFLs viz. variation of the parameters of TF gene A/B/C across various FFLs found in the entire TF network and variation of the parameters of TF genes A, B, C within a given FFL. Here we have assumed that both of them are approximately the same. The main results of our analysis will not be affected much due to the second type of variation since we have used the mean values of the parameters associated with the entire set of TF genes of an organism to represent genes A, B and C of various types FFLs.

Methods
We use the following Euler type iterative numerical scheme to integrate the deterministic differential Eqs 1-3 associated with various A-OR/AND-B type FFLs in the dimensionless time and concentrations space.   CV was calculated over 10 5 numbers of stochastic trajectories. The coherent C1 type FFL possesses lower CV of response times than other FFLs whereas I1 type possesses highest CV of response times. A. Here the settings are v vƒ0:003 that is applicable to both prokaryotes and eukaryotes such as yeast, w w was iterated from 0.001 to 10, and s s = 4, T = 25 generation times. B. Here the settings are v v §0:03 that is applicable to higher eukaryotes such as human, w w was iterated from 0.001 to 10, and s s = 4, T = 25 generation times. doi:10.1371/journal.pone.0041027.g005 measured in number of molecules) we find that l mh , 10 22 /w h molecule 21 and l ph , 10 23 /r h molecule 21 . Further we also find that l ab , 1 s, l xkh , 10 23 molecule 21 s (k = b, c and h = a, b) and l xyc , 10 23 molecule 21 s. We use the reflecting boundaries (0, 1) for the scaled concentration variables (P h , M h , Y ab , and X nk ) where n = b, c and k = a, b and the absorbing boundary condition P c = K to compute the mean first passage time (response-time) from the stochastic simulations. Averaging was done over 10 5 trajectories at each w c value and the coefficient of variation of response time was computed as CV (response-time) = standard deviation of responsetime/mean of response-time.

Conclusions
Feedforward loops (FFLs) consist of three genes which code for three different transcription factors A, B and C where B regulates C and A regulates both B and C. We have developed a detailed model to describe the dynamical behavior of various types of coherent and incoherent FFLs in the transcription factor networks. We considered the deterministic and stochastic dynamics of both promoter-states and mRNAs of various genes coding for the transcription factors associated with the FFL motifs. Detailed analysis showed that the response times of FFLs strongly dependent on the ratios (w h = c pc /c ph where h = a, b, c) between the lifetimes of mRNAs of A, B and C (1/c mh ) and the protein of C (1/c pc ). When the binding of transcription factors A and B with the cis-acting elements of respective promoters is very strong, then we could categorize all the possible FFLs into Group I, II and III based on the dependency of the response times of FFLs on w h . Though the response times of Group I FFLs were higher than II and III, they seem to be less sensitive to the changes in w h within the naturally occurring dynamic range (0.1, 1). We have further shown that among the members of the Group I FFLs, the coherent C1 type was more robust against changes in other system parameters which could be one of the reasons why C1 type coherent FFLs are more abundant in nature than the others.

Author Contributions
Conceived and designed the experiments: RM. Performed the experiments: RM. Analyzed the data: RM. Contributed reagents/materials/ analysis tools: RM. Wrote the paper: RM.