The Underlying Molecular and Network Level Mechanisms in the Evolution of Robustness in Gene Regulatory Networks

doi:10.1371/journal.pcbi.1002865

Figure 1.

Determination of transcription factor binding sites and relative binding specificities by in-silico molecular modeling.

(A) Example of in-silico model of DNA-protein complex for the transcription factor EGR1 (PDB:1AAY, originally with sequence 5′-GCGTGGGC-3′) bound to the candidate 8-mer 5′-CGTTGTCG-3′. DNA color codes: GUA:green, CYT:pink, ADE:blue, THY:orange. (B) Detailed view of same model complex for protein residues at 3.5 Å distance from DNA, showing residue repositioning upon energy minimization procedure. Here, the crystal structure is shown in blue and the model in red. (C) Distribution of calculated binding strengths, ε, using the Robertson-Varani statistical potential on TF-DNA complexes for all possible 8-mers (4⁸) for the Egr1 structure. (D) Transformation of normalized ε scores into relative binding specificities, κ. Dashed line indicates cutoff level γ, below which all specificities are set to zero, providing a variable separation between binding and non-binding 8-mers. ε′_opt is a particular value of ε′, defining constant numbers of binding sites for each TF (see Materials and Methods). (E) Six in-silico determined TFBS preferences were compared against those available in JASPAR [23], UniProbe [35] and TRANSFAC [36] databases. N indicates the number of sequences used (we used the N lowest energy sequences to obtain in-silico preferences) to produce the information-content sequence logos (WebLogo [60]). *Logos constructed from frequency matrices.

More »

Expand

Figure 2.

Schematic representation of the gene-regulatory network model.

(A) Model of development. The expression of each gene is regulated by combinatorial interaction between an explicitly modeled cis-regulatory sequence (black lines) and the gene products (sequence specific transcription factors). Each gene product is represented by a different color. Shapes within the cis-regulatory regions represent sequence determinants of regulatory elements and their colors define the identity of the interacting transcription factor. Within the box, the explicit regulatory sequence representation is illustrated by showing an example of a consensus binding site for a given TF (maximal binding specificity, κ_max) and a mutated site (with a lower κ). The extent of gene regulation is a function of the presence and associated binding specificities of each regulatory element (κ_ix, where i is the input gene and x is a regulatory site on gene j), transcription factor abundances (s_i) and the function of the interacting transcription factors (activator or repressor of transcription, represented as positive and negative s_i values). (B) Population model. Simulations start with a randomly chosen developmentally stable founder. Variation is introduced in two forms: exchange of promoter regions between two randomly chosen parents (without recombination within promoter regions) and single point mutations at the DNA level. Selective pressure is applied to the offspring on two levels: they must develop a stable expression pattern through time (phenotype) and that phenotype must be similar to that of the founder.

More »

Expand

Figure 3.

Evolution of robustness depends on URR length and specificity gap.

(A) Change in robustness, measured as the difference of the mean phenotypic distances between unperturbed and perturbed individuals at generations 2000 and 0. The mutation rate used for this measure was 1 mutation per 100 bp per genome. (B) Change in connectivity (comparing generation 2000 to generation 0), measured by the fraction of unique inputs in the network of a given individual. The numbers at the end of each bar represent the connectivity at the end of the simulations. Error bars are the standard error of the mean over 100 independent simulations.

More »

Expand

Figure 4.

Classification of events produced by single point mutations on a cis-regulatory segment.

(A) Decision tree defining all possible events on cis-regulatory regions after the introduction of a point mutation. (B) These events can be thought of as “tools” available to the system, since they summarize all the changes the system can potentially make. For clarity, we also classified them in a continuum, according to their impact on the network architecture. Silent mutations are located at the local-sequence level extreme, since they produce changes that only affect the sequence without modifying either network architecture or gene expression levels. On the other hand, deletion or creation of unique TFBSs is found at the other extreme (network-architecture level) because these events directly impact the network's architecture. A preserved TFBS has the ability to change the relative specificity of a binding site.

More »

Expand

Figure 5.

Decomposition of robustness.

Robustness due to stable individuals is the sum of the products between the frequency and the average phenotypic distance of the mutational events described in Fig. 4A. Therefore they can be used to decompose robustness. (A) Relative composition of the frequencies of each mutation type. They were measured as the differences between final and initial generations in the simulations for each of the classified mutational events (see Materials and Methods). Silent mutations dominate in almost all cases, especially at low specificity gap (γ) and URR lengths (L). Silent mutations are found at the extreme of local-sequence level changes (Fig. 4B). (B) Fraction of local-sequence and network-architecture level changes. Local changes were calculated as the fraction of the total robustness change assuming constant frequency of silent mutations (see Materials and Methods). The length of the URRs (in base pairs) is indicated on top of each bar in both graphs.

More »

Expand

Figure 6.

Local-sequence level mechanisms.

(A) Resistance to creation of TFBSs within TFBS-free regions, measured as the frequency of silent mutation events (Fig. 4) normalized to the fraction of TFBS-free region in the genome. (B) TFBS conservation or degree of resilience to deletion of TFBSs, measured as the average probability that a TFBS will remain a TFBS following a point mutation. Error bars are computed as the standard error of the mean over 100 independent simulations.

More »

Expand

Figure 7.

Network-architecture level mechanisms.

Correlation between the “other contributions” portion of the change in robustness (Fig. 5B) and the average network rewiring as a function of URR (L) and specificity gap (γ). Rewiring, Φ, was computed between individuals at generation 2000 and their respective founders (see Materials and Methods). These values were corrected for the effects of changing connectivity by calculating Φ between two randomly chosen stable individuals, both with the same average connectivity values observed for the individuals at the end of the simulations. The correlation shows that Φ explains for the most part the “other contributions” component of robustness. The amount of rewiring depends primarily on L and to a lesser extent on γ. Error bars are the standard error of the mean over 100 independent simulations.

More »

Expand