Fig 1.
The Diffusion-Accessible-Domain (DAD) hypothesis.
(A) Classical view of histone transfer at the replication fork, where the parental histones on being dislodged from the DNA land exclusively on the leading and lagging daughter strands. (B) Our new DAD hypothesis which states that due to diffusion, some parental histones (for example on locus 2 DNA) may end up landing on non-daughter strands (locus 1 DNA), as long as the loci are in close physical proximity as well as replicated at the same time. Thicker arrows indicate higher probability transitions due to physical proximity. The wiggly nature of the arrows are meant to indicate diffusive histone transfer. Note that the undirected nature of diffusion in this model results in symmetric distribution of parental histones to the daughter DNA strands.
Fig 2.
Differential histone diffusivity between chromatin compartments explains histone dilution kinetics during replication.
(A) A model of histone dilution at the replication fork incorporating the DAD hypothesis (Model 1). Each row of the parent matrix represents a different DNA locus (either on the same or on different chromosomes) and columns represent histone sites. Histones are shown as green spheres, with the j = 1 histones shown larger to indicate diffusion within rows of the same column. The yellow stars represent biotin tags on the histones at one particular locus of interest R*, mimicking the initial conditions of the experimental protocol in Escobar et al [28]. The arrows show examples of possible movement of histones between loci from the parent
to either daughter
or
. The probabilities of these transitions are based upon laws of diffusion, which accounts for the distance between parent and daughter loci xi,i′ = |i − i′| (details in Methods and SI). (B) Flowchart of one run of the simulation for multiple cell cycles. Histone are distributed from parent to the two daughter matrices based on the laws of diffusion, and the process repeated many times. Empty spots arising out of histone dilution are filled with untagged histones, thereby leading to a decrease in number of tagged histones at the locus R*. (C) Comparison of model (pink dashed lines) with experimental results [28] (green solid lines) for both active and repressed genes. Results from three active genes (Pou5f1, Nanog, Ccna2) and three repressed genes (Meis2, Hoxc6, Ebf1) are shown here. The model results represent best fit curves, with the diffusion constant D as the only free parameter. (D) As expected, fitted diffusion constants from panel (C) are higher for active genes than repressed genes, demonstrating the reasonability of our model.
Fig 3.
DAD model predictions of diffusion-induced histone mark patterns in the epigenome.
(A) A modified version of the model shown in Fig 2A, that incorporates a description of histone marks and mark copying (Model 2). Since any locus will have various types of histone marks, we incorporate a mark of interest (red clover symbol) and other marks (blue clover). Once the parent divides, empty spots are left in the daughters
and
, which get filled with newly synthesized histones carrying neither blue nor red marks. These sites are then filled by a simple mark copying rule where the nearest neighbor sites determine the mark to be assigned (dashed lines; details in Methods and SI). Rules of diffusion and transition of histones from parent to daughter cells remain identical to Model 1. (B) Results from simulations of Model 2, for various number of cell cycles. Histone modification similarity between pairs of loci Q(i1, i2) as a function of negative distance (−xi,i′) is plotted. Negative distance was used to qualitatively match the spatial proximity measure in HiC datasets, where physical proximity increases towards the right of the x-axis. The red box- plots correspond to Early-Early (EE) segments while blue corresponds to Late-Late (LL) segments. For all simulations performed, the matrices
,
and
have 20 rows and 200 columns, where the first 10 rows correspond to Early segments (faster diffusion, D = 1.2) and the last 10 rows to Late segments (slower diffusion, D = 0.07). Each row was assigned a number of histone marks of interest drawn from an exponential distribution with parameter 130. Statistics for each plot were generated from 100 repetitions. Beginning with no difference in Q(i1, i2) between EE and LL segments at 0 cell cycles, three distinct patterns emerge and stabilize over increasing cell cycles: Q(i1, i2)LL is larger than Q(i1, i2)EE for any value of the x-axis, there is a distinct distance dependence in the Q(i1, i2) values for both EE and LL loci pairs, and finally the slope of the distance dependence is higher for LL pairs. (C) Quantification of the slopes of the distance dependent Q(i1, i2) patterns using linear regression, demonstrating a steeper slope for the LL loci pairs (blue line) once the patterns have stabilized after 50 cell cycles. (D) Distribution of the difference between average Early (averaged over the 10 Early rows) and average Late signal after 50 divisions, demonstrating no biases in the simulations. Since these simulations are started with equal numbers of histone marks of interest (on average) between Early and Late rows, no differences emerge after any number of divisions. Yet, the Q(i1, i2) values develop the distinct patterns shown in panel (B), highlighting the role of diffusion in generating non-trivial patterns. (E) Simulations of Model 2 where diffusion is not allowed and parental histones from a particular locus land on the identical locus in either one of the daughter DNA strands. Evidently, none of the three patterns from panel (B) develop when diffusion is prevented from occurring.
Fig 4.
Histone modifications in the GM06990 and K562 cell lines consistently exhibit the predicted patterns from the DAD model.
(A) An overview of the bioinformatics pipeline for investigating the presence of the three predicted histone modification patterns in public datasets. (B) Classification of 1 Mb genomic segments as early or late, depending on the replication timing score difference (early score–late score) for each segment. A cutoff value of 1000 was used in all plots in this figure. If the score difference was larger than +1000, the segment was classified as early. If the score difference was less than -1000, the segment was classified as late. Results with another cutoff value of 500 are shown in Fig B in S1 Text. (C) Q(i1, i2) vs spatial proximity plots for H3K27ac, H3K27me3 and H3K4me2 modifications from the GM06990 (top row) and K562 (bottom row) cell lines. The dashed lines represent the Median Q(i, j)chance values for early (red) and late (blue) loci. The three predicted patterns from the DAD model are clearly visible here. Spatial Proximity is defined as the Pearson correlation between the i1th and i2th rows of the normalized HiC contact matrix. (D) Linear regression of Median Q(i1, i2) against spatial proximity for EE and LL loci pairs, for the H3K27me3 mark in both cell types. The x-axis points are the middle values of the bins from panel (C). The third pattern–a steeper slope for LL segments is evident in these plots. (E) The difference ΔΔ = ΔMedianQ(i1, i2) − ΔMedianQ(i1, i2)chance plotted as a function of spatial proximity, where ΔMedianQ(i1, i2) = MedianQ(i1, i2)LL − MedianQ(i1, i2)EE. As per the DAD model predictions, ΔMedianQ(i1, i2) is expected to be larger at lower values of spatial proximity (left side of the plot) and smaller at higher values of spatial proximity (right side of the plot). ΔMedianQ(i1, i2)chance was subtracted from ΔMedianQ(i1, i2) to account for the observed baseline differences in average values of histone modification signal between early and late segments for any mark. Positive ΔΔ values at low spatial proximity demonstrate consistency with the DAD model predictions. (F) A negative control histone mark H3K9me3, that shows none of the three predicted patterns in Q(i1, i2). This is likely a result of H3K9me3 marking constitutive heterochromatin regions, where diffusion is expected to be minimal or even absent.