Extracting multi-way chromatin contacts from Hi-C data

Lei Liu; Bokai Zhang; Changbong Hyeon

doi:10.1371/journal.pcbi.1009669

Abstract

There is a growing realization that multi-way chromatin contacts formed in chromosome structures are fundamental units of gene regulation. However, due to the paucity and complexity of such contacts, it is challenging to detect and identify them using experiments. Based on an assumption that chromosome structures can be mapped onto a network of Gaussian polymer, here we derive analytic expressions for n-body contact probabilities (n > 2) among chromatin loci based on pairwise genomic contact frequencies available in Hi-C, and show that multi-way contact probability maps can in principle be extracted from Hi-C. The three-body (triplet) contact probabilities, calculated from our theory, are in good correlation with those from measurements including Tri-C, MC-4C and SPRITE. Maps of multi-way chromatin contacts calculated from our analytic expressions can not only complement experimental measurements, but also can offer better understanding of the related issues, such as cell-line dependent assemblies of multiple genes and enhancers to chromatin hubs, competition between long-range and short-range multi-way contacts, and condensates of multiple CTCF anchors.

Author summary

The importance of DNA looping is often mentioned as the initiation step of gene expression. However, there are growing evidences that ‘chromatin hubs’ comprised of multiple genes and enhancers play vital roles in gene expressions and regulations. Currently a number of experimental techniques to detect and identify multi-way chromosome interactions are available; yet detection of such multi-body interactions is statistically challenging. This study proposes a method to predict multi-way chromatin contacts from pair-wise contact frequencies available in Hi-C dataset. Since chromosomes are made of polymer chains, the pairwise contact probabilities are not entirely independent from each other, but certain types of correlations are present reflecting the underlying chromosome structure. We extract these correlations hidden in Hi-C dataset by leveraging theoretical argument based on polymer physics.

Citation: Liu L, Zhang B, Hyeon C (2021) Extracting multi-way chromatin contacts from Hi-C data. PLoS Comput Biol 17(12): e1009669. https://doi.org/10.1371/journal.pcbi.1009669

Editor: Ferhat Ay, La Jolla Institute for Allergy and Immunology, UNITED STATES

Received: June 14, 2021; Accepted: November 19, 2021; Published: December 6, 2021

Copyright: © 2021 Liu et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: The software package and associated documentation are available online (https://github.com/leiliu2015/HLM-Nbody).

Funding: This work was supported in part by National Natural Science Foundation of China (12104404 to L.L.), ZSTU intramural grant (20062226-Y to L.L.) at Zhejiang Sci-Tech University, and KIAS Individual Grant (CG035003 to C.H.) at Korea Institute for Advanced Study. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.

Introduction

Recent advances in experimental techniques [1–6] offer unprecedented glimpses into the chromosome structures inside cell nuclei, in the form of pairwise distances and contact frequencies between genomic loci. Gene expressions are, however, realized when multiple genomic loci, e.g., promoters and enhancers separated over large genomic distances, are brought together to form a regulatory element [7–10], which underscores the importance of resolving chromatin interactions beyond pairwise two-body contacts. In particular, the transcriptional regulation is fine-tuned via a complex network of cooperative and competitive interactions among nearby genes mediated by transcription factors [11, 12].

A plethora of experimental methods have recently been developed to detect multi-way contacts in chromosome [13], which include super-resolution chromatin tracing [14], and both ligation-based (3way-4C [15], COLA [16], Tri-C [17], MC-4C [18, 19], Pore-C [20] and single-cell techniques [21–24]) and ligation-free methods (GAM [25] and SPRITE [26]). Detection of multi-way chromatin contacts using these experimental methods, however, are statistically limited due to the paucity of such contacts. The probability of detecting a particular n-body contacts from a genomic region of interest consisting of N statistical segments is approximately (n ≥ 2), where ⟨c⟩ ∼ N/V is the effective concentration of the segments in the volume V which can be approximated using the Flory radius [27] (R_F ∼ N^ν) to with ν(≥ 1/3) being the Flory exponent [28, 29]. Since for N ≫ n, it is expected that the n-body contact probability scales as p_n ∼ n!/N^3ν(n−1), which makes experimental detection of n-body contacts with larger n statistically more demanding. To get around the detection problem, one usually either conducts a genome-wide study at low resolution (a small N), or performs a high-resolution experiment by focusing only on the contacts formed at a few prescribed sites [16, 30].

Since the polymer physics idea was first used to explore the physical characteristics of chromosomes as long polymer chains confined in a small nuclear space [28, 31–38], computational strategies of incorporating genomic constraints from experimental measurements and epigenomic information into polymer-based modeling of 3D chromosome structure have recently gained much traction [39–54]. Many studies, which generate an ensemble of 3D chromosome structures, highlight the heterogeneous and probabilistic nature of chromosome structure [45, 47, 50, 51, 54]. Once an ensemble of structural models of chromosomes are obtained from computational approaches, it is straightforward to count the multi-way chromatin contacts directly from them and to quantify the corresponding contact probabilities. Specifically, recent studies, which have generated 3D chromosome structures based on the strings and binders switch (SBS) model, have shown that the probabilities of triplet chromatin contacts for HoxD and α-globin regions calculated from the structures are in good agreement with those from 3way-4C and Tri-C experiments, respectively [55, 56]. CHROMatin mIXture (CHROMATIX) model was used to address the functional relevance of many-body contacts among genomic elements enriched at transcriptionally active loci [30].

In this study, we calculate n-body contact probabilities from Hi-C data by using analytic expressions derived from the formalism of Heterogeneous Loop Model (HLM) [50, 51, 57]. Although the original aim of HLM was to reconstruct 3D chromosome structures from Hi-C [50, 51], it is still possible to calculate the map of n-body contact probabilities without explicitly counting those contacts from the generated chromosome structures. Through comparisons between the multi-way chromatin contacts derived from HLM and those from three separate measurements from Tri-C, MC-4C, and SPRITE, we will show that the n-body contact probability maps are in good agreement with experimental measurements (Results). We also explore the relation between pairwise and higher-order contacts and discuss how to avoid the most evident false-positiveness in the analysis of multi-way chromatin contacts experiments (Discussions). Finally, the numerical details of training HLM based on Hi-C and the mathematical derivations for contact probabilities are provided in the Methods section.

Results

Polymer theory of n-body chromatin contacts

The chromatin fiber in a genomic region of interest is modeled as a coarse-grained polymer chain composed of N monomers (or sites), each representing a chromatin segment of a prescribed genomic length. We assume that the chromatin effective energy landscape can be described by a summation of harmonic restraints on the spatial distances between all monomer pairs, (1) where specifies the 3D structure of the polymer chain, and is a N-by-N symmetric stiffness matrix of elements k_ij. The so-called Kirchhoff matrix is defined by , where is a diagonal matrix with elements . We assume that the probability of the chromatin to a adopt a particular structure is given by (2) where k_BT is our energy unit (the Boltzmann constant times the temperature) and is a normalization constant such that the integration of Eq 2 over all possible structures equals to 1.

The primary assumption of HLM (Eq 1) results in a probability distribution of the physical distance between any (i, j) monomers, r_ij [50, 58] (3) where γ_ij is a function of K-matrix with a positive value. More specifically, γ_ij = (σ_ii + σ_jj − 2σ_ij)⁻¹/2 for i > 0 and for i = 0, in which σ_ij denotes the (i, j)-th element of the inverse matrix of K. To assess our assumption, we compare Eq 3 with the fluorescence in situ hybridization (FISH) data recently reported by Takei et al [24]. By using DNA seqFISH+ method, they measured the 3D coordinates of 2,460 loci spaced approximately 1 Mb apart across the whole genome, together with additional 60 consecutive loci at 25 kb resolution on each chromosome, in 446 mouse embryonic stem cells (Fig 1A).

Download:

Fig 1. Chain organization of chr3 of mouse ES cells measured in the DNA seqFISH+ experiment [24].

(A) 3D positions of 151 imaged loci spaced about 1 Mb apart (large empty dots) and 60 consecutive loci with an equal space of 25 kb (small solid dots) in one of 446 sample cells. (B) Angular correlation of the chromatin segments at 25 kb resolution. (C) Distribution of the physical distance of four loci pairs at 1 Mb resolution (markers) with their fittings to our theory (lines, Eq 3), which are rescaled by using the fitting parameter (γ_ij) and plotted in the inset. (D) Rescaled pairwise distance distributions of all loci pairs, which are colored by their genomic lengths. (E, F) Similar as (C, D) but at 25 kb resolution.

https://doi.org/10.1371/journal.pcbi.1009669.g001

At 1 Mb resolution (Fig 1C), the distribution of the physical distances between four loci pairs on chromosome 3 (distinguished by markers of different colors) can be well fitted by Eq 3. When we replot the experimental data with a scaled distance (), all rescaled data are distributed around a master curve (4) which is shown as the black solid line in the inset of Fig 1C. Taking all 151 imaged loci on chr3 as a test case, we analyzed the distance distributions of all intra-chromosome loci pairs (i.e., 151 × 150/2 = 11, 325 pairs), calculated γ for each pair, and plotted the rescaled data. As shown in Fig 1D, regardless of the genomic distance between these loci (labeled by different colors of the dots), data collected over the whole chromosome lie around the master curve, lending support to the validity of Eq 3. This also holds when analyzing the distance distribution of loci at a finer resolution of 25 kb (Fig 1E and 1F).

Next, the cross-linking probability function F(r), describes how likely two chromatin fragments are captured by cross-linking agents if they are spaced by a distance of r. Based on F(r), one can count the contact frequency for any monomer pair in a given structural ensemble, as well as the higher-order contact frequency among any group of n monomers (2 < n ≤ N). The higher-order n-body contact is formed when n monomers (genomic segments) are simultaneously in spatial proximity and within a capture radius of cross-linking agent. The n-body contact probability, e.g., 3-body (n = 3, triplet) contact probability between i, j, and k monomer, p_ijk, can thus be calculated by integrating the probability of a particular chromatin structure (Eq 2) multiplied by the chance to form simultaneous cross-linking among the n monomers in that structure over all possible chromatin structures. Thanks to the special form of the energy function (Eq 1), HLM yields analytic formulae. Specifically, when the effective capture radius of the cross-linking agent is denoted by r_c, we obtain the following results:

The pairwise contact probability p_ij can be written as (5) with γ_ij defined in Eq 3.
The triplet contact probability p_ijk is (6) where a = σ_ii + σ_jj − 2σ_ij, b = σ_ij + σ_jk − σ_ik − σ_jj, c = σ_jj + σ_kk − 2σ_jk, and z = ac − b².
The 4-body contact probability p_ijkl is (7) where W₄ is a 3 × 3 matrix of the form with d = σ_jk + σ_kl − σ_jl − σ_kk, e = σ_kk + σ_ll − 2σ_kl, f = σ_ik+ σ_jl − σ_il − σ_jk, and . The parameters a, b, c are identical to those in Eq 6.

Readers are referred to the Methods section for detailed derivations. As briefly descried above, these results are equivalent to reconstructing an ensemble of 3D chromatin structures followed by counting explicitly n-body contact frequencies in the ensemble.

The stiffness matrix has to be determined before calculating the n-body contact probability and predict specific multi-way chromatin interactions. As illustrated in Fig 2, -matrix can be determined from , i.e., by minimizing —a cost function that quantifies the difference between the -dependent pairwise contact probabilities of the model () and those from pairwise contacts experiment () (see the Methods section for the details). We next compare our predictions of many-body chromatin contacts with those from measurements found in the literature. Detailed information on which genomic region are modeled, the cell types, the model resolutions and the experiment sources are provided in Table A in S1 Appendix.

Download:

Fig 2. Flow chart of HLM that is enclosed by the dashed orange box.

It takes two-body contacts from Hi-C as input, and calculates n-body (n > 2) contact probability. The stiffness matrix is updated until the algorithm finds that minimizes the cost function which quantifies the difference between pairwise contact probabilities of Hi-C data and those determined from matrix.

https://doi.org/10.1371/journal.pcbi.1009669.g002

Comparison with Tri-C

Our first case study for our HLM-based theoretical predictions (see Eq 6) concerns the mouse α-globin locus, an extensively studied model system [61], whose three-body contacts are available from Tri-C measurement [17]. The globin genes are flanked by several CTCF-binding sites (the 3rd row in Fig 3A) in both erythroid and embryonic stem (ES) cells. The Capture-C heatmaps of the two cell types (Fig 3B. The top and bottom panels are for the ES and erythroid cells, respectively) show that the erythroid cell display more frequent contacts (stronger interactions) between the α-globin genes (Hba-a1/2) and the five upstream enhancer elements (R1, R2, R3, R4 and Rm).

Download:

Fig 3. Three-body contacts predicted with HLM versus those from Tri-C.

(A) Gene notation, DNase hypersensitive sites, CTCF-binding sites [59, 60] and genomic regions of interest (sites of interest, SOI) at mouse α-globin locus on chr11. (B) log₁₀(p_ij) from Capture-C (top triangle) and HLM (bottom triangle) at 2-kb resolution. (C, D) log₁₀(p_ijk) from Tri-C (top) and HLM (bottom) at k = R2 and k = HS-39, whose positions are labeled with black strips. The Pearson correlations between Tri-C and HLM are 0.80 (ES) and 0.75 (erythroid) for k = R2, and 0.89 (EC) and 0.85 (erythroid) for k = HS-39. (E) Cell-line difference of the predicted three-body contacts, , among the SOIs with respect to the viewpoints R2, HS-39 and (F) Hba-a1, HS+44.

https://doi.org/10.1371/journal.pcbi.1009669.g003

The -matrices of HLM, obtained separately for both cell lines based on Hi-C data, are not only used to calculate p_ij that captures the variation of local pairwise contacts upon ES-to-erythroid cell differentiation (Fig 3B), but also to predict triplet contacts (p_ijk) at different genomic locus in a cell-type specific manner (Fig 3C and 3D). We calculated the two maps of triplet contact probability (i) at the genomic locus of the strongest enhancer R2 (Fig 3C) and (ii) at the upstream CTCF-binding site HS-39 (Fig 3D). The Pearson correlations (PC) between the HLM-predicted triplet contacts and those from Tri-C data are generally high (PC ≈ 0.75 − 0.89. See the caption of Fig 3 for details). In particular, for the triplet contact probability map calculated at the HS-39 viewpoint, the correlations are comparable to those reported by a simulation study using the SBS model (PC ≈ 0.8) [56]. We further calculated three correlation coefficients specifically designed to compare contact matrices, namely the distance-corrected Pearson correlation [46], stratum-adjusted correlation [62] and stratified Pearson correlation (S1 Fig). As summarized in Table B in S1 Appendix, overall, HLM matches better with Tri-C experiment than SBS model.

The comparison between triplet contact probabilities in ES cells over all sites of interest (SOIs) and those in erythroid cells underscores the change in higher-order contacts upon cell differentiation (Fig 3E). Upon the activation of α-globin gene (ES → erythroid cell), R2 simultaneously interacts with the gene promoters and its nearby enhancers especially R1 more frequently, which suggests the formation of a regulatory hub. HS-39, HS-38 and distal downstream CTCF binding sites are not involved in this hub, but form diffuse interactions in between. These findings from the HLM-predicted triplet contacts are consistent with the Tri-C analyses conducted at R2 and HS-39 [17] (see S2 Fig).

We made additional comparisons at the globin gene promoter Hba-a1 and the downstream CTCF boundary element HS+44, whose triplet contact probabilities have not yet been measured (Fig 3F). At the viewpoint Hba-a1, the promoter interacts with upstream enhancers in a cooperative manner, which supports the above analyses. By contrast, in the presence of contact with a downstream boundary element, Hba-a1 becomes less likely to interact with another downstream boundary site in erythroid cells.

Comparison with multi-contact 4C sequencing (MC-4C)

Next, we study the formation of higher-order contacts among CTCF binding sites in the cohesin release factor (WAPL) lacking cells [63]. According to the loop extrusion model [64–66], the ring-shaped protein complex, cohesin, binds to DNA and progressively enlarges chromatin loop until the dynamics is hindered by two convergently orientated CTCF-bound sites. Knockout of the WAPL promotes the chromatin loop extension [63].

We carried out HLM calculation for a 1.28-Mb genomic region on chromosome 8, which contains 13 CTCF-binding sites, for both wild-type (WT) and WAPL-deficient (ΔWAPL) human chronic myeloid leukemia (HAP1) cells, whose triplet contacts were measured with MC-4C [18]. Although the Hi-C dataset (Fig 4A, top) is noisier than that used in the first case study (Fig 3B, top), the HLM-predicted triplet contacts at the CTCF binding sites E (Fig 4B) and K (Fig 4C) show patterns similar to those from the MC-4C measurement [18]. To account for the enrichment of long-range triplet contacts comprised of distal CTCF binding elements (e.g., the triplet A, H and K) in the WAPL-lacking cells, Allahyar et al. [18] put forward a “traffic jam” of CTCF roadblock-trapped cohesin complexes.

Download:

Fig 4. Comparing many-body contact predictions with MC-4C.

(A) Comparing log₁₀(p_ij) from Hi-C and HLM at 10-kb resolution of a 1.28 Mb region on chr8, where there are thirteen CTCF-binding sites labeled from A to M. The genomic position of forward- and backward-oriented CTCF sites are marked with red and blue sticks, respectively. (B, C) Comparing log₁₀(p_ijk) from MC-4C and HLM at a viewpoint of site E and K, respectively. From the viewpoint of E (K), the values of PC between the model and the MC-4C experiment are 0.83 (0.81) and 0.71 (0.73) for the WT and ΔWAPL cells, respectively. (D) Cell-line difference of the predicted three-body contacts, namely , among the CTCF-binding sites from the viewpoint of A, E and (E) H, K, respectively. (F, G) Cell-line difference of four- and five-body contacts from double- and triple-anchored viewpoints, whose positions are marked with the black strips. The numbers label the maximum amplitudes of changes.

https://doi.org/10.1371/journal.pcbi.1009669.g004

Examining the WAPL knockout-induced change in triplet contacts among CTCF binding sites, we plot the changes of p_ijk () at four viewpoints, k = E, K and A, H in Fig 4D and 4E, respectively. While the site A is colocalized more frequently with the distant downstream sites K and L in ΔWAPL cell, its interaction with the nearby loci (sites B-G) is reduced significantly (e.g., p_ABC is reduced by more than 60% from 1.1 × 10⁻³ to 4.2 × 10⁻⁴). Similar patterns of “long-range enrichment and short-range depletion of triplet contacts” are identified for those from other upstream and downstream viewpoints (e.g., the sites E and K). Although the MC-4C measurement [18] did not highlight the short-range depletion of triplet contacts, our finding is consistent with the contact frequencies measured with respect to the sites E and K (S3(A) Fig). In fact, the two Hi-C datasets, one for WT and the other for ΔWAPL cell, show that the pairwise contact frequencies of ΔWAPL cells in the domains flanked by short-range CTCF binding sites are smaller than those of the WT (S3(B) Fig), which directly confirm the short-range depletion of contacts. The observation of long-range enrichment and short-range depletion of triplet contacts does not always hold; the central sites H and I are the two exceptions that do not demonstrate depletions of short-range triplet contacts.

The aggregation of cohesin molecules has been identified in ΔWAPL cells with super-resolution imaging [18]; however, it has not been known whether or not a similar type of aggregation occurs in CTCF-bound chromatin sites. To explore the possibility of condensation of CTCF-bound chromatin sites, we calculate higher-order chromatin contacts in both conditions of WT and ΔWAPL cells. The changes in 4-body and 5-body contacts among CTCF-bound sites from WT to ΔWAPL cells from double- and triple-anchored view points are all positive, namely, (Fig 4F) and (Fig 4G), which lends supports to the picture surmising the clustering of domain boundary in ΔWAPL cells. Although their probabilities are small, up to 4-fold increases of 4-body and 5-body contacts are identified among the SOIs (see S4 Fig).

Comparison with SPRITE

Lastly, we model a 0.45 Mb genomic region of GM12878 cell (Fig 5A and 5B), which contains transcriptionally active genes BCL2 and multiple super-enhancers, annotated in Ref. [30] by Perez-Rathke et al., and compared the predicted triplet contacts with the results from SPRITE [26]. The human BCL2 gene encodes a membrane protein which blocks the apoptotic death in B-lymphoblastoid cells.

Download:

Fig 5. Comparing triplet contact predictions with SPRITE.

(A) Genes, RNA-seq and H3K27ac ChIP-seq signals [60, 67] in a 0.45 Mb region on human chr18 with the positions of three types of SOIs (Enhancers, Promoters and Super-enhancers). (B) log₁₀(p_ij) from Hi-C compared with that from HLM at 5-kb resolution. (C) The triplet contact frequency from SPRITE is compared with log₁₀(p_ijk) from HLM anchored at an active promoter site. (D) SPRITE frequency of triplets which are divided into four quantiles based on ascending order of p_ijk. (E) For three sites, i < j < k, along a polymer chain, the minor and major sections denoted by m and t, respectively, are depicted with dashed lines. (F) The expected three-body contact probability , and (G) the specificity Z-score . (H) The mean specificity Z-score of triplet contacts with respect to all possible combinations of annotations (E: Enhancers, P: Promoters, S: Super-enhancers, #: sites without any annotation). The error bars are the standard deviations.

https://doi.org/10.1371/journal.pcbi.1009669.g005

We expect that the similarity between the model and the experiment is not so high as the foregoing two case studies, because the data from SPRITE, designed for a genome-wide study, is sparse with 5-kb resolution, and the chromatin fragments cocaptured by SPRITE are not necessarily in spatial proximity since SPRITE does not distinguish between direct and indirect cross-linkings. Indeed, a weak correlation is found in the triplet contacts anchored at the active BCL2 gene promoter (PC = 0.18; see Fig 5C). To validate our models against the measurements, we sort the triplets in an ascending order of their predicted contact probabilities, and divide them into four quantiles. Then the frequency of the quantilized triplets in SPRITE increases with the predicted probability, p_ijk (Fig 5D).

To identify the specificity in a single cell type, we defined an expected triplet contact probability (i < j < k) as the mean contact probability of all triplets in the “sections” of interest along the chain (Fig 5E). The minor and major sections are determined as m = min(|j − i|, |k − j|) and t = |k − i|, respectively. depends only on m and t (Fig 5F), which reflects the chain connectivity and the global compactness of chromatin. The ratio of the difference between the observed and expected probabilities scaled by the standard deviation is then defined as the specificity Z-score, , for each triplet (Fig 5G). The triplet contacts are deemed specific if Z_ijk > 1.

Z_ijk averaged over all possible combinations of functional annotations in the region of interest are shown in Fig 5H. The concurrent triplet contacts between two super-enhancers with a promoter (PSS) or with a third super-enhancer (SSS) are specifically enriched at this loci. This finding comports well with the Perez-Rathke et al’s finding that multiple super-enhancers are involved in higher-order chromatin contacts more frequently than other elements [30]. We confirm similar results from modeling at another active locus (S5 Fig).

Discussions

Relation between pairwise and triplet contacts

For a given polymer chain, is there any simple relation to associate two-body contacts with three-body contacts? Through both experimental and theoretical studies, there has been much effort to address this question and related issues in the context of concurrent chromatin interactions [14, 15, 18, 19, 68].

Here, we address this issue in a principled way by considering a Gaussian polymer chain without any specific interaction as a reference system. In the heatmaps of n-body contact probability of the Gaussian polymer chain consisting of N monomers, the enrichments of many-body contacts are observed near the diagonal part of N × N matrix as well as around the viewpoints (Fig 6A–6C). For any triplet, ijk (i < j < k), the contact probability reads (see Eq 26 for the derivation) (8) where m = min(|j − i|, |k − j|) is the minor section, t = |k − i| is the major section of the triplet (Fig 5E), and γ is the stiffness constant which restrains the consecutive sites. For t ≫ m, it follows that p_ijk ∼ t^−3/2, the scaling exponent of which is identical to that of the two-body contact probability [28, 29]. For a fixed t, p(m, t) changes non-monotonically as a function of m, having its minimum and maximum at m = t/2 and m = 1, respectively (Fig 6B). According to Eq 8 the specificity Z-score defined in the subsection Comparison with SPRITE satisfies Z_ijk = 0 for all triplets in a Gaussian chain, as anticipated.

Download:

Fig 6. Many-body contacts in a Gaussian polymer chain consisting of 20 sites.

(A) Two-body contact probability log₁₀(p_ij). (B) Three-body contact probability log₁₀(p_ijk) anchored at the 7-th site (k = 7). (C) Four-body contact probability log₁₀(p_ijkl) double anchored at the 6-th and 15-th sites (k = 6, l = 15). (D-F) Boost factor p_ijk/p_ij p_jk, hub-score H_ijk, and association z-score A_ijk from the viewpoint of the 7-th site.

https://doi.org/10.1371/journal.pcbi.1009669.g006

An enhancement of triplet contact compared to two independent pairwise loops was discussed by Polovnikov et al. [68], who used the delta function for the cross-linking probability between genomic sites, such that sites were considered in contact only if they overlapped with each other. It was found that p_ijk/p_ij p_jk ≥ 1 with a maximum at m = t/2. This finding, however, is not universally held if one considers other types of cross-linking probability. When Gaussian and Heaviside step functions are used as the cross-linking probabilities, triplet contact probabilities are always suppressed in comparison with the product of two binary contact probabilities, i.e., p_ijk/p_ij p_jk ≤ 1, and the largest suppression occurs at m = t/2 (Fig 6D and S6(D) Fig).

No clear-cut relation exists between pairwise (p_ij) and triplet contact probabilities (p_ijk), which however was the underlying presumption of the data processing in several experiments. For example, a hub score was calculated to identify synergistic interaction hubs in the analysis of 3way-4C data [15], which was defined by H_ijk = 3p_ijk/(p_ij p_jk + p_ij p_ik + p_jk p_ik) approximately, whereas our Gaussian chain model clarifies that the hub score is not a constant (Fig 6E). In fact, as discussed in the experiment [15], enrichment of the score is always found at the triplets close to any viewpoint (e.g., Figs 3 and 4 in Ref. [15]).

To detect cooperative (competitive) contacts, an association Z-score, defined as , were computed in the analyses of MC-4C measurements [18, 19]. is the number of reads cocapturing the genetic sites i, j and k (the viewpoint). and are the mean and standard deviation of , the number of reads capturing the sites j and k but not i. A positive (negative) A_ijk, which implies that the chance of site j being in association with k and any other third site is (dis)favored when site i is interacting with k, was interpreted as a measure of cooperative (competitive) triplet contacts. However, the calculation based on Gaussian polymer chain shows that adjacent monomer pairs along the chain display a greater association z-scores A_ijk (Fig 6F). This issue of false-positiveness in the short-range cooperativity among the multiple sites has been noticed in Ref. [18].

A recent super-resolution imaging study on chromatin fibers in IMR90 cells has explored whether or not the association between two genetic loci facilitates the contact with a third site [14]. For triplets ijk satisfying i < j < k, two conditional probabilities and ( is the contact probability between the k-th and j-th sites under the condition that i-th and j-th sites are in contact, and is similarly defined) were compared with the unconditioned contact probability p_jk. It was found that regardless of cell types and other chemical treatments, a relation was always satisfied among three probabilities, for most triplets of genomic loci [14]. Our theoretical framework allows us to validate this relation for any triplet in a polymer chain (see S7 Fig). The generic characteristics of multi-body contacts of Gaussian chain mentioned above do not depend on the specific form of cross-linking probability (S6 Fig).

As a further test, we counted the average, maximum, and minimum pairwise contact probabilities of the triplets anchored at R2 in the α-globin locus of mouse erythroid cells (S8(B)–S8(D) Fig). They have dissimilar patterns with respect to the results from Tri-C experiment (S8(A) Fig), which is also reflected by their overall lower Pearson correlations than HLM calculated at two different viewpoints in two different cell types (S8E Fig). The relation between pairwise and multi-way contacts is more complicated than intuition.

Miscellaneous

The prediction of HLM is not always consistent with experiments. For example, when scrutinizing the WAPL knockout-induced change of triplet contacts at viewpoint of the CTCF binding site E, an enrichment involving site G can be found in MC-4C (see the left panel of S3 Fig) which is not in agreement with our model prediction (Fig 4D). Many factors can contribute to this discordance, such as the quality of the Hi-C dataset at the locus of interest. Despite much effort devoted in processing raw Hi-C data, bias which remains in the Hi-C contact matrix after normalization will propagate to the model and subsequent predictions on higher-order contacts.

Another apparent missing component of our approach is the excluded volume interaction, which plays an indisputable role in determining polymer behavior. However, as shown in Fig 1, chromatin segment pairs have non-zero probabilities at small physical distances, in contrast to the expectation that segments are not allowed to overlap with each other. We explain the absence of excluded volume by using a polymer melt, which contains 125 polymer chains each composed of 641 monomers (S9 Fig). The system was simulated considering only chain connectivity and Weeks-Chandler-Anderson type exclude volume interactions [69]. The latter manifests itself as the zero-valued probability of the bond length (r_i,i+1) and intra-chain pairwise monomer distance (r_ij) when r < 1 in units of the monomer diameter (the circular black dots in S9(C) and S9(D) Fig). However, when the melt is coarse-grained such that we model 8 or 64 consecutive segments as a coarse-grained center, the coarse-grained segments overlap with each other. Together with the results shown in Fig 1B, it is justified to ignore the excluded volume and bending penalty beyond certain scales.

As a further test of the general applicability of HLM for the prediction of n-body contacts, we modeled the β-globin locus of mouse ES cells (S10 Fig) and the Pcdhα locus of mouse neural progenitor cells (S11 Fig) by using the Hi-C data reported by Bonev at al [70]. The pairwise and predicted triplet contact probabilites at multiple viewpoints are well correlated with the experiments (see detailed PC coefficient in the figure caption and Table A in S1 Appendix).

Conclusions

To recapitulate, we have derived analytic expressions of n-body contact probabilities (for any n > 2) based on a recent chromatin polymer model (HLM) [50] and developed a method to predict multi-way chromatin contacts from Hi-C data.

First, the predicted triplet chromatin contacts are in reasonable correlation with the results of two independent measurements (i) Tri-C [17] and (ii) MC-4C [18], also in accordance with the genome-wide study using (iii) SPRITE [26]. Besides confirming the experimental findings, the suggested method was used to explore the multi-way chromatin contacts at any viewpoint of genomic locus of interest, which allowed us to discover some of key features not previously underscored in each measurement. (i) For the mouse α-globin locus, a cell-line dependent interaction pattern for the promoter is found when the viewpoint is anchored at the gene promoter. (ii) Although the previous study highlighted only the enrichment of long-range triplet contacts among CTCF binding sites in the ΔWAPL cells, we find that depletion of short-range contacts occurs for some triplets to compensate the enrichment. Our calculations also lend support to the aggregates of CTCF-bound boundary elements by explicitly showing the enrichments of four- and five-way chromatin contacts. (iii) Lastly, our analysis captures the enrichment of triplet contacts involving super-enhancers at transcriptionally active loci.

With an increasing contact order n, the probability of forming multi-way contacts decreases by orders of magnitude (see how the range of scale bars changes with increasing n in Fig 6A–6C). Our theoretical approach can be used to circumvent this statistical limitation inherent to experimental detection of multi-way chromatin contacts. All the computer codes discussed here are provided in https://github.com/leiliu2015/HLM-Nbody, so that multi-way contacts can be calculated from an input Hi-C dataset. The methodology developed here will be of great help to elucidate the regulatory roles played by complex chromatin topology.

Methods

Numerical details for determining the stiffness matrix

Our previous numerical procedure [50, 51] to determine -matrix has been improved in this paper by adapting those in the recent modeling studies of chromatin by two other groups [71–73]. The best solution, , which is consistent with a given Hi-C data, was determined with Hi-C data by optimizing the cost function, (see Fig 2). Different forms of the cost function can be conceived: where is the pairwise contact probability observed in Hi-C experiments, PC(x, y) stands for the Pearson correlation coefficient between x and y, and can be derived from based on either Eq 12 for Gaussian contact probability or Eq. S1 for contact probability in the form of Heaviside step function.

To avoid non-physical negative eigenvalues of the Kirchhoff matrix, the interaction strengths were previously required to be non-negative (k_ij ≥ 0) during the minimization [50, 71]. Shinkai et al. have overcome the issue of negative k_ij by alternately updating the backbone and non-backbone interaction strengths [72]. Even though there are some pairs with k_ij < 0, as long as the resulting -matrix is positive-definite [72], the potential of mean force between the i- and j-th monomer is proportional to −ln P(r_ij), which has a physically meaningful minimum at .

We compared four different methods to optimize the cost functions: the first method minimizes the cost by random sampling (RS) of the parameter space [72]; the second one updates via steepest gradient descent (GD) [71]; the third and fourth methods, RMSprop [74] and ADAM [75], calculate both the gradient and its second moment to accelerate the model training [76].

As a case study, the decay of the cost function during the model training of a 2.4-Mb region on chr8 in mouse embryonic stem cells is shown in S12(A) Fig. The ADAM optimizer outperforms other methods significantly. Quality of the final model was calibrated by using the stratum adjusted correlation (SCC) [62] and PC between log₁₀(p_ij) from Hi-C and that from the model. As shown in the accompanying Table B in S1 Appendix, while L₁ and L₃ are both good candidates, L₂ seems to be the best choice. However, the dependence of contact probability on the subchain size s, defined as , shows that the optimal model based on minimizing L₂ underestimates the overall contacts (S12(B) Fig), although the pattern of log₁₀(p_ij) from the model is highly correlated with that from Hi-C. Similar conclusions can be drawn from training a model of a 10-Mb region on chr5 in GM12878 cells (S13 Fig), which our previous work used as a test case [50]. The models trained with new schemes clearly achieve better quality (S13(D) Fig).

Taken together, in this study we calculated the -matrix based on Hi-C data by minimizing L₁ with F₀, or minimizing L₃ with F₁ using ADAM optimizer, where F₀ and F₁ are two possible functional forms of cross-linking probability explained below.

Derivations of the multi-body contact probability

Despite many potential biases in Hi-C [13], the frequency of two chromatin fragments being cross-linked in millions of cells is ideally determined by the probability density of their spatial distance P(r), and the efficiency of the cross-linking agent. The latter contribution can be included by the r-dependent cross-linking probability of fragments F(r). One can consider using a Gaussian function, , for cross-linking probability, which makes many-body contacts easier to handle mathematically. Alternatively, the Heaviside step function F₁(r) = Θ(r_c − r) is conceived as well, as a natural choice of the cross-linking probability, such that fragment pairs are cross-linked if the spatial distance r is smaller than an effective capture radius of the cross-linking agent (r_c).

The contact probabilities based on the two cross-linking probabilities, F₀(r) and F₁(r) are denoted by “p⁽⁰⁾” and “p⁽¹⁾”, respectively. Unless stated otherwise, the results shown in the main text were calculated with F₀(r), and “p” was used as a shorthand notation of “p⁽⁰⁾” for brevity.

Here we present the derivation of n-body contact probability based on the Gaussian cross-linking probability, F₀(r). The derivation based on the Heaviside step function, F₁(r), is provided in S1 Appendix.

Generic many-body contact probability.

Based on Eqs 1 and 2, if we set , the many-body contact probability among n (≥ 2) monomers in a given set can be directly calculated as (10) where , P(r) is the probability density of the chain configuration (r) and is the enumeration of all the pair-wise combinations of the elements in . Δ is a matrix of elements Δ_uv whose values are given by (11) in which u, v ∈ (0, 1, 2, …, N − 1). The K and Δ with subscript “0” in Eq 10 signifies the matrices whose 0-th row and column are removed from the original ones.

Pairwise contact probability.

To avoid computing determinants of large matrices in the general solution (Eqs 10 and 11), p_n can alternatively be calculated based on n-body correlation function. For example, considering the probability density of the pairwise distance between the i-th and j-th monomer in three-dimensional space (Eq 3), the pairwise contact probability between the i-th and j-th monomers can be formulated as [71, 72] (12) where Eq 3 and is used in the second row.

Three-body contact probability.

For three-body contact probability p_ijk, we first consider the probability density of the distances between the i-th and j-th monomers, and between the j-th and k-th monomers projected on one dimension (13) where k = (k₁, k₂), is a matrix with elements (14) and (15) Remember that σ is the element of K⁻¹, and σ_ij = 0 if i × j = 0. Then, the three-body contact probability in 3D is given by (16)

Four-body contact probability.

Four-body contact probability p_ijkl can be derived in the same manner. By defining u_x = (x_ij, x_jk, x_kl)^T, the one dimensional probability density of three pairwise distances, P(x_ij, x_jk, x_kl), is given by (17) where now k = (k₁, k₂, k₃). W₄ has a matrix form of (18) with elements in Eq 14 and (19) Then, from Eq 17, one gets (20) where (21)

Conditional pairwise contact probability.

With super-resolution tracing, Bintu et al. examined whether the association between two chromatin loci facilitates or prevents the contact with a third locus [14]. More specifically, given three monomers of indices i < j < k, two conditional probabilities p(r_jk ≤ r_c ∣ r_ij ≤ r_c) and p(r_jk ≤ r_c ∣ r_ij > r_c) were compared with the marginal one p(r_jk ≤ r_c) (see Fig 5 in Ref. [14]). They found “cooperativity”, namely p(r_jk ≤ r_c ∣ r_ij > r_c) < p(r_jk ≤ r_c) < p(r_jk ≤ r_c ∣ r_ij ≤ r_c), among more than 80% triplets of CTCF-bound and generic loci, as well as little effect of cohesin depletion induced by auxin treatment.

Assuming F₀(r), we next aim at deriving the conditional contact probabilities of the j- and k-th monomers given that the j-th monomer is in contact with the i-th monomer (), or given that the j-th monomer is in contact with the i-th monomer (), with an ordering of indices i < j < k. The first conditional probability can be calculated based on the relation that , in which (22) with non-zero elements , or equivalently (23) where we have applied Eqs 13 and 14. By using the relation that , the second conditional pairwise contact probability can be determined as (24) where the marginal contact probabilities and are given by Eq 12.

Together with and , it is straightforward to prove that for any triplet (25) The equality signs hold if and only if b = 0 (see Eq 14), which corresponds to the case that x_ij and x_jk are completely uncorrelated.

Gaussian polymer chain.

At last we consider the simplest model, a Gaussian polymer chain, in which k_ij = k₀ if |i − j| = 1 and k_ij = 0 otherwise. It follows that σ_ij = min(i, j)/k₀. With Eq 16, it straightforwardly leads to (26) where s₁ = j − i and s₂ = k − j for any triplet with i < j < k.

There is a power-law decay of with s₁ as if s₁ ≫ s₂. It is also easy to prove that the quantity, p_ijk/p_ij p_jk, discussed in Fig 6D is a nontrivial function of s₁ and s₂. If s₁ + s₂ = constant, it has its minimum at s₁ = s₂, and maximum at s₁ = 1 or s₂ = 1.

In addition, as a result of b = 0 in Eq 14, we notice that for a Gaussian chain (see Eqs 22–25).

Supporting information

S1 Appendix. Derivation of n-body contact probability based on the cross-linking probability modeled with the Heaviside step function.

Table A. Genomic regions simulated in this work. Table B. Pearson correlation (PC), stratum-adjusted correlation (SCC8) and distance-corrected Pearson correlation (DCPC9) of the contact probabilities predicted by SBS and HLM compared with Capture-C1 and Tri-C2 experiments. Table C. Stratum adjusted correlation (SCC) and PC coefficients between Hi-C and HLM model.

https://doi.org/10.1371/journal.pcbi.1009669.s001

(PDF)

S1 Fig. Stratified Pearson correlations of the HLM-predicted contact probabilities.

Stratified PC at α-globin locus of mouse (A) ES and (B) erythroid cells compared with Capture-C (2-body) and Tri-C (3-body) experiments.

https://doi.org/10.1371/journal.pcbi.1009669.s002

(TIF)

S2 Fig. Changes of Tri-C triplet contact frequencies in the α-globin region of mouse erythroid cells with respect to ES cells.

The analysis was done at the viewpoints of R2 (top) and HS-39 (bottom). Following the statistical analysis in the experiment [17], we use the symbol * to mark all triplet interactions with significant changes (P < 0.01). The erythroid cell-specific regulatory hub and diffuse interactions among CTCF boundary sites, which are highlighted by dashed triangles, are both captured by Tri-C and our theory (Fig 3E).

https://doi.org/10.1371/journal.pcbi.1009669.s003

(TIF)

S3 Fig. Analysis of the changes in triplet contacts () and in pairwise contacts () among CTCF binding sites in WAPL lacking cells.

(A) The triplet contacts from MC-4C data [18] with respect to the viewpoints of E (top) and K (bottom). The triplet contacts predicted by HLM are shown in Fig 4D. (B) The enrichment of pairwise contacts between long-range CTCF binding sites (off-diagonal elements in red corresponding to ) is counteracted by the depletion of contacts in the domains flanked by short-range CTCF binding sites (matrix elements along the diagonal block in blue corresponding to ). The data of pairwise contacts are obtained from Hi-C data [63].

https://doi.org/10.1371/journal.pcbi.1009669.s004

(TIF)

S4 Fig. WAPL depletion-induced fold changes of n-body contact probability.

Fold changes of (A) four-body contacts double-anchored at sites E, K and (B) five-body contacts triple-anchored at sites E, H and K. The absolute change of contact probabilities is shown in Fig 4F and 4G, respectively.

https://doi.org/10.1371/journal.pcbi.1009669.s005

(TIF)

S5 Fig. Comparison between HLM and SPRITE.

Comparison between HLM and SPRITE similarly to those of Fig 5 in a 0.52 Mb region on human chr12. (C) From the viewpoint of the promoter of active ETV6 gene which encodes an ETS family transcription factor.

https://doi.org/10.1371/journal.pcbi.1009669.s006

(TIF)

S6 Fig. Many-body contacts of a Gaussian polymer chain calculated with step function.

Same as Fig 6 but using F₁(r) (Heaviside step function) as the cross-linking probability.

https://doi.org/10.1371/journal.pcbi.1009669.s007

(TIF)

S7 Fig. Conditional pairwise contact probability with either a Gaussian (left column) or Heaviside step (right column) cross-linking probability.

(A) Heatmaps of log₁₀(p_ij) in a 1.23-Mb region on chr21 of IMR90 cells from Hi-C and log₁₀(p_ij) from HLM. (B) Comparison between the unconditioned contact probability p_jk, the conditional contact probability, p_jk∣ij, and p_jk\ij, calculated for CTCF-site triplets. (C) p_jk, p_jk∣ij, and p_jk\ij calculated for all triplets ijk of i < j < k, which are sorted in an ascending order of p_jk.

https://doi.org/10.1371/journal.pcbi.1009669.s008

(TIF)

S8 Fig. Triplet contacts predicted by simple rules with reference to Fig 3C and 3D.

(A) Three-body contact matrix at α-globin locus of mouse erythroid cells measured by Tri-C experiment or predicted by using three simple rules (B-D). (E) Pearson correlations between the results from 4 methods and the experiment.

https://doi.org/10.1371/journal.pcbi.1009669.s009

(TIF)

S9 Fig. Excluded volume in a polymer melt at different levels of coarse graining.

(A) A typical configuration of a dense polymer melt and (B) one polymer chain in the melt at three levels of coarse graining. The beads in (B) are colored differently along the chain, with a diameter of the most probable bond length at the corresponding scales. (C) Probability of the bond length and (D) intra-chain pairwise monomer distance.

https://doi.org/10.1371/journal.pcbi.1009669.s010

(TIF)

S10 Fig. HLM of mouse β-globin locus on chr7 at resolution of 8kb.

(A) Pairwise contact probability from Hi-C [70] compared with HLM, which has a PC of 0.996. (B) Triplet contact probabilities from Tri-C compared with HLM, which are anchored at 3’HS1 and HS2 with PCs of 0.62 and 0.88, respectively.

https://doi.org/10.1371/journal.pcbi.1009669.s011

(TIF)

S11 Fig. HLM of mouse Pcdhα locus on chr18 at 5 kb resolution (see also the caption of S10 Fig).

Compared with the MC-4C dataset, (A) the pairwise contact probability has a PC of 0.98, and (B) the triplet contact probabilities have PCs of 0.84, 0.76, 0.76, 0.88, and 0.70 at the viewpoint of Pcdhα1, Pcdhα11, Pcdhαc1, HS7, and HS5–1, respectively.

https://doi.org/10.1371/journal.pcbi.1009669.s012

(TIF)

S12 Fig. Comparison between the model trainings with different choices of cost functions and optimizers.

We trained a polymer model of a 2.4-Mb genomic region on chr8 in mouse ES cells at 25-kb resolution [70]. (A) The trajectories of various cost functions L in a log-log scale, minimized by using one of the four methods (RS, GD, RMSprop, and ADAM) with different cross-linking probabilities F_α (α = 0, 1). (B) Comparing P(s) from Hi-C and from three models, which were all trained with ADAM using F₀, but with different forms of the cost functions. (C) Comparison of log₁₀(p_ij) from Hi-C (top) with that from the model trained by minimizing L₁ with ADAM (bottom).

https://doi.org/10.1371/journal.pcbi.1009669.s013

(TIF)

S13 Fig. Comparison between the models trained for a 10-Mb genomic region at 50-kb resolution in GM12878 cells [77] with different choices of cost functions and optimizers.

(A-C) Same as the caption of S12 Fig. (D) Pearson correlations between Hi-C and HLM in our previous work [50] (the black line), and new models trained in this work (the colored lines) as a function of genomic separation, s.

https://doi.org/10.1371/journal.pcbi.1009669.s014

(TIF)

Acknowledgments

We thank Dr. Yonghyun Song for careful reading of the manuscript and helpful comments. We thank the Center for Advanced Computation in KIAS for providing computing resources.

References

1. Wang S, Su JH, Beliveau BJ, Bintu B, Moffitt JR, Wu Ct, et al. Spatial organization of chromatin domains and compartments in single chromosomes. Science. 2016;353(6299):598–602. pmid:27445307
- View Article
- PubMed/NCBI
- Google Scholar
2. Gu B, Swigut T, Spencley A, Bauer MR, Chung M, Meyer T, et al. Transcription-coupled changes in nuclear mobility of mammalian cis-regulatory elements. Science. 2018;359(6379):1050–1055. pmid:29371426
- View Article
- PubMed/NCBI
- Google Scholar
3. Cardozo Gizzi AM, Cattoni D I, Fiche JB, Espinola SM, Gurgo J, Messina O, et al. Microscopy-Based Chromosome Conformation Capture Enables Simultaneous Visualization of Genome Organization and Transcription in Intact Organisms. Mol Cell. 2019;74(1):212–222. pmid:30795893
- View Article
- PubMed/NCBI
- Google Scholar
4. Ohno M, Ando T, Priest DG, Kumar V, Yoshida Y, Taniguchi Y. Sub-nucleosomal Genome Structure Reveals Distinct Nucleosome Folding Motifs. Cell. 2019;176(3):520–534. pmid:30661750
- View Article
- PubMed/NCBI
- Google Scholar
5. Hsieh THS, Cattoglio C, Slobodyanyuk E, Hansen AS, Rando OJ, Tjian R, et al. Resolving the 3D Landscape of Transcription-Linked Mammalian Chromatin Folding. Mol Cell. 2020;78(3):539–553. pmid:32213323
- View Article
- PubMed/NCBI
- Google Scholar
6. Krietenstein N, Abraham S, Venev SV, Abdennur N, Gibcus J, Hsieh THS, et al. Ultrastructural Details of Mammalian Chromosome Architecture. Mol Cell. 2020;78(3):554–565. pmid:32213324
- View Article
- PubMed/NCBI
- Google Scholar
7. Srinivas K, Sabari BR, Coffey EL, Klein IA, Boija A, Zamudio AV, et al. Enhancer features that drive formation of transcriptional condensates. Molecular cell. 2019;75(3):549–561.
- View Article
- Google Scholar
8. Oudelaar AM, Harrold CL, Hanssen LLP, Telenius JM, Higgs DR, Hughes JR. A revised model for promoter competition based on multi-way chromatin interactions at the α-globin locus. Nat Commun. 2019;10:5412. pmid:31776347
- View Article
- PubMed/NCBI
- Google Scholar
9. Beagrie RA, Thieme CJ, Annunziatella C, Baugher C, Zhang Y, Schueler M, et al. Multiplex-GAM: genome-wide identification of chromatin contacts yields insights not captured by Hi-C. bioRxiv. 2021;.
- View Article
- Google Scholar
10. Lim B, Levine MS. Enhancer-promoter communication: hubs or loops? Curr Opin Gene Develop. 2021;67:5–9. pmid:33202367
- View Article
- PubMed/NCBI
- Google Scholar
11. Miller JA, Widom J. Collaborative competition mechanism for gene activation in vivo. Molecular and cellular biology. 2003;23(5):1623–1632. pmid:12588982
- View Article
- PubMed/NCBI
- Google Scholar
12. Hermsen R, Tans S, Ten Wolde PR. Transcriptional regulation by competing transcription factor modules. PLoS Comput Biol. 2006;2(12):e164. pmid:17140283
- View Article
- PubMed/NCBI
- Google Scholar
13. Kempfer R, Pombo A. Methods for mapping 3D chromosome architecture. Nat Rev Genet. 2020;21(4):207–226. pmid:31848476
- View Article
- PubMed/NCBI
- Google Scholar
14. Bintu B, Mateo LJ, Su JH, Sinnott-Armstrong NA, Parker M, Kinrot S, et al. Super-resolution chromatin tracing reveals domains and cooperative interactions in single cells. Science. 2018;362(6413):eaau1783. pmid:30361340
- View Article
- PubMed/NCBI
- Google Scholar
15. Olivares-Chauvet P, Mukamel Z, Lifshitz A, Schwartzman O, Elkayam NO, Lubling Y, et al. Capturing pairwise and multi-way chromosomal conformations using chromosomal walks. Nature. 2016;540:296–300. pmid:27919068
- View Article
- PubMed/NCBI
- Google Scholar
16. Darrow EM, Huntley MH, Dudchenko O, Stamenova EK, Durand NC, Sun Z, et al. Deletion of DXZ4 on the human inactive X chromosome alters higher-order genome architecture. Proc Natl Acad Sci U S A. 2016;113(31):E4504–E4512. pmid:27432957
- View Article
- PubMed/NCBI
- Google Scholar
17. Oudelaar AM, Davies JOJ, Hanssen LLP, Telenius JM, Schwessinger R, Liu Y, et al. Single-allele chromatin interactions identify regulatory hubs in dynamic compartmentalized domains. Nat Genet. 2018;50:1744–1751. pmid:30374068
- View Article
- PubMed/NCBI
- Google Scholar
18. Allahyar A, Vermeulen C, Bouwman BAM, Krijger PHL, Verstegen MJAM, Geeven G, et al. Enhancer hubs and loop collisions identified from single-allele topologies. Nat Genet. 2018;50(8):1151–1160. pmid:29988121
- View Article
- PubMed/NCBI
- Google Scholar
19. Vermeulen C, Allahyar A, Bouwman BAM, Krijger PHL, Verstegen MJAM, Geeven G, et al. Multi-contact 4C: long-molecule sequencing of complex proximity ligation products to uncover local cooperative and competitive chromatin topologies. Nat Protoc. 2020;15:364–397. pmid:31932773
- View Article
- PubMed/NCBI
- Google Scholar
20. Ulahannan N, Pendleton M, Deshpande A, Schwenk S, Behr JM, Dai X, et al. Nanopore sequencing of DNA concatemers reveals higher-order features of chromatin structure. bioRxiv. 2019;.
- View Article
- Google Scholar
21. Nagano T, Lubling Y, Stevens TJ, Schoenfelder S, Yaffe E, Dean W, et al. Single-cell Hi-C reveals cell-to-cell variability in chromosome structure. Nature. 2013;502(7469):59–64. pmid:24067610
- View Article
- PubMed/NCBI
- Google Scholar
22. Tan L, Xing D, Chang CH, Li H, Xie XS. Three-dimensional genome structures of single diploid human cells. Science. 2018;361(6405):924–928. pmid:30166492
- View Article
- PubMed/NCBI
- Google Scholar
23. Tan L, Xing D, Daley N, Xie XS. Three-dimensional genome structures of single sensory neurons in mouse visual and olfactory systems. Nat Struct Mol Biol. 2019;26(4):297–307. pmid:30936528
- View Article
- PubMed/NCBI
- Google Scholar
24. Takei Y, Yun J, Zheng S, Ollikainen N, Pierson N, White J, et al. Integrated spatial genomics reveals global architecture of single nuclei. Nature. 2021;590(7845):344–350. pmid:33505024
- View Article
- PubMed/NCBI
- Google Scholar
25. Beagrie RA, Antonioand Schueler Scialdone M, Kraemer DCA, Chotalia M, Xie SQ, Barbieri M, et al. Complex multi-enhancer contacts captured by genome architecture mapping. Nature. 2017;543:519–524. pmid:28273065
- View Article
- PubMed/NCBI
- Google Scholar
26. Quinodoz SA, Ollikainen N, Tabak B, Palla A, Schmidt JM, Detmar E, et al. Higher-Order Inter-chromosomal Hubs Shape 3D Genome Organization in the Nucleus. Cell. 2018;174(3):744–757.e24. pmid:29887377
- View Article
- PubMed/NCBI
- Google Scholar
27. de Gennes PG. Scaling Concepts in Polymer Physics. Ithaca and London: Cornell University Press; 1979.
28. Halverson JD, Smrek J, Kremer K, Grosberg AY. From a melt of rings to chromosome territories: the role of topological constraints in genome folding. Rep Prog Phys. 2014;77:022601. pmid:24472896
- View Article
- PubMed/NCBI
- Google Scholar
29. Liu L, Hyeon C. Contact statistics highlight distinct organizing principles of proteins and RNA. Biophys J. 2016;110(11):2320–2327. pmid:27276250
- View Article
- PubMed/NCBI
- Google Scholar
30. Perez-Rathke A, Sun Q, Wang B, Boeva V, Shao Z, Liang J. CHROMATIX: computing the functional landscape of many-body chromatin interactions in transcriptionally active loci from deconvolved single cells. Genome Biol. 2020;21:13. pmid:31948478
- View Article
- PubMed/NCBI
- Google Scholar
31. Grosberg AY, Nechaev SK, Shakhnovich EI. The role of topological constraints in the kinetics of collapse of macromolecules. J Phys. 1988;49(12):2095–2100.
- View Article
- Google Scholar
32. Grosberg A, Rabin Y, Havlin S, Neer A. Crumpled globule model of the three-dimensional structure of DNA. Europhys Lett. 1993;23:373.
- View Article
- Google Scholar
33. Grosberg AY, Khokhlov AR. Statistical Physics of Macromolecules. New York: AIP Press; 1994.
34. Rosa A, Everaers R. Structure and Dynamics of Interphase Chromosomes. PLoS Comput Biol. 2008;4:e1000153. pmid:18725929
- View Article
- PubMed/NCBI
- Google Scholar
35. Tjong H, Gong K, Chen L, Alber F. Physical tethering and volume exclusion determine higher-order genome organization in budding yeast. Genome Res. 2012;22(7):1295–1305. pmid:22619363
- View Article
- PubMed/NCBI
- Google Scholar
36. Wong H, Marie-Nelly H, Herbert S, Carrivain P, Blanc H, Koszul R, et al. A Predictive Computational Model of the Dynamic 3D Interphase Yeast Nucleus. Curr Biol. 2012;22(20):1881–1890. pmid:22940469
- View Article
- PubMed/NCBI
- Google Scholar
37. Gürsoy G, Xu Y, Kenter AL, Liang J. Spatial confinement is a major determinant of the folding landscape of human chromosomes. Nucleic Acids Res. 2014;42:8223–8230. pmid:24990374
- View Article
- PubMed/NCBI
- Google Scholar
38. Kang H, Yoon YG, Thirumalai D, Hyeon C. Confinement-induced glassy dynamics in a model for chromosome organization. Phys Rev Lett. 2015;115:198102. pmid:26588418
- View Article
- PubMed/NCBI
- Google Scholar
39. Barbieri M, Chotalia M, Fraser J, Lavitas LM, Dostie J, Pombo A, et al. Complexity of chromatin folding is captured by the strings and binders switch model. Proc Natl Acad Sci U S A. 2012;109(40):16173–16178. pmid:22988072
- View Article
- PubMed/NCBI
- Google Scholar
40. Jost D, Carrivain P, Cavalli G, Vaillant C. Modeling epigenome folding: formation and dynamics of topologically associated chromatin domains. Nucleic Acids Res. 2014;42(15):9553–9561. pmid:25092923
- View Article
- PubMed/NCBI
- Google Scholar
41. Brackley CA, Brown JM, Waithe D, Babbs C, Davies J, Hughes JR, et al. Predicting the three-dimensional folding of cis-regulatory regions in mammalian genomes using bioinformatic data and polymer models. Genome Biol. 2016;17(1):59. pmid:27036497
- View Article
- PubMed/NCBI
- Google Scholar
42. Brackley CA, Johnson J, Kelly S, Cook PR, Marenduzzo D. Simulated binding of transcription factors to active and inactive regions folds human chromosomes into loops, rosettes and topological domains. Nucleic Acids Res. 2016;44(8):3503–3512. pmid:27060145
- View Article
- PubMed/NCBI
- Google Scholar
43. Di Pierro M, Zhang B, Aiden EL, Wolynes PG, Onuchic JN. Transferable model for chromosome architecture. Proc Natl Acad Sci U S A. 2016;113(43):12168–12173. pmid:27688758
- View Article
- PubMed/NCBI
- Google Scholar
44. Gursoy G, Xu Y, Liang J. Spatial organization of the budding yeast genome in the cell nucleus and identification of specific chromatin interactions from multi-chromosome constrained chromatin model. PLoS Comput Biol. 2017;13(7):e1005658. pmid:28704374
- View Article
- PubMed/NCBI
- Google Scholar
45. Shi G, Liu L, Hyeon C, Thirumalai D. Interphase Human Chromosome Exhibits Out of Equilibrium Glassy Dynamics. Nat Commun. 2018;9:3161. pmid:30089831
- View Article
- PubMed/NCBI
- Google Scholar
46. Bianco S, Lupiáñez DG, Chiariello AM, Annunziatella C, Kraft K, Schöpflin R, et al. Polymer physics predicts the effects of structural variants on chromatin architecture. Nat Genetics. 2018;50:662–667. pmid:29662163
- View Article
- PubMed/NCBI
- Google Scholar
47. Liu L, Shi G, Thirumalai D, Hyeon C. Chain organization of human interphase chromosome determines the spatiotemporal dynamics of chromatin loci. PLoS Comp Biol. 2018;14(12):e1006617. pmid:30507936
- View Article
- PubMed/NCBI
- Google Scholar
48. Nir G, Farabella I, Estrada CP, Ebeling CG, Beliveau BJ, Sasaki HM, et al. Walking along chromosomes with super-resolution imaging, contact maps, and integrative modeling. PLoS genetics. 2018;14(12):e1007872. pmid:30586358
- View Article
- PubMed/NCBI
- Google Scholar
49. Zhang S, Chen F, Bahar I. Differences in the intrinsic spatial dynamics of the chromatin contribute to cell differentiation. Nucl Acids Res. 2020;48:1131–1145. pmid:31828312
- View Article
- PubMed/NCBI
- Google Scholar
50. Liu L, Kim MH, Hyeon C. Heterogeneous loop model to infer 3D chromosome structures from Hi-C. Biophys J. 2019;117(3):613–625. pmid:31337548
- View Article
- PubMed/NCBI
- Google Scholar
51. Liu L, Hyeon C. Revisiting the organization of Polycomb-repressed domains: 3D chromatin models from Hi-C compared with super-resolution imaging. Nucleic Acids Res. 2020;48(20):11486–11494. pmid:33095877
- View Article
- PubMed/NCBI
- Google Scholar
52. Chu X, Wang J. Microscopic Chromosomal Structural and Dynamical Origin of Cell Differentiation and Reprogramming. Adv Sci. 2020;7(20):2001572. pmid:33101859
- View Article
- PubMed/NCBI
- Google Scholar
53. Di Stefano M, Nützmann HW, Marti-Renom MA, Jost D. Polymer modelling unveils the roles of heterochromatin and nucleolar organizing regions in shaping 3D genome organization in Arabidopsis thaliana. Nucleic Acids Res. 2021;49(4):1840–1858. pmid:33444439
- View Article
- PubMed/NCBI
- Google Scholar
54. Shi G, Thirumalai D. From Hi-C Contact Map to Three-dimensional Organization of Interphase Human Chromosomes. Phys Rev X. 2021;11(1):011051.
- View Article
- Google Scholar
55. Bianco S, Annunziatella C, Andrey G, Chiariello AM, Esposito A, Fiorillo L, et al. Modeling Single-Molecule Conformations of the HoxD Region in Mouse Embryonic Stem and Cortical Neuronal Cells. Cell Rep. 2019;28(6):1574–1583. pmid:31390570
- View Article
- PubMed/NCBI
- Google Scholar
56. Chiariello AM, Bianco S, Oudelaar AM, Esposito A, Annunziatella C, Fiorillo L, et al. A Dynamic Folded Hairpin Conformation Is Associated with alpha-Globin Activation in Erythroid Cells. Cell Rep. 2020;30(7):2125–2135. pmid:32075757
- View Article
- PubMed/NCBI
- Google Scholar
57. Bak JH, Kim MH, Liu L, Hyeon C. A unified framework for inferring the multi-scale organization of chromatin domains from Hi-C. PLoS Comput Biol. 2021;17(3):e1008834. pmid:33724986
- View Article
- PubMed/NCBI
- Google Scholar
58. Bohn M, Heermann DW, van Driel R. Random loop model for long polymers. Phys Rev E. 2007;76(5):051805. pmid:18233679
- View Article
- PubMed/NCBI
- Google Scholar
59. Nitzsche A, Paszkowski-Rogacz M, Matarese F, Janssen-Megens EM, Hubner NC, Schulz H, et al. RAD21 Cooperates with Pluripotency Transcription Factors in the Maintenance of Embryonic Stem Cell Identity. PLoS One. 2011;6(5):e19470. pmid:21589869
- View Article
- PubMed/NCBI
- Google Scholar
60. The ENCODE Project Consortium. An Integrated Encyclopedia of DNA Elements in the Human Genome. Nature. 2012;489(7414):57–74.
- View Article
- Google Scholar
61. Gürsoy G, Xu Y, Kenter AL, Liang J. Computational construction of 3D chromatin ensembles and prediction of functional interactions of alpha-globin locus from 5C data. Nucleic Acids Res. 2017;45(20):11547–11558. pmid:28981716
- View Article
- PubMed/NCBI
- Google Scholar
62. Yang T, Zhang F, Yardimci GG, Song F, Hardison RC, Noble WS, et al. HiCRep: assessing the reproducibility of Hi-C data using a stratum-adjusted correlation coefficient. Genome Res. 2017;27:1939–1949. pmid:28855260
- View Article
- PubMed/NCBI
- Google Scholar
63. Haarhuis JHI, van der Weide RH, Vincent Blomen A, Yáñez-Cuna JO, Amendola M, van Ruiten MS, et al. The Cohesin Release Factor WAPL Restricts Chromatin Loop Extension. Cell. 2017;169(4):693–707. pmid:28475897
- View Article
- PubMed/NCBI
- Google Scholar
64. Sanborn AL, Rao SSP, Huang SC, Durand NC, Huntley MH, Jewett AI, et al. Chromatin extrusion explains key features of loop and domain formation in wild-type and engineered genomes. Proc Natl Acad Sci USA. 2015;112(47):E6456–E6465. pmid:26499245
- View Article
- PubMed/NCBI
- Google Scholar
65. Fudenberg G, Imakaev M, Lu C, Goloborodko A, Abdennur N, Mirny LA. Formation of Chromosomal Domains by Loop Extrusion. Cell Rep. 2016;15(9):2038–2049. pmid:27210764
- View Article
- PubMed/NCBI
- Google Scholar
66. Banigan EJ, Mirny LA. Loop extrusion: theory meets single-molecule experiments. Curr Opin Cell Biol. 2020;64:124–138. pmid:32534241
- View Article
- PubMed/NCBI
- Google Scholar
67. Djebali S, Davis CA, Merkel A, Dobin A, Lassmann T, Mortazavi A, et al. Landscape of transcription in human cells. Nature. 2012;489(7414):101–108. pmid:22955620
- View Article
- PubMed/NCBI
- Google Scholar
68. Polovnikov KE, Nechaev S, Tamm MV. Many-body contacts in fractal polymer chains and fractional Brownian trajectories. Phys Rev E. 2019;99(3):032501. pmid:30999417
- View Article
- PubMed/NCBI
- Google Scholar
69. Hsu HP, Kremer K. Static and dynamic properties of large polymer melts in equilibrium. J Chem Phys. 2016;144(15):154907. pmid:27389240
- View Article
- PubMed/NCBI
- Google Scholar
70. Bonev B, Cohen NM, Szabo Q, Fritsch L, Papadopoulos GL, Lubling Y, et al. Multiscale 3D Genome Rewiring during Mouse Neural Development. Cell. 2017;171(3):557–572.e24. pmid:29053968
- View Article
- PubMed/NCBI
- Google Scholar
71. Le Treut G, Képès F, Orland H. A polymer model for the quantitative reconstruction of chromosome architecture from HiC and GAM data. Biophys J. 2018;115(12):2286–2294. pmid:30527448
- View Article
- PubMed/NCBI
- Google Scholar
72. Shinkai S, Nakagawa M, Sugawara T, Togashi Y, Ochiai H, Nakato R, et al. PHi-C: deciphering Hi-C data into polymer dynamics. NAR Genom Bioinf. 2020;2(2):lqaa020. pmid:33575580
- View Article
- PubMed/NCBI
- Google Scholar
73. Shinkai S, Sugawara T, Miura H, Hiratani I, Onami S. Microrheology for Hi-C Data Reveals the Spectrum of the Dynamic 3D Genome Organization. Biophys J. 2020;118(9):2220–2228. pmid:32191860
- View Article
- PubMed/NCBI
- Google Scholar
74. Tieleman T, Hinton G. Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude. COURSERA. 2012;4(2):26–31.
- View Article
- Google Scholar
75. Kingma DP, Ba J. Adam: A method for stochastic optimization. arXiv. 2014; p. 1412.6980.
- View Article
- Google Scholar
76. Mehta P, Bukov M, Wang CH, Day AGR, Richardson C, Fisher CK, et al. A high-bias, low-variance introduction to Machine Learning for physicists. Phys Rep. 2019;810:1–124. pmid:31404441
- View Article
- PubMed/NCBI
- Google Scholar
77. Rao SSP, Huntley MH, Durand NC, Stamenova EK, Bochkov ID, Robinson JT, et al. A 3D Map of the Human Genome at Kilobase Resolution Reveals Principles of Chromatin Looping. Cell. 2014;159(7):1665–1680. pmid:25497547
- View Article
- PubMed/NCBI
- Google Scholar

[ref1] 1. Wang S, Su JH, Beliveau BJ, Bintu B, Moffitt JR, Wu Ct, et al. Spatial organization of chromatin domains and compartments in single chromosomes. Science. 2016;353(6299):598–602. pmid:27445307
View Article
PubMed/NCBI
Google Scholar

[2] View Article

[3] PubMed/NCBI

[4] Google Scholar

[ref2] 2. Gu B, Swigut T, Spencley A, Bauer MR, Chung M, Meyer T, et al. Transcription-coupled changes in nuclear mobility of mammalian cis-regulatory elements. Science. 2018;359(6379):1050–1055. pmid:29371426
View Article
PubMed/NCBI
Google Scholar

[6] View Article

[7] PubMed/NCBI

[8] Google Scholar

[ref3] 3. Cardozo Gizzi AM, Cattoni D I, Fiche JB, Espinola SM, Gurgo J, Messina O, et al. Microscopy-Based Chromosome Conformation Capture Enables Simultaneous Visualization of Genome Organization and Transcription in Intact Organisms. Mol Cell. 2019;74(1):212–222. pmid:30795893
View Article
PubMed/NCBI
Google Scholar

[10] View Article

[11] PubMed/NCBI

[12] Google Scholar

[ref4] 4. Ohno M, Ando T, Priest DG, Kumar V, Yoshida Y, Taniguchi Y. Sub-nucleosomal Genome Structure Reveals Distinct Nucleosome Folding Motifs. Cell. 2019;176(3):520–534. pmid:30661750
View Article
PubMed/NCBI
Google Scholar

[14] View Article

[15] PubMed/NCBI

[16] Google Scholar

[ref5] 5. Hsieh THS, Cattoglio C, Slobodyanyuk E, Hansen AS, Rando OJ, Tjian R, et al. Resolving the 3D Landscape of Transcription-Linked Mammalian Chromatin Folding. Mol Cell. 2020;78(3):539–553. pmid:32213323
View Article
PubMed/NCBI
Google Scholar

[18] View Article

[19] PubMed/NCBI

[20] Google Scholar

[ref6] 6. Krietenstein N, Abraham S, Venev SV, Abdennur N, Gibcus J, Hsieh THS, et al. Ultrastructural Details of Mammalian Chromosome Architecture. Mol Cell. 2020;78(3):554–565. pmid:32213324
View Article
PubMed/NCBI
Google Scholar

[22] View Article

[23] PubMed/NCBI

[24] Google Scholar

[ref7] 7. Srinivas K, Sabari BR, Coffey EL, Klein IA, Boija A, Zamudio AV, et al. Enhancer features that drive formation of transcriptional condensates. Molecular cell. 2019;75(3):549–561.
View Article
Google Scholar

[26] View Article

[27] Google Scholar

[ref8] 8. Oudelaar AM, Harrold CL, Hanssen LLP, Telenius JM, Higgs DR, Hughes JR. A revised model for promoter competition based on multi-way chromatin interactions at the α-globin locus. Nat Commun. 2019;10:5412. pmid:31776347
View Article
PubMed/NCBI
Google Scholar

[29] View Article

[30] PubMed/NCBI

[31] Google Scholar

[ref9] 9. Beagrie RA, Thieme CJ, Annunziatella C, Baugher C, Zhang Y, Schueler M, et al. Multiplex-GAM: genome-wide identification of chromatin contacts yields insights not captured by Hi-C. bioRxiv. 2021;.
View Article
Google Scholar

[33] View Article

[34] Google Scholar

[ref10] 10. Lim B, Levine MS. Enhancer-promoter communication: hubs or loops? Curr Opin Gene Develop. 2021;67:5–9. pmid:33202367
View Article
PubMed/NCBI
Google Scholar

[36] View Article

[37] PubMed/NCBI

[38] Google Scholar

[ref11] 11. Miller JA, Widom J. Collaborative competition mechanism for gene activation in vivo. Molecular and cellular biology. 2003;23(5):1623–1632. pmid:12588982
View Article
PubMed/NCBI
Google Scholar

[40] View Article

[41] PubMed/NCBI

[42] Google Scholar

[ref12] 12. Hermsen R, Tans S, Ten Wolde PR. Transcriptional regulation by competing transcription factor modules. PLoS Comput Biol. 2006;2(12):e164. pmid:17140283
View Article
PubMed/NCBI
Google Scholar

[44] View Article

[45] PubMed/NCBI

[46] Google Scholar

[ref13] 13. Kempfer R, Pombo A. Methods for mapping 3D chromosome architecture. Nat Rev Genet. 2020;21(4):207–226. pmid:31848476
View Article
PubMed/NCBI
Google Scholar

[48] View Article

[49] PubMed/NCBI

[50] Google Scholar

[ref14] 14. Bintu B, Mateo LJ, Su JH, Sinnott-Armstrong NA, Parker M, Kinrot S, et al. Super-resolution chromatin tracing reveals domains and cooperative interactions in single cells. Science. 2018;362(6413):eaau1783. pmid:30361340
View Article
PubMed/NCBI
Google Scholar

[52] View Article

[53] PubMed/NCBI

[54] Google Scholar

[ref15] 15. Olivares-Chauvet P, Mukamel Z, Lifshitz A, Schwartzman O, Elkayam NO, Lubling Y, et al. Capturing pairwise and multi-way chromosomal conformations using chromosomal walks. Nature. 2016;540:296–300. pmid:27919068
View Article
PubMed/NCBI
Google Scholar

[56] View Article

[57] PubMed/NCBI

[58] Google Scholar

[ref16] 16. Darrow EM, Huntley MH, Dudchenko O, Stamenova EK, Durand NC, Sun Z, et al. Deletion of DXZ4 on the human inactive X chromosome alters higher-order genome architecture. Proc Natl Acad Sci U S A. 2016;113(31):E4504–E4512. pmid:27432957
View Article
PubMed/NCBI
Google Scholar

[60] View Article

[61] PubMed/NCBI

[62] Google Scholar

[ref17] 17. Oudelaar AM, Davies JOJ, Hanssen LLP, Telenius JM, Schwessinger R, Liu Y, et al. Single-allele chromatin interactions identify regulatory hubs in dynamic compartmentalized domains. Nat Genet. 2018;50:1744–1751. pmid:30374068
View Article
PubMed/NCBI
Google Scholar

[64] View Article

[65] PubMed/NCBI

[66] Google Scholar

[ref18] 18. Allahyar A, Vermeulen C, Bouwman BAM, Krijger PHL, Verstegen MJAM, Geeven G, et al. Enhancer hubs and loop collisions identified from single-allele topologies. Nat Genet. 2018;50(8):1151–1160. pmid:29988121
View Article
PubMed/NCBI
Google Scholar

[68] View Article

[69] PubMed/NCBI

[70] Google Scholar

[ref19] 19. Vermeulen C, Allahyar A, Bouwman BAM, Krijger PHL, Verstegen MJAM, Geeven G, et al. Multi-contact 4C: long-molecule sequencing of complex proximity ligation products to uncover local cooperative and competitive chromatin topologies. Nat Protoc. 2020;15:364–397. pmid:31932773
View Article
PubMed/NCBI
Google Scholar

[72] View Article

[73] PubMed/NCBI

[74] Google Scholar

[ref20] 20. Ulahannan N, Pendleton M, Deshpande A, Schwenk S, Behr JM, Dai X, et al. Nanopore sequencing of DNA concatemers reveals higher-order features of chromatin structure. bioRxiv. 2019;.
View Article
Google Scholar

[76] View Article

[77] Google Scholar

[ref21] 21. Nagano T, Lubling Y, Stevens TJ, Schoenfelder S, Yaffe E, Dean W, et al. Single-cell Hi-C reveals cell-to-cell variability in chromosome structure. Nature. 2013;502(7469):59–64. pmid:24067610
View Article
PubMed/NCBI
Google Scholar

[79] View Article

[80] PubMed/NCBI

[81] Google Scholar

[ref22] 22. Tan L, Xing D, Chang CH, Li H, Xie XS. Three-dimensional genome structures of single diploid human cells. Science. 2018;361(6405):924–928. pmid:30166492
View Article
PubMed/NCBI
Google Scholar

[83] View Article

[84] PubMed/NCBI

[85] Google Scholar

[ref23] 23. Tan L, Xing D, Daley N, Xie XS. Three-dimensional genome structures of single sensory neurons in mouse visual and olfactory systems. Nat Struct Mol Biol. 2019;26(4):297–307. pmid:30936528
View Article
PubMed/NCBI
Google Scholar

[87] View Article

[88] PubMed/NCBI

[89] Google Scholar

[ref24] 24. Takei Y, Yun J, Zheng S, Ollikainen N, Pierson N, White J, et al. Integrated spatial genomics reveals global architecture of single nuclei. Nature. 2021;590(7845):344–350. pmid:33505024
View Article
PubMed/NCBI
Google Scholar

[91] View Article

[92] PubMed/NCBI

[93] Google Scholar

[ref25] 25. Beagrie RA, Antonioand Schueler Scialdone M, Kraemer DCA, Chotalia M, Xie SQ, Barbieri M, et al. Complex multi-enhancer contacts captured by genome architecture mapping. Nature. 2017;543:519–524. pmid:28273065
View Article
PubMed/NCBI
Google Scholar

[95] View Article

[96] PubMed/NCBI

[97] Google Scholar

[ref26] 26. Quinodoz SA, Ollikainen N, Tabak B, Palla A, Schmidt JM, Detmar E, et al. Higher-Order Inter-chromosomal Hubs Shape 3D Genome Organization in the Nucleus. Cell. 2018;174(3):744–757.e24. pmid:29887377
View Article
PubMed/NCBI
Google Scholar

[99] View Article

[100] PubMed/NCBI

[101] Google Scholar

[ref27] 27. de Gennes PG. Scaling Concepts in Polymer Physics. Ithaca and London: Cornell University Press; 1979.

[ref28] 28. Halverson JD, Smrek J, Kremer K, Grosberg AY. From a melt of rings to chromosome territories: the role of topological constraints in genome folding. Rep Prog Phys. 2014;77:022601. pmid:24472896
View Article
PubMed/NCBI
Google Scholar

[104] View Article

[105] PubMed/NCBI

[106] Google Scholar

[ref29] 29. Liu L, Hyeon C. Contact statistics highlight distinct organizing principles of proteins and RNA. Biophys J. 2016;110(11):2320–2327. pmid:27276250
View Article
PubMed/NCBI
Google Scholar

[108] View Article

[109] PubMed/NCBI

[110] Google Scholar

[ref30] 30. Perez-Rathke A, Sun Q, Wang B, Boeva V, Shao Z, Liang J. CHROMATIX: computing the functional landscape of many-body chromatin interactions in transcriptionally active loci from deconvolved single cells. Genome Biol. 2020;21:13. pmid:31948478
View Article
PubMed/NCBI
Google Scholar

[112] View Article

[113] PubMed/NCBI

[114] Google Scholar

[ref31] 31. Grosberg AY, Nechaev SK, Shakhnovich EI. The role of topological constraints in the kinetics of collapse of macromolecules. J Phys. 1988;49(12):2095–2100.
View Article
Google Scholar

[116] View Article

[117] Google Scholar

[ref32] 32. Grosberg A, Rabin Y, Havlin S, Neer A. Crumpled globule model of the three-dimensional structure of DNA. Europhys Lett. 1993;23:373.
View Article
Google Scholar

[119] View Article

[120] Google Scholar

[ref33] 33. Grosberg AY, Khokhlov AR. Statistical Physics of Macromolecules. New York: AIP Press; 1994.

[ref34] 34. Rosa A, Everaers R. Structure and Dynamics of Interphase Chromosomes. PLoS Comput Biol. 2008;4:e1000153. pmid:18725929
View Article
PubMed/NCBI
Google Scholar

[123] View Article

[124] PubMed/NCBI

[125] Google Scholar

[ref35] 35. Tjong H, Gong K, Chen L, Alber F. Physical tethering and volume exclusion determine higher-order genome organization in budding yeast. Genome Res. 2012;22(7):1295–1305. pmid:22619363
View Article
PubMed/NCBI
Google Scholar

[127] View Article

[128] PubMed/NCBI

[129] Google Scholar

[ref36] 36. Wong H, Marie-Nelly H, Herbert S, Carrivain P, Blanc H, Koszul R, et al. A Predictive Computational Model of the Dynamic 3D Interphase Yeast Nucleus. Curr Biol. 2012;22(20):1881–1890. pmid:22940469
View Article
PubMed/NCBI
Google Scholar

[131] View Article

[132] PubMed/NCBI

[133] Google Scholar

[ref37] 37. Gürsoy G, Xu Y, Kenter AL, Liang J. Spatial confinement is a major determinant of the folding landscape of human chromosomes. Nucleic Acids Res. 2014;42:8223–8230. pmid:24990374
View Article
PubMed/NCBI
Google Scholar

[135] View Article

[136] PubMed/NCBI

[137] Google Scholar

[ref38] 38. Kang H, Yoon YG, Thirumalai D, Hyeon C. Confinement-induced glassy dynamics in a model for chromosome organization. Phys Rev Lett. 2015;115:198102. pmid:26588418
View Article
PubMed/NCBI
Google Scholar

[139] View Article

[140] PubMed/NCBI

[141] Google Scholar

[ref39] 39. Barbieri M, Chotalia M, Fraser J, Lavitas LM, Dostie J, Pombo A, et al. Complexity of chromatin folding is captured by the strings and binders switch model. Proc Natl Acad Sci U S A. 2012;109(40):16173–16178. pmid:22988072
View Article
PubMed/NCBI
Google Scholar

[143] View Article

[144] PubMed/NCBI

[145] Google Scholar

[ref40] 40. Jost D, Carrivain P, Cavalli G, Vaillant C. Modeling epigenome folding: formation and dynamics of topologically associated chromatin domains. Nucleic Acids Res. 2014;42(15):9553–9561. pmid:25092923
View Article
PubMed/NCBI
Google Scholar

[147] View Article

[148] PubMed/NCBI

[149] Google Scholar

[ref41] 41. Brackley CA, Brown JM, Waithe D, Babbs C, Davies J, Hughes JR, et al. Predicting the three-dimensional folding of cis-regulatory regions in mammalian genomes using bioinformatic data and polymer models. Genome Biol. 2016;17(1):59. pmid:27036497
View Article
PubMed/NCBI
Google Scholar

[151] View Article

[152] PubMed/NCBI

[153] Google Scholar

[ref42] 42. Brackley CA, Johnson J, Kelly S, Cook PR, Marenduzzo D. Simulated binding of transcription factors to active and inactive regions folds human chromosomes into loops, rosettes and topological domains. Nucleic Acids Res. 2016;44(8):3503–3512. pmid:27060145
View Article
PubMed/NCBI
Google Scholar

[155] View Article

[156] PubMed/NCBI

[157] Google Scholar

[ref43] 43. Di Pierro M, Zhang B, Aiden EL, Wolynes PG, Onuchic JN. Transferable model for chromosome architecture. Proc Natl Acad Sci U S A. 2016;113(43):12168–12173. pmid:27688758
View Article
PubMed/NCBI
Google Scholar

[159] View Article

[160] PubMed/NCBI

[161] Google Scholar

[ref44] 44. Gursoy G, Xu Y, Liang J. Spatial organization of the budding yeast genome in the cell nucleus and identification of specific chromatin interactions from multi-chromosome constrained chromatin model. PLoS Comput Biol. 2017;13(7):e1005658. pmid:28704374
View Article
PubMed/NCBI
Google Scholar

[163] View Article

[164] PubMed/NCBI

[165] Google Scholar

[ref45] 45. Shi G, Liu L, Hyeon C, Thirumalai D. Interphase Human Chromosome Exhibits Out of Equilibrium Glassy Dynamics. Nat Commun. 2018;9:3161. pmid:30089831
View Article
PubMed/NCBI
Google Scholar

[167] View Article

[168] PubMed/NCBI

[169] Google Scholar

[ref46] 46. Bianco S, Lupiáñez DG, Chiariello AM, Annunziatella C, Kraft K, Schöpflin R, et al. Polymer physics predicts the effects of structural variants on chromatin architecture. Nat Genetics. 2018;50:662–667. pmid:29662163
View Article
PubMed/NCBI
Google Scholar

[171] View Article

[172] PubMed/NCBI

[173] Google Scholar

[ref47] 47. Liu L, Shi G, Thirumalai D, Hyeon C. Chain organization of human interphase chromosome determines the spatiotemporal dynamics of chromatin loci. PLoS Comp Biol. 2018;14(12):e1006617. pmid:30507936
View Article
PubMed/NCBI
Google Scholar

[175] View Article

[176] PubMed/NCBI

[177] Google Scholar

[ref48] 48. Nir G, Farabella I, Estrada CP, Ebeling CG, Beliveau BJ, Sasaki HM, et al. Walking along chromosomes with super-resolution imaging, contact maps, and integrative modeling. PLoS genetics. 2018;14(12):e1007872. pmid:30586358
View Article
PubMed/NCBI
Google Scholar

[179] View Article

[180] PubMed/NCBI

[181] Google Scholar

[ref49] 49. Zhang S, Chen F, Bahar I. Differences in the intrinsic spatial dynamics of the chromatin contribute to cell differentiation. Nucl Acids Res. 2020;48:1131–1145. pmid:31828312
View Article
PubMed/NCBI
Google Scholar

[183] View Article

[184] PubMed/NCBI

[185] Google Scholar

[ref50] 50. Liu L, Kim MH, Hyeon C. Heterogeneous loop model to infer 3D chromosome structures from Hi-C. Biophys J. 2019;117(3):613–625. pmid:31337548
View Article
PubMed/NCBI
Google Scholar

[187] View Article

[188] PubMed/NCBI

[189] Google Scholar

[ref51] 51. Liu L, Hyeon C. Revisiting the organization of Polycomb-repressed domains: 3D chromatin models from Hi-C compared with super-resolution imaging. Nucleic Acids Res. 2020;48(20):11486–11494. pmid:33095877
View Article
PubMed/NCBI
Google Scholar

[191] View Article

[192] PubMed/NCBI

[193] Google Scholar

[ref52] 52. Chu X, Wang J. Microscopic Chromosomal Structural and Dynamical Origin of Cell Differentiation and Reprogramming. Adv Sci. 2020;7(20):2001572. pmid:33101859
View Article
PubMed/NCBI
Google Scholar

[195] View Article

[196] PubMed/NCBI

[197] Google Scholar

[ref53] 53. Di Stefano M, Nützmann HW, Marti-Renom MA, Jost D. Polymer modelling unveils the roles of heterochromatin and nucleolar organizing regions in shaping 3D genome organization in Arabidopsis thaliana. Nucleic Acids Res. 2021;49(4):1840–1858. pmid:33444439
View Article
PubMed/NCBI
Google Scholar

[199] View Article

[200] PubMed/NCBI

[201] Google Scholar

[ref54] 54. Shi G, Thirumalai D. From Hi-C Contact Map to Three-dimensional Organization of Interphase Human Chromosomes. Phys Rev X. 2021;11(1):011051.
View Article
Google Scholar

[203] View Article

[204] Google Scholar

[ref55] 55. Bianco S, Annunziatella C, Andrey G, Chiariello AM, Esposito A, Fiorillo L, et al. Modeling Single-Molecule Conformations of the HoxD Region in Mouse Embryonic Stem and Cortical Neuronal Cells. Cell Rep. 2019;28(6):1574–1583. pmid:31390570
View Article
PubMed/NCBI
Google Scholar

[206] View Article

[207] PubMed/NCBI

[208] Google Scholar

[ref56] 56. Chiariello AM, Bianco S, Oudelaar AM, Esposito A, Annunziatella C, Fiorillo L, et al. A Dynamic Folded Hairpin Conformation Is Associated with alpha-Globin Activation in Erythroid Cells. Cell Rep. 2020;30(7):2125–2135. pmid:32075757
View Article
PubMed/NCBI
Google Scholar

[210] View Article

[211] PubMed/NCBI

[212] Google Scholar

[ref57] 57. Bak JH, Kim MH, Liu L, Hyeon C. A unified framework for inferring the multi-scale organization of chromatin domains from Hi-C. PLoS Comput Biol. 2021;17(3):e1008834. pmid:33724986
View Article
PubMed/NCBI
Google Scholar

[214] View Article

[215] PubMed/NCBI

[216] Google Scholar

[ref58] 58. Bohn M, Heermann DW, van Driel R. Random loop model for long polymers. Phys Rev E. 2007;76(5):051805. pmid:18233679
View Article
PubMed/NCBI
Google Scholar

[218] View Article

[219] PubMed/NCBI

[220] Google Scholar

[ref59] 59. Nitzsche A, Paszkowski-Rogacz M, Matarese F, Janssen-Megens EM, Hubner NC, Schulz H, et al. RAD21 Cooperates with Pluripotency Transcription Factors in the Maintenance of Embryonic Stem Cell Identity. PLoS One. 2011;6(5):e19470. pmid:21589869
View Article
PubMed/NCBI
Google Scholar

[222] View Article

[223] PubMed/NCBI

[224] Google Scholar

[ref60] 60. The ENCODE Project Consortium. An Integrated Encyclopedia of DNA Elements in the Human Genome. Nature. 2012;489(7414):57–74.
View Article
Google Scholar

[226] View Article

[227] Google Scholar

[ref61] 61. Gürsoy G, Xu Y, Kenter AL, Liang J. Computational construction of 3D chromatin ensembles and prediction of functional interactions of alpha-globin locus from 5C data. Nucleic Acids Res. 2017;45(20):11547–11558. pmid:28981716
View Article
PubMed/NCBI
Google Scholar

[229] View Article

[230] PubMed/NCBI

[231] Google Scholar

[ref62] 62. Yang T, Zhang F, Yardimci GG, Song F, Hardison RC, Noble WS, et al. HiCRep: assessing the reproducibility of Hi-C data using a stratum-adjusted correlation coefficient. Genome Res. 2017;27:1939–1949. pmid:28855260
View Article
PubMed/NCBI
Google Scholar

[233] View Article

[234] PubMed/NCBI

[235] Google Scholar

[ref63] 63. Haarhuis JHI, van der Weide RH, Vincent Blomen A, Yáñez-Cuna JO, Amendola M, van Ruiten MS, et al. The Cohesin Release Factor WAPL Restricts Chromatin Loop Extension. Cell. 2017;169(4):693–707. pmid:28475897
View Article
PubMed/NCBI
Google Scholar

[237] View Article

[238] PubMed/NCBI

[239] Google Scholar

[ref64] 64. Sanborn AL, Rao SSP, Huang SC, Durand NC, Huntley MH, Jewett AI, et al. Chromatin extrusion explains key features of loop and domain formation in wild-type and engineered genomes. Proc Natl Acad Sci USA. 2015;112(47):E6456–E6465. pmid:26499245
View Article
PubMed/NCBI
Google Scholar

[241] View Article

[242] PubMed/NCBI

[243] Google Scholar

[ref65] 65. Fudenberg G, Imakaev M, Lu C, Goloborodko A, Abdennur N, Mirny LA. Formation of Chromosomal Domains by Loop Extrusion. Cell Rep. 2016;15(9):2038–2049. pmid:27210764
View Article
PubMed/NCBI
Google Scholar

[245] View Article

[246] PubMed/NCBI

[247] Google Scholar

[ref66] 66. Banigan EJ, Mirny LA. Loop extrusion: theory meets single-molecule experiments. Curr Opin Cell Biol. 2020;64:124–138. pmid:32534241
View Article
PubMed/NCBI
Google Scholar

[249] View Article

[250] PubMed/NCBI

[251] Google Scholar

[ref67] 67. Djebali S, Davis CA, Merkel A, Dobin A, Lassmann T, Mortazavi A, et al. Landscape of transcription in human cells. Nature. 2012;489(7414):101–108. pmid:22955620
View Article
PubMed/NCBI
Google Scholar

[253] View Article

[254] PubMed/NCBI

[255] Google Scholar

[ref68] 68. Polovnikov KE, Nechaev S, Tamm MV. Many-body contacts in fractal polymer chains and fractional Brownian trajectories. Phys Rev E. 2019;99(3):032501. pmid:30999417
View Article
PubMed/NCBI
Google Scholar

[257] View Article

[258] PubMed/NCBI

[259] Google Scholar

[ref69] 69. Hsu HP, Kremer K. Static and dynamic properties of large polymer melts in equilibrium. J Chem Phys. 2016;144(15):154907. pmid:27389240
View Article
PubMed/NCBI
Google Scholar

[261] View Article

[262] PubMed/NCBI

[263] Google Scholar

[ref70] 70. Bonev B, Cohen NM, Szabo Q, Fritsch L, Papadopoulos GL, Lubling Y, et al. Multiscale 3D Genome Rewiring during Mouse Neural Development. Cell. 2017;171(3):557–572.e24. pmid:29053968
View Article
PubMed/NCBI
Google Scholar

[265] View Article

[266] PubMed/NCBI

[267] Google Scholar

[ref71] 71. Le Treut G, Képès F, Orland H. A polymer model for the quantitative reconstruction of chromosome architecture from HiC and GAM data. Biophys J. 2018;115(12):2286–2294. pmid:30527448
View Article
PubMed/NCBI
Google Scholar

[269] View Article

[270] PubMed/NCBI

[271] Google Scholar

[ref72] 72. Shinkai S, Nakagawa M, Sugawara T, Togashi Y, Ochiai H, Nakato R, et al. PHi-C: deciphering Hi-C data into polymer dynamics. NAR Genom Bioinf. 2020;2(2):lqaa020. pmid:33575580
View Article
PubMed/NCBI
Google Scholar

[273] View Article

[274] PubMed/NCBI

[275] Google Scholar

[ref73] 73. Shinkai S, Sugawara T, Miura H, Hiratani I, Onami S. Microrheology for Hi-C Data Reveals the Spectrum of the Dynamic 3D Genome Organization. Biophys J. 2020;118(9):2220–2228. pmid:32191860
View Article
PubMed/NCBI
Google Scholar

[277] View Article

[278] PubMed/NCBI

[279] Google Scholar

[ref74] 74. Tieleman T, Hinton G. Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude. COURSERA. 2012;4(2):26–31.
View Article
Google Scholar

[281] View Article

[282] Google Scholar

[ref75] 75. Kingma DP, Ba J. Adam: A method for stochastic optimization. arXiv. 2014; p. 1412.6980.
View Article
Google Scholar

[284] View Article

[285] Google Scholar

[ref76] 76. Mehta P, Bukov M, Wang CH, Day AGR, Richardson C, Fisher CK, et al. A high-bias, low-variance introduction to Machine Learning for physicists. Phys Rep. 2019;810:1–124. pmid:31404441
View Article
PubMed/NCBI
Google Scholar

[287] View Article

[288] PubMed/NCBI

[289] Google Scholar

[ref77] 77. Rao SSP, Huntley MH, Durand NC, Stamenova EK, Bochkov ID, Robinson JT, et al. A 3D Map of the Human Genome at Kilobase Resolution Reveals Principles of Chromatin Looping. Cell. 2014;159(7):1665–1680. pmid:25497547
View Article
PubMed/NCBI
Google Scholar

[291] View Article

[292] PubMed/NCBI

[293] Google Scholar

Figures

Abstract

Author summary

Introduction

Results

Polymer theory of n-body chromatin contacts

Comparison with Tri-C

Comparison with multi-contact 4C sequencing (MC-4C)

Comparison with SPRITE

Discussions

Relation between pairwise and triplet contacts

Miscellaneous

Conclusions

Methods

Numerical details for determining the stiffness matrix

Derivations of the multi-body contact probability

Generic many-body contact probability.

Pairwise contact probability.

Three-body contact probability.

Four-body contact probability.

Conditional pairwise contact probability.

Gaussian polymer chain.

Supporting information

S1 Appendix. Derivation of n-body contact probability based on the cross-linking probability modeled with the Heaviside step function.

S1 Fig. Stratified Pearson correlations of the HLM-predicted contact probabilities.

S2 Fig. Changes of Tri-C triplet contact frequencies in the α-globin region of mouse erythroid cells with respect to ES cells.

S3 Fig. Analysis of the changes in triplet contacts () and in pairwise contacts () among CTCF binding sites in WAPL lacking cells.

S4 Fig. WAPL depletion-induced fold changes of n-body contact probability.

S5 Fig. Comparison between HLM and SPRITE.

S6 Fig. Many-body contacts of a Gaussian polymer chain calculated with step function.

S7 Fig. Conditional pairwise contact probability with either a Gaussian (left column) or Heaviside step (right column) cross-linking probability.

S8 Fig. Triplet contacts predicted by simple rules with reference to Fig 3C and 3D.

S9 Fig. Excluded volume in a polymer melt at different levels of coarse graining.

S10 Fig. HLM of mouse β-globin locus on chr7 at resolution of 8kb.

S11 Fig. HLM of mouse Pcdhα locus on chr18 at 5 kb resolution (see also the caption of S10 Fig).

S12 Fig. Comparison between the model trainings with different choices of cost functions and optimizers.

S13 Fig. Comparison between the models trained for a 10-Mb genomic region at 50-kb resolution in GM12878 cells [77] with different choices of cost functions and optimizers.

Acknowledgments

References