Comparative 3D Genome Structure Analysis of the Fission and the Budding Yeast

We studied the 3D structural organization of the fission yeast genome, which emerges from the tethering of heterochromatic regions in otherwise randomly configured chromosomes represented as flexible polymer chains in an nuclear environment. This model is sufficient to explain in a statistical manner many experimentally determined distinctive features of the fission yeast genome, including chromatin interaction patterns from Hi-C experiments and the co-locations of functionally related and co-expressed genes, such as genes expressed by Pol-III. Our findings demonstrate that some previously described structure-function correlations can be explained as a consequence of random chromatin collisions driven by a few geometric constraints (mainly due to centromere-SPB and telomere-NE tethering) combined with the specific gene locations in the chromosome sequence. We also performed a comparative analysis between the fission and budding yeast genome structures, for which we previously detected a similar organizing principle. However, due to the different chromosome sizes and numbers, substantial differences are observed in the 3D structural genome organization between the two species, most notably in the nuclear locations of orthologous genes, and the extent of nuclear territories for genes and chromosomes. However, despite those differences, remarkably, functional similarities are maintained, which is evident when comparing spatial clustering of functionally related genes in both yeasts. Functionally related genes show a similar spatial clustering behavior in both yeasts, even though their nuclear locations are largely different between the yeast species.

The contact frequency map at 96kb resolution is defined as a 130 × 130 matrix. Each cell in the matrix represent the contact frequency between two 64kb of consecutive genomic regions. For our structure population, the contact frequency c ij is equal to the total sum of observed contacts between any one of the beads in one region (i) to any one of the beads of the second region (j). For Hi-C experimental contact frequency map, c ij is defined as the sum of all the physical proximity values between region (i) and region (j).
After we obtain the contact frequency map C 130 96kb , a normalization is processed following protocol provided by Imakaev, et al [1]. All the matrix comparison is performed based on normalized contact frequency map.

Matrix Correlation Measurement
Let C n Exp = (c ij Exp ) n×n and C n P = (c ij P ) n×n represents the contact frequency matrix for experimental and structure population. The Pearson's correlation coefficient between the two matrices is as follows.

Localization Probability Density (LPD) of Gene and Chromosome
To visualize loci localization in the nucleus, we calculate localization probability density (LPD) maps from our structure population. We first collect all the position information (x,y,z) of target beads and projected those 3D coordinates into a 2D space [2]. A density grid projection and normalization is done using the same protocol as the budding yeast analysis [3].
For any given set of genes, we first collect the 3D coordinates (x',y',z') of all genes in all the structure population, and then project them into a 2D space using the following formula: Next, we perform density grip projection with a grid size as Δ = 10 nm, which results into a 2D grid of 142*142 pixels. Once we map the point (z c ,ρ c ) to a grid, a Gaussian blur is needed using the following formula where z c and ρ c represents the center of the pixel in z and p axis. (i,j) denote the neighboring pixels and σ = 30nm.
Normalization is performed after Gaussian blur using following formula and Δ is the grid size. Finally, we divide all ij G by the maximum of ij G so that the maximum value is 1.

Nucleus Accessibility
To estimate how much a locus can explore inside a nucleus we define its nucleus accessibility. For given genomic regions, we first get the 2D space it can explore (the total number of grids with the density value > 0.0001) through LPD analysis. The nucleus accessibility can be calculated as the fraction of total available nuclear space, excluding the nucleolus regions, that is accessible to the regions at 2D space.

Interaction Entropy Calculation
Entropy can be used to measure uncertainty in a random variable, in this case interaction frequency coming from a domain to its partners. The higher the entropy, the more unspecific interactions it has to its partners.
The entropy of bin i is defined as where and the expected uniform distribution values are simply .

Functional Correlation Gene Pairs from Genetic Interaction Experiment
The functional correlation between two genes is calculated as the cosine correlation or dot product between the two genes' genetic interaction score of all query genes.