Differential Chromosome Conformations as Hallmarks of Cellular Identity Revealed by Mathematical Polymer Modeling

Inherently dynamic, chromosomes adopt many different conformations in response to DNA metabolism. Models of chromosome organization in the yeast nucleus obtained from genome-wide chromosome conformation data or biophysical simulations provide important insights into the average behavior but fail to reveal features from dynamic or transient events that are only visible in a fraction of cells at any given moment. We developed a method to determine chromosome conformation from relative positions of three fluorescently tagged DNA in living cells imaged in 3D. Cell type specific chromosome folding properties could be assigned based on positional combinations between three loci on yeast chromosome 3. We determined that the shorter left arm of chromosome 3 is extended in MATα cells, but can be crumpled in MAT a cells. Furthermore, we implemented a new mathematical model that provides for the first time an estimate of the relative physical constraint of three linked loci related to cellular identity. Variations in this estimate allowed us to predict functional consequences from chromatin structural alterations in asf1 and recombination enhancer deletion mutant cells. The computational method is applicable to identify and characterize dynamic chromosome conformations in any cell type.


Introduction
The three-dimensional organization of the genome was shown to dynamically adapt to nuclear function and complexity [1][2][3]. Chromatin fibers can be represented by a polymer random coil adopting a considerable, largely unappreciated number of states in response to DNA metabolism [4][5][6][7]. How the intrinsic folding of a chromosome within the nucleus and the relative position of loci on specific chromosomes contribute to these processes is not known. Chromosome conformation capture techniques that rely on protein-DNA cross-linking have provided precious information on the frequency of three-dimensional long range molecular contacts between genomic DNA segments [4,[8][9][10]. Microscopy approaches are, on the other hand, necessary to further our understanding of the dynamics of DNA transactions because they allow analysis of live and single cells [11,12]. Using two-color distance measurements between two fluorescently labelled loci in 3D in fixed or living cells has yielded data to infer chromatin compaction parameters by polymer modeling [13][14][15]. These parameters were also used to model 4C and HiC data (for example see [4,8,12]) or to simulate genome-wide chromosome organization [2]. Because the nucleosome fiber is highly flexible, inferring fiber properties from measuring distances between two labeled loci implies making a number of assumptions to describe the actual path of the fiber. In order to obtain spatial, 3D information, at least 3 points in space are needed. Three points provide geometrical information that can be used to establish physical models. In the nucleus, these points can include reference structures, such as the nuclear envelope or the nucleolus in yeast, or three distinct DNA tags. Few previous studies used three labeled loci: in fixed, mammalian cells, distances and angles within a triangle formed by three probes in the same nucleus were measured to study changes in chromatin domain compaction [16]; in live bacteria cells, three labels were used to determine the position of two chromosomal loci relative to a third, reference one, in 2D [17]. A geometrical interpretation of changes in chromosome folding was not proposed.
The challenge for the analysis of three points in space is to develop mathematical algorithms allowing comparison of two sets of data, because commonly used tests for comparing distributions (1D, using Wilcoxon or KS tests) are no longer applicable. New computational tools are needed to identify specific features which may not be the most prominent ones, notably in cellular systems evolving over time or in space. To address this need, we developed a new system to fluorescently label three distinct genomic loci in living cells simultaneously and implemented mathematical algorithms to analyze the relative 3D positions of the labelled DNA loci.
We used S. cerevisiae chromosome 3 (Chr3) as a model to study its folding. Chr3 is a short chromosome of only 320 kb with a tripartite organization: two AT rich domains flank a GCrich centromere proximal region and forms a ring-like structure mediated by frequent contacts between heterochromatin loci near the ends [18][19][20]. This chromosome has received particular attention because the study of the mating type loci, and their interconversion, contributed to fundamental knowledge about cell lineage control, silencing and recombination [21]. Numerous genetic data contributed to a better understanding of the directionality of the mating type switch, yet the underlying mechanism is not known [22,23]. A role for chromosome architecture is usually invoked as a possible driving force for donor choice [24][25][26][27]. We have the possibility now to address this question by studying the nuclear position of the three mating type loci in the nucleus simultaneously.
In this study, we adapted a third system, based on the λ operator [28], from bacteria to yeast. In combination with the widely used Tet and Lac operators, we simultaneously tag three distinct loci in live yeast cells. We then developed a computational approach to demonstrate, that the frequency of specific positional combinations between three loci point to folding properties of a small chromosome. We find that folding of the left arm is different in MATa and MATα cells. In strains lacking either a component involved in chromatin compaction, the chaperone Asf1, or the recombination enhancer (RE), the mating type specific conformation of Chr3 is altered. Our results suggest that chromosomal organization brought about by fiber folding and heterochromatic domains contribute to control of the yeast mating type loci by regulating long-range contacts.

Lambda cloning and strain construction
To simultaneously label three distinct loci, we adapted the bacterial λ repressor operator system (λO and λCi) for use in yeast [28,29] and combined it with the existing Lac and Tet systems [30] (Fig 1A). Strains bearing three distinct labels were created by integration of constructs encoding fluorescent repressor fusion proteins followed by the integration of the operator sequences into the yeast genome by transformation. Yeast strains are listed in Table 1. Expression of an operon containing the phage λcI repressor gene fused to the gene encoding YFP was placed under the control of the pURA3 promoter. The λ repressor sequence bears amino acid modifications G48S and Y210H which strengthen DNA binding and reduce tetramerization, respectively [28]. The λcI sequence was amplified by PCR using pRFG116 and kindly provided by Dr D Chattoraj. During the PCR reaction a NotI recognition site was introduced. The pURA3 promoter was amplified from the pCJ97 plasmid with an introduction of a NheI site. The two fragments were then digested by NheI and ligated to generate a pURA3-λcI fragment. This fragment was double-digested with EcoRI and NotI, and ligated to the doubledigested (EcoRI/NotI) pCJ97 plasmid bearing a YFP sequence.
Following the validation steps discussed above, we subcloned the pURA3-λcI-YFP in a plasmid bearing pHIS-CFP-lacI-pUra3-TetR-YFP (named pGVH30 [18]), plasmid which allows expression of two fusion proteins from a single plasmid thus requiring a unique selection marker (ADE2), finally leading to plasmid pIL01. The λO were extracted from the bacterial pRFB122 plasmid by an XhoI digestion and ligated to the pSR6 plasmid [31] and digested by XhoI/SalI. Integration of the operator repeats was performed using a cloning free technique. The method entails insertion of a marker gene generated by PCR using long primers, with the optimal size of the locus-specific primer tails varying from 60 to 80 nt, near the locus of interest. The marker is then replaced by the operator repeats found in pSR plasmids. The following PCR-amplified genomic fragments (SGD coordinates) were used for insertion within 0.5 to 4 kb from the respective loci: 15160 kb to 15773 kb for HML, 294898 kb to 295245 kb for HMR, 90917 kb to 92521 kb for LEU, 239254 kb to 240927 kb for ARS1413 and 16431 kb to 17993-kb for TelVI. The PCR-amplified sequences of HMR and HML were cloned into a pAFS52-lacO plasmid and in the pAFS59-tetO bearing plasmid, respectively [18]. The lambda operator repeats were integrated at 197197-197310 kb on Chr3 to label the MAT locus. The ASF1 gene (196334 kb to 197080 kb on Chr10) or the RE (29083 kb to 29748 kb on Chr3) were replaced by the hygromycin resistance gene amplified from pAG32 (1743bp). Alternatively, the RE (28987-29852 on Chr3) was replaced by loxP flanked hygromycin resistance gene amplified from pGI10. Galactose induced expression of the recombinase from pHS47 induces deletion of the entire cassette (162bp remaining plasmid sequences). The lambda operator (λO) system comprises a relatively small number of repeated binding sites; 64 repeats compared to the usual 128-256 repeats of TetO and LacO arrays. The focus formed by binding of multiple repressor  fusion proteins to the arrays integrated into the genome is easily detectable using conventional fluorescence microscopy ( Fig 1A). The constitutive expression of the λcI repressor fused to YFP and its binding to a short 1.5kb DNA λO fragment was not toxic to the cell, which displayed growth rates identical to unmodified yeast cultures. λcI-YFP fusion proteins diffuse freely in the cytoplasm, and in the nucleoplasm apart from the vacuole (Fig 1A). The focus formed at a site near the MAT locus (197 kb along right arm of Chr3) tagged using λO was positioned in the center of the nuclear lumen with the same frequency as the same genetic locus tagged with LacO.

Microscopy and image analysis
Live microscopy was performed using an Olympus IX-81 wide-field fluorescence microscope, equipped with a CoolSNAPHQ camera (Roper Scientific) and a Polychrome V (Till Photonics), electric piezo with accuracy of 10 nm and imaged through an Olympus oil immersion objective 100X PLANAPO NA1.4. Yeast cells were spread on a concave microscopy slide filled with SD-agarose (YNB + 2% sugar/ carbon source + 3%(w/v) agarose). Acquisition of CFP, mRFP and YFP was performed in 3D (21 focal planes at 0.2 μm distance intervals; 500 ms, 300 ms, 500 ms acquisition times respectively). Fluorescent intensities of the acquired spots were identical in both cell types demonstrating that the inserted operator arrays maintained the same size. The x, y and z coordinates for each focus were automatically measured using the "Spot distance"plug in on image J (http://bigwww.epfl.ch/sage/soft/spotdistance/; D Sage, EPFL; [29]. This program uses multi-channel z-stack images to localize the center of the nucleus based on the background fluorescence of the CFP-lacI. The position of each focus is assigned to the center of gravity of the fluorescence around the brightest pixel found in this nucleus in a filtered version of the image. The signals are scored on 3D stacks using at least 200 nuclei,  monitoring nuclear integrity and cell cycle stage through bud shape and nuclear diameter. Distances between pairs of loci can be determined in 3D using this Image J plugin. Statistical analysis was performed using the Wilcoxon test.

Mating type switching assay
Expression of the HO-endonuclease gene from the pHO (URA selection [32]) was induced by addition of 2% filtered galactose (SIGMA) to a yeast culture exponentially growing in 2% raffinose. For quantitative PCR reactions, Bio-Rad SybrGreen Supermix was used in the presence of 30ng of genomic DNA and 0.26μM of each primer. PCR were run on ABI 7900. The MAT a-specific primers were 5'-GGCATTACTCCACTTCAAGT (P1) and 5 0 -ATGTGAACCG CATGGGCAGT (P2). The MATα-specific primers were 5 0 -ATGTGAACCGCATGGGCAGT (P2) and 5 0 -GCAGCACGGAATATGGGACT (P3). Primers specific for the ARG5, 6 locus 5 0 -CAAGGATCCAGCAAAGTTGGGTGAAGTATGGTA and 5 0 -GAAGGATCCAAATTT GTCTAGTGTGGGAACG or for the actin locus were used for normalization. All qPCR assays were accompanied by reactions using dilutions of genomic DNA from wt strains' 0h input sample to assess the linearity of the PCR signal and to create calibration curves.

Modeling experience and problem formulation
One considers a fiber with N nodes. Each state E k of this fiber defined in a reference frame R k is represented by N nodes: On a chromosome represented as a polymer fiber, the center of gravity of each fluorescently labelled locus represents a node. N loci can be represented by (x i , y i , z i ) i = 1. . .N .
If we denote θ i the angle around the node i.
( For each local analysis around node i, we make the following change of variables: The main advantage of these new variables is their invariance during the reference frame change. To analyze the variations of positions around the node i, we introduce the normalized principal components analysis operator PCA norm and consider a set of experimental data We denote by C i and P i the correlation matrix of the principal component analysis and the diagonal matrix of eigenvalues.
For all nuclei, we can write the coordinates of red, green and blue loci in the referential linked to the microscope as x r ; y r ; z r À Á Taking the red locus as the origin, we can write ðx r ; y r ; z r Þ R microscope ð0; 0; 0Þ R microscopenuclei ; ðx g ; y g ; z g Þ R microscope ðx g À x r ; y g À y r ; z g À z r Þ R microscopenuclei ; Referential R microscopenuclei is called pseudo-referential because its origin is inside the nucleus and its basis vectors are linked to the microscope. To overcome the orientation problem inherent to the nuclear sphericity, we can define new variables simple enough to analyze the position of the three loci as (1 and 1) (S1 Fig). Let us take: and y ¼ arccos where is the scalar product. With these new variables each nucleus is represented by three variables instead of nine: (x r , y r , z r , x g , y g , z g ,

Variables and referential
Projection. Each nucleus is represented by the three components (d 1 , d 2 , θ). We project all points d i , (θ, d 1 ) and (θ, d 2 ) planes (Fig 2A and S1 Fig), In each plane, the probability density function is estimated by Parzen-Rozenblatt method [33]. Kernel density estimation. In the Parzen-Rozenblatt methods, estimations lie in the choice of the kernel K. We use a Gaussian kernel [34]. For x,y in R 2 , the kernel K is given by: This Kernel allows us to compute the probability of the density function f. The cumulative distribution function F(x, y) is plotted within ten regions defined as . Each density map (X i ,X j ) is composed of ten levels U k ¼ X kl i ; X kl Let us take two different experiments A and B, and (U k ) k = 1. . .10 level intensity associated to A and (V k ) k = 1. . .10 level intensity associated to B. We denote m k U and m k V associated mean vectors, s k U and s k V associated standard deviation vectors at the level k. For two levels, k 1 in the Aexperiment and k 2 in the B-experiment we define and r 2ij r 1ij k 1 k 2 is a correlation in first approximation and r 2ij k 1 k 2 is a correlation in second approximation with r 1ij k 1 k 2 ! r 2ij k 1 k 2 .

Inverse problem and abstract model
We developed an original abstract model based on the assumption that each point of a fiber (or polymer) moves in a specific area (Fig 3A). If we consider a dynamic polymer in the plane in which we can define N points (x i , y i ) i = 1. . .N . For each point (x i , y i ), we define the survival zone as the smallest closed set P i where (x i , y i ) takes its values during temporal fluctuations. Any  Let us note Z V1 , Z V2 , . . .Z VN survival zones of (x 1 , y 1 ), (x 2 , y 2 )and (x N , y N ) defined as R 2closed set.
We can define an abstract polymer from the knowledge of the N survival zones. Here, the abstract fiber will be close to the real fiber as soon as the survival zones will be small and the number of nodes is large enough. We can build several random configurations of our polymer knowing the survival zones.
where rand Z V i denotes the unform random function in the set Z V i We can construct an abstract configuration: For the given abstract configuration set, we can define as (1 and 2) new system variables: Where M' denotes the number of abstract configurations and N the number of nodes. For a given set of points, the system of survival zones that best correlates with our experimental data has to be determined. Let us note as the difference between the experimental correlation matrix and abstract correlation matrix.
each locus (or node) which can predict biological relevant features. B-C) Survival Zones (Z) of HML, MAT and HMR. The initial position of the 3 loci is set based on the estimated conformation for Chr3 (B). Iterations were run using Eq 25 (see Methods) and the statistically most significant zones are represented for wt, asf1 mutants and strains in which the recombination enhancer element was deleted (C). The optimal abstract model will be defined by: fðx 1 ; y 1 ; ε 1 1 ; ε 1 2 Þ; ðx 2 ; y 2 ; ε 2 1 ; ε 2 2 Þ; ::: ||. || frob denotes the Frobenian norm of a square matrix. D is the base interval.

Inverse problem solving
Eq (25) is solved iteratively. The center of each survival zone is fixed by conservation of proportions between the three distances. On chromosome 3, the center of HML positions is taken at (0,0), the center of HMR is fixed at (5,0). If d HML-MAT , d MAT-HMR and d HML-HMR are experimental mean distances, the lengths r 0 and r 1 are given by: The center of the MAT zone is defined as the intersection of two circles. The circle with (0,0) center and r 0 as radius. The second circle has (5,0) as a center and r 1 as radius.
Iterations are made using the variables ε 1 1 ; ε 1 2 ; ε 2 1 ; ε 2 2 ; ε 3 1 and ε 3 2 , in the range [0;5]. Our code has been parallelized on one hundred CPU cards using MPI (Message Passing Interface). The step of subdivision for the iterative resolution is taken at 0.2 to yield 5 12 abstract configurations. Each experiment represents~100 hours of calculation; computing of survival zones for all conditions tested took 1500 hours of calculation.

Results and Discussion
Differentiating geometric distribution of three distinct chromosome loci Combinations of the widely used Tet and Lac repressors, each fused to a fluorophore with a distinct emission spectrum, has proven to be a valuable tool for simultaneous visualization of two loci [18,19,30,[35][36][37]. In order to simultaneously label three distinct loci to track their relative position, we adapted a third system based on the the bacterial λ repressor operator system (λO and λCi) [19] for use in yeast (Fig 1A). Operator sequences were inserted by homologous recombination near the HML, MAT and HMR or near HML, LEU and MAT loci on Chr3. Acquisition of fluorescent RFP, YFP and CFP proteins fused to the Lac, Tet and λ repressors, respectively, was performed in 3D.
We introduce a normalized principal components analysis (PCA) operator to convert the automatically measured x, y and z coordinates for each focus into geometrical variables (see Methods). The resulting coordinates of the formed triangle, here d1 (side of the triangle delimited by HML-MAT), d2 (MAT-HMR) and d3 (HMR-HML), were plotted on the same graph for all analyzed cells (Fig 1B). The large variation in positional combinations of the three loci reflects the highly dynamic nature of chromosome folding in yeast [38]. The MAT locus is mobile [18,39,40] but the two silent mating type loci, despite their frequent juxtaposition, also change position relative to each other within 10-30 seconds [19]. 50% of the most frequent positions observed for the three loci were included in a volume whose 3D surface is colored in red for MATa and in green for MATα cells (Fig 1B). A large fraction of the d1_d2_d3 combinations are correlated in both mating types (intersection between the red and green iso-volumes). They represent the most probable conformations of Chr3 and are as such readily detected by other methods. Strikingly, subsets of relative triangular positions of the three mating type loci are specific to MATa or MATα cells. Our goal was to characterize the folding features leading to this variant part of the distribution and to correlate them with donor preference.

Folding of the left arm of chromosome 3 is mating type specific
The distribution of angles within the triangle formed by MAT, HML and HMR in MATa and MATα cells was statistically significant (S1 and S2A Figs), again suggesting that certain conformations of Chr3 could be mating type specific. To extract folding features from the distribution of the three loci, we generated 2D projections of geometrical coordinates recorded for all nuclei. In the resulting density maps (Fig 2 and S2-S7 Figs), data are grouped within 10% color-coded increments of occurrence. For each dataset, nine distinct maps are generated and compared to the equivalent maps of another dataset using PCA (see Methods). Correlation coefficients (c) between duplicate experiments using the same strain were greater than 0.8 (S2C Fig; four independent experiments 223<n<559). As a control, density maps resulting from a simulation of data obtained using random positioning of three loci within a sphere of 2 μm diameter (one example is shown in Fig 2A) were significantly different from experimental data (c<0.1). Furthermore, the distribution of three independent loci (MAT on Chr3, the right telomere of Chr5 and ARS1412 on Chr14) did not correlate with the ones on Chr3 (c <0.6 (n = 405)). Thus, density maps obtained for the distribution of the three mating type loci on Chr3 are non-random. Strikingly, the maps obtained in a-cells were distinct from those in α-cells. This was surprising because the nuclear positions of individually labelled mating type loci were previously shown not to be statistically different in aand α-cells [26]. Density maps differed at the 10-50% contour levels around HML and MAT (S2C Fig). The angles formed at HML were significantly smaller in a fraction (~30%) of α-cells compared to a-cells (Fig 2B) suggesting that, in α-cells, loci on the left arm of Chr3 are more confined with respect to those on the right arm. Also, in about 20% of cells, the angle at MAT was much smaller in a-cells than in α-cells (80°-140°) for a similar distribution of HML-MAT distances. These data expose, for the first time, that positioning of HML relative to MAT and HMR differs between MATa and MATα cells.
Geometric analysis of another combination of three loci provides additional detail (Fig 2C  and S3 Fig). We labeled two loci, HML and LEU2, on the left arm and one, MAT, on the right. Density maps confirm that a portion of the left arm of Chr3 is largely compressed in MATa cells. For example, the angle at LEU2 formed with the vector pointing towards HML was, in the 30% most representative cells, significantly smaller in MATa than in MATα cells (c = 0.27). This suggests that HML and LEU2 roam in a similar volume with respect to MAT in MATa but not in MATα cells. Increased constraint of the LEU2 locus supports the view that the left arm, or at least a large region of it, can crumple and shorten in MATa cells, dynamically changing between an extended conformation and a transiently more compressed one.

Relative physical constraint characterizes differential chromosome folding
Our goal was to determine whether the linkage to a chromosome fiber contributes to the relative positions of three distinct loci. We developed an abstract polymer model to define physical constraint imposed by the fiber (Eq (25) in Methods). Each nucleus can be taken as a state of the chromosome fiber. Hence, we have up to five hundred states of our system to describe our data. In an optimization approach, we assume a known configuration of Chr3 as the initial configuration (Fig 3A) to reduce the number of parameters to be estimated. The abstract model is a model based on the assumption that each point of a fiber (or polymer) moves uniformly in a specific area or survival zone (Z). To solve Eq (25) we used an iterative method based on the variables ε 1 1 ; ε 1 2 ; ε 2 1 ; ε 2 2 ; ε 3 1 and ε 3 2 , in the range [0;5]. Our code has been parallelized on one hundred CPU cards. The step of subdivision for the iterative resolution is taken at 0.2 to yield 5.10 12 abstract configurations. We can thus determine interactions between the survival zones Z MAT , Z HML and Z HMR as the extent by which two points roam in the same space while being under the physical constraint of the fiber. Z MAT and Z HMR partially overlap in acells (Fig 3) in agreement with the fact that chromatin in yeast is highly flexible [38], notably between these two loci 100kb apart on the same chromosome arm. In contrast, Z HML is excluded from the two other survival zones. In a-cells, Z HML is significantly greater than in α-cells consistent with our finding that the left arm is more crumpled and flexible in a subset of nuclei (Fig 2). In addition, we asked whether our abstract mathematical model could inform on physical properties of the chromatin fiber in yeast. First simulations (S4 Fig) suggest that the determined survival zones Z of the three mating type loci correspond to rather flexible, moderately constrained polymer fibers. For example, the survival zones of three linked loci corresponding to Z 1 = [-1;1]×[-1;1], Z 2 = [-1;5]×[-1;5]and Z 3 = [3.5;6.5]×[-1.5;1.5] are closer to simulation B than to A or C (compare experimental data in Fig 3 to S4 Fig). The simulated survival zones Z2 and Z3 in our example are separated by <0.5 or less. If we assign a value of Z = 5 to the~100kb contour length which corresponds to the separation between MAT (Z2) and HMR (Z3) the intervening fibers's flexibility is <10kb. In future work, using different sets of multiple labels at varying distances along chromosomes, the abstract mathematical model presented in this study and polymer modeling will allow to better define chromatin fiber properties.
Chromatin structural properties control mating type specific folding of chromosome 3 We then tested whether our abstract model could be applied to predict functional consequences in mutant cells. We deleted the asf1 gene, coding for a histone chaperone. Asf1 was previously shown to regulate juxtaposition of the HM loci [19]. An increase in the random kinetics of the loci is expected due to general chromatin decompaction in the absence of Asf1 [41]. All distances measured between the three loci on Chr3 in asf1 strains increased by nearly 20% (S5 and S6 Figs; p<0.007). Interestingly, the distribution of angles formed around each locus was the same in wt and asf1 mutant MATa but varied in MATα. This suggests that decompaction of chromatin in the absence of Asf1 leads to an overall extension of the chromatin fiber without compromising the folding properties of Chr3 in a-cells. In α-cells, however, density maps change to resemble those of wt MATa cells (Fig 4 and S5 Fig). Computing the survival zones of HML relative to MAT and HMR in α-cells showed that HML is less constrained in asf1 mutants than in wild type (Fig 3). Z MAT and Z HMR partially overlap and the zone of HML expands, again, similar to the situation determined in wt a-cells. Thus, the extended, stiffer conformation of the left arm of Chr3 in α-cells seems to be dependent on chromatin structural features brought by Asf1. The lesser impact of chromatin structure in a-cells suggests that the folding of the left arm of Chr3 in MATa is intrinsic and that in α-cells certain conformations are excluded due to chromatin structural properties. To test whether a greater contact probability between HML and the right arm of Chr3 would favor recombination, we determined donor preference in asf1 mutant cells. We found that in α-cells, usage of HML increased nearly four-fold in the absence of Asf1. Hence, reduced physical constraint of the left arm of Chr3 in a subset of cells correlates with improved recombination competence.
Finally, we asked whether the Recombination enhancer (RE), a <1kb DNA element located 17 kb to the right of HMLα, shown to be required for recombinational competence of a large region (40 kb) near the left end of Chr3 in MATa [22], is involved in folding of Chr3. Deletion of the RE region reduces the use of HML to repair MATa from >80% to <10% [22,23]. In MATα, the RE is in a heterochromatin configuration and non-functional, leaving the left arm incompetent for recombination [42]. We found that the distribution of the three loci was altered in both mating types in the absence of the RE, although the differences in density maps are more pronounced in αthan in acells (Fig 4, S6 and S7 Figs). In a-cells, only the most frequently observed wt conformations of the three loci remained. The relative positioning of HML and MAT varied and this was more noticeable in α-cells (S7 and S8 Figs). To verify that this effect was due to the RE element rather than the insertion of the hygromycin resistance gene we deleted the RE region using the cre/lox method. The generated density maps of the relative positions of the three loci, although in a different manner, also show differences to the wt maps. Notwithstanding that we cannot formally rule out that replacing the RE with heterologous sequences (the deletion of 803bp or the addition of 1078bp within the left arm, 29kb from the telomere) might affect chromosome folding, RE specific sequences appear to mediate some of the detected cell type dependent differences.
It was previously debated whether the RE may be involved in changing the localization or the higher-order organization of the entire left arm of Chr3 to make it more flexible in pairing with the recipient site in MATa cells [24,25]. Our data suggest that folding of the left arm is such that at a given moment it can be in the proximity of the MAT locus without being pre- folded or permanently in a recombination favorable position (see model Fig 5). Thus, in MATa cells, donor preference seems to only be imposed at commitment to recombination following cleavage of the MAT locus through recruitment of repair and recombination factors to RE elements or even synthesis of non-coding transcripts [23,43]. Strikingly, in α-cells, the repressed RE element appears, at least in part, to be responsible for the extended conformation of the left arm of Chr3. Hence, the heterochromatin complex at the RE may sequester part of the chromosome, an organization that could counteract looping and may limit recombination aptitude of this chromosome arm.

Conclusion
We present a new system for labelling and visualizing a specific chromosomal site by fluorescence microscopy in living cells. We show that a small sequence element can influence the folding of an entire chromosome. Our new methodology allows three individual chromosomal sites to be imaged simultaneously in living yeast. Quantitative data can be obtained because all cells are labeled identically and permanently. The computational strategy used to evaluate the relative distribution of three objects simultaneously in 3D represents a powerful tool for studying chromosome biology and is applicable to the analysis of any three simultaneously labelled sites in any cell type. It allows identifying transient and unstable conformations of chromosomes which are statistically not the most frequently detected ones, yet may be relevant for regulating DNA processes. This view is also supported by recent studies using polymer modeling of chromatin which revealed that fluctuations in transcriptional activity correlated with probabilistic organization of the Tsix gene domain [44] and with enhancer-promoter communication via modulatory chromatin looping [45]. Our method is highly complementary to genomewide chromosome conformation capture approaches and necessary to validate models from simulations. It is also amenable to investigation of chromosomal rearrangements governing changes in DNA-related processes in higher eukaryotes.
Supporting Information S1 Fig. Variables and referential determination. A) To change the 3D referential, geometric variables are calculated from the microscope coordinates. A single data point is represented for each nucleus by the three components d1,d2,d3 and their associated angles. 3D data are projected onto a unique 2D plane (eg. d1/q); 9 projections can be generated from each data set. Density maps (warm colors for high density and cold colors for weak density in 10% increments) are generated for each projection in 2D. 3 Examples are shown. B) Kernel density estimation to compute de probability of the density function f. . Boxes represent the interquartile range (IQR) between lower and upper 25% quartiles, red bands correspond to the median distance or angle, and outliers are indicated by crosses. A median distribution Wilcoxon test was used to obtain p values. B) All generated maps are represented, and correspond to the triangulation between the position of TetR-mRFP (HML), CFP-LacI (HMR) and YFP-λcI (MAT) foci relative to one distance or angle at each locus. Correlation coefficients (c) obtained from the comparison between two strains at each of the ten incremental combination frequency levels is shown in Table 1. Red, dark pink and pink shades correspond to c<0.5, c<0.6 and c<0.7 respectively. (EPS) Boxes represent the interquartile range (IQR) between lower and upper 25% quartiles, red bands correspond to the median distance or angle, and outliers are indicated by crosses. A median distribution Wilcoxon test was used to obtain p values. B) All generated density maps are represented, and correspond to the triangulation between the position of the three foci relative to one distance or angle at each locus. Correlation coefficients (c) obtained from the comparison between two strains at each of the ten incremental combination frequency levels is shown in Table 1. Red, dark pink and pink shades correspond to c<0.5, c<0.6 and c<0.7 respectively. (EPS) S4 Fig. Simulation of the relative physical constraint of three linked loci using our abstract mathematical model to assess the properties of the chromatin fiber. Survival zones Z of three linked loci whose parameters were flexible (A), moderately constrained (B) or constrained (C) were simulated using the same iterative algorithm as for Boxplots represent the distances measured between the three tagged loci in wild-type and asf1 MATa (n = 242) and MATα (n = 116) cells. Boxplots of pairwise distances between TetR-mRFP (HML), CFP-LacI (HMR) and YFP-λcI (MAT) foci (left panel) and the angles formed at each locus (right panel) in G1. Boxes represent the interquartile range (IQR) between lower and upper 25% quartiles, red bands correspond to the median distance or angle, and outliers are indicated by crosses. A median distribution Wilcoxon test was used to obtain p values. B) Correlation coefficients (c) obtained from the comparison between two strains at each of the ten incremental combination frequency levels is shown in Table 1. Red, dark pink and pink shades correspond to c<0.5, c<0.6 and c<0.7 respectively. (EPS) A) Boxplots representing the distances between the three tagged loci and its comparison between wild-type and re deleted in MATa (n = 500) and MATα (n = 324). Boxplots of pairwise distances between TetR-mRFP (HML), CFP-LacI (HMR) and YFP-λcI foci (MAT) (left panel) and the angles formed at each locus (right panel) in G1. Boxes represent the interquartile range (IQR) between lower and upper 25% quartiles, red bands correspond to the median distance or angle, and outliers are indicated by crosses. A median distribution Wilcoxon test was used to obtain p values. B) Correlation coefficients (c) obtained from the comparison between two strains at each of the ten incremental combination frequency levels is shown in Table 1. Red, dark pink and pink shades correspond to c<0.5, c<0.6 and c<0.7, respectively. (EPS)