A genome-wide comprehensive analysis of nucleosome positioning in yeast

doi:10.1371/journal.pcbi.1011799

Fig 1.

Pearson clusters and fPCA considering all protein-coding genes.

(A) The silhouette plot clearly indicates that the data can be best divided into two clusters, and creating more groups would only decrease the difference between each cluster. (B) and (C) display the profiles for each cluster. Large values are given in copper, low values are black, and the colour gradient in between is uniformly scaled. It is therefore a perceptually uniform representation. Both heatmaps are normalised independently, such that their respective largest value is displayed in the strongest copper hue and their lowest value in black. Unfortunately, it is difficult to quantify visually why these clusters were established. This is particularly true because the Pearson index measures only general trends in the profile, and it does not take the scaling into account. Each row represents a gene, and the x-axis shows the position along the coding region, with the +1 nucleosome defined to be at position 0 bp. The colour code represents MNase-seq amplitude, i.e. copper values show large MNase-seq signal values, whereas dark areas indicate a low amplitude. (D) The cartoon presents the hypothesised differences that could occur between the Pearson clusters. Due to the well-positioned nucleosomes and the wave-like structure of MNase-seq data, we presume that the Pearson correlation measures coordinated nucleosome positioning along the gene. If two profiles (orange and blue) are in two different clusters, this could indicate either a shift in the exact nucleosome positions (left); or a general trend in the MNase-seq signal amplitude, i.e. either increasing or decreasing (right). (E) Pearson clusters considering all genes are linearly separable with respect to their fPC scores. This indicates that two fPCs are sufficient to interpret the gene groups. We use the symmetric Jensen-Shannon (JS) distance to describe separability between the clusters along fPC1 and fPC2. The JS distance between the cluster distributions is much larger for fPC 2 than for fPC 1. Orange and blue indicate each one group, the dashed line symbolises the best linear separation using a SVM. The x-axis represents the score of the first fPC ζ¹, the y-axis gives the score for the second fPC ζ². Both axes are scaled to the same range, points outside the range (29) were included in the analysis but not plotted. (F) When analysing the effect of the major fPCs, they describe predominantly position-dependent scaling (transparent black lines, fPC 1) and collective nucleosome phasing (transparent black arrows, fPC 2). The second fPC in WT indicates an increasing or decreasing signal magnitude as a function of distance from the TSS, suggesting stronger or weaker presence (corresponding to panel D right). The mean is given as a dashed black line, a positive contribution—i.e. adding the fPC to the mean—is displayed in magenta, and a negative contribution—i.e. subtracting from the mean—is shown in green. Trends over the entire array are indicated by grey arrows. When exact positions were seemingly not affected by the fPC, we marked the positions with a grey vertical bar. See Methods for more information about how the plots were produced.

More »

Expand

Fig 2.

Nucleosome phasing is strictly limited to the gene body, which is maintained by Rsc8 but antagonised by Chd1.

The cluster distribution plots in panels A-C show the distribution of both gene groups with respect to the small-gene fPCs of WT, rsc8-depleted cells, and chd1Δ strains. Orange and blue indicate the two clusters, and the black dashed line shows the separating boundary determined by a linear SVM. The histograms present the cluster distribution with respect to each axis. Panels D-F display the transformation of the average small-gene nucleosome profile by the two major fPCs for WT, rsc8 depletion, and chd1Δ, respectively. The dashed black line as well as the solid lines in magenta and green display the mean, a positive contribution of the fPC, and a negative contribution. Turquoise arrows indicate the effect on the +1, dark blue arrows on the +4, and orange arrows on the +6 position. (A) When plotting the cluster distribution with respect to small-gene fCPs in WT, the linear separability is lost. (B) The fPCs of the rsc8-depleted strain maintain the linear separability, despite the fact that the groups were established for all genes. As we interpret the Pearson clusters as similarity in positioning between genes of 1000 bp mediated by chromatin remodelers, it possibly suggests that positioning outside coding regions influences nucleosomes inside and vice versa. (C) Whilst most mutants that were rsc8 depleted could discriminate between the all-gene clusters using small-gene fPCs, this separability is lost again in rsc8-depleted chd1Δ, revealing partly antagonistic roles to maintain gene-specific phasing for Rsc8 and Chd1. (D) The effect of two fPCs sheds light on why the Pearson groups are not linearly separable in WT using small-gene fPCs. The distribution of the second fPC changes its regular wave-like form to much broader peaks and valleys after the +2 nucleosome, which corresponds to approximately the size of the smallest genes in budding yeast. (E) Nucleosome positioning in rsc8-depleted conditions is clearly visible along the entire considered region, despite the included genes being smaller. This suggests that gene-specific nucleosome arrangement cannot be maintained. It is of note that the phasing also changes for the +1 nucleosome, and the NDR can be seemingly not conserved. (F) On the other hand, rsc8-depleted chd1Δ loses the regular wave-like shape of its second fPC after the +2 nucleosome to form broader peaks, indicating the presence of gene-specific nucleosome profiles as in WT conditions. All axes are scaled to the same size for each strain; shapes and amplitudes are therefore comparable (see Methods for more details).

More »

Expand

Fig 3.

The fPCs, their gene specific scores, and the discriminating boundary explain collective phasing and how this changes in chd1Δ with respect to WT conditions.

The figure shows the cluster distribution with respect to , the impact of the determined fPCs, and the location-specific impact of the separating boundary for WT (i.e. panels A-C) and chd1Δ conditions (i.e. panels D-F). Panels A and D show the fPC scores of WT and chd1Δ strains, respectively. For the latter, the boundary slope changed notably (black dashed line). As indicated by the fCPs in panels B and E for WT and chd1Δ, respectively, the functional description of the data changes. Indeed, the second fPC of chd1Δ abates quickly after the +1, with a strong effect on the effect of the +2 (grey arrows). The dashed black line as well as the solid lines in magenta and green indicate the mean, a positive contribution of the fPC, and a negative contribution, respectively. When exact positions were seemingly not affected by the fPC, we marked the positions with a grey vertical bar. General trends are given in grey arrows along the gene. The location-specific impact of the separating boundary is given in panel C for WT and panel F for chd1Δ strains. Interestingly, despite the median distributions of the clusters (blue and orange) are clearly different with respect to the +1 and +2 in WT conditions, later positions are much more important for allocating a profile to a particular group (grey areas, mean in black). Whilst this is also true for chd1Δ, the importance of later nucleosomes is even more accentuated, whereas the influence of the +1 and +2 positions are further decreased. All axes are scaled to the same size for each strain; shapes and amplitudes are therefore comparable (see Methods for more details).

More »

Expand

Table 1.

SVM boundary slopes for both replicates.

The first two rows give the boundary slope for replicate A and B, respectively. Mean μ is the mean slope for both. The s value represents our significance measurement defined in Eq 7. Noteworthy changes of the boundary slope are marked in green (bold), all others are red. The s-value in WT is per definition equal 0.

More »

Expand

Fig 4.

The fPCs, their gene specific scores, and the discriminating boundary explain changing collective phasing in double mutants.

The figure shows the cluster distribution with respect to , the impact of the determined fPCs, and the location-specific impact of the separating boundary for all double mutants, in particular isw2Δchd1Δ (i.e. panels A-C), rsc8chd1Δ (i.e. panels D-F), and isw1Δisw2Δ (i.e. panels G-I). The linear separation of the cluster distribution with respect to factors indicate a notable gene-specific change for the three mutants in panels A, D, and G. The two clusters are given in orange and blue, and the SVM boundary is depicted by the black dashed line. Whilst isw2Δchd1Δ and isw1Δisw2Δ require both fPCs to linearly separate the Pearson clusters, rsc8chd1Δ is almost exclusively dependent on the second fPC, which means this mutant decreased the slope tilt. This can be better understood when analysing the two fPCs and their effect on the mean ((B) for isw2Δchd1Δ, (E) for rsc8chd1Δ, and (H) for isw1Δisw1Δ). The solid lines in magenta and green in these plots indicate a positive contribution of the fPC and a negative contribution, respectively, whereas the black dashed line depicts the mean. Grey arrows along the gene suggest general trends. Grey vertical bars suggest positions that remain largely unperturbed by the fPC. Grey arrows pointing to a single peak suggest remarkable properties. Interestingly, whilst the first fPC of the isw2Δchd1Δ and isw1Δisw2Δ strains shows a similar transformation of the mean, the second fPC indicates a different behaviour, particularly with respect to the +2 nucleosome. As suggested by the fact that clusters in the rsc8chd1Δ mutant are exclusively dependent on the second fPC, the first fPC explains only the average profile amplitude and does not contain any information about collective phasing. The location-specific effect of the linear separator for each mutant is given in (C), (F), and (I). The grey areas indicate the importance of each position to determine the clusters, whose median profile is shown as a blue and orange dashed line. The mean is depicted in black. Although the impact on the grouping of the +1 and +2 position in isw2Δchd1Δ conditions is similar to the isw1Δisw2Δ strain, the latter is seemingly particularly dependent on the +3 and +4 nucleosome. Positions thereafter become less important, which keep having a strong impact on clustering in isw2Δchd1Δ. As expected rsc8chd1Δ is exclusively dependent on the second fPC. Interestingly, the entire profile seems to be influential for classifying genes, with the largest impact allocated to the first two nucleosomes. All axes are scaled to the same size for each strain; shapes and amplitudes are therefore comparable (see Methods for more details).

More »

Expand

Fig 5.

The fPCs, their gene specific scores, and the discriminating boundary explain changing collective phasing in isw2Δrsc8chd1Δ.

The two clusters are given in orange and blue. The figure shows the fPC scores ζ of the isw2Δchd1Δrsc8 mutant and their separating boundary (black dashed line, A). The slope decreases with respect to the WT, making the gene groups almost solely dependent on the second fPC. Both fPCs transform the mean in a similar way as the double mutant rsc8chd1Δ (compare panel B with Fig 4E). The dashed black line as well as the solid lines in magenta and green indicate the mean, a positive contribution of the fPC, and a negative contribution, respectively. As expected, the separating boundary discriminate between the two clusters largely following the second fPC (C). The grey areas show the importance of each position to determine the clusters, whose median profile is shown as a blue and orange dashed line. The mean is depicted in black. All axes are scaled to the same size for each strain; shapes and amplitudes are therefore comparable.

More »

Expand

Fig 6.

Remodeler deletions have varying effects on the interdependence with other genomic properties.

The orange bars in panels A, C, and E show the ratio of correct predictions, and blue bars are wrong guesses. As we distinguish between two clusters, the dashed black line at 0.5 indicates random guessing. The dashed grey line with black edging in panels B, D, and F display the linear boundary for the Pearson clusters. Panels A and B: WT conditions are seemingly correlated with the sequence composition. However, the results are different for the B replicate, and therefore non-conclusive. All possible correlations are surprisingly low. Panels C and D: chd1Δ mutants increase particularly their dependence on Pol II and other transcription-coupled properties, such as Mediator presence. Surprisingly, the mutant showed also an increased interdependence on NDR length. Panels E and F: despite the Rsc8-mediated gene limits, there is no correlation with coordinated nucleosome phasing and the size of transcribed regions or NDR length. Although Sth1 indicates a slightly increased interdependence, this cannot be confirmed when plotting the groups with respect to the Pearson clusters. This is in line with our hypothesis that positioning in different regions interfere, and therefore, nucleosome localisation become increasingly independent from region-specific factors.

More »

Expand

Fig 7.

Chromatin remodelers maintain nucleosome organisation on a local and far-reaching scale.

Top: RSC (green ellipse) establishes independent nucleosome phasing on each gene (two vertical dashed lines) by maintaining the NDR through positioning the +1 (cornered arrow) and -1 nucleosome. The ATP-dependent positioning is symbolised by black arrows pointing away from RSC. The local remodeling effect of Chd1 (blue ellipse) allows chromatin arrangement independent of Pol II transcription (yellow ellipse). Bottom: in rsc8 strains, the NDR cannot be maintained anymore, and phasing in and outside a gene interfere with each other (single dashed line). We propose that this should equally lead to an increased interdependence of other nuclear processes such as transcription. If chd1 is deleted, nucleosome arrangement is more sensitive to the presence of other large complexes, such as Pol II. During transcription, Pol II is affecting the local positioning (black arrows from Pol II).

More »

Expand

Fig 8.

Representing the MNase data array as a composition of B-spline base functions in WT conditions.

Left shows the raw data, each colour depicting one profile over a gene. Center gives the smoothed profiles after representing the data as B-splines. Right displays the average profile using the functional composition.

More »

Expand