Blurring of High-Resolution Data Shows that the Effect of Intrinsic Nucleosome Occupancy on Transcription Factor Binding is Mostly Regional, Not Local
Figure 6
The blurring of intrinsic nucleosome occupancy data accentuates the effects of nucleosomes in a computational model of TF binding.
(A) Illustration of how nucleosome occupancies are used to weight the predicted binding affinities of sequence motifs (top panel): Two 2.4kb genomic regions (CAN1/NPR2 and YPL137C/ISU1) showing normalized nucleosome tag counts from in vitro reconstituted chromatin, averaged over 15bp windows (gray line) or 600bp windows (black line). Red dots indicate the location of a perfect Gcn4 consensus site in each region. (middle panel): Same as the top panel except the lines show the conversion of normalized tag counts into weights that can be applied to Position Weight Matrix based estimates of TF binding affinity. Note that the weights are plotted on a log scale. Details of the weighting scheme are given in Methods. (bottom panel): Predicted equilibrium binding constants for the two sequence-identical Gcn4 sites (relative neighed Ka = 1; white histogram bars). High-resolution nucleosome data (15bp window; gray bars) increases the effective Ka of the two sites by about the same amount because the local nucleosome occupancy for both sets is about the same, and lower than average. Averaged over 600bp, the CAN1/NPR2 site is in a much lower-than-average nucleosome occupancy region while the YPL137C/ISU1 site is in a higher-than-average region. As a result, the predicted effective binding affinities of these two sites, subject to low resolution nucleosome occupancy (black bars) are very different. (B) Effect of nucleosome-based weighting on the prediction of TF bound promoters. Each dot is a TF. The value along x-axis shows how well the PWM, used in a computational model of TF binding, predicts which promoters are bound. This is quantified as the area under an ROC curve (ROC AUC). Plotted against this value is the change in ROC AUC that is obtained by weighting genomic loci on the basis of high-resolution nucleosome data (orange; 15bp window) or on the same data averaged over 600bp windows.