A Quantitative Model of Transcriptional Regulation Reveals the Influence of Binding Location on Expression

Binding site position, but not sequence conservation, is strongly associated with gene expression level.

(A) The mean log expression of bound genes is shown in each tissue as a function of both the distance between the transcription start site and the nearest regulatory region identified by ChIP, and the maximum conservation score of any regulatory region within 5kb of that gene's TSS. Error bars indicate +/− s.e.m. Also shown is the Spearman correlation, and associated p-value from a right-tailed t-test, between log expression and the distance and conservation measures. (B) In the upper plot the mean log expression of genes in liver and 3T3-L1 cells is shown as a function of the location of the nearest binding site over a 200kb window. Error bars indicate +/− s.e.m. In the lower plot we show the influence function, which measures a binding event's predicted effect on expression as a function of position, obtained by fitting our predictive model to 1,000 bootstrapped samples of ChIP and expression data in each tissue. Shaded regions show the empirical 99% confidence intervals obtained from the bootstrap iterations.

