Fig 1.
Using PBS to identify signal enrichment in ChIP-seq datasets.
(A) Schematic of the PBS method. (B) Signal tracks (i), called peaks (ii) and PBS values (iii) for several histone modifications at a sample locus in tissue from esophagus. This example was chosen for demonstration as the locus has signal from both narrow and broad histone marks in this tissue. (C) Heatmap of a collection of 28 H3K27ac datasets from four different tissue categories showing signal at the CDKN2A locus. Highlighted boxes demonstrate enrichment of H3K27ac signal proximal to IFNA8 in central nervous system tissues (i), CDKN2A in all tissue categories (ii), and LINC01239 in gastrointestinal and cardiovascular tissues (iii).
Fig 2.
Comparing PBS and peak calling in H3K27ac and H3K27me3 datasets.
(A) Histogram of read-counts and PBS (inset) for a representative H3K27ac dataset. (B) Venn diagram describing overlap between bins with high PBS and bins with peaks in (A). (C) Histogram of read-counts and PBS (inset) for a representative H3K27me3 dataset. (D) Venn diagram describing overlap between bins with high PBS and bins in (C). In both (A) and (C), the black curve corresponds to the background distribution estimated from a fit to bottom fiftieth percentile of data. The heavier tail and greater number of bins with PBS > 0 in (C) compared to (A) indicate elevated genome-wide enrichment of H3K27me3 when compared to H3K27ac. Venn diagrams to the right of the histograms indicate the overlap between bins containing peaks and those with PBS > 0.9 for each histone mark.
Fig 3.
Using PBS to detect changes in H3K27me3 levels during cell differentiation.
(A) Summary of genome-wide PBS across H3K27me3 datasets of iPSCs, myeloid progenitors, or neutrophils. The number of bins with high PBS (> 0.8) increases with increased cell differentiation. Additionally, moderate levels of PBS (between 0.2 and 0.8) corresponding to low, broad H3K27me3 signal, appear exclusively in more differentiated cell types (myeloid progenitor cells and neutrophils). (B) Comparing iPSC, myeloid progenitor cells and neutrophils, including signal tracks (top), MACS2 peaks (middle), and PBS (bottom). Highlighted are regions with no signal in all three datasets (i); no signal in iPSC but moderate signal in myeloid progenitors and moderate-high signal in neutrophils (ii); and high signal in all three cell types (iii). While both peak calling and PBS identify the consistent sharper peak in (iii), only PBS meaningfully detects the spreading enrichment in (ii) (14 peaks, spanning 3.8% of the region vs. 74% of the region with PBS equal to 0.8 or higher).
Fig 4.
Detecting elevated H3K27me3 in colon tumor samples.
(A) Comparing distributions of counts for H3K27me3 in tumor (left) and normal (right) colon tissue. Tumor tissue appears to have globally elevated H3K27me3, as shown by the heavier tail in the histogram of counts, corresponding to a greater number of non-zero PBS values (inset). (B) Schematic describing a PBS-based differential analysis of two datasets. The output of the differential analysis is a histogram of the differences in PBS between the two datasets, as shown at the bottom. (C) Histograms summarizing a PBS-based differential analysis of H3K27me3 PBS between tumor and normal colon tissue. Plots are facetted based on overlap with a region of hypomethylated DNA (hypomethylated block). (D) Histograms describing a PBS-based differential analysis of H3K27me3 between tumor and normal colon tissue in hypomethylated blocks (right-hand facet in (C)). Plots are facetted based on overlap with euchromatin or heterochromatin, as defined by Hi-C. The increase in H3K27me3 in tumor tissue is more pronounced in regions of euchromatin that overlap hypomethylated tumor DNA. Hypomethylated tumor DNA is defined as a 5 kB bin with methylation levels lower than those of normal tissue [24].
Fig 5.
Combining DiffBind with a PBS-based differential analysis.
(A) Volcano plot showing significantly differentially bound regions in tissue samples from two ENTEx donors (gray). Yellow highlighted points indicate regions enriched in donor 2 that also overlap with bins that have a difference in PBS > 0.9. Values have been randomly downsampled to 10000 points to facilitate visualizing trends. (B) Comparison between top GO terms for a DiffBind-based differential analysis (left) and a combined DiffBind and PBS-based analysis (right). Terms explicitly relating to immune system processes are highlighted in orange. The FDR-adjusted p values in the right panel are greater (less significant) than those in the left panel due to the smaller number of input genes.
Fig 6.
S-LDSC using PBS-based annotations.
S-LDSC results describing phenotypic enrichments of H3K27ac annotations for a representative immune cell type (CD25+ T cell), two small intestine samples from two different ENTEx donors, and a representative gastrointestinal sample (transverse colon, or colon TV). The enrichment pattern for phenotypes in donor 2 H3K27ac more closely matches that of the CD25+ T cells, whereas the enrichment pattern for donor 1 more closely matches that of transverse colon.