Identification of gene specific cis-regulatory elements during differentiation of mouse embryonic stem cells: An integrative approach using high-throughput datasets
Fig 2
Predictive model for an example gene, Runx1.
(A) A network representation of the model, where the gene (here Runx1) for which the model is built (red octagon), chosen CREs (blue hexagons) and TFs bound to the chosen CREs are represented as nodes. Black arrows indicate the regulation of the gene by the CRE/coCRE and coloured arrows represent the binding of TFs to the CREs in different cell types. The colours corresponding to the cell types are given below the network. The TFBP of the CRE in a specific cell type is represented as a circular histogram and in the case of coCREs these represent the frequency of occurrence of a specific TF in the regions of that community (here the community comprises of 4 regions). The p-value of observing a combinatorial binding profile in that cell type is provided for each TFBP node and the methodology is given in Methods section. The abbreviations for the TFs in the circular histogram are: Esrrb (EB), Nanog (NG), Pou5f1 (O4), Sox2 (S2), Cebpb (CB), Elk4 (E4), Gata2 (G2), Lmo2 (L2), Tal1 (T1), Fli1 (F1), Tead4 (T4), Meis1 (M1), Gata1 (GA1), Gfi1 (G1), Gfi1b (GB), Runx1 (R1), Spi1 (P1). It should be noted that not all TFs in the circular histogram have supporting ChIP-seq data in all cell types (Table 1). In the absence of ChIP-seq data for a specific cell type, the bar for that TF in the histogram of that cell type is zero. (B) The gene expression profile (GEP) of Runx1 with cell types along the horizontal axis and FPKM on the vertical axis. (C) The plot shows the best linear fit between the actual (X) and predicted (Y) GEP for Runx1. The spearman correlation coefficient is also provided. (D) The plot shows the tag density profile normalised as coverage per million aligned reads for the 10 cell types. Runx1 gene structure is provided in blue below the coverage tracks. The predictor CREs that were used in the lasso model are given as grey boxes and the chosen CRE and the coCRE are given in red and yellow respectively. The super enhancers (SE) identified by Whyte et al.[53] are given as green bars and the enhancers given by SEA is in blue. The experimental enhancers identified by Schütte et al. and Dogan et al. are provided as well. In the case of Runx1 there is no overlap with the Dogan et al. dataset, and hence the absence of any bars. It should be noted that the coCRE enhancer is represented as a composite of red boxes of member CREs.