Fig 1.
a. Schematic representation of the inducible ROSA-TET system. The mES SIM2 clones A6, B8 and C4 contain a Flag-tagged version of the mouse Sim2 gene under the control of a modified human CMV promoter (hCMV*-1). In presence of tetracycline in the culture media (+Tet), the tetracycline-regulatable transactivator (tTA) is trapped and cannot bind the hCMV*-1 promoter. Upon removal of tetracycline (-Tet), the tTA binds the hCMV*-1 promoter inducing the expression of the Sim2-Flag-IRES-Venus construct. The mES EB3 parental line does not contain the Sim2-Flag transgene. The puromycin-resistant (PuroR) and hygromycin-resistant (HygroR) cassettes are used for the clone selection process. SA: Splice Acceptor; pA: poly-adenylation site; IRES: Internal Ribosome Entry Site; orange triangles represent loxP sites. Modified from [25] b. Fluorescence-activated cell sorting (FACS) analysis of Sim2 expressing and non-expressing clones. Cells were grown in the presence (+Tet) or absence (-Tet) of Tetracycline during 26 hours. The y-axis represents the number of cells and the x-axis the fluorescence intensity. c and d. Agarose gel electrophoresis results of reverse-transcription PCR assay (RT). Total RNA from +Tet cells was reverse transcribed and amplified using primers specific to Sim2 (c) or Arnt (d) in presence (RT+) or absence (RT-) of reverse transcriptase. L: loading marker; H2O: PCR negative control.
Fig 2.
Identification and characterization of the SIM2 DNA binding sites by ChIP-seq.
a. Venn diagram of the number of SIM2 binding sites identified by ChIP-seq in each SIM2 clones (A6, B8, C4) and EB3 line. The sum of the bold numbers is equal to the 1229 SIM2 DNA binding sites found in at least 2 SIM2 clones. b. The pie chart shows the genomic distribution of these 1229 sites. c. Distribution of the distances between the SIM2 DNA binding sites and the closest transcription start site (TSS). d. Selection of gene ontology terms significantly over-represented in the list of genes associated to a SIM2 DNA binding site.
Fig 3.
mRNA-sequencing and ChIA-PET analyses.
Sim2 (a) and Arnt (b) mRNA levels (RPKM) in A6, B8, C4 SIM2 clones and three EB3 replicates. c. Comparison of the gene expression level (mean log2 RPKM between the 3 replicates) between Sim2 expressing cells (y-axis) and EB3 control cells (x-axis). Each blue dot represents a gene; differentially expressed genes (EdgeR FDR<0.05) are shown in red. The diagonal line represents the expected distribution of genes equally expressed between Sim2 expressing and non-expressing cells. d. Enrichment of Sim2 targets among the genes upregulated in Sim2 expressing clones as revealed by the GSEA analysis. Genes were sorted according to their expression fold change between Sim2 expressing and non-expressing cells (x-axis, 0 showing the most upregulated gene). Black vertical bars show the position of the SIM2 targets in the ranked list. The enrichment score (ES in green) significantly deviates from zero at the beginning of the distribution showing that the SIM2 targets are not randomly distributed in the ranked list but enriched among the most upregulated genes. p: FDR corrected p-value e. ChIA-PET interactions occurring between SIM2 DNA binding sites and promoters of genes differentially expressed in Sim2 expressing cells compared to EB3 cells (FDR<0.05). Blue lines show inter-chromosomal interactions and red lines intra-chromosomal interactions.
Table 1.
List of Sim2 target genes identified using the ChIA-PET data.
Fig 4.
Motifs enriched in Sim2 DNA binding sites.
Fig 5.
Overlapping SIM2 occupancy with master transcription factor binding sites.
a. Frequency distribution of OCT4, SOX2 and NANOG DNA binding sites in a 40kb window centered to the newly identified SIM2 DNA binding sites. Plots show a significant enrichment for the OSN binding sites at the SIM2 peak localization in SIM2 A6 expressing cells. Pie charts show the proportion of SIM2 DNA binding sites overlapping with the OCT4, SOX2 or NANOG binding sites (in grey) (100bp window). p = Fisher’s exact test p-value; F score: measure of the significance of the association (1 = perfect match). b. Protein co-immunoprecipitation experiments of SIM2-FLAG with endogenous OCT4, SOX2, KLF4 (left panel) and NANOG (right panel). Cellular protein extracts from Sim2 expressing cells (A6) or EB3 cells were immunoprecipitated by using antibodies directed against each of the pluripotency factors (N-terminal and C-terminal part of NANOG) or IgG as a negative control for co-immunoprecipitation. Associated proteins were immunoblotted using an anti-FLAG antibody. Red star shows the SIM2-FLAG protein, blue star the signal given by the recognition of the IgG heavy chains. Ø: Beads only; kDa: kilodaltons; protein lysat: protein lysat was loaded as an input control for the immunoblot.
Fig 6.
SIM2 DNA binding sites colocalize with known enhancer marks.
Distribution of chromatin modification marks in a 40kb window centered to the SIM2 DNA binding sites: DNaseI hypersensitivity signal (a), RNA polymerase II (b), H3K4me3 (c), P300 (d), H3K4me1 (e) and H3K27ac (f). Pie charts show the proportion of SIM2 peaks overlapping each of these marks (in grey) (100bp window). p = Fisher’s exact test p-value; F score: measure of the significance of the association (1 = perfect match). Data were taken from the mouse ENCODE project in the UCSC genome browser mm9 build (http://genome.ucsc.edu/).
Fig 7.
The Mediator complex colocalizes with the SIM2 DNA binding sites.
Frequency distribution of MED1 (a) and MED12 (b) DNA binding sites in a 40kb window centered to the SIM2 peaks. Pie charts show the proportion of SIM2 DNA binding sites overlapping MED1 or MED12 DNA binding sites (in grey) (100bp window). p = Fisher’s exact test p-value; F score: measure of the significance of the association (1 = perfect match).