MGA, L3MBTL2 and E2F6 determine genomic binding of the non-canonical Polycomb repressive complex PRC1.6

Diverse Polycomb repressive complexes 1 (PRC1) play essential roles in gene regulation, differentiation and development. Six major groups of PRC1 complexes that differ in their subunit composition have been identified in mammals. How the different PRC1 complexes are recruited to specific genomic sites is poorly understood. The Polycomb Ring finger protein PCGF6, the transcription factors MGA and E2F6, and the histone-binding protein L3MBTL2 are specific components of the non-canonical PRC1.6 complex. In this study, we have investigated their role in genomic targeting of PRC1.6. ChIP-seq analysis revealed colocalization of MGA, L3MBTL2, E2F6 and PCGF6 genome-wide. Ablation of MGA in a human cell line by CRISPR/Cas resulted in complete loss of PRC1.6 binding. Rescue experiments revealed that MGA recruits PRC1.6 to specific loci both by DNA binding-dependent and by DNA binding-independent mechanisms. Depletion of L3MBTL2 and E2F6 but not of PCGF6 resulted in differential, locus-specific loss of PRC1.6 binding illustrating that different subunits mediate PRC1.6 loading to distinct sets of promoters. Mga, L3mbtl2 and Pcgf6 colocalize also in mouse embryonic stem cells, where PRC1.6 has been linked to repression of germ cell-related genes. Our findings unveil strikingly different genomic recruitment mechanisms of the non-canonical PRC1.6 complex, which specify its cell type- and context-specific regulatory functions.


Introduction
Polycomb group (PcG) protein complexes play crucial roles in many physiological processes, including stem cell maintenance, differentiation, cell cycle control and cancer [1][2][3][4]. PcG complexes repress transcription through various mechanisms including changes in histone modification, polynucleosome compaction and direct interaction with the transcription machinery [1,5]. Two major complexes exist in mammals, the Polycomb repressive complexes 1 and 2 (PRC1 and PRC2), which differ in their enzymatic activity. PRC1 contains the E3 ligase RING1/2, which catalyzes ubiquitination of histone H2A at lysine 119 (H2AK119ub1), while PRC2 contains the methyltransferase EZH2 (Enhancer of Zeste Homolog 2) that catalyzes trimethylation of histone H3 (H3K27me3). It has long been considered that H3K27me3 is required for PRC1 binding to chromatin. However, this view was challenged when it was found that a number of PRC1 complexes exist, which lack H3K27me3-binding CBX (Chromo Box) subunits [6][7][8][9].
Our knowledge of the targeting of ncPRC1 complexes to their genomic sites is limited. The PCGF1-containing PRC1.1 variant is recruited to non-methylated CpG islands via the histone methyltransferases Kdm2b (Lysine (K)-specific demethylase 2b), which binds to non-methylated CpG islands [6,9]. Recruitment of the PCGF3/5-containing ncPRC1s to the inactive Xchromosome is mediated by the Xist-RNA [11].
The subunit composition of the different ncPCR1s is specific and potentially revealing. While PRC1.6 (also known as E2F6-PRC1 and PCGF6-PRC1) is similar if not identical to L3MBTL2 (Lethal(3)Malignant Brain Tumor-Like 2)-containing complexes [12,13] and the E2F6 repression complex [14], it is specifically associated with several proteins that are not found in other ncPRC1s (Fig 1A), [7,15]. MGA (MAX Gene-Associated protein, also abbreviated as MGAP by UniProt) contains two DNA-binding domains, a T-box domain and a bHLH (basic helix-loop-helix) domain. MGA interacts with MAX (Myc-associated Factor X), and E2F6 interacts with DP-1 or DP-2 (transcription factor DP-1 or DP-2). Heterodimeric MGA/MAX binds E-boxes, and heterodimeric E2F6/DP-1/2 binds to E2F recognition sequences in vitro [16][17][18]. L3MBTL2 contains four MBT domains that bind to mono-and di-methylated histone H3 and H4 tails in vitro [19][20][21]. Full-length L3MBTL2 can also interact with histones independent of their lysine methylation state [12,20]. The association of PRC1.6 with the sequence-specific DNA binding proteins MGA/MAX and E2F6/DP1, and with the histone-interacting protein L3MBTL2 suggests that these proteins could play a role in locusspecific recruitment of PRC1.6. Crucially, this notion has not been addressed experimentally.
In mouse embryonic stem cell (ESC) an essential role for the corresponding PRC1.6 subunits in specification and proliferation has been demonstrated. Mga and Pcgf6 were identified as essential self-renewal genes in ESCs by a genome-wide RNAi screen [22]. A more recent knockout study revealed that Mga is essential for survival of mouse pluripotent cells during peri-implantation development and for growth of ESC cultures [23]. L3mbtl2-deficient ESCs retain characteristics of pluripotent cells but are severely impaired in proliferation [13]. Finally, the defining subunit of PRC1.6, Pcgf6, is expressed at high levels in mouse ES cells, where it is required for ESC identity [24,25]. The mechanism by which this occurs remains controversial. Two reports suggested a repressive function of Pcgf6 on mesodermal-specific [24] and on endodermal lineage genes [26], while Yang et al. suggested an PRC1.6-independent direct activator function of Pcgf6 on core ESC regulators such as Oct4, Sox2 and Nanog [25].
Here we describe the targeting mechanism of PRC1. 6, an exemplar of the non-canonical PRC1 class, by detailing the role of MGA, L3MBTL2, E2F6 and PCGF6 in genomic binding site selection. We show that MGA, L3MBTL2, E2F6 and PCGF6 colocalize genome-wide in the context of PRC1.6. Taking advantage of CRISPR/Cas-mediated genetic ablation in HEK293 cells, we demonstrate that MGA is absolutely essential for binding of PRC1.6. By expression of MGA mutants in MGAko cells, we found that the bona fide T-box and bHLH DNA-binding domains of MGA mediate binding to a subset of loci but are dispensable for others. We further demonstrate that L2MBTL2 and E2F6 determine differential binding of PRC1.6 to distinct promoters. Finally, we demonstrate that Mga, L3mbtl2 and Pcgf6 colocalize also in mouse ESCs. In particular, we found enrichment at promoters of meiosis-and germ-line-specific genes that were shown to be de-repressed on Max-, L3mbtl2-or Pcgf6-depletion. Together, our findings unveil strikingly different genomic recruitment mechanisms for a non-canonical Polycomb repressive complex, which specify its cell type-and context-specific regulatory functions.

Genomic colocalization of MGA, L3MBTL2, E2F6 and PCGF6 in HEK293 cells
To identify the genomic binding sites of PRC1.6 and to gain mechanistic insights into its targeting, we focused on the roles of MGA, L3MBTL2, E2F6 and PCGF6 as these factors are binding to selected promoters. The region -2 kb upstream of the CDC7 promoter served as a negative control. Percent of input values represent the mean of at least three independent experiments +/-SD. (H) Sequence motifs enriched in PRC1.6 binding regions. Logos were obtained by running MEME-ChIP with 300 bp summits of the top 600 union MGA-L3MBTL2-E2F6-PCGF6 ChIP-seq peaks. The numbers next to the logos indicate the occurrence of the motifs, the statistical significance (E-value) and the transcription factors that bind to the motif. Right panel, local motif enrichment analysis (CentriMo) showing central enrichment of the MGA/MAX bHLH and the E2F6/DP1 binding motifs within the 300 bp peak regions. The NRF1 binding motif was not centrally enriched.
https://doi.org/10.1371/journal.pgen.1007193.g001 specific to PRC1.6 ( Fig 1A) and were not found in other ncPRC1s. We established HEK293 cell clones in which each of these four proteins was depleted individually using the CRISPR/Cas9-sgRNA system (S1 Fig) as controls at key steps in the analysis. By Western blotting we confirmed successful depletion of MGA, L3MBTL2, E2F6 and PCGF6 in several clones ( Fig 1B).
Next, we determined MGA, L3MBTL2, E2F6 and PCGF6 occupancy by ChIP-seq using chromatin of the corresponding knockout cell lines as a reference for peak selection. Thereby, we were able to remove a number of perfectly shaped false positive ChIP-seq signals (S2A Fig) from the classified lists of binding sites. We obtained different peak strengths and different numbers of peaks for the different factors (L3MBTL2 > E2F6 > MGA > PCGF6), possibly due to the different performance of the antibodies resulting in different ChIP efficiencies. Stringent filtering of uniquely mapped reads (!30 tags and !3-fold enrichment over the corresponding knockout control) yielded lists of high-confidence binding sites for each factor.
Comparison of the MGA, L3MBTL2, E2F6 and PCGF6 data sets revealed a very high degree of overlap (Fig 1C and 1D) reflecting colocalization (Fig 1E). Consistent with the role of PRC1 in regulating gene expression, the large majority of these sites were located close to the 5´-end of annotated transcripts ( Fig 1F). We also confirmed colocalization of MGA, L3MBTL2, E2F6 and PCGF6 to a set of selected target promoters by conventional ChIP-qPCR analysis ( Fig 1G).
The overlap of the MGA, L3MBTL2, E2F6 and PCGF6 ChIP-seq peaks shown in Fig 1C  also suggests the existence of some genomic sites bound by only one of the four factors. However, the majority of the potential factor-specific sites was removed when we compared filtered peaks with unfiltered MACS peaks (S2B- S2E Fig). Moreover, visual genome browser inspection of the remaining potential subunit-specific peaks indicated the shared presence of MGA, E2F6, L3MBTL2 and PCGF6 at all examined sites (S2B- S2E Fig). Hence, our ChIP-seq results indicate that all four factors bind to the same genomic loci in vivo. This conclusion is strongly supported by the complete absence of genomic L3MBTL2, E2F6 and PCGF6 binding events in MGA-depleted cells (see below).
A de novo sequence motif analysis of the top 600 ranked MGA, L3MBTL2, E2F6 and PCGF6 binding sites revealed centrally enriched motifs that match in vitro recognition sequences for MGA/MAX (the E-Box, CACGTG) [17] and for E2F6/DP1 (GCGGGAA) [18] (Fig 1H). The abundant occurrence of the E-box and the E2F6 binding motif indicated that both, MGA and E2F6, could be important for recruitment of PRC1.6 to its specific sites in chromatin.

MGA plays a crucial role in genomic targeting of PRC1.6
MGA and E2F6 are sequence-specific DNA binding factors; and L3MBTL2 is a histone-interacting protein. Having found that they colocalize genome-wide, we set out to investigate their interdependence in genomic targeting of PRC1.6. At first we focused on the role of MGA and examined whether binding of other PRC1.6 subunits was affected in MGA-depleted cells. ChIP-seq analysis revealed that MGAko cells lack genome-wide binding of both L3MBTL2 and E2F6 (Fig 2A and 2B) indicating that MGA is crucial for genomic targeting of L3MBTL2 and E2F6 and potentially for the entire PRC1.6 complex. This finding was particularly unexpected since the E2F6/DP2 heterodimer binds E-box motifs readily in vitro [18]. Western blot analysis revealed that MGA-depleted cells contained markedly less E2F6 as well as PCGF6 ( Fig 2C). The reduced protein levels of E2F6 and PCGF6 in MGAko cells were likely due to impaired protein stability, as the transcript levels of E2F6 and PCGF6 were not reduced in MGAko cells (Fig 2D). The protein level of L3MBTL2 in MGAko cells was similar as in wild type cells. However, the fraction of SUMO-modified L3MBTL2 [20] was strongly reduced ( Fig  2C), which may indicate that SUMOylation of L3MBTL2 in wild type cells takes place at the level of chromatin. Finally, the level of RING2 protein was unchanged in MGA-deficient cells.
To exclude that the lack of any E2F6 and L3MBTL2 binding in MGAko cells was the result of inefficient ChIPs, we also probed a panel of selected target promoters by ChIP-qPCR. These experiments validated the lack of genomic L3MBTL2 and E2F6 binding in two different MGAko clones (Fig 2E). We also analyzed for the presence of other PRC1.6 components including PCGF6, MAX, RING2, RYBP, HP1γ and the H2AK119ub1 mark. All factors as well as the H2AK119ub1 mark were present at the MGA target sites in wild type cells but were absent or, in the case of the H2AK119ub1 mark, markedly reduced in both MGA-depleted cell clones ( Fig  2E). The global H2AK119ub1 levels were similar in wild type, MGAko, L3MBTL2ko, E2F6ko, and PCGF6ko cells (S3 Fig) showing that the observed reduction of the H2AK119ub1 mark at the PRC1.6 target regions is due to changes in local RING2 deposition. Collectively, these results demonstrate that MGA is absolutely crucial for genomic loading of the entire PRC1.6 complex. Importantly, these results also indicate that E2F6, L3MBTL2 and PCGF6 bind to their genomic sites exclusively in the context of the PRC1.6 complex and are not recruited to chromatin independently of PRC1.6.
Since previous studies reported that H2AK119ub1 plays a critical role for recruitment of PRC2 followed by downstream deposition of H3K27me3 [10], we also tested for the presence of the catalytic PRC2 component EZH2 and for H3K27me3 ( Fig 2F). Neither EZH2 nor H3K27me3 were enriched at the selected PRC1.6 loci suggesting that PRC1.6 binding is not generally interconnected with PRC2 binding. Importantly, we found considerable enrichment of EZH2 and H3K27me3 at known PRC2-dependent canonical PRC1 target sites. These canonical PRC1 binding sites were not bound by MGA, and the levels of EZH2 and H3K27me3 at these sites remained unchanged in MGAko cells ( Fig 2F). The absence of MGA at canonical PRC1 binding regions is consistent with genome-wide data that revealed only a low level of overlap between PCGF6 and other PCGFs in HEK293 cells [7].

MGA promotes the genomic localisation of PRC1.6 through different mechanisms
Given that MGA is essential for targeting of PRC1.6, it would be expected that re-expression of MGA would restore not only genomic binding of MGA but also of the other PRC1.6 components. To test this prediction, we expressed full-length MGA in MGAko cells, and subsequently analyzed a panel of PRC1.6 target promoters for binding of exogenous MGA and of endogenous L3MBTL2, E2F6, PCGF6, MAX and RING2. Indeed, re-expression of MGA in MGAko cells not only restored specific binding of MGA but also of the other PRC1.6 subunits ( Fig 3A). We did not observe an increase of H2AK119ub1 levels at these promoters. Potentially, the short time span of transient MGA expression was not sufficient for the H2AK119ub1 mark to be deposited efficiently.
MGA contains two different DNA binding domains, a T-box domain close to the N-terminus and a bHLH domain in its C-terminal part (Fig 3B). To test whether these DNA binding  domains account for genomic loading of PRC1.6, we generated two different types of DNA binding-deficient MGA mutants by deleting the entire T-box domain (MGA-ΔT, aa 79 -aa 264 deleted), and by replacing several critical amino acids in the bHLH domain [27,28] by alanine residues (MGA-bHLHmut, MGA-H2477A_ER2481/2482AA_R2485A). Compared with wild type MGA, binding of the MGA-bHLH mutant to several target promoters (AEBP2, ZFR, CDIP, CCND2 and TFAP4) was strongly reduced (Fig 3D). We also observed reduced binding of the MGA-T-box deletion mutant to the SPOP promoter. However, both MGA mutants still bound to the RFC1, PHF20, RPA2, RNF130 and CDC7 promoters as efficiently as wild type MGA ( Fig  3D). We also tested binding of an MGA double mutant in which both DNA binding domains were mutated simultaneously (MGA-ΔT-bHLHmut). Remarkably, the MGA-ΔT-bHLHmut double mutant still bound to these promoters as efficiently as wild type MGA. Importantly, the DNA binding-deficient MGA mutants also rescued binding of endogenous L3MBTL2 to the RFC1, PHF20, RPA2, RNF130 and CDC7 promoters but not to the MGA-bHLH-dependent AEBP2, ZFR, CDIP, CCND2 and TFAP4 promoters and to the MGA-T-Box-dependent SPOP promoter ( Fig 3D, right panel). These results suggest that MGA can recruit PRC1.6 to specific target sites by DNA-binding-dependent and by DNA-binding-independent mechanisms.

L3MBTL2 and E2F6 contribute to genomic binding of PRC1.6
As MGA is able to bind a subset of PRC1.6 loci independent of its DNA-binding activity, we investigated the potential contribution of E2F6, L3MBTL2 or PCGF6 to the recruitment of PRC1.6 to its target sites. To address this issue, we profiled binding of MGA, L3MBTL2 and E2F6 in cells lacking L3MBTL2, E2F6 or PCGF6 (L3MBTL2ko, E2F6ko or PCGF6ko cells). Importantly, the level of MGA and of other PRC1.6 subunits in L3MBTL2ko-, E2F6ko-, and PCGF6ko cells was unaffected ( Fig 4A and S4 Fig). Analysis of the ChIP-seq data sets revealed that the overall genomic positions of the PRC1.6 binding sites in E2F6ko-, L3MBTL2ko-and PCGF6ko cells are similar to those in wild type cells (Fig 4B and 4C). However, the signal strengths of the MGA and L3MBTL2 peaks in E2F6ko cells and the signal strength of the MGA and E2F6 peaks in L3MBTL2ko cells were significantly reduced, but only slightly affected in PCGF6ko cells (Fig 4C and 4D). Notably the extent of reduction of MGA binding in E2F6ko cells correlated well with the extent of reduction of L3MBTL2 binding in E2F6ko cells (Fig 4E, left panel). Equally, the extent of reduction of MGA binding in L3MBTL2ko cells correlated well with the extent of reduction of E2F6 binding in L3MBTL2ko cells ( Fig 4E, right panel). These results demonstrate that the genomic localization of PCR1.6 requires the simultaneous association of MGA, L3MBTL2, E2F6 and PCGF6 in a single complex.

L3MBTL2 and E2F6 recruit PRC1.6 differently to distinct sets of genes
Binding of MGA to the majority of its genomic sites was greatly reduced in L3MBTL2ko as well as in E2F6ko cells, indicating that both, L3MBTL2 and E2F6 contributed to genomic binding of PRC1.6. Importantly, however, the extent of reduction of MGA and E2F6 binding in L3MBTL2ko cells, and the extent of reduction of MGA and L3MBTL2 binding in E2F6ko cells did not correlate ( Fig 5A). Rather, the shape of these plots revealed three distinct types of PRC1.6 binding site (i) loci where binding of MGA was reduced in both, L3MBTL2ko and E2F6ko cells (ii) loci where binding of MGA was reduced in L3MBTL2ko cells but not in E2F6ko cells; (iii) loci where binding of MGA was reduced in E2F6ko cells but not in L3MBTL2ko cells. Thus, we were able to identify L3MBTL2-dependent and E2F6-dependent PRC1.6 binding sites (Fig 5B and S5A Fig).
We also probed a panel of PRC1.6 target sites in two different L3MBTL2ko and E2F6ko cell clones by conventional ChIP-qPCR. We tested for the presence of MGA, L3MBTL2, E2F6 and PCGF6, MAX, RING2 and H2AK119ub1 (Fig 5C and S5B Fig). This analysis confirmed L3MBTL2-and E2F6-dependent binding of PRC1.6 to the RFC1, PHF20 and SPOP promoters; L3MBTL2-dependent but E2F6-independent binding to the AEBP2 and ZFR promoters; and E2F6-dependent but L3MBTL2-independent binding to the ALDOA, RNF130 and CDC7 promoters (S5B Fig). In all cases the levels of H2AK119ub1 correlated with PRC1.6 binding. Finally, we performed rescue experiments in which we found that expression of L3MBTL2 in L3MBTL2ko cells not only restored binding of ectopically expressed L3MBTL2 but also of endogenous MGA, E2F6 and PCGF6 to the L3MBTL2-dependent RFC1, PHF20, SPOP, AEBP2 and ZFR promoters ( Fig 5D). Of note, the enrichment levels observed in rescued L3MBTL2ko cells were approximately 1.5-to 3-fold lower than in wild type cells (compare Fig  5D with in Fig 1G). This is not surprising given that not all cells in the population express L3MBTL2 after transient transfection. Importantly, L3MBTL2 also re-occupied the E2F6-dependent ALDOA, RNF130 and CDC7 promoters; however, binding of MGA, E2F6 and PCGF6 to these promoters was not enhanced. This result strongly supports our model in which L3MBTL2 is only essential for recruitment of PRC1.6 to a subset of loci despite its presence at all PRC1.6 binding sites.
To gain further insight into the E2F6-dependent recruitment of PRC1.6, we examined whether the DNA-binding activity of E2F6 is necessary for PRC1.6 binding. Wild type E2F6 expressed in E2F6ko cells re-occupied all tested PRC1.6 target loci, and also resulted in slightly increased binding of endogenous MGA and L3MBTL2 to E2F6-dependent promoters but not to the L3MBTL2-dependent promoters ( Fig 5E). The DNA binding-deficient E2F6 mutant (E2F6-L68E,V69F) did not bind to the E2F6-dependent promoters, and did not re-occupy the L3MBTL2-dependent promoters. This observation indicates that the DNA binding domain of E2F6 is not only necessary for DNA recognition but also for association with PRC1.6.
By ChIP-qPCR analysis of selected promoters we also validated that PCGF6 has a limited role in the recruitment of MGA, L3MBTL2 and E2F6 (S6 Fig). However, it is important to note that binding of RING2 in PCGF6ko cells was nearly reduced to levels at a negative control region. Consistently, H2AK119ub1 levels were also reduced at these promoters (S6 Fig). Thus, PCGF6, albeit not essential for binding site selection by PRC1.6, it is required to recruit RING2 to these loci. This observation is in line with a recent study that revealed recruitment of RING2 by a PCGF6-TET repressor fusion protein tethered to a Tet operator array in vivo [29].
We surveyed the DNA sequences of L3MBTL2-and the E2F6-dependent PRC1.6 loci and found specific enrichment of the E2F binding motif (GCGGGA) in the E2F6-dependent PRC1.6 binding sites, and specific enrichment of the E-box (CACGTG) and T-box (AGGC/ TGC/TGAGG) binding motifs in the L3MBTL2-dependent PRC1.6 binding sites (Fig 5F). The strong association of E-box and T-box motifs with L3MBTL2-dependent PRC1.6 binding sites point to an important role of L3MBTL2 in facilitating or stabilizing an interaction of MGA/ MAX with DNA. Finally, we examined whether there are specific functional features shared amongst E2F6-dependent and L3MBTL2-dependent PRC1.6-bound genes. E2F6-dependent PRC1.6 target genes but not L3MBTL2-dependent PRC1.6 target genes were highly enriched in Gene Ontology (GO) terms related to cell cycle control (Fig 5G). This finding is in line with the role of E2F6 as an RB-independent transcriptional repressor during cell cycle progression [30]. GO terms associated with L3MBTL2-dependent PRC1.6 target genes included quite different biological processes such as "positive regulation of neurotransmitter secretion", meiotic "synaptonemal complex assembly" and "ribosome assembly". Altogether these results suggest that E2F6 and L3MBTL2 recruit PRC1.6 to distinct gene sets that regulate different biological processes.

Role of PRC1.6 in HEK293 cell function
We went on to investigate the role of PRC1.6 in cell growth and gene expression in HEK293 cells by comparing the proliferation potential of wild type, MGAko, L3MBTL2ko, E2F6ko and PCGF6ko cells. The growth rates of wild type cells and PCGF6ko cells were similar; however, we observed reduced proliferation of MGAko, L3MBTL2ko and E2F6ko cells. (Fig 6A). Next, we examined the transcriptional impact of PRC1.6. RNA-seq of three independent wild type cell cultures and three independent MGAko clones identified 587 genes with !2-fold altered expression levels in MGAko cells. Expression of 485 genes was reduced in the MGAko cells, while expression of 102 genes was increased (Fig 6B). Comparison of the set of de-regulated genes with the gene set bound by MGA revealed that MGA was not bound to the majority of the down-regulated genes (434/485, 89%) suggesting an indirect role of MGA in the regulation of these genes. In contrast, MGA was bound to the majority of the up-regulated genes (71/102, 70%) suggesting that PRC1.6 acts as a direct repressor on these genes. Representative ChIP-seq and RNA-seq genome browser screenshots of de-repressed genes bound by PRC1.6 are shown in Fig 6C. Interestingly, the top up-regulated genes in MGAko cells included several critical regulators and effectors of meiosis such as CNTD1, SMC1B, SYCE2, YBX2, MEIOC (C17orf104), RAD9B, TAF7L, STAG3, CPEB1 and ALDH1A2, as well as several testis-enriched genes such as PRSS50, TRIM71, C19orf57, ZCWPW1, ZNF239, RIBC2, NEUROG2 (http:// www.proteinatlas.org/).
To test whether L3MBTL2, E2F6 and PCGF6 contribute to PRC1.6 target gene repression, we examined the expression of a panel of fourteen genes in L3MBTL2ko, E2F6ko and PCGF6ko cells by locus-specific RT-PCR assays. We found increased transcript levels of these genes also in L3MBTL2ko, E2F6ko or PCGF6ko cells, yet to different degrees (Fig 6D). For example, compared with wild type cells, transcript levels of CNTD1 were also increased in L3MBTL2ko and in E2F6ko but not in PCGF6ko cells. Transcript levels of SMC1B, however, were increased in E2F6ko cells but not in L3MBTL2ko and PCGF6ko cells. Conversely, transcript levels of STAG3 were increased in L3MBTL2ko and PCGF6ko cells but not in E2F6ko cells (Fig 6D). Thus, L3MBTL2, E2F6 and PCGF6 contributed to repression of these genes differentially in a gene-specific manner. Interestingly, specific de-repression of these genes in L3MBTL2ko or E2F6ko cells, respectively, correlated well with the contribution of L3MBTL2 and E2F6 to PRC1.6 binding. Binding of PRC1.6 to the CNTD1 promoter was diminished in L3MBTL2ko as well as in E2F6ko cells. Binding of PRC1.6 to the SMC1B promoter was lost in E2F6ko cells but remained in L3MBTL2ko cells, and binding of PRC1.6 to the STAG3 promoter was lost in L3MBTL2ko cells but remained in E2F6ko cells (S7 Fig).

Mga, L3mbtl2 and Pcgf6 colocalize in mouse ESCs and repress genes involved in differentiation
Several recent studies revealed critical roles for Mga, L3mbtl2 and Pcgf6 in the regulation of mouse ES cell pluripotency, proliferation and differentiation [13,23,24,26,29]. Therefore, we investigated the genomic localization of PRC1.6 components also in mouse ES cells. We focused on Mga, L3mbtl2 and Pcgf6 as there is no available antibody, which efficiently recognizes murine E2f6. Also of note is that we failed to generate Mga-deficient ESC clones, which is in line with a previous study suggesting that Mga plays an essential role in ESCs [23]. Therefore, we used an IgG control ChIP-seq dataset as a reference for peak selection. Similar to the ChIP-seq results with chromatin of HEK293 cells, we obtained different numbers of filtered (!30 tags and !3-fold enrichment over IgG) peaks for Mga (14.183), L3mbtl2 (17.007) and Pcgf6 (4817) (Fig 7A). The vast majority of the Pcgf6 peaks (90%) overlapped with Mga and L3mbtl2 peaks; and the majority of the Mga peaks (82%) overlapped with the L3mbtl2 peaks. Genome browser track and heatmap views of binding densities also revealed clear colocalization of Mga, L3mbtl2 and Pcgf6 (Fig 7B and 7C). Moreover, visual inspection of genome browser tracks did not confirm any Mga-, L3mbtl2-or Pcgf6-specific binding site (S8A Fig). Collectively, our ChIP-seq data sets reveal that Mga, L3mbtl2 and Pcgf6 colocalize in mouse ESCs suggesting strongly that the function of PRC1.6 is conserved in murine and human cells. This conclusion was further supported by a de novo sequence motif analysis of the top 600 ranked Mga/L3mbtl2/Pcgf6 peak regions, which revealed the presence of centrally enriched E-box as well as T-box and E2F6/DP1 recognition sequences as prevalent motifs (Fig 7D). Finally, as in HEK293 cells the majority of the Mga/L3mbtl2/Pcgf6 binding sites were located close to transcriptional start sites (Fig 7E). However we observed that the genomic distribution of the Mga/L3mbtl2/Pcgf6 peaks in mouse ESCs differ to some extent from the distribution in HEK293 cells. In many instances, we observed multiple Mga/L3mbtl2/Pcgf6 peaks within a gene locus including the promoter, exons, and the 3´-end (S8B Fig). Potentially, the peaks within gene bodies were not direct PRC1.6 binding sites but reflect local intragenic loops within genes that fold exons close to cognate promoters. The capture of such structural features by ChIP-seq has been reported previously [31]. It is also possible that these intragenic peaks reflect discrete compacted chromatin structures similar to those generated by canonical cPRC1 [32].   [26] and in L3mbtl2ko cells [13]. Left panel, GO analyses of biological functions of PRC1.6-bound genes that were de-repressed !2-fold in Pcgf6ko cells. Right panel, GO analyses of biological functions of PRC1.6-bound genes that were de-repressed !2-fold in L3mbtl2ko cells. Enriched GO terms were retrieved using DAVID 6. To examine the impact of the Mga/L3mbtl2/Pcgf6 binding sites on gene expression in mouse ESCs, we compared our ChIP-seq data sets with genes that were deregulated in L3mbtl2-depleted [13] or in Pcgf6-depleted ESCs [26]. We found that two-third of the genes (882 out of 1354) that were up-regulated in Pcgf6-depleted ES cells, and 71% of the genes that were up-regulated in L3mbtl2-depleted cells (421 out of 587) by !2-fold were bound by Mga, L3mbtl2 and Pcgf6 ( Fig  7F). Interestingly, genes aberrantly expressed in Pcgf6ko and in L3mbtl2ko ESCs largely do not overlap (Fig 7F). Nevertheless, a GO analysis revealed that Pcgf6-dependent, as well as L3mbtl2dependent repressed PRC1.6 target genes were strongly associated with germ-line development (meiosis and spermatogenesis). Also the small group of 78 direct PRC1.6 target genes that were up-regulated in Pcgf6ko as well as in L3mbtl2ko ES cells (Fig 7F) encode several meiotic genes including Syce3, Stk31, Slc22a, Mei1 and Tdrkh. Remarkably, promoters of the meiotic genes were within the top ranked 200 Mga, L3mbtl2 and Pcgf6 peaks. Genes specifically de-repressed in Pcgf6-depleted ES cells but not in L3mbtl2-depleted ES cells were related to the wnt signaling pathway and to neuron differentiation. Conversely, specific L3mbtl2-repressed genes were associated with angiogenesis and positive regulation of cell migration (Fig 7F). This finding suggests that Pcgf6 and L3mbtl2 repress common as well as different sets of genes despite the presence of both factors at all target genes within the PRC1.6 complex.

Discussion
In this study, we provide insights into the genomic targeting mechanism of the non-canonical PRC1 complex PRC1.6. We find that MGA, L3MBTL2, E2F6 and PCGF6 colocalize genome-wide in the context of PRC1.6. MGA is absolutely crucial for binding of the complete PRC1.6 since genome-wide binding of E2F6, L3MBTL2 and PCGF6 is lost in MGA-depleted cells (Fig 2). Mechanistically, we provide strong evidence that MGA executes recruitment of PRC1.6 to its target sites through two distinct functions (Fig 9). On the one hand, MGA acts as a sequence-specific DNA-binding factor mediating recruitment of PRC1.6 to E-box and Tbox containing promoters. On the other hand, MGA has a scaffolding function, which is independent of its DNA binding capacity (Fig 3B and 3C). The scaffolding function of MGA may protect E2F6 and PCGF6 against degradation (Fig 2C).
The other components of PRC1.6 have distinct functional roles. L3MBTL2 is also involved in genomic targeting of PRC1.6 since in L3MBTL2ko cells, MGA, E2F6 and PCGF6 fail to bind to a large fraction of promoters (Figs 4 and 5). These L3MBTL2-dependent PRC1.6 binding sites are enriched for the bHLH E-box motif but not for the E2F6-binding motif (Fig 5). The MBT domains of L3MBTL2 are known to bind preferentially mono-, and di-methylated histone H3 and H4 marks [19][20][21]; and full-length L3MBTL2 interacts with histone tails independent of their lysine methylation state [12,20]. Thus, we propose that L3MBTL2 promotes binding site selection of PRC1.6 by facilitating and stabilising the interaction of MGA/MAX with E-or T-box-containing promoters.
A large fraction of PRC1.6 binds to promoters that regulate mitotic cell cycle genes. Since binding is largely unaffected in L3MBTL2-depleted cells targeting of PRC1.6 to this class of genes is more likely mediated by E2F6 (Fig 5F and 5G). This finding is consistent with a previous report showing that E2F6 functions as an RB-independent transcriptional repressor by controlling E2F1-3-dependent transcription during cell cycle progression, particularly by counteracting the activating E2Fs during S phase [30]. These cell cycle-regulated genes are not upregulated in MGA-depleted cells. Likely, E2F4, another repressive E2F family member, compensates for the loss of E2F6-mediated PRC1.6 binding. Indeed, it has been shown that only simultaneous inhibition of both, E2F6 and E2F4 activity, results in depression of these PRC1.6 target genes [30].
Mga, L3mbtl2 and Pcgf6 colocalize also in mouse ESCs (Fig 7) strongly suggesting that the core components of PRC1.6 are evolutionarily and functionally conserved. In addition, knockdown of Mga and Max in mouse ESCs leads to the loss of Pcgf6 binding at several promoters of the genes that are up-regulated in Pcgf6ko cells [29] indicating that also the recruitment mechanisms in ESCs are similar to those observed in human cells. Notably, Pcgf6 is the most highly expressed Pcgf paralog in undifferentiated ESCs [24], and Pcgf6 is the predominant Ring1b-interactor in ESCs [37]. These observations indicate that PRC1.6 is a major PRC1 complex in ESCs. PRC1.6 components play essential roles in ESCs including regulation of ESC pluripotency, proliferation and differentiation. Most significantly, Mga depletion leads to the death of proliferating pluripotent ICM cells in vivo and in vitro, and the death of ESCs in vitro [23]. Also Pcgf6ko and L3mbtl2ko as well as Maxko ESCs have defects in proliferation and differentiation [13,26,38] but less severe as Mgako ESCs. The most severe phenotype of Mgako ESCs is in line with the crucial importance of Mga for genomic PRC1.6 binding.
Consistent with published reports, we have found that in ESCs, PRC1.6 is involved in the repression of meiotic genes. Ablation of Max, the dimerization partner of Mga, activates meiotic genes in ESCs and induces cytological changes, which are reminiscent of germ cells at the leptotene and zygotene stages of meiosis [39,40]. Meiotic and germ-line-specific genes are also activated in Pcgf6ko and L3mbtl2ko cells [13,26]. Mga, L3mbtl2 and Pcgf6 bind to the promoters of these meiosis-specific genes (Fig 7) strongly suggesting that PRC1.6 directly represses these genes in ESCs thereby safeguarding/preventing meiosis. Interestingly, several meiotic genes are also de-repressed in MGA-deficient 293 cells (Fig 6). De-repression of a limited number of meiotic and germ cell-specific genes is also observed in E2F6-deficient MEFs indicating that the repressive function operates in somatic cells [41][42][43].
Knockdown of Pcgf6 results also in strongly increased expression levels of several mesodermal genes including T (Brachyury), the Runx transcription factor Mlf1 and the vascular endothelial growth factor receptor 2 (Vegfr-2, Flk) encoded by the Kdr gene [24]. Our ChIPseq data revealed binding of Mga, L3mbtl2 and Pcgf6 to these genes suggesting that PRC1.6 also directly represses mesodermal lineage genes in mouse ESCs.
Apart from meiosis-specific and germ-line-specific genes, quite different gene sets are derepressed on Pcgf6-and L3mbtl2-depletion in ESCs. Based on this observation it was suggested that Pcgf6 acts independently of L3mbtl2 [24]. However we provide strong evidence that all Pcgf6 binding sites are also bound by L3mbtl2. We speculate that L3mbtl2 facilitates binding of PRC1.6 to specific loci, while Pcgf6 acts through recruitment of Ring1b and downstream H2AK119ub1 [29]. Since L3mbtl2 associates with the methyltransferases G9A and GLP [13,14] it may also facilitate H3K9 dimethylation. Indeed, G9A and GLP are required for repression of germ cell-specific genes [44]. It is also possible that L3mbtl2, known to compact nucleosomal arrays in vitro [12], represses transcription directly by chromatin compaction making promoters inaccessible for the transcription machinery.

Antibodies
Rabbit polyclonal antibodies against MGA for use in ChIP experiments and immunoblotting were generated by immunizing with a bacterially expressed GST fusion protein carrying the 300 C-terminal amino acids of human MGA. Immunization was carried out by Eurogentec (Seraing, Belgium) using the 28-day Speedy immunization protocol. Antisera were affinitypurified according to a protocol described in [45] using the matrix-coupled GST-MGA fusion protein. The commercially available antibodies used in this study are shown in Table 1.
The empty pX459 vector was transfected as a negative control. Puromycin selection (3 μg/ ml) was carried out 48 hours after transfection for 3 to 6 days. Individual colonies were isolated and the targeted loci were genotyped by PCR (see S1 Fig) and sequenced. Cell clones with indels in the targeted locus were further analyzed by Western blotting.

Construction of expression vectors
The expression vector for 3xFLAG-L3MBTL2 has been described in [20]; and the expression vectors for HA-tagged wild type E2F6 and the DNA-binding-deficient E2F6 mutant in [18]. The HA-tag was removed by BamHI/HindIII digestion and blunt-end re-ligation. For expression of 3xFLAG-tagged MGA under control of the CMV promoter, several MGA cDNA fragments were amplified from poly(A)-and random-primed HEK293 cell cDNA libraries, and placed stepwise into pN3-3xFLAG using conventional restriction cloning procedures. The sequence of the cloned MGA cDNA is identical to the NCBI reference sequence XM_005254246.2 and encodes the 3115 amino acid full-length MGA isoform XP_005254303.1. Mutations of the MGA T-box and bHLH domains were introduced into the wild type MGA construct by replacing appropriate wild type fragments with corresponding mutant gBlock DNA fragments (IDT, Leuven, Belgium) using internal restriction sites of the MGA cDNA.

Expression of MGA, L3MBTL2 and E2F6
For expression of MGA, L3MBTL2 or E2F6, the respective knockout clones were transiently transfected with the corresponding expression plasmid using the FugeneHD transfection reagent (Promega). Five million cells on a 15-cm dish were transfected with 20 μg of plasmid DNA, harvested 48 hours after transfection and cross-linked chromatin was prepared. Expression of the proteins was monitored by Western blotting.

Cell growth conditions and growth curves
HEK293 cells were cultured in DMEM/F-12 + GlutaMax medium (Gibco, Thermo Fisher, Waltham, MA) supplemented with 10% fetal bovine serum (Sigma Aldrich, St. Louis, MO) and 1% Penicillin/Streptomycin (Sigma Aldrich). Mouse J1 ES cells were cultivated feeder-cell free on gelatin-coated plates in DMEM + GlutaMax (Gibco, Thermo Fisher), supplemented with 15% fetal bovine serum (Biochrom, Berlin, Germany), 1% non-essential amino acids (Gibco, Thermo Fisher), 1% Penicillin/Streptomycin (Sigma Aldrich), 50 mM ß-Mercaptoethanol and 1000 U/mL ESGRO leukemia inhibitory factor (Merck Millipore, Billerica, MA). For determination of growth rates of wild type and corresponding knockout HEK293 cell lines, 3x10 5 cells were plated on a 6-well dish and counted in two or three days intervals as indicated in Fig 6A. Cumulative cell numbers were calculated by multiplying the initial cell number with the fold-increase in cell numbers in each interval.

ChIP-qPCR
ChIP experiments were performed with the One Day ChIP kit (Diagenode, Seraing, Belgium). ChIP-qPCRs with gene-specific primers ( Table 2) were performed using the ImmoMix PCR reagent (Bioline, Luckenwalde, Germany) in the presence of 0.1 x SYBRGreen (Molecular Probes, Thermo Fisher, Waltham, MA). Enrichment was calculated relative to input.

ChIP-seq and data analysis
Three to four individual ChIPs were pooled and purified on QIAquick columns (Qiagen, Hilden, Germany). Five nanograms of precipitated DNA were used for indexed sequencing library preparation using the Microplex library preparation kit v2 (Diagenode). Libraries were purified on AMPure magnetic beads (Beckman Coulter, Brea, CA) and quantified on a Bioanalyzer (Agilent Technologies, Santa Clara, CA). Pooled libraries were sequenced on an Illumina HiSeq1500 platform (Illumina Inc., San Diego, CA), rapid-run mode, single-read 50 bp (HiSeq SR Rapid Cluster Kit v2, HiSeq Rapid SBS Kit v2-50 cycles) according to manufactureŕ s instructions. Raw ChIP-seq reads were aligned using Subread [47] version 1.4.3-p1. Reads matching multiple locations were discarded during alignment. Peaks were called with MACS [48] ver-sion1.4.0rc2 against the respective knockout control or against IgG for mouse ES cell data. Filtered peaks were required to have at least 30 tags and a sequencing depth-corrected ratio over control of 3x. Published mESC datasets (Fig 8) were retrieved from GEO and processed as above using Subread and MACS, but were not filtered. Unions and overlaps were calculated on an 'at least 1bp overlap' basis. For motif search and heatmaps, peaks were centred at their summits and fixed sized regions extracted. Summits were defined as the point of highest read overlap after extending the reads to 200 bp. Heatmaps show number of reads extended to 200 bp, normalized for sequencing depth. The signal distribution was truncated at the 99 th percentile in each sample in order to increase contrast. Regions for heatmaps were ordered by the sum of signal in the first sample depicted. ChIP-seq signal plots shown in Figs 1F and 7E are also based on reads extended to 200 bp. Genes were associated with a peak if the peak was located within -2.5 kb of TSS to TES.

Expression analysis
For RNA-seq, total RNA was extracted from HEK293 cells stably transfected with the empty pX459 vector and three different MGAko clones by using the RNeasy Mini system (Qiagen) including an on-column DNaseI digestion. RNA integrity was assessed on an Experion (Bio-Rad Laboratories, Hercules, CA). Sequencing libraries were generated using the TruSeq stranded mRNA Library Preparation Kit (Illumina Inc.). Libraries were quantified on a Bioanalyzer (Agilent Technologies) and subsequently sequenced on an Illumina HiSeq1500 platform (Illumina Inc.), rapid-run mode, single-read 50 bp (HiSeq SR Rapid Cluster Kit v2, HiSeq Rapid SBS Kit v2-50 cycles) according to manufacturer´s instructions. Quantitative RT-qPCR was performed essentially as described in [20]. cDNA was synthesized with the Tetro reverse transcriptase (Bioline) using one to two microgram of total RNA. Quantitative PCR was performed in triplicates by using the ImmoMix PCR reagent (Bioline) with gene-specific primers (Table 3). Values were normalized to GAPDH and/or B2M mRNA content.