Formation of a Polycomb-Domain in the Absence of Strong Polycomb Response Elements

Polycomb group response elements (PREs) in Drosophila are DNA-elements that recruit Polycomb proteins (PcG) to chromatin and regulate gene expression. PREs are easily recognizable in the Drosophila genome as strong peaks of PcG-protein binding over discrete DNA fragments; many small but statistically significant PcG peaks are also observed in PcG domains. Surprisingly, in vivo deletion of the four characterized strong PREs from the PcG regulated invected-engrailed (inv-en) gene complex did not disrupt the formation of the H3K27me3 domain and did not affect inv-en expression in embryos or larvae suggesting the presence of redundant PcG recruitment mechanism. Further, the 3D-structure of the inv-en domain was only minimally altered by the deletion of the strong PREs. A reporter construct containing a 7.5kb en fragment that contains three weak peaks but no large PcG peaks forms an H3K27me3 domain and is PcG-regulated. Our data suggests a model for the recruitment of PcG-complexes to Drosophila genes via interactions with multiple, weak PREs spread throughout an H3K27me3 domain.


Author Summary
Polycomb group proteins (PcGs) regulate growth and development of multi-cellular organisms by modifying histones and inhibiting chromatin remodeling resulting in gene silencing. PcGs act via binding to cis-regulatory DNA sequences known as Polycomb response elements (PREs). In genome wide studies, PREs are recognized as strong binding sites for PcG proteins, often near the promoter but many, weaker PcG binding sites are often spread throughout a PcG-domain. It was not known if these weak peaks were functional. We deleted the strong PcG peaks from the PcG-domain containing the invected and engrailed genes and show that a PcG-domain still forms. Further analysis showed these weak sites can act as PREs by recruiting PcGs and modifying histone H3. These weak peaks interact with each other and the target genes to maintain the three-dimensional structure of the PcG domain in the absence of strong peaks. We propose a model,

Introduction
Polycomb group (PcG) proteins were first identified in Drosophila as repressors of homeotic genes [1], but genome-wide experiments over the last decade in Drosophila and mammals have identified hundreds perhaps thousands of other genes that are regulated by PcG proteins. PcG proteins work in complexes to modify chromatin [2,3]. Two of the PcG complexes are PRC1 and PRC2. PRC1 contains the proteins Ph, Psc, Sce/Ring, and Pc and acts in part by inhibiting chromatin remodeling [4]. PRC2 contains Esc, Su(z)12, p55/CAF, and the histone methyltransferase E(z) that tri-methylates histone H3 on lysine 27 (H3K27me3). In Drosophila, PcG protein complexes are brought to the DNA by Polycomb group response elements (PREs) [2,5]. In mammals, apart from three prominent cases, canonical PREs are largely absent [6][7][8] and in embryonic stem cells (ESCs), PcG complexes are recruited to clusters of unmethylated CpG-rich DNA sequences [9,10]. In addition, a number of sequence-specific DNA binding proteins have been reported to recruit PcG proteins to specific genes in mammals [11][12][13].
PREs were discovered in transgenic assays in Drosophila and defined as DNA fragments that cause maintenance of silenced expression of a transgene [14][15][16][17][18]. Another assay for PRE activity is "pairing-sensitive silencing" (PSS) of the reporter gene mini-white in the Drosophila eye; repression of mini-white is much stronger when two copies of the PRE-reporter gene are present in the genome, either in cis or in trans [19,20]. This ability of PRE-reporter transgenes to interact in order to increase silencing was the first indication of the ability of PREs to facilitate interactions between DNA fragments. PREs are now known to be involved in intra-and inter-chromosomal interactions, and to participate in setting up higher order chromatin structure [21][22][23][24], although insulators also play an important role in this [25].
Many different DNA binding proteins bind PREs and act together to recruit PcG protein, but how this occurs is poorly understood [26]. Pho was the first PRE DNA binding protein discovered and has been the most extensively studied [27]. In genome-wide chromatin-immunoprecipitation (ChIP) studies, PcG-targets are identified by the presence of H3K27me3-domains along with strong PcG-protein binding peaks at discrete sites within the target region [28]. These peaks of PcG-protein binding are known or presumed to be the PREs. Only a small fraction of PcG-binding peaks have been tested for PRE activity in transgenic assays, and only four PREs have been deleted or mutated in the Drosophila genome [29][30][31]. In all cases PRE deletion in situ led to unexpectedly weak phenotypes suggesting that other PREs or other mechanisms of PcG recruitment might be able to compensate for PRE loss but no conclusive study argue strongly against the popular yet poorly supported idea that in Drosophila, PcG target genes are regulated by a few strong PcG peaks.
In this study, we tested the functional importance of the well-characterized PREs of the PcG target genes invected (inv) and engrailed (en). inv and en are located next to each other in the genome, share regulatory DNA, and encompass a 113kb H3K27me3 domain [32]. Two PREs are upstream of en and were discovered by their ability to mediate PSS [19,20]. Two other strong PcG peaks were identified upstream of inv based on ChIP-chip studies [33] and then later shown to have PRE activity in transgenic assays [34]. These 4 PREs are bound by PcG proteins and are the strongest PcG protein peaks in ChIP experiments in the inv-en domain in all cell types and developmental stages; hence, we hypothesized they would be absolutely required for PcG-repression in vivo. Contrary to this expectation, flies that had a deletion of all four strong inv-en PcG peaks maintain the wild type H3K27me3 domain, do not mis-express En in embryos or larvae, and survive to become fertile adults. These surprising results focused our attention on the many sites within the region that bind PcG proteins but at relatively low levels. We therefore performed several experiments to see if these were functional PREs. Chromosome conformation capture-sequencing (4C-seq) experiments in the inv-en domain suggest that these sites interact with each other and the en transcription unit. Further, a transgene that contains three weak PcG-protein binding sites, but no known PREs, establishes an H3K27me3 domain, recruits the PRC2 component E(z) to the transgene, is repressed in a pairing-sensitive manner, and is de-repressed in pho and ph mutants. Together our data suggest that Drosophila PcG-target genes are not just regulated by a small number of strong PREs, but rather, many DNA fragments within a target PcG domain can act as PREs. We suggest these PREs act together to recruit PcG proteins, and to spread the H3K27me3 mark.

Results
Characterized PREs from inv-en domain are not required for viability or formation of the H3K27me3 domain The inv-en genes are regulated by shared enhancers [32,35] and comprise a~113kb H3K27me3 domain (Fig 1). This domain extends from the 3' end of the ubiquitously transcribed enhancer of Polycomb [e(Pc)] gene to the 3' end of the ubiquitously transcribed toutatis (tou) gene. In S2 cells and adults there are four strong PcG-peaks in the inv-en domain [33,36] and these correspond to the four PREs we have previously characterized [34,37]. In order to identify the PREs within the inv-en region in larval tissues, we performed ChIP followed by next generation sequencing (ChIP-seq) with Ph and Pho antibodies (Fig 1). In agreement with previous studies on embryos and S2 cells [38], we find two large Ph and Pho peaks located just upstream of the en promoter (Fig 1). These two peaks are within DNA fragments that have PRE activity in reporter assays in embryos and pairing-sensitive silencing activity in the eye [19,26,37,39]. In both embryos and larvae, there is a strong Ph peak at the inv promoter in addition to a weaker peak about 6kb further upstream. Pho is also present at these sites but the intensity of the Pho and Ph peaks varies at the sites, i.e. where the Pho peak is large, the Ph peak is small and vice versa (Fig 1). Similarly, DNA fragments that contain these two peaks have PRE activity in embryos and pairing-sensitive silencing of mini-white in the eye [34].
In addition to these strong peaks, there are a number of smaller, statistically significant peaks for both Ph and Pho, and many of these peaks overlap (Fig 1, blue arrow heads). To further investigate these binding sites of varying intensity, we analyzed the genome-wide Pho data and identified 3727 peaks at a p-value < 1e-5. Next, for the purposes of identifying bona fide Pho binding sites, these peaks were further refined using Peaksplitter [40] and the top~20% (2324 peaks) were further analyzed (S1 Fig). We categorized these peaks into three subclasses according to their heights: the top 10% of the peaks (heights~400) were defined as strong Pho peaks, the next 35% (100 peak heights 400) were termed weak Pho peaks, and even smaller peaks (heights < 50) were labeled as null Pho peaks (S1C Fig). The known PREs of the inv-en region coincided with 3 strong Pho peaks, and one in the second category, which is associated with a very strong Ph peak (Fig 1 and S1 Fig). In addition, there are 6 smaller, yet statistically significant Pho peaks that overlap with Ph peaks upstream of the en transcription start site, and they belong to the second category (S1 Fig). Interestingly, Ph ChIP-seq data from embryos [38] shows a similar pattern with strong peaks upstream of en and at the inv promoter, and several weak peaks, some that coincide with peaks observed in larvae (Fig 1A, green stars).
We wondered if the weaker peaks were dependent on the presence of the known PREs and if the weaker peaks were sufficient to render the inv-en domain PcG-regulated. To test this we deleted the known PREs from the endogenous inv-en domain hypothesizing that if these PREs were essential for PcG regulation of the locus, we would see disruption of the H3K27me3 domain and mis-expression (or loss of repression) of en and inv genes, most likely leading to severe developmental defects.
Three different mutants were used in our analysis. The first mutant, called en Δ1.5 , carries a 1.5kb deletion upstream of the en promoter that removes the two known en PREs ( [42]; Fig  2A). These flies are homozygous viable and fertile indicating that repression of inv-en by PcG is largely normal. In fact, we directly examined expression in imaginal discs and did not detect any differences with wild type larvae ( Fig 2B). The phenotypes we did observe in en Δ1.5 flies were relatively minor. We initially observed a weak loss-of-function defect in wing veins in en Δ1.5 flies [42], however that defect is no longer present in the stock and is now only revealed when en Δ1.5 is put over a deficiency for the region (see below). Another phenotype, a defect in fusion of the cuticle at the midline in abdominal segments of the adult arose in multiple copies of this stock after about 3 years and is present in 45% of en Δ1.5 flies (S2A Fig). Thus, the two PREs upstream of en are dispensable for viability, fertility, and there is no mis-expression of En in en Δ1.5 imaginal discs.
The second mutation we analyzed, inv Δ24 , is a 24kb deletion that takes out both the characterized inv PREs and a large part of the inv transcription unit so that inv Δ24 flies make no Inv protein. Since the inv gene acts redundantly with en and is dispensable in the laboratory, this  [38]. ChIP-chip experiments in embryos (Ph; [33]) and larvae (Pho; [41]) also showed the presence of smaller Ph and Pho peaks in the inv-en domain.
doi:10.1371/journal.pgen.1006200.g001 fact did not interfere with our analyses. inv Δ24 flies are viable and fertile and are phenotypically normal (S2A Fig), consistent with the idea that these two PREs are also not essential for PcG mediated repression of en.
Finally, we used P-element-mediated male recombination to generate a mutant with all four well-characterized PREs deleted (inv Δ24 en Δ1.5 ). To our surprise, the double deletion flies are also homozygous viable and fertile, showing that the characterized PREs are dispensable in the laboratory (S2A Fig). Further, the expression pattern of En is normal in inv Δ24 en Δ1.5 embryos and imaginal discs, suggesting that PcG-regulation of en is largely intact (Fig 2B and 2C).
We assayed the H3K27me3 levels over the inv-en domain by ChIP-qPCR in en Δ1.5 , inv Δ24 and inv Δ24 en Δ1.5 larval tissues and compared them to WT levels ( Fig 2D). We used 11 primer sets that spanned the domain (Fig 2A). We also carried out ChIP-seq with anti-H3K27me3 antibody in wild type and in inv Δ24 en Δ1.5 larvae (Fig 2E). In both analyses we saw that an H3K27me3 domain was established in mutant animals with H3K27me3 levels comparable to WT levels across almost the entire domain. We did see a modest reduction in H3K27me3 signal (about 50%) in the region just upstream of the inv Δ24 deletion (see primer pairs 2 and 3 in Fig 2D). This may be a result of our mutagenesis protocol which leaves behind a PBac[WH] element (S2B Fig). PBac[WH] contains a Su(Hw) insulator element [43]; we suggest this insulator partially blocks the spreading of the H3K27me3 mark toward e(Pc) by PRC2. Altogether, these data show that deletion of the 4 characterized PREs from the inv-en domain does not disrupt the formation of the H3K27me3 inv-en domain, consistent with our data showing that En is accurately expressed in larval tissues.

The size of the small Ph and Pho peaks does not change when the large peaks are deleted
We carried out ChIP-seq experiments with anti-Pho and anti-Ph antibody on WT, en Δ1.5 , inv Δ24 , and inv Δ24 en Δ1.5 larvae. Strikingly, the smaller Pho and Ph peaks were still present after deletion of the large peaks ( Fig 3A, blue arrowheads). ChIP-qPCR with Pho antibody confirmed these results and show that there is only a minimal reduction in Pho binding even in inv Δ24 en Δ1.5 larvae ( Fig 3B). These data show that the small peaks are independent binding sites, and do not result from cross-linking of these DNA fragments with the proteins bound to the large peaks. We therefore hypothesized that these small peaks might be true PREs and searched for the presence of consensus binding sites for PRE-DNA binding proteins and for the GAGA and GTGT motifs found in PREs within 1kb regions encompassing the peaks (S1 Table). All of the fragments contain consensus Pho sites and most contain consensus binding sites for a collection of PRE-DNA binding proteins, however we note that peaks 2, 3, 4, and 6 contain no GAGA sites, a core component of PREs, and thus are not canonical PREs [5].
Most Ph peaks co-localize with Pho peaks in the inv-en region, however there are some exceptions. There is a strong Ph-binding site near the middle of the inv gene where Pho is not bound; for an unknown reason, the intensity of this peak was increased in the inv Δ24 , and inv Δ24 en Δ1.5 mutants (Fig 3A, green star). Examination of Pho and Ph binding upstream of En is interesting; the en promoter-proximal peak of Ph is much wider than that of Pho, and extends over the en promoter ( Fig 3C). Promoter-proximal Pho binding is completely lost in en Δ1.5 and inv Δ24 en Δ1.5 larvae, but Ph was still present at the en promoter ( Fig 3C) albeit at a lower level. We suggest that Ph is recruited to the en promoter via another DNA binding protein or via interaction with cohesin [44] or to paused RNA Polymerase bound to the en promoter.
Phenotypes of inv Δ24 en Δ1.5 flies The fact that even inv Δ24 en Δ1.5 flies are viable and fertile indicates that PcG repression of en is fundamentally normal. However, Trithorax group protein complexes (TrxG, a transcriptional activator) also bind to PREs and it has therefore been proposed that PREs are also necessary for gene activation. We therefore examined more fully the phenotypes we did note in mutant flies to look for evidence of loss of en expression. Adult inv Δ24 en Δ1.5 flies have three phenotypes: 1) they hold their wings out, 2) 90% of the flies have a very subtle defect in the posterior crossvein of the wing [WD minor , Fig 4B top right], and 3) there is a defect in the fusion of left and right dorsal hemisegments (Fig 4C). The first two phenotypes are recessive; and the third is dominant, and much stronger in the homozygote. En is required for anterioposterior patterning in the adult abdominal segments [45], but a role for En in hemisegment fusion has not been described; thus, the reason for this phenotype is unknown. We generated a~110kb deletion (inv en Δ110 ) that almost entirely removes the inv-en region ( Fig 4A) and crossed it to en Δ1.5 , inv Δ24 , and inv Δ24 en Δ1.5 flies. Our data strongly suggest that the wings held out phenotype is caused by a loss of inv function and the defective wing vein phenotype is caused by a loss of en function. inv Δ24 flies have normal wings; this shows that en can fully substitute for inv in an otherwise WT fly. However, 60% of inv Δ24 /inv en Δ110 flies hold their wings out, suggesting that loss of inv function strongly contributes to this phenotype ( Fig  4D). In contrast, 100% of en Δ1.5 /inv en Δ110 flies have wing defects in the posterior compartment, while none hold their wings out (Fig 4D). We examined the phenotype of the wing hinge in inv Δ24 en Δ1.5 mutants and could not see any abnormalities; thus it is likely that the wings held out phenotype is caused by a defect in muscles or in the innervation of the wings (Fig 4E).
The wing vein phenotypes we observe could be due to a decreased level of En expression. Therefore, we measured the level of inv, en, tou, and e(Pc) mRNA and quantified the En signal in wing imaginal discs of WT and inv Δ24 en Δ1.5 and saw no significant differences (S2C- S2E  Fig). The inv and en promoters are both associated with paused RNA Polymerase II (PolII), a characteristic of Polycomb-regulated genes [36,46]. We tested whether deletion of the PREs had any effect on the level of PolII at the promoter of these genes [inv and en] and also on the flanking genes [E(Pc) and tou] in larvae. We found that a low amount of PolII is present at the promoters of these genes and this level did not change in the absence of characterized PREs (S2F Fig). Thus, while our genetic data suggests that en Δ1.5 causes a decrease in en function, we could not detect any changes in the level or pattern of En expression in inv Δ24 en Δ1.5 wing discs. These data suggest that deletion of the characterized PREs causes only a subtle decrease the level of En expression. Since a promoter-tethering element, necessary for interaction with imaginal disc enhancers [47], is also present within the en Δ1.5 deletion, we cannot attribute this subtle decrease to a loss of the PRE.
Deletion of strong PREs does not affect local 3D structure of inv-en domain As opposed to a hierarchical model of recruitment of Polycomb proteins to PREs [48], recent articles proposed a combinatorial recruitment of Pho at noncanonical, low-affinity Pho binding sites present within a Polycomb domain [49,50]. Through their analysis of Hi-C data Schuettengruber et al. (2014) proposed that, preferentially within a Polycomb domain, different Pho sites contact each other forming a distinct compact structure [49]. To test whether the weak peaks interact with the en transcription unit, we carried out 4C-seq experiments in our WT and deletion mutants using a probe from the en transcription unit. We note that our 4Cexperiment was done with larval brains and discs, and contains a mixture of cells with En 'ON' and those with En 'OFF'. Thus, differences detected could indicate differences in either the 'ON' of 'OFF' state.
This bait is~2.9kb downstream of the en promoter (Fig 5, red arrowhead). In concordance with previous reports [21,44,51], we observed an interaction of known inv PREs with the en transcription unit (Fig 5, indicated with blue arrowheads), the known en PREs are too close to the bait to analyze. In addition, we found interactions of the en transcription unit with several different fragments distributed within the inv-en domain with the strength of these interactions varying amongst the fragments; of these, some were interactions between weak Pho/Ph peaks with the en transcription unit (Fig 5, yellow shaded boxes). Some of the weak Pho/Ph peaks have higher levels of interaction with the en transcription unit than the known inv PREs. This suggests that these weak Pho/Ph peaks may play an important role in the establishment of the topology of this domain. We note that, in agreement with data on the inv-en domain in embryos [51], we did not observe any significant interaction with any other Polycomb domain.
Establishment of an H3K27me3 domain over a transgene that contains weak Pho/Ph peaks We next examined whether a~7.5kb fragment (2R:7446390..7453923) containing three weak Pho/Ph peaks (Fig 6A, region outlined with green line, Fragment R from Cheng et al., 2014 [32]) could cause accumulation of H3K27me3 over the lacZ gene when cloned into the P-element vector P[en-lacZ] (Fig 6B, [32]). This vector contains 400bp upstream of the en transcription start site and 139bp of the untranslated en leader, but no PREs. We examined the accumulation of H3K27me3 over the lacZ gene in larva with P[7.5en-lacZ] inserted at two different chromosomal insertion sites (2R:17047933, X:11744845). Our data show that the 7.5kb fragment is required to create a small H3K27me3 domain over the lacZ gene (see below); H3K27me3 is highest immediately adjacent to the en fragment and drops off rapidly within the lacZ gene (Fig 6C). The mark does not spread to regions flanking the transgene. We also carried out ChIP with anti-E(z) antibody in P[7.5en-lacZ]-2R larvae to determine whether the fragment with weak peaks could recruit this PRC2 subunit, our data show that there is an accumulation of E(z) at the junction of promoter-weak peak fragment and at the 5' end of lacZ (S3A Fig). We also examined the ß-galactosidase expression pattern in P[7.5en-lacZ] transgenic larvae from 2 independent insertion sites. ß-galactosidase was expressed in a variegated pattern in discs heterozygous for the insert (Fig 6D), and this variegation was extremely variable for the insert on the X. There was a dramatic increase in silencing in imaginal discs homozygous for P [7.5en-lacZ] inserted on the 2 nd chromosome (Fig 6E). The extreme variegation of discs from P  [7.5en-lacZ]-X was also present in females homozygous for the transgene, and in this case, it was difficult to assess whether there was more silencing in homozygotes. Variegated expression and increased silencing of lacZ in wing discs homozygous for the transgene are both reminiscent of the effect of PREs on expression of the mini-white gene in the eye. In fact, P[7.5en-lacZ]-X homozygotes have a lighter eye color than heterozygotes, and thus show pairing-sensitive silencing. The eye color of P[7.5en-lacZ]-2 flies is variegated, but is darker in homozygotes. We tested the effect of pho and ph mutations on ß-gal expression in P[7.5en-lacZ]-2 wing imaginal discs. Silencing of lacZ from P[7.5en-lacZ]-2 is greatly reduced in a pho homozygous mutant (Fig 6F) suggesting that Pho plays a role in recruiting PcG complexes to this transgene. lacZ silencing is also relieved in a ph mutant (Fig 6G). These data show that the small Pho/Ph peaks can act as PREs in this transgene and suggest they also act as PREs within the inv-en domain.
We examined H3K27me3 accumulation over a transgene that had a 2.6kb fragment of en DNA, from -2.4kb to +139bp (including the known PREs and the en promoter) fused to lacZ and cloned into pCaSpeR with the mini-white reporter (S3B Fig, [37]). A small H3K27me3 domain formed over the lacZ gene in this transgene (S3C Fig). Like the signal over P[7.5en-lacZ] (Fig 6C), the highest level of H3K27me3 was found closest to the en fragment, and its intensity rapidly dropped about 50% over the~2.5kb region of lacZ gene. When the fragment containing the known PREs was removed from this transgene (via FRT and loxP sites) [37], H3K27me3 was completely lost, even though en promoter remained (S3C Fig) [52]. The en sequences remaining in this transgene included 400bp upstream of the promoter and 139bp of the untranslated leader; analogous to the en promoter remaining in en Δ1.5 (that deletes sequences from -1956 to -412bp). Thus, the en promoter alone is not sufficient to establish an H3K27me3 domain. We also examined the effect of an individual weak Pho/Ph peak (2R:7429281..7432610) on H3K27me3 accumulation in the P[en-lacZ] (S3D Fig). This fragment was able to incorporate H3K27me3 over lacZ in two different insertion sites in the genome (S3D Fig). This construct contains no embryo or imaginal disc enhancer [32], thus we could not access lacZ expression in these lines. We do note however, that three out of four P [3.3en-lacZ] exhibit pairing sensitive silencing, strongly suggesting this transgene contains a PRE. Finally to address the question of the relative strength of these presumed PREs, we inserted fragments with either strong or weak Pho/Ph peaks into a FC31-white reporter gene construct and integrated them in the attP40 site; we also integrated vector alone at the attP40 site. The vector alone inserted at this site gave transgenic flies with red eyes. Transgenic flies containing constructs with PREs from en or Ubx had an orange eye color, consistent with repression of white by the PRE. In contrast, transgenic flies with constructs with weak Pho/Ph peaks had a red eye color, similar to the vector alone (S3E Fig). We further cloned three weak peak fragments together in the reporter construct; even this was not able to repress the white gene (S3E Fig, middle picture). We note that at this chromosomal location we did not observe any PSS. In addition, the en and Ubx PREs were able to suppress the yellow gene present at the FC31 landing site, while constructs with the weak peaks were not. These data show that the known PREs with the large Ph/Pho peaks are stronger silencers than the weak Ph/Pho binding sites in this assay.

Discussion
Global identification of genomic sites to which PcG complexes bind and understanding the mechanism that controls expression of target genes by these complexes is essential to fathom how the PcG system regulates transcription and genome organization. The key conclusions that can be drawn from our experiments are: 1. Strong PREs of the inv-en domain are non-essential in the laboratory. 2. The size of the weak Pho and Ph peaks are not changed in the absence of the known PREs. 3. Weak Pho and Ph peaks interact with the en transcription unit 4. Known PREs are dispensable for maintenance of 3D structure of the local PcG domain. 5. A DNA fragment with weak Ph/Pho peaks can create a PcG domain in a transgene. These data change our view of PcG-recruitment to the inv-en domain; rather than having just a few strong PREs, there are many DNA fragments that can act together to recruit PcG proteins. As explained below, it is likely that our results can be generalized to many PcG targets.

Evidence for weak PREs from other studies
PREs were identified as DNA fragments that could recruit PcG proteins to transgenes [15,17], maintain expression of a reporter gene throughout development, and cause the transgenes to be derepressed in a PcG-mutant. Early chromatin-immunoprecipitation experiments showed that PREs are binding sites for PcG proteins [53], and genome-wide studies identified the largest PcG peaks at known PREs. Most PcG-target genes have only a few large PcG-protein peaks, often located near the promoter. However, there were indications from early transgenic studies that more than just a few PREs exist in PcG targets. For example, when testing fragments of Ubx DNA for regulatory activity in embryos, Müller and Bienz (1991) reported that two discrete DNA fragments could confer PcG-dependent restricted expression to a bxd-Ubx-lacZ reporter gene; one they called PBX and another called ABX. The bxd enhancer causes bxd-Ubx-lacZ to be expressed from the head to the tail of the embryo [parasegments (ps) 2-14], but is silenced in the anterior part of the embryo in the presence of either the PBX or the ABX fragment due to repression by the PcG-proteins. Notably the strength of repression by the PBX fragment was stronger than that by the ABX fragment suggesting the presence of strong and weak PREs.
The bxd enhancer is 'primed' to be PcG-regulated. In 1995, Jürg Müller used the UAS-Gal4 system to show that transient expression of Gal4-Pc could silence a UAS-bxd-lacZ transgene throughout embryogenesis, and showed that this silencing was dependent on endogenous PcG-genes [54]. This data suggests that once the bxd-fragment encounters PcG-proteins, it can maintain repression, essentially acting as a PRE. In contrast, a UAS-NP6-lacZ construct, containing the synthetic enhancer NP6, was only transiently silenced by Gal4-Pc; once Gal4-Pc was gone, the transgene was expressed again. Thus, on the UAS-bxd-lacZ transgene there was a kind of "handing off" from the Gal4-Pc to the endogenous PcG-proteins and this did not happen on the UAS-NP6-lacZ construct. Interestingly, the bxd enhancer contains a weak PcG-protein peak but on its own does not act as a PRE in transgenes (S4 Fig). We suggest that regulatory DNA from PcG-target genes is primed to be PcG-regulated and facilitates PcGrecruitment and H3K27me3 spreading.

PcG-recruitment to inv-en
Examination of H3K27me3 levels in early embryos showed that H3K27me3 accumulates first over the inv-en known PREs and subsequently spreads to the rest of inv-en region [55]. It is therefore surprising that deletion of the known PREs does not lead to de-repression of En/Inv expression and therefore to major developmental defects and lethality. Perhaps even more surprising is that, apart from the DNA flanking the known en PREs, H3K27me3 levels are not decreased in the inv Δ24 en Δ1.5 larvae and the 3D structure of the inv-en domain is not disrupted by the loss of the known PREs. Clearly, recruitment of PcG proteins is highly redundant at the inv-en domain. Another surprising result in our experiments is the presence of the Ph protein at the en promoter in the absence of the promoter-proximal en PREs. We suggest that Ph, or other PRC1 components are recruited to the en promoter either by direct interaction with other transcription factors (i.e. Adf1, Cg) [56,57] or via interaction with paused polymerase. It is interesting to note that in early studies, general transcription factors co-purified with PRC1 components [58].

Why so many PREs in a PcG domain?
PcG recruitment and repression is a dynamic process; PREs recruit PcG complexes and they are anchored to PREs, but interactions between PRC1 and PRC2 components and the chromatin modifications imparted by them reinforce each other to stabilize and propagate the complexes [50,59]. ChIP-experiments show that the enzymatic component of PRC2, E(z) is concentrated over the PREs, and the levels of H3K27me3 are very high flanking the strong PREs of en (Fig 2), suggesting that E(z) acts best on flanking nucleosomes and that the level of activity decreases with distance from the PRE. How far does the H3K27me3 domain spread from a PRE? Interestingly, the size of the H3K27me3 domain on our en-lacZ transgenes was relatively small, dropping rapidly throughout the 2.5kb of lacZ gene, suggesting that E(z) acts on a relatively local area. Given previous results showing the extreme sensitivity of PRE-transgenes to the position of insertion in the genome, we suspect that the degree of spreading will be dependent on the exact construct tested and the chromosomal insertion site. Nevertheless, our studies showed a remarkable consistency between the spreading of H3K27me3 through the lacZ gene for three different en-lacZ transgenes inserted at 5 different chromosomal locations. Thus we suggest that spreading from a single PRE is likely limited and that the weak Ph/Pho peaks facilitate spreading of the H3K27me3 mark both by direct recruitment of PcG proteins and by facilitating interactions with PcG proteins bound to other PREs (Fig 7). We note that the strong PREs are present in all cell types and tissues whereas weak peaks may be either stage or tissue specific, although at present we do not know what imparts this specificity on a PRE (Fig 7).
While PREs have been known for many years in Drosophila, identification of PREs in mammalian PcG-targets has been more difficult in part because very high PcG-binding peaks are not present in mammalian PcG-target genes, although many small PcG-binding peaks do exist. Interestingly, a detailed analysis of PcG recruitment to the HoxD locus in mice suggests a large number of binding sites cooperate to recruit PRC2 [60]. These authors show that deletions within the 300kb HoxD locus do not disrupt the H3K27me3 domain, suggesting the presence of multiple PRC2-recruitment sites. Interestingly, previous results on single PRE deletions in the endogenous Ubx and Abd-B genes showed only minimal mis-expression of these genes and weak phenotypes, suggesting at least partial redundancy in PcG recruitment at these loci [29][30][31]61]. The consequences of these PRE deletions on the H3K27me3 domain were not examined. Our data suggest that multiple, weak PREs can recruit PcG proteins to the inv-en domain in the absence of the strong PREs. We suggest that PcG recruitment in Drosophila is similar to that in mammals, except that in mammals, PcG-target genes lack strong PREs.

Fly strains
Generation of en Δ1.5 and inv Δ24 mutants have been described [32,42]. The double mutant inv Δ24en Δ1.5 was obtained using P-element mediated male recombination [62]. The deletions were validated both by PCR and by the absence of those sequences in ChIP-seq data. All flies were kept at 25°C. Generation of constructs and transgenic lines has been discussed previously [32].

qRT-PCR
Total RNA was collected from 20 third instar larvae wing imaginal disks using Trizol (Invitrogen) according to manufacturer's protocol. One-step RT-qPCR was performed with the SensiFAST SYBR No-ROX One-Step kit (Bioline) on a Roche Lightcycler 480 according to manufacturer instructions [63]. Primers used are listed in S2 Table. Staining of imaginal disks For antibody staining, imaginal disks and brains were collected from third instar larvae and fixed for 23 min [0.1 M PIPES (pH 6.9), 1 mM EGTA (pH 6.9), 1.0% Triton X-100, 2mM MgSO4, 4% formaldehyde (Ted Pella, Inc.)]. Fixed tissues were blocked for 2 hrs at 4°C with blocking buffer (1X PBS, 0.1% Tween-20 and 5% normal goat serum). Tissues were incubated over night at 4°C with primary antibody diluted in blocking buffer (rabbit anti-En 1:500 dilution, Santa Cruz Biotechnology, Inc.; mouse anti-ßgal 1:500 dilution, Invitrogen). After removal of primary antibody, tissues were washed (1X PBS, 0.1% Tween-20) four times-15 min each. Washed tissues were incubated in fluorescent tagged secondary antibody diluted in blocking buffer for 2 hrs at 4°C in dark, washed 3 times and mounted with DAPI-Vectashield (Vector Laboratories). All wing imaginal discs were quantified by ImageJ.

ChIP and ChIP-qPCR
Protocol for carrying out ChIP in larval tissue has been described previously [63], with minor changes: fixed brains and imaginal discs were dissected from 10 third instar larvae, and before incubating the sonicated chromatin with antibodies, 3.3% of each sample was saved for input reactions. ChIP was performed with 1:100 dilutions of anti-Pho [64] and anti-Ph antibodies (a   antibodies. Quantitative PCR on ChIP samples was performed as described previously [26]. Lightcycler 480 Real-Time PCR System (Roche Applied Science) and Lightcycler 480 DNA SYBR Green I Master Mix (Roche Applied Science) were used for performing ChIP-qPCR.

ChIP-seq and data analysis
Following purification of immunoprecipitated DNA, Illumina libraries were prepared using TruSeq DNA Sample Prep Kit V2 as described (http://ethanomics.wordpress.com/chip-seqlibrary-construction-using-the-illumina-truseq-adapters/). Peak calling for Pho was conducted using MACS v1.4.2 [65]. All ChIP-seq data sets were aligned using Casava (version 1.8) to the Drosophila reference genome (release 5.22/6.02). All ChIP-seq experiments were performed with 2 biological replicates yielding similar results. Data was normalized as reads per million except the embryo Ph and H3K27me3 data in Fig 1, normalization of this data is explained in Bowman et al. (Bowman et al., 2014 [38]). The same H3K27me3 data was done in several figures (1, 2D, 3A, 5A). Peak calling for Pho was conducted using MACS v1.4.2 [65]. Shifting model building for prediction of fragment length was turned off (-nomodel) and a fixed standard background model for peak calling was used to increase sensitivity (-nolambda). In addition, we specified a p-value cutoff of 0.00001 for the output peaks. For in depth analysis of the binding regions, output peaks were split with Peaksplitter v0.1 [40] and~20% (peak height! 50) of the split peaks were considered for further analysis to rank them into null, weak and strong binding sites. Sequencing data have been deposited to the Gene Expression Omnibus.

4C-seq
Chromosome conformation capture (3C) protocol was followed as described by Tolhuis et al. [66] with minor modifications. To generate 4C-library, we followed the protocol described by Tolhuis et al. [67]. Brains and disks from 10 third instar larvae were fixed using freshly prepared fixing solution containing 2% formaldehyde and 1X PBS prepared from 16% formaldehyde (Ted Pella, Inc.). Fixation reaction was quenched using 125 mM Glycine. After fine dissection and nuclei isolation, nuclei pellet was dissolved in 1X NEB buffer 3. After the SDS treatment the chromatin was digested overnight with 5 μl EcoRI (200 U, NEB). Next day, after inactivating EcoRI, the digested chromatin was ligated for 6 hrs with 0.5 μl ligase (200 U, NEB) in 2ml of ligation reaction. The ligated library was purified as stated in the reference mentioned above. Finally the library was dissolved in 400 μl ultrapure water or 10 mM Tris-HCl (pH 8.0).
100 μl of the 3C library was digested with 5 μl of DpnII (50 U, NEB). After the enzyme was inactivated, digested chromatin was re-ligated using 5 μl of ligase (200 U, NEB) in a total volume of 2 ml ligation reaction to promote self-ligation events. Finally purification of the library was carried out. Inverse PCR was carried out using 8 different custom made reverse primers (containing unique 5 nucleotide bar code) and one universal forward primer to produce Illumina-4C-seq libraries. The primer sequence used for 4C-seq library amplification is mentioned in S2 Table. PCR amplification was carried out using Kapa HiFi HotStart Ready Mix (KAPA Biosystems, cat no. KK2601) with 90 ng of DNA and 2.5 μl of forward primer (10mM) and 1.7 μl of reverse primer (10mM). Amplification program in thermal cycler was: denaturation-95°C for 30 sec, amplification-25 cycles of 98°C for 10 sec, 64°C for 30 sec and 72°C for 30 sec; final elongation step was 72°C for 1 min. After pooling together 6 PCR reactions, it was purified using High Pure PCR Product Purification kit (Roche, cat no. 11732676001). Finally 4C-PCR products from WT and mutant lines were combined in preferred ratios for sequencing.

4C-data analysis
Procedure to detect significant interactions (peaks) in 4C was adapted from Tolhuis et al. and Ghavi-Helm et al. [51,67]. Read counts were transformed using variance stabilizing transformation from the DESeq2 package in R [68] and then transformed counts from chr2R (chromosome with bait fragment) were smoothed using loess. Region immediately surrounding bait fragment was excluded from fit (chr2R:7410000..7416000). Interactions with p-value < 0.01 in both replicates and FDR < 0.1 in at least one replicate were selected as bona-fide interactions.
Reads from all the NGS data were deposited in the NCBI-GEO under accession numbers GSE77342. [WH]f06870 present at the inv Δ24 deletion site. inv Δ24 was made by recombination between two P[WH] elements, leaving an intact element at the site of the deletion [32]. (C) Quantification of total transcripts of E(Pc), inv, en and tou in WT and inv Δ24 en Δ1.5 relative to total RpL32 transcript in larval wing imaginal disks; data presented is from the average of two independent biological samples with three replicates each (mean±SEM). (D) Quantification of the En expressing area over the total wing disc area, data was collected from 20 imaginal discs of WT and inv Δ24 en Δ1.5 (mean±SEM). n.s. not significant by student T test. (E) Quantification of fluorescent pixel intensity in the En expressing region in the Wing disc, in the indicated genotypes, data was collected from 15 imaginal discs of WT and inv Δ24 en Δ1.5 (mean±SEM). (F) Total PolII accumulation quantified on promoters and body of E(Pc), inv, en and tou. The rpl32 gene is used as a positive control. Results are shown as fold enrichment over background signal and are the average of two independent biological samples with three replicates each (mean±SEM). Results are shown as fold enrichment over control signal and are the average of two independent biological samples with three replicates each (mean±SEM). (B) Diagram of the P-element based transgenic construct with and without en characterized PREs [37]. (C) Quantification of H3K27me3 over lacZ, gene. Regions amplified by qPCR are indicated over lacZ gene in the diagram; fragments from e(Pc) and tou were used as negative controls. Results are shown as fold enrichment over background signal and are the average of two independent biological samples with three replicates each (mean±SEM). Statistical analysis of differential H3K27me3 accumulation was performed using Student's t-test, P-values 0.05. Only the significant differences with WT are shown with ' Ã '. (D) Schematic of a transgene containing a fragment associated with a single weak peak (2R:7429281..7432610, fragment K from Cheng et al., 2014 [32]) is shown in upper panel; quantification of H3K27me3 over lacZ in two transgenic lines containing above transgene is shown. Results are shown as fold enrichment over control signal and are the average of two independent biological samples with three replicates each (mean±SEM). (E) Schematic of the attP40 site after transgene insertion is shown in the top panel, inserted vector is shown in green color. Pictures of the fly eye containing either the strong and weak PcG peaks are shown in the bottom panel. Fragment coordinates used in this assay are also shown. Construct 'attB-P[acman]-Ap R ' [69] was modified by the addition of an eye enhancer from the white gene [32]. Note also that the body color of the enPREs and PRE D lines is yellow whereas the other lines have a darker body color. This lighter body color is the result of repression of the y+ transgene present at the attP40 site.  Table. List of Pho, Zeste, Spps, Dsp1 and Grh consensus binding sites and the GTGT and GAGA motifs found within 1kb region encompassing the small peaks. The center of the peak was identified and 500bp upstream and downstream were chosen for this analysis. The coordinates of the small peaks are shown in the left column. (TIFF) S2 Table. List of primers used. (XLSX)