A Complex Genetic Switch Involving Overlapping Divergent Promoters and DNA Looping Regulates Expression of Conjugation Genes of a Gram-positive Plasmid

Plasmid conjugation plays a significant role in the dissemination of antibiotic resistance and pathogenicity determinants. Understanding how conjugation is regulated is important to gain insights into these features. Little is known about regulation of conjugation systems present on plasmids from Gram-positive bacteria. pLS20 is a native conjugative plasmid from the Gram-positive bacterium Bacillus subtilis. Recently the key players that repress and activate pLS20 conjugation have been identified. Here we studied in detail the molecular mechanism regulating the pLS20 conjugation genes using both in vivo and in vitro approaches. Our results show that conjugation is subject to the control of a complex genetic switch where at least three levels of regulation are integrated. The first of the three layers involves overlapping divergent promoters of different strengths regulating expression of the conjugation genes and the key transcriptional regulator RcoLS20. The second layer involves a triple function of RcoLS20 being a repressor of the main conjugation promoter and an activator and repressor of its own promoter at low and high concentrations, respectively. The third level of regulation concerns formation of a DNA loop mediated by simultaneous binding of tetrameric RcoLS20 to two operators, one of which overlaps with the divergent promoters. The combination of these three layers of regulation in the same switch allows the main conjugation promoter to be tightly repressed during conditions unfavorable to conjugation while maintaining the sensitivity to accurately switch on the conjugation genes when appropriate conditions occur. The implications of the regulatory switch and comparison with other genetic switches involving DNA looping are discussed.


Introduction
Bacteria exchange genetic material at high rates by different processes, which are collectively named Horizontal Gene Transfer (HGT). HGT can be beneficial for bacteria because the newly acquired DNA may endow them with novel features enabling them to adapt to changing conditions in the environment, i.e. rapid evolution. On the other hand, HGT is notorious for its role in the dissemination of virulence/pathogenicity determinants and antibiotic resistance. The main mechanisms responsible for HGT are transformation mediated by natural competence, transduction and conjugation [1][2][3][4][5][6]. The latter mechanism, -conjugation-, concerns the transfer of a DNA element from a donor to a recipient cell. Conjugative elements containing all the information required for DNA transfer of a donor to a recipient cell are often found on plasmids, but they can also be embedded within a bacterial chromosome. These latter forms are generally named integrative and conjugative elements (ICE).
Some basic features of the conjugation process are conserved among plasmids [for review see, [7][8][9][10]. In most cases, a singlestranded DNA (ssDNA), which is generated by a rolling circle-like mode of DNA replication, is transferred into the recipient cell through a membrane-associated intercellular mating channel, named transferosome, which is a form of type IV secretion system. Conjugative plasmids can be exploited for the construction of tools to genetically modify bacteria of clinical or industrial relevance that are reluctant to genetic manipulation by other ways. Besides its intrinsic scientific interest, a detailed understanding about how conjugation genes are regulated is crucial to design strategies helping to interfere with the rapid spread of antibiotic resistance, and for the construction of genetic tools based upon conjugative plasmids.
Various conjugative plasmids have been studied in considerable detail [for review see, [7][8][9][10]. Although most of the well-studied conjugative plasmids replicate in Gram-negative bacteria, an increasing interest in conjugative plasmids of Gram-positive bacteria has resulted in the recent analysis of conjugative plasmids from for instance streptococci, enterococci, staphylococci and clostridia [11][12][13][14][15][16][17]. However, conjugation systems present on the Gram-positive soil bacterium Bacillus subtilis had not been reported until recently. This is most remarkable taking into account that (i) it is one of the best-studied Gram-positive bacteria; (ii) it has important industrial applications; and (iii) it is closely related to pathogenic and fastidious bacilli [for review see, 18,19]. Moreover, several B. subtilis strains are gut commensals in animals including humans [20]. B. subtilis plasmids may therefore play an important role in HGT in different environments. We chose the B. subtilis plasmid pLS20 for our studies. Originally, this 65 kb plasmid was identified in the Bacillus subtilis natto strain IFO3335 that is used in the fermentation of soybeans to produce ''natto'', a dish that is popular in South Asia [21]. Previous studies on pLS20 have shown that it is conjugative in liquid media as well as on solid media [22,23]. The presence of pLS20 has a broad impact on the physiology of the host, and the localization of some components of the conjugation machinery has been determined [24,25]. The replication region of pLS20 has been characterized, and it has been demonstrated that it uses a dedicated segregation mechanism involving the actin-like Alp7A protein [26,27]. pLS20 encodes a protein, Rok LS20 , that suppresses the development of natural competence of B. subtilis [28].
Recently, we have reported a global view of the regulatory circuitry of the pLS20 conjugation genes. A conjugation operon encompassing more than 40 genes is located next to a divergently oriented single gene, rco LS20 , which encodes the master regulator of conjugation responsible for keeping conjugation in the default ''OFF'' state. Activation of conjugation requires an anti-repressor, Rap LS20 , that belongs to the family of Rap proteins. Inactivation of the rap LS20 gene on pLS20 severely compromises conjugation, and conjugation was enhanced when rap LS20 was expressed from an ectopic locus. The activity of Rap LS20 , in turn, is regulated by a signaling peptide, Phr* LS20 . The small phr LS20 gene, located immediately downstream of rap LS20 , encodes a pre-protein. After being secreted, Phr LS20 can be processed by a second proteolytic cleavage, resulting in generation of the functional pentapeptide, Phr* LS20 , corresponding to the five C-terminal residues of Phr LS20 . When (re)imported, this peptide inactivates Rap LS20 . Therefore, activation of conjugation is ultimately regulated by the Phr* LS20 signaling peptide. The Phr* LS20 concentration will be relatively high or low when donor cells are predominantly surrounded by donor or recipient cells, respectively. Hence, conjugation will become activated particularly under conditions in which recipient cells are potentially present. In addition, Phr* LS20 has a crucial role in returning conjugation to the default ''OFF'' state [29].
Despite identification of the players involved in regulation of the conjugation genes, our knowledge on regulation of the genetic switch responsible for activating conjugation is still very limited. Using a combination of various in vitro and in vivo approaches, we show that the genetic switch controlling pLS20 conjugation involves at least three layers of regulation. Together, they tightly repress the main conjugation promoter under conditions that do not favor conjugation, while maintaining the ability to accurately switch on the conjugation genes when appropriate conditions occur. The three layers involve coinciding or overlapping divergent promoters of different strengths, autoregulated expression of Rco LS20 , which turns out to be a tri-functional transcriptional regulator, and formation of Rco LS20 -mediated DNA looping. The sophisticated regulatory mechanism that combines three layers of control into a single switch is novel for plasmids of Gram-positive bacteria. The implications of the uncovered regulation mechanisms for conjugation are discussed in the context of regulatory systems present on other HGT elements and with other regulatory systems involving DNA looping.

Results
Promoters P c and P r The rco LS20 -gene 28 intergenic region contains the strong main conjugation promoter, P c , which is under the negative control of the master regulator of conjugation Rco LS20 . According to our standard presentation (Fig. 1A), the transcription of pLS20cat gene 27, encoding the main repressor of conjugation genes, Rco LS20 , reads leftwards. Flanking genes 28 to 74, which are all transcribed in the opposite direction, probably constitute a large conjugation operon [29]. To test whether a promoter that would drive expression of this operon is located upstream of gene 28 we cloned the ,600 bp intergenic rco LS20gene 28 region in the appropriate orientation in front of a promoterless lacZ reporter, and subsequently placed a single copy of this cassette at the B. subtilis chromosomal thrC locus (strain PKS3). Transcriptional fusions to several sub-fragments of this region were also constructed. In addition, all the fragments were cloned in the opposite orientation to analyze the divergent promoter of the rco LS20 gene (see below). For simplicity, the cloned fragments are indicated with Roman letters. Fragments cloned in the orientation to analyze the conjugation or the rco LS20 promoter are indicated with the extension ''c'' or ''r'', respectively. The entire intergenic region is referred to as Fragment I (or F_I). A schematic representation of the different strains and fusions described in this work is given in Figs. 1B-C.

Author Summary
Plasmids are extrachromosomal, autonomously replicating units that are harbored by many bacteria. Many plasmids encode transfer function allowing them to be transferred into plasmid-free bacteria by a process named conjugation. Since many of them also carry antibiotic resistance genes, plasmid-mediated conjugation is a major mechanism in the dissemination of antibiotic resistance. In depth knowledge on the regulation of conjugation genes is a prerequisite to design measures interfering with the spread of antibiotic resistance. pLS20 is a conjugative plasmid of the soil bacterium Bacillus subtilis, which is also a gut commensal in animals and humans. Here we describe in detail the molecular mechanism by which the key transcriptional regulator tightly represses the conjugation genes during conditions unfavorable to conjugation without compromising the ability to switch on accurately the conjugation genes when appropriate. We found that conjugation is subject to the control of a unique genetic switch where at least three levels of regulation are integrated. The first level involves overlapping divergent promoters of different strengths. The second layer involves a triple function of the transcriptional regulator. And the third level of regulation concerns formation of a DNA loop mediated by the transcriptional regulator.
Colonies of strain PKS3 containing F_I c -lacZ fusion were blue when grown overnight on Luria-Bertani (LB) agar plates supplemented with the chromogenic substrate 5-bromo-4-chloroindolyl-b-D-galactopyranoside (Xgal; Fig. S1), demonstrating that the rco LS20 -gene 28 intergenic region contains a promoter, which we named P c . Analysis of PKS3 samples gave relatively high levels of b-galactosidase (bG) activities that were in the range of 300 and 500 Miller Units (MU) during mid-exponential and stationary phase, respectively. These results indicate that P c is a rather strong promoter that does not seem to be regulated by host-encoded factors when grown under these conditions. Under our laboratory conditions, efficient conjugation is limited to a narrow time window near the end of the exponential growth phase [29]. If P c is the main conjugation promoter it is expected that (i) its activity would generally be lower in the presence of pLS20cat and (ii) there would be a correlation between promoter P c activity and the efficiency of conjugation. The following results show that this is indeed the case. Thus, we introduced pLS20cat into strain PKS3, and colonies of the resulting strain, PKS8, were white after overnight growth on Xgal-containing plates (Fig. S1).
In addition, when we used PKS8 as donor strain and simultaneously determined the kinetics of conjugation and promoter P c activity we found that promoter P c is only active during a rather short window of time near the end of the exponential growth phase, which coincides with the period of high conjugation efficiency (Fig. 2).
Next we tested whether Rco LS20 , encoded by the pLS20cat gene 27 (rco LS20 ), is responsible for repression of the P c promoter. To this end, we placed the rco LS20 gene under the control of the isopropyl b-D-1-thiogalactopyranoside (IPTG)inducible P spank promoter and introduced this cassette at the amyE locus of strain PKS3 (harboring the F_I c -lacZ fusion at the thrC locus). Colonies of this strain, PKS5, were blue when grown on Xgal-supplemented LB agar plates, but white when the plates contained IPTG (Fig. S1). Together, these results indicate that promoter P c located upstream of gene 28 constitutes the main conjugation promoter that is negatively controlled by Rco LS20 .
Interestingly, PKS5 colonies were white when plates contained as little as 10 mM of IPTG (colonies shown in Fig. S1). Taking also Map of the conjugation region of plasmid pLS20cat. Positions and directions of the genes and the positions of the predicted transcriptional terminators are indicated with arrows and lollipop symbols, respectively. Panels B and C show a blow-up of the 600 bp rco LS20 gene 28 intergenic region and the different fragments fused to lacZ. Fusions in (B) and (C) were used to study activities of the promoters P c and P r , respectively. Features of the intergenic region are given on the top line. Numbers correspond to the bp position in this region. Names of the fragments cloned are indicated. The small triangles indicate position of an inverted repeated sequence (see text). Strains containing P c -lacZ fusions in combination with the P spank -rco LS20 cassette were grown on plates containing 10 mM or 1 mM IPTG. The symbols ''+++'', ''+'',and ''2'' reflect intense blue, pale blue, and white colonies after growth on X-gal containing plates. Colors of the colonies were observed after 16 and 48 hours of incubation at 37uC for strains containing pLS20cat or the P spank -rco LS20 cassette, respectively. doi:10.1371/journal.pgen.1004733.g001 into account that P spank is a relatively weak promoter, these results indicate that the P c promoter is very sensitive to Rco LS20 .
Promoter P c is located at an unusually large distance upstream of the first gene of the conjugation operon. We next set out to delineate the position of the P c promoter. As a first approach, we constructed strains containing lacZ fused to different subregions of Fragment I c . Surprisingly, whereas no significant promoter activity was obtained with the strain having lacZ fused to Fragment II c (strain GR10), the bG activities obtained with strains harboring lacZ fused to Fragment III c , IV c , V c or VI c were very similar to those obtained when lacZ was fused to Fragment I c . These results show that promoter P c is located at an unusually large distance of at least 350 bp upstream of gene 28.
Analyses of strains containing lacZ fused to Fragment VII c (GR68) or VIII c (GR70) revealed that promoter activity was sustained only by Fragment VIIc (Fig. 1B), showing that the 59- Additional evidence that this sequence constitutes the P c promoter was obtained by primer extension analysis to determine the transcription start site. The detected extension product is shown in Fig. 3B. The position of the deduced transcription start site is located 6 bp downstream of the P c core promoter sequences mentioned above (see Fig. 3A). The position of the transcription start site corroborates with our RNAseq data, which provides a good estimation of the position of the transcriptional start site. Thus, total RNA isolated from pLS20cat-harboring cells was processed as described in Materials and Methods after which it was employed to generate cDNA libraries using a ''directional RNA-seq'' procedure that preserves information about the transcript's direction. The schematic representation of the distribution and directionality of the reads presented in Figure 3C shows that the rightward-oriented transcripts, driving expression of gene 28 and downstream genes (shown in green), start close to the divergently oriented rco LS20 gene (shown in red).
The rco LS20 -28 intergenic region contains the weak P r promoter that is activated and repressed at low or high Rco LS20 concentrations, respectively. As for P c , we constructed lacZ fusion strains to characterize the divergently oriented P r promoter responsible for expression of Rco LS20 . Surprisingly, no promoter activity was observed when lacZ was fused to the 570 bp Fragment I r (strain GR25, Fig. 1C). One possibility could be that promoter P r is located even further upstream. This does not seem to be the case however, because Fragment IA r , corresponding to the 1,014 bp region upstream of rco LS20 (strain GR62), also did not provide detectable levels of promoter activity. We then introduced pLS20cat into these strains to study whether it encoded a protein that might be required to activate promoter P r . Colonies of the resulting pLS20cat-harboring strains GR39 (F_I r -lacZ) and GR66 (F_IA r -lacZ) turned pale blue when grown on Xgal-containing plates (Fig. 1C), consistent with pLS20cat providing a protein that activates the P r promoter. In addition, the results show that the P r promoter is located on Fragment I r .
Rco LS20 might be responsible for activating its own promoter. To test this possibility, we engineered strain GR92 that contains Figure 2. Correlation between the kinetics of P c promoter activity and conjugation efficiencies of pLS20cat. Overnight cultures of the strain PKS8 (F_I c -lacZ, pLS20cat) and recipient strain PS110 were diluted to an OD 600 of 0.05. Next, samples taken at different times were used to determine conjugation efficiency of pLS20cat by a standard conjugation protocol (continuous line), and the promoter P c activity by measuring bgalactosidase activity (broken line). T = 0 corresponds to the end of the exponential growth phase. The presented graph corresponds to a representative experiment. The experiment was carried out three times and the corresponding values differed by less than 10%. doi:10.1371/journal.pgen.1004733.g002 the F_I r -lacZ fusion combined with the cassette in which expression of rco LS20 is under the control of the IPTG-inducible P spank promoter. Colonies of strain GR92 were white when grown on agar plates containing only Xgal, but turned pale blue when the plates contained also low levels of IPTG. These results demonstrate that Rco LS20 activates its own promoter. In addition, the fact that colonies only developed a pale blue color suggests that promoter P r is weaker than P c . To test this more directly, we measured P r promoter activities at late-exponential growth phase using strain GR92 grown at different levels of Rco LS20 induction (Table 1). Interestingly, maximum P r promoter activity was obtained when cells were grown in the presence of 50 mM IPTG. Promoter P r activity decreased at higher IPTG concentrations and equaled background levels in the presence of 1 mM of IPTG, indicating that Rco LS20 represses its own promoter at higher concentrations.
Together, these results show that P r is a weak promoter whose strength is several hundred folds weaker than that of P c . The Figure 3. Promoter P c is located 461 bp upstream of the start codon of gene 28 and overlaps with the divergently oriented P r promoter. A. Determination of promoter P c and P r sequences by deletion analysis and primer extension. pLS20cat containing cells harvested at the end of the exponential growth phase were processed to isolate their total RNA, which was used in primer extension assays as described in Materials and Methods. Features of the promoter P c are shown above the sequence. The dotted vertical lines and black straight lines indicate the 59 end points of the transcriptional lacZ fusions present in strains GR68 and GR70, displaying and not displaying promoter activity, respectively. The core promoter and the putative upstream UP element is indicated by a light blue box; the 235 and 210 hexamers, and the extended 210 motif are indicated with dark blue and green boxes, respectively. The transcription start site determined by primer extension and the direction of transcription are indicated with the corresponding encircled base and a black bent arrow. The thin grey bent arrow corresponds to the 39 end point of the smaller extension product that coincides with the start of an inverted repeat which is marked with a pair of thin blue arrows above the sequence. Features of promoter P r are shown below the sequence. The dotted vertical lines and the black straight lines indicate the 39 end points of the transcriptional fusions with lacZ reporter present in strains GR82 and GR116, displaying and not displaying promoter activity, respectively. The deduced position of the P r core promoter, and the 235 and 210 boxes are indicated with orange and red boxes, respectively (see text). The transcription start site determined by primer extension and the direction of transcription are indicated with the corresponding encircled base and a black bent arrow. B. Primer extension to determine the transcription start sites of promoters P c (left panel) and P r (right panel). The cDNA products of the primer extension reactions are indicated with bent arrows (lane P). Free lane in which no sample was run is indicated with ''2''. Lanes M, M1 and M2 correspond to [G+A] chemical sequencing reactions of a short 230 bp DNA fragment corresponding to the studied pLS20cat region obtained by PCR amplification as described in Materials and Methods. In the case of the P c promoter, a smaller extension product with a relatively strong signal was observed 37 bp downstream of the extension product shown. The longer extension product most likely reflects the correct transcription start site based on the following arguments. First, it is known that AMV reverse transcriptase prematurely terminates cDNA synthesis when reaching a stem loop in the RNA, and that the prematurely terminated molecules map at the bottom of the secondary structure [71,72]. The position of the strong signal coincides with the 39 end of an inverted repeat (indicated in Fig. 3A). Second, no putative core promoter sequences are evident upstream of the 59 position of the shorter extension product. Third, if the stronger signal corresponds to the transcription start site, the responsible promoter would be present on Fragment VIII c used for the transcriptional lacZ fusion in strain GR70. However, no promoter activity was observed with this strain (see text). And fourth, the transcription start site based on the longer extension product corroborates the RNAseq data. C. Schematic overview of RNAseq expression data of pLS20cat genes rco LS20 and 28 under conditions with (top panel) and without overexpression of rco LS20 (lower panel). The amount of right and leftward ''reads'', given in green and red, respectively, are presented on a log2 scale. The positions of the divergently oriented genes rco LS20 and 28 are indicated on the top with a red and green arrow, respectively. Dotted lines and black arrows indicate the approximate start sites of the divergent transcripts driven by the P c and P r promoters. doi:10.1371/journal.pgen.1004733.g003 results also show that Rco LS20 has a triple function. First, low levels of Rco LS20 are required to activate its own promoter P r ; second, at higher concentrations Rco LS20 represses its own promoter; and third, Rco LS20 is responsible for repression of the oppositely oriented P c promoter. This triple function of Rco LS20 is likely to have important consequences for regulation of the conjugation genes (see discussion). It is worth mentioning that whereas maximum activation of the P r promoter was achieved when rco LS20 was induced from the P spank promoter at 50 mM IPTG, efficient repression of the P c promoter was observed by inducing rco LS20 with as low as 10 mM IPTG. Finally, the results obtained show that Rco LS20 is the only pLS20cat protein required for activation and repression of the P r and P c promoters.
The divergent P r and P c promoters overlap. As a first approach to determine the position of the P r promoter we constructed strains containing lacZ gene preceded by different subregions of Fragment I r combined with pLS20cat to provide Rco LS20 in trans. The transcriptional regulator Rco LS20 is a DNA binding protein (see below). Therefore, a lack of P r promoter activity in the reporter assay can be due to the absence of (part of) the P r promoter or the Rco LS20 binding sites required for activation of P r . Since activator proteins generally bind upstream of promoters, we tested constructs having deletions at the 39 end of Fragment I r (i.e. flanking the rco LS20 gene). Promoter P r activity was detected when lacZ was fused to Fragment VII r (strain GR82), but not when it was fused to Fragment VIII r (strain GR116) (Figs. 1C and 3A). Interestingly, these results suggested that promoter P r would be (partially) located on the 63 bp 59 region of Fragment VII on which the divergently oriented P c promoter is also located (see above, Fig. 3A). In a complementary approach, we determined the transcriptional start site of promoter P r by primer extension (Fig. 3B). The determined transcription start site of promoter P r is positioned 6 bp upstream of the 235 box of the P c promoter (see Fig. 3A). This implies that promoter P r overlaps with the P c promoter. P r is a weak promoter whose activity requires Rco LS20 (see above). It is therefore unlikely that the 235 and 210 boxes will be very similar to the consensus sequences. The following two sequences that may constitute a s A -dependent promoter are located upstream of the determined P r transcription start site: (i) [59-aaGAtA-17 bp -TgTAAa-39] and (ii) [59-aTaACA-18 bp-aAgtAT-39] (mismatches with respect to consensus 235 (59-TTGACA-39) and 210 boxes (59-TATAAT-39) given in lower case, see Fig. 3A). The position of the determined transcription start is optimally spaced with respect to the first but not the second possible promoter sequence. Therefore, we favor the first sequence to correspond to the P r promoter. Interestingly, this would imply that the positions of the 210 and 235 boxes correspond exactly to the 235 and 210 boxes, respectively, of the divergently oriented P c promoter.
The results of the RNAseq experiments presented in Figure 3C (see also above) support the conclusion that the P r and P c promoters overlap. RNA transcripts mapped against the entire intergenic region except for a small region that is located near the start of the rco LS20 gene. The divergent promoters P r and P c , responsible for the left-(red) and rightward (green) oriented transcripts, respectively, must both be located in the small nontranscribed region which corresponds to the position of the P r /P c promoters according to their transcriptional start sites determined by primer extension.
In summary, results obtained by a combination of different approaches demonstrate that divergent P c and P r promoters overlap, if not coincide.

Rco LS20 operator sites
In vivo evidence that Rco LS20 binds to two operator sites; one of them, -located more than 85 bp downstream of P c -, is required for efficient regulation of promoters P c and P r . Rco LS20 belongs to the Xre-family of transcriptional regulators and is predicted to contain a Helix-Turn-Helix (HTH) DNA binding motif in its N-terminal region [29]. It is therefore likely that Rco LS20 will exert its transcriptional regulatory effects on P r and P c by binding to DNA sequences. We employed the following in vivo approach to gain insights into the location of the Rco LS20 binding sites. Either pLS20cat or the P spank -rco LS20 cassette was introduced into the various lacZ fusion strains (see Fig. 1). The resulting strains were then grown on Xgal containing LB plates, -supplemented with or without 10 mM of IPTG for strains containing the P spank -rco LS20 cassette-, and expression of the different lacZ fusions in response to Rco LS20 was screened by the color of their colonies.
A schematic summary of the results obtained for promoter P c is given in Figure 1B. In agreement with results presented above, the strain harboring lacZ fused to Fragment I c (PKS3) displayed high P c promoter activity, but no promoter activity was detected when Rco LS20 was provided in trans (strains PKS5 and PKS8). Efficient Rco LS20 -mediated repression of the P c promoter was lost however when lacZ was fused to Fragment III c (strains GR16 and PKS32). This strongly indicates that an Rco LS20 operator site is located on the 368 bp Fragment II c and that this operator, which would be located at least 85 bp downstream of the P c promoter, is crucial for efficient repression of the P c promoter. Fragment II c contains an inverted repeated sequence (59-ATCAAAATCAtgctg-caactTGGTTTTGAT-39). To test whether this region constitutes an Rco LS20 operator site we constructed lacZ fusions to Fragments IV c or V c , and also engineered derivatives of these two strains containing pLS20cat or P spank -rco LS20 . The 59 ends of these Fragments are located up-or downstream of the inverted repeat (see Fig. 1B). The P c promoter present on Fragment IV c and V c was efficiently repressed by Rco LS20 , indicating that the Rco LS20 operator site is present on Fragment V and not on the 212 bp region immediately upstream of gene 28 containing the mentioned inverted repeat. Efficient Rco LS20 -mediated repression of promoter P c was not observed for the lacZ fusion based on Fragment III c (see above). Together these in vivo results strongly indicate that an ,160 bp region, located 85 bp downstream of P c , contains an Rco LS20 operator site that is required for efficient repression of this promoter. We name this operator site O I . Results described above show that promoter P c was not repressed by Rco LS20 when the lacZ fusion was based on Fragment III c (strain GR16) and cells were grown in the presence of 10 mM IPTG. Interestingly though, promoter P c in strain GR16 was efficiently repressed when the concentration of IPTG was increased to 1 mM (see Fig. 1B). This indicates that another Rco LS20 operator site is present on the 201 bp Fragment III c . We name this operator site O II .
Next, we used the same strategy to delineate the regions required for activation of the divergent P r promoter. The results of these analyses are summarized in Figure 1C. Interestingly, the region required for efficient repression of P c by Rco LS20 , is also required for Rco LS20 -mediated activation of promoter P r . Thus, Rco LS20 activated the P r promoter when lacZ was fused to Fragments IV r or V r (strains GR97/GR33 and GR102/GR35, respectively) but not when it was fused to Fragment III r (strains GR14/GR9, Fig. 1C). In summary, the in vivo results obtained provide strong evidence that one Rco LS20 operator, O I , is located in an ,160 bp region located 85 bp downstream of promoter P c , and that this operator is crucial for proper repression and activation of promoters P c and P r , respectively. In addition, the results indicate the presence of another Rco LS20 operator, O II , which would be located near promoters P c and P r .
In vitro approaches show that Rco LS20 binds cooperatively to multiple binding sites present in operators O I and O II . To study the position of the Rco LS20 binding sites in more detail we purified Rco LS20 and used it in Electrophoretic Mobility Shift Assays (EMSA). To facilitate purification, we constructed an E. coli strain that expresses an Rco LS20 -His (6) tagged fusion protein. The his (6) -tag was placed at the C-terminus because Rco LS20 contains a predicted Helix-Turn-Helix DNA binding motif close to its N-terminus. The following result demonstrates that the Rco LS20 -His (6) protein is functional in vivo. We constructed B. subtilis strain GR90 in which the expression of rco LS20 -his (6) gene is placed under the control of the inducible P spank promoter, and which also contains the F_I c -lacZ reporter fusion. The activity of promoter P c in strain GR90 was repressed in an IPTG-dependent manner similar to that observed for strain PKS5 containing an inducible copy of native rco LS20 (not shown).
The in vivo transcriptional fusion results presented above indicated the presence of two operators. One of them, operator O II , located near promoters P c /P r , and another one, operator O I , present in an ,160 bp region about 85 bp downstream of P c . In addition, this analysis indicated that the ,200 bp region immediately upstream of gene 28 does not contain Rco LS20 binding sites. Accordingly, we began analyzing binding of Rco LS20 to Fragments X (200 bp region upstream gene 28), III (expected to contain operator O II ) and XII (expected to contain operator O I ) (see Fig. 4). Independent of the concentrations used, Rco LS20 did not bind to Fragment X (Fig. 4B). Together with the in vivo data presented above, this provides strong evidence that this region does not contain Rco LS20 binding sites. Also in agreement with the in vivo data, Rco LS20 bound to both Fragment III and Fragment XII (Fig. 4B). Interestingly, the retardation patterns obtained for these fragments were similar, and resulted in the appearance of a maximum of two retarded species. The observation that the two retarded species were already present at low Rco LS20 concentrations when the majority of the DNA migrated to the position of unbound DNA, strongly indicates that Rco LS20 binds cooperatively to at least two binding sites present in each operator. In addition, the observation that DNA fragments entered the gel even at very high protein concentrations indicates that Rco LS20 binds to specific sites and that it does not spread along the DNA. To delineate the O I and O II regions further we used overlapping and subregions of Fragments III and XII as probes. Fragment IIIA (130 bp containing promoters P c /P r ) and Fragment XIIA (125 bp) both produced up to two shifts, and Rco LS20 did not bind to the 46 bp region that separates these two fragments. This latter conclusion is based on comparison of gel retardations obtained with fragments XI and XII.
We next analyzed binding of Rco LS20 to Fragments I, IV and V that encompass both operators. These fragments gave similar retardation patterns. Interestingly, in these cases, Rco LS20 binding resulted in the appearance of four retarded species. All four of these retarded species could be detected at low Rco LS20 concentrations when most of the fragment had not bound Rco LS20 , indicating that Rco LS20 binds cooperatively to multiple sites on these fragments.
To search for the presence of conserved motifs in the two Rco LS20 operators we used the motif-identification programs MEME [30] and BIOPROSPECTOR [31]. These analyses revealed the identification of an 8 bp conserved motif that is present seven times in O II (Fragment III_A), and four times in O I (Fragment XII_A). We named the seven motifs identified in the O II operator a1-a7, and the four motifs in the O I operator b1-b4 (see Fig. 5). Whereas motifs b1 to b4 are all located on the lower strand, motifs a1-a7 are located on the upper strand, except motif a3. It is worth mentioning some characteristics of motifs a1 to a7. First, motif a5 overlaps with the P c /P r core promoter sequences, and motifs a1-a4 and a6-a7 flank them. Second, motifs a1 and a7 form part of a 13 bp direct repeat. Third, motifs a1 and a3 form an inverted repeat. Fourth, the oppositely oriented motifs a3 and a4 overlap in a region that has an inverted repeat (59TTTCAgT-GAAA-39).
Evidence that the identified motif constitutes (part of) the binding site for Rco LS20 was obtained by DNase I footprinting (see below) and mutational analysis. Thus, gel retardation assays showed that binding of Rco LS20 is affected in probes containing alterations in one or two motifs in either operator. For instance, Rco LS20 did not bind to a derivative of Fragment III_A containing mutations in both motifs a1 and a7; and binding was affected when only motif a7 was mutated. Similarly, mutation of motif b1 or b4 resulted in the appearance of only one retarded species instead of two observed for corresponding fragments without mutations (Fig. 4B). In summary, the results obtained show that the intergenic rco LS20 -gene 28 region contains two Rco LS20 operators that are separated by 75 bp. Operator O II overlaps with promoters P r /P c and the other region is located 75 bp towards the direction of gene 28. Each region contains repeats of a motif whose consensus sequence is 59-CAGTGAAA-39 and which forms (part of) the binding site of Rco LS20 . Motifs in O I are located on the lower strand, and except for one, motifs in O II are located on the upper strand.
Binding of Rco LS20 to operators O I and O II was confirmed by DNase I footprinting. The results presented in Figure 6 confirm that Rco LS20 binds to a region that overlaps with the P r /P c promoters and to another region located about 75 bp downstream of the P c promoter. The combined in vitro results are in line with the in vivo results presented above.
Evidences that proper regulation of the P r /P c promoters involves DNA looping. Operator O I , -located at a distance of more than 75 bp from P r /P c -, is needed for proper regulation of these promoters. This and other data presented above, suggest that proper regulation of the P r /P c promoters involves DNA looping mediated by Rco LS20 bound to operators O I and O II . Due to the intrinsic stiffness of DNA, loops are generally longer than 90 bp because the curvature energy required to make smaller loops is too large, unless the DNA region separating the two operator sites is bent [32]. Operators O I and O II are separated by only 75 bp. Several periodically spaced A/T tracts can result in formation of a static bent [33]. The spacer region contains periodically spaced A/ T tracts, and computer-assisted analysis predicts that the spacer region forms a static curve (see Fig. S2). These data prompted us to perform circular permutation assays. Thus, three overlapping fragments of identical size (314 bp) were generated in which the predicted static curve is located at different positions (see Fig. 7A). As expected, these fragments migrated to identical positions when run on a 2% agarose gel (Fig. 7B). However, when run on a native 8% PAA gel the fragments migrated differently and all of them run slower than expected for their size, with the fragment containing the predicted bent in the middle of the fragment migrating slowest (Fig. 7C). These results show that the 75 bp spacer contains a static bent.
If Rco LS20 -mediated DNA looping occurs then it is expected (i) that Rco LS20 will form oligomers thereby creating a DNA binding unit able to bind simultaneously to O I and O II , and (ii) that the two operators are in phase such that the Rco LS20 binding sites have a spatial orientation that is optimal for Rco LS20 binding. We tested both predictions. The oligomerization state of Rco LS20 was studied by two complementary analytical ultracentrifugation approaches (Fig. 8). In sedimentation velocity experiments, Rco LS20 was observed as a single species with an experimental sedimentation coefficient of 3.8 S. This value corrected to standard conditions (s 20,w = 4.1S) was compatible with an elongated protein tetramer (Fig. 8A). To confirm this result, sedimentation equilibration experiments were carried out within the concentration range from 10 to 30 mM. The calculated average molecular mass obtained was 85,200 Da61,700, which corresponds to the tetrameric form of Rco LS20 (Fig. 8B).
To test if a specific phasing between O I and O II is important for Rco LS20 to carry out its regulatory role we constructed a derivative of Fragment I, I+5, in which we enlarged the spacer half a helical turn by inserting 5 bp and cloned this fragment in front of lacZ (see Fig. 1B). Next, we tested the responsiveness of promoter P c to Rco LS20 using strains containing either Fragment F_I c or F_I c +5 fused to lacZ. As expected, Rco LS20 , which was provided in trans by pLS20cat, efficiently repressed promoter P c when lacZ was fused to Fragment I c (strain PKS8). Promoter P c was not efficiently repressed by Rco LS20 however, when lacZ was fused to Fragment I c +5 (strain GR191). Thus, colonies of pLS20cat-harboring cells were blue when grown on Xgal-containing plates (see Fig. S3). These results show that enlarging the distance between O I and O II with half a helical turn destroys proper regulation of promoter P c by Rco LS20 . Besides affecting the phasing, the 5 bp insertion might also affect the static curvature of the spacer region. Regardless whether the loss of Rco LS20 -mediated regulation is due to phasing and/or altered curvature, the results provide compelling evidence that Rco LS20 mediates its regulatory effect through DNA looping.
Next, we analyzed by EMSA if the 5 bp insertion between operators O I and O II affects Rco LS20 binding. As described above, even in the presence of the highest Rco LS20 concentration applied, DNA fragments F_I, F_IV and F_V containing operators O I and O II entered the gel migrating to distinct positions, indicating multiple intramolecular Rco LS20 binding events (Fig. 4B, right column, first, third and fourth panel). Interestingly, however, whereas Fragment I+5 entered the gel at low Rco LS20 concentrations, most of the DNA did not enter the gel at medium or high Rco LS20 concentrations (Fig. 4B, right column, second panel). One possible explanation is that dephasing between the two operators allows Rco LS20 to bind intermolecularly resulting in the formation of high molecular weight nucleoprotein complexes that do not enter the gel. Together, these results support the view that the phasing between O I and O II is crucial for proper Rco LS20mediated regulation of transcription.

Discussion
Conjugation is a complex and energy consuming process, involving the generation and transfer of ssDNA, synthesis and assembly of a sophisticated type IV secretion system, and establishment of specific contacts with the recipient cell. Hence, the process of conjugation and expression of the genes involved are strictly controlled. Analysis of the regulation of conjugation genes present on ICEs in bacteria and those on plasmids of Gramnegative bacteria indeed indicates that this is the case [for review see, 5,7]. In our previous studies, we have sequenced and annotated plasmid pLS20cat of the Gram-positive bacterium B. subtilis and identified a large conjugation operon. We have also identified rco LS20 as the gene encoding the master regulator of conjugation, Rap LS20 as the anti-repressor required to activate the conjugation genes, and we showed that the activity of Rap LS20 is in turn regulated by the signaling peptide Phr* LS20 . In this study, we analyzed the underlying molecular mechanism of how the pLS20 conjugation genes are regulated. The results obtained provide compelling evidence that the conjugation genes of pLS20 are controlled by a complex genetic switch, which is composed of at least three intertwined layers. A scheme of the three layers is shown in Figure 9. One of the levels results from the relative positioning of the main conjugation promoter, P c , and the divergently oriented promoter P r , driving expression of the rco LS20 gene (Fig. 9A). The presence of divergently oriented promoters is a common form of gene organization in bacteria, and the (likely) role of this organization in transcriptional regulation has long been recognized [34]. Nevertheless, direct proof for and detailed analysis of the implications on transcriptional regulation are restricted to only a minor fraction of the divergently oriented transcriptional units detected. Here, we identified the conjugation promoter P c and showed that it is a relatively strong promoter, which is repressed by the master regulator of conjugation Rco LS20 . Importantly, the position of promoter P c coincides, or at least partially overlaps, with the divergently oriented weak P r promoter. It has been demonstrated that an RNA polymerase can bind only to one of two overlapping promoters [35,36]. Thus, in the special configuration of overlapping promoters the RNA polymerase may itself act as a transcriptional regulator. Recently, Bendtsen et al. [37] described theoretical scenarios backed up by experimental data that overlapping promoters indeed can result in a transcriptional switch, provided that they have different activities in the absence of the regulatory protein, combined with a regulator that has a strong differential effect on the regulation of both promoters. This is exactly the case for the P c /P r promoter pair; in the absence of the regulator promoter P c is several hundred folds stronger than P r , and the presence of the regulator strongly represses the P c promoter while activating the P r promoter.
The second level of regulation contributing to the genetic switch concerns the multiple roles that Rco LS20 plays in the P c /P r Figure 6. Footprint analyses of the binding of Rco LS20 to the rco LS20 --gene 28 intergenic region. Fragment V, end-labeled at the P c template strand, was analyzed for binding of Rco LS20 -His. First lane (2) was not incubated with protein. Concentrations of Rco LS20 -His, increasing by two-folds, ranged from 0.11 to 7.04 mM. The positions of the P c and P r promoters are indicated on the left. Bars on the right reflect the regions covered by Fragments IIIA (F_IIIA) and XIIA (F_XIIA). Positions of motifs a1-a7 and b1-b4 are indicated with green or purple arrows at the right. doi:10.1371/journal.pgen.1004733.g006 regulation (Fig. 9B). We showed that, on the one hand, Rco LS20 activates transcription of its own weak promoter, P r , thereby generating a self-sustaining positive feedback loop. On the other hand, Rco LS20 functions simultaneously as an efficient repressor of the P c promoter. The dual effect that Rco LS20 has on P c and P r maintains conjugation effectively in the ''OFF'' state. We also showed that the level of rco LS20 induction from an inducible promoter required for efficient repression of the P c promoter was about ten-fold lower than that required for maximum autoactivation of the P r promoter. These differential effects of Rco LS20 on repressing and activating the P c /P r promoters will also contribute towards maintaining conjugation stably in the ''OFF'' state under conditions when conjugation should not be activated. Interestingly, we found that at elevated concentrations Rco LS0 inhibits its own transcription. This negative autoregulation probably functions to keep Rco LS20 within a low concentration range in order to respond accurately to the anti-repressor Rap LS20 to activate the conjugation genes. The triple effects Rco LS20 has on the regulation of the P c /P r promoters will also play an important role when Rap LS20 induces the system to switch to the ''ON'' state. In addition to relieving repression of the strong conjugation P c promoter, this will simultaneously annihilate autostimulation of the P r promoter, preventing further synthesis of Rco LS20 , which in turn will contribute in pushing and maintaining conjugation in the ''ON'' state.
A third level contributing to the genetic switch to activate the conjugation genes involves the DNA looping mediated by simultaneous binding of Rco LS20 to operators O I and O II (Fig. 9C). DNA looping mediated by a transcriptional regulator has been reported for several other regulatory systems in prokaryotes and their analyses have revealed that several features are conserved and necessary for DNA looping to occur [for review see , 38]. Our results showed that the properties of Rco LS20 and the DNA in the P c /P r region comply with the necessary features for Rco LS20 -mediated loop formation. First, using different techniques, we show that Rco LS20 , -predicted to contain a helix-turnhelix DNA binding motif in its N-terminal region [29]-, is a DNA binding protein and that it binds specifically to two operators, O I and O II . Second, operator O I , which is located more than 85 bp away from promoters P c and P r , is required for efficient regulation of both promoters. Third, Rco LS20 binds cooperatively to both operators. Fourth, dephasing the positions of the two operators by the DNA fragments amplified by PCR that were subjected to 2% agarose (B) and 8% native PAA (C). Position of the P c /P r promoters and the Rco LS20 binding motifs within the operators are indicated with grey rectangles and arrows, respectively. The 75 bp region separating O I and O II is shown as an interrupted line and the position of the unique EcoRI site is given. The predicted curvature in this region is represented by the blue arc above the top line, and by a blue shading in the equivalent region in fragments A-C. Fragments A-C were run on 2% agarose or on 8% native PAA gel followed by ethidium bromide staining. doi:10.1371/journal.pgen.1004733.g007 inserting 5 bp in the spacer region destroys proper regulation of the conjugation genes. And fifth, we showed that Rco LS20 forms tetramers in solution. This will create a unit containing multiple DNA binding motifs, facilitating cooperative binding to multiple sites within the two operators.
The DNA loop in the P c /P r region of pLS20 is characterized by a small spacer region that separates Rco LS20 operators O I and O II . The spacer length can be used to classify DNA loops into two categories: short or energetic loops, and long or entropic ones. Due to intrinsic stiffness and torsional rigidity of the DNA, loop formation is normally unfavorable for those with spacer lengths shorter than the DNA persistence length (approximately 150 bp), because the curvature energy required for forming such small loops becomes too great. For such short loops to occur specific features like intrinsic static bending or binding of an additional protein inducing bending are required. In the case of pLS20, in which the operators O I and O II are separated by only 75 bp, we show that the spacer region contains a static bent.
The first experimental demonstration that a DNA loop can play a crucial role in transcriptional regulation was reported for the E. coli ara operon in 1984 [39]. Since then, some other operons have been shown to be also regulated by transcriptional regulatormediated DNA looping [for reviews see, 38,[40][41][42][43], though the actual number of transcriptional systems for which DNA looping has been conclusively demonstrated is remarkably low. In the case of plasmids, reports demonstrating DNA looping systems are limited to only few cases. One of these includes regulation of initiation of DNA replication at the beta origin of the E. coli R6K plasmid [44]; and in the case of Enterococcus faecalis plasmid pCF10 it has been proposed that regulation of its conjugation system involves DNA looping mediated by the pheromoneresponsive transcriptional regulator PrgX [for review see, 45]. Bio-informatic analyses suggest that DNA looping mediated regulation of transcription is likely to be more common than the few cases for which this has been demonstrated so far. For instance, Cournac and Plumbridge [38] have screened the E. coli genome for the presence of putative ''simple DNA looping systems'' in which looping would involve a single regulator (i.e., this analysis included only transcriptional regulators for which the operator sequence is known, and did not take into account the putative loops that would involve heterologous proteins and/or global transcriptional regulators). Under these restrictive settings, this survey identified 48 genes/operons in which DNA looping mediated regulation is likely to play a role. Interestingly, fourteen of them involve divergently oriented promoters. In the context of our studies, it is worth mentioning the regulation of the conjugation genes located on the integrative and conjugative element ICEBs1 that is present in several B. subtilis strains. The gene encoding the transcriptional regulator ImmR, and the excision and conjugation genes are expressed from two divergently oriented promoters that are separated by ,130 bp. At low concentrations, the ImmR protein can bind to six regions, three being proximal to each promoter. It has been suggested that repression of the immR promoter might involve cooperative interactions between ImmR molecules bound to binding sites proximal to both promoters, i.e. DNA looping [46]. Based on the distribution of the operator sites, DNA looping could also be involved in the transcriptional regulation of the Gram-negative plasmids Ti or IncP-plasmids, where divergent promoters have been shown to be involved in controlling both the replication and transfer functions [47,48].
What are the benefits of DNA looping in general and for the regulation of the conjugation genes of pLS20 in particular? A major consequence of DNA looping is that it results in a high local concentration of the transcriptional regulator at the right place, which would increase its specificity and affinity [for recent review see , 49]. Often, -and Rco LS20 is not an exception-, transcriptional regulators are produced in limited amounts per cell. Low numbers of regulators enhance the possibility of transcriptional fluctuations between individual cells within a population. In addition, the intrinsic stochasticity of transcription, -also referred to as noise-, affects the temporal effectiveness of transcriptional regulation; again this is especially prominent when the number of regulatory proteins involved is low. Recent evidences indicate that DNA looping contributes importantly to controlling temporal transcriptional noise, as well as dampening transcriptional fluctuations Figure 8. Rco LS20 forms tetramers in solution. The oligomerization state of Rco LS20 protein in solution was studied by two complementary analytical ultracentrifugation assays. A. Sedimentation velocity assay. Sedimentation coefficient distribution c(s) profile corresponding to 10 mM purified Rco LS20 . B. Sedimentation equilibrium assay. Upper part: Sedimentation equilibrium data for Rco LS20 (empty circles) are presented together with best-fit analysis assuming protein dimer (dashed line), tetramer (black line), or octamer (dotted line) species. The data indicate that Rco LS20 is a tetramer at 10 mM. Lower part: The difference between estimated values and experimental data for protein tetramers (residuals). doi:10.1371/journal.pgen.1004733.g008 between cells within a population [50,51]. Thus, DNA looping contributes to the tight regulation of promoters especially when levels of transcriptional regulators are low by diminishing stochastic fluctuations in transcription.
For some differentiation processes, cell-to-cell or stochastic variability in levels of transcriptional regulators form the basis for activation of these processes, resulting in different behavior of genetically identical cells within a population [52][53][54]. Examples of these processes are the formation of persister cells, development of natural genetic competence, spore formation and swimming/ chaining. It is believed that such a bet-hedging strategy is beneficial for the fitness of the species because there will always be some cells that are prepared to cope with a deteriorating environmental condition that may arise in the near future. However, for other processes, there may not be such an advantage and it would then be important to tightly repress the process at times when conditions for that process are not apt. Conjugation probably is such a process because there is no benefit in activating the conjugation genes when there is no recipient present to receive the plasmid. The fact that the efficiency of pLS20 transfer during growth conditions antithetic to conjugation is below the detection limit (at least six orders of magnitude lower than those observed during optimal conjugation conditions) strongly indicates that conjugation genes are tightly repressed under such conditions. However, the tight repression of conjugation should not compromise the ability of rapidly switching to high expression of the conjugation genes when appropriate conditions occur. In pLS20 this is achieved by the constellation of DNA looping combined with autoregulated expression of Rco LS20 and overlapping divergent promoters of different strength.
A well-studied genetic switch involving DNA looping is the one that governs the switch from the lysogenic to the lytic state of the Figure 9. Model of the different layers contributing to the genetic switch controlling expression of the pLS20 conjugation genes. A. RNA polymerase acts itself as a switch because it is unable to bind simultaneously to both of the two overlapping and divergently oriented promoters. Consequently, RNA polymerase (the brown ellipse shaped form) binds only one promoter at a time resulting in transcription of only the gene(s) controlled by this promoter. B. Rco LS20 generates a self-sustaining positive feedback loop by activating transcription from its own promoter (P r ) (left panel). This, combined with the simultaneous repression of the divergent conjugation promoter (P c ), results in conjugation being maintained effectively in the ''OFF'' state. Relief of Rco LS20 -mediated repression of the P c promoter results in activation of the conjugation genes (right panel). In addition, this interrupts the auto-stimulation of the P r promoter, preventing further synthesis of Rco LS20 , which in turn will contribute in pushing and maintaining conjugation in the ''ON'' state. The negative auto-regulatory loop of Rco LS20 that probably functions to keep Rco LS20 within a low concentration range (see text) is not presented. C. DNA looping results in a high local concentration of Rco LS20 , increasing specificity and affinity that dampens transcriptional fluctuations between and within individual cells (left panel). This would contribute to tight repression of the P c promoter, keeping conjugation in the ''OFF'' state under conditions antithetic to conjugation without compromising the ability to switching rapidly to a high expression state (i.e. ''ON'', right panel) of the conjugation genes when appropriate conditions occur. rco LS20 and gene 28, -the first gene of the conjugation operon-, are indicated with large red and blue arrows, respectively. The same coloring scheme is used for the corresponding promoters (rectangular) and transcripts (thin broken arrows). Activation and repression of transcription are indicated with continuous black lines ending in an arrow and a ''T'' shape, respectively. The red cylindrical structures, which may reflect one or two Rco LS20 tetramers, represent the Rco LS20 oligomer mediating DNA looping. doi:10.1371/journal.pgen.1004733.g009 Escherichia coli phage l [for review see, 55,56]. In the lysogenic or prophage state, phage l replicates passively with the host while the lytic genes are repressed. This prophage state is extremely stable and can be maintained for many generations. Upon induction of the SOS response, however, a switch is made to the lytic cycle resulting in excision of the phage genome, followed by its amplification and eventually lysis of the cell and release of phage progeny. The early lytic phage l genes are located in two divergently oriented operons, which are controlled by the lytic promoters P R and P L . A third operon, which encodes amongst others the CI transcriptional regulator, is located in between the two early lytic operons such that the promoter of gene cI, P RM , flanks the divergently oriented P R promoter driving expression of one of the two early operons. In several aspects, functional analogies exist between CI and Rco LS20 although they share only 16% of identity at their primary protein sequence level. Both Rco LS20 and CI stimulate and repress their own promoter at low and high concentrations, respectively, resulting in a self-sustaining positive feedback loop while keeping the transcriptional regulator in a low concentration range. Above, arguments have been given that for pLS20 this situation, together with the effects of the DNA loop, is important for the tight repression of the P c promoter during conditions in which conjugation is not favourable, while maintaining the sensitivity to be able to respond rapidly to switch on the conjugation genes when appropriate conditions occur. The transcriptional regulation of l appears to serve a similar purpose. Thus, on the one hand the lytic genes are tightly repressed since spontaneous switching to the lytic cycle occurs less than once every 10 8 generations [57]. On the other hand, mutations that specifically eliminate the negative autoregulation of cI expression impair prophage induction [58,59]. Another analogy between the pLS20 and l systems is that both the regulators Rco LS20 and CI, can form higher order oligomers, permitting them to bind cooperatively to multiple sites distributed in two operators, effectively resulting in DNA looping which plays an important role in the genetic regulation of the conjugation and the lytic operon, respectively. Taking the analogy further, it is interesting to note that these regulatory systems both control a process of horizontal gene transfer.
However, there are also several differences between the two systems. For instance, whereas regulation of pLS20 conjugation genes involves a short loop of 75 bp, regulation of the l lytic genes involves a long loop of 2.3 kb. A second difference is that CI protein forms dimers in solution. A pair of CI dimers tetramerizes when binding to the binding sites in one operator and another dimer pair does the same when binding to the other operator. Upon DNA looping, interaction between the two tetramers constitutes a functional octamer. In addition, when a loop is formed another pair of dimers may bind to additional binding sites present in both operators, and this additional bridge is responsible for repressing P RM promoter. At present, we do not have such detailed insights in transcriptional regulations at the molecular level for Rco LS20 . However, instead of dimers, Rco LS20 forms tetramers in solution, which probably means that the molecular mechanism by which the pLS20 promoters P r and P c are regulated is distinct from the way CI regulates l promoters P R and P RM . Another argument supporting this assumption is the different configuration of the divergent promoters and the binding sites for the regulator protein. In pLS20, the position of promoters P c /P r overlaps and the Rco LS20 binding sites in O II overlap and flank these core promoters. In l the binding sites for CI regulator in one operator overlap the P R promoter and are located upstream of the P RM core promoter sequences. Finally, a major difference between the DNA looping involved systems of pLS20 and l is how the switches are induced. In l, the switch is induced by an SOS response which results in RecA-mediated CI autocleavage. In the case of pLS20, the switch is dictated ultimately by intercellular quorum sensing signaling involving the signaling peptide Phr* LS20 that regulates the activity of Rap LS20 , the anti-repressor of Rco LS20 [29]. This quorum sensing system will lead to activation of the conjugation genes when donor cells are surrounded by recipient cells. However, high levels of Phr* LS20 will build up when the majority of the cells that surround a donor cell already contain pLS20, and this will inactivate Rap LS20 and hence block activation of the conjugation genes.
Besides those described here, it is possible that the pLS20 conjugation genes are regulated by additional mechanism(s). For example, the conspicuously long 59 untranslated region upstream of gene 28 is predicted to form complex secondary structures, which might modulate expression of the downstream genes in a variety of scenarios. Currently, a study to elucidate a possible role of this long 59 untranslated region is carried out in our laboratory.
In summary, in this work we have provided evidence that regulation of the conjugation genes present on pLS20 is based on a unique genetic switch that combines at least three levels of control. These include (i) overlapping divergent promoters of different strengths, (ii) auto-stimulation and repression of the weak P r promoter by the transcriptional regulator at low and elevated concentrations, respectively, combined with simultaneous repression of the divergent strong conjugation promoter, and (iii) DNA looping mediated by binding of Rco LS20 regulator to two operators separated by a short loop. Most likely, the combination of these different layers causes tight repression of the main conjugation promoter P c when conditions for conjugation are not optimal, while allowing the system to switch rapidly to high expression of the conjugation genes when appropriate conditions occur.

Bacterial strains, plasmids, media and oligonucleotides
Bacterial strains were grown in LB liquid medium or on 1.5% LB agar plates [60]. When appropriate, the following antibiotics were added to media or plates: ampicillin (100 mg/ml), erythromycin (1 and 150 mg/ml in B. subtilis and E. coli, respectively), chloramphenicol (5 mg/ml), spectinomycin (100 mg/ml), and kanamycin (10 mg/ml). Table S1 lists the B. subtilis strains used. All of them are isogenic with B. subtilis strain 168. Plasmids and oligonucleotides used are listed in Table S2 and S3, respectively. All oligos were purchased from Isogen Life Science, The Netherlands.
Transformation E. coli cells were transformed using standardized methods as described in Singh et al [61]. For standard B. subtilis transformations, competent cells were prepared as described by Bron [62]. Transformants were selected on LB agar plates with appropriate antibiotics.

Construction of plasmids and strains
Standard molecular methods were used to manipulate DNA [60]. Sequence analysis was used to verify the correctness of all constructs. The same strategy was used to construct B. subtilis strains containing a copy of lacZ fused to the entire or part of the rco LS20 -gene 28 intergenic DNA region. First, the region of DNA to be cloned was amplified using appropriate primers (see Table  S3), purified, and digested with the appropriate restriction enzymes. Next, the fragment was used to prepare a ligation mixture together with the integration vector pDG1663 digested with the same enzymes. The ligation mixture was transformed into E. coli XL1-blue cells. The plasmid content of several ampicillin resistant transformants was checked and clones containing the insert with appropriate size and orientation were subjected to DNA sequencing to verify the absence of mutations. The names of the pDG1663 derivatives and their characteristics are listed in Table S2. Plasmid DNA of each pDG1663 derivative was used to transform competent B. subtilis 168 cells. Transformants were initially selected for resistance to erythromycin. Next, double crossover events were distinguished from single cross-over events by selecting transformants sensitive to spectinomycin. The resulting B. subtilis strains containing a single copy of lacZ preceded by different regions of the rco LS20 -gene 28 region at the thrC locus of the B. subtilis chromosome are listed in Table S1. Next, plasmid pLS20cat was introduced into the different lacZ fusion strains by conjugation. B. subtilis strain PKS9 contains a single copy of the rco LS20 gene under the control of the IPTG-inducible P spank promoter at its amyE locus and this cassette is linked to the spectinomycin gene. Chromosomal DNA of strain PKS9 was used to transform competent cells of the various lacZ fusion strains in order to construct derivatives of the lacZ fusion strains containing the P spank -rco LS20 cassette.
The following strategy was used to construct a translational fusion of rco LS20 with his (6) . The rco LS20 gene was amplified from pLS20cat by PCR using primers oPKS14N and oPKS8. The purified PCR product was digested with NcoI and SalI and cloned into the vector pET28b+ digested with the same restriction enzymes to produce plasmid pRco LS20 -His. B. subtilis strain GR90 contains the rco LS20 -his (6) under the control of the P spank promoter at the amyE locus. To construct this strain rco LS20 -his (6) was amplified from pRco LS20 -His by PCR using primers oGR3 and oGR4. The PCR product was digested with NheI and SphI and cloned into the vector pDR110 digested with the same enzymes to generate pP spank rco LS20 -His. This plasmid was used to transform competent B. subtilis cells selecting for spectinomycin resistance. Double cross-over events were selected by loss of amylase gene. b-Galactosidase activity assays b-galactosidase activities were determined as described previously [63]. Overnight grown cultures were diluted 100 times into fresh prewarmed medium and samples were taken every 45 min.

Conjugation assays
Conjugation was carried out in liquid medium as described previously [29]. The effect of ectopic expression on conjugation of a gene controlled by the IPTG-inducible P spank promoter was studied as follows. Overnight cultures were diluted in prewarmd LB supplemented with IPTG at the indicated concentrations to an OD 600 of ,0.05. Next, samples were taken at regular intervals to determine OD 600 and were subjected to matings with proper recipient cells.

RNA isolation and RNA sequencing
Preparation of total RNA samples, RNA sequencing and Bioinformatic analysis of RNAseq data was done as described previously [29].
Rco LS20 -His (6) purification E. coli BL21 (DE3) cells carrying plasmid prco LS20 -His were used to inoculate 1 litre of fresh LB medium supplemented with 30 mg/ml kanamycin and grown at 37uC with shaking. At an OD 600 of 0.4, expression of rco LS20 -his (6) was induced by adding IPTG to a final concentration of 1 mM and growth was continued for 2 h. Cells were further processed as described previously [28]. Purified protein (.95% pure) was dialysed against buffer B (20 mM Tris-HCl pH 8.0, 1 mM EDTA, 250 mM NaCl, 10 mM MgCl 2 , 7 mM b-mercaptoethanol, 50% v/v glycerol) and stored in aliquots at 280uC. Bradford assay was used to determine the protein concentrations.

Gel retardation
In essence, the gel retardation assays were carried out as described before [28]. Thus, different fragments of intergenic regions between gene 28 and rco LS20 were amplified by PCR using pLS20cat as template. The resulting PCR fragments were purified and equal concentrations (300 nM) were incubated on ice in binding buffer [20 mM Tris HCl pH 8, 1 mM EDTA, 5 mM MgCl 2 , 0.5 mM DTT, 100 mM KCl, 10% (v/v) glycerol, 0.05 mg ml 21 BSA] without and with increasing amounts of purified Rco LS20 His (6) in a total volume of 16 ml. After careful mixing, samples were incubated for 20 min at 30uC, placed back on ice for 10 min, then loaded onto 2% agarose gel in 0.5XTBE. Electrophoresis was carried out in 0.5X TBE at 50 V at 4uC.Finally, the gel was stained with ethidium bromide, destained in 0.5XTBE and photographed with UV illumination.

Primer extension experiments
Determination of the transcription start sites by primer extension was performed essentially as described [64]. In brief, total RNA (30 mg) was mixed with 4 pmol of end-labeled oligonucleotide that served as primer; the mixture was heated at 70uC for 5 min and allowed to anneal for 5 min at 23uC. The annealed RNA was ethanol precipitated, resuspended and primer extension was performed with 30 U of AMV reverse transcriptase (Promega) at 42uC, as recommended by the supplier. The extended cDNA products were analysed by electrophoresis on a denaturing 6% urea-polyacrylamide gel, in parallel with a DNA sequence ladder performed by chemical sequencing [65] of a DNA fragment encompassing the mapped promoters (see below). The primer used to map promoter P c was 59-ttctagttctttttacac, while that used for promoter P r was 59-tctctattgcccacttat. Oligonucleotides were end-labeled with [c-32 P]-ATP and T4 polynucleotide kinase as recommended by the supplier (New England Biolabs). The 186 bp DNA fragment that served as sequence ladder was PCR amplified with primers 59-acggtctagcgcttacaat and 59-ttctagttctttttacac, the last one labeled at its 59 end.

DNase I footprinting
DNaseI footprinting assay was carried out as described [66]. The P c /p r promoter encompassing region was amplified by PCR using primers p28_D16 and Prom28UpBam, and pLS20cat as template. One of the ends was radio-labeled by digesting the fragment with BamHI and subsequently filling in the end with exo 2 Klenow fragment in the presence of [a-32 P]-ATP.

Computer-assisted analysis
Presence of conserved motifs was searched by using motifidentification programs MEME [30] and BIOPROSPECTOR [31]. Prediction of the static bending properties of DNA sequences was carried out by calculating the global 3D structure according to the dinucleotide wedge model [67]. All graphics work was done by using Adobe Photoshop CS2 and adobe illustrator. Graphs were plotted using Excel program.

Ultracentrifugation
Sedimentation velocity assay. Samples in 20 mM Tris-HCl, 250 mM NaCl, 10 mM MgCl 2 , 1 mM EDTA and 100 mM glycerol, pH 7.4, were loaded (320 mL) into analytical ultracentrifugation cells. The experiments were carried out at 43-48 krpm in an XL-I analytical ultracentrifuge (Beckman-Coulter Inc.) equipped with UV-VIS absorbance and Raleigh interference detection systems. Sedimentation profiles were recorded at 280 nm. Sedimentation coefficient distributions were calculated by least-squares boundary modelling of sedimentation velocity data using the continuous distribution c(s) Lamm equation model as implemented by SEDFIT 14.1 [68]. Experimental s values were corrected to standard conditions (water, 20uC, and infinite dilution) using the program SEDNTERP [69] to get the corresponding standard s values (s 20,w ).
Sedimentation equilibrium assay. Using the same experimental conditions as in the SV experiments, short columns (90 mL) SE experiments were carried out at speeds ranging from 7,000 to 10,000 rpm and at 280 nm. After the last equilibrium scan, a high-speed centrifugation run (48,000 rpm) was done to estimate the corresponding baseline offsets. Weight-average buoyant molecular weights of protein were determined by fitting a single species model to the experimental data using the HeteroAnalysis program [70], and corrected for solvent composition and temperature with the program SEDNTERP [69]. Figure S1 The rco LS20 -gene 28 intergenic region contains a strong promoter that is inhibited by the pLS20cat encoded protein Rco LS20 . Strains were streaked on Xgal-containing LB plates and incubated for 16 hours at 37uC. When indicated, plates were also supplemented with 10 mM IPTG in the case of PKS5. Strain PKS3 contains a cassette at the thrC locus in which the lacZ gene is preceded by the 570 bp rco LS20 -gene 28 intergenic region (sequences in between the ribosomal binding sites of the divergently oriented genes 28 and rco LS20 ). PKS8 is a derivative of PKS3 harboring pLS20cat. PKS5 is a derivative of PKS3 containing the P spank -rco LS20 cassette at amyE. The negative control strain PKS7 contains a promoterless version of lacZ at the thrC locus. (TIF) Figure S2 The 75 bp region separating operators O I and O II is predicted to contain a static bent. The global 3D structure of a 256 bp DNA region encompassing operators O I and O II was predicted according the dinucleotide wedge mode using the online webpage http://www.lfd.uci.edu/,gohlke/dnacurve/. For clarity, sequences corresponding to promoters P c /P r and motifs in operators O I and O II are presented as space filling. Positions of the promoters and Rco LS20 binding motifs are given in blue and purple respectively. (TIF) Figure S3 Enlarging the distance between operators O I and O II with half a helical turn affects Rco LS20 -mediated inhibition of promoter P c . Strains containing F_I c and F_I c +5 fused to lacZ (PKS3 and GR189, respectively) and their derivatives harboring pLS20cat (PKS8 and GR191, respectively) were spread on Xgalcontaining LB agar plates and photographed after 24 hours incubation at 37uC. (TIF)