Eubacterial SpoVG Homologs Constitute a New Family of Site-Specific DNA-Binding Proteins

A site-specific DNA-binding protein was purified from Borrelia burgdorferi cytoplasmic extracts, and determined to be a member of the highly conserved SpoVG family. This is the first time a function has been attributed to any of these ubiquitous bacterial proteins. Further investigations into SpoVG orthologues indicated that the Staphylococcus aureus protein also binds DNA, but interacts preferentially with a distinct nucleic acid sequence. Site-directed mutagenesis and domain swapping between the S. aureus and B. burgdorferi proteins identified that a 6-residue stretch of the SpoVG α-helix contributes to DNA sequence specificity. Two additional, highly conserved amino acid residues on an adjacent β-sheet are essential for DNA-binding, apparently by contacts with the DNA phosphate backbone. Results of these studies thus identified a novel family of bacterial DNA-binding proteins, developed a model of SpoVG-DNA interactions, and provide direction for future functional studies on these wide-spread proteins.


Introduction
To be successful, single-celled organisms must efficiently and rapidly adapt to changing conditions. This is often accomplished through exquisite regulatory networks involving numerous, dynamic trans-acting factors. Prokaryotic proteins that bind to nucleic acids govern virtually every cellular process, including nucleoid organization, transcription, translation, and DNA replication, modification, repair, and recombination. Remarkably, most DNA-binding proteins are poorly characterized, and it is likely that many more await discovery.
In our studies of the VlsE antigenic variation system of Borrelia burgdorferi, the causative agent of Lyme disease [1,2], we discovered that these bacteria produce a cytoplasmic protein which specifically binds a DNA site within the vlsE open reading frame. Using a powerful, unbiased approach, we identified that protein to be the borrelial SpoVG. A broad range of Eubacteria, including many important human pathogens, encodes homologs of SpoVG. The name derives from observations that Bacillus spp. spoVG mutants are unable to complete stage five of sporulation [3][4][5]. Bacillus spp. mutants exhibit additional defects, such as abnormal cell cycle and division [4,6,7]. Staphylococcus aureus spoVG mutants are less virulent than are wild-type bacteria, and produce significantly lower levels of several pathogenesis-related factors [8][9][10]. With many organisms, production of spoVG is developmentally regulated and often utilizes alternative sigma factors [11][12][13][14][15][16][17][18][19][20][21][22][23]. The three dimensional structures have been determined for SpoVG from S. aureus and other species, and found to be very highly conserved ( [24], and Protein Data Base [PDB] accession numbers 2IA9, 2IA9X, 2IA9Z). However, until our discovery, the biochemistry of SpoVG remained a mystery.
Here we demonstrate that the SpoVG homologues of Borrelia burgdorferi, Staphylococcus aureus, and Listeria monocytogenes all bind to DNA. Further investigations determined that, while SpoVG members are highly similar, they have evolved to bind different consensus sequences. Alanine mutagenesis and domain shuffling revealed residues and microdomains required for generalized DNA binding and for nucleotide sequence specificity.

Identification of B. burgdorferi SpoVG as a Site-specific DNA-binding Protein
As part of our studies of the vlsE system, we postulated that B. burgdorferi expresses a cytoplasmic factor(s) that binds near the recombination site, to help control genetic rearrangement. Addressing that hypothesis, we observed that incubating cell-free B. burgdorferi cytoplasmic extract with vlsE DNA retarded the electrophoretic mobility of DNA, consistent with a DNA-protein complex (Fig. 1). This complex was not evident when cytoplasmic extracts were heat denatured or treated with proteinase K, indicating the need for a properly folded, intact protein (data not shown). Additional EMSAs narrowed the protein-binding site even further. The 70 bp labeled probe F27B-R10 bound the cytoplasmic protein, and binding was competed by the unlabeled version of that DNA sequence, fragment F27-R10 ( Fig. 1A and B, lane 4). DNAs flanking those 70 bp did not compete for protein binding (Fig. 1B, lanes 3 and 5). These results indicate that the borrelial protein binds a DNA sequence of approximately 70 bp (X on Fig. 1A), and that neither of the repeat regions flanking the recombination site is involved in protein binding.
To identify the unknown factor, we took advantage of a DNA affinity chromatography method developed in our laboratory, which has identified several other novel sequence-specific DNAbinding proteins [25][26][27]. Using a segment of vlsE that included the high-affinity binding site as bait, a protein of approximately 12 kDa was purified. Buffers containing at least 500 mM NaCl were required to elute the protein off the DNA, indicating that the trans acting factor had a high affinity for vlsE bait DNA (Fig. 2). Matrix assisted laser deionizing-time of flight (MALDI-TOF) MS/MS analysis identified the protein as being encoded by open reading frame BB0785, a hypothetical protein of unknown function, with a corresponding Mascot score of 212. Control reactions that used the same cell-free extracts and different DNA baits did not pull-down this protein (data not shown).
Due to its homology with the SpoVG proteins of many bacterial species, we have retained that name for the borrelial gene and protein (Fig. 3). To confirm that this protein was responsible for the protein-DNA complex formed by cytoplasmic extracts, we purified recombinant B. burgdorferi SpoVG (SpoVG Bb ) and repeated the EMSAs. Indeed, recombinant SpoVG Bb bound to probe F27B-R10 (Fig. 4A). This 70-mer was dissected and one 18 bp fragment was found to be required and sufficient for SpoVG-binding (Figs. 4B & C). SpoVG Bb bound to its high-affinity target DNA with an apparent disassociation constant (K D ) of 308 (631) nM. Further controls incorporated erp Operator DNA, a region of DNA known to be bound by other B. burgdorferi DNAbinding proteins [25,28,29]; SpoVG Bb failed to bind this sequence, confirming its specificity for vlsE DNA (Fig. 4D). The identified SpoVG Bb -binding sequence does not occur anywhere else in the B. burgdorferi genome, although it is possible that this protein may bind sequences that differ slightly from the site within vlsE. These studies were the first to demonstrate a function for a SpoVG orthologue. The role(s) of SpoVG Bb in vlsE rearrangement is still under investigation, and is beyond the scope of this communication on the biochemistry of SpoVG-DNA interactions.

S. aureus SpoVG is also a Site-specific DNA-binding Protein
Bioinformatics indicate that many spore and non-spore forming bacteria, Gram positive and Gram negative, encode a SpoVG protein (Fig. 3). Given the high degree of sequence conservation, we hypothesized that these orthologues also bind DNA. Compared to wild-type bacteria, S. aureus spoVG mutants express significantly less capA-H mRNA and synthesize reduced levels of capsule [9,17]. We hypothesized that S. aureus SpoVG (SpoVG Sa ) might bind DNA near the cap operon promoter. This was confirmed by EMSA, which demonstrated that recombinant SpoVG Sa bound to S. aureus (Newman) cap5 59-non-coding DNA in a dose dependent fashion (Fig. 5, lanes 2-4). Heat denaturation or proteinase K treatment eliminated the shifted EMSA band, confirming that this complex contained functional protein (Fig. 5, lanes 11 and 12). In order to determine the relative affinity of the SpoVG Sa -DNA interaction, three independent protein preparations and multiple EMSAs with labeled cap probe were performed with saturating  In a whole transcriptome screen of a S. aureus spoVG mutant, significant alterations in several other virulence-related loci were documented, including fmtB, esxA, and lukED [9]. The ability of SpoVG Sa to bind near the promoters of those genes was evaluated using each DNA as an unlabeled EMSA competitor against labeled cap5 DNA. The fmtB, esxA, and lukED 59 non-coding DNAs each competed with labeled cap5 probe for binding of SpoVG Sa (Fig. 5, lanes 5-7). Control studies using unlabeled competitors derived from the esxA or cap5A open reading frames had substantially lesser effects on SpoVG Sa binding to the labeled cap5 probe (Fig. 5, lanes 9 and 10). These results indicate that the 59-non-coding regions of cap5, fmtB, esxA, and lukED all contain a unique sequence(s) to which SpoVG Sa binds with high affinity and specificity.
Additional EMSAs using a smaller probe and unlabeled competitors narrowed down the high-affinity SpoVG Sa -binding sequence in cap5 promoter-proximal DNA. Probe cap41 contains a SpoVG-binding site (Fig. 6, lane 2). Three unlabeled 28 bp DNAs, which span the 62 bp sequence of probe cap41, were included in EMSAs at molar excesses over probe cap41. This type of analysis prevents a possible bias towards probe and/or competitor length, while controlling for potential high affinity interactions at the ends of the probe. At a constant concentration of SpoVG Sa , addition of competitor A decreased the amount of bound probe and increased the amount of free DNA (Fig. 6, lanes 3-5). In contrast, 5-fold greater concentrations of competitors B or C did not detectably affect SpoVG Sa binding to probe cap41 ( Fig. 6, lanes 6 and 7). These data indicate that the high affinitybinding site is contained within the 28 nucleotides of competitor A. MEME (Multiple Em for Motif Elicitation) analyses of the DNAs bound by SpoVG Sa indicated that all contain at least two 5-TAATT T / A -39 sequences (Fig. 7A). Competitor A contains two copies of that motif. To evaluate whether this motif is involved with SpoVG binding, a competitor with mutated motifs was incorporated into subsequent EMSAs (Fig. 7C). SpoVG Sa exhibited greater than five-fold higher affinity for the wild-type competitor over the mutant (Fig 7B). Taken together, these results demonstrate that the S. aureus SpoVG protein preferentially binds to DNA containing an TAATT T / A motif. Whether SpoVG Sa will bind to any such sequence or if surrounding DNA sequences/ structures contribute to protein binding remains to be determined.

Different SpoVG Homologues Bind to Different DNA Sequences
The vlsE probe, to which SpoVG Bb binds with high-affinity and specificity, does not possess the SpoVG Sa consensus binding motif. These observations suggested that SpoVG homologues might bind to divergent, distinct DNA sequences. With this in mind, we incubated equal concentrations of SpoVG Sa or SpoVG Bb with labeled vlsE and cap41 probes in independent EMSAs. SpoVG Bb bound to the vlsE probe, but not cap41 (Fig. 8A). Likewise, SpoVG Sa bound to only the cap41 probe (Fig. 8B).
To further address our hypothesis that SpoVG homologues act in a similar fashion, but interact with different sequences, we purified the SpoVG homologue from another firmicute, Listeria monocyotgenes (SpoVG Lm ). SpoVG Lm bound S. aureus cap41 promoter DNA but not B. burgdorferi vlsE DNA (Fig. 8C).

Chimeric SpoVG Proteins Identify Residues Involved with Sequence Specificity
Orthologous proteins are under selective pressure to maintain function, but can diverge in amino acid composition to accommodate the needs of the individual species. Protein structural predictions indicated that SpoVG homologues possess a hypervariable alpha helix at the carboxy terminus (Fig. 3). We suspected that it was this variable domain that contributed to the above-described DNA sequence specificity. To address this hypothesis, we created two different chimeric SpoVG proteins. The staphylococcal SpoVG protein was mutated at residues S 66 through E 71 and changed to the corresponding borrelial SpoVG residues, creating the chimeric variant SpoVG Sa-Bb . We reciprocated this strategy by exchanging residues Q 69 through A 74 of SpoVG Bb with those of the S. aureus protein, generating the chimeric protein SpoVG Bb-Sa (Fig 8AB and Fig. 3). For both chimeras, exchanging 6 residues was sufficient to permit binding to the alternative consensus sequence. SpoVG Bb-Sa bound to the cap41 probe, but could no longer bind to the vlsE probe. The SpoVG Sa-Bb protein now bound vlsE DNA. That chimera retained Note that these analyses grouped the SpoVG protein of the opportunistic oral pathogen Prevotella dentalis with the Gram-positive Bacilli class, although it is currently considered to be a member of the Bacteroides. Consistent with these results, P. dentalis has morphological and biochemical features which differ from other species in the genus Prevotella and class Bacteroides [51]. Red arrows indicate residues demonstrated to be involved in SpoVG-DNA interactions. Green asterisks denote conserved residues that were found to be not required for binding DNA. The magenta box indicates residues of SpoVG Bb and SpoVG Sa involved in DNA sequence specificity. doi:10.1371/journal.pone.0066683.g003 a slight ability to interact with the cap41 DNA, albeit at a dramatically reduced affinity for which a K D could not be calculated (Fig. 8B). Taken together, these results demonstrate that sequence divergence within the alpha helix contributes to DNA sequence specificity.

Conserved Residues Essential for DNA-protein Complexes
Bacterial proteins that perform analogous functions often retain similar biochemical and structural features in order to interact with their respective ligands [30]. We reasoned that, since three different SpoVG proteins interact with DNA, conserved residues common to all SpoVG orthologues might be required for nonspecific substrate binding. Recombinant SpoVG Sa and SpoVG Bb proteins were produced that included single or double amino acid substitutions at conserved positions ( Fig. 3. and Table 1). These mutant proteins were tested for their abilities to interact with their respective high-affinity DNA sequences.
Initial investigations targeted a doublet of positively charged residues (R and K), which were conserved in all SpoVG homologues (Fig. 3). The two charged residues are predicted to project inward from an abbreviated b-sheet, toward the carboxyterminal alpha helix. Alanine substitutions at position R53-R54 of SpoVG Bb or K50-R51 of SpoVG Sa impaired DNA binding. Addition of mutant proteins at five-fold excess over the disassociation constant of the wild-type protein still did not produce a detectable EMSA shift (Figs. 3 and 9). To assay residues independently, SpoVG Sa K50A and SpoVG Sa R51A were created. These variants exhibited the same deficiency in DNA binding as Mutations to other conserved, positively charged residues did not have any significant effects on DNA binding (Fig. 3, Table 1, and data not shown). Additionally, none of the other mutant proteins exhibited altered sequence preference (data not shown).

Site-directed Mutagenesis did not Affect Multimerization
Replacing charged or polar residues with a small, non-polar, uncharged alanine can interfere with protein-protein interactions, or cause protein misfolding [31]. To that end, sizing chromatography and tandem native/denaturing PAGE analysis were used to examine the native state of SpoVG Sa . The recombinant protein has a molecular mass of 14.6 kDa. By two independent methods, our data indicate that SpoVG Sa forms a 55-60 k Da complex in solution, consistent with a tetramer (Fig. 10). The complexes disappeared when samples were denatured, demonstrating that these bands were not the results of contamination (Fig. 10C). None of the SpoVG Sa mutants exhibited diminished multimer forma-tion, suggesting that the mutants which were impaired for DNA binding still retained their ability to fold correctly and form higher ordered species in solution.

Discussion
The current studies yielded several novel findings that impact a broad range of Eubacterial species. First, SpoVG orthologues from three distinct bacteria bound DNA. For several bacterial species, it is known that these small proteins play key roles in critical cellular processes, which we can now hypothesize are due to SpoVG-DNA interactions. Second, these discoveries help explain why SpoVG was found in association with the S. aureus nucleoid, and the involvement of the Bacillus subtilis orthologue with nucleoid organization [4,32]. Third, while SpoVG proteins are highly conserved overall, the S. aureus and B. burgdorferi proteins interact   preferentially with distinct DNA sequences. Given the amino acid divergence among different orthologs' carboxy-terminal alpha helices, we speculate that this feature may also be true for other SpoVG homologues. Finally, we identified two residues, whose biochemical properties are conserved among SpoVGs, that are essential for DNA interactions.
Residues involved with maintaining SpoVG secondary structure model well between species, suggesting that the solved crystal structures are likely to be representative of all orthologous proteins. Merging all of these data, we propose a model for SpoVG binding (Fig. 11). Solvent-accessible, positively charged residues are located adjacent to the alpha helix and can stabilize duplex binding through electrostatic interactions with the phosphate backbone of DNA. These are residues R53 and R54 of SpoVG Bb , and K50 and R51 of SpoVG Sa . Those residues extend into a pocket, while the alpha helix is arranged perpendicularly to provide base-edge specificity through interactions by residues extending into the pocket (Figs. 10 and 11). Notably, the B. burgdorferi and S. aureus alpha helices are out of phase by approximately one turn of the alpha helix, presenting residues with dissimilar hydrogen-donating and hydrogen-accepting capabilities on the upper helical face ( [31], and Fig. 11). Independent evolution of the two studied SpoVG proteins resulted in different nucleic acid binding specificity. Our data suggest that SpoVG homologs of different bacterial species may bind to distinct DNA sequences, and possibly exert different effects on physiology. Similar phenomena have been documented that alter the specificity, diversify the signal, and eliminate unwanted cross-talk between sensor histidine kinases and response regulators in two-component signal transduction systems [33][34][35].
The mechanisms by which S. aureus controls of virulenceassociated genes are poorly understood. The identification of a SpoVG Sa -binding site adjacent to the cap promoter suggests that SpoVG Sa may play a direct role in controlling capsule production. Indeed, cap transcription is significantly reduced in spoVG mutant S. aureus [9], and S. aureus lacking a SpoVG Sa -binding site in the cap promoter exhibit reduced cap transcription [36]. However, expression of the cap operon has been reported to be controlled by at least 12 other regulatory factors [9,17,[36][37][38][39][40]. Studies are currently under way to define binding-sites of the other regulatory factors, and to understand the ways in which these many regulators interact with each other and with RNA polymerase to control cap expression.
The role of SpoVG Bb in B. burgdorferi vlsE genetic rearrangement remains to be determined. The specialized recombination processes involved are complex and highly regulated, occurring only during mammalian infection but never during tick colonization or in culture [2,41]. Recombination of vlsE is RecAindependent, requires holiday junction resolvases, and may involve G-quadruplex DNA [42][43][44]. Our preliminary studies suggest that SpoVG may interact with other, as-yet unidentified factors. We are continuing studies to identify other players in the vlsE variation system in order to define the complicated mechanism of borrelial vlsE recombination. It is also possible that SpoVG Bb controls gene expression in B. burgdorferi as do the S. aureus and B. subtilis orthologs.
In conclusion, our data suggest that all SpoVG orthologues are DNA binding proteins. The two SpoVG homologs characterized in these studies, those of the firmicute S. aureus and the spirochete B. burgdorferi, each bound with high affinities to distinct DNA sequences. Those data suggest that, despite the overall similarities between SpoVG homologs of different species, each may preferentially bind to a different DNA sequence. The amino acid sequence of the SpoVG a-helix was found to be critical for DNA sequence-specificity. Two addition, invariant residues are essential for DNA-binding, probably through contacts with the negativelycharged DNA backbone. These results provide a framework upon which to define the roles of the ubiquitous SpoVG proteins in bacterial pathogenesis and cellular physiology.

Bacterial Strains
S. aureus strain Newman was cultured at 37uC in Luria Bertani (LB) broth with agitation. B. burgdorferi strain B31 was propagated in Barbour-Stoenner-Kelly (BSK)-II broth at 34uC [45]. Whole genomic DNAs from B. burgdorferi and S. aureus were purified using Qiagen genomic DNA extraction kits, following the manufacturer's recommend procedure (Valencia, CA). Purified L. monocytogenes strain EGD-e genomic DNA was a gift from Dr. Sarah D'Orazio.

DNA-affinity Chromatography
A protein was purified from B. burgdorferi cytoplasmic extract based on its affinity for vlsE DNA bait, using previously-described procedures [25,27]. Bait DNA was generated by PCR of the B. burgdorferi vlsE coding region using one 59-biotin-modified and one unmodified oligonucleotide (Table 2). A single band that eluted in buffer containing 750 mM NaCl was excised and MALDI-TOF   BioSpo41  GAG TAT AAT TAT TTT TAA TTT ACA TAT AAA TAA AAA GGC  GAA AAT AAT GCG GTT TAA AAG TAA TTA AT   capA5 59UTR  59 Biotin   Spo41-F  GAG TAT AAT TAT TTT TAA TTT ACA TAT AAA TAA AAA GGC GAA  AAT AAT GCG GTT TAA AAG TAA TTA AT   capA5 59UTR  None   Spo42-R  ATT AAT TAC TTT TAA ACC GCA TTA TTT TCG CCT TTT TAT TTA TAT  GTA AAT TAA AAA TAA TTA ATA TAC

Recombinant Proteins
Purified B. burgdorferi B31 DNA was used as template to clone the borrelial spoVG gene into pET101, creating pBLJ132. Similarly, the S. aureus and L. monocytogenes spoVG genes were individually cloned into pET101 (Invitrogen, Grand Island, NY), producing pBLJ505 and pBLJ340, respectively. Each cloned insert was completely sequenced to confirm that the spoVG gene was free of mutations and in-frame with the hexa-histidine tag. Escherichia coli Rosetta-2 (Novagen, EMB Millipore, Billerica, MA) was independently transformed with pBLJ132, pBLJ340, or pBLJ505. Recombinant proteins were induced by the addition of 1 mM IPTG, and purified using MagnaHis Ni-Particles (Promega, Madison, WI). In order to create conditions conducive to protein-DNA interactions, each SpoVG protein was dialyzed against a buffer containing 100 nM dithiothreitol, 50 mM Tris-HCl, 25 mM KCl, 10% glycerol vol/vol, 0.01% Tween-20, 1 mM phenylmethanesulfonyl fluoride [25,26,29,48]. Protein purities and concentrations were assessed via SDS-PAGE and Bradford analyses (Bio-Rad, Hercules, CA) respectively. Protein aliquots were snap frozen in liquid nitrogen and stored at 280uC.
To generate mutant SpoVG proteins, site-directed mutagenesis was performed on wild-type plasmid clones, as previously described [49]. Each plasmid was sequenced to confirm accuracy of mutations. All proteins were expressed, purified, and otherwise handled in the same manner. At least two independent protein preparations were used to evaluate each mutant protein that had a phenotypic difference from the wild-type protein. Tables 1 and 2 describe all probes, competitors, and mutant SpoVG proteins produced in this study follow the text.

Electromobility Gel Shift Assays (EMSA)
Sequences of oligonucleotides used in this study are listed in Table 2. Oligonucleotide primers specific for the B. burgdorferi vlsE coding region or S. aureus cap5 59 non-coding region were used to produce labeled probes, with one primer modified to include a 59 biotin moiety that allowed for chemiluminescent detection. PCRsynthesized probes were purified by gel electrophoresis. Smaller, labeled DNA fragments were annealed by an initial hightemperature melting step, followed by incremental decreases in temperature using a thermocycler [48].
Unlabeled competitor DNAs were also generated via PCR or by annealing oligonucleotides. Larger competitors, consisting of S. aureus capA, fmtB, esxA, and lukED 59 non-coding DNAs, were PCR amplified, and cloned into pCR2.1 (Invitrogen, Grand Island, NY, USA), generating pBLJ506, 507, 508, and 509, respectively. Each plasmid was sequenced to ensure that the clones were free of mutations. These constructs were then used as templates for PCR generation of specific competitors ( Table 2). Amplicons were separated by agarose gel electrophoresis and purified using Wizard DNA Clean-up Systems (Promega, Madison, WI) before use as EMSA competitors.
All probe and competitor DNA concentrations were determined spectrophotometrically. When appropriate, competitor concentrations and oligonucleotide annealing efficiencies were also confirmed using relative ethidium bromide-stained band intensity following electrophoresis through native 20% polyacrylamide gels (Invitrogen, Grand Island, NY). phase was sterile filtered to 0.22 mm. The flow rate was set to 0.10 ml/minute and elution was monitored at A280. The elution of proteins was calibrated using standards of known molecular weight from GE Healthcare LMW and HML Gel Filtration Calibration Kits. (Catalog Nos. 28-4038-41 and 28-4038-42). First, the void volume (V 0 ) of the column was determined by injection of 100 ml of 1 mg/ml blue dextran 2000 (2,000 kDa) in elution buffer with 5% glycerol. Protein standards consisting of thyroglobulin (669 kDa), ferritin (440 kDa), aldolase (158 kDa), conalbumin (75 kDa), ovalbumin (43 kDa), carbonic anhydrase (29 kDa), ribonuclease A (13.7 kDa), and bovine lung aprotonin (6.5 kDa) were individually prepared in elution buffer with 5% glycerol at 10 mg/ml. These standards were then diluted such that each individual protein had a concentration of 2.0 mg/ml and injected in 100 ml aliquots. The log of the molecular masses of these standards was then graphed against resulting elution volumes (V E ) as V E /V 0 to produce a linear calibration. Individual experimental protein samples were then run and compared to this calibration curve to estimate molecular mass. Two independent protein preparations were used for each analysis.