Transcriptomic insights on the virulence-controlling CsrA, BadR, RpoN, and RpoS regulatory networks in the Lyme disease spirochete

Borrelia burgdorferi, the causative agent of Lyme disease, survives in nature through a cycle that alternates between ticks and vertebrates. To facilitate this defined lifestyle, B. burgdorferi has evolved a gene regulatory network that ensures transmission between those hosts, along with specific adaptations to niches within each host. Several regulatory proteins are known to be essential for the bacterium to complete these critical tasks, but interactions between regulators had not previously been investigated in detail, due to experimental uses of different strain backgrounds and growth conditions. To address that deficit in knowledge, the transcriptomic impacts of four critical regulatory proteins were examined in a uniform strain background. Pairs of mutants and their wild-type parent were grown simultaneously under a single, specific culture condition, permitting direct comparisons between the mutant strains. Transcriptomic analyses were strand-specific, and assayed both coding and noncoding RNAs. Intersection analyses identified regulatory overlaps between regulons, including transcripts involved in carbohydrate and polyamine metabolism. In addition, it was found that transcriptional units such as ospC and dbpBA, which were previously observed to be affected by alternative sigma factors, are transcribed by RNA polymerase using the housekeeping sigma factor, RpoD.


Introduction
Borrelia burgdorferi, the bacterium that causes Lyme disease, is an obligately parasitic spirochete whose enzootic cycle alternates between vertebrates and Ixodes spp. ticks. Survival of B.

Bacteria and culture conditions
All studies described were performed using the B. burgdorferi strain B31-A3 and direct derivatives. B31-A3 is a clonal derivative of the type strain B31 [29,30]. B31-A3 contains the full complement of naturally-occurring DNA elements identified in the sequenced culture of strain B31 with the exception of cp9 [18,31]. Absence of cp9 does not have any detectable effects on infectivity or gene expression [29,32,33]. Generation and validation of each of the four mutations in the B31-A3 background has been described previously [8,16,27,29]. Prior to RNA-Seq analyses, all strains were assessed for the presence of the full repertoire of natural DNA elements by multiplex PCR [34]. The badR and rpoN mutants had apparently lost lp21 during production or subsequent cultivation. lp21 contains a long stretch of untranscribed, repetitive DNA along with ORFs that are involved in maintenance and partitioning, and lp21 is not known to play a role in infection processes [18,30,32]. All other naturally-occurring plasmids were retained in all cultures of the strains. Cultures and harvesting of bacteria were performed essentially as described previously [30]. B. burgdorferi were cultured in Barbour-Stoenner-Kelly II (BSK-II) liquid medium [35]. All strains were grown as at least three distinct cultures. Briefly, 5 ml of medium was inoculated with a 1:100 dilution of bacteria from frozen glycerol stocks, then incubated at 34˚C. Previous studies have demonstrated that the inoculation from -80˚C to warmer media conditions induces substantial changes in transcript and protein levels [36] which can confound studies of gene regulation. To avoid those effects, the initial 34˚C cultures were grown until cell densities reached mid exponential phase (~1x10 7 bacteria/ml). Cultures were then diluted into 10 ml of fresh BSK-II to a final density of 1x10 5 bacteria/ml, and again incubated at 34˚C. All cultures grew with essentially identical division rates. When cultures reached mid-exponential phase (1x10 7 bacteria/ml), bacteria were harvested by centrifugation at 8200xG for 30 minutes at 4˚C. Supernatants were removed and the cell pellets immediately resuspended in 1 ml of pre-warmed (60˚C) TRIzol (Thermo-Fisher, Waltham, MA). Cell suspensions were stored until use at -80˚C.
Direct-Zol miniprep kit (Zymo, CA USA). RNA was eluted from the column with 35 μl RNase-free water and stored at -80C. Yield and integrity were examined using a Bioanalyzer with the RNA 6000 Nano kit (Agilent, CA USA). Electropherograms were examined to ensure that RNA was intact and all samples used for library construction had RIN scores >9. RNA concentration was further determined using a Nanodrop 2000 spectrophotometer (Thermo-Fisher, Waltham, MA).
Illumina cDNA libraries were generated using the RNAtag-seq protocol as described previously [30,37]. Briefly, 840 ng of total RNA was fragmented, dephosphorylated, and ligated to DNA adapters carrying 5'-AN8-3' barcodes with a 5' phosphate and a 3' blocking group. Samples bearing unique barcoded RNAs were pooled and depleted of rRNA using the RiboZero Bacterial Gold rRNA depletion kit (Illumina, CA USA). These pools of barcoded RNAs were converted to Illumina cDNA libraries and sequenced in paired end mode for 75 cycles on the Illumina Nextseq 500 platform (Illumina, San Diego, CA).

RNA-Seq data analysis
As previously described [30,38], reads corresponding to each particular sample were deconvoluted, based on their associated barcode. Up to 1 mismatch in the barcode was allowed, with the caveat that it did not result in assignment to multiple barcodes. Multiplexing barcodes were trimmed using in house scripts [30]. The expected read length following removal of indexing barcodes was 33bp and we attained an average read length of 32.5 bp. Quality of reads was assessed using FastQC (v0.11.5) (http://www.bioinformatics.babraham.ac.uk/ projects/fastqc). De-convoluted reads were trimmed using Trimmomatic v0.36 [39] to remove low quality reads, trim low quality bases from the ends of reads, and to trim any reads in which both pairs did not have a length of at least 25 bases. A custom transcriptome (multi-FASTA) was created by merging the curated B. burgdorferi B31 coding sequences (NCBI Assembly ASM868v2_CDS as of 5/1/17) FASTA with a multi-FASTA of all ribosomal and tRNAs and a multi-FASTA containing a set of recently identified putative ncRNAs [30]. The custom index is available on Figshare (see below). The transcriptome was indexed using the Salmon-index function set for quasi mapping with default settings and auto library detection (v0.8.2) [40]. Mapping and counting was conducted using Salmon (v.0.8.2) in quasi mode with seqBias and GCbias flags activated. The B. burgdorferi genome contains several regions of high similarity encoded on the plasmids that have confounded both transcript quantification and genome assembly in the past [18,31,41,42]. Salmon utilizes a probabilistic model to estimate the true mapping location for ambiguously-mapped reads, providing increased accuracy of estimation of both identical sequences in different locations and of paralogous gene clusters [40,43]. For the examination of read abundance surrounding the rpoS locus reads from each sample were aligned to the B. burgdorferi B31 genome sequence using BWA [18,31,44], and read abundance was examined using Artemis (Release 16.0) [45].
For logistical reasons, cultures of csrA, badR, and the wild-type parent were grown simultaneously, and rpoS, rpoN, and additional cultures of the wild-type parent were simultaneously grown at a later date. Batch effects are a well-known confounding variable in RNA-Seq experiments, and can often account for as much or more variability than the biological effect in question [46]. To account for this, data from each mutant were compared with its simultaneouslygrown wild-type and other mutant strain. Results from each set of cultures clustered well by principal component analysis, whereas the two batches of wild-type bacteria were separate from each other, supporting our decision to compare mutants to wild-type samples only within batch (S1 Fig). Results from the csrA, badR, and the wild-type parent were compared with each other, and the rpoS, rpoN, and wild-type were compared separately.
Downstream data analysis (differential expression testing, plotting, significance filtering, and intersection identification) was performed in RStudio (1.0.143) (http://www.rstudio.com). Differential expression analyses were conducted using DESeq2 (v1.41.1) both with and without a Benjamini-Hochberg FDR correction set to alpha = .05 [47]. Thirty-nine transcripts had less than three total reads summed across all samples and were not tested. PCA and MA plots were generated using DESeq functions plotMA and plotPCA. Count data was extracted using the DESeq plotCounts function and replotting the data using ggplot2 [48]. Significance filtering was set at a padj value less than 0.05 and a log2FoldChange of greater than one. The availability of all code and reference data utilized in these studies is openly available and is described below in the Data Availability section.
These analyses used the ncRNA list and nomenclature of the first comprehensive analysis of the B. burgdorferi noncoding transcriptome [30]. A later study by other researchers used different criteria for calling putative ncRNAs, resulting in a somewhat different list [49,50]. Although the later list was not used in the current analyses, our raw data are readily accessible to anyone who wishes to analyze them against those or other transcript sets (see Data Availability, below).

Quantitative reverse transcription-PCR (qRT-PCR)
Purified RNAs from each of the above-described cultures was also assayed by qRT-PCR for comparison with RNA-Seq results. Approximately 1 μg of isolated RNA was treated with Turbo DNase I for 45 minutes to remove contaminating genomic DNA (Thermo-Fisher, Waltham, MA). Normalized amounts of RNA were converted to cDNA using SuperScript (BioRad, Hercules, CA). cDNAs were diluted 1:20 for use in qPCR. SYBER-Green based qPCR was performed essentially as described previously [30,51] using a CFX96 Touch (BioRad, Hercules, CA). Briefly, 2 μl of cDNA was combined with 5 μl 2X iTaq qPCR Supermix (BioRad, Hercules, CA), 300 μM of appropriate oligonucleotide primer pairs (S1 Table), and nuclease free H 2 O to a final volume of 10 μl. Reactions were performed in technical triplicate. Cycling conditions consisted of an initial melt at 95˚C for 2 minutes followed by 40 cycles of PCR with a 15 second melt at 95˚C, a 15 second extension at 60˚C and fluorescence detection. Melt curves were performed by increasing reaction temperatures in 0.5˚C increments from 65˚C to 95˚C. Melt curves confirmed that each particular set of primers and template generated a single specific product. Transcripts were targeted that do not have associated antisense RNAs. Data from qRT-PCR were analyzed by the ΔΔCt method [52] normalized to ftsK, which has previously been shown to be stably expressed during a variety of different culture conditions [30].

Quantitative PCR analyses of native plasmid lp28-4
Total DNA was isolated from all five strains. For each, qPCR was performed, targeting lp28-4 (primer pair qlp28-4F and qlp28-4R) and the dnaA gene at the chromosome's center (primer pair qDnaAF and qDnaAR) (S1 Table). The relative abundance of each strain's lp28-4 was normalized to its chromosome, using the ΔCt method.
proteins [23][24][25][26]53]. To further investigate these hypotheses, RNA-Seq analyses were performed on a csrA null mutant, the first such global analysis of B. burgdorferi CsrA. We observed that 239 transcripts were significantly different between the csrA mutant and the wild-type parent (13.4% of the transcriptome) ( Fig 1A, Table 1 and S2 and S5 Tables). Of the affected transcripts, 153 had reduced abundance and 86 had increased abundance in the mutant. Approximately two thirds (158 transcripts or 66%) of the differentially expressed (DE) transcripts consisted of ORF mRNAs [30]. The majority of DE transcripts (171/239 or 71.5%) were plasmid-encoded, and the majority of these were reduced in the mutant (116/169 or 68.6%). Importantly, deletion of csrA did not have significant effects on any of the other three regulatory proteins being studied, indicating that the observed effects were not due to CsrA working through BadR, RpoS, or RpoN. The RNA-Seq results were validated by performing qRT-PCR analyses of cdaA, glpF, glpK, glpD, bosR, spoVG, bbk32, dbpA, and sodA transcripts (Fig 2).
The known or proposed functions of the DE ORFs support the hypothesis that CsrA controls a diverse regulon. Deletion of csrA negatively affected transcripts for several outer surface Red points indicate transcripts which met the criteria of a log2 fold change >1 and an adjusted p-value (padj) < 0.05. The X-axis is given as mean normalized count across compared samples and the Y-axis as log2 fold change between conditions. https://doi.org/10.1371/journal.pone.0203286.g001 Table 1. Differentially expressed transcripts when comparing the csrA mutant to wild-type, listed in order of genome reference number. The included transcripts met the criteria of >1 log2 fold-change and an adjusted p-value (padj) when comparing the csrA mutant to wild-type. A total of 239 transcripts were differentially regulated, not including the mutated gene, by the mutation. The first column contains the CDS/custom transcript ID which is the transcript ID for all coding sequences obtained from the NCBI Gene file format file or the transcript ID given to ncRNAs. RefSeq entries are further separated by the character "_". The first portion gives the genetic element from which it is derived, the second describes the type of element (CDS), the third provides RefSeq ID, and the fourth provides a number indicating the particular entries ordered number in the RefSeq entry. The second column is the gene information, for the ncRNAs it contains the location relative to other genes and for predicted or known genes it contains gene name. The remaining columns describe the various metrics of expression of each impacted transcript including, base mean (average library size normalized counts across all samples), log2FC (Fold change estimate), lfcSE (uncertainty of the log fold change estimate), stat (Wald statistic), pvalue, padj (pvalue following Benjamini-Hochberg adjustment). ORFs and ncRNAs are identified according to the names or numbers assigned to genes and transcripts by the initial genome sequencing of B. burgdorferi strain B31 [18,31] or from our previous analyses of that strain's ncRNA transcriptome [30]. proteins that are involved with transmission from ticks and with survival within the vertebrate host, including ospC (down 2.7-fold), dbpB (down 6.3-fold), vlsE (down 5.3-fold), and arp (down 4.0-fold) [54][55][56][57][58][59][60][61][62][63]. In the remainder of this section, we focus on examples of transcripts that were affected only by the csrA mutation, while subsequent sections present information on transcripts that were impacted by csrA and one or more other mutations (i.e. regulome overlaps). As noted above, previous studies of csrA mutant B. burgdorferi produced varied results, with some showing significant impacts on rpoS or RpoS-affected transcripts, while others did not observe such effects [24,28]. Under the growth conditions employed in our studies, deletion of csrA did not significantly change levels of either the rpoN or rpoS mRNAs (Fig 3). However, the mutant did demonstrate significant changes in levels of several transcripts that were previously seen to be altered in some rpoN or rpoS mutants [5,6]. For example, both ospC and dbpB were expressed at significantly lower levels in the csrA mutant than in the wild-type (Table 1). Those results suggest that at least some members of the previously-described RpoS regulon are also controlled through RpoS-independent mechanisms (see below). Although deletion of csrA did not have any significant effects on either the rpoN or rpoS mRNAs, there was a significant, 25.7-fold decrease in the level of antisense RNA ncRNA0071, which is transcribed within the rpoN ORF (Fig 3) [30]. Please note that this report uses the ncRNA nomenclature of Arnold et al., 2016 [30], which was the first published description of the B. burgdorferi ncRNA transcriptome. Among other notable observations, numerous transcripts that encode outer-surface lipoproteins that play roles in vertebrate infection were present at lower amounts in the mutant. These included the fibronectin-binding protein-encoding bbk32 and several members of the erp family [64][65][66].

RefSeq CDS/Custom
Multiple transcripts encoding key nutrient scavenging proteins were present at higher levels in the mutant, including guaA, BB_B23, and BB_B29. GuaA and the BB_B23-encoded protein are involved in purine salvage and the uptake of hypoxanthine, respectively. GuaA is essential for the borrelial infectious cycle [67], and BB_B23 mutants are defective in vertebrate infection [68]. ORF BB_B29 encodes a putative glucose transporter and, while not absolutely essential for vertebrate infection, mutants are significantly defective for growth on certain carbohydrate sources [69]. Together, these data support the hypothesis that CsrA controls a range of systems that are important for survival in both the tick and vertebrate hosts.

BadR controls transcripts of genes associated with catabolite uptake and utilization
Consistent with the previous array-based study of BadR [16], levels of a large number of transcripts were altered by deletion of badR. A number of these transcripts encode proteins involved in the uptake of catabolites from the extracellular milieu. Under the studied growth conditions, a total of 234 transcripts were DE in the badR mutant: 134 decreased and 100  Table 2 and S3 and S6 Tables). Similar to what was observed for the csrA mutant, approximately one third of DE transcripts were putative ncRNAs (69/234 or 29.5%). Of the 100 transcripts that increased in the badR mutant, only 14 (14%) were putative ncRNAs, whereas 55/134 (41.0%) of the reduced transcripts were putative ncRNAs. The ratio of affected transcripts was slightly skewed towards the plasmids, with 129 (55.1%) transcripts originating from the small native replicons. Transcripts in elevated abundance reflected a plasmid vs. chromosome bias consistent with the overall trend, with 65/100 (65%), whereas the bias was not maintained for the transcripts of reduced abundance 64/134 (47.8%). RNA-Seq results were validated by qRT-PCR analyses of select transcripts (Fig 2). A number of these transcripts were affected only in the badR mutant, while many were also altered in the csrA mutant. Examples of transcripts affected only by ΔbadR are presented below, and those that overlapped with ΔcsrA are presented in the subsequent section. We note, however, that deletion of badR did not affect levels of csrA transcript, indicating that regulon overlaps were due to convergence, rather than one protein operating through the other.
Glycerol is a key nutrient for B. burgdorferi during tick colonization, and mutants unable to metabolize that carbohydrate are significantly impaired [70]. The operon encoding import and catabolism of glycerol consists of four genes glpF, glpK, an ORF of unknown function  Table 2. Differentially expressed transcripts when comparing the badR mutant to wild-type, listed in order of genome reference number. The included transcripts met the criteria of >1 log2 fold-change and an adjusted p-value (padj) when comparing the badR mutant to wild-type. A total of 234 transcripts were differentially regulated, not including the mutated gene, by the mutation. The first column contains the CDS/custom transcript ID which is the transcript ID for all coding sequences obtained from the NCBI Gene file format file or the transcript ID given to ncRNAs. RefSeq entries are further separated by the character "_". The first portion gives the genetic element from which it is derived, the second describes the type of element (CDS), the third provides RefSeq ID, and the fourth provides a number indicating the particular entries ordered number in the RefSeq entry. The second column is the gene information, for the ncRNAs it contains the location relative to other genes and for predicted or known genes it contains gene name. The remaining columns describe the various metrics of expression of each impacted transcript including, base mean (average library size normalized counts across all samples), log2FC (Fold change estimate), lfcSE (uncertainty of the log fold change estimate), stat (Wald statistic), pvalue, padj (pvalue following Benjamini-Hochberg adjustment). ORFs and ncRNAs are identified according to the names or numbers assigned to genes and transcripts by the initial genome sequencing of B. burgdorferi strain B31 [18,31] or from our previous analyses of that strain's ncRNA transcriptome [30].     (ORF BB_0242), and glpD [18,70,71]. The glpFKD operon appears to be under complex control, being reported to be impacted by several regulatory factors, including RpoS, SpoVG, cyclic-di-GMP, and ppGpp [70,[72][73][74]. In addition, an antisense RNA that is transcribed within the glpF ORF, ncRNA0042, was recently identified in two independent studies [30,50]. Upon deletion of badR, glpFKD operon transcript levels increased 2-to 4-fold, and ncRNA0042 was reduced 3.6-fold (Fig 4). Thus, both BadR and ncRNA0042 need to be considered along with the other proteins and small molecules in future studies on the mechanism controlling borrelial glycerol utilization. A previous study found that badR deletion had a significant effect upon RpoS expression [16]. In contrast, the conditions utilized in our studies did not show any significant differences in rpoS mRNA levels between the badR mutant and its wild-type parent. Transcripts of rpoS were readily detected in both strains. Additionally, there were no significant changes in the expression levels of any of the known regulators of rpoS (dsrA, hfq, bosR or rpoN) or RpoSaffected transcripts such as ospC (Fig 5). Possible reasons for these results are discussed below.

The CsrA and BadR regulons share substantial overlap
To better understand how these two regulatory factors interact, we analyzed their differentially-expressed transcript sets for intersection and divergence (Figs 6 and 7 and S4 Table). As noted above, deletion of either regulatory factor did not have any detectable impact on expression of the other. Compared with the wild-type parent, the ΔcsrA and ΔbadR mutants showed significant changes in the same 150 transcripts (S4 Table). Of these, 80 transcripts were reduced in both mutants, 66 were increased in both mutants, 1 was reduced in the rpoN, csrA, and badR mutants, and only 3 were affected in opposite directions (i.e. reduced in the badR mutant while increased in the csrA mutant, or vice versa), 1 of which was also at reduced abundance in the rpoN mutant (S4 Table). In contrast, the set of transcripts affected by CsrA alone consisted of 19 increased members and 68 reduced transcripts (Tables 1 and 2 and S4 Table), and, of the transcripts impacted by BadR alone, 53 were reduced and 31 were increased (Tables  3 and 4 and S4 Table).
Of the 80 transcripts that were decreased in both mutants, there was a slight bias towards the plasmids, with 56.3% of transcripts being plasmid encoded. Approximately half of the dually reduced transcripts (52.5%) were putative ncRNAs. Conversely, the vast majority of the 67 dually increased transcripts were transcripts encoding ORFs, with only 10 putative ncRNAs (14.9%). The bias for plasmid/chromosomal origin was similar to the ratio for reduced transcripts, with 62.6% of increased transcripts originating from the plasmids.
BadR and CsrA are known nucleic acid-binding proteins and could mediate all of the DE transcript changes directly. However, both mutants exhibited altered expression of spoVG, which encodes a regulatory protein with site-specific DNA-/RNA-binding activity [74][75][76]. The mRNA encoding SpoVG was significantly increased by 3-fold in the badR mutant and 2.3-fold in the csrA mutant. It is possible that some of the BadR-and/or CsrA-affected ORFs of unknown function also encode nucleic acid-binding proteins.
Transcripts encoding proteins that are involved in chitobiose uptake (chbCAB) were significantly enhanced by 35-to 65-fold in the badR mutant, as was previously observed [16]. Moreover, those transcripts were among the most highly increased transcripts in that data set (Fig 8). Deletion of csrA also increased the expression of chbCAB, by 2.8-to 5.2-fold. Chitobiose is a dimer of N-acetylglucosamine that can be used as an energy source. It is also required for peptidoglycan synthesis [77,78]. Transcripts encoding other proteins involved with cell wall synthesis were also affected by deletion of either csrA or badR. The nanE transcript, encoding the epimerase that converts N-acetylmannosamine-6-phosphate into GlcNAc-6P, was increased 2.1-fold by the absence of badR. MurG is a key enzyme involved in the formation of the peptidoglycan cell wall, transferring GlcNAc moieties from lipid intermediate I to lipid intermediate II; murG transcript levels were increased in both the badR and csrA mutants, by 3-fold and 2.6-fold, respectively.
Other carbohydrate-utilization pathways affected by both the badR and csrA mutants include increased levels of transcripts encoding a putative hexose transporter IIABC component (BB_0408) (2.5-fold and 2.8-fold, respectively) and a subunit of another putative hexose/ pentose ABC transporter (BB_0678) (2.2-fold and 2.9-fold, respectively).
Several transcripts involved in the uptake and catabolism of polyamines were significantly increased by deletion of either badR or csrA. Polyamines are cationic organic bases that are present in significant levels within vertebrate hosts, can affect a wide variety of biological processes, and are often involved in stress and osmotic responses. The polyamine uptake system  The differentially expressed transcripts from either the csrA or badR mutant vs. wild-type parent were compared for any overlap in identity using the UpsetR package. Affected transcript sets were first filtered by higher or lower abundance to create 8 possible sets (Four regulators, at higher or lower abundance). The horizontal bars on the lower left side indicate the number of transcripts in each set (e.g. 153 transcripts were significantly reduced in the csrA mutant). Dark dots indicate each set, and dark dots connected by a line indicate paired sets (e.g. the leftmost dots and line refer to the paired set of transcripts that were reduced in both in B. burgdorferi, and the polyamines spermine and spermidine, are important for control of bacterial growth and expression of infection-associated proteins [79,80]. Polyamines can be imported through the PotABCD transporter or produced de novo from arginine. B. the csrA and badR mutants). The vertical bars indicate the number of transcripts in each set or paired set (e.g. 80 transcripts were reduced in both the csrA and badR mutants, 68 transcripts were reduced only in the badR mutant, 66 transcripts were elevated in both the csrA and badR mutants, etc.).
https://doi.org/10.1371/journal.pone.0203286.g006 Fig 7. Interaction network of four essential regulatory factors. An interaction network was generated using Adobe Illustrator using the data in S5-S8 Tables. Transcripts that were found in higher abundance by the deletion of a particular factor are indicated by a blocked line stretching from the regulator to the transcript as the regulatory factor would be expected to naturally lower transcript abundance. Those that were found in lower abundance follow the same scheme but are indicated by arrows as the regulatory factor would be expected to naturally increase transcript abundance.
Most transcripts from lp28-4 appeared to be decreased in both the badR and csrA mutants. To further examine these observations, we isolated genomic DNA from all 5 strains, then used qPCR to examine the copy numbers of lp28-4 relative to the chromosome. We found that the ratio of lp28-4 to chromosome was reduced in the csrA (0.45:1) and badR (0.43:1) mutants, but that relative abundance also fluctuated in the rpoN (1.37:1) and rpoS (0.81:1) mutants (Fig 9). B. burgdorferi has one of the most complex known bacterial genomes, and smaller replicons/ plasmids may occasionally be lost during cultivation. As was performed for all of the strains used in the current studies, it is common practice to examine the replicon profile of B. burgdorferi strains by use of plasmid-specific PCR. At least three hypotheses could explain the results on lp28-4 transcript levels: either those initially-clonal cultures now contain a mixture of bacteria with and without the plasmid, many transcripts from lp28-4 are under the control of BadR and CsrA, or BadR and CsrA have an influence on lp28-4 copy number. Considering the apparent variation in plasmid copy number we opted not to make any further inferences regarding transcripts from genes on lp28-4. Plasmid lp28-4 is not necessary for mammalian infection, although it has some role in tick colonization, so further investigation of these results is warranted [32,81,82]. No other borrelial replicon exhibited such a broad expanse of DE Table 3. Differentially expressed transcript when comparing the rpoS mutant to wild-type. The included transcript met the criteria of >1 log2 fold-change and an adjusted p-value (padj) when comparing the rpoS mutant to wild-type. One transcript was impacted, not including the mutated gene, by the mutation. The first column contains the CDS/custom transcript ID which is the transcript ID for all coding sequences obtained from the NCBI Gene file format file or the transcript ID given to ncRNAs. RefSeq entries are further separated by the character "_". The first portion gives the genetic element from which it is derived, the second describes the type of element (CDS), the third provides RefSeq ID, and the fourth provides a number indicating the particular entries ordered number in the RefSeq entry. The second column is the gene information, for the ncRNAs it contains the location relative to other genes and for predicted or known genes it contains gene name. The remaining columns describe the various metrics of expression of each impacted transcript including, base mean (average library size normalized counts across all samples), log2FC (Fold change estimate), lfcSE (uncertainty of the log fold change estimate), stat (Wald statistic), pvalue, padj (pvalue following Benjamini-Hochberg adjustment). The ncRNA is listed according to the nomenclature the previous analyses of the strain B31 ncRNA transcriptome [30].  Table 4. Differentially expressed transcripts when comparing the rpoN mutant to wild-type. The included transcripts met the criteria of >1 log2 fold-change and an adjusted p-value (padj) when comparing the rpoN mutant to wild-type. A total of 6 transcripts were differentially regulated, not including the mutated gene, by the mutation. The first column contains the CDS/custom transcript ID which is the transcript ID for all coding sequences obtained from the NCBI Gene file format file or the transcript ID given to ncRNAs. RefSeq entries are further separated by the character "_". The first portion gives the genetic element from which it is derived, the second describes the type of element (CDS), the third provides RefSeq ID, and the fourth provides a number indicating the particular entries ordered number in the RefSeq entry. The second column is the gene information, for the ncRNAs it contains the location relative to other genes and for predicted or known genes it contains gene name. The remaining columns describe the various metrics of expression of each impacted transcript including, base mean (average library size normalized counts across all samples), log2FC (Fold change estimate), lfcSE (uncertainty of the log fold change estimate), stat (Wald statistic), pvalue, padj (pvalue following Benjamini-Hochberg adjustment). ORFs and ncRNAs are identified according to the names or numbers assigned to genes and transcripts by the initial genome sequencing of B. burgdorferi strain B31 [18,31] or from our previous analyses of that strain's ncRNA transcriptome [30]. transcripts, so the current study did not evaluate copy numbers of the other naturally-occurring plasmids.

RefSeq CDS/Custom Transcript ID
ospC and dbpBA can be transcribed independently of either alternative sigma factor RpoN and RpoS are the only alternative sigma factors of B. burgdorferi, and both are essential for transmission from ticks to vertebrates and for establishment of vertebrate infection [6,9]. For this reason, we chose deletion mutants in these two sigma factors as an early step in dissecting the gene regulatory networks important for pathogenesis. It is well established that many regulatory factors of B. burgdorferi are controlled by environmental stimuli [3,4,36,[83][84][85][86][87][88]. In order to control for such potential effects, we performed transcriptome analysis of a single, specific culture condition: mid-exponential phase at 34˚C. Both rpoS and rpoN mRNAs were readily detected at significant levels in the wild-type parental strain under these growth conditions. Previous immunoblot-or array-based analyses of rpoS mutants, cultured under conditions that otherwise induce high-level expression of RpoS, observed that levels of numerous transcripts, such as ospC and dbpBA, were substantially affected by the rpoS mutation [5,6,8]. This led to hypotheses that RpoS-RNA polymerase holoenzyme might directly transcribe all of those genes. However, under the current culture conditions, ospC, dbpBA, and other transcripts were produced at detectable levels in the rpoS mutant. Moreover, all of those transcripts were expressed at approximately equivalent amounts in wild-type and rpoS mutant samples. These results, combined with those presented below for the rpoN mutant, indicate that the ospC, dbpB, dbpA, and many other genes can be transcribed by RNA polymerase using the "housekeeping" sigma, RpoD. We validated these studies with qRT-PCR on samples harvested from independent cultures, which also readily detected ospC and dbpBA transcripts in the rpoS mutant.
Overall, almost no transcripts were detectably impacted by deletion of rpoS under the studied culture conditions. Only 1 transcript, ncRNA0317, was detected as DE (Fig 1C and  Table 3). This transcript is a putative small RNA that is encoded downstream of the ospB ORF, in an intergenic location. While RpoS protein production is also controlled post-transcriptionally, none of the transcripts of known regulators DsrA, Hfq, or BBD18, were observed to be affected in either the rpoS deletion mutant or any of the other examined mutants.
Deletion of rpoN resulted in very few significant changes ( Fig 1D and Table 4). Transcript levels of BB_0449, the gene that is divergently-transcribed from rpoN, were approximately twice as high in the mutant, possibly due to transcriptional read-through from the inserted antibiotic resistance gene. The only other significant difference was the reduced expression of several transcripts on lp28-1, including BB_F23, which encodes a putative partition protein.
Considering these data, it is possible that rpoN could affect the replication of lp28-1. Notably, none of the transcripts previously hypothesized to require RpoS, such as ospC or dbpBA, were affected by deletion of rpoN.
The rpoS locus has previously been shown to have both an RpoN-and an RpoD-dependent promoter [1,11,89], leading us to investigate read coverage plots for the region around rpoS in the wild-type and rpoN mutants (Fig 10). These analyses highlighted that transcription initiating from the flgI and flgJ genes that are directly 5' of rpoS appears to continue into the rpoS ORF [30,90]. Consistent with that observation, the previously-mapped RpoD-dependent promoter of rpoS is located within the flgJ ORF [11]. Altogether, these results indicate that essentially all transcription of rpoS under the assayed culture conditions resulted from RNA polymerase-RpoD holoenzyme using the previously-mapped promoter within flgJ and/or by read-through from the promoter 5' of flgIJ. The read coverage plots indicated that substantial amounts of transcripts from flgJ were terminated in the space between the flgJ and rpoS ORFs (Fig 10). No intrinsic (Rho-independent) terminators are present in this region [30], implying that some other type of transcriptional regulatory element exists immediately 5' of the rpoS ORF.

Discussion
In an effort to determine whether there are overlaps between borrelial regulons, we simultaneously examined wild-type and pairs of mutant B. burgdorferi that were grown under a single, specific condition, so that each batch of RNA-Seq data sets could be compared with each other. Several important conclusions can be drawn from these results: Both CsrA and BadR affect levels of numerous transcripts, evidently independently of RpoS; the CsrA and BadR regulons include transcripts affected by only one of those proteins, and also include a substantial number of transcripts that were affected by both nucleic acid-binding proteins; neither CsrA Read coverage histograms of rpoS locus and upstream genomic region in wild-type and ΔrpoN strains. Abundance plots represent the merged normalized expression from three independent biological replicates. Blue lines indicate relative transcript abundance from left to right (the coding strand of flgI, flgJ and rpoS) and red lines indicate relative transcript abundance from the opposite strand (right to left), and reside above (+) and below (-) the central axis. Open reading frames are indicated below coverage plots and direction of transcription is given by arrows at the ends of genes. Transcriptional start sites of the two previously mapped promoters are indicated by "TSS" and arrows. Normalized read coverage of each strand is given as RPKMO (reads per kb of gene per million reads aligning to annotated ORFs) is given on Y-axis on the left. The RpoN-and RpoD-dependent transcriptional start sites were previously identified [1,11,90]. nor BadR affected each other's transcript levels; CsrA can alter levels of transcripts such as ospC and dbpBA without affecting rpoS; transcription from the promoters of ospC and dbpBA do not require RpoS, but can be transcribed by the housekeeping sigma, RpoD; and, under the conditions examined by these studies, rpoS was transcribed from only its RpoD-dependent promoters.
None of the regulons defined by the current studies are likely to be complete. Had we chosen another growth condition, those stimuli could affect regulatory networks to the extent that additional members of these regulons might have been detected, while other transcripts might have been obscured by the effects of competing factors. For example, variations in culture conditions, such as acid stress, temperature, or osmolarity, can have significant effects on cellular levels of RpoS, explaining the substantial differences previously found between different analyses of rpoS mutant B. burgdorferi [6,8,10,[91][92][93][94][95][96]. We note also that the majority of previous studies on borrelial RpoS function have focused on bacteria cultured under conditions that caused high-level RpoS expression. We opted not to replicate such analyses because the large differences in RpoS content between wild-type bacteria that express high levels of the protein and an rpoS mutant can lead one to overlook subtleties. For example, prior array-based studies reported that levels of ospC and dbpBA were found to be greatly diminished in rpoS mutants in some instances, indicating that RpoS plays a positive role in their expression, but the low levels of ospC, dbpBA, etc. in rpoS mutants led to assumptions that RpoS is essential for their transcription [5,7]. However, under the studied culture conditions, rpoS, ospC, and dbpBA were produced at significantly detectable levels in all strains, and neither ospC nor dbpBA were affected by deletion of rpoS. This demonstrates that both ospC and dbpBA are transcribed by RpoD-directed RNA polymerase. The previously-reported effects of RpoS enhancing expression of ospC and dbpBA indicate that either their promoters can also be recognized by RpoScontaining holoenzyme, or RpoS controls production of one or more factors that affect ospC and dbpBA transcript levels (e.g. DNA-binding proteins that stimulate transcription) [5,97]. Regarding the first hypothesis, while elevated RpoS content can correlate with increased transcription of ospC, there is no strong evidence that RpoS-RNA polymerase holoenzyme directly transcribes the ospC promoter. Studies have been performed on the ospC promoter in the unrelated bacterium E. coli, but those studies determined that the two species' RNA polymerases recognize different DNA sequences, so firm conclusions cannot be drawn from that report [97]. In support of the latter hypothesis, DNA sequences adjacent to the ospC promoter are required for maximal transcription [98][99][100][101], and a recent study provided evidence for at least one RpoS-controlled DNA-binding protein [73]. In addition, this can explain how ospC is repressed early during mammalian infection, while dbpBA and rpoS continue to be expressed [59,102,103]. Clearly, much remains to be learned about the mechanisms by which B. burgdorferi controls transcription of ospC and other virulence factors.
In addition, only a portion of the badR-affected transcripts observed in the current RNA--Seq study were also observed to be affected by badR in a previous, array-based study [16]. As a caveat, arrays measure only transcripts that hybridize with a probe derived from a segment of each gene, and hybridization efficiency is sensitive to temperature, pH, salt concentrations, and other experimental conditions. In addition, all prior array-based analyses studied only mRNAs, without considering intergenic or antisense ncRNAs. Even with those caveats, the data suggest that some of the transcripts affected by badR in the current study were influenced by additional regulatory factors that had little-to-no effect on them under the conditions of the previous transcriptome analysis. These might include other regulatory proteins, or, since BadR function is dependent upon cellular carbohydrate contents, differences in metabolic status or nutrient composition between batches of culture media might have contributed to results. These variations reinforce the hypothesis that B. burgdorferi uses multiple factors in a complex network of overlapping regulons, such that fluctuations in levels of a single regulatory factor may have significant impacts on some targets but not on others. Indeed, all intensively studied operons of B. burgdorferi are controlled by multiple factors. For example, transcription of the erp operons is directly regulated by the BpaB repressor, BpuR co-repressor, and EbfC antirepressor proteins, each of which also regulates other transcripts in various ways [104,105]. Data from the current study further aids dissection of the regulatory interplay of B. burgdorferi.
Prior studies detected two promoters that drive transcription of rpoS, one of which is dependent upon RpoN, and the other upon RpoD [11,15,16,89] (Fig 10). The current study demonstrates that transcription initiating from the RpoD-dependent promoter(s) 5' of the upstream flgI and flgJ ORFs [90] probably also contributes to the expression of rpoS (Fig 10). The previously-identified RpoD-dependent promoter of rpoS lies within the flgJ ORF [11], so it is unlikely that a feature at the end of flgJ could terminate transcription that arose from one RpoD-dependent promoter but not the other. Further studies are required to determine whether the upstream promoters are regulated by B. burgdorferi, and how use of each promoter affects the others. It is also notable that all rpoS transcripts in the rpoN mutant originated from the RpoD-dependent promoters, serving as a reminder that the RpoD promoters must always be considered when studying conditions and regulatory factors that affect borrelial RpoS levels. These analyses also indicated considerable diminishment of transcription between flgJ and rpoS, suggestive of a regulatory mechanism operating in that area. The sequence does not contain an obvious intrinsic terminator [30]. Several proteins are known to bind DNA in this region, including BadR [15,17]. While prior research on those factors has focused on the RpoN-dependent promoter, it would be worthwhile to examine their effects on transcription from the RpoD-dependent promoters.
In conclusion, these studies expanded knowledge of the B. burgdorferi CsrA, BadR, RpoS, and RpoN regulons. Under the examined growth conditions, none of these regulatory proteins were observed to have impacts on any of the other three, indicating that effects of two proteins on a single transcript were due to converging regulatory pathways. This lack of impact on one another was true when considering both adjusted and non-adjusted p-values. Substantial convergence was observed between the csrA and badR mutant transcriptomes, as well as evidence that each regulates a distinct set of transcripts. CsrA exerted significant impacts upon numerous transcripts, such as ospC and dbpBA, through mechanisms that appear to be independent of RpoS, further advancing understanding of these infection-associated regulons. "WT-1" and "WT-2" indicate data from the two sets of wild-type cultures. (PDF) S1 Table. Primers used in these studies. Contains all primers used within these studies for qPCR and qRT-PCR. Name and nucleotide sequence (5'-3') is given for each. (DOCX) S2 Table. Differentially expressed transcripts when comparing the csrA mutant to wildtype, listed in order of fold change. The included transcripts met the criteria of >1 log2 foldchange and an adjusted p-value (padj) when comparing the csrA mutant to wild-type sorted by fold change. A total of 239 transcripts were differentially regulated, not including the mutated gene, by the mutation. The first column contains the CDS/custom transcript ID which is the transcript ID for all coding sequences obtained from the NCBI Gene file format file or the transcript ID given to ncRNAs. RefSeq entries are further separated by the character "_". The first portion gives the genetic element from which it is derived, the second describes the type of element (CDS), the third provides RefSeq ID, and the fourth provides a number indicating the particular entries ordered number in the RefSeq entry. The second column is the gene information, for the ncRNAs it contains the location relative to other genes and for predicted or known genes it contains gene name. The remaining columns describe the various metrics of expression of each impacted transcript including, base mean (average library size normalized counts across all samples), log2FC (Fold change estimate), lfcSE (uncertainty of the log fold change estimate), stat (Wald statistic), pvalue, padj (pvalue following Benjamini-Hochberg adjustment). ORFs and ncRNAs are identified according to the names or numbers assigned to genes and transcripts by the initial genome sequencing of B. burgdorferi strain B31 [18,31] or from our previous analyses of that strain's ncRNA transcriptome [30]. (DOCX) S3 Table. Differentially expressed transcripts when comparing the badR mutant to wildtype, listed in order of fold change. The included transcripts met the criteria of >1 log2 foldchange and an adjusted p-value (padj) when comparing the badR mutant to wild-type sorted by fold change. A total of 234 transcripts were differentially regulated, not including the mutated gene, by the mutation. The first column contains the CDS/custom transcript ID which is the transcript ID for all coding sequences obtained from the NCBI Gene file format file or the transcript ID given to ncRNAs. RefSeq entries are further separated by the character "_". The first portion gives the genetic element from which it is derived, the second describes the type of element (CDS), the third provides RefSeq ID, and the fourth provides a number indicating the particular entries ordered number in the RefSeq entry. The second column is the gene information, for the ncRNAs it contains the location relative to other genes and for predicted or known genes it contains gene name. The remaining columns describe the various metrics of expression of each impacted transcript including, base mean (average library size normalized counts across all samples), log2FC (Fold change estimate), lfcSE (uncertainty of the log fold change estimate), stat (Wald statistic), pvalue, padj (pvalue following Benjamini-Hochberg adjustment). ORFs and ncRNAs are identified according to the names or numbers assigned to genes and transcripts by the initial genome sequencing of B. burgdorferi strain B31 [18,31] or from our previous analyses of that strain's ncRNA transcriptome [30]. (DOCX) S4 Table. Intersection table of all transcripts that differentially expressed across all mutants. Contains the entire set of transcripts that were differentially expressed under any condition and what condition they were impacted by. A total of 331 transcripts, including those mutated, were differentially expressed across our total data set. The first column is the gene information, for the ncRNAs it contains the location relative to other genes and for predicted or known genes it contains gene name. The second column contains the CDS/custom transcript ID which is the transcript ID for all coding sequences obtained from the NCBI Gene file format file or the transcript ID given to ncRNAs. RefSeq entries are further separated by the character "_". The first portion gives the genetic element from which it is derived, the second describes the type of element (CDS), the third provides RefSeq ID, and the fourth provides a number indicating the particular entries ordered number in the RefSeq entry. The following 8 columns are given as each mutant and increased abundance or decreased abundance. If a transcript was differentially expressed in a condition it's cell value is given as TRUE. Empty cells indicate that a transcript was not impacted by a given condition. ORFs and ncRNAs are listed according to the numerical order assigned to genes and replicons by the initial genome sequencing of B. burgdorferi strain B31 (13,69). (XLSX) S5 Table. Total differential expression testing table for the csrA mutant. Contains the differential expression testing results for the csrA mutant compared to wild-type. The first column is the gene information, for the ncRNAs it contains the location relative to other genes and for predicted or known genes it contains gene name. The second column contains the CDS/custom transcript ID which is the transcript ID for all coding sequences obtained from the NCBI Gene file format file or the transcript ID given to ncRNAs. RefSeq entries are further separated by the character "_". The first portion gives the genetic element from which it is derived, the second describes the type of element (CDS), the third provides RefSeq ID, and the fourth provides a number indicating the particular entries ordered number in the RefSeq entry. The remaining columns describe the various metrics of expression of each impacted transcript including, base mean (average library size normalized counts across all samples), log2FC (Fold change estimate), lfcSE (uncertainty of the log fold change estimate), stat (Wald statistic), pvalue, padj (pvalue following Benjamini-Hochberg adjustment). ORFs and ncRNAs are listed according to the numerical order assigned to genes and replicons by the initial genome sequencing of B. burgdorferi strain B31 (13,69). (XLSX) S6 Table. Total differential expression testing table for the badR mutant. Contains the differential expression testing results for the badR mutant compared to wild-type. The first column is the gene information, for the ncRNAs it contains the location relative to other genes and for predicted or known genes it contains gene name. The second column contains the CDS/custom transcript ID which is the transcript ID for all coding sequences obtained from the NCBI Gene file format file or the transcript ID given to ncRNAs. RefSeq entries are further separated by the character "_". The first portion gives the genetic element from which it is derived, the second describes the type of element (CDS), the third provides RefSeq ID, and the fourth provides a number indicating the particular entries ordered number in the RefSeq entry. The remaining columns describe the various metrics of expression of each impacted transcript including, base mean (average library size normalized counts across all samples), log2FC (Fold change estimate), lfcSE (uncertainty of the log fold change estimate), stat (Wald statistic), pvalue, padj (pvalue following Benjamini-Hochberg adjustment). ORFs and ncRNAs are listed according to the numerical order assigned to genes and replicons by the initial genome sequencing of B. burgdorferi strain B31 (13,69). (XLSX) S7 Table. Total differential expression testing table for the rpoS mutant. Contains the differential expression testing results for the rpoS mutant compared to wild-type. The first column is the gene information, for the ncRNAs it contains the location relative to other genes and for predicted or known genes it contains gene name. The second column contains the CDS/custom transcript ID which is the transcript ID for all coding sequences obtained from the NCBI Gene file format file or the transcript ID given to ncRNAs. RefSeq entries are further separated by the character "_". The first portion gives the genetic element from which it is derived, the second describes the type of element (CDS), the third provides RefSeq ID, and the fourth provides a number indicating the particular entries ordered number in the RefSeq entry. The remaining columns describe the various metrics of expression of each impacted transcript including, base mean (average library size normalized counts across all samples), log2FC (Fold change estimate), lfcSE (uncertainty of the log fold change estimate), stat (Wald statistic), pvalue, padj (pvalue following Benjamini-Hochberg adjustment). ORFs and ncRNAs are listed according to the numerical order assigned to genes and replicons by the initial genome sequencing of B. burgdorferi strain B31 (13,69). (XLSX) S8 Table. Total differential expression testing table for the rpoNmutant. Contains the differential expression testing results for the rpoN mutant compared to wild-type. The first column is the gene information, for the ncRNAs it contains the location relative to other genes and for predicted or known genes it contains gene name. The second column contains the CDS/custom transcript ID which is the transcript ID for all coding sequences obtained from the NCBI Gene file format file or the transcript ID given to ncRNAs. RefSeq entries are further separated by the character "_". The first portion gives the genetic element from which it is derived, the second describes the type of element (CDS), the third provides RefSeq ID, and the fourth provides a number indicating the particular entries ordered number in the RefSeq entry. The remaining columns describe the various metrics of expression of each impacted transcript including, base mean (average library size normalized counts across all samples), log2FC (Fold change estimate), lfcSE (uncertainty of the log fold change estimate), stat (Wald statistic), pvalue, padj (pvalue following Benjamini-Hochberg adjustment). ORFs and ncRNAs are listed according to the numerical order assigned to genes and replicons by the initial genome sequencing of B. burgdorferi strain B31 (13,69). (XLSX)