Comparison of transcriptional profiles of Treponema pallidum during experimental infection of rabbits and in vitro culture: Highly similar, yet different

Treponema pallidum ssp. pallidum, the causative agent of syphilis, can now be cultured continuously in vitro utilizing a tissue culture system, and the multiplication rates are similar to those obtained in experimental infection of rabbits. In this study, the RNA transcript profiles of the T. pallidum Nichols during in vitro culture and rabbit infection were compared to examine whether gene expression patterns differed in these two environments. To this end, RNA preparations were converted to cDNA and subjected to RNA-seq using high throughput Illumina sequencing; reverse transcriptase quantitative PCR was also performed on selected genes for validation of results. The transcript profiles in the in vivo and in vitro environments were remarkably similar, exhibiting a high degree of concordance overall. However, transcript levels of 94 genes (9%) out of the 1,063 predicted genes in the T. pallidum genome were significantly different during rabbit infection versus in vitro culture, varying by up to 8-fold in the two environments. Genes that exhibited significantly higher transcript levels during rabbit infection included those encoding multiple ribosomal proteins, several prominent membrane proteins, glycolysis-associated enzymes, replication initiator DnaA, rubredoxin, thioredoxin, two putative regulatory proteins, and proteins associated with solute transport. In vitro cultured T. pallidum had higher transcript levels of DNA repair proteins, cofactor synthesis enzymes, and several hypothetical proteins. The overall concordance of the transcript profiles may indicate that these environments are highly similar in terms of their effects on T. pallidum physiology and growth, and may also reflect a relatively low level of transcriptional regulation in this reduced genome organism.

Introduction Treponema pallidum subsp. pallidum (hereafter called T. pallidum) is the causative agent of syphilis [1][2][3]. This highly motile spirochete is closely related to the other subspecies of T. pallidum that cause the non-venereal diseases yaws (subsp. pertenue) and bejel (subsp. endemicum) [4]. Worldwide, there are an estimated 6 million new cases of syphilis in adults each year, with an additional 300,000 fetal and infant deaths due to congenital syphilis [5,6]. In 2016, the World Health Organization developed a program to decrease the transmission of syphilis by 90% by 2030, with a focus on eliminating congenital syphilis; however, there has recently been a significant increase in new cases of syphilis in North America, Europe, and Asia [6][7][8][9]. In the United States alone, there were 35,063 new primary and secondary syphilis cases reported in 2018, representing a 71% increase from 2014 [10]. Coinciding with this increase in primary and secondary syphilis cases, there were 1306 cases of congenital syphilis in 2018 with 78 resulting in still birth and 16 in infant death, reflecting an 185% increase in reported cases from 2014 [7].
The Nichols strain of T. pallidum was first isolated from a patient from Washington, D.C. with syphilis in 1912 and has been propagated in rabbits since that time [11]. This strain has served as the principal laboratory strain of T. pallidum; its complete genome was first sequenced in 1998, followed by resequencing in 2013 [12,13]. The genome consists of a single circular chromosome 1.14 Mbp in length with 1,063 predicted protein-encoding open reading frames [13]. Of the predicted open reading frames, only about 55% have predicted functions. Based on sequence information, T. pallidum lacks the ability to synthesize nucleosides, fatty acids, and most amino acids, as well as proteins necessary for metabolic processes including the Krebs cycle and oxidative phosphorylation [2,12]. Related to its reduced metabolic capabilities, T. pallidum has numerous genes involved in the transport and utilization of needed molecules from the host [1,14,15].
The genomic sequences of many additional strains of T. pallidum subsp. pallidum, subsp. pertenue, subsp. endemicum, and Treponema paraluiscuniculi (venereal spirochetosis of rabbits) reveal that these organisms are all closely related, with~99.2% sequence identity among the T. pallidum subspecies and 98.1% identity between T. pallidum and T. paraluiscuniculi strains (reviewed in [16]). Moreover, the gene content in this group of organisms is virtually identical, with heterogeneities consisting primarily of single nucleotide polymorphisms and duplications of T. pallidum repeat (tpr) genes. Within T. pallidum subsp. pallidum, the strains are subdivided into at least two genetic clusters, consisting of isolates more closely related to the Nichols strain and others related to the SS14 strain [17][18][19]. This observation suggests a relatively recent divergence among syphilis-causing organisms.
Although the Nichols strain of T. pallidum has been propagated in rabbits and other animals for over a century, it has only recently been successfully cultured in vitro [20,21]. Co-culture of T. pallidum strains in vitro with Sf1Ep cottontail rabbit epithelial cells in growth media based on Eagle's Minimal Essential Medium (MEM) under microaerobic conditions (1.5% O 2 , 5% CO 2 ) at 34˚C resulted in up to 100-fold increase of T. pallidum, but serial passage of T. pallidum remained unsuccessful and cultures generally survived for less than 18 days [14]. In 2018, the first successful long-term in vitro cultivation of T. pallidum was reported [22]. This culture system uses a modified culture medium (TpCM-2, containing CMRL 1066 medium as its basal medium) in combination with Sf1Ep cells grown under microaerobic conditions to successfully cultivate T. pallidum continuously in vitro, with retention of infectivity in the rabbit model [22][23][24][25].
The recent advancement in the ability to cultivate T. pallidum in long-term in vitro culture has opened up the possibility of studying the biology of this enigmatic organism in greater detail. Although gene expression has been examined previously to some extent in rabbit-propagated T. pallidum [2,26,27], it is uncertain how gene expression is affected by in vitro culture. This study compares the gene transcript levels of T. pallidum propagated by intratesticular infection of rabbits with those of T. pallidum grown in vitro utilizing global RNA-seq analysis and quantitative reverse transcriptase PCR (qRT-PCR) of a small subset of genes. Overall, expression of 90% of the T. pallidum genes was not significantly different between these two culture conditions. These results indicate that in vitro cultivation of T. pallidum is a useful alternative to rabbit infection for studying gene expression patterns and other biological properties of this important human pathogen. In addition, the data support the concept that T. pallidum may have a limited capability to alter gene expression in response to varying environmental conditions.

Global transcriptional profiles of T. pallidum propagated in rabbits and in in vitro cultures
All of the experiments in this study utilized the Nichols strain of T. pallidum subsp. pallidum, hereafter referred to as T. pallidum. Two sets of in vitro samples, each consisting of T. pallidum RNA collected from three 75-cm 2 flask cultures, were compared to T. pallidum RNA collected from two rabbits (Fig 1). RNA sequences obtained using an RNA-seq approach were mapped against the published T. pallidum genome (NC_021490). An average of 46,823,419 read pairs were obtained from each of the six in vitro samples, in comparison to an average of 44,265,095 read pairs from each rabbit sample (Table 1). The overall percentage of read pairs that mapped to the T. pallidum genome differed between in vitro and rabbit samples with an average of 12.5% of the in vitro read pairs mapping to the T. pallidum genome in comparison to 71.4% of the rabbit sample read pairs (Table 1). Both the in vitro and rabbit samples contained residual mammalian cells prior to RNA extraction.
Read pairs that mapped to the T. pallidum genome were assigned to individual accession numbers by HTSeq with the minimum alignment quality set to 0 (Tables 2 and S1). The majority of the assigned read pairs corresponded to rRNA sequences, with an average of 84.0% of the assigned in vitro culture read pairs and 91.4% of the rabbit infection-derived read pairs corresponding to rRNA. The average percent read pairs per in vitro sample that mapped to protein-encoding genes (16.0%) was almost twice as high as the average percent per rabbit sample (8.6%). Similarly, the average percent of tRNA read pairs per sample in the in vitro samples (0.08%) was also higher than in the rabbit samples (0.02%).

Consistency of RNA-seq read profiles
RNA-seq is considered a valuable measure of global RNA transcript levels. However, RNA-seq read coverage profiles exhibit a surprising degree of unevenness of transcript levels within https://doi.org/10.1371/journal.ppat.1009949.g001 Table 1. Read pairs generated from the in vitro culture and rabbit-propagated samples by RNA-seq analysis. The total number of read pairs per sample and the number of read pairs that map to the T. pallidum genome (NC_021490) were generated by HTSeq.

PLOS PATHOGENS
Treponema pallidum comparative RNAseq genes and operons. To examine whether differences in read frequencies between T. pallidum propagated by in vitro culture or rabbit infection could be due in part to differential RNA stability, we scanned the coverage profile throughout the genome using the Integrative Genomics Viewer (IGV) [28] and its appended Sashimi program [29]. The coverage profiles of the eight RNA preparations from in vitro-and rabbit-propagated T. pallidum were found to be remarkably similar; this observation is exemplified by the near identical patterns found in the vicinity of the large ribosomal protein gene operon (Fig 2). To quantitatively compare coverage profiles between samples, the read distributions within the length of all the protein-encoding genes were determined and averaged for each RNA preparation. Highly similar read distribution profiles were obtained from RNA samples obtained from infected rabbits or in vitro cultures. Coverage profiles were highest at the 5' regions and lowest, at about 80-85% of maximum, in the 3' regions. Overall, the similar read distribution profiles in all samples indicate that it is unlikely that differences in read distribution significantly contributed to gene expression differences. Thus the observed differences in normalized average counts are unlikely to be the result of differences in RNA degradation patterns (e.g. altered expression or activity of RNases) or related effects.

Comparison of the most highly expressed T. pallidum genes during rabbit infection and in vitro growth
The fifty T. pallidum protein-encoding genes with the highest transcript levels during rabbit infection and long-term in vitro culture (Table 3) were determined by calculating FPKM (fragments per thousand bases per million reads) for each gene using the counts generated by HTSeq. Genes encoding RNA products (rRNAs and tRNAs) were excluded from this analysis. The functional group corresponding to each of these genes was assigned based on the predicted functions of T. pallidum genes [12], and the percentage of each functional group for the top fifty most highly expressed genes was determined (Fig 3). The overall functional group percentages for both rabbit infection and in vitro culture derived T. pallidum were highly similar. The highest frequency functional group for both rabbit ( Fig 3A) and in vitro culture (Fig 3B) was cell envelope proteins, comprising 26% and 34% of the top 50 most highly expressed genes in rabbit and in vitro cultivation, respectively; this group includes flagellar proteins, membrane lipoproteins, and other membrane-associated proteins. Other functional groups with high transcript levels (>2% of total) included those encoding proteins involved in translation, cellular processes (including chaperones and proteins involved in oxidative/reduction reactions), energy metabolism, transport and substrate binding, and unknown functions (hypothetical proteins) (Fig 3). Among these groups, the categories exhibiting the highest and lowest ratio of

Comparison to previous T. pallidum RNA and protein expression data
The 50 most highly expressed genes from this experiment were then compared to the 50 most highly expressed T. pallidum genes during rabbit infection previously reported by Šmajs et al. [26] in a microarray study. After accounting for newly annotated genes, 42% (18/43) and 30% (13/44) of the most highly expressed genes during rabbit infection and in vitro culture in this study were also among the most highly expressed in the previous microarray transcriptome study (S2 Table) [26]. The most common functional groups for the top 50 most highly expressed T. pallidum genes in rabbits as determined by Šmajs et al. [26] were hypothetical proteins (34%), cell envelope (26%), translation (14%), energy metabolism (12%), and cellular processes (8%). These results were similar to the data obtained for this study, with the most common functional groups for both the rabbit infection and in vitro cultivation samples being cell envelope, translation, cellular processes, and hypothetical proteins (Fig 3). Like in the previous work, we found that the four genes encoding flagellar filament proteins (flaB1-3, flaA) were among the most highly expressed genes in rabbit infection and in vitro cultivation, but unlike in the previous work the cytoplasmic filament protein cfpA was not one of the most highly expressed genes in our study (Table 3). Of the genes encoding outer membrane proteins or lipoproteins, two (TPANIC_0663 and tmpA) were among the most highly expressed genes in our results as well as in the previous work, while five additional membrane

PLOS PATHOGENS
Treponema pallidum comparative RNAseq Table 3. T. pallidum protein-encoding genes with the highest average FPKM during in vitro culture or rabbit infection. The fifty genes with the highest gene expression based on average FPKM during in vitro culture and rabbit infection are listed in order from highest expression to lowest in the rabbit-derived specimens. ORF numbers in black indicate that the gene is one of the fifty most highly-expressed genes during both rabbit infection and in vitro culture. ORF numbers in blue indicate that the gene is one of the fifty most highly-expressed genes in rabbits but not in vitro culture, while ORF numbers in orange indicate indicate that the gene is one of the fifty most highly-expressed during in vitro culture, but not in rabbit infection. Average FPKM was calculated based on counts determined by HTSeq with the minimum alignment quality set to 0 and excluding rRNA and tRNA read pairs. Unless otherwise indicated, functional categories based on [26]. components were detected in the top 50 most highly expressed genes in the previous rabbit transcriptome data (tp34, tmpC, tpp15, tmpB, and tpp17). Although three genes encoding chaperonins were among the most highly expressed in the previous work (groEL, groES, and dnaK), they were not among the most highly expressed genes in this study. Genes responsible for the maintenance of redox potential (ahpC, flavodoxin, thioredoxin), a V-type ATPase component (TPANIC_0424), and glyceraldehyde-3-phosphate dehydrogenase (TPANIC_0844) were all highly expressed in this study as well as in the previous transcriptome study.

T. pallidum
To further compare the data generated by RNA-seq to the previously reported microarray data, scatter plots were generated based on the reported cDNA/DNA ratio values from the microarray work and the FPKM values generated in this study. Overall, there was a low concordance between the prior microarray data and the data generated by RNA-seq; the microarray data was somewhat more similar to the rabbit infection RNA-seq results (R 2 = 0.28) than to the in vitro culture results (R 2 = 0.19).

Treponema pallidum comparative RNAseq
Osbak et al. [30] analyzed the proteome of T. pallidum subsp. pallidum DAL-1 in a semiquantitative manner using mass spectroscopy. A total of 557 proteins (corresponding to 54% of the predicted protein-encoding genes) were identified by this means. In their study, the abundance of these proteins as measured by the normalized spectral abundance factor (NSAF) did not correlate with transcript abundance as determined previously in the Šmajs et al. [26] microarray analysis. Similarly, we found that the protein NSAF values did not correlate well with the RNA transcript levels determined for T. pallidum propagated in infected rabbits or in vitro cultures, yielding R 2 values of 0.006 and 0.0028, respectively.

Differential gene expression of T. pallidum cultured in rabbits and in vitro
A scatter plot comparing the log 2 -transformed average FPKM values for rabbit infection and in vitro culture was created to compare the similarity between these two growth conditions (Fig 4A). There was a high concordance between the RNA-seq data generated for rabbit infection and in vitro culture (R 2 = 0.90), indicating that there is not a large difference in T. pallidum gene expression between these two culture conditions. A Poisson distance matrix was calculated from rlog-transformed read counts to compare gene expression of the two rabbit samples to the six in vitro samples. The gene expression profiles of the two rabbit samples were most similar to each other, and the six in vitro samples also clustered together (Fig 4B).
Differential expression analysis was then used to compare the individual gene transcript levels from the combined rabbit samples to the combined in vitro samples (S2 Table), omitting tRNA transcripts. Genes were considered to be significantly differentially expressed if the | log 2-fold difference| was � 1 (equivalent to a 2-fold difference in gene expression) and the false discovery rate (FDR) adjusted p-values were � 0.05. Of the 1063 genes from the T. pallidum genome that were represented by the RNA sequencing data, 94 (9%) were differentially  Table 4). To verify these results, a subset of significantly differentially expressed genes were subjected to qRT-PCR. All of the genes examined by qRT-PCR were differentially expressed (p � 0.05) between T. pallidum grown in vitro and in rabbits, in agreement with the RNA-seq results ( Table 5). The RNA-seq and qRT-PCR differential expression values in Table 5 exhibited a high degree of correlation (r = 0.95).

Pathway analysis
To identify potential enrichment of differentially expressed genes in specific biological pathways, protein-coding genes were first annotated with Gene Ontology (GO) terms by homology. Overall 64% (637/1003) of protein-coding genes were successfully annotated with one or more GO terms. Gene set enrichment analyses were then performed using TopGO, Cluster-Profiler, and GoSeq. All three analyses identified GO terms representing ribosomal proteins as

PLOS PATHOGENS
Treponema pallidum comparative RNAseq significantly upregulated in rabbits in comparison to in vitro cultures with adjusted p-values of < 0.001. Among GO terms with ten or more members, TopGO also identified ATP metabolic process proteins (p-adj < 0.05) as weakly upregulated in rabbits, whereas DNA repair proteins (p-adj < 0.05), transmembrane transporter activity proteins (p-adj < 0.01), and membrane proteins (p-adj < 0.01) were weakly downregulated in rabbits. Likewise, Cluster-Profiler also identified DNA repair proteins (p-adj < 0.05) as downregulated in rabbit cultures. GoSeq did not identify any additional enriched terms. Enrichment was also assessed for a collection of previously identified putative virulence genes [12]; however no significant difference was identified between the rabbit and in vitro culture conditions.

Transport of nutrients
T. pallidum acquires many protein, nucleotide, and lipid precursors from the environment, and must also maintain appropriate intracellular concentrations of electrolytes and other solutes through transport proteins. However, only a few transporters exhibited differential transcript levels in the rabbit infection and in vitro culture models. TPANIC_0163, encoding the ABC transporter periplasmic binding protein TroA that binds iron, zinc, and manganese ions, had one of the highest differential transcription values (log 2 1.70, p<1.07 x 10 −12 ) between rabbit infection and in vitro culture [31][32][33][34]. In contrast, transcripts for the magnesium/cobalt efflux proteins TPANIC_0027 and TPANIC_0028 were significantly higher in the in vitro samples compared to the rabbit (log 2 −1.17 and -1.08, respectively) [35]. Four additional transportrelated genes had significantly higher transcription in the in vitro environment: TPANIC_0140 (K + transport protein NtpJ), TPANIC_0840 (major facilitator subfamily [MFS] transporter protein), TPANIC_0786 (ABC transporter ATP binding protein), and TPANIC_0301 (ABC transporter permease).

Differences in gene transcripts related to metabolism in T. pallidum cultured in rabbits versus in vitro
The transcript levels of ppdK, a pyruvate phosphate dikinase, were higher in rabbits than in vitro, suggesting that pyruvate metabolism is elevated. The pyruvate-flavodoxin oxidoreductase NifJ (TPANIC_0939), which is thought to be involved in maintenance of a proton gradient across the cytoplasmic membrane, transcript is also elevated during rabbit infection [2]. Related to the redox environment and antioxidant defense, trxA (thioredoxin) transcript levels were increased in rabbits compared to in vitro culture. The alkyl hydroperoxidase AhpC, also involved in antioxidant defense, has one of the highest transcript levels in both the in vivo and in vitro environments (Table 3) [36]. Transcripts elevated in T. pallidum cultured in vitro include a gene involved in folic acid biosynthesis (folC).

Varied ribosomal protein gene transcript levels in T. pallidum cultured in rabbits versus in vitro
A total of 14 of 56 ribosomal protein genes had significant differential expression between the long-term in vitro cultures and T. pallidum grown in rabbits, using the criteria of log 2 -fold difference � |1| in transcript levels with a p value <0.05 ( Table 4). Most of these (11 of 14) had higher transcript levels during rabbit infection as compared to in vitro culture. To provide a more comprehensive view, the relative transcript levels for all of the ribosomal protein genes in the large ribosomal protein operon (TPANIC_0188 through TPANIC_0213) and in additional loci were examined (Fig 6). Within the large operon, all 27 genes (including two that do not encode ribosomal proteins) were expressed at a higher level in infected rabbits than in the

PLOS PATHOGENS
Treponema pallidum comparative RNAseq Table 4. Genes with significantly different transcript levels in T. pallidum from in vitro cultures and infected rabbits. Average normalized counts, log 2 -fold difference and false discovery rate (FDR) adjusted p-values were determined using DESeq2. Gene function was based on [9]. Functional categories were based on Gene Ontology (GO) terms (QuickGO). Log 2 values represent the ratio of rabbit counts/in vitro counts. Genes are listed in the order of ascending log 2 -fold Difference. Differences in tRNA expression were excluded from this analysis.

PLOS PATHOGENS
Treponema pallidum comparative RNAseq in vitro cultures, with 10 (highlighted in green) of these fulfilling both the 2-fold increase and p<0.05 significance criteria (Fig 6A). Ten additional genes had p-values less than 0.05 but less than a 2-fold increase (highlighted in yellow). Ribosomal protein genes at other loci (including potential operons of 2-4 genes) had more varied results (Fig 6B). Two additional genes (ssb1 and prp) 'embedded' in potential ribosomal protein operons were also included in the results. Only 3 of 31 genes in Fig 6B had >1 log 2 -fold differences and p<0.05, with an additional 11 with smaller differences but p<0.05. Of the 11 ribosomal protein genes with >1 log 2 -fold differences and p<0.05, 7 encode proteins associated with the 30S ribosomal subunit.

Membrane and flagellar protein gene transcript levels
Higher transcript levels of putative OmpA-OmpF porin family proteins TPANIC_RS05190 and TPANIC_RS00645 were present in vitro than in rabbits. Likewise, transcripts encoding multiple predicted proteins involved in lipoprotein (CoaD and Ddl) and peptidoglycan biosynthesis (MltG), and trans-membrane lipoprotein transport (LolA) were elevated in the in vitro samples in comparison to in rabbits. In contrast, levels of transcripts encoding the lactoferrin binding periplasmic lipoprotein Tp34 (TpD) [37], the carboxypeptidase lipoprotein Tp47 [38], and the purine nucleoside-binding lipoprotein [39] PnrA/TmpC were higher in rabbits (Table 4). Only a few of the genes involved in motility appeared to be differentially expressed in rabbits and in vitro. Flagellar assembly is accomplished with 26 known proteins, including three flagellar filament core proteins (FlaB1, FlaB2, and FlaB3), a flagellar filament sheath protein (FlaA1), motor proteins (MotA and MotB), and multiple motor switch proteins (FliG1, FliG2, FliM, and FliN) [40]. In general, there were no consistent differences between flagellar gene

PLOS PATHOGENS
Treponema pallidum comparative RNAseq transcript profiles in the in vivo and in vitro environments (S2 Table). Exceptions include the genes encoding FlaB3, FliL1, and FliG1, which exhibited significantly higher transcript levels during infection of rabbits as compared to in vitro culture ( Table 4).

Expression of genes involved in DNA replication, transcription and mismatch repair in T. pallidum cultured in rabbits versus in vitro
Transcripts for the chromosome replication initiation protein dnaA were more highly expressed during rabbit infection than during in vitro culture (Table 4). In contrast, transcripts for the mismatch repair protein mutS, the DNA repair and recombination restart protein recO, and ruvB, a Holliday junction DNA helicase that is also involved in DNA repair, were higher in vitro than in rabbits, suggesting that DNA repair processes may be upregulated in vitro.

Regulatory protein gene transcript levels
TPANIC_0474 is highly homologous (51% identical, 73% similar) to the Borrelia burgdorferi YebC/PmpR family DNA binding transcriptional regulator (BB0025) that affects the expression levels of VlsE, a B. burgdorferi surface lipoprotein involved in immune evasion [41]. In our studies, more TPANIC_0474 transcripts were detected in T. pallidum during rabbit infection as compared to in vitro, potentially indicating a regulatory response to the environment in infected rabbits. Similarly, significantly higher transcript levels during rabbit infection were observed with TP_0461, which is predicted to encode a xenobiotic response element (XRE) family regulatory protein with a helix-turn-helix binding motif. None of the five predicted sigma factor genes of T. pallidum (TPANIC numbers _0493, _0092, _0111, _0709, and _1012) exhibited significantly different transcript levels in the rabbit infection and in vitro culture environments (S2 Table).

Procedural observations
In this study, we compared the transcriptomes of T. pallidum grown in rabbit testes versus in long-term in vitro culture to determine if expression patterns between the two culture conditions are similar. Comparison of the log 2 -transformed FPKM values for rabbit infection and in vitro culture showed that RNA transcript levels for these two culture conditions were highly similar. Subsequent differential expression analysis conducted using DESeq2 also indicated that the two culture conditions result in highly similar RNA transcriptional profiles, but significant differences were observed for 94 genes. A subset of these were verified by qRT-PCR. The overall similarity of the transcription profiles during infection of rabbits and in vitro culture leads to two possible conclusions. The first is that T. pallidum is well adapted to its natural, relatively homeostatic environment in tissue and, unlike Borrelia species [42], has evolved toward near constant expression of its gene repertoire with few mechanisms of gene regulation (reviewed in [2]). The second possible conclusion is that the conditions in rabbit testicular tissue and those present in the in vitro culture system (which, like tissue, includes mammalian cells, a rich source of nutrients, and exposure to microaerobic oxygen levels) are very similar

PLOS PATHOGENS
Treponema pallidum comparative RNAseq and thus result in closely related transcript patterns. The observed differences in transcript levels may provide insight into genes that are regulated to some degree. It is important to note that in this study, T. pallidum were obtained from the inoculated, inflamed testes of infected rabbits. It is possible that, related to the systemic nature of syphilis infections, T. pallidum obtained from other rabbit tissue (such as skin lesions or blood) may exhibit slightly different transcription patterns. Additionally, exposure to more extreme, stressful conditions during in vitro culture (e.g. lack of mammalian cells or changes in temperature, oxygen concentration, or medium composition) may also lead to greater differences in gene expression and hence reveal additional regulatory networks. All of the 1,063 predicted genes were represented in both the rabbit infection and in vitro culture transcriptomes, and the majority of the assigned reads were rRNAs. A significant portion of the sequences in all specimens examined did not map to the T. pallidum genome (29% to 88%); most, if not all, of these populations likely represent rabbit RNA sequences from the infected New Zealand white rabbits or the Sf1Ep cottontail rabbit epithelial cells present in the in vitro cultures. In addition, the majority of mapped T. pallidum RNA sequences corresponded to rRNAs (84% of the assigned in vitro sequences and 91% of the assigned rabbit sequences). This result indicates that the RNA preparation kit used for the transcriptome library was not sufficiently effective in enriching for T. pallidum mRNA; this method uses selective oligonucleotides based on 50 different prokaryotic species to hybridize with and remove prokaryotic rRNA [43,44] and may not work well with T. pallidum rRNA species. More efficient removal of rabbit cells from the samples prior to RNA extraction, as well as more efficient T. pallidum mRNA enrichment procedures would be expected to increase the proportion of T. pallidum sequences recovered in RNA preparations. Similarly, tRNA expression levels were omitted from analysis because the RNA purification, reverse transcription, and cDNA sequencing procedures utilized in this study were not optimal for tRNA recovery and quantitation [45,46].
Comparison of our RNA-seq results with a prior transcriptome analysis utilizing a hybridization procedure [26] showed a low degree of correlation, although there was a general trend with regard to increasing transcript concentration values. The reasons for the relatively poor concordance are unknown, but may be related to differences in RNA preparation procedures or the inherently lower dynamic range and sensitivity of hybridization methods [26,47,48]. The T. pallidum protein abundance values previously reported by Osbak et al. [30] did not correlate well with our FPKM values obtained by RNA-seq, similar to the poor correspondence that they observed with the previous RNA abundance data obtained by hybridization [26]. In studies with other organisms, R 2 values between protein and mRNA levels were typically only~0.4, indicating that post-transcriptional effects may play a major role in the relative abundance of proteins [49].
The RNA-seq results were validated by qRT-PCR of 10 genes that exhibited differential expression (Table 5), and these results had a Pearson R 2 value of 0.91. The magnitude of expression differences for the 10 genes examined by qRT-PCR were higher than that determined by RNA-seq, possibly indicating that the RNA-seq results may be underestimating transcript levels, but for each gene tested the pattern of expression between rabbit and in vitro culture was the same. Although the number of genes in this analysis is limited, the data indicate that the RNA-seq information is useful in comparing transcript levels during rabbit infection and in vitro culture conditions.

Implications for the use of the in vitro culture system as a substitute for the rabbit model
Growth and multiplication of T. pallidum requires acquisition of nutrients, catabolic and anabolic activities, recycling and modification of components (e.g. lipids and nucleotides),

PLOS PATHOGENS
Treponema pallidum comparative RNAseq synthesis of macromolecules (nucleic acids, proteins, and peptidoglycans), assembly of structures (such as membranes, ribosomes and flagella), and cell division processes. In addition, T. pallidum has specialized mechanisms to protect it against the host's immune system, including antigenic variation, limitation of surface immune targets, and adherence and penetration of tissue [2,50]. All of these activities must be regulated to some extent, although the T. pallidum genome contains only a few genes encoding predicted regulatory factors. Based on the RNAseq data, transcripts from the genes generally are present in similar levels during rabbit infection and in vitro culture. The relative similarity in transcript levels in the two environments support previous work showing that T. pallidum metabolism and growth is similar during both experimental rabbit infection and the in vitro culture system [22,23]. However, the observed transcriptional differences may indicate important effects of these two environments on the organisms.

Membrane transport and lipoprotein enrichment in vitro
T. pallidum grown in vitro demonstrated a weak enrichment of transcripts with GO terms associated with membrane transport. For example, transcripts for genes encoding proteins with predicted involvement in potassium uptake (ntpJ) and magnesium/cobalt efflux (TPA-NIC_0027, TPANIC_0028) [35] were elevated in vitro (Table 4), potentially indicating an increased need for balance of these ions in the in vitro environment. Conversely, the gene encoding TroA (the periplasmic binding protein of the Fe/Mg/Zn ABC transporter operon troABCDR [31][32][33][34]) had significantly higher transcript levels during rabbit infection than in in vitro cultures. In studies in the related organism Treponema denticola [51,52], TroA and the cognate regulator protein TroR were found to be important in ion transport. Therefore, the tro operon may play an important role in T. pallidum metalloregulation and gene expression during infection. Differential expression of genes involved in membrane transport could be due to differing concentrations of important nutrients between TpCM-2 medium and the rabbit model. Interestingly, the testes of 10 month-old rabbits only have about 12% of the zinc concentration found in serum, possibly explaining why troA is expressed at higher levels in rabbit testes than it is in vitro, and potentially providing a means of nutritional immunity from syphilis infection in the rabbit host [53]. Therefore, this data showing an enrichment of genes involved in membrane transport will be useful for designing future studies aimed at optimizing in vitro growth.
Although pathway analysis did not detect a significant difference (p-adj > 0.05) in the expression of genes with GO terms associated with membrane proteins, transcripts were significantly higher in vitro for multiple lipoproteins (Oop protein TPANIC_RS05190, and several other predicted membrane lipoproteins) and enzymes involved in lipoprotein and peptidoglycan synthesis (CoaD, Ddl, MltG). Membrane protein genes with significantly higher transcript levels in the rabbit environment included those encoding the Tp47 peptidoglycan carboxypeptidase, lactoferrin-binding lipoprotein Tp34 (TpD), OmpW protein TPA-NIC_0126, lipoprotein TmpC, Tpr domain-containing protein TPANIC_0017, and the heterogeneous fibronectin-binding lipoprotein TPANIC_0136. These predicted gene products could be involved in the adaptation to the infected rabbit and the in vitro culture environments.

Insights into the metabolism of T. pallidum
Pathway analysis indicated an enrichment of transcripts for genes associated with ATP metabolic processes in rabbits. For example, elevated expression of ppdk in rabbits suggests that pyruvate and phosphoenolpyruvate (PEP) metabolism are elevated in comparison to in vitro PLOS PATHOGENS Treponema pallidum comparative RNAseq culture. PEP is thought to be important for the ability of T. pallidum to respond to differences in amino acid and glucose levels in the environment [2], so increased transcript levels of ppdk may allow organisms grown in rabbits to more efficiently switch between utilizing different sources of carbon. As sodium pyruvate is a component of TpCM-2, these data suggest that manipulation of pyruvate levels may be important for T. pallidum's growth and survival. In contrast, folic acid biosynthesis or interconversion may be elevated in vitro due to an increase in expression of folC in comparison to rabbit infection, perhaps indicating that the addition of more folic acid to TpCM-2 may be beneficial. Enrichment of other GO terms involved in metabolic processes was not detected, perhaps indicating that T. pallidum does not undergo large swings in metabolism when subjected to different culture conditions, which is not surprising due to its highly reduced genome.

Differences in RNA levels for DNA repair, transcription regulation, and translation machinery genes
Genes with GO terms associated with DNA repair had slight, but significant elevations in transcript levels in vitro. For example, transcripts for the mismatch repair protein mutS were higher in vitro than in rabbits. In the spirochete B. burgdorferi, MutS is important for repairing the oxidative DNA damage caused by reactive oxygen species (ROS) produced by the infected host [54]. The elevated levels of mutS transcripts potentially suggests that T. pallidum grown in vitro may be subject to greater levels of DNA damaging agents (such as ROS) than organisms grown in rabbits [22,54]. Transcript levels of the Holliday junction DNA helicase ruvB was also significantly elevated in vitro. This protein is activated by the global SOS response to DNA damage in other bacteria; it is possible that the in vitro culture system could be inducing higher levels of DNA damage than occur in T. pallidum grown in rabbits [55].
The T. pallidum genome encodes very few recognizable regulators; for example, it does not contain any identifiable two-component regulatory systems [2]. One gene that had significantly higher transcript levels in infected rabbits was the transcriptional regulator yebC. Zhang et al. [41] found that mutation of yebC in B. burgdorferi resulted in differences in transcript levels in 32 genes, with the largest decrease occurring in the antigenic variation protein gene vlsE. The yebC mutant was unable to cause long-term infection in immunocompetent mice, most likely due to a deficiency in immune evasion. The YebC ortholog in T. pallidum (TPA-NIC_0474) may also affect gene expression, allowing the spirochete to adapt to changing conditions during syphilitic infection, such as increased immune pressure. In addition, TPANIC_0461 is predicted to encode an Xre family [56][57][58][59] regulatory protein homolog, and this gene has elevated transcript levels during rabbit infection as compared to in vitro culture (Table 4). Thus it is possible that TPANIC_0461 is capable of altering expression of other genes and aid in adaptation, even in the relatively homeostatic environment of human tissue.
In terms of macromolecular synthesis, pathway analysis found that GO terms associated with ribosomal genes were significantly higher in rabbits; rRNA species were excluded from this analysis, because rRNA was selectively depleted in these preparations to increase the proportion of mRNA reads. Although transcript levels of some of the ribosomal protein genes were significantly elevated, others were not (Fig 5). It is of interest that ribosomal protein transcripts were not consistently upregulated in one growth condition in comparison to the other. There is increasing evidence that ribosome composition can vary between growth conditions or tissues as an additional level of translational control [60][61][62]; it is possible that T. pallidum has retained this mechanism as a way of adapting to varied conditions within host tissue. With regard to DNA replication, the chromosome replication initiation protein dnaA had significantly higher transcript levels in rabbits than in vitro, but transcript levels for other genes

PLOS PATHOGENS
Treponema pallidum comparative RNAseq necessary for replication (such as the DNA polymerase I, polA) were not significantly different. This result suggests that overall transcript levels for genes involved in DNA replication did not differ between the two culture conditions, as supported by our pathway analysis results.

Implications for future syphilis research
The availability of long-term in vitro culture of T. pallidum [22][23][24][25] has opened up the possibility to study the growth, motility, and antimicrobial susceptibility of T. pallidum without the need for a rabbit host. The similarity of the transcriptomes generated from different culture environments suggests that T. pallidum does not globally shift its expression levels based on environmental conditions, which is not unexpected for an obligate pathogen with a reduced genome size. Further analysis of the observed differences in gene expression between these two systems may provide insights into adaptive mechanisms that T. pallidum has retained during genome reduction. One approach is to examine gene expression under different in vitro culture conditions, such as varied temperature, pH, medium composition, ROS concentrations, or axenic culture. A future goal for the field is to develop the ability to systematically mutate T. pallidum and thereby provide a more definitive view of the genetic basis of its unique biology and pathogenesis.

Ethics statement
Rabbit procedures were reviewed and approved by the Animal Welfare Committee of the University of Texas Health Science Center at Houston.

Bacteria
T. pallidum subsp. pallidum Nichols was originally obtained from J.N. Miller at the UCLA Geffen School of Medicine and cultured in vitro in TpCM-2 medium with Sf1Ep cottontail rabbit epithelial cells as previously described [22].
Two male New Zealand White rabbits were inoculated via intratesticular injection with 2-5 x 10 7 T. pallidum per testis. Ten days after infection, rabbits were euthanized and the testes were aseptically removed and rinsed in phosphate buffered saline (PBS). Testes extracts were prepared by finely mincing the testes and stirring in extraction buffer (PBS with 20% heatinactivated rabbit serum and 1 mM DTT, pre-equilibrated with 95% N 2 :5% CO 2 ) for 10 min at room temperature, followed by centrifugation at 1000 x g for 2 x 5 min to remove rabbit tissue. The resulting supernatant containing T. pallidum was treated with RNAprotect Bacteria reagent (Qiagen) to stabilize the RNA and incubated at room temperature for 5 minutes. Bacteria were pelleted by centrifugation for 10 minutes at 10K x g and used immediately used for RNA extraction. Approximately 1.6 x 10 10 T. pallidum were isolated from each rabbit.
Two sets of three T75 flasks containing Sf1Ep cells and TpCM-2 medium were inoculated with T. pallidum grown continuously in long-term culture. Organisms were harvested from the flasks after 7 days of in vitro growth by removing the TpCM-2 medium and placing it into a 50 mL conical tube, then washing the flask with 2.5 mL trypsin-EDTA and placing the trypsin-EDTA wash into the conical with the reserved TpCM-2 medium. An additional 2.5 mL of trypsin-EDTA was added to the flask, followed by incubation at 37˚C for five min to disassociate attached T. pallidum from the Sf1Ep cells. After trypsinization, the reserved TpCM-2 medium was added back to the flask, pipetted to resuspend the Sf1Ep-T. pallidum mixture, and returned to the conical tube. The tube was then centrifuged at 100 x g for 7 min to remove the Sf1Ep cells, and the resulting supernatant containing T. pallidum was immediately treated

PLOS PATHOGENS
Treponema pallidum comparative RNAseq with RNAprotect Bacteria reagent (Qiagen) for 5 minutes at room temperature. Bacteria were pelleted by centrifugation for 10 minutes at 10K x g and immediately processed for RNA extraction as described below. Each flask yielded~1x10 9 T. pallidum, and was processed separately to provide biological replicates.

RNA sequencing and data analysis
T. pallidum RNA was extracted from the rabbit and in vitro samples using a Qiagen RNeasy kit as per manufacturer's instructions. To remove prokaryotic and eukaryotic DNA, on-column DNase digestion was performed using Qiagen RNase-free DNase set. cDNA libraries were prepared with an Ovation Complete Prokaryotic RNAseq Library Preparation kit and sequenced on an Illumina NovaSeq6000 S4 system (150bp paired end reads) by Psomagen Inc. (Seoul, South Korea).
High throughput RNA sequencing reads were preprocessed using Cutadapt v2.3 with parameters set to remove standard Illumina sequencing adapters and enforce a minimum read length of 18 nt. Bowtie2 v2.3.4.1 was used to align the paired-end reads to NCBI RefSeq NC_021490 for T. pallidum using default parameters with seed substring lengths set to 18 nt [63]. Samtools was used to convert the resulting SAM files to BAM files and to sort the BAM files [64]. The name-sorted BAM files were used to create count tables using HTSeq with filtering set to 0 [65]. DESeq2 and R were used to perform differential expression analysis and to determine statistical significance [66]. GO term annotation was performed using InterProScan v5.36-75.0 [67]. GSEA was performed using adjusted p-value <0.05 as the cutoff for significance and the background gene set as all genes that received adjusted p-values. Default parameters were used with the following exceptions: TopGO v2.40 [68] was run with the weight01 algorithm, ClusterProfiler v3.16 [69], was run with 10,000 permeations and max gene set size of 100, and GOseq v1.40 [70] was run using Benjamini-Hochberg probability corrections. Gene body coverage was calculated in R using RCoverage [71]. These analyses were performed in part using high-performance computing resources of the Texas Advanced Computing Center (TACC) at The University of Texas at Austin.

Validation of differential expression with qRT-PCR
The same RNA preparations used for RNA sequencing were used for qRT-PCR validation. Primer sets were designed using the Realtime PCR Tool (Integrated DNA Technologies; https://www.idtdna.com/scitools/Applications/RealTimePCR/), setting the PCR product length between 150 and 200 base pairs (S3 Table). An Invitrogen SuperScript First-Strand cDNA Synthesis Kit was used to synthesize cDNA from the T. pallidum RNA samples following the manufacturer's directions. qRT-PCR reactions were assembled using 20 ng cDNA and 10 pmol of each primer with an iQ SYBR Green Supermix (Bio-Rad). Reactions were performed on a Bio-Rad C1000 Touch Thermal Cycler using a program of 95˚C for 2 min followed by 39 cycles of 95˚C for 5 s and 60˚C for 30 s. All eight biological samples were run with three technical replicates, no template controls and no RT controls. The resulting data was analyzed by the relative quantification method, where the average ΔΔC T values from the three technical replicates of the gene of interest were normalized to the values of the three technical replicates of the control gene, TPANIC_0426, for each of the 8 biological replicates [26].
Supporting information S1 Table. Read pairs generated from the in vitro and rabbit samples during RNA sequencing. The total number of read pairs per sample and the number of read pairs that map to the T.

PLOS PATHOGENS
Treponema pallidum comparative RNAseq pallidum genome were generated by HTSeq with the minimum alignment quality set to 0. (PDF) S2 Table. Differential expression analysis of the in vitro and rabbit samples. Average normalized RNA-seq counts, log 2 -fold difference values (combined rabbit/in vitro), and adjusted p-values (combined rabbit vs. in vitro) obtained for T. pallidum subsp. pallidum Nichols during in vitro culture and rabbit infection. T. pallidum ORF numbers, gene IDs, and coordinates are from NCBI RefSeq entry NC_021490. ORF numbers in blue indicate genes with significantly higher transcript levels in infected rabbits, while ORF numbers in orange indicate genes with significantly higher transcript levels in vitro. � Functional roles based on Gene Ontology (GO) terms (QuickGO). Unless otherwise indicated, functional categories based on [26]. (XLSX) S3 Table. Primer sets used for qRT-PCR. (PDF)