The genus Borrelia reloaded

The genus Borrelia, originally described by Swellengrebel in 1907, contains tick- or louse-transmitted spirochetes belonging to the relapsing fever (RF) group of spirochetes, the Lyme borreliosis (LB) group of spirochetes and spirochetes that form intermittent clades. In 2014 it was proposed that the genus Borrelia should be separated into two genera; Borrelia Swellengrebel 1907 emend. Adeolu and Gupta 2014 containing RF spirochetes and Borreliella Adeolu and Gupta 2014 containing LB group of spirochetes. In this study we conducted an analysis based on a method that is suitable for bacterial genus demarcation, the percentage of conserved proteins (POCP). We included RF group species, LB group species and two species belonging to intermittent clades, Borrelia turcica Güner et al. 2004 and Candidatus Borrelia tachyglossi Loh et al. 2017. These analyses convincingly showed that all groups of spirochetes belong into one genus and we propose to emend, and re-unite all groups in, the genus Borrelia.


Introduction
The spirochete genus Borrelia, named after the French biologist Amédée Borrel, was originally described in 1907 by Swellengrebel [1], with B. anserina (Sakharoff 1891) Bergey et al. 1925 designated as the type species. Since then numerous species and strains have been described, and members of this genus are well recognized as the aetiological agents of Lyme borreliosis (LB) and relapsing fever (RF) in humans. Lyme borreliosis and RF genospecies have long been recognized to have different clinical, biological, and epidemiological characteristics, and phylogenetic data is concordant with this, demonstrating that these two groups are genetically similar yet distinct, and form independent monophyletic sister clades that share a common ancestor [2].
Nevertheless, LB and RF Borrelia share a common set of genetic and biological characteristics that unify these organisms as a group compared to other related spirochetes. Namely, all LB and RF Borrelia species are spirochetes with an obligate parasitic lifestyle, are transmitted between vertebrate hosts by arthropod vectors (ticks and louse), and can be transstadially transmitted within their arthropod vectors. Various vector associations of Borrelia have been found in nature, with the genus Ixodes mainly vectoring LB species while argasid ticks often vector the RF group. However, some members of the RF group are associated with hard ticks a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 genomes (where possible), including two new Borrelia genomes from the novel reptile and echidna-associated clade, B. turcica, and 'Candidatus Borrelia tachyglossi', which have yet to be analyzed in this context. We also re-examine the CSIs previously used to support the delineation of LB and RF Borrelia in the genomes of B. turcica and 'Candidatus Borrelia tachyglossi' to establish whether these molecular markers are useful to establishing the relationship of B. turcica, and 'Candidatus Borrelia tachyglossi'. Our analyses indicate that insufficient genomic divergence exist between LB and RF Borrelia to consider them separate genera, and that Borrelia CSIs are limited in their ability to unambiguously distinguish the taxonomic identity of B. turcica and 'Candidatus Borrelia tachyglossi'.

Strain included in this study
In order to accurately assess Borrelia intra-genus POCP, the proteomes of 30 Borrelia species strains, including n = 17 strains from the LB group, n = 11 from the RF group, and n = 2 from the reptile and echidna-associated group, were retrieved from GenBank (National Center for Biotechnology Information (NCBI), Bethesda (MD), https://www.ncbi.nlm.nih.gov/) or sequenced and assembled from low passage type cultures except 'Candidatus B. tachyglossi' which was sequenced from a single tick [12,19]. A tree summarizing the phylogenetic relationship based on 791 homologous proteins is shown in Fig 1. A full summary of strains used is presented in Table 1. To determine the levels of inter-genera POCP within the order Spirochaetales, an additional 54 proteomes, including n = 8 Brachyspira, n = 21 Leptospira, n = 5 Spirochaeta, and n = 20 Treponema species, were retrieved from GenBank (S1 Table) and included in the POCP analysis.

Sequence analyses
POCP analysis was performed according to Qin et al. [18] and as described in [20]. Briefly, for each genome pair reciprocal BLASTP [21] was used to identify homologous proteins between genome pairs. Proteins were considered to be conserved if the BLAST matches had an E-value of < 1e-5, >40% sequence identity and >50% of the query sequence in each of the reciprocal searches. The POCP value for a genome pair was then determined as [(C 1 +C 2 )/(T 1 +T 2 )] x 100, where C 1 and C 2 are the number of conserved proteins between the genome pair, and T 1 and T 2 are the total number of proteins in each genome being compared [18]. Scripts used for these analyses are available upon request.
The CSIs presented in Adeolu and Gupta [14] that are differentially present in LB and RF genomes were reinvestigated in the genomes of B. turcica and 'Candidatus Borrelia tachyglossi' to establish whether these molecular markers are useful for classifying their taxonomic relationships. To identify CSIs, the conserved amino acid sequences flanking the CSIs were searched against the proteomes of all 30 Borrelia genomes used here using BLASTP [21]. Hits from the matching protein in all 30 Borrelia proteomes were aligned with MUSCLE [22], and visually inspected for the presence of CSIs. The presence of previously defined CSPs in the genomes of B. turcica, and 'Candidatus Borrelia tachyglossi' was determined using BLASTP searched as described in Adeolu and Gupta [14].

Results and discussion
In order to determine whether the 50% POCP threshold for genus delineation was appropriate for spirochete taxa, we used pairwise POCP analysis to determine the inter-genera POCP values for 84 spirochete genomes from the genera Borrelia, Brachyspira, Leptospira, Spirochaeta,  and Treponema. Among all spirochete genomes investigated inter-genera POCP values ranged between 4.8% and 36.8% (mean 10.1%), indicating a low degree of protein conservation occurs between spirochete genera (S1 Table). These spirochete inter-genera POCP values are at a minimum of 13.2% lower than the 50% value determined by [18], suggesting this value is an appropriate and highly conservative threshold for spirochete genera delineation. Compared to the low level of protein conservation measured between spirochete genera, POCP values were significantly higher among Borrelia species. Borrelia POCP values were highest among the LB genospecies, which ranged between 81.1-94.4% (mean 90.2%), while POCP values between RF species were generally lower and more variable, ranging between 65.3-93.1% (mean 81.1%) (Figs 2 and 3 (Figs 2 and 3).
The original proposal to delineate LB and RF Borrelia was largely based on the occurrence of 53 CSIs that have different forms in LB and RF genogroups [14]. It was subsequently defended by suggesting that novel Borrelia species that group with, or as an outgroup to RF Borrelia would be expected to contain RF-specific CSIs and generally none specific to the LB group [23]. An examination of these 53 CSIs in B. turcica and 'Candidatus Borrelia tachyglossi' genomes shows although the majority of CSIs present in these genomes correspond to the RFspecific form of the indels, 9/53 (17.0%) and 11/53 (20.8%) of the CSIs in B. turcica and 'Candidatus Borrelia tachyglossi', respectively, correspond to LB-specific forms (Fig 4; Table 2).
Thus, the results of our analysis using genospecies that were originally defined as belonging to the genus Borrelia showed a very clear pattern. The results demonstrate that LB and RF Borrelia genogroups lack sufficient proteomic differentiation to be classified as different genera according to the POCP threshold determined by Qin et al. [18]; the analysis of inter-genus POCP supported the classification of the five closely related Spirochaetales genera. Therefore, we propose to formally reestablish the genus Borrelia in its original form including species of the LB, RF, and reptile-and echnida-associated genogroups. Additionally, up to 20% of the CSIs identified as having genogroup-specific forms were not concordant with phylogenetic position of B. turcica and 'Candidatus Borrelia tachyglossi' as predicted previously [23].
Although categorical molecular markers such as these have been previously used in clarifying prokaryotic taxonomy, here these markers appear to have limited utility in resolving the taxonomic classification of novel Borrelia species. The reptile-and echidna-associated Borrelia clade to which B. turcica and 'Candidatus B. tachyglossi' belong is a very recently described group of Borrelia for which several novel variants have been described based on single-or multi-gene phylogenetic analyses. Although this group clearly shared a more common ancestor with RF Borrelia, the presence of LB-specific CSIs and high protein conservation with LB species suggests this Borrelia may share common genetic and biological characteristics with LB species. Both, PCOP and CSI supported the continuum of Borrelia species between LB and RF which now includes B. turcica and 'Candidatus B. tachyglossi'. These data suggest that the genus Borrelia in the form it was originally described and is proposed here represents a continuum with RF and LB group species at the extreme ends of the genus, and reptile and echnida-associated, and other Borrelia species (perhaps still to be discovered) sharing a unique mixture of features from both RF and LB groups.
In our study we included as many type strains as possible, as type strains are the representatives of the species and can be obtained from microbial culture collections. However, for two of the species belonging to the LB group of spirochetes, genomic data of the type strains were
https://doi.org/10.1371/journal.pone.0208432.g002 [18]. POCP values of species belonging to the LB group, RF group of spirochetes, the reptile-associated species B. turcica and echnida-associated species B. tachyglossi are above the genus threshold of 50%, indicating that all belong into one bacterial genus, Borrelia.

Fig 3. Percentage of conserved proteins (POCP) matrix generated by the method described in
https://doi.org/10.1371/journal.pone.0208432.g003 not available to us. As a surrogate we used genomic data available for closely related strains of these species, i.e. PKo for B. afzelii and A14S for B. spielmanii. Previous data on multilocus sequence typing have shown that these two isolates are closely related to the type strain of the respective genospecies and fall into the same phylogenetic cluster [24,25].

Conclusion
The data presented in this study very clearly demonstrate that all groups investigated, i.e. RF group spirochetes, LB group spirochetes, reptile-and echnida-associated Borrelia species belong to the same genus as values for POCP were consistently above the proposed threshold for genus delimitation. We propose to re-establish the genus Borrelia in its original form.  [26,27]. B. afzelii strains are also distinguishable from all other LB species by using Multilocus sequence analysis [24]. Supporting information S1 Table. Additional proteomes from the order Spirochaetales included in this study. A distance matrix of POCP values is given for genera included and shows that the within genus percentages are generally higher than 50% except for Treponema. (XLSX)