Table 1.
Treponemal strains reported in this study.
Figure 1.
tpr alleles among treponemal species, subspecies and strains.
tprD2: A tprD allele which contains a 330-bp unique central region and three smaller heterogeneous regions at the 3′ end. tprC-like and tprD-like: similar to tprC or tprD, respectively, with small sequence differences in discrete variable regions (DVRs). tprGJ: A chimera where the 3′ end contains tprJ signatures. tprGI: A chimera where the 5′ end is homologous to tprG, and the central and 3′ regions are homologous to the corresponding regions of tprI. truncated: Predicted truncated proteins due to a frameshift. tprA-like, tprI-like and tprL-like contain small sequence differences. tprE-like and tprH-like contains small sequence differences that segregate syphilis from non-syphilis treponemes. tprL1: A unique tprL allele in T. p. pertenue and Fribourg-Blanc strain. * indicates that tprC and tprD in the Nichols strain and that tprI-like sequences in the tprF and tprI loci are also identical in the pertenue subspecies and the Fribourg-Blanc treponeme.
Figure 2.
Allelic variants at the tprC and tprD loci.
Four allelic combinations are found at these two loci: 1) identical tprC and tprD, 2) tprC-like and tprD-like, 3) tprC-like and tprD2, and 4) tprD2 truncated and tprD2 truncated. Same background color indicates sequence identity. Vertical blue lines indicate discrete variable regions (DVRs), which contain mutations in great majority of non-synonymous character. Green color indicates unique tprD2 signatures. Light blue background indicates predicted untranslated regions of the ORFs in the tprD2 alleles due to a single nucleotide insertion, frameshifting and a premature stop.
Figure 3.
Structural models of TprC/D and TprI.
Non-templated 3D models generated for the mature Nichols TprC/D and Nichols TprI peptides using the TMBpro algorithm [61] suggest a typical β-barrel structure. DVR,discrete variable regions. EL, external loops. Variable regions, DVR1–DVR7 for TprC/D and DVR1–DVR9 for TprI, as defined by protein sequence alignments (Figure S1 and S2) are indicated by red color (loops, font and arrows). Note that each DVR co-localizes with a predicted EL. Orientation of the structure was determined as specified by Randall et al [61]. Proposed conserved and variable surface exposed loops are highlighted in blue and red, respectively, and proposed periplasmic exposed regions of the proteins are in purple.
Figure 4.
Top: proteins encoded in the tprG locus. Bottom, proteins encoded in the tprJ locus. Regions of the same color indicate sequence identity among gene products. Five different variants can be identified among the 12 strains analyzed in the tprG locus. Nichols, Bal3 and Street 14 encode the Nichols reference TprG. Sea81-4encodes a truncated TprG. The TprGJ chimera is found in Mexico A, Samoa D, Gauthier and CDC2; the TprGI chimera is found in Iraq Band Bosnia A, and a truncated TprGI chimera is found in Cuniculi A and Fribourg-Blanc. At the tprJ locus, Nichols, Bal3, Mexico and Street 14 contain the Nichols TprJ. Sea81-4 and all non-syphilis strains carry theTprGJ hybrid. Green, signal peptide.
Figure 5.
Encoded variants at the tprA locus.
The Nichols, Bal3, Mexico A, and Street 14 isolates carry a gene encoding a truncated protein as result of the presence of only 3 CT dinucleotide repeats. A gene coding for a full length protein, which contain 4 CT dinucleotide repeats, is found in Sea81-4, Gauthier, Samoa D, CDC2, Iraq B, Bosnia A, Fribourg-Blanc and Cuniculi A. Blue color, unique sequence as result of frameshifting. Green, signal peptide.
Figure 6.
Encoded variants at the tprL (tp1031) locus.
Coding sequences: Three different coding sequences have been identified for treponemal species and subspecies: the proposed tprL ORFs in the Nichols and Street 14 genome sequences; an extended tprL for pallidum, endemicum, and paraluiscuniculi strains; and a fused tprL (called tprL1) for pertenue and the Fribourg-Blanc strains. The Nichols ORF was predicted to be 1542 bp, although lacks identifiable promoter elements upstream. In this study, an extended tprL of 1806 bp has been identified in the Nichols and other pallidum strains, as well as in endemicum and paraluiscuniculi strains. The initially shorter Nichols tprL was the result of sequencing errors in the reported Nichols genome sequence [27]. Typical promoter elements are shown for the extended tprL ORF (SC, start codon. RBS, ribosomal binding site. +1, transcriptional start site (TSS). −10 and −35, σ70 signatures). A deletion of 278 bp (274 bp of the 5′ end of tp1030, whose coding sequence is located on the minus strand, and 4 bp of the 5′ end of the genome-derived tprL) creates an alternative start site in tp1030 for pertenue and Fribourg-Blanc tprL1, resulting in a shorter ORF of 1668 base pairs. This ORF, however, lacks recognizable promoter elements. Encoded proteins: Differences in coding sequences result in two different proteins: 1) a shorter pertenue/Fribourg-Blanc variant with a 44 amino acid unique amino terminus and 2) a longer TprL in the remaining species/subspecies with a predicted signal peptide 25 amino acids long (green) in the longer product, but not identifiable in the pertenue/Fribourg-Blanc gene product. Blue color, region unique to pertenue and Fribourg-Blanc strains (132 nucleotides or 44 amino acids). Red color, region unique to the pallidum, endemicum and paraluiscuniculi species/subspecies (65 amino acids).