Figure 1.
Chaperone-usher fimbrial gene clusters (FGCs) of Salmonella.
A phylogenetic tree was built for 35 types of FGCs by using the amino acid sequences of the combined 950 usher proteins from 90 genomes (MEGA 5.0, as described in method). The FGCs were divided into five clades. The scale indicates the number of substitutions per amino acid. The bottom box lists different protein domain families. The asterisk indicates that some subunits were not picked by CDD or InterPro Scan, but (i) showed sequence similarity with other subunit(s) in the same gene clusters and (ii) were typically β-sheet-rich, as are all fimbrial subunits. Framed arrows are either known or predicted adhesins (as described in the text). C, V, VV, VVV were used to define the level of amino sequence variability for each subunit. C indicates subunits for which there was only one sequence available, or subunits lacking variants; V, VV or VVV indicated respectively ≤1, 1–10 or >10 detected variations per 100 amino acids.
Figure 2.
Salmonella and FGCs co-evolution model.
Proposed tree that includes E. coli and the two Salmonella species, S. bongori and S. enterica, the latter being divided into seven subspecies (monophasic IIIa, IV, VII and diphasic I, VI, II, and IIIb; subsp. V is now S. bongori). FGCs shown in red are suggested to have been acquired by HGT. FGCs shown in blue have diverged from orthologous E. coli or other Salmonella FGCs. In purple are FGCs that were lost. In green are FGCs that were duplicated. A dotted line separates the subspecies based on the presence of one or two flagellin genes. A dotted frame includes all the S. enterica subsp. I. The 5 clades correspond to the ones shown in Figure S3. The asterisks indicate that sdi and sbb were found in the integrative and conjugative element ICESe3 region of Salmonella enterica subsp. VII strain SARC16, suggesting independent acquisitions of these FGCs.
Figure 3.
Correlation of Salmonella phylogenomic groups with specific collections of FGCs.
On the left, phylogenomic tree of 90 Salmonella and two E. coli control strains, based on 45 highly conserved house-keeping genes totaling ∼43 Kb. Clade 1 to 4 correspond to the clades shown in Figure 2, and clade 5 includes the few sequenced genomes from strains that were not S. enterica, subsp. I. The scale indicates the number of substitutions per nucleotide. On the top and heat map, hierarchical clustering support tree for the FGCs (MeV, complete linkage method with an euclidean distance threshold of 9.525, http://www.tm4.org). FGCs with or without pseudogenes were shown as green or red rectangles, respectively. On the right, Salmonella serovars (somatic O and flagellar H antigens).
Figure 4.
Predicted structural model of the Salmonella FimH fimbrial adhesin.
The structure of the Salmonella FimH protein was based on the template structure 1klf (Protein Data Bank) from the E. coli FimH adhesin [119]. On the left, ribbon model of the predicted structure of Salmonella FimH with its lectin and pilin domains, each with one disulfide bond. ˜β -barrel are shown in yellow and α-helices are shown in pink. On the right, the variable amino acid positions are shown in a tube-rendering model of the FimH backbone structure, with a color gradation from blue (most conserved residues) to red (most variable positions). None of the natural variable positions were located in the predicted binding pocket, shown as green circles on both models.