The whole genome analysis of two strains of the first intermediately pathogenic leptospiral species to be sequenced (Leptospira licerasiae strains VAR010 and MMD0835) provides insight into their pathogenic potential and deepens our understanding of leptospiral evolution. Comparative analysis of eight leptospiral genomes shows the existence of a core leptospiral genome comprising 1547 genes and 452 conserved genes restricted to infectious species (including L. licerasiae) that are likely to be pathogenicity-related. Comparisons of the functional content of the genomes suggests that L. licerasiae retains several proteins related to nitrogen, amino acid and carbohydrate metabolism which might help to explain why these Leptospira grow well in artificial media compared with pathogenic species. L. licerasiae strains VAR010T and MMD0835 possess two prophage elements. While one element is circular and shares homology with LE1 of L. biflexa, the second is cryptic and homologous to a previously identified but unnamed region in L. interrogans serovars Copenhageni and Lai. We also report a unique O-antigen locus in L. licerasiae comprised of a 6-gene cluster that is unexpectedly short compared with L. interrogans in which analogous regions may include >90 such genes. Sequence homology searches suggest that these genes were acquired by lateral gene transfer (LGT). Furthermore, seven putative genomic islands ranging in size from 5 to 36 kb are present also suggestive of antecedent LGT. How Leptospira become naturally competent remains to be determined, but considering the phylogenetic origins of the genes comprising the O-antigen cluster and other putative laterally transferred genes, L. licerasiae must be able to exchange genetic material with non-invasive environmental bacteria. The data presented here demonstrate that L. licerasiae is genetically more closely related to pathogenic than to saprophytic Leptospira and provide insight into the genomic bases for its infectiousness and its unique antigenic characteristics.
Leptospirosis is one of the most common diseases transmitted by animals worldwide and is important because it is a major cause of febrile illness in tropical areas and also occurs in epidemic form associated with natural disasters and flooding. The mechanisms through which Leptospira cause disease are not well understood. In this study we have sequenced the genomes of two strains of Leptospira licerasiae isolated from a person and a marsupial in the Peruvian Amazon. These strains were thought to be able to cause only mild disease in humans. We have compared these genomes with other leptospires that can cause severe illness and death and another leptospire that does not infect humans or animals. These comparisons have allowed us to demonstrate similarities among the disease-causing Leptospira. Studying genes that are common among infectious strains will allow us to identify genetic factors necessary for infecting, causing disease and determining the severity of disease. We have also found that L. licerasiae seems to be able to uptake and incorporate genetic information from other bacteria found in the environment. This information will allow us to begin to understand how Leptospira species have evolved.
Citation: Ricaldi JN, Fouts DE, Selengut JD, Harkins DM, Patra KP, Moreno A, et al. (2012) Whole Genome Analysis of Leptospira licerasiae Provides Insight into Leptospiral Evolution and Pathogenicity. PLoS Negl Trop Dis 6(10): e1853. doi:10.1371/journal.pntd.0001853
Editor: Mathieu Picardeau, Institut Pasteur, France
Received: April 9, 2012; Accepted: August 25, 2012; Published: October 25, 2012
Copyright: © 2012 Ricaldi et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Funding: This work was supported by the following grants from the U.S. Public Health Service, National Institutes of Health: K24AI068903, D43TW007120, and RO1TW05860. This project was also funded in part with federal funds from the National Institute of Allergy and Infectious Diseases, National Institutes of Health, Department of Health and Human Services under contract number HHSN272200900007C and an NIH/NIAID Genome Sequencing Center Contract. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
Leptospirosis is a globally important tropical infectious disease that takes a disproportionate toll in tropical regions . Caused by more than 250 serovars of spirochetes distributed among nine species of pathogenic Leptospira and at least five known species of intermediate Leptospira , the burden of leptospirosis disease falls predominantly on people living in poverty and under inadequate sanitary conditions . Yet, pathogenic mechanisms in leptospirosis remain poorly understood . Reasons for the varying pathogenic potentials of different varieties of Leptospira to cause human disease have not been explored. Mechanisms of leptospiral tropisms for different mammalian reservoirs hosts are unknown. Lateral transfer of DNA has been observed in Leptospira but mechanisms for such transfer have yet to be defined –. The present study was designed to gain insight into the evolution of intermediate Leptospira with the highest degree of resolution currently possible—using comparative whole genome analysis—and to explore the degree to which evidence might link this leptospiral clade to an evolutionary position between pathogenic and saprophytic Leptospira clades as suggested by phylogenetic analysis of 16S rRNA gene sequences –.
DNA-relatedness and phylogenetic analyses have resolved the genus Leptospira into three distinct lineages , – comprising 20 species: nine pathogens, five intermediates and six saprophytes. Pathogenic Leptospira are capable of infecting and causing disease in humans and animals; intermediate Leptospira are able to infect humans and animals and cause a variety of clinical manifestations , , , although less frequently; saprophytic Leptospira are environmental bacteria that do not infect mammals at all. Genome sequencing efforts have so far focused on pathogenic (L. interrogans ,  and L. borgpetersenii ) and saprophytic species (L. biflexa ). Genomic comparisons indicate that while the L. biflexa genome is relatively stable, the genomes of pathogenic species have undergone considerable insertion sequence-mediated rearrangement , . It has been shown that there is considerable genomic plasticity even within the same species. For example, an ~54 kb genomic island and a large inversion in Chromosome I differentiate the L. interrogans sv. Lai and Copenhageni genomes , whose coding sequences are ~99% similar at the amino acid level. A comparison of in vitro growth characteristics also indicates that the third lineage of Leptospira, which includes L. licerasiae, occupies an intermediate position between the pathogenic and saprophytic species. Despite reference to intermediate Leptospira as “saprophytic intermediates,”  convincing clinical data confirm the pathogenicity of these Leptospira , . Knowledge of the genomic content of these intermediate species is necessary to complete our understanding of leptospiral evolution.
In this study, we sequenced and annotated genomes of L. licerasiae sv. Varillal strains VAR010 and MMD0835, the first intermediate species to be sequenced. In view of the range of stresses encountered by pathogenic bacteria during the course of infection, it is becoming apparent that in addition to virulence factors such as hemolysins there are additional proteins or contributory (pathogenicity-associated) factors involved in stress management strategies that are essential for successful infection. Genomic comparisons of the infectious species L. licerasiae, L. interrogans, L. borgpetersenii and the non-infectious saprophyte L. biflexa have provided much needed insight into these contributory factors, leptospiral virulence and pathogenicity.
Bacterial strain and genomic DNA extraction
L. licerasiae sv. Varillal type strain VAR010T (human isolate) and strain MMD0835 (Philander isolate) were originally isolated in Iquitos, Peru . The type strain has been deposited in the American Type Culture Collection (ATCC BAA-1110T). L. licerasiae sv. Varillal str. MMD0835 strain is available through BEI Resources (http://www.beiresources.org/). Both strains were grown in liquid Ellinghausen-McCullough-Johnson-Harris (EMJH) medium under standard culture conditions to a density of ~108 organisms/mL. Cells were harvested from 10 mL of culture (109 Leptospira) by centrifugation and genomic DNA (gDNA) was extracted using TRIzol (Invitrogen Life Technologies, USA) following manufacturer's directions. To remove RNA, extracted gDNA was then treated with an RNase cocktail (Roche, USA) containing RNase A and H.
Genome sequencing and assembly
The genome of L. licerasiae sv. Varillal type strain VAR010T was sequenced using a combination of 454 FLX Titanium and Illumina Solexa Genome Analyzer IIX. Paired-end libraries were constructed with fragment sizes ranging from 2000 to 4000 for 454 and 200 to 300 for Illumina. A total of 2272294 reads (1:4.26 454:Illumina) were assembled using the Celera Assembler version 7.0beta . The genome assembled into 14 contigs (4 scaffolds) at 58-fold sequence coverage with 99.93% of the genome with more than 19-fold coverage. L. licerasiae sv. Varillal str. MMD0835 was sequenced using just the Illumina Genome Analyzer II platform. A single paired-end library with a fragment size between 300–500 bp was constructed. A total of 1112438 reads were used by the CLC bio de novo assembler (CLC NGS Cell v. 3.20.50819, http://www.clcbio.com) to generate 48 contigs at 25-fold sequence coverage with 78.0% of the genome above 19-fold coverage (99.9% above 4-fold coverage).
Deposition of Genome Sequence Data
The nucleotide sequences and the corresponding automated annotations for the genomes of L. licerasiae str. VAR010T and MMD0835 were submitted to GenBank, with accession numbers AHOO01000000 and NZ_AFLO00000000, respectively.
Genomes were run through the JCVI automated annotation pipeline v10.0. Ab initio gene predictions were generated using Glimmer3  in an iterative fashion. The initial set of gene predictions was then used to train a second round of Glimmer3 analysis to produce the final set of gene predictions. All predicted genes were subsequently translated into all six reading frames and searched against a non-redundant amino-acid database using BLASTP. Each query protein-coding region was extended by 300 nucleotides in an attempt to extend the alignment through regions of low similarity and through different frames and stop codons using Blast-Extend-Repraze (BER, http://ber.sourceforge.net/). All putative protein coding sequences (CDS) were then searched against Pfam  and TIGRFAM  protein family models with HMMER3 . Coding sequences that scored well to these models were assumed to share the function modeled by the HMM. All predicted proteins were then searched against the NCBI Protein Clusters Database (PRK) . The remaining evidence types used in the automated functional annotation of gene products were SignalP , which detects the presence of putative signal sequences and TmHMM  to predict membrane-spanning regions.
The autoAnnotate program weighed the evidence obtained from the searches from a ranked list of evidence types to make a preliminary annotation, including name, gene symbol, Enzyme Commission (EC)  number, JCVI role category , and Gene Ontology (GO)  terms to each protein in the genome. Each protein was assigned a descriptive common name coming from an HMM name, a JCVI database of experimentally characterized proteins (CharProtDB) , or from a best BER match protein. Proteins predicted to encode enzymes were assigned EC numbers, JCVI role categories, GO terms and gene symbols (e.g., “recA”) as appropriate. The autoAnnotate program also employed the Protein Naming Utility (PNU)  to standardize protein nomenclature. Functional assignments were further enhanced with the Genome Properties  system, which records or predicts the presence or absence of metabolic pathways (e.g., biotin biosynthesis), protein complexes (e.g., ATP synthase), cellular structures (e.g., outer membrane) and certain genome traits (e.g., optimal growth temperature, cell shape, etc.). Additional structural features such as tRNAs were identified with the tRNAscanSE . 16S and 23S ribosomal RNA genes were identified directly from BLAST search results. Other structural RNAs were identified from matches to Rfam, a database of non-coding RNA families  and Aragorn . Insertion sequence elements were identified using the online tool ISsaga (http://issaga.biotoul.fr/ISsaga) with default settings . Genomic islands (GIs) were identified using the online tool IslandViewer (http://www.pathogenomics.sfu.ca/islandviewer) , which integrates three different genomic island prediction methods: IslandPick , IslandPath-DIMOB , and SIGI-HMM ; we report putative GIs predicted by multiple tools.
Regions of pairwise synteny between the Leptospira genomes were identified by first finding the maximum unique matches with a minimum length of five amino acids using PROmer , , followed by visualization of the data using MUMmerplot (http://mummer.sourceforge.net/) and Gnuplot 4.0 (http://www.gnuplot.info/) as previously described . QuartetS  was used to identify orthologous protein sequences among the eight Leptospira genomes used in this study. QuartetS uses an approximate phylogenetic analysis of quartet gene trees to infer the occurrence of duplication events and discriminate paralogous from orthologous genes . The QuartetS pipeline was run with default parameters. To be considered orthologs, the bi-directional best hit pairs had to satisfy the following conditions: (i) the alignment region had to cover at least 50% of the length of each sequence and (ii) the e-value of the pair-wise alignment had to exceed 1e−5.
To better understand the functional differences between pathogenic, intermediate and saprophytic Leptospira, each of the annotated genomes was uploaded to the RAST (Rapid Annotation using Subsystem Technology) server  retaining the original gene calls. Subsystems predicted to be active within each genome were then compared. A subsystem is a generalization of the concept of a biochemical pathway, extended to include ancillary components and alternative reactions reflecting functional variants found in various species.
Prophages were identified using Phage_Finder  version 2.0, which now utilizes HMMER3 , , drastically improving the speed of the HMM searches. Predicted prophage regions were identified using default settings and under strict (-S) mode. To facilitate identification of prophages in Leptospira genomes, Bacteriophage LE1 ,   was added to the BLAST database used for prophage identification. Phage_Finder version 2.0 is available at http://sourceforge.net/projects/phage-finder/files/phage_finder_v2.0/ under the GNU General Public License.
Lipopolysaccharide preparation and gas chromatography mass spectrometery (GC-MS) analysis
A three-day culture of L. licerasiae str. VAR010 (~108 cells/mL) was harvested by centrifugation at 4000 rpm for 90 min at room temperature. Cells were washed thrice with 1× PBS then treated with 50% aqueous phenol for 30 min at 65°C with continuous stirring. The cells were immediately immersed in an ice-water bath to reduce the temperature to 10°C, then centrifuged at 4000 rpm for 40 min at 10°C. The top layer (phenol saturated aqueous layer) and bottom layer (water saturated phenol layer) were removed and dialyzed against ddH2O extensively to remove phenol (three days with change in water twice per day)—the phenol layer was analyzed by GC-MS and polyacrylamide gel electrophoresis. The dialyzed lipopolysaccharide (LPS) was lyophilized then re-suspended in 500 µL of water; 200 µL was used for sugar composition analysis. For GC-MS, samples were silylated using Trimethylsilyl (TMS). First, samples were methanolyzed using 1 M MeOH-HCl, at 80°C for 16 h, followed by re-N-acetylation and TMS derivatization using Tri-Sil TP reagent (Thermo Scientific) according to manufacturer's directions. The derivatives were subjected to GC-MS analysis and the data quantified using an internal inositol standard. LPS isolation and GC-MS analysis were done by the Glycotechnology Core Resource at the University of California, San Diego.
Results and Discussion
Assembly and annotation details of two draft L. licerasiae genomes
454 and Illumina pyrosequencing of str. VAR010T yielded 2,272,294 reads that were assembled into 14 contigs (4 scaffolds) with 4,211,147 high-quality mostly contiguous bases. These contigs had an average length of 300.8 kb, an N50 of 522.9 kb and a maximum length of 1.67 mb. The str. MMD0835 genome was assembled into 48 contigs with 4,198,811 contiguous bases (N50 of 463.5 kb; max. length of 1.07 mb). The overall characteristics of the draft L. licerasiae genomes are summarized in Table 1. G+C content. Gaps in genome coverage were not filled in with manual sequencing given resource constraints. This approach is consistent with de novo sequencing and publication of other pathogen genomes, given that the length of the draft genomes was consistent with other sequenced leptospiral genomes (Table 1) and that the two strains whose genome sequences reported here are vastly similar. Gaps are typically caused by large (greater than the library “insert” size) fragments, which tend to be rRNA operons, large mobile elements or duplicated regions and likely do not materially detract from the quality of the data analysis presented here.
General genome features of L. licerasiae str. VAR010 and MMD0835
Non-coding RNA (ncRNA) genes and regulatory elements.
The L. licerasiae genomes were examined for the presence of riboswitches  and other ncRNA regulatory elements. Riboswitch predictions in the finished leptospiral genomes were confirmed by an online search of the Rfam database . Only candidates passing Rfam trusted cutoffs and therefore very likely to be true ncRNAs are presented. All infectious Leptospira contain at least two copies of the cobalamin (vitamin B12) riboswitch. As in other bacteria, both riboswitches appear to regulate expression of genes necessary for transport and biosynthesis of vitamin B12. The first, LEP1GSC185_0331, is immediately upstream of a gene encoding a TonB-dependent ligand-gated channel with similarity to the outer membrane cobalamin transport protein, BtuB, and the second, LEP1GSC185_3336, is immediately upstream of two genes encoding a putative cobalt transporter (cbtA—LEP1GSC185_3338; LlicsVM_010100017167 and cbtB—LEP1GSC185_3337; LlicsVM_010100017162) and the adjacent cobalamin biosynthesis (cob) operon. The lack of a cobalamin riboswitch and an incomplete cob operon in the saprophyte L. biflexa (see below) suggest that the ability to respond to cobalamin levels and synthesize B12 de novo from simpler metabolites is restricted to infectious Leptospira. Interestingly, the L. licerasiae genes encoding CbtA and CbtB, which share homology with Pseudomonas syringae proteins, may have been acquired via lateral gene transfer (LGT) since these genes are uncommon in Leptospira—homologs of both proteins are also present in L. broomii, L. inadai and L. kmetyi. All of the genomes studied here possess a single thiamine pyrophosphate (TPP; LEP1GSC185_0557) riboswitch (Table 1) that in L. licerasiae is directly upstream of thiC (LEP1GSC185_0556). The thiamine biosynthesis protein, ThiC, converts 5′-phosphoribosyl-5-aminoimidazole to 4-amino-5-hydroxymethyl-2-methylpyrimidine,an important intermediate in the synthesis of TPP.
A putative cis-regulatory element unique to L. licerasiae was also identified, ydaO-yuaA (LEP1GSC185_1591). This element is thought to be triggered during osmotic shock leading to activation of ydaO, a predicted amino acid transporter gene, and members of yuaA-yubG operon, which encode KtrA and KtrB K+ transporters . While a role in L. licerasiae is yet to be established, this element is found immediately upstream of a universal stress family protein (LEP1GSC185_1590; LlicsVM_010100003660), which has homology at the C-terminus to a family of universal stress proteins (USPs) and Na+/H+ exchangers (NHEs). USPs are small cytoplasmic bacterial proteins whose expression increases when the cell is exposed to stress agents such as DNA-damaging agents . These proteins are thought to enhance survival during prolonged exposure to such conditions . Indeed, one such protein UspA is up regulated in Leptospira at physiological temperature implying a role during in vivo growth . NHEs are found in both prokaryotes and eukaryotes and are believed to be crucial for cell volume homeostasis . Thus, it is possible that in L. licerasiae, the ydaO-yuaA element responds to and permits survival during periods of osmotic stress. This mechanism could allow for survival in environmental waters.
Prophages can be important drivers of microbial evolution by providing fitness factors for their host , , by facilitating movement of DNA through transduction of the host chromosome or packaging of pathogenicity islands  and altering serotype through lysogenic conversion , . To explore any of these possibilities in any of the available Leptospira genomes, Phage_Finder  was run under strict (-S) mode to identify prophage regions. Phage_Finder identified two prophage regions in the genomes of both L. licerasiae strains.
The first region in each strain was located on ~103 kb contigs (AHOO02000007 in VAR010 and NZ_AFLO01000023 in MMD0835) with best BLASTP matches to bacteriophage LE1 of L. biflexa. LE1 was previously shown to be of circular topology, to form intracellular particles consistent with phage, and to replicate like a plasmid . Given this information and that a large portion of each contig was predicted to be prophage, it was reasonable to believe these phage-like contigs in L. licerasiae also represented linear forms of circular phage genomes like LE1. There was significant overlap in the sequence of the ends, also suggesting a circular form. To demonstrate circular topology, outward-facing primers were designed and used in PCR reactions. The results of PCR produced 300 bp products, indicating that both LE1-like phages are indeed circular in L. licerasiae strains VAR010 and MMD0835 (Figure 1). Comparisons between LE1 and these prophages at the protein level indicated that the later three quarters of the L. licerasiae prophage proteins match some portion of LE1 (Figure 1), albeit at a low percent identity (average ~30% identity). A comparison between the two LE1-like prophages revealed that they are identical at the amino acid level (Figure 1). We propose naming the L. licerasiae LE1-like prophages vB-LliZ_VAR010-LE1 and vB-LliZ_MMD0835-LE1 using a previously suggested systematic bacteriophage nomenclature . vB-LliZ_VAR010-LE1 encodes 102 predicted proteins and has a G+C of 37.8% which is lower than the average for the entire L. licerasiae genome—41.6%. These L. licerasiae prophage elements possess ~22 kb of unique sequence that LE1 lacks as well as several unique predicted open reading frames interspersed among the LE1 homologs. A comparison of this ~22 kb region to other Leptospira genomes identified multiple efflux pumps in the infectious L. licerasiae that may function in adaptation to the mammalian host. Further, this amino acid similarity to bacterial efflux pumps suggests phage-mediated gene transfer between the L. licerasiae chromosome and LE1-like prophage. While the presence of these efflux pumps in the genomes of other infectious species would also suggest a role in pathogenicity, BLASTP searches against the non-redundant protein database (nr) indicate that these proteins have homologs in the non-pathogen Leptonema illini DSM 21528. Why L. licerasiae and not L. biflexa have maintained copies of these genes is unclear. Also within this region, the predicted L. licerasiae protein LEP1GSC185_3887 is notable in that it shares homology with a TolC/IS1533 transposase fusion protein. It has been suggested that the mobile genetic element (MGE) IS1533, has mediated LGT resulting in the antigenic switch of sv. Copenhageni to sv. Hardjo .
(A) Linear representations of CDSs found in each L. licerasiae genome with similarity to bacteriophage LE1 and to non-prophage regions of L. interrogans and L. borgpetersenii encoding efflux pumps. CDSs are labeled by locus identifier and colored by functional role categories as noted in the boxed key. BLASTP matches between CDSs are colored by protein percent identity (see key). (B) Circular depiction of vB-LliZ_VAR010-LE1.
The second prophage region was only partially detected in the L. licerasiae genomes by Phage_Finder (Figure 2), but is adjacent to a cryptic prophage region expressed in L. interrogans sv. Lai and is presumably associated with pathogenicity . The region detected by Phage_Finder is located at nucleotide position 210203..191954 of VAR010 and 71814..108770 of MMD0835, but after comparison to the above mentioned unnamed prophage element in L. interrogans sv. Lai, could be extended to include coordinates 210203..171583 of VAR010 and 71814..110434 of MMD0835 (Figure 2). Presumably the reason this region was truncated by Phage_Finder was due to a lack of sufficient homology in the BLAST database used and/or due to the lack of a head morphogenesis region, which is required by Phage_Finder to label a region as “prophage” under strict mode. Since this region lacks an identifiable head morphogenesis region yet retains tail-like proteins, it may be functionally analogous to phage tail-type bacteriocins, called pyocins in Pseudomonas aeruginosa  and monocins in Listeria .
Depicted are linear representations of CDSs found in each genome with similarity to the cryptic prophage identified by Qin et al. in L. interrogans sv. Lai . CDSs are labeled by locus identifier and are colored by functional role categories as noted in the boxed key. BLASTP matches between CDSs are colored by protein percent identity (see key). CDSs colored blue in L. borgpetersenii and L. interrogans genomes denote CDSs flanking the regions identified by Qin et al. (yellow highlighted box).
Comparison of the pathogenic, intermediately pathogenic and saprophytic leptospiral genomes
L. licerasiae str. VAR010 causes mild disease in humans and has been isolated from peridomestic and wild rodents and marsupials in Peru . Although phenotypic differences between VAR010 and MMD0835 have yet to be described, VAR010 (3931 total CDS) has 185 non-orthologous CDS relative to strain MMD0835 (3885 total CDS), whereas strain MMD0835 has 140 non-orthologous CDS relative to strain VAR010 reminiscent of another environmental pathogen with a plastic genome, Burkholderia pseudomallei . The majority of these non-orthologous genes encode hypothetical proteins. Both strains share 3,745 CDS with an average pair-wise amino acid similarity of 99.98%. Of these, 1211 have no orthologs in the other genomes used in this study. A putative function could be assigned to 632 with the remainder comprising hypothetical (579) proteins (Table S1).
Considering only those genes common to both strains of each species, L. licerasiae shares 2,237 (~57%) with L. interrogans, 2,077 (~53%) with L. borgpetersenii and 1,898 (~48%) with L. biflexa. 1,547 orthologs (~39% of the predicted L. licerasiae CDS) were present in all genomes compared (Figure 3) and likely represent a substantial proportion of the core genome of Leptospira. As shown in Figure 4, the gene order is more conserved in the intermediate and pathogenic branches. Surprisingly, L. licerasiae had the highest average protein identity with L. interrogans sv. Lai (2,278 proteins with an average pairwise identity of ~67%). Taken together these observations suggest that L. licerasiae is more closely related to the pathogenic branch of infectious Leptospira than to the saprophyte, L. biflexa. This was unexpected since 16S rRNA phylogeny suggests that L. licerasiae occupies an intermediate position between the pathogens and saprophytes . Table 2 shows the subsystem distribution of predicted CDS in L. licerasiae, L. interrogans, L. borgpetersenii and L. biflexa. Based on these data it would seem that intermediate Leptospira retain several proteins related to nitrogen, amino acid and carbohydrate metabolism that have likely been lost by the pathogenic sub-branch. For example, L. licerasiae (LEP1GSC185_2652) and L. biflexa (LEPBI_I1590) both possess ilvA, which encodes threonine ammonia-lyase an enzyme that catalyzes the conversion of threonine to 2-oxobutanoate; while neither L. borgpetersenii nor L. interrogans appears to do so. That L. licerasiae and perhaps the other intermediates do well in artificial culture media might be related to the retention of these metabolic functions.
Orthologs were predicted using QuartetS  run with default parameters. Only proteins present in both strains of a given species are shown.
Line figures depict the results of PROmer analysis. Colored lines denote percent identity of protein translations (see key) and are plotted according to the location in the leptospiral reference genomes (x-axis) and the query genome L. licerasiae strain VAR10 (y-axis). Chromosome I results are in the left column while chromosome II comparisons are in the right column.
It is a commonly accepted concept that genes unique to pathogenic microorganisms are likely to be necessary for infection (pathogenesis). To identify potentially pathogenicity-associated genes, we compared the genome content of three infectious leptospiral species, L. licerasiae (2 strains), L. interrogans (2 strains) and L. borgpetersenii (2 strains) with that of the non-infectious saprophyte, L. biflexa (2 strains). These comparisons identified 452 conserved pathogen-specific proteins (Figure 3). Based on domain homology searches, 315 were assigned a putative function (Table S2). Infectious Leptospira species share a number of proteins predicted to participate in environmental signaling and processing and metabolism (Table S2).
That the infectious species studied here appear to possess a complete vitamin B12 biosynthesis operon and a novel regulatory mechanism is perhaps the most notable metabolic difference between infectious and non-infectious Leptospira. Indeed the absence of these genes from the L. biflexa and recently sequenced Leptonema genomes would indicate that the ability to synthesize B12 was acquired after the speciation event giving rise to the infectious branch of Leptospira predating the separation of the intermediate and pathogenic sub-branches. The genomes of over 100 infectious strains searched so far including the intermediates species, L. inadai and L. broomii, possess at least two copies of the B12 riboswitch (M. Matthias and J. Vinetz manuscript in preparation), supporting the belief that these elements are essential for pathogenicity. As in other bacteria, the availability of different nutrients inside and outside the mammalian host requires changes in the metabolic capacity of Leptospira. Published data have firmly established that Leptospira have an absolute requirement for B12 for growth at 37°C . Much like iron, B12 is sequestered in vivo. Hence, for survival in vivo, leptospiral pathogens need to synthesize B12 de novo or scavenge B12 from the host. Whether leptospires are fully capable of synthesizing the highly complex B12 molecule from simpler precursors de novo is not known. But, cobI (LEP1GSC185_3345; LIC20129), an enzyme involved in cobalamin biosynthesis, is ~30-fold up regulated during mammalian infection consistent with a role in vivo in replication and/or pathogenicity (J. Lehmann, J. Vinetz, and M. Matthias manuscript in preparation). In addition, although all leptospiral genomes sequenced to date, including L. biflexa, encode the enzyme cob(I)yrinic acid a,c-diamide adenosyltransferase, which catalyzes the first step in the conversion of cobinamide to B12, all infectious Leptospira, including L. licerasiae, L. interrogans, L. borgpetersenii, L. santarosai, L. noguchii and L. weilii, encode at least one additional homolog. The reason for this is unclear, but it may be that these pathogen-specific homologs are required for B12 biosynthesis in vivo. While L. interrogans, L. borgpetersenii and L. licerasiae, appear to be able to use either l-glutamate or cobinamide to synthesize B12, it would seem that this is not a universal feature of infectious Leptospira.
Leptospira encode four essential B12-dependent enzymes: B12-dependent methionine synthase, two B12-dependent methylmalony-CoA mutase related proteins and a B12-dependent ribonucleotide reductase. Methionine synthase transfers a methyl group from methyl-tetrahydrofolate to homocysteine as the final step in the synthesis of methionine; ribonucleotide reductases generate the deoxyribonucleotides needed for DNA synthesis and allow the production of DNA in the absence of oxygen; methylmalonyl-CoA interconverts (R)-methylmalonyl-CoA and succinyl-CoA in the terminal step of β-oxidation of fatty acids/catabolism of cholesterol. A role for B12 in leptospiral pathogenicity has yet to be established. However, B12 synthesis has been linked to fatty acid metabolism and survival of the intracellular pathogen Mycobacterium tuberculosis in vivo . As humans do not synthesize B12, these genes may represent novel drug targets.
Putative pathogenicity-associated genes
The L. licerasiae VAR010 and MMD0835 genomes encode 196 and 198 putative lipoproteins, respectively consistent with the number found in other leptospiral species (L. interrogans – 184; L. borgpetersenii – 130 and L. biflexa 164) . Of these, infectious species share LipL31 (LEP1GSC185_3242, LlicsVM_010100016712), LipL32 (LEP1GSC185_2633, LlicsVM_010100013757), LipL40 (LEP1GSC185_1670, LlicsVM_010100003275), LipL41 (LEP1GSC185_1838, LlicsVM_010100002470), LipL46 (LEP1GSC185_3176, LlicsVM_010100016407), LigB (LEP1GSC185_1828; LlicsVM_010100002515), LruA/LipL71 (LEP1GSC185_0209, LlicsVM_010100006058) and LruB (LEP1GSC185_0754, LlicsVM_010100019404). That these genes are absent from the L. biflexa genome suggests a potential role in pathogenicity. The function of LruB is unknown, but serology suggests this protein is expressed in vivo .
Much recent work has demonstrated the importance of fibronectin and plasminogen binding proteins in Leptospira –. Fibronectin binding proteins are adhesins that play an important role in certain bacterial infections , . Putative pathogenicity factors LigA and LigB, specific to pathogenic Leptospira are induced at physiological osmolarity and are involved in leptospiral adhesion to extracellular matrix proteins and plasma proteins including collagens I and IV, laminin, fibronectin and fibrinogen . The above mentioned LipL32 and LipL40 are putative plasminogen binding proteins . Apart from LigB, at least three other conserved pathogen-specific outer membrane proteins are predicted to mediate attachment to host cells: a putative fibronectin binding protein, Lfb1 (LEP1GSC185_0134; LlicsVM_010100012092) ; Lsa66, a leptospiral surface adhesin of 66 kDa (LEP1GSC185_1758; LlicsVM_010100002865) shown to bind laminin and plasma fibronectin extracellular matrix molecules , and a protein believed to mediate attachment to host cells (LEP1GSC185_2102; LlicsVM_010100001165).
Unique genomic features of the L. licerasiae O-antigen locus
Previously published immunological data from Peru indicate that the L. licerasiae O-antigen is antigenically unique . Comparative analysis of all extant Leptospira spp. genomic data, including the new data presented here, explains this antigenic uniqueness at a genomic level. In contrast to the complex LPS O-antigen biosynthetic loci found in the published L. interrogans, L. borgpetersenii and L. biflexa genomes, which contain 91, 76 and 56 genes respectively, the L. licerasiae O-antigen locus we propose is comprised of a modest 6-gene operon, LEP1GSC185_2122–2127 (Figure 5, Table 3). The genes in this cluster have no apparent orthologs in the already sequenced L. interrogans, L. borgpetersenii and L. biflexa genomes. We are confident that this operon is the true L. licerasiae O-antigen locus based on the following observations: 1) There are only two wzx O-antigen transporter homologs in the genome. One of these (LEP1GSC185_0029) is not in an operon with any other genes of types typically associated with O-antigen biosynthesis. The other, LEP1GSC185_2124, is part of the proposed O-antigen locus. 2) Of the 29 putative polysaccharide glycosyltransferases we could identify in the L. licerasiae genome, while none are orthologs of genes in the O-antigen regions of the other sequenced Leptospira genomes, 22 are bidirectional best hits (that is, candidate orthologs) with non-O-antigen related genes from one or more of these genomes. Of the remaining 7 genes, one (LEP1GSC185_3401) is associated with a glycogen-related operon, two (LEP1GSC185_1696 and _2304) are part of short operons with genes of unknown function, and one (LEP1GSC185_2985) is proximal to flagellin genes. The remaining three, LEP1GSC185_2122, 2123 and 2126, are clustered together in the proposed O-antigen locus. 3) The remaining two genes in the proposed O-antigen locus, LEP1GSC185_2125 and 2127, encode functions commonly associated with sugar modification in O-antigens, pyruvoylation and acetylation, respectively. Homologs of the latter gene are specifically annotated as O-antigen related. 4) This O-antigen region is fully within a single contig.
The O-antigen region and flanking CDSs of L. licerasiae strain VAR10 are compared to regions of homology in L. interrogans Copenhageni (A). Likewise, the O-antigen and flanking CDSs of L. interrogans Copenhageni are compared to the homologous region in L. licerasiae strain VAR10 (B). Yellow shaded boxes mark the locations of the O-antigen regions. CDSs are labeled by locus identifier and colored by functional role categories as noted in the boxed key. Gene symbols, when present, are noted above their respective genes in bold italics. BLASTP matches between CDSs are colored by protein percent identity (see key).
The boundaries of the proposed L. licerasiae O-antigen biosynthesis operon map to three different syntenic regions in L. interrogans, which suggests a complex history of differential genome rearrangement and LGT events in these two species. Indeed, the six genes of the operon do not correspond to any syntenic blocks in any sequenced genome – the most similar genes to each are present in entirely disjoint sets of bacterial and archaeal strains (Table 3). This would seem to imply either, 1) that potential en bloc LGT source genomes with similarly constructed O-antigens have yet to be sequenced, or 2) that a series of LGT events from different sources have accumulated these genes in the L. licerasiae lineage to create a novel O-antigen cluster. Since the O-antigen cluster is not predicted to reside on a GI the latter seems more likely.
By analogy to extant knowledge of how E. coli O-antigen operons are assembled , we can hypothesize that the L. licerasiae O-antigen consists of a repeating unit with at least 4 sugars (corresponding to the primer sugar and the products of the three glycosyltransferase enzymes), that these sugars are bioavailable from the core Leptospira metabolome (i.e. glucose, galactose, mannose, etc., since no blocks of sugar biosynthesis or sugar modification genes are present in the operon) and at least one of them is modified by (probably) pyruvoyl and/or acetyl groups.
The chemical composition of the polysaccharide component of leptospiral LPS has been examined in a few serovars –. The proportion of the major component sugars rhamnose, galactose, arabinose and xylose was shown to vary between strains. The composition of the LPS derived from L. licerasiae sv. Varillal is consistent with previously published data. However, our GC-MS analysis indicates that L. licerasiae LPS (Figure 6) is composed primarily of arabinose (~61.6%), with xylose (~12.8%), mannose (~11.5%), rhamnose (~9.3%), galactose (~4.0%) and glucose (<1%). The relative proportion of arabinose and rhamnose (6:1) in the LPS of L. licerasiae is significantly different from that (1:3) reported in L. interrogans sv. Copenhageni , which might help to explain why there is absolutely no serological cross-reactivity between sv. Varillal and Copenhageni . The presence of rhamnose in the purified VAR010 LPS is surprising since the genome does not appear to encode a complete pathway for the synthesis of dTDP-l-rhamnose shown to be present in L. interrogans and L. borgpetersenii; the enzymes that catalyze the final two steps on the pathway, rmlC and rmlD, are absent. Because both L. licerasiae genomes are unfinished, it is possible that these genes reside on unsequenced regions of the genome. But, since other intermediate strains sequenced to date, L. broomii and L. inadai and the saprophyte, L. biflexa also seem to lack rmlC and rmlD homologs, it is also biologically plausible that L. licerasiae truly lacks either gene. A TBLASTN search against the L. licerasiae genomes failed to produce any significant alignments, thus it is does not seem that the genes were missed. L. licerasiae does possess the enzymes necessary to synthesize GDP-d-rhamnose from GDP-d-mannose, gmd (LEP1GSC185_1627; LlicsVM_010100003480) and rmd (LEP1GSC185_1109; LlicsVM_010100011485). Although rare, other pathogens such as Pseudomonas aeruginosa have been shown to produce LPS containing d-rhamnose ; therefore, it is possible that L. licerasiae produces d-rhamnose, but this needs to be confirmed experimentally.
LPS was purified by the hot-phenol water method. E. coli LPS (Lane 1), 1/32 Dilution of VAR010 Phenol phase LPS (Lane 2), 1/16 Dilution of VAR010 Phenol phase LPS (Lane 3), 1/8 Dilution of VAR010 Phenol phase LPS (Lane 4), 1/4 Dilution of VAR010 Phenol phase LPS (Lane 5), 1/2 Dilution of VAR010 Phenol phase LPS (Lane 6), Undiluted VAR010 Phenol phase LPS (Lane 7), Molecular weight marker (M).
The polymerase (wzy) and chain length determinant (wzz) genes are not observed in the proposed O-antigen locus, but may be located elsewhere in the L. licerasiae genome. These genes may be difficult to identify by homology due to their membrane protein nature. There are two identified wzy homologs in L. licerasiae with candidate orthologs in L. interrogans and L. borgpetersenii. There are no obvious wzz homologs in the L. licerasiae genome. The formal possibility exists that this O-antigen consists of only a single repeat, obviating the need for wzy and wzz genes, but this would be unprecedented if true.
Evidence for lateral gene transfer in L. licerasiae
Seven putative genomic islands in L. licerasiae ranging in size from 5 kb to ~36 kb (Table 4) were identified, the longest of which coincides with the previously mentioned cryptic prophage in sv. Lai and Copenhageni . In addition, we found 28 putative type II toxin-antitoxin systems (TASs) in the VAR010 genome (Table 5). TASs belong to the prokaryotic mobilome as they are extensively, if not preferentially, spread via plasmid-mediated LGT . Like many, if not most of the mobilome members, the TASs are not simply mobile, but appear to behave like selfish elements. If a mobile genetic element encoding a TAS is lost during cell division, the concentrations of the labile antitoxin rapidly decreases, allowing the toxin, which is more stable, to kill the cell. Thus, TASs contribute to the stable maintenance and dissemination of plasmids and genomic islands in bacterial populations despite the associated fitness cost. In M. tuberculosis, 37% of these systems are located on genomic islands . In L. licerasiae, 36% (10/28) of the putative type II TASs reside on putative genomic islands, and thus, appear to have been acquired by LGT. Of the L. licerasiae type II TASs, chpK/chpI (Table 5) has been confirmed in L. interrogans  and appears to be unique to infectious species . L. interrogans encodes another four TASs . By contrast, L. biflexa str. Ames and Paris possess several TASs (22 and 20 TASs, respectively ) much like L. licerasiae.
As additional independent evidence of lateral transfer, more than half of the L. licerasiae-specific CDS have no or poor homology with other leptospiral proteins. These include phosphate, chromium and molybdate transport systems. Of these proteins, most have homology with non-invasive environmental bacteria including Sorangium cellulosum [6 proteins], Bdellovibrio bacteriovirus [6 proteins] and Haliscomenobacter hydrossis [5 proteins]. While IS elements appear to be major contributors to genomic diversification in pathogenic Leptospira, which may possess more than 20 insertion sequence (IS) elements , the relative lack of IS elements in the L. licerasiae and L. biflexa genomes would suggest that genomic diversity where it exists is a result of different mechanisms. The phylogenetic origins of the laterally transferred genes, suggest that L. licerasiae is able to exchange genetic material with non-invasive environmental bacteria, whether this species can become naturally competent remains to be determined.
This study bridges a major gap in our knowledge of leptospiral biology and addresses a key question in the field regarding the pathogenic potential of the intermediate clade of Leptospira . The data presented here 1) demonstrate that L. licerasiae is more closely related to pathogenic than to saprophytic Leptospira; 2) provide insight into the genomic bases for its infectiousness and unique antigenic characteristics; and 3) support the denomination of the intermediate clade as ‘intermediately pathogenic’ and its consideration as a transitional group between saprophytes and pathogens.
Future comparative genomic analysis of the complete set of Leptospira species will provide deeper large-scale insights into the evolution, biology and evolution of virulence of this genus of spirochetes, and guide new experimental directions.
Identification of 1211 conserved Leptospira licerasiae proteins with no orthologs in L. interrogans , L. borgpetersenii or L. biflexa .
Identification of 452 conserved pathogen-specific proteins with 315 assigned a putative function by domain homology searches.
We thank Paula Maguina, Staff Research Associate, UC San Diego, for her important and key scientific and logistical contributions to the work reported here.
Conceived and designed the experiments: JNR DEF KPP NJW JMV MAM. Performed the experiments: JNR DEF JDS DMH KPP AM JSL JP RS MT NJW MAM. Analyzed the data: JNR DEF JDS DMH KPP AM JSL JP RS MT NJW JMV MAM. Contributed reagents/materials/analysis tools: DEF JDS DMH KPP. Wrote the paper: JNR DEF JDS DMH KPP AM JSL JP RS MT NJW JMV MAM.
- 1. Bharti AR, Nally JE, Ricaldi JN, Matthias MA, Diaz MM, et al. (2003) Leptospirosis: a zoonotic disease of global importance. Lancet Infect Dis 3: 757–771. doi: 10.1016/s1473-3099(03)00830-2
- 2. Ko AI, Goarant C, Picardeau M (2009) Leptospira: the dawn of the molecular genetics era for an emerging zoonotic pathogen. Nat Rev Microbiol 7: 736–747. doi: 10.1038/nrmicro2208
- 3. Ko AI, Galvao Reis M, Ribeiro Dourado CM, Johnson WD Jr, Riley LW (1999) Urban epidemic of severe leptospirosis in Brazil. Salvador Leptospirosis Study Group. Lancet 354: 820–825. doi: 10.1016/s0140-6736(99)80012-9
- 4. Ralph D, McClelland M (1994) Phylogenetic evidence for horizontal transfer of an intervening sequence between species in a spirochete genus. J Bacteriol 176: 5982–5987.
- 5. Zuerner R, Haake D, Adler B, Segers R (2000) Technological advances in the molecular biology of Leptospira. J Mol Microbiol Biotechnol 2: 455–462.
- 6. Haake DA, Suchard MA, Kelley MM, Dundoo M, Alt DP, et al. (2004) Molecular evolution and mosaicism of leptospiral outer membrane proteins involves horizontal DNA transfer. J Bacteriol 186: 2818–2828. doi: 10.1128/jb.186.9.2818-2828.2004
- 7. Morey RE, Galloway RL, Bragg SL, Steigerwalt AG, Mayer LW, et al. (2006) Species-specific identification of Leptospiraceae by 16S rRNA gene sequencing. J Clin Microbiol 44: 3510–3516. doi: 10.1128/jcm.00670-06
- 8. Matthias MA, Ricaldi JN, Cespedes M, Diaz MM, Galloway RL, et al. (2008) Human leptospirosis caused by a new, antigenically unique Leptospira associated with a Rattus species reservoir in the Peruvian Amazon. PLoS Negl Trop Dis 2: e213. doi: 10.1371/journal.pntd.0000213
- 9. Victoria B, Ahmed A, Zuerner RL, Ahmed N, Bulach DM, et al. (2008) Conservation of the S10-spc-alpha locus within otherwise highly plastic genomes provides phylogenetic insight into the genus Leptospira. PLoS One 3: e2752. doi: 10.1371/journal.pone.0002752
- 10. Zakeri S, Khorami N, Ganji ZF, Sepahian N, Malmasi AA, et al. (2010) Leptospira wolffii, a potential new pathogenic Leptospira species detected in human, sheep and dog. Infect Genet Evol 10: 273–277. doi: 10.1016/j.meegid.2010.01.001
- 11. Slack AT, Khairani-Bejo S, Symonds ML, Dohnt MF, Galloway RL, et al. (2009) Leptospira kmetyi sp. nov., isolated from an environmental source in Malaysia. Int J Syst Evol Microbiol 59: 705–708. doi: 10.1099/ijs.0.002766-0
- 12. Slack AT, Kalambaheti T, Symonds ML, Dohnt MF, Galloway RL, et al. (2008) Leptospira wolffii sp. nov., isolated from a human with suspected leptospirosis in Thailand. Int J Syst Evol Microbiol 58: 2305–2308. doi: 10.1099/ijs.0.64947-0
- 13. Levett PN, Morey RE, Galloway RL, Steigerwalt AG (2006) Leptospira broomii sp. nov., isolated from humans with leptospirosis. Int J Syst Evol Microbiol 56: 671–673. doi: 10.1099/ijs.0.63783-0
- 14. Brenner DJ, Kaufmann AF, Sulzer KR, Steigerwalt AG, Rogers FC, et al. (1999) Further determination of DNA relatedness between serogroups and serovars in the family Leptospiraceae with a proposal for Leptospira alexanderi sp. nov. and four new Leptospira genomospecies. Int J Syst Bacteriol 49 Pt 2: 839–858. doi: 10.1099/00207713-49-2-839
- 15. Segura ER, Ganoza CA, Campos K, Ricaldi JN, Torres S, et al. (2005) Clinical spectrum of pulmonary involvement in leptospirosis in a region of endemicity, with quantification of leptospiral burden. Clin Infect Dis 40: 343–351. doi: 10.1086/427110
- 16. Petersen AM, Boye K, Blom J, Schlichting P, Krogfelt KA (2001) First isolation of Leptospira fainei serovar Hurstbridge from two human patients with Weil's syndrome. J Med Microbiol 50: 96–100.
- 17. Ren SX, Fu G, Jiang XG, Zeng R, Miao YG, et al. (2003) Unique physiological and pathogenic features of Leptospira interrogans revealed by whole-genome sequencing. Nature 422: 888–893. doi: 10.1038/nature01597
- 18. Nascimento AL, Ko AI, Martins EA, Monteiro-Vitorello CB, Ho PL, et al. (2004) Comparative genomics of two Leptospira interrogans serovars reveals novel insights into physiology and pathogenesis. J Bacteriol 186: 2164–2172. doi: 10.1128/jb.186.7.2164-2172.2004
- 19. Bulach DM, Zuerner RL, Wilson P, Seemann T, McGrath A, et al. (2006) Genome reduction in Leptospira borgpetersenii reflects limited transmission potential. Proc Natl Acad Sci U S A 103: 14560–14565. doi: 10.1073/pnas.0603979103
- 20. Picardeau M, Bulach DM, Bouchier C, Zuerner RL, Zidane N, et al. (2008) Genome sequence of the saprophyte Leptospira biflexa provides insights into the evolution of Leptospira and the pathogenesis of leptospirosis. PLoS One 3: e1607. doi: 10.1371/journal.pone.0001607
- 21. Myers EW, Sutton GG, Delcher AL, Dew IM, Fasulo DP, et al. (2000) A whole-genome assembly of Drosophila. Science 287: 2196–2204. doi: 10.1126/science.287.5461.2196
- 22. Delcher AL, Phillippy A, Carlton J, Salzberg SL (2002) Fast algorithms for large-scale genome alignment and comparison. Nucleic Acids Res 30: 2478–2483. doi: 10.1093/nar/30.11.2478
- 23. Finn RD, Mistry J, Tate J, Coggill P, Heger A, et al. (2010) The Pfam protein families database. Nucleic Acids Res 38: D211–222. doi: 10.1093/nar/gkp985
- 24. Haft DH, Loftus BJ, Richardson DL, Yang F, Eisen JA, et al. (2001) TIGRFAMs: a protein family resource for the functional identification of proteins. Nucleic Acids Res 29: 41–43. doi: 10.1093/nar/29.1.41
- 25. Finn RD, Clements J, Eddy SR (2011) HMMER web server: interactive sequence similarity searching. Nucleic Acids Res 39: W29–37. doi: 10.1093/nar/gkr367
- 26. Klimke W, Agarwala R, Badretdin A, Chetvernin S, Ciufo S, et al. (2009) The National Center for Biotechnology Information's Protein Clusters Database. Nucleic Acids Res 37: D216–223. doi: 10.1093/nar/gkn734
- 27. Petersen TN, Brunak S, von Heijne G, Nielsen H (2011) SignalP 4.0: discriminating signal peptides from transmembrane regions. Nat Methods 8: 785–786. doi: 10.1038/nmeth.1701
- 28. Krogh A, Larsson B, von Heijne G, Sonnhammer EL (2001) Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. J Mol Biol 305: 567–580. doi: 10.1006/jmbi.2000.4315
- 29. Kanehisa M, Goto S (2000) KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res 28: 27–30. doi: 10.1093/nar/28.1.27
- 30. Riley M (1993) Functions of the gene products of Escherichia coli. Microbiol Rev 57: 862–952.
- 31. Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, et al. (2000) Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 25: 25–29.
- 32. Madupu R, Richter A, Dodson RJ, Brinkac L, Harkins D, et al. (2012) CharProtDB: a database of experimentally characterized protein annotations. Nucleic Acids Res 40: D237–241. doi: 10.1093/nar/gkr1133
- 33. Goll J, Montgomery R, Brinkac LM, Schobel S, Harkins DM, et al. (2010) The Protein Naming Utility: a rules database for protein nomenclature. Nucleic Acids Res 38: D336–339. doi: 10.1093/nar/gkp958
- 34. Haft DH, Selengut JD, Brinkac LM, Zafar N, White O (2005) Genome Properties: a system for the investigation of prokaryotic genetic content for microbiology, genome annotation and comparative genomics. Bioinformatics 21: 293–306. doi: 10.1093/bioinformatics/bti015
- 35. Lowe TM, Eddy SR (1997) tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res 25: 955–964. doi: 10.1093/nar/25.5.955
- 36. Gardner PP, Daub J, Tate JG, Nawrocki EP, Kolbe DL, et al. (2009) Rfam: updates to the RNA families database. Nucleic Acids Res 37: D136–140. doi: 10.1093/nar/gkn766
- 37. Laslett D, Canback B (2004) ARAGORN, a program to detect tRNA genes and tmRNA genes in nucleotide sequences. Nucleic Acids Res 32: 11–16. doi: 10.1093/nar/gkh152
- 38. Varani AM, Siguier P, Gourbeyre E, Charneau V, Chandler M (2011) ISsaga is an ensemble of web-based methods for high throughput identification and semi-automatic annotation of insertion sequences in prokaryotic genomes. Genome Biol 12: R30. doi: 10.1186/gb-2011-12-3-r30
- 39. Langille MG, Brinkman FS (2009) IslandViewer: an integrated interface for computational identification and visualization of genomic islands. Bioinformatics 25: 664–665. doi: 10.1093/bioinformatics/btp030
- 40. Langille MG, Hsiao WW, Brinkman FS (2008) Evaluation of genomic island predictors using a comparative genomics approach. BMC Bioinformatics 9: 329. doi: 10.1186/1471-2105-9-329
- 41. Hsiao W, Wan I, Jones SJ, Brinkman FS (2003) IslandPath: aiding detection of genomic islands in prokaryotes. Bioinformatics 19: 418–420. doi: 10.1093/bioinformatics/btg004
- 42. Waack S, Keller O, Asper R, Brodag T, Damm C, et al. (2006) Score-based prediction of genomic islands in prokaryotic genomes using hidden Markov models. BMC Bioinformatics 7: 142.
- 43. Kurtz S, Phillippy A, Delcher AL, Smoot M, Shumway M, et al. (2004) Versatile and open software for comparing large genomes. Genome Biol 5: R12. doi: 10.1186/gb-2004-5-2-r12
- 44. Fouts DE, Mongodin EF, Mandrell RE, Miller WG, Rasko DA, et al. (2005) Major structural differences and novel potential virulence mechanisms from the genomes of multiple campylobacter species. PLoS Biol 3: e15. doi: 10.1371/journal.pbio.0030015
- 45. Yu C, Zavaljevski N, Desai V, Reifman J (2011) QuartetS: a fast and accurate algorithm for large-scale orthology detection. Nucleic Acids Res 39: e88. doi: 10.1093/nar/gkr308
- 46. Aziz RK, Bartels D, Best AA, DeJongh M, Disz T, et al. (2008) The RAST Server: rapid annotations using subsystems technology. BMC Genomics 9: 75. doi: 10.1186/1471-2164-9-75
- 47. Fouts DE (2006) Phage_Finder: automated identification and classification of prophage regions in complete bacterial genome sequences. Nucleic Acids Res 34: 5839–5851. doi: 10.1093/nar/gkl732
- 48. Eddy SR (2009) A new generation of homology search tools based on probabilistic inference. Genome Inform 23: 205–211. doi: 10.1142/9781848165632_0019
- 49. Bourhy P, Frangeul L, Couve E, Glaser P, Saint Girons I, et al. (2005) Complete nucleotide sequence of the LE1 prophage from the spirochete Leptospira biflexa and characterization of its replication and partition functions. J Bacteriol 187: 3931–3940. doi: 10.1128/jb.187.12.3931-3940.2005
- 50. Saint Girons I, Margarita D, Amouriaux P, Baranton G (1990) First isolation of bacteriophages for a spirochaete: potential genetic tools for Leptospira. Res Microbiol 141: 1131–1138. doi: 10.1016/0923-2508(90)90086-6
- 51. Saint Girons I, Bourhy P, Ottone C, Picardeau M, Yelton D, et al. (2000) The LE1 bacteriophage replicates as a plasmid within Leptospira biflexa: construction of an L. biflexa-Escherichia coli shuttle vector. J Bacteriol 182: 5700–5705. doi: 10.1128/jb.182.20.5700-5705.2000
- 52. Breaker RR (2009) Riboswitches: from ancient gene-control systems to modern drug targets. Future Microbiol 4: 771–773. doi: 10.2217/fmb.09.46
- 53. Barrick JE, Corbino KA, Winkler WC, Nahvi A, Mandal M, et al. (2004) New RNA motifs suggest an expanded scope for riboswitches in bacterial genetic control. Proc Natl Acad Sci U S A 101: 6421–6426. doi: 10.1073/pnas.0308014101
- 54. Diez A, Gustavsson N, Nystrom T (2000) The universal stress protein A of Escherichia coli is required for resistance to DNA damaging agents and is regulated by a RecA/FtsK-dependent regulatory pathway. Mol Microbiol 36: 1494–1503. doi: 10.1046/j.1365-2958.2000.01979.x
- 55. Lo M, Bulach DM, Powell DR, Haake DA, Matsunaga J, et al. (2006) Effects of temperature on gene expression patterns in Leptospira interrogans serovar Lai as assessed by whole-genome microarrays. Infect Immun 74: 5848–5859. doi: 10.1128/iai.00755-06
- 56. Lang F, Busch GL, Ritter M, Volkl H, Waldegger S, et al. (1998) Functional significance of cell volume regulatory mechanisms. Physiol Rev 78: 247–306.
- 57. Desiere F, McShan WM, van Sinderen D, Ferretti JJ, Brussow H (2001) Comparative genomics reveals close genetic relationships between phages from dairy bacteria and pathogenic Streptococci: evolutionary implications for prophage-host interactions. Virology 288: 325–341. doi: 10.1006/viro.2001.1085
- 58. Hendrix RW, Lawrence JG, Hatfull GF, Casjens S (2000) The origins and ongoing evolution of viruses. Trends Microbiol 8: 504–508. doi: 10.1016/s0966-842x(00)01863-1
- 59. Tormo MA, Ferrer MD, Maiques E, Ubeda C, Selva L, et al. (2008) Staphylococcus aureus pathogenicity island DNA is packaged in particles composed of phage proteins. J Bacteriol 190: 2434–2440. doi: 10.1128/jb.01349-07
- 60. Guan S, Bastin DA, Verma NK (1999) Functional analysis of the O antigen glucosylation gene cluster of Shigella flexneri bacteriophage SfX. Microbiology 145(Pt 5):1263–1273. doi: 10.1099/13500872-145-5-1263
- 61. Wright A (1971) Mechanism of conversion of the salmonella O antigen by bacteriophageepsilon 34. J Bacteriol 105: 927–936.
- 62. Kropinski AM, Prangishvili D, Lavigne R (2009) Position paper: the creation of a rational scheme for the nomenclature of viruses of Bacteria and Archaea. Environ Microbiol 11: 2775–2777. doi: 10.1111/j.1462-2920.2009.01970.x
- 63. Bulach DM, Kalambaheti T, de la Pena-Moctezuma A, Adler B (2000) Lipopolysaccharide biosynthesis in Leptospira. J Mol Microbiol Biotechnol 2: 375–380. doi: 10.1128/iai.68.7.3793-3798.2000
- 64. Qin JH, Zhang Q, Zhang ZM, Zhong Y, Yang Y, et al. (2008) Identification of a novel prophage-like gene cluster actively expressed in both virulent and avirulent strains of Leptospira interrogans serovar Lai. Infect Immun 76: 2411–2419. doi: 10.1128/iai.01730-07
- 65. Nakayama K, Takashima K, Ishihara H, Shinomiya T, Kageyama M, et al. (2000) The R-type pyocin of Pseudomonas aeruginosa is related to P2 phage, and the F-type is related to lambda phage. Mol Microbiol 38: 213–231. doi: 10.1046/j.1365-2958.2000.02135.x
- 66. Daw MA, Falkiner FR (1996) Bacteriocins: nature, function and structure. Micron 27: 467–479. doi: 10.1016/s0968-4328(96)00028-5
- 67. Sim SH, Yu Y, Lin CH, Karuturi RK, Wuthiekanun V, et al. (2008) The core and accessory genomes of Burkholderia pseudomallei: implications for human melioidosis. PLoS Pathog 4: e1000178. doi: 10.1371/journal.ppat.1000178
- 68. Stalheim OH, Wilson JB (1964) Cultivation of Leptospirae. I. Nutrition of Leptospira Canicola. J Bacteriol 88: 48–54.
- 69. Savvi S, Warner DF, Kana BD, McKinney JD, Mizrahi V, et al. (2008) Functional characterization of a vitamin B12-dependent methylmalonyl pathway in Mycobacterium tuberculosis: implications for propionate metabolism during growth on fatty acids. J Bacteriol 190: 3886–3895. doi: 10.1128/jb.01767-07
- 70. Verma A, Rathinam SR, Priya CG, Muthukkaruppan VR, Stevenson B, et al. (2008) LruA and LruB antibodies in sera of humans with leptospiral uveitis. Clin Vaccine Immunol 15: 1019–1023. doi: 10.1128/cvi.00203-07
- 71. Lin YP, McDonough SP, Sharma Y, Chang YF (2010) The terminal immunoglobulin-like repeats of LigA and LigB of Leptospira enhance their binding to gelatin binding domain of fibronectin and host cells. PLoS One 5: e11301. doi: 10.1371/journal.pone.0011301
- 72. Oliveira R, de Morais ZM, Goncales AP, Romero EC, Vasconcellos SA, et al. (2011) Characterization of novel OmpA-like protein of Leptospira interrogans that binds extracellular matrix molecules and plasminogen. PLoS One 6: e21962. doi: 10.1371/journal.pone.0021962
- 73. Atzingen MV, Barbosa AS, De Brito T, Vasconcellos SA, de Morais ZM, et al. (2008) Lsa21, a novel leptospiral protein binding adhesive matrix molecules and present during human infection. BMC Microbiol 8: 70. doi: 10.1186/1471-2180-8-70
- 74. Atzingen MV, Gomez RM, Schattner M, Pretre G, Goncales AP, et al. (2009) Lp95, a novel leptospiral protein that binds extracellular matrix components and activates e-selectin on endothelial cells. J Infect 59: 264–276. doi: 10.1016/j.jinf.2009.07.010
- 75. Barbosa AS, Abreu PA, Neves FO, Atzingen MV, Watanabe MM, et al. (2006) A newly identified leptospiral adhesin mediates attachment to laminin. Infect Immun 74: 6356–6364. doi: 10.1128/iai.00460-06
- 76. Carvalho E, Barbosa AS, Gomez RM, Cianciarullo AM, Hauk P, et al. (2009) Leptospiral TlyC is an extracellular matrix-binding protein and does not present hemolysin activity. FEBS Lett 583: 1381–1385. doi: 10.1016/j.febslet.2009.03.050
- 77. Lin YP, Chang YF (2007) A domain of the Leptospira LigB contributes to high affinity binding of fibronectin. Biochem Biophys Res Commun 362: 443–448. doi: 10.1016/j.bbrc.2007.07.196
- 78. Lin YP, Chang YF (2008) The C-terminal variable domain of LigB from Leptospira mediates binding to fibronectin. J Vet Sci 9: 133–144. doi: 10.4142/jvs.2008.9.2.133
- 79. Lin YP, Greenwood A, Nicholson LK, Sharma Y, McDonough SP, et al. (2009) Fibronectin binds to and induces conformational change in a disordered region of leptospiral immunoglobulin-like protein B. J Biol Chem 284: 23547–23557. doi: 10.1074/jbc.m109.031369
- 80. Lin YP, Greenwood A, Yan W, Nicholson LK, Sharma Y, et al. (2009) A novel fibronectin type III module binding motif identified on C-terminus of Leptospira immunoglobulin-like protein, LigB. Biochem Biophys Res Commun 389: 57–62. doi: 10.1016/j.bbrc.2009.08.089
- 81. Vieira ML, Atzingen MV, Oliveira TR, Oliveira R, Andrade DM, et al. (2010) In vitro identification of novel plasminogen-binding receptors of the pathogen Leptospira interrogans. PLoS One 5: e11259. doi: 10.1371/journal.pone.0011259
- 82. Vieira ML, Vasconcellos SA, Goncales AP, de Morais ZM, Nascimento AL (2009) Plasminogen acquisition and activation at the surface of Leptospira species lead to fibronectin degradation. Infect Immun 77: 4092–4101. doi: 10.1128/iai.00353-09
- 83. O'Neill E, Pozzi C, Houston P, Humphreys H, Robinson DA, et al. (2008) A novel Staphylococcus aureus biofilm phenotype mediated by the fibronectin-binding proteins, FnBPA and FnBPB. J Bacteriol 190: 3835–3850. doi: 10.1128/jb.00167-08
- 84. Moser I, Schroeder W, Salnikow J (1997) Campylobacter jejuni major outer membrane protein and a 59-kDa protein are involved in binding to fibronectin and INT 407 cell membranes. FEMS Microbiol Lett 157: 233–238. doi: 10.1111/j.1574-6968.1997.tb12778.x
- 85. Choy HA, Kelley MM, Chen TL, Moller AK, Matsunaga J, et al. (2007) Physiological osmotic induction of Leptospira interrogans adhesion: LigA and LigB bind extracellular matrix proteins and fibrinogen. Infect Immun 75: 2441–2450. doi: 10.1128/iai.01635-06
- 86. Perez J, Goarant C (2010) Rapid Leptospira identification by direct sequencing of the diagnostic PCR products in New Caledonia. BMC Microbiol 10: 325. doi: 10.1186/1471-2180-10-325
- 87. Lundborg M, Modhukur V, Widmalm G (2010) Glycosyltransferase functions of E. coli O-antigens. Glycobiology 20: 366–368. doi: 10.1093/glycob/cwp185
- 88. Faine S, Adler B, Palit A (1974) Chemical, serological and biological properties of a serotype-specific polysaccharide antigen in Leptospira. Aust J Exp Biol Med Sci 52: 311–319. doi: 10.1038/icb.1974.29
- 89. Vinh T, Adler B, Faine S (1986) Ultrastructure and chemical composition of lipopolysaccharide extracted from Leptospira interrogans serovar copenhageni. J Gen Microbiol 132: 103–109. doi: 10.1099/00221287-132-1-103
- 90. Vinh TU, Shi MH, Adler B, Faine S (1989) Characterization and taxonomic significance of lipopolysaccharides of Leptospira interrogans serovar hardjo. J Gen Microbiol 135: 2663–2673. doi: 10.1099/00221287-135-10-2663
- 91. Arsenault TL, Hughes DW, Maclean DB, Szarek WA, Kropinski AMB, et al. (1991) Structural Studies on the Polysaccharide Portion of a-Band Lipopolysaccharide from a Mutant (Ak1401) of Pseudomonas aeruginosa Strain Pao1. Canadian Journal of Chemistry-Revue Canadienne De Chimie 69: 1273–1280. doi: 10.1139/v91-190
- 92. Van Melderen L (2010) Toxin-antitoxin systems: why so many, what for? Curr Opin Microbiol 13: 781–785. doi: 10.1016/j.mib.2010.10.006
- 93. Ramage HR, Connolly LE, Cox JS (2009) Comprehensive functional analysis of Mycobacterium tuberculosis toxin-antitoxin systems: implications for pathogenesis, stress responses, and evolution. PLoS Genet 5: e1000767. doi: 10.1371/journal.pgen.1000767
- 94. Picardeau M, Ren S, Saint Girons I (2001) Killing effect and antitoxic activity of the Leptospira interrogans toxin-antitoxin system in Escherichia coli. J Bacteriol 183: 6494–6497. doi: 10.1128/jb.183.21.6494-6497.2001
- 95. Picardeau M, Le Dantec C, Richard GF, Saint Girons I (2003) The spirochetal chpK-chromosomal toxin-antitoxin locus induces growth inhibition of yeast and mycobacteria. FEMS Microbiol Lett 229: 277–281. doi: 10.1016/s0378-1097(03)00848-6
- 96. Pandey DP, Gerdes K (2005) Toxin-antitoxin loci are highly abundant in free-living but lost from host-associated prokaryotes. Nucleic Acids Res 33: 966–976. doi: 10.1093/nar/gki201
- 97. Makarova KS, Wolf YI, Koonin EV (2009) Comprehensive comparative-genomic analysis of type 2 toxin-antitoxin systems and related mobile stress response systems in prokaryotes. Biol Direct 4: 19. doi: 10.1186/1745-6150-4-19