Genotypic Homogeneity of Multidrug Resistant S. Typhimurium Infecting Distinct Adult and Childhood Susceptibility Groups in Blantyre, Malawi

Nontyphoidal Salmonella (NTS) serovars are a common cause of bacteraemia in young children and HIV-infected adults in Malawi and elsewhere in sub-Saharan Africa. These patient populations provide diverse host-immune environments that have the potential to drive bacterial adaptation and evolution. We therefore investigated the diversity of 27 multidrug resistant (MDR) Salmonella Typhimurium strains isolated over 6 years (2002–2008) from HIV-infected adults and children and HIV-uninfected children. Sequence reads from whole-genome sequencing of these isolates using the Illumina GA platform were mapped to the genome of the laboratory strain S. Typhimurium SL1344 excluding homoplastic regions that contained prophage and insertion elements. A phylogenetic tree generated from single nucleotide polymorphisms showed that all 27 strains clustered with the prototypical MDR strain D23580. There was no clustering of strains based on host HIV status or age, suggesting that these susceptible populations acquire S. Typhimurium from common sources or that isolates are transmitted freely between these populations. However, 7/14 of the most recent isolates (2006/2008) formed a distinct clade that branched off 22 SNPs away from the cluster containing earlier isolates. These data suggest that the MDR bacterial population is not static, but is undergoing microevolution which might result in further epidemiology change.


Introduction
Nontyphoidal Salmonella serovars (NTS) have been a prominent cause of potentially fatal bacteraemia in Malawi and elsewhere in sub-Saharan Africa for more than fifteen years [1,2,3,4]. The majority of isolates in this epidemic are multidrug resistant (MDR) (ampicillin, chloramphenicol and cotrimoxazole), leaving limited therapeutic options for case management [5,6]. Reports of ceftriaxone and ciprofloxacin resistant Salmonella Typhimurium are now emerging from South Africa, emphasising the paramount importance of developing appropriate public health prevention strategies [7]. A vaccine approach to the prevention of NTS bacteraemia will require both an improved understanding of the mechanisms of protective immunity [8,9,10,11] and of the genetic diversity of the circulating invasive strains [12].
In sub-Saharan Africa, invasive NTS particularly affects young children, frequently in association with malnutrition, malaria, severe anaemia and/or HIV infection; and adults almost exclusively with marked HIV-associated immunosuppression [4,13,14]. Acquisition of Salmonella antibody in children is thought to be protective against bacteraemia [10], while in HIV-infected adults there appears to be dysregulated humoral immunity due to production of antibodies that block bactericidal killing of NTS [9], and dysregulated cellular immunity [15,16,17] Until recently, our understanding of the diversity of Salmonella strains causing bacteraemia in children and HIV-infected adults in Malawi was based on serovar and antibiotic resistance differences [5]. We have recently determined that the S. Typhimurium strains in Malawi and Kenya belong to sequence type ST313, which is rarely isolated outside sub-Saharan Africa. Whole genome sequencing of epidemic ST313 strains identified a distinct prophage repertoire and a composite element encoding MDR genes located on a virulence-associated plasmid (pSLT-BT, EMBL accession number FN432031) [18]. Evidence of genome degradation, including pseudogene formation and chromosomal deletions found in these strains, suggests that these S. Typhimurium strains may have become human adapted [19,20,21]. Similar genomic degradation has been identified in host-restricted Salmonella serovars such as S. Typhi, S. Paratyphi A and S. Gallinarum [22,23,24].
Experimental studies with E. coli raise the possibility that the environment within a host can personalise infecting pathogens [25,26,27,28]. A recent study of recurrent invasive ST313 S. Typhimurium disease in HIV-infected Malawian adults has, however, shown no evidence of within-individual microevolution across multiple episodes of recurrent disease over time [29]. In view of the distinct at-risk populations for invasive NTS in sub-Saharan Africa and the potential impact on prevention strategies, we have investigated the hypothesis that host differences imprint on the genotype of ST313 MDR S. Typhimurium isolates from HIV-infected adults and children and from HIV-uninfected children.

Results
Salmonella Typhimurium isolates that underwent Illumina sequencing comprised representative isolates from HIV-infected adults, HIV-infected children and HIV-uninfected children, from two periods; 2002/2003 and 2006/2008. All 27 isolates from both time periods clustered with the MDR epidemic strain D23580 ( Figure 1). Isolates from patients with the same HIV-status or age were no more related to each other than isolates from other susceptibility groups ( Figure 2).
There was however distinct clustering within the D23580 clade based on year of isolation. A cluster of seven isolates contained exclusively isolates from 2006/2008. This small cluster was separated by 22 SNPs from the main group of 20 isolates and contained isolates from each of the three susceptibility groups; two from HIV-infected adults, one strain from an HIV-infected child and four from HIV-uninfected children ( Figure 2). The topology of the tree indicates that the small cluster containing some of the most recent isolates shares a common ancestor with the larger cluster at the root of the tree, and does not branch off from within the large cluster containing other recent and all the older isolates. The maximum SNP distance within the main group of 20 isolates was fifteen to a 2008 blood isolate A50063. The tree pattern suggests limited accumulation of SNPs over time in the sampled S. Typhimurium.
In order to determine whether isolates from distinct susceptibility groups or dates of isolation had distinct genome features we carried out de novo assembly of short read sequence and ordered the contiguous sequence using the S. Typhimurium D23580 genome sequence and plasmid pSLT-BT from S. Typhimurium D23580. All 27 draft genomes were assembled and compared using SNPsFinder [30] and Artemis comparison tool (ACT) [31]. Three SNPs located within intergenic space/noncoding regions, 12 non-synonymous and 2 synonymous SNPs distinguished the 2006/2008 subgroup of strains from the large cluster that includes D23580 (Table 1). Differences between the populations include amino acid substitutions in an arginine exporter protein and in enzymes pyruvate dehydrogenase, fumarate hydratase and nitrate reductase ( Table 1). Arginine is essential in the induction of inducible nitric oxide synthase (iNOS) but also promotes the growth of bacterial pathogens [32,33]. Nitrate reductase contributes to the intracellular survival of Salmonella. Pyruvate dehydrogenase and fumarate hydratase are important in the production of energy for the bacterial cell activities [34,35]. There was no evidence of further genome degradation in the seemingly diverging sub-population except for one non-sense mutation in an open reading frame coding for xylanase/chitin deacetylase ( Table 1).
The multidrug resistance locus encoded within a Tn21-like element on plasmid pSLT-BT from strains in the 2006/2008 subgroup is similar to the integron earlier characterised in S. Typhimurium D23580. Sequence reads of all 27 genomes were mapped to the sequence of the 117 kb plasmid pSLT-BT. Average percent coverage of plasmid pSLT-BT was 95% excluding sequence reads from one strain with a low percent coverage of 12% (Table S1). Assembled plasmid sequences showed that 5% of the genome that was not mapped is probably due to repeat regions within the Tn21-like element where sequence gaps exist. Mapping of sequence reads to pSLT-BT generated two SNP loci distributed between two isolates, D36435 from the 2006/2008 subgroup and D36807 from the large cluster of 20 isolates. The two SNPs lie within a putative resolvase coding sequence flanking multidrug resistance loci in both strains. Comparison of assembled plasmid sequences from the 2006/2008 subgroup to pSLT-BT using ACT demonstrated colinearity and synteny ( Figure S1).

Discussion
We addressed the hypothesis that the genotypic composition of populations of multidrug resistant S. Typhimurium strains in Malawi will be influenced by factors which are responsible for the host's susceptibility to invasive NTS infection -the principal factors being HIV infection and young age. Phylogenetic analysis revealed a high degree of genetic relatedness of MDR isolates ( Figure 2). Clustering of S. Typhimurium strains in the phylogenetic tree did not show host preference between HIV-uninfected children and HIV-infected adults. It appears therefore that these strains have not undergone selection or adaptation to the different susceptible host groups. Selection pressures in the affected children and HIV-infected adults may be similar. Adaptive mutations have been identified in experimental models where mutations in E. coli isolates passaged through human volunteers could be identified with a particular infected host [25]. This was not the case in our different NTS affected groups. The absence of genotypes associated with a susceptible group does not entirely rule out the possibility of phenotypic adaptation of S. Typhimurium strains to the different hosts. Maestroni et al have recently demonstrated regulation of fitness genes in S. Typhimurium which enhance virulence after passage through mice without any apparent mutations [36]. It is possible that over a longer period of time such adaptations to different human hosts will emerge.
The high level of genetic relatedness among S. Typhimurium isolates from children and HIV-infected adults has implications for strategies to determine the source and mode of transmission and for development of interventions such as a vaccine. MDR S. Typhimurium in Malawi account for 90% of all NTS bacteraemia isolates but the source and modes of transmission have not being identified. Our findings suggest that both children and HIVinfected adults are infected by strains emanating from similar sources. Previous studies in Kenya and the Gambia have suggested that human-to-human transmission occurs in invasive NTS disease, since similar isolates could not be found in zoonotic and environmental sources [37,38]. A majority of the strains in the phylogenetic tree clustered with the fully sequenced and annotated invasive S. Typhimurium strain D23580. We have previously shown a reduced genome in D23580 in comparison to gastroenteritis strains, a finding consistent with what we might expect of strains undergoing host adaptation. If S. Typhimurium strains in Malawi are circulating between HIV-infected adults and children, it will be important to determine which one of the susceptible groups is the main reservoir.
Multidrug resistant S. Typhimurium strains and other NTS serovars are not limited to Malawi; they are spread across sub-Saharan Africa [1,3,5,7]. Understanding the genetic diversity of these strains is key to developing a much needed vaccine as there are limited antibiotic choices against NTS infections. We have found that S. Typhimurium strains causing infection in adults and children are genetically similar suggesting that suitable antigenic targets for a vaccine providing protection to both children and HIV-infected adults might be identified. This type of study must be expanded to characterise diversity of other serovar populations in Malawi including S. Enteritidis -the second most common NTS bacteraemia isolate -and serovars in other parts of sub-Saharan Africa to ensure adequate coverage by candidate vaccines.
We have determined that 7/27 closely related strains of S. Typhimurium from the more recent period between 2006 and 2008 formed a separate clade suggesting a complex epidemiology involving microevolution over time. This divergence has some similarities to the epidemic increase of multidrug resistant S. Typhimurium strains between 2001 and 2002 [5] but the 2006/ 2008 divergent strains are closely related to the predominant type strain. The seemingly diverging subcluster of strains could be a reflection of much wider genetic variation that has not been fully captured in the characterised sample size in this study. The longest SNP distance within the large cluster of strains is 15 SNPs to a tip with a single isolate from 2008. This genetic distance is seven SNPs less than the distance distinguishing the 2006/2008 group of seven strains. Therefore it will be necessary to investigate a larger sample size than the present 27 strains to ascertain further the level of genetic diversity in the population of multidrug resistant S.  [39]. Further investigations to understand what these single nucleotide polymorphisms mean to the clinical presentation in both HIV-infected adults and children, and to the epidemiology of S. Typhimurium strains over time are warranted. It will also be important to determine the selection pressures driving these changes.
In conclusion we have demonstrated the genotypic homogeneity of MDR S. Typhimurium strains isolated from HIV-infected adults and children and from HIV-uninfected children, indicating that similar strains are circulating between the distinct susceptibility groups. Homogeneity of the strains suggests that adults and children may have the same sources of infection. We have shown that microevolution of MDR S. Typhimurium strains has occurred  (19) and the corresponding amino acid changes. STM and STM_Mal are systematic identification for coding sequences in LT2 and D23580. Trp -tryptophan, Gln -glutamine, Arg -arginine, Ala -alanine, Thr -threonine, Leu -leucine, Ile -isoleucine, His -histidine, Cys -cysteine, Asn -asparagine, Glyglycine. doi:10.1371/journal.pone.0042085.t001 in Malawi within the past decade. Molecular tools are now available to study the biology of these pathogens and to explore mechanisms that influence clinical presentation and epidemiology.

Bacterial Isolates
We investigated 27 invasive multidrug resistant S. Typhimurium strains isolated from blood or cerebrospinal fluid during the period 2002 to 2008 ( Table 2). As described previously, these isolates were obtained as part of routine surveillance for invasive bacterial infection amongst children and adults presenting with a febrile illness to Queen Elizabeth Central Hospital (QECH), Blantyre Malawi [5]. Bacterial strains were selected randomly from 234 S. Typhimurium isolated from blood or cerebrospinal fluid of children and adults who participated in three studies in 2002, 2006 and 2008 at QECH and their HIV-sero status was known. Twenty-one strains were from 10/65 HIV-infected and 11/72 HIV-uninfected children. Six strains were selected from a group of 97 isolates from HIV-infected adults. Bacterial culture and antibiotic sensitivity testing were carried out using standard previously described protocols [5].

DNA preparation and Sequence alignment
Genomic DNA was extracted using the Wizard Genome DNA purification kit (Promega, USA) according to the manufacturer's instructions. Genomic DNA sequencing was carried out using the Illumina GA platform (Illumina, UK) in groups of 12 index tagged pools in 54 cycle runs at the Wellcome Trust Sanger Institute, UK. Sequence reads were mapped to plasmid pSLT-BT sequence and to the gastroenteritis S. Typhimurium strain SL1344 [40] genome sequence excluding highly variable prophage elements. Prophage elements and repeat sites constituting ,7% of the reference SL1344 genome were excluded using repeatfinding programmes nucmer, REPeuter and repeatmatch [41,42,43,44]. Reads were mapped using the Burrows-Wheeler Aligner software (BWA) with a minimum read depth of 4 [45]. SNPs were identified using mpileup and samtools and filtered with a minimum mapping quality to call a SNP of 30 and a SNP/mapping quality ratio cut-off of 0.75 [29,46,47]. A total of 1159 single nucleotide polymorphisms (SNP) sites were identified across the chromosome with an average percent coverage of 94% of the SL1344 sequence (Table S2). The sequence data for all 27 S. Typhimurium were submitted to the European Read Archive under submission accession number ERA015722 (http://www. ebi.ac.uk/ena/data/view/ERA015722) and run accession numbers listed in Table S3.

Phylogeny
A maximum likelihood phylogram was generated based on 1159 single nucleotide polymorphism (SNPs) sites across the chromosome using RAxML v7.0.4 [48] with GTRGAMMA model of evolution and 100 bootstrap replicates. The phylogram was viewed in FigTree (available at http://tree.bio.ed.ac.uk/ software/figtree/). The phylogram included the already sequenced and annotated epidemic multidrug resistant strain D23580 EMBL accession number FN424405 and the draft genome of the preepidemic and chloramphenicol susceptible S. Typhimurium strain A130, accession number ERA000075 [18]. A130 was isolated from blood in an adult patient in 1997 at QECH, Blantyre, Malawi.
Genome assembly and comparison of plasmid and chromosomal sequences Genome assembly was conducted using VelvetOptimiser.pl [49] to assemble short read sequences and abacas.pl to order the contigs [50].
A file of SNPs identified by mapping using the Burrows-Wheeler Aligner software was opened in the Artemis file of the reference strain SL1344 under comparison to assembled sequences and the annotated strain D23580, to identify any changes in amino acids. Assembled sequences were compared using the Artemis comparison tool (ACT) [31] and also uploaded on to SNPSFinder together with the fully-sequenced and annotated S. Typhimurium strains LT2 (EMBL accession number AE006468) and D23580 [30]. Figure S1 Similarity of Tn21-like sequence from pSLT-BT to assembled sequences. Artemis comparison tool generated figures A and B depicting sequence homology of the Tn21-like sequence (green feature) from virulence plasmid pSLT-BT to assembled plasmid sequences from the seven strains in the 2006/2008 cluster. Antibiotic resistance genes (cat-chloramphenicol, blaTbetalactamse, SulI and SulIsulphonamide, dhfrItrimethoprim, aadAaminoglycoside and StrA and StrBstreptomycin) and quartenary ammonium compound resistance gene (qacE) are present in all six strains. BLASTN matches are shown as red bands. White spaces are gaps in these draft sequences. (TIF)