Characterization and Genetic Variation of Vibrio cholerae Isolated from Clinical and Environmental Sources in Thailand

Cholera is still an important public health problem in several countries, including Thailand. In this study, a collection of clinical and environmental V. cholerae serogroup O1, O139, and non-O1/non-O139 strains originating from Thailand (1983 to 2013) was characterized to determine phenotypic and genotypic traits and to investigate the genetic relatedness. Using a combination of conventional methods and whole genome sequencing (WGS), 78 V. cholerae strains were identified. WGS was used to determine the serogroup, biotype, virulence, mobile genetic elements, and antimicrobial resistance genes using online bioinformatics tools. In addition, phenotypic antimicrobial resistance was determined by the minimal inhibitory concentration (MIC) test. The 78 V. cholerae strains belonged to the following serogroups O1: (n = 44), O139 (n = 16) and non-O1/non-O139 (n = 18). Interestingly, we found that the typical El Tor O1 strains were the major cause of clinical cholera during 1983–2000 with two Classical O1 strains detected in 2000. In 2004–2010, the El Tor variant strains revealed genotypes of the Classical biotype possessing either only ctxB or both ctxB and rstR while they harbored tcpA of the El Tor biotype. Thirty O1 and eleven O139 clinical strains carried CTXϕ (Cholera toxin) and tcpA as well four different pathogenic islands (PAIs). Beside non-O1/non-O139, the O1 environmental strains also presented chxA and Type Three Secretion System (TTSS). The in silico MultiLocus Sequence Typing (MLST) discriminated the O1 and O139 clinical strains from other serogroups and environmental strains. ST69 was dominant in the clinical strains belonging to the 7th pandemic clone. Non-O1/non-O139 and environmental strains showed various novel STs indicating genetic variation. Multidrug-resistant (MDR) strains were observed and conferred resistance to ampicillin, azithromycin, nalidixic acid, sulfamethoxazole, tetracycline, and trimethoprim and harboured variants of the SXT elements. For the first time since 1986, the presence of V. cholerae O1 Classical was reported causing cholera outbreaks in Thailand. In addition, we found that V. cholerae O1 El Tor variant and O139 were pre-dominating the pathogenic strains in Thailand. Using WGS and bioinformatic tools to analyze both historical and contemporary V. cholerae circulating in Thailand provided a more detailed understanding of the V. cholerae epidemiology, which ultimately could be applied for control measures and management of cholera in Thailand.

Introduction Vibrio cholerae is the causative agent of the severe, watery diarrheal disease cholera. V. cholerae is classified into approximately 206 serogroups of which O1 and O139 have the potential to cause cholera outbreaks and are associated with cholera pandemics. The remaining serogroups; determined non-O1/non-O139 are often referred to as environmental cholera [1][2][3] and part of the normal flora of aquatic ecosystems [4]. Nonetheless, some non-O1/non-O139 strains have the potential to cause mild diarrhea, and outbreaks have been observed in several countries including Thailand [5][6][7]. The serogroup O1 is divided into two biotypes: Classical and El Tor, based on phenotypic differences [2].
Since 1817, cholera has spread from the Indian sub-continent and seven pandemics have been observed, the seventh of which is still ongoing. The first six pandemics were associated with the O1 Classical biotype and ceased around 1923 [8,9]. In 1961, the 7 th pandemic began in Southeast Asia, caused by the O1 El Tor biotype [3,[10][11][12][13]. Whole genome sequence (WGS) analysis has identified eight distinct phylogenetic lineages: L1-L8 with L1 and L3-L6 representing the former pandemics and L2 the present 7 th El Tor pandemic. Lineages L7 and L8 are formed by unique isolates [12]. The lineage L2 of the 7 th pandemic has further been subdivided into three waves; I, II and III, of which, wave III seems to consist of several clusters [3,12]. In general, the clusters separate isolates from Africa and India from those isolated in Haiti, Nepal, and Southeast Asia [12,14]. In 1992, V. cholerae O139 emerged and caused epidemic cholera [15] followed in 2002 by the emergence of V. cholerae O1 variants; a genetic mixture of the Classical and El Tor biotypes. The V. cholerae O1 variants were later reported in several countries in Africa and Asia [16][17][18][19]. Since 2013, after the containment of the cholera outbreak in Haiti, the number of reported cholera cases has decreased globally. In Asia however, the incidence of cholera has increased and continues to pose a serious public health concern [20].
V. cholerae consists of two chromosomes and the hallmark of pathogenic V. cholerae is the major virulence factors; cholera toxin (CT) and toxin co-regulated pilus (TCP). The two virulence factors are clustered within two regions; the Vibrio pathogenicity island I (VPI-1) encoded by TCP [21] and the CTX genetic element comprised by a core region in CTXϕ. The latter contains not only the genes of the cholera toxin, ctxAB, but also carries the zonular occludens toxin (zot) and accessory colonization enterotoxin (ace) [22]. In addition, other virulence genes encoding hemolysin (hlyA), heat stable enterotoxin (stn), mannose-sensitive hemagglutin pilus (mshA), repeats-in-toxin A toxin (rtxA), and a ToxR regulatory protein (toxR) have been associated with diarrheal disease [23,24]. Recently, the type III secretion system (TTSS) has been known as a key virulence factor and appears to be an important virulence factor for pathogenicity of non-O1/non-O139 [25].
Since 1997, endemic or sporadic cholera cases have been linked every year to contaminated seafood or potable water in Thailand [26]. Antimicrobial treatments have been recommended for only severe dehydration cases. Nonetheless, the occurrence of resistant strains has dramatically increased [27]. The presence of the SXT element and class I integron have been reported to contribute to the spread of antimicrobial resistance genes among V. cholerae and other bacteria [28].
The objective of this study was to provide more knowledge of the genotypic variation in V. cholerae observed during the past three decades in Thailand. A collection of clinical and environmental V. cholerae serogroup O1, O139, and non-O1/non-O139 strains collected between 1983 and 2013 in Thailand were characterized by a combination of conventional microbiological tests, molecular methods, next generation sequencing, and bioinformatics tools to determine the pheno-and genotypes. In addition, the distribution of virulence-associated genes and the occurrence of antimicrobial resistance and corresponding resistance genes including the class 1 integron and SXT element among V. cholerae strains were subsequently analyzed to elucidate the emerging antimicrobial resistance and virulence properties.

Bacterial strains
A total of 78 V. cholerae strains were selected for this study based on the serogroups O1, O139, and non-O1/non-O139, the sources for these strains were the clinic and environment, and date (1983-2013) from the culture collection of the Department of Microbiology, Faculty of Public Health, Mahidol University, Thailand (Table A in S1 File). The clinical strains were previously isolated from stools and rectal swabs of patients suffering from sporadic cases or outbreaks of cholera in central Thailand and the environmental strains were isolated from seafood, water, and hand swabs.

Characterization of V. cholerae
The purity of all V. cholerae strains were assessed on Thiosulfate-citrate-bile salts-sucrose (TCBS) agar prior to confirmation using a combination of biochemical, serological, and molecular methods as previously described [29,30]. Serogroups and serotypes were determined by slide agglutination utilizing specific polyvalent antisera against V. cholerae O1 and O139, and monovalent specific to Inaba and Ogawa antisera (S & A Reagents Lab, Bangkok, Thailand) and by touchdown-multiplex polymerase chain reaction (TMPCR) using speciesspecific primers for V. cholerae (ompW gene) and serogroup-specific for O1 (rfbV gene) and O139 (wbfZ gene) [30].
Whole genome sequencing V. choleare genomic DNA was extracted using the Invitrogen Easy-DNA TM Kit (Invitrogen, Carlsbad, CA, USA). The concentrations of the extracted DNA were determined using a Qubit dsDNA BR assay kit (Invitrogen). The genomic DNA was prepared for Illumina pairedend sequencing using the Illumina (Illumina, Inc., San Diego, CA) NexteraXT1 Guide 150319425031942 following protocol revision C. A sample of pooled NexteraXT Libraries was loaded onto an Illumina MiSeq reagent cartridge using MiSeq Reagent Kit v2 and 500 cycles with a Standard Flow Cell. The libraries were sequenced using the MiSeq Illumina platform and MiSeq Control Software 2.3.0.3. All strains were paired-end sequenced.
Raw sequence data were submitted to the European Nucleotide Archive (http://www.ebi. ac.uk/ena) under study accession no.: PRJEB14630 (http://www.ebi.ac.uk/ena/data/view/ PRJEB14630). The raw reads were assembled using the Assemble pipeline (version 1.0) available from the Center for Genomic Epidemiology (CGE; http://cge.cbs.dtu.dk/services/all.php) based on the Velvet algorithms for de novo short reads assembly. A complete list of genomic sequence data is available in Table B in S1 File.

The use of bioinformatics tools
Identification of V. cholerae and determination of associated virulence genes and pathogenicity islands. MyDbFinder is a BLAST-based search-engine that was developed as "an empty database" in the same format as the ResFinder tool [34] to identify user-defined genes (https://cge.cbs.dtu.dk/services/MyDbFinder/). The users populate their own database by including DNA sequences of interest in FASTA format into a pure text file. MyDbFinder query raw reads or assembled genome data and outputs the best matching genes from the user's database.
Determination of antimicrobial resistance genes, SXT element, and class 1 integron. In all V. cholerae strains, antimicrobial resistance genes were detected based on the assembled sequences using the ResFinder tool (version 2.1, 80% threshold for %ID/ 60% minimum length) available from CGE [34]. The SXT element, class 1 integron, and presence of mutation in the DNA gyrase gene (gyrA), and the DNA topoisomerase IV genes (parC and parE) were determined using MyDbFinder as previously described [32]. The nucleotide sequence of integrase gene of the SXT element (int SXT ), the class 1 integron (intI), gyrA, parC, and parE genes of the quinolone-resistant V. cholerae strains from GenBank were used as references (Table C in S1 File). ICEVcHai1 (JN648379) and dfrA18 gene of SXT MO10 (AY034138) were used as templates in MyDBFinder (threshold, 95% identity) to determine which V. cholerae strains contained an int SXT gene.
Multilocus sequence type. The assembled sequences were analyzed to identify the MLST, sequence type (ST) for V. cholerae using the MLST tool (version 1.7) available from CGE [35]. The seven housekeeping genes: adk, gyrB, metE, mdh, pntA, purM, and pyrC as previously described by Octavia et al. (2013) [36], were extracted from 78 V. cholerae genomes in this study and 6 V. cholerae genomes from the NCBI database (M66-2, O395, N16961, MO45, MS6, 2010EL-1786). Concatenation of the housekeeping gene sequences was performed with an in-house python script. A multiple alignment was created from the concatenated sequences using MUSCLE via MEGA5 [37]. The final phylogenetic MLST tree was constructed by MEGA5 using the maximum likelihood method of 1,000 bootstrap replicates using Tamura-Nei model for inference [38]. Figtree (http://tree.bio.ed.ac.uk/software/figtree/) was used to visualize the tree. The confidence of the nodes in the tree is estimated by bootstrap values, calculated by sampling with replacements from the multiple sequence alignment. New STs were confirmed by PCR as previously described Octavia et al. (2013) [36].
Genomic islands in the chromosomes of V. cholerae. Variation of genomic islands including CTX, VPI-1, VPI-2, VSP-1, VSP-2, and the super-integron were visualized and determined based on chromosome I and II of the reference genome V. cholerae N16961 (accession no. AE003852 and AE003853) using a BLAST atlas. All protein sequences from the reference genome were aligned against other V. cholerae genomes using BLASTP. The presence and absence of genes were visualized in a circle, with greater similarity represented by higher intensity of color [39].  Table D in S1 File).

Characterization of V. cholerae strains
The biotype classification of the 44 V. cholerae O1 strains revealed 15 strains determined as being typical El Tor similar to the phenotype of El Tor strain N16961 (CCA + HSE + PB r VP + ). The 15 strains all carried according to MyDbFinder identical genes; ctxB, rstR, and tcpA with the exception of three environmental strains (TC22, MK14, and 4T5) and one clinical strain (TC183). Two strains (VC O1-8 and VC O1-10) belonged to the biotype Classical, exhibiting the phenotype CCA -HSE -PB s VPand genotypically similar to O395 strain (Classical). Furthermore, 26 V. cholerae O1 strains tested phenotypically El Tor but revealed using MyDbFinder mixed Classical and El Tor genotypes and determined as an El Tor variant. Finally, one V. cholerae O1 strain (MK14) expressed phenotypically both biotypes (CCA + HSE + PB s VP + ) and was determined as belonging to the hybrid biotype (Fig 1, Table D in S1 File).
The MLST types of the 78 V. cholerae and 6 reference genomes were analyzed and assigned to 26 different STs (Fig 1). The analysis showed that 50 strains were represented by ST69, making this the most common ST and all 50 of these strains related to clinical strains. Among clinical strains, 38 O1 El Tor and 12 serogroup O139 belonged to the same cluster with the pandemic strains (N16961 and MO45) and the Haitian strain (2010EL1786). The strains harbored the 7 th pandemic-specific gene (VC2346) according to MyDbFinder, suggesting that they belong to the same clonal linage. The cluster is also linked to the pre-6 th pandemic strain (M66-2) and the endemic strain from Thailand (MS6), which was closely related to the cluster of the O1 Classical strains (ST73) including the strains related to the 6 th pandemic (Table E in S1 File). All of the non-O1/non-O139 strains and the environmental strains, except for four O139 strains belonging to ST187, were assigned to different novel STs, suggesting a high degree of genetic diversity.
The whole genome sequence of the strains harboring the SXT element revealed a structure organized similar to ICEVchHai1 and SXT MO10 in the GenBank (Fig 3). Most strains except for 4053024303, 4053024306, and 22138 shared the similar structures of SXT element with common known deletions in loci VC1786ICE6 and VC1786ICE14. The variations in the SXT structures separated the individual serogroup O1, O139, and non-O1/non-O139 into distinct branches of the phylogenetic tree (Fig 3). The SXT elements of O1 strains were divided into two clades (GI and GII). The SXT structure of GI was highly similar to the structure of ICEVchHai1. Nineteen loci including dfrA18 and floR were absent in GII. The SXT structures among the O139 strains harbored loci similar to SXT MO10 and ICEVchHai1 but lacked 25 loci including dfrA1. For non-O1/non-O139 strains, four strains harbored the SXT element and their SXT structures were similar to those of O139 strains. Only one resistant strain, VHS1-22I, harbored floR, strA, strB, and sul2 genes. Two susceptible strains and one NAL-resistant strain did not contain these antimicrobial resistance genes including dfrA18 and dfrA1.

Discussion
Since 1982, V. cholerae has been present and emerging in Thailand [40]. In the last decade, sporadic cholera cases have been observed in Thailand caused primarily by V. cholerae O1 and O139. In this study, we found that the phenotypic results characterizing V. cholerae were all in concordance with the in silico genotypic data revealed by WGS targeting the following genes: ompW, rfbV, wbfZ, ctxB, rstR, and tcpA. These genes have previously been used to classify V. cholerae strains [27, 30, [41][42][43]. The tested strains were classified into serogroups O1, O139, and non-O1/non-O139 showing that both V. cholerae serogroup O1 and O139 are present in Thailand and have potentially caused cholera.
In Thailand, several studies have reported the emergence of V. cholerae however, the biotype V. cholerae O1 classical has not been detected since 1986 [27, [44][45][46]. Interestingly, this Year:   study revealed that two strains obtained from stool samples in 2000 were identified as the Classical biotype and were genetically similar to the strains related to the 6 th cholera pandemic (Table E in S1 File). This indicated that the Classical biotype might have re-emerged, causing cholera outbreaks in Thailand after having been absent for several years during the 6 th cholera pandemic. The decline of typical El Tor strains coincided with the first reports from Bangladesh of the emergence of the El Tor variant strain [16]. Furthermore, the El Tor variants possessing both the Classical and El Tor biotypes were recovered from clinical strains during 2004-2010. Detection of the El Tor variant was previously reported in Cameroon, India, and Thailand [18,32,41]. The variant of the Classical and El Tor biotypes increases the severity of the disease and may result in higher morbidity and mortality [47,48]. Kim et al. suggested that the El Tor variant possessing the Classical biotype originated through recombination between the Classical and El Tor types of CTXϕ [49]. One hybrid strain of this study, MK14, originating from a river water sample, lacked the biotype-specific genes as well as the main virulence genes (ctxAB and tcpA), suggesting it to be a non-toxigenic strain and in agreement with previous reports [50,51]. The non-toxigenic strains, however, have been responsible for causing mild to moderate diarrhea in human volunteers in clinical trials [2]. These El Tor variant strains clustered together with the clinical strains including typical El Tor biotype and O139 serogroup. Moreover, the in silico MLST analysis showed that the clinical strains had a highly genetic relationship with the pandemic and outbreak strains. The majority of the clinical strains O1 and O139 belonged to ST69 and showed genetic similarity to the 7 th pandemic strain (N16961), the Haitian outbreak strain (2010EL-1786), and the Cameroon outbreak strains [32]. In addition, all of the clinical strains harbor the specific gene marker of the 7 th pandemic clone. These findings suggest that the clinical strains  in Thailand might originate from a common ancestor of the 7 th pandemic strain. The STs of the clinical strains showed that they were closely related to the pre 6 th pandemic strain (M66-2) and a previous outbreak strain in Thailand (MS6) [52]. The clinical strains of O1 and O139 were highly conserved with regard to MLST (ST69) but contained different virulence genes, particularly ctxAB and tcpA. These findings have previously been reported and might be a result of horizontal gene transfer [36,53,54]. The in silico MLST analysis clearly showed discrimination amongst the different sources (clinical and environmental) and serogroups O1 and O139 as compared with non-O1/non-O139 strains. The clinical strains of O1 and O139 were highly conserved with regard to MLST (ST69), while the environmental strains of O1, O139, and non-O1/non-O139 and the clinical strains of non-O1/non-O139 revealed different and novel STs. This indicates that the environmental strains including non-O1/non-O139 were highly diverse; however, these results might be caused by gene recombination and/or mutation [36]. Furthermore, the environmental strains could be distinguished from the clinical strains using in silico MLST based on the difference in the virulence gene profiles. The environmental strains of O1 and O139 lacked the CTXϕ and tcpA genes, especially. However, these strains harbored other virulence genes similar to non-O1/non-O139. Both chxA and TTSS genes were frequently found among non-O1/non-O139 pathogenic strains and associated with diarrhea [36,51,55]. However, the environmental O1 strains in this study harbored chxA gene (TC22) and TTSS (MK14), indicating virulence potential to cause disease.

1983-1990
Our study showed that the antimicrobial resistance profiles SMX-TMP and NAL-SMX-TMP were predominant among the clinical strains of serogroup O139 and O1, respectively. In addition, other clinical strains exhibited resistance to TET, AZM, and AMP in contrast to the environmental strains which were mostly resistant to NAL followed by AMP, TMP, and TET. Previous reports have described different antimicrobial resistant profiles compared with those from Thailand, such as resistance to furazolidone, NAL, sulfisoxazole, streptomycin, and trimethoprim/sulfamethoxazole in Haiti [56] as well as TET, streptomycin, sulfisoxazole and trimethoprim in China [57]. During 2003-2011, V. cholerae O1 has been reported as being resistant to erythromycin, TET, trimethoprim/sulfamethoxazole, and AMP in Thailand [27].
Our study showed a similar concordance between the antimicrobial susceptibility testing data and the in silico-detected corresponding resistance genes in the V. cholerae strains using the ResFinder bioinformatics tool [34]. A few disagreements were observed and confirmed by re-testing the MIC determination. These discrepancies related to TET-resistant strains in which no conferring resistance genes or other resistance mechanisms could be detected. This phenomenon is well-known and has previously been reported related to potential efflux pumps [58]. In contrast, we observed some strains that harbored both floR and catB9 but displayed a susceptible phenotypic resistance profile. This observation has also been described in a recent publication describing the cholera in Haiti [56]. Similarly, susceptible non-O1/non-O139 strains harboring the qnrVC5 gene did not express resistance to quinolone. Normally, one would anticipate isolates that harbor the genes floR and catB9 would be associated with reduced susceptibility to CHL [59] and those that harbor the gene qnrVC5 would be associated with quinolone resistance. These abnormalities are most likely linked to incorrect interpretative criterion.
According to World Health Organization (WHO) recommendations, TET and CIP are the drugs of choice for the treatment of cholera. Unfortunately, there is a lack of prudent usage in Thailand because these antimicrobials are being misused/overused in the agricultural section [60]. During 2003-2011, the endemic cholera strains in Thailand were resistant to TET, whereas cholera was still susceptible to CIP as proven by Chomvarin et al., 2013 [27] and in this study. Amino acid substitutions in gyrA and parC are the main mechanism responsible for quinolone resistance in V. cholerae [56,58,61]. In this study, the same point mutations in gyrA (S83I) and parC (S85L) were detected among NAL-resistant strains found in both clinical and environmental sources.
The SXT element is an ICE that translocates a panel of antimicrobial resistance genes via horizontal gene transfer [62]. The first SXT, SXT MO10 , was discovered in V. cholerae O139 strain MO10. It harbored resistant determinants to trimethoprim (dfrA18), streptomycin (strA, strB), sulfamethoxazole (sul2), and chloramphenicol (floR) [63]. Other ICEs identified in O1 and non-O1/non-O139 harbor a similar set of resistance genes as the SXT MO10 strain [28,64]. Recently, WGS has been used to identify a variant of SXT in a Haitian O1 strain, ICEVch-Hai1 harboring dfrA1, strA, strB, sul2, and floR [56]. We analyzed the genetic variation in SXT elements by comparing with gene loci in ICEVchHai1 and dfrA18 in SXT MO10 . ICEVchHai1 has previously been used as the reference for comparison with the SXT element in India [64]. In this study, we found that the SXT in each of the different serogroups O1, O139, and non-O1/non-O139 were distinctly different. The SXT structures of the O1 strains showed a higher genetic similarity with ICEVchHai1 than the SXT structures of O139 and non-O1/nonO139 strains. This indicated that the acquired SXT element in the O1 Thai strains were similar to those of the Haitian and Indian strains. These findings are consistent with a previous study that showed identity of SXT within the same serogroup of V. cholerae [28].
In this study, we found that the re-occurrence of classical toxigenic strains have been persisted since 2000 in Thailand. The variation of phenotypic and genotypic characteristics shows that the V. cholerae O1 biotype El Tor variant has caused the majority of the outbreaks since 2004. The V. cholerae O1 and O139 obtained from clinical source commonly harboured CTXϕ and tcpA. Conversely, their environmental strains lacking those virulence genes could be detected. Moreover, the occurrence of SXT element and resistance genes conferring antimicrobial resistance was encountered among Thai strains. These findings suggest that lysogenicity of V. cholerae O1 for CTXϕ and other genetic markers including resistance genes should be further intensively surveillance and control. Future application of WGS combined with bioinformatic tools, such as MLST [35], MyDbFinder, ResFinder [34], and VcTypeFinder (in development), have in this study proven the power and are highly discriminatory methods in understanding the epidemiology of V. cholerae.

Conclusions
In this study, we used WGS and bioinformatic tools to analyze both historical and contemporary V. cholerae circulating in Thailand. To our knowledge, this is the first time since 1986 that the presence of V. cholerae O1 classical has been reported causing cholera outbreaks in Thailand. We found that the majority of the pathogenic strains belonged to V. cholerae O1 El Tor variant and O139. In silico analysis showed that the clinical strains shared common genetic background as well as harbored virulence genes, PAIs and mobile genetic elements associated with antimicrobial resistance while environmental strains were highly diverse. This study contributed to understanding the epidemiology of V. cholerae in Thailand that ultimately can be applied for control measures and management of the disease in Thailand.