Genomic characterization of endemic diarrheagenic Escherichia coli and Escherichia albertii from infants with diarrhea in Vietnam

Background Diarrheagenic Escherichia coli (DEC) is a group of bacterial pathogens that causes life-threatening diarrhea in children in developing countries. However, there is limited information on the characteristics of DEC isolated from patients in these countries. A detailed genomic analysis of 61 DEC-like isolates from infants with diarrhea was performed to clarify and share the characteristics of DEC prevalent in Vietnam. Principal findings DEC was classified into 57 strains, including 33 enteroaggregative E. coli (EAEC) (54.1%), 20 enteropathogenic E. coli (EPEC) (32.8%), two enteroinvasive E. coli (EIEC) (3.3%), one enterotoxigenic E. coli (ETEC), and one ETEC/EIEC hybrid (1.6% each), and surprisingly into four Escherichia albertii strains (6.6%). Furthermore, several epidemic DEC clones showed an uncommon combination of pathotypes and serotypes, such as EAEC Og130:Hg27, EAEC OgGp9:Hg18, EAEC OgX13:H27, EPEC OgGp7:Hg16, and E. albertii EAOg1:HgUT. Genomic analysis also revealed the presence of various genes and mutations associated with antibiotic resistance in many isolates. Strains that demonstrate potential resistance to ciprofloxacin and ceftriaxone, drugs recommended for treating childhood diarrhea, accounted for 65.6% and 41%, respectively. Significance Our finding indicate that the routine use of these antibiotics has selected resistant DECs, resulting in a situation where these drugs do not provide in therapeutic effects for some patients. Bridging this gap requires continuous investigations and information sharing regarding the type and distribution of endemic DEC and E. albertii and their antibiotic resistance in different countries.


Introduction
Diarrhea is a leading cause of morbidity and mortality in children under five years of age in developing countries, especially in South-Asia and Africa, where the accessibility to safe water, good nutrition, adequate sanitation, and proper healthcare is restricted [1]. Vietnam, located in South-Asia, has achieved rapid economic growth in recent years. The infrastructure, medical systems, and food production and food marketing systems that contribute to the improving public health are being developed. However, childhood diarrhea remains a public health problem in Vietnam [2]. The enteric pathogens that cause diarrhea include various viruses, bacteria, and parasites. Among them, diarrheagenic Escherichia coli (DEC) is a group of bacterial pathogens that causes a wide variety of intestinal diseases. DEC is generally classified into at least five pathotypes, including Shiga toxin-producing E. coli (STEC), enteropathogenic E. coli (EPEC), enterotoxigenic E. coli (ETEC), enteroaggregative E. coli (EAEC), and enteroinvasive E. coli (EIEC) based on the specific virulence markers [3]. Among the DEC pathotypes, EAEC, ETEC, and EPEC are generally associated with childhood diarrhea in developing countries [4,5].
Some studies of DEC in Vietnamese children have been conducted so far [6][7][8]. Nguyen et al. [6] reported that one or more marker genes of DEC were detected in 22.5% of fecal samples (132/587) collected from young children with diarrhea from 2001 to 2002. Meanwhile, Hien et al. [7] isolated DEC strains from 25.7% (64/249) of stool samples of diarrhea patients who were less than five years of age from 2001 to 2002 and classified them into pathotypes. Recently, Duong et al. [8] developed new multiplex real-time polymerase chain reactions (PCRs), which revealed that 34.7% and 41.2% of stool samples from children with and without diarrhea from 2014 to 2016, respectively, were positive for the DEC pathotypes. Thus, Vietnamese children are widely contaminated with various DEC strains possibly associated with diarrhea. However, there is little information about the characteristics of these DECs that would be useful for future epidemiological research and infection control.
Genome analysis of DEC reveals the overall picture of prevalence and the repertoires of virulence-related and antibiotic-resistance genes. It can also clarify the phylogenetic relationships between strains, and these can then be used for public health investigations. We performed a detailed genomic analysis of 61 DEC-like isolates from children with diarrhea in Northern Vietnam to clarify and share the characteristics of the DEC strains prevalent in Vietnam.

Ethics statement
Research approval was obtained from the Ethical Committee of the Institute of Tropical Medicine Nagasaki University in Japan (approval number: 150917144) and the Institutional Review Board of the National Institute of Hygiene and Epidemiology (NIHE) in Vietnam (approval number: IRB-VN1059-19). Written informed consent was obtained from all parents of participant children.

Isolation of DEC
Nine-hundred and ninety fecal samples were obtained from children less than five years old with diarrhea who visited the Nam Dinh Children's Hospital in Nam Dinh province, Northern Vietnam, from 2012 to 2015. Diarrhea duration before the examination was � 4 days, and the diarrhea frequency ranged from 1 to 3 times a day. Symptoms included watery diarrhea, and no patient had bloody stools. The samples were inoculated onto MacConkey agar plates and incubated for 18 to 24 hours at 35˚C. Single colonies (1 to 5 colonies in each sample) grown on the plate were screened via PCR, targeting seven E. coli pathotype marker genes; stx1 and stx2 (encoding Shiga toxin 1 and Shiga toxin 2, respectively) for STEC [9], eae (encoding intimin) for EPEC [10], elt and est (encoding heat-labile enterotoxin and heat-stable enterotoxin, respectively) for ETEC [11], aggR (encoding a transcriptional activator of several EAEC virulence genes) for EAEC [12], and ipaH (encoding invasion plasmid antigen H) for EIEC [13]. When multiple strains from the same sample possessed the same marker gene(s), one positive strain was randomly selected, and stored at -80˚C. The stored strains were re-cultured in LB broth for genome analysis, and the presence or absence of the corresponding genes was reconfirmed using the same PCR method.

Genome sequencing, assembly, and annotation
Draft genomes were determined using a MiSeq sequencer (Illumina, San Diego, CA, USA). Illumina short-read libraries were prepared from 100 ng of extracted DNA using the Nextera DNA Library Prep Kit, and paired-end reads were generated using the MiSeq Reagent Kit (v3-600) and MiSeq (Illumina). Raw reads were trimmed by Platanus trim [14] with default parameters and genome assembly was performed using the Platanus_B assembler ver. 1.2.2 [14]. Annotation was carried out using Prokka ver. 1.13 with the default settings [15]. The sequence data have been deposited in NCBI under the accession number, BioProject PRJDB14289.

Statistical analysis
Fisher's exact test for comparing between two groups was performed with EZR (ver. 1.55) and R statistical software package (ver. 4.1.3) [36]. A P-value < 0.05 was considered as significant for statistical analysis.

DEC strains
DEC was isolated from 122 of the 990 samples (12.3%), including 64 aggR-(6.5%), 8 ipaH-(0.8%), 47 eae-(4.7%), 2 est-(0.2%), and 1 elt-positive (0.1%) strains. Neither stx1 nor stx2positive strains were detected in any of the samples. No strains harboring different marker genes were isolated from the same sample. Of the 122 strains, 8, 65, 38 and 11 were isolated in years 2012, 2013, 2014, and 2015, respectively (S1 Table). The strains stored at -80˚C for a long period after isolation were re-cultured, and the virulence gene was confirmed via PCR. Unfortunately, 29 strains were not confirmed to grow in an appropriate medium, and 16 strains were not confirmed to retain the virulence genes identified at the time of isolation (S1 Table). When the genome sequences of the remaining 77 strains were determined, 16 strains did not maintain sufficient quality for genome analysis (S1 Table). Finally, draft genomes of 61 strains from 61 patients (1 to 55 months old) excluding the deficient strains were used for detailed genome analysis (S1 and S2 Tables).

Other virulence-related genes
In addition to major virulence markers, other virulence-related genes were also confirmed in the genomes. Focusing on some representative genes (Fig 1), the hlyA encoding α-hemolysin and invX related to invasion were carried by two and one strains, respectively. The astA gene encoding enteroaggregative stable toxin 1 (EAST1) and set1A encoding Shigella enterotoxin 1 (ShET1) were confirmed in 32.7% (n = 20) and 45.9% (n = 28) of the strains, respectively. The cdtB gene encoding cytolethal distending toxin (Cdt) was identified only in four E. albertii strains. The bfp gene involved in typical EPEC was not confirmed in any strain.

Discussion
The study has clarified the prevalence and repertoires of virulence-related and antibiotic-resistance genes, and the phylogenetic features of DEC and E. albertii strains isolated from infants with diarrhea in Vietnam. Nguyen et al. [6] investigated the prevalence of DEC in Vietnam using 587 fecal samples from children with diarrhea, EAEC and EPEC were detected in 11.6% and 6.6% of the samples, respectively, and EIEC, ETEC and STEC were only in 2%, 2.2% and 0% respectively. Our study also showed that EAEC (6.5%) and EPEC (4.7%) were the main pathotypes, suggesting that EAEC and EPEC are mainly involved in common infant diarrhea in Vietnam. Among the various types of DECs identified, the presence of several epidemic clones was confirmed. EAEC Og130:Hg27 (ST31, n = 10) was the most abundant type, followed by EAEC OgGp9:Hg18 (ST414, n = 4), EAEC OgX13:Hg27 (ST3570, n = 4), EPEC OgGp7:Hg16 (ST10, n = 4), EPEC Og51:Hg21 (ST40, n = 3), and eae-positive E. albertii EAOg1:HgUT (n = 3). These were isolated from the samples of different patients. Among 503 EAEC strains registered in EnteroBase (https://enterobase.warwick.ac.uk/), 9 strains belonging to ST31 (which were isolated in the UK, Germany, Nigeria, Peru, Thailand, and Bangladesh), 2 strains of ST414 (UK), and 1 strain of ST3570 (Germany) were identified. In the 218 EPEC strain information, ST10 was confirmed in 8 strains (Norway, UK, Germany, Brazil) and ST40 in 7 strains (Norway and U.S.A.). Although there have been some reports of EAEC O130:H27 being isolated from sporadic diarrheal cases in Thailand, Peru, and England [37][38][39], no epidemics due to EAEC O130:H27 have been reported worldwide. Since OX13 is an atypical Oserogroup that does not fall under the defined E. coli serotypes O1 to O188, a few characteristics of strains belonging to OX13 that can be shared have been accumulated so far. Detailed genomic analysis of strains isolated from infants in Vietnam suggests that some DEC clones, which have not yet become a public health problem worldwide, may be prevalent in Vietnam. E. albertii is an emerging diarrheagenic pathogen that was first isolated from the feces of diarrhea infants in Bangladesh in 1991 [40]. The basic pathogenicity of E. albertii is the formation of A/E lesions by locus of enterocyte effacement (LEE), as in EPEC. Initially, E. albertii was classified as Hafnia alvei but was subsequently proposed as a new species, "E. albertii", belonging to the genus Escherichia in 2003 based on several genetic and phenotypic analyses [41]. Since then, E. albertii has been implicated in several diarrheal cases including outbreaks [42][43][44][45]. The biochemical properties of E. albertii are very similar to those of E. coli except for a few properties, such as motility and fermentability of some sugars. Therefore, E. coli and E. albertii are sometimes indistinguishable by basic biochemical tests. Furthermore, since E. albertii usually carry the LEE-carried eae, it is presumed that E. albertii are often misidentified as EPEC. Also, in this study, four strains carrying eae, which were thought to be EPEC, were identified as E. albertii via genomic analysis. They have also acquired resistance to various antimicrobial agents, like DEC strains. These results suggest that E. albertii is one of the causes of infant diarrhea in Vietnam. This is the first report of E. albertii isolation from patients in Vietnam. Further information is required to be collected to understand the distribution and epidemiology of this emerging diarrheagenic pathogen. The spread of ESBL-producing Enterobacteriaceae showing resistance against broad-spectrum beta-lactam antibiotics poses a threat worldwide [46]. In Vietnam, many investigations reported the presence of ESBL-producing E. coli in animals, foods, the environment, and healthy humans [47][48][49][50][51][52]. Truong et al. [50] investigated the E. coli strains isolated from workers and pigs at Vietnamese pig farms, and they confirmed that 74% (43/58) and 90% (78/87) of isolates from workers and pigs, respectively were ESBL-producing E. coli. Nakayama et al. [49] investigated chickens in Vietnam and isolated ESBL-producing E. coli from 90% (54/60) of samples. Lien et al. [51] revealed that 43% (76/158) of E. coli strains isolated from hospital wastewater in Vietnam were identified as ESBL-producing. In contrast, there are no reports of ESBL-producing E. coli isolated from patients, except for a few extraintestinal infections [52]. In this study, we identified that 41% of DEC from patients with diarrhea in Vietnam were potentially ESBL-producing, and especially two blaCTX-M types, CTX-M-27 and CTX-M-55 contributing to ESBL production were widely distributed in DEC. Recently, Robert et al. [53] reported the genomic analysis of 721 E. coli strains isolated from patients and environments in ICUs at two hospitals in Hanoi, Vietnam. It is unknown whether the E. coli isolated from patients including stool, urine, and sputum etc. are pathogenetic, however, 85, 29, and 48% of them harbored blaCTX, blaCMY, and blaTEM, respectively. The major types were CTX-M-15 (36%), CTX-M-27 (30%), CTX-M-55 (17%), which were correspond to the findings in this study. However, the ICU strains belonged to 80 sequence types including ST410, ST617, ST131, ST648, and ST1193, and no major overlap with the DEC strains used in this study was identified.
The widespread presence of antibiotic-resistance bacteria is a severe and growing public health issue. According to the pediatric diarrhea treatment guideline in Vietnam uploaded in 2016 (https://kcb.vn/), the use of two antibacterial agents, ciprofloxacin belonging to the second-generation fluoroquinolone and ceftriaxone belonging to third-generation cephalosporin, is recommended for the treatment of infantile diarrhea. However, it was estimated that 65.6% (quinolone resistance) and 41% (ESBL-producing) of DEC and E. albertii in this study were expected to be resistant to these antibiotics. The evolution of ciprofloxacin resistance in E. coli involves the accumulation of point mutations on gyrA and parC. By combining experimental data and mathematical modeling, Huseby et al. [54] showed that the first step in the evolution of clinical ciprofloxacin resistance was the selection of gyrA mutations and one main trajectory that leads to clinically relevant resistance was S83L. A single S83L mutation had an MIC of about 0.25 mg/ml (10 to 20 times that of the wild-type strain) against ciprofloxacin, and additional point mutations of different sites on these two genes made it more resistant [54]. A large-scale genome analysis focused on Shigella sonnei showed that sequential accumulation of mutations in parC and gyrA, including gyrA-S83L, led to the emergence of fluoroquinoloneresistant S. sonnei in approximately 2007, this was then the population spread worldwide [55]. Ciprofloxacin and ceftriaxone have been used following the standard Vietnamese treatment guidelines for diarrhea patients even before the guideline was updated in 2016 [56], and most of the dominant DEC, including EAEC Og130:Hg27 and EPEC OgGp7:Hg16, and all E. albertii carried genes conferring resistance to these two antibiotics, suggesting that the routine use of antibiotics according to the guideline has facilitated the selection of resistant strains. The resulting situation is one in which they cannot contribute to improving therapeutic effects for some patients. Bridging this gap requires continuous investigations and information sharing regarding the type and distribution of antibiotic resistance in each pathogen, including DEC and E. albertii, from each region and country.

Conclusion
In summary, this study reports the genomic analysis results of DEC and E. albertii isolated from infant diarrhea patients in Vietnam. We found some DEC clones, including EAEC Og130:Hg27, EAEC OgGp9:Hg18, EAEC OgX13:H27, EPEC OgGp7:Hg16, and E. albertii EAOg1:HgUT circulating in the area. Most strains possessed several antibiotic resistant genes and mutations, suggesting they have acquired multidrug resistance capacity. Furthermore, many were speculated to be resistant or potentially resistant to ciprofloxacin and ceftriaxone, which are widely recommended drugs for diarrhea treatment in children. The divergence between the distribution of antibiotic resistant strains and the types of antibiotics recommended makes it challenging to improve the effectiveness of antibiotic treatment against infections. We believe that continuous investigations and information sharing of antibiotic resistance in pathogenic bacteria, including DEC, in Vietnam are necessary to improve the therapeutic effects of antibiotics in patients.
Supporting information S1