Genome-based characterization of Escherichia coli causing bloodstream infection through next-generation sequencing

Escherichia coli are one of the commonest bacteria causing bloodstream infection (BSI). The aim of the research was to identify the serotypes, MLST (Multi Locus Sequence Type), virulence genes, and antimicrobial resistance of E. coli isolated from bloodstream infection hospitalized patients in Cipto Mangunkusumo National Hospital Jakarta. We used whole genome sequencing methods rather than the conventional one, to characterized the serotypes, MLST (Multi Locus Sequence Type), virulence genes, and antimicrobial resistance (AMR) of E. coli. The composition of E. coli sequence types (ST) was as follows: ST131 (n = 5), ST38 (n = 3), ST405 (n = 3), ST69 (n = 3), and other STs (ST1057, ST127, ST167, ST3033, ST349, ST40, ST58, ST6630). Enteroaggregative E. coli (EAEC) and Extra-intestinal pathogenic E. coli (ExPEC) groups were found dominant in our samples. Twenty isolates carried virulence genes for host cells adherence and 15 for genes that encourage E. coli immune evasion by enhancing survival in serum. ESBL-genes were present in 17 E. coli isolates. Other AMR genes also encoded resistance against aminoglycosides, quinolones, chloramphenicol, macrolides and trimethoprim. The phylogeny analysis showed that phylogroup D is dominated and followed by phylogroup B2. The E. coli isolated from 22 patients in Cipto Mangunkusumo National Hospital Jakarta showed high diversity in serotypes, sequence types, virulence genes, and AMR genes. Based on this finding, routinely screening all bacterial isolates in health care facilities can improve clinical significance. By using Whole Genome Sequencing for laboratory-based surveillance can be a valuable early warning system for emerging pathogens and resistance mechanisms.


Introduction
There is currently enlarge in the prevalence of infections worldwide due to multidrug-resistant (MDR). Due to their correlation with a high level of mortality and morbidity which are triggered by the insufficient of potent antibiotics, Gram-negative bacteria became a critical threat to global public health [1][2][3]. The production of Extended-spectrum beta-lactamase (ESBL) became the most significant determinant of AMR that spreads rapidly among Enterobacteriaceae [4,5].
Escherichia coli naturally inhabits the human gastrointestinal tract (GIT) and classified as Gram-negative commensals bacterium. Two main classification of pathogenic E. coli-diarrheagenic E. coli (DEC) and extra-intestinal pathogenic E. coli (ExPEC)-are perceived, varying in their associated clinical syndromes and virulence genes. Extraintestinal pathogenic E. coli (ExPEC) can cause a diversity of infections, including sepsis, neonatal meningitis, and urinary tract infections (UTI). The urinary tract is the most commonly infected by these bacteria, which then, is a common source for bloodstream infections [6]. Through horizontal transfer and other mechanisms, commensal E. coli regularly obtains pathogenicity, virulence and multi-drug resistance characteristics of the pathogenic E. coli. Virulent E. coli strains then share pathogenic, virulence, and resistance genes with avirulent or less virulent strains, allowing the appearance of pathogenesis beyond their natural characters [7].
There are six well-recognized pathotypes of DEC such as enterotoxigenic E. coli [ETEC], enteropathogenic E. coli [EPEC], enteroaggregative E. coli [EAEC], Shiga toxin (Stx)-producing E. coli [STEC], enteroinvasive E. coli [EIEC], and diffusely adherent E. coli [DAEC], are identified as the leading etiological agents of childhood and travellers' diarrhoea. EAEC has long been considered as an intestinal pathogen and thereof unlikely to cause disease in normal patients outside of the intestinal tract [8]. However, current studies have related EAEC strains with fatal haemolytic uremic syndrome outbreaks and urinary tract infection [8,9].
Whole genome sequencing (WGS) as a sophisticated molecular diagnostics have shown the emergence of E. coli strains that cause fatal diarrhoea due to the combination of the virulence genes [10]. The virulence genes are important to understand the pathogenic properties of the causative organisms. The virulence gene is an agent that forms itself in or within its host by increasing its able to trigger disease effectively and efficiently. The virulence genes in certain E. coli strains was associated with serious outbreaks in Denmark between 2014-2015 [11]; in Italia between 2016-2017, where it caused 7.1% 30-day mortality rate [12]; in China in 1999, where it caused 177 deaths [13]; and in Germany in 2011, where it killed 54 people [14]. Molecular genomic studies of human infections of E. coli in Indonesia, especially in Jakarta are unusual and no report is available regarding specific subtypes causing infections in Jakarta. The purpose of the study was to characterize the serotypes, MLST (Multi Locus Sequence Type), virulence genes, and antimicrobial resistance of E. coli isolated from bloodstream infection patients in Cipto Mangunkusumo National Hospital Jakarta.
In this research, the 24 E. coli genomes were analyzed and compared with 26 diverse E. coli genomes (S1 Table) using PhaME pipelines [22] which applied a maximum likelihood phylogeny by RAxML v7.2.8, with the GTR model of nucleotide substitution, for model of rate heterogeneity we use the GAMMA, and 100 bootstrap replicates. The phylogeny was midpointrooted and diagramed using the interactive Tree of Life software (iTOL v.3).

Distribution of virulence genes
Isolates that harboured the virulence genes of EAEC were characteristically typical EAEC by genomic criteria. The EAEC showed highly heterogeneous with the existence of a substantial number of genes typically correlated with other E. coli pathotypes, i.e. extraintestinal pathogenic E. coli (ExPEC) and EPEC. ExPEC genes found among the EAEC strains included i) increased serum survival-encoding gene, iss (63.6%); ii) Secreted autotransporter toxin, sat (18.2%); iii) Plasmid encoded enterotoxin, senB (9.1%); iv) Temperature-sensitive hemagglutinin, tsh (9.1%), and Salmochelin receptor, iroN (18.2%). LpfA was the only virulence gene to differ in frequency between the phylogenetic groups, as it was found in all B1 isolates, phylogroup B2, phylogroup F, and phylogroup D.

PLOS ONE
Characterization of serotype, virulence and antimicrobial resistance genes of E. coli through WGS These results are identical to the results from other regions that described the distribution and prevalence of these endemic clones in health facilities [30][31][32]. The variation of clones discovered in the hospital might propose sporadic introductions of various strains to the hospital from the community. To investigate whether identical STs are clonal associated, SNP differences among isolates and phylogenetic assessment proposed the presence of multiple clones of E. coli in these settings. Generally, the levels of antimicrobial resistance in E. coli isolates were inspected to be elevated. Based on assessment of resistance genes, 22 of the E. coli isolates carried genes encoding Macrolide, with mdf(A) being prevalent (100% appeared in 22 isolates) followed by mph(A), ere(B), erm(B), and Inu(F). Gene mdfA encodes a putative membrane protein (mdfA) of 410 amino acid residues which are the main facilitators of the transport protein superfamily. Cells that express mdfA from multicopy plasmids are considerably more resistant to various groups of zwitterionic lipophilic or cationic compounds such as benzalkonium, rhodamine, tetracycline, daunomycin, ethidium bromide, puromycin, rifampin, and tetraphenylphosphonium. Although, mdfA also provides resistance to chemically unrelated, clinically essential antibiotics such as fluoroquinolones, erythromycin, chloramphenicol, and aminoglycosides [33]. Gene mph(A) conferring resistance to macrolides including azithromycin and erythromycin also was discovered in a high quantity. The appearance of the mph(A) in E. coli isolates of the current research was 40.9%, and thus, higher than the 13% that was identified in E. coli from 5 countries from 4 continents by Nguyen et al [34]. and lower than the 50% that was found in E. coli from tertiary care hospital in Moshi, Tanzania by Sonda et al [7]. High risks to azithromycin and erythromycin can be one of the feasible reasons that allow the emergence of resistance to macrolides [35].
In this study, blaTEM-1B being predominant among other genes encoding beta-lactamase and followed by blaCTX-M-15 and blaCTX-M-27. In a tertiary hospital in Dar es Salaam, Manyahi et al [36] found blaCTX-M-15 as the most predominant gene (90.6%) and Sonda et al. [7] who found blaOXA-1 as the most predominant gene in a tertiary hospital in Moshi, Tanzania. In the previous research, O25:H4-ST131 was mentioned to be a main reason of MDR E. coli infections [7,37,38]. The presence of aac(6')Ib-cr encodes low ciprofloxacin resistance by itself (as well as aminoglycoside resistance), and generally requires further mutations (e.g. in the gyrA or parC chromosomes) to provide high levels of resistance. Other research in the Korea [32], US [39], and Brazil [40] have confirmed identical results with this research of the co-carriage of aac(6')Ib-cr, ESBL genes and the presence of ciprofloxacin resistance in E. coli ST131. This plausibly clarify the discovered relationship between transport of aac(6')Ib-cr and CTX-M ESBL and ciprofloxacin resistance [41].
This current research also characterized virulence genes in all E. coli isolates. Based on our results, we found the specific virulence genes in ExPEC and EAEC groups. Adherence protein (iha) was found dominating in ExPEC adherence virulence genes. We also found several toxins which belong to ExPEC group, such as cnf1, ireA, sat, vat, senB, pic, and tsh. Another prevalent group of virulence gene were liable for E. coli immune evasion by increasing serum survival (iss). The iss gene recognized for its role in ExPEC for enhanced survival of bacteria in the serum. The iss gene is located on plasmid ColV, a huge virulence plasmid typical of avian pathogenic E. coli strains, which reveal that a replacement of plasmids and, as the result, replacement of those virulence genes is feasible between human and avian pathogenic E. coli strains [25].
EAEC-specific virulence genes present in our 12 isolates included aggregative adherence fimbria (AAF) variant I; AggR, a global regulator of EAEC virulence; AggR-activated regulator, aar; dispersin, required for proper dispersal of AAFs on the bacterial surface; and the aat transporter system, which mediates dispersin secretion [42]. The aggR gene was found in ST38 (H30) sample and both aggR and aar, was found in ST69 (O15:H18) sample. Strains harbouring the aggR regulon or its constituents have been termed typical EAEC. We also found two samples that harbouring EAEC toxin, i.e. astA, which both of them belong to ST38 (O86:H18). These virulence genes have been defined as suggestive of virulent serotypes and may be used as reliable markers for the detection of pathogenic E. coli [43].
Potential clinical implications for the results received are that caution must be taken when explaining and utilizing microbiology outcomes. A study from Havt et al [44], showed that malnutrition status of children aged 6-24 months was related to virulence genes of aatA and aar. This research underscore that E. coli should not be considered as non-pathogenic until the determinants of pathogenic and antimicrobial resistance have been verified to be missing. In addition, the WGS-based result suggested that there are nosocomial transmissions in the hospital, thus encouraging the formulation of pragmatic antimicrobial management, control initiatives and infection prevention.
We understand and acknowledge the limitations of this study. The assessment was carried out on a limited number of E. coli isolates, which may limit generalization of the results. With a limited number of the isolates analyzed, it is essential to figure out that another limitation that this work may suffer is the insufficiency of deeper statistical analysis to associate the isolates resistance and virulence results with patient characteristics such as room, age, ward, and gender. Also, the presence of genes that encode various resistance and virulence genes only shows the genes that are in the isolates.

Conclusions
The E. coli isolated from 22 patients in Cipto Mangunkusumo National Hospital Jakarta showed high diversity in serotypes, sequence types, virulence genes, and AMR genes. Based on this finding, routinely screening all bacterial isolates in health care facilities can improve clinical significance. By using Whole Genome Sequencing for laboratory-based surveillance can be a valuable early warning system for emerging pathogens and resistance mechanisms.
Supporting information S1