Escherichia coli ST131 clones harbouring AggR and AAF/V fimbriae causing bacteremia in Mozambican children: Emergence of new variant of fimH27subclone.

Multidrug-resistant Escherichia coli ST131 fimH30 responsible for extra-intestinal pathogenic (ExPEC) infections is globally distributed. However, the occurrence of a subclone fimH27 of ST131 harboring both ExPEC and enteroaggregative E. coli (EAEC) related genes and belonging to commonly reported O25:H4 and other serotypes causing bacteremia in African children remain unknown. We characterized 325 E. coli isolates causing bacteremia in Mozambican children between 2001 and 2014 by conventional multiplex polymerase chain reaction and whole genome sequencing. Incidence rate of EAEC bacteremia was calculated among cases from the demographic surveillance study area. Approximately 17.5% (57/325) of isolates were EAEC, yielding an incidence rate of 45.3 episodes/105 children-years-at-risk among infants; and 44 of isolates were sequenced. 72.7% (32/44) of sequenced strains contained simultaneously genes associated with ExPEC (iutA, fyuA and traT); 88.6% (39/44) harbored the aggregative adherence fimbriae type V variant (AAF/V). Sequence type ST-131 accounted for 84.1% (37/44), predominantly belonging to serotype O25:H4 (59% of the 37); 95.6% (35/44) harbored fimH27. Approximately 15% (6/41) of the children died, and five of the six yielded ST131 strains (83.3%) mostly (60%; 3/5) due to serotypes other than O25:H4. We report the emergence of a new subclone of ST-131 E. coli strains belonging to O25:H4 and other serotypes harboring both ExPEC and EAEC virulence genes, including agg5A, associated with poor outcome in bacteremic Mozambican children, suggesting the need for prompt recognition for appropriate management.

Introduction Escherichia coli is a common cause of community and hospital-acquired bacterial infection, causing a wide range of clinical diseases and associated with high morbidity and mortality worldwide [1,2]. Two major groups of pathogenic E. coli-diarrheagenic E. coli (DEC) and extra-intestinal pathogenic E. coli (ExPEC)-are recognized, differing in their virulence factors and associated clinical syndromes [2]. ExPEC strains are among the major causes of urinary tract infections (UTI) and/or hospital/community-acquired bacteremia in both industrialized [3,4] and low-income countries [4][5][6]. These strains differ from DEC or commensal E. coli strains with respect to their virulence factors, with the former requiring specific attributes to cause invasive disease, e.g. the ability to survive in serum, efficient iron uptake mechanisms and internalization by the host [7].
EAEC has long been regarded as an intestinal pathogen and therefore unlikely to cause disease in normal patients outside of the intestinal tract [17]. However, recent reports have linked EAEC strains with urinary tract infection [18] and fatal hemolytic uremic syndrome [19] outbreaks. The role of EAEC in extra-intestinal infections and associated outcomes among African children admitted to hospitals, remains unknown. Through our ongoing invasive bacterial disease surveillance, we previously documented E. coli as among the top five pathogens associated with community-acquired bacteremia in Mozambican children, with an associated case fatality ratio of nearly 10% [6]. Thus, assessing the molecular virulence markers of E. coli strains circulating is an important first step in understanding the molecular epidemiology of this entity in Mozambique, with the hope of informing appropriate control or prevention strategies. Herein, we aim to characterize the molecular epidemiology of E. coli causing childhood bacteremia in a rural Hospital in Mozambique between 2001 and 2014.

Ethics statement
Clinical data were routinely collected from an ongoing morbidity surveillance system in Manhiça district health facilities, established as part of CISM's HDSS and approved by the Mozambican Ministry of Health. All residents of the district of Manhiça have signed an individual informed consent to become part of the ongoing HDSS established in the area.

Study population
The study was conducted by the "Centro de Investigação em Saúde de Manhiça (CISM)" at the Manhiça District Hospital (MDH), the main referral health facility for the Manhiça district, a rural area located 80 km north of Maputo, southern Mozambique. The district has an estimated population of 183,000 inhabitants, and in this area, CISM has been running a continuous health and demographic surveillance system (HDSS) since 1996, currently covering the entire district's population. A full description of the geographical and socio-demographic characteristics of the study community has been presented elsewhere [20]. Of importance, HIV sero-prevalence in the area is among the highest in the world (40% of the general adult population) [21]. CISM is adjacent to the MDH, and since 1997, the hospital and CISM have jointly operated a 24-hour surveillance of all pediatric (<15 years of age) visits to the outpatient department and admissions to the wards including surveillance of invasive bacterial disease as described previously [6].

Sample collection and laboratory procedures
As part of routine clinical practice at MDH, a single venous blood specimen for bacterial culture was systematically collected upon hospital admission for all children <2 years of age, and for children aged 2 to <15 years with axillary temperature �39.0˚C or with signs of severe illness as judged by the admitting clinician for bacterial isolation as detailed described elsewhere [6].

Detection of diarrheagenic E. coli pathotypes, phylogenetic groups and virulence factors
Three hundred and twenty-five frozen E. coli isolates recovered from blood cultures were retrieved, sub-cultured on MacConkey, and screened for the presence of EAEC, ETEC, EPEC, and STEC markers by multiplex polymerase chain reaction (PCR) [22] including phylogenetic group as described elsewhere [23]. Additionally, we tested the isolates for various ExPEC and DEC virulence genes by conventional multiplex PCRs, targeting 44 genes including those commonly prevalent in EAEC (such as aggR, aatA, aap, aaiC and the recently discovered aar gene) [24,25]. Positive samples were confirmed by sequencing five isolates for each gene of interest in the Sequencing Core Facility of the University Of Virginia School Of Medicine, Charlottesville, VA, USA.

Antimicrobial resistance and mechanisms of resistance
The antimicrobial susceptibility phenotype for ampicillin, amoxicillin-clavulanic acid, cefuroxime, ceftriaxone, cefotaxime, aztreonam, ertapenem, imipenem, meropenem, nalidixic acid, ciprofloxacin, chloramphenicol, amikacin, tobramycin, Gentamicin, Tetracycline and trimethoprim-sulfamethoxazole (SxT) was determined by a conventional disk diffusion method on Mueller Hinton agar [26] using commercially available disk (Oxoid, Basingstoke, Hampshire, UK). The interpretative category of resistance was determined according to the Clinical Laboratory Standard Institute (CLSI) guidelines [27]. Multidrug resistance was defined as resistance to three or more unrelated antibiotic families and we considered non-susceptible isolates those with an intermediate or full resistant profile [28] Serotyping and whole genome sequence (WGS) Somatic (O) and flagella (H) antigens were phenotypically identified using commercially available antisera as described elsewhere [29,30]. The following designations were included: ''O rough," the boiled culture auto-agglutinated, suggesting absence of O antigen; ''O?" when it could not be determined whether the strain produces an O antigen (precipitation with Cetavlon indicates an acidic polysaccharide that could represent capsular K antigen); and ''ONT," when the O antigen was found to be present but could not be typed. Serotyping was performed on all bloodstream isolates positive for EAEC markers and a subset of other E. coli bacteremic strains at the International Escherichia and Klebsiella Centre, Department of Bacteria, Parasites and Fungi, Statens Serum Institut (SSI), Copenhagen, Denmark. Additionally, serotypes were also assessed by WGS and compared to conventional serotyping.
Forty-four EAEC isolates (the total of positive by PCR) and 22 non-EAEC isolates for control purpose (randomly selected) were sequenced by using Illumina Miseq (Illumina, San Diego, CA, USA). Briefly, Genomic DNA from isolates was purified using Qiagen DNeasy Blood and Tissue kit (Qiagen, Valencia, USA) according to the kit protocol. Initial DNA concentrations were measured and quantified using Qubit Flourometer and dsDNA BR/HR Assay Kit (Thermo Fisher Scientific). Sample and library preparation were performed using the Nextera XT v2 DNA library Preparation Kit (Illumina, Sand Diego, USA). Libraries were finally purified by Agencourt AMPpure XP System (Beckman Coulter, Indianapolis, USA). WGS data were pre-processed employing a QC-pipeline (available at https://github.com/ssidk/SerumQC), where isolate sequences were removed in case of contamination with more than 5% of another genera, as well as sequences representing EAEC isolates with genome sizes outside the range of 4.64 Mbp-5.56 Mbp. Isolate sequences were removed from the dataset if assemblies comprised of more than 350 contigs. De novo assemblies were carried out using CLC Genomics Workbench 10 with a minimum contig length of 200 bp. The genome size, N50, and contigs are presented in S1 Table. Sequence type, in silico serotype and virulence genes were determined from the de novo assembled genomes using the webtools available at https://cge.cbs.dtu.dk/services/ [31]. The least ambiguous phenotypical or in silico serotype was used in the final analysis i.e. non-motile strains (H-) were given the in silico determined fliC H type, the in silico O type was used on O rough and O? and the phenotypic O group was used if the in silico O type was ambiguous or non-typeable.

Definitions
Bacteremia was defined as the isolation of at least one non-contaminant bacteria from the blood culture collected on admission. Bacteremic EAEC strains were defined as isolates from blood culture testing positive for one of the following genes: aggR, aaiC or aatA genes by multiplex PCR. Other bacteremic E. coli were defined as E. coli strains from blood culture excluding EAEC, EPEC, ETEC and STEC.
Case fatality ratios (CFR), represent in-hospital mortality due to bacteremia calculated for admitted children with known outcome (i.e. discharged alive, or dead), excluding patients that left the hospital without medical permission or were transferred to Maputo Central Hospital as previously described [6].

Statistical analysis
Statistical analyses were performed using STATA software, version 13.0 (Stata Corp., College Station, TX, USA). The proportion of virulence factors found among children infected with EAEC and other bacteremic E. coli was compared using Chi-squared or Fisher's exact tests as appropriate. Minimum community-based incidence rates (MCBIRs) for E. coli bacteremia (EAEC and other bacteremic E. coli excluding ETEC and EPEC) were calculated referring cases to population denominators establishing time at risk (child years at risk [CYAR]) inferred from the HDSS census information. Children did not contribute to the numerator or denominator for a period of 15 days after each episode or when they were outside the study area.
CART analysis was also performed, as previously described [24] (CART Pro Version 7.0; Salford Systems). We included the collective number of virulence genes present (virulence factor score, VFS) in putting 48 factors of interest as binary (present/absent) independent predictive variables along with a continuous ''factor total" that was a sum of all factors including the presence of malaria. Alive/death was the binary dependent outcome variable for isolates causing bacteremia. Furthermore, the CART analyses for bacteremic EAEC were also assayed for 66 genes assessed by WGS.

Screening for EAEC markers, serotypes and phylogenetic groups distribution
From January 2001 to December 2014, 37,536 blood cultures were collected from children younger than 3 years of age admitted to the MDH; and 325 (0.8%) were positive for E. coli. Of these, 57 (17.5%) met the definition of EAEC, 6 (1.8%) ETEC and 2 (0.6%) EPEC; while the remaining 260 (80.0%) were classified as other bacteremic E. coli. Children with EAEC bacteremia appeared to be younger than those with bacteremia secondary to other E. coli [mean: 10.1 months (SD = 7.2) vs. 12.4 (SD = 8.3), p = 0.057], albeit not statistically significance.

Virulence factors detected by conventional multiplex PCR
Of the 57 EAEC, detailed virulence factors were assessed in the 44 isolates serotyped and sequenced. Of the 44 EAEC strains serotyped from blood, 41 (93.2%) met the definition of typical EAEC (presence of the master regulator gene, aggR) and were positive for aar (AggR-activated regulator). Only one typical EAEC (aggR + ) strain lacked aatA while 37 showed the combined presence of virulence genes aggR, aap, aatA and ORF3; and five and two isolates lacked the aap or ORF3 genes, respectively. Table 2 shows the prevalence of virulence factors  (Tables 2 and 3).

PLOS NEGLECTED TROPICAL DISEASES
newly discovered aggR repressor aar [32], and the aggregative adherence fimbriae (AAF) gene cluster. The AAF/V variant was found in 88.6% (39/44) of the EAEC strains followed by AAF/ I (2/44) and AAF/IV (1/44). The sequence analysis confirmed the PCR analysis and the presences of several ExPEC associated virulence genes (iutA, fyuA, traT, hlyA, hylC, cnf1, and pap-GII). Furthermore, we found the increased serum survival gene-iss in 95.5% (42/44) of the EAEC strains positive for aggR. Lastly, the molecular serotype confirmed the conventional serotype except in seven strains (marked with • in Table 1). Additionally, virotype classification did not provide a clear discrimination of our strains according to Dahbi et. al. [33], suggesting possible occurrence of virotype E sub-type requiring further characterization of isolates.

Antimicrobial susceptibility and associated mechanisms
We also assessed the antimicrobial resistance of the 44 EAEC strains, documenting high prevalence of resistance to the most commonly available and used antibiotic for empirical treatment (ampicillin, gentamicin and chloramphenicol) in our community including multidrug resistance (MDR) 97% as demonstrated in Table 4. WGS also identified genes conferring resistance towards three or more groups of antibiotics; Aminoglycosides, Macrolides, Phenicols, Quinolones, Sulphonamides, Tetracyclines, Trimethoprim, and/or β-Lactams (Table 5).
The Classification and Regression Tree (CART) analysis suggests the presence of 2 clusters associated with poor outcome in the absence of malaria. Cluster 1 comprising strains testing positive for papGII and hra in the absence of sfaS (Node 1) and cluster 2 comprising strains harboring cnf1 in the absence of hra and afa_dr (Node 2) (Fig 2). In addition, we demonstrated the presence of fatal strains harbored hlyC and orf3 genes in the absence of agn43 (Node 1) or belonging to ST131 clone harboring hlyA and aer lacking astA toxin (Node 2) among children infected by EAEC (Fig 3). Case fatality ratio among children infected sequenced EAEC strains was 14.6% (6/41), mostly related to ST131 strains (83.3%; 5/6); and 60% (3/5) children with poor outcome were infected by serotypes other than O25:H4, namely O86:H4 (n = 2) and O127:H4 (n = 1).
Classification and regression tree (CART) classification tree topology reveals combinations of factors most strongly associated death in the absence of malaria (Fig 2) or for EAEC strains (Fig 3). We considered all genotypic and phenotypic assays performed: aatA, aggR, aaiC, aap, ORF3, sat, sepA, pic, sigA, pet, astA, aafC, agg3/4C, aafA, agg3A, aggA, agg4A, air, capU, eilA, ORF61. Each branch of the CART tree ends in a terminal ''node'' (blue boxes), and each terminal node is uniquely defined by the presence or absence of a predictive factor such as a gene. The tree is hierarchical in nature.

Discussion
This is the first study conducted in Mozambique characterizing E. coli strains causing childhood bacteremia; documenting the novel subclone of ST131 harboring EAEC genes causing bacteremia in children, with its highest incidence peaking during infancy. During the last years, evidence of involvement of non-fimH30 ST131 isolates and fimH30 subclone isolates fulfilling molecular criteria for EAEC in extra-intestinal infections have been reported [18,34]. However, to our knowledge, this is the first report analyzing overtime trend incidences of EAEC-associated bacteremia in African children, showing a magnitude similar to that caused by Staphylococcus aureus or Hib (pre-vaccine introduction) in our population [6], suggesting that EAEC is currently playing an important role as a cause of childhood E. coli bacteremia in this setting. The high incidence of EAEC reported here could either be related to pathogen or to host factors, including malnutrition or HIV, both highly prevalent in our study area and also known to enhance translocation of commensal bacteria to the bloodstream [2]. Despite the limitation on HIV data in our study population, we believe that the EAEC incidence reported here is possibly due to properties of the pathogen. If it was favored by HIV or malnutrition co- infection, we would expect to also find a high prevalence of the other pathotypes (e.g. EPEC or ETEC) also prevalent among healthy children [35].
As a common enteric isolate, we hypothesize that extra-intestinal EAEC may arise via the transfer the pAA plasmid more classical invasive pathogens, thus transferring additional virulence traits. This is supported by the high prevalence of classical ExPEC virulence genes within our EAEC, such as hlyA which is known to induce oxidative stress in blood [25], and which is also associated with polymorphonuclear lysis/necrosis and lung injury in vivo in a rat model of E. coli pneumonia [36]. The low prevalence of the chromosomal aaiC gene in our strains compared to aggR and aatA on the pAA plasmid may be additional support for transfer of the pAA plasmid. It also suggests that aaiC is not a good marker for EAEC bacteremic strains in our community.
The high prevalence of adhesion AAF/V in our strains is noteworthy, suggesting a high degree of phylogenetic relatedness of our strains. This is underscored by the presence of papC or type 1 fimbriae (fimH) in more than 90% of isolates, suggesting that these strains may derive from urinary tract infections (UTI) strains, despite the lack of clinical information with regard to diagnosis of UTI. The change in fimH alleles might improve colonization abilities of the We included the collective number of virulence genes present (virulence factor score, VFS) in putting 48 factors of interest as binary (present/absent) independent predictive variables along with a continuous ''factor total" that was a sum of all factors including the presence of malaria. We identified 2 clusters associated with poor outcome in the absence of malaria: i) comprising strains testing positive for papGII and hra_in the absence of sfaS (node 1); and ii) comprising strains harboring cnf1 in the absence of hra and afa_dr (node 2).
https://doi.org/10.1371/journal.pntd.0008274.g002 different clades (global dissemination of a multidrug resistant E. coli clone) [33], however, to our knowledge this is second report of fimH27 subclade, which has recently been reported in ST405 accounting for 13% of ExPEC strains isolated from clinical isolates in Nigeria [12]. In contrast to the fimH30 subclade that is characterized to be resistant to extended spectrum of βlactamase or fluoroquinolones [15,37,38], our strains were susceptible to third generation cephalosporines and fluoroquinolones. However, the resistance profile of our fimH27 strains was similar to those reported by Roer et al from bloodstream infections in Denmark [39], despite the high serotype-and ST diversity in the latter. Interestingly, the fact that the isolates circulating in our community do not fit in the classical classification of virotypes [33] supports the hypothesis of the presence of a new entity that require further characterization. Plasmid analysis of our strains compared to globally disseminated fimH30 and fimH27 of ST405 is underway and will be published elsewhere.
Also notable is the fact that WGS identified the presence of the iss gene, recognized for its role in ExPEC virulence and considered a distinguishing trait of avian ExPEC but not of human ExPEC [40], suggesting that some strains may fit in the classification of avian pathogenic E. coli (APEC).
In addition, both serotyping and WGS data support the presence of serotype O25:H4 clone ST131, a clone significantly associated with urinary tract infections and bacteremia [37]. More interesting is the finding that 15 out of 37 (41.7%) of ST131 strains were from distinct serotypes from the traditional O25:H4 and to our knowledge never previously reported: O127:H4 (5 strains), O51:H4 (4), O86:H4 (4), O18ac:H4 and O15:H4 [41,42]. EPEC has been shown to cluster in related groups sharing the H antigen (H2 and H6) that differ only on the O antigen, which might suggest that the LPS operon may be located in a phage region and can be transferred by transduction among EPEC [43]. Here, we find that these strains share H4 but differ in their O antigen, yet belong to the same MLST type ST131, a finding that certainly warrants further investigation. Importantly, E. coli phylogenetic analyses have generally attached greater significance to the H antigen as a marker of shared genetic ancestry, suggesting that the high preponderance of H4 strains in Mozambique may indeed signal the existence of a highly virulent and longstanding pathogen. Further epidemiologic studies should address the importance of H4 flagellar clones.
The WGS analysis illustrated that the recently discovered Aar, that has been hypothesized to act directly or indirectly as a virulence suppressor, modulating virulence because of selection towards clinical attenuation, was present in almost all isolates. These data may also strongly support an important role for the aar gene in E. coli epidemiology; similar to what was found in previous studies in Mali and Brazil where aar-negatives AAF/IV variant showed increased pathogenicity [24,44].
Interestingly CART analysis data showed that deaths are likely to occur in the absence of malaria; nearly 20% of such children died, of which 12 when infected with strains harboring papGII and eight in those testing positive for hra (Fig 2, node 1). This finding reinforces the need of routine screening of bacterial pathogens among children admitted in developing countries where most of deaths are attributable to malaria due to the limited microbiology infrastructure. Indeed, fatal EAEC strains are also related to the presence or absence of specific virulence factors found in ST131 strains (Fig 3) with attributable case fatality greater than that caused by invasive non-typhoidal Salmonella [45] or S. aureus. In addition, despite the small CART analyses for bacteremic EAEC assessing 66 genes by WGS. Regardless of age or serotype we demonstrated the presence of fatal strains harboring hlyC and orf61 genes in the absence of agn43 (node 1) or belonging to ST131 clone harboring, hlyA and aer lacking astA toxin (node 2).
https://doi.org/10.1371/journal.pntd.0008274.g003 PLOS NEGLECTED TROPICAL DISEASES number of isolates, the poor outcome of ST131 non-O25:H4 serotypes may suggest that those are more virulent that the classical serotype O25:H4, and require further in vitro or in vivo testing to establish its potential virulence [13]. Unfortunately due to the lack of adequacy or incomplete data on appropriate empirical treatment in terms of number of doses and days, the variables of antimicrobial resistance, HIV treatment were not included in the CART analysis which may help to elucidate the relationship of strains virulence profile and poor outcome Our study sheds light on the etiology of the bacteremia events, suggesting that not only O25:H4 EAEC, but also other previously undescribed EAEC serotypes of ST131 clone strains can cause clinically severe invasive bacteremia in neonates and young children resulting in hospitalization and death in Southern Mozambique, requiring prompt recognition for appropriate management.
Supporting information S1