Molecular Hazard Identification of Non-O157 Shiga Toxin-Producing Escherichia coli (STEC)

The complexity regarding Shiga toxin-producing Escherichia coli (STEC) in food safety enforcement as well as clinical care primarily relates to the current inability of an accurate risk assessment of individual strains due to the large variety in serotype and genetic content associated with (severe) disease. In order to classify the clinical and/or epidemic potential of a STEC isolate at an early stage it is crucial to identify virulence characteristics of putative pathogens from genomic information, which is referred to as ‘predictive hazard identification’. This study aimed at identifying associations between virulence factors, phylogenetic groups, isolation sources and seropathotypes. Most non-O157 STEC in the Netherlands belong to phylogroup B1 and are characterized by the presence of ehxA, iha and stx 2, but absence of eae. The large variability in the number of virulence factors present among serogroups and seropathotypes demonstrated that this was merely indicative for the virulence potential. While all the virulence gene associations have been worked out, it appeared that there is no specific pattern that would unambiguously enable hazard identification for an STEC strain. However, the strong correlations between virulence factors indicate that these arrays are not a random collection but are rather specific sets. Especially the presence of eae was strongly correlated to the presence of many of the other virulence genes, including all non-LEE encoded effectors. Different stx-subtypes were associated with different virulence profiles. The factors ehxA and ureC were significantly associated with HUS-associated strains (HAS) and not correlated to the presence of eae. This indicates their candidacy as important pathogenicity markers next to eae and stx 2a.


Introduction
Shiga toxin producing Escherichia coli (STEC) are potential lethal zoonotic pathogens with a clinical spectrum including diarrhea, hemorrhagic colitis, and the hemolytic uremic syndrome (HUS) [1]. STEC is of significant public health concern given the potential for foodborne outbreaks and their strong association with HUS, which is the leading cause of acute renal failure in children. The most common STEC serotype associated with human disease is O157:H7, but there is a growing recognition of over a hundred non-O157 STEC serotypes that also may result in human illness [2][3][4]. Some of these non-O157 STEC strains cause outbreaks and severe disease, whereas others are associated with only mild sequela or with no human disease at all [5]. This observation resulted in the development of the STEC seropathotype (SPT) classification, which is based on the serotype association with human epidemics and HUS [6]. Serotypes responsible for haemorrhagic colitis (HC) and haemolytic uraemic syndrome (HUS), O157:H7 and O157:NM, were assigned to seropathotype A. Seropathotype B strains (O26:H11, O103: H2, O111:NM, O121:H19 and O145:NM) have a strong association with outbreaks and HUS, but less commonly than those of seropathotype A. Seropathotype C serotypes (O91:H21, O104:H21, O113:H21, O5:NM, O121:NM and O165:H25) aree associated with sporadic HUS cases but not with epidemics. Seropathotype D serotypes are associated with diarrhea but not with HUS and/or outbreaks. Seropathotype E serotypes comprise STEC that had never been associated with human disease and had been isolated only from animals. Following the scientific opinion of the European Food Safety Authority (EFSA) [7] an alternative grouping (modified SPT) was used in which all serotypes associated with severe disease (HUS) were categorized as seropathotype group "haemolytic uraemic syndrome (HUS)-associated serotype(s)" or HAS (Karmali groups A, B and C). Isolates not associated with HUS were grouped as SPT-D or SPT-E following the same criteria as the Karmali classification.
The EFSA network recently concluded that the seropathotype classification does not define pathogenic STEC nor does it provide an exhaustive list of pathogenic serotypes [7]. This relates primarily to the complexity of designating individual strains as pathogens due to the large variety in serotype and genetic content associated with (severe) disease. There is mounting evidence suggesting that the pathogenesis of STEC infection involves many additional virulence factors besides the well-known Shiga toxins and the locus of enterocyte effacement (LEE), including effector molecules encoded on pathogenicity islands (PAIs) outside the LEE [8,9]. In addition, there are considerable differences in geographic distribution of human pathogenic STEC serogroups [10]. Finally, although informative as an ex post facto determinant of virulence potential, the dynamic nature of STEC virulence in time and place exposes a limitation of SPT classification as a predictive indicator of microbial risk [5].
Phylogenetic analyses have shown that most E. coli strains belonged to four main phylogenetic groups, A, B1, B2, and D [11]. Whereas most commensal and diarrheogenic strains assemble in groups A and B1, extraintestinal E. coli strains belong mainly to group B2 and group D. STEC strains were found to fall into phylogenetic groups A, B1, and D [12]. However, there is a lack of knowledge on the phylogenetic distribution of the virulence factors of STEC isolates.
The goal of this study was to investigate the distribution of known virulence factors among clinical, food and animal STEC isolates from the Netherlands. More specific, the research aimed at identifying associations between virulence factors and phylogenetic groups, isolation sources, seropathotypes, serogroups, intimin presence/absence, type of Shiga toxin, and the rpoS genotype. The results are discussed in relation to the epidemiology of STEC in the Netherlands.

Isolates and growth conditions
A set of 209 STEC non-O157 isolates (23 animal, 57 meat and 129 human clinical isolates) was obtained from the collection of the Food and Consumer Product Safety Authority (Wageningen, the Netherlands) and the National Institute for Public Health and The Environment (Bilthoven, the Netherlands). The animal isolates (all bovine) were isolated from 2002 to 2009 during national surveys and were maintained in Microbank vials (bioTRADING, Mijdrecht, the Netherlands). The strains from meat (different types of meat from various food animals) were isolated during national surveys by the Food and Consumer Product Safety Authority in the Netherlands 2005-2010. The clinical human isolates were strains isolated from patients with STEC symptoms and sent in by hospitals for confirmation in the period 2006 to 2010. All clinical human isolates were maintained at room temperature in Mueller-Hinton agar.
Isolates were propagated on blood agar or nutrient agar (Oxoid) and DNA was extracted using Chelex-100 (Bio-Rad Laboratories B.V., Veenendaal, the Netherlands) resin-based technique. One colony of each isolate was transferred into 300 μl 10% Chelex-100 solution, which was subsequently heated for 5 min at 56°C to resuspend the cells. The tubes were briefly cooled at room temperature and mixed for 15 seconds before heating for 15 min at 98-99°C for lysis of the bacteria. After cooling at room temperature, the lysates were centrifuged for 5 min. at 13,000 rpm and up to 200 μl of the supernatant was transferred to a clean tube and stored at -20°C.

Seropathotype grouping and genetic profiling
Isolates were grouped according to the Karmali seropathotype classification [6]. Ranking was done based on the clinical symptoms caused by the Dutch patient isolates, the German HUSassociated EHEC (HUSEC) reference strain collection [14], and data on clinical outcome of confirmed STEC cases in humans in the EU from the European Surveillance System (TESSy) (2007-2010) as provided by the European Centre for Disease Prevention and Control (ECDC) [7].
PCR was used to screen isolates for the presence of 40 virulence markers and determination of the phylogenetic group. Primers and probes used in this study are displayed in Table 1. Conventional PCR tests for adfO, astA, ckf, efaI, ent/espL2, etpD, iha, iha_homologue, saa, stx 2a , stx 2b , stx 2c , stx 2d , stx 2dact , stx 2e , stx 2f , stx 2g , subA, toxB and ureC were performed on a Thermo Hybaid PCR Express Thermal Cycler (Hybaid Limited, Ashford, Middlesex, UK) using iQ Supermix (Bio-Rad Laboratories B.V., Veenendaal, the Netherlands) and 0.2 μM of each primer at an annealing temperature as indicated in Table 1. PCR products were visualized on a 1.5% or 2% (depending on the length of the fragment (see Table 1)) agarose gel. Real-time PCR tests for eae, ehxA, stx 1 , stx 2 and terB were performed on an iQ5 Thermal Cycler using iQ Supermix (Bio-Rad Laboratories B.V., Veenendaal, the Netherlands) 0.2 μM of each forward and reverse primer and 0.4 μM of the probe at the temperature as indicated in Table 1.
PCRs for the non-LEE encoded effectors (nle) and ent/espL2 were performed as described in Coombes et al. [8]. Except for nleA, the forward primer of each primer set was fluorescently labelled with FAM, VIC, NED, or PET. Amplicons were generated essentially as described and pooled in five sets, resulting in distinctive combinations of fragment size and fluorescent label. Fluorescently labelled fragments were analyzed on a capillary sequencer (3130 Genetic Analyzer; Applied Biosystems) in the presence of an internal marker (GeneScan size standard; Applied Biosystems). The GeneScan 600 LIZ size standard was used for pooled amplicons smaller than 600 bp (pool 1: ent/espL2, nleG2-1, nleB2, nleH1-2; pool 2: nleE, nleG9, nleH1-1, nleG2-3; (Continued) pool 3: nleB, nleG6-2, nleG5-2; pool 4: nleD and nleF). The GeneScan 1200 LIZ size standard was used for pooled PCR products between 600 and 1,200 bp (nleC and nleG). Fragments larger than 1,200 bp (nleA; 1,296 bp) were analyzed by agarose gel electrophoresis. Raw data were analyzed using BioNumerics 6.1 (Applied Maths) to determine fragment sizes.PCR amplification and sequencing of the rpoS gene was performed as described earlier [15], but for several isolates, an alternative reverse primer was used to obtain the complete open reading frame ( Table 1). The phylogenetic group PCR amplifying parts of chuA, TspE4.C2 and yjaA was carried out in a multiplex format using the Qiagen multiplex PCR mix and 0.2 μM of each primer at an annealing temperature of 60°C (Table 1) [16].

Data analysis
Differences in frequencies of genetic markers (denoted in binary values 0 and 1) between groups and associations between markers were evaluated using the Chi-square test with a significance level of 0.05 (IBM SPSS Statistics version 19).

Results
Isolate characteristics and the PCR results of all strains used can be accessed in S1 Table.

Association between stx-type and other virulence genes
On average, isolates with stx 2a , stx 2c and stx 2f showed higher number of virulence genes compared to isolate with the other stx-subtypes present (Fig. 3). Subtype stx 2a and stx 2f were significantly (Chi-square P<0.05) associated with eae ( Table 2). Especially stx 2a was associated with a large number of nle-genes. In contrast, stx 2d and stx 2dact were negatively associated with eae, and stx 2e showed no specific positively associated virulence factors. The stx 2f isolates showed significant (P<0.001) positive associations with (in alphabetic order): adfO, astA, cfk, eae, nleB2, nleD, and nleF. Negative associations were observed with ehxA, iha, nleG21, stx 1 , subA and terB (Table 2).

Phylogenetic distribution of STEC and association with virulence genes
The majority (63.2%) of the STEC isolates included in this study was characterized as phylogenetic group (phylotype) B1, followed by A (20.1%), D (9.1%) and B2 (7.7%). The distribution of phylotypes was not significantly different among animal, meat and human isolates (χ2 = 3.87, P = 0.424). However, a trend was observed with relatively more A and B1 isolates among non-human isolates (90.0%) compared to human isolates (79.1%). In contrast, relatively more B2 and D isolate were observed among human isolates (20.9%) compared to non-human isolates (10.0%). A significant association between phylogenetic group and HAS was observed (χ2 = 10.68, P = 0.014), with no HAS among phylogenetic group A (n = 42) and B2 (n = 16). In contrast, 85.2% of the HAS belonged to phylogenetic group B1, and 14.8% to phylogenetic group D.
No difference was observed in the number of virulence genes present in isolates of different phylogenetic groups (P = 0.515). However, some genes differed significantly in frequency between different phylogenetic groups ( Table 3). The eae gene was more likely to be associated with B2 and D isolates compared to A and B1 isolates. Shiga toxin subtype 2f showed a significant association with phylogenetic group B2. Isolates with stx 2e and stx 2g were significantly associated with phylogenetic group A. Similarly, stx 2f , adfO, and nleB2 were more likely to be associated with B2 and D isolates. In contrast, stx 2 , iha, and ehxA were more likely to be associated with A and B1 isolates.

Virulence factors in relation to STEC seropathotype
The total number of markers present decreased progressively from the Karmali SPT-B to SPT-E (Fig. 4). SPT-B showed significant higher number of virulence genes (mean 18.9) than SPT-C (mean 8.3) (P<0.001), SPT-D (mean 6.0) (P<0.001) and SPT-E (mean 6.2) (P<0.001). SPT-C showed a significant higher mean number compared to SPT-D (P = 0.025) and SPT-E (P = 0.012). No difference was observed between SPT-D and SPT-E (P = 0.999). When considering the modified SPT classification, HAS isolated showed a significant higher number of markers (mean 9.3, median 7) (P<0.001) compared to non-HAS isolates (mean 6.1, median 6) (Fig. 5). The decrease in mean number of virulence genes from Karmali SPT-B to SPT-E and from HAS to SPT-E was primarily due to the decrease in amount of nle-genes.

Virulence factors differentiating human versus non-human isolates
When considering all genetic markers investigated in this study there was no significant difference in the number of genes found present between non-human (mean 6.1) and human isolates (mean 7.0) (P = 0.142). Irrespective of serogroup and seropathotype, some genetic targets were found at a significantly different frequency among isolates of human and non-human origin (Table 4). Highly significant (P<0.01) associations with isolates of human origin were observed for eae, stx 2f and ckf. Highly significant (P<0.01) associations with isolates of nonhuman origin were observed for stx 2dact and iha. Other targets occurring in marginally significantly (0.01<P<0.05) higher frequency among isolates of human origin compared to isolates of non-human origin included ent/espL2, nleA, nleG9, efa1, adfO, and nleH1-2 (Table 4).

Frequency of mutations in rpoS
Sequencing the rpoS gene revealed that 7/80 (8.8%) non-human isolates and 31/129 (24.0%) of the clinical isolates were characterized by mutations, including deletions, insertions and single nucleotide polymorphisms in the open reading frame (ORF). Surprisingly, the mutation found in the animal/meat isolates was nearly the same for all, i.e. A967G (N323D) (6/7). This

Discussion
The complexity regarding STEC in food safety enforcement as well as clinical care primarily relates to the current inability of designating individual strains as pathogens due to the large variety in serotype and genetic content associated with (severe) disease. Subsequently, pathogenicity can neither be excluded nor confirmed for a given STEC isolate based on the seropathotype concept or analysis of the public health surveillance data [7]. To classify the clinical and/or epidemic potential of a STEC isolate at an early stage is it crucial to identify virulence characteristics of putative pathogens from genomic information, which are referred to as 'predictive hazard identification' [7]. This study aimed at identifying associations between  virulence factors, phylogenetic groups, isolation sources and seropathotypes in order to gain an increased understanding on the complex epidemiology of STEC.
Most non-O157 STEC in the Netherlands are phylogroup B1 and associated with ehxA, iha and stx 2 , but not with eae Consistent with previous studies [12,17], phylogenetic analysis shows that STEC are distributed over all four major phylogenetic groups but segregate mainly in phylogenetic group B1 and (to a lesser extent) A. This also reflects earlier observations concerning the broader host range, the more acute nature of infections, and the generally higher environmental persistence of B1 (and A) isolates compared to B2 and D isolates [18][19][20][21]. However, there is a relative paucity of information regarding the phylogenetic distribution of the virulence factors of STEC strains belonging to different phylogenetic groups. The observation by Girardeau et al. [12] that STEC isolates belonging to phylogroup A were exclusively eae-negative (and therefore "non-virulent") could not be confirmed in the present study: i.e. 23.8% of the A isolates were eae-positive compared to only 12.9% of the B1 isolates. However, only phylogenetic group B1 and D contained HUS-associated strains (HAS). Possibly, these associations differ with respect to isolation sources and geographical regions. In contrast to intimin, the typical STEC virulence factors stx 2 and ehxA were significantly associated with A and B1 isolates.

The seropathotype is merely indicative for the virulence potential
Earlier studies showed a clear progressive decline in the number of nle-genes from SPT-A to SPT-E strains [8,22]. In the present study the relation between the SPT and the number of virulence factors was particular evident for the classical SPT classification and the nle-genes as compared to the modified EFSA classification [7] and the total number of virulence factors. The large variability in the number of virulence factors present among HAS indicates that this is merely indicative for the virulence potential. This is also evident from the variation within priority STEC serotypes, where the number of virulence factors present range from 7 to 25. Similarly, high numbers of virulence factor were observed among modified SPT-D en -E isolates which strikingly involved relatively many H-strains (O5:H-, O80:H-, O85:H-, O165:H-, O177:H-). The major problem with the SPT concept is that serogroups are retrospectively placed into risk classes. Given the amount of serogroups and large variation in genetic content this does not provide a proactive hazard identification system.

The variation among STEC is characterized by correlated sets of virulence markers
STEC containing the LEE-island are characterized by their ability to express the attaching and effacing (A/E) phenotype, leading to substantial cytoskeletal rearrangements within the enterocyte [23]. Given the strong correlations between eae and other virulence markers, the disease mechanism employed by LEE-positive strains seems (unlike LEE-negative trains) strongly related to other virulence factors like terB, toxB, etpD, adfO, ckf, efa1 (in random order) and almost all nle-factors. Primarily the isolates belonging to the EU top-5 serogroups possess this array of correlated virulence genes.
Although the LEE-island is considered a hallmark virulence factor for STEC pathogenesis, it appears not to be essential since sporadic cases and small outbreaks (including HC and HUS) have been caused by LEE-negative STEC [14,24]. Although mostly associated with less severe disease, 46% of all clinical human non-O157 STEC isolates in the Netherlands were eae-negative (Friesema, per. comm.). With the present study, the percentage of eae-negative human isolates was almost twice as high (80.9%). It has been postulated that in the absence of LEE-island mechanisms are emerging by which LEE-negative STEC interact with the host mucosa and induce disease [24]. The STEC autoagglutinating adhesion (saa), the iron-regulated gene homolog adhesion (iha) and the subtilase cytotoxin (subAB) have been reported as alternative adhesins [25][26][27]. In the present study, these three virulence factors indeed occurred in a significant higher frequency among LEE-negative strains. If and how the functions encoded by these virulence factors present in LEE-positive strains but lacking in LEE-negative strains are fulfilled should be a focus of further study.
Specific sets of virulence factors were also associated with different Stx-subtypes. Especially stx 2a was positively associated with an array of additional virulence factors (incl. eae) while stx 2b , stx 2d , stx 2e and stx 2g showed very little positive association with additional (known) virulence factors. Recently, the emergence of stx 2f-STEC in the Netherlands was described, which are generally associated with more mild disease [28]. This might be explained by the relatively low potency of Stx2f [29], but also by the general absence of ehxA and terB, both showing a significant association with HAS in the current study and in general with EHEC/HUS [30,31]. These results highlight that differentiation in disease severity among different STEC is not likely linked to the presence or absence of a particular gene but to specific arrays of virulence factors (i.e. virulence profiles). The strong correlations between virulence factors indicate that these arrays are not a random collection but are rather specific sets. Comparative genomics of large sets of non-O157 STEC should reveal common genetic backbones and evolutionary processes leading to the acquisition of such sets of virulence factors [32].
A large proportion of STEC isolates in the Netherlands are characterized by a relatively low risk virulence profile In the Netherlands, the EU top-5 serogroups (O26, O103, O111, O145 and O157) represent approximately half of all clinical STEC isolates [13]. The other half is caused by serogroups O63 (10%), O91 (9%), O113 (6%), O146 (4%) and others. All evidence provided in the current study cumulates to the conclusion that isolates belonging to these serogroups are generally characterized by a low prevalence of virulence genes found to be associated with HAS in this study (like adfO, cfk, eae, efa1, nle-genes, stx 2a , terB, toxB, and ureC; in alphabetic order). This coincides with the observation that most disease caused by these serogroups is relatively mild [13].
Additional markers risk markers are needed to distinguish high risk from low risk STEC In line with the results described here, Ju et al. [22] demonstrated that many of the non-LEE encoded effectors were primarily associated with eae-positive STEC strains. This is also supported by recent comparative genomics which revealed the absence of all known phage-encoded non-LEE effector genes eae-negative STEC [33]. Several 'molecular risk assessment' studies designated specific virulence profiles as strong signatures of high risk STEC. Bugarel et al. [34] concluded that the presence in the same strain of a core of virulence determinants of eae, ent/espL2, nleB, nleE, and nleH1-2 is a strong signature of a human-pathogenic EHEC. A Belgian study presented the combined presence of efa, nleE and stx 2 as a high-risk virulence profile [35]. Bosilevac et al. [36] reported the combination of eae, nle and subA genes as a high risk profile. However, all these markers were strongly correlated in the present study with eae, questioning the added value of using the nle-genes as an additional markers. Consequently, all eae-negative virulent STEC strains, including HAS [14] would be categorized as harmless STEC while other serotypes which have not been reported to be associated with severe disease or outbreaks but carry non-LEE-encoded virulence effectors similar to those of O157 EHEC, would be considered outbreak-and severe disease-associated serotypes. Therefore, we support the conclusion of Ju et al. [22] that additional markers or methods of assessment are needed to accurately distinguish highly pathogenic STEC from low-virulence or harmless STEC. This especially applies for eae-negative STEC. The factors ehxA and ureC were, independently from eae, associated with HAS. This indicates their candidacy as important pathogenicity markers next to eae and stx 2a . The ureC was earlier identified as a suitable marker for pathogenicity [37]. Mutation of this gene resulted in reduced adherence of E. coli O157:H7 in ligated pig intestine [38] and strains with nonfunctional urease were less likely to survive stomach passage and colonize the mouse intestinal tract compared to urease positive strains [39]. However, in other studies ureC was strongly correlated to intimin [22,37]. Probably, different associations between virulence markers reflects a different STEC population composition in different geographical regions. Enterohemolysin (ehxA) is also is known for its association with HUS [30,31].
rpoS variants are over-represented among human clinical isolates Stationary-phase and almost any environmental stress that slows the growth rate of E. coli induce the rpoS-controlled general stress response [40]. In this study the frequency of isolates with rpoS variants (i.e. deviating from E. coli O157:H7 Sakai strain as wild-type (WT) reference) was three times higher among human clinical isolates compared to animal and food isolates. A similar skewed distribution of WT and variants was demonstrated for STEC O157 isolates [15]. Th variants were negatively associated with survival in soil and resistance to acid shock. The postulated hypothesis by van Hoek et al. [15] on the human gut as an environment that would give rise to rpoS variants is strengthened by the current results on non-O157. There is evidence that a WT functional rpoS is advantageous for bovine colonization [41,42]. In contrast, rpoS negatively regulates the expression of the LEE-island [43] and negatively affects the colonization of mice [44]. Non-bovine enteric systems could select for rpoS variants as these are characterized by increased nutrient scavenging abilities at the expense of stress-resistance [45]. Hereby, direct competition with commensal E. coli could be avoided by the establishment of an STEC specific metabolic niche [46].