Inconsistent PCR detection of Shiga toxin-producing Escherichia coli: Insights from whole genome sequence analyses

Shiga toxin-producing Escherichia coli (STEC) have been linked to food-borne disease outbreaks. As PCR is routinely used to screen foods for STEC, it is important that factors leading to inconsistent detection of STEC by PCR are understood. This study used whole genome sequencing (WGS) to investigate causes of inconsistent PCR detection of stx1, stx2, and serogroup-specific genes. Fifty strains isolated from Alberta feedlot cattle from three different studies were selected with inconsistent or consistent detection of stx and serogroup by PCR. All isolates were initially classified as STEC by PCR. Sequencing was performed using Illumina MiSeq® with sample library by Nextera XT. Virtual PCRs were performed using Geneious and bacteriophage content was determined using PHASTER. Sequencing coverage ranged from 47 to 102x, averaging 74x, with sequences deposited in the NCBI database. Eleven strains were confirmed by WGS as STEC having complete stxA and stxB subunits. However, truncated stx fragments occurred in twenty-two other isolates, some having multiple stx fragments in the genome. Isolates with complete stx by WGS had consistent stx1 and stx2 detection by PCR, although one also having a stx2 fragment had inconsistent stx2 PCR. For all STEC and 18/39 non-STEC, serogroups determined by PCR agreed with those determined by WGS. An additional three WGS serotypes were inconclusive and two isolates were Citrobacter spp. Results demonstrate that stx fragments associated with stx-carrying bacteriophages in the E. coli genome may contribute to inconsistent detection of stx1 and stx2 by PCR. Fourteen isolates had integrated stx bacteriophage but lacked complete or fragmentary stx possibly due to partial bacteriophage excision after sub-cultivation or other unclear mechanisms. The majority of STEC isolates (7/11) did not have identifiable bacteriophage DNA in the contig(s) where stx was located, likely increasing the stability of stx in the bacterial genome and its detection by PCR.


Introduction
Shiga toxin-producing Escherichia coli (STEC) is one of the most important pathogens in foodborne illness. Currently, STEC includes more than 400 strains, with O157 and the non-O157 "big six" (O26, O45, O103, O111, O121, and O145) serogroups being most frequently linked to hemorrhagic colitis in humans [1]. However, due to low cell numbers to trigger an infection and the diversity of STEC it can be challenging to isolate or identify specific serogroups associated with contaminated foods.
Several methodologies have been used to identify or isolate STEC including immunomagnetic separation (IMS), a selective and enriched media, PCR, and qPCR [2][3][4][5][6][7]. However, there is still a lack of a gold standard methodology for isolating STEC [8]. Also, the development of specific methods according to the sample matrix could increase sensitivity and lower the threshold of detection of STEC strains. To further these aims, antimicrobials are commonly added to STEC media to prevent plate overgrowth [9], but this practice does not guarantee that only STEC will be isolated, or discriminate STEC serogroups.
For identification of STEC strains, PCR reactions are commonly based on the presence of Shiga toxin genes and can also be applied to determine bacterial serogroup through the amplification of genes responsible for the synthesis of O-antigens (wzx and wzy; [10][11][12]). Factors such as the presence of bacteriophage (phage) which are not incorporated into the bacterial genome and DNA purity can influence the accuracy and sensitivity of detecting STEC using PCR [13,14]. Furthermore, repeated subculturing of STEC can result in the loss of stx-coding phage [15], even with the first subculture [16]. Moreover, in a recent study Macori et al. [17] observed that qPCR amplified free phages encoding stx in samples collected from the rectal anal junction of sheep. Accordingly, there is growing consensus that more investigation is needed to evaluate the impact of stx-carrying free-phages or integration and loss of stx-phages from bacterial genomes on the detection and confirmation of STEC, as false-positive (PCRpositive but no stx integrated into genome) or false-negative (PCR negative but with stx present) results have consequences for food safety.
This study used whole genome sequencing (WGS) of E. coli isolated from feces of western-Canadian cattle to: (i) compare whole genome sequences with previous PCR detection of Shiga toxins and serogroup; (ii) investigate the presence and heterogeneity of stx-encoding phages; and (iii) determine the presence of other virulence factors and antimicrobial resistance of isolates.

Bacterial strains and culture
A total of fifty E. coli previously isolated from cattle feces in three different studies were used for WGS and all strains were encoded with the acronym CAP due to financial support of the Canadian Agricultural Partnership. Forty-eight strains were isolated from feces of western-Canadian slaughter cattle collected from the floor of transport trailers [18], one strain was isolated from the pen floor of an Alberta feedlot [19], and one was isolated in feedlot cattle feces in 2017 [20]. Isolates were selected for WGS based on consistent or inconsistent PCR detection of stx 1 and/or stx 2 and/or serogroup from 750 strains analysed by Zhang et al. [21] and belong to a larger pool of approximately 15,000 isolates [20].
60˚C for 45 s, 72˚C for 90 s, and a final extension of 72˚C for 5 min. Conrad et al. [10] primers were also used for detection of serogroups (O26, O45, O103, O121, O145 O157; Table 1). PCRs contained a final volume of 25μL and 0.2 μM each primer, 1x HotStar Taq Plus Master-Mix (Qiagen1 Hilden, Germany), 1x Coral Load PCR buffer, 2 μL DNA template, and nuclease-free water. The reactions were performed in a Veriti™ Dx Thermal Cycler (Applied Biosystems). To ensure that the PCR primers used were not responsible for inconsistent stx 1 and stx 2 results, virtual PCR was performed for the 50 isolates using Geneious 10.  Table 1). Also, two base pair (bp) mismatches between primer and sequences for both stx and serogroup were allowed to ensure that inconsistences which can lead to amplification were considered. For other configurations default parameters were used.

DNA extraction and WGS
Genomic DNA was extracted from overnight bacterial cultures prepared in Luria-Bertani broth (Merck, Darmstadt, Germany) using the ZR Fungal/Bacterial DNA MiniPrep TM kit (Epigenetics Company, Irvine, CA, USA) according to the manufacturer's instructions. DNA was quality checked and quantified using a Qubit fluorimeter (ThermoFisher, Waltham, MA, USA) and a TapeStation 4200 system (Agilent, Santa Clara, CA, USA). Sample libraries were prepared using the Nextera XT library preparation kit protocol (Illumina, Inc., San Diego, CA, USA). Sequencing was performed on the Illumina MiSeq platform using the MiSeq Reagent Kit V2 to produce 251 bp paired-end reads. Sequencing was performed at the Agri-Food Laboratories, (Alberta Agriculture and Forestry, Edmonton, AB, Canada).

Sequencing analysis
Sequencing reads were de novo assembled into contigs using the Shovill pipeline (https://github. com/tseemann/shovill). Shovill included trimming, which was performed with Trimmomatic 0.39, and de novo assembly was performed with SPAdes version 3. 13 Presence of phage sequences in bacterial genomes was assessed using phaster.ca [30,31]. Phage sequences were compared with reference stx genes (NC_004913.3; NC_049944.1; NC_008464.1) using the Blastn platform (NCBI) and to our WGS strains using Geneious Prime (Biomatters, Auckland, NZ). The MAFFT 7.450 tool [32] was used to align stx sequence data with that of stx-encoding phages obtained from NCBI database using a scoring matrix 200PAM / K = 2, GAP open penalty of 1.53, offset value of 0.123 and automatic determination of sequence direction. The integrity of stx (%) was then calculated automatically in the aligned sequences, selecting only bases with agreement between NCBI phage and strain sequences. A heatmap illustrating the presence of phages in bacterial sequences was prepared using Graph-Pad Prism 5.01 (GraphPad Software, San Diego, CA).

Overall concordance of PCR and WGS
After WGS of the 50 isolates, forty-eight were confirmed as Escherichia coli and two (CAP 7, CAP 50) were identified as Citrobacter spp. and were removed from further analysis. Within the forty-eight isolates of E. coli, only eleven were classified as STEC by WGS [20] as they had contiguous stxA and stxB subunits forming complete sequences for stx 1 or stx 2 , even though stx 1 or stx 2 were detected by PCR at least once in all isolates ( Table 2). All isolates confirmed as STEC by WGS were also consistently classified as STEC by PCR. To evaluate the effectiveness of the PCR primers, a virtual PCR and a Blastn using the NCBI platform were performed to compare binding of stx 1 and stx 2 primers to generic E. coli (without stx presence as determined by WGS) and STEC. Importantly, all STEC confirmed by WGS were positive for stx 1 and complete stx 2 sequences were found in two STEC ( Table 2).
Blastn results showed no stx 1 or stx 2 primer binding in strains classified as generic E. coli by WGS (Table 2). Also, Blastn results discard amplification with other genome sequences, and for isolates not confirmed to be STEC by WGS, the highest score (correspondence between bases of the sequence with the primer) for stx 1 was 28.2 (binding of 14 bases of DNA into 25 bases of the forward primer) and for stx 2 30.2 (binding of 15 bases of DNA sequence into 24 bases of the forward primer). Moreover, virtual PCR using the Conrad et al. [10] and Scheutz et al. [22] primers for stx 1 and stx 2 also indicated amplification only in STEC strains confirmed by WGS.

Primers and phages
Of 48 strains confirmed as Escherichia coli by WGS, 10 STEC and 22 non-STEC had up to six stx-encoding phages integrated within their bacterial genome (Table 3). For these thirty-two isolates, up to three fragments of stx (truncated stxA and stxB subunits) were associated with phage DNA insertions (Table 3 and Fig 1). However, presence of stx-phages did not guarantee presence of even fragmentary stx and fourteen of the integrated stx-phages lacked stx coding sequences. Only one STEC strain confirmed by WGS (CAP 19) did not contain sequences attributed to a stx-encoding phage. One STEC strain with inconsistent PCR detection of stx 2 (CAP 32) was found to have a fragment of stx 2 integrated in the genome ( Table 2). Twenty-two strains classified as generic E. coli by WGS had phage fragments of stx 1 , and in one case stx 1 and stx 2 , which may have contributed to inconsistent PCR detection of these genes. However, 15 strains previously PCRpositive for stx 1 or stx 2 lacked stx fragments in their genome and were not confirmed as STEC by WGS. As well, for two isolates even though stx 2 fragments were present, stx 2 was never detected by PCR. Integrity of stx present in fragments varied from three to 38.7% (Table 3).
Stx phage fragments present in our isolates were compared to phage reference sequences from NCBI and we also performed virtual PCRs using primers designed by Conrad et al. [10] and Scheutz et al. [22]. Virtual PCR results emphasize that all lysogenic phages had insertion locations which corresponded to reference sequences which would have been amplified by both sets of primers. However, no phage sequences were complete as compared to reference sequences, with phage integrity ranging from 1-60% (Table 3). Additionally, there was no difference between the two primer sets [10,22] in detection of stx 1 or stx 2 in reference phages.
For seven STEC strains confirmed by WGS, stx was not located in regions where there were fragments of stx-encoded phage as determined by PHASTER pipeline (Table 3 and Fig 2). For five WGS-confirmed STEC strains, stx was in the contig where stx-phage fragments were detected, with CAP 18 having both stx 1 and stx 2 , but only stx 1 associated with phage DNA (Fig  3). The presence of stx was verified near the insertion site of NinF and NinG genes in seven of the eleven STEC strains. However, stx was located adjacent (within ten genes prior to stx in the genome) to the Lar family of genes in six of the STEC (Figs 2 and 3). A heatmap divided strains into 3 groups: (A) Fifteen STEC-negative strains by WGS lacking stx-phage insertions; (B) Twenty-two STEC-negative strains by WGS with stx-encoding phage insertions; and (C) eleven STEC-positive strains by WGS (Fig 4).

Subtypes of stx and biofilm genes
All WGS-confirmed STEC strains possessed stx 1a and stx 1b , with CAP03 and CAP18 also possessing stx 2a and stx 2b ( Table 2). In all cases, if stx 1 and/or stx 2 were confirmed by WGS, both a and b subtypes were present and by extension two or four bacteriophages would have initially inserted stx into these bacterial genomes. Biofilm genes detected by WGS included csgB, csgD, csgE, csgF, csgG in all STEC, and (47/48) of all strains sequenced. Other genes including cheY, entABCEFS, espX4, espX5, fepABCG, flgG, and ompA were present in all 48 Escherichia coli  strains, reinforcing the quality of the coverage of the sequencing of the isolates (S1 Table). Finally, other genes that regulate cell surface adhesins were verified, such as FimA and FimB (S1 Table).

Serogroup and serotype
For serogroup determination, PCR and WGS were in agreement for 29/50 isolates (Table 4). PCR and WGS fully agreed with the assignment of the 11 STEC strains to their O-groups. Therefore, all mismatches between PCR and WGS serogroup (21/50) were in generic E. coli isolates (non STEC by WGS). In summary, generic E. coli strains showed false positive amplification for serogroups: O26 (n = 6), O45 (n = 2), O103 (n = 6), O145 (n = 2), and O157 (n = 5). The exceptions were O121 which had stable serogroup detection (Table 4) and O111 which was not included in this study due to previously noted stable serogroup and stx 1 detection [33].

Resistome and plasmids
The arsB-mob gene which encodes resistance to arsenic was present in 7/11 STEC isolates and BlaEC which encodes for beta-lactamase resistance was present in all E. coli (Table 5). Other resistance genes to various antimicrobials were occasionally identified including aminoglycosides, diaminopyrimidines, sulfonamides, quaternary amines, tetracycline and phenols. Six generic E. coli strains (CAP 5, 21, 24, 29, 34, 39) carried three or more AMR genes. Almost all STEC isolates harbored at least one plasmid, with IncFIB (AP001918)1 being the most common, and CAP47 the only STEC strain that lacked plasmids.

MLST and Phylogenomic relationship between strains
For all E. coli isolates, 29 sequence types (ST) were detected, but for STEC strains, only six STs were identified (11,21,32,343, 723, and 5082; Table 6). For O157:H7, ST11 strains were detected, similar to that of the reference strain used (Escherichia coli O157:H7 str. Sakai DNA, sequence BA000007), emphasizing the potential pathogenicity of our strains. Based on ST results, O103:H11 may be more closely related to O26:H11 than to O103:H25. In addition, O145:H28 was closely related to O157:H7 as they both had the same subtypes of stx (stx 1a , stx 1b , stx 2a , stx 2b ). A phylogenomic tree with 0.055 relatedness was developed using a single copy of each isolate plus the reference genome using multi-locus sequence types (Fig 5).

Isolation of Citrobacter spp
Citrobacter spp. is part of the Enterobacteriaceae family and can grow in the enrichment medium of Escherichia coli, with morphology very similar to that of STEC colonies [34]. Using IMS may have also led to this misidentification, since some strains of Citrobacter spp. express an antigen similar to that of O157 [35]. Moreover, Citrobacter spp. strains positive for stx have been previously described [36]. A possible solution to prevent misidentification of Citrobacter spp. would be additional PCR assays to detect the uidA gene, responsible for the activity of beta-glucuronidase (mainly for O157), or housekeeping genes for E. coli, such as arcA, gapA, mdh, rfbA, and rpoS [37].
Possibly, amplification of a free stx-encoding phage may have occurred at initial isolation as the two Citrobacter isolated in the present study did not have stx-encoding phage fragments in their genomes. Other PCR-based studies of E. coli have also either detected free stx-encoding phages or hypothesized the loss of stx after sub-cultivation [13,16,38]. Free stx-phages have been found in Citrobacter spp. [36] and other species such as Escherichia albertii [39]. Additional complicating factors which increase the difficulty of isolating STEC include adaptability (e.g. change in the expression of some genes) and difficulty in establishing a culture medium that can promote uniform growth between STEC strains [8]. Immunomagnetic separation was used to overcome some of the difficulties in isolating STEC in the present study. However, it is worth mentioning that as IMS is serogroup-based, it has a small spectrum of detection due to the large number of existing STEC serogroups [1)] Also, some cross reactions among serogroups have occurred, decreasing the discriminatory power of IMS [40,41]. Other challenges in isolation of STEC were addressed by our group in previous studies [42,43]. Competition through culture, the differences between detection across laboratories, and the lack of selectivity by IMS highlight the need to improve methodologies for detection and the isolation of STEC [8]. Consequently, the use of different culture media which would be selective for all STEC and/or the development of new IMS beads with increased selectivity would simplify STEC detection and isolation. Although WGS also has weaknesses with some inherent to the Illumina platform including decreased quality toward the ends of reads, non-uniform amplification of target regions, and difficulties in assembly due to the short length of sequences [44], a combination of phenotypic approaches aligned with genotypic tools can better guarantee effective STEC isolation in future studies.

Concordance of Shiga toxin genes by PCR and WGS and phage influence in PCR
Although there may be difficulties in isolation of STEC, and it has been established that PCR assays across laboratories can produce variable results for detection of Shiga toxin genes due use of different equipment and methods [22,33], it was expected that the use of the same assay by the same laboratory staff with the same equipment and conditions would produce consistent results. However, on re-growth of isolates collected in previous studies, and repeated PCR, detection of stx 1 and/or stx 2 showed variation for some isolates. Fourteen isolates which were positive for stx in the first PCR were negative in the second assay, matching WGS results, although twenty-five isolates continued to show false-positive PCR results in the second assay (positive in PCR but negative in WGS; Table 2). Loss of stx genes after re-culture has been previously described [16] and may be also be attributed to mixed cultures (containing multiple strains of E. coli either possessing or lacking stx, resulting in variability depending on which colonies are selected) or loss of free stx-carrying phage [14].
Stx is carried by phages that may be free within the cell at the start of the lysogenic cycle prior to phage DNA insertion into the bacterial chromosome [13,45]. Although there is great heterogeneity of phages encoding Shiga toxins, the location of phage insertion in the bacterial genome has been reported to be close to wrbA or yecE in the Q terminator region [46]. However, based on results of the present study, seven STEC strains instead had stx inserted close to NinF and NinG.
The adjacent gene relationship between stx-phage insertion and NinG has been previously reported in O157:H7, with NinG thought to act as a controller of stx expression [47]. As seven STEC strains had the insertion of stx near to NinF and NinG, it is possible that these strains had a greater stx stability in the genome and less likelihood of undergoing a phage excision process. Also, seven of eleven strains confirmed as STEC by WGS lacked phage DNA flanking stx insertion sites (Table 3 and Fig 2). The lack of detection of phage DNA may represent cryptic phage which have lost the ability to excise from the bacterial genome, similar to those carrying stx 1 in E. coli O111 [48]. Stx is typically a single transcriptional unit consisting of A and B subunits [49], but multiple insertion, mutation and excision events may have led to defective stxprophages, and these occurrences can be considered as pathoadaptive mutations, although it is not known what advantage the cell obtains from immobilizing stx [48]. Of interest, Creuzburg et al. [48] also obtained variable stx PCR results which were attributed to a lack of primer-binding sites, missing fragments of the target genes, or the presence of other mobile genetic elements causing PCR amplification.  Environments with a high bacterial density promote transfer of phages, with phages being both gained and lost by bacterial members within this dynamic environment [50]. In addition, the presence of multiple fragments of stx-coding phages may be related to the loss of phages by sub-cultivation, which has already been demonstrated [15,16,38,51]. Based on our results, we would agree with Senthakumaran et al. [38] who concluded that STEC with intact prophages may be uncommon and difficult to detect. Also, using WGS these authors observed the existence of a stx-negative "in vivo" strain O145:H28 with characteristics similar to another STEC strain of the same serotype [38]. Moreover, studies evaluating stx loss suggest that STEC O157: H7 strains are more "stx stable" when compared to non-O157 serogroups [16,38,51], although our study also included O157 strains selected for stx instability (n = 7). However, a difference between the present study and other studies that evaluated the loss of stx phage is that in our results the loss of stx 1 was more common, likely due to its increased prevalence, while other studies investigated the loss of stx 2 [16,38,51].
A significant finding of the present study was that intermittent false stx positives could in twenty-two cases be possibly related to presence of fragments of stx-encoding phages ( Table 2), especially as genomes of the majority of strains possessed multiple fragments of identifiable stx-encoding phages ( Table 3). The Conrad et al. [10] primers used at initial isolation have had positive amplification of stx even with one or two base-pair mismatches [33], but the possible intermittent binding of primers to stx fragments has not been previously reported, likely as only a subset of stx fragments may have influenced PCR results. Larger fragments with highest stx sequence integrity would be the most likely to intermittently bind to PCR primers, although it was not possible in the present study to verify which if any of the stx fragments led to false-positive PCRs. However, it is likely more than coincidence that all isolates with fragments having at least 23% stx 1 or stx 2 integrity (n = 9) had intermittent PCR detection of that gene unless they also had an intact stx of the same type enabling consistent PCR detection. The stx 1 present in CAP 32 is interesting and possibly intermediate to a fragment and a complete stx as it only had 56% stx 1 integrity in Geneious analyses due to base substitutions, but was classified as STEC by WGS. Accordingly, the demarcation between STEC and non-STEC may be more complicated than previously supposed and investigating expression of Shiga toxins would provide further clarity.
Three types of insertion of stx-encoding phage in the bacterial genome were verified (Fig  1). The CAP 47 strain confirmed as STEC showed homology with stx-encoding phage BP 4795, while two other non-STEC strains had multiple insertions between the bases of the stx-phage encoding region (CAP 5, CAP 33). In contrast, CAP 14 and CAP 15 each had a conserved stx-carrying phage in their genome but lacked a stx coding region. Similar to CAP 14 and CAP 15 strains, Senthakumaran et al. [38] noted the absence of stx in strains

Subtypes of stx, biofilm genes
In a study of 444 isolates of O157 from human disease outbreaks, multiple copies of stx 1 and/ or stx 2 occurred in 68% of isolates [53]. However, it is odd that only multiple copies of stx 1 or stx 2 were present in all STEC isolates in the present study which were selected for WGS due to consistent stx PCR results. Accordingly, we hypothesize that multiple bacteriophage insertions may increase stx stability within the E. coli genome. Similarly, it was two STEC that had the highest number (five and six, respectively) of integrated stx-phage. Almost all biofilm genes identified were members of the csg family (unique exception was CAP 34; S1 Table). Genes from the csg family play an important role in regulating biofilm genes in E. coli [54]. These genes are responsible for the formation of curli, an extracellular proteinaceous fiber which is involved in binding of surfaces and cell-to-cell contact, also influencing host colonization [55]. Strains of O157 that express curli are thought to have an exacerbated production linked to a high capacity for biofilm formation [56]. Potentially, STEC expressing curli may be linked to the phenomenon of super-shedding (>10 4 cells/g of feces), which has been theorized to be due to formation of an intestinal biofilm that when periodically sloughed leads to high numbers of fecal STEC [57]. However, presence of csg genes does not guarantee biofilm formation by STEC [58] and evaluation of biofilm forming phenotypes would require further study.

Serogroup and serotype
O-antigen serogroups represent the outermost part of the lipopolysaccharide layer and currently for Escherichia coli there are 184 O-serogroups [59]. Recently, some studies have standardized PCR assays to determine both O-antigen polysaccharide [59] and H-flagellum [60] as serological tests are laborious and may cross-react with other serogroups [61]. In the present study we found that in generic E. coli strains (without stx presence by WGS) there were 18 strains mistakenly amplified as belonging to the "Top Seven" (Table 4). There were also three strains which could not be O-serogrouped by WGS, illustrating limitations also of WGS. For generic E. coli strains where serogroup determined by PCR did not match WGS, we evaluated whether there was a lack of primer specificity via virtual PCRs, and all primers evaluated only aligned with target regions. Also, all phages detected were evaluated by virtual PCR and did not affect possible amplification during serogroup determination. Therefore, our results emphasize that although the PCR for the determination of serogroup in STEC strains confirmed by WGS obtained 100% specificity, reasons for serogroup mismatches in some generic E. coli strains could not be determined. Mixed cultures are a possibility but unlikely to be wholly responsible. Additional study of unstable serogroup determination by PCR is required.

Resistome and plasmids
Information about antimicrobial resistance is important as antimicrobials are often included in media to improve the specificity of isolation methodologies. A number of antimicrobials including cefixime, cefsulodin, and vancomycin are used in enrichment broth for isolation of serogroup O157 [62]. Although arsenic and β-lactam resistance genes were present in most STEC strains in the present study, their use in culture media would not completely differentiate STEC from other E. coli strains due to the presence of these genes also in generic E. coli strains. However as selective media encompassing all STEC do not currently exist, the utility of β-lactam supplemented media is worthy of future exploration. The toxicity of arsenic would likely limit its practical application in culture media.
In relation to plasmid presence, IncF plasmids have been reported to confer resistance to different antimicrobials including β-lactams, aminoglycosides, tetracyclines, chloramphenicol, and quinolones [63,64]. This plasmid is present in the class Inc that are responsible for producing TEM-1 or inhibitor-resistant TEM [65]. Moreover, IncF plasmids are widely distributed in the Enterobacteriaceae family and contribute to the spread of antimicrobial multi-resistance among E. coli [66]. However, this plasmid class does not carry stx genes and would not have influenced PCR detection of stx 1 or stx 2 .

Analysis of MLST profiles of strains
The multilocus sequence type of CAP03, ST11, has also been detected in cases of diarrhea as described by Ferdous et al. [67], in the database of the Food and Drug Administration from 2010 to 2017 [68], and confirmed in asymptomatic food handlers and from fecal sources of patients in Japan [69]. An important point is that O157:H7 is considered the serotype with the highest risk to humans, due to the large outbreaks that occurred in USA in 1993 [70,71], in Japan in 1996 [72], and in Canada [73]. For this reason, the presence of O157:H7, and the ST11 profile, represents a direct risk of sporadic cases or a foodborne outbreak. Additionally, four isolates of ST32 (O145:H28) were detected. That ST is related to cases of hemolytic uremic syndrome. Furthermore, Shridhar et al. [74] analyzed 89 isolates of STEC serogroup O145 from several origins and all were ST32 with stx 1a and stx 2a . However, in the present study, CAP18 also showed the presence of stx 1b and stx 2b , which is evidence that supports the potential pathogenicity of this strain.
For serogroup O26:H11, two ST21 isolates were detected. This ST was detected in contamination from cattle feces [68], and hospitalized patients [75]. In addition, this ST was related to an outbreak occurring in Romania in 2016 where ST21 strains were isolated from 10 hemolytic uremic syndrome patients and five diarrhea cases [76]. Also, in a study by Chase-Topping et al. [77] which evaluated E. coli O26 isolated in Scottish cattle, ST21 was the most prevalent, but different from our strains, stx 2 was most common while only stx 1 was verified in our study.
The presence of ST343 (O103:H25) was described by Iguchi et al. [78] in sporadic cases and an outbreak with bloody diarrhea, vomiting and fever in Japan, and similar to present study, stx 1 was detected. In addition, this ST was isolated in areas of fish slaughter and watersheds [79]. As the strains isolated in our study were present in feces and animal hides, it is possible that they could also be present in water [80,81].
For O103:H11 a ST723 was detected. Iguchi et al. [78] observed that serogroup O103 can be present in four ST groups [17, 343, 21, and 723]. The ST depends on the evolutionary line of each O103 strain. For example, ST723 is closely related to ST21, which in the present study was associated with an O26:H11. However, Eichhorn et al. [82], found that ST723 was related to isolates from humans, while ST21 was most often found in isolates from cattle. ST343 has a low similarity with ST21 and ST723, indicating a different evolution from the other two O103 sequence types.
Another ST, 5082, was detected for O121:H7. ST5082 is not common but was related to one bovine isolate and one of unknown origin in California [83]. In this same study, 85% of O121 serogroup isolated were ST655 and only 5% ST5082, but different to the present study had stx 1d and stx 2a or stx 1d and stx 2c , while our strain carried stx 1a and stx 1b . This divergence highlights the complexity and the ability for genetic rearrangement between strains of E. coli.

Conclusions
Generally, PCR is a reliable technique for classifying STEC and the few exceptions from our culture collection which had variable detection of stx and/or serogroup were investigated using WGS. In some cases, PCR primers used to determine stx genes may have been influenced by free phage encoding a Shiga toxin, since 29.2% of isolates (14/48) had concordant WGS and PCR results only in a second PCR after re-culture of the isolates. Conserved stxencoding phages remaining in the genome without stx corroborates the possibility of loss in the region that encoded the stx gene, either by sub-cultivation or other unclear function. The presence of fragments of stx remaining in the genome may in some cases, particularly with larger fragments, have led to intermittent amplification of PCR primers. Comparing serogroup among E. coli isolates as determined by PCR and WGS, both techniques agreed for STEC and in 18 generic E. coli, but in another 21 generic E. coli reasons for this incongruence could not be determined. It is unlikely that any technique may perfectly characterize STEC, but it is most important that Shiga toxin genes be reliably detected by PCR due to their potential human health risks. Having up to six integrated stx-phages per isolate including some lacking stx-coding regions and an average phage integrity of < 10% points to the extreme plasticity and impermanence of stx-carrying phage in the E. coli genome. Conversely, the majority of STEC lacked phage sequences in the same contig as stx, likely increasing stability of stx in the genome and its detection by PCR.
All STEC strains showed genes related to virulence, antimicrobial resistance, and adhesion to surfaces (biofilm formation), and when we analyzed the differences between the STEC isolates it was possible to verify that the main differences among isolates of the same serogroup were linked to the host cell-binding system. Strains showed a diversity of antimicrobial resistance genes, but all strains had a resistance gene for β-lactams. Consequently, β-lactams could be useful to improve isolation of STEC by inhibiting non-resistant background microflora. Regardless of difficulties in PCR classification, results of ST show a relation to other ST strains involved in food-borne outbreaks in other regions of the world, emphasizing the importance of accurate prediction of food safety risks.
Supporting information S1