Table 1.
PacBio assembly statistics.
Figure 1.
Prophage tail fibre allele switching in EC958.
A. Alignment of the Phi1 alternative contig that contains the inversion of the tail fibre region to the genome of EC958. Phage tail fibre genes are coloured from dark green to light green. Phage DNA invertase genes are coloured orange. 26 bp crossover sites are indicated by black arrows. Red shading indicates nucleotide identity in the same orientation. Blue shading indicates nucleotide identity in the opposite orientation, highlighting the inversion in the phage tail fibre region. B. Genetic loci map of the tail fibre gene region of EC958 phages (Phi1, Phi2 and Phi4) and the location of recombination sites for DNA invertase. The major tail fibre gene is formed by a fusion of the stable 5′ region (dark green), encoding a series of Phage_fibre_2 tandem repeats (Pfam03406), with the invertible 3′ region (green) that encodes a Phage Tail Collar domain (Pfam07484). Downstream and presumably co-transcribed with the major tail fibre gene is a minor tail fibre gene (green). The alternate alleles form a mirror image of this arrangement, immediately downstream of the functional phage tail genes (lime green), enabling a new major tail fibre gene (and cognate minor tail fibre gene) to be formed by inversion of a 2–3 kb DNA segment. DNA invertase genes are coloured orange. The Phi4 prophage encodes a truncated DNA invertase (EC958_1582) that lacks the characteristic helix-turn-helix resolvase domain (PF02796). Invertible regions are highlighted in yellow. Figure prepared using Easyfig [27].
Table 2.
Comparison of complete PacBio EC958 genome with draft 454 EC958 genome.
Table 3.
Sites of DNA inversion within EC958 prophage genomes as determined by PacBio assembly of alternate alleles.
Figure 2.
Maximum likelihood phylogenetic comparison of 4 ST131 and 17 representative E. coli isolates.
The tree is rooted using the out-group species E. fergusonii ATCC35469. The phylogenetic relationships were inferred with the use of 261,214 SNPs identified between the genomes of the 22 Escherichia strains and 1000 bootstrap replicates. The major E. coli phylogroups are coloured as follows; phylogroup B2-ST131: SE15, NA114, JJ1886, EC958 (red); other phylogroup B2: APEC-01, S88, 536, UTI89, CFT073, ED1A (orange); phylogroup D: UMN026 (yellow); phylogroup F: IAI39 (yellow); phylogroup A: BW2952, MG1655, W3110, HS (green); phylogroup B1: SE11, IAI1 (aquamarine); phylogroup E: O157 EDL933, O157 Sakai (blue). Red nodes have 100% bootstrap support from 1000 replicates.
Figure 3.
Distribution of EC958 mobile genetic elements in E. coli.
A. Visualisation of the EC958 genome compared with three E. coli ST131 genomes and 16 other E. coli genomes using BLASTn. EC958 prophage (Phi1 – Phi7) and genomic islands (GI-thrW, GI-pheV, GI-selC, GI-leuX) are represented by black boxes in the outermost circle. The innermost circles represent the GC content (black) and GC skew (green/purple) of EC958. The remaining circles display BLASTn searches against the genome of EC958. B. A BRIG visualisation of the EC958 mobile elements compared with the 19 E. coli genomes. BLASTn searches of the 19 genomes against the EC958 prophage and genomic islands show that the EC958 GIs and prophage are well conserved in the ST131 clade C genomes but largely absent from the genomes of SE15 and the other 16 E. coli genomes, which are arranged inner to outer as follows: Group E strains O157 EDL933, O157 Sakai (blue); group B1 strains SE11, IAI1 (aquamarine); group A strains BW2952, MG1655, W3110, HS (green); group D strains UMN026, IAI39 (yellow); group B2 strains APEC-01, S88, 536, UTI89, CFT073, ED1A (orange); group B2 ST131 strains SE15, NA114, JJ1886, EC958 (red). Figure prepared using BRIG [28].
Figure 4.
Nucleotide pairwise comparison of four E. coli ST131 chromosomes showing extensive variation in the structure and location of EC958 prophage elements (blue) and genomic islands (green).
An additional prophage element present in JJ1886 has also been annotated here as Phi8 for clarity. ST131 genomes are arranged from top to bottom as follows: JJ1886, EC958, NA114, SE15. Grey shading indicates nucleotide identity between sequences according to BLASTn (62%–100%). Figure prepared using Easyfig [27].
Figure 5.
Nucleotide pairwise comparison of a 200 kb region (thrA to degP) from the genomes of the four ST131 and 16 other representative E. coli strains.
Grey shading indicates nucleotide identity between sequences according to BLASTn (62%–100%). Coding regions immediately upstream of dnaJ are highlighted in purple. This region is well conserved in 19 of 20 E. coli genomes examined. However, a large insertion in the genome of NA114 located immediately upstream of dnaJ is clearly evident (white). E. coli genomes are arranged from top to bottom as follows: group B2 ST131 strains JJ1886, EC958, NA114, SE15 (red); group B2 strains ED1A, CFT073, UTI89, 536, S88, APEC-01 (orange); group F strain: IAI39 (yellow); group D strain UMN026 (yellow); group A strains HS, W3110, MG1655, BW2952 (green); group B1 strains IAI1, SE11 (aquamarine); group E strains O157 Sakai, O157 EDL933 (blue). Figure prepared using Easyfig [27].
Figure 6.
Nucleotide pairwise comparison between EC958, a simulated EC958 Illumina assembly and NA114.
A. Nucleotide pairwise comparison of the EC958 chromosome (top) and a simulated EC958 chromosome assembly (EC958-sim, bottom). Linear alignments revealed extensive variations in the location and structure of mobile elements in EC958-sim when compared to EC958. Grey shading indicates nucleotide identity between sequences according to BLASTn (62%–100%). Prophage regions are annotated as blue boxes and genomic islands as green boxes. B. Nucleotide pairwise comparison of EC958 chromosome (top) and NA114 chromosome (bottom). C. Nucleotide pairwise comparison of EC958 (top), EC958-sim (centre) and NA114 (bottom) chromosomes. EC958 prophage and genomic islands misassembled in EC958-sim are similarly misassembled in the genome of NA114 (red boxes). Red boxes indicate positions in EC958-sim and NA114 where mobile genetic elements are present in EC958. The dnaJ gene is shown as a black triangle on each chromosome. Figure prepared using Easyfig [27].