Genomic analyses of multidrug-resistant Salmonella Indiana, Typhimurium, and Enteritidis isolates using MinION and MiSeq sequencing technologies

We sequenced 25 isolates of phenotypically multidrug-resistant Salmonella Indiana (n = 11), Typhimurium (n = 8), and Enteritidis (n = 6) using both MinION long-read [SQK-LSK109 and flow cell (R9.4.1)] and MiSeq short-read (Nextera XT and MiSeq Reagent Kit v2) sequencing technologies to determine the advantages of each approach in terms of the characteristics of genome structure, antimicrobial resistance (AMR), virulence potential, whole-genome phylogeny, and pan-genome. The MinION reads were base-called in real-time using MinKnow 3.4.8 integrated with Guppy 3.0.7. The long-read-only assembly, Illumina-only assembly, and hybrid assembly pipelines of Unicycler 0.4.8 were used to generate the MinION, MiSeq, and hybrid assemblies, respectively. The MinION assemblies were highly contiguous compared to the MiSeq assemblies but lacked accuracy, a deficiency that was mitigated by adding the MiSeq short reads through the Unicycler hybrid assembly which corrected erroneous single nucleotide polymorphisms (SNPs). The MinION assemblies provided similar predictions of AMR and virulence potential compared to the MiSeq and hybrid assemblies, although they produced more total false negatives of AMR genotypes, primarily due to failure in identifying tetracycline resistance genes in 11 of the 19 MinION assemblies of tetracycline-resistant isolates. The MinION assemblies displayed a large genetic distance from their corresponding MiSeq and hybrid assemblies on the whole-genome phylogenetic tree, indicating that the lower read accuracy of MinION sequencing caused incorrect clustering. The pan-genome of the MinION assemblies contained significantly more accessory genes and less core genes compared to the MiSeq and hybrid assemblies, suggesting that although these assemblies were more contiguous, their sequencing errors reduced accurate genome annotations. Our research demonstrates that MinION sequencing by itself provides an efficient assessment of the genome structure, antimicrobial resistance, and virulence potential of Salmonella; however, it is not sufficient for whole-genome phylogenetic and pan-genome analyses. MinION in combination with MiSeq facilitated the most accurate genomic analyses.

Introduction Whole-genome sequencing (WGS) has been widely employed in foodborne outbreak investigations and pathogen surveillance [1]. In addition to the rapid identification of pathogens from contaminated sources of outbreaks, more detailed information about the pathogens, such as antimicrobial resistance (AMR), virulence, and inference of possible links between the sources of contamination, can also be obtained [2]. Illumina short-read sequencing technology has proven to be robust for characterizing pathogens that may have caused foodborne outbreaks and identifying those that could pose potential threats to public health [3]. However, this technology is unable to resolve repetitive and GC-rich regions, thus producing unresolvable loops in the underlying genome assembly that are fragmented into independent contigs [4]. The gaps between fragments can lead to an inability to obtain the complete whole-genome structure, which is critical in determining if some genes are co-regulated or co-transmissible and if they are located on chromosome or plasmids [5]. Moreover, the possibility of failing to identify key virulence genes during an outbreak investigation can also have negative impacts on public health assessment.
Nanopore sequencing technology that generates long reads can facilitate the completion of bacterial genome assemblies that are either lacking in sequencing depth at some repetitive regions or have areas that are missing reads completely using short-read sequencing technology [6]. Nanopore long reads can span the wide repetitive regions and also resolve GC-rich regions. Nanopore sequencing technology for full-length genome sequencing could allow the low-cost access of information necessary for making critical public health decisions.
However, Nanopore sequencing technology exhibits lower read accuracy which may produce systematic errors, and for this reason, it has previously only been applied as a complement to short-read sequencing [7]. Since the release of the MinION platform by Oxford Nanopore Technologies, nanopore chemistry, basecalling, and bioinformatic tools have been steadily evolving, with the objective of using raw Nanopore long reads independently to acquire more accurate bacterial genomes independent of other sequencing technologies [8]. In addition, closed whole-genome assemblies can also be accomplished with a combination of both short reads for base-calling accuracy and long reads for structural integrity using hybrid assembly approaches such as those found in the Unicycler and SPAdes pipelines [9,10]. Unicycler was developed as an assembly pipeline for bacterial genomes that can conduct a hybrid assembly using both short and long reads [9]. It produces a short-read assembly graph and then uses long reads to build bridges to resolve all repeats in the genome and performs multiple rounds of short-read polishing, ultimately resulting in a complete genome assembly. The use of the Unicycler hybrid assembly with Illumina short-reads and Nanopore long reads to complete bacterial genomes has been previously reported [11][12][13].
Salmonella enterica subsp. enterica includes more than 2,500 different serotypes, and is considered a primary pathogen for both humans and animals worldwide [14]. The majority of the infections in humans are associated with the consumption of foods that have been contaminated by Salmonella [15]. By providing definitive genotypic information, WGS is ideal for investigating the emergence and dissemination of antimicrobial resistance genes (ARGs) and chromosomal point mutations that predict AMR profiles, including compounds not routinely tested phenotypically [16]. Bacteria showing identical phenotypic resistance regulated by different mechanisms can also be differentiated by WGS. An in silico approach to predict AMR patterns based on WGS data requires comprehensive and accurate ARG databases, as well as bioinformatic tools that can reliably detect ARGs. Here, AMRFinder (https://github.com/ ncbi/amr) and the Bacterial Antimicrobial Resistance Reference Gene Database (https://www. ncbi.nlm.nih.gov/pathogens/isolates#/refgene/) are publicly available for rapid identification of AMR-related genotypes.
In this study, we sequenced 25 phenotypically multidrug-resistant isolates of S. Indiana, Typhimurium, and Enteritidis using both MinION and MiSeq sequencing technologies. The MinION, MiSeq, and hybrid assemblies were then compared in terms of the characteristics of genome structure, antimicrobial resistance profile, virulence potential, whole-genome phylogeny, and pan-genome. A customized, reproducible bioinformatic workflow that employs publicly available tools was developed to obtain a complete circular bacterial chromosome and its associated plasmids. These closed genomes can provide valuable information on the genome structure of Salmonella and complement existing characterization data from other sequencing technologies such as MiSeq. This work represents a data-driven methodology comparison through elucidating the differences, as well as similarities, between genome assemblies of bacterial foodborne pathogens obtained using MinION and MiSeq sequencing technologies.

Genomic DNA extraction
Salmonella isolates were grown in 20 ml of TSB with 0.6% yeast extract (YE; Fisher Scientific Inc.) overnight at 37˚C with agitation at 150 rpm. For MiSeq sequencing, genomic DNA was extracted using the DNeasy Blood and Tissue Kit (QIAGEN Inc., Valencia, CA) on a QIAcube robotic workstation (QIAGEN Inc.). For MinION sequencing, genomic DNA was extracted using the Blood & Cell Culture DNA Maxi Kit (QIAGEN Inc.) following the manufacturer's instructions. The mixture of bacterial cells, lysozyme, RNase A, and QIAGEN Protease was incubated at 37˚C for an extended period of time (1 h) to ensure complete cell lysis, as well as complete RNA and protein degradations. DNA was precipitated by inverting the tube containing Buffer QF and isopropanol 10-20 times and spooled using an inoculating needle. The spooled DNA was immediately transferred to a microcentrifuge tube containing 0.2 ml of Tris-EDTA (TE) buffer, pH 8.0 (Fisher Scientific Inc.) and then dissolved at 55˚C for 2 h. Genomic DNA was stored at 4˚C until use. DNA concentrations were measured using the Qubit dsDNA HS Assay Kit (Fisher Scientific Inc.) on a Qubit 3.0 fluorometer (Fisher Scientific Inc.).

Library preparation and WGS
MiSeq libraries were prepared with 1 ng of genomic DNA input using the Nextera XT DNA Library Preparation Kit (Illumina Inc., SanDiego, CA) following the manufacter's instructions. Afterwards, libraries were sequenced using the MiSeq Reagent Kit v2 (500-cycles) (Illumina Inc.) on a MiSeq System using the 2x250 bp pair-end chemistry. The adapter trimming option in the Illumina FASTQ file generation pipeline was used to remove adapter sequences from the 3' ends of the reads.
For MinION sequencing, libraries were prepared using the Ligation Sequencing Kit (Oxford Nanopore Technologies Inc., Oxford, UK) using the 1D Genomic DNA by Ligation protocol (SQK-LSK109), with a minor modification that 4 μg of genomic DNA input was

Bioinformatic workflow
The quality of the raw short reads of MiSeq was checked using FastQC 0.11.9 (https://github. com/s-andrews/FastQC). Q-score was used to predict the probability of an error in base-calling. Over 75% of bases >Q30 averaged across the entire run was considered acceptable for MiSeq Reagent Kit v2 (2×250 bp). The bioinformatic workflow of the hybrid, MinION, and MiSeq assemblies is shown in Fig 1. Raw reads were trimmed using Trimmomatic 0.36.4 [18], following the SLIDINGWINDOW operation with four bases to average across and 20 as the average quality required. Trimmed reads were then de novo assembled with the Illumina-only assembly method in the Unicycler 0.4.8 pipeline [9], which functions mainly as an optimizer of SPAdes 3.13.1 [19].
The mean read quality of the raw long reads of MinION was scored using NanoPlot 1.0.0 [20]. The adapters on the ends of the raw reads were trimmed with Porechop 0.2.4 (https:// github.com/rrwick/Porechop). When a read has an adapter in its middle, it is regarded as chimeric and cleaved into two separate reads with the adapter subsequently removed. Trimmed reads were then subsampled for subsequent assembly using Filtlong 0.2.0 (https://github.com/ rrwick/Filtlong). Filtlong subsampling was not random but more weight was given to read quality. The selections of minimum length and minimum window quality were relatively conservative, as this was necessary to ensure a sufficient coverage of small plasmids. All read lengths were retained and 50 was designated as the minimum window quality. The worst 10% of the MinION long reads, as measured by bases, was discarded to further increase read quality. To determine when to stop the MinION sequencing process, a long-read-only assembly of the trimmed, filtered reads was conducted with Miniasm 0.3 (https://github.com/lh3/ miniasm), followed by multiple rounds of polishing with Racon 1.4.3 [21], in the Unicycler pipeline to see if sufficient data were gathered to generate a complete genome assembly. The trimmed, filtered reads were also assembled using the hybrid assembly method (normal mode) in the Unicycler pipeline, which can produce an assembly graph with the MiSeq short reads and then use the MinION long reads to build bridges to resolve all repeats. Multiple rounds of polishing were performed with the MiSeq short reads using Bowtie2 2.3.5.1 [22], Samtools 1.9 [23], and Pilon 1.23 [24] in the Unicycler pipeline to correct small errors. Finally, circularized contigs were rotated to begin at a starting gene of dnaA or repA if one could be detected with BLAST+. Bandage 0.8.1 [25] was used to visually assess the quality of de novo assemblies by loading the assembly graphs in GFA format after the Unicycler assembly.
The raw MinION and MiSeq reads of the 25 Salmonella isolates (BioSample accession numbers: SAMN14450150-SAMN14450174) were deposited into the Sequence Read Archive (SRA) database under the BioProject accession number PRJNA615288. The complete genomes based on the hybrid assemblies were submitted to the GenBank database under the accession numbers CP050706-CP050785.

Identifications of plasmids, ARGs, chromosomal point mutations, virulence genes, and Salmonella pathogenicity islands (SPIs)
Plasmids were detected and typed using staramr 0.6.0 (https://github.com/phac-nml/staramr) against the PlasmidFinder database [26], and the sequences were blasted with the database to known plasmid types with 98% minimum identity and 60% minimum coverage. AMRFinder 3.0 alpha using the Bacterial Antimicrobial Resistance Reference Gene Database and staramr 0.6.0 using the PointFinder database [27] were implemented to identify ARGs and chromosomal point mutations, respectively, with 90% minimum identity and 60% minimum coverage compared with known reference sequences. Mass screening of sequences for virulence genes was performed using ABRicate 0.8.7 (https://github.com/tseemann/abricate) integrated with the Virulence Factors Database (VFDB) [28] for bacterial pathogens, with 90% minimum identity and 60% minimum coverage compared with known reference sequences. SPIFinder 1.0 (https://cge.cbs.dtu.dk/services/SPIFinder/) was used to identify SPIs with 90% minimum identity and 60% minimum coverage compared with known reference sequences.

Whole-genome phylogenetic analysis
CSI Phylogeny 1.4 [29] was used to call single nucleotide polymorphisms (SNPs) of the Min-ION, MiSeq, and hybrid assemblies and then infer phylogeny based on the concatenated alignment of the high-quality SNPs. Default settings were used, with 10× as the minimum depth at SNP positions, 10% as the minimum relative depth at SNP positions, 10 bp as the minimum distance between SNPs, 30 as the minimum SNP quality, 25 as the minimum read mapping quality, and 1.96 as the minimum Z-score. S. Typhimurium LT2 (RefSeq assembly accession: GCF_000006945.2) served as the reference genome for SNP calling. The inferred wholegenome phylogeny in Newick format was visualized as a rectangular tree layout with Geneious Prime 2020.1.1. (Biomatters, Ltd., Auckland, New Zealand).

Pan-genome analysis
Sequences were annotated with Prokka 1.14.0 [30] to generate annotated assemblies in GFF3 format containing both sequences and annotations for subsequent pan-genome analysis. Pangenomes were analyzed and calculated using Roary 3.12.0 [31]. The results were visualized using the Roary plots module to generate a matrix with the presence and absence of core and accessory genes against the core-genome phylogenetic tree and a pan-genome pie chart that breaks down into the core, soft-core, shell, and cloud genes. The core-genome SNP alignment was conducted using Parsnp 1.2 [32], allowing for automatic recruitment of the reference sequence and requiring that all genomes be included for the analysis.

Statistical comparisons among assemblies
To evaluate if differences among the hybrid, MinION, and MiSeq assemblies were significant (P<0.05), the non-parametric Wilcoxon signed-rank test was used to compare values for four characters: total length, GC content, numbers of false positives of AMR genotypes, and numbers of false negatives of AMR genotypes. The test compared paired values allowing for contrast of two treatments of the hybrid, MinION, and MiSeq assemblies (MinION assembly versus MiSeq assembly, MinION assembly versus hybrid assembly, and MiSeq assembly versus hybrid assembly). All three combinations of contrast were evaluated for each of the four characters. The test was conducted in R using the Wilcoxon.test function.

Genome assemblies
For MiSeq sequencing, more than 85% of the paired-end short reads received scores of >Q30 for each Salmonella isolate. For MinION sequencing, the average of the mean read length of all isolates was 20,849. The mean quality scores of the MinION long reads ranged from 10.2 to 11.2 with an average of 10.6, which corresponded to an approximate accuracy of over 90%. The de novo assembly using only MiSeq sequence data generated assemblies with genome sizes that ranged from 4.7 to 5.2 Mb (Table 1). Genome sizes of the MinION assemblies ranged from 4.8 to 5.2 Mb. To achieve the best possible assemblies, the hybrid assembly was carried out with the Unicycler hybrid assembly method using both short and long reads [9], which generated assemblies with genome sizes varying from 4.8 to 5.2 Mb. Total length for each of the 25 isolates was compared as paired values for three contrasts. Only the MiSeq assembly versus hybrid assembly contrast was significant (P = 0.02382).
Salmonella genomes were assembled into 44-166 contigs using MiSeq sequencing (Table 1). Noticeably, although there were variations in the number of contigs among the Min-ION assemblies, a closed bacterial genome was obtained for each isolate, including a circular chromosome and plasmid(s) ( Table 1). Based on the de novo assembly, all isolates contained 1 to 5 plasmids except isolates 43 and 174. The sizes of the plasmids identified in those genomes ranged widely from 2 to 260 kb. The largest plasmid (260,432 bp) was detected in isolate 45. As detected by PlasmidFinder, the MinION, MiSeq, and hybrid assemblies showed consistent plasmid profiles for all isolates except isolate 102 (S1 and S2 Tables). IncX1 was detected in the MiSeq assembly of isolate 102 but not in its MinION assembly. A discrepancy was also obaserved in the number of contigs between the MinION and hybrid assemblies of isolates 102, 111, 45, 56, 81, and 95. As predicted with the MinION and hybrid assemblies, although isolate 43 had only one contig, we observed three plasmids (IncHI2, IncHI2A, and IncQ1) integrated into its chromosome (Fig 2), demonstrating an advantage of MinION in plasmid analysis over MiSeq. No plasmids were detected in isolate 174 which also had one contig. Using PlasmidFinder for plasmid analysis on the MiSeq assemblies has some major limitations. The PlasmidFinder database was developed solely based on unique short sequences (200-800 bp) of plasmid replicons for the Enterobacteriaceae family. Our data might be interpreted to suggest that this approach may not be reliable due to the fact that the entire structure of a plasmid cannot be fully revealed since recombination, insertion, or deletion events frequently occur among plasmids [35]. Therefore, it is important to acquire the full sequence of a plasmid using Nanopore sequencing technology to study its type, structure, and evolution.

PLOS ONE
Assemblies using three different methods produced GC contents of 51-52%. It should be noted that the MinION assemblies always had higher GC contents than the MiSeq assemblies, although the differences between these two methods were only 0.03-0.19%. All three contrasts were significantly different in GC content from each other (P<0.00001). Genome assembly using only short-read sequence data is complicated by biases that may occur during library preparation and cause some genetic regions to be excluded from the final library [36]. Common short-read library preparation methods (e.g. Illumina Nextera XT DNA Library Preparation protocol) include PCR amplification steps that are biased against regions with extreme GC contents. Library preparation methods using transposases to fragment DNA may also shear genomes, causing further biases that limit the capability of short-read sequencing [37]. Our hybrid assembly demonstrated that the MinION sequence data improved the contiguity of the MiSeq assembly when running the Unicycler hybrid assembly method ( Table 1). As implemented in this mode, the MinION reads can scaffold contigs generated by short reads to build bridges over regions of the assembly graph that cannot be resolved by MiSeq sequencing alone [9]. This highlights the ability of the MinION long reads to resolve genomic repeats and reconstruct complete genomic assemblies that were otherwise fragmented when assembling using short reads only. Phenicol resistance. All of the chloramphenicol-resistant isolates contained at least one phenicol resistance gene. Among chloramphenicol-resistant isolates, eight phenicol resistance genes were identified in the MiSeq assemblies, including catB3 (63%), cmlA1 (38%), floR (88%), catA1 (19%), catA2 (25%), oqxA (25%), oqxA2 (13%), and oqxB (38%). Phenicol resistance genes present in the MinION assemblies of chloramphenicol-resistant isolates included catB3 (50%), cmlA1 (31%), floR (69%), catA1 (19%), catA2 (25%), and oqxA (19%). The hybrid assemblies of chloramphenicol-resistant isolates carried catB3 (63%), cmlA1 (31%), floR (81%), catA1 (19%), catA2 (25%), oqxA (31%), oqxA2 (31%), and oqxB (63%).

ARGs and chromosomal point mutations
Quinolone resistance. Quinolone resistance is typically conferred by chromosomal point mutations of the quinolone resistance-determining regions (QRDRs) of gyrA, gyrB, parC, and parE [38] and/or the acquisition of plasmid-mediated quinolone resistance (PMQR) genes [39]. The MinION, MiSeq, and hybrid assemblies of quinolone-resistant Salmonella isolates carried either the QRDR mutations or the PMQR genes. There was a high level of concordance in quinolone genotypes among the MinION, MiSeq, and hybrid assemblies of S. Indiana isolates, as both the gyrA and parC mutations were present, with only one exception that the gyrA mutation was not detected in the MinION assembly of isolate 115. We observed that the gyrA mutation was absent only in one MiSeq and one hybrid assembly of quinolone-resistant isolates, while five MinION assemblies of quinolone-resistant isolates did not possess the gyrA mutation. The parC mutation was present in 58% of the MiSeq assemblies and 63% of the hybrid assemblies of quinolone-resistant isolates. It is worth noting that as high as 85% of the  Tetracycline  Phenotype  AST  TET  TET  -TET  TET  TET  -TET  TET  TET  TET   Genotype  Hybrid TET  TET  TET  TET  TET  TET  -TET  TET  -TET   MinION ---TET  ---TET  TET  -TET   MiSeq  TET  TET  TET  TET  TET  TET  -TET  TET  TET MinION assemblies of quinolone-resistant isolates contained the parC mutation, suggesting that MinION was more effective in detecting the parC mutation.
Tetracycline resistance. Phenotypic resistance to tetracycline correlated highly with the presence of known resistance determinants predicted by the MiSeq assemblies. Of the 46 tet   Tetracycline  Phenotype  AST  TET  TET  TET  TET  TET  TET  TET  TET   Genotype  Hybrid TET  TET  TET  TET  TET  TET  TET  TET   MinION -TET  TET  --TET  --MiSeq  TET  TET  TET  TET  TET  TET  TET  genes described to date [44], three were identified in the MiSeq assemblies of tetracyclineresistant isolates, with the highest prevalence being tetracycline efflux transporter encoded by tet(A) (79%) and tet(B) (32%). Relatively rare were ribosomal protection mechanisms conferred by tet(M) (16%), which encodes a tetracycline-degrading enzyme. According to the genotypic predictions by the MinION assemblies, tet(A) and tet(B) were only present in seven and one tetracycline-resistant isolates, respectively. In all but one case, genotypic predictions by the hybrid assemblies for tetracycline resistance were consistent with phenotypic susceptibility data. Only one isolate (isolate 85) whose MiSeq and hybrid assemblies both carried tet (A) was not phenotypically resistant to tetracycline. Sulfonamide and trimethoprim resistance. As detected in the MiSeq assemblies, sulfafurazole resistance was predominantly encoded by sul1 and sul2, which were present in 76% and 90% of sulfafurazole-resistant isolates, respectively, with 29% of these isolates containing sul3. The prevalence of sul1, sul2, and sul3 in the MinION assemblies of sulfafurazole-resistant isolates was 76%, 62%, and 10%, respectively, while these three genes were identified in 76%, 86%, and 24% of the hybrid assemblies of sulfafurazole-resistant isolates, respectively. Among the detected genes responsible for synthesizing trimethoprim-resistant dihydrofolate reductase, dfrA12 and dfrA7 were present in 57% and 29% of the MiSeq assemblies of trimethoprim-resistant isolates, respectively, while they were detected in 64% and 36% of the MinION assemblies of these isolates, respectively. These two genes were identified in 57% and 36% of the hybrid assemblies of trimethoprim-resistant isolates, respectively.
Correlations between AMR phenotype and genotype. Theoretically, any phenotypic feature of a microorganism can be derived from its genome sequence. However, both false positives (phenotypically susceptible, genotypically resistant) and false negatives (phenotypically resistant, genotypically susceptible) of AMR genotyping may occasionally occur and have some adverse consequences [16]. In the present study, we observed instances of false positives for the MinION, MiSeq, and hybrid assemblies, indicating the presence of AMR determinants even if the phenotypic susceptibilities were below the MICs (Table 5). For example, although the MinION, MiSeq, and hybrid assemblies of isolate 96 all harbored amikacin and kanamycin resistance genes, it was not phenotypically resistant to amikacin or kanamycin. Noticeably, the MinION assemblies had similar false positives compared to the MiSeq and hybrid assemblies, although amoxicillin/clavulanic acid resistance genes were present in 16 of the 21 MinION assemblies of amoxicillin/clavulanic acid-sensitive isolates. No significant differences in the

PLOS ONE
numbers of false positives of AMR genotypes were observed between the MinION, MiSeq, and hybrid assemblies (P>0.05). False negatives were also observed for the MinION, MiSeq, and hybrid assemblies ( Table 5), implying that some isolates phenotypically resistant to certain antibiotics were annotated as genotypically susceptible. No significant differences in the numbers of false negatives of AMR genotypes were observed between the MinION, MiSeq, and hybrid assemblies (P>0.05). For 24 ampicillin-resistant isolates, the corresponding resistance genes were present only in 20 Min-ION assemblies. And these genes were absent in up to 12 MiSeq and 12 hybrid assemblies of ampicillin-resistant isolates, suggesting that they were not effective in detecting genes associated with ampicillin. For five isolates resistant to amikacin, its corresponding resistance genes were not identified in one MinION assembly, while MiSeq or hybrid successfully detected amikacin resistance genes. Genes related to kanamycin resistance were not detected in three MinION assemblies, two MiSeq assemblies, and two hybrid assemblies of 19 kanamycin-resistant isolates. For 16 gentamicin-resistant isolates, the corresponding resistance genes were absent in four MinION assemblies, one MiSeq assembly, two hybrid assemblies. Nineteen isolates were observed to display phenotypic resistance to tetracycline. Tetracycline resistance genes were not identified in 11 of the 19 MinION assemblies of tetracycline-resistant isolates. In contrast, tet genes were identified in all MiSeq assemblies of these isolates, while they were absent only in one hybrid assembly.
Among large-scale studies investigating the correlation between phenotypes and genotypes, Feldgarden et al. [45] examined the consistency between AMR genotypes predicted using AMRFinder and resistance phenotypes of 5,425 Salmonella isolates from the National Antimicrobial Resistance Monitoring System. They indicated that overall, the presence or absence of kanamycin and gentamicin resistance genes was a good predictor of phenotypic susceptibility. Nonetheless, 67 out of 3,883 isolates (2%) that carried no kanamycin resistance genes still displayed phenotypic resistance to kanamycin. Similarly, 1% of isolates (53/5,419) were phenotypically resistant to gentamicin regardless of the absence of corresponding resistance genes. Other studies on Salmonella have also demonstrated that phenotypic breakpoints do not always correspond to the presence or absence of ARGs [46,47]. Tyson et al. [47] also reported 10 out of 1,028 Salmonella isolates (1%) devoid of tetracycline resistance genes were still phenotypically resistant to tetracycline. These false negatives require closer attention, which may result in inadequate treatment of infections by resistant strains. It is generally preferable to minimize false negatives at the expense of increasing the false-positive rate, although false positives can lead to antibiotic misuse, potentially increasing the risk of resistance to last-line antibiotics.
Overall, the MinION assemblies provided similar predictions of AMR compared to the MiSeq and hybrid assemblies, although they created more total false negatives of AMR genotypes mainly due to no identified tetracycline resistance genes in 11 of the 19 MinION assemblies of tetracycline-resistant isolates.
Correlations between AMR and CRISPRs, cas genes, and prophage regions. Bacteria are able to meet the evolutionary challenge of combating antimicrobial chemotherapy, often by acquiring preexisting AMR determinants from the bacterial gene pool through the concerted activities of mobile genetic elements, including insertion sequences, transposons, gene cassettes/integrons, plasmids, and integrative conjugative elements [48]. Together, these elements can facilitate horizontal genetic exchange, therefore promoting the acquisition and spread of ARGs. The Bacterial Antimicrobial Resistance Reference Gene Database used in this study contains the total complement of known ARGs, not just those in Salmonella. Meanwhile, this approach permits the identification of ARGs that are able to cross species and ecological barriers. Interestingly, for isolate 43 with 3 plasmids integrated into the chromosome, its ARG-rich region was flanked by several CRISPRs, cas genes, and prophage regions (Fig 2). Multiple reports have demonstrated that CRISPR-Cas systems may play a major role in controlling horizontal gene transfer through mobile genetic elements such as plasmids and bacteriophages, which consequently protect the dynamics of ARG acquisition in bacteria [49]. DiMarzio et al. [50] observed an association between CRISPR-multi-virulence-locus sequence typing and AMR in S. Typhimurium isolates, but exceptions also existed under some conditions. Previous studies have also demonstrated that antibiotics such as carbadox and fluoroquinolones induced prophages that were integrated into the chromosome of S. Typhimurium, and also facilitated horizontal gene transfer from multidrug-resistant S. Typhimurium to a susceptible bacterial host strain [51,52]. With the advent of Nanopore sequencing technology which enables the sequencing and assembly of complete bacterial genomes, it is becoming increasingly feasible to further explore the correlations between AMR and CRISPRs, cas genes, and mobile genetic elements. Nanopore long reads enable the characterization of mobile genetic elements on which key AMR determinants are located and also identify the combination of different AMR determinants co-located on the same mobile genetic element.
Despite some remaining limitations, the information provided by MinION and its combination with MiSeq will largely enhance the monitoring of ARGs circulating among humans, animals, foods, and environments. Nanopore sequencing technology has particular potential for rapid AMR genotyping because sequence data become readily available within minutes of starting the sequencing run. Previous studies have proved the "streaming" genome-based prediction of bacterial AMR phenotype using MinION, where the AMR profile was acquired in real-time as the sequence data were produced by the device [53,54]. Correspondingly, Min-ION can be helpful to identify emerging AMR hazards more quickly and implement timely control strategies designed to mitigate potential risks to public health.

Virulence genes and SPIs
In most cases, virulence genes in the MinION, MiSeq, and hybrid assemblies of each isolate were consistent ( Table 6). The MinION assemblies included all virulence genes that were present in the MiSeq and hybrid assemblies, confirming that MinION can facilitate a rapid and accurate assessment of virulence potential by detecting specific virulence genes. Most notably, the MinION assembly of isolate 53 harbored the rck gene (94.1% coverage, 100% identity) responsible for resistance to complement killing, whereas this gene was absent in the corresponding MiSeq assembly. The rck gene was also detected in its hybrid assembly, demonstrating the utility of the MinION long reads when running the Unicycler hybrid assembly method. Similarly, when González-Escalona et al. [55] compared MiSeq, MinION, and PacBio assemblies of three clinical and environmental isolates of Shiga toxin-producing Escherichia coli (STEC), the MinION assemblies provided sufficient data to cover all virulence genes, which were consistent with the data of PacBio assemblies. However, several virulence genes were not detected in some MiSeq assemblies, pointing to the library preparation as being responsible for the loss of these genetic regions. Interestingly, we noticed the shdA gene involved in synthesizing AIDA autotransporter-like protein was identified in both MinION (99.0% coverage, 96.0% identity) and MiSeq (63.1% coverage, 94.6% identity) assemblies of isolate 95, but absent in its hybrid assembly. This discrepancy could be attributed to the frameshifts introduced during the Unicycler hybrid assembly.
Overall, the MinION, MiSeq, and hybrid assemblies possessed similar profiles of SPIs for each isolate (Table 5 and S6 Table). Nevertheless, some discrepancies were observed between the MinION and MiSeq assemblies. SPI-4 and SPI-12 were absent in the MiSeq assemblies of S. Typhimurium and Enteritidis, respectively, but identified in most MinION assemblies. These SPIs may be located at particularly repetitive or bias-prone regions such that they were omitted from the MiSeq assemblies, while they were present in the MinION assemblies that are less sensitive to these issues. SPI-1 was absent in some MinION assemblies but was detected in the MiSeq assemblies. The Unicycler hybrid assembly took into consideration both long-read and short-read data. For example, SPI-4 and SPI-1 were identified in the MinION and MiSeq assemblies respectively for isolate 45, whereas both of these SPIs were present in its hybrid assembly. Although SPI-1 was present in either the MinION or MiSeq assemblies of several S. Indiana isolates, it was absent in their hybrid assemblies. For isolate 106, SPI-4 was not identified in either the MinION or MiSeq assembly but was present in its hybrid assembly. Technological improvements in genome assemblers are therefore necessary to ensure raw sequence data are correctly assembled. Furthermore, enhanced genome assembly is also essential to fully understand and exploit complex genomic features such as SPIs. As more WGS data of Salmonella become publicly available, genomic analysis can provide a more comprehensive insight into the distribution, diversity, and host specificity of SPIs. Such information not only allows us to identify highly virulent strains and serotypes but also helps us to understand the evolution of Salmonella pathogenicity.

Whole-genome phylogenetic analysis
As shown in Fig 3, there was a greater distance from the MinION assembly clade to the other two clades relative to the distance between the MiSeq and Hybrid assembly clades. The lower read accuracy of MinION sequencing may have negatively affected the correct clustering of Salmonella isolates. The genetic relationships between the MiSeq and hybrid assemblies of each isolate were more concordant on the whole-genome phylogenetic tree, suggesting that the Unicycler hybrid assembly can be an effective strategy to generate genome assemblies that are both accurate and contiguous. Our overall finding was consistent with the work of González-Escalona et al. [55], which reported that the MinION assemblies of STEC had many errors against high-quality MiSeq and PacBio assemblies. Without polishing with the MiSeq short reads of STEC, the MinION assemblies were unable to be correctly placed onto the wholegenome phylogenetic tree. S. Typhimurium LT2 ASM694v2 served as the reference genome for single nucleotide polymorphism (SNP) calling.
Because high-quality reference genomes are not always available, the MiSeq assemblies were considered as the "gold standard" to assess the accuracy of the MinION and hybrid assemblies in our study [55,56]. Similarly, in the study of González-Escalona et al. [55] on the MinION assemblies of three E. coli O26:H11 strains, the MiSeq or HiSeq assemblies of 155 E. coli O26:H11 strains were considered as the "gold standard" for accurate genome sequence determination and SNP analyses. Taylor et al. [56] also used the MiSeq assembly as the reference genome for the E. coli O157:H7 isolate lacking a published reference genome when evaluating Nanopore sequencing technology for rapid phylogenetic inference. Although most MinION assemblies were more contiguous than the MiSeq assemblies, SNPs remained problematic in de novo assemblies generated from our MinION long reads. As indicated by our SNP analysis (S7 Table), 0.54-1.38 SNPs per kbp were detected in the MinION assemblies, which can, therefore, prevent accurate phylogenetic analysis due to errors in gene structure prediction. The Unicycler hybrid assembly reduced the number of SNPs to a lesser extent (<0.1 SNPs per kbp) with the combination of both short and long reads. All genome assemblies for the same serotype clustered together, with S. Typhimurium and Enteritidis having a closer genetic distance and being distinct from S. Indiana, implying that even the higher error rates of the MinION assemblies did not obscure serotype-level phylogenetic differences. Continued improvements in nanopore chemistry, as well as downstream base-calling and assembly, may mitigate the high numbers of SNPs. The potential application of MinION for epidemiological tracing during foodborne outbreaks remains to be validated utilizing more Salmonella isolates from diverse sources.

Pan-genome analysis
The pan-genomes based on the genome annotations of the MiSeq and hybrid assemblies had similar numbers of core and accessory genes, as well as the matrices with the presence and absence of core and accessory genes, which were significantly different from the pan-genome of the MinION assemblies (Fig 4). The pan-genomes of the MiSeq and hybrid assemblies of 25 Salmonella isolates consisted of 7,341 genes with 3,729 core genes (50.8%) and 3,612 accessory genes (49.2%) and 7,606 genes with 3,762 core genes (49.5%) and 3,844 accessory genes (50.5%), respectively. In contrast, the total number of genes in the pan-genome of the MinION assemblies was significantly higher (40,299) with as many as 39,815 accessory genes (98.8%) and only 484 core genes (1.2%). Based on our core-genome phylogenetic analyses, the Min-ION assemblies on the core-genome phylogenetic tree were more genetically isolated from one another compared to the MiSeq and hybrid assemblies on their corresponding trees

PLOS ONE
(trees are displayed on the left side of each matrix) (Fig 4), which was congruent with our pangenome analyses. This is likely due to the fact that MinION sequencing errors could introduce a stop codon and a start codon following an incorrectly introduced stop codon that truncated genes, which artificially increased gene counts in these assemblies [36]. High numbers of errors may, to some extent, interfere with high-quality genome annotations due to reduced inaccuracy in gene prediction to produce a large number of misannotated gene structures. Nonetheless, it should be noted that the hybrid assemblies possessed higher numbers of core and accessory genes than the MiSeq assemblies, suggesting the contribution of the MinION assemblies to these genes in the hybrid assemblies. The MinION assemblies may contain some genes that were unique to those particular genomes but were not annotated in the highly fragmented MiSeq assemblies due to the deficiencies that significantly lower the informational value of draft-quality genomes generated using short reads (S1-S3 Files). In a study by Jacobsen et al. [57], a comparative genomic analysis of 35 Salmonella genomes revealed that the addition of a fragmented genome can affect the size of the core and pan-genome proportionally more than the addition of a completed genome. Furthermore, since incomplete genomes may not always contain the full sequences for genes otherwise present, such truncated genes might erroneously be identified as novel gene families. Our finding in the current study thus further demonstrated that there was a tradeoff between assembly contiguity and annotation accuracy during the Unicycler hybrid assembly. Future research is necessary to specify the genes that were present and absent in the pan-genome of the MinION assemblies relative to the MiSeq and hybrid assemblies.

Conclusions
We used MiSeq and MinION sequencing technologies, both individually and in combination, for the genomic analyses of 25 phenotypically multidrug-resistant isolates of S. Indiana, Typhimurium, and Enteritidis. A series of bioinformatic tools were used to translate raw sequence data into comprehensive genetic information. The MiSeq assemblies struggled to resolve genomic repeats and GC-rich regions, preventing assembly into complete genomes. Tradeoffs existed between the high contiguity of the MinION assemblies and their high numbers of errors, which was highlighted by our whole-genome phylogenetic and pan-genome analyses. Minimizing such errors of using Nanopore sequencing technology is thus warranted. Our study validated a framework to overcome these biases by combining the MinION long reads with the high-accuracy MiSeq short reads. The hybrid assembly typically generated assemblies that were both contiguous and that facilitated accurate annotations of complex genomic features. MinION significantly improved the genome-based high resolution for rapid detection and characterization of ARGs and virulence factors in Salmonella, although notable false negatives of tetracycline resistance were observed in some MinION assemblies. As nanopore chemistry and its relevant bioinformatic tools continue to evolve and improve, this long-read WGS technology, coupled with its increasing cost-effectiveness, is promising in providing a sufficient amount of data to complement the current WGS technologies for epidemiological inference and foodborne outbreak tracing.
Supporting information S1