Molecular and Genomic Characterization of Vibrio mimicus Isolated from a Frozen Shrimp Processing Facility in Mexico

Vibrio mimicus is a gram-negative bacterium responsible for diseases in humans. Three strains of V. mimicus identified as V. mimicus 87, V. mimicus 92 and V. mimicus 93 were isolated from a shrimp processing facility in Guaymas, Sonora, Mexico. The strains were analyzed using several molecular techniques and according to the cluster analysis they were different, their similarities ranged between 51.3% and 71.6%. ERIC-PCR and RAPD (vmh390R) were the most discriminatory molecular techniques for the differentiation of these strains. The complete genomes of two strains (V. mimicus 87, renamed as CAIM 1882, and V. mimicus 92, renamed as CAIM 1883) were sequenced. The sizes of the genomes were 3.9 Mb in both strains, with 2.8 Mb in ChI and 1.1 Mb in ChII. A 12.7% difference was found in the proteome content (BLAST matrix). Several virulence genes were detected (e.g. capsular polysaccharide, an accessory colonization factor and genes involved in quorum-sensing) which were classified in 16 categories. Variations in the gene content between these genomes were observed, mainly in proteins and virulence genes (e.g., hemagglutinin, mobile elements and membrane proteins). According to these results, both strains were different, even when they came from the same source, giving an insight of the diversity of V. mimicus. The identification of various virulence genes, including a not previously reported V. mimicus gene (acfD) in ChI in all sequenced strains, supports the pathogenic potential of this species. Further analysis will help to fully understand their potential virulence, environmental impact and evolution.


Introduction
Generally, Vibrio species are accountable for the diseases associated to the natural bacterial flora of aquatic environments or seafood [1]. Several species of this genus can produce illness in humans mainly Vibrio cholerae, Vibrio parahaemolyticus and Vibrio vulnificus. Vibrio V. mimicus isolates were confirmed by the amplification of the hemolysin (vmh) gen [27,28] using the following primers: vmh390F (GGTAGCCATCAGTCTTATCACG) and vmh390R (ATCGTGTCCCAATACTTCACCG). PCR amplification was performed in a thermal cycler (Perkin Elmer 480) with the following temperature profile: an initial denaturation at 95°C for 5 min; 35 cycles of 30 s at 95°C, 30 s at 53°C and 30 s at 72°C; and a final extension at 72°C for 5 min. Amplified DNA was visualized using agarose gel electrophoresis (1.8%).

Molecular analysis
PCR confirmed V. mimicus isolates were analyzed using the following molecular methodologies.
Amplified ribosomal DNA restriction analysis (ARDRA). PCR amplification of the 16S rRNA using the universal primers 27-F (AGAGTTTGATCCTGGCTCAG) and 1492-R (GGTTACCTTGTTACGACTT) was performed. PCR amplification was done in a thermal cycler (Perkin Elmer 480) with the following temperature profile: an initial denaturation at 96°C for 5 min; 36 cycles of 1 min at 94°C, 1 min at 50°C and 1 min at 72°C; and a final extension at 72°C for 5 min. PCR products were visualized in 1.2% agarose electrophoresis gel. Afterwards, the amplified products were treated with the following restrictions enzymes: CfoI, HaeIII, HinfI, HapII and AluI. Ten microliters of PCR product were digested with restriction enzymes (1 μL of restriction enzyme, 2 μL of specific buffer 10X, 1 μL of albumin and 9 μL of sterile water) and incubated at 37°C for 18 hr according to the manufacturer´s instructions. The digested products were visualized by electrophoresis in an agarose gel (1.8%) at 95 V for 90 min.
Random Amplified Polymorphic DNA PCR (RAPD). Two different primers were used: OPI-3 (5'CAGAAGCCCA3') and vmh390R (ATCGTGTCCCAATACTTCACCG). The PCR amplifications were performed in a thermal cycler (Perkin Elmer 480), with the OPI-3 primer the following temperature profile was used: an initial denaturation at 94°C for 4 min; 30 cycles of 1 min at 94°C, 1 min at 37°C and 1 min at 72°C; and a final extension at 72°C for 6 min. For the vmh390R primer, the same program described by Bi et al. (2000) was used. PCR products were visualized by electrophoresis in an agarose gel (1.8%) at 95 V for 90min. Fingerprint analysis. Gel digital images were captured with Image Lab software (BIO-RAD, Molecular imager: Gel Doc TM XR+) and were analyzed with BioNumerics software (Applied Maths, Inc.). A cluster analysis was performed by calculating a similarity/distance matrix with DICE, a similarity coefficient based on band presence or absence. Then, the resulting similarity matrix was converted into a dendrogram with a clustering algorithm by Unweighted Pair Group Method with the Arithmetic Mean (UPGMA) method with optimization set at 1%.

Genome Sequence
Based on the molecular characterization, two different V. mimicus strains were sequenced using a semiconductor NGS platform (Ion Torrent Personal Genome Machine sequencer, Life Technologies) with a 316 chip at CIAD Mazatlán. The reads were assembled with the program Newbler ver.2.3 mapping against CAIM 602 T [31]. A genome-wide assembly and contig synteny was constructed with Mauve Genome Alignment software ver. 2.3.1 [32] using V. mimicus 451 as a reference strain [16]. The contigs were further reassembled with Geneious R6 ver. 6.0.3 (Biomatters Ltd) to obtain two chromosomes (ChI and ChII). Original contigs were annotated by RAST [33] (http://rast.nmpdr.org/) and by NCBI (http://www.ncbi.nlm.nih.gov). We constructed a data base with the information from the annotation for the study of the differences and similarities.

Comparative Microbial Genomics (CMG)
The genomes of V. mimicus were analyzed by the CMG-Biotools [35] to obtain a BLAST matrix. The BLAST hit was considered significant if 50% of the alignment consists of identical matches and the length of the alignment is 50% of the longest gene.

Results
All three isolates showed the 390 bp fragment corresponding to the vmh gene, and were thus confirmed as V. mimicus. The gels obtained from the different molecular fingerprinting methodologies (ERIC-PCR, RAPD, GTG-5 and ARDRA) showed different patterns between isolates and the type strain (CAIM 602 T ). The dendrograms from the cluster analysis of the gels from each methodology were obtained (Fig 1).
With ERIC-PCR, the strains yielded 5 to 13 amplified products of different sizes ranging from approximately 0.1 kb to 4.0 kb, and the similarities ranged between 30% and 66% among the isolates. With RAPD-vmh390R the strains produced 3 to 5 bands ranging in size from approximately 0.3 kb to 2.0 kb, and the similarities ranged between 36% and 50%, and with RAPD-Opi3 the strains yielded 4 to 6 amplified products of different sizes ranging from approximately 0.2 kb to 3.0 kb, and the similarities ranged between 67% and 80% among the isolates. With ARDRA, the analysis results showed that Cfo-I and Hap-II were the most discriminatory enzymes; with both enzymes, the strains generated 3 to 4 amplified products with a size ranging from approximately 0.2 kb to 0.9 kb for Cfo-I, while for Hap-II sizes ranged from 0.1 kb to 0.6 kb; and the similarity among isolates were ranged between 0% and 35% for Cfo-I and 10% and 26% for Hap-II. With GTG 5 -PCR no differences among these isolates were found. In addition, a composite dendrogram and a similarity matrix was obtained by a cluster analysis of all the band patterns obtained from the different methodologies used (Fig 2). From these data, Vm 92 was the most different from the others, with Vm 87 presented only 54.5% From this molecular characterization, it was confirmed that these three isolates were indeed different strains of V. mimicus. Two of them (i.e., Vm 87 and Vm 92) were selected for sequencing so it can be estimate their pathogenic potential. These strains were deposited at the Collection of Aquatic Important Microorganisms (CAIM: www.ciad.mx/caim) and were registered as CAIM 1882 (Vm 87) and CAIM 1883 (Vm 92). Both genomes were deposited in Gen-Bank under accession numbers PRJNA219179 and PRJNA219181, respectively.
The sequencing yielded a total of 2,932,276 reads (mean length 183 bp) for a total of 540 Mb and an average coverage of 125.6X for CAIM 1882. For CAIM 1883, a total of 2,773,308 reads were obtained (mean length 193 bp) for a total of 536 Mb and an average coverage of 124.5X. The reads were assembled, and 434 contigs for CAIM 1882 (N50 40,957 bp, 92.58% reads mapped) and 455 contigs for CAIM 1883 (N50 41,061 bp, 92.76% reads mapped) were obtained. A summary of the general features from the assembled genomes by chromosome of CAIM 1882 and CAIM 1883 are shown in Table 1.  A genome atlas was generated with GeneWiz browser, where the differences between the genomes of V. mimicus CAIM 602, CAIM 1882, CAIM 1883 and V. mimicus 451 can be visualized (Fig 3). A major variability was observed in ChII than in ChI, some of the differences found in ChI were in genes such as Type IV fimbrial assembly (PilB), zona occludens toxin, accessory cholera enterotoxin, type II restriction enzymes, phage integrase, polysaccharide biosynthesis proteins, transcriptional regulators, and hypothetical proteins. While in ChII genes such as membrane proteins, transcriptional regulators, flagellar proteins, integron integrase, transport proteins, mobile elements, multidrug resistance efflux pumps, and hypothetical proteins were found.
A proteome comparison was obtained with the four genomes of V. mimicus (> 50% homology) and represented in a BLAST matrix (Fig 4). The homology between the proteomes ranged from 75.8% to 87.3%; observing differences among clinical and environmental strains, and even between environmental strains. Whereas, the homology within proteomes (number of proteins that have homologous hits within the proteome itself) was among 1.5% and 2.4%, being the environmental strains (CAIM 1182 and 1183) those with the lowest percentages.
A summary of the virulence genes found by category in each strain according to the classification used by Kimes et al. [36] is shown in Table 2. CAIM 1882 showed slightly more virulence genes than CAIM 1883. For chromosome I (ChI) a higher number of genes was present in the category type I secretion proteins, followed by flagellar proteins, extracellular components, potential regulators and type IV secretion proteins. While in chromosome II (ChII), the main categories were the type I secretion proteins, chemotaxis proteins, antibiotic resistance proteins and iron uptake.
Moreover, examples of genes encoding virulence factors detected in ChI in both strains were hemolysins, proteases, outer membrane proteins [OmpU, OmpT, OmpK and OmpV], a type IV and MSHA pilus, an aerobactin siderophore, a capsular polysaccharide, an accessory colonization factor (acfD), the transmembrane regulatory protein ToxS, the transcriptional activator ToxR and presence of quorum-sensing regulation system (LuxS, LuxO, LuxR). In addition, some proteins and phage shock proteins were found. Whereas in ChII of both strains, examples of genes coding for virulence factors were metalloproteases, chemotaxis proteins, as well as various hemolysins (e.g., cytolysin and hemolysin HlyA, thermolabile hemolysin precursor and thermostable hemolysin delta-VPH), a putative phosphatase, an adhesin, a chitinase, a type II/IV/ and type VI secretion system proteins. Additionally, genes involved in quorum-sensing (autoinducer 2 (AI-2), CAI-1 autoinducer synthase, sensor histidine kinase CqsS), and various hypothetical proteins were found. In addition, both strains possessed an integron and two prophages in ChII, but no plasmids were detected in any of them.
Based on the virulence genes detected for V. mimicus (CAIM 1882 and 1883), 28 core genome genes of clinical importance in both chromosomes were selected for a phylogenetic study ( Fig 5). As expected, a similar cluster pattern for the ML and NJ trees was observed, and both environmental strains were grouped together.
Furthermore, the complete list of gene variations in each strain is shown in Table 3. In ChI few differences were found, which included differences in virulence genes and hypothetical proteins; while in ChII, more genes were detected in CAIM 1882 than in CAIM 1883, and the differences were mainly in virulence genes, transcriptional regulators and hypothetical proteins.

Discussion
In recent years, molecular techniques have been used for the study of bacteria and to establish phylogenetic relationships among them. This study has employed various molecular methodologies to determine the differences between three strains of V. mimicus isolated from the washing step in a shrimp processing facility in Sonora, Mexico. The vmh gene is common to V. mimicus and is a useful marker for species identification [27,28]. Different studies have used some of these methods, but it has not been reported before, the simultaneous use of all  these techniques for the purpose to study single species and establish which method is more useful to detected genetic variations [37][38][39][40]. In this study, ERIC-PCR and RAPD (vmh390R) were the most discriminatory techniques for establishing differences, by obtaining the lowest percentages of similarities between the analyzed strains, while GTG 5 -PCR was the less discriminatory method. ERIC-PCR has been reported for the study of several bacteria, such as V. cholerae, where different patterns were detected, with up to 8 amplification products [38,41]. This is the first report where ERIC-PCR was used for the analysis of V. mimicus strains and by obtaining up to 13 amplified products; this method proved to be a powerful tool for the study of this species and can be used for the detection of genetic variability. Bi et al. [37] used the arbitrarily primed polymerase chain reaction (AP-PCR) to study several clinical strains of V. mimicus, and when comparing their results with those of this study, a similar number of band patterns were found, with the exception that their amplified products were of larger size. It is worth mentioning that the strains used in the previous study were clinical, while in this study, the strains were environmental. Additionally, it is know that there are differences in the genome of clinical and environmental strains of this species [16], so it is possible that these differences can explain the differences in the number of the amplified products. Urakawa et al. [39] performed an RFLP analysis with the 16S rRNA genes of several species of Vibrio, including V. mimicus type strain (CAIM 602 T ). Their study used several restriction enzymes, including Hae-III, Hin-fI and Alu-I, but the results using these enzymes were omitted because of ambiguous results. In this study, these same enzymes also showed a low level of differentiation between strains, compared with the enzymes Cfo-I and Hap-II, which showed a better differentiation between the strains isolated from the frozen shrimp processing facility. This is the first report in which the enzymes Cfo-I and Hap-II were used for the study of strains of V. mimicus. However, it was not possible to differentiate between V. mimicus CAIM 602 T and V. mimicus 93 with these enzymes. In addition, with the cluster analysis that evaluated all the methodologies (ERIC-PCR, RAPD, GTG-5 and ARDRA); it was possible to confirm that the three analyzed strains were different even when they were isolated from the same source. In addition, it is important to consider that each method provides different patterns-resolution, and the selection of one should depend in the desired resolution.
Genome sequencing is opening a door for the understanding of bacteria pathogenesis and study evolutionary history [42]. This study found differences in the gene content in both environmental strains of V. mimicus (CAIM 1882 and CAIM 1883). Those differences were mostly in the ChII involving proteins and virulence genes. Furthermore, the proteome matrix point out differences between these genomes, even when the strains were obtained from the same niche at the same time. These findings give us an outlook of the genomic diversity of V. mimicus.
These environmental genomes presented several virulence genes that were classified into 16 categories, with type I secretion system proteins as the main category. Additionally, genes for type IV pilus proteins (MSHA pilus) and extracellular components such as the capsular polysaccharide (CPS) were present, which have been reported in few studies of V. mimicus [13,16]. Furthermore, this is the first report of an accessory colonization factor (acfD gene) found in V. mimicus, that is also present in others Vibrio species, such as V. cholerae, V. fischeri, V. vulnificus and V. parahaemolyticus, which have been documented to be required for an efficient intestinal colonization, and disruption of any of the four acf genes (acfA, B, C, D) can reduce the ability to colonize [43][44][45][46][47][48]. Sultan et al. [49] found the presence of AI-2 activity and the luxS, luxO and luxR genes in clinical strains of V. mimicus, and in this study the same genes were found in environmental strains of V. mimicus. However, no assays were performed so it remains to be determined their exact roles in which they function. Also, various genes such as the aerobactin siderophore, the ferric aerobactin receptor, ToxR, ToxS, and several others genes found in these strains, have been previously reported in others strains of V. mimicus [27,50,51]. Recently, it was reported the adhesion function of the OmpU protein of V. mimicus [52]. Moreover, various virulence genes found in this study (e.g. OmpU, acfD, MSHA pilus, δ- this food can be consumed raw. In addition, the identification of several virulence genes in these strains, that are also present in other Vibrio species, support previous studies that suggest that V. mimicus could be a gene reservoir in the environment for Vibrio species [19,20].
In conclusion, the information described in this study substantiated the importance and effectiveness of the molecular and genomic methodologies to study and characterize microorganisms. Differences between the analyzed environmental strains were found, even though they came from the same source, offering a perspective of the genomic diversity of V. mimicus. Several virulence genes were identified, supporting the pathogenic potential of this species and a phylogenetic relationship between some core genome virulence genes were established. However, more studies are required to determine how these detected virulence genes may have or not an effect on the health of shrimp consumers. Further studies of genes of clinical importance will help to better understand their potential virulence, environmental impact and evolution.