Comparative Genomic Characterization of Three Streptococcus parauberis Strains in Fish Pathogen, as Assessed by Wide-Genome Analyses

Streptococcus parauberis, which is the main causative agent of streptococcosis among olive flounder (Paralichthys olivaceus) in northeast Asia, can be distinctly divided into two groups (type I and type II) by an agglutination test. Here, the whole genome sequences of two Japanese strains (KRS-02083 and KRS-02109) were determined and compared with the previously determined genome of a Korean strain (KCTC 11537). The genomes of S. parauberis are intermediate in size and have lower GC contents than those of other streptococci. We annotated 2,236 and 2,048 genes in KRS-02083 and KRS-02109, respectively. Our results revealed that the three S. parauberis strains contain different genomic insertions and deletions. In particular, the genomes of Korean and Japanese strains encode different factors for sugar utilization; the former encodes the phosphotransferase system (PTS) for sorbose, whereas the latter encodes proteins for lactose hydrolysis, respectively. And the KRS-02109 strain, specifically, was the type II strain found to be able to resist phage infection through the clustered regularly interspaced short palindromic repeats (CRISPR)/Cas system and which might contribute valuably to serologically distribution. Thus, our genome-wide association study shows that polymorphisms can affect pathogen responses, providing insight into biological/biochemical pathways and phylogenetic diversity.


Introduction
Streptococcosis of cultured fish contributes to major economic losses in the aquaculture industries of Israel [1], Italy [2], Korea [3], Japan [4], and the United States [5]. S. parauberis was first reported from turbot (Scophthalmus maximus) cultured in Spain [6]; since the turn of the present century, it has become important disease in the aquaculture industries of Northeast Asia (Korea, Japan and China), especially among olive flounder aquaculture farms. Recently, Nho et al. [7] reported that S. parauberis is the dominant etiological agent of streptococcosis characterized by clinical symptoms such as chronic wasting syndrome, hemorrhagic septicemia, exophthalmia and meningitis with abnormal swimming.
S. parauberis exhibits serological, genetic and biochemical variations within the species. Nho et al [7] classified S. parauberis into three serotypes based on specific antigenic bands seen on Western blot analysis using anti-S. parauberis chicken IgY. In contrast, Kanai et al. [8] differentiated S. parauberis into two serotypes based on the chemical composition of a surface capsular polysaccharide. These variations may reflect an evolutionary trend, or may suggest that there is a relationship between serotype and virulence.
Comparative genome sequence analysis of bacteria can detect sequence diversity among distinct yet closely related populations, which may have important implications among the strains are important for adaptation, strain specific genes are thought to represent the physiological and virulence properties of an organism, disease epidemiology and understanding evolutionary relationships [9,10]. These genetic variations can arise from gene mutations, insertions, deletions and/or genetic noise. Such analyses have been performed for various pathogenic bacteria, including Helicobacter pylori [11], Mycobacterium species [12], Escherichia coli [13], S. mutans [14] and the fish pathogen Edwardsiella tarda [15]. Moreover the recent determination of the complete genomic sequence for S. parauberis revealed important information on bacterial diversity, functional characteristics, environmental adaptation, and virulence components [16].
Bacteria utilize diverse carbohydrate sources for various biosynthetic processes, particularly those involved in maintenance and reproduction. Many catabolic operons are subject to carbon catabolite repression (CCR) by rapidly metabolizable carbon sources, especially glucose [17]. In Gram-positive bacteria with low GC contents, however a very different CCR pattern is used for PTS sugar transport by EI-P on HPr residue histidine, and HPr can also be phosphorylated at serine 46 (Ser-P) by a specific ATP-dependent protein kinase [18]. Then, HPr (Ser-P) interacts with CcpA, which binds to a conserved sequence in the promoter region called the catabolite-responsive element, thereby negatively controlling the expression of several carbohydrate catabolism related enzymes [19,20].
Bacteria have evolved various mechanisms to defend themselves against viral predation. One such strategy involves the CRISPR/Cas systems, which are small RNA-based defense systems that provide adaptive, heritable immunity against viruses, plasmids and other mobile elements in archaea and bacteria [21]. The RNA and protein components of these immune systems arise from the CRISPR locus and the cas genes. The six cas genes (cas1-6) are present in a wide array of bacteria [21][22][23][24]. Most bacteria have some but not all of the cas genes; however, cas1 and cas2 appear to be universal and can therefore be used as genomic markers for the CRISPR/Cas system.
In this context, we determined the whole genome sequences of two strains of S. parauberis (KRS-02083 and KRS-02109, representing serotypes I and II, respectively), that were isolated from Japan in 2002 and classified using a rabbit anti-serum based agglutination test [8]. We compared the genome sequences, genomic structures, and gene variations of these strains with those of a reference strain of serotype I (KCTC11537) isolated from South Korea in 2006. Our results show differentiation for carbon source utilization and phage resistance according geographical and serological strains which provide useful information on evolutionary events in the three S. parauberis strains and offer new insights into streptococcal species-specific survival and potential prophylactic strategies.

Preparation of genomic DNA
The two S. parauberis strains were cultured in THB at 25°C for 24 h, and genomic DNA was extracted using the Qiagen Genomic-tip 500/G kit (Qiagen, Hilden, Germany) and a genomic DNA buffer set (Qiagen), according to the manufacturer's instructions.

Whole-genome sequencing
The draft genomes of S. parauberis KRS-02083 and KRS-02109 were sequenced by the National Research Institute of Fisheries Science (Yokohama, Japan) using a Roche/454 GS-FLX TM system [25], and Takara Bio, Inc. (Otsu, Japan) using Illumina Genome Analyzer (GA) [26]. Large contigs were assembled using the Newbler de novo assembler package in the 454 GS-FLX TM system, and pair-end reads were assembled with respect to the reference genome using BWA-SAMtools [27] and Edena [28] in Illumina GA.

Annotation and comparative multiple alignment analyses
The KRS-02083 and KRS-02109 genome sequences were annotated using the automated prokaryotic annotation server, Rapid Annotations using Subsystems Technology (RAST; rast.nmpdr.org) [29], and identified by manual NCBI BLAST searches. We evaluated the annotation accuracy by comparing the RAST results from our large scaffolds and pair-end reads to the coding sequences (CDSs) published reference genome (KCTC11537). Comparative circular maps of the genomes were constructed using the BRIG (BLAST Ring Image Generator) [30]. Homologies in the gene contents of the two genomes were identified using the comparative tools of the RAST algorithm, to allow for potential genome-to-genome annotation. CRISPR elements were identified and CRISPRassociated genes were annotated using CRISPRFinder [31] and our CDS results. Multiple alignments were performed using CLUSTALW [32], and various other comparisons and investigations were performed using the in silico molecular cloning tool, Genomic Edition Version 4.2.21 (In Silico Biology, Yokohama, Japan). glucose utilization. PCR detection of the cas1 gene was also performed for identification of CRISPR/Cas. The 24 stored S. parauberis strains utilized in this PCR-based assay had been isolated from diseased olive flounder in Korea (six strains each of type I and type II) between 2006 and 2009, and in Japan (six strains each of type I and type II) between 2002 and 2004. The isolates were cultured in TSB (for the Korean strains) and THB (for the Japanese strains) at 25°C for 24 h, and DNA was extracted using an AccuPrep® Genomic DNA Extraction Kit (Bioneer, Korea) according to the manufacturer's instructions. PCR was performed in 20 μL reaction mixtures containing 1 μL template DNA, 0.05 μM of each primer (Bioneer), and the AccuPower PCR ® premix (Bioneer). Amplification was performed using a C-1000TM thermo cycler (Bio-Rad). The PCR conditions included an initial denaturation cycle of 94°C for 3 minutes; 30 cycles of denaturation step at 94°C for 30 seconds, annealing at 54°C for 30 seconds, and extension at 72°C for 30 seconds; and a final extension step at 72°C for 5 minutes. The resulting PCR products were analyzed on a 1.5% (w/v) agarose gel and stained 1% (w/v) ethidium bromide. DNA bands were visualized using a Gel Dock system (ATTO E-Graph, AE-9000; Takara, Japan) equipped with a CS analyzer program.

Analysis of growth curves and sugar utilization
KRS-02083, KRS-02109 and KCTC11537 were maintained as previously described [8,16]. Bacterial cells were cultivated in TYE broth (tryptone, 10 g; yeast extract, 5 g; and K2HPO4, 2 g per liter) [33] supplemented with 20 mM glucose and 25 mM lactose and L-sorbose for the two Japanese strains and the Korean strain. The growth phenotypes of various strains were monitored using an xMark Microplate Absorbance Spectrophotometer (Bio-Rad, Hercules, CA, USA) at 37°C, with the optical density at 600 nm recorded every hour.

Nucleotide sequence accession numbers
The whole genome sequences of KRS-02083 and KRS-02109 have been deposited in the GenBank database under accession numbers ALYM00000000 and ALWR00000000, respectively.

General genomic features
De novo assemblies of the KRS-02083 and KRS-02109 genomes were generated using two next generation sequencing technologies: the Illumina GA and GS-FLX pyrosequencing. The BRIG program [30] was used to compare the genomes of the S. parauberis strains via genomic alignment ( Figure 1).
The 650 ~750 bp and 580 ~660 bp reads were found to be the dominant lengths in the GS-FLX results for KRS-02083 and KRS-02109, respectively. The Illumina GA results comprised more than 10,000 contigs, including numerous small contigs (≤ 500 bp) and various contigs of different sizes (≥ 500 bp, ≥ 1kb, ≥ 2kb and ≥ 4kb) in both KRS-02083 and KRS-02109. The assembled sequence data for KRS-02083 and KRS-02109 yield coverages of 65.5 and 71.6 times that of the paired-end mapped reads from GS-FLX and, 500 and 600 times that of the single reads produced using the Illunina GA, respectively. In terms of the raw data, the KRS-02083 and KRS-02109 genome sequences consisted of 55 and 31 contigs, respectively, generated by GS-FLX sequencing, and 265 and 192 contigs, respectively, generated using Illumina GA. The KRS-02083 genome was estimated to be 2,126,607 bp in length with 88.61% nucleotide similarity to the reference genome and a GC content of 35.6%, while the KRS-02109 was estimated to be 2,084,517 bp in length with 87.35% nucleotide similarity to the reference genome and a GC content of 35.5%. 2,236; 2,048 CDSs were identified from the KRS-02083 and KRS-02109 genomes, respectively, by RAST annotation. The KRS-02083 and KRS-02109 genome sequences had 13 and 6 gaps, respectively, ranging in lengths from 500 bp to 40 kb (Table 1).
Approximately 70.7% and 72.3% of the S. parauberis KRS-02083 and KRS-02109 proteins, respectively, were grouped into 27 functional groups, and the remaining 29.3% and 27.7%, respectively, were assigned to the "unknown function" group, which contained the highest proportion of annotated genes. An overview of the functional annotation in the two S. pararuberis strains is shown in Table 2. The category of carbohydrate related proteins contain the highest percentage of known genes from the RAST database, at 14.8% and 15.6% for KRS-02083 and KRS-02109, respectively. These proteins are important to the ability of the bacteria to survive in their host's complex environmental niche. The annotated genes of KRS-02083 and KRS-02109 also included genes involved in RNA metabolism (3.6% and 3.8%), protein metabolism (6.7% and 6.3%), DNA metabolism (5.2% and 4.8%) for carbohydrate metabolism and processing. These findings may reflect that the organism is well adapted to aquatic ecosystems containing a wealth of carbohydrates and nucleic acid. We also observed relatively high levels of genes involved in regulation and cell signaling, perhaps reflecting the capacity of these strains to cope with various growth conditions and stresses [34]. In contrast, we did not identify genes involved in the photosynthesis, iron acquisition and metabolism, secondary metabolism or nitrogen metabolism.

Conformation of SNPs and INDEL genes
A major goal for sequencing is to identify large and small changes in the genome that could explain the divergence of a strain or have implications for gene function. Therefore, we conducted a detailed analysis of single nucleotide polymorphisms (SNPs), variations in repeat numbers, and gene-scale gains or losses. Analysis of Illumina GA data allowed us to identify 5,360 and 4,241 single mutations in the genomic sequence of KRS-02083 and KRS-02109, respectively, compared with the reference genome.
Gene insertion-deletion (INDEL) events were abundant in the KRS-02083 and KRS-02109 genomes, which contained 346 and 232 were newly annotated genes and lacked 336 and 357 genes, respectively, with respect to the KCTC 11537 genome. Most of INDEL genes were hypothetical genes or genes of unknown function (Figure 2). INDEL genes contribute to species diversity and might encode supplementary biochemical pathways and functions that are not essential for bacterial growth but may confer selective advantages, such as adaptation to different niches and/or antimicrobial resistance. Large INDELs of gene regions can also create structural variations of the chromosome; these may be used as genotype-specific markers for epidemiological studies, and their study can offer a novel approach to understanding genetic diversity [35]. As such, our comparative analysis of the three S. parauberis genomes may provide important new insights into the evolutionary history of microbial pathogens and heterogenetic specificity in similar species.

Lactose (lac) operon
The KRS-02083 and KRS-02109 genomes contained lac operons homologous to those in S. bovis (NZ_AEEL01000000) and S. pyogenes (NC_002737.1) ( Table 3). The KRS-02083 genome encoded lacRABCDTGF and E in scaffold 11 while KRS-02109 genome encoded lacRABCDFE, and G in scaffold 1 ( Figure 3). The lacR determinant extended to the first ORF of the lac operon in both genomes which contained a DeoR-like helix-turn-helix domain and DeoR terminal sensor domain that detects diverse sugar derivatives in multiple genera, including tagatose phosphate in S. mutans [36]. In this regard, galactose-6-phosphate and tagatose-6-phosphate have been suggested to bind to the lacR cistron and facilitate its dissociation from the binding site, AGGAG. The adjacent genes in order, are galactose-6-phosphate isomerase A/B subunits, tagatose-6-phosphate kinsase and tagatose-1,6-diphosphate aldoase; designated as lacABC and D. The lacA/B enzymes convert galactose-6-phosphate to tagatose-6-phosphate, has a broad specificity for various sugars, and can be used to produce rare sugars. Previously, Zeng et al. [36] showed that a mutant strain of S. mutans lacking lacA/B failed to grow on lactose supplemented medium, indicating that they are essential for the intracellular catabolism of lactose. However, further research is required to clarify the underlying mechanism. LacC and lacD are preceded by a non-translated region containing the promoter along numerous direct and inverted repeats that are involved in regulating the lac-PTS operon.
The next sets of genes in the lac-PTS operons of the two tested strains were lacTGFE in KRS-02083, and lacFEG in KRS-02109, which are similar in structure to that in Lactobacillus rhamnosus [37], and S. mutans [36], respectively. The lacF and lacE genes were clearly related to the hydrophilic PTS IIA and PTS IIB components. LacE in particular was found to have the predominantly hydrophobic N-terminal region typical of integral membrane proteins, followed by a hydrophilic domain that resembled the cytoplasmically oriented PTS lactose-specific IICB component [38]. The enzyme II components of the lactose PTS in the studied Japanese S. parauberis strains respond to intracellular inducers: IICB (lacE) and free cytosolic IIA (lacF), these domains may be part of a single polypeptide or may exist as separate interacting proteins [39]. The two Japanese S. parauberis genomes also contain 6phospho-beta-galactosidase (lacG), an enzyme essential for the catabolism of lactose phosphate into glucose and galactose-6-phosphate, important for lactose hydrolysis ( Figure  S1). It has been suggested that in some cases, lactose 6phosphate can be hydrolyzed by beta-glycosidases that are specific for beta-glucoside sugars. The absence of lacG in KCTC11537 explains the previous finding that it is unable to grow on lactose [36]. Thus, in the two Japanese S. parauberis genomes, the lacABCD genes comprise the tagatose-6-phosphate pathway and are co-transcribed with genes lacFE and G, which specify  proteins for the transport and cleavage of lactose. Our findings support the previous reports that KRS-02083 and KRS-02109 showed positive reactions to lactose acidification in biochemical tests, whereas KCTC11537 did not [7,8].

Sorbose operon
Sorbose is a known carbon energy source in E. coli [40]. KCTC11537 showed a positive reaction to sorbose acidification test [7,8], and encodes L-sorbose utilization factors (sorEDCBF and R) with a 5.7-kb DNA fragment ( Figure 4A). Intracellularly, L-sorbose-1-phosphate is reduced to D-sorbitol-6-phosphate by an L-sorbose-1-phosphate reductase (sorE). Then, in an NAD + -dependent step, sobitol-6-phosphate dehydrogenase (sorF) catalyzes the oxidation of D-sorbitol-6-phosphate to Dfructose-6-phosphate, which is then further catabolized by glycolytic enzymes (Figure S1). Sequence analysis revealed the presence of six sor-utilization ORFs, similar to the sor operon of Yersinia enterocolitica [41] except for the absence of sorF in the KCTC11537 ( Figure 4B). The predicted gene products also showed homologies to various PTS proteins, including the IIB, IIC and IID components of the sorbosespecific PTS, and the catalytic enzyme L-sorbose 1-phosphate reductase from Y. enterocolitica (39%, 71%, 71% and 66%, respectively). The last gene, sorR, showed 43% homology with the corresponding gene of Y. enterocolitica. In KCTC11537, the L-sorbose PTS EII consists of two membrane-bound proteins, IIC (sorC) and IID (sorD), and two soluble components IIA/B (sorB); BLAST searches suggested that these are fused into a single polypeptide chain, as in E. coli [42]. When the predicted amino acid sequence of sorR was used in a BLAST search, we found that its closest homologs included a transcriptional regulator from Y. enterocolitica [43], a DeoR family transcriptional regulator from Sebaldella termitidis [44] and a sor operon regulator from E. coli [45]. These results suggest that the helix-turn-helix is a major structural motif capable of binding DNA; composed of two alpha helices joined by a short strand of amino acids, this motif is found in many proteins that regulate sor gene expression [46]. KCTC11537 encodes two sorbitol 6-phosphate 2-degydrogenase genes, sorF and srlD, which were found to be 78% and 85% similar to those in S. equi [35]. This suggests that cross-talk occurs between the sorF and srlD regulatory metabolic pathways and the redox state of the cell, and further indicates that the genes involved in L-sorbose and D-sorbitol catabolism interact through common effectors to reduce sorbitol-6-phosphate by converting it to fructose-6-phosphate.

PCR-based survey of carbon utilization genes
The utilization genes for glucose, GCK and ptsG, were detected as 480-and 875-bp bands, respectively (Table 4). Our results revealed that the major pathways of carbohydrate uptake in 24 S. parauberis isolates all involved the PTS. The nucleotide sequences of the GCK and ptsG genes in the three genomes examined herein were highly homologous (over 99%) further supporting this contention. Glucose is the preferred carbon source in many bacteria, and the presence of glucose often prevents the use of other, secondary carbon sources (e.g., sorbose, lactose and many others) [47]. The lactoseutilizing genes, lacC, lacE and lacG, were amplified from 10 of the 24 tested isolates, including the Korean strain, J27, and the Japanese strains, KRS-02083, KRS-02030, KRS-02067, KRS-02109, KRS-02068, KRS-02090, KRS-02087, KRS-02091 and KRS-02102. Eleven of the 24 isolates were found to have genes for sorbose utilization, such as sorC and sorE; these isolates included the Korean strains, KCTC11537, KCTC11538, J19, J20, J24, J25 and J21, and the Japanese strains, KRS-02032, KRS-04024, KRS-02067 and KRS-04037. We failed to detect any sorbose-or lactose-associated genes in Korean strains J22, J23, J28 and J30 ( Table 4). The KCTC11537 genome was found to include a sorboseutilization, locus that contained the genes for the PTS EII domains that are involved in the transport of extracellular sorbose and the phosphorylation of sorbitol. The KRS-02083 and KRS-02109 genomes, in contrast, both encoded lac operons, but showed different gene sequences.

Sugar utilization
The difference in the encoded PTS components was confirmed when the bacterial strains were grown in TYE supplemented with D-glucose and with either lactose or Lsorbose. The two Japanese strains encoding the PTS components and tagatose associated enzymes involved in lactose metabolism grew in TYE supplemented with D-glucose and lactose. In contrast, the Korean strain encoding the PTS components and enzymes involved in sorbose metabolism grew well in TYE supplemented with D-glucose and L-sorbose; all strains show similar growth patterns on TYE broth containing non-carbon source ( Figure 5). These results indicate that the PTS elements mediate the expression of the lac and sor operons within the cellular membrane.
In many bacteria the amount of carbohydrates flowing through glycolysis is the primary trigger for binding of CcpA which is a CCR element that is crucial for sporulation [48,49], antibiotic resistance [50], and expression of virulence genes for pathogenicity [51]. It is important to keep in mind that the aim of pathogenic bacteria is to gain access to nutrients rather than to cause damage to the host. A number of papers have described that the carbon sources have effects on virulence particularly in terms capsule production to avoid phagocytosis and efficient attachment to host cells or tissues [52,53]. Differential nutrient availability within diverse host niches impacts upon the ability of S. parauberis cells to counteract local stresses and resist pharmacological intervention. To better understand how the catabolism of lactose and sorbose affect the virulence mechanisms in S. parauberis, future studies should examine the properties of various mutant strains lacking certain PTScomponents, regulatory enzymes and other constituents of the catabolic pathways.

CRISPR/Cas systems
In S. pyogenes, phage integration, as shown by genomic rearrangement of the prophage regions, is an important source for new virulence factors [54]. Mechanisms of phage resistance have not been previously described in S. parauberis, but genomic analyses have suggested that the CRISPR/Cas system helps provide adaptive immunity against foreign genetic elements in phages. Of the genomes examined herein, only KRS-02109 was found to possess characteristic CRISPR/Cas system; it contains genes highly similar to cas1, cas2, and two csn family genes S. pseudoporcinus ( Figure 6). The cas1 gene encodes a metal-dependent DNA-specific endonuclease that  (ref)  2006  I  -----+  +  +  +  -KCTC11538  2008  I  -----+  +  +  +  -J19  2009  I  -----+  +  +  +  -J20  2009  I  -----+  +  +  +  -J24  2009  I  -----+  +  +  +  -J25  2009  I  -----+  +  +  +  -J21  2009  II  ----- may play a role in the recognition, cleavage, and/or integration of foreign nucleic acids into CRISPRs [55]. The biological significance of the CRISPR/Cas system against phage infection has been examined in S. thermophilus [54], where the presence of a CRISPR spacer identical to a phage sequence adjacent to cas provides resistance against phages containing that particular sequence (direct sequence consensus: GTT TTG GAA TCA TTC AAA ATA ACA TAG CTC TAA AAC). The spacer sequences found in the CRISPR loci of KRS-02109 showed sequence similarity to a lytic phage  suggesting that S. parauberis KRS-02109 might resist phage infection via its CRISPR/Cas system. KCTC11537 and KRS-02083 each encode four phage associated loci in their genomes (Figure 1), suggesting that these four regions of the KCTC11537 and KRS-02083 genomes may be involved in the acquisition of foreign genes via natural transformation or bacteriophage activity. In contrast, no such phage-associated gene was found in the KRS-02109 genome, suggesting that this strain may resist phage infection through its CRISPR/Cas system. Interestingly, the cas1 gene was amplified from type II strains obtained from both Korea and Japan (Table 4), further suggesting that the CRISPR/Cas system might contribute to serology.
In the genus streptococcus, specific species may have evolved particular functions by acquiring foreign genes via natural transformation or bacteriophages. However, the acquisition of new foreign genes via phage infection may not be have always favored their lifestyles. In the future, elucidation of the mechanism by which S. parauberis acquires new genes could help clarify the species-specific evolutionary strategies among the streptococci.

Conclusion
Our genomic comparison of three S. parauberis strains revealed numerous biological, virulence, and pathogenetic factors, reflecting the organism's adaptation as an obligate and versatile fish pathogen. The genomic features of the two Japanese strains (KRS-02083 and KRS-02109) were overall similar to those of the reference genome (KCTC11537). The S. parauberis genome appears to encode a number of carbohydrate related proteins (comprising the highest percentage of the annotated genes) and seems to confer the ability to synthesize all of the amino acids and regulatory factors required to survive in the host's complex environmental niche.
The genomes of the two Japanese strains each encoded a lac operon, which was found to be homologous in both type I and type II strains, and the loci were structurally similar to those of L. rhamnosus and S. mutans. In contrast, KCTC11537 exhibited a positive reaction to sorbose and was found to encode L-sorbose utilization factors. Its genome includes a 5.7kb chromosomal DNA fragment that harbors six sor ORFs, which encode intracellular inducers and proteins capable of responding to external sorbose. Of the three examined strains, only KRS-02019 appears to preferentially use a CRISPR/Cas system to defend against phage integration. The cas1 gene, however, was amplified from all type II strains, regardless of geographic distribution.
Our genome sequence analyses showed that sequence diversities can exist among closely related, but distinct populations. We observed distinct variations, most notably in carbohydrate utilization, between similar species from different geographic regions, suggesting that the bacteria may have adapted to using certain carbohydrate sources that are abundant in particular area and/or utilizing secondary carbohydrate sources in the absence or lack of a primary source. Our findings may also be useful for the development of new prophylactic and therapeutic strategies to counter fish streptococcal infection. Table S1. PCR primers for this study.