Molecular epidemiology of Mycoplasma hyorhinis porcine field isolates in the United States

Mycoplasma hyorhinis is one of the causative agents of polyserositis and arthritis in post-weaning pigs. Here we describe the development of a multi-locus sequence typing (MLST) protocol for the characterization of M. hyorhinis field isolates. A total of 104 field isolates from different geographical locations, swine production systems, and clinical backgrounds, were analyzed. Twenty-seven genes, including housekeeping and those encoding surface proteins, were evaluated to index diversity. Genes encoding surface proteins were included to increase the discriminatory power of the MLST. Four target gene fragments were selected to be included in the final MLST-s (surface) protocol: pdhB, p95, mtlD and ung. Within each locus the nucleotide variation ranged from 1.4% to 20%. The 104 field isolates were classified into 39 distinct sequence types (STs). Multiple STs were found within the same production system and within the same pig. The majority of STs grouped strains from the same production system; however, cases existed where multiple systems shared a ST, indicating potential relationships between pig flows. The majority of the nucleotide changes observed in these genes generated synonymous changes, while non-synonymous changes were exclusively in the mtlD gene fragment, suggesting that this protein is undergoing selection. Molecular typing of M. hyorhinis will primarily aid swine practitioners with pig flow management and identifying sources of infection during outbreaks.


Introduction
In recent years, Mycoplasma hyorhinis has been recognized as an important cause of mortality in nursery pigs [1,2]. This pathogen colonizes the nasal and oropharyngeal epithelial surface of pigs. Piglets presumably become colonized through contact with sows, and the bacterium is transmitted via nose-to-nose contact among pigs following colonization and infection [3]. PLOS  submitted to the UMN Veterinary Diagnostic Laboratory from 2010 to 2012. All isolates were confirmed as M. hyorhinis utilizing a qPCR assay based on the 16S rRNA gene [25]. The isolates originated from multiple farms (nurseries, finishers, and gilt development sites) belonging to 21 different swine production systems, denoted as numbers 1-22 (no system 3) in 10 states of the US. Three isolates originated from a production system in Mexico (labeled MEX).
Isolates were stored at -80˚C until further testing, for which they were re-grown in 3 mL aliquots of modified Hayflick's medium for 2 to 14 days [26]. The suspension was centrifuged, and M. hyorhinis genomic DNA was extracted using the DNeasy Blood & Tissue kit following the manufacturer's protocol (Qiagen, Germantown, MD).

Loci selection and primer design
The whole genomes of four M. hyorhinis isolates available in GenBank (HUB-1 (NC_0144 48.1), GDL-1 (NC_016829.1) MCLD (NC_017519.1) and SK76 (NC_019552.1)) were utilized to identify potential target genes [27][28][29][30]. Identification of variable regions within the M. hyorhinis species was accomplished through a Clustal W progressive alignment of all 4 M. hyorhinis genomes using Mauve 2.3.1 [31]. Further identification of areas of variation within different housekeeping and surface protein encoding genes, such as previously identified variable lipoproteins and adhesins [32], was performed using Geneious 7.0.6 (Biomatters, Auckland, New Zealand, http://www.geneious.com). Primers were designed using PrimerQuest1 (IDT, Coralville, USA) software. In silico analysis of primer specificity was carried out using the Primer-BLAST tool (https://www.ncbi.nlm.nih.gov/tools/primer-blast). Genes with observed genetic variation between the available M. hyorhinis genomes were further tested with a subset of 6 distinct isolates originating in pigs from different production systems, ages, and geographical locations. Candidate genes were discarded if no nucleotide sequence variation was observed between these 6 isolates.

PCR conditions and sequencing
The loci selected were amplified in an Eppendorf thermal cycler (Eppendorf Mastercycler 5345, Hamburg, Germany) with a total volume of 25 μl. The reaction mixture contained 12 μl of HotStarTaq Master Mix (Qiagen, Germantown, MD), 0.8 μl of each primer (1 μM), 9.4 μl of H2O, and 2 μl of template DNA. PCR conditions were: 95˚C for 15 min, followed by 35 cycles of 95˚C for 1 min, 43.5˚C for 1:30 mins, and 72˚C for 1 min, and a final extension of 72˚C for 10 min. PCR products were analyzed by electrophoresis using an 1.5% agarose gels stained with ethidium bromide, run at 120 V for 40 min, and visualized using a gel imaging system. PCR products were purified using the QIAquick PCR Purification Kit (Qiagen, Germantown, MD) and bi-directionally sequenced by standard Sanger sequencing on an ABI 3730xl BigDye Terminator v 3.1 (Life Technologies, Grand Island, NY).

Data analysis
Resulting sequencing data was quality assessed, and sequences were aligned utilizing ClustalW within Geneious 7.0.6 and trimmed to equal sizes. MEGA 5.2.1 (version 5; www.megasoftware. net) was utilized to identify parsimony-informative sites. Nucleotide sequence alignments for each gene fragment were translated to assess the quality of change (synonymous or non-synonymous substitutions). Individual and concatenated dendrograms were computed using the Hasegawa-Kishino-Yano distance model and the UPGMA (unweighted pair group method with arithmetic mean) tree building method. For each targeted locus, each distinct sequence variant was arbitrarily assigned a consecutive number starting with 1, with no weight given to the amount of nucleotide sequence variation among alleles [23]. A distinct sequence type (ST) or allelic profile number was defined for each combination of the allele numbers for each locus (i.e. 1-5-2-3). The STs were assigned consecutive numbers in order of description (i.e. ST1). Relationships among strains were evaluated using minimum spanning trees (MST) analysis, after the introduction of allelic profiles into Bionumerics software V7.1 (Applied Math, Sint Maartens-Latem, Belgium). The strains were grouped into clonal complexes, defined here "as a group of MLST genotypes in which every genotype shares at least 3 out of 4 loci in common with at least one other member of the group" (pubmlst.org/analysis/burst/burst.shtml).

Selection and evaluation of M. hyorhinis MLST-s marker genes
A total of 27 genes were evaluated as potential target genes (S2 Table). All housekeeping genes except pdhB and ung were nearly identical (~99.5%) in all 4 annotated genomes and were therefore not included for further testing. While variation was observed within the nrdf and lspA surface protein genes when comparing the 4 annotated genomes, testing of the initial 6 M. hyorhinis isolates revealed no nucleotide sequence variation and were therefore not included for further testing. The gene sequences for vlpF and B genes were found to be highly divergent among the 4 available genomes, with no conserved flanking regions, hindering primer placement. Primers for the remaining vlp genes were designed; however, an amplicon was not obtained for vlpA, C, D and E genes. Although an amplicon was generated for the vlpG gene, the generated product had low quality and was not reproducible.
Minimal nucleotide sequence variation was observed within the cls, hexo and p37 genes and were therefore excluded from the subsequent testing of the remaining isolates. The discriminatory power of the technique was not affected after the removal of the 3 genes. Poor sequencing quality was obtained for the p3 gene and thus was excluded from further analysis. The final MLST-s protocol included: pdhB, p95, mtlD and ung ( Table 1). The selected genes were dispersed throughout the M. hyorhinis genome. In silico analysis revealed no cross-reactivity between primer sequences and swine related non-targeted taxa.

Relatedness of M. hyorhinis isolates
The dendrogram constructed based upon concatenated gene sequences (1441bp) revealed genetic polymorphisms among the examined isolates (S1 Fig). The nucleotide sequence variation (percent informative sites) within each gene ranged from 1.4% to 21% (Table 2), with mtlD showing the highest degree of variation and pdhB the lowest. Although nucleotide polymorphisms were observed, the majority resulted in synonymous changes in the pdhB, p95 and ung genes. In contrast, non-synonymous changes were seen within the mtlD gene (S2 Fig). Furthermore, in some instances, changes involved the switch from amino acids belonging to different groups, based on the characteristics of the side chain, polarity, or acid-base properties  (Tables 2 and 3).

Frequency and distribution of M. hyorhinis sequence types (ST)
The most frequent ST, ST37, was observed 13 times. Eighteen STs contained between 2 and 9 isolates, and 21 STs contained only one isolate. Minimum spanning tree analysis showed the grouping of 39 STs into 3 clonal complexes (CC), with CC1 encompassing more STs (n = 25) and isolates (n = 66) than any other clonal complex. CC2 and CC3 contained 2 STs each and a total of 2, and 6 isolates, respectively. Lastly, singletons, defined as isolates which share less than 3 out of 4 loci with another member of a clonal complex, were identified 10 times and included 30 isolates (Fig 1). ST37, the most frequent subtype, and ST19 contained isolates from only one management system (system 7 and 2, respectively), where specimens were collected the same day during an M. hyorhinis outbreak investigation (Table 3). Furthermore, two isolates from the upper respiratory tract (nasal cavity or bronchus) and pericardium were obtained from the same pig, evidencing that isolates recovered from systemic sites of diseased pigs can be also found in the nasal cavity or bronchi. ST19 was represented by isolates of healthy pigs from the nasal cavity or bronchi and diseased pigs from joints originating from system 2 in 2010 through 2012. Isolates with different STs (ST18 and ST35), however, were also obtained from pigs from that system during the same time frame (Table 3). Furthermore, in system 2, isolates from the pleura and bronchus of the same pig had different STs, differing by only one allele (mltD). In system 5, isolates in different U.S. states shared the same ST, suggesting a point source introduction. In contrast, 5 different STs were identified within the same system (Figs 1,2 and 3).
The isolates from ST17 (n = 9) originated from systemic sites of pigs of wide age distribution (7-50 weeks), and from 5 different systems in 4 different states collected in 2010-2012 (Table 3). It is important to note that information gathered for each isolate included "system," which refers to the ownership of the pigs but lacks information on the pig flow or site. Therefore, the epidemiological interpretation of the data presented here may be limited since one system can be comprised of multiple producers who submit cases independently but share the same origin of pigs.
The reference strain HUB-1, originating from China, as well as the strains originating from Mexico showed unique STs. CC2 was comprised of isolates from the same system in Mexico (Table 3), while CC3 included isolates from different systems. Finally, no evident relationships were observed with respect to the age of the pig, lesion type, year, or US state of origin (Figs  1,2 and 3). However, strains originating from different countries (US, China and Mexico) clustered in distinct clonal complexes or as singletons. Additionally, there was no evidence of a specific ST being associated with virulence.

Discussion
The objective of this study was to develop an MLST-s typing scheme for epidemiological and genetic characterization of M. hyorhinis field isolates. The novel scheme was utilized on a set of 104 well-characterized isolates from various tissue types, swine production systems, ages and geographic locations. Twenty-seven loci were evaluated as potential gene targets; however, many of these loci were almost identical in 4 previously published M. hyorhinis genomes and therefore not pursued further. A previously published and later optimized MLST scheme, showed an overall low percentage of polymorphic sites within each of the included housekeeping genes [21,22]. In view of the limited genetic variation observed in housekeeping genes, more variable genes, such as surface protein encoding genes, were incorporated. This modification could increase the possibility of detecting more genetic variation, making it useful for typing during localized disease outbreaks and studying the pathogenesis of M. hyorhinis [24]. In contrast, variable lipoprotein encoding genes revealed an immense amount of variation between the available genomes, lacking conserved areas for primer placing. These findings were somewhat anticipated given the fact that the genes coding for these lipoproteins are under constant change, either through recombination or high-frequency mutations, which affects their expression and size [33,34]. The final MLST-s protocol contained 2 core housekeeping genes, pdhB and ung, with an observed variation between 1.4-2% resulting in 6 and 7 different allele types, respectively. Similar amounts of variation have also been observed within housekeeping genes in previously published MLST schemes for M. hyopneumoniae, M. agalactiae, M. bovis and M. hyorhinis [20-22, 35, 36]. One unanticipated finding was the little to no variation observed within widely recognized adhesins, such as p37 and p95. The recent sequencing of the complete p37 gene in 33 unrelated isolates from France showed 18 different nucleotide sequences. [21] Although variation was observed, the previous study concluded that the discriminatory power was relatively poor and that single-gene typing methods can fail at differentiating isolates that have a reduced variation. One possible explanation for the lack of variation is that the proteins encoded by these genes may be essential for host colonization and survivability [37]. In fact, it has been shown that truncated versions of certain adhesins within another swine mycoplasma species, M. hyopneumoniae, contribute to loss of virulence [38].
Even though variation was observed within pdhB, p95 and the ung loci, these nucleotide changes resulted in mostly synonymous changes. These findings are in accordance with previous results, where the genetic variation observed in housekeeping genes generally led to synonymous changes [21]. On the other hand, the surface protein-encoding gene mtlD showed great variation (~21%) within the analyzed sequences and more than 20 different alleles. The predicted amino acid sequence alignment of mtlD revealed multiple non-synonymous changes. In some cases, the changes were between amino acids of different chemical groups. Future studies should evaluate if these changes influence the antigenic variation of this species due to the protein location on the bacterial surface or if the changes contribute to virulence.
Modified MLST protocols are not novel. Hyper-variable genes have been incorporated into typing schemes for M. agalactiae, M. capricolum and B. pertussis [35,39,40]. In all cases, an increase in discriminatory power was obtained; in some cases, this modification led to the identification of MLST profiles associated with epidemics and globally distributed strains [40]. The increased ability to discriminate between isolates during an epidemic or outbreak investigation might be at the cost of the interpretation of observed evolutionary relationships [24]. In this study, most systems had at least two STs within a 2-year period, and most frequently the strains differed at the mtlD loci, evidencing that the addition of hyper-variable genes might be a useful tool for local epidemiological investigations.
The present study showed high genetic variation within the M. hyorhinis species, with multiple STs circulating within US swine herds. The MLST-s scheme presented here was revealed to be highly discriminatory, capable of subdividing 104 isolates into 39 distinct STs. Multiple STs were found within a single system and an individual pig during the same time frame. Yet, frequently, isolates from the same farm shared the same ST. ST17 included isolates from 5 different production systems. This result could indicate the dissemination of a clone that may be a more successful pathogen, the existence of an organism with a longer persistence and survivability, or a common gilt source between the systems. It could also be due to the convenient nature of the sampling performed here [35]. In contrast to the observed sharing of ST17 between systems, the majority of STs with more than one isolate originated from the same system.
No correlation between STs and the age of the pig, year of isolation, sample type, pathogenicity, and state of US origin was observed. These observations are in accordance with others [22,41]. However, in this study three distinct clonal complexes, two of which included strains from the US and one that exclusively contained strains from Mexico, were identified. This is in contrast with the findings generated with the previously published MLST scheme, where only one clonal complex was identified in each data set, that included isolates from three different countries [21,22]. This finding suggests that the novel MLST-s presented here could have a higher discriminatory power compared to previously published schemes, possibly due to the addition of more variable genes. However, this study lacked a side by side comparison and therefore further research is required to test such hypothesis. Finally, it is possible that the over-representation of isolates in this dataset from diseased pigs could have hindered the identification of virulence-associated STs.
Currently, swine practitioners are faced with the challenge of controlling M. hyorhinis disease in affected post-weaning pigs. The application of the MLST-s protocol described here has resulted in better understanding of the diversity of M. hyorhinis field isolates circulating in US swine herds. This tool could help to elucidate further features of the epidemiology and dynamics of infection for this pathogen, understand disease outbreaks, assist with selection of isolates for vaccine production, and identify the potential origin of a specific isolate. Arrows indicate areas of nonsynonymous changes between aligned sequences where the amino acid change was between different amino acid groups, based on polarity, pH level or side chain. Arrows from left to right: lysine (basic) vs. glutamic acid (acid), alanine (non-polar) vs. serine (polar) phenylalanine (aliphatic group) vs. leucine (aromatic). (PNG) S1 Data . Concatenated gene (pdhB, p95, mtlD and ung) sequences of all 104 M. hyorhinis isolates.