Characterization of Chinese Haemophilus parasuis Isolates by Traditional Serotyping and Molecular Serotyping Methods

Haemophilus parasuis is classified mainly through serotyping, but traditional serotyping always yields non-typable (NT) strains and unreliable results via cross-reactions. Here, we surveyed the serotype prevalence of Chinese H. parasuis isolates using traditional serotyping (gel immuno-diffusion test, GID) and molecular serotyping (multiplex PCR, mPCR). We also investigated why discrepant results between these methods were obtained, and investigated mPCR failure through whole-genome sequencing. Of the 100 isolate tested, 73 (73%) and 93 (93%) were serotyped by the GID test and mPCR, respectively, with a concordance rate of 66% (66/100). Additionally, mPCR reduced the number of NT isolates from 27 (27%) for the GID testing, to seven (7%). Eleven isolates were sequenced, including nine serotype-discrepant isolates from mPCR and GID typing (excluding strains that were NT by GID only) and two NT isolates from both methods, and their in silico serotypes were obtained from genome sequencing based on their capsule loci. The mPCR results were supported by the in silico serotyping of the seven serotype-discrepant isolates. The discrepant results and NT isolates determined by mPCR were attributed to deletions and unknown sequences in the serotype-specific region of each capsule locus. Compared with previous investigations, this study found a similar predominant serotype profile, but a different prevalence frequency for H. parasuis, and the five most prevalent serotypes or strain groups were serotypes 5, 4, NT, 7 and 13 for mPCR, and serotypes 5, NT, 4, 7 and 13/10/14 for GID. Additionally, serotype 7 was recognized as a principal serotype in this work.

Among the serotyping protocols available for H. parasuis, the capsular polysaccharide is assumed to be the dominant component of the serotyping antigen [9, [23][24][25]. The capsule loci for the 15 H. parasuis serotype reference strains have been annotated, and a strong correlation between the capsule locus type/in silico serotype and serotyping result was observed [26,27]. Surprisingly, the capsule locus was also found in NT strains [27]. Therefore, the capsule locus offers a potential target for molecular serotyping of H. parasuis. Based on the concept of the capsule locus being responsible for the phenotype of the capsule, a multiplex PCR (mPCR) was developed by Howell et al. [28] for rapid molecular serotyping of H. parasuis [28]. All the isolates tested were typed by this method in that research study and a high concordance was gained between the mPCR and IHA results [28].
The aim of this study was to investigate H. parasuis serotype prevalence in Chinese pig herds. For this purpose, both mPCR and GID tests were performed, and whole-genome sequencing was used to validate the discrepant results between the mPCR and GID tests and to survey the cause of the mPCR serotyping failure. We found that the mPCR serotyping detection rate was superior to that of GID typing, and where discrepancies existed in the mPCR serotyping they were attributable to deletions and unknown sequences in the serotype-specific capsule locus region.

Ethics statement
This study was approved by the Animal Ethics Committee of Lanzhou Veterinary Research Institute, Chinese Academy of Agricultural Sciences (Permit No. LVRIAEC2007-003). All the experimental protocols in this study were conducted in strict accordance with the requirements of the Animal Ethics Procedures and Guidelines of the People's Republic of China. All animals were humanely sacrificed under sodium pentobarbital anesthesia, and all efforts were made to minimize any suffering.

GID test
The serotypes of all the field isolates were identified using the GID test, as originally described by Morozumi and Nicolet [23]. Reference strain antisera were prepared as described previously [17] using cells grown overnight on tryptic soy agar (TSA, Becton, Dickinson and Company, Sparks, USA) supplemented with 5% horse serum and 10 μg/ml NAD. The serotyping antigen/heat-stable antigens from the field isolates were prepared by autoclaving at 121˚C for 2 h as described by Morozumi and Nicolet [23]. The serotyping procedure was performed as described previously [19,21]. The test was repeated once more if no definitive serotype was obtained for an isolate. NT strains were defined as isolates whose antigens did not react with antiserum against the 15 serotype reference strains.

Multiplex PCR assay
The test procedure for the one-step mPCR was performed as previously described [28] with some modifications. Briefly, a loopful of bacteria from a pure culture plate was suspended in 30μl of UltraPure H 2 O. The mixture was boiled for 30 min, and the supernatant collected for each isolate after centrifugation at 4,000 × g for 1 min. A 1μl aliquot of genomic DNA for each sample was added to an mPCR mixture, and the 25μl total volume consisted of 12.5 μl premix Taq (Ex Taq version 2.0 plus dye), 0.5μl of primer mix (50 μM), 0.25μl DMSO (added at 1% of the total reaction volume), and 10.75μl of UltraPure H 2 O. All the samples were examined according to the serotype order, and the RNAse-free ddH 2 O and genomic DNA of the corresponding serotype reference strain were used as negative and positive controls, respectively. The dominant serotype 4 reference strain was used as the positive control for the NT strains. The mPCR was heated at 94˚C for 5 min, followed by 30 cycles of 94˚C for 30 s, 58˚C for 30 s, 68˚C for 60 s, and a final extension at 68˚C for 5 min. The amplified products were electrophoresed in 2.0% agarose gels run in 1 x Tris-borate buffer with a Quick-Load1 100 bp DNA Ladder (New England BioLabs) as the molecular size standard. The procedure was repeated twice for each isolate.

DNA extraction and genome sequencing
Eleven isolates were sequenced, including the 9 isolates with serotype discrepancies by mPCR and GID (excluding strains that were NT by GID only) and the 2 isolates that were NT by both methods. Genomic DNA was extracted from overnight cultures grown in supplemented TSB using a DNeasy Blood & Tissue Kit (Qiagen, Valencia, CA) according to the manufacturer's instructions. The concentrations of the extracted genomic DNAs were measured using the Nanodrop 2000/2000C system (Thermo Scientific Company, Waltham, UK). All the isolates were sequenced on an Illumina HiSeq 2000 platform with a paired-end (PE) strategy. The SOAPdenovo assembly was performed using PE reads with quality filtering (Q20), first 5 nucleotides of the 5'-end, and adapter trimming. The average effective sequencing depth for all the isolates was 120-fold.

Capsule locus identification and in silico serotype analysis
The in silico serotypes of the serotype-discrepant isolates and common NT isolates were determined by comparing their capsule contents and compositions with those of the reference strains. The capsule locus was identified for the 9 isolates with serotype discrepancies by mPCR and GID (excluding strains that were NT by GID only) and the 2 isolates that were NT by both methods according to a previous description [26] with some modifications. Briefly, the locus sequences were acquired by using the first gene (funA) and last gene (iscR) of the H. parasuis SH0165 capsule locus (GenBank accession No. CP001321.1) [32] as the query sequences. The gene name was determined for each coding sequence within the capsule locus by a nucleotide Basic Local Alignment Search Tool (BLASTn) interrogation of the NCBI database (https://www.ncbi.nlm.nih.gov/). The predicted gene names were recorded according to the highest matched nucleotide identity score. When more than one significant BLAST match sequence was found for a single isolate or various isolates, their identities were aligned further by BLASTn to determine whether the same sequence was obtained. Identical sequences were defined as described previously [27,33], using a threshold of >80% nucleotide identity over >80% of coverage length. For simplicity, the capsule locus genes from each isolate were ordered from funA to iscR.

Comparing the detection performances of mPCR and GID
The mPCR and GID performances were evaluated using the results from both of these analyses, and the in silico serotyping results. Because the in silico serotyping was 100% concordant with the mPCR results, it was considered to be a potential new gold standard for replacing traditional serotyping methods [28], and was used to validate the results of the above mentioned nine serotype-discrepant isolates. For the serotype-discrepant isolates from the GID and mPCR tests, the serotype category that agreed with the in silico serotyping was considered to be correct. The detection rate for the typable strains was calculated for the GID test and the mPCR assay, and the concordance between both methods was analyzed using the above information.
Although the distribution frequency of each serotype varied extremely between mPCR and GID, almost identical serotype profiles were identified by both methods. Additionally, mPCR and the GID showed nearly identical prevalences of the dominant serotypes or isolate group, and the first five most predominant groups, covering 90% of the total number of isolates, were completely identical for the two methods.
In silico serotype analysis based on the capsule locus Because nine isolates displayed different serotype results for GID and mPCR and two isolates were NT by both methods, the molecular basis of the inconsistent results and mPCR failures were investigated by whole-genome sequencing and analysis of the in silico serotypes obtained. Genome sequencing and assembly were finalized (Table 2), and the capsule loci were identified for the above mentioned serotype-discrepant isolates and NT isolates (Table 3) from the genome. In silico serotypes were obtained from the content and composition of each capsule locus. Compared with the capsule loci of the reference strains, these isolates displayed obvious deletions and/ or unknown sequences (no significant similarity sequence in BLASTn search, NSSS) in their capsule loci. Deletions and NSSSs both occurred in the serotype-specific region of each capsule locus of these isolates. Furthermore, these deletions or NSSSs covered not only the serotype-specific gene position of the mPCR scheme, but also the signal regions of the in silico serotype analysis.
H12, H35, and H36 share some common capsule locus features with the serotype 4 reference strain, SW124, and their capsule loci obviously differ from those of the serotype 14 reference strain 22113 and the serotype 15 reference strain 15995 (Table 3). Compared with SW124, the three isolates only lacked the lstB gene in their capsule loci, so they were defined as belonging to capsule locus type 4 or in silico serotype 4.
H38, H39, and K3 share similar capsule loci with the C5 serotype 8 reference strain but these loci differed markedly from the H555 serotype 10 reference strain locus. Compared with C5, these three isolates lacked the scdA gene, but H38 and H39 share NSSS10 in this position (Table 3). Based on the capsule composition analysis, H38, H39 and K3 can be defined as in silico serotype 8.
Compared with the serotype 1 reference strain No.4, the capsule locus of HPS6 is more closely related to that of the serotype 11 reference strain H465 (Table 3). HPS6 shares gltO,  bstA and amtA with H465; therefore, HPS6 is related to in silico serotype 11 based on its capsule composition. HPS4, 16, and YT were identified separately as serotype 7 and NT strains by GID and mPCR; they share four NSSSs (NSSS6-NSSS9, with 100% nucleotide identity) and a sequence encoding a hypothetical protein named fun, as described previously [26] (Table 3). Moreover, HPS4 and 16 share NSSS5 with 100% nucleotide identity. Another gene, amtA, which is common in the capsule locus of the serotype 11 reference strain H465 [26], also appears in the capsule loci of YT and 16. In these two isolates the funP-funQ-gltJ-cap5E-ndeA-naeA-gltA gene cluster, which is part of the serotype-specific region for serotype 7 [26], is replaced by NSSSs; among these missing genes, funQ is the serotype-specific target gene for serotype 7 in the mPCR scheme [28]. Based on the capsule composition, HPS4, 16 and YT cannot be identified as a definite serotype in the in silico serotype analysis, so we defined them here as NT strains.
Compared with six other NT strains (YT, 16, HPS4, H38, H39 and K3) from the mPCR, H47 differs markedly in its capsule locus, which contains four continuous and totally different NSSSs (NSSS1-NSSS4) ( Table 3). These NSSSs are also distinct from those of YT, 16, HPS4, H38 and H39. Although H47 contained the gene composition of funA-neuA-wzx and lstAwza-wzb-wzs-iscR, it is still assumed to NT strain by the in silico analysis in this study because of the continuous NSSSs within its capsule locus.

Comparison of the detection performance of mPCR and GID
The mPCR and GID detection rates were calculated using the data generated from GID, mPCR and the in silico serotype analysis. Of the 100 isolates we tested, 93 and 73 were identified as single serotypes by mPCR and GID, respectively. Compared with GID, the mPCR detection rate for serotyping H. parasuis isolates was 93%, a value higher than that of GID (73%).
Of the typable isolates tested by GID, nine showed discrepant results with mPCR. Taking the in silico serotype as the standard, the serotypes of seven serotype-discrepant isolates (H12, H35, H36, 16, YT, HPS4 and HPS6) from mPCR agreed well with the results of the in silico serotype analysis (Table 4). However, the in silico serotype analysis did not support any of the results from mPCR or GID for the remaining two serotype-discrepant isolates, H38 and H39 ( Table 4). The concordance between the mPCR and GID test was 66% (66/100), including 64 typable isolates and two common NT strains by mPCR and the GID test.

Discussion
In this research, the results of mPCR and GID analyses produced nearly identical serotype profiles for the isolates. Considering serotypes 5 and 12 as the same serotype, the serotypes 5, 4, 7 and 13 are the most frequently detected serotypes in China, but the prevalence frequency for each serotype manifested obvious differences between the mPCR and GID tests. The results showed identical serotype profiles to those from studies in Denmark [11] and Canada [12], and an investigation of multinational samples [28]. Furthermore, studies performed in Germany [9], Spain [10], USA/Canada [17], Australia [34], China [15,16] and Brazil [14] also had similar results. In all cases, serotype 4, 5 and 13 were the collectively predominant serotypes. Moreover, compared with previous reports from China, serotype 7 was the dominant serotype in this work. High NT isolate rates were reported in all previous described studies, and the two most prevalent serotypes were 5 and 4. However, if the NT strains are included, they will become the third dominant H. parasuis isolate group in Germany [9], USA/Canada [17] and Denmark [11], and even exceed the number of serotype 4 and 5 isolates in some cases [10,14]. When compared with serotyping by GID, mPCR substantially reduced the number of NT isolates from 27% to 7% in this work. The second dominant H. parasuis isolate group was changed from NT strains in the GID test to serotype 4 in the mPCR test. The number of NT isolates played a key role in the serotype profile and prevalence order for H. parasuis. It is also worth noting that a defect in capsule expression can still be existed, even though a entire capsule locus may be present in the isolates. Consequently, it is very important for NT strains to identify whether they are capsulated-strains or non-capsulated strains.
Capsule locus analysis of the NT strains from mPCR revealed the presence of a single deletion or NSSS at the position of a serotype-specific gene adopted by the mPCR for K3, H38, and H39, and multiple NSSSs at the serotype-specific region within this locus for YT, 16, HPS4, and H47. It is clear that the deletion of the serotype-specific target and emergence of NSSS within the capsule locus produced NT isolates in the mPCR test and in the accompanying in silico serotype analysis. In previous research [28], the deletions and NSSSs identified also resulted in a lower concordance between the mPCR and in silico serotype analysis. Similarly, insertions and deletions also caused discrepancies between the phenotypic and genotypic serotyping of Shigella flexneri [33]. To some extent, finding deletions and NSSSs in H. parasuis DNA probably indicates unstable serotype-specific regions within its capsule locus.
Overall, the principal serotype profile from mPCR and GID in our research was the same or similar to profiles of most other previous reports. As a genotypic serotyping method, mPCR is superior to phenotypic serotyping based on GID. In terms of the improved detection rate for typable isolates, the stability of the clinical test, and the compatibility of the results between different laboratories, mPCR will be a valuable alternative to the traditional serotype methods used for typing field isolates of H. parasuis. Investigation of the capsule expression and capsule structures are now required for exploring the origins of NT strains. Additionally, efforts should also be directed in future towards searching for more stable serotype-specific genes to remove the adverse impact of deletions and NSSSs in the mPCR test.