Molecular Surveillance of True Nontypeable Haemophilus influenzae: An Evaluation of PCR Screening Assays

Background Unambiguous identification of nontypeable Haemophilus influenzae (NTHi) is not possible by conventional microbiology. Molecular characterisation of phenotypically defined NTHi isolates suggests that up to 40% are Haemophilus haemolyticus (Hh); however, the genetic similarity of NTHi and Hh limits the power of simple molecular techniques such as PCR for species discrimination. Methodology/Principal Findings Here we assess the ability of previously published and novel PCR-based assays to identify true NTHi. Sixty phenotypic NTHi isolates, classified by a dual 16S rRNA gene PCR algorithm as NTHi (n = 22), Hh (n = 27) or equivocal (n = 11), were further characterised by sequencing of the 16S rRNA and recA genes then interrogated by PCR-based assays targeting the omp P2, omp P6, lgtC, hpd, 16S rRNA, fucK and iga genes. The sequencing data and PCR results were used to define NTHi for this study. Two hpd real time PCR assays (hpd#1 and hpd#3) and the conventional iga PCR assay were equally efficient at differentiating study-defined NTHi from Hh, each with a receiver operator characteristic curve area of 0.90 [0.83; 0.98]. The hpd#1 and hpd#3 assays were completely specific against a panel of common respiratory bacteria, unlike the iga PCR, and the hpd#3 assay was able to detect below 10 copies per reaction. Conclusions/Significance Our data suggest an evolutionary continuum between NTHi and Hh and therefore no single gene target could completely differentiate NTHi from Hh. The hpd#3 real time PCR assay proved to be the superior method for discrimination of NTHi from closely related Haemophilus species with the added potential for quantification of H. influenzae directly from specimens. We suggest the hpd#3 assay would be suitable for routine NTHi surveillance and to assess the impact of antibiotics and vaccines, on H. influenzae carriage rates, carriage density, and disease.


Introduction
Globally, respiratory diseases cause an estimated 1.9 million deaths per year [1]. One of the most important aetiological organisms of both adult and childhood respiratory disease is nontypeable Haemophilus influenzae (NTHi) [2]. NTHi is frequently isolated from the respiratory tract during episodes of sinusitis, otitis media and pneumonia and is the most common cause of chronic obstructive pulmonary disease and bronchiectasis exacerbations [2,3,4].
Traditionally, Haemophilus species have been differentiated by growth requirements, including X (Hemin) and V (Nicotinamide) factors, and phenotypic traits such as hydrogen sulphide production, ornithine decarboxylase production and haemolytic activity [5]. Serological methods such as latex and slide agglutination, or PCR assays targeting genes involved with capsule production such as the bex genes [6], are used to identify encapsulated strains of H. influenzae. Unfortunately, conventional microbiology does not readily distinguish NTHi from its close relative Haemophilus haemolyticus (Hh). Haemolysis of horse or rabbit blood agar plates provides the simplest phenotypic difference, although the use of these blood plates for NTHi identification is not commonplace. Furthermore, with the discovery of the nonhaemolytic Haemophilus haemolyticus phenotype [5], accurate identification of NTHi has become increasingly difficult.
Outside whole genome sequencing, multilocus sequence analysis provides the most accurate identification of true NTHi [7,8]. However, this technique is both expensive and labour intensive and is not practical for routine screening. The identification of a single gene target for simple consistent identification of NTHi from Hh and variant species would be useful for surveillance and intervention studies to determine the true burden of disease caused by NTHi; however, absolute discrimination of these species may not be possible with this simple strategy.
Several gene targets have been explored for speciation of NTHi, Hh and closely related variants, with varying results. These include the lipo-oligosaccharide gene lgtC [7], the IgA protease gene iga [7,8], the fuculose kinase gene fucK [9], the pilus gene pilA [10] and the 16S rRNA gene [11]. Recently, Wang et al. exploited the species heterogeneity of the protein D gene (hpd) to develop two probe-based real time PCR assays (hpd#1 and hpd#3). These hpd#1 and hpd#3 assays demonstrated excellent sensitivity against a clinically diverse collection of 102 NTHi isolates (96% and 98% respectively) [12]. Two outer membrane protein genes of NTHi (omp P6 and omp P2), are well conserved and have also been used as PCR targets. However, a real time PCR assay targeting the omp P6 gene was unable to differentiate NTHi from Hh [13], and a recent study demonstrated limited NTHi sensitivity of an omp P2 real time PCR [12].
Molecular analysis of NTHi collections has revealed a significant presence of Hh in some studies. Among phenotypic NTHi isolates obtained from adult chronic obstructive pulmonary disease patients in the United States, 27% (n = 44) from the nasopharynx and 40% (n = 258) from sputum were found to be Hh using 16S rRNA gene PCR [11]. The same method was used to scrutinise 266 phenotypically-defined NTHi isolates obtained from the nasopharynx of otitis prone and control children (aged 6 to 36 months) in Western Australia. In this study, 79% were designated true NTHi, 12% were Hh and 9% were indeterminate [14]. In contrast, a Danish study re-examined 480 predominantly non-invasive but clinically-associated H. influenzae isolates identified by dominant microbiological growth, using probe hybridisation with fucK, the adherence and penetration protein gene hap and the superoxide dismutase gene sodC gene, and had only to reclassify 0.4% (2/480) as Hh [9]. Of the 250 middle ear isolates cultured in the aforementioned studies, none were demonstrated to be Hh or variant strains [9,11,14].
In remote Indigenous communities of the Northern Territory, otitis media affects approximately 90% of children less than 2 years of age [15], and 20% of children are hospitalised with an acute lower respiratory infection in their first year of life [16]. In this population NTHi is a more dominant pathogen in terms of prevalence than either Streptococcus pneumoniae or Moraxella catarrhalis [17,18] with nasopharyngeal isolation in up to 80% of Indigenous children 3-7 years of age [19], and ear discharge isolation in 21% of Indigenous children with chronic suppurative otitis media (mean age 8 years) [20]. Where there is a high burden of respiratory disease, it is of particular interest to be able to determine the relative contribution of true NTHi and Hh.
Although numerous potential gene targets have been evaluated, definitive identification of NTHi from Hh and variant species with a single gene target has not been demonstrated. Furthermore, there has been no direct comparison of many of the targets investigated to date. In this paper, we assess and compare the ability of a selection of existing and novel PCR-based assays to identify true NTHi from a genetically diverse selection of phenotypic NTHi isolates.

Results
Sixty phenotypic NTHi isolates, classified by 16S rRNA gene PCR [11] as NTHi (n = 22), Hh (n = 27) or equivocal (n = 11), were further characterised by sequencing of the 16S rRNA and recombinase A (recA) genes then interrogated with a selection of PCR-based assays designed to exclusively identify NTHi.

Sequence phylogeny
Sequencing of the 16S rRNA and recA genes was performed on the 60 study isolates and 2 reference isolates (H. influenzae -ATCC 19418, and H. haemolyticus -ATCC 33390) respectively yielding 598 and 543 unambiguous bases. Accession numbers are provided in Table S1. Using the Neighbour-Joining algorithm, radial phylogenetic trees ( Figure 1) were constructed from the individual and concatenated 16S rRNA and recA gene sequences. Six Genbank sequences (Table 1) were included for reference and all trees were rooted by the Genbank reference sequence of Haemophilus parainfluenzae -T3T1.
For each tree, distally discrete clusters of NTHi and Hh were evident, interspersed by less well defined isolates as shown in Figure 1. The sequenced H. influenzae reference strain (ATCC 19418) grouped with the strict cluster of NTHi's by recA sequence but was less well defined by the 16S rRNA gene sequence; however, its sequence similarity across both 16S rRNA and recA gene sequences was sufficient to place it with the strict NTHi's on the concatenated tree. Also on the concatenated tree two study isolates (40 and 41) diverged toward the H. parainfluenzae (T3T1) root isolate, and one isolate (38) grouped with the T3T1 root isolate ( Figure 1). The diversity revealed in the phylogeny precluded complete dichotomous speciation.

PCR of study isolates
The 60 phenotypic NTHi study isolates were subsequently screened with 6 conventional PCR assays targeting the omp P2, omp P6, lgtC, 16S rRNA, fucK and iga genes, 2 real time PCR assays targeting the hpd gene, and a single PCR high resolution melt (PCR-HRM) assay designed to exploit a single nucleotide polymorphism (SNP) in the omp P6 gene (omp P6-HRM) as shown in Figure 2. Overall the PCR assays revealed positivity ranging from 47% (16S rRNA) to 72% (omp P6). The omp P6-HRM separated the isolates into two distinct melt curves ( Figure 2). The upper group of curves included the H. influenzae reference strain ATCC 19418 and the lower group of curves included the Hh reference strain ATCC 33390. The inclusion of standards in the real time PCR assays allowed an estimation of the limit of detection (LOD). For the hpd#3 assay the LOD was below 10 copies per reaction at approximately cycle 36. The PCR results are displayed in conjunction with the 16S rRNA and recA concatenated phylogeny in Figure 3.

Strict definition of NTHi
When assessed by the strict phylogenetic definition of NTHi shown in Figure 3 (Strict NTHi cluster), most PCR assays returned positive results. Only lgtC (4 negatives), iga (2 negatives), hpd#1 and hpd#3 (1 negative each) had less than 100% sensitivity. Against this strict definition, the 16S rRNA gene PCR assay demonstrated the greatest combination of sensitivity and specificity with a receiver operator characteristic (ROC) curve area of 91%, whereas all other assays had poor specificity (44%-70%). The inability of the 16S rRNA gene and recA sequence data to separate isolates into distinct species, and the detection of multiple NTHi target genes outside the strict NTHi group, led to the designation of several isolates as ''fuzzy species'' ( Figure 3).

Study-defined NTHi
For pragmatic assessment of the PCR assays, we enforced a study definition of NTHi. Isolates were interpreted as NTHi if they demonstrated a distinct phylogenetic grouping (16S rRNA and recA concatenated sequences), had approximately 97% DNA similarity or greater with the strict NTHi isolate 86-028NP (16S rRNA and recA concatenated sequences) and possessed most of the target genes (PCR results). Figure 3 displays our study-defined NTHi isolates which includes both the strict NTHi cluster and the fuzzy species.

Assessment of PCR assays against study-defined NTHi
Sensitivity, specificity and ROC curve areas were calculated for each assay against study-defined NTHi. The hpd#1, hpd#3 and iga PCR assays were equally most accurate, each returning a sensitivity, specificity and ROC curve area of 89%, 92% and 90% respectively ( Table 2). The omp P2 and lgtC assays had ROC curve areas of 86% and 88% respectively. The remainder of assays had ROC curve areas below 81%.

Overview and significance
Accurate identification of NTHi is important to establish the relationship of this pathogen with carriage and infection. In the Northern Territory of Australia, where NTHi is a major cause of respiratory disease and otitis media, several randomised controlled trials are underway to assess the effect of antibiotics such as azithromycin, and vaccines including the pneumococcal H. influenzae protein D conjugate vaccine (PHiD-10CV; SynflorixH), on NTHi carriage. These studies rely on conventional microbiology. Failure to discriminate NTHi from its non-pathogenic relatives can result in a mismatch of cause (NTHi) and effect (respiratory disease and otitis media). We recommend the hpd#3 assay for confirmation and future assessment of NTHi in carriage and disease.

Sequence phylogeny
To challenge the PCR assays we selected phenotypic NTHi isolates with significant 16S rRNA gene variability as indicated by Murphy's 16S rRNA PCR [11] and this diversity was confirmed by the 16S rRNA and recA sequencing. The Neighbour-Joining radial phylogenetic trees in Figure 1 demonstrate an evolutionary continuum between NTHi and Hh that was unchanged using the alternative algorithms, Minimum Evolution or Maximum Parsimony. This ''fuzziness'' between species suggests that it may be impossible to differentiate NTHi from Hh without broader genetic interrogation. Other studies have used multilocus sequence analysis for improved Haemophilus species delineation [7,8].
Considering NTHi and Hh are close evolutionary relatives that continue to inhabit an overlapping niche, maintaining their potential for genetic recombination, the lack of genetic distinction between these species is not surprising. The only obvious feature of the phylogeny is the strict cluster of NTHi isolates which might indicate that NTHi has a successful genetic formula for causing infection that Hh does not. This is supported by the finding that isolates collected as a consequence of clinical examination or from sterile sites are predominantly NTHi [8,11,14]. In this clinical context current microbiological techniques are generally adequate, however there have been recent reports of invasive disease caused by Hh [21].

Novel PCR assays
The two novel assays that were designed for this study targeted omp P6 and omp P2. The omp P6-HRM was set up to speciate NTHi from Hh via a SNP in the omp P6 gene. Unfortunately, like the omp P6 PCR, the omp P6-HRM over-represented NTHi. The apparent bimodal nature of the chosen omp P6 SNP was not consistent with the overall genetic diversity that defined NTHi and Hh in this study. PCR-HRM is a powerful and cost effective method of identifying amplicon heterogeneity and careful target selection can allow successful species resolution [22]; a combination of several PCR-HRM assays might provide enhanced discrimination. The omp P2 PCR assay was more accurate for detecting our studydefined NTHi with 81% sensitivity and 92% specificity and was among the best assays tested. This is comparable with a previous study, where omp P2 PCR demonstrated 85% sensitivity for NTHi [12].

PCR results
Six of the 9 PCR assays gave comparable results; the omp P2, lgtC, hpd#1, hpd#3, fucK and iga assays were positive for 50%-57% of isolates. The omp P6 and omp P6-HRM assays tended to over represent NTHi (positive for 70% and 72% respectively) while the 16S rRNA gene assay detected only 47% of isolates producing results more aligned to the strict NTHi phylogeny. Most of the assays were positive for isolates beyond the strict NTHi cluster suggesting these fuzzy isolates were genetically more similar to NTHi than Hh. Thus, the study definition of NTHi considered the PCR results in addition to the phylogeny and similarity of the concatenated sequences (Figure 3).
Assessment of each PCR assay against the study definition of NTHi demonstrated that the 3 PCR assays, hpd#1, hpd#3 and iga, were equally sensitive and specific against the 60 study isolates; however the iga PCR assay lacked specificity against the panel of reference organisms (Table 3). It should be noted that significant diversity exists among isolates of Haemophilus species and the results from the reference panel, comprising only 2 strains of NTHi and single strains of the other Haemophilus species, should be interpreted with caution.
In the recent publication by Wang et al. [12], the hpd#3 assay was demonstrated to be highly specific but did detect 1 of 2 H. aphrophilus isolates among a reference panel of 61 respiratory organisms representing 21 different species. The reported LOD was 70 copies per PCR reaction when limited to 35 cycles. In our study, the hpd#3 PCR assay did not detect any of the 12 non-H. influenzae reference organisms tested, including 1 H. aphrophilus isolate, and we consistently achieved a LOD of 10 copies per PCR reaction. When applied to predominantly invasive NTHi isolates, Wang showed that the hpd#3 assay had a sensitivity of 98% (100/ 102). For our genetically diverse selection of study-defined NTHi isolates, the hpd#3 assay was positive for 89% (33/37).

Conclusion
In summary, no single gene target tested was able to unequivocally differentiate NTHi and Hh. Comparative genomic studies are required to identify the genetic determinants that enable NTHi to successfully invade sterile sites and cause disease.
The hpd#3 probe-based real time PCR assay was the best assay tested, having a superior combination of sensitivity, specificity and LOD for NTHi. Furthermore, this assay has applicability to both clinical isolates and clinical specimens and can be used to quantify bacterial density making it a valuable tool for more accurate monitoring of NTHi in nasopharyngeal carriage, otitis media and respiratory infection.

Study isolates
This study tested DNA extracted from stored isolates obtained during a hospital-based surveillance study conducted in Western Australia from 2007 to 2009 [23]. Nasopharyngeal swabs were collected from children 6-36 months of age who had been anaesthetized for either tympanostomy tube insertion or general surgery. NTHi was identified by colony morphology and dependence on X and V growth factors. Isolates were defined as non-typeable based on a lack of agglutination with typing sera (Bactus AB). DNA was extracted using the Wizard SV gDNA kit according to manufacturer's instructions (Promega). The 60 isolates included in this study were defined by Murphy's dual 16S rRNA gene PCR algorithm [11] as NTHi (n = 22), Hh  (n = 27), or equivocal (n = 11). This dual assay generates equivocal results in approximately 10% of clinical and surveillance isolates [11,14].

Sequence phylogeny
Partial sequencing of the 16S rRNA and recA genes, both commonly used for bacterial classification [7,8,24], was performed on the 60 clinical isolates and 2 reference strains (H. influenzae -ATCC 19418, and H. haemolyticus -ATCC 33390) to assist speciation. A complete 16S rRNA gene amplicon of approximately 1500 bp and a partial recA amplicon of approximately 600 bp were generated using PCR primers and methodology described elsewhere [11,24]. Each amplicon was sequenced in singlicate using the reverse primer from the PCR. All sequencing was carried out by Macrogen, Korea. Six reference sequences were downloaded from Genbank [25] (www.ncbi.nlm.nih.gov -December 2010) to facilitate species identification (Table 1). Sequencing traces were assessed, cropped and exported for further analysis using the Lasergene software (DNASTAR, USA). Sequence alignments (ClustalW) and phylogenetic analyses (Neighbour-Joining, Minimum Evolution and Maximum Parsimony) were conducted using MEGA version 4 [26].

Established PCR assays
A literature search was conducted to identify gene target candidates from established PCR-based assays with a high degree of accuracy for identification of NTHi, including discrimination from Hh. Seven targets were chosen on the basis of their demonstrated sensitivity and specificity for NTHi, and the PCR assays were conducted based on the published methods (Table 4). DNA for standards was extracted from the reference strain ATCC 19418 and a dilution series ranging from 10 to 100000 genome copies per reaction was used to estimate the LOD of real time PCR assays.

Novel PCR assays
Publically available sequence data [25] (www.ncbi.nlm.nih.gov -September 2010) was utilised to design novel assays targeting the two outer membrane protein genes, omp P2 and omp P6.

omp P2 PCR
Primers (Table 4) were selected from the conserved regions of omp P2 to generate a 186 base pair amplicon corresponding to nucleotide region 154223-154408 of the H. influenzae Rd KW20 complete genome, accession number L42023. PCR was performed using standard Taq PCR Core Kit reagents (Qiagen) with 0.5 mM of each primer and 1 ml of DNA extraction template in each 25 ml reaction. The annealing temperature was 55uC and cycling was repeated 35 times. PCR products were isolated by agarose gel electrophoresis, stained using SYBRH Safe DNA gel stain (Invitrogen), and visualised with the Gel Doc XR system in conjunction with Quantity One software (Bio-Rad). All product sizes are based on the NTHi Rd KW20 genome. *product size varies considerably across strains. **hpd probes were labelled with Hex at 59-end, SpC3 at 39-end, and Black Hole Quencher (BHQ) at the internal ''T''. F (forward primer); R (reverse primer); P (probe); HRM (high resolution melt). doi:10.1371/journal.pone.0034083.t004 omp P6-HRM The omp P6 primers (Table 4) for PCR-HRM were selected around a conserved SNP (G for NTHi and T for Hh; as shown in Figure 2) corresponding to nucleotide 402465 (amplicon 402446-402485) of the H. influenzae Rd KW20 complete genome, accession number L42023. Real time PCR was performed using 5 ml of the 26 SensiMix TM SYBRH Green No-ROX reaction buffer (Quantace), 0.5 mM of each primer and 1 ml of DNA extraction template in each 10 ml reaction. Annealing was set at 58uC. Following 35 cycles of PCR, HRM was performed between 66uC and 76uC in 0.1uC increments for 2 seconds each. All thermocycling was done with the Rotor-Gene 6000 real time platform (Qiagen).

Reference isolates
The PCR assays were also conducted on a panel of reference isolates of common respiratory bacterial species to determine their broader specificity. The following species were included: H. influenzae, H. haemolyticus, H. parahaemolyticus, H. parainfluenzae, H. aphrophilus, P. multocida, N. meningitidis, P. aeruginosa, Streptococcus pneumoniae, Moraxella catarrhalis, Streptococcus pyogenes, Klebsiella pneumonia and Staphylococcus aureus. Reference strains were sourced from either Microbiologics (Minnesota, USA) or locally from the culture collection at the Menzies School of Health Research (Northern Territory, Australia) as shown in Table 3.

Statistical Analysis
Sequence phylogeny, PCR results, and DNA similarity were used to define NTHi for this study. The sensitivity and specificity of each PCR method for study NTHi was calculated and assessed by the area under the ROC curve. Analyses were conducted using STATA IC version 11 (StataCorp, Texas, USA).

Supporting Information
Table S1 Partial 16S and recA sequence accession numbers. (DOC)