A New Oligonucleotide Microarray for Detection of Pathogenic and Non-Pathogenic Legionella spp.

Legionella pneumophila has been recognized as the major cause of legionellosis since the discovery of the deadly disease. Legionella spp. other than L. pneumophila were later found to be responsible to many non-pneumophila infections. The non-L. pneumophila infections are likely under-detected because of a lack of effective diagnosis. In this report, we have sequenced the 16S-23S rRNA gene internal transcribed spacer (ITS) of 10 Legionella species and subspecies, including L. anisa, L. bozemanii, L. dumoffii, L. fairfieldensis, L. gormanii, L. jordanis, L. maceachernii, L. micdadei, L. pneumophila subspp. fraseri and L. pneumophila subspp. pasculleii, and developed a rapid oligonucleotide microarray detection technique accordingly to identify 12 most common Legionella spp., which consist of 11 pathogenic species of L. anisa, L. bozemanii, L. dumoffii, L. gormanii, L. jordanis, L. longbeachae, L. maceachernii, L. micdadei, and L. pneumophila (including subspp. pneumophila, subspp. fraseri, and subspp. pasculleii) and one non-pathogenic species, L. fairfieldensis. Twenty-nine probes that reproducibly detected multiple Legionella species with high specificity were included in the array. A total of 52 strains, including 30 target pathogens and 22 non-target bacteria, were used to verify the oligonucleotide microarray assay. The sensitivity of the detection was at 1.0 ng with genomic DNA or 13 CFU/100 mL with Legionella cultures. The microarray detected seven samples of air conditioner-condensed water with 100% accuracy, validating the technique as a promising method for applications in basic microbiology, clinical diagnosis, food safety, and epidemiological surveillance. The phylogenetic study based on the ITS has also revealed that the non-pathogenic L. fairfieldensis is the closest to L. pneumophila than the nine other pathogenic Legionella spp.


Introduction
Legionella acquired its name after an outbreak of a then-unknown ''mystery disease'' that affected 221 persons, and caused 34 deaths eventually, attending a convention of the American Legion in July 1976. This epidemic, which occurred within days of the 200 th anniversary of the signing of the Declaration of Independence, was widely publicized and raised great concern in the United States [1]. A few months later, the causative agent was identified as a previously unknown bacterium, which was subsequently named Legionella. This gramnegative bacterium includes species responsible for Legionellosis or Legionnaire's diseases, with Legionella pneumophila as the most notably species [2,3]. Since then, more than 52 Legionella spp. have been identified (http://www.bacterio.cict. fr/l/legionella.html) [4,5]. Although L. pneumophila remains as the major cause of legionellosis, non-pneumophila infections have been reported to be caused by Legionella micdadei (60%), Legionella bozemanii (15%), Legionella dumoffii (10%), Legionella longbeachae (5%), and other species (10%) [6].
Infections due to species other than L. pneumophila are likely to be underestimated because of a lack of appropriate diagnostic tests [6]. Since Legionella was first identified in 1977, various diagnostic tools for Legionella have been developed, including cell culture, antigen detection, serological typing, polymerase chain reaction (PCR), and microarray methods. The culture method is time-consuming due to the slow growth of Legionella spp., and it fails to distinguish Legionella spp. at the species level [7]. The detection of Legionella antigen in urine by enzyme immunoassays is a highly specific approach; and commercially available systems using this approach can detect L. pneumophila serogroup O1 but not other serogroups [8]. Serological typing methods with monoclonal and multiclonal antibodies can be used to detect L. pneumophila only with the aid of laborious pre-culture [9,10].
Currently, most PCR methods target 5S rRNA, 16S rRNA, 23S-5S ribosomal RNA intergenic spacer, mip, rpoB, and gyrB genes [11][12][13][14][15][16][17]. However, the 5S, 16S, and 23S-5S rRNA genes are too conserved to differentially detect L. pneumophila from other Legionella spp. [18]. While the mip gene was initially used as an L. pneumophila-specific marker [19], other Legionella spp. were later found to harbor this gene as well [20,21] A previous study has conducted a multilocus sequence analysis of 16S rRNA, mip, and ropB [22], and found 16S rRNA was useful for initial identification as it could recognize isolates robustly at the genus level, while mip, rpoB, and the mip-rpoB concatenation can be used to distinguish between different Legionella spp. However, multiplex PCR and sequencing are required for the identification, which render this method cumbersome and time-consuming. A gyrB gene-based single PCR method was developed for the differentiation of L. pneumophila subspp. pneumophila and L. pneumophila subspp. fraseri, but not for other Legionella spp. [16]. An oligonucleotide array based on mip gene sequences and digoxigenin-labeled PCR products was developed to identify 18 species of Legionella that have been reported to cause human infections, but the results are not reliable as some of the species only produced weak hybridization signals [13]. One other oligonucleotide microarray based on the wzm and wzt gene sequences and Cy3-labeled PCR products was developed to serotype all 15 distinct O-antigen forms within L. pneumophila [23].
In this study, we report the establishment of an oligonucleotide microarray method for the simultaneous detection of 11 pathogenic Legionella spp., L. anisa, L. bozemanii, L. dumoffii, L. gormanii, L. jordanis, L. longbeachae, L. maceachernii, L. micdadei, and L. pneumophila (including subspp. pneumophila, subspp. fraseri, and subspp. pasculleii), and one non-pathogenic spp., L. fairfieldensis, based on the 16-23S rRNA gene internal transcribed spacer (ITS) regions. The microarray method described here is specific, sensitive, and reliable and can be used as a better alternative to the traditional serotyping procedure, which is laborious and frequently cross-reactive.

Bacterial strains
The following standard Legionella spp. strains were used for ITS sequencing: L.  Table 1, and included 30 strains of the Legionella target species and 22 other non-target bacterial species. Of these 52 strains, 41 were reference strains and 11 were clinical or environmental isolates. Legionella strains were cultured onto buffered charcoal yeast extract (BCYE) agar plates (Hope Bio-technology Co., Ltd, Qingdao, China) and incubated in a 5% CO 2 incubator at 37˚C for 2-4 days.

Genomic DNA preparation
Genomic DNA was extracted from pure cultures using bacterial genomic DNA purification kit (Tiangen Biotech Co., Ltd., Beijing, China).

Amplification of Legionella spp. ITS regions
The primer pair wl-5793 and wl-5794 was designed on the basis of the 16S rRNA gene and 23S rRNA gene sequences, respectively, using Primer Premier 5.0 software (Premier Boost International, CA) [24,25]. These primers were used to amplify the ITS region of all Legionella spp. The primer sequences and Cloning and sequencing of the ITS regions of Legionella spp. and subspp PCR amplicons were cloned into the pGEM-T Easy vector (Promega, MA) and transformed into E. coli DH5a. Transformants (observed as white colonies grown on an ampicillin plates containing isopropyl-beta-D-thiogalactopyranoside and Xgal) were selected randomly. Plasmid DNA was isolated using the conventional alkaline lysis method, digested with Eco RI, and visualized on agarose gels to confirm the presence of the corresponding inserts. Sequences were verified using

Sequence analysis
Multiple sequence alignment of ITS sequences was carried out with the ClustalW program (http://www.ebi.ac.uk/clustalw/). The identity level was calculated using BioEdit software (http://www.mbio.ncsu.edu/BioEdit/page2.html). Phylogenetic trees were constructed using the neighbor-joining method and plotted by the molecular evolutionary genetics analysis (MEGA) 3.1 software package (http:// www.megasoftware.net). Bootstrap analysis was carried out based on 1,000 replicates.

Target DNA amplification and labeling
Primer concentrations were optimized according to the final intensity of the microarray hybridization signals. The PCR mixtures contained 16PCR buffer (50 mM KCl, 10 mM Tris-HCl; pH 8.3), 2.5 mM MgCl 2 , 400 mM dNTP, 0.2 mM ITS of each primer, 2.5 U Taq DNA polymerase, and 50-100 ng of DNA template in a final volume of 25 mL. The following PCR parameters were employed: initial denaturation at 95˚C for 5 min; followed by 35 cycles of 95˚C for 30 s, 50˚C for 30 s, and 72˚C for 1 min; and a final extension at 72˚C for 5 min. The amplified DNA was analyzed by agarose gel electrophoresis of a 2-mL aliquot of the PCR product (Fig. S1). To label the PCR products, 10 mL of the PCR products generated from the first run and the reverse primer and 0.3 mL of 25 nM Cy3-dUTP were added to the PCR mixture, and PCR was carried out using the same PCR conditions described above.

Oligonucleotide probe design
The conserved and variable regions of the ITS sequences were defined by aligning multiple ITS sequences using ClustalW. For each type of pathogen, two to four probes were designed on MEGA 3.1 based on the sequences from the GenBank database or from our lab data, and checked by Primer Premier 5.0. One probe based on the 16S rRNA gene was designed as the positive control (OA-1993). A probe containing 40 poly(T) oligonucleotides was used as the negative control (WL-4006). A probe containing 40 poly(T) oligonucleotides labeled by 39-Cy3 was used as the positional reference and printing control (Cy3). Each probe comprised a modified 59 amino acid sequence followed by a spacer of 10 to 15 poly(T)s and a stretch of specific sequence (synthesized by AuGCT Biotechnology Corporation, Beijing, China). All the oligonucleotide probes used in this study are listed in Table 3.

Microarray preparation
The probes were dissolved in 50% dimethyl sulfoxide (DMSO) to a final concentration of 1 mg/mL and coated onto aldehyde group-modified glass slides (CapitalBio Corporation, Beijing, China) using SpotArray 72 (Perkin-Elmer Corporation, CA, USA). Each probe was spotted in triplicate. Coated slides were dried and stored at room temperature in the dark. Each glass slide contained eight individual arrays framed with an 8-sample cover slip containing individual reaction chambers. A schematic diagram of the probe positions on the microarray is shown in Fig. 1.

Microarray hybridization and data analysis
All labeled PCR products were precipitated using 100% cold ethanol, centrifuged at 13,000 g for 10 min, washed with 75% ethanol, and dried at room temperature. The dried, labeled DNA was diluted in 16 mL of hybridization buffer (50% formamide, 66 SSC, 56 Denhardt, and 0.5% SDS) and then hybridized with the prepared microarray at 45˚C for 12 h. After hybridization, the slide was washed with solution A (16 SSC and 0.1% SDS) for 3 min, solution B (0.056SSC) for 3 min, and solution C (95% ethanol) for 1.5 min. The microarray was dried under a gentle air stream and scanned with a laser beam of 532 nm using the GenePix biochip scanner 4100A (Axon Instruments, CA, USA) set to the following parameters: photomultiplier tube gain, 600, and pixel size, 5 mm. The signal-to-noise ratio (SNR) was calculated for each spot using the built-in software, GenePix Pro 6.0, with the threshold set at 3.0. A signal was considered positive when 70% of the probes to a respective target gene generated hybridization signals above the SNR threshold.

Test of mock samples
BCYE medium was used for proliferation. Pure cultures of L. bozemanii, L. dumoffii, and L. gormanii were serially diluted from 10 1 to 10 6 CFU/mL, and 1 mL of the diluent was mixed with 100 mL of fresh tap water from the laboratory and vacuum filtered through a 0.22-mm membrane. The membrane was treated with 500 mL of diluted HCl (pH 3.0) for 1 min, placed face-down on BCYE agar plates, and incubated in a 5% CO 2 incubator at 37˚C for 3-5 days. The genomic DNA was then extracted from the cells for microarray hybridization. The 16S rDNA based probe was used as the positive control. c The probe containing 40 poly(T) oligonucleotides was used as the negative control.

Test of air conditioner-condensed water samples
A filter-enriched air conditioner-condensed water sample (800 mL) was plated onto a GVPC agar plate (Hope Bio-Technology Co., Ltd., Qingdao, China) and incubated in a 5% CO 2 incubator at 37˚C for 48 h. Then, the culture was collected and genomic DNA was extracted for use in the downstream PCR and hybridization process.

Nucleotide sequence and microarray accession numbers
The ITS sequences of L. anisa, L. bozemanii, L. dumoffii, L. fairfieldensis, L. gormanii, L. jordanis, L. maceachernii, L. micdadei, and L. pneumophila subspp. fraseri, and L. pneumophila subspp. pascullei were deposited into the GenBank database under the accession numbers KM609984-KM610004. The microarray dataset was deposited into the Gene Expression Omnibus database under the accession number GSE61962.

Legionella spp. ITS regions reveal interspecies variations
We have sequenced the ITS regions of 10 Legionella spp. and subspp., including L. anisa, L. bozemanii, L. dumoffii, L. fairfieldensis, L. gormanii, L. jordanis, L. maceachernii, L. micdadei, L. pneumophila subspp. fraseri and L. pneumophila subspp. pasculleii. Next we analyzed the ITS regions of the 12 Legionella spp., the above 10 plus L. longbeachae and L. pneumophila subspp. pneumophila, whose sequences were previously published (NC_013861, CP0005672) using tRNA-ScanE software (http://lowelab.ucsc.edu/tRNAscan-SE/). The data indicated that except for L. jordanis, which has three ITS types of ITS-tRNA Ala (with tRNA Ala gene), ITS-tRNA Ile (with tRNA Ile gene), and ITS-tRNA none (without tRNA gene), the 11 other Legionella spp. and subspp. all contain two distinct ITS types: ITS-tRNA Ala (with tRNA Ala gene) and ITS-tRNA Ile (with tRNA Ile gene). Alignments of the above ITS sequences revealed significant interspecies variations of 0.266-0.772 for ITS-tRNA Ala and 0.280-0.774 for ITS-tRNA Ile , but low intraspecies polymorphisms of 0.967-0.993 for L. pneumophila ITS-tRNA Ala and 0.943-0.996 for L. pneumophila ITS-tRNAI le , suggesting that ITS is a good target for speciesspecific identification.

Phylogenetic analysis
We have constructed two phylogenetic trees of 12 Legionella spp. and subspp. based on the ITS-tRNA Ala and ITS-tRNA Ile gene sequences with Staphylococcus aureus and Enterococcus faecium as the outer group references for the two gene sequences ( Fig. 2A and 2B). In the ITS-tRNA Ala tree, there are two subgroups: the first subgroup consists of L. pneumophila subspp. pneumophila, L. pneumophila subspp. fraseri, L. pneumophila subspp. pascullei, L. fairfieldensis, L. jordanis, L. maceachernii, and L micdadei; while the second contained subgroup, L. gormanii, L. anisa, L. bozemanii, L. dumoffii, and L. longbeachae. In the ITS-tRNA Ile tree, there are three subgroups: the first subgroup includes L. pneumophila subspp. pneumophila, L. pneumophila subspp. fraseri, L. pneumophila subspp. pascullei and L. fairfieldensis; the second subgroup, L. jordanis, L. maceachernii, and L micdadei; the third subgroup, L. dumoffii, L. longbeachae, L. gormanii, L. anisa, and L. bozemanii. In both phylogenetic trees, L. pneumophila is found to be most closely related to L. fairfieldensis; L. Maceachernii is closest to L micdadei; and L. jordanis is in the neighborhood of L. fairfieldensis and L. maceachernii; and L. anisa is closest to L. bozemanii.

Probe specificity
A total of 52 strains were used to test the specificity of the designed probes. Probes that cross-hybridized or did not produce signals were eliminated from the test panel. After the screening, 32 probes (including 29 species-specific probes, one positive control probe, one negative control probe, and one positional and printing control probe) were selected ( Table 3).
The microarray specifically identified the 30 target strains. For example, L. pneumophila (including subspp. pneumophila, subspp. fraseri, and subspp. pasculleii) produced positive signals with its specific probes of OA-3815, OA-3816, and OA-3817, as well as the positive control probe OA-1993 and the positional and printing control probe Cy3 but not with the other probes [ Fig. 3(1)]. Likewise, L. anisa produced positive signals with its specific probes of OA-3819 and OA-3820 [ Fig. 3

Microarray sensitivity
The sensitivity of the microarray analysis was tested by hybridization with serially diluted genomic template DNA at concentrations of 0.1, 1.0, 10, and 100 ng. Based on the positive signals generated, the sensitivity of the assay using genomic DNA was 1.0 ng DNA for L. pneumophila, L. longbeachae, and L micdadei, and 0.1 ng DNA for L. dumoffii (Fig. S2).

Simultaneous detection of multiple pathogens
As the detection will be more desirable if multiple pathogens can be simultaneously detected. Genomic DNA of two groups of two pathogens: L. anisa and L. dumoffii or L. anisa and L. longbeachae were mixed and used as templates for the testing. The results revealed that the probes were able to hybridize specificly the target regions of these pathogens, demonstrating the designed probes are able to detect multiple samples simultaneously [ Fig. 3(11)-3 (12)]. Next, genomic DNA of two groups of three pathogens: L. gormanii, L. jordanis, and L. fairfieldensis or L. maceachernii, L micdadei, and L. longbeachae, were mixed and again the microarray probes were able to successfully identify multiple pathogens simultaneously [ Fig. 3(13) and 3(14)].

Blind test
The specificity and sensitivity of the microarray detection system described was blind tested. Coded DNA samples from 22 species (Table 1)   (n51), Staphylococcus aureus (n51), and Streptococcus pneumoniae (n51). The results matched exactly with that of the conventional detection methods (data not shown).

Test of mock samples
The mock samples containing L. bozemanii, L. dumoffii, and L. gormanii at various concentrations were tested, and the detection level was found to be at 18, 15, and 5 CFU/100 mL, respectively [ Fig. 3(3), 3(4) and 3 (5)]. On an average, Legionella spp. could be detected at concentrations of as low as 13 CFU/100 mL after filtering and culture enrichment.

Test of real water samples and its confirmation by sequencing
Seven samples of condensed water from air conditioners collected and provided by the Center for Disease Control and Prevention, Shanghai, China, were subjected to the microarray analysis. The hybridization profiles of the samples revealed that three of the seven samples were contaminated by L. pneumophila [ Fig. 3(1)]. The remaining four samples generated signals with the positive control probe indicating the existence of bacteria other than the ten Legionella spp. (Fig.  S3). The existence of L. pneumophila in the three water samples was confirmed by PCR amplification and DNA sequencing of the L. pneumophila wzt genes (23).

Discussion
Bacterial species have at least one copy of the 16S rRNA gene and the 16S-23S rDNA ITS region contains both highly conserved regions and hyper variable regions, which are useful molecular markers for bacterial identification at the species [26] and subspecies levels [27,28], typing [29,30], as well as in evolutionary studies [31,32]. In this report, we describe a microarray method for the determination of pathogenic and non-pathogenic Legionella spp. on the basis of their ITS regions.
A number of techniques have been adopted to improve the reproducibility and sensitivity of the microarray First, we used a two-step PCR to amplify and label the samples. At the first step, the forward and reverse primers were used to amplify the target genes; followed by the labeling of single-strand DNA using the reverse primers at the subsequent step. The two-step PCR scheme enhances the amplification efficacy and generated intensively labeled probes as well.
The microarray is sensitive, and as little as 1.0 ng DNA or 13 CFU/100 mL can be reliably detected. This is achieved using a two-step procedure of vacuum filtering and culture process to enrich the target Legionella. After the collection of all the bacteria in the samples by the vacuum filtering, the acid-resistant Legionella were treated with HCl and selected in BCYE or GVPC [17]. The process described here allowed the detection of Legionella in a short time of 2-3 days, which is 9-10 day faster than the existing methods of identification (ISO11731:1998).
We have developed a rapid oligonucleotide microarray to identify the above 12 Legionella spp. and subspp. based on the polymorphism of 16-23S rRNA ITS sequences. A total of 52 strains were used to test the microarray assay, including 30 target pathogens and 22 closely-related bacteria. The 29 probes selected have reproducibly detected multiple pathogens with high specificity and sensitivity at 1.0 ng genomic DNA or 13 CFU/100 mL following filtering and culture enrichment. A 100% detection of the seven air conditioner-condensed water samples validated the microarray. Our findings revealed that the oligonucleotide microarray technique presented in this study is a promising method for basic microbiology, clinical diagnosis, food safety, and epidemiological surveillance.
In conclusion, this study presents a new PCR-based microarray assay for the comprehensive and simultaneous detection and identification of ten Legionella spp. This new method provides an accurate and reliable approach to differentiate among Legionella isolates at the species level; contributes significantly to largescale epidemiology studies; and can be used to monitor local, regional, and national trends in human legionellosis.