Identification of Clinically Relevant Fungi and Prototheca Species by rRNA Gene Sequencing and Multilocus PCR Coupled with Electrospray Ionization Mass Spectrometry

Background Multilocus PCR coupled with electrospray ionization mass spectrometry (PCR/ESI-MS) is a new strategy for pathogen identification, but information about its application in fungal identification remains sparse. Methods One-hundred and twelve strains and isolates of clinically important fungi and Prototheca species were subjected to both rRNA gene sequencing and PCR/ESI-MS. Three regions of the rRNA gene were used as targets for sequencing: the 5′ end of the large subunit rRNA gene (D1/D2 region), and the internal transcribed spacers 1 and 2 (ITS1 and ITS2 regions). Microbial identification (Micro ID), acquired by combining results of phenotypic methods and rRNA gene sequencing, was used to evaluate the results of PCR/ESI-MS. Results For identification of yeasts and filamentous fungi, combined sequencing of the three regions had the best performance (species-level identification rate of 93.8% and 81.8% respectively). The highest species-level identification rate was achieved by sequencing of D1/D2 for yeasts (92.2%) and ITS2 for filamentous fungi (75.8%). The two Prototheca species could be identified to species level by D1/D2 sequencing but not by ITS1 or ITS2. For the 102 strains and isolates within the coverage of PCR/ESI-MS identification, 87.3% (89/102) achieved species-level identification, 100% (89/89) of which were concordant to Micro ID on species/complex level. The species-level identification rates for yeasts and filamentous fungi were 93.9% (62/66) and 75% (27/36) respectively. Conclusions rRNA gene sequencing provides accurate identification information, with the best results obtained by a combination of ITS1, ITS2 and D1/D2 sequencing. Our preliminary data indicated that PCR/ESI-MS method also provides a rapid and accurate identification for many clinical relevant fungi.


Introduction
Over the past few decades, invasive fungal diseases (IFDs) have become a big challenge to public health, especially in hospitalized patients who are critically ill or immunocompromised. Despite extraordinary development in antifungal drugs during these years, the mortality of IFDs remains high [1,2]. As there are speciesspecific differences in susceptibility to antifungal agents for many pathogenic fungi, rapid and accurate identification of organisms to species level is essential for the early initiation and optimal choice of antifungal therapy. Optimal treatment is of importance to reduce mortality [3,4]. In most routine diagnostic laboratories, phenotypic identification is used, based on morphological and biochemical analysis. Technical expertise and a great deal of time are required for morphological identification, especially for some less common species. Phenotypic methods have also long been criticized for low accuracy and database limitations [5,6].
The development of molecular approaches has greatly facilitated the identification of fungal pathogens. Many molecular identification methods have been established, such as PCR-based sequencing, restriction fragment length polymorphism (RFLP), pyrosequencing, matrix-assisted laser desorption ionization-time of flight mass spectrometry (MALDI-TOF MS), and multilocus PCR coupled with electrospray ionization mass spectrometry (PCR/ESI-MS) [7][8][9]. Among these methods, PCR amplification followed by sequencing and pairwise alignment of amplicons has been widely accepted as the ''gold standard'' for fungal identification. PCR-based sequencing demonstrates better speci-ficity and accuracy compared with traditional phenotypic methods [10,11]. Because of sufficient intra-species conservation and interspecies specificity, the 59 end of the large subunit rRNA gene (D1/ D2 region) and the internal transcribed spacer 1 and 2 (ITS1 and ITS2 regions) between 18 s and 28 s rRNA genes are most frequently used as targets for sequencing [12][13][14]. Nonetheless, this method has its limitations in the requirement for high quality and quantity of DNA samples, and the poor performance in characterizing mixtures of organisms [15].
PCR/ESI-MS which is performed on the commercialized instrument PLEX-ID TM system (Abbott Molecular Inc., Des Plaines, IL, USA) couples multilocus PCR amplification with automated electrospray ionization mass spectrometry analysis and accurately weighs the amplicons, or mixture of amplicons to deduce the composition of A, G, C and T for each amplicon. By comparing with a database of calculated base compositions derived from sequences of reference strains, the base compositions can provide pathogen identification [16,17]. The ability to characterize mixtures of organisms, along with its short turnaround time, and low DNA concentration requirement, makes it a promising method in fungal identification. Several studies have demonstrated its utilization in identification of microbes and diagnosis of bloodstream infections [17,18]. It has been shown that PCR/ESI-MS had good performance in identifying clinical Candida isolates to species level [19,20]. However, data regarding the ability of PCR/ESI-MS to identify other clinical relevant fungi isolates are still insufficient [21,22].
In the present study, we used rRNA gene sequencing and PCR/ ESI-MS methods, as well as traditional phenotypic methods, to identify a series of clinically relevant fungi including yeast and filamentous fungi, and two Prototheca species. The results of identification were compared among different methods, in an effort to provide further evidence for the performance of these methods in the identification of pathogenic fungi in clinical settings.

Phenotypic identifications
The phenotypic identifications of all the isolates were performed in the clinical mycology laboratory of Huashan Hospital, Fudan University. Yeast species were identified by their colony morphology and microscopic morphology on Sabouraud dextrose agar (BD Biosciences, Cockeysville, MD, USA) and CHROMagar Candida plates (BD Biosciences). Biochemical test by API 20C AUX system (bioMéríeux, Marcy I'Etoile, France) was used as supplement for yeast identification. L-canavanine glycine bromothymol blue (CGB) agar was used for the differentiation of C. neoformans and C. gattii [23,24]. Identifications of filamentous fungi were based on their colony morphology and microscopic morphology on Sabouraud dextrose agar (BD Biosciences, Cockeysville, MD, USA) and Czapek-Dox agar (Difco Laboratories, Detroit, MI, USA).

Amplification and sequencing of the rRNA gene regions
The QIAamp DNA Mini Kit (QIAGEN, Valencia, CA, USA) was used for extraction of fungal, bacterial and human genomic DNA per the Qiagen Handbook's protocol. Three pairs of fungusspecific universal primers were used for amplification of the three rRNA gene regions (ITS1, ITS2 and D1/D2) ( Table 1) [25][26][27]. PCR was performed in a total reaction volume of 25 mL containing 1.5 mM of MgCl 2 , 0.1 mM of each dNTP, 1.25 U of Taq DNA polymerase, 0.2 mM of each primer, 16 PCR buffer and 3 mL of DNA template. The PCR was performed for 35 cycles with 30 s of denaturation at 94uC, 30 s annealing at 61uC, and 1 min extension at 72uC, followed by a final extension cycle for 6 min at 72uC. The amplicons were sequenced using the ABI 3730 XL DNA Analyzer (Applied Biosystems/Hitachi, Foster City, CA). Pairwise sequence alignment was performed in the GenBank database (www.ncbi.nlm.nih.gov/BLAST/).
The criteria for judging the results of rRNA gene sequencing and alignment were adopted from previous studies with modification as follows [14,28]: A strain or isolate was assigned to species level if the best matching reference species showed $98% homology and the next best matching reference species showed at least 0.8% less sequence homology. A strain or isolate was assigned to genus level when there was 95% to 98% homology to the best matching species, or when more than one sequence entry of several species from the same genus showed of $98% homology. ''No identification (NI)'' was defined as,95% homology with the best matching reference species, or when there was homology of .95% with multiple sequences of various genera. For each strain and isolate, a multiple rRNA gene sequencing identification result was integrated from alignment results of ITS1, ITS2 and D1/D2 as follows: species-level identification was defined when any of the three regions sequencing yielded specieslevel identification and results by the other regions were of the same species, genus, or NI result; genus level identification was defined when different species but the same genus were indicated by the three regions, or when one of them resulted in genus level identification while the remaining two showed NI result; NI was defined when all the three sequencings yielded NI results or when the results disagreed on genus level.

Assignment of Micro ID
In order to provide a relative standard for further evaluation of results from PCR/ESI-MS, a microbial identification (Micro ID) for each strain and isolate in this study was predefined on the basis of results from both the phenotypic identification and rRNA gene sequencing [29]. For strains or isolates with consistent results between the phenotypic identification and rRNA gene sequencing, the Micro ID was the consensus result. For strains or isolates with inconsistent results between them, the Micro ID was defined primarily based on rRNA gene sequencing. Phenotype was considered supplemental to rRNA gene sequencing in cases where related species could not be differentiated by ITS or D1/D2 sequencing.

PCR/ESI-MS analysis
Multilocus PCR for ESI-MS was performed with broad fungal assay plate (Abbott Molecular Inc, Des Plaines, IL, USA). There were 16 primer pairs in the kit with large subunit (LSU) rRNA, small subunit (SSU) rRNA, mitochondrial DNA (mtDNA) SSU, mtDNA cytochrome B (cyt B), and b-tubulin genes as molecular targets, among which eight targeted broad-range fungi, seven were species or genus specific, and one served as extraction control ( Table 2). The following conditions were used for PCR on a 96well Thermal Cyclers (Eppendorf, Hamburg, Germany): 95uC for 10 min, followed by 8 cycles of 95uC for 30 s, 48uC for 30 s (increasing 0.9uC each cycle), and 72uC for 30 s. The PCR was then continued for 37 additional cycles of 95uC for 15 s, 56uC for 20 s, and 72uC for 20 s. The cycle ended with a final extension of 2 min at 72uC, 99uC for 20 min, and then followed by a 4uC hold. Amplicons and mixtures of amplicons (47 samples were prepared in mixtures of 2-4 samples per reaction) were analyzed using the Abbott PLEX-ID universal biosensor platform (Abbott Molecular Inc, Des Plaines, IL, USA), which performed automated post-PCR desalting, ESI-MS signal acquisition, spectral analysis, and data reporting. Information such as the number of copies of DNA per well, the number of primers making the identification, and the base composition similarity between the reference and unknown isolates were taken into account to determine the correlation. The criteria for judgment of identification results were adopted from previous studies with minor modification as follows [21,30]: A strain or isolate was assigned to species level when the retrieved loci showed $60% match with database signatures, with the best matching reference species showed $0.90 correlation and the next best matching reference species showed at least 0.1% less correlation at the same time. A strain or isolate was assigned to genus level when several matching species from the same genus met the criteria above. ''Fungi detected-no identification can be provided'' was defined as 50% to 60% database match. ''No detection (ND)'' was reported for specimens whose best database match was below 50% or when no fungus was detected.

Statistical analysis
Accuracy of identification assays was calculated as the proportion of samples correctly identified by each of the methods. Between-group comparisons were performed using chi-squared test. A p-value (P) of ,0.05 was considered statistically significant. SPSS 19.0 (SPSS Inc., Chicago, IL, USA) was used for statistical analysis.

Identification of reference strains by phenotypic methods and rRNA gene sequencing
Identification results of the 13 reference strains are shown in Table 3. Phenotypic methods yielded species-level identification for most of the strains, except one C. dubliniensis strain that was identified as its closely related species C. albicans, and five Aspergillus spp. strains that were identified to genus level. Sequencing of the three rRNA gene regions yielded species-level identification for most of the strains with a few exceptions. Two C. gattii strains were identified to genus level because of high homology between C. neoformans and C. gattii by sequencing of all the three regions. In addition, one A. ustus strain was correctly identified to species level by D1/D2 sequencing, but misidentified as other Aspergillus spp. that belonged to the A. ustus complex by either ITS1 or ITS2 sequencing.
Among the 99 isolates, concordant identification results on species level between rRNA gene sequencing and phenotypic methods were achieved for 62.6% (62/99) of them. Inconsistent results occurred for 37.4% (37/99) of them, including 32 isolates with inconsistent species-level results but consistent complex/ genus level results, and 5 isolates with inconsistent results on genus level (Table S1).

Micro IDs of reference strains and clinical/environmental isolates
When assigning Micro IDs for strains and isolates, both rRNA gene sequencing results and phenotypes were taken into consideration. The sequencing results were prioritized for most of the strains and isolates, with a few exceptions as follows: two strains indistinguishable between C. neoformans and C. gattii by rRNA gene sequencing were assigned to C. gattii by CGB culture, and two isolates indistinguishable between Aspergillus oryzae and Aspergillus flavus by rRNA gene sequencing were assigned to A. flavus by morphological and cultural characteristics [31]. Species-level Micro IDs were defined for most of the strains and isolates, except that Micro IDs were defined on complex or genus level for

Assessment of PCR/ESI-MS analysis
Based on Micro IDs, 91.1% (102/112) of the strains and isolates, including 66 yeasts and 36 filamentous fungi, were within the coverage of PCR/ESI-MS identification as per the database of the broad fungal assay kit (Table S2). PCR/ESI-MS was able to identify 87.3% (89/102) of them to species level with 100% (89/ 89) consistency to their Micro IDs at species or species complexlevel (C. parapsilosis, C. metapsilosis, and C. orthopsilosis in C. parapsilosis complex, P. guilliermondii and P. caribbica in P. guilliermondii complex, A. versicolor and A. sydowii in A. versicolor complex). The species-level identification rate for yeasts was 93.9% (62/66), higher than that of filamentous fungi which was 75% (27/36). Genus level identification was achieved for seven isolates of filamentous fungi, four of which were identified as A. flavus/A. oryzae within the A. flavus complex.
For the remaining 10 strains and isolates out of the coverage of the broad fungal assay kit, six were identified to species or genus level by PCR/ESI-MS, including two C. gattii strains as C. neoformans, one Colletotrichum capsici strain as Colletotrichum boninense, one Mucor irregularis isolate as Mucor racemosus, and one Pichia anomala isolate as Pichia spp., all of which showed genus concordance to their Micro IDs.
PCR amplifications of the three regions for negative control samples all yielded negative results and PCR/ESI-MS analysis detected no fungus for each of them (data not shown).

Discussion
The traditional phenotypic identification methods for pathogenic fungal species are based on morphological and biochemical characteristics of different species. Molecular approaches, especially sequence-based strategies which are being more and more widely used show the potential to be a powerful supplement for the phenotypic identification [11,17,32]. The ITS region is one of the most frequently used targets for sequence-based identification. It is located between the 18S and 28S rRNA gene and is divided into ITS1 and ITS2 regions by the 5.8S rRNA gene. Both ITS1 and ITS2 have important applications in sequence-based identification. Several studies have shown that ITS2 displayed significant interspecies variability to identify 99.7% of yeasts and 100% of molds to species level, while ITS1 produced identification accuracy of 96.8%-100% for yeasts and 100% for molds [13,33]. Although ITS2 has been more frequently used for fungal identification, ITS1 showed greater interspecies variability than ITS2 region in another study [25]. A database specialized for ITS1 based on pairwise alignment searches (http://itsonedb.ba. itb.cnr.it:8080/ITS1/) has also been established. In our study, ITS2 showed higher discrimination ability than ITS1 for yeasts, and similar performance for filamentous fungi. Furthermore, a combination of ITS1 with ITS2 sequencing showed better performance than either single locus sequence alone. ITS1 plus ITS2 is also preferred to be an appropriate region for one of the taxonomic methods so called DNA-barcoding of fungi [34][35][36].
The D1/D2 region within the 26/28S large subunit rRNA gene is another commonly used target for identification [27,37]. According to our study, D1/D2 sequencing yielded higher identification accuracy than either ITS1 or ITS2 for yeasts, in accordance with others [12]. Another study has shown that for Table 4. Identification results of clinical and environmental isolates by phenotypic and different rRNA gene sequencing methods (n = 99). species-level identification of some Mucor spp. such as Mucor ramosissimus, D1/D2 sequence analysis acquired higher identity than either ITS1 or ITS2 [13]. Unlike DNA-barcoding, which prefers using one standardized 500 to 800 bp sequence to identify species of all fungal pathogens, our results revealed the highest identification accuracy by combination of ITS1, ITS2 and D1/D2 sequencing for all the clinical relevant fungal pathogens we studied [36]. Though it has been generally accepted that PCR-based sequencing provides quicker and more accurate identification for fungi, the traditional phenotype-based identification remains indispensible. Phenotypes serve as essential supplements for some species that share high homology in rRNA gene sequences. In our study for example, C. neoformans and C. gattii were indistinguishable by rRNA gene sequencing but identified to species by CGB culture. Also two Aspergillus spp. isolates indistinguishable between A. oryzae and A. flavus by rRNA gene sequencing were identified as A. favus by morphological and cultural characteristics. Therefore, a combination of phenotypic and rRNA gene sequencing identification results would be better than any single one of them.

Isolates
PCR/ESI-MS is a new strategy coupling broad-range PCR amplification to automated electrospray ionization mass spectrometry analysis for identification and detection of pathogens. Unlike MALDI-TOF MS, which performs the identification based on species-specific protein masses, PCR/ESI-MS relies on measurement of masses of nucleotides from PCR amplicons. In this study, we assessed the identification value of PCR/ESI-MS for various clinical relevant fungi and Prototheca spp. Regarding the species within the coverage of the broad fungal assay kit, 93.9% (62/66) of yeasts and 75% (27/36) of filamentous fungi were identified to species level by PCR/ESI-MS, results of which were 100% concordant to Micro ID on species/species complex level. One C. dubliniensis strain which is difficult to differentiate from C. albicans by conventional phenotypic methods was accurately identified by PCR/ESI-MS. For the identification of less common fungi by PCR/ESI-MS, a recent study showed that most Rhodotorula spp., Rhizopus spp., Rhizomucor spp., Fusarium spp., and Scedosporium spp. could be identified to genus level [21]. While two Rhodotorula mucilaginosa, four isolates of Rhizopus spp., two isolates of Rhizomucor spp., two isolates of Fusarium spp., and two isolates of Scedosporium spp. in our study were successfully identified to species level. It was also noteworthy that in our study six out of ten strains and isolates out of the coverage of the kit could be identified to genus level concordant with Micro ID, indicating the potential usefulness of the kit for detection of emergent or unknown fungal pathogens.
The ability of PCR/ESI-MS to identify the filamentous fungi, as shown in our study, seemed to be not as satisfactory as for the yeasts. One of the possible explanations may be that the primers of ITS1 and ITS2 regions were not included in this kit [36,38]. The ITS regions seem to have a particular value in the identification of filamentous fungi, as revealed by our results that either ITS1 or ITS2, yielded higher identification accuracy than D1/D2. Therefore, the inclusion of primers targeting ITS1 and ITS2 regions might possibly improve the performance of PCR/ESI-MS in the identification of filamentous fungi.
Although the 16 primer pairs in the broad fungal assay kit were designed to target different gene regions for the purpose of discriminating different genus and/or species, PCR/ESI-MS is still limited in its ability to distinguish among phylogenetically closely related species. As revealed by our results, PCR/ESI-MS could not distinguish species within the C. parapsilosis complex (C. parapsilosis, C. metapsilosis, and C. orthopsilosis) and A. flavus complex (A. flavus and A. oryzae). Moreover, P. guilliermondii, C. neoformans, and A. versicolor are included in the coverage of the kit while their closely related species such as P. caribbica, C. gattii, and A. sydowii are not, which makes it difficult to distinguish these morphologically similar species. A previous study showed that for ascomycetous yeasts, D1/D2 LSU sequences were not specific enough to identify closely related taxa, and the actin gene was a better marker in these cases [39]. Another study recommended b-tubulin locus for identification of individual species within various Aspergillus complexes, while ITS regions for inter complex level identification of Aspergillus spp. [40]. The intergenic spacer (IGS) sequence analysis could differentiate between C. neoformans and C. gattii in another study [41]. For distinction of P. guilliermondii and P. caribbica, sequencing of either ITS or D1/D2 regions is a good tool, as shown in results from our study and by others [42]. It seems that introduction of additional target gene regions, such as ITS and IGS, as well as different regions on LSU rRNA gene and b-tubulin gene, may be helpful to improve the ability of PCR/ESI-MS to differentiate among these closely related species, but may come with an increase of the cost meanwhile.