Multilocus Phylogenetic Study of the Scheffersomyces Yeast Clade and Characterization of the N-Terminal Region of Xylose Reductase Gene

Many of the known xylose-fermenting (X-F) yeasts are placed in the Scheffersomyces clade, a group of ascomycete yeasts that have been isolated from plant tissues and in association with lignicolous insects. We formally recognize fourteen species in this clade based on a maximum likelihood (ML) phylogenetic analysis using a multilocus dataset. This clade is divided into three subclades, each of which exhibits the biochemical ability to ferment cellobiose or xylose. New combinations are made for seven species of Candida in the clade, and three X-F taxa associated with rotted hardwood are described: Scheffersomyces illinoinensis (type strain NRRL Y-48827T  =  CBS 12624), Scheffersomyces quercinus (type strain NRRL Y-48825T  =  CBS 12625), and Scheffersomyces virginianus (type strain NRRL Y-48822T  =  CBS 12626). The new X-F species are distinctive based on their position in the multilocus phylogenetic analysis and biochemical and morphological characters. The molecular characterization of xylose reductase (XR) indicates that the regions surrounding the conserved domain contain mutations that may enhance the performance of the enzyme in X-F yeasts. The phylogenetic reconstruction using XYL1 or RPB1 was identical to the multilocus analysis, and these loci have potential for rapid identification of cryptic species in this clade.


Introduction
D-xylose is a five-carbon backbone molecule of the hemicellulose component of plant cell walls and is one of the most abundant renewable carbon resources on Earth. Some bacteria and certain fungi, including fewer than twenty-five species of the more than 1500 described ascomycete yeasts, share the ability to produce ethanol by the fermentation of D-xylose [1]. In order to ferment D-xylose, yeasts express xylose reductase (XR), xylitol dehydrogenase (XDH), and xylulose kinase (XK) to convert D-xylose to Dxylulose-5-phosphate; D-xylulose-5-phosphate is then incorporated into the pentose phosphate pathway to be catalyzed to ethanol [2][3][4].
Xylose fermentation has been the focus of several studies in order to identify differences in the catabolic rate between strains [2,[5][6][7][8][9][10][11][12][13][14]. Overexpression, homologous and heterologous expression, and direct mutagenesis of genes involved in D-xylose assimilation and fermentation have only modestly enhanced the quantity of ethanol production by yeasts due to several metabolic constraints; these include rate of regeneration of the cofactor NADP(H) required by XR and XDH, repression by glucose, and anaerobic respiration regulatory control [15][16][17]. These studies have resulted in the present understanding of the biochemical pathway, but the main goal of bioengineering yeasts capable of fermenting D-xylose at a high rate to be used at industrial scales has not yet been achieved. Consequently, recent research has been focused on the discovery of new X-F yeasts, e.g. Spathaspora passalidarum and Candida jeffriesii [18], and Spathaspora arborariae and other taxa [19,20], from the guts of lignicolous beetles and rotted wood, niches from which a number of X-F yeasts have been isolated [21][22][23].
Although, X-F yeasts appear scattered throughout the Saccharomycotina, the yeasts that have been reported to exhibit the highest rate of xylose fermentation under certain conditions are members of the Scheffersomyces clade [9,21,24], and for this reason fermentative ability has been intensively studied in this clade [9,. Few studies, however, have undertaken clarification of the phylogenetic relationships among the yeasts of the clade [1,[52][53][54][55][56][57][58]. Therefore, a robust, well-supported phylogeny including as many taxa as possible is necessary to clarify the phylogenetic relationships among the Scheffersomyces clade members along with a comparison of the nucleotide mutations and enzymatic activity of the XR to understand the importance of the biochemical ability in the speciation process of this yeast clade.
In order to distinguish the species in the Scheffersomyces clade we used BLAST searches, biochemical and morphological characterization, and a multilocus phylogenetic analysis that included the traditional SSU and LSU markers, the orthologous RPB1, and the recently proposed ITS barcoding region for fungi [59]. We present a taxonomic revision of the Scheffersomyces clade, trace the nucleotide differences in the XYL1, and propose three new species Figure 1. ML tree based of the D1/D2 LSU region using a 606-character matrix for yeast species isolated from the wood samples (in bold). Schizosaccharomyces pombe was used as an outgroup taxon (in grey). X-F, xylose-fermenting yeasts. Numbers above each branch refer to bootstrap values out of 1000 repetitions. ML score -11353.90. doi:10.1371/journal.pone.0039128.g001 of X-F yeasts, Scheffersomyces illinoinensis, Scheffersomyces quercinus, and Scheffersomyces virginianus associated with rotted hardwoods, Carya illinoinensis (pecan), Quercus nigra (water oak), and Quercus virginiana (live oak). We performed a molecular study of the XYL1 that codifies XR in certain members of the Scheffersomyces clade.

Yeast Isolation and Culture
Partially decayed logs and fallen branches of Carya illinoinensis (pecan, 30 cm diam), Quercus nigra (water oak, approximately 10 cm diam), and Quercus virginiana (live oak, approximately 10 cm diam) were collected from Pecan Drive, Saint Gabriel, Ascension Parish, Louisiana, and LSU Burden Center and the corner of Highland Road and S. Stadium Drive on the LSU campus, Baton Rouge, East Baton Rouge, Louisiana, respectively, between Sep and Oct 2007. The wood samples were divided into approximately 1 cm 2 samples, and each sample was placed in a 1.5 mL microcentrifuge tube with 1 mL of sterile water. Tubes were vortexed for 30 s and a 100 mL aliquot was plated on acidified yeast medium agar [60]. Plates were incubated for 3 d at 25 C, and single colonies were isolated and streaked 4 times to obtain pure cultures.

Molecular Identification
Genomic DNA was extracted using a WizardH Genomic DNA purification kit (Promega). The concentration, integrity, and purity of total DNA extracted were confirmed by gel electrophoresis in 0.8% agarose in 0.5 6 Tris-Borate-EDTA (TBE) buffer. Initial rapid identification was carried out by PCR amplification and sequencing of the LSU (D1/D2 region ,600 bp) rRNA gene for use in BLAST searches [1,54]. In order to increase the robustness of the phylogenetic analyses, PCR amplifications of the small subunit (SSU ,1.6 Kbp) and internal transcribed spacers 1 and 2 (ITS ,500 bp) of the rRNA marker were carried out in addition to the D1/D2 region [61,62] and the SSU rRNA gene was amplified using the combination of primers NS1 (forward) (59-GTAGTCATATGCTTGTCTC-39) and NS8 (reverse) (59-TCCGCAGGTTCACCTACGGA-39); ITS1-LSU markers were amplified using the combination of primers ITS1 (forward) (59-TCCGTAGGTGAACCTGCGG-39) and LR3 (reverse)  in a PCR reaction with 20 mg of total DNA, 0.5 mM DTPs, 2.5 mM MgSO4, 0.8 mM of each In addition, another nuclear locus, RNA polymerase II, was used in the phylogenetic analysis. A fragment of ,700 bp of the subunit I (RPB1) gene was amplified by the primer pair RPB1-Af (forward) 59-GARTGYCCDGGDCAYTTYGG-39 and RPB1-Cr (reverse) 59-CCNGCDATNTCRTTRTCCATRTA-39 [63,64]. The PCR reaction was performed using 100 mg of total DNA, 0.6 mM DTPs, 2.5 mM MgSO4, 1 mM of each primer, 1 6 PCR buffer, and 1.5 U of Taq polymerase (Promega) in 35 mL total final volume of reaction. The PCR amplification program included 5 min of DNA pre-denaturation at 95 C followed by 35 cycles of 1 min of DNA denaturation at 95 C, 45 s of primer annealing at 55 C, and 2 min of extension at 72 C, and 10 min final PCR extension.
The purified PCR products were sequenced in both directions by Beckman Coulter Genomics (Danvers, MA). Each molecular marker was sequenced on three independent occasions in order to avoid nucleotide differences due to sequencing errors.

Morphological, Biochemical and Physiological Characteristics
The yeast standard description based on phenotypic characters was executed following standardized protocols [1,69,70].

Xylose Reductase (XR) Molecular Studies
The ,600 bp fragment of the XYL1 was amplified using the following degenerate primers: XYL1-forward (59-GGTYT-TYGGMTGYTGGAARSTC-39) and XYL1-reverse (59-AAW-GATTGWGGWCCRAAWGAWGA-39) designed in this study, in a PCR reaction with 100 mg of total DNA, 0.4 mM DTPs, 4 mM MgSO 4 , 1 mM of each primer, 16 PCR buffer, and 1U of Taq polymerase (Promega) in a final volume of 25 mL. The PCR amplification program included 5 min of DNA pre-denaturization at 95 C followed by 35 cycles of 1 min of denaturization at 95 C, 1 min primer annealing at 57 C, 1 min extension at 72 C, and 10 min final PCR extension. The purified PCR products were sequenced as described above.

Diversity of Yeasts Isolated from Rotted Wood
The initial rapid molecular identification of the 29 yeast strains isolated from the wood samples using the D1/D2 LSU region confirmed the presence of several closely related species classified in the Scheffersomyces (11 isolates), Sugiyamaella (10 isolates), Trichomonascus (5 isolates), Meyerozyma (2 isolates), and Candida tanzawaensis (1 isolate) clades (Fig. 1). Initial biochemical characterization of fermentation abilities performed on all the isolates showed that only strains classified in the Scheffersomyces clade had the ability to ferment xylose.

Species Description of Scheffersomyces quercinus H. Urbina & M. Blackw. sp. nov
MycoBank accession number MB563719, (Fig. 4, a-b). After 7 d growth in YM broth at 25 C, cells are subglobose (5-865-7.5 mm), and occur singly, in pairs, or in chains. Pseudohyphae are present; true hyphae are absent. After 7 days on YM agar at 25 C, colonies are cream-colored with pale-pinkish perimeter on some older colonies, smooth, flat, and/or with scattered filaments at the margin. After 10 d of Dalmau plate culture on corn meal agar at 25 C, true hyphae are present. Aerobic growth is white, shiny, and smooth with filamentous margin. Asci and ascospores are not observed on YM or V8 agar. Diazonium blue B reaction is negative. See Table 4  MycoBank accession number MB 563720 (Fig. 4, c-d).
After 7 d growth in YM broth at 25 C, cells are globose to ellipsoidal (5.5-1064-6.5 mm), and occur singly, in pairs, in short chains, or in small clusters. Pseudohyphae are present; true hyphae are absent. After 7 d on YM agar at 25 C, cream-colored to light pink with abundant filaments at margin. After 10 d Dalmau plate culture on corn meal agar at 25 C, pseudohyphae are present; septate hyphae are absent. Aerobic growth is white, shiny, and smooth with filaments at margin. Asci and ascospores are not observed on YM or V8 agar. Diazonium blue B reaction is negative. See Table 4 for physiological characterization.  Table 4 for physiological characterization. Etymology. The species name illinoinensis (N.L. gen. n.) refers to the species of the substrate, Carya illinoinensis, from which this species was isolated.

Multilocus Phylogenetic Study
The phylogenetic placement of S. quercinus, S. virginianus, and S. illinoinensis was based on ML analysis results of a concatenated nucleotide dataset containing 3488 characters (Fig. 2). The Scheffersomyces clade was divided into three subclades: 1) the early diverging S. spartinae and S. gosingicus subclade, 2) the cellobiosefermenting S. ergatensis subclade, and 3) the largest, xylosefermenting S. stipitis subclade to which the three new species belong.
Addition of more taxa and more molecular data in phylogenetic analyses has helped to define monophyletic clades, among a number of genera now recognized as polyphyletic. One problematic taxon, Pichia, was based primarily on ascospore shape. Hatshaped ascospores, however, are found among several distant clades of yeasts as well as distantly related members of the Pezizomycotina. Scheffersomyces was proposed recently for species in the Pichia stipitis clade [55]. The genus included the type species, S. stipitis, and S. ergatensis and S. spartinae [55]. We propose additional new combinations in the genus Scheffersomyces by including clade members that previously were described as asexual species of the polyphyletic genus Candida [77].      We amplified a ,600 bp PCR product of XYL1 from X-F and non X-F yeasts tested (Fig. 5). The translated protein sequences at the N-terminal region have the conserved amino acids 49-D, 51-A, and 54-Y, described as part of the catalytic GX 3 DXAX 2 Y domain; the LX 8 DX 4 H and the GX 3 GXG domains, and the amino acids 83-K, 132-P, and 167-K that form the xylose-binding pocket previously reported [78,79] (Fig. 6). , and 204 (S = P) and the percentage of conserved substitutions usually were higher than 50% at all sites compared with the non X-F fermenting yeast S. ergatensis (Fig. 6).

Yeasts Sampled from Rotted Wood
Most of the isolates from live oak and water oak were members of the Sugiyamaella clade: Candida boreocaroliniensis, Candida lignohabitans, and S. smithiae. The cosmopolitan genus Sugiyamaella is comprised of yeasts reported primarily from wood and frass of lignicolous beetles. It is of interest that, unlike other yeasts from these habitats, they are unable to ferment D-xylose [80,81]. We also isolated other non X-F species Candida athensensis, Trichomonascus petasosporus, and a close relative of Candida anneliseae, ascomycete yeasts that previously were found associated with fungus-feeding beetles collected in Panama and the USA [56].
The yeast strains isolated from pecan wood, on the contrary, were dominated by the new X-F yeast species, S. illinoinensis, a close relative of S. stipitis. Only this wood was inhabited by Odontotaenius disjunctus (Passalidae, Coleoptera) in our study. Zhang et al. [82] recognized the relationship between S. stipitis and the lignicolous beetle commonly found inhabiting decayed hardwoods in the southeastern US, so the phylogenetic placement of these closely related species is consistent with the previous findings.
Other members of the Scheffersomyces clade (S. ergatensis, S. shehatae, and S. stipitis) have been reported frequently from associations with the gut of lignicolous beetles, including, not only the passalid beetle O. disjunctus but also beetles in the families Cerambycidae, Lucanidae, Buprestidiae, and Tenebrionidae  [22,83]. It is likely that the common gut yeasts are efficient at digesting components of the host diet, resisting toxic secondary metabolites, and adapting to the gut physiological environment (low oxygen and high carbon dioxide concentrations and extreme pH variation), characteristics that give them a greater chance to be horizontally transmitted to progeny. Consequently, in each host generation the symbiotic yeasts may be exposed to bottlenecks and positive selection driven by the host beetles, and these selective pressures increase when changes in the host diet occur. These evolutionary processes could have favored rapid speciation with morphological and other traditional characters lagging behind molecular changes in the Scheffersomyces yeast members. Evidence that supports our hypothesis is: 1) S. shehatae, S. lignicola, and S. insectosa, often found in association with insects, are indistinguishable by morphology and some molecular markers (e.g. SSU and D1/D2 LSU); 2) branch lengths are constrained in the phylogenetic tree (Fig. 2); and 3) gut morphology is modified to enhance the horizontal transmission of gut yeasts across generations, e.g. the posterior hindgut region of O. disjunctus is colonized mainly by filamentous yeasts attached by a holdfast [84]; in addition mycetomes occur in lignicolous cerambycid beetles colonized exclusively by closely related species of S. shehatae [22].
Kurtzman [58] recognized Candida shehatae var. shehatae, var. insectosa, and var. lignicola based on biochemical assays, a single nucleotide difference in the D1/D2 LSU region, and identical SSU rRNA among the varieties. He suggested that in order to understand the phylogenetic relationship between varieties of C. shehatae, analyses including ITS rRNA should be included. More recently, these varieties were raised to species level based on their distinctive electrokariotype profiles and the reinterpretation of the D1/D2 LSU locus [55,57]. Yeast species often have been underestimated on the basis of only LSU and SSU rRNA data [85][86][87]. Therefore, the addition of more molecular markers in phylogenetic analyses has been used to increase the power of species recognition (see next section).

Phylogenetic Study of the Scheffersomyces Clade
Lachance et al. [88] proposed a method for species delimitation in yeasts based on parsimony networks. In our experience, the results obtained by using this method are difficult to interpret for several reasons: 1) the species are plotted as isolated entities with little information on phylogenetic relationship among members; 2) the method does not allow inclusion of a model of nucleotide selection for the analysis; and 3) node support values are lacking.
Because of these disadvantages, we implemented, instead, a multilocus phylogenetic analysis based on ML, commonly applied to fungi. We incorporated additional loci (ITS and RPB1) in order to increase the robustness of the phylogenetic analysis in the study of the Scheffersomyces members. We followed the recommendations of the Assembling the Fungal Tree of Life (AFTOL http://aftol. org/) research group in searching for orthologous genes in fungal genomes [89]. In addition results obtained by Schoch et al. [90] showed that RPB1, RPB2, and TEF1 (elongation factor 1 alpha) are more phylogenetically informative compared to rRNA genes in Ascomycota. More recently, Schoch et al. [59] proposed the use of ITS as a barcode gene for fungi, although RPB1 had higher species discriminatory power than ITS in Saccharomycotina. The authors [59] also pointed out that datasets containing combinations of at least three molecular markers (SSU, LSU, ITS or RPB1) showed the highest probability of correct identification for all Fungi.
The phylogenetic outcome also suggests that the common ancestor of the Scheffersomyces clade may have shown the ability to ferment D-xylose and cellobiose. This hypothesis is supported by the phylogenetic analyses based on a multilocus dataset that places S. gosingicus, a C-F species, in the earliest derived subclade, and the results of the phylogenetic study showing XR presence in all Scheffersomyces clade members studied (Fig. 5). Moreover, the same tree topology of the multilocus analysis was recovered by using either XYL1 or XR, suggesting that X-F ability might have played a fundamental role in the speciation process of the X-F subclade (Figs. 2, 5). We did not include the single copy gene XYL1 in the multilocus phylogenetic analysis because the orthology of this locus has not been confirmed across several yeast taxa.
The ability to ferment both wood components (D-xylose and cellobiose) is exhibited by only a few yeasts: Ogataea polymorpha [92], Brettanomyces naardenensis [1], and Spathaspora passalidarum [18], and in these species fermentative abilities are weak or delayed. On the contrary, Scheffersomyces clade members exhibit only one or the other fermentative ability. The loss of fermentation capability could be a consequence of becoming more efficient in carrying out the fermentation of fewer sugars. In particular, the fermentation of cellobiose has an antagonistic effect against the fermentation of Dxylose, because during the extracellular fermentation of cellobiose, units of glucose are released by b-glucosidase, a sugar that has higher affinity for the pentose membrane transporter rather than D-xylose. Consequently, glucose is first incorporated into the cells to be fermented [93].

XR in the Scheffersomyces Clade Members
The metabolic constraint in the regeneration of the cofactor NAD + to NADH + has been described as a major constraint in the fermentation of D-xylose, therefore most studies have focused on the molecular characterization of the conserved domains involved in the uptake of this cofactor that is present in the C-terminal region of the XR, but few studies have been done on the characterization of the N-terminal region.
In the study in the N-terminal region of the XR of the X-F members of the S. stipitis subclade we were able to identify all of the conserved domains and amino acids in both X-F and non X-F yeasts. These findings indicated that the ability to ferment xylose does not rely solely on the presence of these conserved regions. We also found several biased nucleotide mutations that maintain the same polarity as the codified amino acid, and only the mutations on the residues 174, 179, 184, 188, and 204 showed a change in amino acid polarity. These mutations were found mainly surrounding conserved domains in comparison with the amino acid sequence of the non X-F yeast S. ergatensis (Fig. 6). The mutations could generate structural modifications that allow the fermentation of xylose in the Scheffersomyces clade members, results that could be supported by performing direct mutagenesis studies on the amino acids to characterize their individual roles in the performance of XR.
Although the X-F yeasts were dispersed throughout the Saccharomycotina, as we mentioned in the introduction, the Debaryomycetaceae includes the largest number of X-F yeasts, such as species of Scheffersomyces and Spathaspora found in association with lignicolous insects [21,24]. This phylogenetic placement of the X-F yeasts supports two alternatives: the X-F ability was the result of convergent evolution in ascomycete yeasts, or the X-F ability was present in the earliest common ancestor in the Saccharomycotina and has been retained mainly in the yeasts associated with lignicolous habitats. Several independent lines of evidence favor the second premise: 1) classical biochemical studies have determined that the ability to assimilate D-xylose and xylitol is common throughout the yeasts; 2) several relatively early diverging yeasts, B. naardenensis, O. polymorpha, and P. tannophilus, ferment xylose [1,92]; 3) many studies characterizing yeast diversity indicate that most X-F yeasts are associated with wood and the gut of lignicolous insects [21]; and 4) more recently, a report of the genome sequences of a diverse group of X-F and non X-F yeasts confirmed the presence of xylose genes in all of them [45].
Yeasts as a group, are known for their biochemical versatility in utilizing a wide variety of carbon sources. Individual strains, however, may be characterized by specific physiological profiles depending on their life style and environment. Wood substrates and the gut of lignicolous insects previously were unexplored environments for the isolation of X-F yeasts. The findings of this study further support the hypothesis that X-F yeasts and yeasts in the Sugiyamaella clade are common inhabitants of the wood substrates. The Scheffersomyces clade is comprised mainly of cellobiose-and D-xylose-fermenting yeasts isolated from distant geographical regions and associated with wood and insects that feed on plant tissues. The amino acid modifications present in the putative XR of X-F yeasts in the S. stipitis subclade, could be responsible for the enhanced rate of fermentation shown by the members of this clade. The addition of ITS and RPB1 loci in the phylogenetic studies on the Scheffersomyces clade dramatically increased the support of the phylogenetic relationships of the members. We have used the primers for XYL1 designed for this study across several yeast species (data not show), and they could be used to help understand how these genes have evolved in the members of Saccharomycotina. The phylogenetic reconstruction using only XYL1 or RPB1 was similar to the multilocus analysis, and these loci have potential for rapid identification of cryptic species in this clade.