Analysis of the Phlebiopsis gigantea Genome, Transcriptome and Secretome Provides Insight into Its Pioneer Colonization Strategies of Wood

Collectively classified as white-rot fungi, certain basidiomycetes efficiently degrade the major structural polymers of wood cell walls. A small subset of these Agaricomycetes, exemplified by Phlebiopsis gigantea, is capable of colonizing freshly exposed conifer sapwood despite its high content of extractives, which retards the establishment of other fungal species. The mechanism(s) by which P. gigantea tolerates and metabolizes resinous compounds have not been explored. Here, we report the annotated P. gigantea genome and compare profiles of its transcriptome and secretome when cultured on fresh-cut versus solvent-extracted loblolly pine wood. The P. gigantea genome contains a conventional repertoire of hydrolase genes involved in cellulose/hemicellulose degradation, whose patterns of expression were relatively unperturbed by the absence of extractives. The expression of genes typically ascribed to lignin degradation was also largely unaffected. In contrast, genes likely involved in the transformation and detoxification of wood extractives were highly induced in its presence. Their products included an ABC transporter, lipases, cytochrome P450s, glutathione S-transferase and aldehyde dehydrogenase. Other regulated genes of unknown function and several constitutively expressed genes are also likely involved in P. gigantea's extractives metabolism. These results contribute to our fundamental understanding of pioneer colonization of conifer wood and provide insight into the diverse chemistries employed by fungi in carbon cycling processes.


Introduction
The most abundant source of terrestrial carbon is plant biomass, composed primarily of cellulose, hemicellulose, and lignin. Numerous microbes utilize cellulose and hemicellulose, but a much smaller group of filamentous fungi has the capacity to degrade lignin, the most recalcitrant component of plant cell walls. Uniquely, such 'white-rot' fungi efficiently depolymerize lignin to access cell wall carbohydrates for carbon and energy sources. As such, white-rot fungi play a key role in the carbon cycle.
White-rot basidiomycetes may differ in their substrate preference and morphological patterns of decay (for review see [1,2]). The majority of lignin-degrading fungi, including Phanerochaete chrysosporium and Ceriporiopsis subvermispora, are unable to colonize freshly cut wood unless inhibitory compounds (extractives) are removed or transformed [2][3][4][5]. A few basidiomycetes, including Phlebiopsis gigantea, are pioneer colonizers of softwood because they tolerate and utilize resinous extractives (e.g., resin acids, triglycerides, long chain fatty acids, see Figure 1) which cause pitch deposits in paper pulp manufacturing [6]. It is this unusual capability that also led to the development of P. gigantea as a biocontrol agent against subsequent colonization of cut stumps by the root rot pathogen Heterobasidium annosum sensu lato (now considered several species) [7,8] and of harvested wood by blue stain fungi [9,10]. It seems likely that when applied to freshly cut wood, P. gigantea is able to rapidly metabolize accessible extractives and hemicellulose. As the hyphae continue to invade tracheids and ray parenchyma cells, the more recalcitrant cell wall polymers (cellulose, lignin; Figure 1) are eroded. Little is known of how some white-rot fungi degrade conifer extractives [11,12] or interact with other fungi such as H. annosum [13].
In addition to peroxidases, laccases have been implicated in lignin degradation [24][25][26]. To date, multiple laccase isozymes and/or the corresponding genes have been characterized from most white-rot fungi except P. chrysosporium, an efficient lignocellulose degrader that lacks such enzymes [27][28][29]. The mechanism(s) by which laccases might degrade lignin remain unclear as the enzyme lacks sufficient oxidation potential to cleave non-phenolic linkages within the polymer. Interestingly, laccase activity has not been reported in P. gigantea.
Additional 'auxiliary activities' [30] commonly ascribed to ligninolytic systems include extracellular enzymes capable of generating H 2 O 2 . These enzymes may be physiologically coupled to peroxidases. Among them, aryl-alcohol oxidase (AAO), methanol oxidase (MOX), pyranose 2-oxidase (P2O), and copper radical oxidases (such as glyoxal oxidase, GLX) have been extensively studied. With the exception of P2O [31], none of these activities have been reported in P. gigantea cultures. In short, the repertoire of extracellular enzymes produced by P. gigantea is largely unknown, and its mechanism(s) for cell wall degradation remain unexplored.
Beyond extracellular systems, the complete degradation of lignin requires many intracellular enzymes for the complete mineralization of monomers to CO 2 and H 2 O. Examples of enzymes that have been characterized from P. chrysosporium include cytochromes P450 (CYPs) [32][33][34], glutathione transferases [35], and aryl alcohol dehydrogenase (AAD) [36]. The role of such enzymes in P. gigantea, if any, is unknown.
Herein, we report analysis of the P. gigantea draft genome. Gene annotation, transcriptome analyses and secretome profiles identified numerous genes involved in lignocellulose degradation and in the metabolism of conifer extractives.

Genome assembly and annotation
Following an assessment of wood decay properties ( Figure 2), P. gigantea single basidiospore strain 5-6 was selected for sequencing using Illumina reads assembled with AllPathsLG. Genome size was estimated to be approximately 30 Mbp (Text S1), somewhat lower than closely related members of the 'Phlebia clade' [23,37] such as C. subvermispora (39 Mbp) and P. chrysosporium (35 Mbp) [22,27]. Aided by 17,915 mapped EST clusters, the JGI annotation pipeline predicted 11,891 genes. Proteins were assigned to 6412, 5615, 6932 and 2253 KOG categories, GO terms, pfam domains and EC numbers, respectively. Significant synteny with P. chrysosporium was observed ( Figure S1). Detailed information on the assembly and annotations is available via the JGI portal MycoCosm [38].

Gene families
Principal component analysis (PCA), based on 73 and 12 families of carbohydrate active enzymes (CAZys, [16]) and auxiliary activities (AAs), [30]), respectively, clustered P. gigantea with other efficient lignin degraders ( [39], Figures 3A and S2). Gene numbers were extracted from 21 fungal genomes and excluded genes encoding putative GMC oxidases such as methanol oxidase, alcohol oxidase and glucose oxidase (Dataset S1). Highest contribution of PC1 (50% of variance separating white-rot and brown-rot fungi) and PC2 (13.0% of variance)) values were those genes associated with degradation of plant cell wall polysaccharides and lignin, respectively (Text S1). Hierarchical clustering analysis with this dataset also categorized P. gigantea into a clade of white-rot fungi that included the polypore P. chrysosporium. The precise number and distribution of P. gigantea

Author Summary
The wood decay fungus Phlebiopsis gigantea degrades all components of plant cell walls and is uniquely able to rapidly colonize freshly exposed conifer sapwood. However, mechanisms underlying its conversion of lignocellulose and resinous extractives have not been explored. We report here analyses of the genetic repertoire, transcriptome and secretome of P. gigantea. Numerous highly expressed hydrolases, together with lytic polysaccharide monooxygenases were implicated in P. gigantea's attack on cellulose, and an array of ligninolytic peroxidases and auxiliary enzymes were also identified. Comparisons of woody substrates with and without extractives revealed differentially expressed genes predicted to be involved in the transformation of resin. These expression patterns are likely key to the pioneer colonization of conifers by P. gigantea.
genes likely involved in lignocellulose degradation were similar, but not identical, to other polypores such as P. chrysosporium and C. subvermispora ( Figure 4). Like P. chrysosporium and Phanerochaete flavido-alba, P. gigantea had no laccase sensu stricto genes. Interestingly, while both P. gigantea and the white-rot Russulales H. annosum are adapted to colonization of conifers, significant numbers of laccase sensu stricto genes were only observed in H. annosum ( Figure 4). This important conifer pathogen also lacked GLX, LiP and representatives of GH5 subfamiles 15 and 31.
With regard to hemicellulose degradation, the genomes of conifer-adapted P. gigantea and H. annosum revealed increased numbers of genes involved in pectin degradation such as GH28 polygalacturonase, CE8 pectin methylesterase and CE12 rhamnogalacturonan acetylesterase (Figure 4). The major hemicellulose of conifer is galactoglucomannan ( [40], Figure 1) but, in the case of mannan degradation, no significant increase in genes encoding GH2 b-mannosidase, GH5_7 endo-mannanase and GH27 agalactosidase was observed relative to other wood decay fungi ( Figure 4). Similarly, no significant differences in the number of genes involved in arabinoglucuronoxylan hydrolysis were identified, except for two transcriptionally convergent GH11 genes present in P. gigantea (Text S1). Encoding putative endo-1,4-bxylanases, wood decay fungi typically harbor one or no GH11 genes. Auricularia delicata is another exception with three of these endoxylanases. Also unusual among white-rot fungi, none of the P. gigantea protein models were assigned to GH95 (Dataset S1). This family includes 1,2-a-fucosidases that hydrolyze the a-Fuc-1,2-Gal linkages in plant xyloglucans.
The P. gigantea genome includes representatives for all the peroxidase families reported in basidiomycetes, including LiP, MnP, heme-thiolate peroxidases, and dye-decolorizing type peroxidases (DyP), with the only exception of VP (Text S1; Figures S8-S13). MnP gene expansion is similar to that found in the C. subvermispora and H. annosum genomes. Beyond class II peroxidases and multicopper oxidases (MCOs), genes encoding auxiliary enzymes involved in ligninolysis were also found such as GMC oxidoreductases (Figures S14-S19; Table S5) and copper radical oxidases (CRO, Figure 4; Table S4). Among the latter group, GLX is coupled to P. chrysosporium LiPs via extracellular H 2 O 2 generation [41]. Consistent with this physiological connection, the P. gigantea genome features both GLX-and LiPencoding genes. GMC genes encoding putative AAO, MOX and glucose oxidase (GOX) may also be involved in H 2 O 2 production by oxidation of low molecular weight aliphatic and aromatic alcohols. The P2O gene (protein model Phlgi1_130349) lies immediately adjacent to a putative pyranosone dehydratase (Phlgi1_16096) gene. This arrangement is conserved in several wood decay fungi and, in addition to peroxide generation, suggests a route for conversion of glucose to the pyrone antibiotic, cortalcerone [42,43]. Genes encoding AAD, members of the zinc-type alcohol dehydrogenase superfamily [44], are also abundant in P. gigantea. Relatively few genes were predicted to encode CYPs which are generally considered important in the The extractives (long chain fatty acids, triglycerides, resin acids and terpenes) are found primarily in the resin ducts, but damage to pine wood causes the release of these compounds across wounded areas. Panel B: In tracheid cell walls, the amorphous, phenylpropanoid polymer lignin (brown) form a matrix around the more structured carbohydrate polymers, hemicellulose (yellow and green) and cellulose (blue). doi:10.1371/journal.pgen.1004759.g001 intracellular metabolism of lignin derivatives and related aromatic compounds ( Figure S19; Dataset S2).
In contrast to analysis of genes involved in lignocellulose degradation ( Figure 3A), white-rot and brown-rot fungi were not clearly separated by principal component analysis of 14 enzymes involved in lipid metabolism ( Figures 3B and S3). However, P. gigantea was grouped near B. adusta and P. carnosa. These associations seem in line with the preferential colonization of softwood substrates by P. carnosa [49] and with the efficient degradation of conifer extractives by B. adusta culture supernatants [50].The highest contribution to PC1 (26.0% variance) and PC2 (6.8% variance) were aldehyde dehydrogenase and long chain fatty acid CoA ligase, respectively ( Figures 3A and S3, Text S1). Also potentially involved in intracellular lipid metabolism, CYP52 and CYP505 clans of cytochrome P450s are associated with degradation of fatty acids and alkanes. Relative to other white-rot fungi, P. gigantea had a slightly greater number of CYP52encoding genes whereas CYP505 gene numbers were similar ( Figure 4; Dataset S1; Figures S31, S32; Tables S13-S15).
Excluding genes with relatively low transcript levels (RPKM values ,10) in LP-containing media, transcripts of 187 genes were increased.2-fold (p,0.05) in NELP or ELP relative to Glc. Of Single basidiospore strain 5-6 also aggressively decayed birch and spruce (Text S1) and was selected for sequencing. Upper panels show scanning electron microscopy [68] of radial (left) and transverse (right) sections of pine wood tracheids that were substantially eroded or completely degraded by P. gigantea those Glc-derived transcripts with RPKM values.10, 146 genes had higher transcripts in Glc relative to NELP or ELP ( Figure 5; Dataset S2).
Mass spectrometry (nanoLC-MS/MS) identified extracellular peptides corresponding to a total of 319 gene products in NELP and ELP cultures (Dataset S2). Most proteins were observed in both NELP and ELP culture filtrates, which contained 294 and 268 proteins, respectively. Approximate protein abundance, expressed as the exponentially modified protein abundance index (emPAI) [52], varied substantially within samples. As expected, gene products with predicted secretion signals and high transcript levels were often detected. Other detected proteins (e.g. MOX model Phlgi1_120749; [53]) may be loosely associated with cell walls and/or secreted via 'non-classical' mechanisms ( [54]; www. cbs.dtu.dk/services/SecretomeP). Still other peptides correspond to true intracellular proteins released by cell lysis, e.g. ribosomal proteins (Dataset S2).
Glycoside hydrolase gene expression was heavily influenced by media composition with transcripts corresponding to 76 genes increasing.2-fold in NELP-or ELP-containing media relative to glucose medium ( Figure 6). Some of these genes were highly expressed with RPKM values well over 100. For example, transcript and peptide levels matching GH7 cellobiohydrolase (CBH1; model Phlgi1_34136) were among the ten most highly expressed genes ( Table 1). Indicative of a complete cellulolytic system, this CBH1 was accompanied by upregulated transcripts S1). GMC oxidoreductases methanol oxidase, glucose oxidase and aryl alcohol oxidase were excluded because confident functional assignments could not be made and/or their inclusion did not contribute to separation of white-and brown-rot species. (B) PCA of 21 fungi using genes encoding 14 enzymes involved in lipid metabolism (KEGG reference pathway 00071, Dataset S1). There is no significant segregation of white-rot and brown-rot fungi although P. gigantea was positioned in the third quadrant with B. adusta and P. carnosa. Symbols for white rot and brown rot fungi appear in blue and red, respectively. Tremella mesenterica is a mycoparasite. For raw data and contributions of the top 20 families see Dataset S1, Text S1 and Figures  and extracellular proteins corresponding to another CBH1 (Phlgi1_13298), a GH6 family member CBH2 (Phlgi1_17701) and GH5_5 b-1,4 endoglucanases (EGs; Phlgi1_86144, Phlgi1_84111), all of which feature a family 1 carbohydrate binding module (CBM1). Also highly expressed were putative bglucosidases (Phlgi1_127564, Phlgi1_18210) and a GH12 (Phlgi1_34479). Other glycoside hydrolases likely involved in degradation of cell wall hemicelluloses include GH5_7 endomannanases (Phlgi1_97727, Phlgi1_110296), a GH74 xyloglucanase (Phlgi1_98770), a GH27 a-galactosidase (Phlgi1_72848) and a GH10 endoxylanase (Phlgi1_85016).
Expression of oxidative enzymes implicated in lignocellulose degradation was also influenced by growth on LP-media (NELP or ELP) relative to Glc-containing media. Transcripts corresponding to five LPMO-encoding genes showed significant regulation (P, 0.01) in LP-medium, and three LPMO proteins were detected (Phlgi1_227588, Phlgi1_227560, Phlgi1_37310). An AAD-like oxidoreductase (Phlgi1_30343), possibly involved in the transformation of lignin metabolites, was also upregulated. However, we did not observe high expression of class II peroxidases under the conditions tested (Dataset S2). On the other hand, a DyP (Phlgi1_85295) was significantly upregulated in certain LPcontaining media ( Table 1). The importance of these peroxidases is further supported by the high protein levels of another DyP, Phlgi1_122124. Specifically, the latter protein showed emPAI values.17 after 5 days growth on LP media and, relative to Glc medium, its transcript ratios were.5-fold higher (p,0.04) (Dataset S2). High DyP gene expression has been observed in white-rot fungi Trametes versicolor and Dichomitus squalens [21], but no genes for these proteins are present in P. chrysosporium and C. subvermispora ( Figure 4). The P. gigantea DyP (Phlgi1_122124) was also abundant in media containing microcrystalline cellulose (Avicel) as the sole carbon source (Dataset S2).
To identify enzymes involved in tolerance to and/or degradation of extractives, comparisons were made of gene expression in ground loblolly pine wood that had been extensively extracted with acetone (ELP) versus non-extracted loblolly pine wood (NELP) ( Figure 7A). In general, this treatment had little impact on . Number of genes identified in white rot fungi P. gigantea (Phlgi), P. chrysosporium (Phach) [27], C. subvermispora (Cersu) [22], and H. annosum (Hetan) [75], and the brown rot fungus P. placenta (Pospl) [45]. CROs were distinguished as previously described [76]. Lytic polysaccharide monooxygenases were formerly classified as GH61 within the CAZy system (http://www.cazy.org/; [16]). Glycoside hydrolase family GH5 was subdivided as described [77] ( Figure S22). doi:10.1371/journal.pgen.1004759.g004 gene expression. For example, glycoside hydrolase transcript and protein patterns showed only minor differences ( Figure 8). Nevertheless, transcripts corresponding to 22 genes showed significantly increased levels (.4-fold; p,0.01) in NELP relative to ELP ( Figure 7B; Table 2). Of particular interest were genes potentially involved in metabolism of resin acids (e.g. CYPs; [55]), in altering the accessibility of cell wall components (e.g., an endoxylanase), and in regulating gene expression (e.g. proteins containing putative Zn finger domains or HMG-Box transcription factors). Integration of transcript profiles of genes involved in intracellular lipid and oxalate metabolism, together with TCA and glyoxylate cycles, strongly supports a central role for b-oxidation in triglyceride and terpenoid transformation by P. gigantea (Figure 9).
Relaxing the transcript fold-change threshold (.2-fold; p,0.01) and focusing on mass spectrometry-identified proteins revealed 14 additional genes potentially involved in metabolism and/or tolerance to loblolly pine wood extractives (Table 3).Among these  The number of genes encoding mass spectrometry-identified proteins was limited to those matching$2 unique peptides after 5-9 days growth in media containing NELP or ELP. RPKM values.100 for RNA derived from these cultures were arbitrarily selected as the threshold for high transcript levels. Genes designated as 'regulated' showed significant accumulation (p,0.05;.2-fold) in NELP or ELP relative to glucose containing media. Methods and complete data are presented in Text S1 and Dataset S2. doi:10.1371/journal.pgen.1004759.g006 Table 1. Differentially regulated genes in media containing non-extracted loblolly pine wood (NELP), solvent extracted loblolly pine wood (ELP), or glucose (Glc) as sole carbon source. extract-induced genes, lipases Phlgi1_19028 and Phlgi1_36659 likely hydrolyze the significant levels of triglycerides. The substrate specificity of aldehyde dehydrogenases such as Phlgi1_115040 is difficult to assess based on sequence, although several have been implicated in the degradation of pine wood resins by bacteria [56]. Secretome patterns in media containing microcrystalline cellulose (Avicel) as sole carbon source generally supported the importance of the same proteins in the metabolism of pine wood extractives (Table 3, Dataset S2). Specifically, lipases Phlgi1_19028 and Phlgi1_36659 and aldehyde dehydrogenase Phlg1_115040 were more abundant in loblolly pine wood and in Avicel media relative to the same media without extractives. The role of peroxiredoxin (Phlgi1_95619) and glutathione S-transferase (Phlgi1_113065) are less clear, but transformations involving H 2 O 2 reduction and glutathione conjugation are possible. A single MCO (Phlgi1_129839) and its corresponding transcripts, were observed to be upregulated in ELP relative to NELP. Although lacking the L2 signature common to laccases [57], the MCO4 protein may have iron oxidase activity provided that an imperfectly aligned Glu residue serves in catalysis (Text S1; Figures S20 and S21; Table  S6).

Discussion
The distinctive repertoire and regulation of P. gigantea genes underlie a unique and efficient system for degrading all components of conifer sapwood. Transcriptome and proteome analyses demonstrate an active system of hydrolases and LPMOs involved in the complete deconstruction of cellulose and hemicellulose. The overall enzymatic strategy is therefore similar to most cellulolytic microbes, but unlike closely related brown-rot decay Agaricomycetes such as P. placenta.
With regard to ligninolysis, key genes were identified including LiPs, MnPs, CROs and GMC oxidoreductases. As in P. chrysosporium, the presence of both LiP-and GLX-encoding genes is consistent with a close physiological connection involving peroxide generation [41]. We also annotated non-class II peroxidases HTPs and DyPs some of which have been implicated in metabolism of lignin derivatives [58,59]. The distribution and expression of DyP-encoding genes are notable; with no genes present in P. chrysosporium and C. subvermispora but several highly expressed genes in T. versicolor, D. squalens [21] and P. gigantea (Table 2). Physiological roles of DyP are likely diverse, but oxidation of lignin-related aromatic compounds has been demonstrated [59].
In addition to lignin, oxidative mechanisms likely play a central role in P. gigantea cellulose attack. Of 15 LPMO-encoding genes, transcripts of six genes were regulated (.2-fold; p,0.01) and peptides corresponding to three were unambiguously identified in NELP-or ELP-containing media. Our inability to detect any LPMO proteins in Avicel media (Dataset S2) suggests induction by substrates other than cellulose [60]. Beyond this, the CDH gene was highly expressed (transcripts and protein) in LP media. The observed coordinate expression of CDH and LPMO may reflect oxidative 'boosting' as recently demonstrated [19,20,47,61]. However, we did not detect elevated transcripts or peptides corresponding to the two P. gigantea aldose 1-epimerase genes even though these were previously observed in culture filtrates of various white-rot fungi [21,62], including Bjerkandera adusta, Ganoderma sp, and Phlebia brevispora [17]. Thus, it seems unlikely that enzymatic conversion of oligosaccharides to their banomers is necessary for efficient CDH catalysis.
Softwood hemicellulose composition typically includes 15-20% galactoglucomannan while hardwoods contain 15-30% glucuronoxylan [40]. Consistent with an adaption to conifer hemicellulose, GH5_7 bmannanases were highly expressed in both NELP and ELP cultures, together with a GH27 a-galactosidase (Table 1). GH11 endoxylanase and CE carbohydrate esterase peptides were also detected in the pine wood-containing media, but not in Avicel cultures (Dataset S2). In aggregate, these results demonstrate P. gigantea adaptation to conifer hemicellulose degradation. P. gigantea's gene expression patterns reveal multiple strategies for overcoming the challenging composition of resinous sapwood. Tolerance to monoterpenes may be mediated in part by a putative ABC efflux transporter (Phlbi1_130987, Figure 9). Of the 51 ABC proteins of P. gigantea, this protein is most closely related to the GcABC-G1 gene of the ascomycete Grosmannia clavigera, a pathogen of Pinus contorta [63]. The GcABC-G1 gene is upregulated in response to various terpenes and appears to be a key element against the host defenses. Consistent with a similar function, our analysis showed the P. gigantea homolog to be upregulated.4.9-fold (p = 0.02) in NELP relative to ELP media (Dataset S2). Other transcripts accumulating in NELP-derived mycelia included three CYPs (Table 2) potentially involved in the hydroxylation of diterpenoids and related resin acids [55]. Differential regulation also implicates glutathione S-transferase, aldehyde dehydrogenase and peroxiredoxin in the transformation and detoxification of extractives (Table 2). Three putative transcription regulators were similarly regulated (Table 3). Aldehyde dehydrogenase-and AAD-encoding genes, some of which are upregulated in P. gigantea LP cultures relative to Glc cultures (Tables 1), are induced by aromatic compounds in P. chrysosporium [64,65].
Predicted to degrade triglycerides, a total of nine lipaseencoding genes were identified in the P. gigantea genome and four of these were upregulated.2-fold (p,0.01) in LP media compared to Glc medium (Dataset S2). Two lipases displayed similar patterns of transcript and protein upregulation on NELP relative to ELP (Table 3), and the pine wood extractive also induced accumulation of these lipases in Avicel media (Table 3). Further metabolism of triglycerides is uncertain, although a putative glycerol uptake facilitator (Phlbi1_99331) was highly expressed (RPKM value of 2532) and significantly (p,0.02) upregulated (2.1-fold) in NELP compared to ELP (Dataset S2). Fatty acids activation and b-oxidation can be inferred by the expression of fatty acid CoA ligase (Phlgi1_107548, Phlgi1_126556, Phlgi1_89325), b-ketothiolase (Phlgi1_27649, Phlgi1_130767), and fatty acid desaturase (Phlgi1_100083, Phlgi1_115799). Upregulation of a mitochondrial malate dehydrogenase (Phlgi1_22176 , Table 3), together with relatively high transcript levels of other TCA cycle components (citrate synthases Figure 8. Glycoside hydrolase encoding genes show similar patterns of expression in media containing freshly ground and nonextracted loblolly pine wood (NELP) relative to the same substrate but extracted with acetone (ELP) to remove pitch and resins. Proteins (upper panel) and transcripts (lower panel) were identified by LC-MS/MS and RNA-seq, respectively. Protein identification was limited to those with.2 unique peptides after five days incubation. Transcript upregulation was limited to significant accumulation (p,0.05;.2-fold) on NELP or ELP relative to glucose-containing medium. Secretome and transcriptome experimental details and complete data are presented in Text S1 and Dataset S2. doi:10.1371/journal.pgen.1004759.g008 Phlgi1_126205, Phlgi1_100215; 2-oxoglutarate dehydrogenase, Phlgi1_126652) may complete fatty acid oxidation. In this connection, we also observed high expression of isocitrate lyase (Phlgi1_21482, Phlgi1_93159) and malate synthase (Phlgi1_27815), which partially explain oxalate accumulation [66] and strongly support an active glyoxylate shunt [45,67] ( Figure 9). Upregulation of glycoside hydrolases, transcription factors, cyclophilins, ATP synthase and ribonuclease may also reflect broad shifts in metabolism or reduced accessibility of the unextracted substrate (Tables 2 and 3).
Beyond genetic regulation, certain constitutively expressed genes are also likely involved in the degradation of all plant cell wall components, including complex resins and triglycerides. For example MOX (Phlgi1_120749) is among the most abundant transcripts in both NELP and ELP (Dataset S2), suggesting an important role in H 2 O 2 production associated with lignin demethylation [53]. Extracellular peroxide generation is key to peroxidase activity, and MOX fulfills this role along with CRO, AAO, and P2O. Along these lines, we also observed high extracellular protein levels of DyP (Phlgi1_122124) under all culture conditions. Most problematic, many P. gigantea genes and proteins exhibited little or no homology to NCBI NR or Swiss-Prot entries. Some of these 'hypothetical' or 'uncharacterized' proteins are undoubtedly important, particularly those that are highly expressed, regulated and/or secreted. For example, of 92 genes upregulated (.2-fold; p,0.01) in NELP relative to ELP, 51 were designated as hypothetical ( Table 2; Dataset S2). Three of these featured predicted secretion signals and peptides were detected in one case. In the absence of biochemical characterization and/or genetic evidence, assigning function to these genes represents a major undertaking. Nevertheless, high throughput transcript and secretome profiling substantially filtered the number of potential targets from a genome-wide estimate of 4744 'hypothetical' genes to the more manageable numbers reported here. More broadly, the results advance understanding of the early and exclusive colonization of coniferous wood by P. gigantea and also provide a framework for developing effective wood protection strategies, improving biocontrol agents and identifying useful enzymes [6,9,10].

Wood colonization assays
Wood wafers (1 cm by 1 cm by 2 mm) were cut from the sapwood of aspen (Populus tremuloides), pine (P. taeda) and spruce (Picea glauca) and sterilized by autoclaving. Following inoculation by contact with mycelium growing on malt extract agar (15 g malt extract [Difco, Detroit, MI] and 15 g agar per liter of water) in Petri dishes, colonized wafers were harvested 30, 60 and 90 days. Noninoculated wood wafers placed on the same media in Petri dishes served as controls. Wafers were removed 30, 60 or 90 days later, weighed and percent weight loss was determined. Additional wafers were removed at the same time period, immediately frozen Truncated gene model predicts incomplete protein (117aa). 3 Nineteen of 22 accumulating in NELP relative to ELP as illustrated in Figure 4B. Three additional upregulated genes were associated with LC-MS/MS-detected proteins and listed in Table 3. Proteins with asterisks are also listed in Table 2  to 220uC and prepared for scanning electron microscopy as previously described [68].

Sequencing and annotation
The genome was sequenced using Illumina and annotated using the JGI Annotation Pipeline [69]. Assembly and annotations are available from JGI portal MycoCosm [38] and deposited to DDBJ/EMBL/GenBank under accession AZAG00000000. The version described in this paper is version AZAG01000000. The completeness of the P. gigantea genome was assessed by finding 99.1% of CEGMA proteins conserved across sequenced genomes of eukaryotes [70](Text S1; Tables S1, S2).

RNA-seq
Mycelium was derived from triplicate cultures of 250 ml basal salts containing: i. 1.25 g freshly-harvested, ground (1mm mesh) loblolly pine wood that had been 'spiked' with acetone and thoroughly dried (NELP); or ii. the same material following extended acetone extraction in a Soxhlet apparatus and drying (ELP). The composition of the extract (Text S1) was determined by GC-MS [51]. Duplicate cultures of basal salts medium with glucose as sole carbon source served as a reference. After 5 days incubation, total RNA was purified from frozen mycelium as described [22,71]. Multiplexed libraries were constructed and sequenced on an Illumina HiSeq2000. DNAStar Inc (Madison, WI) modules SeqNGen and Qseq were used for mapping reads and statistical analysis. Transcriptome data was deposited to the NCBI Gene Expression Omnibus (GEO) database and assigned accession GSE53112 (Reviewer access via http://www.ncbi. nlm.nih.gov/geo/query/acc.cgi?token=ilovmswixtajjez&acc=GSE 53112). Experimental details are provided in Text S1 and all transcriptome analyses are summarized in Dataset S2.

Secretome analysis
With minor modification, NanoLC-MS/MS analysis identified extracellular proteins in culture filtrates as described [22,72]. For each of the two woody substrates (e.g NELP and ELP), cultures were harvested after 5, 7 and 9 days. Mass spectrometric protein identifications were accepted if they could be established at greater than 95.0% probability within 0.9% False Discovery Rate and contained at least two identified peptides. Protein probabilities were assigned by the Protein Prophet algorithm [73]. To verify the effects of pine wood extractives in a well-defined substrate, media containing microcrystalline cellulose (Avicel) were also employed [22,45,74]. Filtrates from these cultures, with or without addition of loblolly pine wood acetone extract, were collected after 5 days and analyzed. Approximate protein abundance in each of the cultures was expressed as the number of unique peptide and the exponentially modified protein abundance index (emPAI) value [52] (See Text S1 for detailed methods). Figure S1 Vista dot plot illustrating syntenic relationship between 12 longest scaffolds of P. gigantea and P. chrysosporium. (EPS) Figure S2 Top 20 families of contributing to PC1 and PC2 values in Figure 3A. The x-axis designates each enzyme family and y-axis indicates the squared rotation values for PC1 and PC2. As shown in Figure 2, the PC2 value mainly separated the whiteand brown-rot fungi. (EPS) Figure S3 Ten genes encoding enzymes potentially involved in lipid metabolism contributing to PC1 and PC2 values in Figure 3B. The x-axis designates each enzyme family and y-axis indicates the squared rotation values for PC1 and PC2. (EPS) Figure S4 Phylogenetic analysis of opsin genes from P. gigantea (Phlgi), C. subvermispora (Cersu), P. placenta (Pospl), A. nidulans (AN), Sordaria macrospora (SM) and Neurospora crassa (NCU). The evolutionary history was inferred using the Minimum Evolution method [78]. The bootstrap consensus tree inferred from 500 replicates (MEGA4) is taken to represent the evolutionary history of the taxa analyzed (MEGA4). Branches corresponding to partitions reproduced in less than 50% bootstrap replicates are collapsed. The percentage of replicate trees in which the associated taxa clustered together in the bootstrap test (500 replicates) are shown next to the branches (MEGA4). The tree is drawn to scale, with branch lengths in the same units as those of the evolutionary distances used to infer the phylogenetic tree. The evolutionary distances were computed using the Poisson correction method [79] and are in the units of the number of amino acid substitutions per site. The ME tree was searched using the Close-Neighbor-Interchange (CNI) algorithm [4] at a search level of 1. The Neighbor-joining algorithm [80] was used to generate the initial tree. All positions containing gaps and missing data were eliminated from the dataset (Complete deletion option). There were a total of 183 positions in the final dataset. Phylogenetic analyses were conducted in MEGA4 [81].  Figure S8 Homology models for the molecular structures of class II heme peroxidases from the P. gigantea genome. Ligninolytic peroxidases, including LiP models -A) 150531 peroxidase, B) 121662 peroxidase and C) 30372 peroxidaseharboring an exposed tryptophan potentially involved in oxidation of high redox-potential substrates, and MnP models -D) 75566 peroxidase, E) 75572 peroxidase, F) 115591 peroxidase, G) 115592 peroxidase and H) 117668 peroxidase -harboring a putative Mn 2+ oxidation site (formed by two glutamates and one aspartate); and I) manually curated GP (32509). Note that an alanine and an asparagine residues in the LiP models occupy the position of the catalytic glutamate and aspartate involved in Mn 2+ oxidation by MnP, and a serine residue in the MnP models occupies the position of the putative catalytic tryptophan characterizing LiP. The amino acid numbering refers to putative mature sequences, after manual processing of their peptide sequences. (EPS) Figure S9 Dendrogram showing evolutionary relationships among 478 basidiomycete heme peroxidases, including structural-functional classification based on Ruiz-Dueñ as et al. (2) (GeneBank and JGI references in parentheses and P. gigantea genome references on yellow background). Amino-acid sequence comparisons as Poisson distances and clustering based on UPGMA and "pairwise deletion" option of MEGA5 [81].  Figure S12 Dendrogram focused on class II heme peroxidases (a total of 219) showing evolutionary relationships and structural-functional classification. A) Short, long and extralong MnPs have a Mn 2+ -oxidation site formed by two glutamic and one aspartic residues, and differ in the length of the C-terminal tail; B) LiPs contain a catalytic tryptophan, with the only exception of TRACE-LiP being the ''unique'' ligninolytic peroxidase with a catalytic tyrosine (3); C) VPs harbor the catalytic sites described above for both MnPs and LiPs; D) GPs do not contain any of the above two catalytic sites; and E) atypical MnPs and VPs lack one of the three acidic residues forming the Mn 2+ -oxidation site. The analysis is described in Figure S9.  Figure S14 Multiple alignments of aryl alcohol oxidases (AAO) sequences from P. gigantea (Phlgi128071, Phlgi121514) and P. eryngii (AAOpe) [82]. Highly conserved histidine active site residues [83] are in red. Bottom lines show conserved motifs in GMC family [84]. Obtained using Clustal W [85].The Phlgi128071 and Phlgi121514 models conserve the substratebinding pocket reported for AAO from P. eryngii and the motifs in GMC family. (EPS) Figure S15 Multiple alignments of methanol oxidase (MOX) sequences from P. gigantea (Phlgi120749, Phlgi108516 and Phlgi72751) and G. trabeum (ABI14440.1) [53]. Highly conserved histidine/asparagine active site residues [84] are in red. Bottom lines show conserved motifs in GMC family [84]. Obtained using Clustal W [85]. The MOX models conserve motifs in GMC family. (EPS) Figure S16 Multiple alignments of cellobiose dehydrogenase (CDH) sequence from P. gigantea (model Phlgi99876) and Gelatoporia subvermispora (ACF60617). Highly conserved active site residues are in red [86]. Obtained using Clustal W [85]. (EPS) Figure S17 Multiple alignments of pyranose oxidse (POX) sequences from P. gigantea (Phlgi130349), Peniphora sp. (AAO13382.1) and Trametes ochracea (AAP40332.1). Obtained using Clustal W [85]. (EPS) Figure S18 Multiple alignments of glucose oxidase (GOX) sequences from P. gigantea (Phlgi128108), B. fuckeliana (CDA88590.1) and C. immitis (EAS27606) and A. niger (AAF59929.2) [1]. Highly conserved histidine active site residues [84] are in red. Bottom lines show conserved motifs in GMC family [84]. Obtained using Clustal W [85]. (EPS) Figure S19 Multiple alignments of eight putative aryl alcohol dehydrogenase (AAD) sequences from P. gigantea (Phlgi1) and P. chrysosporium (AAA61931.1) [36]. Obtained using Clustal W [85]. (EPS) Figure S20 Phylogenetic tree of multicopper oxidases from Acremonium sp. (Acr), Aspergillus nidulans (Ani), Cryptococcus neoformans (Cne), Coprinopsis cinerea (Cci), Pleurotus ostreatus (Pos), Phanerochaete carnosa (Pca), Phanerochaete chrysosporium (Pch), Phanerochaete flavido-alba (Pfa), Phlebiopsis gigantea (Pgi), Postia placenta (Ppl), Saccharomyces cerevisiae (Sce), Schizophyllum commune (Sco), Serpula lacrymans (Sla), Sporobolomyces roseus (Sro), Tremella mesenterica (Tme), and Ustilago maydis (Uma). Alignments were produced in program ClustalW, manually adjusted in Genedoc and computed in MEGA4.0 for phylogenetic tree production (neighbourjoining, bootstrap values 500). ID numbers refer to protein models in the JGI MycoCosm (http://genome.jgi-psf.org/programs/fungi/ index.jsf), other accession numbers to the NCBI database (http:// www.ncbi.nlm.nih.gov/), specific names to proteins specified with accession numbers in Hoegger et al. [29] and Lettera et al. [87]. Blue coloring marks enzymes with laccase activity experimentally shown, light brown enzymes with proven ferroxidase activity, purple enzymes with ascorbate activity, olive enzymes acting in fungal pigment synthesis, and two colors dual enzymatic activities with the left color marking the respective major performance ( [88,89] and references in the review of Kües and Rühl 2011 [90]). Proteins from P. gigantea are highlighted in red. (EPS) Figure S21 Sequence alignment for four regions of S. cerevisiae ferroxidase Fet3 with corresponding regions of enzymes of P. gigantea and Phanerochaete species. Marked in yellow are residues that in Fet3 of S. cerevisiae are critical for Fe 2+ binding and the electron-transfer pathway (for references see [90]). Three groups of enzymes become obvious: i) the Fet3-type ferroxidases; ii) within the cluster of ferroxidases/laccases one subgroup that is more similar to the Fet3-type ferroxidases to which the ferroxidase Mco1 of P. chrysosporium belongs to; and iii) one more distinct subgroup that misses three amino acids in the second region of importance for binding pocket formation and to which P. favido-alba bona fide laccase PfaL belongs. (EPS) Figure S22 Phylogenetic analysis and subfamily assignments of GH5 protein models of P. gigantea (Phlgi), H. annosum (Hetan) and Stereum hirsutum (stehi). (EPS) Figure S23 Phylogenetic analysis of LPMO proteins of P. gigantea, C. subvermispora, and P. Chrysosporium. (EPS) Figure S24 Phylogeny and differential expression of carbohydrate esterase family 1 (CE1) genes. CDS sequences were obtained from each genome database according to assigned protein IDs. Incomplete CDS sequences (partial fragments) were eliminated from the analysis. For each CE family, a multiple alignment was performed using MegAlign version 10 software. The phylogenetic tree was then constructed from the multiple alignment using Clustal W [85].  Figure S25 Phylogeny and differential expression of CE4 genes. Analysis and abbreviations as in Figure S24. (PDF) Figure S26 Phylogeny and differential expression of CE8 genes. Analysis and abbreviations as in Figure S24. Possible CE8 gene Phlgi_132681 was excluded as the model was severely truncated (79 residues). (PDF) Figure S27 Phylogeny and differential expression of CE9 genes. Analysis and abbreviations as in Figure S24. (PDF) Figure S28 Phylogeny and differential expression of CE12 genes. Analysis and abbreviations as in Figure S24. (PDF) Figure S29 Phylogeny and differential expression of CE15 genes. Analysis and abbreviations as in Figure S24. (PDF) Figure S30 Phylogeny and differential expression of CE16 genes. Analysis and abbreviations as in Figure S24. Possible CE16 gene Phlgi_73119 was excluded as the model was severely truncated (95 residues). (PDF) Figure S31 Phylogenetic tree of the cytochrome P450 proteins (P450ome) in P. gigantea. Tree was constructed using 124 P450 sequences (which were full-length or near full-length) and evolutionary history was inferred using bootstrap Neighbor-Joining method. Phylogenetic analyses were conducted using MEGA4 [81]. The P450 listing in the tree is based on the corresponding protein ID with the CYP name in parenthesis. P450s that belong to a new subfamily are indicated with the abbreviation NS. (EPS) Figure S32 Comparative evolutionary analysis of the P450omes of P. gigantea and Phanerochaete species (P. chrysosporium and P. carnosa). Clan level comparison was made for this analysis. (EPS) Figure S33 Multiple alignment of representative protein sequences of P. gigantea hydrophobins. Sequences were aligned using MUSCLE alignment tool implemented in Molecular Evolutionary Genetic Analysis software (MEGA 5.0). MUSCLE was chosen because it is computationally more suitable for multiple sequence alignments and gives a better accuracy than the conventional CLUSTAL alignment [91]. The aligned protein sequences were viewed with the Biological sequence alignment editor (Bioedit), windows 95/98/NT/2K/XP. Identical amino acid residues are marked in black, conserved cysteine residues are marked with asterisks. (EPS) Figure S34 The evolutionary history of P. gigantea hydrophobins with a selected set of closely related basidiomycetes and Acremonium alcalophilum, an ascomycete. Evolutionary relatedness was inferred using the Neighbor-Joining method [80]. The percentage of replicate trees in which the associated taxa clustered together in the bootstrap test (500 replicates) are shown next to the branches. Branches with blue colour represent hydrophobin sequences from P. gigantea and branches without support values are less than 50%. The tree is drawn to scale, with branch lengths in the same units as those of the evolutionary distances used to infer the phylogenetic tree. The evolutionary distances were computed using the Poisson correction method [79], and are in the units of the number of amino acid substitutions per site. The analysis involved 174 amino acid sequences. All ambiguous positions were removed for each sequence pair. There were a total of 155 positions in the final dataset. Evolutionary analyses were conducted in MEGA5 [92]. Fungal species IDs: |Cersu| (Ceriporiopsis subvermispora), |Phlgi| (Phlebiopsis gigantea), |Phchr|(Phanerochaete chrysosporium), |Gansp| (Ganoderma sp.), |Phlbr| (Phlebia brevispora), |Serla_varsha|(Serpula lacrymans), |Wolco| (Wolfiporia cocos), |Hetan| (Heterobasidion annosum), |Schco| (Schizophyllum commune), |Copci| (Coprinopsis cinerea), |Lacbi| (Laccaria bicolor), |Ustma| (Ustilago maydis), |Acral| (Acremonium alcalophilum). The tree is rooted at Acremonium alcalophilum, representing class II of hydrophobins.       Table S13 Overview of the P. gigantea P450ome and comparison with P. chrysosporium and P. carnosa.

(DOCX)
Table S14 Clan-, family-, and subfamily-level classification of the P450ome of P. gigantea and comparison with P. chrysosporium and P. carnosa. (DOCX)   Text S1 Detailed description of methods and annotated gene families.

(DOCX)
Dataset S1 Number and distribution of genes used for PCA. (XLSX) Dataset S2 Complete listing of P. gigantea protein models and expression data. (XLSX) Author Contributions