Whole Genome and Global Gene Expression Analyses of the Model Mushroom Flammulina velutipes Reveal a High Capacity for Lignocellulose Degradation

Flammulina velutipes is a fungus with health and medicinal benefits that has been used for consumption and cultivation in East Asia. F. velutipes is also known to degrade lignocellulose and produce ethanol. The overlapping interests of mushroom production and wood bioconversion make F. velutipes an attractive new model for fungal wood related studies. Here, we present the complete sequence of the F. velutipes genome. This is the first sequenced genome for a commercially produced edible mushroom that also degrades wood. The 35.6-Mb genome contained 12,218 predicted protein-encoding genes and 287 tRNA genes assembled into 11 scaffolds corresponding with the 11 chromosomes of strain KACC42780. The 88.4-kb mitochondrial genome contained 35 genes. Well-developed wood degrading machinery with strong potential for lignin degradation (69 auxiliary activities, formerly FOLymes) and carbohydrate degradation (392 CAZymes), along with 58 alcohol dehydrogenase genes were highly expressed in the mycelium, demonstrating the potential application of this organism to bioethanol production. Thus, the newly uncovered wood degrading capacity and sequential nature of this process in F. velutipes, offer interesting possibilities for more detailed studies on either lignin or (hemi-) cellulose degradation in complex wood substrates. The mutual interest in wood degradation by the mushroom industry and (ligno-)cellulose biomass related industries further increase the significance of F. velutipes as a new model.


Introduction
Flammulina velutipes, also known as winter mushroom or Enokitake, belongs to the Marasmioid clade [1] (part of the order Agaricales), which includes the recently sequenced model mushroom Schizophyllum commune [2] (split gill fungus) and the edible mushroom Lentinula edodes (Shiitake). F. velutipes is further classified in the family Physalacriaceae, together with Armillaria species (honey fungus), which encompass edible fungi and plant pathogens. As such, the genome of F. velutipes KACC42780 is significant within Flammulina species and in studies of other edible and environmentally important mushrooms.
F. velutipes is one of the 6 most actively cultivated mushroom species in the world; over 300,000 tons of this mushroom are produced per year [3]. As a typical white-rot fungus, F. velutipes can be found on dead wood. Its distribution is limited to the temperate zones of the world because a cold period is required for fruiting [4]. This behavior is reflected in commercial cultivation, which uses media mimicking wood (sawdust-based or corncob) and cold treatment. Reduction of the crop cycles (45-55 days), cold treatment, and the amount of waste material (1 kg substrate/ 300 g mushrooms) are important topics for modern cultivation.
Bioethanol production has been attracting worldwide interest due to global warming and the need for global energy security. Lignocellulose from plant cell walls, the most abundant biomass source in nature, is expected to have applications in future bioethanol production. Moreover, the edible mushroom F. velutipes has been shown to be an efficient ethanol producer, and its properties of ethanol fermentation from various sugars and whole crop sorghums have been characterized in several studies [5,6].
However, the production of ethanol from lignocellulosic biomass is expensive; thus, many researchers are focusing on low-cost consolidated bioprocessing (CBP) for bioethanol production. Bioethanol could be produced in one reactor by CBP, which includes the following 3 biological events: production of lignocellulose-degrading enzymes, hydrolysis of polysaccharides present in pretreated biomass, and fermentation of hexose and pentose sugars [7]. Basidiomycetes can degrade lignin [8], and some basidiomycetes produce alcohol dehydrogenase, allowing the production of ethanol using mushrooms instead of S. cerevisiae [9,10]. These properties of basidiomycetes appear suitable for use in new strategies of bioethanol production. However, the use of basidiomycetes in bioethanol production is not common, and the ethanol fermentation abilities of basidiomycetes are not well characterized. Therefore, better insight into the substrate degrading machinery of F. velutipes will be helpful.
Although F. velutipes can be cultured on defined media and can be genetically modified by transformation methods, such as Agrobacterium-mediated transformation (AMT), the calcium-PEG method [5], research identifying the genetic sources involved in these various functions is restricted because only few genes are known. In order to successfully apply multifaceted approaches (combining mycological, biochemical, pharmacological, and metabolic technique studies), identification of the whole genome sequence of F. velutipes, in addition to subsequent gene expression profiling, will be necessary to understand the genetic basis of mushroom formation and regulation of developmental stages and tissue growth.
Here, we report the genome sequence of F. velutipes. This information, along with predicted genes of F. velutipes, will facilitate our understanding of the fundamental and specific cellular processes of mushroom-forming basidiomycete fungi for commercial production by molecular breeding and industrial use, such as bioethanol production from biomass.

Strains and Culture Conditions
Flammulina velutipes was obtained from the Korean Agricultural Culture Collection (KACC; Rural Development Administration, Korea; http://www.genebank.go.kr/) and was grown at 26uC on MCM agar (0.2% peptone, 2% glucose, 0.2% yeast extract, 0.05% MgSO 4 , 0.046% KH 2 PO 4 , 0.1% K 2 HPO 4 , and 1.5% agar) for two weeks. F. velutipes KACC42780 monokaryotic strain and F. velutipes KACC43778 dikaryotic strain were used for genome sequencing and transcriptome analysis, respectively. The F. velutipes dikaryotic strain KACC43778 resulted from a cross between strains KACC42780 and KACC43777. For genomic DNA and total RNA isolation from mycelia, a 300-mL Erlenmeyer flask containing 50 mL MCM medium was inoculated with fresh plugs from the plate (five mycelial plugs/flask) and incubated at 26uC for two weeks without agitation. To obtain primordia and fruiting bodies, mycelial plugs were inoculated on a growth medium consisting of 80% poplar sawdust, 20% rice bran, and 65% water in a 1,000 cm 3 disposable bottle after sterilization [11]. Inoculated bottles were incubated at 20uC under dark conditions for 25 days, and then transferred to fruiting conditions, maintained at 15uC, at 90% humidity, and with continuous light. Primordia, which measured 2-3 mm in diameter, budded at 10 days after physical stimulation and were collected. After budding of primordia, fruiting was stimulated by cold shock at 4uC for 3 days. At 7 days after cold stimulation, fruiting bodies were matured in conditions of 9uC and 75% humidity and collected.

Genome Sequencing and Assembly
Genome Sequencer FLX system (GS-FLX Titanium, Life Sciences) libraries were generated using genomic DNA from F. velutipes strain KACC42780 in accordance with the manufacturer's instructions. For GS FLX Titanium paired-end sequencing, 15 ug of genomic DNA was sheared into DNA fragments ranging from 400 to 800 bp by nebulization. After both ends of the DNA fragments were repaired and phosphorylated, 2 types of adaptors (A and B) were ligated to the DNA fragments. DNA fragments carrying the 59-biotin of adaptor B from the ligation mixture were immobilized onto magnetic streptavidin-coated beads. Singlestranded template DNA (ssDNA) molecules carrying adaptor A at the 59-end and adaptor B at the 39-end were isolated by alkaline denaturation. These purified ssDNAs were then hybridized to DNA capture beads and clonally amplified by an emulsion polymerase chain reaction (PCR) method. After denaturation of the amplified double-stranded DNAs on the capture beads, beads with single-stranded molecules were spread onto each well of a pico titer plate. For GS FLX Titanium mate-paired library sequencing, 15 ug of genomic DNA was sheared into DNA fragments (3 and 5 kb). After the ends of the DNA fragments were repaired and internal adaptors were ligated to the DNA fragments, each DNA fragment was circularized and self-ligated. The circular DNA was sheared into DNA fragments ranging from 400 to 800 bp by nebulization. DNA fragments carrying the 59-biotin of the internal adaptor from the ligation mixture were immobilized onto magnetic streptavidin-coated beads. Mate-paired sequencing was carried out in a method similar to that described above.

BAC Library Construction and Physical Mapping
To generate a physical map, the BAC library was prepared from a partial HindIII and EcoRI digest of high-molecular-weight genomic DNA from F. velutipes in the vector pBeloBAC11 using standard methods [12]. BAC clone DNA was extracted from a single colony by a standard alkaline lysis extraction method. Inserts of 7,680 BAC clones were sequenced from both ends using universal primers, an ABI 37306l DNA analyzer, and an ABI BigDye Terminator Cycle Sequencing Kit (Applied Biosystems, Foster City, CA, USA). A physical map of F. velutipes, based on BAC clones, was constructed by high information content fingerprinting (HICF) analysis according to the manufacturer's standard protocol (ABI SNaPShot Multiplex System). BAC clones were analyzed in terms of band sizes following restriction enzyme digestion by GeneMapper and Genoprofiler, which were then assembled into an FPC map by FPC V8. The summed length of 13 FPC contigs was 39,612 consensus bands (CB), very close to the sum of contig sequence lengths [13].

Gap Filling and Scaffolding
The 272 BAC clones, spanning the minimal tiling path in the FPC map but not covered by contig sequences, were selected and fully sequenced. The 256,356 GS-FLX shotgun reads were assembled by Newbler Assembler (v 2.5.3) software into corresponding draft BAC sequences for 161 BAC clones, while 86,126Sanger shotgun reads were assembled by Phrap software for 111 BAC clones. The sequencing gaps in the draft assemblies of each BAC clone were filled by primer walking. The resulting 272 finished BAC clone sequences were assembled by Phrap together with the contigs from the 271 initial scaffolds, producing 11 final scaffolds, with N50 being 3.9 Mb (500 scaffold contigs spanning a total of 35.3 Mb). Each of the 11 scaffolds was confirmed to be the corresponding chromosome.

Scaffold Assignment
We previously reported the electrophoretic karyotype of the F. velutipes KACC42780 monokaryotic strain using pulse field gel electrophoresis (PFGE) [14]. Therefore, PFGE and Southern hybridization were used to assign the 11 scaffolds to chromosomal bands. The scaffold primers used to make DNA probes were designed to generate unique DNA fragments of 300 bp by PCR amplification. A DNA probe was generated for each of the 11 scaffolds representing chromosome pieces. PFGE and Southern hybridization analysis for assignment of scaffolds to chromosomes were carried out according to methods described by Park et al. [14].
In order to remove redundancy of the genes, the following rules were used. Firstly, Fgenesh+ models with a score of 100 or greater were collected, and the ones from C. cinerea were kept when there were any overlaps between C. cinereaand L. bicolor-based models (gene set 1, containing 6,283 genes). The gene models from Cufflinks were then selected and included in gene set 1, forming gene set 2 (8,082 genes) when they did not overlap with the models in gene set 1 and their protein coding region covered at least 50% of their transcripts. The reason for this complicated rule was that Cufflinks produced many transcripts with frameshift mutations, resulting in truncated protein products. Finally, the additional 3,204 gene models from the Fgenesh prediction were included when they did not overlap with the models in gene set 2, resulting in the final gene set 3 (12,218 genes). The overlaps between gene models were checked by bedtools [15]. For functional annotation of the predicted genes, the genes were compared using BLAST (version 2.2.17) software with a series of protein and nucleotide databases, including the NCBI nucleotide (Nt; http://blast.ncbi. nlm.nih.gov/Blast.cgi), nonredundant set (Nr; http://blast.ncbi. nlm.nih.gov/Blast.cgi), UniProt/Swissprot (http://www.ebi.ac. uk/swissprot/sptr_stats/index.html and http://www.expasy.org/ sprot/relnotes/relstat.html), Gene Ontology (GO) [16], eukaryotic orthologous groups (KOGs) [17], CDD, and Kyoto Encyclopedia of Genes and Genomes (KEGG; http://www.genome.jp/kegg/) protein databases. The E-value cutoff for BLAST comparison was 0.001. The genome contained 287 tRNAs, as predicted by tRNAscan-SE (version 1.23) [18] and 10 pseudogenes.

Repeat Content Identification
RepeatMasker (A.F.A. Smit, R. Hubley, & P. Green, unpublished data; current version: open-3.3.0 at http://repeatmasker. org) was used to generate de novo repeat sequence predictions for F. velutipes with default parameters. Transposable elements (TEs) and repetitive elements were further identified by comparison to known repeats with Censor (http://www.girinst.org/censor/ index.php) with default parameters.

Total RNA Preparation
Samples from three biological replicates of each developmental stage and tissue were ground to a fine powder under liquid nitrogen using a mortar and pestle and stored at 280uC. Total RNA was prepared from tissue samples (100 mg) using TRIzol reagent in accordance with the manufacturer's instructions (Invitrogen Life Technologies, USA). Total RNA (10 mg) was treated for 30 min at 37uC with 1 U of RQ1 RNase-free DNase (Promega, USA).

Transcriptome Analysis
For comparative transcriptome analysis, total RNA was isolated from 3 different developmental stages of the F. velutipes KACC43778 dikaryotic strain as described [30]. Total RNA was further processed using an Agilent Technologies 2100 Bioanalyzer for RNA integrity and quality (value greater than or equal to 8). Library preparation and sequencing were performed with 1 mg of each total RNA using a HiSeq2000 sequencing system under the manufacturer's standard protocol. Short reads of the F. velutipes KACC43778 dikaryotic strain (a total of 11,324,134 reads) were assembled using Cufflinks (http://cufflinks.cbcb.umd. edu/manual.html). The 10,528 transcripts were used in the following steps. Short-reads (a total of 11,324,134 reads) were directly mapped to the 12,218 predicted genes of the F. velutipes KACC42780 genome using BWA (Burrows-Wheeler Aligner) [31] with the parameter set at -q 20 (-q: quality threshold for read trimming down to 35 bp), and the Genome Analysis Toolkit (GATK) [32] was applied based on quality score recalibration using standard hard filtering parameters. Mapping results were manipulated alignments and counting reads of mapping by samtools [33] (http://samtools.sourceforge.net/), and the number of aligned reads was interpreted as the expression level (reads per kilobase of exon model per million mapped reads [RPKM]). In addition, we performed quantile normalization using the R package 'preprocesscore' [34] within each sample.

RT-PCR Analysis
For RT-PCR analysis, the reverse transcription of RNA (1 mg) in a 20-mL reaction volume was performed using oligo-dT18 and ImProm-II reverse transcriptase (Promega, USA). Reactions were incubated at 25uC for 5 min, at 42uC for 60 min, and then at 70uC for 10 min to inactivate the reverse transcriptase. The PCR reaction was conducted in a 50-mL reaction mixture containing 10 mM dNTP mixture, 10 pmol of each specific primer, 1 unit Taq

Data Access
This whole-genome sequencing project has been deposited at DDBJ/EMBL/GenBank (http://www.ncbi.nlm.nih.gov/) under the accession number AQHU00000000 (the project accession number PRJNA191921). The version described in this paper is the first version. Sequence reads have been deposited in the short-read archive at GenBank under the following accession numbers: SRA068777 contains genomic reads by the Genome Sequencer FLX system (GS-FLX Titanium) and SRA068291 contains RNA-Seq reads by the Illumina HiSeq2000. All of the data generated in this project, including those related to gene prediction, gene functional annotations, and transcriptomic data, are available on our interactive web DB (http://112.220.192.2/fve/).

General Features of the Genome
The genome sequence of the F. velutipes monokaryotic strain KACC42780 was analyzed by the Genome Sequencer FLX system (GS-FLX Titanium) with 37.76 (paired-end) and 606 (mate-paired; 3 and 5 kb) sequence coverage ( Figure S1). The Newbler Assembler (v 2.5.3) program was used to assemble the draft genome sequence. The resulting assembly consisted of 500 sequence contigs with a total length of 35.3 Mb and an N50 length of 252.7 kb (that is, 50% of all bases were contained in contigs of at least 252.7 kb). To cover the gaps in the draft genome assemblies resulting from the Newbler Assembler, 272 BAC clones (minimal tiling paths) were selected and mapped on the draft genome assemblies. After anchoring the BACs, contigs were assembled into 11 largest scaffolds, representing the 11 chromosomes [14], with a total length of 35.6 Mb (including gaps between contigs) and an N50 length of 3.9 Mb (Figure 1). PFGE and Southern hybridization analysis allowed 11 scaffolds to be assigned to each corresponding chromosome of F. velutipes ( Figure 2). The general features of the F. velutipes genome, including assembly and gene model statistics, are presented in Table 1 and Table S1. The genome contained 12,218 predicted protein-encoding genes, of which 86% were supported by expressed sequenced tags (ESTs) and 91.5% by RNA-seq (Tables S2 and S3). However, 1,451 out of the 12,218 models were shorter than 500 nt in size, and therefore, most of these regions were unlikely to encode proteins. The total number of genes in F. velutipes was comparable to that of its nearest sequenced species, S. commune [2], as well as to other basidiomycetes of similar genome size ( Table 2). In addition, 287 tRNA genes in the F. velutipes genome were identified by tRNAscan-SE [17] ( Table S4). The average exon and intron lengths were 245.7 and 180.4 nucleotides, respectively. The average gene density, similar among the larger scaffolds, was 1 gene per 2.9 kb (Table S1). Of the 12,218 predicted genes, 74% (9,051) had significant sequence similarity to documented proteins in BLAST-NR (Table S5). BLASTP searches against the NCBI-NR fungal protein database revealed that 7,169 (58.6%) of the predicted proteins shared sequence similarity (BLAST-NR) to documented fungal sequences (data not shown). Cluster analysis with other sequenced fungi identified 5,748 groups containing at least 1 F. velutipes protein ( Table S6). Analysis of these clusters suggested that 34% of F. velutipes proteins had orthologs amongst the Dikarya and were thus conserved in Basidiomycota and Ascomycota ( Figure 1). No less than 47% (5,751) of genes were unique to F. velutipes, according to OrthoMCL analysis, of which 46.7% (2,688) of genes had inparalog(s) in F. velutipes. Of the 12,218 predicted genes, 50.8% (6,214) and 69.9% (8,543) matched with unknown and known functions (cut-off e-value#0.001) and could be annotated with a gene ontology (GO) [16] term and functional catalog (FunCat), respectively (data not shown). In addition, 9,589 (78.48%) and 9,049 (74%) proteins showed sequence similarity to sequenced organisms in the KOGs [16] (eukaryotic orthologous groups) and KEGG (Kyoto Encyclopedia of Genes and Genomes, http://www.genome.jp/kegg/) databases, respectively (data not shown). In addition, the F. velutipes genome contained genes encoding 681 putative transcription factors (Table S7).

Global Gene Expression
NGS-based RNA-Seq was carried out to assess gene expression in 3 key developmental stages in F. velutipes: mycelium, primordium, and mature fruiting bodies, the latter which was subdivided into the 2 major tissues that make up mature fruiting bodies, i.e., stipe and pileus ( Figure 3 and Table S3). This revealed high numbers of expressed genes in all stages of F. velutipes. A total of 11,188 genes were expressed at least once, with 10,121 genes expressed at all stages ( Figure 3b). Only 1,030 genes (8.4%) were never expressed under the examined conditions. Our data further showed the first global expression analysis linked to the complete genome of F. velutipes. These data revealed that fruiting bodies of F. velutipes were highly specialized, with 2% of the genes being stage specific, compared to 0.6% and 0.5% for primordia and mycelia. Primordia and complete fruiting bodies (stipe and pileus) also showed more specific genes in common (613) than fruiting bodies and mycelia (96 genes) or primordia with mycelia (18 genes), indicating more overlap in developmental processes taking place in primordia and mature fruiting bodies. Within fruiting bodies, a high level of differentiation between the distinct tissues also became apparent. The differences between the stipe and pileus (122 and 329 tissue-specific genes) implied a higher level of specification in pilei, but demonstrated that stipes, though superficially simple, are subjected to complex regulation as well. This is illustrated by the activity of mitochondria, which was reduced in stipe tissue (Table S8), but not in pilei. Two hundred four of the 222 fruiting body-specific genes were unique to F. velutipes. Eighteen showed orthology with Agaricales or more distant fungi, and 54 shared no homology with any current database ( Table S9), suggesting that many genes unique to F. velutipes were specifically expressed in fruiting bodies. This implied that many specific traits (species or clade specific) exclusively develop in fruiting bodies. A similar trend was observed in S. commune [2] and Agaricus bisporus [35].

Lignocellulolysis
The F. velutipes genome contains a vast array of genes coding for the initial lignin degradation (Auxiliary Activities, AA; formerly FOLymes) [36,37]. Among 69 AA genes in the F. velutipes genome, AA3 (glucose-methanol-choline oxidoreductase including cellobiose dehydrogenase, aryl-alcohol oxidase/glucose oxidase, alcohol oxidase, pyranose oxidase), AA7 (glucooligosaccharide oxidase), and AA9 (lytic polysaccharide monooxygenase;GH61) genes were prominently present, with 23, 15, and 16 genes, respectively (Table S10). These numbers were significantly higher than those in the closest known white-rot basidiomycete S. commune and Phanerochaete chrysosporium, with the exception of AA9 in S. commune   (Table S10, log2 values). Thus, the stipe and pileus, belonging to the same mature fruiting body, exhibited differential regulation of these enzyme groups, with overall expression being higher in the stipe than in the pileus ( Table S10). The expression levels of glucose-methanol-choline (GMC) oxidoreductase (AA3) differed between the mycelium and fruiting body developmental stages (primordia, stipe, pileus). Additionally, because 4 out of the 11 genes upregulated by 2-fold or more exhibited peaked expression ($5.5 log2 values) during the mycelium stage, AA3 appeared to have an important role in this stage. In addition, 6 AA genes (1 AA1, 1 AA2, 2 AA3 and 2 AA5) showed higher expression levels than the overall average expression level of the other AA genes during all stages and in all tissues (Figure 3c and Table S10). Comparison of stipe tissue, pilei, and primordia further indicated a certain level of coregulation in stipes and pilei, i.e., expression in both the pileus and stipe was considerably lower or higher than in primordia. Expression analysis showed that all enzymes except for 1 AA3 were expressed in multiple stages, some with high expression connected to 1 specific stage. This, together with the sequential nature of degradation in F. velutipes, offers attractive possibilities for detailed studies on lignin degradation. The dbCAN CAZyme annotation of the predicted amino acid sequences of F. velutipes genes against the CAZy revealed 432 CAZymes, including 193 GHs, 85 GTs, 23 PLs, 91 CEs, and 40 CBMs, as well as multiple domain proteins, in the F. velutipes genome ( Table 3, S11, and S12) [38]. Comparisons with the seven other basidiomycetes showed that F. velutipes had a higher number of all CAZymes (GHs, GTs, PLs, and CEs) than the average number for this lineage (162, 71, 8, and 30, respectively) [39]. In addition, F. velutipes had the highest number of genes encoding PLs (23) and CEs (91) among the seven other basidiomycetes. Comparisons with the 17 major CAZyme families reported in Floudas et al. [40] showed 69 genes in 14 of the 17 gene families. Absence of GH51 (endoglucanase) and CE9 (Nacetylglucosamine 6-phosphate deacetylase) is unusual in other white-rot fungi, but does not seem to impair the wood degrading ability of F. velutipes [41]. The distribution of CAZymes, with multiple copies in GH5, GH16, and GH18 (cellulose, b-glucan, and chitin degradation, respectively) families was consistent with other white-rot fungi, as was the presence of GH11 (xylanase) and CE12 (acetylesterase) gene families, which are generally absent in brown-rot fungi [40]. Cellulose is the main polymeric component of the plant cell wall and contains both highly crystalline regions and less-ordered amorphous regions. Glycoside hydrolase (GH) families GH6 and GH7 include cellobiohydrolases that are involved in the attack of crystalline cellulose (insoluble substrate) [42]. F. velutipes also contained 2 and 3 putative cellobiohydrolases assigned to family GH6 (absence in the 7 other basidiomycetes) and GH7, respectively (Table S11) [38].
RNA-seq analysis suggested stage-or tissue-restricted expression for some of these genes, since 13 and 2 out of 392 genes showed no expression in the mycelia stage and stipe, respectively (Table  S12). Among the 14 genes showing no expression in the mycelia  stage, GH71 and GH43 showed the highest expression levels during fruit body and primordial stage, respectively. The signal peptide prediction of amino acid sequences of F. velutipes against the SignalP 4.1 server (http://www.cbs.dtu.dk/services/SignalP/) [43] revealed 956 genes, including 38 AAs and 172 CAZymes, comprising the signal peptides (Tables S10, S12, and S13). Among the 172 CAZyme genes containing signal peptides, 11 genes (3 PL14, 1 PL3, 1 CE4, 1 CE15, 1 GH12, and 1 GH16, 2 GH43 and 1 GH71) and 3 genes (2 PL1 and 1 CBM) were not expressed at the mycelia stage (or fruit body) and both the fruit body and primordial stages (or mycelia), respectively. In addition, GH71 and CBM13 genes showed the highest expression levels during the fruit body and mycelia stages, respectively. Likewise, among the 38 AAs, one AA9 was not expressed at the mycelia stage. However, five AAs (1 AA1, 1 AA3, 2 AA5 and 1 AA9) showed higher expression levels at all stages than the other AA genes, including 33 AAs comprising signal peptides (Tables S10 and S12). The key step for conversion of lignocellulosic biomass into fermentable sugars is represented by the hydrolysis of polysaccharides, resulting from biomass pretreatment, by cellulases and hemicellulases. Filamentous fungi are the major source of cellulases and hemicellulases. The mechanisms regulating cellulase and hemicellulase genes have been studied in filamentous fungi (mainly in Aspergillus and Trichoderma) [44,45,46], and research on the regulation of cellulase and hemicellulase gene expression may be very useful for increasing production of these enzymes. In the hydrolysis process of cellulolytic residues by T. reesei, cellobiose is produced and accumulates, which inhibits further cellulases production. T. reesei b-glucosidase, which hydrolyzes cellobiose to glucose, has been described as having transglycosylation activity in the presence of insoluble substrates (crystalline cellulose), and is therefore possibly involved in cellulase gene induction [47]. F. velutipes also contained 6 putative b-glucosidase assigned to family GH1 (2) and GH3 (4) ( Table S5). Among the 6 genes, 2 GH3 were gradually increased their expression levels through the developmental stages (from mycelia to fruit body) of F. velutipes. The genomes of both fungi (A. niger and T. reesei) encode many carbohydrate active enzymes (CAZy) which, through synergism, catalyse the deconstruction of cellulose and hemicelluloses, the main polysaccharides of lignocellulose. Regulation of the genes encoding these enzymes is governed by several transcription factors, including the activator XlnR/XYR1 (A. nidulans/T. reesei) and the carbon catabolite repressor CreA/CRE1 (A. nidulans/T. reesei) [48]. Foreman et al. [49] performed investigations on regulation of cellulase and hemicellulase genes expression in T. reesei by microarrays, five of which have been far identified to date: the positive regulators XYR1, ACE2, and the HAP2/3/5 complex; the repressor ACE1; and the carbon catabolite repressor CRE1 [47]. In addition, it has been reported that creA, creB, and creC genes products are involved in the regulatory mechanism of carbon catabolite repression in Aspergillus spp. [50,51,52,53,54]. F. velutipes was found to possess 681 putative transcription factors, including 7 homologs for regulation of (hemi)cellulolytic genes expression (1 HAP3, 3 HAP5, 1 PacC, 1 ACE1, and 1 CREC; Table S7).   Alcohol Dehydrogenase Genes F. velutipes has been reported to produce ethanol from complex substrates like whole crop sorghums and high-cellulose substrates as well as from defined monosugars, disugars, and oligosaccharides with high theoretical recovery rates up to 88% (similar to S. cerevisiae) [6,55]. The F. velutipes genome indeed showed a strong potential for alcohol conversion, encoding 58 homologs to alcohol dehydrogenase genes covering a broad range of possible substrates (Table S14). Expression analysis showed that a large proportion of those enzymes were active in the mycelium. This, together with preferred degradation of lignin followed by (hemi)cellulose, makes F. velutipes a valuable candidate for CBP, allowing the combination of lignocellulose conversion to polysaccharides, hydrolysis of polysaccharides to sugars, and fermentation of (hexose and pentose) sugars to alcohol using a single organism. The identified alcohol dehydrogenases can now be easily characterized.

Mushroom Formation
Mushroom formation was proposed to start with the formation of small aggregates from mycelium, formation of primordia, and outgrowth to mature fruiting bodies [56]. Studies on the development of F. velutipes have led to the identification of multiple environmental factors that trigger the formation of aggregates and primordia and influence the subsequent development of the mushroom. Comparative studies between different developmental steps, tissues, or environmental conditions on protein and gene expression [30,57,58], as well as studies on individual genes [59,60] have provided several insights on mushroom development, but have not elucidated their complex global mechanisms and interactions. The genome sequence of F. velutipes revealed a series of genes associated with mushroom formation. Mating type genes [61], new hydrophobins (in addition to FvHyd1) [59,60] (Table  S15), and fruiting body-specific genes (FDS, FVFD16, and FVFD30), with 1, 5, and 3 putative homologs, respectively, were identified, and RNA-seq analysis suggested stage-or tissuerestricted expression for some of these genes ( Figure 4, Table  S15, and Figure S2). Interestingly, only 3 mitochondrial genes (ATP9, SSU, and LSU) were strongly expressed in F. velutipes, with ATP9 lacking strong expression in the mycelium, suggesting that mitochondrial genes could be independently regulated in coordination with certain stages (Table S8).

Mitochondrial DNA
Early during assembly, an additional scaffold harboring high AT content (.70%) was acquired. This scaffold was identified to be the mitochondrial DNA sequence (Figure 5), which was recently published for this strain (KACC42780 is 4019-20 as used in Yoon et al. [62]). Since mitochondrial DNA experiences a faster rate of mutation than nuclear DNA [63], mitochondrial sequences are of interest for strain identification and barcoding [64]. The sequence reported here could serve as a reference for F. velutipes strains. As expected, analysis showed little difference between our mtDNA (accession no. JF799107, 88,395 bp, and 83.46% AT content) and the previously published sequence (88,508 bp and 83.50% AT content). The same genes and tRNAs (26) were identified in a similar order, although 18 putative ORFs were predicted in our scaffold, while only 16 were reported by in Yoon et al. [62], likely resulting from differences in the interpretation of ORFs. Cytochrome b (COB) and COX1 are always present in mitochondrial genomes [65], whereas most other genes are either in the mitochondrial DNA or in the organism's genome. ATP9 has been reported to be present in both the mitochondrial and fungal genome of several species [66], but has clearly remained unique to the mitochondrial DNA in F. velutipes. As reported by Yoon et al. [62], the mtDNA of F. velutipes has a relatively low GC content (16.54% in our scaffold) in comparison to other basidiomycetes (21%-31%) [67,68], and no origin of replication (ORI) could be detected. Being the first reported mtDNA within the Physalacriaceae, it will be interesting to see if these features are characteristic for this group or specific for F. velutipes. The absence of an ORI in mtDNA has also been reported in Trametes cingulata [69].

Repeat Contents
Repeat sequences are thought to be important for generating genetic diversity and a consequence of evolution. RepeatMasker (http://repeatmasker.org) and Censor [70] were used to determine the copy numbers and chromosomal distributions of repetitive DNA of the F. velutipes genome. When we analyzed the repetitive elements by using RepeatMasker, various multicopy repetitive DNA regions (total of 93.5 kb) were identified in the F. velutipes genome. However, relatively low repeat DNA regions (0.263% of the genome: simple repeat, 0.1%; short interspersed nucleotide elements [SINEs], 0.003%; long interspersed nucleotide elements [LINEs], 0.011%; low complexity sequences, 0.085%; unclassified, 0.001%; LTRs, 0.001%, small RNAs, 0.034%) were found in the 35.6-Mb assembled genome of F. velutipes (Table S16).
Since retrotransposons are able to transcribe themselves via reverse transcription, they are often the major contributors to the repetitive contents in the genome [71]. The assembled genome of F. velutipes consisted of 10% (3,567,850 bp) repeat content, with the largest proportion comprising LTR retrotransposons (18%; 640,707 bp), including LTR Copia (4.29%; 153,361 bp) and Gypsy transposons (11.8%; 423,698 bp; Table S17).

Discussion
F. velutipes is consumed for its culinary qualities and is also known as a wood rotting fungus that can degrade lignocellulose. Recently, F. velutipes was found to be a good producer of ethanol. Thus, F. velutipes is a highly attractive model/source for studying enzymes capable of efficient degradation of lignocellulosic biomass for bioethanol production. Here, we reported the sequencing of the F. velutipes genome, contributing valuable information for the development of genetic tools for this important organism and stimulating overall research on mushrooms and their various applications. Gene modeling revealed that the total number of genes in F. velutipes was comparable to that of other sequenced basidiomycetes with similar genome sizes ( Table 2). However, the average intron length of F. velutipes was relatively larger (180.4 bp) than those of other sequenced species (75-127 bp). We also performed gene prediction using the AUGUSTUS tool [72] with default parameters trained in C. cinerea (Table S1). When we used the AUGUSTUS tool for gene prediction, the average intron length of F. velutipes was similar (91.5 bp) to those of other sequenced species. In addition, gene prediction using the AUGUSTUS tool revealed the total number of genes in F. velutipes was relatively fewer (10,645, excluding 5,032 isoforms) than that of other sequenced basidiomycetes of similar genome size (Tables 2 and S1). This is why we used several methods for gene prediction, including ab initio gene structure prediction (Fgenesh), a homology-based approach (Fgenesh+), and transcriptome-based gene identification (Cufflinks). However, further studies are needed for evaluating the differences in average intron length between F. velutipes and other sequenced fugal species.
Wood exhibits high resistance to chemical and biological degradation due to lignin [26], and wood degradation by whiterot fungi usually starts with the depolymerization of this compound. This results in highly reactive lignin radicals that in turn attack any nearby wood polymers, causing further degradation via diverse reactions [8,73]. The removal of the lignin matrix renders cellulose and hemicellulose compounds susceptible to enzymes classified as carbohydrate active enzymes (CAZymes) [74] and facilitates the further degradation of wood. F. velutipes has been indicated to follow a pattern of sequential wood degradation [41], implying that most lignin is degraded prior to degradation of other wood components, in contrast to simultaneous degradation [74]. The F. velutipes genome and NGS-based RNA-Seq revealed a vast array of genes associated with lignin and carbohydrate degradation common to white rot fungi, which helped us characterize its substrate metabolism in more detail. In addition, expression analysis revealed that several genes, annotated as AAs, were upregulated. Interestingly, the expression levels of these genes were specific to each developmental stage of F. velutipes. Notably, we observed substantial differences in the total number of genes between F. velutipes and the close model fungus S. commune, especially in lignin degrading enzymes. The newly uncovered wood degrading capacity of F. velutipes offers interesting possibilities for more detailed studies on either lignin or (hemi-) cellulose degradation in complex wood substrates. The mutual interest in wood degradation by the mushroom industry and (ligno-) cellulose biomass-related industries further increases the significance of F. velutipes as a new model. Consequently, this approach allows the identification of common and specific genes of the F. velutipes genome during development. Furthermore, the analysis of genes specifically expressed at 3 different developmental stages would help to understand the mechanisms of fruiting at the molecular level and could improve artificial cultivation of mushrooms of various kinds of commercially important basidiomycetes.
According to a previous study [6,75], F. velutipes efficiently converts D-glucose to ethanol with a theoretical recovery rate of 88% (similar to the ability of S. cerevisiae). In addition, F. velutipes is able to convert not only sucrose, but also maltose, cellobiose, cellotriose, and cellotetraose to ethanol, with almost the same recovery rate as that of D-glucose. Interestingly, the highest average expression level of 58 alcohol dehydrogenase genes was observed during the mycelia stage. Thus, the expression levels of these genes were 2 and 3 times more than those observed in the fruit body and at the primordial stage, respectively. Alcohol dehydrogenases (ADHs) constitute a large family of enzymes responsible for the reversible oxidation of alcohols to aldehydes with the concomitant reduction of NAD + or NADP + . These enzymes have been identified not only in yeasts, but also in several other eukaryotes and even prokaryotes. The ADHs of S. cerevisiae have been studied intensively for over half a century [76]. The genes encoding classical ADHs in S. cerevisiae include ADH1, ADH2, ADH3, ADH4, and ADH5 [77,78,79,80]. Other genes include SFA1, BDH1 [81], BDH2 (YAL061W), SOR1 (YAL246C), and the recently proposed ADH6 (YMR218C) and ADH7 (YCR105W) [81,82,83,84,85]. F. velutipes was found to possess five alcohol dehydrogenase genes, which showed sequence similarity to ADHs of S. ceraviae (Table S14). Interestingly, all of these homologs showed high expression levels during all stages and all tissues; thus, we focused on these candidate genes for bioethanol production. Our results demonstrating the extensive range of alcohol converting enzymes in the F. velutipes genome support the strong potential of this wood degrader for (consolidated) bioethanol production. Moreover, the highest expression level of alcohol dehydrogenase genes at the mycelial stage suggested that bioethanol could be efficiently produced by mycelia with low-cost biomass processing.
One of the major difficulties for the improvement of production methods is a poor understanding of mushroom formation. Very few genes have been identified that control initiation, succeeding stages, and proliferation. Mating type genes are of the few genes that are fundamental in mushroom formation. Often referred to as master regulatory genes, they control the establishment of the dikaryotic stage [56]. In tetrapolar species like F. velutipes, the A locus encodes homeodomain proteins that regulate clamp cell formation, synchronized nuclear division, and septation, while the B locus encodes pheromone receptors and pheromones that direct clamp cell fusion [86]. More importantly, mushroom formation can only be induced after both the A and B pathways have been activated. In our previous report [61], the mat A locus was part of a ,350 kb region that was highly syntenic within Agaricales and contained conserved recombination hot spots linked to separation events of the matA subloci in F. velutipes and other fungi. The homeodomain 2 genes FvHD2-1 and FvHD2-2 were specific for the matA3 locus, whereas the homeodomain 1 gene FvHD1-1 was detected in matA3 and matA4 mating types. EST libraries of monokaryotic and dikaryotic mycelia (unpublished data) of strains KACC42780 (matA3matB3), KACC43777 (matA4matB4), and KACC43778 (dikaryon, matA3A4matB3B4) confirmed these respective linkages.
The cell walls of filamentous fungi have been investigated in several species of basidiomycota, and studies have reported that the major components of the cell wall are chitin and b-1,3-glucan with b-1,6-linked branches [87,88,89,90]. During the filamentous fungal life cycle, the cell walls are synthesized, re-orientated, and lysed [91,92]. Cell wall lysis and changes in the constituent polysaccharides are essential processes during fruiting body development in basidiomycota and in autolysis of pileus of fruiting bodies. Yamada et al. [93] reported that Fv-pda, a gene coding for chitin deacetylase (CDA, deacetylates chitin to chitosan in the fungal cell wall) is specifically expressed during fruiting body development of F. velutipes. In addition, RT-PCR revealed that the transcript could be detected from all tissues, including the pileus and stipe, but the transcript levels were higher in stipes than in the pileus. The findings of the current study are consistent with those of Yamada et al. [93]. F. velutipes was found to possess five putative CDA genes, including one homolog, which showed sequence similarity to Fv-pda (Table S5). RNA-seq revealed that one CDA gene (homolog to Fv-pda) was expressed at all stages and tissues, but the expression level was higher in the stipe than in the pileus. In addition, the other CDA genes were also specifically expressed at different tissues and stages of F. velutipes (Table S3). These results are consistent with a previous study by Yamada et al. [93] and suggested that CDA would play an important role in the formation of fungal cell walls of fruiting bodies through conversion of chitin to partially deacetylated chitosan.
In conclusion, this study aims to advance understanding of the fundamental and cellular processes of mushroom-forming basidiomycete fungi for commercial production and industrial use. Although the complexity of the respective culture media indicates a possible correlation between complexity and the number of expressed genes (high to low: F. velutipes, MCM and wheatbransawdust; L. bicolor, outside in association with trees; A. bisporus, compost; S. commune, defined minimal medium) there is currently no clear explanation for the exceptionally high expression levels in F. velutipes. This, combined with the knowledge that F. velutipes responds to many environmental stimuli that are present (in reduced) combinations in other mushrooms, makes F. velutipes an effective model for studies on mushroom developmental processes and sensory pathways.