Composition and Expression of Genes Encoding Carbohydrate-Active Enzymes in the Straw-Degrading Mushroom Volvariella volvacea

Volvariella volvacea is one of a few commercial cultivated mushrooms mainly using straw as carbon source. In this study, the genome of V. volcacea was sequenced and assembled. A total of 285 genes encoding carbohydrate-active enzymes (CAZymes) in V. volvacea were identified and annotated. Among 15 fungi with sequenced genomes, V. volvacea ranks seventh in the number of genes encoding CAZymes. In addition, the composition of glycoside hydrolases in V. volcacea is dramatically different from other basidiomycetes: it is particularly rich in members of the glycoside hydrolase families GH10 (hemicellulose degradation) and GH43 (hemicellulose and pectin degradation), and the lyase families PL1, PL3 and PL4 (pectin degradation) but lacks families GH5b, GH11, GH26, GH62, GH93, GH115, GH105, GH9, GH53, GH32, GH74 and CE12. Analysis of genome-wide gene expression profiles of 3 strains using 3′-tag digital gene expression (DGE) reveals that 239 CAZyme genes were expressed even in potato destrose broth medium. Our data also showed that the formation of a heterokaryotic strain could dramatically increase the expression of a number of genes which were poorly expressed in its parental homokaryotic strains.


Introduction
Volvariella volvacea (Bull.: Fr.) Singer, commonly known as the straw mushroom and the Chinese mushroom, is an important, edible, straw-degrading basidomycete fungus of the tropics and subtropics. The annual yield of the fungus was about 437,200 tons in 2008 [1]. The cultivation of V. volvacea uses about 4 million tons of straw per year in China, accounting for approximately 2% of the annual straw yield in China and its cultivation residues are a source of organic fertilizer with high quality for crops. Despite the economic importance of this fungus, relatively little is known about how it degrades straw and obtains nutrients, and how many enzymes take part in lignocellulose degradation. A clear grasp of the mechanism of lignocellulose degradation is important for breeding for increased straw degradation efficiency and biological efficiency, and generally to expand their immense biotechnological potential. These objectives have given impetus to the quest to sequence its genome and to characterize its gene expression. Although the number of fungal genome-sequencing projects has dramatically increased over the last few years, there is a surprising lack of information of mushroom genomes. Genome sequence information of mushrooms will help to increase the understanding of the biology, evolution, and biomedical implications of the entire fungal kingdom.
Carbohydrate-active enzymes (CAZymes), especially cellulases and hemicellulases, are involved in the hydrolysis of plant cell wall polysaccharides, and play an important role in substrate degradation processes. In recent years, the CAZymes of several biomassdegrading fungi, such as Trichoderma reesei [2], Schizophyllum commune [3], Ganderma lucidum [4], have been reported with their genome sequences. We recently obtained the genome sequence and transcriptome of V. volvacea using next-generation high-throughput sequencing technology. Prior to our study, other studies concerning cellulases,hemicellulases, b-Glucosidase and laccase of V. volvacea were published [5][6][7][8][9][10]. However, analysis of CAZymes on the basis of whole genome sequence has not been carried out on V. volvacea.
In this study, we annotated CAZymes encoding genes of this fungus based on de novo sequencing and assembly of the genome of V. volvacea. By comparison with 14 other filamentous fungi, we discovered some features in the composition of CAZymes in this fungus. Using the 39-tag digital gene expression (DGE) [11], we compared the gene expression profiles among 2 homokaryotic V. volvacea strains PYd15 and PYd21, and one heterokaryotic strain H1521, which is a hybrid strain of PYd15 and PYd21. DGE allows millions of short RNAs and differentially expressed genes to be identified in a sample without the need for prior annotations. DGE has many advantages including greater sequencing depth, de-tection of unknown transcripts, practical implementation of digital tags, generation of absolute rather than relative gene expression measurements, detection of high levels of differential polyadenylation and detection of low-abundance transcripts and small changes in gene expression, that makes it particularly attractive for measuring mRNA expression and for identifying differentially expressed genes [12,13].

Strains and Culture Conditions
Strains used in this study include 2 homokaryotic strains PYd21 and PYd15 and one heterokaryotic strain H1521 which was generated by a cross between PYd21 and PYd15. All V. volvacea strains used in this study were kindly supplied by the Mycological Research Center, Fujian Agriculture and Forestry University, Fujian, China.

Genome Sequencing and Assembly
DNA of strain PYd21 was used for genome-wide de novo sequencing. Genomic DNA was isolated using a modified CTAB (cetyltrimethylammonium bromide) approach [14] and sequenced on the Illumina Cluster Station and Illumina GAII platform at BGI-Shenzhen (Shenzhen, China). Three libraries of DNA fragments with insert sizes of 500 bp, 2000 bp and 5000 bp were generated and all of the reads produced from the three libraries were assembled using the SOAPdenovo [15] assembler (http:// soap.genomics.org.cn/). Sample preparation and analytical processing (e.g., base calling) were performed by BGI-Shenzhen (http://www.genomics.cn).

DGE Library Construction and Sequencing
Mycelia from strains H1521, PYd15 and PYd21 grown in potato dextrose broth were collected. Total RNA was extracted from mycelia using pBIOZOL Plant Total RNA Extraction Reagent, according to the manufacturer's protocol (BioFlux). The isolated RNA was treated with RNeasy plant mini kit to remove potential genomic DNA contamination, according to the manufacturer's protocol (QIAGEN, Germany). RNA integrity and concentration were evaluated using an Agilent 2100 Bioanalyzer (Agilent Technologies, Palo Alto, CA, USA). RNA (6 mg) from each of the three strains was submitted to BGI-Shenzhen (Shenzhen, China) for library construction and sequencing. Sequence tag preparation was carried out using the Illumina Gene Expression Sample Prep Kit according to the manufacturer's protocol. The mRNAs were separately isolated with oligo(dT) ligated beads, and were then reverse transcribed into double-stranded cDNA. The ds cDNAs were digested by the restriction enzymes NlaIII and MmeI. After ligation with sequencing adapters, the molecules were amplified by PCR. The three mRNA tag libraries were sequenced on the Illumina Cluster Station and Illumina HiSeq TM 2000. Image recognition and base calling were performed using Illumina Pipeline.

Nonredundant Gene Set Preparation and Illumina DGE Tag Annotation
After filtering out tags with low quality (containing Ns) and adaptor sequences, DGE tags were mapped to the predicted genes, during which one base pair of mismatch was allowed. Ambiguous tags with multihits were excluded. The expression abundance of transcripts was measured by the number of unambiguous clean tags for each gene and then normalized to TPM (number of transcripts per million clean tags) [13,16].

Detection of Differentially Expressed Genes
To identify global transcriptional changes between different samples, a previously described method [24], with some modifications by BGI-Shenzhen company [25], was utilized to identify differentially expressed genes from normalized DGE data via pairwise comparisons between differential strains (PYd15 vs H1521, PYd21 vs H1521, and PYd15 vs PYd21). Genes were defined as significantly differentially expressed if they had a Pvalue ,0.005, a false discovery rate (FDR)#0.001 [25], and an estimated absolute log2-fold change .2 in sequence counts across libraries. To increase the reliability, genes with Raw Intensity#40 in both of two compared strains were filtered off.

Formation of Heterokaryon can Rescue Growth Defects in Two Homokaryotic Strains
Like other basidiomyces fungi, vegative mycelia of V. volvacea can grow in either homokaryotic or heterokaryotic form [26]. PY-1 is a widely cultivated V. volvacea strain in Fujian Province of China. To see the phenotypes of its homokaryotic progenies, 38 single basidiospores were isolated from PY-1. Twenty five of them displayed abnormality in colony formation (Table S1). Two single spore isolates (PYd15 and PYd21) with growth defects were chosen to cross and generated a heterokaryotic strain H1521. The growth rates of PYd15, PYd21 and the heterokaryotic strain H1521 were 0.7060.05, 0.4360.01 and 3.1160.16 cm/d on PDA medium, respectively. In addition, the heterokaryotic strain H1521 had much more aerial hyphae than the two homokaryotic strains ( Figure 1).

Analysis of Genome Sequence Identified 285 CAZyme Genes in V. volvacea
To establish a basis for understanding molecular mechanisms that regulate V. volvacea growth and development, the genome of the homokaryotic strain PYd21 was sequenced by whole genome shotgun strategy and yielded 3,420 Mb clean data. A draft genome sequence of 37.2-megabase with 4.13% repeat content was assembled and deposited at DDBJ/EMBL/GenBank(http:// www.ncbi.nlm.nih.gov/) under the accession no. ANCH00000000. The assembly is contained on 302 scaffolds with N50 of 499,697 base pairs (bp). A total of 11,534 ORFs were predicted using the software Eukaryotic GeneMark-ES(ver-sion2.3).
By searching the predicted amino acid sequences of V. volvacea genes against the Carbohydrate-Active Enzyme database (CAZy), we identified totally 285 CAZymes encoded by the genome of V. volvacea, including 191 glycoside hydrolases, 44 glycosyltransferases, 19 polysaccharide lyases and 31 carbohydrate esterases (the CAZymes were listed in Table S2; the predicted amino acid sequences of CAZymes are presented in Data S1).

Transcriptomic Profiles of Two Homokaryotic Strains with Growth Defect can be Largely Alterred by Formation of Heterokaryon
To gain an insight into mechanisms of differences in growth and development among different strains, DGE libraries of the vegetative mycelia of two homokaryotic strains PYd21 and PYd15, and their hybrid strain H1521, were constructed. The major characteristics of these libraries are summarized in Table 2. An average of 5.9 million sequence tags per library was obtained, with 182,937 distinct tag sequences. Prior to mapping these tag sequences to reference sequences, adaptor tags, low quality tags and tags of copy number = 1 were filtered out, producing an average of 5.8 million clean sequence tags per library, with 92,883 distinct clean tag sequences. By mapping DGE tags to the genome, 57,011 of 57,869 tags were found to be unambiguous tags. We filtered the clean tags mapped to multiple genes. The remaining clean tags were designated as unambiguous clean tags. The number of unambiguous clean tags for each gene was calculated as the gene expression value and then normalized to the number of transcripts per million clean tags (TPM) [13,2]. The raw sequencing data of digital gene expression (DGE) of mycelia were submitted to Gene Expression Omnibus (GEO) database with As shown in Figure 3, the majority of genes were either expressed in all three strains (6,188 genes) or not expressed in any of them (2747 genes). Of the 11,534 reference genes, 76.18% were expressed in at least one strain.
Transcriptomic profiles between two homokaryotic strains were dramatically different. As shown in Table S4, transcriptional levels of 470 genes in PYd15 were at least 3 times higher than in PYd21 while transcriptional levels of 422 genes in PYd15 were at least 3 times lower than in PYd21. Remarkablly, many genes with very high transcriptional levels in one homokaryotic strain had no detected transcript in another homokaryotic strain. Because V. volvacea is a heterothallic fungus and PYd21 and PYd15 have different homeodomain transcription factor genes at mating-type loci (unpublished data), different expression patterns in some genes might be attributed to different transcriptional regulation by mating types. In addition, since 35,389 SNP sites present between PYd15 and PYd21 genomes (unpublished data), it is not surprising that some of these SNP sites might also result in differences in gene expression.
The transcriptomic profile of the heterokaryotic strain H1521 was also significantly different from its parental homokaryotic strains. As shown in Table S5, transcriptional levels of 549 genes in H1521 were at least 3 times higher than the homokaryotic strain PYd21 while transcriptional levels of 366 genes in the heterokaryotic strain H1521 were at least 3 times lower than PYd21. Transcriptional levels of 200 genes in H1521 were at least 3 times higher than the homokaryotic strain PYd15 while transcriptional levels of 142 genes in H1521 were at least 3 times lower than PYd21 (Table S6). To be noted, transcriptional levels for 113 genes in H1521 displayed at least 3 times higher than in both homokaryotic strains. Many genes had no detectable expression in both homokaryotic strain but displayed high expression in the heterokaryotic strain (see Table 3 for the representative data), indicating the formation of heterokaryon can promote the expression of a number of genes. In contrast, 18 genes in both homokaryotic strains showed at least 3 times higher than the heterokaryotic strain H1521, suggesting the formation of heterokaryon can suppress the expression of these genes. Moreover, for 470 genes with lower transcriptional levels in PYd21 than PYd15, transcriptional levels for 266 of them in the heterokaryotic strain H1521was at least 3 times higher than those in PYd21. For 422 genes with lower transcriptional levels in PYd15 than PYd21, transcriptional levels for 53 of them in H1521 was at least 3 times higher than those in PYd15. For many genes, which had high expression in one homokaryotic strain but had no expression in another homokaryotic strain, their expression can be detected in the heterokaryotic strain (see Table 3 for the representative data). All these indicate the formation of heterokaryotic strain can alter the gene expression in a genome-wide scale, which might be the basis of the growth difference between the homokaryotic strains and the heterokaryotic strain.
For CAZyme genes, 239 among 285 predicted genes were expressed in at least one strain (Table S7). Of these genes, 164 were expressed in all the three strains. As shown in Table S8, 15 CAZyme genes in PYd15 displayed higher transcription levels than in PYd21 while 17 CAZyme genes in PYd15 displayed lower transcription levels than in PYd21. In general, the transcriptomic profile of CAZyme genes in the heterokaryotic strain H1521 was similar to that of PYd15. Only 3 CAZyme genes showed the transcriptional differences over 3 times between H1521 and PYd15 (Table S9). In contrast, 17 CAZyme genes in H1521 displayed higher transcriptional levels than in PYd21 and 12 CAZyme genes in H1521 displayed lower transcriptional levels than in PYd21 (Table S10). Thus, transcriptional difference in CAZyme genes might not the cause to the growth difference between the heterokaryotic strain and the homokaryotic strains in PDA medium. As shown above, in addition to CAZyme genes, transcriptional levels of many other genes in the homokaryotic strains were different from those in the heterokaryotic strain. Transcriptional abnormalities in some genes might cause growth defect in the homokaryotic strains. Since formation of heterokaryon can rescue growth defects in two homokaryotic strains, their growth defects should have different genetic basis. When the homokaryon is formed, transcriptional abnormalities in one homokaryotic strain might be complemented by the other, making the heterokaryotic strain growth much better than the homokaryotic strains.

Discussion
Based on the genome sequence, we generated a relatively complete list of CAZyme enzymes of the straw mushroom V. volvacea and discovered some features of this fungus in the composition of CAZymes. In addition, we showed that the transcriptomic profile of a heterokaryotic strain was completely different from those of its parental homokaryotic strains. Thus, this study is not only important for better understanding of biomass degradation of fungi but also provide an important database useful for genetic modification of the fungus in the future. V. volvacea has strong activity in straw degradation but is weak for degrading the lignin component of lignocelluloses [27]. Genome sequencing and annotation of CAZymes demonstrate that, V. volvacea has 285 CAZymes, ranking the seventh in the number of CAZymes among 15 biomass-degrading fungi with sequenced genomes. In previous reports [3,28,29,30,31], V. volvacea, S. commune, P. chrysosporium, and C. cinerea were classified as white rot fungi, while P. placenta, L. bicolor, C. neoformans, U. maydis were known as brown rot fungi. The double clustering of the carbohydrate-cleaving families of these 8 basidiomycete genomes demonstrated that V. volvacea has a close relationship with S. commune, P. chrysosporium, and C. cinerea, which provide a molecular basis to support the previous classification in white rot fungi and brown rot fungi. Among 285 predicted CAZymes, transcripts for 239 of them were detected even in potato dextrose broth medium, which does not contain the best carbon source for expression of most of CAZymes. All these indicate that V. volvacea is genetically well equipped for biomass degradation and most of CAZyme genes are actively expressed. For many CAZyme genes, expression may be suppressed by dextrose but induced by polysaccarides. Therefore, if provided with straw-based carbon sources, more CAZyme genes should be expressed and many expressed CAZyme genes should have higher expression levels. In addition, the composition of glycoside hydrolases in V. volvacea is different from other basidiomycetes. Among 7 compared basiomycetes, S. commune has the closest phylogenetic relationship with V. volvacea. However, the compositions of glycoside hydrolases between S. commune and V. volvacea are obviously different. This might be a genetic basis of their different nutrient preferences: S. commune commonly grows on wood while V. volvacea grows on straw-based media. The composition of glycoside hydrolases might have close correlation with carbon utilization.
V. volvacea exists in either homokaryon or heterokaryon form and heterokaryotic strains usually grow better than homokaryotic strains but the molecular basis is unknown. Using DGE, we compared genome-wide transcriptional profiles among two homokaryotic strains PYd21 and PYd15 and their hybrid heterokaryotic strain H1521and found that a number of genes were differentially expressed between the homokaryotic strains. Some genes displayed very high expression levels in one homokaryotic strain but had no expression in another homokaryotic strain. However, when two genetic different homokaryotic strains fuse into one heterokaryotic strain, some defects in gene expression in each homokaryotic strain might be complemented by each other. In addition, our data also indicate that, for many genes showing low expression in both homokaryotic strains, the formation of heterokaryotic strain can significantly increase their expression. The strain-dependent gene expression reflects a very interesting regulatory phenomenon: when two genetically different homokaryotic strains are crossed to generate a heterokaryotic strain, the gene expression profile could be completely different from either of its parental strains. This further suggests that hybridization breeding can be a powerful strategy to generate V. volvacea strains with high yield.
To be noted, the DGE data is only based on the mycelia grown in PDA medium. If different carbon sources, such as the strawbased media commonly used for commercial production of this mushroom, are provided, the transcriptional profiles in 3 strains   should be changed. Some important CAZyme genes for straw degradation might have more differences in their expression levels and provide better clues to the linkage between gene expression and growth phenotype. Although the linkage between CAZyme expression and growth remains unknown because the roles of these differentially expressed CAZyme genes in biomass degradation are not clear, it is the first report of genome-wide expression of CAZyme genes in the straw mushroom. Further deep investigation is required to clarify the detailed mechanisms of strain-dependent difference in the growth and gene expression in the future.

Conclusion
Genome sequencing of V. volvacea identified 285 CAZyme genes, ranking the seventh in the number of CAZymes among 15 biomass-degrading fungi with sequenced genome. DGE analysis showed 239 CAZyme genes were expressed. A number of genes, which are poorly expressed in two genetic different homokaryotic strains, can reach to high expression levels in heterokaryotic strain.