The genus Corylus is an important woody species in Northeast China. Its products, hazelnuts, constitute one of the most important raw materials for the pastry and chocolate industry. However, limited genetic research has focused on Corylus because of the lack of genomic resources. The advent of high-throughput sequencing technologies provides a turning point for Corylus research. In the present study, we performed de novo transcriptome sequencing for the first time to produce a comprehensive database for the Corylus heterophylla Fisch floral buds.
The C. heterophylla Fisch floral buds transcriptome was sequenced using the Illumina paired-end sequencing technology. We produced 28,930,890 raw reads and assembled them into 82,684 contigs. A total of 40,941 unigenes were identified, among which 30,549 were annotated in the NCBI Non-redundant (Nr) protein database and 18,581 were annotated in the Swiss-Prot database. Of these annotated unigenes, 25,311 and 10,514 unigenes were assigned to gene ontology (GO) categories and clusters of orthologous groups (COG), respectively. We could map 17,207 unigenes onto 128 pathways using the Kyoto Encyclopedia of Genes and Genomes Pathway (KEGG) database. Additionally, based on the transcriptome, we constructed a candidate cold tolerance gene set of C. heterophylla Fisch floral buds. The expression patterns of selected genes during four stages of cold acclimation suggested that these genes might be involved in different cold responsive stages in C. heterophylla Fisch floral buds.
The transcriptome of C. heterophylla Fisch floral buds was deep sequenced, de novo assembled, and annotated, providing abundant data to better understand the C. heterophylla Fisch floral buds transcriptome. Candidate genes potentially involved in cold tolerance were identified, providing a material basis for future molecular mechanism analysis of C. heterophylla Fisch floral buds tolerant to cold stress.
Citation: Chen X, Zhang J, Liu Q, Guo W, Zhao T, Ma Q, et al. (2014) Transcriptome Sequencing and Identification of Cold Tolerance Genes in Hardy Corylus Species (C. heterophylla Fisch) Floral Buds. PLoS ONE 9(9): e108604. https://doi.org/10.1371/journal.pone.0108604
Editor: Zongbin Cui, Institute of Hydrobiology, Chinese Academy of Sciences, China
Received: June 9, 2014; Accepted: September 1, 2014; Published: September 30, 2014
Copyright: © 2014 Chen et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: The authors confirm that all data underlying the findings are fully available without restriction. The high quality reads produced in this study have been deposited in the National Center for Biotechnology Information (NCBI) SRA database (accession number: SRX529300).
Funding: This work was supported by the Special Project for Scientific Research of Forestry Commonweal Industry of National Forestry Bureau (201304710) to GW and the China Postdoctoral Science Foundation (2014M550104) to JZ. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
The genus Corylus consists of deciduous species which naturally occur in temperate forest areas in Europe, the Middle East, Asia, and North America . Currently over 4,000,000 t of nuts are commercially produced through the world, of which 700,000 t are hazelnut production . Hazelnuts are important woody species in Northeast China. Its cultural area is approximately one million hectares, ranking first in the world. As the most widely distribution Corylus plant in China, Corylus heterophylla Fisch yield accounts for more than 70% of the total output in the domestic market of China . Hazelnuts, due to their organoleptic characteristics, constitute one of the most important raw materials for pastry and chocolate industry , . In addition to providing a desirable flavor to different food, they play an important role in human nutrition and health due to their protein, oil, vitamin, and mineral content. Hazelnuts are rich in both monounsaturated and polyunsaturated fatty acids, as well as in vitamin E . Corylus species are also important sources of taxol (paclitaxel), which is an effective yet relatively expensive medicine for treatment of breast, ovarian, and lung cancer –.
Cold stress is one of the most severe abiotic stresses and adversely affects plants by causing tissue injury and delayed growth , . Cold stress can be classified as chilling (<20°C) and freezing (<0°C) stress . Not all plants are always ready to tolerate freezing temperatures. However, many plants are tolerant of freezing temperature after exposure to non-freezing low temperature, a phenomenon called cold acclimation (CA) , . Arabidopsis cold acclimate only about 5–7°C allowing for brief exposures to freezing temperatures whereas woody perennials can withstand extremely low subzero temperatures for extended periods of time. In addition, overwintering floral buds display both enhanced freezing tolerance and dormancy/relief of dormancy . In such process, various physiological and biochemical changes occur in plant cells, which may confer subsequent acquired chilling and freezing tolerance to plants . The signaling pathways used by plants in responding to cold stress and the key genes for modifying the response are of interest . The best characterized regulon of cold-stress responses in plants contains transcription factor C-repeat binding factor dehydration-responsive element-binding protein (CBF/DREB) and its cold-inducible target genes, known as COR (cold-regulated gene), KIN (cold-induced gene), RD (responsive gene to dehydration), or LTI (low-temperature-induced gene) –. A large number of studies demonstrate that gene expression changes occur in a wide range of plant species in cold response , but the precise hierarchical organization of the global network has not been defined.
Hazelnuts will grow in a wide range of soil types from acidic mountain soils to basic soils derived from limestone. The plants grow best in mild, humid climate without extremes of heat or cold. However, buds, leaves, catkins, and female flowers are tolerant of frosts down to −7°C . However, little is known about its tolerance mechanisms. During the last decades, large amounts of transcriptomic and genomic sequences have been available in many model organisms. For Corylus heterophylla, only 90 nucleotide sequences have been deposited in GeneBank database (as of the May 25, 2014). Therefore, extensive genomic or transcriptomic sequences are badly needed for Corylus heterophylla, which can be used for new genes discovery, gene localization, and comparative genomics and so on.
Transcriptome is the complete collection of transcripts in a cell at a specific developmental stage, which provides valuable and comprehensive information on gene expression, gene regulation and amino acid content of proteins . The development of sequencing technology has provided a novel method for the analyses of transcriptome . In plants, RNA-seq has accelerated the investigation of the complexity of gene transcription patterns, functional analyses and gene regulation networks . In the present work, an RNA-seq project for C. heterophylla Fisch was initiated. Four C. heterophylla Fisch floral buds samples, including floral buds in non-cold acclimation (NA), cold acclimation (CA), midwinter (MW), and deacclimation (DA) stages were sequenced using the high-throughput Illumina deep sequencing technique. In addition, we estimated the expression profiles of key genes involved in cold acclimation. The transcriptome sequencing from C. heterophylla Fisch may help improve future genetic and genomic studies on the molecular mechanisms behind the cold tolerance of the C. heterophylla Fisch floral buds.
Results and Discussion
mRNA-seq and de novo transcriptome assembly
To obtain a global overview of the C. heterophylla Fisch floral buds transcriptome, a cDNA library was generated from an equal mixture of RNA isolated from floral buds in the four stages during winter (including NA, CA, MW, and DA), and paired-end sequenced using the Illumina platform. After stringent quality assessment and data filtering, 25,221,054 of 75-bp reads (∼1.85 G) with 97.38% Q20 bases (those with a base quality greater than 20) were selected as high quality reads for further analysis. An overview of the sequencing is presented in Table 1. The high quality reads produced in this study have been deposited in the National Center for Biotechnology Information (NCBI) SRA database (accession number: SRX529300).
Using the Trinity de novo assembly program , next-generation short read sequences were assembled into 82,684 contigs, with N50 length of 642 bp and with mean length of 325 bp (Table 2). The distribution of contigs is shown in Fig. 1A. In total, there were 6,208 contigs coding for transcripts longer than 1 kb and 1,323 contigs coding for transcripts longer than 2 kb. The contigs were subjected to cluster and assembly analyses. A total of 40,941 unigenes were obtained, among which 9,323 genes (22.8%) were greater than 1 kb. The length distributions of unigenes are shown in Fig. 1B, revealing that more than 17,757 unigenes (43.4%) are greater than 500 bp. An overview of the assembled contigs and unigenes is presented in Table 2. These results demonstrated the effectiveness of Illumina pyrosequencing in rapidly capturing a large portion of the transcriptome. As expected for a randomly fragmented transcriptome, there was a positive relationship between the length of a given unigene and the number of reads assembled into it (Fig. 1C).
(A) Length distribution of C. heterophylla Fisch contigs.(B) Size distribution of C. heterophylla Fisch unigenes.(C) Log-Log plot showing the dependence of unigene length on the number of reads assembled into each unigene.
To determine protein-coding transcripts we screened the C. heterophylla Fisch floral buds transcriptome against the NCBI Non-redundant (Nr) peptide database and Swiss-Prot protein database using BLASTx with a cutoff E-value of 10−5. Mapping of 30,549 (74.6%) of the unigenes to the Nr library suggests that most of the unigenes can be translated into proteins. Furthermore, 18,581 (45.4%) unigenes had significant matches in the Swiss-Prot database (Table 3). Distribution analysis based on BLASTx searches showed that the unigenes of C. heterophylla Fisch have homologs in numerous hit a lot of plant species (Fig. 2). Among the various plant species that have protein sequence information in GenBank, the unigenes of C. heterophylla Fisch had the highest number of hits to sequences from Amygdalus persica (26.4%), followed by Vitis vinifera (24.6%), Ricinus communis (10.4%), Populus trichocarpa (9.5%), Fragaria vesca (7.6%), Glycine max (6.2%), and Cucumis sativu (5.1%) (Fig. 2). The high similarity of C. heterophylla Fisch unigenes to genes from Amygdalus persica and Vitis vinifera suggests the possibility of using the genome of Amygdalus persica or Vitis vinifera as a reference for identifying different gene expression patterns of mRNA-seq data.
Homology analysis of C. heterophylla Fisch unigenes with multiple species.
Functional classification by GO
To assign functional information to transcripts, Gene Ontology (GO) analysis was carried out, which provides a dynamic, controlled vocabulary and hierarchical relationships for the representation of information on biological process, molecular function, and cellular component, allowing a coherent annotation of gene products. There were 30,549 unigenes annotated in Nr database, among which 25,311 unigenes were assigned with one or more GO terms, with 49.0% for biological process, 40.8% for molecular function, and 10.1% for cellular component (Fig. 3 and Fig. 4A). For biological process, metabolic process (GO:0008152) were the most represented GO term, followed by cellular process (GO:0009987). Regarding molecular function, unigenes with binding activity (GO:0005488) and catalytic activity (GO:0003824) were highly represented. For cellular components, the most represented category was cell (GO:0005623) and cell part (GO:004464) (Fig. 3).
Results are summarized for three main Go categories: Biological Process, Molecular Function, and Cellular Component. Detail information of GO terms for all unigenes were listed in Table S2.
(A) Summarized of GO terms in Biological Process (BP), Molecular Function (MF), and Cellular Component (CC). (B) GO terms related to signal transduction. (C) GO terms related to stress response.
Hardy plants develop essential tolerance for cold survival through multiple levels of biochemical and cell biological changes. These responses are due to reprogramming of gene expression which results in the adjusted metabolic alternations. The first step in switching on such molecular responses is to perceive the stress as it occurs and to relay information about it through a signal transduction pathway . To explorer the unigenes might be involved in signal transduction and stress responses, we then compared the GO terms related with signal transduction (Fig. 4B) and stress responses (Fig. 4C). The top 5 represented signal related GO terms included response to ABA stimulus (GO:0009737, 440 unigenes), response to high light intensity (GO:0009644, 243 unigenes), signal transduction (GO:0007165, 242 unigenes), response to auxin stimulus (GO:0009733, 227 unigenes), and response to ethylene stimulus (GO:0009723, 193 unigenes) (Fig. 4B). While the top 5 represented stress related GO terms included response to salt stress (GO:0009651, 762 unigenes), response to cadmium ion (GO:0046686, 742 unigenes), response to cold (GO:0009409, 427 unigenes), defense response to bacterium (GO:0042742, 365 unigenes), and response to wounding (GO:0009611, 310 unigenes) (Fig. 4C).
Functional classification by COG and KEGG
In addition, all unigenes were subjected to a search against the COG database for functional prediction and classification. Overall, 10,514 of the 40,941 sequences showing a hit with the Nr database could be assigned to COG classifications (Fig. 5). COG annotated putative proteins were functionally classified into at least 25 protein families involved in cellular structure, biochemistry metabolism, molecular processing, signal transduction and so on (Fig. 5). The cluster for general function prediction (3,296; 31.35%) represented the largest group, followed by transcription (1,741; 16.56%), posttranslational modification, protein turnover, chaperones (1,540; 14.65%), translation, ribosomal structure and biogenesis (1,478; 14.06%), replication, recombination and repair (1,367; 13.00%), signal transduction mechanisms (1,208; 11.49%), carbohydrate transport and metabolism (1,054; 10.02%), amino acid transport and metabolism (740; 7.04%), cell wall/membrane/envelope biogenesis (663; 6.31%), energy production and conversion (636; 6.05%), cell cycle control, cell division, chromosome partitioning (634; 6.03%), and whereas only a few unigenes were assigned to nuclear structure and extracellular structure (7 and 1 unigenes, respectively). In addition, 565 unigenes were assigned to inorganic ion transport and metabolism and 494 unigenes were assigned to intracellular trafficking, secretion and vesicular transport (Fig. 5).
In total, 10,514 sequences were grouped into 25 COG classifications.
To further demonstrate the usefulness of C. heterophylla Fisch unigenes generated in the present study, we identified biochemical pathways represented by the unigene collection. Annotations of C. heterophylla Fisch unigenes were fed into the KEGG Pathway Tools, which is an alternative approach to categorize gene functions with the emphasis on biochemical pathways. This process predicted a total of 128 pathways represented by a total of 17,207 unigenes. Summary of the sequences involved in these pathways was included in Table S3. The top 3 pathways included Plant hormone signal transduction (756 unigenes), RNA transport (727 unigenes), and spliceosome (713 unigenes) (Fig. 6). As shown in Fig. 7, multiple C. heterophylla Fisch unigenes (red rectangle) were involved in the process of spliceosome assembly. Some made up the key components of spliceosome assembly including U1, U2, U4, U5 and U6 etc. Some other unigenes, such as Prp5, UAP56, Prp2, Prp16, Prp17, Prp18, Prp22, Slu7, and Prp43, directly participated in the process of spliceosome assembly. As C. heterophylla Fisch floral buds undergoing cold acclimation during winter and preparing for flower organs differentiation, this result showed that versatile alternative splicing events may occur in the C. heterophylla Fisch floral buds, which suggested that alternative splicing regulation is a general approach to affect complex plant biological processes including plant development, disease resistance and stress responses etc. Altogether, 31,844 (77.8%) unigenes were successfully annotated in the Nr, Swiss-Prot, COG, KEGG, and GO databases listed in Table S1.
The top 30 most highly represented pathways are shown. Analysis was performed using Blast2GO and the KEGG database.
Some C. heterophylla Fisch unigenes, such as Prp5, Prp2, Prp16, Prp12, and Prp43 etc, directly participated in the process of spliceosome assembly, while others made up the key components in spliceosome assembly including U1, U2, U4, U5, and U6 etc.
Candidate genes involved in cold tolerance in C. heterophylla Fisch floral buds
As we known, the response of plants to any environmental signal is mediated by a series of reactions, collectively referred to as signal transduction . After perception of the cold signal, the downstream transcription factors and response genes were reprogrammed and then result in the adjusted metabolism. To identify the genes likely involved in cold tolerance in C. heterophylla Fisch floral buds, we construct a candidate cold tolerance gene set based on the GO terms representation. A large number of studies indicated that the homologs of selected candidate genes involved in plant cold tolerance (Table 4). The selected unigene ID and annotation are listed in Table 4.
To validate the responsible of genes in the candidate cold tolerance gene set to cold, we then selected ten genes from the set and detected their expression pattern under cold stress and during winter by qRT-PCR. In woody perennials of the temperate zone, cold acclimation is triggered by several environmental cues, not only low temperatures, and is generally considered a two-step process: The first stage is induced by short photoperiod and the timing and speed of acclimation can be affected by other such as available moisture. The second stage is induced by low temperature . To exclude the influence of other factors, we first detected the time course expression patterns of these genes under cold stress (4°C). As shown in Fig. 8A, in C. heterophylla Fisch leaves, NIR and ADF were rapidly induced at 2h after treated by cold stress, while bZIP78, CIP, GPAT, and COR413 were dramatically induced at 4h after treated by cold stress. On the whole, all the ten cold tolerance candidate genes were induced in different degree in C. heterophylla Fisch leaves when they were exposed to the cold stress. It suggests that these genes are responsible to cold stress. We then analyzed their expression pattern during overwintering in C. heterophylla Fisch floral buds. During winter, the C. heterophylla Fisch floral buds undergoing four stages: NA, CA, MW, and DA (see methods). These selected genes can be classified into three types according their expression patterns during the process of cold acclimation (Fig. 8B). (1) Type I: the expression of bZIP78, CIP, NIR, GPAT, Hsp22, COR413, and BAM were induced immediately in CA stage and were further induced in the following MW stage, indicating that these genes could response to early and later cold acclimation. (2) Type II: ERD7 had higher expression in CA stage than MW stage, indicating ERD7 might be mainly involved in early cold acclimation. (3) Type III: HD-ZIP and ADF were significant induced in MW stage, but were not induced in CA stage, implying their involvement in later response to cold stress.
(A) Time course expression pattern of 10 cold tolerance candidate unigenes under 4°C cold stress (0 h, 2 h, 4 h, 8 h, 24 h) in C. heterophylla Fisch leaves. (B) Expression pattern of 10 cold tolerance candidate unigenes during four stages (NA, CA, MW, and DA) of overwintering in C. heterophylla Fisch floral buds. The gene names and the primers used for qRT-PCR analysis are shown in Table S4. Standard error of the mean for three biological replicates (nested with three technical replicates) is represented by the error bars.
In type I class, the mRNA level of membrane protein (COR413) and transcription factors (bZIP78) were immediately induced when the C. heterophylla Fisch floral buds into the CA stage and were further induced in MW stage. The cold-regulated (COR)413-plasma membrane and COR413-thylakoid membrane groups are potentially targeted to the plasma membrane and thylakoid membrane, respectively. It is known that the plasma membrane is the primary site of freezing injury . As an integral membrane protein, COR413-PM could play a structural role by stabilizing the plasma membrane lipid bilayer . Many studies have shown that bZIP transcription factor play an important role in the ability of plants to withstand various stresses. In soybean plants, GmbZIP44, GmbZIP62, and GmbZIP78 can bind to GLM (GTGAGTCAT), ABRE (CCACGTGG), and PB-like (TGAAAA) elements with differential affinity. They may function in ABA signaling through upregulation of ABI1 and ABI2 and play roles in salt and freezing tolerance through regulation of various stress responsive genes . The early response of these transcription factors could enhance the expression of a series downstream stress responsive genes rapidly and then enhance the cold tolerance of C. heterophylla Fisch floral buds. Type II class gene ERD7 mainly induced in CA stage but not in MW stage. Early responsive to dehydration (ERD) genes are defined as genes that are rapidly activated during drought stress. ERD15 from Arabidopsis has been functionally characterized as a common regulator of the abscisic acid (ABA) response and the salicylic acid (SA)-dependent defense pathway . The intense induction of ERD7 in CA stage implying the hormone signal involved in early cold acclimation in C. heterophylla Fisch floral buds. In type III class, the actin depolymerizing factors (ADF) are part of the ADF/cofilin group, a family of small proteins (15–22 kD) that includes cofilin, destrin, depactin, and actophorin. The members of this family can be described as stimulus-responsive modulators of the cell actin cytoskeleton dynamics. Using Arabidopsis ADF1, Carlier et al.  have suggested that one of the main functions of ADF is to increase the turnover rate of actin filaments. In wheat, the induction of an active ADF during cold acclimation and the correlation with an increased freezing tolerance suggest that the protein may be required for the cytoskeletal rearrangements that may occur upon low temperature exposure . Moreover, the kinetics of TaADF protein expression is different from the mRNA expression pattern . During the acclimation period, TaADF mRNA level is maximal after 1 or 2 days and slowly decreases afterward, whereas protein accumulation increases and peaks at 49 days in the hardy cultivars. In this study, the expression of ADF was induced in MW stage in C. heterophylla Fisch floral buds, suggesting important changes in the actin cytoskeletal architecture may occur during later cold acclimation.
In this study, de novo transcriptome sequencing of the C. heterophylla Fisch floral buds using Illumina platform was performed for the first time. 28,930,890 raw reads were de novo assembled into 40,941 unigenes. All unigenes were then evaluated and functionally annotated by comparing with the existing protein databases, such as NCBI Nr database, Swiss-Prot database, COG database, and KEGG database. A large number of candidate genes potentially involved in growth, development, and stress tolerance were identified, and are worthy of further investigation. To our knowledge, this is the first application of Illumina paired-end sequencing technology to investigate the transcriptome of C. heterophylla Fisch floral buds and moreover the assembly of the reads was conducted without reference genome information. The database will improve our understanding of the molecular mechanism of cold tolerance in C. heterophylla Fisch floral buds. This resource should lay an important foundation for future genetic or genomic studies on Corylus genus.
Materials and Methods
Plant materials and RNA extraction
C. heterophylla Fisch was obtained from Mulan Paddock, Hebei, China (116°32′–118°14′E, 4 1°35′–42°40′N). Plant samples collection was permitted by the State Forestry Administration and Forestry Bureau of Hebei province. For the field experiment, buds were collected starting in the fall of 2011 and through the winter of 2011–2012. Buds collected at the first time point in the fall (on Sep 29, 2011) were used as the NA control. They had received 0 chill units (number of hours between 0 and 7°C). For subsequent time points, buds were collected later in the fall when they had accumulated 198 chill units (CA stage, Nov 2, 2011), during midwinter when they had accumulated 682 chill units and had reached a maximum bud cold hardiness level of −29°C (MW stage, Dec 29, 2011), and during spring when they had accumulated 1,056 chill units and had partially deacclimated to a bud cold hardiness level of 15°C (DA stage, Apr 24, 2012). Floral buds were dissected from the hazelnut and immediately frozen and stored in liquid nitrogen until use.
For the short-term cold treatment, the seedlings were grown in a growth chamber with 25°C/21°C (day/night), 16 h light/8 h dark cycle. For the exposure to cold stress, four-weeks-old seedlings were transferred to a cold chamber at 4°C. The leaves were collected at 0, 2, 4, 8, and 24 h of cold stress. All of the experiments were repeated at least three times.
Total RNA was extracted from floral buds using the RNeasy Plant Mini kit (Qiagen, Valencia, CA, USA). DNA contamination was removed with RNase-free DNase I (Qiagen). RNA was concentrated and purified with an RNA MinElute kit (Qiagen). RNA quality and quantity were assessed by absorption at 260 nm/280 nm, gel electrophoresis, and via the Agilent 2100 Bioanalyzer (Agilent Technologies, USA).
mRNA-seq library construction for illumine sequencing
The mRNA-seq library was constructed following the manufacturer's instructions of mRNA-seq Sample Preparation Kit (Cat# RS-930-1001, Illumina Inc, San Diego, CA) (Illumina). Briefly, mRNA was purified from 20 µg of total RNA using oligo (dT) magnetic beads. Following purification, the mRNA is fragmented into small pieces using divalent cations under elevated temperature. Taking these short fragments as templates, first-strand cDNA was synthesized using reverse transcriptase and random primers. Second-strand cDNA synthesis was followed using DNA polymerase I and RNase H. Sequencing adapters were ligated to short fragments after purifying with QiaQuick PCR extraction kit, which were used to distinguish different sequencing samples. Fragments with lengths ranging from 200 to 500 bp were then separated by agarose gel electrophoresis and selected for PCR amplification as sequencing templates. The final cDNA library was sequenced using Illumina GAIIx system according to the manufacturer's protocols, with a read leangth of pari-end (PE) 75 bp. The transcriptome datasets are available at the NCBI Sequence Read Archive (SRA) with the accession number SRX529300.
Sequence data analysis and de novo assembly
The raw reads were cleaned by removing adaptor sequences, empty reads and low quality sequences, which included the reads with N percentage (i.e., the percentage of nucleotides in read which could not be sequenced) over 10% and ones containing more than 50% nucleotides in read with Q-value≤5. Transcriptome de novo assembly was performed separately with the short reads assembling programs SOAPdenovo  and Trinity . It has been demonstrated that Trinity is a more efficient de novo transcriptome assembler, especially in the absence of a reference genome . First, Trinity combined the reads with a certain overlap length to form longer fragments, which were called contigs. Next, these reads were mapped back to contigs; with paired-end reads, Trinity was able to detect contigs from the same transcript and determine the distances between these contigs. Finally, Trinity connected these contigs into sequences that could not be extended on their end. Such sequences were defined as unigenes.
As the Trinity assembler discards low coverage k-mers, no quality trimming of the reads was performed prior to the assembly. Trinity was run on the paired-end sequences with the fixed k-mer size of 25, minimum contig length of 100.
Gene annotation and classifications
The optimal assembly results were chosen according to the assembly evaluation. The assembled sequences were compared against the NCBI Nr database and Swiss-Prot database using BLASTn with an E-value of 10−5. Gene names were assigned to each assembled sequence based on the best BLAST hit (highest score). To increase computational speed, such search was limited to the first 10 significant hits for each query. The ORFs were identified as the nucleotide sequence or as the protein translation provided by the “GetORF” program from the EMBOSS software package . The longest ORF was extracted for each unigene. We quantified transcript levels in Reads Per Kilobase of exon model per Million mapped reads (RPKM) . The RPKM measure of read density reflects the molar concentration of a transcript in the starting sample by normalizing for RNA length and for the total read number in the measurement. Genes with high expression levels were screened and listed.
To annotate the assembled sequences with GO terms describing biological processes, molecular functions and cellular components, the Swiss-Prot BLAST results were imported into Blast2GO , a software package that retrieves GO terms, allowing gene functions to be determined and compared. These GO terms are assigned to query sequences, producing a broad overview of groups of genes catalogued in the transcriptome for each of three ontology vocabularies, biological processes (BP), molecular functions (MF) and cellular components (CC). The unigenes sequences were also aligned to the COG database to predict and classify functions. KEGG pathways were assigned to the assembled sequences using the online KEGG web server (http://www.genome.jp/kegg/) . The output of KEGG analysis includes KO assignments and KEGG pathways that are populated with the KO assignments.
Total RNA was isolated from C. heterophylla Fisch floral buds during four stages (NA, Non-cold Acclimation; CA, Cold Acclimation; MW, Midwinter; DA, Deacclimation) with the RNeasy Plant Mini Kit (Qiagen, Valencia, CA, USA). cDNA synthesis was performed with 1 µg total RNA using the Superscript III First Strand Synthesis system followed by the RNase H step (Invitrogen, Carlsbad, USA), according to the manufacturer's protocol. Primer pairs were designed using Primer3 (http://frodo.wi.mit.edu/primer3/) with the following parameters: Tm of approximately 60°C, product size range of 100–260 base pairs, primer sequences with a length of approximately 20 nucleotides, and a GC content of 45–55%. The gene names and primers used for qRT-PCR are liste in Table S4. To quantify the expression level the selected genes, the C. heterophylla Fisch Actin was used as an internal control. qRT-PCR was performed using a 7500 Real-time PCR System (Applied Biosystems, CA, USA) and a SYBR Premix Ex Taq Kit (TaKaRa, Dalian, China). The relative quantitative method (ΔΔCT) was used to calculate the fold change of the target genes . Three biological replicates (nested with three technical replicates) per sample were carried out.
Sequences with significant BLASTn matches against Nr, Swiss-Prot, COG, KEGG, and GO database.
GO terms for all unigenes of C. heterophylla Fisch.
KEGG biochemical pathways of C. heterophylla Fisch. In order to better understand the biological functions of C. heterophylla Fisch unigenes, a total of 17,207 unigenes were assigned to 128 KEGG biochemical pathways.
Conceived and designed the experiments: XC GW. Performed the experiments: XC JZ TZ QM. Analyzed the data: XC JZ WG. Contributed reagents/materials/analysis tools: QL TZ QM GW. Wrote the paper: XC JZ GW. Deposited the sequences in databases: XC JZ.
- 1. Erdogan V, Mehlenbacher SA (2000) Phylogenetic relationships of Corylus species (Betulaceae) based on nuclear ribosomal DNA ITS region and chloroplast matK gene sequences. Syst Bot 25: 727–737.
- 2. Ozdemir F, Akinci I (2004) Physical and nutritional properties of four major commercial Turkish hazelnut varieties. J Food Eng 63: 341–347.
- 3. Liu J, Cheng Y, Liu C, Zhang C, Wang Z (2013) Temporal changes of disodium fluorescein transport in hazelnut during fruit development stage. Sci Hortic-amsterdam 150: 348–353.
- 4. Fallico B, Arena E, Zappala M (2003) Roasting of hazelnuts. Role of oil in colour development and hydroxymethylfurfural formation. Food Chem 81: 569–573.
- 5. Özdemir M, Seyhan F, Bakan A, Ilter S, Özay G, et al. (2001) Analysis of internal browning of roasted hazelnuts. Food Chem 73: 191–196.
- 6. Plosker GL, Hurst M (2001) Paclitaxel: a pharmacoeconomic review of its use in non-small cell lung cancer. Pharmacoeconomics 19: 1111–1134.
- 7. Kumar S, Mahdi H, Bryant C, Shah JP, Garg G, et al. (2010) Clinical trials and progress with paclitaxel in ovarian cancer. Inter J Women's Health 2: 411.
- 8. Gradishar W (2012) Taxanes for the treatment of metastatic breast cancer. Bre Can: Basic Clin Res 6: 159.
- 9. Mahajan S, Tuteja N (2005) Cold, salinity and drought stresses: an overview. Arch Biochem Biophys 444: 139–158.
- 10. Lourenço T, Sapeta H, Figueiredo DD, Rodrigues M, Cordeiro A, et al. (2013) Isolation and characterization of rice (Oryza sativa L.) E3-ubiquitin ligase OsHOS1 gene in the modulation of cold stress response. Plant Mol Biol 83: 351–363.
- 11. Chinnusamy V, Zhu J-K, Sunkar R (2010) Gene regulation during cold stress acclimation in plants. Plant Stress Tolerance: Springer. 39–55.
- 12. Guy CL (1990) Cold accelimation and freezing stress tolerance: role of protein metabolism. Annu Rev Plant Biol 41: 187–223.
- 13. Thomashow MF (1999) Plant cold acclimation: freezing tolerance genes and regulatory mechanisms. Annu Rev Plant Biol 50: 571–599.
- 14. Dhanaraj AL, Alkharouf NW, Beard HS, Chouikha IB, Matthews BF, et al. (2007) Major differences observed in transcript profiles of blueberry during cold acclimation under field and cold room conditions. Planta 225: 735–751.
- 15. Zhao Z, Tan L, Dang C, Zhang H, Wu Q, et al. (2012) Deep-sequencing transcriptome analysis of chilling tolerance mechanisms of a subnival alpine plant, Chorispora bungeana. BMC Plant Biol 12: 222.
- 16. Lv D-K, Bai X, Li Y, Ding X-D, Ge Y, et al. (2010) Profiling of cold-stress-responsive miRNAs in rice by microarrays. Gene 459: 39–47.
- 17. Baker SS, Wilhelm KS, Thomashow MF (1994) The 5′-region of Arabidopsis thaliana cor15a has cis-acting elements that confer cold-, drought-and ABA-regulated gene expression. Plant Mol Biol 24: 701–713.
- 18. Dubouzet JG, Sakuma Y, Ito Y, Kasuga M, Dubouzet EG, et al. (2003) OsDREB genes in rice, Oryza sativa L., encode transcription activators that function in drought –, high – salt – and cold – responsive gene expression. Plant J 33: 751–763.
- 19. Morsy MR, Almutairi AM, Gibbons J, Yun SJ, de los Reyes BG (2005) The OsLti6 genes encoding low-molecular-weight membrane proteins are differentially expressed in rice cultivars with contrasting sensitivity to low temperature. Gene 344: 171–180.
- 20. Agarwal M, Hao Y, Kapoor A, Dong C-H, Fujii H, et al. (2006) A R2R3 type MYB transcription factor is involved in the cold regulation of CBF genes and in acquired freezing tolerance. J Biol Chem 281: 37636–37645.
- 21. Benedict C, Geisler M, Trygg J, Huner N, Hurry V (2006) Consensus by democracy. Using meta-analyses of microarray and genomic data to model the cold acclimation signaling pathway in Arabidopsis. Plant Physiol 141: 1219–1232.
- 22. Ito Y, Katsura K, Maruyama K, Taji T, Kobayashi M, et al. (2006) Functional analysis of rice DREB1/CBF-type transcription factors involved in cold-responsive gene expression in transgenic rice. Plant Cell Physiol 47: 141–153.
- 23. Mehlenbacher SA (1991) Hazelnuts. In: Moore, J N; Ballington, J R (eds). Genetic resources in temperate fruit and nut crops. Acta Horticulturae 290: 789–836.
- 24. Jiang B, Xie D, Liu W, Peng Q, He X (2013) De novo assembly and characterization of the transcriptome, and development of SSR markers in wax gourd (Benicasa hispida). PLoS One 8: e71054.
- 25. Morozova O, Marra MA (2008) Applications of next-generation sequencing technologies in functional genomics. Genomics 92: 255–264.
- 26. Wang Z, Fang B, Chen J, Zhang X, Luo Z, et al. (2010) De novo assembly and characterization of root transcriptome using Illumina paired-end sequencing and development of cSSR markers in sweetpotato (Ipomoea batatas). BMC Genomics 11: 726.
- 27. Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson DA, et al. (2011) Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat Biotechnol 29: 644–652.
- 28. Heidarvand L, Amiri RM (2010) What happens in plant molecular responses to cold stress? Acta Physiol Plant 32: 419–431.
- 29. Weiser C (1970) Cold resistance and injury in woody plants. Science 169: 1269–1278.
- 30. Steponkus PL (1984) Role of the plasma membrane in freezing injury and cold acclimation. Annu Rev Plant Physiol 35: 543–584.
- 31. Breton G, Danyluk J, Charron J-BtF, Sarhan F (2003) Expression profiling and bioinformatic analyses of a novel stress-regulated multispanning transmembrane protein family from cereals and Arabidopsis. Plant Physiol 132: 64–74.
- 32. Liao Y, Zou H-F, Wei W, Hao Y-J, Tian A-G, et al. (2008) Soybean GmbZIP44, GmbZIP62 and GmbZIP78 genes function as negative regulator of ABA signaling and confer salt and freezing tolerance in transgenic Arabidopsis. Planta 228: 225–240.
- 33. Kariola T, Brader G, Helenius E, Li J, Heino P, et al. (2006) EARLY RESPONSIVE TO DEHYDRATION 15, a negative regulator of abscisic acid responses in Arabidopsis. Plant Physiol 142: 1559–1573.
- 34. Carlier M-F, Laurent V, Santolini J, Melki R, Didry D, et al. (1997) Actin depolymerizing factor (ADF/cofilin) enhances the rate of filament turnover: implication in actin-based motility. J Cell Biol 136: 1307–1322.
- 35. Ouellet F, Carpentier É, Cope MJT, Monroy AF, Sarhan F (2001) Regulation of a wheat actin-depolymerizing factor during cold acclimation. Plant Physiol 125: 360–368.
- 36. Danyluk J, Carpentier E, Sarhan F (1996) Identification and characterization of a low temperature regulated gene encoding an actin-binding protein from wheat. FEBS Lett 389: 324–327.
- 37. Li R, Zhu H, Ruan J, Qian W, Fang X, et al. (2010) De novo assembly of human genomes with massively parallel short read sequencing. Genome Res 20: 265–272.
- 38. Rice P, Longden I, Bleasby A (2000) EMBOSS: the European molecular biology open software suite. Trends Genet 16: 276–277.
- 39. Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat methods 5: 621–628.
- 40. Conesa A, Götz S, García-Gómez JM, Terol J, Talón M, et al. (2005) Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research. Bioinformatics 21: 3674–3676.
- 41. Kanehisa M, Goto S, Kawashima S, Okuno Y, Hattori M (2004) The KEGG resource for deciphering the genome. Nucleic Acids Res 32: D277–D280.
- 42. Livak KJ, Schmittgen TD (2001) Analysis of Relative Gene Expression Data Using Real-Time Quantitative PCR and the 2−ΔΔCT Method. Methods 25: 402–408.
- 43. Kaplan F, Guy CL (2005) RNA interference of Arabidopsis beta – amylase 8 prevents maltose accumulation upon cold shock and increases sensitivity of PSII photochemical efficiency to freezing stress. Plant J 44: 730–743.
- 44. Zhang Y, Wang Z, Zhang L, Cao Y, Huang D, et al. (2006) Molecular cloning and stress-dependent regulation of potassium channel gene in Chinese cabbage (Brassica rapa ssp. Pekinensis). J Plant Physiol 163: 968–978.
- 45. Janská A, Maršík P, Zelenková S, Ovesná J (2010) Cold stress and acclimation-what is important for metabolic adjustment? Plant Biology 12: 395–405.
- 46. Xing W, Rajashekar C (2001) Glycine betaine involvement in freezing tolerance and water stress in Arabidopsis thaliana. Environ Exp Bot 46: 21.
- 47. Guy C, Kaplan F, Kopka J, Selbig J, Hincha DK (2008) Metabolomics of temperature stress. Physiol Plantarum 132: 220–235.
- 48. Krasensky J, Jonak C (2012) Drought, salt, and temperature stress-induced metabolic rearrangements and regulatory networks. J Exp Bot 63: 1593–1608.
- 49. Wathugala DL, Richards SA, Knight H, Knight MR (2011) OsSFR6 is a functional rice orthologue of SENSITIVE TO FREEZING –6 and can act as a regulator of COR gene expression, osmotic stress and freezing tolerance in Arabidopsis. New Phytol 191: 984–995.
- 50. Huang C, Ding S, Zhang H, Du H, An L (2011) CIPK7 is involved in cold response by interacting with CBL1 in Arabidopsis thaliana. Plant Sci 181: 57–64.
- 51. Wang J, Rajakulendran N, Amirsadeghi S, Vanlerberghe GC (2011) Impact of mitochondrial alternative oxidase expression on the response of Nicotiana tabacum to cold temperature. Physiol Plantarum 142: 339–351.
- 52. Kosová K, Vítámvás P, Prášil IT, Renaut J (2011) Plant proteome changes under abiotic stress-contribution of proteomics studies to understanding plant stress response. J Proteomics 74: 1301–1322.
- 53. Teige M, Scheikl E, Eulgem T, Dóczi R, Ichimura K, et al. (2004) The MKK2 Pathway Mediates Cold and Salt Stress Signaling in Arabidopsis. Mol Cell 15: 141–152.
- 54. Kargiotidou A, Deli D, Galanopoulou D, Tsaftaris A, Farmaki T (2008) Low temperature and light regulate delta 12 fatty acid desaturases (FAD2) at a transcriptional level in cotton (Gossypium hirsutum). J Exp Bot 59: 2043–2056.
- 55. Saijo Y, Hata S, Kyozuka J, Shimamoto K, Izui K (2000) Over-expression of a single Ca2+-dependent protein kinase confers both cold and salt/drought tolerance on rice plants. Plant J 23: 319–327.
- 56. Vergnolle C, Vaultier M-N, Taconnat L, Renou J-P, Kader J-C, et al. (2005) The cold-induced early activation of phospholipase C and D pathways determines the response of two distinct clusters of genes in Arabidopsis cell suspensions. Plant Physiol 139: 1217–1233.
- 57. Zhang J, Li Y, Chen W, Du G-C, Chen J (2012) Glutathione improves the cold resistance of Lactobacillus sanfranciscensis by physiological regulation. Food Microbiol 31: 285–292.
- 58. Kasukabe Y, He L, Nada K, Misawa S, Ihara I, et al. (2004) Overexpression of spermidine synthase enhances tolerance to multiple environmental stresses and up-regulates the expression of various stress-regulated genes in transgenic Arabidopsis thaliana. Plant Cell Physiol 45: 712–722.
- 59. Tähtiharju S, Palva T (2001) Antisense inhibition of protein phosphatase 2C accelerates cold acclimation in Arabidopsis thaliana. Plant J 26: 461–470.
- 60. Sui N, Li M, Zhao S-J, Li F, Liang H, et al. (2007) Overexpression of glycerol-3-phosphate acyltransferase gene improves chilling tolerance in tomato. Planta 226: 1097–1108.
- 61. Goñi O, Sanchez-Ballesta MT, Merodio C, Escribano MI (2013) Two cold-induced family 19 glycosyl hydrolases from cherimoya (Annona cherimola) fruit: An antifungal chitinase and a cold-adapted chitinase. Phytochemistry 95: 94–104.
- 62. Zhao M-G, Chen L, Zhang L-L, Zhang W-H (2009) Nitric reductase-dependent nitric oxide production is involved in cold acclimation and freezing tolerance in Arabidopsis. Plant Physiol 151: 755–767.
- 63. Baek K-H, Skinner DZ (2012) Production of reactive oxygen species by freezing stress and the protective roles of antioxidant enzymes in plants. J Agr Chem Environ 1: 34.
- 64. Taji T, Ohsumi C, Iuchi S, Seki M, Kasuga M, et al. (2002) Important roles of drought – and cold – inducible genes for galactinol synthase in stress tolerance in Arabidopsis thaliana. Plant J 29: 417–426.
- 65. Sun W, Van Montagu M, Verbruggen N (2002) Small heat shock proteins and stress tolerance in plants. BBA-Gene Struct Expr 1577: 1–9.
- 66. Chelysheva VV, Smolenskaya IN, Trofimova MC, Babakov AV, Muromtsev GS (1999) Role of the 14-3-3 proteins in the regulation of H+-ATPase activity in the plasma membrane of suspension-cultured sugar beet cells under cold stress. FEBS Lett 456: 22–26.
- 67. Alves MS, Fontes EP, Fietto LG (2011) EARLY RESPONSIVE to DEHYDRATION 15, a new transcription factor that integrates stress signaling pathways. Plant Signal Behav 6: 1993–1996.
- 68. Nakashima K, Yamaguchi – Shinozaki K (2006) Regulons involved in osmotic stress – responsive and cold stress – responsive gene expression in plants. Physiol Plantarum 126: 62–71.
- 69. Polashock JJ, Arora R, Peng Y, Naik D, Rowland LJ (2010) Functional identification of a C-repeat binding factor transcriptional activator from blueberry associated with cold acclimation and freezing tolerance. J Am Soc Hortic Sci 135: 40–48.
- 70. Dong MA, Farré EM, Thomashow MF (2011) Circadian clock-associated 1 and late elongated hypocotyl regulate expression of the C-repeat binding factor (CBF) pathway in Arabidopsis. P Natl Acad Sci USA 108: 7241–7246.
- 71. Román Á, Andreu V, Hernández ML, Lagunas B, Picorel R, et al. (2012) Contribution of the different omega-3 fatty acid desaturase genes to the cold response in soybean. J Exp Bot 63: 4973–4982.
- 72. Cabello JV, Arce AL, Chan RL (2012) The homologous HD – Zip I transcription factors HaHB1 and AtHB13 confer cold tolerance via the induction of pathogenesis – related and glucanase proteins. Plant J 69: 141–153.
- 73. Liu L, Cao X-L, Bai R, Yao N, Li L-B, et al. (2012) Isolation and characterization of the cold-induced Phyllostachys edulis AP2/ERF family transcription factor, peDREB1. Plant Mol Biol Rep 30: 679–689.
- 74. Kalinina E, Keith B, Kern A, Dyer W (2012) Salt-and osmotic stress-induced choline monooxygenase expression in Kochia scoparia is ABA-independent. Biol Plantarum 56: 699–704.
- 75. Teixeira MC, Carvalho IS, Brodelius M (2010) ω-3 fatty acid desaturase genes isolated from purslane (Portulaca oleracea L.): expression in different tissues and response to cold and wound stress. J Agr Food Chem 58: 1870–1877.
- 76. Garnier M, Matamoros S, Chevret D, Pilet M-F, Leroi F, et al. (2010) Adaptation to cold and proteomic responses of the psychrotrophic biopreservative Lactococcus piscium strain CNCM I-4031. Appl Environ Microb 76: 8011–8018.
- 77. Zarka DG, Vogel JT, Cook D, Thomashow MF (2003) Cold induction of Arabidopsis CBF genes involves multiple ICE (inducer of CBF expression) promoter elements and a cold-regulatory circuit that is desensitized by low temperature. Plant Physiol 133: 910–918.
- 78. Fan L, Wang A, Wu Y (2013) Comparative proteomic identification of the hemocyte response to cold stress in white shrimp, Litopenaeus vannamei. J Proteomics 80: 196–206.
- 79. Guo L, Yang H, Zhang X, Yang S (2013) Lipid transfer protein 3 as a target of MYB96 mediates freezing and drought stress in Arabidopsis. J Exp Bot 64: 1755–1767.
- 80. Faccioli P, Pecchioni N, Cattivelli L, Stanca A, Terzi V (2001) Expressed sequence tags from cold – acclimatized barley can identify novel plant genes. Plant Breeding 120: 497–502.