De novo Assembly and Transcriptomic Profiling of the Grazing Response in Stipa grandis

Background Stipa grandis (Poaceae) is one of the dominant species in a typical steppe of the Inner Mongolian Plateau. However, primarily due to heavy grazing, the grasslands have become seriously degraded, and S. grandis has developed a special growth-inhibition phenotype against the stressful habitat. Because of the lack of transcriptomic and genomic information, the understanding of the molecular mechanisms underlying the grazing response of S. grandis has been prohibited. Results Using the Illumina HiSeq 2000 platform, two libraries prepared from non-grazing (FS) and overgrazing samples (OS) were sequenced. De novo assembly produced 94,674 unigenes, of which 65,047 unigenes had BLAST hits in the National Center for Biotechnology Information (NCBI) non-redundant (nr) database (E-value < 10-5). In total, 47,747, 26,156 and 40,842 unigenes were assigned to the Gene Ontology (GO), Clusters of Orthologous Group (COG), and Kyoto Encyclopedia of Genes and Genomes (KEGG) databases, respectively. A total of 13,221 unigenes showed significant differences in expression under the overgrazing condition, with a threshold false discovery rate ≤ 0.001 and an absolute value of log2Ratio ≥ 1. These differentially expressed genes (DEGs) were assigned to 43,257 GO terms and were significantly enriched in 32 KEGG pathways (q-value ≤ 0.05). The alterations in the wound-, drought- and defense-related genes indicate that stressors have an additive effect on the growth inhibition of this species. Conclusions This first large-scale transcriptome study will provide important information for further gene expression and functional genomics studies, and it facilitated our investigation of the molecular mechanisms of the S. grandis grazing response and the associated morphological and physiological characteristics.


Introduction
In China, the grasslands are the dominant landscape, covering 40% of the national land area, of which approximately 78% is located in the northern temperate zone [1]. In the Inner Mongolia Autonomous Region, the steppe grasslands cover 68% of the total land area (up to 791,000 km 2 ) [2], but 30-50% of this region is affected by deterioration and desertification [3], and overgrazing is considered to be the major factor contributing to this grassland degradation [2][3][4].
In the grasslands, grazing is normally linked with plant morphological and physiological responses, such as reduced shoot internodes or altered water utilization rates, to adapt to defoliation disturbance and more stressful habitats [5]. Under grazing conditions, plants are expected to develop a resistance strategy (such as small size, being short-lived and fast-growing or having low palatability) to avoid grazing or a tolerance strategy to re-grow / reproduce after damage; decreased plant height is considered to be a positive response to grazing [6,7]. The accumulating data have revealed that grazing leads to growth-inhibition in plants; for example, herbivory by the white-tailed deer on Anticosti Island (Quebec) resulted in stunted bonsai-like plants [8]. Additionally, overgrazing by sheep has significantly reduced plant height when compared with non-grazing grasslands [9].
Stipa grandis (Poaceae, 2n = 44) is a C3 perennial bunch grass and is one of the dominant species in a typical steppe of the Inner Mongolian Plateau [10]. It is a wind-pollinated grass, flowering in mid to late July, with the seeds ripening in late August or early September [10,11]. The mature plants have dense tussocks that are approximately 30 cm high, with long, thin leaves [11]. The S. grandis steppe, which represents the major pasture type in Inner Mongolia, primarily spreads from the Xilingole Plateau and the middle-eastern region of the Hulun Buir Plateau [4] and acts as a natural green barrier to protect the vast area from sandstorms [11]. However, as one of the typical dominant steppes in the Xilin River Basin, the S. grandis steppe has degraded to different degrees, mainly due to overgrazing [4,12]. As a result, the plants not only showed a smaller size but also reduced plant circumference and sexual reproduction [10].
Because there is no sequencing information for S. grandis in the public databases, further research on this steppe plant at the molecular level in response to grazing has been limited. Nextgeneration sequencing (NGS) technologies, such as the Roche / 454, Solexa / Illumina and AB SOLiD platforms [13,14], have been rapidly developed in recent years, providing more efficient and less costly sequencing than ever before [13]. In the current study, we performed a de novo assembly of the leaf transcriptome of S. grandis in response to grazing using the Illumina HiSeq 2000 sequencing platform and characterized the transcriptional changes by comparing the transcriptomes of overgrazing and non-grazing plants. This information will facilitate further functional genomics studies in S. grandis and aid in the understanding of the molecular mechanisms behind the grazing response in plants and the associated morphological changes. non-grazing area has been fenced since 1983 for grazing-free and is considered to be restored, while overgrazing region right next to the fenced area was freely grazed by sheep and seriously degraded [15]. More details about the non-grazing and overgrazing region can be found in previously study [16]. The climate of this area was described previously [2].
For measuring plant height, ten individual plants were randomly chosen and sampled for each region in the morning (over 12 hours after the last grazing happened), and the plants height were statistically analyzed in S1 Fig. For sequencing, three plants were taken from the non-grazing and overgrazing fields each, and fresh leaves of S. grandis were collected and immediately frozen in liquid nitrogen and then stored at -80°C. 0.1 g leaves of S. grandis were taken from each plant for total RNA extraction using Trizol regent. The total RNA from each of the three plants was pooled to obtain at least 20 μg of RNA, which was further treated with RNase-free DNase I. The RNA integrity was examined using an Agilent 2100 Bioanalyzer. The poly(A) mRNA was isolated using Oligo(dT) Beads.

cDNA library construction and sequencing
Following the isolation, the mRNA was fragmented into short fragments using fragmentation buffer. The short fragments were used as templates, and the first-strand cDNA was synthesized using random hexamer primers and reverse transcriptase. Following the second-strand cDNA synthesis, the fragments were end repaired, and poly(A) and sequencing adapters were ligated to the fragments. Following the selection of suitable template fragments, enrichment was performed using PCR amplification to create the final cDNA library. The libraries were sequenced using an Illumina HiSeq 2000.

Data filtering and de novo assembly
The raw reads were filtered by removing the reads containing adaptor sequences, "N" (unknown nucleotides) percentages that were greater than 5%, and low-quality reads (the rate of reads which quality value 10 is more than 20%) with software "filter_fq". Then, the transcriptome de novo assembly was performed using the short-read assembly program Trinity [17]. Trinity performs analysis in three steps using different software modules: Inchworm, Chrysalis, and Butterfly. Briefly, Inchworm assembles the cleaning reads into the linear transcript contigs (unique sequences of transcripts) using greedy K-mer extension (k = 25). Then Chrysalis clusters minimally overlapping Inchworm contigs into clusters and builds complete de Bruijn graphs for each cluster, representing the full transcriptional complexity for a given gene. In the last step, Butterfly processed the individual graphs in parallel, compacted the graphs and report full-length transcripts for alternatively spliced isoforms and paralogous transcripts. The sequences generated from Trinity were defined as unigenes. The assembled unigenes from FS and OS were used for further sequence splicing and redundancy removing with clustering software TGICL [18] to get non-redundant unigenes as long as possible. After gene family clustering, the final obtained unigenes were divided into either clusters (shared more than 70% similarity) or singletons. Finally, a Blastx [19] alignment (E-value < 10 -5 ) was performed between the unigenes and various protein databases, such as the non-redundant protein (nr) database (http://www.ncbi.nlm.nih.gov), the Swiss-Prot protein database (http:// www.expasy.ch/sprot), the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway database (http://www.genome.jp/kegg) and the Cluster of Orthologous Groups (COG) database (http://www.ncbi.nlm.nih.gov/COG). The best aligning results were used to decide the unigene's sequence orientation. A priority order of nr, Swiss-Prot, KEGG and COG was used when the outcome from the different databases conflicted. If a unigene did not align to any of the above databases, ESTScan [20] was used to predict its sequence orientation.

Unigene annotation and classification
For the annotation, the unigenes were first aligned using Blastx against the protein databases, including nr, Swiss-Prot, KEGG and COG (E-value < 10 -5 ). The protein with the highest sequence similarity was retrieved using the given unigenes and was annotated to each unigene. The GO (gene ontology; http://www.geneontology.org) annotations for the unigenes were performed using the Blast2GO [21] program, based on the best Blastx hits from the nr database (E-value < 10 -5 ), and the GO functional classification for All-unigenes was performed using WEGO [22]. The COG database was also used for possible functional prediction and classification of the unigenes. The pathway assignments were generated using Blastall software against the KEGG database.

Estimation of the expression levels and the differential expression analyses
To calculate the unigene expression levels, the FPKM method (Fragments Per kb per Million fragments) was employed, as previously described [23]. The fold changes of the transcripts were calculated using the log 2 formula of OS_ FPKM / FS_FPKM, and 0.01 was used (instead of 0) for the fold change calculation once the value of either OS_ FPKM or FS_FPKM was zero. The differentially expressed genes (DEGs) between the two samples were analyzed with a rigorous algorithm based on Audic's [24] method. The false discovery rate (FDR) method [25] was used to correct for the P-value in the multiple hypothesis testing. An FDR 0.001 and an absolute value of log 2 Ratio 1 were used as the thresholds to judge the significance of the differential gene expression. For the functional and pathway enrichment analysis, the DEGs were then mapped into GO terms (P-value 0.05) and the KEGG database (q-value 0.05).

Quantitative real-time PCR analysis (qPCR)
The qPCR was performed using a Roche LightCycler 480 Real-Time PCR system, as previously described [26]. Actin was used as the internal control gene, and the 2 -ΔΔCT method [27] was used to evaluate the relative quantities of each amplified product in the samples. For each qPCR analysis, three technical replicates were performed. The primers used for the qPCR were provided in S1 Table. Data deposition The clean raw reads data were deposited in NCBI Sequence Read Archive (SRA, http://www. ncbi.nlm.nih.gov/Traces/sra) under the accession number SRP051667.

Illumina sequencing and reads assembly
To investigate the transcriptomic response of S. grandis to overgrazing, the leaves of S. grandis from both FS and OS areas were sampled. When compared with the non-grazing plants, the S. grandis under the overgrazing conditions were significantly reduced in height (S1 Fig). In total, 77,859,274 raw reads for FS and 75,960,480 raw reads for OS were generated using the Illumina HiSeq 2000 sequencing platform. After filtering out the dirty raw reads, 65,954,618 and 64,545,856 clean reads (with an average length of 90 bp) were obtained from the FS and OS samples (Table 1), respectively.
The Trinity [17] method was adopted to assemble all of the high-quality clean reads into contigs and unigenes. As a result, 193,974 FS contigs and 184,321 OS contigs were generated

Functional annotation of the assembled unigenes
For the functional annotation, the unigene sequences were first blasted against the NCBI nr database using Blastx (E-value < 10 -5 ). Of the 94,674 unigenes, 65,047 (68.71%) were annotated (Table 2), whereas 29,627 unigenes (31.29%) were not matched to any known proteins in the nr database. The E-value frequency distribution analysis revealed that 46% of the sequences shared strong homologies, with E-values 1.0E-60, while the remaining 54% fell into the range of 1.0E-60-1.0E-5 (Fig 2A). Furthermore, we also observed that 56.9% of the sequences had a similar distribution range between 80% and 100%, but only 5.8% had similarity values less than 40% (Fig 2B). Based on the homologous species identified among the annotated unigenes, 49.0% of the unigene sequences matched to Brachypodium distachyon, followed by Hordeum vulgare subsp. vulgare (17.2%) and Oryza sativa Japonica (12.3%) ( Fig 2C). In addition to the nr annotation, 42,233 unigenes (44.61%) were aligned to known proteins in the Swiss-Prot database (E-value < 10 -5 ) ( Table 2).

Differential expression and pathway analyses in the FS and OS populations
To reveal the differential expression profiles between FS and OS, the potential DEGs were analyzed. The FPKM method [23] was used to calculate the expression levels. When a threshold of FDR 0.001 and an absolute value of log 2 Ratio 1 were used, a total of 13,221 unigenes showed significant differences in response to grazing, of which 6283 unigenes were upregulated and 6938 unigenes were downregulated (Fig 5).
Base on the nr annotation, a total of 6526 DEGs were assigned 43,257 GO terms and classified into 51 functional categories, which belongs to three main ontologies: biological process    Table). By analyzing the "biological process" categories, 18 subcategories, such as "response to wounding" (GO:0009611), "response to water deprivation" (GO:0009414), "response to reactive oxygen species" (GO:0000302), "defense response" (GO:0006952), "proline biosynthetic process" (GO:0006561) and "chitin catabolic process" (GO:0006032), which is related to resistant responses to abiotic and biotic stressors, were observed (Fig 6).

Validation of the transcriptome results via qPCR analysis
To validate the expression levels in the transcriptome, 30 genes exhibiting different expression patterns (10 upregulated, 10 downregulated, and 10 unigenes with no obvious changes in transcript abundance, with a |log 2 Ratio| < 1 between FS and OS) were randomly selected and examined using qPCR. As a result, the relative expression patterns of the 30 unigenes tested via qPCR correspond well with the transcriptome data that was determined using the FPKM method, exhibiting clear up-or down-regulation or no change in response to grazing (Fig 7), suggesting that the transcriptomic profiling data were reliable. However, some quantitative differences in the relative expression levels were observed, which may be due to the differential sensitivities of the two techniques.

Genes related to stress and defense
Grazing is a process containing multiple stressors, involving both biotic and abiotic stressors and including trampling / wounding, water stress and pathogen infection.
Overall, we identified a total of 115 defense-response related DEGs among the 675 DEGs in the plant-pathogen interaction pathway (Table 4 and S3 Table), including the pathogenesis-related gene PR1 (Unigene6840), which is considered to be one of the SAR (systemic acquired resistance) marker genes [40], and one PRB1-2-like gene (Unigene2461), which had accumulated in the OS sample, indicating that disease resistance was activated. Six leucine-rich repeat receptor kinase (LRR-RK) transcripts, encoding the FLS2 (flagellin sensitive 2) genes [41], and 11 RPM1 (resistance to Pseudomonas maculicula protein 1) [42] were increased by overgrazing. In contrast, 4 homologs of RIN4 (encoding an RPM1-interacting protein) [42], showed attenuated expression. Additionally, the transcripts of the SGT1 (suppressor of G-two allele of Skp1) [43] and Hsp90 (heat shock protein 90) genes were significantly enhanced by grazing. Notably, a set of WRKY genes, including WRKY-1, -7, -11, -55, and -70, also exhibited altered expression due to overgrazing. These results indicate that the immune response of S. grandis is enhanced and that disease resistance is activated by overgrazing.

Sequence assembly
Since the Trinity method was developed [17], NGS has become widely used for de novo transcriptome analysis [14], especially in non-model plants lacking genome sequence information [17], such as Reaumuria trigyna [44], Camelina sativa [45] and Hevea brasiliensis [46]. In this study, we used Illumina RNA-Seq technology sequenced two libraries prepared from the FS and OS samples and obtained high quality of de novo assembly data. Further analysis indicated that the unigenes of S. grandis that were annotated and the expression patterns of S. grandis in response to grazing were calculated correctly.  This report is the first to comprehensively analyze the transcriptome and to identify the differentially expressed genes in S. grandis under grazing conditions without prior genome information. Prior to this study, Chen et al. (2009) used monocot rice plants, which have been entirely sequenced, as a model and simulated grazing by cutting and dabbing cow saliva to study the corresponding gene expression in response to grazing defoliation; however, that experiment only simulated grazing using rice but not wild plants in the actual habitat. Recently, this research group also reported defoliation treatment response in Leymus chinensis and sorghum bicolor plants [47][48][49]. In the present study, we selected S. grandis, a typical species found in the grasslands of Inner Mongolia [10], as our starting material, and using transcriptomic RNA-Seq analysis, we provided a complete gene expression profile for grazing, thus facilitating further studies on the complex responses of plants to grazing at the molecular level.

The wound response in plants
Wounding is an inevitable threat for the survival of all organisms [50] that occurred more frequently under overgrazing condition. Many genes, such as those related to wounds and resistance to potential pathogen attacks, were activated in a grazing simulation study [51]. As a typical wound hormone [52], JA accumulates and modulates the expression of woundassociated genes in response to the plant's wounds [53]. Reymond et al. (2000) analyzed the expression dynamics of 150 genes in mechanically wounded Arabidopsis leaves using a cDNA microarray technique and found that the genes involved in the synthesis or metabolism of members of the JA family are induced, including LOX, FAD, AOS and OPR. Chen et al. (2014) using the Illumina / Solexa platform sequenced seven cDNA libraries prepared from control, wounded (2, 6 and 24 h) and defoliated (2, 6 and 24 h) L. chinensis plants, by comparing the transcriptomic data, 1836 and 3238 genes were significantly differential expressed between wounding and defoliation treatment within one day. Among these genes, such as LOX, AOS and OPR were commonly activated in response of wounding and defoliation. Consistent with this observation, we also found that a number of the above genes, along with 2 PLDs, were upregulated by grazing (Table 3).
Wounding also induced JAZ10 / TIFY9. In the core module (COI1-JAZs-MYC2) of the JA signaling pathway, the JAZ / TIFY genes are transcriptional targets of MYC2, and JA treatment quickly induces their transcripts, followed by the degradation of the SCF COI1 -dependent proteasome and activation of the JA response [34,35]. The activation of these genes indicates that The prefix "CL" represents clusters, and "unigene" represents singletons. FPKM indicates the FPKM values of the unigenes in FS or OS. "Fold change" is equal to log 2 (OS-FPKM / FS-FPKM). "+" indicates upregulated transcription, and "-" represents downregulated transcription.
doi:10.1371/journal.pone.0122641.t003 the JA-dependent wounding response was likely activated. Repeatedly wounding the leaves of Arabidopsis resulted in stunted growth and increased endogenous JA content; however, these treatments did not stunt the growth of mutants that were deficient in JA synthesis, such as the aos and opr3 mutants and the fad3-2fad7-2fad8 triple mutant, indicating that wound-induced JA significantly suppresses plant growth [28]. Based on the literature and our own research, it was hypothesized that the wound response is enhanced by overgrazing, and wound-induced JAs are likely to participate in overgrazing-inhibited plant growth. Additionally, PLD is involved in the response to drought stress [54], and JAs confer plants with the capacity to counter multiple biotic stimuli, such as pathogens [55]. These observations suggest that the complex interplay of gene expression patterns most likely also occurs under overgrazing conditions. The prefix "CL" represents clusters, and "unigene" represents singletons. FPKM indicates the FPKM values of the unigenes in FS or OS. "Fold change" is equal to log 2 (OS-FPKM / FS-FPKM). "+" indicates upregulated transcription, and "-" represents downregulated transcription. doi:10.1371/journal.pone.0122641.t004 The drought response in plants In addition to damage the plant, grazing also reduces the soil's water content [5]. Furthermore, water stress is an important participant in the plant's response to mechanical wounding [50]; therefore, plants also suffer from drought stress under grazing conditions. In plants, CRT / DRE (C-repeat / dehydration-responsive element) is a cis-acting DNA regulatory element that initiates transcription in response to water deficiency [39]. In Arabidopsis, the expression levels of all of the CRT / DRE binding factors (CBFs / DREB1s) were low under normal growth conditions, but the transcripts were immediately enhanced following drought stress [56,57]. DREB1 overexpression in Arabidopsis not only strengthened the plant's tolerance to drought but also resulted in plant growth retardation [58]. The constitutive expression of Arabidopsis CBF1 in Brassica napus elevated drought tolerance [57]. The LEA (late-embryogenesis abundant) proteins, expressed by many prokaryotes and eukaryotes, are hydrophilic proteins associated with tolerance to dehydration [36,37]. Many of the genes encoding the LEA proteins in Arabidopsis contain ABRE (ABA responsive element) and the DRE / CRT / LTRE element [37]. The expression of the LEA proteins are highly induced by water stress in peas (Pisum sativum) [36]. We also found that the transcripts of CBF / DREB and LEA were highly induced by overgrazing ( Table 3), indicating that grazing involve in plant's tolerance to dehydration. The plant hormone ABA (abscisic acid) plays a critical role in the plant's adaptation to abiotic environmental stressors (such as drought). During vegetative growth, ABA accumulates in the plant's cells and regulates the expression of many genes during drought stress [59]. NCED (9-cis-epoxycarotenoid dioxygenase) is a key enzyme in ABA biosynthesis. NCED expression is induced, coinciding with an increased level of endogenous ABA in Arabidopsis, when exposed to drought stress [60]. AtNCED3 overexpression in Arabidopsis confers improved drought tolerance via an increase in the ABA level, while AtNCED3 antisense plants and the T-DNA insertion mutant show a drought-sensitive phenotype with lower ABA levels when compared to wild-type plants [60]. We observed that NCED and 6 dehydrin transcripts were increased by overgrazing (Table 3). Therefore, we hypothesize that the drought response in S. grandis may be activated and elevated drought tolerance when the plant is subjected to grazing.

Plant immunity
Plants under grazing conditions are more likely to be attacked by pathogenic organisms when compared to non-grazing plants. On the one hand, the feces and urine deposition of livestock contain various microorganisms, and an increasing richness and abundance of microorganisms have been identified in soil following grazing due to trampling and fecal and urine deposition [61,62]. On the other hand, the open wound damage to plant tissues caused by mechanical wounding provides a potential infection site for pathogen invasion [50,52].
Plants use two innate immune system modes to protect against microbial pathogen attacks: pattern-triggered immunity (PTI), which is triggered by the perception of microbe-associated or pathogen-associated molecular patterns (MAMPs or PAMPs) through pattern recognition receptors (PRRs); or effector-triggered immunity (ETI), which is triggered by the recognition of pathogen effectors [63]. The perception of microbial pathogens by plants can be described in four phases using the "zigzag" model [64]. In the present study, six FLS2 genes, which function as PRRs in plants to perceive bacterial flagella [41], showed upregulated expression following overgrazing (Table 4).
Many pathogen virulence effectors are secreted through the type III secretion system (TTSS) and are recognized by plants with corresponding R genes [55]. In Arabidopsis, the nucleotide-binding leucine-rich repeat (NB-LRR) classes of the R protein RPM1 confer resistance to Pseudomonas syringae strains expressing the avirulence genes avrB and avrRpm1 [42,65]; loss-of-function of RPM1 plants show sensitive symptoms to P. syringae [66]. However, the RPM1-interacting protein RIN4, which is the target of the type III virulence effector and required for RPM1 activation [42], acts as a negative regulator of the plant's basal defense response [67]. Consistent with this observation, we found that the transcripts of 11 RPM1 genes were enhanced, whereas 4 RIN4 genes were compromised by overgrazing (Table 4). We also observed that the genes required for R gene activities were induced by grazing. For example, SGT1, a component of SCF (Skp1-Cullin-F-box protein) ubiquitin ligases and a regulator, was active early in the plant's R gene-mediated defense; the SGT1b mutation of Arabidopsis was defective in the plant's early defenses [68]. Additionally, the molecular chaperone HSP90, which is critical for disease resistance in Arabidopsis due to its interaction with SGT1 [69], showed increased gene expression under the grazing condition. Therefore, we hypothesize that grazing elevates the resistance of S. grandis to pathogenic organisms.
Several transcription factors (TFs), including WRKYs, also show altered expression during ETI and PTI [70]. Accumulating data implicate the WRKY TFs in the plant immune response, acting as both positive and negative regulators of disease resistance [70]. For example, overexpression of the pathogen-inducible OsWRKY31, also known as WRKY55 [71], enhances disease resistance in transgenic rice plants [72]. In contrast, WRKY7 and WRKY11 function as negative regulators in the plant defense response [73,74]. In Arabidopsis, the loss-of-function WRKY70 mutant showed enhanced disease susceptibility to Erysiphe cichoracearum, while WRKY70-overexpressing transgenic plants increased resistance to E. cichoracearum and suppressed the JA-induced defense response [75]. The altered expression of WRKY genes in present study (Table 4 and S3 Table) suggest that WRKY TFs may be an important component in the resistance of S. grandis to pathogenic organisms to provide grazing tolerance. However, the real roles of the WRKY TFs in S. grandis still need to be further identified. Based on the genes discussed above, we conclude that grazing not only subjects plants to biotic stressors and activates the plant's immune response, it also confers resistance to pathogenic organisms.
In conclusion, to cope with the stressful occurrences that are associated with grazing, plants develop an avoidance / resistance strategy for optimal growth, such as size reduction. It is well documented that pathogens and abiotic stressors, such as wounding and drought stress, severely impact plant performance and productivity [28,76,77]. Dwarf plants under these stressors always show enhanced tolerance and altered gene expression [28,58,78]. Therefore, our observation supports the notion that the dwarf phenotypes of S. grandis are induced by overgrazing, which is at least partially caused by the additive effects of multiple stressors, including wounding, drought and immunity signals.