Unigene-based RNA-seq provides insights on drought stress responses in Marsdenia tenacissima

Marsdenia tenacissima is a well-known anti-cancer medicinal plant used in traditional Chinese medicine, which often grows on the karst landform and the water conservation capacity of land is very poorly and drought occurrences frequently. We found M. tenacissima has strong drought resistance because of continuousdrought16 d, the leaves of M. tenacissima were fully curly and dying. But the leaves were fully almost recovering after re-watering 24h. The activity of SOD and POD were almost doubled under drought stress. The content of osmotic regulating substance proline and soluble sugar were three times than control group. But after re-watering, these indexes were declined rapidly. Three cDNA libraries of control, drought stress, and re-watering treatments were constructed. There were 43,129,228, 47,116,844, and 42,815,454 clean reads with Q20 values of 98.06, 98.04, and 97.88respectively.SRA accession number of raw data was PRJNA498187 on NCBI. A total of 8672, 6043, and 6537 differentially expressed genes (DEGs) were identified in control vs drought stress, control vs re-watering, and drought stress vs re-watering, respectively. In addition, 1039, 1016, and 980 transcription factors (TFs) were identified, respectively. Among them, 363, 267, and 299 TFs were identified as DEGs in drought stress, re-watering, and drought stress and re-watering, respectively. These differentially expressed TFs mainly belonged to the bHLH, bZIP, C2H2, ERF, MYB, MYB-related, and NAC families. A comparative analysis found that 1174 genes were up-regulated and 2344 were down-regulated under drought stress and this pattern was the opposite to that found after re-watering. Among the up-regulated genes, 64 genes were homologous to known functional genes that directly protect plants against drought stress. Furthermore, 44 protein kinases and 38 TFs with opposite expression patterns under drought stress and re-watering were identified, which are possibly candidate regulators for drought stress resistance in M. tenacissima. Our study is the first to characterize the M. tenacissima transcriptome in response to drought stress, and will serve as a useful resource for future studies on the functions of candidate protein kinases and TFs involved in M. tenacissima drought stress resistance.


Plant material, growth conditions, and drought stress treatments
The M. tenacissima "Yunnan" was used in this study, which was supplied by Yunnan Xintong Plant Pharmaceutical Co., Ltd. (Mengzi,Yunnan,China). The M. tenacissima seeds were surface-sterilized in 0.5% (w/v) NaClO for 15 min. Then they were sown in pots filled with peat and vermiculite (v/v = 3:1), and left to germinate in a greenhouse at 25˚C. The two-week-old M. tenacissima seedlings were individually transferred to a small flowerpot containing 1 kg soil (humus soil:garden soil = 1:1) and grown in an artificial climate incubator under natural drought stress treatment(12 h/12 day/night, light 4000 lx,temperature:23˚C/16˚C day/night, air relative humidity: 75%/55% day/night).
In the drought treatment, 10-15-cm high plants were split into three groups with ten plants in each group. The control group of plants was supplied water every two days. The drought stress group of plants was not supplied water until all leaves were curing (about 16 days) ( Fig  1B). The re-watering group of plants was not supplied water until the plant drought phenotype was the same as the drought stress group, then sample were taken when all curing leaves fully expanded (about 24 hours after watering) ( Fig 1C).

The determination of stress physiological indexes of M. tenacissima
The test materials were leaves of each group, using random sampling method. Each process was repeated three times. The contents of soluble sugar, malondialdehyde (MDA) and superoxide dismutase (SOD) were determined by anthrone method, thiobarbital colorimetry and nitroblue tetrazole photochemical reduction method respectively. The guaiacol method was used to measure the content of peroxidase (POD) [30].The experimental data were analyzed by single factor anova method of SPSS 21.0.

Total RNA extraction and cDNA synthesis and sequencing
The roots, stems, and leaves from three randomly selected plants in each group were collected and stored at -80˚C for RNA extraction. Total RNA was extracted from each sample using Triazol reagent (TaKaRa, Dalian, China) according to the manufacturer's instructions. The samples were then treated with DNase I to remove any contaminated genomic DNA. The integrity and purity of the RNA was verified by an ultraviolet spectrophotometer (OD260/ OD280 ratios of 1.89 to 2.08) and 1.2% agarose gel electrophoresis. The RNA from the roots, stems, and leaves of each group of plants was pooled. The cDNA libraries were then constructed according to Huang et al. [31]. The cDNA libraries were sequenced on a HiSeq2000 (Illumina, San Diego, CA, USA) according to the manufacturer's standard protocols to generate 100-bp paired-end reads.

Acquisition of clean reads and mapping
Raw reads from the cDNA library were filtered to remove low-quality reads and adaptors using the program FASTX-Tool kit (http://hannonlab.cshl.edu/fastx_toolkit/) to produce the clean reads. The clean reads were mapped to the reference transcriptome dataset (NCBISRA140234) using SOAP aligner/soap2 software ). The total mapped reads were kept for further analysis.

Identification of differentially expressed genes (DEGs)
The DEGs between treatments were identified based on the Reads Per Kilobase per Million (RPKM) value calibrated by DEGseq [32][33]. Genes with a "q value < 0.005" and a "fold change |log2| > 1" were deemed to be significantly differentially expressed between the two samples.

Functional annotation and classification
The DEGs were annotated using the following databases: the NR protein database (NCBI), Swiss Prot, Gene Ontology (GO), the Kyoto Encyclopedia of Genes and Genomes (KEGG) database, and the Clusters of Orthologous Groups database (COG) according to the methods of described by Zhou et al [34]. Pathways and GO function enrichment analyses were performed as previously described [35]. The transcription factor (TF) responses to drought stress were identified according to the method described by Zhao et al [36].

Quantitative real-time PCR (qRT-PCR) verification of DEGs
To evaluate the accuracy and validity of the transcriptome sequencing data, 24 genes with differential expressions were selected to carry out the qRT-PCR analysis. Primers were designed using the BioXM 2.6 software, and the primer sequences are listed in S1 Table. The GAPDH (glyceraldehyde-3-phosphate dehydrogenase) gene was used as a reference gene. The qRT-PCR analysis of each gene was performed with three biological replicates according to the SYBR Premix ExTaq TM protocol (TaKaRa) on a Light Cycler 480 Real-Time PCR machine (Roche Diagnostics Ltd., Switzerland). The relative expression level of each gene was calculated using the 2 -(ΔΔCt) method. The expression value of each gene from qRT-PCR and RNA-seq was log2 transformed so that the qRT-PCR data could be compared with the RNA-seq results.

The analysis of physiological indexes in M. tenacissima of three groups
We measured some response indexes of abiotic stress, such as SOD, POD, MDA, etc. We found that the content of 5 indexes was significantly increasing under drought stress and 5 indexes were significantly decreasing after re-watering 24 hours. The re-watering group was significant higher than control group except MDA, which indicated that although morphologically restored, physiological stress has not completely relieved (Fig 2).

Transcriptome sequencing, data statistics and evaluation, and reads mapping
To understand the drought-response molecular mechanism in M. tenacissima and identify potential candidate genes involved in drought tolerance, deep RNA sequencing of M. tenacissima seedlings subjected to drought and subsequent re-watering was performed using the Illumina sequencing platform. A total of 43,983,844, 48,059,552, and 43,744,500 raw reads were obtained from the control, drought stress, and re-watering cDNA libraries, respectively (Table 1).And SRA accession number was PRJNA498187 on NCBI. After removing the lowquality reads and adaptors, 43,129,228, 47,116,844, and 42,815,454 clean reads were produced, which accounted for 98.06%, 98.04%, and 97.88% of the raw reads, respectively. SRA accession number is PRJNA498187 on NCBI (Table 1)

Identification of DEGs responding to drought stress
The genes from each treatment group were subjected to a pairwise comparison to identify the DEGs after using a blast algorithm with the preset cutoffs. As a result, a total of 21,254 DEGs were identified. A comparison between control and drought stress showed that 1855 genes were up-regulated, and 6817 genes were down-regulated. Between control and re-watering, 1612 genes were up-regulated and 4431 genes down-regulated. A further 3982 genes were found to be up-regulated, and 2555 genes were down-regulated between drought stress and re-watering, and 78 up-regulated genes and 144 down-regulated genes were identified in all three comparison groups (Fig 3). Notably, 1174 of the 1855 induced genes and 2344 of the 6817 repressed genes under drought stress were down-regulated, but were up-regulated after re-watering (Fig 4, S1 and S2 Tables). These genes probably play important roles in tolerance to drought stress.

GO annotation of DEGs from the three comparison groups
GO classification was performed to investigate the functions of the DEGs in the three comparison groups. After comparing control with drought stress, 2628 DEGs (704 up-regulated genes  Drought stress response in Marsdenia tenacissima by unigene-based RNA-seq and 1924 down-regulated genes) were assigned to 67 main functional groups in the "biological processes", "cellular components", and "molecular functions" categories. When control was compared to re-watering, 1699 DEGs (559 up-regulated genes and 1140 down-regulated genes) could be functionally assigned to the relevant terms. The drought stress versus re-watering comparison functionally assigned 2161 DEGs (1296 up-regulated genes and 865 down-regulated genes) to the relevant terms. The top three significantly enriched GO functional annotation categories were "metabolic process", "cell", and "catalytic activity" (Fig 5).

Transcription factor responses to drought stress and water stimulus
Transcription factors are known to play vital roles in plant abiotic stress tolerance because they can regulate the expression of numerous downstream genes. A total number of 1039, 1016, and 980 TFs were identified in control, drought stress, and re-watering, respectively ( Table 2).
The number of TFs identified in the drought stress and re-watering library was slightly less than in control. In addition, 363, 267, and 299 TFs were identified as DEGs in control vs. drought stress, control vs. re-watering, and drought stress vs. re-watering, respectively. Further analysis revealed that the 363 DEGs from control vs. drought stress could be grouped into 42 families, and the top five families were C2H2 (35), bHLH (33), ERF (26), NAC (24), and MYB (21) (Fig 6A, S6 Table). Similarly, the 267 DEGs from control vs re-watering were grouped into 37 families, and most of the DEGs (103) belonged to the C2H2, ERF, bHLH, and bZIP families ( Fig 6B, S7 Table). Furthermore, in the comparison between drought stress and rewatering, 299 DEGs were involved in a total of 43 TF families. Among these TF families, ERF (37), bHLH (31), MYB (25), C2H2 (19), and MYB-related (18) were the top five families with the most genes ( Fig 6C, S8 Table).
A further analysis of the quantity relationship between up-and down-regulated TFs in the three comparison groups found that the number of up-regulated genes was significantly lower than the number of down-regulated genes in both control vs. drought stress and control vs. rewatering, but was the reverse in drought stress vs. re-watering ( Table 2). In the control vs. drought stress comparison, the number of down-regulated TFs was 3-fold more than the upregulated ones. The largest number of down-regulated TFs was found in the C2H2 family, while the ERF family contained the largest number of up-regulated TFs (Fig 6A). Similarly, in control vs. re-watering, the number of down-regulated TFs was 2-fold higher than the up-regulated ones. The ERF and bHLH families contained the largest number of down-regulated and up-regulated TFs, respectively ( Fig 6B). In drought stress vs. re-watering, the number of down-regulated TFs was lower than the number of up-regulated TFs.
The C2H2 family contained the largest number of down-regulated TFs, whereas the ERF family had the largest number of up-regulated TFs ( Fig 6C).

Identification of candidate genes for drought stress resistance
To identify the candidate genes for drought stress resistance in M. tenacissima, 1174 genes that were induced by drought stress and repressed by re-watering were screened. A blast analysis showed that 855 of the 1174 genes had a functional description (S9 Table). Further analysis found that 64 genes were homologous to the known functional genes that directly protects plants against drought stress, which include aquaporin, late embryogenesis abundant protein, chaperone, dehydration responsive protein, pleiotropic drug resistance protein, alcohol dehydrogenase, peroxidase, proline metabolism genes, trehalose synthesis-related genes, flavonoid synthesis-related genes, mannitol transporter, sugar transporter, peptide transporter, MATE efflux protein, and ABC transporter genes (Table 3). In addition, histone, histone deacetylase, and methyltransferase were also found, which suggested that epigenetic regulation was involved in the M. tenacissima drought stress resistance mechanism (Table 3).
To further identify the crucial regulatory genes, we investigated the protein kinases and the transcription factors in the 855 genes with functional descriptions. A total of 44 protein kinases were identified, which could be classified into 13 class-types. The top four classes were receptor-like protein kinase (11), L-type lectin-domain containing receptor kinase (6), LRR receptor-like serine/threonine-protein kinase (5), and leucine-rich repeat receptor-like protein kinase (5) ( Table 4). A total of 38 transcription factors were identified, which could be categorized into eight TF families. Among these TF families, ERF (10), WRKY (8), and NAC (5) were the top three families with the most genes ( Table 5).

Validation of the RNA-seq data by qRT-PCR
To verify the validity of the RNA-seq data, we analyzed 24 genes using qRT-PCR (S10 Table). The correlation coefficients of the gene expression trends after qRT-PCR, and the sequencing data from control vs. drought stress and control vs. re-watering were 0.6315 and 0.5735, respectively (Fig 7), which confirmed the validity of the RNA-seq data.

Discussion
In this study, we determined 5 physiological indexes of three groups. The results indicated that SOD, POD, proline, soluble sugar and MDA were increasing under drought stress. Then after re-watering all indexes were lower than drought stress, but they were significant higher than control. We performed a comparative analysis of the transcriptome changes of M. tenacissima undergoing drought stress and re-watering treatment. A total of 8672, 6043, 6537 DEGs were identified in drought stress vs control, re-watering vs control, and re-watering vs drought stress, respectively. Interestingly, there were 1174 up-regulated and 2344 down-regulated genes under drought stress presenting an opposite expression pattern when after re-watering, which were most possibly play important roles in tolerance to drought stress in M. tenacissima.
We found two aquaporin genes were up-regulated by drought stress and down-regulated after re-watering. This indicated that these aquaporin(s) may play a similar regulatory role in M. tenacissima under drought stress. The results have been confirmed in chickpea, foxtail millet, maize, and potato [37][38][39][40]. Heat shock responsive genes are functional genes that facilitate Drought stress response in Marsdenia tenacissima by unigene-based RNA-seq protein refolding and stabilize polypeptides and membranes. They have also been reported to respond to drought stress in barley, rice, and potato [32,[39][40][41]. In this study, we found that the expression of two heat shock responsive genes were induced by drought stress and repressed by re-watering treatment, which suggested that they directly participated in regulating drought-stress responses in M. tenacissima. Generally, drought stress induces the accumulation of LEA proteins and this accumulation enhances the survival rate of plants under drought conditions. The role of LEA proteins was to facilitate the correct folding of both structural and functional proteins and prevent lipid peroxidation [36,[42][43][44][45][46]. Previous research had showed that overexpression IbLEA14 in sweet potato transgenic calli enhanced tolerance to drought and salt stress [47].In this study, the expression of three LEA encoding genes was significantly up-regulated by drought stress and down-regulated by re-watering treatment.
In this study, we found 44 putative kinases, which were up-regulated by drought stress and down-regulated after re-watering, that include receptor-like kinases(RLKs), L-type lectindomain containing, serine/threonine-protein kinase, LRR receptor-like, MAPKK et al. RLKs play an important role in plant response to drought stress [48]. Overexpression of RLKs gene in transgenic rice that can enhance the tolerance to drought stress and salt [49].In our study, we found 11 RLKs genes up-regulated under drought stress. Mitogen-activated protein kinases (MAPK) pathways are known to be activated by numerous abiotic stresses such as cold, salt, heat, drought, ozone, or heavy metal intoxication [49]. Previous study had demonstrated that PtrMAPK acted as a positive regulator in dehydration/drought stress responses [49].
Trehalose has a protective role against various abiotic stresses, including drought stress in bacteria, fungi, and some plants. It helps maintain cellular membrane integrity and prevent protein degradation [50][51]. In plants, trehalose-6-phosphate synthase (TPS) and trehalose-6-phosphate phosphatase catalyze the biosynthesis of trehalose, and their expressions are induced by drought stress [36,[52][53][54]. Furthermore, overexpression AtTPS1 or OsTTPS1 improved the stress resistance of transgenic plants [52][53][54]. In this study, one TPP gene and two TPS encoding genes were up-regulated by drought stress and down-regulated after rewatering. Furthermore, one proline synthesis-related gene had a similar expression pattern to the TPP and TPS genes. These results indicate that synthesizing compatible solutes is a conserved drought resistance mechanism in different plants.
It's well-known that drought stress produces reactive oxygen species (ROS) and excessive ROS can cause the irreversible oxidization of lipids and proteins, which leads to membrane injury [55]. To overcome ROS injury, plants utilize ROS-scavenging enzymes, such as peroxidase (POD), superoxide dismutase (SOD), and catalase (CAT), to scavenge the excessive ROS [56]. Our physiological results showed the activity of SOD and POD were doubled under drought stress. We also found six POD-encoding genes were significantly up-regulated by drought stress and down-regulated by re-watering treatment in the transcriptome data, which indicated that ROS-scavenging via POD is an important mechanism in the overall resistance of M. tenacissima to drought stress.
Some studies have shown that alcohol dehydrogenase, MATE efflux family protein, and ABC transporters are involved in protecting plants against drought stress [57][58][59][60][61]. In this study, three alcohol dehydrogenase-, five MATE-, and two ABC-encoding genes were identified, and their expressions were strongly induced by drought stress, but significantly repressed by re-watering treatment, which suggested that these genes may also play important roles in drought stress resistance. Drought stress response in Marsdenia tenacissima by unigene-based RNA-seq We identified 38 TFs that were significantly up-regulated by drought stress and down-regulated after re-watering. The 38 TFs could be classified into the AP2/ERF, bHLH, BES1, ERF, MYB, MYB-related, NAC, WRKY, and Trihelix families. Among them, the ERF (10), WRKY (8), NAC (5), and AP2/ERF (4) families accounted for 70% of the genes, which acted as key regulators and play crucial roles in plant resistance to drought stress [12][13][14][15][16][17].AP2/EREBPs, WRKYs, and NACs were key regulators of ABA-mediated stomatal closure [62][63][64]. bHLHs play an important role in the JA-mediated regulatory network of the abiotic stress response [65].This indicated that ABA and JA play a central role in regulating drought stress tolerance in M. tenacissima.

Conclusion
The activity of SOD and POD were doubled under drought stress. In this study, we performed a comparative analysis of the transcriptome changes in M. tenacissima undergoing drought stress and re-watering treatment. A total of 8672, 6043, and 6537 DEGs, including 363, 267, and 299 TFs, were identified in the control vs. drought stress, control vs. re-watering, and drought stress vs. re-watering comparisons, respectively. The DEGs from these three comparative groups were classified into 67, 58, and 66 GO categories and were involved in 120, 114, and 115 KEGG pathways, respectively. Interestingly, 1174 up-regulated and 2344 down-regulated genes under drought stress had the opposite expression pattern after re-watering. Analysis of the 1174 up-regulated genes induced by drought stress and repressed by re-watering showed that many genes were homologous to known functional genes that directly protect plants against drought stress. Furthermore, 44 protein kinases and 38 TFs with opposite Drought stress response in Marsdenia tenacissima by unigene-based RNA-seq expression patterns under drought stress and re-watering were identified as crucial candidate regulators of drought stress resistance in M. tenacissima. In summary, our study is the first to characterize the M. tenacissima transcriptome in response to drought stress, and has identified the key candidate drought stress resistant genes in M. tenacissima. Our results will help unravel the mechanism controlling M. tenacissima drought stress resistance.  Drought stress response in Marsdenia tenacissima by unigene-based RNA-seq S9 Table. Genes that were up-regulated under drought stress and down-regulated after rewatering treatment.