Detecting early signs of heat and drought stress in Phoenix dactylifera (date palm)

Plants adapt to the environment by either long-term genome evolution or by acclimatization processes where the cellular processes and metabolism of the plant are adjusted within the existing potential in the genome. Here we studied the adaptation strategies in date palm, Phoenix dactylifera, under mild heat, drought and combined heat and drought by transcriptomic and metabolomic profiling. In transcriptomics data, combined heat and drought resembled heat response, whereas in metabolomics data it was more similar to drought. In both conditions, soluble carbohydrates, such as fucose, and glucose derivatives, were increased, suggesting a switch to carbohydrate metabolism and cell wall biogenesis. This result is consistent with the evidence from transcriptomics and cis-motif analysis. In addition, transcriptomics data showed transcriptional activation of genes related to reactive oxygen species in all three conditions (drought, heat, and combined heat and drought), suggesting increased activity of enzymatic antioxidant systems in cytosol, chloroplast and peroxisome. Finally, the genes that were differentially expressed in heat and combined heat and drought stresses were significantly enriched for circadian and diurnal rhythm motifs, suggesting new stress avoidance strategies.


Introduction
As a result of their sessile lifestyle, plants have developed multiple strategies for coping with environmental challenges, such as biotic (for example infection or herbivore attack) or abiotic (such as heat, drought, or high light) stresses. In nature, plants experience several stresses simultaneously, resulting in limited crop yields and growth. The impact of different stress types on plants has been under extensive research, but studies looking at their combinations are still a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 maintained under greenhouse conditions (15-25˚C, 60-70% relative humidity (rH)) two months prior to the start of the experiments. Plants were irrigated every second day towards the end of the light period. Experiments were performed in growth chambers (Heraeus, Vötsch, Germany) with 16/8 h photoperiod and 20/15˚C (70 ± 3% rH) or 35/15˚C (60 ± 8% rH at day and 70 ± 3% at night) temperature and rH under 200-300 mmol photons m -2 s -1 at leaf level.
Experiments were carried out in two batches. In the first set, plants were irrigated every second day ("well-watered", heat and control conditions). Plants were first acclimatized for two weeks in the chambers and then exposed to different growth temperatures (20˚C for control, and 35˚C for heat) for two weeks, followed by harvesting six hours after the onset of light. In the second batch, watering was stopped after the two-week acclimation period. Plants grown at 35˚C were harvested 4-5 days after termination of irrigation (combined heat and drought), and plants grown at 20˚C were collected after 7-8 days (drought conditions). The duration of water deprivation was 3 days longer for 20˚C-grown plants because of lower relative humidity in the 35˚C chamber. In experimental conditions involving drought the leaf water content (fresh weight/dry weight ratio) was monitored, showing that the treatments caused a significant decline in leaf water content, see [11] for more details. Four plants (biological replicates) from each experimental condition were harvested for total RNA and metabolite isolation. After collection, plant material was frozen in liquid N 2 and stored at -80˚C.

Metabolomics data
Extraction and derivatization of plant samples (same plant samples which were used for RNA extraction, with four replicates in each experiment) were performed as described in [10,12]. Briefly, leaf and root material of about 50 mg fresh weight was homogenized and extracted in 700mL 100% methanol at 70˚C for 15 min and centrifuged at 14,000 rpm for 5 min. The supernatants were transferred to new tubes and 1 mL double-distilled water/chloroform were added, tubes were vigorously shaken and centrifuged at 14,000 rpm for 5 min. Aliquots of 200mL of the chloroform phase were dried in a speed-vac (RVC 2-25, Christ, Osterode, Germany) and derivatized. For derivatization 50mL methyl-N-(trimethylsilyl) trifluoroacetamide (MSTFA; Sigma, Munich, Germany) with 20mL pyridine were added and samples were incubated at 37˚C for 30 min. Subsequently, sample reaction solutions were transferred to glass vials suitable for the Gerstel MultiPurpose Sampler (MPS2-XL, Gerstel, Mülheim, Germany). 1mL aliquots were injected into the system and run on a capillary column (HP-5MS, length: 30 m, diameter: 0.25 mm, film thickness: 0.25mm; Agilent Technologies, Palo Alto, CA, USA) at a helium flow of 1mLmin -1 .
Metabolite abundance was determined by GC-EIMSD (Agilent 7890A GC coupled to an Agilent 5975C MS; Agilent Technologies, Frankfurt, Germany) with the GC/MS settings described in [13]. For metabolite identification and quantification, the Golm metabolome database [14] and available authentic external standards of known concentration were used. Peak identification and deconvolution of chromatograms were performed using AMDIS 2.71 ("Automated Mass Spectral Deconvolution and Identification System" freely available from http://www.amdis.net) and the web-based platform "SpectConnect" (http://spectconnect.mit. edu/) [15].
Differential abundance of the metabolite data was analyzed using EdgeR (v 3.14.0) [16]. Data was normalized using the default Trimmed Mean of M-values (TMM). The glmLRT method was used to fit the statistical model. Benjamini-Hochberg false discovery rate correction (FDR) of p-values was used to adjust for multiple testing, and FDR 0.05 was used as a significance threshold.

RNA sequencing and RNAseq data processing
RNA was extracted from a single pinna leaf and cDNA preparation was carried out as described in [17]. NEBNext Ultra™ RNA Library Prep protocol (New England Biolabs, Ipswich, MA, USA), Illumina HiSeq 1000 System User Guide (Illumina, Inc., San Diego, CA, USA) and KAPA Library Quantification Kit-Illumina/ABI Prism User Guide (Kapa Biosystems, Inc., Woburn, MA, USA) were used in library preparation and RNA sequencing. To this end, 500 ng of total RNA were used for library preparation (NEBNext Poly(A) mRNA Magnetic Isolation Module). Later on, purified RNAs were reverse transcribed using random primers to construct forward and reverse cDNA strands. Amplified cDNAs were treated for endrepair process by addition of a single 'A' base, and ligation of the barcode-containing adapters (NEBNext Multiplex Oligos, New England Biolabs). Finally, treated cDNAs were purified for DNA library preparation, quantified by KAPA SYBR FAST ABI Prism Library Quantification Kit. To construct the cBot (TruSeq PE Cluster Kit v3) cluster, equimolar amounts were pooled. A 2x100 bp paired-end sequencing run was performed on a HiSeq 1000 instrument, using TruSeq SBS v3 Reagents. The output (.bcl files) were converted into.fastq files with CASAVA 1.8.2 software. Library preparation and sequencing were conducted at the Genomics Core Facility "KFB-Center of Excellence for Fluorescent Bioanalytics" at University of Regensburg, Germany.
The quality of the raw reads was checked with FastQC software (www.bioinformatics. babraham.ac.uk/projects/fastqc/). This was followed by removal of adapter sequences and trimming and cropping of the reads using Trimmomatic-0.33 [18] in paired-end mode. The bases with a Phred quality score less than 20 were trimmed from the ends of the reads, and the reads shorter than 30 bases were removed from the analysis (-phred33, TRAILING:20 and MINLEN:30).

Differential gene expression analysis
Filtered reads were mapped to the P. dactylifera gene models (with ribosomal genes removed) [19] using Kallisto V-0.43.0 (CMD:quant) [20] with 4000 bootstrap sets. The final count table for each biological replicate was obtained as the mean of the bootstrap runs. The count table was used as input to EdgeR (v 3.14.0) [16] to carry out differential gene expression analysis. Genes with no expression were removed and the filtered count table was normalized using the default Trimmed Mean of M-values (TMM). The glmLRT method was used to fit the statistical model. Benjamini-Hochberg false discovery rate correction of p-values was used to adjust for multiple testing, with FDR 0.05 as significance threshold.

Ortholog inference and orthologs associated with ROS responses
Protein sequences of Arabidopsis thaliana, Ananas comosus, Oryza sativa (japonicus), Zea mays and Sorghum bicolor were downloaded from Phytozome v.11 [21] and used for orthology analysis by running Orthofinder [22] with default parameters.
ROS signature gene sets were collected from [23,24] and annotated according to gene family and the localization of the proteins in subcellular compartments in Arabidopsis. These candidate gene sets were mapped to their putative orthologs (custom bash shell script) using the orthologous gene clusters from Orthofinder.

Gene annotation and Gene Ontology (GO) enrichment analysis
In order to annotate the genes in date palm, the predicted protein sequences [19] were mapped to Arabidopsis thaliana proteome using BLASTP with a default E-value threshold of 10 [25].
The Gene Ontology categories and the functional description of the best hit in Arabidopsis was chosen to represent the annotation of the respective P. dactylifera gene.
Lists of differentially expressed genes were selected by using the absolute value of log 2 fold change (FC) cutoffs 1 and 4, with FDR adjusted p-value threshold of <0.05.
Gene ontology enrichment analysis was carried out with GOATOOLS [26] using Fisher exact test. Multiple testing correction with Benjamini-Hochberg false discovery rate adjustment was applied for the p-values, using significance threshold FDR 0.05. A second GO enrichment analysis was carried out with threshold-free gene set enrichment analysis (GSA) using R package Piano (v 1.12.1) [27] to detect coordinated changes in gene expression (significance threshold for FDR adjusted p-values 0.05, minimum gene set size 2 with no upper bound, using maxmean as gene set statistic, and using 10,000 bootstrap sets) [27]. Enriched GOs with significantly increased mean transcript levels were grouped using the treemap package in R [28] with default parameters.

Motif enrichment
A total of 163 known motifs were collected from AGRIS [29] and PLACE [30] databases, and various other publications [31][32][33] for motif enrichment analysis. Motif enrichment was tested in the promoter sequences 1 kbp upstream of P. dactylifera genes. The motif search was conducted separately for the genes on positive and negative strands (custom R script). Motif counts were tabulated, and one-sided (alternative hypothesis: greater) Fisher exact test was applied to carry out enrichment analysis. The p-values were corrected using Bonferroni correction with 0.05 as significance threshold. In addition, GO enrichment analysis was carried out for genes with circadian and circadian-related motifs, light responsive, and sugar biosynthesis-related motifs.

Gene expression
Roughly 95% of all reads were retained after trimming. Information of raw and trimmed read counts per sample are collected in S1 Table. Principal component analysis (PCA) of transcriptome (Fig 1) illustrates that heat was associated with the principal component 1 (PC1), describing 26% of the total variance. In total, drought (D), heat (H) and combined heat and drought (HD) experiments resulted in 24,504 expressed genes (S5 Table).
Setting the threshold of significance to abs(log 2 FC) ! 1 and FDR 0.05 resulted in 68, 1240, and 3168 differentially expressed genes in drought, heat, and combined heat and drought stresses, respectively. A common factor for all treatments was the expression of genes encoding heat shock proteins (HSPs). Altogether 38 genes were differentially expressed in all three experiments (S5 Table). From this set, 19 genes were related to heat stress responses, such as heat shock proteins, chaperone proteins and heat stress transcription factors, illustrating both heat and drought induced heat-related transcriptional responses in date palm [4]. The two significantly expressed heat stress transcription factors (PDACT_KE332562.1_G003770 and PDACT_KE332562.1_G003780) were putative orthologs of Arabidopsis heat stress transcription factor A-2 (HsfA2), known to regulate the expression of a number of heat shock proteins (HSPs) in Arabidopsis and to induce the expression of L-ascorbate peroxidase 2 (APX2), Inositol-3-phosphate synthase isozyme 2 (IPS2) and galactinol synthases 1 and 2 (GolS1) [34]. Additionally, two negative regulators of cell death, orthologs of Bax inhibitor-1 family protein and Fuzzy Onions Like [35,36], were highly expressed in all three conditions suggesting suppression of cell death, as well as one positive regulator, putative ortholog of Arabidopsis BAG6 chaperone regulator [37]. In Arabidopsis, BAG6 is a positive regulator of heat shock factors and cell death. Possibly the two negative regulators are needed to prevent the cell death initiation.
Threshold-free gene set enrichment analysis [27] revealed ten enriched GOs common to heat, drought and combined heat and drought. Analysis of the enriched GO terms using Treemap [28] shows "cellular response to unfolded protein" and "induction of programmed cell death" categories to be enriched in drought and combined heat and drought (S1 and S3 Figs). A majority of the 38 genes which were differentially expressed in all conditions belonged to these two GO categories. In addition to these, pathways related to phytohormones (in H, D and HD), wax and secondary metabolites (in H and HD), fatty acid biosynthesis (in H and HD) and plant cell wall (in H, D and HD) were enriched.

Metabolomics
In total, concentrations of 91 amino acids, sugars, organic acids and ten unknown metabolites were quantified (S7 Table). In contrast to transcriptomics data, the PCA plot of metabolomics data showed drought as the main descriptor of variance (37% explained by PC1, Fig 2).
Overall, drought and combined heat and drought had more similar effect on metabolites. The abundance of arginine, glucuronic acid, DL-glutamine, ornithine, fucose, galactose, lactose and proline increased significantly in at least one of the conditions (Fig 3 and S7 Table). Altogether, this suggests increased activity in carbohydrate metabolism.

Reactive oxygen species scavenging and anti-oxidative system
We next analyzed the expression of the genes related to ROS scavenging or redox-related processes, as identified in A. thaliana [23,24]. The signature sets were first mapped to sets of putative orthologs in P. dactylifera using OrthoFinder, and then these sets were tested for enrichment among the differentially expressed genes using Fisher exact test (S10 Table). The putative orthologs were subdivided by gene family and their localization in plant cell, according to information from A. thaliana. The key components of the antioxidative system known to control ROS production in plants are superoxide dismutase [ ) were included. Genes encoding ROS scavenging-related proteins targeted to chloroplast and cytosol had increased expression, whereas the expression of genes encoding proteins targeted to mitochondria was decreased (Fig 4 and S8 Table).
Earlier analysis of the same experiment [10] showed that ROS homeostasis or cellular redox-state did not change in the experiment. The gene expression data however displayed changes at the transcriptional level for these processes. This suggests that ROS scavenging systems function effectively in different subcellular compartments, along with non-enzymatic ROS scavengers maintaining the ROS homeostasis. More detailed analysis on protein levels and enzyme activities will be required to analyze whether the increased transcript abundance for genes encoding antioxidative enzymes was translated into increased enzyme activities that would reflect increased ROS production by the treatments.

Activation of pathways associated with stress tolerance
In order to detect pathways contributing to stress tolerance, gene set analysis was carried out for the transcriptomics data (S2 Table), and the results were interpreted together with the metabolite data.
Drought. In total, 31 pathways (out of 70 significantly enriched pathways) were identified with mean log 2 FC>0 in drought stress (Fig 5). Among them were pathways related to protein folding and unfolding, heat acclimation, water deprivation, response to ABA, hydrogen peroxide and viruses (S1 Fig). Protein folding is vital for any organism to function and survive under stress (for example heat and drought stresses) [38]. HSPs and their chaperon activities are the main driving force behind protein folding, assembly, translocation and degradation upon normal or stressed environmental conditions [39]. "Cellular response to unfolded protein" was the major biological process enriched in response to drought (S1 Fig). It is well known that stresses (in this study H, D and HD) cause the protein folding to slow down in endoplasmic reticulum (ER), resulting in an abundance of folded, unfolded or misfolded proteins in ER [40]. The accumulation of folded, unfolded or misfolded proteins in ER may trigger a set of biological pathways such as "response to endoplasmic reticulum stress", which was significantly enriched together with protein folding and unfolding processes in all three treatments.
Heat. In heat stress, 88 out of 136 significantly enriched GO categories were expressed at higher levels than in control (Fig 6 and S2 Table). A majority of these categories were related to plant immune responses or fatty acid biogenesis (S2 Fig).
Plant cell wall synthesis is an important factor in plants for coping with different stresses. The dynamic nature of cell wall synthesis ensures the plant resistance to stresses, and supports the growth and development of the plant [41][42][43]. Four processes related to cell wall synthesis were enriched in all three conditions (H, D and HD), with mean expression increase by 2.1 fold (S2 Table). For more detailed analysis, we inspected 48 genes with a putative ortholog reported to take part in cell wall biogenesis [43] (S9 Table). Expression patterns of these genes show the active nature of cell wall biogenesis in response to heat, drought and combined heat and drought conditions. The abundance of glucuronic acid increased significantly in all conditions, by three fold in heat and drought conditions and 8.6 fold in combined heat and drought. Glucuronic acid is utilized by different pathways such as biosynthesis of arabinose, xylose, galacturonic acid, and apiose residues found in cell wall constituents such as pectin and hemicellulose [44], suggesting that cell wall-related processes were activated (S7 Table). In transcriptomic data, "polygalacturonase activity" (pectin biosynthesis) was enriched in P. dactylifera in heat stress with Columns show metabolites and rows show treatments. Each metabolite was manually annotated to be associated with biosynthesis of an organic compound, illustrated by an annotation matrix over the heatmap. Black cell shows the annotation assigned to each metabolite. Red: log 2 FC ! 1, blue: log 2 FC -1 with respect to control, gray: non-significant fold change (S7 Table). Abbreviations: H (heat experiment), D (drought experiment), and HD (combined heat and drought). mean fold change of 2.6. Altogether, these data sets together suggest active remodeling of the cell wall.
Combined heat and drought. In the combined heat and drought stress, 195 significantly enriched pathways were identified from transcriptomic data. Out of these, 57 GO categories were positively enriched with mean log 2 FC>0 (Fig 7). The associated pathways were protein folding and unfolding, plant immune responses, heat acclimation, response to phytohormones, hydrogen peroxide, cell wall biogenesis, programmed cell death and secondary metabolism (S3 Fig). Altogether 99 GO categories were unique to combined heat and drought (S2 Table).
The expression of "wax biosynthetic process" was increased by two fold in combined heat and drought. P. dactylifera has 30 genes annotated to be involved in wax biosynthesis (S2 Table). Intra or epicuticular wax production is an effective physical barrier against abiotic and even biotic stresses. In case of abiotic stresses, wax is known to have beneficial effects by preventing water loss (drought stress) and reflecting the excess radiation (heat stress). This protective shield creates an isolation coat over the leaf surface in harsh arid climates [45].
In metabolomics data, galactose content was increased eight fold in combined heat and drought. Soluble carbohydrates such as galactose are known to accumulate during drought stress [46]. UDP glucose 4-epimerase is an important enzyme of galactose metabolic pathway which facilitates the reversible conversion between UDP galactose and UDP glucose. In A. thaliana, overexpression of rice UDP glucose 4-epimerase in a transgenic line resulted in tolerance to salt, drought, and freezing stress [47]. UDP glucose 4-epimerase regulates the monosaccharide pool available for pectin production [48], but also plays an important role in galactinol, stachyose, and raffinose biosynthesis. There are 14 paralogous UDP glucose Differentially expressed genes encoding ROS and redox-related enzymes in heat, drought, or combined heat and drought treatments in P. dactylifera. Genes encoding proteins for ROS scavenging and redox-related processes were assigned to a subcellular compartment according to the protein localization in Arabidopsis, and the expression profile of the genes was used to construct the heatmaps. Genes are in rows, the number following the gene name is arbitrary and used to simplify the identification. The experiments (D: drought, H: heat and HD: combined heat and drought) are in columns. Heatmap shows log 2 FC of the expressed genes. Red: log 2 FC ! 1, blue: log 2 FC -1, gray: -1 < log 2 FC < 1. SOD: superoxide dismutase, AOX: alternative oxidase, APX: ascorbate peroxidase, GPX: glutathione peroxidase, CAT: catalase, GRX: glutaredoxin, Trx: thioredoxins, PrxR: peroxiredoxin and DHAR: dehydroascorbate reductase (S8 Table).

Motif enrichment
Motif enrichment analysis revealed a higher amount of enriched motifs in combined heat and drought and heat conditions, and to a lesser degree in drought (Fig 8, S3 Table). Interestingly, all cis-acting elements involved in sugar regulation [33] were significantly enriched in the heat and combined heat and drought conditions, suggesting that sugar signaling has an eminent role in the date palm gene regulation related to heat stress.
A significant enrichment of circadian motif and morning element-containing genes was detected in heat and combined heat and drought. Association between the stress-induced genes and circadian rhythm and diurnal cycle could be evidence of a stress avoidance strategy in date palm; stress reactions could be triggered in the morning to improve the palm tolerance to the harsh conditions during daytime. Further evidence of this is that from the 57 HSP genes with significantly increased expression in the experimental conditions (S6 Table), 27 genes contained the circadian clock associated 1 (CCA1, seq: AAAAATCT) promoter motif (FDR 0.05). In addition, two out of these 27 HSPs contained also a morning element (CCA CAC), 10 genes had an evening element (TGGATA), and one gene had both morning and evening elements (S6 Table). The CCA1-motif containing genes are involved in adjustment to  Table). environmental conditions, such as light and temperature, and they regulate specific biological processes synchronized to a particular time of the day. We found no CCA1 nor evening element motifs in A. thaliana HSP promoters, but 14 HSP orthologues contained morning element motifs. A complete set of enriched GOs for the genes containing circadian and light responsive cis-motifs is in S4 Table. Proline is involved in several stress responses as an osmolyte [49], a signaling molecule [50], a metal chelator [51] through chaperone activities, and in direct ROS scavenging [52]. Proline was significantly increased by 9.5 fold in the drought experiment; in combined heat and drought the increase was 4.3, but marginally below the level of statistical significance (FDR = 0.0576, S7 Table). The PRE motif (ACTCAT, Proline and hypoosmolarity responsive) was enriched among differentially expressed genes in the heat experiment (Fig 8, S3 Table), and in transcriptomics, GSA analysis showed enrichment of "proline transport" with a two fold increase in the heat experiment (S2 Table). Interestingly, the proline levels were not altered in the heat stress, which could suggest that proline act as a signaling molecule in heat stress.
To dissect the processes regulated by diurnal cycle in the different stress conditions, further enrichment analysis of these genes was carried out. Overall, GO enrichment analysis of all genes containing circadian and light responsive, sugar, drought and ABA cis-regulatory elements showed seven common enriched GOs: vacuole, plasma membrane, response to heat, water deprivation, high light, endoplasmic reticulum stress and hydrogen peroxide. Out of these, 5 GOs (response to heat, water deprivation, high light, endoplasmic reticulum stress and hydrogen peroxide) were significantly enriched in the set of differentially expressed genes in the D, H, or HD experiments.  Table).
Furthermore, ABA regulatory elements were enriched in both heat and combined heat and drought, but not in drought. ABA is generally induced under osmotic stresses and is known to be a positive regulator of LEA class proteins, heat shock proteins and protein phosphatase 2C (PP2Cs) [53]. No enrichment in drought could result from the fact that the conditions were not severe enough for the date palm to elicit a drought response. We further detected the WRKY W-box element to be significantly enriched in the combined heat and drought and heat. WRKY transcription factor family is a key regulator in many processes such as response to biotic and abiotic stresses, and senescence [54]. The drought responsive element was not enriched in any condition.

Conclusion
We have studied the stress adaptation strategies initiated in date palm, Phoenix dactylifera, under mild heat, drought, and combined heat and drought conditions by transcriptomic and metabolomic profiling. Overall the transcriptional profile showed that combined heat and drought and heat had more similar response, whereas drought resembled control. Other monocot species have shown a similar balance tipped towards heat (T. aestivum and S. bicolor) [2]. In the metabolomics data the balance was opposite, with the combined heat and drought resembling the drought condition. This could be due to a more immediate response to water deficiency at the metabolomic level, which leads the plant to switch more towards carbohydrate metabolism. Based on the transcriptomic data and the overall levels of hydrogen peroxide, the date palm did not appear to be very stressed, since ROS was effectively compartmentalized and scavenged. Therefore it could be that many of the observed responses at the transcriptional level do not necessarily lead into translation of the proteins.
Overall, it appears that date palm reacts to drought and heat conditions in a similar manner to other plants. The hallmarks of heat stress were visible in the transcriptomics data, such as protein misfolding, response to hydrogen peroxide and cell wall modification, as well as ABA signaling in the case of drought. Since the plants were exposed to the stress for several days before harvesting, the early signs of heat stress such as calcium and NO signaling were not detected anymore. Interestingly, two orthologs of heat shock factor HSFA2 showed increased expression in all three conditions. They are known to be the most highly heat-induced heat shock factor in Arabidopsis, and also to play an important role in recovery from heat shock [55] and extension of thermotolerance in Arabidopsis [56].
Finally, we detected a significant enrichment of circadian rhythm motifs in the differentially expressed genes in heat and combined heat and drought stresses, suggesting new stress avoidance strategies. For example, some heat shock factors contain motifs for circadian regulation in the promoter regions, making it possible to synchronize HSP production to the time of the day when they are needed. A similar connection has been reported in Arabidopsis in the case of negative regulators of freezing tolerance, COR27 and COR28 [54].
Supporting information S1  Table. Genes putatively associated with cell wall biogenesis and their differential expression.