Differential Gene Expression Reveals Candidate Genes for Drought Stress Response in Abies alba (Pinaceae)

Increasing drought periods as a result of global climate change pose a threat to many tree species by possibly outpacing their adaptive capabilities. Revealing the genetic basis of drought stress response is therefore implemental for future conservation strategies and risk assessment. Access to informative genomic regions is however challenging, especially for conifers, partially due to their large genomes, which puts constraints on the feasibility of whole genome scans. Candidate genes offer a valuable tool to reduce the complexity of the analysis and the amount of sequencing work and costs. For this study we combined an improved drought stress phenotyping of needles via a novel terahertz water monitoring technique with Massive Analysis of cDNA Ends to identify candidate genes for drought stress response in European silver fir (Abies alba Mill.). A pooled cDNA library was constructed from the cotyledons of six drought stressed and six well-watered silver fir seedlings, respectively. Differential expression analyses of these libraries revealed 296 candidate genes for drought stress response in silver fir (247 up- and 49 down-regulated) of which a subset was validated by RT-qPCR of the twelve individual cotyledons. A majority of these genes code for currently uncharacterized proteins and hint on new genomic resources to be explored in conifers. Furthermore, we could show that some traditional reference genes from model plant species (GAPDH and eIF4A2) are not suitable for differential analysis and we propose a new reference gene, TPC1, for drought stress expression profiling in needles of conifer seedlings.


Introduction
Smaller amounts of precipitation and an increase in the occurrence of drought events are being predicted for the Mediterranean region and parts of Central Europe, especially during summer periods [1]. Drought stress poses a major threat to trees by possibly causing hydraulic failure. Facing low water availability, trees react with stomatal closure and reduced photosynthesis while maintaining metabolism. Thus, severe drought periods can ultimately cause forest diebacks by xylem cavitation, carbon starvation and damage by pathogens and insects [2]. While single trees might reveal sufficient plasticity, tree populations can cope with changing climatic conditions either by migration and/or adaptation [3]. Migration via seed dispersal might be increasingly constrained by fragmented landscapes and is generally limited by natural barriers, especially for mountainous species [4]. Adaptation only works via selection on standing genetic variation or newly arisen mutations [5]. Standing variation offers the best chance for rapid adaptation since potentially beneficial alleles might already be numerously present in the population [5].
To assess the adaptive potential of tree populations it is therefore necessary to identify genes that are, among other stress-related traits, involved in the drought stress response and to study their variation in natural populations. Many studies on tree populations today use a candidate gene approach for the following reasons: While genomic resources are readily available for model species such as Arabidopsis, rice or maize, for most tree species, and especially conifers, such resources are scarce [6]. Furthermore, the identification of drought stress related genes is a big challenge when working with conifers, due to their large genome sizes [7] which make genome-wide association studies very resource-intensive [8]. Candidate genes are thus used as an alternative for detecting selective signals [9,10]. This approach traditionally involves F st outlier analysis of single-nucleotide polymorphisms (SNPs) within those candidate genes ("bottomup" approach) [11]. Alternatively, candidate genes allow for a reasonable selection of target genes for association studies ("top-down" approach), especially when handling large genomes. Moreover, drought stress is hard to assess in conifers and the response is a highly quantitative trait.
However, using a novel terahertz spectroscopy setup, it is now possible to continuously measure the water content of multiple plants and thereby precisely monitor the drought stress response [12]. This allows sampling leaves from multiple plants with identical drought stress response and analyzing them with next generation sequencing methods. Thus it is possible to identify those specific genes that are underlying an accurately assessed drought stress phenotype. This approach provides a unique opportunity for detecting and exploring novel candidate genes in non-model species which may not be found annotated from traditional model species. We chose European silver fir (Abies alba Mill.) as the model species for our study. A. alba is an ecologically and economically valuable coniferous tree, which has its main area of distribution in mountainous regions of Central and Southern Europe [13]. Effects of drought were shown to manifest in a reduced growth rate [14][15][16], reduced photosynthetic activity and stomatal conductance [17,18], crown-damage [19,20] and an increasing susceptibility to damage caused by pathogens or insects [21,22]. A dieback as a response to frequent and severe water shortages can already be observed, e.g. at Mont Ventoux in Southern France [23].
The major goals of our study were (I) the identification of candidate genes for drought stress response in A. alba, (II) the comparison of drought stress related genes between A. alba and model organisms to identify conifer-specific genes, (III) the validation of the expression profiles by reverse-transcription quantitative real-time PCR (RT-qPCR) and (IV) the identification of reference genes for RT-qPCR data normalization.

Plant material and drought stress monitoring
Silver fir seedlings were propagated from seeds of female cones of a single tree in a forest stand near Hagenbach, a Black Forest region of South-Western Germany (the seeds were provided with permission by Hans Lehman from the forestry office Oberharmersbach). Thus, all seedlings used in the experiment were either half-siblings or full-siblings. To establish groups of plants with highly controlled levels of drought stress, a novel terahertz time-domain spectroscopy setup was used in a preliminary study conducted by Born et al. [12]. This allowed the manipulation and monitoring of the individual water status of multiple seedlings by continuously measuring the cotyledons, without inducing other forms of stress. Twelve seedlings were measured this way. While six of them were well-watered, the other six seedlings were not watered until they reached comparable levels of considerable drought stress (for a more detailed account of the drought stress monitoring and its results see Born et al. [12]). At this point, two cotyledons were cut off from each seedling for RNA extraction and immediately stored in liquid nitrogen. Cotyledons were also harvested from the control group of well-watered seedlings at corresponding times of the day.

RNA extraction
For sequencing, total RNA from every individual needle was extracted using the InviTrap Spin Plant RNA Mini Kit (STRATEC Molecular GmbH, Berlin, Germany). The cotyledons were ground in liquid nitrogen with mortar and pestle in lysis buffer RP and β-Mercaptoethanol. Half of each lysate was used for RNA extraction by GenXPro GmbH (Frankfurt am Main, Germany) while the rest was stored at -80°C for RT-qPCR validation. To remove genomic DNA contaminants the samples were treated "off-column" with Baseline-ZeroTM DNase (Epicentre/Biozym, Hessisch Oldendorf Germany) and subsequently purified using RNA Clean & ConcentratorTM-5 Kit (Zymo Research Europe, Freiburg Germany). RNA samples for RT-qPCR validation were immediately stored in a deep freezer at -80°C. RNA concentration and purity were measured via ratios of optical density (OD 260/280 , OD 260/230 ) using NanoDrop 1000 spectrophotometer (PEQLAB Biotechnologie GmbH, Erlangen Germany). The absence of DNA contamination was confirmed after performing a PCR using a primer pair which targets the nuclear microsatellite marker NFH15 (GenBank Accession Number: AY966492, [24]) at an annealing temperature of 57°C. Integrity was assessed using gel-electrophoresis. Complementary DNA (cDNA) was synthesized using the Maxima First Strand cDNA Synthesis Kit for RT-qPCR (ThermoScientific, Schwerte Germany). The cDNA samples were immediately stored in aliquots at -80°C. All kits were applied according to the manufacturer's protocol. Any modifications are explicitly described.

Transcriptome sequencing
Prior to synthesizing cDNA, the extracted mRNA from the drought stressed and the well-watered seedlings was pooled, respectively. From each pool a cDNA library was constructed targeting sequences near the cDNA 3'-ends. This Massive Analysis of cDNA Ends (MACE) was conducted by GenXPro GmbH as described in Kahl et al. [25]. The 5'-ends of 50-500 bp long fragments were sequenced (single-read) using the Illumina HiSeq 2000 platform (Illumina Inc., San Diego, CA, USA), generating 100 bp long tags. Illumina's HiSeq Control Software v. 2.0.5 was used for sequencing, RTA v. 1.17.20.0 for real time analysis and CASAVA v. 1.8.2 (Consensus Assessment of Sequence and Variation) for base calling and demultiplexing. To prevent PCR-biased quantification, GenXPro's "TrueQuant" method was applied, thereby eliminating PCR-based copies from the dataset. For this purpose, unique oligonucleotides were ligated to each tag prior to PCR, making it possible to identify and eliminate PCR copies with identical barcode-tag-combinations [26,27].

Assembly, annotation and gene expression profiling
After sequencing, the tags were assembled and annotated using the TIGR Plant Transcript Assemblies database (http://plantta.jcvi.org/). Not annotated tags were assembled and subsequently blasted (BLASTx) against the Swiss-Prot and TrEMBLE databases (http://www.uniprot.org/). A differential expression analysis was conducted using the MA-plot based method with random sampling model (MARS) of the DEGseq R package [28]. Prior to this analysis the libraries were normalized according to their respective size by dividing each tag frequency through the sum of the total tags and multiplied by 10 6 (tags per million). For multiple testing corrections a p-valuethreshold of 1e-10 for significantly differentially expressed (DE) transcripts was set. Following, the enrichment of each gene ontology (GO) term was tested using Fisher's exact test (two-tailed) [29]. Additionally, we analyzed the MACE results using the R packages DESeq [30] with a single estimated dispersion condition, a size factor normalization and an FDR (false discovery rate) threshold of q < 0.1 as well as NOISeq [31] with simulated technical replicates (NOISeq-sim), a trimmed mean of M-values normalization and a threshold of q = 0.9.

RT-qPCR validation
The gene expression of a small number of genes was assessed in each individual seedling by RT-qPCR using relative quantification according to the MIQE criteria (Minimum Information for the Publication of Quantitative Real-Time PCR Experiments) [32].
The MACE dataset was first searched for DE transcripts with log 2 fold changes higher than 3 (for up-regulated transcripts) or lower than -3 (for down-regulated transcripts) since they were most likely responsive to dehydration. To minimize the rate of false positives introduced by rare transcripts, a threshold of at least 50 different tags with match in sense orientation (5'-3') to a database entry was set. Genes for validation were selected from this filtered subset of DE transcripts that were significantly assigned (enrichment-p-value < 1e-10) to the GO terms response to water stimulus (GO:0009415), response to water deprivation (GO:0009414) and response to osmotic stress (GO:0006970). Furthermore, genes were selected from the subset of filtered DE transcripts with the ten highest and ten lowest fold changes that were significantly assigned to the GO domain biological process (GO:0008150). Primer pairs (Metabion, Martinsried, Germany) were designed based on the assembled MACE tag sequences for each selected gene using Primer3 v. 4.0.0 (http://bioinfo.ut.ee/primer3/) with default parameters for a product size of 60 bp to 150 bp and an optimum annealing temperature of 60°C. Primer pairs were considered specific when (1) there was no amplicon present in genomic DNA samples, (2) the first derivative of the corresponding melting curves resulted in a single peak, (3) gelelectrophoresis showed one product with the expected size and (4) the amplicon-sequence was identical with the target sequence which was verified by re-sequencing of the PCR-products using the Macrogen Europe Laboratory sequencing service (Amsterdam, The Netherlands).
To select adequate reference genes for the normalization of the RT-qPCR data two different approaches were used. First, by searching the literature for conifer gene expression studies, traditionally used reference genes were identified. Second, the MACE-dataset was searched for sequence tags which were neither up-nor down-regulated (p-value > 0.99, log 2 fold change: -0.005 to 0.005), and had a minimum amount of ten sequence tags. The potential reference genes were tested for their expression stability among the drought stressed and well-watered seedlings using geNorm [33] and Normfinder [34]. Both algorithms were implemented in GenEx v. 5.4.4.119 (MultiD Analyses AB, Göteborg Sweden) which also provided the expected accumulated standard deviation to assess the optimal number of reference genes to be included for the most precise data normalization [35]. Real-time PCR was performed on the Roche LightCycler 480 II System (Roche Diagnostics, Mannheim, Germany) using the sample maximization method with samples in triplicates at an optimized and standardized temperature and cycle program (Table S1 in S1 File) in which only the annealing temperatures were varied according to the optimum of the primer pairs (Table S2 in S1 File). Each PCR reaction was performed with KAPA SYBR Fast Universal Master Mix (Peqlab, Erlangen, Germany).
After the RT-qPCR the quantification cycles (Cq) were determined using the second derivative maximum method implemented in the Roche LightCycler 480 Instrument Software v. 1.5.0. The gene expression ratio was calculated using the Pair Wise Fixed Reallocation Randomization Test implemented in the Relative Expression Software Tool-384 v. 1 (REST) using 5000 iterations [36]. The ratio was corrected for the amplification efficiencies which were calculated according to Liu & Saint [37]. The intra-and inter-assay variations were assessed by calculating the coefficient of variance as the standard deviation relative to the mean of the Cq-values. Therefore, thirty replicates of the same cDNA sample were used to amplify GAPDH in three separate qPCR runs (each with ten of the replicates) on three different days. The coefficient of variance was not supposed to exceed four percent on the Cq basis [38].

MACE libraries
After sequencing, the MACE method yielded two libraries containing, in total, 15.4 million tags with 6.2 million tags for the drought stressed pool and 9.2 million tags for the well-watered pool ( Table 1).
Annotation of the tags resulted in a total of 65,535 transcripts, which were assigned to the three main gene ontology (GO) domains: molecular function (GO:0003674) contained 38,745 transcripts, cellular component (GO:0005575) 39,776 and biological process (GO:0008150) 37,140. Since this analysis aimed to find candidate genes associated with drought stress response, transcripts assigned to the GO domain biological process were most interesting. Within this domain, GO terms associated with metabolic processes were most enriched. In response to drought stress these GO terms were mostly down-regulated, as was methylation (GO:0032259) and photosynthesis (GO:0015979) (Fig 1 and Table S3 & S4 in S1 File). In contrast, GO terms associated with stimuli and stress were generally up-regulated, especially terms most obviously linked to drought stress, namely response to water stimulus, response to water deprivation and response to osmotic stress.

Differential gene expression analyses
The DEGseq analysis resulted in a total of 3,407 significantly DE transcripts (p < 1e-10) between the drought stressed and the well-watered pool (Fig 2A). The NOISeq (q = 0.9, Fig 2B) and DESeq (q < 0.1, Fig 2C) analyses yielded 2,694 and 342 DE transcripts, respectively. DEGseq uniquely identified 1,726 transcript and NOISeq 1009 transcripts, while DESeq shared all identified transcripts with either NOISeq or both DEGseq and NOISeq (Fig 3). Shown is the amount of total tags analyzed for the library, the amount of unique tags and the amount of tags for each treatment pool (drought stressed and well-watered).
S: sense direction; AS: antisense direction.

RT-qPCR validation and candidate gene selection
After filtering by fold change and the minimum number of different sense tags ( 50), out of the 3,407 DE transcripts identified by DEGseq, 832 transcripts could be listed ( Table 2, FASTA files of all 832 transcripts available in S2 File). From the subset of filtered DE transcripts with a significant assignment to the GO terms response to water stimulus, response to water deprivation and response to osmotic stress and the top ten up-and down-regulated transcripts significantly assigned to biological process, 29 different up-regulated (Table S5, S6 & S8 in S1 File) and 14 different down-regulated (Table S7 & S8 in S1 File) genes for validation could be  (Table 3). All these selected genes were represented in the NOISeq results, while DESeq did not identify eight up-regulated and seven down-regulated genes as DE transcripts. The literature search revealed twelve reference genes from published conifer studies (Table S9 in S1 File). Profiling the MACE dataset revealed three potential reference genes: the   global transcription factor group E8 (GTE8), the transcription initiation factor 2A (TIF2A) and the two pore calcium channel protein 1 (TPC1). All three exhibited medium abundance levels and were therefore optimal candidates as reference genes. Out of the 15 potential reference genes, four (GAPDH, TPC1, 18S rRNA and elF4A2) specifically amplified their target in the RT-qPCR. The expression stability of these genes was tested using Normfinder [34] and geNorm [33]. Both algorithms identified TPC1 and 18S rRNA as the most stable reference genes (Table S10 in S1 File). Furthermore, their combination exhibited the lowest accumulated standard deviation. Thus, TPC1 and 18S rRNA were both applied for normalization in REST. Out of the 50 tested primer pairs for validation (Table S2 in S1 File), four could meet our conservative criteria and were specifically amplifying their targets during the qPCR: ERD10, BAM1(b), XTH7 and PUP1(a) (Table S11 in S1 File). The calculated gene expression ratios were similar to those obtained by the MACE method ( Table 4). The assay precision was assessed by calculating the intra-assay variation (repeatability) and inter-assay variation (reproducibility) based on Cq-values. The intra-assay coefficient of variance ranged between 0.8% and 1.08% while the inter-assay coefficient of variance was 0.92%.
From the 338 transcripts identified unanimously by all Seq-analyses (Fig 1), 296 remained after filtering by minimum different sense tags ( Table 5, FASTA files of all 296 transcripts available in S3 File). Almost half of these transcripts (~45%) were unknown and many of the transcripts with a database hit (~38% of the remaining~55%) were not yet properly assigned.

Discussion
Analyzing the adaptive potential of plant species to drought stress is of crucial importance in the context of rapid climate change. For species with large genomes, such as conifers, where genomic resources are scarce, it is often necessary to reduce the pool of target genes to an affordable size. Combining a new water monitoring setup with the MACE technique we were able to link a standardized phenotypic response to specific genes and thereby identify novel candidate genes for drought stress response in A. alba. For our approach, we pooled RNA from individual seedlings for each treatment group and subsequently analyzed both pools via transcriptome sequencing. To exclude bias by individual expression patterns we applied RT-qPCR to validate the differential expression for a subset of genes known to be involved in the drought stress response of model plant species. Among these genes, only UGT74E2 displayed a gene-regulation pattern that clearly differed from expectation according to the respective literature. In Arabidopsis, an ectopic over-expression of UGT74E2 increased the tolerance to salinity and drought stress and reduced the plants' water loss [39]. Therefore, one would assume an up-regulation in response to drought stress. However, in silver fir the opposite was the case. Hence, UGT74E2 might play a different physiological role in the phylogenetically distant silver fir, compared to Arabidopsis. Further research regarding the function of UGT74E2 in other conifer taxa are necessary to offer an explanation for the different expression in response to drought stress.
The up-regulation of PUP1(a), BAM1(b) and ERD10 based on the MACE technique and subsequent DEGseq analysis could be verified by the RT-qPCR. Individual differences among the seedlings were expected but proved to be low for BAM1(b) and ERD10 and moderate in the case of PUP1(a). Though varying to a larger extent, the overall down-regulation of XTH7 could be affirmed. However, DEGseq predicted a much higher gene expression ratio than observed with RT-qPCR. Individual differences were more pronounced for XTH7 and may partially be attributed to PCR inhibition and/or stochastic cDNA template variation.
In order to select adequate reference genes for the RT-qPCR, we tested traditional reference genes which were previously used in other conifer gene expression studies. Depending on the treatment, ontogenetic stage or the tissue under investigation some of them showed expression stability [40][41][42], even though in other studies [33,40,42,43] these genes showed significant variability in expression patterns. Here we tested GAPDH, elF4A2 and 18S rRNA, with only 18S rRNA showing expression stability across all individual seedlings. Since the expression of GAPDH and eIF4A2 varied among the drought stressed and well-watered seedlings, these genes were not suitable as internal controls. However, 18S rRNA should not be used as the only reference gene for normalization due to the possible mismatch of rRNA and mRNA abundance [44]. Fortunately, the selection of reference genes based on the MACE results proved to be successful. We could verify the expression stability of TPC1 for all individual seedlings using RT-qPCR. In Arabidopsis and rice TPC1 is known to be ubiquitously expressed across several tissues [45,46]. However, Wang et al. [47] identified a TPC1 homologous gene which was induced in Triticum aestivum as a response to high salinity, polyethylene glycol (PEG), low temperature and abscisic acid treatment. They suggested an important role of TPC1 in the stomatal closure and abiotic stress response of T. aestivum. On the one hand this implies the necessity for further investigation on the expression stability of TPC1 as a potential reference gene when investigating drought stress response. On the other hand the TPC1 homologue might be involved in the osmotic rather than the drought stress response. Since the expression stability of TPC1 was only tested in cotyledons, future studies need to address other tissues such as roots, as well as different age stages of needles.
Our study design lacked biological replicates, which is a serious limitation but was dampened by pooling the samples for the two libraries. Pooling samples for RNA-Seq analysis has proven to be a reliable method for estimating gene expression, especially for genes exhibiting high expression levels [48]. Furthermore, the THz measurements were highly precise which ensured that the pools had very homogenous stress levels [12]. It is also notable that the THz approach enabled us to measure a stress response solely induced by water deprivation. Other approaches, such as PEG treatment, might largely lead to the differential expression of genes involved in osmotic stress response rather than in the response to water shortage. Since the individuals in our study were pooled in two treatment groups we could not estimate the biological variation within those groups. As stated in other studies facing the same problem [49][50][51], the results must be taken cautiously and the candidates should be further examined. Nonetheless, in order to analyze our data, we chose a conservative approach. Therefore, apart from DEGseq, we additionally employed DEseq and NOISeq. All three analyses showed different results as was expected according to comparative studies of the used methods [52,53]. DESeq is generally more conservative, while DEGseq and NOISeq are more aggressive but prone to false positives. However, DESeq did not identify three of the four genes verified by RT-qPCR as DE transcripts (Fig 2). Since the genes for validation were selected by very strict criteria and the RT-qPCR was conducted on the individual seedlings and not on pooled plant material, we conclude that the DEGseq and NOISeq results likely include a relatively high amount of false positives, while the 43 genes for validation (Table 3) should be correctly defined as DE transcripts. Since these genes were representatives of the group of 832 filtered DE transcripts, we define the whole set as potential candidate genes for drought stress response for further studies. However, the most conservative selection would only include the 296 filtered consensus transcripts. Some of these genes or close variants were previously identified or used as candidates in other studies regarding drought stress response in conifers. For example, xyloglucan endotransglycosylase/hydrolase (AoXET1) was down-regulated in needles and stems of Pinus pinaster seedlings in response to drought stress [54]. A glutathione S-transferase and chitinases (cht1 and cht2) were up-regulated in response to drought-and pathogen-related-stress in roots and shoots of six-week-old seedlings of Picea abies [55]. Velasco-Conde et al. [56] measured the expression pattern of several dehydrins (dhn1, dhn2, dhn3, dhn7, dhn9 and a dhn-like protein) in needles of 3-year-old cuttings of drought-sensitive and drought-resistant genotypes of P. pinaster. Only dhn3 and dhn4 showed an involvement in drought resistance between genotypes, while dhn2 was consistently down-regulated in response to drought stress. Our results suggest that dhn2 might play a different role in drought stress response in A. alba, since it was significantly up-regulated. Dehydrins (dhn1 and dhn2), aquaporin (aquaMIP) and early responsive to dehydration 3 (erd3) were used as candidate genes for drought stress response in megagametophytes of Pinus taeda [57]. Similarly, a putative glucan-endo-1,3-beta-glucosidase precursor, dehydrins (dhn1 and dhn2) and erd3 were used as candidates for outlier analyses in megagametophytes of P. pinaster [58]. Chitinase 4 and a putative LEA protein were identified as drought stress responsive in needles, stems and roots of P. pinaster [59] and a glutathione Stransferase in needles of adult Pinus halepensis trees [50].
To our knowledge, no studies exist, which aimed to identify adaptive genes for drought stress response in any member of the genus Abies. However, such an analysis would surely benefit from genus-specific candidate genes. Here we present a selection of 296 genes that contains previously identified candidates but predominantly adds to the possible selection of candidates for future studies. Many of these genes are linked by their biological function to a specific response to water deprivation. For example, 3-ketoacyl-CoA synthase 11 belongs to the group of "very long chain fatty acids", which are required for wax synthesis [60]. The leaf cuticle is protected by the wax layer against non-stomatal water loss, which could explain its up-regulation. Also involved in water management are aquaporins, which are expressed very variably in response to drought stress [61]. Tonoplast intrinsic proteins (TIP1s) are usually found in the lytic vacuole membrane [62]. Hence, TIP1-1 is probably up-regulated during drought periods to allow better access to the water stored in the vacuole. Chloroplastic beta-amylase 1 is involved in the breakdown of leaf starch [63]. During daytime, plants store glucose as starch in chloroplasts and access this energy during nighttimes via starch breakdown. Up-regulation of BAM1 is most likely a response to the down-regulation of metabolic processes and especially photosynthesis during drought periods (Fig 1). Analogous to "regular" nighttimes, metabolism with reduced photosynthesis can only be maintained by breaking down the energy storage, e.g. starch. Accordingly, sugar transport proteins, which transport hexoses through cellular membranes, are necessary for metabolism but may also play a role in distributing osmolytes throughout the plant. The down-regulation of XTH6 and XTH7 indicates limited cell growth during drought periods, since xyloglucan endotransglucosylase/hydrolases are involved in cell enlargement and restructuring [64]. Protein kinases add phosphate groups to a substrate, while protein phosphatases remove them, thus either activating or deactivating enzymes [65]. Both protein kinases and phosphatases are key factors in signal transduction as response to drought stress, by regulating enzyme activity. E3 ubiquitin-protein ligase RHA2A functions in an ABAmediated signaling pathway during early seedling development, positively regulating the plants response to osmotic stress [66]. The fact that RHA2A is part of the subset of both up-and down-regulated candidate genes highlights the necessity for a distinction between possibly different protein isoforms for all genes. Late embryogenesis abundant (LEA) proteins play a protective role against desiccation-damage during drought periods, presumably by suppressing protein aggregation [67]. Dehydrins were initially categorized as "Group II LEA proteins" and indeed protect plant cells from desiccation-damage but are also involved in pathogen resistance [68]. Galactinol synthases are induced by drought, cold and ABA [69] and are interesting from the perspective of adaptation. Taji et al. [70] found that genes encoding galactinol synthase in transgenic Arabidopsis improved drought stress tolerance, which might be attributable to the role of galactinol synthase in the biosynthesis of "raffinose family oligosaccharides". The resulting accumulation of galactinol and raffinose may enhance drought stress tolerance via osmoprotection. Chitinase 8, the acidic isoform of glucan endo-1,3-beta-glucosidase and zeamatin are all involved in pathogen defense response, mainly against fungi [71][72][73] and in the case of probable disease resistance protein At4g33300 against bacteria [74]. Reason for the up-regulation of pathogen-resistance genes in silver fir in response to drought might be that forest trees are especially prone to drought-disease interactions, mainly involving fungi [21]. Glutathione S-transferases are most notably detoxification enzymes with many other, still unknown, involvements hypothesized in plant stress response [75]. As such, they are classified as early responsive to dehydration (ERD), i.e. genes that are activated swiftly in response to drought stress, a group that also contains dehydrins [76].
A group of gene products that stands out are the putative uncharacterized proteins (PUPs) inferred from the transcriptome sequencing of Picea sitchensis [77]. PUP1 belongs to the dehydrin family, PUP2 to the NAC domain and PUP4 as well as PUP5 to the family of UDP glycosyltransferases (UGT), according to the UniProt Protein Knowledgebase (http://www.uniprot. org/). All PUPs identified in this study seem to play an important role in the drought stress response of silver fir and should be focused on in further studies. Correspondingly, further research should focus on the DE transcripts that did not have a database hit, since these genes are not yet described for any plant species and are most likely involved in drought stress response. As such, they might possibly be specific to the Pinaceae family or conifers in general.
In conclusion our study provides first insights into the drought stress response of A. alba at the transcriptome level and offers a set of candidate genes for use in future studies. The majority of these candidates are yet unknown or lack a proper assignment and add to the growing genomic resources available for non-model conifer species. Such resources will be increasingly important for investigating the adaptive potential of long-lived organisms such as trees in the face of rapid climate change.