Sequence and Expression Analyses of Ethylene Response Factors Highly Expressed in Latex Cells from Hevea brasiliensis

The AP2/ERF superfamily encodes transcription factors that play a key role in plant development and responses to abiotic and biotic stress. In Hevea brasiliensis, ERF genes have been identified by RNA sequencing. This study set out to validate the number of HbERF genes, and identify ERF genes involved in the regulation of latex cell metabolism. A comprehensive Hevea transcriptome was improved using additional RNA reads from reproductive tissues. Newly assembled contigs were annotated in the Gene Ontology database and were assigned to 3 main categories. The AP2/ERF superfamily is the third most represented compared with other transcription factor families. A comparison with genomic scaffolds led to an estimation of 114 AP2/ERF genes and 1 soloist in Hevea brasiliensis. Based on a phylogenetic analysis, functions were predicted for 26 HbERF genes. A relative transcript abundance analysis was performed by real-time RT-PCR in various tissues. Transcripts of ERFs from group I and VIII were very abundant in all tissues while those of group VII were highly accumulated in latex cells. Seven of the thirty-five ERF expression marker genes were highly expressed in latex. Subcellular localization and transactivation analyses suggested that HbERF-VII candidate genes encoded functional transcription factors.


Introduction
Transcription factors (TFs) activate or repress the transcription of genes. The regulation of gene expression can be constitutive, tissue-specific or induced in response to environmental stimuli [1]. Plants are sessile organisms, which develop different mechanisms to protect themselves against aggressors, but also to adapt to various environments. In plants, this involves exogenous and endogenous signals, such as hormones. The gaseous plant hormone ethylene is reported to play an active role in a wide range of developmental and adaptation processes [2][3][4]. Studies on ethylene signalling have revealed a linear transduction pathway that leads to the activation of transcriptional regulators belonging to the Ethylene Response Factor (ERF) type. The Ethylene Response Factor (ERF) is one of the most important families of transcription factors and plays a key role in hormone, sugar and redox signalling in a context of abiotic and biotic stress [5]. ERFs have one AP2 domain, which is involved in DNA binding. This domain is about 60 amino acid residues which recognize GCC or DRE boxes in the promoter sequence of their target genes. Based on this conserved domain, ERFs were classed in ten groups by Nakano [6] or several subclasses by Sakuma [7]. The ERF family belongs to the AP2/ERF superfamily, which has been described for several species. Of the few woody plant species studied [8][9][10], AP2/ERF genes have been identified for only one subtropical crop: Hevea brasiliensis [11].
Hevea is the only commercial source of natural rubber. Natural rubber is synthesized in the cytoplasm of laticifers, which are periodically emitted from the cambium [12]. A laticifer is a cellular network created by anastomosis of latex cells, which is embedded in phloem tissues. The latex is collected by tapping the soft bark tissues. Ethephon, an ethylene releaser, is applied to the surface of the tapping panel to stimulate latex production. Ethephon induces several biochemical processes in laticifers [13], such as sucrose loading, water uptake, nitrogen assimilation or synthesis of defence proteins, involving a large number of ethylene-response genes [14][15][16][17][18][19][20], whereas its direct role in rubber biosynthesis is controversial [21] (Figure 1). For that reason, ethylene biosynthesis and signalling pathways have been intensively studied in Hevea [14,15,17,[22][23][24]. Based on NGS sequencing of five tissue-type libraries (latex, bark, leaf, root, somatic embryogenic tissues), 173 AP2 domain-containing transcripts have been identified in Hevea, of which 142 have a full-length AP2 domain [11]. In Hevea, the ERF transcription factor family consists of 115 members divided into ten main groups. The three groups VII, VIII and IX account for more than 50% of HbERF. The expression of some HbERF genes has been associated with somatic embryogenesis [25], jasmonic acid-induced laticifer differentiation [26], and abiotic stress [27,28]. According to Duan, the three HbERF genes induced upon laticifer differentiation correspond to three members of group VII (HbERF-VIIa3, HbERF-VIIa17 and HbERF-VIIa1) [11].
This paper first set out to more effectively estimate the number of HbAP2/ERF genes by completing the Hevea clone PB 260 transcriptome by sequencing additional tissues and annotating functions, particularly transcription factors, and comparing these RNA sequences with available genomic sequences. A reproductive tissue-type library (immature and mature male and female flowers, zygotic embryos) was sequenced and re-assembled with reads from previously sequenced libraries (leaf, bark, latex, root, somatic embryogenesis tissues) using the same bioinformatics pipeline published by Duan [11]. Then, a gene expression analysis was performed in various tissues using real-time RT-PCR in order to identify AP2/ERF genes potentially involved in the regulation of latex production. The function of members from group HbERF-VII, which are highly expressed in latex, was tested by subcellular localization and transactivation of an artificial GCC-box containing promoter.

Plant material
For the Hevea transcriptome analysis, plant material of clone PB 260 was grown according to the conditions described in Duan and coll. [11]. Latex, leaf, bark, and zygotic embryo RNA samples were collected and prepared at the Sembawa Centre, IRRI, P.O Box 1127, Palembang 30001, Indonesia (2 degrees 55 minutes 39.15 seconds South, 104 degrees 32 minutes 18.9 seconds East). Reproductive (immature and mature male and female flowers, zygotic embryos) RNA samples were collected and prepared at the CRRC, RRIT, Sanam Chaikhet District, Chachoengsao 24160, Thailand (13.39uN latitude and 101.26uE longitude). These locations and our activities did not require any specific permission. The field studies did not involve endangered or protected species.
For the relative transcript abundance analysis, the plant material was the same as for the transcriptome analysis, except latex and bark, which were harvested from 5-year-old trees without ethylene treatment and collected at IRRI's Sembawa Centre.

Total RNA isolation
All samples were frozen in liquid nitrogen and stored in the freezer at 280uC pending total RNA extraction. Total RNAs were isolated using the caesium chloride cushion method adapted from Sambrook and coll. [29] by Duan and coll. [30]. One gram of fresh matter was ground and transferred to a tube containing 30 mL of extraction buffer consisting of 4 M guanidium isothiocyanate, 1% sarcosine, 1% polyvinylpyrrolidone and 1% bmercapto-ethanol. After homogenization, tubes were kept on ice and then centrifuged at 10,000 g at 4uC for 30 min. The supernatant was transferred to a new tube containing 8 mL of 5.7 M CsCl. Ultracentrifugation in a swinging bucket was carried out at 89,705 g at 20uC for 20 h. The supernatant and caesium cushion were discarded whilst the RNA pellet was washed with 70% ethanol. After 30 min of air drying, the pellet was dissolved in 200 mL of sterile water. Although DNA could not cross the caesium cushion for this centrifugation condition, DNA contamination was checked by PCR amplification using primers of the Actin gene including the intron sequence. RNAs were stored at 280uC.

Sequencing technique and contig assembly
Total RNA samples of each reproductive tissue were pooled together. Single-stranded cDNA was synthesised from pooled RNA samples. Pyrosequencing was carried out using a GS FLX (Roche/454) Titanium run (Roche Applied Science) by the GATC-Biotech company in Germany. A 454 sequencing halfrun (200 Mbp) generated more than 500,000 reads with an average read length of 400 bp for each library according to the manufacturer. Reads were analysed using the ESTtik tool (Expressed Sequence Tag Treatment and investigation kit) [31] available on the Southgreen bioinformatics platform (www. southgreen.fr) and modified for the analysis of 454 data items. Reads were first cleaned to avoid mis-assembly by discarding sequences that were both lower than 120 bp and of low complexity. We then discarded non-coding reads by comparing the reads against the fRNAdb database using the Megablast algorithm with an e-value cutoff of 1e-20 [32]. More than 400,000 cleaned reads were obtained for this library. Reads from this library (Accession: PRJNA236464, ID: 236464) and those published by Duan and coll. (Accession: PRJNA235297 ID: 235297) were then assembled in contigs using the TGICL program [33], integrated in the ESTtik pipeline. The removal of poor end regions of reads and the computation of overlaps between reads was done using the default parameters of the CAP3 program (best hit cut-off for difference -b = 20; best hit for clipping -c = 12) [34]. Clustering was carried out for reads with an overlap of at least 60 bp and 94% identity between reads. The second step was an assembly of reads from each cluster with greater stringency: the length of sequence overlap was then 60 bp with 95% identity between reads. The transcript sequence database consisted of contigs.
The annotation of contigs was processed using Orfpredictor to obtain the longest ORFs leading to an Interproscan annotation. Then, all contigs were aligned to the NCBI non-redundant protein database using BLASTX to obtain an xml output format. Lastly, we treated the Interproscan and BLASTX results as the input files of the Blast2Go program and generated the output file listing the GO number for each annotated contig. By using this file, combining their GO databases (component.ontology, function.ontology, and process.ontology) downloaded from the GO website, we generated the figure using our own perl program.
Primer design and analysis of transcript abundances by real-time RT-PCR Several rules were applied in order to reduce the risk of error in relative gene expression data. The integrity of total RNA was checked by electrophoresis. Primers were designed at the 39 side of each sequence in order to reduce the risk of error due to short cDNA synthesis using the Primer 3 module of Geneious (Biomatters Ltd., New Zealand). Real-time PCR amplification and the fusion curve were carried out using a mix of cDNAs in order to check the specificity of each pair of primers. Primer sequences are listed in Table S1 in File S2. cDNAs were synthesized from 2 mg of total RNA to the final 20 mL reaction mixture using a RevertAidTM M-MuLV Reverse Transcriptase (RT) kit according to the manufacturer's instructions (MBI, Fermentas, Canada). Full-length cDNA synthesis was checked on each cDNA sample by PCR amplification of the Actin cDNA using primers at the cDNA ends. Quantitative gene expression analysis was finally carried out by real-time RT-PCR using a Light Cycler 480 (Roche, Switzerland). Real-time PCR reaction mixtures consisted of 2 mL RT product cDNA, 0.6 mL of 5 mM of each primer, and 3 mL 26 SYBR green PCR master mix (LightCyclerH 480 SYBR Green I Master, Roche Applied Sciences) in a 6-mL volume. PCR cycling conditions comprised one denaturation cycle at 95uC for 5 min, followed by 45 amplification cycles (95uC for 20 s, 60uC for 15 s, and 72uC for 20s). Expression analysis was carried out in a 384-well plate. Samples were loaded using an automation workstation (Biomek NX, Beckman Coulter).
Real-time PCR was carried out for eleven housekeeping genes in order to select the most stable gene as the internal control for all the compared tissues (HbelF1Aa, HbUBC4, HbUBC2b, HbYLS8, HbRH2b, HbRH8, HbUBC2a, HbalphaTub, Hb40S, HbUbi, HbActin). HbRH2b was selected as the best reference gene due to its stability in tissues from immature and mature male and female flowers, zygotic embryos, latex and bark. The homogeneity of the HbRH2b gene Cp confirmed that it could be used as an internal reference gene (Table S2 in File S2). The HbRH2b gene was amplified in each reaction plate in parallel with target genes. The transcript abundance level for each gene was relatively quantified by normalization with the transcript abundance of the reference HbRH2b gene. Relative transcript abundance took into account primer efficiencies. All the normalized ratios corresponding to transcript accumulation were calculated automatically by Light Cycler Software version 1.5.0 provided by the manufacturer using the following calculation: Normalized Ratio = Efficiency 2D(Cp target-Cp RH2b) .

Statistical data analyses
Real-time PCR reactions were set up with three biological replications. Statistical analysis was performed with an ANOVA after logarithmic transformation of raw data. The ANOVA was followed by a Student Newman-Keuls test when the values of relative transcript abundances were compared for immature and mature flowers, zygotic embryos, latex and bark. Values with the same letter did not differ significantly at the 0.05 probability level.

Phylogenetic analysis of the AP2 domain from ERF marker genes
The full AP2-domain sequences derived from Hevea, Arabidopsis, Populus, Vitis, Malus6domestica, Oryza and Zea mays AP2-domain proteins of around 60 amino acids were then aligned using MUSCLE software [35,36] which uses a progressive multiple alignment method. The alignment was curated by Gblocks software [37], searching for at least 10-amino-acid-long conserved blocks, and the block with 57 amino acids was extracted. This block of 57 amino acids was used to construct the phylogenetic tree using PhyML software [38], which implements a maximum likelihood tree reconstruction method, using the LG+gamma model, starting from a BioNJ tree [39]. A RAP-Green analysis was performed on the initial PhyML tree to improve gene function inferences and predict gene duplications in phylogenetic trees [40]. The final tree was drawn and displayed with the Archaeoptheryx program, and rooted on the branch separating the RAV family from the rest of the tree. Branch supports were computed using the aLRT-SHlike method [41].

Cis-acting element analysis and determination of conserved motifs
In silico promoter analysis was searched again in the PLACE database [42] and The PlantCARE database [43]. The Hevea genomic scaffold was provided for the HbERF-VIIa7, HbERF-VIIa17 and HbERF-VIIa20 genes by the CATAS-BIG Hevea Genome Project coordinated by Prof Chaorong Tang and Prof Songnian Hu. A 2000 bp sequence upstream from the start codon was scanned for the presence of putative cis-acting regulatory elements using the database associated search tools. The number of copies for each cis-acting element was then counted. Conserved motifs were investigated by multiple alignment analyses using ClustalW and MEME version 3.0 [44].

Subcellular localization and transcriptional activity tests by transient expression in a single cell system
Tobacco protoplasts were used in the subcellular localization because Hevea protoplasts have a short viability [45]. GFP Cterminal fusions were obtained with HbERF-VIIa7, HbERF-VIIa17 and HbERF-VIIa20 and used for tobacco protoplast BY-2 transfection according to Chaabouni and coll. [46]. The subcellular location of the fluorescence was determined after 20 hours using a DM4500 microscope (Leica).
A transactivation experiment was carried out according to the procedure published by both Chaabouni and Pirrello [46,47]. A synthetic reporter construct (4XGCC-GFP) was used [47]. Effector constructs were generated by fusing the 35S promoter to the CDS of the genes (HbERF-VIIa7, HbERF-VIIa17 and HbERF-VIIa20). For transient assays, tobacco (Nicotiana tabacum) BY-2 protoplasts were co-transformed with reporter and effector constructs [48]. Transformation assays were performed in three independent replicates. After 16 h, GFP expression was analysed and quantified by flow cytometry (FACS Calibur II instrument, BD Biosciences) on a flow cytometry platform (MRI). Data were analysed using Flowing software. For each sample, 100-400 protoplasts were gated on forward light scatter (FSC) and side light scatter (SSC) to check the size and the structure of protoplasts and eliminate debris from the analysis. The GFP fluorescent protoplasts used for the calculation of GFP activity were selected within a gate. This gate was previously defined by the comparison between transformed protoplasts (co-transformation with the effector plasmid lacking the HbERF coding sequence) and nontransformed protoplasts. The green fluorescence detected in the defined gate showed a marked hook. The GFP fluorescence per population of cells corresponded to the average fluorescence intensity of the cell population after subtraction of autofluorescence determined with non-transformed BY-2 protoplasts. The data were normalized using an experiment with protoplasts transformed with the reporter vector in combination with the vector used as the effector plasmid, but lacking the HbERF coding sequence.

Functional annotation and classification of a comprehensive Hevea transcriptome
Transcript sequences were previously produced from several tissue-type libraries (somatic embryogenic tissues, leaf, bark, latex and root) by the pyrosequencing GS-FLX 454 technique in order to have the most complete representation of the transcriptome of the Hevea clone PB 260 [11]. We supplemented this reference transcriptome with RNA sequences from a reproductive tissuetype library (immature and mature male and female flowers, zygotic embryos). All reads generated for these six libraries were re-assembled using the automatic pipeline described by Duan and coll. [11]. A total of 3,525,203 reads were cleaned and assembled in 86,941 contigs (Table S3 in File S2). From these contigs, 30,342 were then annotated in the Gene Ontology (GO) database, and 30,312 were classed using BLASTX (Table S4 in File S2). All contigs were categorized in 42 functional groups and classed in three ontologies: cellular component, molecular function and biological process ( Figure 2). The largest number of contigs was assigned to the molecular function (26,747 contigs), which contained two major subcategories (binding and catalytic activity). This was followed by the biological process (19,191 contigs), which contained six major subcategories (biological regulation, cellular process, establishment of localization, localization, metabolic process, and regulation of biological process). The minority category was the cellular component (9,508 contigs), which contained five subcategories (cell, cell part, macromolecular complex, organelle and organelle part). Given that the elements of the three GO terminologies are not independent, a Venn diagram was drawn in order to illustrate their relationship ( Figure 3).

Relative transcript abundance in various tissues
The relative transcript abundance of 84 AP2/ERF genes was analysed in various tissues (both immature and mature male and female flowers, zygotic embryos, leaf, bark and latex) ( Figure 5). For the 66 ERF genes, a high relative transcript abundance (value higher than one for the HbERF/HbRH2b ratio) was observed in all tissues for ERF members from groups I (4/7 genes), II (4/5 genes), VII (5/6 genes) and VIII (6/10 genes) ( Figure 5A). In addition, a high relative transcript accumulation has been found for a few members of other ERF groups (ERF-IIIe01, HbERF-VI05, HbERF-IXa02, HbERF-IXb02, HbERF-IXb03 and HbERF-IXc01). For the AP2 and RAV families, only AP2-6, AP2-16 and AP2-18 showed a similar or higher relative transcript abundance than the HbRH2 internal control.
Transcripts of ERFs from group VII were highly accumulated in latex. Several members of this group were identified as key regulators of hypoxia-responsive genes in Arabidopsis but also in Oryza species. For that reason, an additional phylogenetic analysis was carried out using the 6 Hevea members, and 15 Oryza sativa and 2 Oryza nivara protein sequences, which have specific functions related to the response to submergence ( Figure S11 in File S1). This phylogenetic analysis showed that the HbERF-VIIa07 and HbERF-VIIa12 genes were found to be orthologs to the OsEREBP1. Finally, a phylogenetic analysis was carried out by comparing AP2 domain-deduced amino acid sequences from five dicots (Arabidopsis thaliana, Populus trichocarpa, Vitis vinifera, Mal-us6domestica and Hevea brasiliensis) and two monocots (Oryza sativa and Zea mays) for group VII (Figure 6). The differentiation of members from ERF group VII clearly occurred before the speciation between dicot and monocot species because ERF-VIIs from all species were distributed in several clades [49]. Only one clade was early differentiated to generate genes involved in the tolerance of submergence in rice.

Gene structure and in silico promoter analysis of HbERF-VIIs
Highly expressed in latex and putatively key regulators of hypoxia response, which is a phenomenon occurring in latex from TPD-affected trees, this study focused on gene structure and in silico promoter analysis of HbERF-VIIs. The gene structure of the 6 HbERF-VIIs was analysed using genomic scaffold sequences from the Hevea clone CATAS 7-33-97. These scaffolds revealed that the ERF-VII genes possessed 2 exons and at least 1 intron (Figure 7). The first exon was shorter (174-287 bp) than the second exon (568-968 bp), which contained the AP2 domain. Two genes (HbERF-VIIa07 and HbERF-VIIa12) contained 2 introns, the second intron being located in the 39UTR sequence. These two genes showed a large first intron (1472 to 1848 bp) compared with other genes (90-142 bp), and a short second intron (129-135 bp).
In order to better understand the regulation of these ERFs, an in silico analysis of the 2000 bp upstream sequence ATG codon of these scaffolds was carried out using PLACE and PlantCARE ( [42,43]; Table S9 in File S2). Forty-one cis-acting elements were selected for their putative role in the response to tissue specificity, hormones, sugar/starch and stress (Table 3). According to the selected regulatory groups, 11 cis-acting elements were observed. E-box, POLLEN1 and RAV1-A were found for all of the studied genes. Interestingly, CAT-box, meristem specific cis-acting elements only had a high frequency (27 copies) in HbERF-VIIa1. L1 box (L1 layer specific) and XYL (regulating secondary xylem development cis-acting element) only existed in HbERF-VIIa17. Thirteen cis-acting elements of six hormone responses (jasmonic acid, ethylene, abscisic acid, auxin, gibberellin and salicylic acid) were observed. Three cis-acting elements were found individually, TCA-element in HbERF-VIIa20, GCC-box in HbERF-VIIa1and JERE in HbERF-VIIa17. For sugar and starch responses, eight cisacting elements (A-box, amylase box, CGACG element, CMSRE-1, OsBP-5, SRE, TATCCA element and TATCCAY motif) were found. The cis-acting elements in this regulatory function were distributed over the studied genes except HbERF-VIIa7. The majority of the upstream elements were categorized in stressrelated responses. Eight cis-acting elements (ARE, DRE/CRT, LTRE, CNGTTR-motif, CANNTG-motif, TC-rich repeats, Wbox and WUN-motif) were discovered. CANNTG-motif and TCrich repeats were found for all of the studied genes, while the WUN-motif was specifically present in HbERF-VIIa04.   The ERF family has been classed according to Nakano's method. ND: not determined. doi:10.1371/journal.pone.0099367.t001
In order to validate the function of HbERF-VIIs as transcription factors, a subcellular localization experiment was carried out using HbERF coding sequences/GFP translational fusion into pMDC83 (Figure 9). Transient expression into BY-2 tobacco  Table 2. Identification of putative functions for expression marker genes based on a phylogenetic tree analysis with Arabidopsis thaliana using the deduced amino acid sequences of the AP2 domain for each gene ( Figure S1-S10 in File S1; Tables S7-S8 in File S2).

Gene
Transcript accumulation Phylogenetic analysis Orthologous gene Accession No.

HbERF-Xa06
Low in leaf ERF110 At5g50080-Xa Regulation of bolting time [108] doi:10.1371/journal.pone.0099367.t002 protoplasts revealed GFP activity of the fusion protein in the nucleus for each tested HbERF, in contrast with a pMDC83 empty control plasmid. As previously demonstrated, transient transformation of protoplasts is a powerful tool for analysing the activity of transcription factors [47,50]. The GFP reporter gene under the control of a synthetic promoter harbouring the GCC box cis-acting element was transactivated by the three HbERF-VII candidates ( Figure 10). The effector constructs and the reporter constructs driven either by a GCC-rich synthetic promoter (GCC) or a synthetic promoter containing a mutated GCC motif (mGCC::TCCTCC) were co-transformed with 35S::HbERF-VII constructs into BY2 tobacco protoplasts. GFP activity was  Table 3. Cis-acting regulatory elements identified in the HbERF-VIIs promoter region using 2000 bp region upstream ATG (start codon) by PLACE and PlantCARE.

Regulatory group
Cis-acting element

Towards a comprehensive Hevea transcriptome
Over the last decade, several studies were conducted on Hevea brasiliensis to identify ESTs from latex [51,52], especially those involved in rubber biosynthesis [53,54], and also TPD-related genes [55]. More recently, several high-throughput analyses have led to the identification of microRNAs [56,57], transcriptomes [11,[58][59][60], and genomes [61] using NGS technologies. All the transcriptome analyses have provided a partial overview of the Hevea transcriptome, as RNAs from only a few specific tissues were sequenced, such as shoot apex [58] and leaf [61] from clone RRIM 600, along with leaf and latex from CATAS 7-33-97 [59]. This latter work by Xia and coll. also revealed incomplete sequences for the ERF family, with partial sequences unfit for classification [11]. Duan and coll. reported having developed a comprehensive transcriptome using several tissues (root, bark, latex, leaf, and tissues from somatic embryogenesis) from plants growing under different environmental conditions [11]. Our study improved this previous transcriptome by adding RNA sequences from reproductive tissues (immature and mature male and female flowers, zygotic embryos) and by performing gene annotation. Fifty-eight transcription factor families were identified accounting for 2448 contigs (8.07% of predicted genes). Another study based on the Hevea genome sequence showed 5978 transcription factors (8.5% of gene models) classed in 50 families [61]. This number of transcription factors is very high (2.5 to 6 times) compared with other species and even with our own annotation. This large number could be explained by an overestimation. For instance, based on repeat motif identification, Rahman and coll. found 139 AP2 genes, 246 ERF genes and 25 RAV genes. The presence of the AP2 domain in three transcription factor families (AP2, ERF, RAV), and the B3 domains in two other families (RAV, B3) probably led to one contig being annotated in more than one family carrying the same specific domain. The careful annotation carried out in our work is likely to provide a more accurate estimation of transcription factors and the ERF family in particular, as in our annotation 58 transcription factor families are commonly found in 49 species [62]. The number of transcription factors ranged from 1291 to 3546 genes with, for instance, 2201 genes for Manihot esculenta, a Euphorbiaceae close to Hevea brasiliensis, and 2585 genes in Populus trichocarpa in woody species.
This study led to a better estimation of AP2/ERF genes for the Hevea brasiliensis clone PB 260. The present transcriptome database, supplemented with RNA sequences from reproductive tissues, contains a smaller number of AP2/ERF genes (114+1 soloist instead of 142 found in Duan's paper) closed to the genomic information provided by the Chinese Hevea Genome Project on the Hevea clone CATAS 7-33-97. Although the first assembly of RNA reads was carried out with 2,387,930 reads from GS-FLX 454 sequencing, the addition of 643,113 reads from the reproductive tissues in this study led to a significant improvement in sequence quality comparable with genomic data (Table S3 in File S2). The decrease in gene number between Duan and coll. and the present report is mostly due to the reduction in gene number for the ERF Table 3. Cont.

Regulatory group
Cis-acting element

TATCCA element
Mediates sugar and hormone regulation groups VII (from 23 to 6), VIII (from 15 to 11) and IX (from 19 to 12). This reveals that transcriptome analyses can be used to identify genes harboured in a plant genome but require RNA sequencing of large numbers of tissues from plants at different stages growing under various environmental conditions in order to cover all types of expression patterns. Some HbAP2/ERFs could play an important role in latex Several members of three ERF groups (I, VII, VIII) are highly expressed in latex. In total, sixteen latex expression marker genes with higher or lower transcript abundance compared with other tissues were found in all ERF groups except group V. Seven highly expressed ERFs in latex, not significantly different from other tissues, belonged to groups II, IV, VII and VIII. HbERF-II and HbERF-VIIIa deduced proteins contained the repressor EAR motif. Their high expression under normal conditions suggests a certain negative control of ethylene response, especially in latex, by HbERF-IIa03 and HbERF-IIb04, and in leaves by HbERF-IIb01 and HbERF-IIb02. Members of group VIII are known to drive the response to jasmonate. This hormone is assumed to play a crucial role in the response to tapping and, in particular, to mechanical wounding of bark. The latex expression marker gene HbERF-VIIIa04 had AtERF3, a repressor of abiotic stress response and an inducer of cell death, as an ortholog.
The HbERF-IVs correspond to DREB2 in Sakuma's classification. These genes are generally involved in the response to dehydration. The role of HbERF-IVa03 in the response to water stress is consistent since laticifers are subjected to recurrent osmotic stress after tapping and consequent latex flow. Group VII was described as a regulator of hypoxia-responsive genes, which will be developed in the next part of the discussion. For group X, one member was described as a master regulator of redox potential (RRTF1) [63]. In Hevea, HbERF-Xb1 is ortholog to RRTF1but is not an expression marker gene. By contrast, the marker HbERF-Xa04 did not have any known function.
With regard to the nine latex expression marker ERF genes with very low relative transcript abundance, two belonged to group III. This group corresponds to CBF/DREB1 in Sakuma's classification. In addition to low expression of the HbERF-III genes, this group III included a small number of genes in Hevea compared with other species. This could explain the low adaptation of this subtropical species to cold. Indeed, genetic analysis showed that a duplication of CBF genes was related to low temperature tolerance in Arabidopsis, tomato [64], maize [65], and Eucalyptus [66]. In wheat and barley, the Frost resistance-2 (Fr-2) locus is coincident with a cluster of more than 12 CBF genes [67]. In Eucalyptus, 14 of the 17 CBF genes are located in a cluster on chromosome 1 [68]. QTLs of cold tolerance have been linked to CBF genes in Arabidopsis [69], and Eucalyptus nitens [70]. Nevertheless, a general correlation between level of expression of CBFs and latitudinal position has been demonstrated [71] it has also been shown that the correlation between level of expression and level of cold tolerance is not so strict and in some accessions it is clearly dissociated [72]. Three other latex expression marker genes belong to ERF groups VI and VI-like, which are involved in the response to cytokinin [73]. Lastly, two latex expression marker genes belonging to group IX had orthologs with functions related to vascular cell division (HbERF-IXa03/AtERF99) [74], and programmed cell death (HbERF-IXb03/AtERF102) [75].
The last latex expression marker gene HbAP2-6 encodes an ortholog of WRINKLED1 (WRI1). WRI transcription factors orchestrate tissue-specific regulation of fatty acid biosynthesis and in particular WRI1 is involved in seed storage metabolism in Arabidopsis [76,77]. Glycolipids represented 27-37% of total lipids in Hevea latex [78]. This unusual glycolipid composition was suggested to be linked to the peculiar nature of this specialized cytoplasm expelled from the laticiferous system during tapping. The high relative transcript abundance of HbAP2-6 could thus be related to the induction of genes involved in fatty acid biosynthesis in Hevea.

HbERF-VII genes could play a role in hypoxia response
The ERFs from group VII are known to regulate hypoxiaresponsive genes. ERF-VII proteins govern the response to low oxygen in plants by post-translational regulation. The regulation of these transcription factors has been characterized in depth for AtERF75/RAP2.2, AtEBP/RAP2.3, AtERF74/RAP2.12, AtERF71/HRE2 and AtERF73/HRE1 in Arabidopsis [79][80][81][82]. AtEBP also confers resistance to hydrogen peroxide and heat treatments [83], and this protein has been shown to interact with ACBP4 [84]. The stability of all these proteins is enabled by binding to the Acyl CoA binding proteins (ACBP). Under hypoxia, ERF-VII proteins are dissociated and then activate hypoxiaresponsive genes. When normoxia is recovered, the N-end rule pathway is involved in the proteosomal degradation of ERF-VII [85]. The N-terminal MCGGAII motif of ERF-VII proteins is targeted by the N-end rule pathway associated with the proteosomal degradation pathway [85,86]. The presence of the N-terminal MCGGAII motif in all the identified HbERF-VII deduced proteins, as in other species, suggests similar posttranslational regulation in Hevea.
Most species have 3 to 8 ERF-VII genes: 5 for Arabidopsis thaliana [6], 6 for Populus trichocarpa [9], 3 for Vitis vinifera [8], 8 for Malus6domestica [87], 4 for Solanum lycopersicon [47], and 6 for Hevea brasiliensis (present work). Only rice has a larger number (15) of genes [6]. Interestingly, phylogenetic analysis of ERF group VII proteins revealed that Oryza sativa had one additional clade with Figure 10. Transactivation of the synthetic GCC-box and mGCC containing promoters by HbERF-VIIa7, HbERF-VIIa17 and HbERF-VIIa20 proteins. Transient expression in BY-2 tobacco protoplasts co-transformed by pMDC32 harbouring the effector construct (ERF candidate genes under the control of the CaMV 35S promoter) and the reporter constructs, 46 GCC::GFP or 46 mGCC::GFP, respectively. Fluorescence activity was measured by flow cytometry for three independent biological replicates. The ratio of fluorescence between the constructs with functional GCC and mutated mGCC boxes revealed that ERF candidates are activators (ratio .1) or repressors (ratio ,1). (*) indicates significant difference for the Student test (p,0.01). doi:10.1371/journal.pone.0099367.g010 three genes (OsSUB1A, OsSUB1B, OsSUB1C) [88]. Interestingly, OsSUB1A is involved in submergence tolerance [89,90]. Thus, ERF-VIIs are pivotal regulators of responses to flooding and low oxygen [91]. These ERF-VII proteins have also been shown to be regulated by oxidative stress [92,93]. The divergence of these genes clearly led to a specific adaptation of rice to submergence by orchestrating various acclimatization responses [94], by creating a rice-specific group consisting of Sub1 and Snorkel proteins [49]. By contrast, the high expression of ERF-VIIs in Hevea brasiliensis with a small number of genes suggests a transcriptional-based adaptation in latex cells.
Some hypoxia-responsive genes targeted by ERF-VIIs identified in Arabidopsis were also present in latex. An analysis of the gene expression pattern revealed that Arabidopsis RAP2.2 controls the induction of genes involved in sugar metabolism and fermentation pathway enzymes [95]. Alcohol dehydrogenase (ADH), nonsymbiotic haemoglobin (HB1), SUCROSE SYNTHASE1 (SUS1) and SUCROSE SYNTHASE4 (SUS4) are considered as good markers of anaerobic response [49]. In order to overcome insufficient production of ATP by mitochondrial respiration, the catabolism of soluble sugars and starch can lead to an adjustment of the energy crisis by maintaining ATP production and NAD + regeneration [79,91,95]. Fermentation of pyruvate to ethanol by pyruvate decarboxylase and ADH plays a central role in hypoxia response. ADH expression is actively induced by RAP2.12 in Arabidopsis [96]. In Hevea, ADH, SUS1, SUS4 and HB1 were transcribed in all tissues and especially in latex, with the ADH presence of two contigs (CL1Contig11114, CL1536Contig2), HB1 presence of one contig (CL2139Contig4), and SUS1 presence of four contigs (CL1Contig20718, CL33Contig19, CL33Contig3 and CL33Contig8) (Table S11 in File S2). Their corresponding enzyme activities had also long been recorded in latex [97]. The presence of cis-acting elements involved in anaerobic response (ANAERO) and the sugar or starch content (amylase box, CGACG, BP-5, TATCCAY) in the promoter sequence of four HbERF-VII genes also revealed potential transcriptional regulation by the metabolism.

Response to low oxygen concentration in latex and regulation of rubber production
Sucrose is the source of carbon and energy for the biosynthesis of natural rubber. AcetylCoA generated by glycolysis is used by the mevalonate pathway to produce isopentenyl pyrophosphate (IPP). IPP is the precursor for elongation of the polyisoprene chain in rubber particles. Latex regeneration after tapping requires a lot of energy from glycolysis. A hypoxic condition was suggested in laticifers since the fermentative pyruvate metabolism is the main route of sugar degradation in latex cytosol [98]. The intermediate product, pyruvate, can be used by both aerobic and anaerobic pathways to generate acetyl CoA [97]. In addition, oxygen consumption has been observed for two latex-specific organelles, in the lutoids by peroxidase and NADH-quinone reductase, and the Frey-Wyssling particles by o-diphenoloxidase [97]. Lutoidic NAD(P)H oxidase generates toxic forms of oxygen such as superoxide anions, which are involved in lipid peroxidation of lutoids. Coagulant factors are released from the damaged lutoids and lead to the aggregation of rubber particles [99]. This in situ coagulation of rubber particles reduces latex flow after tapping. This physiological syndrome is called Tapping Panel Dryness (TPD). In TPD-affected trees, the consumption of oxygen by NADH-Cytochrome-c-oxidoreductase from lutoids is particularly high [100]. According to these authors, NADH-dependent consumption of oxygen is inhibited by superoxide dismutase activity [101].
The involvement of oxygen in the latex metabolism and TPD syndrome and the tolerance of rubber trees to wounding suggest that some HbERF-VIIs might play an important role in latex production. Indeed, three (HbERF-VIIa04, HbERF-VIIa07, HbERF-VIIa12) of the six HbERF-VIIs identified in this study as expression marker genes are highly regulated in latex and are orthologs to AtEBP/RAP2.3 and AtERF74/RAP2.12. Another HbERF-VII gene, HbERF-VIIa17 ortholog genes to AtEBP, might play a role in the response to the accumulation of reactive oxygen species generated during latex regeneration. In addition, three HbERF genes induced upon laticifer differentiation [26], correspond to three members of group VII (HbERF-VIIa3, HbERF-VIIa17 and HbERF-VIIa1) according to Duan and coll. [11]. In the present work, we showed that HbERF-VIIa3 corresponds to the new HbERF-VIIa4. Further characterization of the HbERF-VII genes and their target genes should lead to the identification of new functions in Hevea brasiliensis.

Supporting Information
File S1 Contains Figure S1, Phylogenetic tree of ERF group I. The deduced amino acid sequences of the AP2 domain from Hevea (black letter) and Arabidopsis (blue letter) were aligned using Muscle, and the phylogenetic tree was constructed using PhyML with an LG+T model. Expression marker genes are indicated in bold letters. Figure S2, Phylogenetic tree of ERF group II. The deduced amino acid sequences of the AP2 domain from Hevea (black letter) and Arabidopsis (blue letter) were aligned using Muscle, and the phylogenetic tree was constructed using PhyML with an LG+T model. Expression marker genes are indicated in bold letters. Figure S3, Phylogenetic tree of ERF group III. The deduced amino acid sequences of the AP2 domain from Hevea (black letter) and Arabidopsis (blue letter) were aligned using Muscle, and the phylogenetic tree was constructed using PhyML with an LG+T model. Expression marker genes are indicated in bold letters. Figure S4, Phylogenetic tree of ERF group IV. The deduced amino acid sequences of the AP2 domain from Hevea (black letter) and Arabidopsis (blue letter) were aligned using Muscle, and the phylogenetic tree was constructed using PhyML with an LG+T model. Expression marker genes are indicated in bold letters. Figure S5, Phylogenetic tree of ERF group VI. The deduced amino acid sequences of the AP2 domain from Hevea (black letter) and Arabidopsis (blue letter) were aligned using Muscle, and the phylogenetic tree was constructed using PhyML with an LG+T model. Expression marker genes are indicated in bold letters. Figure S6, Phylogenetic tree of ERF group VI-L. The deduced amino acid sequences of the AP2 domain from Hevea (black letter) and Arabidopsis (blue letter) were aligned using Muscle, and the phylogenetic tree was constructed using PhyML with an LG+T model. Expression marker genes are indicated in bold letters. Figure S7, Phylogenetic tree of ERF group VII. The deduced amino acid sequences of the AP2 domain from Hevea (black letter) and Arabidopsis (blue letter) were aligned using Muscle, and the phylogenetic tree was constructed using PhyML with an LG+T model. Expression marker genes are indicated in bold letters. Figure S8, Phylogenetic tree of ERF group VIII. The deduced amino acid sequences of the AP2 domain from Hevea (black letter) and Arabidopsis (blue letter) were aligned using Muscle, and the phylogenetic tree was constructed using PhyML with an LG+T model. Expression marker genes are indicated in bold letters. Figure S9, Phylogenetic tree of ERF group IX. The deduced amino acid sequences of the AP2 domain from Hevea (black letter) and Arabidopsis (blue letter) were aligned using Muscle, and the phylogenetic tree was constructed using PhyML with an LG+T model. Expression marker genes are indicated in bold letters. Figure S10, Phylogenetic tree of ERF group X. The deduced amino acid sequences of the AP2 domain from Hevea (black letter) and Arabidopsis (blue letter) were aligned using Muscle, and the phylogenetic tree was constructed using PhyML with an LG+T model. Expression marker genes are indicated in bold letters. Figure S11, Phylogenetic analysis of ERF group VII using the 6 Hevea brasiliensis, 15 Oryza sativa and 2 Oryza nivara members. The deduced amino acid sequences of the AP2 domain from Hevea (black letter) and Oryza sativa and Oryza nivara (green letter) were aligned using Muscle, and the phylogenetic tree was constructed using PhyML with an LG+T model. (PDF) File S2 Contains Table S1, List of primer sequences for 84 AP2/ ERF genes and 11 housekeeping genes, and expected length of amplicons after amplification by real-time PCR in H. brasiliensis clone PB 260. Table S2, Comparison of Cp values, standard deviation and coefficient of variance for gene expression analysis by real-time RT-PCR of 11 housekeeping genes in 11 tissues from male and female immature and mature flowers, zygotic embryos, latex and bark. Table S3, Summary of 454 sequencing and clustering using automatic pipeline for various tissue-type libraries (embryogenic tissues, leaf, bark, latex, roots and reproductive tissue) and a global library combining reads from all tissues. Table  S4, Annotation and classification of gene function according to GO analysis. Table S5, Classification of transcription factor family in Hevea. Table S6, Comparison between contigs generated by RNAseq (Duan et al. 2013 and this report) and genomic scaffold. Table S7, RAP-Green analysis for ortholog and paralog inferences in phylogenetic comparison between Arabidopsis and Hevea. Table  S8, Identification of putative orthologs of Hevea brasiliensis AP2/ ERFs using Arabidopsis TAIR database. Table S9, Cis-acting regulatory elements identified in HbERF-VIIs promoter region using 22000 bp region upstream ATG (start codon) by PLACE and PlantCARE. Table S10, Comparison of conserved motifs between Arabidopsis and Hevea. Table S11, Read distribution list corresponding to anaerobic-responsive genes. (XLSX)