Genome-Wide Characterization and Expression Profiling of the AUXIN RESPONSE FACTOR (ARF) Gene Family in Eucalyptus grandis

Auxin is a central hormone involved in a wide range of developmental processes including the specification of vascular stem cells. Auxin Response Factors (ARF) are important actors of the auxin signalling pathway, regulating the transcription of auxin-responsive genes through direct binding to their promoters. The recent availability of the Eucalyptus grandis genome sequence allowed us to examine the characteristics and evolutionary history of this gene family in a woody plant of high economic importance. With 17 members, the E. grandis ARF gene family is slightly contracted, as compared to those of most angiosperms studied hitherto, lacking traces of duplication events. In silico analysis of alternative transcripts and gene truncation suggested that these two mechanisms were preeminent in shaping the functional diversity of the ARF family in Eucalyptus. Comparative phylogenetic analyses with genomes of other taxonomic lineages revealed the presence of a new ARF clade found preferentially in woody and/or perennial plants. High-throughput expression profiling among different organs and tissues and in response to environmental cues highlighted genes expressed in vascular cambium and/or developing xylem, responding dynamically to various environmental stimuli. Finally, this study allowed identification of three ARF candidates potentially involved in the auxin-regulated transcriptional program underlying wood formation.


Introduction
The plant hormone auxin plays a prominent role in the regulation of plant growth in response to diverse developmental and environmental cues such as organogenesis, tropic movement, root growth, fruit development, tissue and organ patterning and vascular development [1].Auxin plays a crucial role in the specification of vascular stem cells (procambium) and in cambial activity [2].Analysis of auxin distribution across the cambial region in hybrid aspen trees showed a radial auxin gradient reaching a peak level in the cambial zone or at the border between the cambial zone and the expansion zone towards developing wood cells [3,4].The auxin gradient was indeed shown to overlap with the sequential and numerous auxin-regulated genes responding dynamically to the change in auxin levels in wood forming cells [5].
As trees are long living organisms with sessile lifestyle, they have to adapt to changing environmental conditions throughout their lifetimes which may span decades and centuries in some cases.In particular, vascular stem cell activity shows plasticity in response to mechanical stress which affects wood formation and quality.In angiosperm woody species, a local increase in cambial cell division induces the formation of tension wood in the upper side of the leaning tree stems [6,7].Auxin has been proposed to be implicated in the tension response, and application of either exogenous auxin or auxin transport inhibitors was shown to induce the gelatinous G-fibres characteristics of tension wood [8].Although measurements of endogenous auxin failed to reveal significant changes in auxin balance in the cambial region tissues, a rather large set of auxin-related genes were found to be differentially expressed in developing poplar tension wood [9].A recent study indicated that the auxin signalling pathway is significantly disrupted during cambial dormancy in hybrid aspen [10].Despite the fact that auxin has long been proposed as primary regulator of cambial activity and wood formation [4,11], the auxin-regulated transcriptional programs underlying wood formation remain largely under investigated.
Auxin exerts its function through modulating the expression of numerous genes among which is a set of transcriptional regulators.Auxin Response Factors (ARFs) and Aux/IAAs are two wellknown mediators which regulate auxin responsive gene expression [12,13].Most ARF proteins contain a highly conserved Nterminal B3-like DNA binding domain that recognizes an auxinresponse element (AuxRE: TGTCTC) present in the promoters of auxin-responsive genes.The C-terminal domain contains two motifs, called III and IV, also found in Aux/IAA proteins and shown to enable the formation of homo-and heterodimers among ARFs and Aux/IAAs [14,15].The middle region whose sequence is less conserved confers transcription activation or repression depending on its amino acid composition [13].Biochemical and genetic studies in Arabidopsis and other plants have led to a working model of the mediation of auxin response by ARF proteins [14,16].In the absence of auxin, Aux/IAAs bind to ARFs and recruit co-repressors of the TOPLESS (TPL) family, preventing the ARFs from regulating target genes [17].The presence of auxin induces Aux/IAA protein degradation via the 26S proteasome through SCF-TIR1 ubiquitin ligase complex; thus liberating the trapped ARF proteins, allowing them to modulate the transcription of target auxin-responsive genes (for review, see Guilfoyle and Hagen) [12].This model based on limited ARF-Aux/IAA interaction studies which provides a framework for understanding how members of these families may function.More recently, a large-scale analysis of the Aux/IAA-ARF interactions in the shoot apex of Arabidopsis showed that the vast majority of Aux/IAAs interact with all ARF activators, suggesting that most Aux/IAAs may repress the transcriptional activity of ARF activators [18].In contrast, Aux/IAAs have limited interactions with ARF repressors suggesting that the role of the latter is essentially auxin-independent and that they might simply compete with the ARF activators for binding to the promoter of auxininducible genes [18].This finding is particularly important taking into account that auxin predominantly activates transcription [19][20][21] and that a large complement of the ARF family acts as transcriptional repressors [12].Whereas the above proposed scenario applies to the shoot apical meristem, it is likely that specific interactions between Aux/IAAs and ARFs might also affect the dynamics of the ARF-Aux/IAA signalling pathway in other developmental processes such as cambial development.
The recent availability of Eucalyptus grandis genome [40], the second hardwood forest tree genome fully sequenced, offers new opportunities to get insights into the regulation of secondary growth and cambial activity by ARFs, especially because Eucalyptus belongs to evergreen trees that do not present dormancy in their cambial activity in sharp contrast with deciduous trees like Populus.Eucalyptus is also the most planted hardwood in the world, mainly for pulp and paper production but is also foreseen as a dedicated energy crop for lignocellulosic biofuel production.Thus, understanding the mechanisms that underlying auxin regulation in Eucalyptus wood formation is of interest both in the context of plant development and as a path to improve lignocellulosic biomass production and quality.
In the present paper, we report a genome-wide identification and characterization of the ARF family in Eucalyptus grandis.We analyzed gene structure, protein motif architecture, and chromosomal location of the members of the E. grandis ARF family.We also performed comparative phylogenetic relationships and large scale transcript profiling with a special focus on vascular tissues to get insights in their evolution, expression characteristics and possible functions.

Identification of ARF gene family in Eucalyptus grandis and chromosomal location
The identification procedure is illustrated in Fig. S1.Firstly we used Arabidopsis ARF proteins as queries in BLASTP searches for predicted protein in Eucalyptus genome (JGI assembly v1.0, annotation v1.1, http://www.phytozome.net/eucalyptus).A total of 64 Eucalyptus proteins identified in this initial search were examined by manual curation of protein motif scan using Pfam domain IDs (http://pfam.wustl.edu)and NCBI conserved domain database (http://www.ncbi.nlm.nih.gov/cdd).Redundant and invalid gene models were eliminated based on gene structure, intactness of conserved motifs and EST support.Three incomplete gene models were identified and completed by FGeneSH (http://linux1.softberry.com).To complete partial sequence of Eucgr.K02197.1, we cloned the corresponding genomic fragment using forward primer: 59-AATTGACCGCGGTTGGATA-39 and reverse primer 59-GAGCAGGCCAACATCCTCA-39, which located up-stream and down-stream respectively of the nondetermined sequence (N).According to sequencing result we complete the missing part (1156 bp), corresponding to a part of promoter region and a part of 59end CDS of the Eucgr.K02197.1 (submitted to GenBank data library under the accession number KC480258).All these manual curations enabled us to obtain 17 complete Eucalyptus ARF proteins sequences.We then used them as query in two subsequent additional searches: 1) BLASTP against Eucalyptus proteome for exhaustive identification of divergent Eucalyptus gene family members, and 2) tBLASTn searches against Eucalyptus genome for seeking any possible nonpredicted genes.For validation, we also used poplar ARF proteins as queries to do the search procedure described above, and we obtained exactly the same result.
In the course of the above identification process we completed and expertly re-annotated three partial sequences (accession numbers Eucgr.F02090.1,Eucgr.F04380.1, and Eucgr.K03433.1 in the Phytozome database) initially annotated in the Eucalyptus genome-sequencing project (Table 1).In addition, we found one gene (accession number Eucgr.K02197.1) that corresponded to a partial sequence for which the 59 end was not determined (1240 N as sequencing results).Information on chromosomal location was retrieved from the Eucalyptus genome browser (http://www.phytozome.net/eucalyptus).EgrARF genes were mapped to their loci using MapChart 2.2 [41].Sequence, phylogenetic, gene structure analysis Conserved protein motifs were determined by Pfam [42].Multiple protein sequences alignment was performed using Clustal X program (Version 2.0.11).Using full length sequences of all predicted protein, phylogenetic trees were constructed with MEGA5 program by neighbor-joining method with 1000 bootstrap replicates.Their exon-intron structures were extracted from Phytozome (http://www.phytozome.net/eucalyptus) and visualized in Fancy Gene V1.4 (http://bio.ieo.eu/fancygene/).The prediction of small RNA target sites in EgrARF genes was performed through the web application psRNATarget (http:// plantgrn.noble.org/psRNATarget/).The stem-loop structures were predicted using RNAfold web server (http://rna.tbi.univie.ac.at/cgi-bin/RNAfold.cgi) and visualized by RNAstructure 5.3 program.

Plant material
The plant materials provenance and preparation are described in Cassan-Wang et al. [43].Hormone treatments were performed in an in vitro culture system. 10 mM NAA (1-Naphthaleneacetic acid, for auxin), or 20 mM gibberellic acid or 100 mM ACC (1aminocyclopropane-1-carboxylic-acid, for ethylene) were added to the medium of 65-d-old young trees, and trees were sampled 14 days after treatments.
Total RNA extraction, cDNA synthesis, quality controls and high throughput quantitative RT-PCR All the procedures used for the qRT-PCR, from the RNA extraction to the calculation of transcript abundance are described in Cassan-Wang et al. [43].Only samples with a RNA integrity number .7 (assessed by Agilent 2100 Bioanalyzer) were retained for reverse transcription.cDNA quality was assessed as described by Udvardi et al. [44] using housekeeping genes IDH and PP2A3 (primers see Table S1).Primer pairs were designed using the software QuantPrime (http://www.quantprime.de)[45], showing in Table S1.qRT-PCR was performed by the Genotoul service in Toulouse (http://genomique.genotoul.fr/)using the BioMark 96:96 Dynamic Array integrated fluidic circuits (Fluidigm Corporation, San Francisco, USA) described in Cassan-Wang et al. [43].The specificity of the PCR products was confirmed by analysing melting curves.Only primers that produced a linear amplification and qPCR products with a single-peak melting curves were used for further analysis.The efficiency of each pair of primers was determined from the data of amplification Ct value plot with a serial dilution of mixture cDNA and the equation E = 10 (-1/slope) -1.E -DDCt method was used to calculate relative mRNA fold change compared to control sample using formula (E target ) DCt_target (control2sample) / (E reference ) DCt_reference (control2sample) [46] and five reference genes (IDH, PP2A1, PP2A3, EF-1a and SAND, Table S1) were used for data normalization.We chose in vitro plantlets as control sample, because it contains the main organs and tissues of our studies such as stem, leaves, shoot tips, xylem, phloem and cambium, and it is a relative stable and less variable sample as being grown under the same in vitro culture condition from one experiment to another.

Transactivation analysis in single cell system
For testing the ability of ARF transcription factors to up or down regulate the expression of auxin responsive promoter DR5, the full-length cDNAs of the ARF transcription factors were cloned in pGreen vector under 35SCaMV promoter to create the effector constructs.The reporter constructs use a synthetic auxinresponsive promoter DR5 fused with the GFP reporter gene.Tobacco BY-2 protoplasts were co-transfected with the reporter and effector constructs as described in Audran-Delalande et al. [47].After 16 h incubation, GFP expression was quantified by flow cytometry (LSR Fortessa, BD Biosciences).Data were analysed using BD FacsDiva software.Transfection assays were performed in three independent replicates and 3000-4000 protoplasts were gated for each sample.GFP fluorescence corresponds to the average fluorescence intensity of the protoplasts population after subtraction of auto-fluorescence determined with non-transformed protoplasts.50 mM 2, 4-D was used for auxin treatment.We tested two independent protoplast preparations and for each of them, we performed in three independent transformation replicates.Similar results were obtained with the independent protoplast preparations and the data were represented by one of the preparations.For normalization, protoplasts were transformed with the reporter vector and the effector plasmid lacking the ARF gene.

Identification and chromosomal distribution of Eucalyptus ARF genes
The procedure to identify all members of the ARF family in the E. grandis genome (JGI assembly v1.0, annotation v1.1 (http:// www.phytozome.net/cgi-bin/gbrowse/eucalyptus/),included expert manual curation as illustrated in Fig. S1.It allowed the identification of 17 genes encoding full length Eucalyptus ARF proteins (henceforth referred to as EgrARF genes).We named these genes according to their potential orthologs in Arabidopsis (Table 1 1).In silico chromosomal mapping of the gene loci revealed that the 17 EgrARF genes are scattered on nine of the eleven chromosomes, with one to three EgrARF genes per chromosome and with chromosomes 8 and 9 being devoid of ARF genes (Fig. S2).
The predicted proteins encoded by the EgrARF genes ranged from 593-1119 amino acid residues (Table 1), with PIs in the range of 5.43-8.32,suggesting that they can work in very different subcellular environments.Sequence analyses of the predicted proteins and Pfam protein motif analysis showed that most of them (14 of the 17 predicted proteins) harbour the typical ARF protein structure comprising a highly conserved DNA-binding domain (DBD) in the N-terminal region composed of a plant specific B3type subdomain and an ARF subdomain, a variable middle region (MR) that functions as an activation or repression domain, and a carboxy-terminal dimerization (CTD) domain consisting of two highly conserved dimerization subdomains III and IV, similar to those found in Aux/IAAs (Fig. 1).We analysed and aligned the predicted amino acid sequences of the EgrARFs (Fig. 1 and Fig. S3).Four out of the 17 EgrARFs (10, 16A, 16B and 17) exhibited an additional short segment of amino acids (between 15 to 43 amino-acids) in their DBD, between the B3 and ARF subdomains (Fig. 1 and Fig. S3).Such a feature has already been reported in Arabidopsis and soybean [36].At the end of the DBD domain, all of the EgrARFs excepty EgrARF6A, 6B and 19A contain a conserved putative mono-partite nuclear localization signal (NLS) (Fig. S3) shown to direct the proteins into the nucleus [36,48].
The predicted protein structures of EgrARF3 and EgrARF17 are lacking dimerization domains III and IV like their potential orthologs in Arabidopsis (Fig. 1 and Fig. S3).EgrARF24, which has no ortholog in Arabidopsis, has a truncated CTD since only Aux/IAA subdomain III is present.The percentage of EgrARFs displaying a truncated CTD (17.6%) is similar to that in Arabidopsis (17.4%), but lower than in rapeseed (22.6%) or tomato (28.6%) [24,37,50].These truncated EgrARFs are predicted to be unable to interact with Aux/IAA, a sequestration mechanism which may regulate their activity, and hence, they are likely to be insensitive to auxin.However, ARF repressors seem to display very limited interactions with Aux/IAA proteins [18], therefore the lack of domains III and/or IV could also have consequences for the interaction of ARFs with other transcriptional regulators [49].
Compared to Arabidopsis, the ARF family in Eucalyptus is slightly contracted with 17 versus 23 members.It is worth noting that we found the exact same number of ARF genes in another Eucalyptus species, E. camaldulensis (http://www.kazusa.or.jp/ eucaly/).Indeed when comparing to other species, in which the ARF family has been characterized (Table 2), Eucalyptus and grapevine appeared to have the smallest families with 17 and 19 members respectively, whereas poplar and soybean had the largest families with 39 and 51 members, respectively.We did not find evidence that any of the 17 EgrARF genes arose by tandem, segmental, or whole genome duplication, or even the more ancient hexaploidization in the E. grandis genome [40] and it appears that any such duplicates have been lost in Eucalyptus as is the case for 95% of whole-genome duplicates.This is sharply contrasting with the intensive tandem duplication events found for Arabidopsis ARF members [14,51], the segmental duplication found in Populus [38], and the whole-genome duplication events in soybean [36].
As duplication and alternative splicing are the two main mechanisms involved in diversification of function within gene families, sometimes viewed as opposite trends in gene family evolution, we performed an in silico survey of the alternative transcripts predicted in the E. grandis genome JGI assembly v1.0, annotation v1.1 (http://www.phytozome.net/eucalyptus), and compared them to those in Arabidopsis (Table 1 and Fig. S4).More than half of the Eucalyptus ARF family members (10 out of 17) have evidence of alternative splicing (Fig. S4).Taking into account the number of possible alternative transcripts in Eucalyptus (17) and in Arabidopsis (15), the total number of possible transcripts in both species becomes very similar, 34 and 38, respectively.Some of the transcripts resulted in truncated versions of the genes like EgrARF1.4,4.3 and 9B.2 lacking the Aux/IAA interaction domain and EgrARF2B.2lacking the B3/DBD domain.We further compared the in silico predicted ARF alternative transcripts from E. grandis to those expressed in a dataset of in-house RNA-Seq data from E. globulus (Table S4, Fig. S5, File S1).Remarkably, the vast majority of the alternative transcripts predicted in E. grandis were found expressed in E. globulus providing strong experimental support to their occurrence and conservation in the two Eucalyptus species.The importance of alternative splicing in the ARF family, has been highlighted recently by Finet et al. [49], who have shown that two Arabidopsis alternative transcripts of AtARF4 have very different functions in flower development, and by Zouine et al. [35] who have shown that in tomato, one third of the ARF members displays alternative splicing as a mode of regulation during the flower to fruit transition.In Arabidopsis and in many other species, not only domain rearrangement through alternative splicing but also extensive gene duplication played a significant role in ARF functional diversification [49], whereas in Eucalyptus the first mechanism appeared to be preeminent.

Comparative Phylogenetic analysis of the ARF family
To assess the relationship of Eucalyptus ARF family members to their potential orthologs in other landmark genomes, we constructed a comparative phylogenetic tree using all predicted ARF protein sequences from genomes of relevant taxonomic lineages.The core rosids were represented by Arabidopsis and Populus (Malvids) while the Myrtales, the Vitales and the Asterides were represented by Eucalyptus grandis, Vitis vinifera and Solanum lycopersicum, respectively.The monocots were represented by the Oryza sativa genome (Fig. S6).A simplified tree with only Arabidopsis, Populus and Eucalyptus (Fig. 2A) showed that ARFs are distributed into four major groups I, II, III, and IV.Eucalyptus (and also grapevine) which harbour the smallest number of ARF genes as compared to all other species (Table 2), have the fewest number of ARF proteins in each of the four groups.The positions and phases of the introns were well conserved within each group (Fig. 1 and Fig. 2), whereas their sizes were poorly conserved even within the same group.All five predicted Eucalyptus ARF transcriptional activators fell within group II as their potential orthologs from Arabidopsis and other species; the remaining EgrARFs were distributed among the three other groups.
Some lineage-specific clades were found in the Solanaceae ARF family [35] as well as in Arabidopsis ARF family [24].In Arabidopsis, group I was substantially expanded with a subgroup containing seven tandem duplicated genes (encoding proteins AtARF12 to 15 and AtARF20, to 22), and the sister pair of AtARF11-AtARF18, for which orthologs were found only in Brassicaceae.
In group I, an isolated clade (highlighted in green in Fig. 2) contained EgrARF24 clustering with PtrARF2.5 and PtrARF2.6 and did not contain any obvious Arabidopsis ortholog.This clade was absent from the herbaceous annual plants (Arabidopsis, tomato and rice), but present in woody perennial plants (Eucalyptus, Populus and Vitis; Fig. S6).To verify if this clade could be more specific to woody perennial plants, we performed a BLAST similarity search in 33 plant genomes available in Phytozome and found potential orthologs of EgrARF24 in 13 plant species out of 33 (Table S5) which are presented in a phylogenetic tree (Fig. 2B).Among these 13 plant species, 11 are trees such as M. domestica, C. sinensis, C. clementina, P. persica, or tree-like plants and shrubs such as C. papaya, T. cacao, G. raimondii, although the latter is often grown as an annual plant.A. coerulea and F. vesca are perennial herbaceous plants.The two notable exceptions are two members of the Fabaceae family (G.max, and P. vulgaris) which are annual herbaceous plants.We thus considered this clade as woody-preferential.Regarding Group III, there was no evidence of large expansion of ARF3 and ARF4 genes in any of the three species, with only ARF3 duplicated in Populus.Group IV contained four members from Eucalyptus, i.e. one more than in Arabidopsis.All of the EgrARFs belonging to this group have in common an additional fragment (between 15 to 43 amino-acids residues) within their DBD (Fig. 1 and Fig. 2) and, noteworthy, alternative splicing was not detected for any of these genes in Eucalyptus and Arabidopsis (Fig. S4).

Prediction of small regulatory RNAs and their potential ARF targets
In Arabidopsis, several ARF genes are targets of microRNAs miR160 and miR167, or of a trans-acting short interfering RNA (tasiRNA) TAS3 [29,[52][53][54].Since these small RNAs and their targets are very often conserved across plant species [32,55,56], we searched for their potential orthologs in the Eucalyptus genome.Their chromosomal locations, genomic sequences and the sequences of their mature forms are presented in Table S6.We identified three potential Eucalyptus miR160 loci and three potential miR167 loci, all predicted gene products formed typical microRNA stem-loop structures (Fig. S7).The three EgrmiR160 genes encode a mature RNA identical to that in Arabidopsis.The three miR167 genes produce two different mature RNA forms (Table S6) whereas in Arabidopsis three different mature miR167 forms were detected.We also identified a potential TAS3 locus in the Eucalyptus genome (Table S6).
We used these newly identified Eucalyptus small RNAs as probes to search in silico for their target sites in EgrARF genes.Ten of the 17 EgrARF genes were found to be potential targets of these three small RNAs (Table S7).We identified highly conserved target sites for EgrmiR160 in EgrARF10, 16A, 16B and 17, for EgrmiR167 in EgrARF6A and B, and for EgrTAS3 in EgrARF2A, 2B, 3 and 4 (Table S7).The targeting of three different small RNA to their corresponding target genes was highly conserved between Arabidopsis and Eucalyptus suggesting common regulation of plant growth and development.For example, miR160, a highly conserved miRNA group across the plant kingdom, is known to target ARF10, ARF16 and ARF17 to regulate various aspects of plant development [30,52,53].In Arabidopsis, miR167 regulates lateral root outgrowth [57], adventitious rooting [58], ovule and anther growth, flower maturation [20,29] and jasmonic acid homeostasis [59] by targeting both AtARF6 and AtARF8.Very recently, it has also been shown that miR167 regulates flower development and female sterility in tomato [60].Because Eucalyptus is a woody perennial plant, one could expect that some small RNAs (for instance miR160 and Tasi 3) could be involved in the regulation of wood formation through targeting of ARF genes preferentially expressed in cambial cells or developing xylem.

Expression of EgrARFs in different Eucalyptus organs and tissues and in response to environmental cues
To start investigating the functions of the EgrARF genes, we assessed their transcript expression levels in various Eucalyptus organs and tissues by qRT-PCR, with special attention to vascular Phylogenetic relationships between the orthologs of EgrARF24 in other species.EgrARF24 proteins were used to blast 33 species genomes in Phytozome.An E-value of 1.0E-50 as used as a cut off to select the ARF potential orthologs from each species.A phylogenetic tree was constructed used the procedure as in (A) and using AtARF2 was used as an outgroup.The species containing putative orthogs of EgrARF24 were the followings: 1 Aquilegia coerulea, 2 Glycine max, 1 Phaseolus vulgaris, 1 Carica papaya, 2 Malus domestica, 1 Prunus persica, 1 Fragaria vesca, 1 Vitis vinifera, 2 Populus trichocarpa, 1 Citrus sinensis, 1 Citrus clementine, 2 Gossypium raimondii, 1 Theobroma cacao.doi:10.1371/journal.pone.0108906.g002tissues (Fig. 3, Fig. S8 and Fig. S9).Transcript accumulation was detected for 16 EgrARFs in all 13 organs and tissues tested (Fig. 3), except for EgrARF24, which was detected only in shoot tips and young leaves (Fig. S8A).The very restricted expression profile of EgrARF24 is surprising first because this gene belongs to a woody-preferential clade and second, because its poplar orthologs PtrARF2.5 and PtrARF2.6 could be detected in xylem based on microarray expression data [38], PtrARF2.6 being highly expressed in developing wood (http://popgenie.org/).It should be noted however that this gene is truncated in E. grandis, it has lost domain III, whereas PtrARF2.6 and their grapevine ortholog still have domain III and IV.
Heatmap representation (Fig. 3) indicated that EgrARF genes were expressed across various tissues and organs, but different members displayed preference to particular tissues and/or organs and could therefore be clustered into three main expression groups.Group A is the smallest with only two members EgrARF10 (predicted repressor) and EgrARF19A (predicted activator) showing a relatively higher expression in vascular cambium as compared to other tissues and/or organs.EgrARF10 was expressed at higher level in cambium (both mature and juvenile) than in differentiating xylem and/or phloem (Fig. 3 and Fig. S9).Its ortholog in Populus, PtrARF10.1, is highly expressed in developing xylem tissues [38], suggesting that AtARF10 orthologs in trees might be involved in wood cell differentiation having a different/supplementary role as compared to that of the Arabidopsis sister pair AtARF10 -AtARF16 whose mutants exhibit root cap defects and abnormal root gravitropism [30].
EgrARF19A was expressed at similar levels in the three vascular tissues (Fig. 3 and Fig. S9).Group B is the largest with eight genes (EgrARF4, 6B, 6A, 3, 1, 9A, 9B, 17) expressed in all tissues including vascular and non-vascular tissues (Fig. 3).The expression of EgrARF3 and EgrARF4 is highest in root, stem and phloem and differs from the specific expression of their Arabidopsis orthologs AtARF3 and AtARF4 associated with developing reproductive and vegetative tissues.This suggests that they might be involved in other processes than the control of the abaxial identity of the gynoecium, and lateral organs shown in Arabidopsis [26].Group C includes six genes (EgrARF2A, 2B, 5, 16A, 16B, 19B) preferentially expressed in leaves, floral buds and fruits and virtually absent from vascular tissues and particularly from cambium and xylem (Fig. 3 and Fig. S9).As its Arabidopsis ortholog, EgrARF19B was highly expressed in root [24].It should be noted that the activator EgrARF5 is highly expressed in all samples analysed, with the highest expression in in vitro plantlets.Because in vitro plantlets were used to normalize the expression data in the heatmap, the expression of EgrARF5 appeared in green in all other samples (Fig. 3).Its expression profile normalized using a different sample is given in Fig. S8.
Thirteen of the sixteen EgrARF genes examined (Fig. 3 and Fig. S9) exhibited higher expression in phloem than in xylem and/ or cambium, suggesting that in Eucalyptus more EgrARF genes are involved in phloem than in xylem differentiation and/or function.EgrARF5 was equally expressed in phloem and xylem.In Arabidopsis, ARF5/MONOPTEROS (MP) is known to play a critical role in the specification of vascular stem cells [27] but its role in secondary growth driven by vascular cambium activity has not been explored hitherto.EgrARF10 and EgrARF19A were the only two genes more expressed in cambium and/or xylem than in other organs or tissues, supporting their possible involvement during the differentiation of meristematic cambium cells into xylem cells.No obvious difference in transcript levels were observed between juvenile and mature stages neither in cambium nor in differentiating xylem (Fig. 3 and Fig. 4).
We further examined the responsiveness to bending stress of the eight EgrARF genes which showed moderate to high expression in vascular tissues (Fig. 4).Half of EgrARFs were down-regulated in tension wood as compared to the control upright xylem, including three predicted repressors (EgrARF3, 4, and 9A) and one predicted activator (EgrARF6A).Conversely, in opposite xylem, four genes were up-regulated, including three predicted activators (EgrARF6A, 6B, 19A) and one repressor (EgrARF10).Only one gene (EgrARF4) was repressed.In general, EgrARF gene expression was repressed in tension xylem and induced in opposite xylem, except in the case of EgrARF4, which was down-regulated in both tension and opposite xylem (Fig. 4).These results are consistent with a study performed in Populus where seven ARF genes were detected in a poplar tension wood EST database, while the majority of genes were down-regulated in tension wood as compared to opposite wood [61].
Recent studies indicated that high nitrogen fertilization affects xylem development and alters fibre structure and composition in Populus [62,63] and induces some overlapping effects with tension wood on xylem cell walls.Interestingly, EgrARF4 and EgrARF6A were down regulated in tension wood, but were down regulated when nitrogen was in excess (Fig. 4).

Effects of long-term hormone treatments on EgrARFs transcript levels
Several hormones are known to regulate cambium activity and xylem formation [ [31], [64] and references therein].For instance, application of exogenous ethylene (ACC) on young poplar trees during 12 days was shown to stimulate cambial activity, while xylem cell size was decreased [65].We performed similar longterm hormonal treatments (15 days) by growing young Eucalyptus trees on medium supplemented with either auxin, gibberellin or ethylene in order to evaluate the consequences on the transcripts levels of the EgrARF genes in stems (organs enriched in xylem).The phenotypes of the Eucalyptus trees after hormonal treatments were typical of each hormone: gibberellin stimulated plant growth resulting in longer stems, ethylene reduced plant growth and led to epinastic leaves, whereas auxin induced shortened and bolded roots (Fig. S10).All EgrARF transcripts except EgrARF24 were detected in young tree stems and the expression levels of 13 were altered and mainly down-regulated by long-term hormonal treatments (Fig. 5).Although long-term hormonal treatments likely have both direct and indirect effects on ARFs expression, it is interesting to note distinct and differential behaviours: Five ARFs exhibited a kind of ''hormonal preference'' response since their transcripts levels were altered in stems treated only by one of the three hormones.For instance, EgrARF3 was up-regulated only in auxin treated samples; EgrARF5, only in ethylene treated samples, whereas EgrARF6A, EgrARF16A and EgrARF19B were altered only in gibberellin treated samples.Most of the other ARFs were modulated at different degrees by the direct and/or indirect actions of each of three hormones with the notable exception of EgrARF4 that was down-regulated in stems treated by ethylene and gibberellic acid but not affected in those treated by auxin.

Transcriptional activities of EgrARF4, EgrARF 10 and EgrARF19A
We decided to characterize the transcriptional activity of three ARF members: EgARF10 and 19A which were preferentially expressed in cambium/xylem, and EgARF4 whose expression was modulated in xylem in response to mechanical stress and to nitrogen fertilization.For this purpose, tobacco protoplasts were co-transfected with an effector construct expressing the full-length coding sequence of the ARFs under the 35SCaMV promoter and a reporter construct carrying the auxin-responsive DR5 promoter fused to GFP coding sequence (Fig. 6A).DR5 is a synthetic auxinresponsive promoter made of nine inverted repeats of the conserved Auxin-Responsive Element, (TGTCTC box), fused to a 35SCaMV minimal promoter.This reporter construct has been widely used to assess auxin responsive transcriptional activation or repression in vivo and in planta [15,47].The DR5-driven GFP showed low basal activity which was induced up to 4-fold by exogenous auxin treatment (Fig. 6B).Co-transfection with the effector genes EgrARF4 and EgrARF10 resulted in a very significant (p,0.001)repression of auxin-induced reporter gene.Expression of 80% and 38%, respectively hereby confirming their predicted repressors roles.On the other hand, the values obtained for EgrARF19A suggested that it could be an activator as predicted by its sequence analysis, but this tendency was not strongly supported by the student-T test.

Conclusions
The ARF family in E. grandis contains 17 members (5 activators and 12 repressors) and is slightly contracted as compared to most angiosperm ARF families studied hitherto.In contrast to these species, it is characterized by the absence of whole genome, segmental and/or tandem duplication events.Indeed, whole genome duplication in Eucalyptus occurred 109.9Mya ago, considerably earlier than those detected in other rosids and 95% of the paralogs were lost [40].The absence of tandem duplication is remarkable especially because E. grandis has the largest number of genes in tandem repeats (34% of the total number of genes) reported among sequenced plant genomes.Indeed, tandem duplication shaped functional diversity in many gene families in Eucalyptus.The ARF family thus evolved in a very different way.Our data suggests that genomic truncation and alternative splicing were preeminent mechanisms leading to the diversity of domain architecture, shaping and increasing the functional diversity of the ARF family in Eucalyptus, thereby compensating for the lack of extensive gene duplication found in other species.Comparative phylogenetic studies pointed out the presence of a new clade, maintained preferentially in woody and perennial plants.Finally, large scale expression profiling allowed identifying candidates potentially involved in the auxin-regulated transcriptional programs underlying wood formation.

7 .24 14 *
Using FGeneSH to complete the complete sequence.**using specific primers to amplify the genomic DNA to complete the sequence.a Gene model of Eucalyptus (version 1.1) in phytozome V8.0.b The best hit of EgrARF in Arabidopsis by using blastp.c The amino acid identity percentage between EgrARF and corresponding AtARF.d Designation related to Arabidopsis best hit.e The number of predicted alternative transcripts of EgrARF in phytozome.f,g Location of the EgrARF genes in the Chromosome.h Length of open reading frame in base pairs.i The number of amino acids, molecular weight (kilodaltons), and isoelectric point (pI) of the deduced polypeptides.doi:10.1371/journal.pone.0108906.t001

Figure 1 .
Figure 1.Gene structure of the EgrARF family.The information on exon-intron structure was extracted from the Phytozome database and visualized by using the FancyGene software (http://bio.ieo.eu/fancygene/).The sizes of exons and introns are indicated by the scale at the top.The domains of EgrARF gene were predicted by Pfam (http://pfam.xfam.org/)and are indicated by different colours.The B3 together with ARF subdomains constitute the DNA binding domain (DBD).The CTD contains two sub-domains III and IV.The TAS3 and microRNA target sites are marked on the corresponding target genes.The triangles underline the insertion sites of additional short amino-acids segments between the B3 and ARF subdomains.doi:10.1371/journal.pone.0108906.g001

Figure 2 .
Figure 2. Phylogenetic relationships of ARF proteins between Eucalyptus and other species.(A) Phylogenetic relationships between ARF proteins from Arabidopsis, Populus and Eucalyptus.Full-length protein sequences were aligned by using the Clustal_X program.The phylogenetic tree was constructed by using the MEGA5 program and the neighbor-joining method with predicted ARF proteins.Bootstrap support is indicated at each node.The blue shade highlights the activators, and the green shade indicates the distinct likely woody preferential clade containing EgrARF24.(B)Phylogenetic relationships between the orthologs of EgrARF24 in other species.EgrARF24 proteins were used to blast 33 species genomes in Phytozome.An E-value of 1.0E-50 as used as a cut off to select the ARF potential orthologs from each species.A phylogenetic tree was constructed used the procedure as in (A) and using AtARF2 was used as an outgroup.The species containing putative orthogs of EgrARF24 were the followings: 1 Aquilegia coerulea, 2 Glycine max, 1 Phaseolus vulgaris, 1 Carica papaya, 2 Malus domestica, 1 Prunus persica, 1 Fragaria vesca, 1 Vitis vinifera, 2 Populus trichocarpa, 1 Citrus sinensis, 1 Citrus clementine, 2 Gossypium raimondii, 1 Theobroma cacao.doi:10.1371/journal.pone.0108906.g002

Figure 3 .
Figure 3. Expression profiles of16 EgrARF genes in various organs and tissues.The heat map was constructed by using the relative expression values determined by qRT-PCR of 16 EgrARF genes (indicated on the right) in 13 tissues and organs (indicated at the top) normalized with a control sample (in vitro plantlets).In the heat map, red and green indicate relatively high and lower expression (log 2 ratios) than in the control, respectively.Each measurement is the mean of three independent samples.The heat map and the hierarchical clustering were generated by MultiExperiment Viewer (MEV).doi:10.1371/journal.pone.0108906.g003

Figure 4 .Figure 5 .
Figure 4. Effect of environmental cues and developmental stages on EgrARF expression.The heat map was constructed by using the relative expression values determined by qRT-PCR of EgrARF genes (indicated on the right) in various tissues and conditions (indicated at the top) normalized with a control sample (in vitro plantlets).In the heat map, red and green indicates relatively higher expression and lower expression (log 2 ratios) than in the control, respectively.The heat map and the hierarchical clustering were generated by MultiExperiment Viewer (MEV).doi:10.1371/journal.pone.0108906.g004

Figure 6 .
Figure 6.EgrARF transcriptional activities in tobacco protoplasts.(A) Schemes of the effector and reporter constructs used to analyse the function of EgrARFs in auxin-responsive gene expression.The effector constructs express the EgrARF of interest driven by the 35S promoter.The reporter construct consists of a reporter gene expressing GFP driven by the auxin-responsive promoter DR5 (DR5::GFP).(B) Effector and reporter constructs were co-expressed in tobacco protoplasts in the presence or absence of a synthetic auxin (50 mM 2, 4-D).GFP fluorescence was quantified 16 h after transfection by flow cytometry.A mock effector construct (empty vector) was used as a control.In each experiment, protoplast transformations were performed in independent biological triplicates.Two independent experiments were performed and similar results were obtained; the figure indicates the data from one experiment.Error bars represent SE of mean fluorescence.Significant statistical differences (student T test, P,0.001) to control are marked with **.doi:10.1371/journal.pone.0108906.g006

Table 1 .
ARF gene family in Eucalyptus.
).Where two EgrARFs matched the same potential Arabidopsis ortholog AtARFx, they were named as EgrARFxA and xB, with xA being the closest to the Arabidopsis ortholog; e.g.EgrARF2A and EgrARF2B.The percentage of identity between the Arabidopsis and the Eucalyptus predicted ARF protein sequences, and among the Eucalyptus ARFs themselves are given as Table S2 and S3, respectively.Eight Arabidopsis genes have no corresponding Eucalyptus orthologs (AtARF12 to 15 & 20 to 23), while only one EgrARF gene, EgrARF24, has no ortholog in Arabidopsis (Table

Table 2 .
Summary of ARF gene content in angiosperm species.