Cloning and functional characterization of seed-specific LEC1A promoter from peanut (Arachis hypogaea L.)

LEAFY COTYLEDON1 (LEC1) is a HAP3 subunit of CCAAT-binding transcription factor, which controls several aspects of embryo and postembryo development, including embryo morphogenesis, storage reserve accumulation and skotomorphogenesis. Herein, using the method of chromosomal walking, a 2707bp upstream sequence from the ATG initiation codon site of AhLEC1A which is a homolog of Arabidopsis LEC1 was isolated in peanut. Its transcriptional start site confirmed by 5’ RACE was located at 82 nt from 5’ upstream of ATG. The bioinformatics analysis revealed that there existed many tissue-specific elements and light responsive motifs in its promoter. To identify the functional region of the AhLEC1A promoter, seven plant expression vectors expressing the GUS (β-glucuronidase) gene, driven by 5’ terminal series deleted fragments of AhLEC1A promoter, were constructed and transformed into Arabidopsis. Results of GUS histochemical staining showed that the regulatory region containing 82bp of 5’ UTR and 2228bp promoter could facilitate GUS to express preferentially in the embryos at different development periods of Arabidopsis. Taken together, it was inferred that the expression of AhLEC1A during seed development of peanut might be controlled positively by several seed-specific regulatory elements, as well as negatively by some other regulatory elements inhibiting its expression in other organs. Moreover, the GUS expression pattern of transgenic seedlings in darkness and in light was relevant to the light-responsive elements scattered in AhLEC1A promoter segment, implying that these light-responsive elements harbored in the AhLEC1A promoter regulate skotomorphogenesis of peanut seeds, and AhLEC1A expression was inhibited after the germinated seedlings were transferred from darkness to light.


Introduction
Seed development is a complex procedure of the flowering plant in life cycle, which can conceptually be divided into two distinct phases: embryo morphogenesis and seed maturation Cloning of the 5' flanking region of AhLEC1A The peanut genomic DNA was isolated from Luhua 14 leaves using CTAB method [28]. Genome walking was performed to isolate the 5' flanking regulatory region. According to the BD Genome Walker Universal Kit (Clontech, USA) manufacturer's instructions, each of 2.5 μg genomic DNA was digested with four restriction enzyme DraI, EcoRV, PvuII, and StuI respectively; and then the digested samples were connected with the BD Genome-Walker adaptor resulting in the library containing digestions by DraI, EcoRV, PvuII, and StuI (LD, LE, LP, and LS). Based on the sequence of AhLEC1A genomic DNA, two nested gene-specific primers (GSP), LEC1AGSP1-2 and LEC1AGSP2-2, were designed. The first round of PCR reaction was done in a 25μL reaction system using an AP1 provided by Kit and LEC1A GSP1-2 as 5' terminal and 3' terminus primer, and 1μL DNA of each library as template. The nested PCR reaction was also performed using the same volume and conditions with primers AP2 and LEC1AGSP2-2, and 1μL of the 10-fold diluted primary PCR products as template. The specific PCR fragments from the second round reaction were isolated and inserted into the vector pEASY-T3. The recombinants harboring the target gene were validated by two-way sequencing using ABI3730 model DNA sequencer. The primer and adaptor sequences of this assay were listed in Table 1.

Precise identification of transcription start site in AhLEC1A
The transcription start site of AhLEC1A gene was identified by 5' RACE (rapid amplification of cDNA ends) using a 5' RACE kit (Invitrogen GeneRacer™ Kit) following the instructions provided by the manufacturer. Total RNA was extracted from the developing seeds of peanut Luhua 14 using the improved CTAB method [29]. The ds-cDNA was synthesized using the full-length mRNA with RNA Oligo as template. The ds-cDNA was cloned into vector pCR4-TOPO to establish the full-length cDNA library. According to the cDNA sequence of AhLEC1A, two 3' terminus gene-specific primers TSS LEC1AGSP1-1 and TSS LEC1AGSP2-2 were designed, for use in the nested PCR reaction. The 5' terminus general primer for two rounds of PCR were GeneRacer™ Primer and 5' Nested Primer. 1μL of the full-length cDNA library as got previously and a 50-fold dilution of the primary PCR product was used respectively as the template of the two rounds of PCR. The nested PCR products were collected and sequenced by ABI3730 model DNA sequencer. The primer sequences used in the assay were listed in Table 1.

Expression analysis of AhLEC1A gene in various organs
The expression analysis was performed by qRT-PCR using ABI 7500 instrument. Gene-specific primers were designed according to AhLEC1A cDNA sequence ( Table 1). The first-stand cDNAs of AhLEC1A were amplified using SYBR premix Ex Taq polymerase (Takara). Its relative expression level was analyzed using AhACTIN7 as the reference gene by the 2 -ΔΔCT method [30]. Three sample repetitions with technical triplicates were set in the experiment.

Plasmid construction and Arabidopsis transformation
A series of 5' -truncated promoter sequences were obtained by PCR using a single reverse primer localized in 5' UTR of AhLEC1A, and different forward primers situated in the different sites of the AhLEC1A promoter (The primer sequences were listed in Table 1). To construct the vector, the appropriate restriction sites were introduced into the PCR-amplified promoter (HindIII at the 5' end; NcoI at the 3' end). The PCR-amplified promoter was then inserted into HindIII/NcoI-digested pCAMBIA3301, replacing the cauliflower mosaic virus (CaMV) 35S promoter, producing seven deletion constructs containing various fragments (-2228~+82, Q7; -1254~+82, Q6; -935~+82, Q5; -721~+82, Q4; -617~+82, Q3; -354+ 82, Q2; -105~+82, Q1). The constructs including Q1-Q7 and the control pCAMBIA3301 was introduced into Agrobacterium tumefaciens strain GV3101 using a freeze-thaw method. Transgenic Arabidopsis plants were generated by the floral dip method. The seeds of the T 0 -T 2 generations were germinated on 1/2MS 0 agar medium containing 10μg/L Basta. The copy number in transgenic plants was determined by segregation ratio of the plants with and without basta-resistance. The T 1 transgenic lines with single copy gene have the 3:1 ratio of resistant plants to non-resistant plants. The homozygous lines of T 2 generation were screened on basta-resistant 1/2MS 0 medium. More than eight homozygous lines respectively carrying single copy gene of Q1-Q7 and the control pCAMBIA3301 were obtained. The identified transgenic plants were transferred to soil under 120 μmol�m -2 �s -1 light in a growth room at a temperature between 22˚C and 25˚C. All Arabidopsis plants grew under a 16h light/8h dark photoperiod, and 65% relative humidity.

Histochemical GUS staining
The GUS assay was performed as described by Jefferson [31]. For each AhLEC1A promoter-GUS construct, at least thirty plants of T 2 generation lines in five transgenic events were used for GUS histochemical staining. The roots and leaves at the 4-leaf stage, stems at the bolting stage, flowers, immature embryos of 6-10 days after pollination and 3-5day etiolated and deetiolated seedlings in transgenic T 2 lines were incubated in GUS assay buffer with 50mM sodium phosphate(7.0), 0.5mM K 3 Fe(CN) 6 , 0.5mM K4Fe(CN) 6 �3H 2 O, 0.5% Triton X-100, and 1mM X-Gluc at 37˚C overnight and then cleared with 70% ethanol. The samples were examined by stereomicroscopy.

Isolation of the promoter of AhLEC1A and localization of TSS
The 2739bp DNA fragment was amplified by two rounds of PCR using the method of genome walking. Its sequence analysis found that this fragment includes 2707bp of 5' flanking region upstream from ATG and 32bp of coding sequence (Fig 1). In order to determine the transcription start site (TSS) of AhLEC1A gene, the nested 5' RACE was performed to amplify the 5'-end of its cDNA. The 140bp of cDNA fragment, including the 58bp of coding region started from ATG and 82bp 5' UTR, was isolated (Fig 2). Compared with the gDNA sequence of AhLEC1A, the sequence of 82bp 5' UTR was identical to the 5' upstream sequence of its gDNA, suggesting that the "A" located at the 82th nucleotid (nt) upstream from ATG is the TSS of AhLEC1A gene.

Analysis of cis-regulatory elements in AhLEC1A promoter sequence
In silico analysis of 2707bp 5' flanking region revealed that a number of putative cis-elements were present in the 2625bp of promoter region and 82bp of 5' UTR (Fig 3). In detail, the basic  promoter elements, TATA box (TATATAT) and CAAT box (CAAAT)), respectively placed at -36~-30 nt and -143~-148 nt. Many crucial elements required for embryo-or endospermspecific expression and seed storage compounds accumulation scattered over the promoter, including two SKn-1 motifs (GTCAT) at -84~-80 nt and -475~-479 nt, two CANBANAPA elements (CNAACAC) at -442~-448 nt and -1597~-1603 nt, and three binding sites for AGL15 (CWWWWWWWWG) at -1245~-1236 nt, -2079~-2070 nt and -2272~-2263 nt. In addition, four DPBFCOREDCDC3 elements (ACACNNG), previously considered to involve in embryo-specific expression and also to respond to ABA were found at -111~-117 nt, -445 -451 nt, -1321~-1327 nt and -1330~-1324 nt. We have also detected many other regulatory elements for the accumulation of seed storage compounds and embryogenesis. For instance, eight EBOX BNNAPA (CANNTG), two 2S SEED PROT BANAPA (CAAACAC) and one SEF3 MOTIF GM (AACCCA). Besides, there were some elements associated with regulating in vegetative organ development on the promoter, such as mesophyll-specific element CACTFTPPCA1 (YACT), root-specific element ROOTMOTIFTAPOX1 (ATATT) and so on. There also exist some elements involved in light responsiveness including more than a dozen I BOX (

Functional analysis of the regulatory regions of the AhLEC1A promoter
To validate the role of the crucial regulatory region in AhLEC1A promoter, a series of GUS expression vectors (Q1~Q7) (Fig 4), driven by different length of the promoters with truncated 5' terminal were established, and the GUS expression patterns in stable transgenic plants of Arabidopsis was investigated. In the histochemical assay, GUS expression was visualized specifically in the developing embryos of transgenic plants containing Q7 construct (including 2228bp promoter region and 82bp 5' UTR, Fig 5). The result of AhLEC1A expression analysis by qRT-PCR also showed that its transcripts were higher in seeds, but lower or rarely in roots, stems, leaves and flowers (S1 Fig) Otherwise, the GUS staining was observed in all detected organs of transgenic plants carrying Q3, Q4, Q5, and Q6 construct (Fig 5). These four promoter segments are respectively 617bp, 721bp, 935bp and 1254bp in size with 5' terminal deletion of 1611bp~974bp. It was suggested that there exist some key motifs in the promoter region between -2228bp and -1255bp, which related to inhibit the expression in the other organs except for the developing seeds. Moreover, the further deletional promoter fragment Q2 with 354bp drove the GUS to express only in embryos and rosette leaves. The shortest fragment Q1 containing 105bp promoter region and 82bp 5' UTR couldn't drive the GUS to express in any detected organs of transgenic Arabidopsis (Fig 5), implying that it might be caused by the deletion of the necessary component for gene expression.
To explore the role of AhLEC1A on seedling establishment, the transgenic lines with Q7, Q5, Q3 and Q2 constructs were chosen for further analysis. Transgenic Arabidopsis seeds were kept in the dark till their germinating. The results of GUS staining indicated that the unexpanded cotyledon and apex hook of the seedlings harboring Q7 or Q5 construct showed dark blue color, and the hypocotyls were light blue. However, after the etiolated transgenic seedlings had been moved to the light for 2 days, the plants with Q7 construct hardly got dyed, and only the expanded cotyledons with Q5 construct were stained blue. By contrast, the whole seedlings with Q3 or Q2 construct were dyed dark blue under both growth conditions (Fig 6). The results suggested that there existed some negatively regulatory elements at the region of -2228bp~-618bp in AhLEC1A promoter to control the expression of AhLEC1A in hypocotyls  and radicles at the stage of seedling formation, and some of them mentioned above might associate with light response.

Discussion
Identifying and characterizing the 5' flanking region of gene is helpful for revealing its temporal and spatial expression pattern, and facilitating its utilization in plant genetic engineering [32]. In the present study, we have cloned and analyzed the 5' flanking regulatory sequence of AhLEC1A. Several cis-elements in AhLEC1A promoter, such as SKn-1, CANBANAPA (CA) n , AGL15, DOF core, SEF3 motif and the like, which previously were demonstrated to be required for seed development and storage accumulation, were identified. Skn-1 motif and (CA) n element were reported to play a vital role in determining the seed-specific expression; the deletion of Skn-1 motif or (CA) n element in glutelin and napin promoter decreased their transcription in seeds [33,34]. The element of DOF core was considered to confer the endosperm-specific expression in Zea mays [35,36]. SEF3 motif is the binding site of Soybean Embryo Factor 3, which regulate the transcription of the β-conglycinin (a storage protein) gene and participate in seed development [37,38]. Our results of GUS staining assay revealed GUS gene, driven by the longest AhLEC1A promoter (Q7), specially expressed in the embryos of the transgenic Arabidopsis, which is well in agreement with our results of gene expression analyzed by qRT-PCR method (S1 Fig). These data showed that AhLEC1A functioned in a seed-specific manner. Otherwise, the transgenic lines with 1611bp~974bp deletion constructs from 5' terminal of Q7 promoter showed the constitutive expression at higher GUS levels in roots, rosettes, stems, flowers, and seeds. Meanwhile, in silico analysis of AhLEC1A promoter displayed several tissue-specific elements like mesophyll-specific element CACTFTPPCA1, root-specific element ROOTMOTIFTAPOX1 and pollen-specific element POLLEN1LELAT52 distributed on its upstream regulatory region, as well as many negatively regulatory elements including four WRKY71OS and five WBOXATNPR1 dispersed intensively in the fragment of -1225~-2228bp which was deleted in Q3~Q6. These results demonstrate that AhLEC1A expression in seed-specific pattern might be attributed to be negatively regulated its transcription in vegetative organs by some cis-element existed in the distal region of its promoter, and simultaneously to be controlled its expression in seeds positively by some seed-specific elements in the proximal region of its promoter. This regulatory model was also found in AtLEC1 promoter of Arabidopsis [27], D540 promoter of rice [39], and C-hordein promoter of barley [40].
Beyond embryogenesis and embryo development, LEC1 also regulate skotomorphogenesis of seedlings at the post-germination stage. The unexpanded cotyledons and apical hook of seedlings with Q7 construct germinated in dark dyed obviously in blue, while the whole seedlings were scarely stained after transferring to light for 2 days. It was suggested that light might repress the expression of AhLEC1A by recruiting some proteins to bind the particular elements in its promoter. Our results found that total 23 light-responsive elements I BOX CORE/GATA BOX (GATA) scattered on Q7 segment of AhLEC1A promoter, 9, 15 and 21 out of them were respectively deleted in Q5, Q3 and Q2 promoter, resulting in that in GUS assay, the staining patterns of Q2 and Q3 transgenic plants in darkness were similar to those in light, and the hypocotyls of Q5 transgenic plants were dyed in blue when growing in darkness while there were no dyeing after transferring them to light. The core sequence of I BOX, and the GATA BOX with similar function had been shown to be essential for light-regulated transcriptional activation [41][42][43]. Furthermore, it has been demonstrated that I BOX as a negative cis-element can inhibit the expression of GalUR in strawberry, and the inhibited role is strictly depended on light [44]. Yamagata et al. also found that I BOX, as a negative regulatory element, was necessary for down-regulating the expression of cucumisin gene by binding fruit nuclear protein in Musc melons (Cucumis melo L.) [45]. Previous study found that AtLEC1 promoter exists several I BOX CORE elements, and deleting some of them localized on 5' upstream segment from -436 nt in mutant tnp restrains the hypocotyl elongation of etiolated seedlings in darkness [27]. These data suggested that some of I BOX elements function as a negative regulator in response to illumination. In our GUS histochemical assay, the degree and range of dyeing changed with the number of I BOX, demonstrating that some of them might be involved in negative regulating the expression of AhLEC1A gene during the procedure of seedling growth from dark condition to light condition.
The cis-elements comparison in the promoters of AhLEC1A and AhLEC1B showed that lots of similar elements are dispersed in the both promoters, but their amounts and positions were much different (Table 2). AhLEC1A promoter contained a number of distinct seed-development related components such as 2S SEED PROT BANAPA, SP8BFIBSP8BIB, CANBNNAPA. However, AhLEC1B promoter contained numerous specific elements involved in abiotic stress or hormones responding, including GCCCORE, ASF1MOTIFCAWV, and several regulatory elements known to modulate gene expression at higher transcription level in different plant species [46]. The similarities and differences between two AhLEC1 promoters implied that their functions might be partially same and redundant, and to some extent AhLEC1A and AhLEC1B might play different roles during the particular growth and development period of peanuts, respectively. The point of view was consistent with the study of predecessors who thought LEC1 genes originated from a common ancestor and neofunctionalization and /or subfunctionalization processes were responsible for the emergence of a different role for LEC1 genes in seeds plants [47]. Moreover, during the evolution of cultivated peanuts, A and B subgenomes were subjected to asymmetric homoeologous exchanges and homoeolog expression bias. Yin et al. considered that A subgenome were significantly affected by domestication, while natural selection preferred to B subgenome [48]. It was speculated that during genome evolution, to satisfy the demands for seed growth and development, the orthologous genes AhLEC1A and AhLEC1B suffered from the different selection pressure at different life stages to produce their functional divergence.
In summary, we identified and characterized the promoter of AhLEC1A. It was found that during the process of seed development and maturation, its expression in embryo were regulated by the positive cis-elements in seed-specific mode and the negative elements restricting its expression in other organs. Moreover, AhLEC1A was also involved in skotomorphogenesis Table 2. Comparison of regulatory elements in AhLEC1A promoter and AhLEC1B promoter.