Deep Sequencing of Maize Small RNAs Reveals a Diverse Set of MicroRNA in Dry and Imbibed Seeds

Seed germination plays a pivotal role during the life cycle of plants. As dry seeds imbibe water, the resumption of energy metabolism and cellular repair occur and miRNA-mediated gene expression regulation is involved in the reactivation events. This research was aimed at understanding the role of miRNA in the molecular control during seed imbibition process. Small RNA libraries constructed from dry and imbibed maize seed embryos were sequenced using the Illumina platform. Twenty-four conserved miRNA families were identified in both libraries. Sixteen of them showed significant expression differences between dry and imbibed seeds. Twelve miRNA families, miR156, miR159, miR164, miR166, miR167, miR168, miR169, miR172, miR319, miR393, miR394 and miR397, were significantly down-regulated; while four families, miR398, miR408, miR528 and miR529, were significantly up-regulated in imbibed seeds compared to that in dry seeds. Furthermore, putative novel maize miRNAs and their target genes were predicted. Target gene GO analysis was performed for novel miRNAs that were sequenced more than 50 times in the normalized libraries. The result showed that carbohydrate catabolic related genes were specifically enriched in the dry seed, while in imbibed seed target gene enrichment covered a broad range of functional categories including genes in amino acid biosynthesis, isomerase activity, ligase activity and others. The sequencing results were partially validated by quantitative RT-PCR for both conserved and novel miRNAs and the predicted target genes. Our data suggested that diverse and complex miRNAs are involved in the seed imbibition process. That miRNA are involved in plant hormone regulation may play important roles during the dry-imbibed seed transition.


Introduction
Plant gene expression is highly regulated to ensure proper development and function of tissues and adequate responses to abiotic and biotic stresses. Gene expression is often a multistep process and can be regulated at several levels. One of the most recently discovered regulatory mechanisms is post-transcriptional and involves 21-24 nt small RNA molecules [1]. The small RNA content of plant cells is surprisingly complex, suggesting an extensive regulatory role for these molecules [2]. The bestcharacterized class of plant small RNA is microRNA (miRNA) [3]. MiRNAs were first identified in Caenorhabditis elegans through genetic screens for aberrant development and were later found in almost all multicellular eukaryotes examined [4]. Mature miRNAs are single-stranded ,21 nt small RNA which are generated from a single-stranded primary transcript by a series of enzymatic activities. Mature miRNAs down-regulate their target genes through the cleavage of mRNAs [5], translational repression [6], or transcriptional inhibition [7]. However, mRNA cleavage seems to be the predominant mechanism of miRNA-mediated regulation in plants [8]. Many miRNAs targets in plants are transcription factors [9,10]. Transcription factors play key regulatory roles in plant development [10], however they are no longer needed after they function and may even be harmful for the next developmental stage. To maintain normal plant development, MiRNAs play crucial roles in the elimination of those unwanted factors [11].
As dry seeds imbibe water, the resumption of energy metabolism and cellular repair occur. Potential miRNA-mediated gene expression regulation has been suggested in seed development, dormancy and germination [12,13]. For example, SPL13 (SQUAMOSA PROMOTER BINDING PROTEIN-LIKE 13) is the miR156 target gene and down-regulation of SPL13 by miR156 appears to be essential for the transition to the vegetative-leaf stages. Mutated SPL13 that is resistant to miR156 over-accumulates at the post germination stages and causes a delay in vegetative leaf development from the cotyledonstage seedlings [14,15]. Transcription factor ARF10 (AUXIN RESPONSE FACTOR10) is the target of miR160. The mutant expressing miR160 de-regulated ARF10 (mARF10) exhibited altered hormone sensitivity during seed germination and seedling growth in addition to a serrated leaf phenotype [16]. The balance between ABA and gibberellin (GA) is important for determining the dormancy status of seeds. ABA is abundant in dormant seeds and generally decreases during imbibition when seed dormancy is released [17] whereas GA increases during the transition to germination [18]. The miR159 was reported to targets MYB101 and MYB33 transcription factors, which are positive regulators of ABA signalling during Arabidopsis seed germination, suggesting that miR159 may play a role in seed germination [19].
In recent years, development of high-throughput pyrosequencing technology has revolutionized the discovery of miRNA species. High-throughput technologies has allowed the identification of several non-conserved or lowly expressed miRNAs through deep sequencing, e.g. in Arabidopsis, wheat and tomato [20][21][22]. The discovery of non-conserved miRNAs suggests that plant species/families with specific developmental features may contain non-conserved miRNAs that are involved in the regulation of gene expression specific to those features. Characterizing the stage-and tissue-specific expression of miRNAs is becoming more important for understanding the regulatory mechanisms of critical events during plant development.
Maize is one of the most important crops worldwide as well as a model plant for biological research. A number of miRNAs with specific function have been reported in maize. The expression of APETALA2 floral homeotic transcription factor, which is required for spikelet meristem determination, is regulated by miR172 [23]. MiR172 also regulates the APETALA2-like gene glossy15 to promote vegetative phase transition [24]. Teosinte glume architecture 1 (tga1), which plays an important role in maize domestication, has been identified as a target of miR156 [25]. The miR166 functions in acting on the asymmetry development of leaves in maize by regulating a class III homeodomain leucine zipper (HD-ZIPIII) protein [26]. However, the small RNA population in maize is highly complicated and only limited numbers of miRNAs and their target genes have been analysed in seed research to date.
Plants go through distinct phases during early stages of development. Dry seeds imbibe water and re-initiate active physiology [27]. The reactivation events such as the activation of genes encoding enzymes involved starch degradation and protein and DNA/RNA synthesis play important roles in the decision as to whether a seed will germinate or not. The shift from the seed development/maturation mode to the germination mode is a critical change in the developmental program of seeds. Regulation of transcription factors targeted by miRNAs is involved at this critical stage in plant development [28]. To investigate the roles of miRNAs in both dry and imbibed maize seeds and to identify potential seed-specific miRNAs, we sequenced small RNA populations of dry and imbibed maize seeds using the Illumina platform. Our results indicated that diverse miRNAs were involved in the imbibition process of maize seed.

Deep Sequencing of Maize Small RNAs
In order to identify the miRNAs involved in both dry and imbibed seeds, small RNA libraries from dry and imbibed seed embryos (imbibed 24 hours) were sequenced side by side using the Illumina platform. The statistics of the small RNA sequences from the two libraries were summarized in the tables (Table 1 and  Table 2). A total of 16,991,216 and 11,321,391 raw reads were obtained from dry seeds and imbibed seeds, respectively. After removing the low quality sequences, adapter sequences, and sequences smaller than 18 nt, the remaining clean reads from the two libraries were aligned to the maize genome. A total of 12,341,357 and 9,735,651 clean reads were perfectly matched to the B73 genome (B73 RefGen_v2 (Release 5a.57 in May, 2010)). The two libraries generated 3,454,632 unique small RNA reads indicating a highly complex small RNA population in maize seed. For the total small RNAs, 75.89% were found in both libraries.
For the unique small RNAs, however, only 14.33% were shared in both libraries, 18.55% were dry-seed-specific and 67.12% were imbibed-seed-specific, indicating dynamic small RNA population was expressed in the imbibed seeds (Table 3). Figure 1 showed that some small RNAs were up-regulated, some were down-regulated in imbibed seeds, and some were roughly equally expressed in both dry and imbibed seeds. The size of the small RNAs was not evenly distributed in both libraries, however, overall size distribution of the sequenced reads from the two sequencing efforts were very similar. The majority (.75%) of the small RNAs from both libraries was 20-24 nt, with the 24 nt being the most abundant, followed by 22 and 21 nt classes ( Figure 2). This result was consistent with recent report in maize [29] and with that of Medicago truncatula [30], rice [31], peanut [32] and Arabidopsis [33].

Conserved miRNAs
Conserved miRNA families were found in many plant species and had important functions in plant development. To identify conserved miRNAs in our dataset, all small RNA sequences were Blastn searched against the known mature miRNAs and their precursors in the miRNA database miRBase. There were currently 29 families containing 172 known miRNAs of maize in miRBase [34]. Aligning small RNA sequences to known miRNAs resulted in 1,461,286 and 1,394,428 matches for dry seed and imbibed seed, respectively. The statistics of the conserved miRNA families in dry and imbibed seeds was listed in the table (Table 4). Conserved miRNAs from dry and imbibed seeds had the following features in common. First, both libraries identified 24 conserved miRNA families consisting of 110 individual miRNAs (Additional File S1). The conserved miRNA families in both libraries showed similar abundance distribution. The most abundant families were miR156/miR157, miR166, miR168 and miR528, and the least abundant families were miR162 and miR399 ( Figure 3). The conserved miRNA families in both libraries had identical family member distribution. The largest miRNA family size identified was miR166 that consisted of 14 members and miR156/157, miR167 and miR169 possessed 12, 10 and 9 members, respectively; whereas miR162, miR529, miR827 and miR1432 had only one member detected in the maize seeds ( Figure 4). Second, the two libraries showed similar relative family member abundance pattern within a given miRNA family (Additional File S1). For instance, the sequencing frequencies for miR156d were 517,422 and 436,396, respectively, whereas the sequencing frequencies for miR156j were only 929 and 181, respectively. Third, 5 conserved miRNA families, miR395, miR444, miR482, miR2118 and miR2275 were not detected in both libraries. Fourth, the abundance of the originally annotated miRNA* of zma-miR397b was much higher (7736, 10727 times in two sequenced libraries) than its annotated miRNA (5240, 893 times in two sequenced libraries). This may suggest that a small fraction of miRNA* do not degrade as fast as others [29]. Although the identified conserved miRNA families in dry and imbibed seeds showed similarities in several aspects, significant differences existed between them. 16 of the 24 conserved miRNA families identified showed significant expression difference between dry and imbibed seeds at P = 0.01 level, with 12 significantly down-regulated and 4 significantly up-regulated in the imbibed seed compared to those of the dry seed. The 12 down-regulated miRNA families were miR156, miR159, miR164, miR166, miR167, miR168, miR169, miR172, miR319, miR393, miR394 and miR397. The 4 upregulated families were miR398, miR408, miR528 and miR529. For both up-and down-regulated conserved miRNA families,  their abundances differed greatly. For example, miR159, miR393 and miR394 were down-regulated by 7.7, 5.3 and 5.5 folds, respectively, in imbibed seeds compared to those in dry seeds. However, their basal expression level in dry seeds were only 2027, 296 and 33, respectively. Whereas miR156 and miR166, although they both were down-regulated in imbibed seed only by 1.5 folds, their basal frequencies in dry seeds were 525,736 and 432,679, respectively, much higher than those of miR159, miR393 and miR394. Even they were significantly down-regulated in the imbibed seeds; there still were 342,738 and 282021 miRNA transcripts of miR156 and miR166, respectively, in the imbibed seeds. The same was true for those up-regulated families. MiR528 and miR408 both were up-regulated by 2.6 folds, but the basal expression frequency of miR528 was 214,196 whereas the basal expression frequency of miR408 was only 2,571.

Novel Maize miRNAs
It is always challenging to identify novel plant miRNAs due to their low level of expression and abundance. Some of the common features of miRNAs have been explored and different miRNA prediction programs have been developed to predict plant novel miRNAs. For example, one of the characteristics of miRNA precursors was their hairpin structures. However, hairpin structure is not unique to pre-miRNA, lots of other coding or non-coding RNAs, such as rRNA, tRNA and mRNA, also have the similar hairpin structures [35]. Several studies observed that miRNA precursors have low folding free energy, and considered that low free energy is one important characteristic of miRNAs [36]. We previously developed a very stringent miRNA prediction program to predict maize novel miRNAs [37]. Briefly, a small RNA is considered as a potential miRNA candidate only if it meets all the following strict criteria: 1) the sequence could fold into an appropriate stemloop hairpin secondary structure, 2) the small RNA sits in one arm of the hairpin structure, 3) no more than 6 mismatches between the predicted mature miRNA sequence and its opposite miRNA* sequence in the secondary structure, 4) no loop or break in the miRNA or miRNA* sequences, and 5) predicted secondary structure has higher minimal folding free energy index and negative minimal folding free energy. Based on the miRNA prediction criteria mentioned above, 357 and 580 novel miRNAs were predicted from dry and imbibed seeds, respectively (Additional File S2, S3, S4). They were mapped to all 10 chromosomes in maize. The lengths of the vast majority of the novel miRNAs were 20, 21 and 22 nt and none of them were 24 nt small RNAs. More than half of the novel miRNAs began with 5' uridine, which was a characteristic feature of miRNAs [35,36]. The predicted hairpin structures of the miRNA precursors were in the range of 64-371 nt, which was similar to those observed in rice [38]. The average negative minimal folding free energy of these miRNA precursors was 261.15 kcal mol 21 according to Mfold3.2, which was similar to the computational values of Arabidopsis thaliana miRNA precursors (257 kcal mol 21 ) and much lower than folding free energies of tRNA (227.5 kcalmol 21 ) or rRNA (233 kcal mol 21 ) [36]. Compared to conserved miRNA families, the reads of novel miRNAs were very low, and the majority of them were sequenced less than 50 times. The most abundant novel miRNAs was miRgs297 that was sequenced 3019 times in imbibed seed and novel miRds40 that was 381 times in dry seed. We also looked for sequenced miRNA* sequences, only 46 complementary sequences were found in our combined dry and imbibed seed data sets. Most miRNA* showed weak expression (sequencing frequency ,10) and their expression levels were much lower than their corresponding miRNAs, consistent with the idea that miRNA* strands were degraded rapidly during the biogenesis of mature miRNAs [33]. Compared to conserved miRNAs, different set of novel miRNAs was expressed in dry and imbibed seeds. Of the novel miRNA identified, 280 were dry-seed-specific, 503 were imbibed-seed-specific, and only 77 were shared in both dry and imbibed seeds. Large numbers of novel miRNAs were specifically expressed in imbibed seeds ( Figure 5).

Target Prediction of Maize Novel miRNAs and GO Analysis
To assess the function of the novel miRNAs from dry and imbibed maize seeds, putative targets of novel miRNAs were predicted using a stringent bioinformatic program. Targets of 301 novel miRNAs were successfully predicted. No target genes were found for the rest of 559 novel miRNAs. To gain a better understanding of the functional roles of the predicted miRNA target genes in maize, we did Gene Ontology (GO) analysis on putative targets. The targets were annotated by using the GO annotations available from B73 RefGen_v2. GO terms are commonly used to describe the functions of genes and gene products and to facilitate queries among genes from different organisms. Since the expression levels of the majority of the predicted novel miRNAs were low, targets of miRNAs with sequencing frequency greater than 50 reads were GO analysed. There were 65 miRNA families consisting of 99 miRNAs whose  reads were more than 50. Figure 6A showed target genes specifically enriched in the dry seed. Novel miRNA targets of carbohydrate catabolic related genes were specifically enriched in dry seed. Figure 6B showed target genes specifically enriched in the imbibed seed. Contrary to the dry seed, novel miRNA targets covered a broad range of functional categories in the imbibed seed. They included genes in amino acid biosynthesis, isomerase activity, ligase activity and others.

Discussion
Small RNA libraries from maize dry and imbibed seed embryos were sequenced in this study. The two libraries generated a total of 25,797,782 clean reads and 85.58% of them were perfectly matched to the B73 genome, indicating the overall good quality of the sequence data. The total clean reads consisted of 3,454,632   unique small RNA reads. These sequencing data suggested a highly complex small RNA population in maize seed. The two libraries shared 75.89% of the total clean reads; they shared, however, only 14.33% of the unique small RNA reads. The above results indicated that the shared reads have less small RNA members with each member, on average, having high expression level; while the dry-seed-specific and imbibed-seed-specific small RNAs have more small RNA members with each member having low expression level. The low expression level of these specific unique small RNAs suggested the possibility that they acted upstream of the complex regulating pathway. As to the unique small RNA reads, 18.55% were dry-seed-specific and 67.12% were imbibedseed-specific, indicated that more unique small RNAs were expressed in imbibed seed. The result reflected the fact that complicated physiological, biochemical and molecular changes took place and diverse small RNAs were needed to regulate gene expression in imbibed seed.
MicroRNAs are one of the major players of the total small RNA population in regulating gene expression. Twenty-four conserved miRNA families were identified and 860 putative novel miRNAs were predicted in this study. Sixteen conserved miRNA families showed significant expression differences between dry and imbibed seeds with 12 families down-regulated and 4 families up-regulated in imbibed seeds. Once seeds imbibe water, germination begins. Plant hormones, especially the balance between abscisic acid (ABA) and gibberellin acid (GA), play important roles in seed development, maturation, dormancy and germination. ABA is abundant in dormant seeds and generally decreases during imbibition when seed dormancy is released, whereas GA increases during the transition to germination [17]. The endogenous ABA contents in dry seeds rapidly decline upon imbibition during the early phase of germination (within 6-12 hours) [39]. Many ABA signal transduction proteins were involved in seed germination [40][41][42][43]. In Arabidopsis, miR159, targeting MYB33 and MYB101, two ABA positive regulators, plays a vital role in seed germination [19]. In this study, GRMZM2G423833 and GRMZM2G070523, the homologs of AtMYB33 and AtMYB101, were predicted as target genes of MiR159 and GRMZM2G070523 was validated by degradome sequencing (data not shown). MiR159 and miR319 showed high similarity in mature miRNA sequences, their target motifs were highly conserved in cereal and Arabidopsis GAMYB genes, and they often shared same target genes. RMZM2G139688 and GRMZM2G028054 were predicted as target genes of both miR159 and miR319 in this study. GRMZM2G139688 was identified in our degradome dataset (data not shown). GRMZM2G028054 was validated as target gene of miR159/ 319 by 59 RACE cleavage assay ( Figure 9). Their expression levels were about 6 and 2 fold changes in imbibed seed by qRT-PCR, respectively ( Figure 8). These two target genes were members of the MYB transcription factor family and were homologous to AtMYB65 in Arabidopsis. AtMYB33, AtMYB65 and AtMYB101 were members of GAMYB-like family and were post-transcriptionally regulated by miR159 [44]. In the absence of miR159, the deregulation of MYB33 and MYB65 up-regulated genes that were GA induced during seed germination. These genes participated in aleurone vacuolation which was GA-mediated programmed cell death (PCD) required for seed germination [45]. Thus, GAMYBlike genes participated in GA-induced pathways via miR159mediated regulation during seed germination. The predicted target gene of miR169 was nuclear factor YA. Nuclear factor Y (NF-Y) was a highly conserved transcription factor presented in all eukaryotic organisms, and was a heterotrimer consisting of three subunits (NF-YA, NF-YB and NF-YC) [46]. In our degredome dataset, NF-YA5 was the most abundance target gene of miR169. In Arabidopsis, the expression of NF-YA5 was regulated by miR169 and overexpression of NF-YA5 caused hypersensitivity to ABA during seed germination [47,48]. In rice, a cis-acting ABA responsive element (ABRE) was found in the upstream region of miR169n suggesting that it may be ABA regulated [49]. Upon seed germination, ABA level decreases, so are the abundances of miR159, mi169 and miR319. In this report, MiR159, miR169 Figure 6. Gene ontology classification of novel miRNA targets in dry seed and imbibed seed. A represents in dry seed, B represents in imbibed seed. The X-axis is the categories of target genes. The Y-axis is the percentage of genes mapped by the categories, and represents the abundance of the GO categaries. AgriGO web-based tool was used to analyze GO categories of genes showing changes in transcription levels. doi:10.1371/journal.pone.0055107.g006 and miR319 were all down-regulated with miR159 being downregulated by 7.7 folds, the most down-regulated miRNA family in imbibed seed. When ABA level decreased, the expression of miR159 would be decreased. Its targets, MYB33 and MYB101, the two ABA positive regulators, would be up-regulated, which, in turn, would increase the level of ABA in imbibed seed. This is somewhat counter-intuitive. However, it was thought that this pathway played a pivotal role in resetting ABA responses in order for seeds to sense a decrease in ABA level. Probably, cells need to degrade positive factors of ABA signalling to reset the developmental program. Thus, miR159 may function in allowing a fast recovery from high ABA levels by regulating the genes of positive factors continually when the signal disappears [19]. This negative feedback of ABA signalling by miR159 may be important for seed in the shift from dormancy to germination [11].
MiR164, miR167 and miR393 were also significantly decreased in maize imbibed seed. Their predicted target genes were mainly involved in auxin signal transduction pathway and downstream transcription factors. In Arabidopsis, miR393 has been shown function via the auxin pathway by post-transcriptional regulation of auxin receptorsTIR1, AFB2 and AFB3 [50]. In rice, OsmiR393 was recently found to play an important role in response to salt and drought stress by targeting two rice auxin receptor genes OsTIR1 and OsAFB2 [51]. AUXIN RESPONSE FACTOR (ARF)s are the executors of auxin-dependent transcription and form the pivotal point in translating auxin signals into transcriptional responses [52]. MiR167 regulate ARF6 and ARF8 as positive regulators of adventitious rooting in Arabidopsis [53]. In cultured rice cells, miR167 was shown to cleave auxin responsive factor 8 (ARF8) mRNA [54]. MiR164 has also been shown to play a role in plant development. In Arabidopsis, miR164 targets transcription factor NAC1 to down-regulate auxin signals for Arabidopsis lateral root development [55]. We proposed that the down-regulation of miR393, miR167 and miR164 in imbibed seed might play important roles in dry-to-germinating seed transition by regulating auxin perception, transduction and function.
Plants undergo several developmental transitions, including the transition from an embryonic to post-embryonic mode of growth, the juvenile-to-adult vegetative transition, and the vegetative-toreproductive transition. The juvenile phase is an important and critical step during plant development to ensure maximum growth and productivity. MiR156 plays crucial role in the control of juvenile-to-adult transition in plant by targeting the SQUAMOSA PROMOTER BINDING PROTEIN LIKE (SPL) plant-specific transcription factors [56]. SPLs affect diverse developmental processes such as leaf development, shoot maturation, phase change and flowering in plants. In Arabidopsis thaliana, there are 17 members of the SPL family of transcription factors, and 11 of them are the MiR156 targets [57]. The rice genome contains 19 SPL genes, with 11 of those SPLs containing the target sites of OsmiR156. Overexpression of miR156 in Arabidopsis, rice and maize, repressed the transcript abundance of related SPL genes and reduced apical dominance, delayed flowering time, caused dwarfism and increased total leaf numbers and biomass [58,57,23]. Teosinte glume architecture1 (tga1), which encodes an SBP-domain family protein as does Arabidopsis SPL13 and plays an important role in maize domestication, has been identified as a target of miR156 [25]. In maize, Corngrass1 (Cg1) mutants overexpress two tandem miR156 genes [23], which target teosinte glume architecture1 (tga1). The reduction of tga1 expression in Cg1 mutants affects the juvenile to adult transition, an important phase Figure 8. Quantitative RT-PCR analyses of the relative expression levels of various predicted target genes. The maize housekeeping gene actin was used as the internal control. The values presented were means of three technical replicates 6 SD. A-K represented the expression profiles of some predicted target genes of miR156, miR164, miR166, miR167, miR168, miR169, miR319, miR393, miR408, miR528 and zma-miRn6 in dry and imbibed seeds, respectively. doi:10.1371/journal.pone.0055107.g008 transition in plant development [25]. It was reported that 7 out of 8 switchgrass SPL genes exhibited higher transcript abundance in inflorescences than in other tissues at reproductive stage [59]. A decline in miR156 abundance provides a permissive environment for flowering and is paralleled by a rise in SPL levels [60]. MiR156 seems to play the same role in the dry seed (embryonic mode) to the imbibed seed (post-embryonic mode) transition as that in the juvenile to adult transition. MiR156 showed the highest abundance (525736) in dry seed, possibly keeping SPL factors low to prevent germination. After 24 hours of imbibition, the transcript was reduced to 342738, a 1.5 fold reduction. The balance between miR156 and SPL appears to be important to maintain proper plant development and phase transition. Previous studies showed that miRNA gene regulation cascades exist and the miR156 pathway acts upstream of the miR172 pathway. Moreover, targets of miR156 and miR172 exert positive feedback on the expression of miRNA genes that suppress themselves [61]. Compared to dry seed, miR172 also was significantly reduced in imbibed seed (by 2.4 fold). MiR166 is the second most abundant miRNA in the dry seed and showed similar reduction in imbibed seed (from 432679 to 282021, by 1.5 fold). Previous studies indicated that miR166 regulated plant leaf morphogenesis, vascular development as well as lateral organ polarity and meristem formation, by targeting class III HOMEODOMAIN-LEUCINE ZIPPER (HD-ZIP III) transcription factors [62,63]. In dry seed, abundant miR166 also is needed to keep low activities of HD-ZIP III transcription factors. Not only the miRNA156 and miRNA166 families were abundant in dry and imbibed seeds, but also they had more family members than any other miRNA families, suggesting the importance of these two miRNA families in dry and imbibed seeds. MiR168 also was significantly decreased in imbibed seed. In Arabidopsis, exclusive target of miR168 is AGO1, which is the core component of the RNA-induced silencing complex (RISC) [64]. AGO1 also was predicted as the target of zma-miR 168a/b in maize root [65]. RISC associates with miRNA and inhibits target genes by mRNA cleavage or translational repression [66][67][68]. It is well documented that miRNAs play crucial roles in controlling a variety of developmental processes such as organ identity establishment, organ separation, hormone signalling, flowering time control and regulating plant stress response. ARGONAUTE (AGO) proteins are considered to be integral players in all known small RNAtargeted regulatory pathways [67]. The miR168-mediated feedback regulatory loop regulates ARGONAUTE1 (AGO1) homeostasis, which is crucial for gene expression modulation and plant development. MiR528 was up-regulated upon imbibition and also was the most abundant miRNAs in the imbibed seed. MiR528 was found to be repressed in response to drought stress in leaves of Triticum dicoccoides [69]. Rice miR528 had been shown to be downregulated during the early submergence phase and induced after 24 h of submergence in maize roots [70]. There was a stable strong repression of miR528 in maize both roots and shoots under the low N condition [71]. Copper proteins, cupredoxin, multicopper oxidase and laccase genes have been predicted as targets of miR528, only Cu 2+ -binding domain-containing protein (CBP) was experimentally validated in sugarcane [72]. MiR528 was not found in Arabidopsis and believed to be monocot-specific. MiR408 also was up-regulated in imbibed seed. The Arabidopsis gene family of MiR408 has a single member, regulate a subset of the laccase gene family (AtLAC) and plantacyanin transcripts [73,74]. Both LACs and plantacyanin are Cu-containing proteins. Plant laccases play a putative role in lignin biosynthesis [75], while plantacyanin functions in reproduction of Arabidopsis [76]. It is widely accepted that during seed germination, reactive oxygen species (ROS) such as superoxide radicals, hydroxyl radicals and hydrogen peroxide (H 2 O 2 ) are produced in excess. ROS are efficiently scavenged by superoxide dismutase (SOD) [77] to minimize cell damage and to promote high-germination capacity and vigorous seedling development. MiR408 and miR528, whose potential target genes are predominantly involved in energy metabolism and scavenging of the oxidative species produced during stress. The distinct roles of miR528 and miR408 in seed germination need further investigation. Based on the information discussed above, we proposed the possible roles of miRNA playing in the very early stage of germination [ Figure 10].
Novel miRNAs were predicted using a very stringent bioinformatic program and the majority of them showed low level of expression. We chose those that showed more than 50 reads in the normalized libraries for target GO analysis. It was very interesting to notice that carbohydrate catabolic related genes were specifically enriched in the dry seed library, indicating that dry-seedspecific miRNAs supress carbohydrate catabolic related genes to keep dry seed in the dry state. Or it may reflect the fact that dry seeds contain mRNAs stored during maturation (also called longlived transcripts) to indicate they survived desiccation [78]. Over 10,000 different stored mRNAs have been identified in transcriptome analysis of Arabidopsis [79][80][81][82]. Similar numbers were found in barley and rice [83,84]. During maize seed maturation, carbohydrates are accumulated for storage and it is not surprising to notice that carbohydrate catabolic related genes are specifically enriched in dry seed. However, target gene enrichment in imbibed seed was totally different from that in dry seed. They covered a broad range of functional categories including genes in amino acid biosynthesis, isomerase activity, ligase activity and others. This may suggest that low expression of novel imbibed seed specific miRNAs regulate transcription, amino acid biosynthesis and translation at the proper speed.

RNA Isolation and Cloning of Maize Small RNAs
Maize (Zea mays) inbred line 87-1 was used in this study. Dry seeds were soaked in distilled water, wrapped in paper towels and incubated at 25uC for 24 hours in dark. Embryos of imbibed seeds were used for RNA extraction. Embryos of dry seeds without soaking were used as control. Briefly, total RNAs were extracted from the two samples using Trizol kit (Invitrogen, USA). The small RNAs of 16-28 nt were gel-purified by poly-ethylene glycol precipitation, separated on 15% denaturing PAGE, and visualized by SYBR-gold staining. Small RNAs were ligated to a 59adaptor and a 39adaptor sequentially, reverse-transcription polymerase chain reaction (RT-PCR) amplified, and PCR products were subjected to sequencing [85]. The sequencing was performed on the Illumina platform (Beijing Genomics Institute, China).

Identification of Conserved and Novel miRNAs
The raw sequences and vector removal were processed using PHRED and CROSS MATCH programs as previously reported [86,87]. Trimmed sequences with .18 nt were used for further analyses. First, known non-coding RNAs including rRNA, tRNA, snRNA, snoRNA, and those containing the polyA tail, were removed from the small RNA sequences and the remaining sequences were mapped to the maize ncRNAs deposited in the NCBI GenBank database and Rfam database. Then, the unique small RNA sequences were analyzed by BLAST search against the miRNA database (miRBase 17.0). Only mature and precursor sequences of those small RNAs perfectly matched to known maize miRNAs, were regarded as conserved miRNAs.
To discover potential novel miRNA precursor sequences in our dataset, we used the identified mature miRNA sequences to do Blastn searches against maize genomic sequence. Sequences that met previously described criteria were then considered to be miRNA precursors [88]. A maximum of six unpaired nucleotides and the distance ranging from 5 to 240 nt between the miRNA and miRNA* were allowed. The selected sequences were then folded into a secondary structure using an RNA-folding program mFold3.2 [89] (see Additional File S5, S6, S7). Specifically, dominant, mature sequences residing in the stem region of the stem-loop structure and ranging between 20-22 nt with a maximum free-folding energy of 220 kcal mol 21 were considered as potential novel maize miRNA candidates.

Target Gene Prediction and Analysis
The major steps and parameter settings for predicting target genes of miRNAs were performed as described in previous studies [90,91,58]. Briefly, the criteria were as follows: 1) No more than four mismatches between miRNA and target (G-U bases count as 0.5 mismatches), 2) No more than two adjacent mismatches in the miRNA/target duplex, 3) No adjacent mismatches in positions 2-12 of the miRNA/target duplex (5' of miRNA), 4) No mismatches in positions 10-11 of miRNA/target duplex, 5) No more than 2.5 mismatches in positions 1-12 of the miRNA/target duplex (5' of miRNA). MiRNA target genes were predicted follow the criteria for the identified novel miRNAs (see Additional File S8, S9, S10). These targets were grouped by the biological function of the proteins they encode for, as described by UniProt (http://www. uniprot.org/). The agriGO web service (http://bioinfo.cau.edu. cn/agriGO/analysis.php) was used for the gene ontology term enrichment test [92].
Quantitative RT-PCR for Mature miRNAs Small RNAs were isolated using the miRcute miRNA isolation kit (Tiangen, Beijing, China) from dry and imbibed maize seed embryos. Then a polyA adapter was ligated to the mature miRNAs 3' end by E.coli poly(A) polymerase, and the ligated products were used for the initiation of the reverse transcription according to the supplier's manual of the miRcute miRNA firststrand cDNA synthesis kit (Tiangen, Beijing, China). The reverse transcription product was amplified using a miRNA-specific forward primer and a universal reverse primer.
The specific primers for quantitative RT-PCR (qRT-PCR) on the mature miRNAs were designed with the software primer premier 5.0 (PREMIER Biosoft Int., Palo Alto, CA, USA) (Additional File S11). For mature miRNAs, qRT-PCR with SYBR Green was performed on a CFX96 TM Real-Time System (Bio-Rad, Hercules, California, USA). Briefly, 20 ul PCR reaction contained about 5 ng miRNA first-strand cDNA synthesis, 10 ul 26 miRcute miRNA premix, 200 nM each primer. The reactions were mixed gently and incubated at 94uC for 2 min, followed by 45 cycles of 94uC for 15 s, 61.5uC for 30 s. Finally, a final stage of 65uC to 95uC was performed to confirm the absence of multiple products and primer dimers. U6 was used for each sample as an endogenous control. All samples were performed at least three technical replicates. Data were analyzed using Bio-Rad cfx manager software (Bio-Rad, Hercules, California, USA).

Validation of the miRNA Predicted Target Gene Expression Profiles by Quantitative RT-PCR
Total RNA was isolated from dry and imbibed seed embryos (24 hours imbibed) using Trizol kit respectively (Invitrogen, USA).