Elucidation of miRNAs-Mediated Responses to Low Nitrogen Stress by Deep Sequencing of Two Soybean Genotypes

Nitrogen (N) is a major limiting factor in crop production, and plant adaptive responses to low N are involved in many post-transcriptional regulation. Recent studies indicate that miRNAs play important roles in adaptive responses. However, miRNAs in soybean adaptive responses to N limitation have been not reported. We constructed sixteen libraries to identify low N-responsive miRNAs on a genome-wide scale using samples from 2 different genotypes (low N sensitive and low N tolerant) subjected to various periods of low nitrogen stress. Using high-throughput sequencing technology (Illumina-Solexa), we identified 362 known miRNAs variants belonging to 158 families and 90 new miRNAs belonging to 55 families. Among these known miRNAs variants, almost 50% were not different from annotated miRNAs in miRBase. Analyses of their expression patterns showed 150 known miRNAs variants as well as 2 novel miRNAs with differential expressions. These differentially expressed miRNAs between the two soybean genotypes were compared and classified into three groups based on their expression patterns. Predicted targets of these miRNAs were involved in various metabolic and regulatory pathways such as protein degradation, carbohydrate metabolism, hormone signaling pathway, and cellular transport. These findings suggest that miRNAs play important roles in soybean response to low N and contribute to the understanding of the genetic basis of differences in adaptive responses to N limitation between the two soybean genotypes. Our study provides basis for expounding the complex gene regulatory network of these miRNAs.


Introduction
Nitrogen (N) is an essential macronutrient of plants and its availability markedly affects crop growth and development [1]. The production of high-yielding crops always demands application of substantial N fertilizer [2]. However, due to incomplete capture and poor conversion of N fertilizer, 50-70% of N fertilizer is lost, which results in serious environmental pollution [3].Thus, lowering fertilizer input and improving N use efficiency of crops is urgent for improvement of current agricultural practice [4]. Studying the biological basis of the response of crops to low N is an essential step towards improving their N use efficiency. Several studies have been undertaken to decipher the response of plants to low N, and these studies elucidate the physiological and biochemical changes that are specifically involved in the response. These include the reduction of growth and photosynthesis, remobilization of N from old mature organs to actively growing ones, and accumulation of abundant anthocyanins [5][6][7][8][9][10][11]. Furthermore, the expression of many plant genes, such as those involved in N absorption and assimilation, carbon metabolism, photosynthesis anthocyanin synthesis, and protein degradation were found to be regulated by N limitation [12]. These studies had provided valuable insights into the plants response to N limitation; however, the mechanisms of responses are far from being completely understood. Recently, some miRNAs have been associated with nutrients limitation in plant, which further facilitate in understanding the adaptability of plants to N limitation [13][14][15][16][17].
miRNAs were initially identified through direct cloning and computational analysis [31][32][33][34], and most of them are of high abundance and highly conserved [35]. The advent of highthroughput sequencing technology has provided opportunity for the large-scale identification of low abundance miRNAs, thus rapidly increasing the total number of identified plant miRNAs [36][37]. Moreover, due to its reproducibility and quantitative feature, high-throughput sequencing can also be used to study the differential expression of miRNAs [38]. Up to now, the technology has been used to identify miRNAs in a large variety of plants such as Arabidopsis, rice, poplar, wheat and tomato [38,[39][40][41].
Soybean (Glycine max (L.) Merrill), the major legume crop worldwide, is an important protein source and economic crop. Although soybean can acquire N via its N-fixing symbiosis with rhizobacteria, exogenous N fertilizer is still applied to meet the demand of soybean growth, especially in high-yield production [42], so improving the N use efficiency of soybean is very important. Many miRNAs have been identified in soybean by both computational analysis and high-throughput sequencing [43][44][45][46][47][48]. However, none of these miRNAs were found to be associated with low N stress.
To identify low N-responsive miRNAs in soybean on a genomewide scale, we constructed 16 small RNAs libraries using samples from 2 different genotypes (low N sensitive and low N tolerant) subjected to various periods of low nitrogen stress. Through highthroughput sequencing and analysis, a total of 362 known miRNAs variants of 158 families and 90 new miRNAs of 55 families were obtained. Moreover, we also found that some soybean miRNAs showed differential expression patterns in response to low N stress. Some potential targets of these miRNAs were predicted to be involved in different biological functions. To the best of our knowledge, this is the first report of systematic investigation of low N -regulated miRNAs and their targets in soybean.

Plant Materials, Stress Treatments and Sampling
Two soybean cultivar, No.116 (low-N-tolerant soybean variety) and No.84-70 (low-N-sensitive soybean variety) were selected for this study [49]. Seeds of the two varieties were germinated and grown hydroponically with half-strength modified Hoagland solution (7.5 mM/L N) in the greenhouse as previously described [49], The nutrient solution was replaced with fresh solution every 2 days. After they had grown for 10 days until the first trifoliate leaves fully developed, seedlings were transferred to 1/10 N concentrations (0.75 mM/L N) half-strength Hoagland solution for different term (short-term: 0.5 h, 2 h, 6 h,12 h and long-term: 3 d, 6 d, 9 d, 12 d) low N stress treatment. Ca and K were compensated with CaCl 2 and K 2 SO 4 respectively at equivalent concentration in low-N Hoagland solution. Control treatment seedlings were maintained in normal N level half-strength Hoagland solution. Each treatment was replicated for thrice.
Roots and shoots of total fifteen seedlings with different time point treats were sampled separately, immediately frozen in liquid nitrogen frozen in liquid N and stored at 280uC for RNA extraction. Control treatment samples were collected at the same time.

Analysis of Sequencing Data
The raw data from Solexa sequencing were preprocessed to remove contaminant reads and clip adapter sequences, and the adaptertrimmed reads longer than 18 nt were used for further analysis as clean reads. The identical clean reads were grouped as unique sequences with associated counts of the individual reads, and each unique sequence was then mapped to the soybean genome (http://www. phytozome.net/search.php?show = text&method = Org_Gmax) using SOAP v1.11 and no mismatches were allowed [51]. The unique RNA sequences that perfectly matched soybean genome were retained for subsequent analysis.
To identify known miRNAs, those matched-genome unique RNA sequences were aligned with soybean stem-loop miRNA precursors from miRBase 19 (http://www.mirbase.org). Generally, in Solexa sequencing, many variants can also map on miRNA precursors besides annotated miRNA sequences, and the reads of variants are less abundant than those of annotated miRNA sequences. However, some research found that variants are more abundant in some cases, indicating that they could be utilized to refine the annotated miRNA sequences in the miRBase [52][53][54]. In our study, owing to the construction of 16 small RNA libraries, variants could be compared among libraries. if a variant was more abundant than the annotated miRNA sequences in most libraries, the variant could be convincingly used to substitute for the annotated miRNA sequences. Thus, the most abundant variants in each library were determined from the comparative analysis.
For novel miRNA predictions, these unique RNA sequences matched to known miRNAs precursors, rRNA etc deposited at GenBank (http://www.ncbi.nih.gov/GenBank/) and Rfam (http://rfam.sanger.ac.uk/) databases and genomic exon sequences in the sense strand were removed. 100 nt upstream and 100 nt downstream genome sequences flanking the remaining sequences were extracted to predict secondary structures using RNAfold [55], the resulting potential loci with good hairpin-like structures were then analyzed to predict novel miRNAs by Mireap (http:// sourceforge.net/projects/mireap/). Parameters were set based on authentic criteria for annotation of plant miRNAs [56].
Solexa data have been deposited into the NCBI database with accession number SRP021551.

Identifying miRNAs Responsive to low N Stress
In order to identify low N stress responsive miRNAs, the differentially expressed miRNAs(Known miRNAs and new miRNAs)between low-N stress library and corresponding control treatment library need to be investigated. The frequency of miRNA read counts was first normalized as transcripts per million (TPM), then the normalized expression levels of miRNA between the low-N stress and corresponding control samples was carried out to calculate fold change based on the following formula: Foldchange = log 2 (stress/control).If the normalized expression of a miRNA was zero in samples, this data was modified as 0.01; if the normalized expression of a miRNA in both compared samples was less than 1, the miRNA was not used in the analysis of differential expression. Next, Poisson distribution model for P-value calculation was used for estimating the statistical significance of miRNA expression changes under low N stress,and the formula shown below: P-value formula: Finally, the miRNAs meeting the following criteria were considered as differentially expressed miRNAs: (i) p-value should be less than 0.05; (ii) fold change or log 2 ratio of normalized counts between low-N stress and corresponding control library was greater than 1 or less than 21.

Quantitative Real-time PCR Analysis
RT-qPCR is widely used to measure the gene expression variation, and the choice of suitable genes to use as reference genes is a crucial factor for interpretation of RT-qPCR results. The expression of reference genes is known to vary considerably under different experimental conditions, therefore, these reference genes should be evaluated for their stability of expression. In the present study, two protein-coding genes (TUA5, ACT) and three miRNAs (gma-miR1520d-3p, gma-miR156b-5p, gma-miR166a-5p) were selected to analyze their expression stability in all eight No. 116 libraries. All primers of the nine candidate reference genes are listed (Table S1). Primer sequences for the two mRNA housekeeping genes were chosen based on current literature [57,58], and the stem-loop primers, used for miRNAs candidate reference genes, were designed according to Chen et al [59], which consisted of a self-looped 44 bp sequence (59-GTCGTATCCAGTGCAGGGTCCGAGGTATT-CGCACT-GGATACGAC-39) and 6 variable nucleotides that were specific to the 39 end of the miRNA sequence. RT-qPCR was performed as previously described [59,60]. Briefly, the total RNA was reverse-transcribed using miRNA specific stem-loop primers or Oligo(dT) with M-MLV (Takara, China), then cDNA products were used as template for qPCR with gene-specific primers and universal reverse primer (59GTGCAGGGTCCGAGGT-39). The qPCR reactions were performed in triplicates on IQ TM 5 and MyiQ TM Real-Time PCR Detection Systems (Bio-Rad) using SYBR-Green. Each PCR reaction was performed in a final volume of 20 ml containing containing 10 ml 26SYBR Premix Ex Taq II (TaKaRa, Japan), 0.25 mM of each primers and cDNA from reverse-transcribed from 100 pg total RNA using the following protocol: 95uC for 5 min, 40 cycles of 95uC for 10 sec, 60uC for 10 sec and 72 uC for 15 sec. At the end of the cycling protocol, a melting-curve analysis from 55 to 95uC was performed to determine specificity of the amplified products. All RT-qPCR reactions were performed with three biological replicates. RT-qPCR data was analyzed with geNorm software (V3.50) to determine the stability of candidate reference genes expression [61].
After determining the appropriate reference genes, 12 miRNAs were randomly selected for RT-qPCR assays to validate the reliability of Solexa/Illumina sequencing technology. These RT-qPCR assays were performed as described above. All the primers used are listed (Table S1). The relative expression levels of these miRNAs were calculated by delta-Ct method [62], which first transformed the Ct values of interest miRNA and reference genes to quantities using delta-Ct, then dividing the quantities of interest miRNA by the geometric mean of the reference genes. The mean and SD are calculated from the triplicate RT-qPCR assays. Student's t-test was used for statistical analysis of RT-qPCR data.

Prediction of Potential Target Genes for Differentially Expressed miRNAs
Target prediction of differentially expressed miRNAs was performed based on methods described by Allen et al. [63]. Mature miRNA sequences were used as query to search against the Glycine max unigene database by the psRNATarget server using the following stringent parameters : (i) No more than two mismatches between miRNA and its target (G-U bases count as 0.5 mismatches), (ii) No mismatches in positions 10-11 of miRNA and its target duplex. The functional annotation of identified putative miRNA targets were inspected on the Phytozome using the Blast2Go (B2G) software suite v2.3.1 with the default parameters.

An Overview of Small RNA Libraries Data Sets by Highthroughput Sequencing
In our previous study, the exposure of No.116 (low N tolerant variety) and No.84-70 (low N-sensitive variety) to short-term (0.5 h, 2 h, 6 h, 12 h) and long-term (3 d, 6 d, 9 d, 12 d) low N stress resulted in different morphological and physiological changes [49]. Furthermore, 3231 differentially expressed genes involved in 22 metabolism and signal transduction pathways were identified through digital gene-expression [49]. In this study, to identify miRNAs in response to low N stress, sixteen small RNA libraries were constructed and sequenced using Illumina HiSeq 2000, yielding a total of 361,296,585 sRNA raw reads (more than twenty million for each library). After removing low quality reads, adapters, poly-A sequences and short RNA reads smaller than 18 nucleotides, 348,651,354 (96.50%) clean reads including 52,351,387 unique sequences were obtained from all these libraries. For clean reads, 116RS library produced the least clean reads (19,583,940) and 116RL library yielded the most clean reads (23,859,206), while for unique sequences, 84SLC library generated the least unique sequences (955,962) and 116RL library showed the most abundant unique sequences (4,853,221). Although the average sequenced frequency of a unique sequence was from 4.8 (116RLC library) to 20.7 (84SLC library), over 74% of the unique sequences were only sequenced once in these libraries, indicating that sequencing was far from saturated (data not shown). These unique sequences were then perfectly mapped to the soybean genome using SOAP2 software, and the results showed that over 57.4% of unique sequences matched the soybean genome in these libraries. Among these libraries, the highest proportion of unique sequences mapped to the soybean genome came from the 84SS library (78.26%) and the lowest proportion come from the 116RLC library (57.45%) ( Table 1).
Size distribution of sRNAs based on both total sRNAs reads and unique sRNAs reads were analyzed ( Figure 1).The majority of the total sRNAs reads were found to be in the range of 21 nt to 24 nt in length in all 16 libraries. Three major peaks at 21 nt, 22 nt and 24 nt were observed in eight root libraries (Figure 1a), while 21-nt small RNAs in total sRNA reads were dominant in eight shoot libraries ( Figure 1b). The length distribution of unique sRNA reads revealed that the 24 nt sRNAs were the most abundant class in all libraries and it was followed by 21, 22 nt sRNAs ( Figure 1c and Figure 1d). Overall, although these small RNAs were unevenly distributed according to their length in all libraries, a proportion of small RNAs of a certain length was found to be similar among all libraries. To further compare the average abundance of sRNAs with different lengths, the ratio of raw and unique sequences was calculated, which found that the ratio of sRNAs varied along with length, and the 21 nt sRNAs showed the highest redundancy. The result was consistent with previous reports from other plant species [36,[64][65].

The Most Abundant miRNA Variants Corresponded to Soybean known miRNAs
In the process of detecting known miRNAs in soybean, we found that in some cases, the most abundant sequences among all unique sequences mapped to the known pre-miRNAs of soybean were not annotated miRNA sequences in miRBase. Thus, we turned to search for the most abundant variants and compared them among all 16 libraries to determine if these variants were more abundant than annotated miRNA sequences in most libraries. In sixteen libraries, a total number of 349 known pre-miRNAs belonging to 158 miRNA families were analyzed to determine the most abundant variants by investigating the distribution of unique RNA sequences on known pre-miRNAs. In some libraries,some of the known pre-miRNAs could not be well-supported by unique RNA sequences, therefore, numbers of known pre-miRNAs analyzed were different in every library, and number of pre-miRNAs (348) analyzed in 84RLC library was the most ( Table 2 and Table S2).
When the most abundant variants located on their corresponding pre-miRNA 59 arm or 39 arm were searched and compared in all 16 libraries, we found that although some of the most abundant variants were same in all 16 libraries, some of the most abundant variants did not coincide among 16 libraries. In order to further study the differential expression analysis, these most abundant variants that showed differences among 16 libraries needed be unified. Therefore, we followed the rule that if a unique RNA  Figure S1). Interestingly, for gma-miR4361-3p and gma-miR4368b-3p, their most abundant variants on respective miRNA precursors were almost the same in 8 libraries from the same variety, however, they were different between libraries from the two varieties (Table S2, Figure S1). This implied that the most abundant variants might be varietyspecific.  In classical plant miRNA biogenesis pathway, a pre-miRNA is cleaved into miRNA::miRNA*duplex, the miRNA of which joins with Argonaute (AGO) to form the RNA-induced silencing complex (RISC) to regulate gene expression. In most cases, miRNA*is quickly degraded, so it is found at a much lower frequency than miRNAs [66]. We assumed the most abundant variant on the whole stem of a pre-miRNA (59 arm or 39 arm) as miRNA variant, while the most abundant variant on the opposite arm of miRNA variant as miRNA* variant. The analysis of the most abundant variant showed that miRNA variants for over 50% of known pre-miRNAs were the same in these 16 libraries, while the miRNA* variants were less consistent (Table S2, Figure S1). For example, their corresponding most abundant variants of gma-miR167g, gma-miR169f and gma-miR394b were found to be located in their pre-miRNAs 59 arm (as miRNA variants) and were consistent in 16 libraries. They also had more sequence reads than the corresponding miRNA* variants located in the pre-miRNAs 39 arm, and the miRNA* variants were different in these 16 libraries. Certainly, miRNA variants could come from their pre-miRNAs 39 arm and were slightly more prevalent than miRNA variants from the pre-miRNAs 59 arm (Table S2). In addition, we found that for some miRNAs, such as gma-miR169l, gma-miR171b and gma-miR4414, their corresponding miRNA variants in some libraries were located on the pre-miRNAs 59 arm, while in other libraries located on the pre-miRNAs 39 arm (Table S2, Figure S1). These results revealed the alternative use of the pre-miRNAs 59 and pre-miRNAs 39 arms as well as the complexity of the mature miRNA variants generating processes in different libraries.
For analysis of these unified miRNA variants' sequenced reads, these unique RNA sequences located between +2 and 22 nt away from unified miRNA variants on their corresponding pre-miRNAs were included in our calculations. We found that gma-miR3522 5p was the most abundantly expressed and was sequenced 79410810 in 16 libraries, followed by some species of gma-miR1507a/b-3p, gma-miR156d/g/i/j/l/m-5p and gma-miR166u-3p. We also calculated sequenced reads of miRNA* variants, and the miRNA* variants were considered to be identified even if miRNA* variants were sequenced once in one library of 16 libraries, thus a total of 267 miRNA* variants were identified in 16 libraries (Table 2 and Table S2).
Previous studies showed that a single miRNA precursor could produce two or more distinct miRNAs [53]. In this study, we also found that some pre-miRNAs, such as pre-miR159a, pre-miR319a, pre-miR394a, and pre-miR2118a could generate distinct abundant sequences on their 59 arm or 39 arms, the positions of which didn't overlap and the sequence frequencies were high enough. Most importantly, we found most of their corresponding miRNA*s sequences, indicating they were likely to be the products of DCL1 processing. These distinct abundant sequences from the same precursor could be annotated as different miRNA variants of the same precursor, amplifying analyzed known miRNA variants to 362 in 16 libraries (Table 2, Table S2, Figure S1).
To further characterize these miRNA variants, we compared them with annotated mature soybean miRNAs in miRBase (Table  S2, Figure S1, Figure 2). Though many annotated miRNAs were the most abundant sequences on their corresponding pre-miRNAs in all libraries, almost 50% of the miRNA variants were not consistent with annotated miRNAs in miRBase. For example, for gma-miR160d, gma-miR2109, and gma-miR482b, although the most abundant variants on both arms of respective pre-miRNAs were different with annotated miRNAs, they were the same in these 16 libraries and could form miRNA::miRNA* duplex, supporting that they were likely authentic miRNAs or miRNA*s and could substitute for the annotated miRNAs. In addition, we found that some miRNA variants were located on the opposite arms of the annotated miRNAs on their corresponding pre-miRNAs. For example, for gma-miR171e, gma-miR159d and gma-miR390c, their corresponding miRNA variants were located on the corresponding pre-miRNAs 59 arms, whereas their annotated miRNA were located on the corresponding pre-miRNAs 39 arms.

Putative Novel miRNAs in Soybean
Besides the unique RNA sequences aligned known miRNA precursors, there were still numerous unclassified small RNAs in these 16 libraries, some of which might be novel miRNAs. The Mireap software was used to predict novel miRNAs with adjusted parameters, which were suitable for plant miRNAs identification [67]. Approximately, 3000 loci were predicted with miRNA precursor-like stem-loop structures in the 16 libraries. Since the most abundant sequences on these new loci weren't always consistent among the libraries, we adopted the above rule of known miRNA variants to unify them to count sequenced frequency. To improve accuracy of a new miRNA prediction, we adopted the following critical criterion: (1) miRNA* was detected in at least one library, (2) total sequenced frequency of a novel miRNA in 16 libraries were over 100, as the use of low expression miRNAs is prone to high false positive. Thus only 90 loci belonging to 55 miRNA families were annotated as new miRNAs, and each library had varied novel miRNAs (Table 3 and  Table S3). Generally, these new miRNAs were named temporarily in the form of novel-soy-number-3p/5p,. For some paralogous loci of newly identified miRNAs that could be classified into the same families, they were designated as novel-soy-number-letters-3p/5p. For example, novel-soy0055 have nine paralogous loci that were named novel-soy0055a,novel-soy0055i, respectively (Table S3). According to the sequencing reads, novel-soy0001-3p was found to be the most abundantly expressed novel miRNA, followed by novel-soy0045-5p, novel-soy0055i-3p and novel-soy0027-3p (Table S3).
In accordance with the known miRNAs, these newly identified miRNAs derived from predicted hairpin structures ranged from 80 to 376 nt, and the minimum folding free energy (MFE) for these structures hairpin structures was found to be less than 20 kcal mol 21 . Almost 50% of these newly identified miRNAs were 21 nt miRNAs beginning with a U, and the pre-miRNAs 59 and pre-miRNAs 39 arms were alternatively used as sources of miRNA in different libraries (Table S3).
To confirm whether these new miRNAs were homologous with known miRNAs, we compared them with the known miRNAs from all plant species deposited in miRBase. Our results showed that eight new miRNAs or miRNAs*(novel-soy0001, novel-soy0006, novel-soy0013, novel-soy0020, novel-soy0025, novel-soy0026, novel-soy0044 and novel-soy0045) were orthologues of known miRNAs identified in different plant species (Table S3). In addition, although most new miRNAs were independently transcribed from intergenic regions of the genome, we found that a few new miRNAs loci were located in the introns or exons of the protein-coding genes. The positions of these new miRNAs on protein-coding genes as well as the functions of protein-coding genes have been summarized (Table 4). These genes are involved in distinct plant physiological processes except that the functions of some genes were unclear.

The miRNAs Responsive to Short-term and Long-term N Limitation in Soybean Roots
To identify low N-responsive miRNAs (known miRNAs and new miRNAs) in soybean roots, we compared miRNA expression profiling between short-term or long-term low N stress and corresponding control libraries in both genotypes. Specifically, the sequenced amount of a specific miRNA was first normalized as transcripts per million (TPM), then the log 2 ratios between low N stress and corresponding control libraries and P-value based on Poisson distribution model were calculated. To minimize noise and improve accuracy, we only selected the miRNAs with sequence reads over 100 in at least one library for comparison. P-value #0.05 and the absolute value of log 2 ratio $1.0 as a threshold were used to judge the statistical significance of miRNA expression. The miRNAs with log 2 ratio $1.0 were designated as 'up-regulated', while the miRNAs with log 2 ratio #21.0 as 'downregulated' (Figure 3, Figure 4 and Table S4). Results showed that in No.116 variety (tolerant genotype), 14 known miRNAs belonging to 5 miRNA families were found to be significantly differentially expressed in response to short-term low N stress, and these 14 miRNAs were all down-regulated and reduction in gma-miR2109-3p expression was the highest. In No.84-70 variety (sensitive genotype), 13 known miRNAs belonging to 8 miRNA families were identified to be significantly differentially expressed. Among 13 known miRNAs, 5 known miRNAs (i.e. gma-miR1510a-5p,gma-miR396b/d/g-3p) were up-regulated while the other known miRNAs (i.e. gma-miR1512a-5p, gma-miR5372-5p) were down-regulated. Among the significantly  differentially expressed miRNAs responsive to short-term N limitation from both genotypes, 3 known miRNAs (gma-miR408a/b/c-5p) were found to be common and down-regulated. Under long-term low N stress condition, 36 known miRNAs belonging to 12 miRNA families as well as one novel miRNA (novel-soy0006-3p) were identified to be significantly differentially expressed in No.116 variety. Among these 36 miRNAs, 15 known miRNAs (i.e.gma-miR396b/d/g-3p, gma-miR482a/c-3p, gma-miR2109-3p) and the novel miRNA were found to be downregulated while the other known miRNAs were up-regulated. In No.84-70 variety, 34 known miRNAs belonging to 15 miRNA families were identified to be significantly differentially expressed. Among 34 known miRNAs, 12 known miRNAs (i.e. gma-miR1512b -5p, gma-miR171c/e-5p, gma-miR482a-5p) of which were up-regulated while the other known miRNAs (i.e. gma-miR156b/f-5p, gma-miR169c/p/s-5p) were down-regulated. Among these significantly differentially expressed miRNAs responsive to long-term N limitation from both genotypes, 16 known miRNAs in 7 miRNA families (i.e.gma-miR1512c-5p, gma-miR159a-3p-1, gma-miR159b/f-5p-2, gma-miR169c/e/h/ p/s-5p, and gma-miR862a/b -5p) were common. To identify the common significantly differentially expressed miRNAs under both short-term and long-term low N stress conditions, miRNAs significantly differentially expressed in response to short-term or long-term low N stress were further compared. The results showed that in No.116 variety roots, 3 significantly differentially expressed known miRNAs belonging to 2 miRNA families(gma-miR159a/e-3p-1, gma-miR2109-3p) were common under short-term and long-term low N stress conditions, while in No.84-70 variety roots, 2 known miRNAs belonging to 2 miRNA families (gma-miR1510a-5p, gma-miR5559-5p) were common under both short-term and long-term low N stress conditions. Among both short-term and long-term low N-responsive miRNAs from roots of both genotypes, no significantly differentially expressed miRNA was found to be common.

Comparison of the Soybean Shoots miRNAs with Roots miRNAs Responsive to Short-term and Long-term N Limitation
The miRNAs that were found significantly differentially expressed in response to short-term or long-term low N stress in soybean shoots were further compared with significantly differentially expressed miRNAs in soybean roots (Table S4). The results showed that in No.116 variety, 9 known miRNAs belonging to 3 miRNA families (gma-miR159a/e-3p-1, gma-miR160/b/c/d/e-5p, gma-miR2109-3p, gma-miR2109-5p) were significantly differentially expressed in response to short-term low N stress in both shoots and roots of No.116 variety. Under long-term low N stress condition, 5 known miRNAs belonging to 4 miRNA families(gma-miR159a/e-3p-1, gma-miR2109-3p, gma-miR397b-3p, gma-miR408d-5p) were significantly differentially expressed in both shoots and roots of 116 variety. The significantly differentially expressed miRNAs in both shoots and roots of No.116 variety responsive to short-term were further compared with those responsive to long-term low N stress. The results showed that 3 known miRNAs in 2 miRNA families (gma-miR159a/e-3p-1, gma-miR2109-3p,) were common. In No.84-70 variety, 2 known miRNAs from 2 miRNA families (gma-miR5372-5p, gma-miR5559-5p) were found significantly differentially expressed in response to short-term low N stress in both shoots and roots of No.84-70 variety, while under long-term low N stress condition, 4 known miRNAs belonging to 3 miRNA families (gma-miR1510a/ b-5p, gma-miR171e-5p, gma-miR482a-5p) were significantly differentially expressed in both shoots and roots of No.84-70 variety. The significantly differentially expressed miRNAs in both shoots and roots of No.84-70 variety responsive to short-term were further compared with those responsive to long-term low N stress. The results showed that no known miRNA was common.

Confirmation of miRNAs by qRT-PCR
It is essential to evaluate the stability of candidate reference genes expression before RT-qPCR is used to determine the differential expression of genes. Two protein-coding genes (TUA5, ACT) and three miRNAs (gma-miR1520d-3p, gma-miR156b-5p, gma-miR166a-5p ) were selected for the analysis of their expression stability in all eight No. 116 libraries by geNorm software. The results showed that the average expression stability values (M) of miR156b-5p and miR1520d-3p were lower than those of the other genes in eight No. 116 libraries analyzed, which was consistent with previous reports [68]. To further determine the optimal number of reference genes for normalization, we calculated the pair-wise variation of these candidate reference genes. The combination of the two most stable genes (gma-miR1520d-3p, miR156b) was found to be sufficient for normalization purposes because the V2/3 value was lower than 0.15. Thus, the two miRNA genes (gma-miR1520d-3p, gma-miR156b-5p) were selected to normalize the level of gene expression( Figure  S2).
After determining reference genes, 12 miRNAs were randomly selected for RT-qPCR assays to validate the reliability of Solexa/ Illumina sequencing technology. Student's t-test was used for statistical analysis of RT-qPCR data, and 12 miRNAs randomly selected did not display differential expression in response to shortterm low N stress in No. 116 root.Most of RT-qPCR results were found to be consistent with the deep sequencing data ( Figure 6). However, there were some differences between the deep sequencing results and RT-qPCR data. For example, deep sequencing results showed that the gma-miR172l-3p was downregulated exposed to short-term low N stress in No. 116 root and log 2 ratio was 20.95, while the RT-qPCR indicated that it showed negligible expression differences and log2 ratio was 0.06.

The Targets of Low N-responsive miRNAs
To understand the functions of low N-responsive miRNAs, the identification of their targets is an important step. Since plant miRNAs are highly complementary to their targets [34], bioinformatics methods based on the homology between miRNAs and target genes are used to predict target genes and have been applied in a number of studies [69][70][71][72]. In this study, potential target genes of all low N-responsive miRNAs were predicted, along with the description of the function of these genes (Tables S5). We predicted a total of 223 targets genes for 53 out of all 68 low N-responsive known miRNAs as well as 14 genes for one new miRNAs in roots from the two varieties, with the remaining known miRNAs having no target genes. However, in shoots, we identified a total of 399 target genes for 93 out of all 124 low N-responsive known miRNAs, with the remaining 31 known miRNAs and the new miRNA having no target genes. While a few miRNAs were predicted to target only one gene, the great majority of low Nresponsive miRNAs had multiple potential target genes. For instance, gma-miR2109-5p had the most 48 target genes followed by gma-miR156b/f-5p and gma-miR397a/b-5p with 32 and 31 target genes respectively. In general, multiple members of some miRNAs families can target the same gene, however, species from different miRNAs families might also target the same gene and thus have similar functions. For example, Glyma13g04030.1 was predicted to be the target of both gma-miR159a/e-3p-1 and gma-miR319g/l-3p, and Glyma16g05900.1 was regulated by both gma-miR156b/f-5p and gma-miR169o/r-5p. However, it was not clear how these miRNAs regulate the same genes.
To elucidate the functions of low N-responsive miRNAs, target genes with functional annotations were analyzed. We found that the predicted targets were involved in a broad range of plant physiological and biochemical processes, including regulation of protein degradation (26S proteasome regulatory complex, Apoptotic ATPase, Vesicle coat complex COPI, Mitochondrial import inner membrane translocase etc), cellular Transport (multicopper oxidase, Amino acid transporters), hormone signaling pathways (AUX/IAA family, Auxin response factor, gibberellin 2-oxidase), Carbohydrate metabolism(Long-chain acyl-CoA synthetases, acetyl-CoA carboxylase carboxyl transferase subunit, pyruvate dehydrogenase E1 component), nucleic acid metabolism(Asparaginyl-tRNA synthetase, CCAAT-binding factor, tRNA delta(2)isopentenylpyrophosphate transferase, RNA recognition motif, Chromatin assembly factor-I). Interestingly, some target genes are also important transcription factors (Myb superfamily, TCP family transcription factor, GRAS family transcription factor, AP2 domain-containing transcription factor).

Discussion
Extensive studies on the molecular basis underlying adaptive responses to low N stress have been conducted and many genes have been identified to be responsible for low N adaptability. For example, 10,422 genes were found to be involved in early stage responses to low N stress in rice seedling by Lian et al. [73]. However, most studies were about gene expression regulation at Figure 6. Comparison between qRT-PCR and the deep sequencing in No.116 root exposed to short-term low nitrogen stress. The yaxis indicate the relative expression levels of twelve selected miRNA in qRT-PCR and in Solexa sequencing analysis. The x-axis indicates twelve selected miRNAs, which are respectively as follows: 1. gma-miR1511-3p; 2. gma-miR390a/c-3p; 3. gma-miR396b/c/d/f/g-5p; 4. gma-miR396b/d/g-3p; 5. gma-miR398a/b-3p; 6. gma-miR4348-5p; 7. gma-miR4413a-5p; 8. gma-miR482a-3p; 9. gma-miR156p-5p; 10. gma-miR171n-3p; 11. gma-miR171o-5p; 12. gma-miR172l-3p. gma-miR1520d-3p and gma-miR156b-5p was chosen as endogenous reference genes. doi:10.1371/journal.pone.0067423.g006 the transcriptional levels. With regard to post-transcriptional regulation, miRNAs associated with low N stress response in some plants have been fragmentarily reported, but little information on the post-transcriptional regulation of N limitation in soybean was available. Particularly, no miRNA in soybean has been identified to be involved in response to low N stress. In the present study, we constructed sixteen libraries for the genome-wide identification of miRNAs in soybean shoots and roots in two different genotypes exposed to long-term and short-term low N-stress using the highthroughput sequencing technology (Illumina-Solexa). We obtained 348,651,354 total reads and 52,351,387 unique reads. These data were much more than those reported in previous studies and allowed us to analyze low abundance miRNAs and identify more new miRNAs [43][44][45][46][47][48]. Although 508 soybean miRNAs have been well registered in miRBase database [74], 90 novel miRNAs were detected. Furthermore, sixteen constructed libraries could be compared to improve the reliability of identified miRNAs. For example, the most abundant variants of some miRNAs on their corresponding pre-miRNA were found to be different from their registered miRNA sequences, however, they were the same in all or most libraries and could be utilized to refine miRBase annotations of soybean miRNAs. In addition, the use of latest Glycine max genomic database and miRBase in the present study contributed to the identification of more miRNAs.
Some documents reported that miRNAs were involved in plants responses to N availability [28][29][30]. In our research, a total of 150 known miRNAs variants as well as 2 novel miRNAs were identified to be responsive to low N stress. We further analyzed the differences between the miRNAs determined in our study and the ones previously reported. For example, Gifford ML et al. found that high N repressed miR167a and resulted in the ARF8 transcript to accumulate in the pericycle to regulate root architecture [28]. In our research, all species of gma-miR167 family did not show significantly differential expression in the two varieties under long-term and short-term N limitation. Another study indicated that miR156 was up-regulated by low N in Arabidopsis, whereas miR169, miR395 and miR398 were downregulated [29]. We found that multiple members of the gma-miR169 family were repressed in both roots and shoots of these two soybean varieties under low N stress, and some members of gma-miR398 family were down-regulated only in soybean shoots, while gma-miR395 family were not responsive to low N stress. Moreover, we also found that different species of the gma-miR156 family showed different response patterns. For example, gma-miR156b-5p and gma-miR156f-5p were repressed in roots of No.84-70 variety under long-term N limitation, and gma-miR1560-3p was up-regulated in its shoots under long-term N limitation. Recently, Xu et al. studied detailed response of miRNAs to low N availability in maize shoots and roots at the whole genome level and found that under long-term low N condition, miR167, miR169, miR395, miR399, miR408, and miR528 were down-regulated in maize roots, and in maize leaves miR164, miR172, and miR827 were up-regulated while miR169, miR397, miR398, miR399, miR408, and miR528 were downregulated. Under short-term low N condition, miR160, miR168, miR169, miR319, miR395, and miR399 were up-regulated in roots, while in maize leaves miR172 were up-regulated and miR397, miR398, and miR827 were down-regulated. Interestingly, different species in the miR169 family showed different expression patterns, such as miR169e/f/g/h were down-regulated, while miR169i/j/k/p were up-regulated [30]. We found that in soybean roots, gma-miR408 family were up-regulated in response to long-term low N, and some species of gma-miR160 and gma-miR319 family were down-regulated in response to short-term low N. However, members of gma-miR167 and gma-miR168 were not responsive, which were contrary to the results obtained from research of Xu et al. [30]. In soybean shoots, some species of gma-miR397, gma-miR398 and gma-miR408-5p family were found to be down-regulated in response to long -term low N, and gma-miR398c-5p was found to be down-regulated in response to short -term low N.These results were consistent with the results obtained from research of Xu et al. [30], but gma-miR164, gma-miR172, gma-miR528 and gma-miR827 did not show significantly differential expression under long-term and short-term N limitation, which were different from the results obtained from research of Xu et al. [30]. Overall, Several possible explanations may account for the differences between our findings and those reported by Xu et al. [30]. The miRNAs were identified by microarray system in the study of Xu et al., which is not as sensitive and sufficient as high-throughput sequencing technology used in our study. Furthermore, the statistical methods for differentially expressed miRNA and material in our research were different from that of Xu et al [30].
One purpose of our study was to identify those miRNAs that showed different expression patterns in the two soybean varieties. Many differentially expressed miRNAs were discovered in our study and could be divided into three types. The first type was the miRNAs that showed differential expression under low N stress only in No.116 variety or No.84-70 variety. For example, gma-miR2606a/b-3p was repressed only in No.116 variety roots under short-term low N, while gma-miR1512a-5p was repressed only in No.84-70 variety roots under short-term low N. The second type of miRNAs exhibited differential expression in both soybean varieties under same low N stress, but the directions of differential expression were different. For example, gma-miR396b/c/d/f/g-5p was down-regulated in No.116 variety shoots and up-regulated in No.84-70 variety shoots under short-term low N stress. The third type of miRNAs included the miRNAs that exhibited specific differential expression under different-term low N stress in the same organ of the two varieties. For example, gma-miR319g/l-3p were down-regulated in No.84-70 variety roots under short-term low N stress while they were up-regulated in No.116 variety roots under long-term low N stress. This implied that those genotypespecific regulated miRNAs might be responsible for differences in the response of the two varieties to low N stress.
By comparing the target genes of differentially expressed miRNAs with previously discovered genes involved in low N stress response in other plants, we found that these responsive genes were indeed involved in various metabolic and regulatory pathways. For example, Peng found 21 genes involved in protein degradation through autophagy and ubiquitin-proteosome pathways were up-regulated by N limitation [75]. Some of our predicted target genes were determined to play roles in protein degradation, including Glyma07g31580 (target of gma-miR156b/ 6f-5p), as well as Glyma05g20930 and Glyma06g18790 (target of gma-miR396b,g-5p), which were predicted to encode separately E3 ubiquitin ligase and Cathepsin L1. Metabolism of N is closely related to carbon metabolism, because nitrate assimilation and biosynthesis of nitrogenous macromolecules require abundant energy, reducing equivalents and organic carbon intermediates provided by carbon metabolism. Some studies found that genes involved in carbon metabolism were responsive to low N stress [75][76]. In the present study, we found that some targets of gma-miR159d-3p, gma-miR396b,g-5p, such as Glyma05g23280, Glyma07g05550, Glyma16g02090, Glyma17g16750, Gly-ma19g44930, Glyma15g08010 and Glyma19g01200 were related to carbon metabolism. Gene expression alterations caused by low N stress correspondingly activate signal transduction and gene transcription regulatory networks to coordinate all of these changes [73,[75][76]. We observed that some target genes were transcription factors and participated in signal transduction, such as MYB, AP2, ARF, SPB, and zinc finger. Futhermore, MYB, ARF and AP2 families have been known to be involved in plant stress responses [77][78][79]. In addition, we noticed that gma-miR5372-5p showed distinct expression patterns between the shoots of the two genotypes under the short-term low N condition, and its target gene, Glyma09g02600, encodes peroxidase protein which could eliminate excess concentrations of reactive oxygen species produced under stress conditions. This phenomenon also was observed by Kulcheski et al who found that MIR-Seq11 had different expression behavior between the two contrasting soybean genotypes under the drought stress, and MIR-Seq11 was also predicted to target peroxidase protein [80].
In summary, our study has identified 362 known miRNAs belonging to 158 families and 90 new miRNAs belonging to 55 families from two soybean genotypes, and analyzed their expression patterns during short-term and long-term N limitation. 150 known miRNAs variants as well as 2 novel miRNAs with with significantly differential expression were discovered and the putative targets of these miRNAs were predicted. This work can contribute to a better understanding of the genetic basis of the phenotypic differences between the two soybean genotypes under N-limiting conditions and significantly contribute to future research. Our further research plans include the characterization of these differentially expressed miRNAs and their targets and understanding the complex gene regulatory network of these miRNAs.