Genome-wide identification and characterization of microRNAs by small RNA sequencing for low nitrogen stress in potato

Nitrogen is an important nutrient for plant growth and tuber quality of potato. Since potato crop requires high dose of N, improving nitrogen use efficiency (NUE) of plant is an inevitable approach to minimize N fertilization. The aim of this study was to identify and characterize microRNAs (miRNAs) by small RNA sequencing in potato plants grown in aeroponic under two contrasting N (high and low) regimes. A total of 119 conserved miRNAs belonging to 41 miRNAs families, and 1002 putative novel miRNAs were identified. From total, 52 and 54 conserved miRNAs, and 404 and 628 putative novel miRNAs were differentially expressed in roots and shoots, respectively under low N stress. Of total 34,135 predicted targets, the gene ontology (GO) analysis indicated that maximum targets belong to biological process followed by molecular function and cellular component. Eexpression levels of the selected miRNAs and targets were validated by real time-quantitative polymerase chain reaction (RT-qPCR) analysis. Two predicted targets of potential miRNAs (miR397 and miR398) were validated by 5’ RLM-RACE (RNA ligase mediated rapid amplification of cDNA ends). In general, predicted targets are associated with stress-related, kinase, transporters and transcription factors such as universal stress protein, heat shock protein, salt-tolerance protein, calmodulin binding protein, serine-threonine protein kinsae, Cdk10/11- cyclin dependent kinase, amino acid transporter, nitrate transporter, sugar transporter, transcription factor, F-box family protein, and zinc finger protein etc. Our study highlights that miR397 and miR398 play crucial role in potato during low N stress management. Moreover, study provides insights to modulate miRNAs and their predicted targets to develop N-use efficient potato using transgenic/genome-editing tools in future.


Introduction
Potato is the most important non-grain food crop in the world. In India, potato crop requires high dose of nitrogen (180-240 kg ha -1 ) to produce high tuber yield (30-40 t ha -1 ) [1]. An excessive use of nitrogen causes environmental pollution and degradation of soil health and water quality [2]. Hence, improving nitrogen use efficiency (NUE) of plant is an important approach, which can utilize less nitrogen (N) to produce equivalent tuber yield. This approach is more important in developing countries where resources are limited like India [2]. To achieve this, although agronomic and soil management techniques have been applied to a great extent to improve NUE in crops, this study aimed to identify microRNAs (miRNAs) involved in regulation of gene expression under N stress metabolism in potato [3,4]. Plant growth and development is controlled by genes and non-coding regulatory molecules like miRNAs [4]. In fact miRNAs are 20-24 nucleotides molecules and they are involved in many biological processes such as cell signaling and development, biotic and abiotic stresses, DNA methylation, auxin response and so on [5]. Mostly miRNAs originate from non-coding region of gene, however a few miRNAs are reported from retrotransposons [6]. Mature miR-NAs are processed from primary-miRNAs, which are further processed to precursor-miRNAs [6]. Several miRNAs families have been found in plants and animals and many of them are conserved across organisms [7].
With the advancement in next-generation sequencing technologies and availability of the potato genome sequence [21], it is possible to discover miRNAs and target genes in potato. Very recently, we have identified genes under low N stress in potato through RNA-sequencing [22]. Genome-wide identification and characterization of miRNAs in potato is essential to widen our knowledge on gene regulation under low N stress. Hence, in this study we sequenced small RNA to identify conserved and novel miRNAs in response to low N stress versus high N (control) in potato plants grown in aeroponic under controlled conditions. Further, we also performed expression profiles of selected miRNAs and their targets, and 5'RLM-RACE to validate the predicted targets. Our results would enhance knowledge on function of miRNAs in response to adaptation to low N stress in potato.

Small RNA sequencing and analysis
Total RNA was isolated from root and shoot tissues of potato cv. Kufri Jyoti grown in aeroponic under two N (low N and high N-control) regimes (Fig 1). Single-end (1 x 75) small RNA libraries were prepared directly from the QC (quality control) passed total RNA. Size of the libraries was ranged between 149-156 bp and sequencing was performed using NextSeq500 (Illumina, San Diego, CA, USA). Nearly 11-21 millions raw reads per sample were generated (S1A Table), of which nearly 9-18 million reads were recovered after adapter trimming (S1B Table), from which high quality reads (8-18 million reads per sample) were used for further analysis (S1C Table). High quality reads with an average read length 23-24 nt and unique tags (2.6-3.9 million per sample) (S1D Table) were mapped to the Rfam database, and mapped reads, unmapped reads and unique tags were filtered (S1E Table). Further, unmapped reads and unique tags were mapped to the potato genome sequence repeat database and then filtered (S1F Table). Reads and unique tags, which were unmapped to the repeat database, were filtered for length distribution (15-34 nt) (S1G Table). Finally, reads having length more than 34 nt were excluded from analysis and reads size of 15-34 nt were retained for miRNAs identification. Bioinformatics analysis flow is depicted in S1 Fig

Identification of conserved and novel miRNAs
Clean reads (15-34 nt) were used for mapping with the potato genome sequence database and alignment summary is outlined in Table 1 and shown in Fig 2. Precursor sequences were identified by the tool miRCat (The UEA small RNA workbench, v 3.2) and only miRNAs observed on stable precursor sequences are reported. After identifying mature sequences, candidate miRNAs were compared with the miRBase ID to assign miR families and then conserved and putative novel miRNAs were segregated ( Table 2). A total of 119 conserved miNRAs and 1002 putative novel miRNAs were identified in root and shoot tissues in response to low N versus high N (control) (S2 Table and S3 Table). Number of conserved miRNAs were 110, 105, 108 and 103 in KJ_HN_Root, KJ_LN_Root, KJ_HN_Shoot and KJ_LN_Shoot, respectively (S4 Table; KJ: Kufri Jyoti, HN: high N; LN: low N); whereas 615, 381, 890 and 708 putative novel   Table). Maximum 20 conserved miRNAs were identified in potato chromosome 3 followed by chromosome 6 (18), whereas maximum putative novel miNRAs were identified in chromosome 4 (103) followed by chromosome 1 (91) ( Table 3).

Differential expression analysis of miRNAs
Read counts were used for expression profiling of conserved and putative novel miRNAs. Raw counts were normalized by using TPM (Transcripts Per Million) normalization method and further used to analyze differential expression analysis of miRNAs. Only significantly (p < 0.05) expressed miRNAs are reported and summarised in Table 4. Of total 119 conserved miRNAs belonging to 41 miRNAs families, 52 miRNAs were differentially expressed in roots, of which 38 were up-regulated, 9 were down-regulated and 5 miRNAs were tissue-specific (KJ_HN_Root) (S8 Table). Whereas, 54 conserved miRNAs were differentially expressed in shoots, 32 were up-regulated, 13 were down-regulated and 9 were tissue-specific (7: KJ_HN_Shoot, and 2: KJ_LN_Shoot) (S9 Table). Of total 1002 putative novel miNRAs, 404 were differentially expressed in roots, 92 were up-regulated, 128 were down-regulated and 184 were tissue-specific (171: KJ_HN_root, and 13 KJ_LN_root) (S10 Table). Whereas, 628 putative novel miRNAs were differentially expressed in shoots, of which 252 were up-regulated, 156 were down-regulated and 220 were tissue-specific (177: KJ_HN_Shoot, and 43: KJ_LN_Shoot) (S11 Table).

Target prediction of miRNAs and GO characterization
The psRNATarget finder tool was used to predict targets of miRNAs. A total of 34,135 targets were predicted by 723 miRNAs (61 conserved and 662 putative novel miRNAs) (S12 Table).
In-silico targets finding showed that most miRNAs regulate gene expression by cleavage mechanism followed by inhibition of translation process. Predicted targets were characterized by total 4399 GO terms, of which 2327 were categorised to biological process, 1583 were belonged to molecular function, 488 were under cellular component, and one was of 'eco' category (S13 Table). Based on the GO analysis, we predicted several target genes such as Cdk10/11-cyclin dependent kinase (49 miRNAs), amino acid transporter (29 miRNAs), receptor protein kinase (28 miRNAs), F-box domain containing protein (27 miRNAs), cytochrome P450 (27 miR-NAs), serine/threonine-protein kinase PBS1 (23 miRNAs), F-box family protein (21 miRNAs), ankyrin repeat-containing protein (21 miNAs) and many unknown genes. In addition, many unknown transcripts were targeted by miRNAs like PGSC0003DMT400044989 (50 miRNAs) and PGSC0003DMT400004014, PGSC0003DMT400044986 and PGSC0003DMT400044988 (each 49 miRNAs). Based on hits value, top five GO numbers were belonged to cellular component (GO:0016020, hits: 16638, membrane; and GO:0016021, hits: 16073, integral component of membrane) followed by molecular function (GO:0005515, hits: 9268, protein binding; and GO:0005524, hits: 7118, ATP binding) and biological process (GO:0006468, hits: 4585, protein phosphorylation). Selected top ten GO terms of the predicted targets of miRNAs is depicted in Fig 5.

PLOS ONE
miRNAs identification under low nitrogen stress in potato F-box family protein for stu-known-mir14 (miR5303); transcription factor for stu-novel-mir963; zinc finger protein for stu-novel-mir712; amino acid transporter for stu-novel mir1052; and sugar transporter for stu-novel-mir858 (S14 Table).

Target validation of conserved miRNAs by 5' RLM-RACE
Two targets (one each) of conserved miRNAs (miR397 and miR398) were validated by 5'RLM-RACE. The predicted targets PGSC0003DMT400001824 (Laccase) of miR397 and PGSC0003DMT400026824 (Serine-threonine protein kinase, plant-type) of miR398 were amplified with distinct band (Fig 7). Further, the amplified 5' RLM-RACE products were eluted, cloned and sequenced to analyse cleavage pattern based on sequence complementarity between miRNAs and mRNA (target) (Fig 7).

Discussion
Improving NUE of potato plant is essential to reduce amount of N fertilizers applied to the crop, while maintaining tuber yield. This approach could reduce production cost and save environment from excessive use of N [2]. We aimed to discover miRNAs regulating gene networks in potato under N stress. Very recently we have shown precision phenotyping of potato  , however genome-wide identification of miRNAs is yet to be investigated.
In this study, we identified 119 conserved miRNAs (20-24 nt) belonging to 41 miRNAs families, and 1002 putative novel miRNAs (15-25 nt) responsive to low N stress in potato. The differential expression analysis of conserved miRNAs shows that miR397 and miR398 play crucial role in potato under low N and their targets were also validated in the study. The GO analysis indicates that maximum target genes belong to biological process followed by molecular function and cellular component. However, the highest number of hits of miRNAs to targets were observed with cellular component (membrane, and integral component of membrane), indicating their role during adaptation to N stress in potato. Our findings on miRNAs identification in potato corroborate with earlier research. A few selected targets of miR398 were heat shock protein binding protein, calmodulin-binding heat shock protein, zinc finger protein, cytochrome P450, and serine-threonine protein kinase plant-type. Earlier study shows that miR398 is directly associated with stress regulatory network genes and responses to various stresses like high salt, water deficit, oxidative, abscisic acid, phosphorpus and copper deficiency, high sucrose and bacterial infection [23]. Trindade et al.
[24] revealed up-regulation of miR398 and miR408 in response to water deficit in Medicago truncatula. Manipulation of microRNA expression has been investigated in plants, such as miR397 and miR398 were down-regulated, and miR156/157 were up-regulated under N starvation [15]. Similarly, Shahzad et al. [16] reviewed role of miRNAs in nutrient acquisition [29] identified miR397 were found under N starvation in wheat at grain filling stage. Thus, these studies illustrate importance of miR397 and miR398 in plant metabolism, and our study observed their role in potato adaptation to low N stress.
In this study, miR156/157 encoded several targets such as nitrate transporter and salt-tolerance protein; whereas amino acid transporter and serine-threonine protein kinase plant-type are some predicted targets of miR156. MiR156 is one of the first microRNAs described in Arabidopsis by small RNA sequencing [30]. MiR156 is a graft-transmissible miRNA that modulates plant architecture and tuberization in Solanum tuberosum ssp. andigena [31]. Zeng et al. [14] reviewed role of various microRNAs in plant development in response to N deficiency. A few miRNAs identified in our study corroborate with earlier finding such as miR156 and miR319 (shoot development), miR393 (root development, defense response), miR397 (copper homoeostasis, lignin synthesis), and miR398 (copper homeostasis, oxidative stress) [14]. Song et al.  [42] investigated miRNAs like miR156, miR397, miR398, miR5303 etc. and their target genes in the fruit and shoot tip of Lycium chinense, a traditional Chinese medicinal plant. Hou et al. [43] suggest the involvement of miR156, miR319, miR482, miR5303 etc. families in development of long non-coding RNAs during tuber sprouting process of potato. In addition to the above conserved miRNAs families, miR5303 contained 15 members followed by miR169 contained 11 members; miR166 and miR319 contained 7 members; and miR172, miR7981 and miR8039 contained 6 members, and rest 34 conserved miRNAs contained 1-3 members, depending upon the tissues used in the study. These miRNAs were also differentially expressed in potato under low N stress. MiR169 targets genes like Aspartate aminotransferase and transcription factors (YA1, YA6, YA3, YA4, and BZIP transcription factor BZI-2). MiR169 isoforms regulate root architecture in Arabidopsis and their targets are linked to nutrient signalling in plants [44]. Role of grafttransmissible miR172 in induction of potato tuberization has been proven by Martin et al. [45]. Thus, researchers have evidenced role of miRNAs in plant metabolism and manifested their relevance under low N stress in potato.
Taken together, our study substantiates that several miRNAs play a key role in adaptation to low N stress in potato. Of which, we have confirmed that miR397 and miR398 play a critical role during N stress metabolism in potato. A schematic diagram of selected miRNAs and their potential target genes involved in N metabolism in potato are depicted (Fig 8). Further investigation is required to modulate expression of miRNAs and their targets to develop N-use efficient potatoes which would adapt to low N stress without compromising yield applying modern genomics tools like genome editing.

Plant materials and N treatments
An Indian potato variety Kufri Jyoti (KJ) was used in this study, which is N inefficient [46].  (Fig 1). Ammonium form of N was limited to 0.5 mM in both the treatments. Shoot and root samples (three replicates) were collected from 45 days old plants of both high and low N treatments and processed for small RNA sequencing for samples namely 'KJ_LN_Shoot' versus 'KJ_HN_Shoot' (control); and 'KJ_LN_Root' versus 'KJ_HN_Root' (control).

RNA isolation, library preparation and sequencing
Total RNA was isolated from root and shoot samples by modified c-TAB and lithium chloride method [47]. Quality and quantity of isolated RNA samples were checked on 1% denaturing RNA agarose gel and NanoDrop, respectively. Small RNA sequencing libraries (single end, read length: 1 x 75 bp) were prepared directly from the isolated QC passed total RNA using TruSeq Small RNA Library Preparation Kit as per the manufacturer's instruction (Illumina, San Diego, CA, USA). The gel size selected and purified libraries were analyzed on 4200 Tape Station system (Agilent Technologies, Santa Clara, CA, USA) using High sensitivity D1000 Screen Tape and then libraries were used for sequencing in NextSeq 500 following manufacturer's instruction (Illumina, San Diego, CA, USA). Details of analysis with parameters setting are described in S1 File.

Data processing and analysis
Raw reads were screened for adapter contamination using the Cutadapt tool (v 1.16) to retain only putative miRNAs (Phred quality score: QV 20) [48]. Then quality filtering analysis was performed to remove any low quality read or trim terminal low quality bases using the Trimmomatic tool (v 0.38) [49]. Adapter trimmed and quality filtered clean reads were formatted into a non-redundant fasta format. Occurrence of each unique sequence read was counted as sequence tag. Number of reads for each unique tag reflects relative expression level of miRNA. Considering tags instead of reads allows a high loss-less compression of original data and simplifies computational analysis.
Since, small RNA sequencing data is known to include reads from other category of small non-coding RNAs, read tags were blasted against customized Rfam database of Rfam (v13) [50]. Read tags, which were not mapped to the Rfam, were mapped to known repeat sequences i.e. Repbase (v 22) using NCBI Blast and any tag matched with above mentioned stringent criteria was excluded from further analysis. Finally, reads having length more than 34 nt (nucleotide) were also excluded from analysis. Adapter trimmed, quality filtered and length filtered reads were considered as cleaned putative miRNAs population and were used for miR-NAs identification.

Reads mapping to the reference potato genome
Clean reads were mapped to the reference potato genome. Precursor sequences were identified by the tool miRCat, The UEA small RNA workbench (v 3.2) [51,52]. The miRcat analyses reads mapping pattern along with few upstream and downstream bases. It uses RNAfold to estimate the precursor stability by analyzing MFE value. Only miRNAs observed on stable precursor sequences were reported.

Identification of conserved and novel miRNAs
On identifying mature miRNAs, candidate mature sequences were compared with miRBase (a conserved miRNA database) and assigned with miR families [53]. Thus, conserved miRNAs were identified based on the assigned miRBase ID, and rest others without miRbase ID were categorized as putative novel miRNAs. Unique mature miRNAs sequences found in each sample were tabulated. In-house custom wrapper scripts based on RNAfold was used to generate secondary structure of precursor sequences of miRNAs. Venn diagram was prepared for the identified conserved and putative novel miNRAs using Venny tool [54] with the default setting to identify common and tissue specific miRNAs.

Differential expression analysis of miRNAs
After aligning with the potato genome, reads count were used for expression profiling of conserved and putative novel miRNAs. miRNAs with at least of 10 reads support in any of the four samples were considered for differential analysis. TPM normalization was applied to normalize reads count of miRNAs population per sample. TPM values were then log (base 2 ) transformed to get fold expression values and finally log 2 FC values were calculated by subtracting respective fold expression values. FC value of above zero is considered as up-regulated, whereas below zero is considered as down-regulated.

Prediction of miRNAs targets and GO characterization
The psRNATarget finder tool was used to predict potential targets of miRNAs based on unpaired energy (UPE) using default parameters as latest version of the tool (v 2017). Predicted targets were then characterized by the GO terms.

Validation of miRNA expression by RT-qPCR
RT-qPCR analysis was performed to confirm the expression level of selected miRNAs. Total RNA of same stage (45 days) plants was used to generate cDNA by reverse transcription using Mir-X™ miRNA First-Strand Synthesis and TB Green TM RT-qPCR kit following manufacturer's instructions (Takara Bio USA Inc., CA, USA) in ABI PRISM HT7900 (Applied Biosystems, Warrington, UK). The entire sequence of mature miRNAs was used as miRNA-specific 5' primer, whereas 3' primer (mRQ 3' Primer) was supplied with the kit. Thermal cycles profile of RT-qPCR was followed as: denaturation at 95˚C for 10 s; 40 cycles of 95˚C for 5 s and 60˚C for 20 s; and dissociation at 95˚C for 60 s, 55˚C for 30 s and 95˚C for 30 s. A normalization standard gene U6 snRNA was used to measure relative expression of miRNA by the deltadelta Ct method ( ΔΔ Ct) [55]. It measures relative expression of miRNAs by comparing with normalization standard U6 snRNA. miRNAs and U6 snRNA were amplified in each sample to determine the Ct for each, by which relative expression was determined using the ΔΔ Ct calculation. RT-qPCR data of miRNAs were analysed in three replications (both biological and technical) and calculated FC using ΔΔ Ct method [55]. Standard errors were calculated from the replicated data and expression result is shown in bar diagram (mean ± standard error). Replicated expression data was analysed with one way analysis of variance (ANOVA) using XLSTAT2018.5 (www.xlstat.com) and means were compared using Tukey's test at least significance level (p < 0.05) [46]. Means of up-regulated and down-regulated miRNAs were analysed separately.

Expression analysis of targets by RT-qPCR
Total RNA of all four samples of same stage (45 days) was reverse transcribed using TaqMan1 Reverse Transcription Reagent kit (Applied Biosystems, New Jersey, USA). RT-qPCR primers were designed using Primer Express 2.0 software (Applied Biosystems, CA, USA), as given in S14 Table. RT-qPCR was performed using Power SYBR Green PCR Master Mix (Applied Biosystems, Warrington, UK) in the ABI PRISM HT7900 following thermal cycler profiles 50˚C for 2 min; 95˚C for 10 min; and 40 cycles of 95˚C for 15 s, 60˚C for 1 min, 72˚C for 30 s using internal standard of mitochondrial cytochrome oxidase gene (coxI) (X83206.1) as described earlier in potato [56,57]. RT-qPCR expression data of target genes was analysed as above like miRNAs [46].

Validation of targets of conserved miRNAs by 5' RLM-RACE
Two predicted targets (one each) of miRNAs (miR397 and miR398) were validated by 5' RLM-RACE technique using the SMARTer RACE 5'/3' Kit following manufacturer's instruction (Cat. No. 634858; Takara Bio USA, Inc., CA, USA). In brief, poly A + RNA was isolated from same stage (45 days) plant tissues (roots and leaves) using NucleoSpin RNA Plant kit (Cat. No. 740949.50; Takara Bio USA, Inc., CA, USA). Poly A + RNA (1 μg) was processed for first-strand cDNA synthesis, and then 5' RACE-ready cDNA was used for 5' RACE amplification using 5' gene-specific primer and Universal Primer A Mix, as provided with the kit. The PCR cycles used for the RACE amplification were 30 cycles of 94˚C for 30 s, 67˚C for 30 s and 72˚C for 5 min. The PCR products were eluted, cloned in pRACE vector (a pUC19-based vector) and sequenced using M13 universal primer sequences to identify cleavage site. The genespecific primers were designed using IDT (Integrated DNA technologies, USA) tool and only 5' primers were used for 5' RLM-RACE amplification (S15 Table). Illumina nextseq small RNA library preparation, adapter trimming analysis, quality filtering analysis, uniquification of read tags, reads filtering (removal of known non-coding RNA mapping reads), reads filtering (removal of repeat database mapping reads), reads length filtration, reads mapping to the reference genome, known and novel miRNAs indemnification analysis, differential analysis, precursor sequences secondary structures, miRNAs targets identification analysis. (DOCX) S1 Table. Details of data generation by small RNA sequencing for N stress in potato. a) raw reads data statistics, b) adapter trimming analysis summary statistics, c) quality trimming analysis summary statistics, d) read tags uniquification summary statistics, e) Rfam database mapped read filtering statistics, f) repeat database mapped read filtering statistics, and g) read length filtration summary analysis.  Table. Prediction of target genes of both conserved and putative novel miRNAs identified in potato under N stress condition. A list total 79303 potential target genes of both conserved and putative novel miRNAs in potato are summarized in sheets. (XLSX) S13 Table. Gene Ontology (GO) characterization of target genes in potato. Summary of GO characterization of target genes of miRNAs identified in potato under N stress are summarized in sheets. (XLSX) S14 Table. RT-qPCR analysis. Summary of RT-qPCR primers used for gene expression analysis of miRNAs and their corresponding targets. (DOCX) S15