Identification and annotation of newly conserved microRNAs and their targets in wheat (Triticum aestivum L.)

MicroRNAs (miRNAs) are small, non-coding and regulatory RNAs produce by cell endogenously. They are 18–26 nucleotides in length and play important roles at the post-transcriptional stage of gene regulation. Evolutionarily, miRNAs are conserved and their conservation plays an important role in the prediction of new miRNAs in different plants. Wheat (Triticum aestivum L.) is an important diet and consumed as second major crop in the world. This significant cereal crop was focused here through comparative genomics-based approach to identify new conserved miRNAs and their targeted genes. This resulted into a total of 212 new conserved precursor miRNAs (pre-miRNAs) belonging to 185 miRNA families. These newly profiled wheat’s miRNAs are also annotated for stem-loop secondary structures, length distribution, organ of expression, sense/antisense orientation and characterization from their expressed sequence tags (ESTs). Moreover, fifteen miRNAs along with housekeeping gene were randomly selected and subjected to RT-PCR expressional validation. A total of 32927 targets are also predicted and annotated for these newly profiled wheat miRNAs. These targets are found to involve in 50 gene ontology (GO) enrichment terms and significant processes. Some of the significant targets are RNA-dependent DNA replication (GO:0006278), RNA binding (GO:0003723), nucleic acid binding (GO:0003676), DNA-directed RNA polymerase activity (GO:0003899), magnesium ion transmembrane transporter activity (GO:0015095), antiporter activity (GO:0015297), solute:hydrogen antiporter activity (GO:0015299), protein kinase activity (GO:0004672), ATP binding (GO:0005524), regulation of Rab GTPase activity (GO:0032313) Rab GTPase activator activity (GO:0005097), regulation of signal transduction (GO:0009966) and phosphoprotein phosphatase inhibitor activity (GO:0004864). These findings will be helpful to manage this economically important grain plant for desirable traits through miRNAs regulation.


Introduction
MicroRNAs (miRNAs) are a special abundant regulatory class of RNAs known for properties such as non-coding, endogenous in nature and short lengths from 18 to 26 nucleotide (nt). a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 These small miRNAs are called as mature miRNAs, generate from long precursor miRNAs (pre-miRNAs) whose length ranges from 70-500 nt and forming a self-folded stem-loop secondary structures by Dicer-like 1 (DCL1) enzyme in plants [1]. Mature miRNAs are involved to regulate gene expression at post-transcriptional levels by either targeting mRNAs for degradation or hindering protein translation. Both processes are accomplished by the complementary base pairing of miRNAs to their target mRNA sequences [2]. In plants, for a majority of cases, miRNAs interact with their targets through perfect or near-perfect hybridizing and lead to target mRNA degradation [3]. Growing confirmation has revealed that miRNAs play a significant role in an extensive range of developmental processes in plants, including cell proliferation, stress response, metabolism, inflammation, and signal transduction [2][3][4]. To date, more than 28,645 miRNAs have been reported from 223 species of plants and animals and available in the publicly available database miRBase (Release 21) [5]. The majority of plant miRNAs have been identified in species with fully sequenced genomes such as; 713 from Oryza sativa, 401 from Populus trichocarpa, 384 from Arabidopsis thaliana, 343 from Solanum tuberosum, 321 from Zea mays, and 241 from Sorghum bicolor [5]. miRNArelated research is constantly increasing and miRNAs, along with their functions, are being profiled and annotated through various computational tools and experimental methods such as direct cloning, deep sequencing, and other approaches. Comparison of miRNAs through several plant species has shown that some miRNAs are greatly evolutionary conserved from species to species, such as from mosses to higher flowering eudicots in the plant kingdom [4]. Conservation nature of miRNAs has provided a valid approach for profiling new miR-NAs in other species. Presently, comparative genome-based approaches have been used to profile conserved miRNAs in many plant species, such as cotton [6], switchgrass [7,8], soybean [9], tomato [10], chilli [11], roses [12], helianthus [13], cherry [14], red alga [15] cowpea [16]. Wheat (T. aestivum L.) is belongs to family Poaceae and also known as bread wheat. Wheat, based on production, is the world second most-produced cereal crop after maize [17]. It is an important source of carbohydrates, vegetal protein and multiple nutrients and dietary fiber [18]. In the miRBase, a database of miRNAs (http://www.mirbase.org/, Release 21: June 2014) [5], only 119 mature miRNAs are reported in this significant staple food. Although progress has been made on the identification and profiling of miRNAs in the wheat [19][20][21], still a need to profile more conserved miRNAs is fruitful for this significant cereal crop. In this study, a well-defined comparative genome based homolog search was employed to profile new wheat miRNAs and their targets.

Retrieval of candidate miRNAs
To mine new conserved wheat miRNAs through comparative homology-based search, a total of 1,286,372 wheat ESTs were downloaded from the EST-database (dbEST), (release 130101, 1 January 2013) available at https://www.ncbi.nlm.nih.gov/genbank/dbest/dbest_summary. The reference miRNAs and wheat ESTs were subjected to BLASTn and BLASTx algorithms to profile potential conserved miRNAs and remove repeated sequences and protein coding sequences respectively [22]. The FASTA format of potential candidate wheat miRNAs, having maximum four mismatches with the reference miRNAs and non-coding in nature were selected, saved and subjected to downstream analyses.

Prediction of wheat miRNAs secondary structures
Prediction of stem-loop secondary structures of initial potential candidate sequences is an important criterion for profiling and characterization of new conserved miRNAs in wheat [2]. MFOLD (version 3.6) [23], a secondary structure prediction tool was employed to produce stem-loop structures for the initial identified potential wheat miRNA sequences. All the initial candidate sequences who failed to develop stable secondary structures were discarded. Only potential candidate miRNA sequences with stable stem-loop structures showing mature sequences in the stem region, at least 12 nucleotide involved in Watson-Crick or G/U base pairing with the opposite strand and having minimum free energy (MFE) -10Kcalmol -1 were saved and subjected to physical scrutiny.

Physical scrutiny
Physical scrutiny of the candidate miRNAs is an essential step to exclude the false positive miRNAs. So, all the potential candidate miRNAs resulted from the wheat ESTs with properties such as having maximum 4 mismatches with the reference miRNAs, non-coding in natures, forming a stable stem-loop secondary structure and single-tone in natures were subjected to physical scrutinization to remove the sequences with large bulges, mature sequences not in the stem-region and having higher MFEs. The organ of expression for each of the newly profiled wheat miRNA is also noted from its EST.

RT-PCR validation
From the newly profiled wheat miRNAs, fifteen miRNAs were randomly selected and subjected to expressional analysis by RT-PCR (Reverse Transcription) along with Ta54227 (Cell division control prot., AAA-superfamily of ATPases), as an housekeeping gene [24]. Primer-3 algorithm (http://bioinfo.ut.ee/primer3-0.4.0) was used to design stem-loop primers (S1 Table) from the ESTs of 15 randomly selected miRNAs. Total RNA was extracted from the leaves of wheat using Qiagen plant RNA kit. Later, cDNA was synthesized using the Rever-tAid™ First Strand cDNA synthesis Kit (Fermentas), according to the supplier's protocol. A 60 μg cDNA was used as template and PCR was programmed as follows: initial denaturation at 95˚C for 3 min, for 30 cycles; denaturation at 94˚C for 35 sec, annealing at 60˚C for 35 sec, and extension at 72˚C for 30 sec and final elongation step at 72˚C for 10 min. The PCR products were separated through 1.5% (w/v) agarose gel with 100 base pair DNA leader.

Targets prediction
In order to predict putative targets for the newly profiled wheat miRNAs, psRNATarget, a plant small RNA target analysis server available at http://plantgrn.noble.org/psRNATarget/ [25] was used. The wheat library (Triticum aestivum (wheat), cDNA, EnsemblPlants, release-31) was used as selected target library with the modified 2017-updated parameters of psRNA-Target as Max Expectation cutoff: 5, HSP length for scoring: 19, Penalty for GU pair: 0.5, Penalty for other mismatch: 1.0, Allowing bulge on target: Yes, Penalty for opening gap: 2.0, Penalty for extending gap: 0.5, Weight for seed region: 1.5, Seed region: 2-13, # of mismatches allowed in seed region: 2 and Calculating UPE: No. The predicted putative wheat miRNA targets were subjected to the Gene Ontology functional and enrichment analyses through agriGO [26].

Wheat miRNAs characterization
The newly profiled wheat miRNAs were characterized and annotated in terms of pre-miRNAs length, MFE of pre-miRNAs, mature miRNA sequences with mismatches, number of mismatches, mature sequence length, ESTs, strand orientation, mature sequences arm, GC percentage and organ of expression (for detail, S2 Table). All the mature sequences of the new conserved wheat miRNAs are observed in the stem regions of the stem-loop structures, (some are shown in Fig 1). The predicted miRNAs' stem-loop structures showed that there are at least 11-21 nucleotides engaged in Watson-Crick or G/U base pairings between the mature miRNA and the opposite arms (pre-miRNAs) in the stem region and the hairpin precursors do not contain large internal loops or bulges. Many researcher have been reported same results for the miRNAs in other plants and animals [6,11,12,32].
Wheat miRNAs' mature lengths were found with minimum 18 nt and maximum 25 nt with an average of 22. According to class boundaries, mature sequences length ranges from lowest to highest are as, 18 nt have (2 out of 212) made 1% of total, 19 nt (7 out of 212) 3%, 20 nt (30 out of 212) 14%, 21 nt (88 out of 212) 42%, 22 nt (36 out of 212) 17%, 23 nt (17 out of 212) 8%, 24 nt (30 out of 212) 14% and 25 nt (2 out of 212) 1%. The wheat mature sequences length range is observed in agreement with the other known plant miRNAs [16,32]. The current study revealed that strand orientation of the newly profiled 100 miRNAs out of 212 are found in sense strand that produced 47% of the total miRNAs. While 112 miRNAs out of 212 are found in anti-sense strand orientation that made 53% of the total miRNAs. The 113 out of 212 that made 53% of the total mature sequences are observed on the 5' arm of the stem-loop secondary structures, whereas, the 99 out of 212 (47%) are found on the 3' arm. GC percentage is an important parameter of characterization for a nucleotide sequence. The GC% of the newly predicted wheat miRNAs are found with 14% minimum, 90% maximum and an average of 47%. The detail GC % results shown in the form of class boundaries are, 10% to 40% (77 out of 212) 36%, 41% to 60% (98 out of 212) 46%, 61% to 80% (27 out of 212) 13%, 81% to 95% (10 out of 212) 4% of the total. The newly profiled wheat miRNAs based on their ESTs were also characterized for their organ of expression. The maximum miRNAs are found in root (56 out of 212) made 26% of the total and followed by leaf 17%, crown 13%, anther 8%, seedling 6.6%, shoot 6%, seed 5.6%, spike 5%, endosperm 2%, pistil 2%, kernel 1.4%, embryo 1.4%, florets 1%, ovary 1%, stem 1%, cultured 1%, callus 0.5%, and egg cell 0.5%. The organ based expression of wheat miRNAs will be helpful in devising better plant organ developing and regulation. The different organ based expression of the miRNAs reported through comparative genomics approaches are in agreements with the previous reports in other plant species [9][10][11][12][13][14][15][16]32].

Wheat miRNAs putative targets prediction
Targets prediction is an important step of annotation and characterization for the newly profiled wheat miRNAs. A total of 32927 target genes were identified for the newly predicted 212 new conserved wheat miRNAs (S3 Table) by a very stringent approach as described above. Based on gene ontology annotation, these targets comprises of 50 GO-terms (Table 1)  studied Arabidopsis thaliana AtCOX17 genes, that contributes in the transfer of copper for COX assembly and found that AtCOX17 genes are induced by several stress conditions and abscisic acid. This means the newly identified miRNAs targeting wheat COX genes would help us to understand and manage it against several stress conditions.
In the reproduction process (GO:0000003), recognition of pollen (GO:0048544) and oogenesis (GO:0048477) are identified and annotated as a potential targets of the tae-miR1535, tae-miR3476, tae-miR5386, tae-miR5783, tae-miR8154, tae-miR6276, tae-miR6249, tae-miR6111, tae-miR6202, tae-miR3627 and tae-miR8044b. Male and female gametes are the specialized structures developed by flowering plants. They have crucial role in seeds and fruit production. Understanding of male fertility through pollen recognition and production genes can be utilized for the development of novel hybrid seed production systems in wheat [36]. The wheat's miRNAs, identified in this study, targeting genes involved in the processes of male, female gametes and reproduction would be a good source to enhance seed productions and regulate male fertility in wheat.
Magnesium (Mg) is the second most abundant cation in plants. It plays a significant role in many physiological and biochemical processes like photosynthesis, enzyme activation, and synthesis of nucleic acids and proteins [40]. Mg also serves as a regulator of cation-anion balance in cells and as an osmotically active ion regulating cell turgor together with potassium (K) [41]. To maintain an optimal Mg level in various tissues, plants have evolved efficient transport and regulation machinery for Mg 2+ distribution throughout the whole plant [42]. The newly predicted wheat miRNAs including tae-miR5167a, tae-miR5543, tae-miR5568c, tae-miR6035, tae-miR6224a, tae-miR7768b and tae-miR8659 are found to target the Mg transport associated genes such as, magnesium ion transport (GO:0015693), magnesium ion transmembrane transporter activity (GO:0015095) and magnesium ion binding (GO:0000287). These tae-miRNAs would be useful to regulate Mg accumulation in the plant for a better management of photosynthesis, enzymes activation, nucleic acid and protein synthesis.
The salinity of soil is one of the main abiotic stresses that limit agricultural yields global and at least 50% of total agricultural lands are at risk of salinization. The most comprehensively studied gene class in relation to salinity stress physiology is the family of cation/proton antiporter 1 [43]. Here, the wheat miRNAs as; tae-miR435, tae-miR827, tae-miR1522, tae-miR5167a, tae-miR5490, tae-miR6180 tae-miR6191b, tae-miR6275, tae-miR7714 and tae-miR7768b are engaged to target the two antiporter gene like; antiporter activity (GO:0015297) and solute:hydrogen antiporter activity (GO:0015299). These tae-miRNAs could be used to reprogrammed cation/proton antiporter activities and enhance the salinity tolerance in wheat.
Plant growth, development, biotic and abiotic stress responses are also regulated by MAP kinase phosphatases (MKPs). They are the major regulators of MAPK signaling pathways and play vital roles in plant survival and sustainability. Various phosphorylation and kinaseassociated genes are the prominent players of MAPK signaling pathways [44]. A number of such genes as, protein amino acid phosphorylation (GO:0006468), protein kinase activity (GO:0004672), ATP binding (GO:0005524), regulation of Rab GTPase activity (GO:0032313) Rab GTPase activator activity (GO:0005097), regulation of signal transduction (GO:0009966), phosphoprotein phosphatase inhibitor activity (GO:0004864), regulation of phosphoprotein phosphatase activity (GO:0043666) identified as potential targets of newly profiled wheat miR-NAs, including tae-miR477b, tae-miR827, tae-miR1435a, tae-miR1438, tae-miR1522, tae-miR1861b, tae-miR3476, tae-miR5641, tae-miR8135 and tae-miR8154. Thus, wheat growth, development, biotic and abiotic stress resistance can be devised by managing these cell signaling players through the newly predicted wheat miRNAs.
The transcription factor MYB performs a vital role in abiotic stress responses. In Arabidopsis, AtMYB96 overexpressed plants have shown dehydration tolerance by participating the ABA and auxin signaling pathways, as well as participated in improving freezing and drought tolerance by regulating a lipid transfer protein 3 [45,46]. Similarly, AtMYB44 and AtMYB60 integerate in plant responses to dehydration stress by controlling stomatal openings [47,48]. Wei et al. [49] cloned the wheat TaODORANT1, a R2R3-type MYB transcription factor gene in tobacco and found that TaODORANT1 was up-regulated under high salinity, PEG6000, H 2 O 2 , and ABA treatments. This TaODORANT1 overexpression improved drought and salt tolerance in transgenic tobacco plants. The newly identified wheat miRNA tae-miR858 is found to target the MYB-transcription factor. This would serve a potential resource to manage the wheat under biotic and abiotic stresses and to regulate it for better crop production.
Another important transcription factor is WRKY that play vital roles in plant resistance responses to pathogens. Wang et al. [50] identified that the two WRKY genes TaWRKY49 and TaWRKY62 were originally in association with high-temperature seedling-plant resistance to wheat stripe rust, caused by the fungal pathogen Puccinia striiformis f. sp. tritici (Pst) resistance in wheat. The wheat miRNA, tae-miR5082 identified in this study is found to target the transcription factor WRKY. With the help of such wheat miRNA, fungal rust resistance could be better managed and that would ultimately increase the crop production.
Another significant transcription factor zinc finger are reported to performs cricial roles in several plant processes including regulation of growth and development, signaling networks, responses to environmental stresses. Recently, Agarwal and Khurana [51] identified and explored the involvement of wheat zinc finger (TaZnF) in plant stress response, mainly heat stress. They reported that the overexpression of TaZnF in Arabidopsis transgenics have showed considerable tolerance to cold and oxidative stress. Based on these observations, they suggested that TaZnF acts as a positive regulator of thermal stress and thus can be of great significance in understanding and improving temperature stress tolerance in plants. As the transcription factor, zinc finger is predicted as putative potential target gene of two newly profiled wheat miRNAs (tae-miR5183 and tae-miR5562), this would be useful to better program the wheat resistance under heat, cold and oxidative stress.

Conclusions
The 212 new conserved miRNAs belonging to 185 families from wheat EST sequences were identified by applying comparative genomics approaches. All these miRNAs are reported for the first time in wheat. In addition, for these 212 wheat miRNAs, 32927 targets are predicted which have roles in 50 GO-enrichment pathways. The targets are found to involve in different processes, as metabolism, transcription factor, transporter, cell signaling, structural protein, stress-related and growth & development. Some randomly selected wheat miRNAs are also validated by RT-PCR. In detail, characterization and annotation of the newly profiled miRNAs and their targets were also done. These results will contribute to wheat stress-resistant breeding as well as understanding better yield's traits.
Supporting information S1 Table. The wheat pre-miRNAs primer sequences for RT-PCR experimental validation. Fifteen randomly selected wheat miRNAs subjected to expression analysis through RT-PCR are given here with melting temperature (Tm) primers, product size (bp) and source EST.