The Sequence and Structure Determine the Function of Mature Human miRNAs

Micro RNAs (miRNAs) (19–25 nucleotides in length) belong to the group of non-coding RNAs are the most abundant group of posttranscriptional regulators in multicellular organisms. They affect a gene expression by binding of fully or partially complementary sequences to the 3’-UTR of target mRNA. Furthermore, miRNAs present a mechanism by which genes with diverse functions on multiple pathways can be simultaneously regulated at the post-transcriptional level. However, little is known about the specific pathways through which miRNAs with specific sequence or structural motifs regulate the cellular processes. In this paper we showed the broad and deep characteristics of mature miRNAs according to their sequence and structural motifs. We investigated a distinct group of miRNAs characterized by the presence of specific sequence motifs, such as UGUGU, GU-repeats and purine/pyrimidine contents. Using computational function and pathway analysis of their targeted genes, we were able to observe the relevance of sequence and the type of targeted mRNAs. As the consequence of the sequence analysis we finally provide the comprehensive description of pathways, biological processes and proteins associated with the distinct group of characterized miRNAs. Here, we found that the specific group of miRNAs with UGUGU can activate the targets associated to the interferon induction pathway or pathways prominently observed during carcinogenesis. GU-rich miRNAs are prone to regulate mostly processes in neurogenesis, whereas purine/pyrimidine rich miRNAs could be involved rather in transport and/or degradation of RNAs. Additionally, we have also analyzed the simple sequence repeats (SSRs). Their variation within mature miRNAs might be critical for normal miRNA regular activity. Expansion or contraction of SSRs in mature miRNA might directly affect its mRNA interaction or even change the function of that distinct miRNA. Our results prove that due to the specific sequence features, these molecules can also be involved in well-defined cellular processes depending on their sequence contents. The pathway mapping and theoretical gene target identification allowed us to create a biological framework to show the relevance of the specific miRNAs in regulation the distinct type of targets.

Introduction potency could affect the functionality of the miRNA-mediated gene expression regulation, the full understanding of various biological functions of miRNA needs the knowledge about the sequence and structure of mRNA targets, miRNAs precursors but also of mature miRNAs is needed.
In this paper, we consistently studied across the mature human miRNAs: (i) the length distribution; (ii) characteristics of nucleotides content, their occupancy position, sequence-based motifs, like: UGUGU, GU-repeats and purine/pyrimidine contents; (iii) simple sequence repeats (SSRs). Additionally, we have also examined the structural tetranucleotide motifs (UUCG, GAAA, GCAA, GAGA, GUGA, GGAA, CUUG, UUUG) involved in stable hairpin formation and their propensity for regulation of specific group of targets. Using ModeRNA software, for the first time we have modeled the tertiary structure of mature miRNAs, suggesting that the miRNAs-target recognition can be not only sequence-, but also structure-related. Thus, in this paper we investigate the sequence features of mature miRNAs, convincing that the nature of these analysis can reveal the complexity of the miRNA molecular structures, what can give a hint for a genetic, including cancer, research. To predict the biological functions behind our sequence analysis results, we search for the corresponding pathways of the distinct group of miRNAs using DNA intelligent (DIANA) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses. Accordingly, with DIANA-miRPath software we performed a computational target prediction combined with pathway analysis. Apart from the pathways analysis, the gene ontology including biological processes was also taken into account. The validity of the KEGG pathway clustering approach is further supported by the additional analysis with PANTHER classification system, that provided also the list of different protein classes being involved in the distinct processes and pathways. Based on these analysis we have found a link between the sequence content and pathways, biological processes and proteins associated with the distinct group of characterized miRNAs.

Library construction and clustering
All metazoan and human miRNAs sequences, available on 17 th October 2012, were downloaded from miRBASE version 19 in FASTA format (ftp://ftp.sanger.ac.uk/pub/mirbase/19.0/). The data were scanned for unique sequences of mature miRNAs. The molecules with identical sequences, but annotated under different names in the miRBASE, were removed. The cured library contained 2042 human miRNA sequences.

Analysis of miRNA nucleotide sequences and SSR analysis
The analysis was done with the use of Phyton language, version 2.7.3. For each of the analyses, different scripts were prepared (S1 file). Whole sequence analysis was conducted on a specific number of sequences, to a single one accuracy.
For calculation of the monorepeats in a single miRNA sequences, the regular expressions language was used.

Calculation SSR relative count
SSR relative count is the total repeats per miRNA on average. The calculated number of repeats were divided by the number of miRNAs is on each analyzed group (e.g. relative count for mononucleotide repeats 2326/2042 = 1.14).

Functional analysis with DIANA miRPath
For the identification of the networks and pathways of the selected miRNAs we used DIANA miRPath (v2.0) software http://diana.imis.athena-innovation.gr) [30]. The prediction algorithm DNA Intelligent Analysis (DIANA) DIANA-microT-CDS (v5.0) was used for the identification of potential target RNAs from each cluster. A core analysis was performed to recognition of the most relevant miRNA targets, canonical pathways, biological functions and physiological processes from the interactions provided by the DIANA database. The identification of all the significantly targeted by the selected miRNA pathways was performed in the mode "Union of pathways". The enrichment analysis and calculation of the significance levels (p-values) for each selected miRNA individually was performed. Fisher's meta-analysis method was applying for calculation a merged p-value for each pathway. For the analysis a posteriori approach was used, where the statistical results showed the probability that the surveyed pathway is significantly enriched with gene targets of at least one selected miRNA. The significance for all the miRNA-mRNA pairs in a pathway were performed and calculated, followed by the combination into a merged P-value for each pathway. The results are reported and visualized as heat maps, and the pathways are clustered based on significance levels.

Classification and pathway analysis with PANTHER System
The list of miRNAs with distinct sequence motif were subjected to TargetScanHuman program, release 7.0 (August 2015) analysis prior to obtain the full list of predicted targets for the specific group of miRNAs. The lists of targets were then uploaded to the PANTHER (protein annotation through evolutionary relationship) Classification System (http://pantherdb.org). By overrepresentation test the comparison of provided data set with a reference list in PANTHER database was performed. Ontology categories for over-and underrepresentation were determined statistically with Binominal distribution test.

miRNA sequence alignment
ClustalW command line version of the multiplatform sequence alignment was used for comparison of distinct miRNAs sequences and other RNAs with previously, experimentally confirmed hairpin structures.

Secondary structure analysis
Mfold program, version 3.5, was used to calculate the folding free energy in the conditions given by the program for all the sequences towards formation of hairpin structures (http:// mfold.bioinfo.rpi.edu).

miRNAs structure modeling
ModeRNA program (http://iimcb.genesilico.pl/moderna/) for miRNAs 3D modeling based on the experimentally confirmed structures (templates) was used [31]. The structure for RNAs were downloaded from PDB database. At least 80% of similarity was taken as a threshold with the sequences alignment. The PyMOL programm was applied as a tool for figures generation.

Results
The nucleotide composition in mature human miRNAs The analysis of a length heterogeneity of 2042 sequences of human mature miRNAs resulted that number of nucleotides is a discrete random variable, which ranges from 16-27. miRNAs sequences with 21 (12%) and 23 (14%) nucleotides are most abundant, nevertheless miRNAs with 22 nucleotides in length are predominant (47%) (S1 Fig). These results corroborate with the recently published observations [32]. The sequence length analysis of human mature miR-NAs showed that the length heterogeneity is related to some biological factors such as evolution conservations or miRNA's regulatory mechanism [32]. It has been also established that miRNAs that regulate cancer-associated targets (oncogenes/tumor suppressors) show stable lengths of 22 nucleotides, whereas longer miRNAs tend to regulate more genes [32].
Furthermore we analyzed the nucleotides content in human mature miRNAs, what showed inequality of base presence. It showed a prevalence of guanosine and uridine at 29 and 26%, and lower levels of adenosine (23%) and cytidine (22%), respectively. This observations is also consistent with the previously shown results [32]. Although the most abundant nucleotide in miRNAs is guanosine, the first position at 5' end is more frequently occupied by the uridine (35%) (Fig 1a and 1b). However, one can notice, that A nucleotide at 2 nd and 3 rd positions is only slightly overrepresented than G. (Fig 1a and 1b). It has been also observed that the presence of A residue in the targeted mRNA, from position 11 of the 5' end of a miRNA, has a great impact on the improvement of the miRNAs binding to the specific target [20,33]. This strengthen of the interaction is observed, although it was shown, that the exact base-pairing between the A residue and the nucleotide at position 1 of the 5'end of the miRNA is not necessary [20,22]. The strongest representation of the U residue in the first and the last position of miRNAs sequence was observed not only for human, but also five other mammalian species, such as Pongo pygmaeus, Pan troglodytes, Macaca mulatta, Mus musculus and Rattus norvegicus. This observation suggests that both ends, more likely than just the 5' end, are also involved in target recognition [34].
Based on a single position analysis of the whole sequences, we have also looked at the purines or pyrimidines tracts in mature miRNAs. We assumed the threshold for purine/pyrimidine rich sequences over 70% for G, A and C,U, respectively. Based on that criterion we found 141 CUrich and 187 AG-rich sequences. We found: miR-4290, miR-1281, miR-4716-5p, miR-483-3p and miR-877-3p containing almost only UC nucleotides and miR-6124, miR-483-5p, miR-4271, miR-4644, miR-4716-3p and miR-1234-5p purine-rich. In the Table 1 we showed only highly purine/pyrimidine rich sequences, were the level of the investigated nucleotides is over 90%.
The functional analysis of the cellular pathways analysis showed that the purine/pyrimidine-rich sequences show the involvement mostly in lysine degradation, RNA transport and spliceosome (Fig 2a, S1 Table). The search generated a rank-ordered list of KEGG and PAN-THER pathways, issuing statistical significance based on p-values. Examining the pathways affected by the distinct group of miRNAs we highlighted pathways associated with different physiological processes connected to the specified group of miRNAs. The heat map indicates the high enrichment of the e.g. miR-483 in RNA degradation, miR-4271 in lysine degradation or miR-4716 in spliceosome (Fig 2b). The list of proteins involved in the pathways and processes associated with the purine/pyrimidine-rich miRNAs also corroborate with these results (S1 Table). The membrane trafficking regulatory, membrane traffic or nucleic acid binding proteins are the most common proteins being a main players in binding and RNA transport processes (S1b Table). Apart from the targeting the distinct group of mRNAs, polypyrimidinetracts could allow possibly for the interaction of miRNAs with other proteins, such as PTB (polypyrymidine tract-binding protein), preventing the typical regulatory action of miRNAs The Sequence and Structure Determine the Function of Mature Human miRNAs [35]. On the other hand, such miRNAs can interact with Kozak sequence-purine-rich motif, similar to prokaryotic Shine-Dalgarno sequence [36][37]. Recently, the novel mechanism for post-transcriptional control of gene expression was shown [38]. It involves the formation of an intermolecular G-quadruplex between small RNA and mRNA, where a four stranded RNA structure is formed through direct RNA-RNA dimerization. This is hint to suppose, that the G-rich miRNAs could be also regulate the gene expression via this kind of mechanism.

Simple sequence repeats (SSR) analysis
Regardless the extensive research, there are only limited data about the nucleotide composition, especially in terms of the nucleotide repeats in mature miRNAs. This analysis could be beneficial, since the SSRs can be potentially used for genetic maps construction, as a genetic markers, but also in the linkage association research or phylogenetics and population genetics studies [39][40][41][42][43][44]. It was also shown, that SSRs could serve for a target genes involved in diseases, for evolution history research or finally, in the execution of the paternity tests [45][46][47][48].
Based on the nucleotide sequence analysis, we performed SSR monitoring. We searched in human miRNAs for the occurrence and nature of SSRs. The most important standard for the identification of SSRs in a given sequence is the definition of the minimum repeat number. The minimum repeat units in track is annotated as the valid SSR tract. In order to detect various repeats in mature miRNAs, and taking into account the length of the molecules, we considered minimum three repeats. Consequently, three minimum repeats were used for the survey of SSRs in pre-miRNA studies [49]. Among the analyzed mature miRNA sequences we were able to define: mono-, di-, tri-, tetra-and pentanucleotide repeats.
Mononucleotide repeats poly (U) and poly (G) were the predominant repeats in all of the analyzed mature miRNAs ( Table 2). The longest mononucleotide repeats observed, was the 18 nucleotides SSR poly (G). For poly (U), (A) and (C) repeats we were able to find maximum 10 nucleotides long, both for "U" and "A" SSR and 7 for "C", respectively. Dinucleotides were the second most common repeats in mature miRNAs (Table 2). We have found that GU/UG were also predominant for human miRNAs (Table 2). To get comprehensive view on specified miR-NAs, we established the list GU/UG-rich miRNAs ( Table 3). The less frequent, with almost the same relative count, were the AG/GA and AU/UA repeats (Table 2). CG/GC repeats in mature  The KEGG pathway analysis for pyrimidine/purine (P/P)-rich miRNAs. A. The table illustrating the: list of P/P-rich miRNAs (first column); the IDs and KEGG pathways names (second and third column); the number of genes and P/P-rich miRNAs involved associated with the pathways (fourth and sixth column). P-value was given in fifth column as a result of statistical analysis. P-value threshold is considered 0.05. B. P/P-rich miRNAs in predicted pathway heat map. Significant miRNA-pathway interaction p<0.001. doi:10.1371/journal.pone.0151246.g002 The Sequence and Structure Determine the Function of Mature Human miRNAs miRNAs are relatively rare ( Table 2). The most frequent dinucleotide repeat we have found within the sequence of has-miR-1277-5p-1, was (AU) motif, repeated 7 times. Interestingly, most of the trinucleotide repeats contain base "U" and "G", although it can be noticed that this type of SSR is less represented. AGU, GGU and UGU SSR were not observed in humans mature miRNAs sequences, whereas GUG motif was observed 6 times. The detailed data about trinucleotide repeats are given in S2 Table. Mature human miRNAs with the longest trinucelotide repeat-(AGG) 4 is hsa-miR-4298-1.
Although, in the mature miRNAs tetranucleotide and pentanucleotide repetas were less expected to detect, however we were still able to find such SSRs type ( Table 2). Interesting example of mature miRNA, with pentarepeats, is hsa-miR-3620-5p-1, in which we identified (UGGGC) 4 (Table 2). Until now, there is only a single study focused on SSRs in short sequences, such as pre-miR-NAs [49]. However, since miRNAs are functional molecules, we decided to monitor the SSRs in mature miRNAs. For this group of molecules, we have noticed the same tendency that has been already observed for pre-miRNAs. Among the majority of the analyzed miRNAs, poly (A/U) repeats were more frequent than poly (G/C) repeats. Having converted this data into genome level, this observation, both among the previous pre-miRNAs analysis and our mature miRNAs studies, is similar to primate genome [50]. Mononucleotide and dinucleotide repeats were significantly predominant, which was similar to pre-miRNAs and to that of introns, in which majority of SSRs were also mono-and dinucleotides, whereas tri-, tetra-, penta-and hexanucleotide repeats were relatively rare. It has been reported that (GT) n is the most predominant dinucleotide repeat motif in animal and invertebrates. Additionally, similarly to pre-miRNAs sequences, the most abundant repeats were (GU) n /(UG) n as well. Thus, one could assume that SSRs, both in pre-miRNAs and miRNAs, might have a functional meaning. They can be consider for providing a molecular basis for organization of pre-miRNAs in vivo or rapid mature miRNAs maturation. SSRs changes within the mature miRNAs sequences, similarly to pre-miRNAs might have a critical impact on the normal miRNA regulation activity and the variations of SSRs in mature miRNAs can influence theirs direct mRNA target interaction or even alter the function of that distinct miRNA.
Additionally, our analysis showed that miRNAs enriched with the particular sequence motif are predicted to target different pathways. The pathways associated with GU-rich miRNAs as the most significant indicate a "dopaminergic synapse", "lysine degradation" and "long-term potentiation". The more intense red color in a heat map indicates higher probability that a specific pathway is significantly enriched with target genes for a certain miRNA (Fig 3). This analysis also revealed that the GU-rich miRNAs more likely are involved in the neurological processes such as "Opioid proopiomelanocortin pathway" or "Axon guidance mediated by netrin" (Fig 3, S3 Table). The list of the most important proteins involved in the processes with the participation of GU-rich miRNAs covers among others proteins of voltage-gated sodium channel, sodium channel, but also SNARE proteins, which the primary role is to mediate lysosome formation, particularly in the presynaptic membrane in neurons (S3 Table). Although AU-rich elements (AREs) are very abundant in the 3'UTRs of many different mammalian mRNAs with unstable structure, the presence and function of GU-rich elements (GREs) are still poorly understood. There was found that through genome-wide analysis at least 5% of human genes contain GREs in their 3'UTR. The functional over-representation of it is assigned for the genes involved in transcription, nucleic acid metabolism, developmental processes and neurogenesis [51]. Until now, there is no report showing the importance of the GU-rich miR-NAs in specific processes. However, in our global sequence analysis of miRNAs, we demonstrate, that this type of miRNAs target also mRNAs involved in biological processes such as mRNA 3'-end processing, mRNA transcription or nervous system development (S3 Table). On the other hand, it was shown that GREs are the targets for at least one RNA-binding protein: CUG-binding protein 1 (CUG-BP1)-the member of the highly conserved CELF family of RNA-binding proteins that are post-transcriptional regulators of deadenylation, mRNA decay, translation and pre-mRNA processing [52][53][54][55][56][57]. It is possible CUG-BP1 could also bind directly GU-rich miRNAs and affect the protein expression. This needs, however, further investigation.

The identification of interferon induction motif (IIM)
Going deeper into the sequence analysis within the sequences of mature miRNAs, we identified 5'-UGUGU-3' motif, which is known for stimulation of innate immune response. We have found 50 of mature miRNAs within that motif (Table 4). That feature of the UGUGU sequence was first shown for siRNAs as extremely important, as it may cause variety of non-specific side effects, including stimulation of interferon and cytokine production, global shutdown of protein synthesis or nonspecific degradation of mRNAs [58][59]. The miRNA, containing interferon stimulation motifs, can induce itself one or more undesirable effects, such as proliferation blockage, differentiation or apoptosis of cancer cells, or it can even serve as potent immunomodulatory agent [58][59].
We have found, that apart from the potential of these group of miRNAs to possible direct activation of interferon response, they also target particular mRNAs involved in specific pathways and cellular processes. Analysis for UGUGU-rich miRNAs ranked at the top of the list following pathways: pathways in cancer, PI3K-Akt signaling, MAP-signaling and hedgehog signaling pathways (Fig 4, S4 Table). Our analysis showed that these type of miRNAs is involved among others in processes such as induction of apoptosis, immune system process, immune response or macrophage activation (S4 Table). Due to the KEGG and PANTHER analyses of miRNAs involvement in this type of biological processes, it appears, that also for IIM-rich miRNAs we could expect the direct recognition of the mRNAs involved in PI3K-Akt signaling and MAP signaling pathways (Fig 4a, S4 Table). The potential involvement of IIMcontained miRNAs in these pathways are also supported by the list of protein class provided by the PANTHER analysis (S4 Table). It was shown that positioning of this 5 0 -UGUGU-3 0 motif especially at the 5 0 -end of the sense strand of siRNAs results in a rapid and enhanced induction The Sequence and Structure Determine the Function of Mature Human miRNAs of type I IFN. Rapid production of IFN -β involves thus activation of signaling cascades governed by effectors that are intermediates in the JAK/STAT, mitogen-activated kinase (MAPK) and phosphatidylinositol 3-kinase (PI3K) pathways [60][61]. This results indicates that IIM-rich miRNAs, similarly to the cancer-associated miRNAs (CA-miRNAs) are closely connected to cancer [62]. The one of the most prominently observed pathway for these class of IIM-miRNAs-pathways in cancer is highly represented by the gene targets for IIM-miRNAs. This kind of pathway clearly demonstrated the involvement into biological capabilities such as evading of apoptosis, block of differentiation, unlimited replication, increased angiogenesis, sustained ability for invasion and metastasis in malignant transformation [62]. Mitogen -activated protein kinase (MAPK) pathway functions also as integrating signals that affect proliferation, differentiation, survival and migration. The results indicates, that the IIM-rich miRNAs could also be a promising clinical targets for cancer through the MAPK pathway. The phosphatidylinositol 3-kinase(PI3K)-Akt signaling pathway is activated by many types of cellular stimuli or toxic agents and it is involved in regulation of basic physiological cellular functions such as transcription, translation, proliferation, growth or survival [63]. There was shown that serine/threonine kinase Akt/PKB plays significant role in this pathway. Moreover, a impaired activation of the PI3K-Akt pathway has been associated with the development of different types of diseases such as diabetes, mellitus, autoimmunity, and finally-cancer [64].

The characteristic of stem-loop hairpin structure of mature human miRNAs
It is already known that over 50% of the hairpins predicted for the human 16S and 23S rRNA have tetranucleotides loops [65]. The tetraloops are thought to fulfill a variety of functions, including recognition elements for interactions with proteins and other RNAs. They can regulate the activity of a biological system by shifting the equilibrium between alternate structures [66]. It was observed, with the use of the NMR methods that small stem-loops can exist, in solution, in equilibrium with duplex forms [67]. About 70% of these tetraloops have the consensus loop sequences (GNRA) or UNCG (where N = A,C,G or U; and R = A or G. It has been shown that RNA hairpins within these sequences form unusually stable loop conformation [65][66][67]. Table 5. Occurrence of tetranucleotides motifs connected with hairpin formation in human miRNA sequences. The all miRNAs with the ability to form secondary structure were subjected to the analysis. The searching was performed according to: first-motifs present in loop of the predicted hairpins, second: motifs present in the loop where at least 3 base pair were predicted in the stem and finally-where at least 3 base pair and additionally C-G as a closing pair in the stem were predicted. With Mfold algorithm, we found 1431 (70%) sequences of human miRNAs, able to fold into the hairpin structure, whereas 588 sequences did not show such a propensity. The minimal free energy of these structures (ΔG) falls in the range from -0.1 to -11.1 kcal//mol, however, the most widely represented structures indicate the energy levels from -0.4 to -3.3 kcal/mol [68]. In order to gain better insight into the potential secondary structure of mature miRNAs, we have also analyzed a possibility of the hairpin stem-loop formation. Our calculation revealed that the most frequently motif occurred in miRNAs, is the four nucleotides motif (tetraloop), found in 433 sequences of mature RNAs. Other hairpin-loop of miRNA sequences are: 413, 283 and 114 for three-, five-and six-nucleotides loops, respectively.
Among tetraloops, we specified sequences: UUCG, GAAA, GCAA, GAGA, GUGA, GGAA, CUUG, UUUG-involved in the hairpin loop formation. Many sequences of mature miRNAs can fold exactly into the hairpin loop, with the nucleotides of interest located in the loop, containing minimum 3 base pairs in the stem additionally (Table 5).
We found that 179 miRNAs can form stable hairpin structure, with the 4-nt motifs ( Table 5). The most represented motif in human miRNA sequences is GAGA, although GGAA, GAAA or UUUU motifs are also abundant. UUUU motif in the loop, together with the CG as a closing pair in the stem, is most widely represented among the mature human miRNAs ( Table 5).
The observation that miRNA can fold into the secondary structure is consistent also with previous reports, that plant miRNA can form secondary structure with free energy of: -9.3 to +1.5 kcal/mol [69]. It directly supports the observation that given miRNAs differ dramatically in terms of the half-life. Thus, the knotty secondary structures of miRNAs makes them nucleases resistant, implicating their long surveillance in the cell. The higher order structure of miRNA can play a crucial role in conformational changes during miRNA-mRNA interactions. This could modulate the pairing and also explain the different degree of genetic regulation for the specific miRNA. The secondary structure could be also extremely important in the mechanism by which sequences for some miRNAs are selected, what can modulate its affinity with their mRNA targets. It can also provide the specificity of miRNAs-mRNA interaction what could be achieved by the presence of miRNA secondary structure, by precluding the possibility of binding of other miRNAs and genomic RNAs with complementary sequences.
Although the mature miRNAs are generally considered as single stranded, there are very few reports suggesting a self-complementarity in mature miRNAs, that has already been observed in more than 50% of mature miRNAs. They are prone to form hairpins and/or homo-duplexes in solution [38,[57][58]. NMR studies have shown that hsa-miR-520h mature strand can fold into hairpin structure or into self-complementary homo-duplex, in higher concentration [70].
Next, we used the ModeRNA program for modelling of miRNA 3D structure, based on templates of related molecule. To find tertiary structures, we searched the PDB database for experimentally confirmed (X-ray, NMR or CD), about 24 nucleotides long RNAs. We have found 168 sequences, which were aligned to human mature miRNA sequences. For further analysis, we only took into account sequences with at least 80% similarity, since only that level provides reliable, homology modeling. The alignment showed 35 miRNAs that were further processed with ModeRNA server (Fig 5a). Positions with identical nucleotides were fixed, whereas remaining positions were modeled by the program. The best matched results we obtained include miR-381-3p with 1R4H RNA, miR-4649-5p with 2KYE and miR-3677-5p with 1Q8N RNA (Fig 5b). Due to the homology modeling, ModeRNA analysis makes our calculation and secondary structure prediction more accurate and give strong support for miRNAs hairpin loop formation hypothesis.
Our CD, NMR and enzymatic probing data for some miRNAs also prove that miRNAs have the intrinsic potential to form secondary structure and that hairpin possibly is a prevailing form of miRNA in the cell [68]. The association of structured miRNAs and the cellular pathways The pathway analysis revealed that miRNAs contained different sequences in the tetraloop also can cluster into specific group. This clustering approach revealed that miRNAs enriched in CUUG or GUGA sequence in the tetraloop are a regulators of the targets from Wnt signaling pathways. Neutrophin signaling pathway appears to be more associated with miRNA contained UUUG or UUUU motifs, whereas dopaminergic synapse pathways are more targeted by miRNAs with GUGA and GGAA motifs and PI3K-Akt signaling pathway by UUUG and GAGA-enriched miRNAs, respectively (Fig 6, S5 Table).

Conclusions
For a very long time, RNA was considered to be exclusively the carrier of genetic information, but now, the group of RNAs with regulatory function have evolved into the miRNAs, longnoncoding RNAs (lncRNAs), circular RNAs or miRNA-sponges [71].
Although, one miRNA can potentially regulate hundreds of different mRNAs, the majority of transcripts are still actively expressed and translated, which supports the existence of other mechanisms, that counteract miRNA regulation, to achieve homeostasis.
Expression of the gene families or several components of a particular signaling pathway are frequently regulated by miRNAs. The pathway analysis of potential targets based on the sequence, thus the structure of miRNAs could then enhance the probability of identification and verification of the relevant miRNA-target interactions.
Our observation strongly suggest that miRNA persistence is related to biological function, thus better characterization of miRNA structure, stability and associated regulatory mechanism should provide new avenues for the characterization of their biological function. We postulate that, due to the specific sequence features, these molecules can also be involved in very well defined cellular processes depending on theirs sequence contents. Moreover, the unique features encoded in the sequence and in the structure of mature miRNA can be a key to understand the mRNA target recognition.  Table. The results from the analysis pyrimidines/purines-rich miRNAs using PAN-THER classification system. The table presents: top 10 biological processes related to input miRNAs;most significant pathways derived from overrepresentation test and top 10 protein classes related to input miRNAs. +/-shows over-or underrepresentation. Second and third columns contain the number of genes in reference and input list, respectively. P-value threshold is considered 0.05. (DOC) S2 Table. Occurrence and relative count of trinucleotide repeats in mature miRNAs. (DOC) S3 Table. The results from the analysis of GU-rich miRNAs using PANTHER classification system. The table presents: top 10 biological processes related to GU-rich miRNAs; most significant pathways derived from overrepresentation test and top 10 protein classes related to GU-rich miRNAs. +/-shows over-or underrepresentation. Second and third columns contain the number of genes in reference and input list, respectively. P-value threshold is considered 0.05. (DOC) S4 Table. The results from the analysis of interferon induction motif (IIM)-contained miRNAs using PANTHER classification system. The table presents: top 10 biological processes related to IIM motif-contained miRNAs; most significant pathways derived from overrepresentation test and top 10 protein classes related to IIM motif-contained miRNAs. +/shows over-or underrepresentation. Second and third columns contain the number of genes in reference and input list, respectively. P-value threshold is considered 0.05. (DOC) S5 Table. Molecular KEGG pathways analysis for miRNAs with defined tetraloops in the secondary structure.

Supporting Information
(DOC)