A Comprehensive Analysis of miRNA/isomiR Expression with Gender Difference

Although microRNAs (miRNAs) have been widely studied as epigenetic regulation molecules, fewer studies focus on the gender difference at the miRNA and isomiR expression levels. In this study, we aim to understand the potential relationships between gender difference and miRNA/isomiR expression through a comprehensive analysis of small RNA-sequencing datasets based on different human diseases and tissues. Based on specific samples from males and females, we determined that some miRNAs may be diversely expressed between different tissues and genders. Thus, these miRNAs may exhibit inconsistent and even opposite expression between males and females. According to deregulated miRNA expression profiles, some dominantly expressed miRNA loci were selected to analyze isomiR expression patterns using rates of dominant isomiRs. In some miRNA loci, isomiRs showed statistical significance between tumor and normal samples and between males and females samples, suggesting that isomiR expression patterns are not always invariable but may vary between males and females, as well as among different tissues, tumors, and normal samples. The divergence implicates the fluctuation in the expression of miRNA and its detailed expression at the isomiR levels. The divergence also indicates that gender difference may be an important factor that affects the screening of disease-associated miRNAs and isomiRs. This study suggests that miRNA/isomiR expression and gender difference may be more complex than previously assumed and should be further studied according to specific samples from males or females.


Introduction
A class of small non-coding RNAs (ncRNAs), microRNAs (miRNAs), has been widely studied as endogenous negative regulatory molecules to control post-transcriptional gene expression [1]. miRNAs play a pivotal role in coding-non-coding RNA regulatory networks, and many abnormally expressed miRNAs were proven to be associated with human diseases, particularly in promoting carcinogenesis in some cancers [2,3]. Although a single miRNA sequence is believed to come from a miRNA locus, a series of heterogeneous sequences have been widely detected and identified in both human and plant miRNAs by comprehensively analyzing highthroughput sequencing datasets [4][5][6][7][8][9]. Some isomiRs have been proven as functional small RNAs by associating with target mRNAs [6,[10][11][12][13] and influencing miRNA stability [14] or effectiveness [15,16]. More studies focus on these multiple miRNA variants because of the versatile roles of isomiRs, particularly those dominantly expressed isomiRs, and isomiRs with 5' variation and/or 3' additions. Simultaneously, the analysis of miRNA-miRNA or isomiR-iso-miR interaction combined with homologous and/or clustered miRNAs is necessary from the miRNA/isomiR levels [17], and miRNA may be thoroughly studied from the detailed isomiR expression.
Some relevant biological molecules, including proteomes, mRNA, and miRNA, were reported to be differently expressed or enriched between samples from males and females [18][19][20][21]. Specifically, some miRNAs were determined to have various expressions between genders [22,23], and the promoter methylation of miR-137 was validated as a female-associated molecule [24]. Loher et al. [25] found that expressions of many isomiRs diverged between males and females. These findings demonstrate that miRNA or isomiR expression may be related with gender difference, and gender-associated miRNA or isomiR expression and function should be not ignored. Typical analyses and studies always disregard the relationship of miRNA or isomiR expression and gender difference, and some gender-related miRNAs may be ignored or balanced using the typical analysis. However, fewer systematic studies are carried out, particularly those that are based on the miRNA or isomiR biogenesis and the expression between males and females across different diseases and tissues.
In this study, based on the potential importance and relationships of miRNA/isomiR expression and gender difference, a comprehensive analysis was performed using specific and common diseases in males and females. The study aimed to explore the divergence of miRNA/ isomiR expression profiles and gender difference at the miRNA and isomiR levels, respectively. Specifically, the potential expression divergence was analyzed between different tissues, tumor and normal groups, and isomiR expression patterns in different tissues and genders. According to miRNA/isomiR characteristics and gender difference, the study may provide more information and implications for studies on miRNA and isomiR, particularly specific miRNA and iso-miR expression profiles in specific human diseases.

Flowchart
A flowchart of a comprehensive analysis of miRNA and isomiR expression was indicated in Fig  1, which would contribute in determining the relationship of miRNA/isomiR expression and gender difference.

Source data
The small RNA deep sequencing datasets used in this work were obtained from The Cancer Genome Atlas (TCGA) pilot project (https://tcga-data.nci.nih.gov/tcga/), and all the data were sequenced by the Illumina HiSeq sequencing platform. These datasets included small RNA sequencing of tumor and control samples from females (a specific disease in females, uterine corpus endometrial carcinoma (UCEC)), males (specific disease in males, prostate adenocarcinoma (PRAD)), and females and males (common diseases in males and females, lung squamous cell carcinoma (LUSC) and thyroid carcinoma (THCA)). Equal sample sizes were randomly obtained from the corresponding tumor samples based on the same and similar characteristics because of the divergence of sample sizes between tumor and normal samples (S1 Table).

Sequence and expression analysis
Small RNA expression profiles were first collected from TCGA (the original sequencing data were analyzed by mapping analysis), and these profiles were further analyzed at the miRNA and isomiR levels. Differentially expressed miRNAs were obtained using DESeq package [26] according to miRNA expression profiles if the profiles were statistically diverged and had relative abundant enrichment levels. Based on the obtained abnormal miRNA expression profiles, some abundantly expressed miRNA loci were selected to further screen the isomiR expression profiles (Fig 1).
According to all isomiRs from a specific miRNA locus, the relative expression level of each isomiR type was estimated using the percentage in each individual. The average percentages and standard deviations were used to describe isomiR expression according to the multiple individuals in a sample. The rates of the top four dominant isomiRs were selected to estimate the isomiR expression patterns based on several dominant isomiRs in the miRNA locus [4,9]. Specifically, the rates of dominant isomiRs were employed, including the rate of the most dominant and secondary dominant isomiRs, the rate of the most dominant and the third dominant isomiRs, and the rate of the most dominant and the fourth dominant isomiRs.

Statistical analysis
Venny's distribution of deregulated expressed miRNAs was constructed using Venny web server 2.0 (http://bioinfogp.cnb.csic.es/tools/venny/index.html). The expression data at the miRNA/isomiR levels were described as mean ± standard deviation ( x ± SD). Differentially expressed miRNA species were collected by utilizing a DESeq package that contained statistical analysis, including multiple comparison correction using false discovery rate (FDR). The rates of dominant isomiRs were adopted to estimate isomiR expression, and the distribution of each rate followed the normal distribution. A t test was used to estimate the difference between tumor and relevant normal tissues, and between samples from males and females in each  disease. Furthermore, ANOVA analysis was employed to estimate the isomiR expression difference among different groups. If the P or P adj values (the associated FDR) were less than 0.05, the difference would be considered a statistical difference.

Expression analysis at the miRNA level
Based on relative expression levels at the miRNA level (the average sequence counts were more than 50 in tumor or normal tissues) (Fig 1 and S1 Table), a series of miRNAs were collected as significant abnormally expressed species in tumor tissues (|Log 2 (Fold-change)| 2.0 and P adj < 0.05), and the up-regulated miRNAs in these tumor samples were dominant (Fig 2). A total of 38 deregulated miRNAs were obtained in PRAD and followed by 35 miRNAs in UCEC, whereas less deregulated expressed miRNAs were detected in PRAD (9) and LUSC (15) according to the standard (S1A Fig). Among these deregulated miRNAs, less common species were detected across different tissues (S1B Fig).
Based on deregulated miRNAs and their relative expression levels (average sequence counts were more than 5000 in tumor or normal tissues, which could ensure further analysis at the isomiR levels), some typical miRNA loci were presented in S2 Table. Further analysis was performed according to the female and male samples in the same tissues (Table 1 and  Interestingly, the specific miRNA, miR-375, was identified as up-regulated miRNA in femalespecific UCEC (log 2 FC = 2.61, P = 0.0262), whereas it was also determined as up-regulated miRNA without statistical significance in male-specific PRAD (log 2 FC = 2.15, P = 6.93E-10) (S2 Table).

Expression analysis at the isomiR level in different diseases
Deregulated miRNAs were analyzed based on the sum of all varied sequences, which were also called isomiR sequences in the corresponding miRNA gene loci, whereas the detailed sequences were not further analyzed. According to versatile biological roles and potential function, these isomiR sequences should be further involved in the in-depth analysis. Therefore, some common deregulated miRNA loci were selected to perform isomiR expression analysis based on gender-specific diseases and the common diseases, including down-regulated miR-143 and upregulated miR-182 (S2 Table). The rates of dominant isomiRs from the two miRNA loci exhibited significant difference across the eight groups (Fig 3). Based on all the normal and tumor groups across different tissues, the rates of dominant isomiRs showed diverse expression except for miR-143 in normal groups. The rate of the most dominant and secondary dominant iso-miRs showed a statistical difference among normal and tumor groups (Fig 3).
For each pair of tumor, the corresponding normal samples, and male and female samples, a relevant t test was used to estimate the difference. As expected, isomiR expression may show a significant difference between tumor and control samples, particularly for miR-143 in malespecific PRAD disease and miR-182 in LUSC disease (S3 Table). The two miRNA loci showed diverse divergence between other pairwise samples, and they may also exhibit significant difference between various tissues, although the difference was more common between males and females. miRNA locus may generate inconsistent isomiR expression patterns between different gender-specific diseases, between different common diseases, and between males and females (S3 Table).

Expression analysis at the isomiR level between males and females
We further selected the common dominantly expressed miRNA loci to track isomiR expression patterns between different genders based on the same human diseases between males and females, including the diseases LUSC and THCA. Some miRNAs were collected to understand the isomiR expression in LUSC, including down-regulated miRNAs (miR-451 and miR-30a) and up-regulated miRNAs (miR-205, miR-210, miR-183, and miR-9). In THCA, only one upregulated miR-221 was collected, and the further extension of down-regulated miR-451 was selected although it was identified as down-regulated miRNA without statistical significance in the male-THCA group (log 2 FC = -1.83, P adj = 0.6557). Expression analysis showed that most of these deregulated miRNAs had similar isomiR expression patterns across the four groups with pairwise male and female groups. However, miR-30a loci exhibited statistical difference in all the most dominant isomiRs (P < 0.0001), and other miRNAs may show significant difference in some certain rates of dominant isomiR species (Fig 4). For example, in the miR-183 locus, the rate of the most dominant and secondary dominant isomiRs was similar across different groups, but other rates showed significant difference (P = 0.0018 and P < 0.0001, respectively, Fig 4). In the miR-451 locus, except for the rate of most dominant and fourth dominant isomiRs, other rates differed, which also demonstrated that the miRNA locus could generate different isomiR expression profiles. Further statistical analysis based on pairwise male and female samples and tumor and normal samples in specific disease were performed using an unpaired t test. The isomiR expression of the miR-30a locus showed statistical difference. The rate of the most dominant and secondary dominant isomiRs in miR-451 differed between normal and tumor samples in males and females, whereas other miRNA loci exhibited similar isomiR expression between normal and tumor samples ( Table 2). For the expression between males and females, miR-30a showed significant expression divergence in normal tissues, whereas no significant difference was detected in tumor samples. The rate of the most dominant and secondary dominant isomiRs varied between tumors from males and females (t = 2.4642, P = 0.0254), whereas other miRNA loci showed similar isomiR expression (Table 2).
Other deregulated miRNAs were only analyzed at the isomiR levels in tumor or normal tissues of males and females because these miRNAs were rarely detected in the corresponding The results shown here are in whole based upon data generated by the TCGA Research Network: http://cancergenome.nih.gov/.

Discussion
The crucial roles of miRNAs in various diseases have attracted many researches, and diseaserelated miRNAs, including their target mRNAs, can be predicted based on biological interaction network [27][28][29][30][31][32][33][34][35]. Systematic analysis and study of roles and interactions of miRNAs in   Table 2. Statistical analysis between paired tumor and normal samples or male and female samples using t test in Fig 4. miRNA Tissues The first/second The first/third The first/fourth miR-30a Female: LUSC-nt & LUSC-tn specific biological process, such as cell death [36][37][38], cell proliferation [39,40], have been performed in diverse human diseases. The phenomenon of multiple isomiRs further enriches the miRNA/isomiR study, and disease-related isomiRs may be next crucial markers to study occurrence and development of diseases. Although the phenomenon of isomiRs was first detected by analyzing deep-sequencing small RNA datasets, these miRNA variants with various sequences and expressions gained the attention of researchers based on their versatile biological functions [6,[10][11][12][13][14][15][16]. In the specific miRNA locus, different isomiRs may demonstrate diverse deregulated expression patterns and adverse expression, although these sequence-related isomiRs may have close functional relationships, including co-regulating target mRNAs. The typical analysis of miRNAs without considering the isomiR level may be a partial solution, and the comprehensive analysis from the isomiR level is quite important, particularly because some deregulated miRNAs are prone to detect abnormal isomiR expression patterns [41]. Loher et al. reported that isomiR expression profiles may be population-dependent and gender-dependent [25], particular for the diversion of gene expression and protein between males and females [18][19][20][21]. For small non-coding RNAs, more studies focused on the gender-specific miRNAs or miRNA expression with gender difference [42][43][44][45], but systematic analysis of isomiR expression based on miRNA loci and studies based on deregulated miRNA loci are rare. Is the gender-dependent isomiR expression profile highly significant? This study may provide more implications for miRNA and isomiR studies in gender-relevant samples. Based on these results and our previous studies, we aimed to explore the potential isomiR expression in different human diseases and genders, including male-specific and female-specific diseases and diseases for both genders. Further analysis was performed between males and females using the common tissues and diseases. Diversely deregulated miRNA and isomiR expression profiles can be found in specific diseases from males and females (Fig 2, S2 Table, and S1 Fig). Although some miRNAs are identified as common deregulated species in gender-specific diseases, such as deregulated miR-182 and miR-183, these miRNAs always exhibit different levels of up-or down-regulated expression (S2 Table). Most of these miRNAs were studied as crucial miRNAs in the occurrence and development of some diseases. For example, miR-182 and miR-183 are identified as oncogenic miRNAs and contribute to early breast cancer development [46]. Specific miRNAs may show opposite expression patterns between females and males, whereas they may be identified as normally expressed miRNAs using the mixed samples. Thus, some abnormally expressed miR-NAs may be disregarded or balanced in typical analysis using mixed genders where gender may be an important factor in examining miRNA. Disease-associated miRNAs are always selected and identified from these deregulated miRNAs, and inconsistent and abnormal expression profiles reveal that the small RNA expression may be influenced by gender difference. Diverse miRNA expression profiles lead to diverse isomiR profiles, which complicate and interrupt further experimental validation of disease-associated miRNAs.
IsomiR expressions in selected common deregulated miRNA loci, including miR-143 and miR-182, differ across the four groups with different diseases (P < 0.01, Fig 3). Analyses of iso-miRs in normal and tumor samples showed that rates of dominant isomiRs may be similar or diverged among different genders and tissues (from different diseases), thus suggesting that isomiR expressions may diverge between genders and tissues (Fig 3). The analysis also focuses gender difference and tissue difference, which may contribute to various isomiR expressions. After further pairwise comparisons of isomiRs between normal and tumor samples, two miR-NAs may be significantly diverged between some tumor and normal samples (P < 0.05) or have similar expression between tumor and normal samples (P > 0.05, S3 Table). The three rates of dominant isomiRs also have an inconsistent expression. However, the rate of the topdominant isomiRs may be most important mark to estimate the isomiR expression (the top two dominant isomiRs may possess nearly 80% expression in certain miRNA loci). These findings indicate that the selected deregulated miRNA loci may generate inconsistent isomiR expression between genders, normal and tumor samples, and different tissues. Although analysis at the miRNA level indicates that these miRNAs are abnormally expressed in tumor samples, the detailed and real expression at the isomiR levels may vary. The divergence of isomiR expression patterns implies that isomiRs may be diverged between different tissues and genders, which may affect the selection of deregulated miRNAs for further experimental validation. More importantly, in these multiple isomiRs, some isomiRs with varied or shifted "seed sequences" may also have higher enrichment levels. However, the changed functional regions may result in new target mRNAs or binding sites with target mRNAs. These isomiRs may enrich the function in the miRNA locus, and analysis at the isomiR levels may be particularly considered in studies on miRNAs.
To further understand the potential expression divergence between males and females in the same tissues, two common diseases in males and females (LUSC and THCA) were selected to analyze isomiRs in some miRNA loci between males and females. These selected miRNAs were identified as deregulated species. However, not all of these miRNA loci showed diverged isomiR expression across different groups (such as miR-183 and miR-221, Fig 4). Similarly, pairwise comparisons of isomiRs demonstrate that miR-30a and miR-221 diverged between genders; and miR-30a and miR-451 are diverged between normal and tumor samples ( Table 2). These findings suggest that isomiRs from a specific miRNA locus may vary between males and females and between normal and tumor samples in the specific samples. However, typical analysis at the miRNA level could disregard this divergence, particularly when mixed samples from females and males are adopted. The final screening of deregulated miRNAs would not reflect the real expression and change at the isomiR levels. Analysis at the isomiR level can lead to inconsistent expression patterns although these isomiRs are generated from the specific miRNA locus. These multiple isomiRs always have close functional relationships (the same seed sequences or shifted seed sequence) [41]. A comprehensive analysis at the miRNA and isomiR levels is necessary to further reveal the complex small RNAs.
Taken together, this study determined that miRNA and isomiR expressions may diverge between normal and tumor samples and between females and males. These expression divergences suggest that gender may be an important factor in miRNA and isomiR expression. A canonical analysis of miRNA/isomiR without considering gender difference may disregard some deregulated miRNAs, and isomiRs in miRNA loci, which would increase the false positive rate. Candidate disease-associated miRNAs should be screened and examined better based on gender-difference at the isomiR levels.

Author Contributions
Conceived and designed the experiments: LG QZ. Performed the experiments: LG JFY TML. Analyzed the data: LG JFY QZ. Contributed reagents/materials/analysis tools: TML QZ. Wrote the paper: LG.