Advertisement
  • Loading metrics

PBMDA: A novel and effective path-based computational model for miRNA-disease association prediction

PBMDA: A novel and effective path-based computational model for miRNA-disease association prediction

  • Zhu-Hong You, 
  • Zhi-An Huang, 
  • Zexuan Zhu, 
  • Gui-Ying Yan, 
  • Zheng-Wei Li, 
  • Zhenkun Wen, 
  • Xing Chen
PLOS
x

Abstract

In the recent few years, an increasing number of studies have shown that microRNAs (miRNAs) play critical roles in many fundamental and important biological processes. As one of pathogenetic factors, the molecular mechanisms underlying human complex diseases still have not been completely understood from the perspective of miRNA. Predicting potential miRNA-disease associations makes important contributions to understanding the pathogenesis of diseases, developing new drugs, and formulating individualized diagnosis and treatment for diverse human complex diseases. Instead of only depending on expensive and time-consuming biological experiments, computational prediction models are effective by predicting potential miRNA-disease associations, prioritizing candidate miRNAs for the investigated diseases, and selecting those miRNAs with higher association probabilities for further experimental validation. In this study, Path-Based MiRNA-Disease Association (PBMDA) prediction model was proposed by integrating known human miRNA-disease associations, miRNA functional similarity, disease semantic similarity, and Gaussian interaction profile kernel similarity for miRNAs and diseases. This model constructed a heterogeneous graph consisting of three interlinked sub-graphs and further adopted depth-first search algorithm to infer potential miRNA-disease associations. As a result, PBMDA achieved reliable performance in the frameworks of both local and global LOOCV (AUCs of 0.8341 and 0.9169, respectively) and 5-fold cross validation (average AUC of 0.9172). In the cases studies of three important human diseases, 88% (Esophageal Neoplasms), 88% (Kidney Neoplasms) and 90% (Colon Neoplasms) of top-50 predicted miRNAs have been manually confirmed by previous experimental reports from literatures. Through the comparison performance between PBMDA and other previous models in case studies, the reliable performance also demonstrates that PBMDA could serve as a powerful computational tool to accelerate the identification of disease-miRNA associations.

Author summary

Identification of miRNA-disease associations is considered as a key way for the development of pathology, diagnose and therapy. Computational prediction models contribute to discovering the underlying disease-related miRNAs on a large scale. Based on the assumption that functionally related miRNAs tend to be involved in phenotypically similar disease and vice versa, the model of PBMDA was developed to prioritize the underlying miRNA-disease associations by adopting a special depth-first search algorithm in a heterogeneous graph, which was composed of known miRNA-disease association network, miRNA similarity network, and disease similarity network. Through leave-one-out cross validation and 5-fold cross validation, the promising results demonstrated the effectiveness of the proposed model. We further implemented the case studies of three important human complex diseases, 88%, 88% and 90% of top-50 predicted miRNA-disease associations have been manually confirmed based on recent experimental reports. It is anticipated that PBMDA could prioritize the most potential miRNA-disease associations on a large scale for advancing the progress of biological experiment validation in the future, which could further contribute to the understanding of complex disease mechanisms.

Introduction

MicroRNAs (miRNAs) are an abundant class of small (20~25 nucleotides) endogenous non-coding RNAs, which were normally deemed as negative gene regulators by suppressing the expression of messenger RNAs (mRNAs) in a sequence-specific manner and repressing the protein translation of their target genes [14]. Since the two members of the miRNA family (i.e., lin-4 and let-7) were firstly discovered [57], mounting biological observations and studies have indicated that miRNAs play important roles in many important biological processes. To date, 2588 miRNAs have been discovered in the human genome [8]. With the advances in molecular biology and biotechnology, miRNAs have been proven to influence many important physiological processes such as cell growth [9], immune reaction [10], cell differentiation [11], cell development [9], cell cycle regulation [12], inflammation [13], cell apoptosis [14], stress response [9,15], and tumor invasion [16]. In addition, emerging evidences imply the strong links between miRNAs and diseases. For examples, miR-129, miR-142-5p, and miR-25 were found to be differentially expressed in all pediatric brain tumor types [17]. MiR-145 was observed to target the insulin receptor substrate-1 and restrain the growth of colon cancer cells [18]. MiR-23/27/24 clusters were involved in angiogenesis and endothelial apoptosis, which has potential therapeutic applications in both of vascular disorders and ischemic heart disease [19]. Therefore, the accumulating miRNA-disease associations could be utilized for the pathological classification, individualized diagnosis, and disease treatment [20].

Some databases (e.g. HMDD [21] and miR2Disease [22]) have been constructed for providing accumulating insights into the relationship between miRNAs and diseases. To date, HMDD has already collected 10368 entries including 572 miRNAs and 378 diseases, which shed some light on the molecular mechanisms of diseases. The existing knowledge of miRNA-disease associations mainly comes from previous biological experiments. However, only depending on the traditional biological experiments is time-consuming and costly, therefore unpractical to detect miRNA-disease associations on a large scale in a short term. For addressing these challenges, increasing attentions have been devoted to developing computational models to predict potential miRNA-disease associations by integrating various experimentally confirmed and heterogeneous datasets [2329].

For predicting or prioritizing disease-related miRNAs, there have been some computational methods proposed, which are mostly based on the assumption that functionally related miRNAs tend to be involved in phenotypically similar diseases and vice versa. For examples, Jiang et al. [30] constructed functionally related miRNA network and human phenome-microRNAome network and prioritized the potential miRNA-disease associations according to the cumulative hypergeometric distribution. However, this method excessively depended on the predicted miRNA-target associations which include a high rate of false-positive and high false-negative results. Xuan et al. [31] presented the prediction model of HDMP based on the most weighted similar neighbors to predict potential disease-associated miRNAs. This method integrated the information content of disease terms and phenotype similarity between diseases to infer the functional similarity of miRNA pairs. However, it becomes invalid when no known associated miRNA is available for diseases which are less investigated. Mørk et al. [32] developed a protein-driven method called miRPD for inferring miRNA-disease associations by calculating the product of the two scoring functions, which were calculated based on the miRNA-protein and protein-disease associations. These three aforementioned methods only considered miRNA neighbor information in their ranking system, which can be summarized as traditional local network similarity measure-based computational models. Some methods were mainly proposed based on the more effective global network similarity measure rather than traditional local network similarity measures. For example, Chen et al. [33] proposed the computational model of Random Walk with Restart for MiRNA-Disease Association (RWRMDA) to identify novel disease-related miRNAs by implementing random walks on the miRNA-miRNA functional similarity network. However, this model cannot work for new diseases without any known related miRNAs. Similarly, Xuan et.al. [34] developed a computational model of MIRNAs associated with Diseases Prediction (MIDP) and its extension version named MIDPE for the diseases with known related miRNAs and without any known related miRNAs, respectively. They established the transition matrices between the labeled and unlabeled nodes for exploring the prior information of nodes and the different ranges of topologies. Shi et.al. [35] also performed random walk analysis to explore the relationships between miRNAs and diseases by calculating the functional associations between miRNA targets and disease genes in protein-protein interaction networks. Later, Chen et al. [36] developed the method of Within and Between Score for MiRNA-Disease Association prediction (WBSMDA) to uncover the potential miRNA-disease associations by integrating several heterogeneous biological datasets, which improves the prediction accuracy of previous classical computational models. By calculating and combining Within-Score and Between-Score, WBSMDA can work for new diseases without any known related miRNAs and new miRNAs without associated diseases. Some proposed machine learning-based models demonstrated their power in this field. Based on the assumption that aberrant regulations of target mRNAs occur because their miRNAs are implicated in a specific disease, Xu et al. [37] prioritized novel disease-related miRNAs by constructing the MiRNA Target-Dysregulated Network (MTDN). In addition, they built a support vector machine (SVM) based supervised classifier to distinguish prostate cancer and non-prostate cancer related miRNAs by combining four topological features extracted from MTDN. However, the prediction performance suffered from unavailable verified negative miRNA-disease association samples. A semi-supervised method called Regularized Least Squares for MiRNA-Disease Association (RLSMDA) was developed by Chen et al. [38]. Considering obtaining verified negative samples (i.e. those disease-miRNA pairs without any known association evidences) is difficult and even impossible, cost functions were defined and minimized in the framework of Regularized Least Squares (RLS). Then the optimal classifiers in the disease and miRNA spaces were yielded and combined to obtain the final predictive results. Chen et al. [39] also developed the model of Restricted Boltzmann Machine (RBM) for multiple types of miRNA-disease association prediction (RBMMMDA) for predicting various types of miRNA-disease associations. Based on the known miRNA-disease association network, RBMs were constructed and trained by Contrastive Divergence (CD) algorithm. Finally, this model implemented prediction by calculating conditional probabilities.

The knowledge on miRNAs provides valuable information for the prevention, diagnosis and treatment of human diseases. There is an urgent need to accelerate the identification of disease-miRNA associations for further studies on pathology as well as drug development. We here proposed a novel Path-Based MiRNA-Disease Association (PBMDA) prediction method by constructing a heterogeneous graph consisting of three interlinked sub-graphs (i.e., miRNA-miRNA similarity network, disease-disease similarity network and known miRNA-disease association network). Integrating different types of heterogeneous biological datasets allows that PBMDA could be applied to the new diseases with no known associated miRNAs and the new miRNAs with no known associated diseases. In addition, the proposed method can simultaneously prioritize all unknown miRNAs for all the investigated diseases.

In this work, three evaluation frameworks, including global leave-one-out cross validation (global LOOCV), local leave-one-out cross validation (local LOOCV) and 5-fold cross validation (5-fold CV), were implemented to evaluate the prediction performance of PBMDA. When performing on the HMDD database, PBMDA obtained the best performance in the frameworks of both local and global LOOCV (AUCs of 0.8341 and 0.9169, respectively) and 5-fold cross validation (average AUC of 0.9172) compared with several state-of-the-art computational models [31,33,36]. To further evaluate the performance of PBMDA, we implemented case studies of three important human diseases. As a result, most of top-50 predicted disease-related miRNAs (44/50 for Esophageal Neoplasms; 44/50 for Kidney Neoplasms; 45/50 for Colon Neoplasms) were verified by previously published literatures, respectively. Besides, 9 out of top-10 predicted obesity-related were manually validated based on the published literatures. Our model also represents an improvement to the prediction accuracy through the comparison performance between PBMDA and other previous representative models in case studies. A simulation experiment also proves the applicability of our model to a new disease (no known associated miRNAs).

Materials and methods

Human miRNA-disease associations

HMDD database (http://www.cuilab.cn/hmdd) has collected 5430 experimentally verified human miRNA-disease associations (see S1 Table), involving 495 miRNAs and 383 diseases (see S2 and S3 Tables). The adjacency matrix Y is constructed to describe the confirmed associations between miRNA and disease. Namely, if miRNA m(i) is recorded to be associated with disease d(j), the entity Y(i,j) is equal to 1, otherwise 0. For further description in detail, the investigated numbers of miRNAs and diseases in our study are represented by variables nm and nd, respectively. To evaluate the prediction lists for case studies, another two independent databases (i.e. dbDEMC [40] and miR2Disease [22]) are utilized for validation.

MiRNA functional similarity

According to previous literature [41], it could be concluded that miRNAs with similar functions are more likely associated with similar diseases. Under this assumption, miRNA functional similarity score was calculated (http://www.cuilab.cn/files/images/cuilab/misim.zip). We therefore utilized these data to construct miRNA functional similarity symmetric matrix FS (see S4 Table), in which the entity FS(m(i),m(j)) indicates how is miRNA m(i) functionally similar to another miRNA m(j).

Disease semantic similarity

Mesh database (http://www.ncbi.nlm.nih.gov/), a strict system for disease classification, is available for effectively researching the relationship between different diseases. Disease could be transformed into corresponding Directed Acyclic Graph (DAG), such as DAG(D) = (T(D),E(D)), where T(D) indicates the node set including node D and its ancestor nodes, and E(D) is the edge set of corresponding direct links from a parent node to a child node, which represents the relationship between different diseases [41]. Based on disease DAG, the contribution of disease term d to the semantic value of disease D and the semantic value of disease D itself can be formulated by the following two equations, respectively. (1) (2) Where △ is a the semantic contribution decay factor, which shows that as the distances between disease D and its ancestor diseases increases, their contribution to the semantic value of disease D progressively decreases. Accordingly, disease D locates in the 0th layer, the contribution to the semantic value of disease D itself was defined as 1. The contribution of its ancestor disease should be multiplied by the semantic contribution decay factor. Therefore, △ should be assigned a value between 0 and 1, and that this value was set as 0.5 here according to some previous important literatures [42,43]. Based on this way to measure disease semantic similarity, it should be considered that two diseases sharing more common parts of their DAGs should obtain higher semantic similarity. Under this assumption, the semantic similarity between two diseases d(i) and d(j) can be calculated as: (3) where the entity SS(d(i),d(j)) in row i column j represents the disease semantic similarity between d(i) and d(j).

Gaussian interaction profile kernel similarity for diseases

According to the basic assumption that two miRNAs with more functional similarity tend to be more associated with similar diseases, the topologic information of the known miRNA-disease association network could be used to measure disease similarity. We therefore introduce Gaussian interaction profile kernel for calculating the network topologic similarity between diseases. A binary vector IP(d(i)), i.e. the ith column of matrix Y, is recorded as the interaction profiles of disease d(i) for representing associations between d(i) itself and each miRNA. We then utilized Eq (4) to compute Gaussian kernel similarity between disease d(i) and disease d(j) based on their interaction profiles. (4) where parameter γd is a regulation parameter of the kernel bandwidth. As a symmetric matrix, KD represents the Gaussian interaction profile kernel similarity for all investigated diseases. Parameter γd is needed to be updated by using a new bandwidth parameter γ'd divided by the average value of associations with miRNAs for all diseases. Based on the previous successful research about lncRNA-disease association prediction [44], γ'd is set to 1 for controlling the kernel bandwidth. So γd can be formulated as: (5)

Gaussian interaction profile kernel similarity for miRNAs

Similarly, we also calculated the Gaussian interaction profile kernel similarity for miRNAs, which can be calculated by Eqs (6) and (7): (6) (7) where γ'm = 1 and KM is a symmetric matrix, whose entity KM(m(i),m(j)) denotes the Gaussian interaction profile kernel similarity between miRNA m(i) and miRNA m(j).

Integrated similarity for miRNA and disease

MiRNA functional similarity and disease semantic similarity are the primary data to construct the disease and miRNA similarity matrix. However, these matrices have the problem of sparsity, so we calculated the Gaussian interaction profile kernel similarity based on the known miRNA-disease associations for calculating the similarity of those disease-disease or miRNA-miRNA pairs without corresponding disease semantic similarity or miRNA functional similarity. For constructing two integrated similarity matrix (i.e., miRNA similarity matrix Sm and disease similarity matrix Sd), we integrated miRNA functional similarity, disease semantic similarity, and Gaussian interaction profile kernel similarity for miRNAs and diseases by judging whether miRNA m(i)/disease d(i) has functional/semantic similarity with another miRNA m(j)/disease d(j) or not.

(8)(9)

PBMDA

For prioritizing the most possible potential miRNA-disease associations, we here devised a novel Path-Based MiRNA-Disease Association (PBMDA) prediction method (See Fig 1). For eliminating the node sets with the weak interaction, we set the threshold variable T to 0.5 based on the previous literature research [45], which means we did not take such links into consideration if the similarity between these nodes was less than 0.5. So that three weighted matrixes can be represented as:

thumbnail
Fig 1. Flowchart of PBMDA.

Three networks were integrated to construct a heterogeneous graph and the potential miRNA-disease associations could be effectively inferred by the scoring system.

https://doi.org/10.1371/journal.pcbi.1005455.g001

(10)

In this way, we constructed a heterogeneous graph with lots of paths, which consisted of these three weighted matrixes. A path was defined as a connection between a miRNA and a disease. Furthermore, all paths between a miRNA and a disease must be acyclic to avoid the visited nodes from being traversed repeatedly. A specific depth-first search algorithm was adopted to traverse all paths in the graph, which is easy to be implemented as a recursive algorithm. For saving time, we set a parameter L to limit the maximum length of paths. According to previous literature research [45] and comparison experiments, we found that it was suitable to set L = 3 after trying different values from 2 to 4 increasingly, i.e. one path was not allowed to include more than three edges. Based on the assumption that if more paths are found to connect a miRNA and a disease, they are more likely to have associations, the accumulative contributions from all paths between a miRNA-disease pair could be integrated as a final score. Accordingly, the scoring formula can be defined as Eq (11) with the exponential decay function Fdecay(p), which is depended on the path between a specific miRNA mi and a specific disease dj: (11) where p = {p1,p2,…,pn} is a set of paths linking up a miRNA mi and a disease dj, and ∏pw represents the product of the weight of the all the edges in path pw obtained from the Eq (10).

Generally, longer paths between a miRNA and a disease should have less confidence to directly demonstrate their relationship, i.e. the contributions from the longer path should be cut down more sharply. So the decay function Fdecay(p) can be calculated as follows: (12) where parameter α is a decay factor, which was set 2.26, according to previous literature research [45], and len(p) is the length of path p. After traversing all paths in the graph, each miRNA-disease pair could obtain a final score representing the association confidence between this miRNA and disease, i.e. the higher score they obtain, the more closely related they should be. As an example in Fig 1, the score value of miRNA1 (m1) and disease1 (d1) is calculated as: 1.02.26*1 < m1d1 > + (1.0 * 0.7)2.26*2 < m1d2d1 > + (0.8 * 1.0)2.26*2 < m1m3d1 > + (0.7 * 1.0 * 0.9)2.26*3 < m1m2d3d1 > ≈ 1.6078. The edge between d4 and d1 is not taken into consideration, because its weight is less than threshold T. The code and data of PBMDA is freely available at http://www.escience.cn/system/file?fileId=84394.

LOOCV and 5-fold CV

To evaluate the predictive performance of PBMDA, we implemented LOOCV and 5-fold CV based on known miRNA-disease associations downloaded from HMDD database [21]. LOOCV could be divided into two evaluation frameworks based on the ranking scope (i.e., global LOOCV considers all investigated diseases while local LOOCV only includes a given disease). They both followed the common framework of LOOCV, i.e. each known miRNA-disease association was left in turns served as a test sample and other known miRNA-disease associations were regarded as training samples. Test sample was ranked among the candidate miRNA-disease associations without any known association evidences. The test samples with higher ranks than the specific threshold would be considered as successful predictions. In the framework of 5-fold CV, all known verified miRNA-disease associations were randomly divided into five uncrossed groups, of which one was regarded as testing samples and the other four were used for training in turns. In this paper, we randomly implemented 100 divisions of all known verified miRNA-disease associations to reduce bias brought by sample divisions. The receiver operating characteristic (ROC) curves were drawn for performance evaluation by calculating true positive rate (TPR, sensitivity) and false positive rate (FPR, 1-specificity) based on the varying threshold. Sensitivity indicates the percentage of the positive test samples which are ranked higher than the given threshold; specificity indicates the percentage of candidate samples which are ranked lower than the given threshold. In this way, the ROC curves were plotted based on TPR versus FPR. The areas under ROC curves (AUCs) were also calculated for a numerical evaluation of model performance. AUC = 0.5 denotes a purely random prediction while AUC = 1 denotes a perfect prediction. As a result, the reliable AUCs of 0.9169 and 0.8341 in the frameworks of global and local LOOCV were obtained by PBMDA. Furthermore, the average and the standard deviation of AUC in the framework of 5-fold CV are 0.9172 and 0.0007, respectively. It is anticipated that PBMDA could serve as an effective and robust computational prediction model.

Results

Performance comparison with other methods

We further compared the prediction performance of PBMDA model with four state-of-the-art computational prediction models (i.e., WBSMDA [36], RLSMDA [38], HDMP [31] and RWRMDA [33]). RWRMDA and HDMP are the representational methods in this domain. They were often chosen as benchmarking methods to validate the later developed methods, such as: MIDP [35] and Shi’s method [36]. RLSMDA was a semi-supervised learning method based on the framework of Regularized Least Squares (RLS) representing a good try in machine learning algorithm. WBSMDA was a newly published method representing the current level of computational prediction models in this domain. The performance comparisons in the framework of global and local LOOCV were shown in Fig 2. As a result, PBMDA, WBSMDA, RLSMDA and HDMP achieved AUCs of 0.9169, 0.8030, 0.8426 and 0.8366 in the framework of global LOOCV, respectively. When implementing the local LOOCV, we also obtain the best prediction performance based on PBMDA with AUC of 0.8341. The other methods (WBSMDA, RWRMDA, HDMP, and RLSMDA) obtained AUCs of 0.8031, 0.7891, 0.7702 and 0.6953, respectively. In addition, 5-fold CV was implemented on PBMDA, WBSMDA, RLSMDA and HDMP with average AUC value of 0.9172+/-0.0007, 0.8185+/-0.0009, 0.8569+/-0.0020 and 0.8342+/-0.0010, respectively, which was observed that PBMDA obtained the best performance based on 5-fold CV. In conclusion, PBMDA significantly improves prediction performance of previous computational models by demonstrating its reliable and robust performance from these evaluation frameworks.

thumbnail
Fig 2. The comparison results between PBMDA and other four computational models in terms of global LOOCV and local LOOCV.

https://doi.org/10.1371/journal.pcbi.1005455.g002

Effects of parameters

For simplicity, the maximum path length L and weight threshold T were respectively selected 3 and 0.5 for the prediction. It is possible to obtain the better prediction performance by adjusting these parameters values. 5-fold CV was implemented over 100 times for further evaluation. As a result, given T = 0.5, the average AUC respectively equals to 0.7726+/-0.0016 (when L = 2) and 0.9172+/-0.0007 (when L = 3). It seemed that, the performance of our model could be affected by being restrained with the insufficient neighbor length in each path. Because it took too long to run the program, we did not complete the experiment when L = 4. It seems that as L increases, the computational complexity represents an exponential growth. However, we also implemented 5-fold CV 100 times based on our model and a smaller dataset (HMDD v1.0, 1395 known miRNA-disease associations involved in 271 miRNAs and 137 diseases) for testing performance with increasing L (L = 2~4) and fixed T (T = 0.5). As we can see in Table 1, the prediction accuracy of proposed model is deceased with the increasing L based on a smaller dataset. It is assumed that the increasing L tends to cause overfitting problems for a small network dataset. Therefore, it is not necessary to obtain the improvement of prediction accuracy by increasing the path length L. We should take the size of a network dataset into consideration when selecting parameter L. It seems like that L should be generally increased, as the network dataset becomes larger. Nevertheless, it does not need to select a too high numerical value for L for avoiding overfitting. Considering the reliable performance of three maximum limited lengths for each path (i.e., L = 3), there is no necessity to select L = 4 in this study. This setting also helps reduce the run-time of computational model to travel all possible paths.

thumbnail
Table 1. Five-fold CV experiment results of changing a parameter L when T = 0.5 based on a smaller dataset (HMDD v1.0).

https://doi.org/10.1371/journal.pcbi.1005455.t001

We also implemented a series of 5-fold CV experiments with the increasing T values to obtain the optimal setting (see Table 2). It is showed that, the selection of T value cannot greatly affect the accuracy of PBMDA, which demonstrates a strong robustness on this parameter. So we made a better trade-off to set T = 0.5.

thumbnail
Table 2. Five-fold CV experiment results of changing a parameter T when L = 3.

https://doi.org/10.1371/journal.pcbi.1005455.t002

Case studies

Many miRNAs in the top rank were predicted to have associations with digestive system and urinary system. It seems that miRNA functional expression is closely related to the dysfunction of both digestive system and urinary system. We attempted to explore their potential relationships and what a role of miRNA plays in disease mechanisms of digestive and urinary system. Esophagus and colon belong to digestive system, while kidney belongs to urinary system. Therefore, to further evaluate the prediction performance of PBMDA, Esophageal Neoplasms, Kidney Neoplasms and Colon Neoplasms were investigated to infer their underlying associated miRNAs. Two independent databases (i.e., dbDEMC [40] and miR2Disease [22]) were used as benchmark datasets to verify the predictive results. The quantitative statistics demonstrates the reasonability of this benchmarking method (see Table 3). Besides, miR2Disease and dbDEMC are commonly utilized to be benchmark datasets in this domain, such as HDMP and MIDP models. It is worthwhile to note that, the predicted miRNA-disease associations were not included in HMDD database, including those highly-ranked disease-related miRNAs listed in Tables 47. For the sake of space, this article only mentions these three cancers. Actually, our model can make a successful prediction for almost all of given important diseases. We also present the verification of top-50 prediction list for those important diseases investigated by other previous computational models, such as: prostate neoplasms, breast neoplasms and lung neoplasms in HDMP (see Table 8 and S5 Table). Although HMDD database included plentiful of known miRNA-disease associations, there were still many other miRNA-disease associations existing in other two independent benchmark databases, which were not overlapped in HMDD database. So that, we utilized the known miRNA-disease associations (included in HMDD) to prioritize the novel miRNA-disease associations (not included in HMDD), and evaluated the prediction performance of PBMDA by observing how many these novel associations were matched by other two independent benchmark databases.

thumbnail
Table 3. The quantitative statistics between test dataset (HMDD) and benchmark datasets (miR2Disease and dbDEMC).

https://doi.org/10.1371/journal.pcbi.1005455.t003

thumbnail
Table 4. PBMDA was applied to Esophageal Neoplasms to predict the potential disease-related miRNAs, and 44 of top-50 predicted miRNAs have been confirmed according to recent experimental literatures.

https://doi.org/10.1371/journal.pcbi.1005455.t004

thumbnail
Table 5. PBMDA was applied to Kidney Neoplasms to predict the potential disease-related miRNAs, and 44 of top-50 predicted miRNAs have been confirmed according to recent experimental literatures.

https://doi.org/10.1371/journal.pcbi.1005455.t005

thumbnail
Table 6. PBMDA was applied to Colon Neoplasms to predict the potential disease-related miRNAs, and 45 of top-50 predicted miRNAs have been confirmed according to recent experimental literatures.

https://doi.org/10.1371/journal.pcbi.1005455.t006

thumbnail
Table 7. Nine out of top-10 predicted obesity-related miRNAs have been manually validated by the published literatures.

https://doi.org/10.1371/journal.pcbi.1005455.t007

thumbnail
Table 8. Verification of top-50 prediction list for several important human diseases.

https://doi.org/10.1371/journal.pcbi.1005455.t008

Esophageal Neoplasms is one of the most common digestive carcinomas with poor prognosis. With the growth of the tumor, patients could cause corresponding symptoms, such as difficult or painful swallowing, weight loss and coughing up blood. Cisplatin-based chemotherapy is the main approach for the treatment of Esophageal Neoplasms but the chemotherapy response is difficultly detected. Some studies suggested that miRNAs could be considered as effective prognostic biomarkers for Esophageal Neoplasms. For examples, hsa-let-7 can be considered as a prognostic biomarker for measuring the response to chemotherapy. In addition, when a recurrence of disease happens, patients have relatively higher expression of mature hsa-miR-143 and mature hsa-miR-145 than normal people. A case study of Esophageal Neoplasms was implemented on PBMDA for yielding the most probable related miRNAs (see Table 4). As a result, 9 of top-10 and 44 of top-50 candidates were confirmed to have associations with Esophageal Neoplasms based on previous experimental literatures. For examples, the overexpression of hsa-miR-17 (1st in the prediction list) cluster can accelerate the cellular growth in Esophageal Neoplasms [46]. Previous research showed that hsa-miR-125b (2nd in the prediction list) can promote cell proliferation in Esophageal Neoplasms by influencing the target transcripts: CYP24, ERBB2 and ERBB3 [47]. Moreover, hsa-miR-221 (3rd in the prediction list) can be regarded as a useful diagnostic marker for measuring the sensitivity to the treatment of Esophageal Neoplasms [48].

Kidney Neoplasms is the most rapidly increasing tumor type in incidence rate, especially among black persons. And more than 80 percent of patients are found to have renal-cell carcinoma (RCC). Recent studies found that patients with RCC usually have overexpression of miR-34a, which plays a critical role in slowing the growth of RCC. Besides, MiRs-141/200c were considered as the most down-regulated miRNAs in RCC by targeting ZEB2, which is a type of transcriptional repressor [49,50]. In order to identify potential disease-miRNA associations, we implemented the case study of Kidney Neoplasms. The existing experimental literatures have demonstrated 9 of top-10 and 44 of top-50 potential miRNA candidates were correctly associated with this important human disease (see Table 5). For example, miR-155 (1st in the prediction list), miR-126 (4th in the prediction list) and miR-20a (5th in the prediction list) were identified to be upregulated in clear-cell type human renal cell carcinoma (ccRCC), relative to normal kidney samples [51,52]. Furthermore, it was found that miR-145 (2nd in the prediction list) and miR-146a (3rd in the prediction list) with over expression suppress their target mRNA and protein expression of the STAT-1 pathway in kidney tissues [53].

Colon Neoplasms maintains the second leading cause of cancer-related death in the United States [54]. Although chemotherapy has important therapeutic value, surgery is still the only curative way for the treatment of Colon Neoplasms. There is an urgent need to find potential biomarkers, which have a strong response to the clinical observations. By using in situ hybridization technique, researchers have confirmed that miR-21 has high expression levels in colonic carcinoma cells [55]. What’s more, let-7 functions as a potential growth suppressor in human colon cancer tumors and cell lines [56]. Identifying more miRNAs associated with Colon Neoplasms helps accurately evaluate the clinical outcomes. Therefore, we implemented the case study of Colon Neoplasms based on PBMDA. In the prediction list, 9 of top-10 and 45 of top-50 predicted miRNAs obtained confirmation of their associations with Colon Neoplasms based on recent experimental literatures (see Table 6). For examples, preclinical research showed that the expression of miR-21 (1st in the prediction list) is related to clinicopathologic features of colorectal cancer [57]. Experimental studies also found that miR-20a (2nd in the prediction list) shows significantly higher expression in colon cancer tissues than normal tissues [58]. MiR-18a (3rd in the prediction list) is considered as a colon tumor suppressor by targeting on K-Ras (mRNA) to influence cell proliferation and anchorage-independent growth [59]. What’s more, miR-34a (6th in the prediction list) has important potential to be used as potential diagnostic and prognostic biomarker by using its expression at different stages of Colon Neoplasms [60].

Considering that most cancers are characterized by some extent of genetic and genomic modifications, so the fact that cancers are associated with miRNA dysregulation is perhaps an obvious form of validation, which inspired us to know whether PBMDA can achieve the similar effectiveness for another disease type, such as obesity. Because there is only one entry corresponding to the miRNA-obesity association in both HMDD and the two independent benchmark databases (i.e., dbDEMC [40] and miR2Disease [22]), we decided to manually validate the top-10 predicted obesity-related miRNAs based on the published literatures. As we can see the validation results from Table 7, 9 out of top-10 predicted miRNAs have been demonstrated to be associated with obesity, which demonstrated that PBMDA also work effectively for other diseases.

Because only WBSMDA and PBMDA chose the latest version of HMDD for the prediction, we decided to compare the performance between PBMDA and WBSMDA by observing how many top-50 predicted miRNA-disease associations have been confirmed by dbdemc and miR2Disease databases for these three important diseases. Their validation results were compared in Table 9. As we can see from this table, PBMDA perform better than WBSMDA in general.

thumbnail
Table 9. Performance comparison between PBMDA and WBSMDA in case studies of three important diseases in top-50 prediction list based on the latest version of HMDD (v2.0).

https://doi.org/10.1371/journal.pcbi.1005455.t009

Besides, we have also implemented PBMDA on the older version of HMDD (v1.0) for further comparison between PBMDA and another three compared computational models, i.e. HDMP, RWRMDA and RLSMDA. For a fair comparison, we only took those top-50 predicted associations verified by three benchmark databases (i.e. HMDD v2.0, dbDEMC and miR2Disease) into consideration. Namely, those predicted associations additionally verified by extra literatures were not included. Besides, we could only compare performance between PBMDA and other compared models for those given diseases, whose verification was published in their articles, e.g. prostatic, breast and lung neoplasms in HDMP. Based on these comparison results (see Table 10), our model generally improves prediction performance for various selected diseases, relative to other compared computational models.

thumbnail
Table 10. We implemented PBMDA model on the older version of HMDD (v1.0) for further comparison between PBMDA and another three representative computational models, i.e. HDMP, RWRMDA and RLSMDA.

For these given diseases, their top-50 prediction lists have been verified by three benchmark databases (i.e. HMDD v2.0, dbDEMC and miR2Disease).

https://doi.org/10.1371/journal.pcbi.1005455.t010

To prove the applicability of our model to a new disease (no known associated miRNAs), we selected Glioblastoma for further verification. We removed all records about Glioblastoma from the known miRNA-disease association network derived from HMDD and that Glioblastoma could be regarded as a new disease. We implemented our model for prediction and then the prediction result about Glioblastoma was yielded. Similarly, we verified top-50 predicted miRNA-disease associations by HMDD, dbDEMC and miR2Disease (see Table 11). Fifteen out of top-20 and 37 out of top-50 predicted miRNAs have been verified to be associated with Glioblastoma by HMDD, dbDEMC and miR2Disease. Based on these prediction results, we can safely conclude that PBMDA can still achieve the reliable prediction performance for a new disease. Most importantly, it also demonstrates that our model is indeed applicable to a new disease.

thumbnail
Table 11. We removed all records about Glioblastoma from the known miRNA-disease association network derived from HMDD and implemented our model for prediction.

Fifteen out of top-20 and 37 out of top-50 predicted miRNAs have been verified to be associated with Glioblastoma by HMDD, miR2Disease and dbDEMC databases.

https://doi.org/10.1371/journal.pcbi.1005455.t011

As a global computational model, PBMDA was also implemented to simultaneously prioritize the potential miRNAs for all investigated diseases. Owing to the limited prior knowledge, some promising disease-related miRNAs have not been validated yet. We therefore listed the top-100 potential associations in S6 Table.

Discussions

With the great amount of researches, it was found that miRNAs play increasingly significant roles in many physiological processes including complex human diseases. Researchers attempt to identify disease-related miRNAs as valuable biomarkers for clinical measure, diagnosis, prognosis and treatment. The biological experiment-based verification is not only time-consuming but also expensive, which boosts the development of computational predictive models. A novel Path-Based MiRNA-Disease Association (PBMDA) computational prediction model was proposed here by integrating heterogeneous biological networks. PBMDA could construct a heterogeneous graph by padding internal connections, including miRNA-miRNA similarity, disease-disease similarity and known miRNA-disease associations. MiRNA-miRNA similarity and disease-disease similarity are inferred from Gaussian interaction profile kernel similarity for miRNA and disease, miRNA functional similarity network, and disease semantic similarity network. Compared with four state-of-the-art computational models, PBMDA achieved the highest AUCs of 0.9169, 0.8341 and 0.9172+/-0.0007 in the evaluation frameworks of global LOOCV, local LOOCV and 5-fold CV, respectively, demonstrating the most reliable prediction performance. In the case studies of three important human complex diseases, 44, 44, and 45 of top-50 predicted miRNAs of Esophageal Neoplasms, Kidney Neoplasms and Colon Neoplasms have been experimentally supported by the previous experimental literatures, respectively. By manually validating the predicted obesity-related miRNAs based on the published literatures, 9 out of top-10 predicted miRNAs have been demonstrated to be associated with obesity. Furthermore, through the comparison performance between PBMDA and other previous models in case studies, it is anticipated that PBMDA would significantly accelerate the identification of miRNA-disease associations. We are planning to provide a standalone tool or webserver for users in the future. This study is aimed to firstly propose the computational model for the next schedule.

There are several major factors contributing to the high prediction performance of PBMDA. First, reliable biological datasets were utilized to establish an integrated similarity network, which represents three relationships (i.e., miRNA-miRNA similarity, disease-disease similarity, and miRNA-disease associations). Second, as a path-based model, PBMDA can effectively take advantage of topological information implied in the integrated heterogeneous network. Third, PBMDA can be applied for new disease (no known associated miRNAs) and new miRNAs (no known associated diseases), which greatly improves the practicability and reliability of the PBMDA. Depending on the disease semantic similarity and miRNA functional similarity, we can construct the disease-disease and miRNA-miRNA similarity network. Depth-first search algorithm can be used to assign the scores to the paths like: diseasenew↔disease↔miRNA and miRNAnew↔miRNA↔disease. In this way, the unverified disease-miRNA associations including new diseases and/or new miRNAs also can be prioritized based on their aggregated scores. Fourth, the model of PBMDA can be easily introduced together with other biological information (e.g. various miRNA-related interactions and disease phenotypic similarity [61,62]) for further improving the quality of the integrated heterogeneous network. Last but not least, PBMDA could simultaneously prioritize candidate miRNAs for all investigated diseases.

There is still a vast potential to boost the prediction performance of PBMDA, which still have some limitations. For examples, the miRNA-disease associations obtained from HMDD database are far from enough, which greatly influences the performance of our approach. The disease semantic similarity and miRNA functional similarity have problem of sparsity, which was remedied by integrating the Gaussian interaction profile kernel similarity inferred from the known miRNA-disease associations. It inevitably did bring the predicted error to the constructed heterogeneous graph. Finally, the distance-decay function in our approach is relatively simple, and it could be reconstructed based on the machine learning methods.

Supporting information

S1 Table. Known human miRNA-disease associations obtained from HMDD database.

https://doi.org/10.1371/journal.pcbi.1005455.s001

(XLSX)

S2 Table. Names of 495 miRNAs involved in known human miRNA-disease associations obtained from HMDD database.

https://doi.org/10.1371/journal.pcbi.1005455.s002

(XLSX)

S3 Table. Names of 383 diseases involved in known human miRNA-disease associations obtained from HMDD database.

https://doi.org/10.1371/journal.pcbi.1005455.s003

(XLSX)

S4 Table. The constructed miRNA functional similarity score matrix.

https://doi.org/10.1371/journal.pcbi.1005455.s004

(XLSX)

S5 Table. PBMDA’s top-50 prediction list verified by dbDEMC and miR2Disease databases.

https://doi.org/10.1371/journal.pcbi.1005455.s005

(XLSX)

S6 Table. As a global measure model, PBMDA can simultaneously prioritize all unknown potential related miRNAs for all investigated diseases.

We here publicly released the top-100 potential miRNA-disease associations predicted by PBMDA.

https://doi.org/10.1371/journal.pcbi.1005455.s006

(XLSX)

Author Contributions

  1. Conceptualization: XC.
  2. Data curation: XC ZAH.
  3. Formal analysis: ZHY XC GYY.
  4. Funding acquisition: XC ZZ GYY ZHY ZW.
  5. Investigation: XC GYY ZWL.
  6. Methodology: XC.
  7. Project administration: XC ZZ.
  8. Resources: XC.
  9. Software: ZAH XC.
  10. Supervision: XC ZZ.
  11. Validation: XC ZAH ZHY.
  12. Visualization: ZAH XC ZHY ZZ GYY ZWL.
  13. Writing – original draft: ZAH XC ZHY.
  14. Writing – review & editing: XC ZAH ZZ.

References

  1. 1. Meister G, Tuschl T (2004) Mechanisms of gene silencing by double-stranded RNA. Nature 431: 343–349. pmid:15372041
  2. 2. Bartel DP (2004) MicroRNAs: genomics, biogenesis, mechanism, and function. cell 116: 281–297. pmid:14744438
  3. 3. Ambros V (2001) microRNAs: tiny regulators with great potential. Cell 107: 823–826. pmid:11779458
  4. 4. Ambros V (2004) The functions of animal microRNAs. Nature 431: 350–355. pmid:15372042
  5. 5. Lee RC, Feinbaum RL, Ambros V (1993) The C. elegans heterochronic gene lin-4 encodes small RNAs with antisense complementarity to lin-14. Cell 75: 843–854. pmid:8252621
  6. 6. Reinhart BJ, Slack FJ, Basson M, Pasquinelli AE, Bettinger JC, et al. (2000) The 21-nucleotide let-7 RNA regulates developmental timing in Caenorhabditis elegans. Nature 403: 901–906. pmid:10706289
  7. 7. Wightman B, Ha I, Ruvkun G (1993) Posttranscriptional regulation of the heterochronic gene lin-14 by lin-4 mediates temporal pattern formation in C. elegans. Cell 75: 855–862. pmid:8252622
  8. 8. Kozomara A, Griffiths-Jones S (2011) miRBase: integrating microRNA annotation and deep-sequencing data. Nucleic Acids Res 39: D152–D157. pmid:21037258
  9. 9. Ambros V (2003) MicroRNA pathways in flies and worms: growth, death, fat, stress, and timing. Cell 113: 673–676. pmid:12809598
  10. 10. Taganov KD, Boldin MP, Chang K-J, Baltimore D (2006) NF-κB-dependent induction of microRNA miR-146, an inhibitor targeted to signaling proteins of innate immune responses. Proceedings of the National Academy of Sciences 103: 12481–12486.
  11. 11. Chen C-Z, Li L, Lodish HF, Bartel DP (2004) MicroRNAs modulate hematopoietic lineage differentiation. Science 303: 83–86. pmid:14657504
  12. 12. Carleton M, Cleary MA, Linsley PS (2007) MicroRNAs and cell cycle regulation. Cell cycle 6: 2127–2132. pmid:17786041
  13. 13. Urbich C, Kuehbacher A, Dimmeler S (2008) Role of microRNAs in vascular diseases, inflammation, and angiogenesis. Cardiovascular Research 79: 581–588. pmid:18550634
  14. 14. Petrocca F, Visone R, Onelli MR, Shah MH, Nicoloso MS, et al. (2008) E2F1-regulated microRNAs impair TGFβ-dependent cell-cycle arrest and apoptosis in gastric cancer. Cancer cell 13: 272–286. pmid:18328430
  15. 15. Leung AK, Sharp PA (2010) MicroRNA functions in stress responses. Molecular cell 40: 205–215. pmid:20965416
  16. 16. Ma L, Teruya-Feldstein J, Weinberg RA (2007) Tumour invasion and metastasis initiated by microRNA-10b in breast cancer. Nature 449: 682–688. pmid:17898713
  17. 17. Sredni ST, Huang CC, Bonaldo MdF, Tomita T (2011) MicroRNA expression profiling for molecular classification of pediatric brain tumors. Pediatric blood & cancer 57: 183–184.
  18. 18. Shi B, Sepp-Lorenzino L, Prisco M, Linsley P, Baserga R (2007) Micro RNA 145 targets the insulin receptor substrate-1 and inhibits the growth of colon cancer cells. Journal of Biological Chemistry 282: 32582–32590. pmid:17827156
  19. 19. Bang C, Fiedler J, Thum T (2012) Cardiovascular Importance of the MicroRNA‐23/27/24 Family. Microcirculation 19: 208–214. pmid:22136461
  20. 20. Weidhaas J (2010) Using microRNAs to understand cancer biology. The Lancet Oncology 11: 106–107. pmid:20022811
  21. 21. Li Y, Qiu C, Tu J, Geng B, Yang J, et al. (2014) HMDD v2.0: a database for experimentally supported human microRNA and disease associations. Nucleic acids research 42: D1070–1074. pmid:24194601
  22. 22. Jiang Q, Wang Y, Hao Y, Juan L, Teng M, et al. (2009) miR2Disease: a manually curated database for microRNA deregulation in human disease. Nucleic acids research 37: D98–D104. pmid:18927107
  23. 23. Chen X (2016) miREFRWR: a novel disease-related microRNA-environmental factor interactions prediction method. Mol Biosyst 12: 624–633. pmid:26689259
  24. 24. Chen X, Yan CC, Zhang X, You Z-H (2016) Long non-coding RNAs and complex diseases: from experimental results to computational models. Briefings in Bioinformatics: bbw060. pmid:27345524
  25. 25. Chen X (2015) KATZLDA: KATZ measure for the lncRNA-disease association prediction. Sci Rep 5: 16840. pmid:26577439
  26. 26. Chen X, Huang Y, Wang X, You Z, Chan K (2016) FMLNCSIM: fuzzy measure-based lncRNA functional similarity calculation model. Oncotarget 7: 45948–45958. pmid:27322210
  27. 27. Chen X, Liu MX, Cui QH, Yan GY (2012) Prediction of Disease-Related Interactions between MicroRNAs and Environmental Factors Based on a Semi-Supervised Classifier. PLoS One 7: e43425. pmid:22937049
  28. 28. Chen X (2015) Predicting lncRNA-disease associations and constructing lncRNA functional similarity network based on the information of miRNA. Sci Rep 5: 13186. pmid:26278472
  29. 29. Chen X, Yan G-Y (2013) Novel human lncRNA–disease association inference based on lncRNA expression profiles. Bioinformatics 29: 2617–2624. pmid:24002109
  30. 30. Jiang Q, Hao Y, Wang G, Juan L, Zhang T, et al. (2010) Prioritization of disease microRNAs through a human phenome-microRNAome network. BMC systems biology 4: 1.
  31. 31. Xuan P, Han K, Guo M, Guo Y, Li J, et al. (2013) Prediction of microRNAs associated with human diseases based on weighted k most similar neighbors. PloS one 8: e70204. pmid:23950912
  32. 32. Mork S, Pletscher-Frankild S, Palleja Caro A, Gorodkin J, Jensen LJ (2014) Protein-driven inference of miRNA-disease associations. Bioinformatics 30: 392–397. pmid:24273243
  33. 33. Chen X, Liu M-X, Yan G-Y (2012) RWRMDA: predicting novel human microRNA–disease associations. Molecular BioSystems 8: 2792–2798. pmid:22875290
  34. 34. Xuan P, Han K, Guo Y, Li J, Li X, et al. (2015) Prediction of potential disease-associated microRNAs based on random walk. Bioinformatics 31: 1805–1815. pmid:25618864
  35. 35. Shi H, Xu J, Zhang G, Xu L, Li C, et al. (2013) Walking the interactome to identify human miRNA-disease associations through the functional link between miRNA targets and disease genes. BMC systems biology 7: 101. pmid:24103777
  36. 36. Chen X, Yan CC, Zhang X, You ZH, Deng L, et al. (2016) WBSMDA: Within and Between Score for MiRNA-Disease Association prediction. Sci Rep 6: 21106. pmid:26880032
  37. 37. Xu J, Li C-X, Lv J-Y, Li Y-S, Xiao Y, et al. (2011) Prioritizing candidate disease miRNAs by topological features in the mirna target–dysregulated network: Case study of prostate cancer. Molecular cancer therapeutics 10: 1857–1866. pmid:21768329
  38. 38. Chen X, Yan GY (2014) Semi-supervised learning for potential human microRNA-disease associations inference. Sci Rep 4: 5501. pmid:24975600
  39. 39. Chen X, Yan CC, Zhang X, Li Z, Deng L, et al. (2015) RBMMMDA: predicting multiple types of disease-microRNA associations. Sci Rep 5: 13877. pmid:26347258
  40. 40. Yang Z, Ren F, Liu C, He S, Sun G, et al. (2010) dbDEMC: a database of differentially expressed miRNAs in human cancers. BMC genomics 11: S5.
  41. 41. Wang D, Wang J, Lu M, Song F, Cui Q (2010) Inferring the human microRNA functional similarity and functional network based on microRNA-associated diseases. Bioinformatics 26: 1644–1650. pmid:20439255
  42. 42. Chen X, You Z, Yan G, Gong D (2016) IRWRLDA: Improved Random Walk with Restart for LncRNA-Disease Association prediction. Oncotarget 7: 57919–57931. pmid:27517318
  43. 43. Huang Y, Chen X, You Z, Huang D, Chan K (2016) ILNCSIM: improved lncRNA functional similarity calculation model. Oncotarget 7: 25902–25914. pmid:27028993
  44. 44. van Laarhoven T, Nabuurs SB, Marchiori E (2011) Gaussian interaction profile kernels for predicting drug-target interaction. Bioinformatics 27: 3036–3043. pmid:21893517
  45. 45. Ba-Alawi W, Soufan O, Essack M, Kalnis P, Bajic VB (2016) DASPfind: new efficient method to predict drug–target interactions. Journal of Cheminformatics 8: 1–9.
  46. 46. Xu X-L, Jiang Y-H, Feng J-G, Su D, Chen P-C, et al. (2014) MicroRNA-17, microRNA-18a, and microRNA-19a are prognostic indicators in esophageal squamous cell carcinoma. The Annals of thoracic surgery 97: 1037–1045. pmid:24360091
  47. 47. Kan T, Meltzer SJ (2009) MicroRNAs in Barrett's esophagus and esophageal adenocarcinoma. Current opinion in pharmacology 9: 727–732. pmid:19773200
  48. 48. Okamoto K, Miyoshi K, Murawaki Y (2013) miR-29b, miR-205 and miR-221 enhance chemosensitivity to gemcitabine in HuH28 human cholangiocarcinoma cells. PLoS One 8: e77623. pmid:24147037
  49. 49. Chow W-H, Gridley G, Fraumeni JF Jr, Järvholm B (2000) Obesity, hypertension, and the risk of kidney cancer in men. New England Journal of Medicine 343: 1305–1311. pmid:11058675
  50. 50. Catto JW, Alcaraz A, Bjartell AS, White RDV, Evans CP, et al. (2011) MicroRNA in prostate, bladder, and kidney cancer: a systematic review. European urology 59: 671–681. pmid:21296484
  51. 51. Juan D, Alexe G, Antes T, Liu H, Madabhushi A, et al. (2010) Identification of a microRNA panel for clear-cell kidney cancer. Urology 75: 835–841. pmid:20035975
  52. 52. Tsz-fung FC, Youssef YM, Lianidou E, Romaschin AD, Honey RJ, et al. (2010) Differential expression profiling of microRNAs and their potential involvement in renal cell carcinoma pathogenesis. Clinical biochemistry 43: 150–158. pmid:19646430
  53. 53. Lu MC, Lai NS, Chen HC, Yu HC, Huang KY, et al. (2013) Decreased microRNA (miR)‐145 and increased miR‐224 expression in T cells from patients with systemic lupus erythematosus involved in lupus immunopathogenesis. Clinical & Experimental Immunology 171: 91–99.
  54. 54. O’Brien CA, Pollett A, Gallinger S, Dick JE (2007) A human colon cancer cell capable of initiating tumour growth in immunodeficient mice. Nature 445: 106–110. pmid:17122772
  55. 55. Schetter AJ, Leung SY, Sohn JJ, Zanetti KA, Bowman ED, et al. (2008) MicroRNA expression profiles associated with prognosis and therapeutic outcome in colon adenocarcinoma. Jama 299: 425–436. pmid:18230780
  56. 56. Akao Y, Nakagawa Y, Naoe T (2006) let-7 microRNA functions as a potential growth suppressor in human colon cancer cells. Biological and Pharmaceutical Bulletin 29: 903–906. pmid:16651716
  57. 57. Slaby O, Svoboda M, Fabian P, Smerdova T, Knoflickova D, et al. (2008) Altered expression of miR-21, miR-31, miR-143 and miR-145 is related to clinicopathologic features of colorectal cancer. Oncology 72: 397–402.
  58. 58. Motoyama K, Inoue H, Takatsuno Y, Tanaka F, Mimori K, et al. (2009) Over-and under-expressed microRNAs in human colorectal cancer. International journal of oncology 34: 1069. pmid:19287964
  59. 59. Tsang WP, Kwok TT (2009) The miR-18a* microRNA functions as a potential tumor suppressor by targeting on K-Ras. Carcinogenesis 30: 953–959. pmid:19372139
  60. 60. Arndt GM, Dossey L, Cullen LM, Lai A, Druker R, et al. (2009) Characterization of global microRNA expression reveals oncogenic potential of miR-145 in metastatic colorectal cancer. BMC cancer 9: 1.
  61. 61. Huang Y-A, You Z-H, Chen X, Chan K, Luo X (2016) Sequence-based prediction of protein-protein interactions using weighted sparse representation model combined with global encoding. BMC bioinformatics 17: 184. pmid:27112932
  62. 62. Wong L, You Z-H, Ming Z, Li J, Chen X, et al. (2016) Detection of interactions between proteins through rotation forest and local phase quantization descriptors. International journal of molecular sciences 17: 21.