Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Crinet: A computational tool to infer genome-wide competing endogenous RNA (ceRNA) interactions

  • Ziynet Nesibe Kesimoglu,

    Roles Conceptualization, Methodology, Software, Validation, Visualization, Writing – original draft, Writing – review & editing

    Affiliations Department of Computer Science and Engineering, University of North Texas, Denton, Texas, United States of America, Department of Computer Science, Marquette University, Milwaukee, Wisconsin, United States of America

  • Serdar Bozdag

    Roles Conceptualization, Project administration, Supervision, Writing – review & editing

    Serdar.Bozdag@unt.edu

    Affiliations Department of Computer Science and Engineering, University of North Texas, Denton, Texas, United States of America, Department of Computer Science, Marquette University, Milwaukee, Wisconsin, United States of America

Abstract

To understand driving biological factors for complex diseases like cancer, regulatory circuity of genes needs to be discovered. Recently, a new gene regulation mechanism called competing endogenous RNA (ceRNA) interactions has been discovered. Certain genes targeted by common microRNAs (miRNAs) “compete” for these miRNAs, thereby regulate each other by making others free from miRNA regulation. Several computational tools have been published to infer ceRNA networks. In most existing tools, however, expression abundance sufficiency, collective regulation, and groupwise effect of ceRNAs are not considered. In this study, we developed a computational tool named Crinet to infer genome-wide ceRNA networks addressing critical drawbacks. Crinet considers all mRNAs, lncRNAs, and pseudogenes as potential ceRNAs and incorporates a network deconvolution method to exclude the spurious ceRNA pairs. We tested Crinet on breast cancer data in TCGA. Crinet inferred reproducible ceRNA interactions and groups, which were significantly enriched in the cancer-related genes and processes. We validated the selected miRNA-target interactions with the protein expression-based benchmarks and also evaluated the inferred ceRNA interactions predicting gene expression change in knockdown assays. The hub genes in the inferred ceRNA network included known suppressor/oncogene lncRNAs in breast cancer showing the importance of non-coding RNA’s inclusion for ceRNA inference. Crinet-inferred ceRNA groups that were consistently involved in the immune system related processes could be important assets in the light of the studies confirming the relation between immunotherapy and cancer. The source code of Crinet is in R and available at https://github.com/bozdaglab/crinet.

Introduction

MicroRNAs (miRNAs) are small RNA types that bind to other RNAs such as mRNA, long non-coding RNA (lncRNA), and circular RNA to regulate their expression post-transcriptionally. Recently, a new regulatory layer related to miRNAs has been discovered [1]: certain RNAs targeted by common miRNAs “compete” for these miRNAs and thereby regulate each other indirectly by making the other RNA(s) free from miRNA regulation. Such indirect interactions between RNAs are called competing endogenous RNA (ceRNA) interactions, which have important roles in diseases including cancer [25]. There is a regulation multiplicity between miRNAs and RNAs, meaning that a miRNA could have multiple RNA targets, and an RNA could be targeted by multiple miRNAs. Given the enormous number of RNAs and difficulty of deciphering miRNA binding targets accurately, identifying ceRNA interactions experimentally is cost- and labor-prohibitive. Therefore, computational tools are crucial to infer ceRNA interactions in complex genomes like human. For the rest of the paper, “genes” refers to mRNAs, lncRNAs, and pseudogenes in our analysis.

Despite existing computational tools [69], there exists crucial drawbacks. Current tools compute only pairwise ceRNA interactions or ceRNA modules, however, inferring groupwise ceRNA interactions should be considered since several ceRNAs could work together to sequester miRNA(s) targeting key ceRNA(s). Also, in the existing tools, a miRNA/gene could be assigned to many genes/miRNAs without considering the sufficiency of miRNA/gene expression abundance. Furthermore, since ceRNAs positively regulate each other, two genes having many common ceRNA partners might be inferred just because of the amplifying effect of regulation by common ceRNA partners. Thus, excluding these false positive ceRNA interactions is important.

In this study, we developed a computational tool named Crinet (CeRna Interaction NETwork) to infer genome-wide ceRNA interactions and groups to address the aforementioned drawbacks. To build our ceRNA network on a proper miRNA-target interaction set, we integrated expression datasets and binding scores for miRNA-target pairs considering expression abundance sufficiency. We computed ceRNA pairs considering strong regulation jointly. To cope with the spurious interactions, we excluded ceRNA pairs which were potentially inferred because they had significant overlapping of common ceRNA partners. We inferred ceRNA groups and integrated these groups into the ceRNA network. Getting the benefit of multiple biological datasets and including non-coding RNAs, this approach facilitates a better understanding of ceRNA regulatory mechanisms addressing important drawbacks, which would shed light on the underlying complex regulatory circuitry in disease conditions.

Crinet was applied to breast cancer dataset to infer ceRNA interactions and groups. We evaluated Crinet-selected miRNA-target interactions with protein expression-based benchmarks having increasing performance following filtering steps. Expression change in the inferred ceRNAs was highly affected following knockdown of their ceRNA partners. Inferred ceRNAs, hub ceRNAs, and some ceRNA groups were significantly associated with the cancer-related genes and processes, and consistently involved in the immune system related processes, thus Crinet-inferred ceRNAs could be important virtues in the view of the studies validating the relation between immunotherapy and cancer.

Materials and methods

Crinet is a computational tool to infer genome-wide ceRNA interactions and groups (Fig 1). Briefly, the first step is the data preparation step. In the second step, miRNA-target interactions are computed by incorporating expression datasets and considering expression abundance sufficiency. Starting with final miRNA-target interactions, ceRNA interactions are inferred in the third step. In the last step, ceRNA groups are inferred from the ceRNA network and integrated into the network. In the following, each step of Crinet is explained in more detail.

Data preparation

Crinet incorporates miRNA-target interactions with binding scores, gene-centric copy number aberration (CNA), and expression datasets. If binding scores are not available, the same score for all interactions could be used.

To collect datasets for Crinet, we used TCGAbiolinks R package [10] and obtained the datasets from the Cancer Genome Atlas (TCGA) project including gene expression, miRNA expression, and CNA for totally 1107 breast (BRCA) tumor samples (available at https://www.cancer.gov/tcga). We preprocessed each datatype separately obtaining normalized expression values (gene expression as FPKM and miRNA expression as RPM) and filtered lowly expressed genes (if FPKM <1 and RPM = 0 for at least 15% samples). To get gene-centric CNA from a segmented dataset, we ran CNTools R package [11].

We obtained all conserved and nonconserved miRNA-target interactions with weighted context++ scores from TargetScan [12]. To compute pseudogene and lncRNA targets of miRNAs, we ran TargetScan separately for lncRNAs and pseudogenes, providing a constant dummy ORF length and supplying all transcript sequence as 3’UTR with the assumption that the entire sequence could be bound by miRNAs. If multiple scores exist for one pair due to the multiple transcripts, we used the strongest score.

Since there is a corresponding weighted context++ score in TargetScan results even for weak miRNA-target interactions, we kept the top 40% of ranked interactions with respect to weighted context++ scores. To fit different distributions of scores for mRNAs, lncRNAs, and pseudogenes, we combined z-normalized scores from each and applied min-max normalization after having all the scores in the range of -1 and 1. We used these normalized scores as the weight for each miRNA-target interaction assuming that these scores show the binding strength between miRNA and its gene target.

Computing miRNA-target interactions

In this step, we computed final miRNA-target interactions leveraging expression datasets and considering expression abundance sufficiency.

Integrating interactions with expression datasets (correlation filtering).

Since miRNAs are known to repress their target genes, we kept only miRNA-target pairs having negative correlation between their expressions. By default, we used the correlation coefficient threshold of -0.1, which was the median correlation coefficients of all miRNA-target pairs. We also applied random sampling with replacement to compute correlation coefficient of each miRNA-target pair for 1000 times and required that the threshold was satisfied for ≥99% of the samplings.

Getting interactions with sufficient abundance and binding probabilities (abundance filtering).

In the miRNA-target interaction sets, a miRNA could be assigned to mediate thousands of RNAs, and similarly an RNA could be assigned to many miRNAs as a potential target. To quantify the expression sufficiency for our putative interactions, we introduced Interaction Regulation (IR) formulated as: (1) where IR(r, t) is the IR of the regulator r and the target t across samples, Exp(.) is the expression vector across samples, scorert is the normalized binding score for the interaction between the regulator r and the target t. Using this formula, we kept the final miRNA-target interactions having high IRs (i.e., 80th percentile of log of IR(r, t) > −4.89, which was third quartile of all IRs through samples).

Keeping genes with proper effective regulation.

To exclude genes from analysis if they were not under strong miRNA regulation based on the final miRNA-target interactions, we introduced Effective Regulation (ER) formulated as: (2)

To keep genes with proper effective regulation by miRNAs, we filtered out genes without strong negative correlation (< −0.01) between its expression and ER, assuming that they did not have strong miRNA regulation for our specific dataset. We also applied random sampling with replacement to compute correlation for 1000 times and required that the threshold was satisfied for ≥99% of the samplings. We used the remained genes (called candidate genes) for further analysis.

Inferring ceRNA interactions

To infer ceRNA interactions, we generated all possible gene-gene combinations using candidate genes and filtered them based on the following criteria:

Checking significant number of common regulator.

Since ceRNAs should have common miRNAs to compete for, we kept gene pairs having a significant number of common miRNA regulators (hypergeometric p-value <0.01).

Checking significant expression correlation.

Since the ceRNA pairs indirectly positively regulate each other and CNA considerably affects the expression values, we kept the gene pairs having significant partial correlation when excluding the CNA effect from the gene expression. By default, we used the correlation coefficient threshold of 0.55 (p-value <0.01), which was the third quartile of the positive correlation (S3 Fig in S1 File).

Checking collective regulation.

If there exists ceRNA regulation between two genes then both genes compete for common miRNAs and those miRNAs affect both genes simultaneously. We called this regulation Collective Regulation (CR) formulated as: (3) where S is a set of genes having ceRNA interactions and corr() is the Pearson correlation function. We kept the ceRNA pairs if they had a CR < −0.01.

We applied random sampling with replacement for both partial correlation and CR measurements (last two steps) 100 times separately and kept the interactions when the threshold was satisfied for ≥99% of the samplings.

Excluding amplified interactions.

Having common ceRNA partners between any two genes will increase the correlation between their expression. If a gene pair has too many common ceRNA partners, then some of the ceRNA interactions could be superior due to the high number of common ceRNA partners. To exclude such spurious interactions from our network, we employed a network deconvolution algorithm [13] and kept the top one-third of ranked interactions as our pairwise ceRNA network.

Inferring ceRNA groups

In ceRNA regulation, each ceRNA pair compete for common miRNA(s) and act as a decoy to make the other RNA free from miRNA regulation. However, this competition could occur among more than two RNAs, or between two groups of RNA. Based on this premise, Crinet inferred ceRNA groups in addition to ceRNA interactions.

To obtain ceRNA groups, we utilized one of the popular community detection algorithms named Walktrap [14] on the weighted ceRNA network where weights were normalized partial correlation coefficient. We kept the groups satisfying all the group conditions, otherwise split them iteratively. Three group requirements of Crinet are listed as follows:

Common miRNA regulator.

To be able to compete for, all the group members were required to have at least one common miRNA regulator.

Strong regulation effect.

CeRNAs in a group were expected to have a stronger miRNA regulation effect as a group than as individual ceRNAs. Thus, we required that CR of a group must be stronger (i.e. reduced) than the correlation between expression and ER of ≥90% of the group members. Moreover, to ensure that most of the group members would be under strong collective regulation effect, we required that an average difference between CR of the group and corr(Exp(g), ER(g)) for each gene g in the group was >0.

Compatibility with the network.

To hold inference consistency of the ceRNA network, for a given group, we called each ceRNA partner of group members as neighbor and expected the group to satisfy any two of the three conditions for at least 90% of its neighbors. The conditions are i) at least one common miRNA regulator between the group and the neighbor, ii) a strong Pearson correlation based on expression, and iii) strong collective regulation between the group and the neighbor.

Results

We tested Crinet on breast tumor samples from TCGA (Section Data preparation for details). We computed miRNA-target interactions (Table 1) and used them to infer 17,443 pairwise ceRNA interactions (Table 2). Using this pairwise ceRNA network, we obtained 81 ceRNA groups after applying 1508 iterations of Walktrap. Thirty five of these groups were connected to at least one node in the final network, while the others had interactions only within the group. After this step, we had our grouped ceRNA network with 4352 nodes (4317 individual genes and 35 groups of genes) and 17,274 edges between inferred nodes.

thumbnail
Table 1. Number of miRNA-target interactions after each miRNA-target interaction filtering step in Crinet.

https://doi.org/10.1371/journal.pone.0251399.t001

thumbnail
Table 2. Number of remained ceRNA pairs after each ceRNA interaction filtering step in Crinet.

https://doi.org/10.1371/journal.pone.0251399.t002

To check the scale-free property and specificity of Crinet, we examined the inferred network (A.2 Section in S1 File for details). Since biological networks generally exhibit scale-free property [1517], we computed the inferred network’s degree probability distribution function. Our inferred ceRNA network had a negative slope with high fitness (R2 = 0.93), indicating that the inferred ceRNA network was scale-free (S2 Fig in S1 File). To evaluate the specificity of Crinet, we checked if our inferred ceRNA pairs existed in different regulatory layers, namely protein-protein interactions (PPIs) and transcription factor (TF)-gene interactions. We collected 1,663,810 TF-target interactions from TRRUST v2 [18] database and the ENCODE Transcription Factor Targets dataset [19], and 1,847,774 PPIs from BIOGRID v3.5.186 [20]. Within all inferred ceRNA interactions, very few interactions were TF-gene interactions (0.46%) and PPIs (0.51%) indicating that the regulatory relationships between inferred ceRNA interactions were not due TF or PPI effect.

To check the reproducibility based on different datasets, the robustness to different hyperparameters, and the effect of each individual step in ceRNA inference, we conducted more detailed analysis of Crinet results (A.3 and A.4 Sections in S1 File for details). To check the reproducibility of Crinet based on different datasets, we ran Crinet on two equal-sized random samplings of the breast cancer dataset multiple times. To avoid bias in the comparisons, we ensured that both samplings had similar subtype distribution (namely Basal-like, Normal-like, Luminal-A, Luminal-B, and Her2-enriched). We observed highly overlapping interactions and ceRNAs among different runs (S4 Table in S1 File). We checked the distribution of consistently overlapping ceRNAs and observed that the mean degree of these ceRNAs was much higher as compared to the overall mean degree (p-value <2.10−16) suggesting that consistently inferred ceRNAs were the hub ceRNAs highly involving in our inferred ceRNA network. Moreover, to examine the effect of each step in Crinet, we disabled major steps in ceRNA inference and evaluated the results. Disabling individual steps made a substantial difference in the inferred results (S1 Table in S1 File). However, when we modified the hyperparameters in each of these steps, we observed highly overlapping interactions (S5 and S6 Tables in S1 File) suggesting that Crinet is robust to different choices of hyperparameters.

miRNA-target interaction filtering showed increasing performance on protein expression-based benchmarks

Since we built a ceRNA network relying on miRNA-target interactions, proper selection of these interactions is important; therefore, we evaluated each filtering step of miRNA-target interactions using protein expression-based benchmarks.

Transfection analysis.

We utilized a Reverse Phase Protein Array (RPPA) dataset for MDA-MB-231 breast cancer cell line from The Cancer Proteome Atlas (TCPA) database (accession number: TCPA00000001) [21] to assess our miRNA-target interactions as in [9]. We used 104 antibodies, their fold-change for 141 transfected miRNAs, and mock controls. For each miRNA-target interaction, we measured expression fold ratio of each antibody of the target for the miRNA transfection relative to the average mock transfections. Table 3 confirms the preferential down-regulation of predicted miRNA targets, getting higher after each consecutive filtering step showing the positive effect of filtering for each independent interaction. We also checked the average of all targeting miRNAs per gene relative to average mock transfections and observed a similar down-regulation tendency. Although the ratio did not increase for the last step, it was due to few genes. ERCC1, BAK1, CTNNA1, PXN, MSH2, XIAP, MAPK3, EEF2K, CAV1, IGFBP2, PRKAA1, PCNA, CASP9, IGF1R, SMAD4, COL6A1, PIK3R1, CHEK1, EIF4E, PTK2, CDK1, SMAD1, BCL2L1, BCL2, LCK, DIABLO, NF2, and EIF4EBP1 some of which are known to be important in breast cancer were consistently down-regulated by their predicted miRNA regulators. As a negative control, we used non-inferred interactions and did not observe any strong down-regulation tendency for all the filtering steps for both phases (S2 Table in S1 File).

thumbnail
Table 3. Evaluation of miRNA-target interaction filtering steps for the computed miRNA-target interactions using miRNA transfection data.

https://doi.org/10.1371/journal.pone.0251399.t003

Protein expression anticorrelation analysis.

Using protein expression dataset from TCPA matching with breast tumor samples in our analysis, we analyzed the negative correlation between miRNA expression and protein expression of their targets for each applicable miRNA-target interaction.

While the anticorrelation tendency slightly decreases after selecting top 40% miRNA-target interactions, our filtering steps substantially increased the ratios showing the anticorrelation tendency in our selected interactions (Table 4). Also, while considering average miRNAs per target, we had a slight anticorrelation tendency for top 40% interactions; however, our filtering steps increased the anticorrelation tendency much more. As negative control, non-inferred interactions did not show strong tendency, and even the tendency was towards to the up-regulation for the gene phase (S3 Table in S1 File).

thumbnail
Table 4. Evaluation of miRNA-target interaction filtering steps for the computed miRNA-target interactions based on miRNA-protein expression anticorrelation.

https://doi.org/10.1371/journal.pone.0251399.t004

Protein expression analysis with ESR1.

To evaluate predicted miRNA-target interactions in [9], the authors focused on the ESR1 protein, showing that ESR1 protein expression in TCGA breast cancer tumors (profiled by RPPA using the antibody ER.alpha.R.V_GBL.9014870) had a strong negative correlation with the expression of predicted miRNA regulators. Ranking samples based on miRNA expression, the top 10% and bottom 10% samples were compared based on ESR1 protein expression. Similarly, we generated a heatmap showing protein expression for Crinet-selected miRNAs regulating ESR1. Fig 2 shows nine Crinet-selected and 12 Cupid-selected miRNAs regulating ESR1, having five miRNAs as common. We quantified the anticorrelation between miRNA and protein expression by measuring fold-change of mean protein expression for the top 10% samples with respect to the bottom 10%. Our results indicated that the expression of Crinet-selected miRNAs for ESR1 had high anticorrelation with protein expression with high fold-change consistently while Cupid had some low fold-change such as hsa-mir-381.

thumbnail
Fig 2. Heatmap showing protein expression of ESR1 for the top and bottom 10% ranked samples with respect to miRNA expression.

Protein expression is shown for the top and bottom 10% samples ranked with respect to miRNA expression regulating ESR1 by Cupid-selected, Crinet-selected, Crinet-eliminated, and negative control along with the mean difference of log fold-change of protein expression for the bottom 10% with respect to the top 10% samples. Each row is independently ranked by miRNA expression. (*Common miRNA regulators with Cupid).

https://doi.org/10.1371/journal.pone.0251399.g002

Fig 2 also illustrates nine regulators eliminated by Crinet following expression correlation and sufficient abundance filtering of miRNA-target selection. These interactions did not show strong anticorrelation between miRNA and protein expression with respect to fold-change for the majority of miRNAs, showing the strength of Crinet filtering approach. As a negative control, we added the average of 100 random miRNAs which were not selected as ESR1’s regulator by Cupid and Crinet, and they exhibited a low fold-change.

Inferred ceRNA interactions were able to predict gene expression change

To assess the accuracy for ceRNA inference, we used the Library of Integrated Network-based Cellular Signature (LINCS) [22] L1000 shRNA-mediated gene knockdown experiment in breast cancer cell line as in [6] and checked whether ceRNA interactions can predict the effects of RNAi-mediated gene silencing perturbations in MCF7 cells. Since Crinet starts with a high number of genes, it was not computationally feasible to run many tools with our dataset. However, Hermes [6] runs any given ceRNA pair independently, therefore we ran Hermes for the genes in the knockdown assays using the same expression datasets and Crinet-selected miRNA-target interactions. LINCS database is a rich resource having an expression change of nearly 1000 genes as a response to a silenced gene. When a gene is silenced then its ceRNA partner will be affected since more miRNA regulators will be available to suppress the ceRNA partner. Thus, given a ceRNA pair, expression level should be lower in response to the silenced ceRNA partners in comparison to the genes that are not ceRNA partners. Based on this assumption, we evaluated the Crinet- and Hermes-inferred networks. The accuracy of this assessment is shown in Table 5. Since Hermes was not selective in terms of the number of ceRNA interactions by inferring many significant interactions, we evaluated several networks from Hermes till having similar number of genes with Crinet in the knockdown assessment. Based on these results, Crinet outperformed Hermes at predicting gene expression change of ceRNA partners for each timepoint and for overall accuracy.

thumbnail
Table 5. Evaluation of the accuracy of Crinet- and Hermes-inferred ceRNA interactions based on the shRNA-mediated gene knockdown experiment.

https://doi.org/10.1371/journal.pone.0251399.t005

Inferred ceRNAs were significantly associated with the known cancer genes and cancer-related processes

To analyze the biological significance of the inferred ceRNA network, we applied enrichment analysis for the inferred ceRNAs. We used ClusterProfiler R package [23] for all enrichment analysis. The inferred ceRNAs were significantly enriched in 398 GO terms from biological process ontology and 39 KEGG pathways. To associate enriched terms to broader categories, we analyzed GO Slim terms (S8 Table in S1 File). Inferred ceRNAs were mostly involved in biological processes including immune system process, cell differentiation, cell death, cell cycle, response to stress, and cell-cell signaling. These suggest that ceRNA interactions could have important role in biological processes in cancer.

To check if the inferred ceRNAs were associated with the cancer-related genes, we collected 3078 known cancer genes obtained from Cancer Gene Census in COSMIC v91 [24], Bushman’s cancer gene list v3 [25], human oncogenes from ONGene [26], Network of Cancer Genes 6.0 [27], and LncRNADisease database from the Cui Lab [28]. In our inferred ceRNAs, we had significant overlap (hypergeometric test p-value 9.10−105) between the known cancer genes and the inferred ceRNAs having 789 out of 3078 known cancer genes in our inferred network. When we repeated the same analysis for non-inferred genes, they did not show significant p-value (almost 1). We also had 54 breast cancer-related genes from LncRNADisease and Network of Cancer Genes 6.0 databases, 14 out of 54 breast cancer-related genes were inferred in our network. These results indicated that inferred ceRNAs were significantly associated with known cancer genes.

Moreover, we analyzed the hub ceRNAs in our network. The degree distribution of our network had a median of three with a maximum of 152 and a minimum of 1. We got the top 81 ceRNAs having a degree of 50 or more in our network as hub genes. The hub genes were significantly enriched in 58 GO terms and eight KEGG pathways. We investigated the GO Slim terms from biological processes ontology for hub genes, and enriched terms included immune system process, cell death, cell differentiation, cell motility, and cell cycle. Also, these hub genes were among the known cancer genes with a hypergeometric p-value of 0.0009. Specifically, 15 out of 81 genes were involved in the known cancer genes, while three of them were breast cancer-related. These suggest that hub ceRNAs in the inferred network were involved in the important biological processes in cancer.

Among hub genes were some lncRNAs with known involvement in cancer. For instance, MAGI2-AS3 had a degree of 89 being highly connected for ceRNA regulation in our network, and it is known as suppressor involving in cell growth [29]. MALAT1, which had a degree of 51, contributes significantly to cancer initiation and progression in breast cancer [30]. Some other lncRNAs had also important functionality: MIR100HG as an oncogene involving in proliferation [31], ITGB2-AS1 as an oncogene involving migration and invasion [32], and MEG3 as a suppressor involving in proliferation and EMT [33].

Inferred ceRNA groups included known cancer-related genes and were enriched in cancer-related processes

To evaluate the inferred ceRNA groups, we performed enrichment for four of 81 inferred ceRNA groups that have more than three members (S4 Fig in S1 File). CeRNA groups had a significant overlap with the cancer-related genes (hypergeometric p-value <0.0006). Group 1 genes were enriched with 82 GO terms from biological process ontology and 5 KEGG pathways, while group 3 were enriched with 35 GO terms and group 4 with 17 GO terms. Moreover, group 1 was highly enriched with GO Slim terms including cell cycle, response to stress, DNA metabolic process, chromosome segregation, and cell division. Although group 2 had limited mRNAs, all the remaining groups (groups 2, 3, and 4) were consistently enriched with GO Slim term immune system process. Additionally, group 3 had GO Slim terms including cell death, cell adhesion, and cell motility, while group 4 included response to stress (S7 Table in S1 File for details). CeRNA groups were significantly overlapped with known cancer-related genes and significantly enriched in biological processes suggesting that ceRNA groups could have important roles as a group in cancer including the immune system and cell repair.

Discussion

In this study, we developed a computational tool named Crinet to infer pairwise and groupwise ceRNA interactions and applied it to the breast tumor samples. Leveraging multiple types of biological datasets, considering expression abundance between miRNA and their targets, and excluding amplifying effect of ceRNA regulation, we inferred a ceRNA network including 17,274 ceRNA interactions between 4352 ceRNAs/ceRNA groups.

Unlike the existing tools, we filtered the miRNA-target interactions considering abundance sufficiency and binding scores. We introduced Interaction Regulation (IR) score, and confirmed that the miRNAs with the highest number of targets and low expression levels were successfully filtered out (A.1 Section and S1 Fig in S1 File). Since ceRNAs positively regulate each other, expecting positive correlation between expression of ceRNAs is a common approach [1]. While some studies use additional metrics in addition to simply calculating correlation [7, 34], we measured the partial correlation excluding the CNA effect since expression values are highly affected by copy number amplification and deletion.

Different from the studies that analyze only differentially expressed (DE) genes in order to have a computationally manageable number, we included all lncRNAs and pseudogenes in addition to all coding RNAs (including non-DE mRNAs) to understand comprehensive ceRNA regulation. We had the known lncRNA oncogenes and suppressors as highly connected in the inferred ceRNA network indicating that the lncRNAs could have an important role in ceRNA regulation. When the expression profiling of other data types such as circular RNAs is available, including them into the pipeline could be worthy of further investigation to improve ceRNA inference [35].

We also started with a large set of miRNA-target interactions since we utilized weighted context++ score as a proxy for binding probability. This enabled us to include all possible gene-gene interactions based on common regulators especially for the measurements like our collective and effective regulation. Our carefully designed miRNA-target interaction filtering steps were able to eliminate high likely false-positive interactions confirmed by the protein expression-based benchmarks. As a negative control, we analyzed non-inferred interactions in addition to the inferred ones in our evaluations based on miRNA transfection and protein expression data. Especially for the gene phase, we observed that the the number of down-regulated non-inferred interactions to up-regulated ones were less than 1 (S2 and S3 Tables in S1 File).

There are important studies to eliminate indirect interactions as an alternative to the network deconvolution approach such as the network enhancement [36]. When we applied the network enhancement method to our network, we had many new edges ranked at the top of the final network ranking better than the existing ones. Therefore, we preferred to use the network deconvolution method [13], which eliminates indirect interaction contribution from the direct ones for the existing edges. Applying the network deconvolution method to ceRNA inference, we eliminated the amplifying effect of ceRNA pairs, which was not addressed by the previous studies.

Crinet inferred ceRNA groups holding the inference consistency in the ceRNA network. We defined the ceRNA group as a group of two or more ceRNAs that have strong and collective relationships and we replaced the individual pairwise edges of the group members with the group edges that affect the whole group collectively. In that way, we did not find only a group of closely related genes, but we had a ceRNA network in which the groups and individual ceRNAs had ceRNA interactions showing the comprehensive regulation. Computational inference of ceRNA interaction is based on the local topologic information because the inference starts with a miRNA-RNA interaction set. However, we grouped ceRNAs even though they do not have inferred pairwise interaction; thus, we were able to add global signals into the inferred network.

Delving into the biological significance, inferred ceRNAs, ceRNA groups, and hub ceRNAs were significantly enriched in the known cancer-related genes and processes suggesting that ceRNAs could serve important processes in cancer. We consistently had immune system process as the significantly enriched GO term for the inferred ceRNAs, some ceRNA groups, and the hub ceRNAs. There are studies confirming that the weakness in the immune system function is closely related to tumorigenesis [37]. In [38], authors investigated ceRNA networks in Papillary Thyroid Carcinoma and disclosed the combined regulation of immune responses from these networks. In [39], authors unraveled the prognostic significance of ceRNA interactions among immune response genes in glioblastoma multiforme. Considering the immunotherapy as an emerging field in cancer with its potential to provide a strong response in cancer patients, novel Crinet-inferred ceRNA interactions and groups significantly enriched in immune-related processes could be important assets.

Supporting information

S1 File. Supplementary file of Crinet.

This file includes supplementary methods, tables, and figures of Crinet.

https://doi.org/10.1371/journal.pone.0251399.s001

(PDF)

References

  1. 1. Salmena L, Poliseno L, Tay Y, Kats L, Pandolfi PP. A ceRNA hypothesis: the Rosetta Stone of a hidden RNA language? Cell. 2011;146(3):353–358. pmid:21802130
  2. 2. Yang J, Li T, Gao C, Lv X, Liu K, Song H, et al. FOXO1 3’ UTR functions as a ceRNA in repressing the metastases of breast cancer cells via regulating miRNA activity. FEBS letters. 2014;588(17):3218–3224. pmid:25017439
  3. 3. Zhou M, Wang X, Shi H, Cheng L, Wang Z, Zhao H, et al. Characterization of long non-coding RNA-associated ceRNA network to reveal potential prognostic lncRNA biomarkers in human ovarian cancer. Oncotarget. 2016;7(11):12598. pmid:26863568
  4. 4. Kumar MS, Armenteros-Monterroso E, East P, Chakravorty P, Matthews N, Winslow MM, et al. HMGA2 functions as a competing endogenous RNA to promote lung cancer progression. Nature. 2014;505(7482):212–217. pmid:24305048
  5. 5. Qi X, Zhang DH, Wu N, Xiao JH, Wang X, Ma W. ceRNA in cancer: possible functions and clinical implications. Journal of Medical Genetics. 2015;52(10):710–718. pmid:26358722
  6. 6. Chiu HS, Martínez MR, Bansal M, Subramanian A, Golub TR, Yang X, et al. High-throughput validation of ceRNA regulatory networks. BMC genomics. 2017;18(1):418. pmid:28558729
  7. 7. Do D, Bozdag S. Cancerin: A computational pipeline to infer cancer-associated ceRNA interaction networks. PLoS computational biology. 2018;14(7):e1006318. pmid:30011266
  8. 8. Wen X, Gao L, Hu Y. LAceModule: Identification of Competing Endogenous RNA Modules by Integrating Dynamic Correlation. Frontiers in genetics. 2020;11:235. pmid:32256525
  9. 9. Chiu HS, Llobet-Navas D, Yang X, Chung WJ, Ambesi-Impiombato A, Iyer A, et al. Cupid: simultaneous reconstruction of microRNA-target and ceRNA networks. Genome research. 2015;25(2):257–267. pmid:25378249
  10. 10. Colaprico A, Silva TC, Olsen C, Garofano L, Cava C, Garolini D, et al. TCGAbiolinks: an R/Bioconductor package for integrative analysis of TCGA data. Nucleic acids research. 2016;44(8):e71–e71. pmid:26704973
  11. 11. Zhang J. CNTools: Convert segment data into a region by sample matrix to allow for other high level computational analyses. R package (Version 16 0). 2016;.
  12. 12. Agarwal V, Bell GW, Nam JW, Bartel DP. Predicting effective microRNA target sites in mammalian mRNAs. elife. 2015;4:e05005. pmid:26267216
  13. 13. Feizi S, Marbach D, Médard M, Kellis M. Network deconvolution as a general method to distinguish direct dependencies in networks. Nature biotechnology. 2013;31(8):726. pmid:23851448
  14. 14. Pons P, Latapy M. Computing communities in large networks using random walks. In: International symposium on computer and information sciences. Springer; 2005. p. 284–293.
  15. 15. Arita M. Scale-freeness and biological networks. Journal of biochemistry. 2005 Jul 1;138(1):1–4. pmid:16046441
  16. 16. Eguiluz VM, Chialvo DR, Cecchi GA, Baliki M, Apkarian AV. Scale-free brain functional networks. Physical review letters. 2005 Jan 6;94(1):018102. pmid:15698136
  17. 17. Barabási AL, Albert R. Emergence of scaling in random networks. science. 1999 Oct 15;286(5439):509–12. pmid:10521342
  18. 18. Han H, Cho JW, Lee S, Yun A, Kim H, Bae D, et al. TRRUST v2: an expanded reference database of human and mouse transcriptional regulatory interactions. Nucleic acids research. 2018;46(D1):D380–D386. pmid:29087512
  19. 19. Consortium EP, et al. A user’s guide to the encyclopedia of DNA elements (ENCODE). PLoS biology. 2011;9(4).
  20. 20. Stark C, Breitkreutz BJ, Reguly T, Boucher L, Breitkreutz A, Tyers M. BioGRID: a general repository for interaction datasets. Nucleic acids research. 2006;34(suppl_1):D535–D539. pmid:16381927
  21. 21. Akbani R, Ng PKS, Werner HM, Shahmoradgoli M, Zhang F, Ju Z, et al. A pan-cancer proteomic perspective on The Cancer Genome Atlas. Nature communications. 2014;5(1):1–15. pmid:24871328
  22. 22. Liu C, Su J, Yang F, Wei K, Ma J, Zhou X. Compound signature detection on LINCS L1000 big data. Molecular BioSystems. 2015;11(3):714–722. pmid:25609570
  23. 23. Yu G, Wang LG, Han Y, He QY. clusterProfiler: an R package for comparing biological themes among gene clusters. Omics: a journal of integrative biology. 2012;16(5):284–287. pmid:22455463
  24. 24. Tate JG, Bamford S, Jubb HC, Sondka Z, Beare DM, Bindal N, et al. COSMIC: the Catalogue Of Somatic Mutations In Cancer. Nucleic Acids Research. 2018;47(D1):D941–D947.
  25. 25. Lab B. Bushman Lab: Cancer Gene List (version 4); 2018. Available from: http://www.bushmanlab.org/links/genelists.
  26. 26. Liu Y, Sun J, Zhao M. ONGene: A literature-based database for human oncogenes. J Genet Genomics. 2017;44(2):119–121. pmid:28162959
  27. 27. Repana D, Nulsen J, Dressler L, Bortolomeazzi M, Kuppili Venkata S, Tourna A, et al. The Network of Cancer Genes (NCG): a comprehensive catalogue of known and candidate cancer genes from cancer sequencing screens. Genome Biology. 2019;20. pmid:30606230
  28. 28. Chen G, Wang Z, Wang D, Qiu C, Liu M, Chen X, et al. LncRNADisease: a database for long-non-coding RNA-associated diseases. Nucleic acids research. 2012;41(D1):D983–D986. pmid:23175614
  29. 29. Yang Y, Yang H, Xu M, Zhang H, Sun M, Mu P, et al. Long non-coding RNA (lncRNA) MAGI2-AS3 inhibits breast cancer cell growth by targeting the Fas/FasL signalling pathway. Human cell. 2018;31(3):232–241. pmid:29679339
  30. 30. Kim J, Piao HL, Kim BJ, Yao F, Han Z, Wang Y, et al. Long noncoding RNA MALAT1 suppresses breast cancer metastasis. Nature genetics. 2018;50(12):1705–1715. pmid:30349115
  31. 31. Wang S, Ke H, Zhang H, Ma Y, Ao L, Zou L, et al. LncRNA MIR100HG promotes cell proliferation in triple-negative breast cancer through triplex formation with p27 loci. Cell death & disease. 2018;9(8):1–11. pmid:30042378
  32. 32. Liu M, Gou L, Xia J, Wan Q, Jiang Y, Sun S, et al. LncRNA ITGB2-AS1 could promote the migration and invasion of breast cancer cells through up-regulating ITGB2. International journal of molecular sciences. 2018;19(7):1866. pmid:29941860
  33. 33. Zhang W, Shi S, Jiang J, Li X, Lu H, Ren F. LncRNA MEG3 inhibits cell epithelial-mesenchymal transition by sponging miR-421 targeting E-cadherin in breast cancer. Biomedicine & Pharmacotherapy. 2017;91:312–319. pmid:28463794
  34. 34. Paci P, Colombo T, Farina L. Computational analysis identifies a sponge interaction network between long non-coding RNAs and messenger RNAs in human breast cancer. BMC systems biology. 2014;8(1):83. pmid:25033876
  35. 35. Huang M, Zhong Z, Lv M, Shu J, Tian Q, Chen J. Comprehensive analysis of differentially expressed profiles of lncRNAs and circRNAs with associated co-expression and ceRNA networks in bladder carcinoma. Oncotarget. 2016;7(30):47186. pmid:27363013
  36. 36. Wang B, Pourshafeie A, Zitnik M, Zhu J, Bustamante CD, Batzoglou S, Leskovec J. Network enhancement as a general method to denoise weighted biological networks. Nature communications. 2018 Aug 6;9(1):1–8. pmid:30082777
  37. 37. Bigley AB, Spielmann G, LaVoy EC, Simpson RJ. Can exercise-related improvements in immunity influence cancer prevention and prognosis in the elderly? Maturitas. 2013;76(1):51–56. pmid:23870832
  38. 38. Huang CT, Oyang YJ, Huang HC, Juan HF. MicroRNA-mediated networks underlie immune response regulation in papillary thyroid carcinoma. Scientific reports. 2014;4:6495. pmid:25263162
  39. 39. Chiu YC, Wang LJ, Lu TP, Hsiao TH, Chuang EY, Chen Y. Differential correlation analysis of glioblastoma reveals immune ceRNA interactions predictive of patient survival. BMC bioinformatics. 2017;18(1):132. pmid:28241741