Identification of a microRNA signature associated with survivability in cervical squamous cell carcinoma

Background The aim of this study is to find the potential miRNA expression signature capable of predicting survival time for cervical squamous cell carcinoma (CSCC) patients. Methods The expression of 332 miRNAs was measured in 131 (Training cohort) and 130 (Validation cohort) patients with CSCC in the Cancer Genome Atlas (TCGA) data portal. The miRNA expression signature was identified by Cox Proportion Hazard regression model to the Training data set, and subsequently validated in an independent Validation set. Kaplan-Meier curves and the receiver operating characteristic analyses of 5 years were used to access the overall survival of miRNA signature. MiRNA signature-gene target analysis was performed, followed by the construction of the regulatory network. Gene Ontology and Kyoto Encyclopedia of Genes and Genomes pathway analysis were used to explore the function of target genes of miRNA signature. Results A 2-miRNA expression signature of hsa-mir-642a and hsa-mir-378c associated with survivability was identified in CSCC. Both of them had a significant diagnostic and prognostic value of patients with CSCC. A total of 345 miRNA signature-target pairs were obtained in the miRNA signature-gene target regulatory network, in which 316 genes were targets of has-mir-378c and has-mir-642a. Functional analysis of target genes showed that MAPK signaling pathway, VEGF signaling pathway and endocytosis were the significantly enriched signal pathways that covered most genes. Conclusions The 2-miRNA signature adds to the prognostic value of CSCC. In-depth interrogation of the 2-miRNAs will provide important biological insights that finding and developing novel molecularly prediction to improve prognosis for CSCC patients.


Introduction
Cervical squamous cell carcinoma (CSCC), accounting for about 75-80% of all cervical cancers, is one of the most common gynecological malignancy and leads to the cancer death in women [1,2]. Walboomers JM and Castellsague X et al found that CSCC was closely associated with high-risk human papillomavirus (HPV) infection [3,4]. In addition, lymph node metastasis is one of diffusion routes that influence survival and prognosis of CSCC [5,6]. Once lymph node metastasis occurs, the overall 5-year survival rate for early stage carcinoma of the uterine cervix is reduced to 53%, which lead to the high recurrence rate and poor prognosis of patients with CSCC [7-10]. As there are no valid diagnostic and therapeutic methods for CSCC, it is urgent to understand the pathological mechanism and find potential biological markers for diagnosis, therapy and prognosis of patients with CSCC. [11,12]. Several genes have been identified as the diagnostic and prognostic biomarkers for CSCC. It has been demonstrated that the kinase family member 20a (KIF20A) protein is one potential biomarker for CSCC [13]. Additionally, Liu DQ et al suggested that receptor interacting serine/threonine kinase 4 (RIPK4) might act as a potential diagnostic and independent prognostic biomarker for patients with CSCC [14].
MicroRNAs (miRNAs) are small non-coding RNAs that are approximately 22 nt in size. They can modulate growth, proliferation, differentiation and apoptosis of cells by regulating target genes expression at the post-transcriptional level. As microRNAs stably present in almost all body fluids, they constitute a new class of non-invasive biomarkers [15][16][17][18][19]. It has been reported that the deregulation of miRNAs leads to the occurrence of a number of diseases, such as cancers in cervical [20]. MiR-23b/uPA is involved in the HPV-16 E6-associated cervical cancer development [21]. MiR-372 is down-regulated in cervical cancer tissues compared with normal cervical tissues [22]. The down-regulation of miR-143 is associated with lymph node metastasis and poor prognosis in cervical cancer [23,24]. It is reported that 6 serum microRNAs including miR-1246, miR-20a, miR-2392, miR-3147, miR-3162-5p and miR-4484 has been identified in predicting lymph node metastasis of CSCC patients [25]. In addition, it is found that serum miR-206 is a powerful tool to predict chemoradiotherapy sensitivity in advanced-stage CSCC patients [26]. It is noteworthy that the identification of potential miRNAs that participate in survival prediction is essential for establishing novel prognosis strategies for CSCC. Recently, miRNA expression signatures related to prognosis have been found in number of malignancy [27]. Hence, we undertook to identify and validate a miRNA expression signature capable of predicting for survivability in CSCC patients.

TCGA data retrieval and analysis
The BCGSC__IlluminaHiSeq_miRNASeq data were acquired from Firebrowse (http:// firebrowse.org/?cohort=LIHC&download_dialog=true, 2016-01-28). Level 3 (Reads-per-kilobase-million; RPKM) miRNA-Seq and Level 1 clinical data were downloaded from the TCGA data portal (http://tcga-data.nci.nih.gov/tcga) dataset. At the time of analysis, there were 307 clinical histories. Only those clinical histories with miRNA-seq values and sufficient follow-up data (261 cases) were used for further survival analysis. All these cases were randomly divided into Training cohort (131 cases) and Validation cohort (130 cases). There was no significant difference in gender, race, family history, tumor stage, vascular invasion, follow-up time and follow-up result between two cohorts. Clinical characteristics for two cohorts were shown in Table 1.

Identification and survival analysis of miRNA signature
In order to identify the survival time related miRNAs in CSCC, the single factor Cox proportional hazard (CoxPH) regression model was fitted to the Training cohort data. The statistical significance was set at p<0.05. After further adjustment, the multi-factor CoxPH regression model was used for identification of miRNA signature in the survival evaluation model of CSCC. A risk score (RS) was calculated using the coefficients from the model, and high vs. low risk patients were then compared in the Training cohort and Validation cohort using the logrank test. Kaplan-Meier curves were used to plot overall survival with miRNA signature expression using Cutoff Finder (http://molpath.charite.de/cutoff). In addition, the receiver operating characteristic (ROC) analyses were performed to assess the 5 years' survival rate of miRNA signature of CSCC by using pROC package in R language. The area under the curve (AUC) under binomial exact confidence interval was calculated and the ROC curve was generated.

Network construction of miRNA signature-targets
Identifying target genes is an important step in studying the function of miRNA in tissues. In this study, target genes of miRNA signature were obtained by miRWalk (http://www.umm.uniheidelberg.de/apps/zmf/mirwalk/). According to the miRNA-target pairs, miRNA signature-targets interaction network was established by Cytoscape software (http://www.cytoscape.org/).)

Functional annotation of miRNA signature targets
In order to study the biological function of target genes of miRNA signature, the Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis were performed by using the online software GeneCodis3 (http://genecodis.cnb.csic.es/analysis). The threshold of false discovery rate (FDR) < 0.05 was set as the criteria of statistical significance.

Generation and validation of miRNA signature
The single factor CoxPH regression model fitted to the Training cohort yielded 47 miRNAs (Table 2). A group of miRNA signatures including hsa-mir-642a and hsa-mir-378c (Table 3) was identified that were most strongly associated with survival after multi-factor CoxPH regression model analysis. The miRNA signature was combined with their coefficients within the penalized model to yield the following equation: The RS was calculated for each patient in the Training cohort, in which the patients were dichotomized into either the "low risk" (< median), or the "high risk" (! median) group. A highly significant difference was observed between the high risk and the low risk group (p < 0.001), that was shown in Fig 1. When the same miRNA signature equation was applied to the Validation cohort, a similar significant difference was also observed between the high risk and the low risk group (p = 0.007), that was shown in Fig 2. Additionally, we performed 5 years' survival analysis of miRNA signature by ROC and calculated the AUC to assess the discriminatory ability of miRNA signature (Fig 3). The AUC of the miRNA signature was 0.7221. Our result suggested that the miRNA signature could be the prognosis model for predicting the survival situation of CSCC.

MiRNA signature-targets network
A total of 345 miRNA signature-target pairs were obtained by miRWalk, followed by the construction of the interaction network (Fig 4). In the network, 316 genes were targets of has-mir-642a and has-mir-378c. The red rhombus and blue-green ellipse represented the miRNA and target genes, respectively.

Discussion
CSCC is one of the most common gynecological cancers that affect the health of women [1, 2]. In addition, the 5-year overall survival rate is about 80% [7, 28]. Even so, it is needed to understand the pathological mechanism and find potential survival related genes in the development of CSCC. In this study, we found a miRNA signature including hsa-mir-642a and hsa-mir-378c in CSCC, which could be a valuable tool in guiding treatment decisions for CSCC.
Hsa-mir-642a, a primate-specific miRNA, is a tumor suppressor. It is reported that hsamir-642a is differentially expressed in lung cancer cells [29]. Interaction of hsa-mir-642a-5p and Linc00974 can increase the expression of keratin 19 and activate Notch and TGF-β signaling pathways, which will increase the proliferation and invasion of hepatocellular carcinoma [30]. It is found that hsa-mir-642a is over expressed in the pediatric embryonal central nervous system neoplasm that is regarded as a prognostic parameter of patients [31]. In addition, the abnormal expression of hsa-mir-642a in myeloma cell lines significantly decreased protein levels of DEP domain containing MTOR interacting, which caused dedifferentiation of myeloma cells [32]. It is noteworthy that hsa-mir-642a is associated with cervical cancer prognosis [33]. Herein, we also found that hsa-mir-642a was related to survival time of patients with CSCC. Furthermore, cryptochrome circadian clock 2 (CRY2) was one of the target genes of hsa-mir-642a. Cryptochrome 2 is circadian clock gene and the hypermethylation of CRY2 is involved in DNA recombination and repair in long-term shift-workers [34]. It has been demonstrated that the genetic variation of CRY2 is related to metabolic characteristics of type 2 diabetes [35,36]. In addition, CRY2 has been suggested to act as a modulator in the development of cancer [37]. The expression level of CRY2 in ovarian cancer is remarkably lower than those in normal ovary [38]. The polymorphism in CRY2 gene has been frequently found associated with increased risk or recurrence of breast and endometrial cancers [39,40]. Our result showed that hsa-mir-642a was significantly associated with survival time of CSCC and could be a diagnostic and prognostic marker of CSCC. It is reported that the expression of hsa-mir-378c may enhance cell survival and tumor growth [41]. It has been found that hsa-mir-378c is associated with Stage I and Stage II colon cancer compared with normal controls [42]. Additionally, the expression of hsa-mir-378c is significantly down-regulated in osteosarcoma, intrahepatic cholangiocarcinoma and advanced stage gastric cancer [43][44][45]. It is worth mentioning that hsa-mir-378c is the member of protective miRNA signatures and correlated with cervical cancer prognosis [33]. In this study, we found that hsa-mir-378c was one of the members of miRNA signatures in the CSCC survival analysis. Moreover, myelin regulatory factor (MYRF) was one of the target genes of hsa-mir-378c. MYRF is a myelin-associated gene and acts as a key transcription factor for oligodendrocyte differentiation and central nervous system myelination. [46][47][48]. In addition, it is the target gene of hsa-mir-423-5p and involved in the immune response or injury in the retina [49].

Fig 2. Kaplan-Meier curves showing CSCC patients dichotomized based on risk score in the Validation cohort.
High risk is defined as a RS ! the median in the training cohort, and low risk is defined as a RS < the median in the Validation cohort. MYRF may be involved in the nervous and immune of CSCC. In a word, hsa-mir-378c played a crucial role in the CSCC and could be a diagnostic and prognostic marker in the development of CSCC.
According to the functional annotation analysis of miRNA signature targets, MAPK signaling pathway (FDR = 4.14E-05), VEGF signaling pathway (FDR = 8.89E-05) and endocytosis (FDR = 9.18E-05) were significantly enriched signal pathways that covered most genes. It has been shown that p38 MAPK is involved in a number of cellular processes, including cell survival and death [50,51]. It is found that human papillomavirus (HPV) 16 E2 can induce apoptosis by inhibiting p38 MAPK/JNK signal pathway in CSCC, which is important for the in vitro growth and migration of cervical squamous carcinoma cells in response to HPV 16 E2 treatment [52]. VEGF has been identified as angiogenesis regulator and may be important to restrict tumor growth, progression and metastasis. Vascular proliferation is a characteristic of cervical cancer and high density of microvessels indicates a worse prognosis of the disease [53]. Tjalma W et al found that the expression level of VEGF was high in cervical cancers [54]. It is suggested that VEGF could stimulate tumor cell proliferation in the early stages and may be responsible for tumorigenesis of cervical cancer [55]. In addition, it has been demonstrated that VEGF could be the predictive biomarker for monitoring the recurrence of cervical cancer [56]. The endocytosis process is involved in regulating various biological process including cell cycle and apoptosis in cancer cells [57,58]. It has been reported that endocytosis is associated with CSCC-specific alternative splicing events [59]. This suggested that MAPK, VEGF and endocytosis signal pathways may play an important role in CSCC. Inhibition of these signal pathways might be a useful therapeutic strategy for CSCC.

Conclusions
In summary, we have identified and successfully validated a 2-miRNA signature of hsa-mir-642a and hsa-mir-378c in patients with CSCC. The signature adds to the potential predictive role in the survival time of CSCC patients. Therefore, we can detect the expression of hsa-mir-642a and hsa-mir-378c in the blood to predict the survival time of patients with CSCC, which will improve the clinical outcome for patients with CSCC.