A Seven-microRNA Expression Signature Predicts Survival in Hepatocellular Carcinoma

Hepatocellular carcinoma (HCC) is the fifth common cancer. The differential expression of microRNAs (miRNAs) has been associated with the prognosis of various cancers. However, limited information is available regarding genome-wide miRNA expression profiles in HCC to generate a tumor-specific miRNA signature of prognostic values. In this study, the miRNA profiles in 327 HCC patients, including 327 tumor and 43 adjacent non-tumor tissues, from The Cancer Genome Atlas (TCGA) Liver hepatocellular carcinoma (LIHC) were analyzed. The associations of the differentially expressed miRNAs with patient survival and other clinical characteristics were examined with t-test and Cox proportional regression model. Finally, a tumor-specific miRNA signature was generated and examined with Kaplan–Meier survival, univariate\multivariate Cox regression analyses and KEGG pathway analysis. Results showed that a total of 207 miRNAs were found differentially expressed between tumor and adjacent non-tumor HCC tissues. 78 of them were also discriminatively expressed with gender, race, tumor grade and AJCC tumor stage. Seven miRNAs were significantly associated with survival (P value <0.001). Among the seven significant miRNAs, six (hsa-mir-326, hsa-mir-3677, hsa-mir-511-1, hsa-mir-511-2, hsa-mir-9-1, and hsa-mir-9-2) were negatively associated with overall survival (OS), while the remaining one (hsa-mir-30d) was positively correlated. A tumor-specific 7-miRNAs signature was generated and validated as an independent prognostic predictor. Collectively, we have identified and validated an independent prognostic model based on the expression of seven miRNAs, which can be used to assess patients’ survival. Additional work is needed to translate our model into clinical practice.


Introduction
Liver cancer is the fifth most common cancer in men and the second cause of cancer death worldwide (cause 745 000 deaths in 2012) [1]. Hepatocellular carcinoma (HCC) is the most common type of liver cancers, accounting for about 75% of all liver cancer cases [2]. The majority of HCC patients are diagnosed at an advanced stage. It is important to discover tumor-specific prognostic factors for HCC to foresee outcome and improve treatments. To date, microRNAs (miRNAs) have been proposed as a promising prognostic predictor in HCC [3]. miRNAs play vital roles in mediating the expression of proteins by regulating the transcription or degradation of target mRNAs [4]. In HCC, a number of miRNAs have been associated with survival or response to chemotherapy such as sorafenib or doxorubicin [5][6][7]. However, the sample size, the numbers of candidate miRNAs or miRNA detection method in previous studies were relatively limited [8][9][10].
TCGA project provides a collection of clinical data, RNA sequence, DNA methylation, DNA copy number variations, and miRNA sequence profiles for LIHC. Here, we designed a study using the dataset from TCGA project: 1) to find out the differential miRNAs expression profiles between HCC tumors and adjacent non-tumor tissues, 2) to determine the miRNAs with prognostic potential from the differential expression profiles, and 3) to understanding the biological pathway of the prognostic miRNA signature.

Patients and samples
All 360 patients of LIHC were retrieved from the TCGA data portal. The full clinical dataset were downloaded (up to Feb 24, 2015) and checked for the assessment of the eligibility. The exclusion criteria were set as follows: 1) histologic diagnosis is not HCC, including fibrolamellar carcinoma (2 cases) and mixed hepatocholangiocarcinoma (7 cases); 2) samples with clinical data but without miRNA sequence data (4 cases); and 3) Overall survival more than 2000 days or missing important clinical or biological data (20 cases). Overall, a total of 327 HCC patients were included in our study with the corresponding clinical data including age, gender, race, the American Joint Committee on Cancer staging system (also be called AJCC TNM staging system), tumor status, vital status, risk factors, vascular invasion, Child-Pugh classification and surgical types. Among the 327 patients (Cohort T), the adjacent non-tumor tissues were retrieved from 43 subjects (Cohort N). As the data were obtained from TCGA, further approval by an ethics committee was not required. This study meets the publication guidelines provided by TCGA (http://cancergenome.nih.gov/publications/publicationguidelines).

MiRNA expression data procession
The miRNA expression data (level 3) of the corresponding patients (tumor and/or adjacent non-tumor tissues) were downloaded from TCGA data portal (up to Feb 24, 2015). The miRNA expression profiling was performed using the Illumina HiSeq 2000 miRNA sequencing platforms (Illumina Inc, San Diego, CA). The miRNA expression level was demonstrated as reads per million miRNA mapped data. The miRNA expression analyses were performed using BRB-ArrayTools (version 4.4) developed by Dr. Richard Simon and the BRB-ArrayTools Development Team [11]. In brief, the miRNAs with missing data exceeded 10% of all subjects were excluded from the dataset and the expression level of each individual miRNA was log2-transformed for further analysis.

Statistical analysis
The continuous variables were presented as mean ± SD. The differences of clinical characteristics (age, gender, risk factors, AJCC stage, tumor size, lymph node involvement, Metastasis status, and Tumor grade) between two cohorts (Cohort N and Cohort T) were evaluated using Chi-square tests.
The miRNAs expression levels between tumor and adjacent non-tumor tissues were ascertained with two-sample t test, and the significance level was set as 0.001 as default to control the false discovery rate (FDR). Subsequently, the univariate Cox proportional hazards regression was conducted to find out the miRNAs correlated with overall survival [13]. After selecting the miRNAs, the principal component model was applied to compute a prognostic index for each patient.

Patient characteristics
All 327 patients enrolled in our study were pathologically diagnosed with HCC. The mean ± SD age for all 327 patients was 60.37 ± 16.71 years, and the mean ± SD follow-up time was 17.16 ± 17.75 months. Among the 327 participants (Cohort T), a total of 43 patients with adjacent non-tumor tissues were included in Cohort N for the analysis of the differential expression of miRNAs between tumor and adjacent non-tumor tissues. As summarized in Table 1, no significant difference was observed in the distribution of age (P = 0.857), gender (P = 0.402), risk factors (P = 0.694), AJCC pathological stage (P = 0.235), tumor size (P = 0.901), lymph node involvement (P = 0.480), Metastasis status (P = 0.461), and Tumor grade (P = 0.168) between the two cohorts.

Differentially expressed miRNAs (DEmiRNA) in tumor and adjacent non-tumor tissues
The miRNA expression in the 327 HCC tumor tissues and 43 adjacent non-tumor tissues (Cohort T and N) were investigated, and a total of 207 miRNAs (DEmiRNA) were found to be expressed differentially with p value and FDR less than 0.001 (S1 Table). As outlined in Fig 1, the unsupervised hierarchical clustering with the 207 miRNA expression data clearly discriminated the tumor and adjacent non-tumor samples. Among these 207 miRNAs, 77 miRNAs (31.2%) were up-regulated, while the remaining 130 miRNAs (62.8%) were down-regulated. With regard to the fold change in expression levels, 50 differentially expressed miRNAs showed a greater than three-fold or less than 0.33-fold change in expression levels. Among the 50 miR-NAs, 13 miRNAs (26.0%) were up-regulated and 37 miRNAs (74.0%) were down-regulated (Fig 2).  The correlation between differentially expressed miRNAs from Tumor/ Non-tumor and clinical parameters The 207 DEmiRNA sets from Tumor and Non-tumor HCC tissues were further analyzed according to clinical parameters including gender, race, tumor status, risk factors, tumor grade, AJCC pathologic (TNM stage), AJCC tumor stage, vascular invasion, Child-Pugh classification, and age at diagnosis. A total of 78 miRNAs were found to be expressed differentially with P value less than 0.001 (S2 Table). Most of these miRNAs are unevenly expressed in gender, race, tumor grade and AJCC pathological stage; No significant miRNA was found between different tumor status, AJCC TNM staging system (lymph node), vascular invasion and Child-Pugh classification ( Table 2).

Establishment of miRNAs signature associated with HCC patient survival
To identify the potential miRNAs with prognostic characteristics, the 207 DEmiRNA expression in the 327 HCC tumor tissues (Cohort T) was profiled using the univariate Cox proportional hazards regression model and seven miRNAs were found to be significantly associated with survival (P <0.001, S3 Table). Among the seven significant miRNAs, six (hsa-mir-326, hsa-mir-3677, hsa-mir-511-1, hsa-mir-511-2, hsa-mir-9-1, and hsa-mir-9-2) were negatively associated with OS, while the remaining one (hsa-mir-30d) was positively correlated (Fig 3). Then, a principal component model was used to compute a prognostic index for each patient by Cox proportional hazards models and Leave-One-Out-Cross-Validation. The prognostic index can be computed by the simple formula Siwi xi-1.880964 where wi and xi are the weight and logged miRNA expression for the i-th miRNA. A patient was predicted as high (low) risk if its prognostic index is larger than (smaller than or equal to) -0.05488. Accordingly, cohort T can be divided into two groups: high-risk (n = 161) and low-risk (n = 166) groups.

Associations of the 7-miRNA signature model with clinical parameters
Univariate and multivariate Cox regression analyses were used to test the effect of the 7-miRNA signature (high versus low risk) on overall survival taking into account the following demographic and clinical parameters: (i) gender (Female vs. Male); (ii) tumor status (With tumor vs Tumor free); (iii) tumor grade (G3+G4 vs. G1+G2); (iv) AJCC TNM staging system (T) (T3+T4 vs. T1+T2); (v) and AJCC pathological stage (III-IV versus I-II).
As summarized in Table 3 and Fig 4, in univariate analysis, 7-miRNA signature, tumor status, AJCC T stage, and AJCC pathological stage rather than gender and tumor grade were associated with patients' OS. In multivariate analysis, the risk established by the 7-miRNA model and tumor status resulted to be as an independent prognostic factor after the final stepwise analysis (P < 0.01).

KEGG pathway analysis for 7-miRNA signature target genes
The miRNAs selected in our study were correlated with several cancers and diseases by previous studies [14][15][16]. However, we check whether any KEGG pathways were enriched with the seven miRNAs target genes to reveal the biological relevance. The seven miRNAs target genes participated in both cancer-related and non-cancer related pathways (Fig 5).
We listed the target-gene enrichment of KEGG pathways in our study (Table 4 and S4  Table). Ten cancer-related pathways including pathways for prostate cancer, colorectal cancer, endometrial cancer, acute myeloid leukemia, melanoma, and renal cell carcinoma, were enriched with the seven miRNA target genes. Another 25 non-cancer related pathways involved in prion diseases, cardiomyopathy, Hepatitis B, drug metabolism, endocytosis, P53 signaling pathway, and PI3K-Akt signaling pathway et al were also enriched with the seven miRNA target genes. Thus, KEEG pathway analysis of signature microRNAs has identified potential targets and biological processes known to be involved in cancer which provided the biological relevance of the 7-microRNA signature. Discussion Computational analysis of TCGA dataset has been demonstrated to be a powerful approach in identifying genetic and transcriptional changes linked to clinical outcomes, pointing the way to new prognostic markers and potentially novel therapeutic targets [14,17,18]. The identification and validation of novel biomarkers such as miRNA signatures have been a research focus in cancer research [3]. In the present study, we identified a tumor-specific miRNA signature consisting of seven miRNAs that were significantly associated with survival in patients with HCC, on the basis of genome-wide miRNA profiling of 327 HCC patients. Furthermore, we confirmed the 7-miRNA signature as an independent prognostic factor after adjusting to the various variables including gender, tumor status, tumor grade, AJCC T stage and AJCC pathological stage. The performance of our generated prognostic model was assessed in Leave-oneout cross-validation (LOOCV) model to refine its accuracy. As a subset of non-coding RNAs, mounting evidences have indicated that the miRNA function as an important regulator in the development and progression of HCC, including cell migration [19], apoptosis [20], and response to antitumor therapies [21]. A number of miRNAs have been correlated with the survival of HCC patients [22] and thus this is not the first study aimed to establish a prognostic signature in HCC. In fact, Budhu et al. aiming at identifying biological meaningful metastasis-related miRNAs, reported a 20-miRNA metastasis signature that predicted HCC with venous metastases and was associated with survival [8]. Another report, Jiang et al. found a 19-miRNA signature that impacted HCC outcome [9]. These two reports found some miRNAs that also be used in our analysis. However, the numbers of candidate miRNAs, miRNA detection methods and goals of previous studies are different from this study; in particular, our approach has two major advantages: 1) We use novel stepwise approach that integrating TCGA dataset with 327 HCC tumor tissues and 43 adjacent nontumor tissues, which is by far the largest research population for HCC survival prediction; 2) instead of using miRNA arrays (about 500 candidate miRNAs) or real-time PCR (about 100 candidate miRNAs), TCGA dataset was collected from high-thorough miRNA expression profiling using the Illumina HiSeq 2000 miRNA Sequencing platforms (1246 candidate miRNAs) which makes it possible to include as many miRNAs as we can.
Hence, we conducted the study to identify a tumor-specific miRNA signature in a large cohort of HCC patients with Next Generation Sequencing technology. First of all, a summary of 207 miRNAs were found to be expressed differentially between tumor and adjacent non-tumor tissues. Seventy-eight of the 207 miRNAs were also differentially expressed according to gender, race, tumor grade, AJCC TNM stage, age at diagnosis and AJCC tumor stage. Subsequently, the expression levels of seven miRNAs were demonstrated to be significantly associated with the prognosis of HCC patients, and a 7-miRNA signature was generated and confirmed as an independent prognostic factor. Finally, the KEGG pathway analysis proved that the 7-miRNA signature is biologically meaningful. With the largest cohort of HCC patients and candidate miRNAs, this is by far the most integrated study to elucidate the miRNA expression profiles in HCC and evaluate their prognostic values.

Prognosis of 7 miRNAs in HCC
With respect to the associations between their expression levels and patient survival, the seven miRNAs in the signature model were divided into two groups: six risky miRNAs that were negatively associated with patient survival and the remaining one protective miRNAs positively associated. Among the six risky miRNAs, the methylation of hsa-mir-9-1 was observed in liver fibrolamellar carcinoma [23], Burkitt lymphoma [24], pancreatic adenocarcinoma [25], and breast cancer [26]. Aberrant hypermethylation-mediated inactivation of has-mir-9-1 is known to prolong the overall survival in metastasized renal cell cancer [27]. The methylation of has-mir-9-2 is also a frequent event in human HCC [28]. Expression of hsa-miR-326 was correlated with poor prognosis in esophageal cancer and glioma patients [29,30]. Hsa-miR-511 impeded HCC cell proliferation, migration, and invasion by targeting PIK3R3 [31]. Meanwhile, the functions of mir-3677 in cancer was poorly understood. The biological and clinical studies of these miRNAs have partially provided clues for the prognostic value, but we need well-designed studies to validate the functions of these miRNAs in HCC.

Prognosis of 7 miRNAs in HCC
Regarding the protective miRNA, Yao et al. [32] reported that hsa-mir-30d was up-regulated in HCC and promoted HCC cell invasion and metastasis, which is consistent with our finding, but the prognostic characteristics of hsa-mir-30d in HCC remains to be clarified. Higher expression of hsa-miR-30d was associated with better overall survival or inhibition of tumor cell in renal carcinoma [33], ovarian carcinoma [34] and malignant peripheral nerve sheath tumor [35]. Hsa-miR-30d was involved in cell autophagy [36], apoptosis [37] and platinum sensitivity in tumor cells [38]. The biological studies certified that hsa-mir-30d had tumor-suppressive character, which was also consistent with our finding.
There are some limitations should be acknowledged in interpreting the above results. First, the miRNA expression profiling was detected from liver tissues, which may not reflect the Table 4. Cancer related and non-cancer related KEGG pathways enriched in seven miRNA target genes. (The full list of target genes see S4 Table) KEGG Prognosis of 7 miRNAs in HCC miRNA in serum or stool [39]. We may need to validate the miRNA signature in these samples as they are conveniently available for monitoring. Second, we strictly selected the subjects and use the LOOCV model to cross-validate the 7-miRNA prognostic model, we still need external validation to reduce the FDR. In conclusion, by employing a large independent HCC patient cohort, our study identified a tumor-specific miRNA signature consisting of seven miRNAs, which can be served as a novel biomarker for HCC prognostic prediction and improve treatment outcome.
Supporting Information S1