A Serum MicroRNA Panel as Potential Biomarkers for Hepatocellular Carcinoma Related with Hepatitis B Virus

Background The identification of new high-sensitivity and high-specificity markers for HCC are essential. We aimed to identify serum microRNAs (miRNAs) as biomarkers to be used in diagnosing hepatitis B virus (HBV) –related hepatocellular carcinoma (HCC). Methods We investigated serum miRNA expression in (261 HCC patients, 233 cirrhosis patients, and 173 healthy controls), recruited between August 2010 and June 2013. An initial screening of miRNA expression by Illumina sequencing was performed using serum samples pooled from HCC patients and controls. Quantitative reverse-transcriptase polymerase chain reaction (qRT-PCR) was used to evaluate the expression of selected miRNAs. A logistic regression model was constructed using a training cohort (n = 357) and then validated using an independent cohort (n = 241). The area under the receiver operating characteristic curve (AUC) was used to evaluate the accuracy of the use of the biomarkers for disease diagnosis. Results We identified 8 miRNAs (hsa-miR-206, hsa-miR-141-3p, hsa-miR-433-3p, hsa-miR-1228-5p, hsa-miR-199a-5p, hsa-miR-122-5p, hsa-miR-192-5p, and hsa-miR-26a-5p) and constructed an miRNA set that provided high diagnostic accuracy for HCC (AUC = 0.887 and 0.879 for training and validation sets, respectively). The miRNAs could also be used to differentiate HCC patients from healthy (AUC = 0.893) and cirrhosis (AUC = 0.892) patients. Conclusions We identified a serum of miRNA panel that has considerable clinical value in HCC diagnosis.


Introduction
Hepatocellular carcinoma (HCC) is currently the third leading cause of cancer-related deaths in the world, with mortality rates reaching up to 500,000 deaths per annum. Patients with HCC show the shortest survival time among patients with different forms of cancer, with most patients dying within 12 months of developing the tumour [1]. A previous study has suggested that early diagnosis of HCC and effective treatment are likely to prolong the lifetime of liver cancer patients [2]. Current methods for the diagnosis of HCC fall into two main categories: imaging and biomarker tests. However, the diagnostic performance of these methodsis unsatisfactory, particularly for the diagnosis of earlystage HCC. Currently, only 30% to 40% of patients with HCC are found eligible for potentially curative intervention at diagnosis, due to late clinical presentation and the lack of effective early-detection measures. Therefore, the identification of new markers with high sensitivity and specificity for HCC is the need of the hour.
MicroRNAs (miRNAs) are an emerging class of highly conserved, non-coding small RNAs that regulate gene expression at the post-transcriptional level. It is now clear that miRNAs can potentially regulate every aspect of cellular activity, including differentiation and development, metabolism and proliferation; they also play a role in regulating apoptotic cell death, cellular responses to viral infection, and tumorigenesis [3]. Recent studies provide clear evidence that miRNAs are abundant in the liver and modulate a diverse spectrum of liver functions [4]. Circulating miRNAs are extremely stable and protected from RNAasemediated degradation in body fluids; they, therefore, have emerged as candidate biomarkers for many diseases [5,6,7]. The use of miRNAs as noninvasive biomarkers is of particular interest in diagnosis of liver diseases [8,9,10].
Many studies have demonstrated that miRNA expression profiles in HCC and non-tumor tissue are significantly different [11,12,13,14,15]. In fact, differential expression of several microRNAs in the serum, including miR-16, miR-122, miR-21, miR-223, miR-25, miR-375, and let-7f in patients with HCC, patients with hepatitis B, and healthy individuals has been reported recently [16,17]. However, those studies had one or more of the following limitations: Limited number of screened miRNAs, small sample size, failure to differentiate HCC from hepatitis B virus (HBV) infection, and lack of independent validation.
In our study, we investigated miRNA expression profiles with independent validation in a large cohort of participants, in order to identify a set of miRNAs for the diagnosis of HCC. The cohort included healthy individuals and patients with cirrhosisand HCC related to HBV.

Ethics statement
The study was approved by the Medical Ethics Committee of The First Affiliated Hospital of Soochow University and The Third Hospital Affiliated to Jiangsu University (No. 2012046 and No. 272), and written informed consent was obtained from each patient prior to participation. The study was conducted in accordance with the Declaration of Helsinki.

Study design, patients, and healthy controls
A multistage, case-control study was designed to identify a serum miRNA profile as a surrogate marker for HCC ( Fig. 1). A total of 261 HCC patients, 233 cirrhosi patients and 173 healthy controls were enrolled in our study. In the discovery biomarker stage, 9 serum samples pooled from 3 healthy control donors, 3 HCC patients and 3 cirrhosi patients treated at The First Affiliated Hospital of Soochow University were subjected to Illumina Hiseq 2000 deep sequencing to identify the miRNAs that were significantly differentially expressed. In the biomarker selection stagedifferent expression miRNAs were validated by qRT-PCR in 20 HCC patients, 20 cirrhosis patients and 20 healthy controls. Subsequently, 135 HCC patients, 132 cirrhosis patients and 90 healthy controls (from The First Affiliated Hospital of Soochow University and The Third Hospital of Zhenjiang Affiliated Jiangsu University) formed a training set. Sequential validation was performed using a hydrolysis probe-based qRT-PCR assay to refine the number of serum miRNAs as an HCC signature.
Whereas an additional 103 HCC patients, 78 cirrhosis patients and 60 healthy controls serum samples (from The Third Hospital of Zhenjiang Affiliated Jiangsu University) formed an independent validation set. All patients were diagnosed with HCC and cirrhosis between August 2010 and June 2013, and blood samples were collected prior to any therapeutic procedure.
Chronic HBV infection was defined as positivity for HBV surface antigen for at least 6 months, positivity for HBV DNA by PCR analysis, and HBV infection-compatible results in a liver biopsy. All patients were positive for HBsAg and did not have any other types of liver diseases such as chronic hepatitis C, alcoholic liver diseases, autoimmune liver diseases, or metabolic liver diseases. The diagnosis of HCC and cirrhosis was histopathologically confirmed. Data on all subjects were obtained from medical records, pathology reports and personal interviews with the subjects. Tumor-free healthy control subjects were recruited from a large pool of individuals seeking a routine health check-up at the Healthy Physical Examination Centre of The First Affiliated Hospital of Soochow University who showed no evidence of disease. The demographics and clinical features of the patients arelisted in Table. 1.

RNA isolation and library preparation
About 5 mL of venous blood was collected from each participant. The whole blood was separated into serum and cellular fractions by centrifugation at 4,000 rpm for 10 min, followed by 5 min centrifugation at 13,000 rpm for complete removal of cell debris. The supernatant serum was stored at 280uC until analysis. Total RNA was isolated using LCS TRK1001 miRNeasy kit (LC Sciences, Hangzhou, China). The libraries were constructed from total RNA using the Illumina Truseq Small RNA Sample Preparation Kit (Illumina, San Diego, CA, USA) according to the manufacturer's protocol. Briefly, RNA 39 (P-UCGUAUGCCGUCUUCUGCUUG-UidT) and 59 (GUUCAGAGUU CUACAGUCCGACGAUC) adapters were ligated to target miRNAs in two separate steps. Reverse transcription reaction was applied to the ligation products to create single stranded cDNA. The cDNA was amplified by PCR using a common primer and a primer containing the index sequence (CAAGCAGAAGACGGCATACGA). The quantity and purity of total RNAs were monitored using a NanoDrop  Table 1. Characteristics of study subjects in the three datasets.

Illumina sequencing and data analysis
The raw sequences were processed using the Illumina pipeline program. After masking of adaptor sequences and removal of contaminated reads, the clean reads were filtered for miRNA prediction with the software package ACGT101-miR-v4.2 (LC Sciences, Houston, Texas, USA) and subsequently analyzed according to report [18]. Secondary structure prediction of individual miRNAs was performed by Mfold software (Version 2.38; http://mfold.rna.albany.edu/?q=mfold/RNA-Folding-Form) using the default folding conditions. The raw dates were reduced to cleaned sequences by removal of the following sequences: (1) 3ADT&length filter: reads were removed due to 3ADT not being found, and reads with length ,18 and .26 were removed. (2) Junk reads: Junk: $2N, $7A, $8C, $6G, $7T, $10 Dimer, $6 Trimer, or $5 Tetramer. (3) Rfam: Collection of many common non-coding RNA families except miRNAs (http://rfam.janelia.org). (4) Repeats: Prototypic sequences representing repetitive DNA from different eukaryotic species (http://www.girinst.org/repbase). (5) Notes: There was overlap in mapping of reads with mRNA, rRNA, tRNA, snRNA, snoRNA, and repeats. (6) mRNA Database: (http://www.ncbi.nlm.nih.gov/). The clean sequence reads were mapped with miRBase 20.0, allowing a mismatch of one or two nucleotide bases. More detailed description of the computational pipeline employed for data handling is reported in a flow-chart outline of study procedures (Figure. S1). All data were transformed to log base 2. Differences between the samples were calculated using chi-square and fisher's exact test. Only miRNAs with fold difference .2.0 and P,0.05 were considered statistically significant.

qRT-PCR validation study and data analysis
qRT-PCR-based relative quantification of miRNAs (300 mL of serum from each participant) was performed with SYBR Premix Ex Taq (TaKaLa) according to the manufacturer's instructions using a Rotor-Gene 3000 Real-time PCR machine (Corbett Life Science, Sydney, Australia). According to the results obtained, miRNA-24 has been reported to be consistently present in human serum [19,20]. Moreover, our previous experience is that miRNA-24 maintains a stable expression, and that the level of miRNA-24 served as an internal control in serum miRNA relative quantitative analysis. The specificity of each PCR product was validated by melting curve analysis at the end of PCR cycles. All samples were analyzed in triplicate, and the cycle threshold (Ct) value was defined as the number of cycles required for the fluorescent signal to reach the threshold. The relative expression levels of miRNAs in serum were calculated using the formula 2 2DDCt where All primers used were obtained from Invitrogen company (Shanghai, China).

Statistical analysis
All Illumina sequencing data were transformed to log base2. Differences between the samples were calculated using chi-square and fisher's exact test. Only miRNAs with fold difference .2.0 and P,0.05 were considered statistically significant. Data were presented as median 6 SD. The data of demographic and clinical features of the HCC patients and healthy controls were analyzed using the statistical Package for the Social Sciences(SPSS) version

Description and clinical features of patients
The characteristics of the study participants are presented in Table. 1. There was no significant difference in the distribution of age, sex, and alanine aminotransferase (ALT) expression among the three groups (healthy, cirrhosis, and HCC).

Global analysis of miRNAs by deep sequencing
Illumina HiSeq 2000 sequencing of the miRNAs obtained from the sera of patients in the healthy control group, cirrhosis group and HCC group produced 9,364,754, 10,491,694, and 7,896,608 raw-reads, respectively, which, after extensive preprocessing and quality control, were reduced to 459,890, 859,216, and 494,523 clean reads ( Fig. 2A-C, Table. S1). Distribution of reads of 16-30 nt length is presented in (Fig. 2D). In our study, we found that the length of miRNA is generally 20 to 22 nt. Clean reads were mapped to human miRNA (miRs) database v20.0 (ftp://mirbase. org/pub/mirbase/CURRENT/), pre-miRNA (mirs) database v20.0 (ftp://mirbase.org/pub/mirbase/CURRENT/), and genome database (ftp.ncbi.nih.gov/genomes/H sapiens/Assembled chromosomes/seq/). A total of 2,754 unique reads map to human miRNAs or pre-miRNAs in miRbase and the pre-miRNAs further map to the human genome and expressed sequence tags.

Analysis of differentially expressed miRNAs
We normalized the differential expression of miRNA count data, and the number of individual miRNA reads was standardized by the total numbers of 1,000,000 reads in each sample. Comparing the HCC and healthy control groups, the differential expression levels of 143 miRNAs have significant differences. Among them, 6 miRNAs were up-regulated (fold change .4-fold, P,0.05) in the control group, 9 down-regulated (fold change . 2-fold, P,0.05), shown in Table. 2. Comparing the HCC and cirrhosis groups, differential expression levels of 84 miRNAs have significant differences. Among them, 5 miRNAs were up-regulated (fold change .4-fold, P,0.05) in cirrhosis group, 12 downregulated (fold change .2-fold, P,0.05), shown in Table. 3.

Differential Expression Profile of Eight Selected miRNAs
We used qRT-PCR assay to confirm the expression of 32 candidate miRNAs that were selected from the previous step from an independent cohort of 60 serum samples. Threshold levels were found to be as follows: MiRNA Ct,35 and detection rate .75%. We determined the 2 2DDCt of 32 candidate miRNAs in three groups, Mann-Whitney unpaired test was used to compare between HCC patients and controls. Eight of the 32 miRNAs had significantly different expression levels between the HCC and control groups (healthy + cirrhosis group), as shown in Table. 4. These were hsa-miR-206, hsa-miR-141-3p, hsa-miR-433-3p, hsa-miR-1228-5p, hsa-miR-199a-5p, hsa-miR-122-5p, hsa-miR-192-5p, and hsa-miR-26a-5p.

MiRNA expression profile for HCC patients versus control patients in the training cohort
We used qRT-PCR assay to confirm the expression of 8 candidate miRNAs that were selected from the previous step. There were 6 miRNAs with significantly different expression between HCC and healthy groups, as shown in (Fig. 3A and Table. 5), and 6 other miRNAs showed significantly different expression between HCC and cirrhosis groups (Fig. 3B and Table. 5). We identified 8 of these miRNAs that showed significantly different expression when compared with the control group ( Fig. 3C and Table. 5); These were then selected for the next validation. These were hsa-miR-206, hsa-miR-141-3p, hsa-miR-433-3p, hsa-miR-1228-5p, hsa-miR-199a-5p, hsa-miR-122-5p, hsa-miR-192-5p, and hsa-miR-26a-5p. Compared to Ct of their levels in the control samples, the diagnostic accuracy using these miRNAs, as measured by AUC, was 0.665,0.68, 0.607,  Fig. 4A-H).

Establishing the predictive miRNA panel for HCC versus control
A stepwise logistic regression model was applied on the training data set to estimate the chances of being diagnosed with HCC. All of the 8 miRNAs turned out to be significant predictors. The predicted probability of being diagnosed with HCC from the logit model based on the 8-miRNA panel (Table S5), logitP = 211.8472 + 0.52147miR122 2 0.22949miR1228 2 0.27621miR141 + 0.34063miR192 + 0.33325mi199a 2 0.30556miR206 + 0.40777miR26a 2 0.38006miR433, was used to construct the ROC curve. The diagnostic performance for the established miRNA panel was evaluated using ROC analysis.

Validating the miRNA panel
The parameters estimated from the training data set were used to predict the probability of being diagnosed with HCC for the independent validation data set (251 serum samples). Similarly, the predicted probability was used to construct the ROC curve. The AUC of the miRNA panel was 0. 879 (95% CI = 0.842-0.941; sensitivity = 90.3%, specificity = 76.2%, Fig. 5B).
The performance of the miRNA panel in differentiating the HCC group from the healthy as well as the cirrhosis groups was also evaluated. The analysis demonstrated that the miRNAs had a high accuracy in distinguishing HCC patients from healthy patients (AUC = 0.893; 95% CI, 0.849 to 0.94; sensitivity 82.8%; specificity 83.3%, Fig. 5C) and cirrhosis patients (AUC = 0.892; 95% CI, 0.844 to 0.939; sensitivity 81.6%; specificity 84.6%. Fig. 5D).

Comparison of the AUC of the miRNA panel with that of AFP in the validation set
Using the same serum samples, we evaluated the AUC of the AFP in different groups. The analysis demonstrated that the AFP also had a high accuracy in distinguishing HCC from healthy patients (AUC = 0.844; 95% CI, 0.785 to 0.902; sensitivity 60.2%; specificity 100%), cirrhosis patients (AUC = 0.708; 95% CI, 0.632 to 0.783; sensitivity 57.3%, specificity 79.5%) and control subjects (AUC = 0.766; 95% CI, 0.703 to 0.829; sensitivity 59.2%, specificity 87%).
We also compared the AUC of the miRNA panel with that of AFP. There was no difference between the AUC values of the miRNA panel and those of AFP (difference between areas = 0.0735, 95% CI = 0.000145 to 0.148, P = 0.514, Fig. 6A) in the healthy group. However, there were significant differences between the AUC values of the miRNA panel and those of AFP in the cirrhosis group (difference between areas = 0.184, 95% CI = 0.0925 to 0.276, P = 0.0001, Fig. 6B) and control group (difference between areas = 0.113, 95% CI = 0.0344 to 0.192, P = 0.0049, Fig. 6C).

Discussion
Sensitive and specific cancer biomarkers are essential for early detection and diagnosis of HCC, as well as for developing preventive screening. However, current methods are insufficient to detect HCC in the early stages. Advances in magnetic resonance imaging and computed tomography have greatly improved imaging of focal hypervascular masses consistent with HCC, but these procedures are costly and not readily available in developing countries. Laboratory data including serum alfa-fetoprotein (AFP) and des-gamma carboxyprothrombin (DCP) levels have been used as HCC biomarkers for a long time. However, the accuracy of AFP is modest (sensitivity: 39-65%; specificity: 76-94%). Onethird of cases of early-stage HCC (tumors ,3 cm) are missed using AFP analysis [21], and serum AFP levels are also elevated in patients with benign liver diseases, such as hepatitis and cirrhosis [22,23].
Many miRNAs are dysregulated in HCC; thus, it is to be expected that circulating miRNA levels are also affected by HCC progression. The high stability of miRNAs in circulation makes them perfect biomarkers, especially for detection of early stage, presymptomatic disease [24]. It is interesting that circulating miR-21 [16,25], miR-222 [25], and miR-223 [26] were found to be upregulated in the serum/plasma of HCC patients associated with HBV or HCV.
Downregulation of subsets of miRNAs is a common finding in HCC, suggesting that some of these miRNAs may act as putative tumor suppressor genes. Restoration of tumor suppressive miRNAs leads to cell cycle block, increased apoptosis, and reduced tumor angiogenesis and metastasis by inhibiting migration and invasion. Of these miRNAs, miR-122 and miR-199 appear to be particularly important in HCC [27,28,29]. Liver-specific miR-122 is the most abundant miRNA in the liver and it plays an important role in regulating hepatocyte development and differentiation [30,31]. The expression of miR-122 is downregulated in HCC tumor tissues and cancer cell lines, while its overexpression has been found to induce apoptosis and suppress proliferation in HepG2 and Hep3B cells [32]. The role of miR-122 in liver cancer has been demonstrated directly by the generation of miR-122 knockout mice [33,34].
At the circulating blood level, the diagnostic performance of miR-21, miR-122, and miR-223 in discriminating patients with HCC from the healthy group was reported by Xu et al [16]. However, their study failed to distinguish HCC from chronic hepatitis. Qu et al [37] found miR-16 to have moderate diagnostic accuracy of HCC, with sensitivity of 72.1% and specificity of 88.8%. In our study, miR-16 did show significant down-regulation in HCC as compared to control, but it did not meet our candidate microRNA selection criteria at the microarray level. Li et al [13] reported an extraordinarily high diagnostic accuracy of serum microRNA profiles for the diagnosis of HCC (AUC = 0.97-1.00) with miRNAs10a, 125b, 223, 23a, 23b, 342-3p, 375, 423, 92a, and 99a. However, the need for different markers for different group comparisons with different critical values in their study (HCC versus healthy, HCC versus HBV, healthy versus HBV, healthy versus HCV, and HBV versus HCV) raised concerns about the robustness of these markers. Furthermore, these results were not validated either internally or externally.
AFP is the most widely used tumor biomarker currently available for the early detection of HCC. Findings of a previous clinical study demonstrated that serum AFP had a sensitivity of 41-65% and specificity of 80-100% [38]. We found that AFP showed high accuracy in discriminating HCC patients from healthy subjects. At present, AFP measurement and ultrasound at 6-month intervals are the standard tools to screen for HCC in China. AFP is considered to be a useful and feasible tool for screening and early diagnosis in China due to its convenience,  especially due to the fact that more than 60% of patients with HCC have an AFP level of .400 ng/ml [39]. However, the widely used marker AFP does not yield satisfactory results for early diagnosis of HCC, particularly AFP-negative HCC. AFP results are positive during pregnancy, as well as for active liver disease, embryonic tumor and certain gastrointestinal tumors; Furthermore, false-negative results and limitations in terms of sensitivity in different detection methods add to the limitations of this biomarker. For example, a small hepatic tumor results in AFP expression being lower than the limit of detection, whereas AFP expression is delayed or higher than the limit of detection when the tumor is large, yielding AFP-negative HCC. We compared the AUC of the miRNA panel with that of AFP. There were significant differences between the AUC values of the miRNA panel and those of AFP in the cirrhosis group. Compared with other studies of circulating miRNA in HCC diagnosis [13,17,23,26], our study is unique for the following reasons: first, we screened a large number of serum miRNAs via the Illumina  Hiseq 2000 sequencing method, which gave us a better chance to identify potential diagnostic markers. Secondly, we included not only HCC and healthy groups, but a cirrhosis group as well. It is well-known that the pathogenesis of HCC is heterogenous and that multiple mechanisms of tumorigenesis could be involved (tumor suppressor gene, oncogene, viral effects, angiogenesis, etc). Nonetheless, we hypothesized that, similar to the adenomacarcinomasequence in colorectal cancer, the clinical pathway of most HBV-related HCC may follow the four stages of healthy, hepatitis, cirrhosis, and HCC. Because of the long incubation time, miRNA disturbance might occur during any of these stages (hepatitis, cirrhosis, or HCC) before the clinical/pathophysiological manifestationof HCC. Thus, all the representative differential miRNAs, namely HCC versus healthy, HCC versus hepatitis, and HCC versus cirrhosis should be considered. Failure to do so might be the source of the unsatisfactory differentiation of HCC from hepatitis or cirrhosis in other studies. Finally, the microRNA panel identified in our study was validated by a large, independent cohort from two independent medical centers. In summary, we identified a serum microRNA panel that differentiates HCC from healthy and cirrhosis with a high degree of accuracy and validated it in a large number of subjects. Our study demonstrates that this serum microRNA panel has considerable clinical value for early diagnosis of HCC, so that more patients, who would have otherwise missed the curative treatment window, can benefit from therapy. Figure S1 A flow-chart of study procedures. (TIF)