Molecular Signatures Associated with HCV-Induced Hepatocellular Carcinoma and Liver Metastasis

Hepatocellular carcinomas (HCCs) are a heterogeneous group of tumors that differ in risk factors and genetic alterations. In Italy, particularly Southern Italy, chronic hepatitis C virus (HCV) infection represents the main cause of HCC. Using high-density oligoarrays, we identified consistent differences in gene-expression between HCC and normal liver tissue. Expression patterns in HCC were also readily distinguishable from those associated with liver metastases. To characterize molecular events relevant to hepatocarcinogenesis and identify biomarkers for early HCC detection, gene expression profiling of 71 liver biopsies from HCV-related primary HCC and corresponding HCV-positive non-HCC hepatic tissue, as well as gastrointestinal liver metastases paired with the apparently normal peri-tumoral liver tissue, were compared to 6 liver biopsies from healthy individuals. Characteristic gene signatures were identified when normal tissue was compared with HCV-related primary HCC, corresponding HCV-positive non-HCC as well as gastrointestinal liver metastases. Pathway analysis classified the cellular and biological functions of the genes differentially expressed as related to regulation of gene expression and post-translational modification in HCV-related primary HCC; cellular Growth and Proliferation, and Cell-To-Cell Signaling and Interaction in HCV-related non HCC samples; Cellular Growth and Proliferation and Cell Cycle in metastasis. Also characteristic gene signatures were identified of HCV-HCC progression for early HCC diagnosis. Conclusions A diagnostic molecular signature complementing conventional pathologic assessment was identified.


Introduction
Hepatocellular carcinoma (HCC) is the third leading cause of cancer death in the world [1][2][3]. As for other cancers, the etiology of HCC is multifactorial and progresses through multiple stages [4]. This multistep process may be divided into chronic liver injury, inflammation, cell death, cirrhosis, regeneration, DNA damage, dysplasia and finally HCC. Different lesions have been considered pre-neoplastic in regard to the development of HCC. For instance, cirrhotic liver contains regenerative nodules and like HCC may contain dysplastic nodules [5,6]. The principal risk factor for the development of HCC is hepatitis B virus (HBV) [7,8], followed by hepatitis C virus (HCV) infection [9]. Non viral causes are less frequent and include toxins and drugs (e.g., alcohol, aflatoxins, microcystin, anabolic steroids), metabolic liver diseases (e.g., hereditary haemochromatosis, a1-antitrypsin deficiency), steatosis [10] and non-alcoholic fatty liver disease [11,12]. In general, HCCs are more prevalent in men than in women and the incidence increases with age.
The molecular mechanism underlying HCC is currently unknown. Activation of cellular oncogenes, inactivation of tumor suppressor genes, over-expression of growth factors, possibly telomerase activation and DNA mismatch repair defects may contribute to the development of HCC. Alterations in gene expression patterns accompanying different stages of growth, disease initiation, cell cycle progression, and responses to environmental stimuli provide important clues to these complex process [13,14]. In addition to primary HCC, metastatic liver disease often occurs. Metastases most often derive from gastrointestinal organs, primarily colon and rectum, though they can occur from primaries throughout the body [15]. These cancers can be treated using routine therapies relevant to the primary such as chemotherapy, radiotherapy, surgical resection, liver transplantation, chemo-embolization, cryosurgery or combination therapy [16]. The characterization of genes that are differentially expressed during tumorigenesis is an important step toward the identification of the biological steps involved in the transformation process. Studies examining the gene expression of metastatic liver tumors and HCC in parallel with paired non-cancerous liver tissues might yield important insights by identifying genes not expressed in normal liver and are switched on in tumors and vice versa. Such studies should also lead to the identification of genes that are expressed in tumors at different stages and never in non cancerous liver tissue.
The present study assessed the expression profile of 18 HCVrelated primary HCCs and their corresponding HCV-positive non-HCC counterpart, 1 HCV-positive liver sample without the corresponding HCC tissue, 14 gastrointestinal liver metastases and their corresponding non cancerous tissue and 6 liver biopsies from patients with benign pathologies and normal liver by use of highdensity oligonucleotide arrays. This represents an independent study from a previous study performed by our group [17]. An HCC-specific molecular signature set was identified that may enhance conventional pathologic assessment and may provide a tool for prognostic purposes, as well as identify targets for new therapeutic strategies.

Patient and Tissue Samples
A total of 102 liver human samples have been analyzed. Thirty one samples were used to define the signature genes in the first group of samples represented by a subset of samples from 19 patients profiled and reported in a previous study of molecular classification of HCV-related hepatocellular carcino-ma [17]. An independent set of 71 liver biopsies has been used to define/evaluate the identified liver cancer signature ( Figure 1).
Liver biopsies from 19 HCV-positive HCCs, 14 metastases from distant primary and 6 HCV-negative control samples from healthy donors obtained during laparoscopic cholecystectomy were obtained with informed consent at the liver unit of the INT ''Pascale'', Naples. In addition from each of the HCVpositive HCC and metastatic patients a paired liver biopsy from non-adjacent, non-tumor containing liver was obtained. All liver biopsies were stored in RNA Later at 280Cu (Ambion, Austin,TX). Confirmation of the histopathological nature of the biopsies was performed by the Pathology lab at INT before processing samples for RNA extraction. The non-HCC tissues from HCV-positive patient represented a heterogeneous sample consistent with the prevalent liver condition of each subject (ranging from persistent HCV-infection to cirrhotic lesions). One HCC sample, was shown to be mainly cirrhotic tissue and removed from the analysis. Furthermore, laboratory analysis confirmed that the 6 controls were seronegative for HCV antibodies.    Samples were homogenized in disposable tissue grinders (Kendall, Precision). Total RNA was extracted by TRIzol solution (Life Technologies, Rockville, MD), and purity of the RNA preparation was verified evaluating the 260:280 nm ratio of the spectrophotometric reading with NanoDrop (Thermo Fisher Scientific, Waltham, MA). Moreover, the integrity of extracted RNA was evaluated by Agilent 2100 Bioanalyzer (Agilent Technologies, Palo Alto, CA), analyzing the presence of 28S and 18S ribosomal RNA bands and verifying that the 28S/18S rRNA intensity ratio was equal or close to 1.5. In addition, phenol contamination was evaluated considering acceptable a 260:230 nm OD ratio within a 2.0-2,2 range.
Double-stranded complementary DNA (cDNA) was prepared from 3 mg of total RNA (T-RNA) in 9 ml DEPC -treated H 2 O using the Super script II Kit (Invitrogen) with a T7-(dT15) oligonucleotide primer. cDNA synthesis was completed at 42uC for 1 h. Full-length dsDNA was synthesized incubating the produced cDNA with 2 U of RNase-H (Promega) and 3 ml of Advantage cDNA Polymerase Mix (Clontech), in Advantage PCR buffer (Clontech), in presence of 10 mM dNTP and DNase-free water. dsDNA was extracted with phenol-chloroform-isoamyl, precipitated with ethanol in the presence of 1 ml linear acrylamide (0.1 mg/ml, Ambion, Austin, TX) and aRNA (amplified-RNA) was synthesized using Ambion's T7 MegaScript in Vitro Transcription Kit (Ambion, Austin, TX). aRNA recovery and removal of template dsDNA was achieved by TRIzol purification. For the second round of amplification, aliquots of 1 mg of the aRNA were reverse transcribed into cDNA using 1 ml of random hexamer under the conditions used in the first round. Second-strand cDNA synthesis was initiated by 1 mg oligo-dT-T7 primer and the resulting dsDNA was used as template for in vitro transcription of aRNA in the same experimental conditions as for the first round [18]. 6 mg of this aRNA was used for probe preparation, in particular test samples were labeled with USL-Cy5 (Kreatech) and pooled with the same amount of reference sample (control donor peripheral blood mononuclear cells, PBMC, seronegative for antihepatitis C virus (HCV) antibodies ) labeled with USL-Cy3 (Kreatech).The two labeled aRNA probes were separated from unincorporated nucleotides by filtration, fragmented, mixed and co-hybridized to a custom-made 36K oligoarrays at 42Cufor 24 h. The oligo-chips were printed at the Immunogenetics Section, Department of Transfusion Medicine, Clinical Center, National Institutes of Health (Bethesda, MD). After hybridization the slides were washed with 2xSSC/0.1%SDS for 1 min, 1xSSC for 1 min, 0.2xSSC for 1 min, 0.056SSC for 10 sec., and dried by centrifugation at 800 g for 3 minutes at RT.

Data Analysis
Hybridized arrays were scanned at 10-mm resolution with a GenePix 4000 scanner (Axon Instruments) at variable photomultiplier tube (PMT) voltage to obtain maximal signal intensities with less than 1% probe saturation. Image and data files were deposited at microarray data base (mAdb) at http://madb.nci.nih. gov and retrieved after median centered, filtering of intensity (.200) and spot elimination. Data were further analyzed using Cluster and TreeView software (Stanford University, Stanford, CA).

Statistical Analysis
Unsupervised Analysis. For this analysis, a low-stringency filtering was applied, selecting the genes differentially expressed in 80% of all experiments with a .3 fold change ratio in at least one experiment. 8,210 genes were selected for the analysis including the five groups of analyzed samples the HCV-related HCC, their non-HCC counterpart, metastasis, and their non metastatic counterpart as well as samples from the normal controls. Hierarchical cluster analysis was conducted on these genes according to Eisen et al. [19], differential expressed genes were visualized by Treeview and displayed according to the central method [20]. Principal component analysis (PCA) was applied for visualization when relevant based on the complete dataset.
Supervised Analysis. Supervised class comparison was performed using the BRB ArrayTool developed at NCI, Biometric Research Branch, Division of Cancer Treatment and Diagnosis. Three subsets of genes were explored. The first subset included genes up-regulated in HCV-related HCC compared to normal control samples, the second subset included genes up-regulated in the HCV-related non-HCC counterpart compared with normal control samples, the third subset included genes up-regulated in metastasis compared with normal control samples. Paired samples were analyzed using a two-tailed paired Student's t-Test. Unpaired samples were tested with a two-tailed unpaired Student's t-Test assuming unequal variance or with an F-test as appropriate. All analyses were tested for an univariate significance threshold set at a p-value ,0.01. Gene clusters identified by the univariate t-test were challenged with two alternative additional tests, a univariate permutation test (PT) and a global multivariate PT. The multivariate PT was calibrated to restrict the false discovery rate to 10%. Genes, identified by univariate t-test as differentially expressed (p-value ,0.01) and a PT significance ,0.05, were considered truly differentially expressed. Gene function was   Ingenuity pathway analysis IPA). Pathway analysis was performed using the gene set expression comparison kit implemented in BRB-Array-Tools. The human pathway lists determined by ''Ingenuity System Database'' was selected. Significance threshold of t-test was set at 0.01. IPA is a system that transforms large dataset into a group of relevant networks containing direct and indirect relationships between genes based on known interactions in the literature. The significance of each network was estimated by the scoring system provided by Ingenuity. The scores are determined by the number of differentially expressed genes within each of the networks and the strength of the associations among network members.

Genes Differentially Expressed among Distinct Tissues
The gene expression profile-of tissue samples from the various groups (HCV-related HCC, non-HCC counterpart, metastases, non-metastatic counterpart and controls from healthy donors) were compared by unsupervised analysis. Genes were filtered according to the following criteria: presence in 80% of all experiments, a .3 fold change ratio in at least one experiment; this filter yielded 8,210 genes that were used for clustering. The HCC and the metastatic samples prevalently clustered into distinct groups, based on differences in their patterns of gene expression (Figure2A). PCA segregated the different sample types into four-five groups according on their pathological status. Statistical and functional analysis of the profiles identified a set of genes whose expression was differentially altered between the groups ( Figure 2B). The expression pattern of gastrointestinal liver metastases was clearly distinct from that of HCV-related primary HCC, allowing a definite molecular characterization of the two diseases.

Differential Gene Expression Patterns between HCV positive Liver Tissue with and without HCC, Metastasis and Normal control Liver Tissue
An unpaired Student's t-test with a cut-off p value set at p,0.01 comparing HCV-related HCC to normal controls indentified 1864 genes differentially expressed. Among them, 993 were up-regulated and 871 down-regulated in HCV-related HCCs ( Figure 3A).
In total 198 genes showing up regulation were found in common with our previous study [17], the results is presented as two-way Venn diagram in additional Figure S1A and Supplemental Table S1. The common genes 2.0 fold upregulated (ranked according to the name) are listed in Table 1.
Comparison between liver tissues from HCV-related non HCC and normal controls (p,0.01) indentifies 1526 genes differentially expressed. Among them, 618 were up-regulated and 918 downregulated in HCV-related HCC liver tissues ( Figure 3B). In total 59 genes showing up regulation were found in common with our previous study [17], the results is presented as two-way Venn diagram in additional Figure S1B and Supplemental Table S1. The common genes 2.0 fold upregulated (ranked according to the name) are listed in Table 2.  Comparison between liver tissues from HCV-related HCC and parental HCV-related non HCC (p,0.01) indentifies 1020 genes differentially expressed. Among them, 468 were up-regulated and 552 down-regulated in HCV-related HCC liver tissues ( Figure 3C).
In total 10 genes showing up regulation were found in common with our previous study [17], the results is presented as two-way Venn diagram in additional Figure S1C and Supplemental Table  S1. The common genes 2.0 fold upregulated (ranked according to the name) are listed Table 3.
Comparison of liver tissues from metastases and normal controls (p,0.01) indentified 1,780 genes. Among them, 760 were shown to be up-regulated and 860 down-regulated in metastatic liver tissues ( Figure 3D and Supplemental Table S2). The genes showing the highest fold up-regulation are listed in Table 4.

Gene Signatures Involved in HCC progression
A progression of differences in gene expression across tissue types from normal (n = 6) to HCV related non HCC (n = 19) to HCV-related HCC (n = 18) identified 450 genes with decreasing and 136 genes with increasing trend in expression ( Figure 3E). Genes with a significantly increasing trend in expression values were considered as possible diagnostic and prognostic markers. The genes showing the highest fold of up-regulation that were also consistent with our previous findings [17] are reported in Table 5.

Canonical Pathways and Molecular and Cellular Functions
To explore the biological significance of the genes characterizing different pathological or normal conditions we investigated their interactions by IPA mapping their molecular/cellular functions and canonical pathways. The more important molecular and cellular functions (ranked according to lowest p value) of genes up-regulated in HCV-related HCC samples were related to regulation of gene expression (1.12E217 to 3.41E203), cellular growth and proliferation (2.00E214 to 3.84E203) and posttranslational modification (1.53E209 to 2.45E203). The top canonical pathways included protein ubiquitination (p = 2.88E203), 14-3-3 mediated Signaling (p = 1.13-E02) and Aryl Hydrocarbon receptor signaling pathway (p = 3.09E-02) ( Figure 4A). The more important molecular and cellular functions (High p value) of genes up-regulated in HCV-related non HCC samples were related to Cellular Growth and Proliferation (1.04E-22 to 4.61E204), Gene Expression (2.07E222 to 4.58E204) and Cell-To-Cell Signaling and Interaction (9.05E214 to 4.65E204). The top canonical pathways included Interferon Signaling Genes (p = 9.78E204), Antigen Presentation Pathway (p = 1.58E203) and Protein Ubiquitination Pathway (p = 2.44E202) ( Figure 4B). The more important molecular and cellular functions (High p value) of genes up-regulated in metastases were related to Gene Expression (6.08E226 to 2.00E204), Cellular Growth and Proliferation (1.86E225 to 8.64E205), Cell Cycle (5.67E221 to 1.33E204). The top canonical pathways included Arginine and Proline Metabolism (p = 1.77E209), Coagulation System (p = 9.68E209) and Acute Phase Response Signaling (p = 2.08E208) ( Figure 4C). Table 6 summarizes the more important findings for each of the described comparison analysis.
Among the three different class comparison analysis (HCVrelated HCC, HCV-related non HCC and Metastatic liver tissue vs normal control) we found a gene-set that distinguish the different cases of liver disease, in particular with time course analysis we identify the genes that should be candidate as a possible progression markers ( Figure 5).

Discussion
HCC is a common and aggressive malignant tumor worldwide with a dismal outcome. Early detection and resection may offer an opportunity to improve the long-term survival for HCC patients. Unfortunately, with current diagnostic approaches, only about 10% to 20% of HCC patients are eligible for resection [21]. In the first study, microarray analyses of liver biopsies from HCC nodules and paired non-adjacent non-HCC liver tissue of the same HCV-positive patients were compared to biopsies from HCV-negative control subjects. The class comparison analysis used in that study successfully identified a set of genes significant differentially expressed. Moreover the up-regulated genes identified within the individual class comparison analysis were evaluated and classified by a pathway analysis, according to the "Ingenuity System Database". The genes up-regulated in samples from HCV-related HCC were classified in metabolic pathways, and the most represented are the Aryl Hydrocarbon receptor signaling (AHR) and, protein Ubiquitination pathways, which have been previously reported to be involved in cancer, and in particular in HCC, progression. The genes up-regulated in samples from HCV-related non-HCC tissue were classified in several pathways prevalently associated to inflammation and native/adaptive immunity and most of the over expressed genes belong to the Antigen Presentation pathway. In this new study we performed the same statistical analysis under the same condition to confirm our previous data. To elucidate the genes and molecular pathways involved in the HCV-related HCC a class comparisons analysis were performed on new samples set. This analysis allowed us to identify the unique probe sets characterizing the pathological status, in fact as expected, the gene expression patterns were found to vary significantly among the HCC and normal control liver samples. Genes associated with cell death, cell to cell signaling and interaction, were found to have increased expression in HCC samples. The molecular events linked to the development and progressions of HCC are not well known. Malignant hepatocytes are the result of sequential changes accumulated in mature hepatocytes or can derive from stem cells. The most accepted hypotheses [22,23] describes a step-by-step process in which external stimuli induce genetic alterations in mature hepatocytes leading to cell death, cellular proliferation, and the production of monoclonal populations. These populations harbor dysplastic hepatocytes that evolve to dysplastic nodules [24].
Canonical pathways prevalently associated with HCV-related HCC included protein ubiquitination, antigen presentation and Aryl Hydrocarbon receptor signaling pathway, confirming our previous data. Cellular growth and proliferation and antigen presentation were the more important cellular and molecular functions when HCVrelated non HCC samples were compared with normal control liver tissue. These data agree with the numerous regulatory roles reported for the HCV core, that affect signal transduction, expression of viral and cellular genes, cell growth and proliferation [25,26].
Several viruses target specific components of the MHC class I pathway, leading to diminished cell surface expression of MHC class I molecules. Other viruses block the transport of MHC class I molecules through the endoplasmic reticulum (ER), inhibit TAPmediated translocation of cytoplasmic peptides into the ER, or interfere with proteasomal degradation of their own proteins [27]. Other viruses, like human cytomegalovirus, escape CD8_T-cell recognition by downregulating cellular MHC class I molecules [28] and simultaneously inducing the expression of virus-encoded MHC class I homologues capable of engaging inhibitory receptors that give a negative signal blocking NK cell function. Flaviviruses can up regulate MHC class I cell surface expression by increased peptide supply to the ER [29,30]. Viruses may use these strategies to evade and counteract a potential NK cell attack. Some studies demonstrated that HCV core protein induced the up regulation of antigen presentation and immune response mechanisms [31].
Canonical pathways mainly associated with HCV-related non-HCC tissue included Interferon Signaling, SAPK/JNK Signaling and NF-kB Activation by viruses pathway. These pathways are prevalently associated with inflammation and native/adaptive immunity.
A traditional HCC diagnosis has relied on the use of a single biomarker approach (e.g., AFP).
We based our study on the identification of the minimal set of genes sufficient for the molecular signature and for developing a chip able to contribute or substitute the pathology diagnosis and to furnish a prognostic indication of progression risk, as well as responsivity to pharmacological treatment of HCV-associated hepatitis and their progression to cirrhosis/HCC.
Among the four different class comparison analysis (HCV-related HCC, HCV-related non HCC and Metastatic liver tissue vs. normal control; HCV-related HCC vs. autologous HCV-related non HCC liver tissue) we found a gene-set that distinguish the different cases of liver disease, in particular with time course analysis we identify the genes that should be candidate as a possible progression markers (e.g., GPC3, CXCL12, SPINK1, GLUL, UBD, TM4SF5, DPT, SCD, MAL2, TRIM55, COL4A2). All these data altogether suggested developing a specific gene-chip along with genes showing the highest fold up-regulation in common with previous work representing the different stage of disease. The identification of the lesions and the evaluation of their neoplastic progression will be based on the gene pattern expression on the gene-chip ( Figure 5, Table 1 -5).
In conclusion we identified a set of genes highly candidate as gene signatures to be validate on a larger clinical sample size of liver tissue biopsies to evaluate consistency and universality of the results, to verify the effective power of distinguishing different pathological stage of liver disease and to assess their value as progression markers for early HCC diagnosis in HCV positive patients. Furthermore, identification of specific alterations in key metabolic pathways could give the opportunity to identify new therapeutic targets for innovative, personalized treatments.
Moreover, the gene-expression pattern will be correlated with additional clinical parameters (besides disease stage and tumor histopathology) such as the frequency with which patients show the identified profile, patient age and gender, concurrent diseases and pharmacological treatments.
In parallel, all our liver collection samples will undergo further molecular analysis (which include miRNA, aCGH, proteomic) to develop increasingly sophisticated gene expression indicators of specific types or stages of liver disease as well as responsivity to pharmacological treatment of HCV-associated hepatitis and their progression to cirrhosis/HCC. Figure S1 Venn diagram illustrating the number of upregulated genes in common between first (green circle) and second (blue circle) data set. Genes in common are in red circle. A) Comparison analysis between HCV-related HCC versus normal liver. B) Comparison analysis between HCV related non-HCC versus normal liver. C) Comparison analysis between HCV-related HCC versus autologous HCV related non-HCC.

(TIF)
Table S1 List of the 198 genes upregulated in HCC samples of the two independent datasets, in comparison to liver control biopsies. (XLS)