A Prognostic Score for Patients with Intermediate-Stage Hepatocellular Carcinoma Treated with Transarterial Chemoembolization

Background Intermediate-stage hepatocellular carcinoma (HCC), defined according to the Barcelona Clinic Liver Cancer (BCLC) staging system, is a heterogeneous condition with variable clinical benefits from transarterial chemoembolization (TACE). This study aimed to develop a simple validated prognostic score based on the predictive factors for survival in patients with intermediate-stage HCC treated with TACE. Methods Three-hundred and fifty patients with intermediate-stage HCC undergoing initial TACE at Chiba University Hospital (training cohort; n = 187) and two affiliated hospitals (validation cohort; n = 163) were included. Following variables were entered into univariate and multivariate Cox regression models to develop a points-based clinical scoring system: gender, age, etiology, pretreatment, Child–Pugh score, aspartate aminotransferase, creatinine, C-reactive protein, alfa-fetoprotein, size of the largest lesion, and number and location of lesions. Results The number of lesions and the Child–Pugh score were identified as independent prognostic factors in the training cohort. The development of a 0–7-point prognostic score, named the Chiba HCC in intermediate-stage prognostic (CHIP) score, was based on the sum of three subscale scores (Child–Pugh score = 0, 1, 2, or 3, respectively, number of lesions = 0, 2, or 3, respectively, HCV-RNA positivity = 0 or 1, respectively). The generated scores were then differentiated into five groups (0–2 points, 3 points, 4 points, 5 points, and 6–7 points) by the median survival time (65.2, 29.2, 24.3, 13.1, and 8.4 months, respectively; p < 0.0001). These results were confirmed in the external validation cohort (p < 0.0001). Conclusions The CHIP score is easy-to-use and may assist in finding an appropriate treatment strategy for intermediate-stage HCC.


Introduction
Hepatocellular carcinoma (HCC) is the sixth most common cancer worldwide and the third most common cause of cancer mortality [1]. Transarterial chemoembolization (TACE) is a widely recommended treatment strategy for patients with asymptomatic large or multinodular HCC without macrovascular invasion or extrahepatic metastasis (intermediate-stage HCC) [2][3][4][5][6]. However, because patients with intermediate-stage HCC comprise a heterogeneous population, with differences in tumor size, number, liver function, and possible factors, the clinical benefits of TACE are variable. Therefore, there is a need to construct a treatment strategy for patients with intermediate-stage HCC that is based on prognosis. Furthermore, we are unaware of any prognostic scores based on the statistical analysis of patients with intermediate-stage HCC who received TACE. Using Cox regression analysis, this study aimed to identify the predictive factors for survival in patients with intermediate-stage HCC who received TACE. In addition, we tried to develop a validated prognostic score in those patients.

Ethics statement
This study was approved by the Research Ethics Committees of Graduate School of Medicine, Chiba University (approval number 1,807), Kimitsu Chuo Hospital (approval number 219) and Numazu City Hospital. Informed consent was not obtained because of the retrospective design. Patient records/information was anonymized and de-identified prior to analysis.

Patient eligibility
Patients with HCC who underwent initial TACE were retrospectively identified from databases at Chiba University Hospital (between September 2001 and August 2011) and two affiliated hospitals (Kimitsu Chuo Hospital and Numazu City Hospital; between July 2003 and August 2011). We identified all patients histologically or radiologically diagnosed with HCC according to the diagnostic criteria of the American Association for the Study of Liver Diseases [2,3]. We included patients with intermediate-stage HCC [Barcelona-Clinic Liver Cancer (BCLC) stage B; defined as >3 tumors of any size, 2-3 tumors exceeding a 3-cm diameter, or a single unresectable tumor >5 cm, without macrovascular invasion or extrahepatic spread, and Eastern Cooperative Oncology Group Performance Status = 0]. We excluded patients with HCC at BCLC stages A or C, and patients at BCLC stage B who converted to systemic therapy (including sorafenib and hepatic arterial chemo infusion) during treatment. We identified the predictive factors and developed the prognostic score based on the dataset from Chiba University Hospital (training dataset). The prognostic score was validated using an external and independent cohort from the Kimitsu Chuo Hospital and Numazu City Hospital datasets (validation dataset).

Indication and strategy for TACE
The treatment policy for patients with intermediate-stage HCC followed our standard practice. Initially, we consider whether definitive treatment can be accomplished with either surgical resection or local ablation. All remaining cases are considered for TACE, the first-line non-curative treatment for intermediate-stage HCC [7]. TACE procedures are performed on demand, with repeated TACE performed if a viable tumor is identified or if there is local or distant intrahepatic recurrence (conventional TACE). For the procedure, a microcatheter is inserted coaxially via a 4-6-French gauge catheter through the femoral artery, and TACE is performed using a superselective technique [8]. A mixture of ethiodized oil (Lipiodol) and an anticancer agent (epirubicin, cisplatin, or miriplatin) is injected through the tumor-feeding branch. After injection of the Lipiodol and anticancer agent mixture, gelatin sponge particles are injected to obstruct the tumor-feeding branch completely. Because there is a lack of high quality evidence informing selection of TACE agent, the anticancer agent selection strategy at our institution was as follows: (1) epirubicin was selected as first-line treatment before cisplatin approval; (2) cisplatin was selected as first-line treatment between cisplatin approval and miriplatin approval; (3) miripratin was selected as first-line treatment after miripratin approval, and (4) changing anticancer agents was allowed in cases where insufficient therapeutic effect following the previous TACE was demonstrated. A computed tomography scan is performed within 3 months of TACE to evaluate the radiological response of the tumor. Follow-up computed tomography or magnetic resonance imaging is performed every 3-4 months.

Statistical analysis
Overall survival (OS) was defined as the time from the first TACE to either death from any cause or the date of last follow-up. Univariate and multivariate Cox proportional-hazard regression models were used to estimate the hazard ratios for the risk factors in relation to OS. To develop the prognostic score, we compared models with all possible variable combinations based on the Akaike Information Criterion. The model with the smallest Akaike Information Criterion value was selected as the final model. The prognostic score was then established based on the rounding values of estimated coefficients of the final model. Between-group differences in patient characteristics were analyzed with the Fisher exact test for categorical variables. Survival estimates were derived by the Kaplan-Meier method. All statistical tests were two-sided, and 95% confidence intervals were calculated.
All statistical analyses were performed using SAS software (version 9.3; SAS Institute, Inc., Cary, NC, USA). Proportions for categorical variables were analyzed with the FREQ procedure. The Kaplan-Meier estimates were calculated with the LIFETEST procedure. The Cox proportional-hazards regression models were performed with the PHREG procedure.

Comparison of different sub-classification models to predict mortality
We compared the developed prognostic score with the following three sub-classification models: the Bolondi model [9], the hepatoma arterial-embolisation prognostic (HAP) score [10] and the Yamakado models [11]. We then compared the concordance index to evaluate the discriminatory ability of the prognostic scores to predict OS [12]. The concordance indexes were calculated with the R survC1 package.

Population characteristics
Of the 805 patients with HCC who received TACE in three institutions, we included 350 patients with intermediate-stage HCC undergoing initial TACE for the training cohort (n = 187) and the validation cohort (n = 163) (Fig 1 and S1 Table). The training cohort had the following characteristics: 74% were men; the median age was 70 years; most patients (70%) had positive serum hepatitis C virus (HCV)-RNA tests; Child-Pugh scores were 5, 6, 7, and 8-9 in 46%, 37%, 13%, and 5% patients, respectively; and 107 patients (57%) received pre-treatment. The validation cohort had the following characteristics: most patients were men (71%); the median age was 71 years; 77% of patients had positive HCV-RNA tests; Child-Pugh scores were 5, 6, 7, and 8-9 in 38%, 31%, 15%, and 15% patients, respectively; and 87 patients (53%) received pre-treatment. The training and validation cohorts were similar with regard to most variables, although the validation cohort had a large number of patients with Child-Pugh scores of 8-9 (Table 1).

Univariate and multivariate survival analyses
During the study period, 117 of 187 patients in the training cohort and 136 of 163 patients in validation cohort died. The median OSs were 27.8 (95% CI: 23.5-32.2) and 20.5 (95% CI: 17.9-23.1) months for the training and validation cohorts, respectively. The median follow-up periods were 22.8 and 19.2 months for the training and validation cohorts, respectively. No significant differences were observed in overall survival (OS) regardless of timing of initial TACE (S2 Table).
Next, log-relative risk of death related to the size of the largest lesion and the number of lesions was examined (S1 Fig). We divided the size of the largest lesion into 30 mm, >30 mm and 50 mm, >50 mm (multi-nodule), and >50 mm (single nodule) categories; the data suggests that patients with a single nodule were at low risk of death. However, there was no clear discrepancy for log-relative risk between the sizes of the tumors in patients with multi-nodule intermediate-stage HCC. Consequently, we defined 50 mm as the cut-off value for the size of the largest lesion in univariate analysis. We then divided the number of lesions into 4 categories (1, 2-4, 5-7, and 8 lesions). This revealed clear discrepancies in log-relative risks between the patients with single lesion, 2-7 lesions, and 8 lesions.
In univariate analysis, HCV-RNA positivity, Child-Pugh score, maximum tumor size, number of liver tumors, location of liver tumors, and pre-treatment were identified as statistically  (Table 2). We subsequently identified independent variables by comparing all possible combination of variables based on the AIC. Smallest AIC values were selected for the final model. In the multivariate survival analysis, HCV-RNA positivity, Child-Pugh score and number of liver tumors remained significant predictive factors. The prognostic score as a predictor of OS The prognostic score was developed based on the rounding values of estimated coefficients of the final model. As a result, the three predictive factors of OS in the multivariate survival analysis, namely HCV-RNA positivity, Child-Pugh score and number of lesions, were used for the score calculation (Table 3). To generate a simple and easy-to-use score model, we made the "additive" formula using logarithmically-transformed hazard ratios. Scores were calculated on the basis of the formula as follows: score = round [1.5 × ln(HR)] (S3 Table). The final score, named the Chiba HCC in intermediate-stage prognostic (CHIP) score, was defined as the sum of the three sub-scores with a minimum score of 0 and a maximum score of 7. Among the training cohort, 39, 59, 56, 17, and 16 patients showed scores of 0-2, 3, 4, 5, and 6-7, respectively. Similarly, among the validation cohort, 16, 49, 52, 25, and 21 patients showed scores of 0-2, 3, 4, 5, and 6-7, respectively. The observed cumulative survival of patients grouped by score was calculated using the Kaplan-Meier method for both groups (Fig 2). The CHIP score successfully identified five subgroups with distinct prognoses (training cohort: p < 0.0001, validation cohort: p < 0.0001).

Prognostic factor Points
Child-Pugh score Comparison of different sub-classification models to predict mortality Kaplan-Meier curves was also analyzed according to the Bolondi model, the HAP score and the Yamakado method (Fig 3). All of these sub-classification models were found to be significant in the log-rank test using the training and validation datasets (S4 Table). The discriminatory value of the various prognostic scores to predict mortality was evaluated separately using both the training and validation datasets by the concordance index. In the training dataset, the concordance index of CHIP score was the highest and significantly better than the Bolondi model and the Yamakado model (Table 4). However, there were no significant differences between CHIP score and the other three scoring models in the validation dataset (Table 5).

Discussion
This study aimed to establish a simple scoring model that could  patients in this study received antiviral therapy, it is difficult to determine the utility of our prognostic score regarding previous antiviral therapy. Further analyses in a larger number of patients would be needed to clarify this issue. Subsequently, we divided the Child-Pugh score, an internationally recognized index of liver function, into four categories in the CHIP score (5, 6, 7, and 8-9). Additionally, we found that the number of tumors was best divided into three categories (1, 2-7, and 8 tumors), which is consistent with current knowledge. For example, the number of tumors is known to be associated with intrahepatic spread of malignant cells, and is consistently shown to influence survival [13,14]. Additionally, patients with a single unresectable nodule have good prognoses regardless of whether they receive TACE [15,16]. We made single tumor an independent category because our data also suggested that patients with such tumors had good prognoses. Furthermore, when comparing patients with 8 tumors and those with 2-7 tumors, we found that prognoses were poorer in the former. As 140 patients in the training set (75%) had between 2 and 7 lesions, we attempted to further stratify these patients. We performed univariate survival analysis for overall survival according to number of lesions (2-4 tumors vs. 5-7 tumors). However, the hazard ratio in patients with 2-4 tumors was almost identical to that in patients with 5-7 patients (4.182 vs. 3.969). Further analyses would be necessary to examine whether HCC patients with 2-7 tumors could be divided into subgroups.
Bolondi et al. divided the BCLC stage B into B1-B4 sub-classifications based on existing reports, trials, and expert opinion [9]. Their method was validated in several reports in BCLC stage B patients who received TACE [17,18]. Importantly, tumor burden according to the Bolondi model is determined by the up-to-7 criteria, sum of the size of the largest tumor (in cm) and the number of tumors. In contrast, CHIP score takes into account tumor number, but not tumor size. For example, a HCV-negative Child-Pugh A patient complicated with a single HCC with a diameter of 10 cm was classified into the group with most favorable prognosis according to our score system (score 0) but not according to the Bolondi model (class B2). Moreover, the up-to-7 criteria itself was established based on the large amount of liver transplantation data performed without the Milan criteria [19]. Although Bolondi's sub-classification method is not specific prognostication in BCLC stage B patients who received TACE, it does allow sub-division of treatment options and better prediction of the associated patient outcomes. The HAP score is a simple scoring index requiring the measurement of two tumor variables (alpha-fetoprotein and the largest size of tumor) and two liver variables (albumin and bilirubin) and can predict outcomes in transcatheter arterial embolization/TACE [10]. , which is greater than the number categories in existing methods. The prognostic value for our datasets was comparable to existing criteria. However, the CHIP score may also be able to predict survival. Recently, Sieghart et al. described a meaningful scoring system, designated the ART score. The score comprises increase in Child-Pugh score, increase in AST, and radiological response to first TACE in patients with HCC. This scoring system may assist in decision making regarding TACE retreatment by estimating prognosis following second TACE [20]. Unlike the CHIP score, the purpose of the ART score was not subclassification of intermediate stage HCC.
The techniques and procedures used to deliver TACE are highly variable between institutions [21]. The CHIP score was equally discriminatory in the validation dataset despite differences in technique and patient characteristics. However, in both the training and validation datasets, all cases employed conventional TACE with a super-selective technique. Although most TACE in Asia (particularly Japan) is performed using this technique, it is not the main practice in Western countries where drug-eluting bead TACE is widely used. The CHIP score should therefore be validated against another external dataset treated with drug-eluting bead TACE.
In summary, we developed a novel prognostic score specifically for patients with intermediate-stage HCC undergoing TACE. The prognostic score is simple to employ and is based on just two variables, the Child-Pugh score and the number of tumors. Although validated in an independent dataset, a large prospective cohort will be necessary to confirm our results.