Continued value of the serum alpha-fetoprotein test in surveilling at-risk populations for hepatocellular carcinoma

Backgrounds and aims Because of the known limitations of ultrasonography (US) alone, we re-evaluated whether complimentary testing for serum alpha-fetoprotein (AFP) is helpful in surveilling for hepatocellular carcinoma (HCC) in high-risk populations. Methods We included, from a hospital-based cancer registry, 1,776 asymptomatic adults who were surveilled biannually with the AFP test and US and eventually diagnosed with HCC between 2007 and 2015. Based on the screening results, these patients were divided into three groups: AFP (positive for AFP only; n = 298 [16.8%]), US (positive for US only; n = 978 [55.0%]), and AFP+US (positive for both; n = 500 [28.2%]). We compared the outcomes of the three groups, calculating the survival of the AFP group both as observed survival and as survival corrected for lead-time. Results In terms of tumor-related factors, the separate AFP and US groups were more likely to have early stage HCC and to receive curative treatments than the combined AFP+US group (Ps<0.05). The AFP group had significantly better overall and cancer-specific survival than the AFP+US group after adjusting for covariates (adjusted hazard ratios [HRs] 0.68 and 0.62, respectively). In analyses correcting for lead-time in the AFP group (doubling time 120 days), the respective adjusted HRs for the AFP group were unchanged (0.74 and 0.67), but they were no longer significant after additional adjustment for tumor stage and curative treatment (0.87 and 0.81). Conclusions HCC cases detected by the AFP test without abnormal ultrasonic findings appear to have better survival, possibly as a result of stage migration and the resulting cures. Complementary AFP surveillance, together with US, could be helpful for at-risk patients.

Introduction Hepatocellular carcinoma (HCC) is one of the fastest growing causes of cancer death globally. [1] It has a reputation as a rapidly progressive cancer that is almost invariably fatal, with 3-year survival of less than 30%. [2] The high case-fatality ratio of HCC may be attributed in part of its vague and nonspecific symptoms, which usually appear when the disease has reached an advanced stage. However, a considerable improvement in survival has been observed in patients who have early-detected HCC and thus receive potentially curative treatment. [3] Surveillance for HCC is thought to be a way to detect lesions at an earlier stage and improve clinical outcomes in asymptomatic at-risk populations. [4,5] Currently, ultrasonography (US) is regarded as the backbone of screening for HCC, but it has practical limitations in terms of high operator-dependency and variable sensitivity. A recent meta-analysis highlights suboptimal (<50%) sensitivity of US for detection of HCC at an early stage, although the sensitivity and specificity of US in detecting HCC of any stage exceeds 90%. [6] The serum alpha-fetoprotein (AFP) assay is the only serological screening tool for early detection of HCC that reliably meets the final 5-phases criterion for possibly reducing the population disease burden listed by the Early Detection Research Network (EDRN), an initiative of the National Cancer Institute (NCI). [7] However, the use of AFP in surveillance is subject to ongoing debate, even as an adjunct to USG, due to issues about cost-effectiveness. [8][9][10] There have been false-positive results for HCC detection due to AFP elevation encountered in chronic liver disease, and falsenegative results in HCCs not secreting AFP. [11] Unfortunately, there are no robust or promising next-generation biomarkers available for clinical use in screening or diagnostic systems.
Both the American Association for the Study of Liver Diseases (AASLD) and the National Comprehensive Cancer Network (NCCN) guidelines for HCC, newly updated in 2017, reintroduce serum AFP testing as an additional surveillance method because of its potential benefit for patients with HCCs secreting AFP. [12,13] Recent interesting efforts in the U.S. to develop a computer model for detecting HCC also included serum AFP measurement in the prediction calculator. [14,15] We performed the present study to reappraise the methodological role of serum AFP, the oldest, and only available, oncomarker for HCC, for early detection of tumors. To this end, we examined the serum AFP values and ultrasonic results obtained at the time of HCC detection in a hospital-based cohort of patients undergoing regular surveillance in the absence of cancer-related symptoms, and then assessed survival outcomes according to diagnostic status.

Study subjects and design of the experimental groups
A study population consisting of 9,615 patients aged �20 years with HCC primarily diagnosed and treated between January 2007 and December 2015 was retrospectively constructed from a prospective hospital-based registry maintaining data on all new cases of cancer in Asan Medical Center, which is a part of the National Cancer Registration Program. This study was approved by the institutional review board (IRB) of the Asan Medical Center, Seoul, Republic of Korea. (IRB No.2017-0029), and informed consent was waived by IRB.
Since the objective of the study was to evaluate the surveillance effect of the serum AFP test on a background of US examination, 7,766 patients were excluded as follows: (1) 1,904 patients who initially had cancer-related symptoms such as ascites, abdominal pain, fever, jaundice and constitutional syndromes (i.e., weight loss, malaise, and anorexia) [16]; (2) 5,786 who did not undergo biannual regular surveillance tests for HCC comprising both abdominal US and serum AFP, as formally recommended by Korean practice guidelines [17]; and (3) 76 who had concurrent non-HCC malignancies.
Among the 1,849 asymptomatic patients who underwent surveillance tests for HCC at 6-month intervals, there were (1) 47 patients for whom there was a time interval exceeding a month between the two screening tests and the subsequent diagnosis of HCC; (2) 14 who had any abnormal surveillance results that were not evaluated by further diagnostic tests; (3) 12 whose tumor was incidentally identified by other test modalities such as computed tomography.
Accordingly, a total of 1,776 patients whose HCC was detected during bi-annual concurrent ultrasound and AFP surveillance were included in the final analysis. The patients were classified into three groups according to the results of screening prior to confirmation of HCC as follows: an AFP group: patients with high serum AFP test results (�20 ng/ml) but no focal lesions on ultrasonography; a US group: patients with suspected malignant lesion(s) on ultrasonography but normal AFP levels; and an AFP+US group: patients with positive findings in both tests (Fig 1)

Screening and diagnostic approaches
Serum AFP was measured using an immunoradiometric assay (RIA-gnost AFP, Cis-Bio International, Schering, Switzerland) based on the principle of the sandwich assay with I 125 -labeled anti-AFP monoclonal antibody, or a commercial enzyme immunoassay (Abbott AFP-EIA; Abbott Laboratories, North Chicago, IL). There is known to be good agreement between the results obtained by the two methods. [18] Ultrasonography was performed by licensed radiologists experienced in hepatobiliary ultrasound using real-time scanners. The diagnostic workup for HCC was initiated when AFP levels were elevated and/or a suspicious lesion(s) was observed on US. [19,20] The blood concentration of AFP used to identify a "screen-positive" result was �20 ng/ml, the typical criterion used in many surveillance studies and proven to be optimal for HCC screening. [19][20][21].
A [n = 268], respectively) in accordance with the international guidelines. [10,12] HCC stage at diagnosis was classified by the American Joint Committee on Cancer (AJCC) staging systems and Barcelona Clinic Liver Cancer (BCLC). [10,22] For each patient, the therapeutic modality for HCC was decided in a hierarchical manner according to efficacy in lengthening life based on tumor stage when this was feasible.
Detailed information on demographics, clinical data, tumors and survival outcomes were extracted from inpatient and outpatient medical records using the anonymized clinical database system of our institution (ABLE) [23] and the database of the National Population Registry of the Korea National Statistical Office using the unique personal identification numbers of the patients.

Statistical analysis of observational data
For comparisons of the three groups, continuous variables were analyzed by one-way analysis of variance (ANOVA) and non-parametric testing, including the Kruskal-Wallis test, whilst categorical variables were assessed using the chi-square test and Fisher's exact test. Clinical variables associated with the "AFP group" were analyzed by the logistic regression method. Kaplan-Meier analysis was used to illustrate and compare overall survival across the groups, defined as the interval between the dates of HCC diagnosis and deaths from any cause. Death, survival, and follow-up data were fully accessible through the registry of our institution and were collected up to December 31, 2016. Cancer-specific mortality, in which deaths due to HCC progression were regarded as outcomes, was also measured. A Cox proportional hazard model with backward elimination was used to identify the independent characteristics of the groups associated with overall survival and HCC-specific death. Potential confounders with P<0.10 among the demographic and hepatic variables in the univariate model, including age, sex, cause of chronic liver disease, method of HCC detection, body mass index, family history of HCC, positive history of alcohol and smoking, diabetes, hypertension, liver cirrhosis, ascites, model for end-stage liver disease (MELD) score, platelet counts, and infiltrative type of HCC, were used as input variables in the multivariate analysis. We hypothesized that any potential effect of the surveillance results on outcome would be due to stage migration and/or receipt of more curative treatment. Therefore, we examined changes in the parameter estimate of surveillance in the full model before (Model 1) and after (Model 2) adding these two potential explanatory variables. To reduce the impact of potential confounding effects in the AFP group and the AFP+US group, rigorous adjustment was made for significant differences in baseline characteristics in Model 1 and Model 2 by propensity score-based matching. Propensity scores were matched for the two groups based on differences of ±0.05 in the scores. Differences in overall mortality and cancer-specific mortality between the matched groups were compared using Cox regression models, with robust standard errors that accounted for the clustering of matched pairs. [24] P<0.05 was considered statistically significant. All statistical analyses were performed using R package MatchIT. [25] Because survival in the AFP group may be related to diagnoses made before the lesions were detected by US, we calculated lead times for the AFP group using Schwartz's formula. [26] Tumor volume doubling time was based on the value given in previous reports. [27][28][29][30][31] The estimated lead time for the AFP group was subtracted from their observed survival. If the value became negative, we attributed a survival (deceased patients) or a follow-up (surviving patients) of 1 day. The survival of the AFP group (corrected for the estimated lead time) was compared with the observed survival of the US and AFP+US groups. The adjusted hazard ratios (HRs) for corrected survival by the detection methods were also calculated using Cox multivariate stepwise regression analysis. The length-time bias was also adjusted using various tumor volume-doubling times from 90 to 150 days, which might represent tumors with various growth rates, for the calculation of the lead times. [27] A two-tailed P-value <0.05 was considered statistically significant. Statistical analyses were performed with R software version 3.1.1.

Demographic and tumor characteristics of the patients
The median age of the 1,776 patients was 57 years (interquartile range [IQR], 51-64 years), and most of the patients were male (77.5%). Liver cirrhosis was present in 86.5% of the patients (1,536 out of 1,776), and 1,615 (90.9%) were in Child-Pugh class A. There were 46 (2.6%) patients with preclinical ascites and no related symptoms. When the patients were stratified based on the results of the screening tests, 298 (16.8%), 978 (55.1%), and 500 (28.2%) were assigned to the AFP, US, and AFP+US groups, respectively. Table 1 presents the baseline characteristics of the three groups in the screening period. The largest proportions of females and never drinkers were observed in the AFP group (Ps<0.001 for both). In terms of cause of liver disease, hepatitis B virus (HBV) infection was the most common risk factor for HCC in the entire cohort (82.1%). In terms of laboratory findings, mean hepatic inflammatory parameters such as aspartate transaminase and alanine transaminase levels were comparable across the groups (P = 0.162 and P = 0.378, respectively). However, liver cirrhosis was more common in the AFP group, as was more advanced liver dysfunction (i.e., Child-Pugh class B) (Ps<0.05). Mean platelet count, representing underlying portal hypertension, decreased from the AFP +US group through the US and AFP groups (Ps<0.001).
The median interval between screening test and diagnosis of HCC was 2.0 weeks (IQR 1.7-2.2 weeks) with similar data across the three groups (P = 0.445; Table 2). The initial findings for tumor stage and therapeutic modality in the three screening groups are shown in Table 2. The median maximal tumor diameter was 2.4 cm (IQR, 1.7-3.5 cm), and 7.3% of the patients had three or more tumors. The infiltrative subtype of HCC that is difficult to recognize on early imaging was least common in the US group (1.6%), and present in 5.0% of the AFP group (P<0.001). Multiple tumors and larger tumors were most frequent in the AFP+US group, with similar numbers in the AFP and US groups (Ps<0.001 for all). While the proportion of patients with early HCC (i.e., very early and early stage HCC based on the BCLC classification) was highest in the AFP group (83.2% vs. 67.6%, P<0.001), more advanced tumors were most frequent in the AFP+US group.
Surgical resection and chemoembolization were the most frequent initial methods used for treating HCC in all the groups. Primary liver transplantation was performed in 15 (5.0%), 22 (2.3%), and 10 (2.0%) patients, respectively, in the AFP, US, and AFP+US groups. Curative treatments such as resection, transplantation and local ablative therapies were more often performed in the AFP group and US group than in the AFP+US group (60.1% vs. 63.1% vs. 56.4%, P = 0.044; Table 2) Table 3 shows the relationships between demographic and liver disease-related parameters and the AFP group. In univariate analysis, female gender, habitus of alcohol drinking and  In terms of tumor factors, about two thirds of the HCC tumors (68.5%) in the AFP group were in previously established blind spots of the US test, including hepatic dome, area under the rib, caudate lobe, and tip of the lateral segment of the left lobe.
Univariate analyses showed that the HRs of overall and cancer-specific survival relative to the AFP+US group were 0.65 and 0.59 for the AFP group, and 0.54 and 0.46 for the US group, respectively (all Ps�0.001). Other significant variables related to time-dependent outcomes are presented in Table 4. These factors were subsequently assessed in a multivariate proportional hazard regression model (Model 1; Table 4). To evaluate the potential effect of AFP screening due to stage migration and subsequent receipt of more-curative treatment, BCLC stage and primary treatment were added to this model (Model 2; Table 4). In model 1, the AFP and US groups had multivariate HRs of 0.60 (CI 0.47-0.78) and 0.53 (CI 0.43-0.64), respectively, for overall death relative to the AFP+US group (Ps<0.001, respectively). In model 2, the adjusted HRs of the AFP and US groups compared with the AFP+US group for all-cause death were 0.68 (CI 0.52-0.88; P = 0.003) and 0.57 (CI 0.47-0.69; P<0.001), respectively. In terms of HCC-related mortality, the lower HRs in the AFP group than in the AFP+US group were also significant, as shown in  and HR 0.62 [0.46-0.84], P = 0.002 in Model 2). The same trends were observed when using the AJCC staging system, as shown in S1 Table. In addition, these results were recapitulated after PS matching between the AFP and AFP+US groups (S2 Table and S1 and S2 Figs).

Lead-time correction model
Because the earlier recognition in the preclinical course of disease due to AFP screening appeared to contribute to the longer survival observed in the AFP group, we corrected for lead-time in that group. When we used 120 days as the assumed tumor doubling time to correct for lead-time bias, overall survival and cancer-specific mortality outcomes were significantly better in the AFP group than in the AFP+US group ( Table, and similar results were obtained when the analysis was based on the AJCC staging system (S4 Table).

Analysis of the cirrhotic subset
Analysis of the cirrhotic cohort (n = 1,536) showed that the significantly lower overall and cancer-specific mortality risks of the AFP group relative to the AFP+US group were maintained, with adjusted HRs of 0.67 (0.51-0.87; P = 0.003) and 0.59 (0.43-0.82; P = 0.002), respectively, after adjustment of BCLC stage and curative treatment options (S4 Fig and S5 Table). Survival after correction for lead-time in the AFP group also revealed similar trends in patients with liver cancer as were observed in the entire cohort (S6 Table).

Discussion
In the present study, retrospective evaluation of data for preclinical HCC revealed that approximately 17% of asymptomatic patients with HCC were diagnosed on the basis of a preceding elevation of AFP, with no suspected lesions on ultrasonic images. We conclude that the serum AFP test can play a complementary role in the early detection of HCC even when there is regular US-based surveillance. This route to diagnosis was more often observed in cirrhotic patients, and, importantly, such patients had better survival outcomes than those whose HCC was detected by both the serological and radiological tests. The trend for our findings by the two competing endpoints of all-cause and cancer-specific mortality to coincide adds to the evidence that the AFP test helps early detection of HCC. Because of the finding that the difference in survival between the patients identified by AFP with and without abnormal US findings was maintained after correcting for lead time combined with confounding adjustment, we suggest that the observed survival benefit was due to downward stage migration followed by more frequent receipt of potentially curative treatment.
The clinical effectiveness of an HCC screening program in at-risk individuals relies on early diagnosis, provided that effective treatments are available. [4,5] Consequently, an increased proportion of successful treatments and a reduction in the mortality of surveilled patients should be the final results. While US is a well-accepted modality for HCC surveillance, the interpretation of US can be difficult due to the increased liver nodularity and parenchymal heterogeneity as cirrhosis progresses, which hampers the identification of nodules. [19,32] There is currently no concrete evidence supporting the use of CT or MRI as part of a routine surveillance strategy for detecting HCC. In addition, the high costs and potential harm related to contrast-related injury and radiation exposure associated with these tests make them poor candidates for surveillance tests in most clinical settings. [33] In our regular surveillance series, nearly 83.2% of the cancers positive for AFP screening alone were detected at early, potentially curable, stages, as were the HCC nodules primarily identified on US that did not produce AFP and had more indolent phenotypes. In contrast, over 68.4% of the cases that gave screen-positive results in both tests were beyond the early stages and produced more AFP.
There has been controversy about the complementary use of AFP testing as a biomarker for HCC surveillance. [8,10] A stage-specific meta-analysis of 13 prospective studies found only a minimal effect of adding AFP measurement on the sensitivity of detection of early HCC. [9] However, inter-study heterogeneity in the included individuals, in the AFP cut-offs, and in the reference imaging make these findings unreliable. In addition to the survival benefit of AFP testing in our present work, recent investigations of the accuracy of HCC surveillance techniques have reported greater sensitivity of US and AFP for early stage HCC in cirrhotics compared to US alone, [6] and a lower false-positive rate for AFP than for US. [34] Indeed, the most recently updated oncology and hepatology guidelines (re)introduce AFP. [12,13] In order to establish mandatory indications of AFP for HCC detection, we need to determine the potential predictors of HCC as detected by "elevated AFP data without pathologic ultrasonic information". In our series, liver cirrhosis, which is usually accompanied by portal hypertension, was major predictor requiring complementary AFP screening. For our AFPsecreting HCCs in our cirrhotic subset, the actual outcomes were better when they were only serologically detectable outside the ultrasonic detection range in which cirrhotic patients, albeit more cancer-susceptible, were more often found. Our evidence-based data support the statement by the American College of Gastroenterology that diagnostic examination should be done in cirrhotics with an elevated or rising AFP, even in the absence of abnormal findings on US. [35] Our observations could provide a rationale for the clinical use of the AFP serotest coupled with US screening, particularly in the cirrhotic subjects, in order to provide life-saving or life-extending opportunities to impending HCC patients.
There are potential limitations to our study. A complimentary role of AFP testing in HCC surveillance would ideally be evaluated by a randomized trial in a high-risk population, especially to assess the consequences of falsely elevated AFP. Although a formal economic analysis will be needed to justify offering AFP testing as a concrete complimentary option, we believe that developed societies could afford the cost of the false-positive AFP results added to the 90% true positives at least in cirrhotic patients in whom the sensitivity of US for early tumors is disappointing and the risk of HCC is highest, [36] if that improved survival, as in our findings. Second, length-time bias is generally recognized as important in cancer screening. [37] Because HCC is rarely indolent as it had doubling times of 3-6 months in prior studies, [28,31] and only subjects asymptomatic at the time of HCC diagnosis were included in the analysis, this kind of bias is not likely to have been introduced in our study. Third, although most patients included in this study were infected by HBV, HBV itself would not have a negative effect on the ability of AFP to detect HCC as long as the HBV infection was not exacerbated, as actually in our case. [38] In conclusion, we have demonstrated that measuring AFP to detect HCC can improve overall and cancer-specific mortalities in patients with chronic liver disorders, even after correcting for lead times due to AFP screening. This improvement may be due to detecting tumors at an earlier stage, so increasing the chance of curative treatment. Given the inherent limitations of ultrasound and the lack of availability of other reliable markers, our data suggest that continuing to use the AFP assay together with established US examination in at-risk patients is clinically desirable and practically recommendable, supporting the relevant updated guidelines.
Supporting information S1 Fig. (A) overall mortality and (B) cancer-specific mortality of the AFP and AFP+US groups after propensity score matching in Model 1. Both overall and cancer-specific mortalities were significantly different in the two groups (Ps<0.001, respectively, by the log-rank test).