Discordance between Liver Biopsy and FibroScan® in Assessing Liver Fibrosis in Chronic Hepatitis B: Risk Factors and Influence of Necroinflammation

Background Few studies have investigated predictors of discordance between liver biopsy (LB) and liver stiffness measurement (LSM) using FibroScan®. We assessed predictors of discordance between LB and LSM in chronic hepatitis B (CHB) and investigated the effects of necroinflammatory activity. Methods In total, 150 patients (107 men, 43 women) were prospectively enrolled. Only LSM with ≥10 valid measurements was considered reliable. Liver fibrosis was evaluated using the Laennec system. LB specimens <15 mm in length were considered ineligible. Reference cutoff LSM values to determine discordance were calculated from our cohort (6.0 kPa for ≥F2, 7.5 kPa for ≥F3, and 9.4 kPa for F4). Results A discordance, defined as a discordance of at least two stages between LB and LSM, was identified in 21 (14.0%) patients. In multivariate analyses, fibrosis stages F3–4 and F4 showed independent negative associations with discordance (P = 0.002; hazard ratio [HR], 0.073; 95% confidence interval [CI], 0.014–0.390 for F3–4 and P = 0.014; HR, 0.067; 95% CI, 0.008–0.574 for F4). LSM values were not significantly different between maximal activity grades 1–2 and 3–4 in F1 and F2 fibrosis stages, whereas LSM values were significantly higher in maximal activity grade 3–4 than 1–2 in F3 and F4 fibrosis stage (median 8.6 vs. 11.3 kPa in F3, P = 0.049; median 11.9 vs. 19.2 kPa in F4, P = 0.009). Conclusion Advanced fibrosis stage (F3–4) or cirrhosis (F4) showed a negative correlation with discordance between LB and LSM in patients with CHB, and maximal activity grade 3–4 significantly influenced LSM values in F3 and F4.


Introduction
Because the prognosis of and management strategies for patients with chronic liver diseases depend strongly on the severity of liver fibrosis, early detection of significant fibrosis is key [1]. To date, liver biopsy (LB) has been the gold standard for assessing liver fibrosis. However, its invasiveness, potential adverse events [2], sampling errors [3], and interpretational variability [4] have encouraged clinicians to seek more accurate and noninvasive tools for assessing liver fibrosis.
Recently, liver stiffness measurement (LSM) using FibroScanH was introduced as a noninvasive device to accurately assess liver fibrosis [5]. In view of the results achieved so far, LSM can help physicians decide treatment strategies, predict prognosis, and monitor disease progression or regression in patients with chronic liver disease. Despite the clinical usefulness of LSM, several confounding factors that can diminish the accuracy of LSM have been identified, such as necroinflammatory activity, reflected by a high alanine aminotransferase (ALT) level, cholestasis, or heart failure [6][7][8][9][10][11][12]. In addition to these extrinsic factors, LSM should satisfy the intrinsic prerequisites for preserving the validity of LSM: $10 valid measurements, a success rate $60%, and an interquartile range (IQR)/median LSM value among valid measurements (IQR/M) ,0.3. However, because these criteria are not based on scientific evidence, several studies have tried to demonstrate the clinical relevance of these criteria by identifying factors that predict discordant results between LB and LSM in estimating liver fibrosis. Results from these studies have identified elevated ALT, high IQR/M, high body mass index (BMI), and fibrosis stage at the time of LB as predictors of discordance [13][14][15]. Although elevated ALT has been considered to be the single most important confounder on LSM, the effects of necroinflammatory activity, which is closely related to ALT level, on discordance between LB and LSM have not been determined.
Thus, in this study, we examined predictors of discordance between LB and LSM in patients with chronic hepatitis B (CHB) and investigated the effects of necroinflammatory activity on LSM.

Patients
Between January 2007 and December 2009, 196 consecutive patients with CHB, defined by detectable hepatitis B virus surface antigen (HBsAg) for more than 6 months and positive hepatitis B virus (HBV) DNA by polymerase chain reaction assay, underwent both LB and LSM before starting antiviral treatment. Of them, 184 (93.9%) patients received LSMs on the same day as LB. The remaining LSMs were conducted at a median of 6 (range, 1-24) days before LB in 12 (6.1%) patients.
No patient had evidence of decompensated liver cirrhosis, such as a history of variceal bleeding, ascitic decompensation, hepatic encephalopathy, or Child-Pugh class B or C at the time of LB and LSM. Exclusions criteria were as follows: (1) previous antiviral treatment before LB, (2) evidence of liver cancer or another malignancy, (3) coinfection wtih hepatitis C virus, hepatitis D virus, or human immunodeficiency virus, (4) alcohol consumption in excess of 40 g/day for more than 5 years, (5) LB specimens shorter than 15 mm in length or unknown LB length, (6) right-sided heart failure, (7) LSM failure, or (8) unrelaible LSM (,10 valid measurments).
This study cohort includes a subset of a previous multicenter Korean study [14]. The study protocol was consistent with the ethical guidelines of the 1975 Declaration of Helsinki. Written informed consent was obtained from each participant or responsible family members after the possible complications of LB had been fully explained. This study was approved by the independent institutional review boards of Severance Hospital, Yonsei University College of Medicine.

Clinical data
Demographic details and BMI were collected. The following laboratory parameters were also collected from all the patients at the time of LSM; ALT, gamma-glutamyltranspeptidase (GGT), and platelet count. HBsAg was measured using standard enzyme-linked immunosorbent assays (Abbott Diagnostics, Abbott Park, IL, USA). The upper limit of normal (ULN) for ALT was defined as 40 IU/L.

Liver stiffness measurement
LSM was obtained according to the instructions provided by the manufacturer. Details of the technical background and examination procedure have been described previously [5,[16][17][18]. The success rate was calculated as the number of valid measurements divided by the total number of measurements. Results are expressed in kilopascals (kPa). IQR was defined as an index of LSM intrinsic variability corresponding to the 25 th and 75 th percentiles intervals around the LSM result containing 50% of the valid measurements. The median value was considered representative of the elastic modulus of the liver. Only procedures with $10 validated measurements were considered reliable, regardless of success rate and IQR/M. The same experienced operator (.3,000 LSM examinations), blinded to LB results and the clinical data of the study population performed all LSM examinations.

Liver biopsy and histological evaluation
LB specimens were fixed in formalin and embedded in paraffin. Sections (4 mm) were stained with hematoxylin and eosin and Masson's trichrome. All liver tissue samples were evaluated by an experienced hepatopathologist (YN Park) who was blinded to the clinical data of study population, including LSM results. Liver fibrosis and necroinflammation were evaluated semiquantitatively according to the Laennec system [19]. Fibrosis was scored in five grades at first: 0, no definite fibrosis, 1, minimal fibrosis (no septa or rare thin septum; may have portal expansion or mild sinusoidal fibrosis), 2, mild fibrosis (occasional thin septa), 3, moderate fibrosis (moderate thin septa; up to incomplete cirrhosis), and 4, liver cirrhosis. Then, liver cirrhosis was sub-classified into three groups: F4A, mild cirrhosis, definite or probable, 4B, moderate cirrhosis (at least two broad septa), and 4C, severe cirrhosis (at least one very broad septum or many minute nodules). The activity grade referred to the degree of necroinflammatory activity in the lobule and periportal area and was scored in five grades: A0, no activity, A1, minimal, A2, mild, A3, moderate, and A4, severe activity. Maximal activity grade was defined as the higher of the lobular and periportal activity. Steatosis in the liver specimen was graded on a four-point scale: S0 (insignificant, ,5%), S1 (mild, 5-33%), S2 (moderate, 34-66%), and S3 (severe, $66% of hepatocytes with fat deposits) [20,21].

Statistical analysis
Patient characteristics are reported as means 6 standard deviations, medians (ranges), or n (%), as appropriate. Continuous variables of patients with discordance and those without were compared with independent t-tests or Mann-Whitney U tests. The chi-squared or Fisher's exact test was used for categorical variables. A discordance was defined as a discordance of at least two stages between LB and LSM [13]. Cutoff LSM values for determining discordance were derived from our cohort, which maximized the sum of sensitivity (Se) and specificity (Sp). Positive and negative predictive value (PPV and NPV) was also computed. Spearman's analysis was used to investigate correlations between variables. Univariate and subsequent multivariate binary logistic regression analyses were performed to identify independent factors related to discordance between LB and LSM. Hazard ratios (HRs) and corresponding 95% confidence intervals (CIs) are also indicated. A two-sided P value of ,0.05 was considered significant. All statistical analyses were performed with the SPSS software (ver. 12.0; SPSS Inc., Chicago, IL, USA).

Liver histology and corresponding LSM values
The median length of LB samples was 17 (range, 15-25) mm. The fibrosis stage and maximal activity grade are summarized in Table 2.
The median and range of LSM values according to maximal activity grade in each fibrosis stage are listed in Table 2. The median LSM values increased significantly as fibrosis stage increased (6.4 kPa for F1, 9.1 kPa for F2, 10.0 kPa for F3, and 12.0 kPa for F4; all P,0.05 between each fibrosis stage; Influence of necroinflammatory activity on LSM according to each fibrosis stage LSM values between maximal activity grade 1-2 versus 3-4 were compared in each fibrosis stage ( Figure 2). The median LSM values were not significantly different between maximal activty grade 1-2 and 3-4 in F1 and F2 fibrosis stage (P = 0.676 and 0.139, respectively), whereas those were significantly higher in maximal activty grade 3-4 than 1-2 in F3 and F4 fibrosis stage (8.6 vs. 11.3 kPa in F3, P = 0.049; 11.9 vs. 19.2 kPa in F4, P = 0.009).
The ALT levels in F4 were 40.8615.6 IU/L in the cases with maximal activity grade 1-2 and 60.7623.3 IU/L in those with maximal activity grade 3-4, which were significantly different (P = 0.004). In contrast, the ALT levels in F1, F2, and F3 showed no significant difference between maximal activity grade 1-2 versus 3-4 (50.6638.

Correlations between variables
Among the study variables, the highest correlation was noted between ALT level and maximal activity grade (correlation coefficient, 0.497; P,0.001), followed by a correlation between ALT and age (correlation coefficient, 20.290; P,0.001) and another between ALT and gender (correlation coefficient, 20.170; P = 0.038).

Patients with discordance between LB and LSM
When we stratified these 21 patients with discordance into two groups (LB high group, defined as patients with higher fibrosis stage based on LB, and LSM high group, defined as those with higher fibrosis stage based on LSM), only 2 (9.5%) patients were stratified into the LB high group and 19 (90.5%) into the LSM high group. The distribution of fibrosis stages based on LB and LSM is indicated in Table 3.
The mean maximal activity grade and ALT levels of two patients with discordance in LB high group showed a trend to be lower than those of the 129 patients with non-discordance (1.560.7 vs. 2.46 0.8, P = 0.150 and 25.0612.7 vs. 72.96103.1 IU/L, P = 0.514, respectively). The mean maximal activity grade of the 19 patients with discordance in the LSM high group was higher than the 129 patients with non-discordance with borderline statistical significance (2.760.9 vs. 2.460.8, P = 0.074), whereas ALT levels only showed a trend to be higher in the LSM high group than in the 129 patients with non-discordance (87.6662.5 vs. 72.96103.1 IU/L, P = 0.547).
Of the two cases in the LB high group, one was classified as F4A fibrosis stage, with a maximal activity grade of A, and the other was classified as stage F3, with a maximal activity of A2. The histology of the former patient with F4A showed a thin fibrous septa with minimal necroinflammatory activity (Figure 3(A)). Among the 19 cases in the LSM high group, 7 and 12 cases showed F1 and F2 fibrosis stage, respectively, and their maximal  activity was A1-2 in 8, A3 in 8, and A4 in 3 cases, respectively. One histological example of the LSM high group showed periportal fibrosis with bridging necrosis (Figure 3(B)).

Discussion
Although LSM is an accurate method that evaluates the degree of liver fibrosis [5], many extrinsic factors have significant influences on LSM [6][7][8][9][10]. Additionally, LSM should satisfy the three intrinsic prerequisites of $10 valid measurements, success rate $60%, and IQR/M,0.3 to maintain the validity needed to reflect the real fibrotic state of liver [5]. However, these intrinsic prerequisites are only the manufacturer's recommendation.
As a result, several studies have investigated the influence of IQR/M on the accuracy of LSM [13][14][15]. Two studies with chronic hepatitis C (CHC) only [13] or mostly CHC [15] have proposed optimal cutoff IQR/M values of 0.21 and 0.17, respectively. In contrast, the other study with CHB did not identify IQR/M as a significant predictor of accuracy [14]. Indeed, in our study with CHB, IQR/M was not selected as a significant predictor of discordance. Reasons for this remain  unclear although IQR/M in the previous CHB study [14] and ours (both mean IQR/M 0.14) was much lower than in CHC studies (0.23 [13] and 0.16 [15]) indicating that LSM performed more accurately. We believe that IQR/M may not be a sensitive marker to predict discordance in CHB because of the influence of inhomogeneous histological features or necroinflammation that overwhelm the influence of IQR/M on LSM [22]. Consistent with the literature [13][14][15], success rate was not a significant predictor of discordance in our study, which might mean that 'high quality,' reflected by lower IQR/M, does matter for the accurate interpretation of LSM, rather than a 'high percentage' of successful shots. In addition to IQR/M, fibrosis stage (F0-2 vs. F3-4) and elevated ALT (.1.5-26 ULN) were proposed as significant extrinsic predictors of discordance [13][14][15]. However, controversy remains regarding fibrosis stage. Lucidarme et al. [13] concluded that advanced fibrosis (F3-4) was correlated with discordance, while Kim et al. [14] and Myers et al. [15] proposed minimal fibrosis (F0-2) as a significant predictor of discordance between LB and LSM. In our study, F3-4 and F4 showed a negative correlation with discordance. The significantly lower rate of discordance in F4 may be explained by an unlimited upper cutoff LSM value for cirrhosis until 75 kPa in this study design. Accordingly, F3-4, which included a high proportion of F4 (80.9%), also showed a negative correlation with discordance. From these results, we suggest that a different predictor of discordance may be produced according to a different distribution of F4. Furthermore, this hypothesis may explain why ALT level was not selected as a significant predictor of discordance in our study with a higher prevalence of F3-4 (56.0%) than in a previous study (20.3%) that proposed ALT as a significant predictor of discordance [15]. This confounding effect of elevated ALT may have been attenuated in our study due to a higher proportion of F4, which is free to misdiagnosis due to LSM overestimation by elevated ALT, despite similar ALT levels between the two studies [mean 61 IU/L [15] vs. 74 IU/L in the present study].
Thus, the potentially masked influence of necroinflammation in F4, the consistent reports on the overestimating effects of elevated ALT, and the significant correlation between lobular activity grade 3-4 and discordance in our univariate analysis prompted us to  investigate further the effects of necroinflammatory activity grade on LSM. The maximal activity grade 3-4 significantly influenced LSM values in F3 and F4, but not in F1 or F2, which may be explained in several ways. First, the mean ALT level was not significantly different in A1-2 versus A3-4 in F1 and F2 in our cohort. Thus, the ALT effect could not be revealed. Second, extrinsic factors, such as liver congestion [7] and respiration [23] resulting in a change of portal flow concurrent with necroinflammation, may have influenced the performance of LSM when liver fibrosis was insufficient (#F2) to be detected by LSM. These combined effects of several confounders might have concealed the effects of necroinflammation on LSM, consistent with a recent meta-analysis on LSM reporting the relatively lower performance of LSM to predict significant fibrosis ($F2) [24]. Most patients (90.5%) with discordance were stratified into the LSM high group, indicating that LSM values were subject to overestimation due to necroinflammation or high ALT. Although statistical significance for necroinflammation and high ALT was not seen when we compared patients with non-discordance with those in the LSM high group, the clinical implications should be further investigated in future studies, considering the borderline statistical significance of high ALT (P = 0.074) and the small sample size of the LB high group. Indeed, clinical variables, such as ALT level, which showed the best correlation with necroinflammatory activity in our study, are needed to predict the discordance 'before LB,' because histological variables, such as fibrosis stage or activity grade, which are only available 'after LB' are not helpful for the prediction of the discordance between LB and LSM. Indeed, the concept of excluding subjects with high ALT to enhance the accuracy of LSM has already been proposed [17,25]. Although ALT was not a significant predictor, the optimal cutoff ALT level to predict discordance was 55 IU/L (data not shown). Furthermore, the mean ALTs of maximal activity grade 3-4 in F3 and F4 that raised LSM values significantly were 60.7 and 112.3 IU/L, respectively. Because the ALT cutoff seemed to be around 1.5-36 ULN in our study and 1.5-26ULN in previous ones [14,15,26], ALT level #36 ULN may be optimal to enhance LSM performance.
Interestingly, LSM values showed a significant stepwise increment according to F4A, B, and C without a significant difference in ALT levels in our study, indicating that LSM can further stratify patients with cirrhosis. Thus, mild liver cirrhosis (F4A) might have a higher chance of being underestimated by LSM than moderate or severe cirrhosis (F4B or F4C), especially when necroinflammatory activity or ALT level is low. This could be a reason why one patient with F4A and A1 activity grade belonged to the LB high group. Although the histological sub-classification of cirrhosis is gaining clinical relevance [27], stratification of cirrhosis according to the Laennec system or LSM should be further validated via long-term follow-up studies using solid clinical end-points, such as liver-related death or development of hepatocellular carcinoma.
In conclusion, advanced fibrosis stage (F3-4) or cirrhosis (F4) showed a negative correlation with discordance between LB and LSM in patients with CHB, and maximal activity grade 3-4 significantly influenced LSM values in F3-4 and F4. Thus, future studies should investigate how to control for the clinical marker of ALT, which may bridge histological information to enhance the accuracy of LSM.