Validation of Ten Noninvasive Diagnostic Models for Prediction of Liver Fibrosis in Patients with Chronic Hepatitis B

Background and Aims Noninvasive models have been developed for fibrosis assessment in patients with chronic hepatitis B. However, the sensitivity, specificity and diagnostic accuracy in evaluating liver fibrosis of these methods have not been validated and compared in the same group of patients. The aim of this study was to verify the diagnostic performance and reproducibility of ten reported noninvasive models in a large cohort of Asian CHB patients. Methods The diagnostic performance of ten noninvasive models (HALF index, FibroScan, S index, Zeng model, Youyi model, Hui model, APAG, APRI, FIB-4 and FibroTest) was assessed against the liver histology by ROC curve analysis in CHB patients. The reproducibility of the ten models were evaluated by recalculating the diagnostic values at the given cut-off values defined by the original studies. Results Six models (HALF index, FibroScan, Zeng model, Youyi model, S index and FibroTest) had AUROCs higher than 0.70 in predicting any fibrosis stage and 2 of them had best diagnostic performance with AUROCs to predict F≥2, F≥3 and F4 being 0.83, 0.89 and 0.89 for HALF index, 0.82, 0.87 and 0.87 for FibroScan, respectively. Four models (HALF index, FibroScan, Zeng model and Youyi model) showed good diagnostic values at given cut-offs. Conclusions HALF index, FibroScan, Zeng model, Youyi model, S index and FibroTest show a good diagnostic performance and all of them, except S index and FibroTest, have good reproducibility for evaluating liver fibrosis in CHB patients. Registration Number ChiCTR-DCS-07000039.


Introduction
Chronic hepatitis B (CHB) is a major global health problem, which can lead to cirrhosis, decompensation and hepatocellular carcinoma (HCC). The recent guidelines [1] on the management of CHB have proposed that the presence of significant fibrosis and cirrhosis are indication for treatment and close monitoring for complications of portal hypertension and development of HCC. Therefore, assessment of liver fibrosis in patients with CHB is of paramount importance to predict disease progression, determine the optimal timing and evaluate the efficacy of antiviral therapy.
At present, liver biopsy remains the gold standard for assessing liver fibrosis. However, liver biopsy is an invasive procedure with a potential risk of complications, especially in those with advanced fibrosis and cirrhosis, and its diagnostic accuracy is compromised by sampling error as well as interobserver variations [2][3][4].
Therefore, noninvasive methods for assessing liver fibrosis have been the focus of translational research. Noninvasive models such as aspartate aminotransferase (AST)-to-platelet ratio index (APRI) [5], FibroTest [6], FibroScan [7], Zeng model [8] and so on, comprising various biochemical and clinical parameters have been derived from patients with CHB, CHC and alcoholic liver disease. Most of these studies reported good values in AUROC analysis but have not been externally validated and compared in the same group patients. Therefore, in the present study we validated the diagnostic performance and evaluated the reproducibility of these forementioned models against liver histology in a big cohort of Chinese patients with CHB.

Methods Patients
Between September 2007 and April 2009, patients with CHB who underwent a percutaneous liver biopsy at the seven hospitals (Beijing Friendship Hospital, Beijing; Beijing Youan Hospital, Beijing; 302 Hospital of the Chinese People's Liberation Army, Beijing; Nanfang Hospital, Guangzhou; Ruijin Hospital, Shanghai; Renji Hospital, Shanghai; Southwest Hospital, Chongqing) who met the following criteria were recruited into this study. (I) Age between 18 and 65 years; (II) hepatitis B surface antigen (HBsAg) positive for longer than 6 months; (III) at least two weeks off-therapy of biofendate or biocyclol before enrollment; (IV) written informed consent. Exclusion criteria included: (I) white blood cell count <3.5×10 9 /L, or platelet count <80×10 9 /L, or prothrombin index <60%; (II) evidence of a co-infection with hepatitis C; (III) evidence of any other acquired or inherited liver disease; (IV) a history of decompensated cirrhosis defined as jaundice in the presence of cirrhosis, ascites, bleeding gastric or esophageal varices or encephalopathy; (V) a history of malignancies including hepatocellular carcinoma; (VI) lactation; (VII) body mass index (BMI) > 28 kg/m 2 ; (VIII) cardiac pacemaker or defibrillator carrier; (IX) unhealed wound in right upper quadrant.

Liver Histology
Needle liver biopsy specimens were obtained with a 16-gauge needle under ultrasound guidance. To be considered as adequate for scoring, the liver biopsies had to measure at least 10mm and contain 8 portal tracts. All the liver biopsy specimens were routinely processed by formalin fixation, paraffin-embedding, and sectioned at 5μm thickness, and then stained with Masson Trichrome and reticulin staining for histological assessment. All specimens were assessed by two independent pathologists blinded to patient clinical and laboratory characteristics. Fibrosis was scored on a 5-point scale according to the METAVIR scoring system [2,9]: F0, no fibrosis; F1, portal fibrosis without septa; F2, portal fibrosis with few septa; F3, numerous septa without cirrhosis; F4, cirrhosis. Discordant cases were reviewed by both pathologists together to reach consensus.

Liver Stiffness Measurement
Liver stiffness measurement (LSM) were performed using the FibroScan1 medical device (Echosens, Paris, France). LSM was performed by well-trained operators on the same day as liver biopsy. The procedure was based on at least ten validated measurements. The success rate was calculated as the number of measurements. The median value, expressed in kilopascals (kPa), was considered representative of the liver stiffness value. The liver stiffness value was considered reliable only if at least 10 successful acquisitions were obtained, the overall success rate was 60%, and the interquartile range over median value of liver stiffness was 30%.

Sample collection
Blood samples were collected in evacuated tubes (Monovette 02.1063, Sarstedt, Germany), allowed to clot for 30 min at room temperature and centrifuged at 1600g for 15 min at 4°C. Sera were frozen at -80°C within 2 h after collection.

Statistical Analysis
The diagnostic performance of each noninvasive model was assessed using receiver operating characteristic curves (ROC). The areas under the ROC curves (AUROC) as well as 95% confidential interval (CI) of AUROC were calculated. AUROC values for different diagnostic criteria for the same data set were compared with the De Long method [22], using Medcalc Software version 12.2.1.0 (Medcalc, Mariakerke, Belgium). A P-value < 0.05 was considered statistically significant.

Diagnostic performance of noninvasive models in comparison with liver biopsy
Diagnostic performances of the noninvasive models were evaluated by AUROCs determined for the whole population according to their histology fibrosis stages, which were shown in Table 4. For significant fibrosis (F2), the AUROCs of HALF index, Zeng model, FibroScan, Youyi model, S index and FibroTest were varies from 0.70 to 0.90, while the AUROCs of Hui model, APAG, APRI and FIB-4 were less than 0.70 (Fig 2A). For advanced fibrosis ( Fig 2B) and cirrhosis (Fig 2C), the AUROCs of all models were better than 0.70, except APRI. The ROC curves of respective noninvasive models had been compared with each other using the Delong method (S1 Fig and S1 Table). HALF index and FibroScan showed significantly better performances for diagnosis of significant fibrosis, advanced fibrosis and cirrhosis than any other serum models (all P<0.05). The accuracy of FibroScan and HALF index for prediction of F2, F3 and F4 was statistically equivalent to each other (P = 0.25 for F2, P = 0.30 for F3, P = 0.45 for F4).
The AUROCs for F2, F3 and F4 of HALF index, FibroScan, Zeng model, Youyi model, S index and FibroTest were consistently better than 0.70, so the six noninvasive models were selected for further evaluation.

Evaluation the reproducibility of the six noninvasive models
The given cutoff values for each model obtained from the original articles [6][7][8][12][13][14] were listed in Table 5. For FibroScan, the diagnostic accuracy for F2, F3 and F4 were 73%, 84% and 88% respectively, which were similar to the values of original article [7] (76%, 90% and 94% for F2, F3 and F4), indicating FibroScan has a good reproducibility to predict fibrosis stage in CHB patients. The diagnostic accuracy of Youyi model were 65%, 71% and 77% with the given cutoff values in this study, similar to the values of original article [14] (79%, 82% and 77% for F2, F3 and F4). Since low cutoffs were originally described to rule out significant fibrosis, attention must be paid to NPV. The NPV for HALF index was 100% with cutoff value < 2.22, higher than the one in original article [12] (NPV 94.7%). While high cutoffs were described to confirm significant fibrosis, attention should be paid to PPV. The PPV for HALF index was 95% which was similar to the one in original article [12] (PPV 100%). These results demonstrated that HALF index has a good reproducibility. Compared to the original article [8] (NPV 86.1% at value <3.0, PPV 91.1% at value >8.7), Zeng model also has a good reproducibility. Whereas the NPV of S index in this study with low cutoff value (less than 0.1) was only 56%, less than the one of the original article [13] (NPV 65.57%). The NPV of FibroTest was 59% at the cutoff value 0.1 in this study compared with 100% in the original article [6], which means the reproducibility of FibroTest is not good enough.

Discussion
In this study, we compared and verified ten noninvasive fibrosis evaluation models in diagnostic performance and reproducibility in the same group of CHB patients. Among the ten models, there are six models (HALF index, FibroScan, Zeng model, Youyi model, S index and FibroTest) consistently show an AUROC over 0.70 in diagnosing significant fibrosis, advanced fibrosis and cirrhosis, especially HALF index and FibroScan have a much better accuracy in diagnosing any stage of liver fibrosis. Four models, HALF index, FibroScan, Zeng model and Youyi model, have a much better reproducibility than the other six models.
The first important finding of this study was that models containing of imaging techniques or direct serum markers (the HALF index, FibroScan, Zeng model, and Youyi model) had better diagnostic values for CHB patients than those only containing of indirect serum markers (the S index, Hui model, APAG, APRI and FIB-4), which was similar to the previous validation studies [13,23]. Secondly, noninvasive models derived from CHC are not useful tools to assess liver fibrosis in patients with CHB because of the different pathogenesis and histology progression of fibrosis between CHB and CHC. Thirdly, LSM alone showed similar diagnostic performance to the complex models that combine LSM and other serum parameters (such as HALF index).
This study had two unique features. The first one is that most noninvasive models derived from CHB only validated in an internal validation cohort from the same population with the training cohort or validated in a single-center external cohort [23], while our study validated ten reported noninvasive models (seven derived from CHB patients and three from CHC patients) in a large multicenter cohort from different parts of China. The large number and geographic diversity of patients played a key role in validating the reproducibility. The other is that we recruited patients who underwent not only blood tests but also LSM, and this allowed  us to compare the diagnostic performance of these two major categories of noninvasive diagnostic modalities in Asian populations with CHB [24][25][26]. This study did have some limitations. Firstly, we recruited only patients with liver biopsy specimens at least 1 cm in length and 8 complete portal tracts, which could satisfy the primary assessment of liver fibrosis, but still not fulfill the more stringent criterion recommended by the guideline of AASLD [3]. Secondly, in our study 39% patients had fibrosis stage < F2, and less patients (15%) had cirrhosis, which may lead to a selection bias. Thirdly, since patients enrolled from 7 different hospitals, FibroScan was performed by the different operator and potentially subject to interobserver variability. To reduce this interobserver variability, we setup standard operating procedures for FibroScan when designing the experiments, and all the operators had been trained together before this study. Finally, this is a cross-sectional study. Whether these models can be used to assess treatment response and long term clinical outcome in CHB patients still needs prospective cohort study.
In conclusion, in a big cohort of CHB patients we found that six noninvasive models including HALF index, FibroScan, Zeng model, Youyi model, S index and FibroTest had a good diagnosis accuracy and four of them (HALF index, FibroScan, Zeng model and Youyi model) had a good reproducibility for evaluating liver fibrosis in CHB patients.  Table. AUROC of respective models was compared with each other and the P values was listed.