Prospective comparison among transient elastography, supersonic shear imaging, and ARFI imaging for predicting fibrosis in nonalcoholic fatty liver disease

The diagnostic performance of supersonic shear imaging (SSI) in comparison with those of transient elastography (TE) and acoustic radiation force impulse imaging (ARFI) for staging fibrosis in nonalcoholic fatty liver disease (NAFLD) patients has not been fully assessed, especially in Asian populations with relatively lean NAFLD compared to white populations. Thus, we focused on comparing the diagnostic performances of TE, ARFI, and SSI for staging fibrosis in a head-to-head manner, and identifying the clinical, anthropometric, biochemical, and histological features which might affect liver stiffness measurement (LSM) in our prospective biopsy-proven NAFLD cohort. In this study, ninety-four patients with biopsy-proven NAFLD were included prospectively. Liver stiffness was measured using TE, SSI, and ARFI within 1 month of liver biopsy. The diagnostic performance for staging fibrosis was assessed using receiver operating characteristic (ROC) analysis. Anthropometric data were evaluated as covariates influencing LSM by regression analyses. Liver stiffness correlated with fibrosis stage (p < 0.05); the area under the ROC curve of TE (kPa), SSI (kPa), and ARFI (m/s) were as follows: 0.757, 0.759, and 0.657 for significant fibrosis and 0.870, 0.809, and 0.873 for advanced fibrosis. Anthropometric traits were significant confounders affecting SSI, while serum liver injury markers significantly confounded TE and ARFI. In conclusion, the LSM methods had similar diagnostic performance for staging fibrosis in patients with NAFLD. Pre-LSM anthropometric evaluation may help predict the reliability of SSI.

Introduction Nonalcoholic fatty liver disease (NAFLD) is the most common cause of chronic liver disease with an estimated 30% prevalence for NAFLD and 3-5% for nonalcoholic steatohepatitis (NASH) in the United States [1]. Moreover, NASH may eventually lead to hepatic complications including liver cirrhosis and hepatocellular carcinoma as well as extrahepatic comorbidities such as diabetes mellitus (DM) and cardiovascular disease. According to a recent report on the third National Health and Nutritional Examination Survey, advanced fibrosis diagnosed on the basis of noninvasive markers in NAFLD was associated with increased overall mortality, including deaths due to cardiovascular, malignancy-related, and hepatic complications [2]. Thus, the accurate assessment of liver fibrosis is critical to predict long-term outcomes and to determine treatment strategy in subjects with NAFLD [3][4][5].
Sonoelastography, an alternative tool for assessing liver fibrosis, is noninvasive, easily accessible, and available at point-of care, which allows the physicians to predict the risk of progression and decide treatment plans [6]. Transient elastography (TE) is the most widely used technique for evaluation of chronic liver disease and associated fibrosis. However, TE does not produce the real-time sonographic image of the liver. Moreover, TE is not feasible in some patients with morbid obesity and ascites, such as in severely obese NAFLD and decompensated NASH cirrhosis. The acoustic radiation force impulse imaging (ARFI) and the supersonic shear imaging (SSI) have several advantages over TE; they are fully integrated into a conventional ultrasound system and, thus, can be performed during routine liver sonographic examination. Both ARFI and SSI can also be used to select the examination points in the liver and evaluate heterogeneous liver fibrosis or focal liver lesions such as liver tumor [7]. Moreover, we have already evaluated the intra-and inter-observer variations of ARFI and SSI, indicating excellent agreements [8].
While the diagnostic performances of TE and ARFI for staging fibrosis in NAFLD patients have been studied [9], SSI has rarely been investigated. A previous study evaluated the diagnostic performance of SSI in comparison with those of TE and ARFI for staging fibrosis in chronic liver disease patients with heterogeneous etiologies [10], and another study compared the diagnostic performances between TE, ARFI, and SSI in white populations who had generally obese years old, (ii) bright echogenic liver on ultrasound scanning (increased liver/kidney echogenicity and posterior attenuation), and (iii) unexplained high alanine aminotransferase (ALT) levels above the reference range within the past 6 months. The exclusion criteria were as follows: (i) history of hepatitis B or C virus infection, (ii) history of autoimmune hepatitis, (iii) history of drug-induced liver injury or steatosis, (iv) Wilson disease or hemochromatosis, (v) habitual excessive alcohol consumption (male > 30 g/day, female > 20 g/day) assessed using the validated Korean version of the Alcohol Use Disorder Identification Test (AUDIT-K) questionnaire during the study period, and (vi) a diagnosis of malignancy within the past year. Of the eligible subjects, those with at least two of the following risk factors underwent liver biopsy: DM, central obesity (waist circumference ! 90 cm for men or ! 80 cm for women), a high level of triglyceride (! 150 mg/dL), a low level of high-density lipoprotein (HDL)-cholesterol (< 40 mg/dL for men or < 50 mg/dL for women), presence of insulin resistance, hypertension, or clinically suspected NASH or fibrosis [14]. A total of 112 subjects with radiologic evidence of hepatic steatosis were initially approached. Among them, six participants were excluded due to concurrent hepatitis B (n = 5) or a history of heavy alcohol consumption (n = 1), eight subjects did not undergo liver biopsy due to failures to fulfill the eligibility criteria for liver biopsy, and four subjects did not meet the adequate biopsy specimen criteria, yielding a drop-out rate of 16%. Thus, a total of 94 subjects with biopsy-proven NAFLD were finally included in this prospective cohort study (Fig 1). This study was carried out in accordance with the 1975 Helsinki Declaration and was approved by Seoul Metropolitan Government-Seoul National University (SMG-SNU) Boramae Medical Center Institutional Review Board (26-2015-52). Written informed consent was obtained from each study participant in the study cohort.

Biochemical and anthropometric evaluation
The clinical, anthropometric, and biochemical data of the study population were obtained on the same day as the liver biopsy. The plasma levels of aspartate aminotransferase (AST), ALT, gamma-glutamyltransferase (GGT), total cholesterol, triglyceride, HDL-cholesterol, low-density lipoprotein (LDL)-cholesterol, free fatty acid, total bilirubin, albumin, glucose, glycosylated hemoglobin (HbA1c), insulin, c-peptide, and high sensitivity C-reactive protein (hs-CRP) were measured using 12-hour overnight fasting blood samples. Platelet count and international normalized ratio of prothrombin time (PT-INR) were tested using whole blood. Insulin resistance was determined using the homeostasis model assessment of insulin resistance (HOMA-IR). The presence of insulin resistance was defined as HOMA-IR ! 2 as described elsewhere [15,16]. The noninvasive serum fibrosis tests, such as AST-to-ALT ratio (AAR), AST-to-platelet ratio index (APRI), and fibrosis 4 index (FIB-4), were calculated from baseline demographic and biochemical data as described elsewhere [17][18][19]. Anthropometric data including total body muscle and fat mass (kg) were collected using the InBody 330 body composition analyzer (InBody, Seoul, Korea). Waist circumference (WC) was also measured. Abdominal total adipose tissue (TAT), visceral adipose tissue (VAT), and subcutaneous adipose tissue (SAT) areas were measured using non-contrast computed tomography (CT) within 1 month of percutaneous liver biopsy. The subjects were examined with a 128-detector CT scanner (Ingenuity CT, Philips Medical Systems, Cleveland, OH, USA) in the supine position. The TAT, VAT, and SAT areas were measured at the level of umbilicus with commercially available CT software (Rapidia 2.8; INFINITT, Seoul, Korea) that electronically determined the adipose tissue area by setting the attenuation values for a region of interest (ROI) within a range of −250 to −50 Hounsfield units.

Liver stiffness measurement
Simultaneously, TE, ARFI, and SSI were conducted for LSM after at least 2-hour fast and within 1 month of percutaneous liver biopsy.
TE. Transient elastography (TE, Echosens, Paris, France) was performed with the Fibroscan 1 system using the M probe. The examinations were carried out by a well-trained radiologic technician (with an experience of more than 1,000 cases of TE LSM) blinded to ARFI and SSI results and histological data. As previously described [11,20], poorly reliable or unreliable data were defined as an interquartile range (IQR) per median of LSM (IQR/M) > 0.3 with a median LSM ! 7.1 kPa, and those unreliable results were excluded from analysis.
ARFI. The ARFI imaging was conducted by two experienced radiologists (H.W. with 13 year-experience for abdominal ultrasound; M.S.L. with 10 year-experience for abdominal ultrasound) blinded to TE and SSI results and histological data, using Acuson S2000 (Siemens AG, Erlangen, Germany). Since the inter-observer agreement of ARFI imaging proved highly reliable (0.927-0.958) in our center [15], repeat measurements between the two radiologists were not performed. The patients were required to be supine with their right arms raised overhead to increase the intercostal acoustic window. An ARFI-integrated convex probe (5C1) was positioned in the intercostal space perpendicular to the liver capsule to properly visualize the right lobe of the liver in the optimal acoustic window. The 10 × 5 mm ROI cursor was positioned in the area of liver parenchyma deeper than 2 cm from the liver capsule and free from large blood vessels, reverberation artifacts, and acoustic shadowing. Ten valid measurements at the same area were obtained from each patient during their late expiratory phases with breath-hold and the median value expressed in meters per second (m/s) was regarded as a representative value of liver stiffness. When an IQR/M was > 0.3 if the LSM was > 1.5 m/s, the measurement was considered unreliable [21].
SSI. The SSI (AiXplorer, Aix-en-province, France) with an SSI integrated convex probe (SC6-1) system was used for measuring shear-wave speed. One of the authors (W.K. with 12-year experience of liver ultrasound) who were not involved in performing ARFI measurement and were blinded to TE and ARFI results and histological data implemented SSI. Patient position, probe location, number of measurements, breath-holding, general rules of ROI cursor (Q-Box™) positioning, and the definition of unreliable measurement for SSI were similar to those for ARFI [21]. For acquiring valid measurements, the operator located a 25 × 20 mm shear-wave imaging (SWI) box at vessel-and reverberation-free liver parenchyma and waited for at least three seconds for the elastogram to be stable, and finally put the 15 mm diameter Q-Box in the area of relatively uniform elasticity, which was seen as a uniform colored area in the SWI box. For each patient, ten consecutive LSMs were obtained within one Q-Box [22].

Liver histology
Liver biopsy as a reference standard for suspected NAFLD was performed for all study participants. Every biopsy was performed under ultrasound guidance, and in the same or similar location as the elastography measurements. The adequate liver specimen criteria were as follows: (i) ! 20 mm in length and (ii) ! eight portal tracts [23]. Liver specimens were fixed with 4% formalin and embedded in paraffin. All liver specimens stained with hematoxylin-eosin and Masson's trichrome were analyzed by an experienced pathologist who was blinded to clinical data. Biopsy-proven NAFLD was defined as the presence of ! 5% steatosis [24]. Fibrosis was staged according to a 5-point scale: F0, no fibrosis; F1, perisinusoidal or portal; F2, perisinusoidal and portal/periportal; F3, septal or bridging fibrosis; and F4, cirrhosis [25]. Significant fibrosis was defined as ! F2 and advanced fibrosis as ! F3. The NAFLD activity score (NAS) ranged from 0 to 8 according to the grades of steatosis (0-3), lobular inflammation (0-3), and hepatocellular ballooning (0-2) [25] according to Brunt's criteria [24,26].

Statistical analysis
Quantitative data was compared using the paired t-test or Wilcoxon signed rank test according to the data distribution. Proportions were compared using the chi-square test. The clinical, anthropometric, and biochemical parameters that affected the failure or unreliability of LSM were analyzed using logistic regression analysis. Spearman correlation analysis was used to evaluate the relationship between individual LSM values and the clinical, anthropometric, biochemical, and histological parameters. Successive multiple linear regression analysis was conducted to identify the independent confounders among the multiple parameters that were significantly associated with each LSM on correlation analysis (p < 0.05).
Receiver operating characteristic (ROC) curve analysis was performed to evaluate the diagnostic performances of TE, ARFI, and SSI for staging fibrosis. The areas under the ROC curves (AUROCs) and the 95% confidence intervals (CI) of the AUROCs were calculated for the detection of significant fibrosis, advanced fibrosis, and cirrhosis. Sensitivity, specificity, positive predictive values (PPVs), and negative predictive values (NPVs) were calculated from the AUROC curves. The optimal cut-off of each LSM method was based on the highest Youden's index. The AUROCs for staging fibrosis were compared among TE, ARFI, and SSI using the DeLong test. These statistical analyses were performed using the IBM SPSS Statistics Ver. 20.0 (IBM Inc., Armonk, NY, USA) and MedCalc software ver. 16

Baseline clinical and histological characteristics
The mean body mass index (BMI) of 94 included patients was 27 Parameters affecting the failure or unreliability of LSM using TE, ARFI, and SSI The proportions of the unreliable or failed data as assessed by all the LSM methods are presented in Table 2. The unreliability or failure rate of ARFI (11.7%) was significantly lower than that of SSI (26.6%) (p = 0.01). However, there was no significant difference in the unreliability or failure rate of LSM between TE and ARFI or between TE and SSI. For the reliability of LSM in each stage of fibrosis, ARFI showed the significantly higher reliability rate of LSM compared to TE and SSI in NAFLD patients with no or mild fibrosis (F0-1), and compared to SSI in those with F2. Otherwise, all the LSM methods showed similar reliability in LSM rates in patients with advanced fibrosis or cirrhosis.
The logistic regression analysis demonstrated that the reliability of TE was affected by the diverse clinical and biochemical parameters including age, hypertension (especially, systolic blood pressure), LDL-cholesterol, liver function-related markers (AST, albumin, and platelet), and DM-related prognostic markers (insulin and HbA1c). The reliability of ARFI was influenced by age and WC. The reliability of SSI was subject to anthropometric traits such as body fat mass, BMI, TAT, SAT, and the VAT-to-SAT ratio (VSR), while it was not significantly affected by age or liver function-related serum markers (Table 3).

Correlation between the two shear-wave-based elastographies
There was a weak positive correlation (Spearman's rho = 0.370, p < 0.001) between ARFI and SSI. A scattered diagram (Fig 2) also showed that the shear-wave velocity of SSI was generally slightly higher than that of ARFI, as mentioned in the previous study [8].
Diagnostic performance of LSM using TE, ARFI, and SSI for staging fibrosis The diagnostic performances presented as AUROC of TE (kPa), SSI (both m/s and kPa), and ARFI (m/s) for staging significant fibrosis, advanced fibrosis, and cirrhosis are presented in Table 4. In terms of the AUROC analysis, all the LSM methods showed good to excellent performances for staging advanced fibrosis and cirrhosis, whereas they exhibited fair performance for significant fibrosis. The optimal cut-offs according to the highest Youden's index and 90% fixed specificity, and associated sensitivity, specificity, PPV, and NPV are presented in Table 4.
Of the individual LSM methods, SSI showed the best diagnostic performance for significant fibrosis (Table 4, Fig 3A). ARFI and TE showed better diagnostic performances than SSI for advanced fibrosis (Table 4, Fig 3B). ARFI showed the best diagnostic performance for cirrhosis ( Table 4, Fig 3C). However, the diagnostic performances for staging fibrosis were not significantly different between TE, SSI, and ARFI (Table 5 and Fig 3).

Comparison of LSM across fibrosis stage between TE, SSI, and ARFI
We compared LSM across the different fibrosis stages between TE, SSI, and ARFI. Liver stiffness as measured by TE, ARFI, and SSI tended to increase with fibrosis stage and showed a significantly positive correlation with fibrosis stage on Spearman correlation analysis (Table 6).

Confounding factors influencing LSM
Fibrosis stage was significantly associated with LSM, irrespective of the LSM methods (r = 0.416-0.586, p < 0.001; Table 7). In multivariate analysis using variables with p <0.05 on Spearman's correlation analysis as independent variables, TE LSM had a possible causal relationship with fibrosis stage and serum albumin (R 2 = 0.262, p < 0.001); ARFI LSM showed relationship with fibrosis stage and platelet count (R 2 = 0.502, p < 0.001); SSI LSM had relationship with WC and fibrosis stage (R 2 = 0.368, p < 0.001) ( Table 7).
Serum fibrosis indices (HOMA-IR, FIB-4, and APRI) were significantly associated with liver stiffness as measured by all the LSM methods; however, AAR did not correlate with liver stiffness as measured by any LSM method ( Table 7). The grade of steatosis had no impact on LSM, regardless of the LSM methods.

Discussion
In this prospective biopsy-proven NAFLD cohort study, all three LSM methods (TE, SSI, and ARFI) were compared. Additionally, clinical, anthropometric, and biochemical confounders, which might affect the reliability of LSM or be correlated with LSM, were identified separately according to the different LSM methods on regression analyses. Regarding the diagnostic performance for staging fibrosis, our results were similar to those from previous studies on TE LSM [27][28][29] as well as SSI and ARFI LSM [9,11,30]. Moreover, according to recent meta-analyses, the cut-offs for differentiating significant fibrosis, advanced fibrosis, and cirrhosis in the current study were also similar to those in the previous studies: TE ( [9]. According to Elastography-based fibrosis prediction in nonalcoholic fatty liver disease recent EASL-ALEH Clinical Practice Guidelines, the cut-offs of TE exam for differentiating ! F2 and F4 in NAFLD patients were presented as follows: 6.6~7.8 kPa for ! F2, and 10.3~22.3 kPa for F4 [32]. Although there was no specifically presented cut-off for NAFLD, the guidelines showed the cut-offs of ARFI for differentiating ! F2 and F4 in chronic liver disease patients as follows: 1.22~1.63 m/s for ! F2, and 1.71~4.24 m/s for F4 [32]. Aforementioned cut-offs were also similar to those in our study. For SSI, our study showed higher cut-offs than the previous studies (6.3 kPa for ! F2, 8.3 kPa for ! F3, and 10.4 kPa for F4 [11]) (7.1 kPa for ! F2, 9.2 kPa for ! F3, and 11.5 kPa for F4 [12]). It might be partially attributable to relatively low proportion of advanced fibrosis in our study compared to the previous study (27.7% vs. 43.3%) [11]. Other characteristics of our study cohort, such as pure Asian patients and relatively lower proportion of obesity, also could affect the slight increase in cut-off values during the SSI exam. However, specificities for differentiating each stage of fibrosis using SSI in the current study were not significantly different from those in the previous study (55.3%, 61.2%, and 78.0% in the current study versus 50%, 71%, and 72% in the previous study, respectively) [11].  Elastography-based fibrosis prediction in nonalcoholic fatty liver disease From a clinical perspective, NAFLD patients with suspected advanced fibrosis usually require liver biopsy for accurate diagnostic and therapeutic decisions. NAFLD patients with insulin resistance and/or metabolic risk factors are at risk for advanced liver disease. Thus, sonoelastography as a tool for differentiating fibrosis stage should be sensitive for detecting advanced fibrosis to minimize undetected cases of high-risk patients who should be indicated for liver biopsy; additionally, it also should be specific for detecting significant fibrosis to obviate unnecessary biopsy and its related morbidity. Unfortunately, no LSM method sufficiently improved the diagnostic performance for detecting mild to significant fibrosis differently from advanced fibrosis and cirrhosis in the current and previous studies [9,11,29]. However, in the current study, TE may be occasionally preferred to ARFI or SSI under consideration of the clinical situation, such as the degree of liver injury, liver function, and anthropometric features.
The aforementioned guide for the adoption of sonoelastography was also supported by the variable reliability rates of the LSM methods according to fibrosis stage. ARFI showed a significantly higher reliability rate of LSM in patients with no to mild fibrosis (F0-1) compared to SSI and TE, and in those with F2 compared to SSI. However, the reliability rate of ARFI dropped to 57.1% in patients with cirrhosis, which was lower than the reliability rates of TE (85.7%) and SSI (71.4%), although the difference was not statistically significant. The smaller ROI box for ARFI than for TE or SSI and the heterogeneous distribution pattern of collagen deposition in the cirrhotic liver may be plausible reasons for this decline in the reliability rate of ARFI in cirrhotic patients. Regarding the shear-wave detection methods, TE generates shear wave in a direction perpendicular to skin, and detects it in the same longitudinal direction. However, in the case of both ARFI and SSI, shear wave is generated from focused ultrasound and proceeds transversely. ARFI uses multiple detection pulses perpendicular to shear wave, while SSI captures shear wave on 2D imaging using ultrafast imaging techniques. Thus, as the depth and the tissue property between skin and the target lesion may directly affect the shear wave of TE, TE seems to have lower reliability even in the mild fibrosis stage. The reason ARFI showed more reliable data in the mild fibrosis stage than SSI is unclear; however, it might be attributable to the high frame rate (350-4,000 Hz) of ultrafast imaging with SSI, which is rather insufficient to visualize the subtle change of target liver parenchyma with the slow shear wave from the mild fibrotic tissue.
The logistic regression analysis demonstrated that the reliability of TE might be easily affected by the various parameters. DM-related serum markers (Insulin and HbA1c) were significantly associated with the reliability of TE, similar to the previous study [33]. Although the previous study showed that the diagnosis with DM was associated with the failure or unreliability rate of TE, our study demonstrated that DM-related serum markers rather than DM itself were more significantly associated with the failure or unreliability rate of TE.
On the other hand, the reliability of SSI was significantly influenced by diverse anthropometric parameters. Mean BMI and SAT in the reliable SSI LSM group (26.0 kg/m 2 and 180.94 cm 3 ) were significantly lower than that in the unreliable or failure group (28.1 kg/m 2 and 230.55 cm 3 ). For TE and ARFI, WC was the only anthropometric trait that was significantly associated with the reliability of LSM. Therefore, the failure or unreliability of LSM was significantly associated with anthropometric data in all the LSM methods, as evidenced by previous studies [9-11, 29, 30, 33]. Anthropometric evaluation prior to LSM may reduce the failure or unreliability rate of all the LSM methods.
Among the confounders influencing LSM, anthropometric parameters were the most significant confounders affecting SSI LSM, while serum markers of liver injury could confound TE and ARFI LSM. However, serum insulin levels and the homeostasis model assessment of insulin resistance, which are implicated in the main pathogenesis of NAFLD, might invariably influence all the LSM methods.
In the current study, lobular inflammation on histological examination was associated with TE and ARFI LSM, while no association was observed with SSI LSM, differing from the previous study [11]. That study also showed a weak correlation of steatosis grade and NAS with TE LSM but no correlation with ARFI and SSI LSM. In the current study, steatosis severity and NAS were not significantly associated with liver stiffness as measured by any LSM method. Both TE and ARFI use the separate pulse-echo ultrasound acquisition to detect the similar shear-wave speed, while SSI uses ultrafast ultrasound imaging to detect and image shear wave. Consequently, TE and ARFI may share similar confounders affecting LSM such as liver inflammation and liver function tests.
Our study had the following inherent limitations: a relatively small sample size leading to the risk of beta error, the relative paucity of severely obese NAFLD patients in terms of BMI, and the cross-sectional study design. Only M prove was used regardless of patient habitus during the TE exam. In a recent study [34], potential misclassification of fibrosis might occur occasionally when the fasting period before performing SSI and TE was less than three hours. Thus, 2-hour fasting time also could be a substantial limitation of our study.
In conclusion, the diagnostic performances of TE, ARFI, and SSI for staging fibrosis in NAFLD patients were not statistically different in any fibrosis stage. Pre-LSM anthropometry can aid in predicting the failure or unreliability of SSI LSM.
Supporting information S1 Dataset. All database of the included study population. (XLS)