Incremental Value of MR Cholangiopancreatography in Diagnosis of Biliary Atresia

Purpose To evaluate the incremental value of a combination of magnetic resonance cholangiopancreatography (MRCP) and ultrasonography (US), compared to US alone, for diagnosing biliary atresia (BA) in neonates and young infants with cholestasis. Materials and Methods The institutional review board approved this retrospective study. The US and MRCP studies were both performed on 64 neonates and young infants with BA (n = 41) or without BA (non-BA) (n = 23). Two observers reviewed independently the US alone set and the combined US and MRCP set, and graded them using a five-point scale. Diagnostic performance was compared using pairwise comparison of the receiver operating characteristics (ROC) curve. The sensitivity, specificity, accuracy, positive predictive value (PPV), and negative predictive value were assessed. Results The diagnostic performance (the area under the ROC curve [Az]) for diagnosing BA improved significantly after additional review of MRCP images; Az improved from 0.688 to 0.901 (P = .015) for observer 1 and from 0.676 to 0.901 (P = .011) for observer 2. The accuracy of MRCP combined with US (observer 1, 95% [61/64]; observer 2 92% [59/64]) and PPV (observer 1, 95% [40/42]; observer 2 91% [40/44]) were significantly higher than those of US alone for both observers (accuracy: observer 1, 73% [47/64], P = 0.003; observer 2, 72% [46/64], P = 0.004; PPV: observer 1, 76% [35/46], P = 0.016; observer 2, 76% [34/45], P = 0.013). Interobserver agreement of confidence levels was good for US alone (ĸ = 0.658, P < .001) and was excellent for the combined set of US and MRCP (ĸ = 0.929, P < .001). Conclusion Better diagnostic performance was achieved with the combination of US and MRCP than with US alone for the evaluation of BA in neonates and young infants with cholestasis.


Introduction
Early diagnosis of biliary atresia (BA) is of great clinical importance because timely surgical intervention can restore bile flow and prevent worsening of liver disease [1]. Avoiding unnecessary laparotomy in patients with other causes of neonatal cholestasis is also essential, as this may contribute to morbidity and a considerable proportion of patients may not have BA [2,3].
Many investigators have endeavored to distinguish BA from non-BA patients without the use of laparotomy. The preoperative diagnosis of BA using multiple ultrasonography (US) parameters has met with variable success [4][5][6][7]. A triangular cord sign, though helpful in the diagnosis, is not always present in every BA patient [4,8]. Moreover, biochemical and histopathologic results may also overlap between BA and other causes of neonatal cholestasis [2]. Magnetic Resonance cholangiopancreatography (MRCP) is another useful and non-invasive examination for biliary disease, and offers visualization of the extrahepatic biliary tree, including the confluence of the right and left hepatic ducts. A literature review on the performance of MRCP for the diagnosis of BA indicates an accuracy of 71-100% [9][10][11] and a sensitivity of 90-100% [9,11,12]. Thus, no single radiologic investigation allows a reliable diagnosis to be made with certainty [3,13]. A multidisciplinary approach is required to discriminate BA from non-BA in neonates and young infants.
The aim of this study was to evaluate the incremental value of a combination of MRCP and US, compared to US alone, for diagnosing BA in neonates and young infants with cholestasis.

Study population
This retrospective study was approved by Samsung Medical Center institutional review board, and the requirement for informed consent was waived. And patient information was anonymized and de-identified prior to analysis.
We searched our institutional database for both abdominal US examinations and MRCP performed between January 2000 to January 2016 using the search terms "BA," "neonatal hepatitis," "cholestasis," "jaundice," and "clay-colored stool,". The search yielded 117 neonates and young infants. Of these, 51 patients with choledochal cysts and 5 patients with Alagille syndrome were excluded. Ultimately, 61 consecutive neonates and young infants were selected. Cases were defined as 41 patients (male to female ratio, 12:29) with BA confirmed with surgical cholangiography or histologic analysis of the biliary remnants (BA group). Twenty children were diagnosed as clinically suspected of having neonatal hepatitis, based on their clinical and laboratory improvement during the follow-up period (n = 19) or according to surgical cholangiography (n = 1). Another 3 patients diagnosed as neonatal hepatitis were collected from Seoul National University Hospital between 2010 and 2015. Eventually, 23 children (male to female ratio, 10:13) were included as a control (non-BA group).
The charts were reviewed for demographic information, clinical profiles, and laboratory data.

Image analysis
Two pediatric radiologists, who were blinded to the final clinical diagnosis, reviewed the two image sets (the US alone set and the combined US and MRCP set) parted by an interval of one month.
The parameters assessed by US included the length of the gallbladder (GB) along the longitudinal axis, with a description of its wall regularity. The GB was considered atretic when it was not visualized or when it was less than 1.9 cm long [4,5]. Positivity for the triangular cord sign was defined as a thickness of the echogenic anterior to the right portal vein greater than 4 mm [14]. The caliber of the hepatic artery was measured at the level of the right proximal hepatic artery running parallel to the right portal vein. Hepatic artery enlargement was defined as a caliber of the hepatic artery greater than 1.5 mm [6]. The visibility of the common bile duct at the porta hepatis was recorded by US [4]. The visibility of the extrahepatic biliary tree was evaluated by MRCP. Our use of the term "extrahepatic biliary tree" refers to the common hepatic duct and the common bile duct, as well as the confluence of the right and left hepatic ducts [11].
Two observers first independently interpreted the US images alone, and they rated their confidence levels according to BA versus non-BA using a five-point scale (1 = definitely non-BA, 2 = probably non-BA, 3 = equivocal BA, 4 = probably BA, 5 = definite BA). The two endpoints of the five-point confidence level scale were predefined. The GB abnormalities (either atretic GB or GB wall irregularity), the presence of the triangular cord sign, hepatic artery enlargement, and non-visualized common bile duct at the porta hepatis were regarded as definite US findings of BA. Conversely, definitive US findings of non-BA were the absence of GB abnormalities (a normal length of the GB and GB wall regularity), absence of a triangular cord sign, absence of hepatic artery enlargement, and visualization of the common bile duct at the porta hepatis.
After one month later, two observers independently interpreted the combined set of US and MRCP images and graded their confidence levels using the same five-point scale. The diagnosis of BA was assigned according to the non-visualization of the extrahepatic biliary tree by MRCP. A study finding was considered to be non-BA when the entire extrahepatic biliary tree was visualized. When the MRCP diagnosis differed from that seen on US images, the observers were requested to place priority to the MRCP data.

Statistical analysis
Statistical analyses were performed using SPSS software (SPSS, version 23.0; SPSS, Chicago, Ill). Results were considered significant if the P-value was less than 0.05. Descriptive data were expressed as the mean ± standard deviation (SD) or frequencies. Statistical differences were tested by applying a two-sample t-test, the Mann-Whitney test, or Fisher's exact test. The diagnostic performance of two observers was compared by receiver operating characteristic (ROC) analysis with the area under the ROC curve (A z ) and pairwise comparison. The sensitivity, specificity, accuracy, and positive and negative predictive values of each observer were calculated. McNemar's test was used to compare the sensitivity, specificity, and accuracy, and Bennett's test was used to compare the positive and negative predictive values of each observer [15].

Clinical profiles and laboratory data
The demographic information and laboratory data for patients with BA and non-BA are summarized in Table 1. The mean interval between US and MRCP was 3.5 days (range, 0-30 days) in BA patients. All patients underwent Kasai portoenterostomy (mean age 81 days, range 32 days-7 months). In non-BA patients, the mean interval between US and MRCP was 2.9 days (range, 0-19 days). The BA patients showed a female predominance. No statistically significant differences were noted in the frequency of clay-colored stool or in the mean levels of biochemical hepatic function tests in both groups.

US and MRCP imaging findings
The US and MRCP findings in both the BA and non-BA patients are summarized in Table 2. The US findings of GB abnormalities (observer 1, P<0.001; observer 2, P<0.001) and hepatic artery enlargement (observer 1, P = 0.002; observer 2, P<0.001) were significantly higher in the BA patients than in the non-BA patients for both observers. The GB was visualized in 58 patients (BA, n = 36; non-BA, n = 22) and was not seen in 6 patients (BA, n = 5; non-BA, n = 1). The proportions of triangular cord sign and non-visualization of the common bile duct were not significantly different between the BA and non-BA patients, as determined by both observers. Non-visualization of the extrahepatic biliary tree by MRCP was significantly higher in the BA patients than in the non-BA patients (observer 1, P<0.001; observer 2, P<0.001).

Incremental value of MRCP for diagnostic performance
The diagnostic performance with respect to evaluation of BA by two observers increased significantly after additional review of the MRCP images: A z increased from 0.688 to 0.901 (P = 0.015) for observer 1 and from 0.676 to 0.901 for observer 2 (P = 0.011) ( Table 3). Table 3 demonstrates the diagnostic predictive values to diagnose non-BA for each observer and by each technique. The confidence levels for interobserver agreement were good for US alone (ĸ = 0.658) and excellent for the combined set of US and MRCP (ĸ = 0.929). The diagnostic accuracy (observer 1, P = 0.003; observer 2, P = 0.004) and positive predictive value (observer 1, P = 0.016; observer 2, P = 0.013) were significantly higher for both observers when both US and MRCP images were reviewed than when US images alone were reviewed. The sensitivity was higher when analyzing both US and MRCP images than when analyzing US images alone, but the differences did not reach statistical significance in both observers. The specificity and negative predictive value demonstrated the tendency of improvement after additional review of the MRCP images, but statistical significance was different in each observer. Additional review of the MRCP data permitted the observers to correct some diagnostic errors occurring with US images only (observer 1, n = 17; observer 2, n = 16) ( Table 4) (Figs 1 and 2). Five cases (BA, n = 1; non-BA, n = 4) were not correctly interpreted by either observer even after additional review of the MRCP findings.

Discussion
We evaluated the added value of MRCP to US images for distinguishing BA from non-BA in neonates and young infants with cholestasis. Our results showed that the A z , diagnostic accuracy, and positive predictive value significantly increased for both observers when both MRCP and US images were reviewed, compared with US images alone. In addition, adding MRCP to US images decreased the diagnostic errors by both observers. The US features, such as GB abnormalities, triangular cord sign, and vascular alterations, have been reported as significant predictors for discriminating BA from non-BA. Variable ranges of sensitivity and specificity were reported in the literature for individual US parameters, such as abnormal GB (61-97% and 69-100%) [3-5, 7, 16], triangular cord sign (23-100% and 87-100%) [3,4,7,16,17], enlarged hepatic artery (72-92% and 49-87%) [4,6,18], and non-visualized common bile duct (83-95% and 48-92%) [4,6,7,19]. Humphrey and Stringer described that the use of a combination of several US parameters allowed the distinguishing of BA from other causes of neonatal cholestasis, with an overall accuracy of 98% [7]. In contrast to this high diagnostic performance of US, Giannattasio et al. [8] found that 68% (17/25) of infants with BA showed an identifiable GB at US, with a regular wall in one-fifth of the infants, and the triangular cord sign was seen in 24% (6/25) of cases. The lower sensitivity of US for the diagnosis of BA could reflect several potential pitfalls of US diagnosis itself (such as detection of a small non-distended GB and the difficulty in visualization of the common bile duct in healthy infants) or obscured triangular cord sign in patients with early-stage BA [4,8]. Therefore, the assessment of BA is challenging, and no single radiologic examination appears to be clearly superior, although several efforts have attempted to improve the preoperative investigation for BA. A BA diagnostic scoring system consisted of clinical, laboratory, and histopathological variables, as well as US parameters [3], has recently been developed. Subsequent validation in 75 consecutive patients revealed an overall high diagnostic accuracy of 99% (74/75) [3]. However, the indications for liver biopsy were debated by Sciveres et al. [20] based on their experience. When liver biopsy was proposed in only 25% (16/64), they found a similar diagnostic performance (three diagnostic errors out of 64 cases) to that of new scoring system introduced by El-Guindi et al. [3].
MRCP can be a useful and non-invasive method for the evaluation of hepatobiliary disease, even in pediatric patients. Previous studies have reported that BA can be reliably diagnosed by visualization of the extrahepatic biliary tree, especially when accompanied by the appearance of an atrophic GB by MRCP [9][10][11][12]. Jaw et al. [10] reported that 16 jaundiced infants were correctly diagnosed as BA (n = 6) or non-BA (n = 10) with an overall accuracy of 100%, based on non-visualization of the extrahepatic biliary tree on MRCP. However, Norton et al. [11]  reported that their three false positive cases indicated that the previous criteria established by Jaw et al. [10] could not be relied on alone for a 100% diagnostic accuracy. They found that MRCP was 82% (19/23) accurate in depicting BA when using their expanded criteria, which included delineation of the confluence, common hepatic, and common bile ducts, in agreement with our criteria used in the present study.
Both observers in our study made erroneous diagnoses of BA in five patients, even after additional review of the MRCP images. Among these errors, a false positive diagnosis of BA was made in four patients. In one infant, the extrahepatic biliary tree, including the confluence, was not visualized by MRCP. In the other three infants, the common hepatic and common bile ducts were observed, but the confluence was not visualized clearly. These cases signify a potential pitfall in the diagnosis of BA by MRCP, as it depends on the inadequate production and secretion of bile. A prospective pilot study by Siles et al. [21] reported that MRCP visualization of the normal biliary system was only possible in 62.5% (10/16) of normal infants younger than three months. The MRCP findings may be simulating BA when insufficient bile is produced in the extremely small diameter of the hypoplastic bile duct [9,11]. Except in one case, the delineation of the extrahepatic biliary tree that included confluence ruled out the possibility of BA, with a negative predictive value of 95% (21/22) for observer 1 and 95% (19/20) for observer 2. One false negative case was diagnosed as having type II BA; in this patient, the common bile duct was preserved, but the common hepatic duct and confluence were not visualized by surgical cholangiography. The high signal intensity by intestinal tracts may masquerade as normal extrahepatic biliary tree.
Our study has several drawbacks. First, although we recruited consecutive patients who met our inclusion criteria, the possibility of selection bias should be considered due to the retrospective study design. Second, the increased diagnostic accuracy observed for the combined image sets may be related to the reviewers' experiences with the rating system. A recall bias may have resulted from the order of image evaluation, although we tried to prevent this issue by using a one-month interval for review and by rearranging the order of the cases. Third, our number of non-BA patients was small and our statistical results could be affected by the low number of non-BA patients. A large-scale research may be needed to validate the value of MRCP for differentiating BA from non-BA in neonates and young infants.
In conclusion, the addition of MRCP yields better diagnostic accuracy and positive predictive value than the use of US alone in the diagnosis of BA in newborn and young infants with cholestasis. Therefore, when US is inconclusive in differentiating BA from non-BA, additional MRCP imaging may provide valuable information regarding the normal patency of the extrahepatic biliary tree and can reduce or even obviate unnecessary laparotomy.