Quantitative magnetic resonance imaging predicts individual future liver performance after liver resection for cancer

The risk of poor post-operative outcome and the benefits of surgical resection as a curative therapy require careful assessment by the clinical care team for patients with primary and secondary liver cancer. Advances in surgical techniques have improved patient outcomes but identifying which individual patients are at greatest risk of poor post-operative liver performance remains a challenge. Here we report results from a multicentre observational clinical trial (ClinicalTrials.gov NCT03213314) which aimed to inform personalised pre-operative risk assessment in liver cancer surgery by evaluating liver health using quantitative multiparametric magnetic resonance imaging (MRI). We combined estimation of future liver remnant (FLR) volume with corrected T1 (cT1) of the liver parenchyma as a representation of liver health in 143 patients prior to treatment. Patients with an elevated preoperative liver cT1, indicative of fibroinflammation, had a longer post-operative hospital stay compared to those with a cT1 within the normal range (6.5 vs 5 days; p = 0.0053). A composite score combining FLR and cT1 predicted poor liver performance in the 5 days immediately following surgery (AUROC = 0.78). Furthermore, this composite score correlated with the regenerative performance of the liver in the 3 months following resection. This study highlights the utility of quantitative MRI for identifying patients at increased risk of poor post-operative liver performance and a longer stay in hospital. This approach has the potential to inform the assessment of individualised patient risk as part of the clinical decision-making process for liver cancer surgery.

Introduction Liver resection for primary and secondary malignant tumours is a fundamental component of the multimodal treatment of cancer [1,2]. When an individual patient and multidisciplinary surgical team is considering liver resection, a major challenge is knowing in advance whether liver surgery will be survivable and how that individual's liver will recover function after surgery [3]. We term this future liver performance (FLP), and misjudging FLP can result in postsurgical liver failure and death [4].
Future liver performance is a function of future liver remnant (FLR) volume, FLR function and liver regenerative capacity. FLR volume is typically straightforward to calculate from routine pre-operative imaging with computed tomography (CT) or magnetic resonance imaging (MRI) [5]. However, this requires substantial time input from specialised radiologists [5] using dedicated software tools to accurately delineate the planned FLR.
Estimates of FLR function are based on clinical risk factors, with older age or presence of liver disease, obesity [6] or diabetes mellitus/metabolic syndrome [7] all indicative of poor FLP [8]. When combined with laboratory indices of liver synthetic function (e.g., serum albumin concentration or prothrombin time), these risk factors give a reasonable estimate of FLP to an experienced liver surgeon [8]. More objective direct estimates of hepatocellular function can be obtained using 99m Tc-mebrofenin scintigraphy [9], 13 C-methacetin metabolism [10], or indocyanine green (ICG) clearance [11]. However, these methods require tracer injection and specialised detection devices and have not been widely adopted by most centres offering liver resection, with the exception of ICG clearance in Asia [12]. Various blood-based composite risk scores have been created to estimate survival after liver resection including the 50-50 criteria of prothrombin time and bilirubin [13], peak bilirubin >7mg/dL [14] and the Hyder-Pawlik [4] composite score of creatinine, bilirubin, INR and the Clavien-Dindo grade; these scores are all calculated on measurements taken in the time recovering after surgery and are less sensitive to mild morbidities where changes in clinical management may still be required.
Poor liver regenerative capacity is difficult to predict accurately and to detect pre-operatively when interventions intended to increase the FLR volume (FLRV) such as portal vein embolization [15] or combined portal and hepatic vein occlusion, fail to induce hypertrophy [16]. For an otherwise normal healthy liver affected by cancer, a FLRV of at least 5 cm 3 per kg body weight, or 25% of pre-operative size is taken as a safe threshold below which an individual undergoing liver resection is generally considered to have an unacceptable risk of postoperative liver failure. For livers with evidence of parenchymal disease (e.g., cirrhosis) a higher threshold 40% is commonly used [17]. These thresholds have been derived from cohort studies and are based on cumulative experience of patients undergoing liver resection and do not take into account individual variations in liver health. Patients with very healthy livers could tolerate smaller FLRVs and conversely, post-operative liver failure can occur in patients with FLRV above the usual thresholds, often unexpectedly. Furthermore, some patients receiving systemic anti-cancer therapy may also develop hepatica injury, including chemotherapy-associated steatohepatitis, further increasing the risks associated with liver resection [18]. There is therefore a great and unmet need to develop a non-invasive imaging technology that can accurately and reliably quantify FLP when used pre-operatively.
LiverMultiScan is a non-invasive quantitative multiparametric MRI (mpMRI) technology [19] that has shown high diagnostic accuracy for liver fibroinflammation [20], steatosis and haemosiderosis [21], as well as predicting liver-related outcomes [22] in patients with chronic liver disease. It has also been demonstrated to be cost-effective in stratifying patients with non-alcoholic fatty liver disease [23] and sensitive to changes in non-alcoholic steatohepatitis [24].
The aim of this study was to evaluate, for the first time, the utility of combining these quantitative mpMRI metrics of pre-operative liver health with estimated FLR to forecast individualised risk of poor liver performance in patients undergoing surgical resection for cancer in the liver.

Methods
This study was conducted in accordance with the principles of the Declaration of Helsinki 2013 and approved by the institutional research departments, the South East Scotland Research Ethics Committee 02 (Reference: 17/SS/0049), NHS Scotland R&D and NHS England HRA. The study was registered prospectively on www.clinicaltrials.gov (NCT03213314) and the study design published [25]. Patients being considered for liver resection for cancer were recruited between 7 th September 2017 and 31 st January 2019 at The Royal Infirmary of Edinburgh or Hampshire Hospitals NHS Foundation Trust with the last patient to enter the study completing the protocol on 31 st March 2019. There was no change in treatment, intervention or randomisation in the study. Study data were collected and managed using REDCap electronic data capture tools hosted at the Edinburgh Clinical Trials Unit [26,27].

Study assessments
In this observational clinical trial, patients were recruited following clinical diagnosis and treatment planning. 419 individuals were identified at the multidisciplinary cancer meeting and screened, 305 people were approached to take part, of whom 149 participated in the study requiring a pre-operative and a 3-month post-operative study visit at an imaging centre. Relevant clinical data were obtained by trained members of the Clinical Research Facility at each site. Peripheral venous blood samples were obtained from each study participant on each visit and at the time of surgery for multiple routine laboratory analyses and specialist assays. MR images were acquired a mean of 3 days before surgery and 102 days following surgery at either the Candover Clinic Hampshire Hospitals NHS Foundation Trust on a Siemens Aera 1.5T or at the Queen's Medical Research Institute, The University of Edinburgh on a Siemens Skyra 3T MRI scanner. Appropriate 3-plane localiser images were acquired of the abdomen and mpMRI data were collected using non-contrast enhanced multi-slice acquisition protocols along with a 3D image of the liver, totalling approximately 18 minutes.
Clinical care teams were blinded to research MRI scan reports and surgical procedures were performed as standard of care with operative information collected as described in the eCRF including details of the performed operation. Tissue samples were taken from liver resection specimens with strict care being taken not to compromise the direct clinical care pathology reporting. Fresh Tru-cut biopsies of the tumour and a 1 cm 3 wedge biopsy of the non-lesion liver parenchyma were snap frozen in liquid nitrogen in the operating theatre immediately after the resection specimen was removed from the patient. Formalin-fixed paraffin-embedded blocks of background liver and tumour tissue were obtained specifically for the study at the time of specimen processing for direct clinical care pathology reporting using the NAFLD activity score (NAS) [28], a summation of the semi-quantitative histological scores for liver fat, inflammation and ballooning. The ISHAK score was used to assess fibrosis [29].

Image analysis
The Liver volume was delineated from three-dimensional T1-weighted VIBE images using a semi-automatic approach, comprising an automatic deep learning based initialisation followed by manual editing. The deep learning approach used a 3D U-net architecture as previously described [30]. Six cases were omitted as more than 5% of the liver volume was outside of the field of view of the image acquisition. Tumour locations were defined and guided by the surgical notes and were manually delineated using ITK-SNAP [31] by tracing the tumour in three orthogonal planes and interpolating into a 3D consolidated segmentation. Finally, the delineated liver volume was subdivided into the anatomical Couinaud segments [32], based on manually-placed 3D landmarks (inferior/superior vena cava, right/middle/left hepatic veins, gall bladder fossa, umbilical fissure, left/right portal vein) following the guidelines of Germain et al [33], to facilitate modelling of the planned segmentectomy planes. For atypical wedge resections, a conical frustrum was modelled based on the tumour delineation and the resection path from the liver surface. Surgical notes from the operation plan were used to inform the estimation of FLR volume by removing anatomical segments and wedges as calculated above (Fig 1).
Multislice quantitative cT1, proton density fat fraction (PDFF) and T2 � maps were generated using proprietary LiverMultiScan software (Perspectum, UK) as previously described [21] with operators blinded to patient status. The quantitative MRI maps were overlaid onto the Couinaud segmented volumetric images to allow characterisation of liver tissue in the FLR.

Morbidity outcome
Post-operative mortality has previously been modelled by Hyder and Pawlik [4] by combining three weighted blood scores (INR, creatinine, bilirubin) with the Clavien-Dindo score on the third day following surgery, which predicted 90 day mortality following resection for liver cancer. In our study, excellent long-term post-operative patient outcomes were achieved and only five patients exhibited a Hyder-Pawlik score >9.0 (S1 Fig) indicating that this study was underpowered to reach the primary endpoint. However, measures of liver dysfunction in the immediate days following surgery from blood-derived biomarkers (INR, creatinine, bilirubin with Hyder-Pawlik weightings) can serve as an alternative measure of morbidity. We measured and summed these blood-derived biomarkers for each of the five days following surgery and devised the modified Hyder-Pawlik score, to evaluate the risk of morbidity directly after the operation.

Statistical analysis
Interim analyses were planned in advance. Any errors in data entry were reviewed and corrected with the Chief Investigator's permission. When longitudinal blood sampling was incomplete for the anticipated 5 days (owing to early discharge from hospital), bilirubin, creatinine or INR value was imputed from the value of the previous day. The Hyder-Pawlik score was calculated for each patient based on the values at day 3 post-surgery of bilirubin (mg/dL), creatinine (mg/dL), INR and maximum Clavien-Dindo grade, weighted as previously described [4]. The composite biomarker was generated after a stepwise logistic regression using a modified Hyder-Pawlik score of over 22 as a discriminator and the optimal weighed predictors were selected by Akaike's information criterion. The Youden index identified the optimal receiver operating characteristic.

Results
Of the 149 patients enrolled in the study, 135 patients underwent liver resection, 7 had portal vein embolization to induce hypertrophy of the FLR and two individuals had transarterial chemoembolisation (Fig 2a). A range of surgical procedures were planned (Fig 2b) resulting in a median [IQR] FLR of 83% [23%-95%]. Demographics and clinicopathologic data are described in Table 1. The large majority (n = 114, 84% of participants) had liver metastases from colorectal cancer; the remainder had hepatocellular carcinoma (n = 6), cholangiocarcinoma (n = 1) or other metastases (n = 14) including breast cancer metastasis and ovarian cancer metastasis. 73 patients had received systemic anticancer chemotherapy that was ended a median of 56 days prior to surgery. 12 patients presented with a diagnosed underlying liver disease. MpMRI was performed in all patients and 21% of individuals (29/135) had pre-operative liver cT1 values greater than the upper limit of normal (795 ms) defined for the general population [34,35].

Pre-operative multiparametric MRI correlates with liver histology
Patients displayed a range of tumour and parenchymal liver cT1 values, six examples are shown in Fig 3a-3f. The case shown in Fig 3c depicts a parenchymal cT1 within the normal range [34], and histological analysis of liver tissue adjacent to the tumour explant (white box, Fig 3g) appears typical of a healthy liver showing no inflammation, ballooning or steatosis. The case shown in Fig 3f shows a parenchymal cT1 above the normal healthy range and the associated histology (Fig 3h) indicates ballooned hepatocytes, infiltration of inflammatory cells and grade 3 steatosis. cT1 correlated with histopathological scoring of ballooning (Spearman's rho = 0.279, p = 0.001) and inflammation (Spearman's rho = 0.276, p<0.001) and PDFF correlated with steatosis (Spearman's rho = 0.702, P<0.0001). cT1 was significantly elevated in patients with a ballooning score of 2 ( � P = 0.0057) or an inflammation score of 2 ( � P = 0.011). PDFF showed significant stepwise increases with increasing steatosis scores ( ��� P<0.001) (Fig 3i-3k).

Pre-operative multiparametric MRI predicts with post-operative morbidity
The primary endpoint of this study was to determine the ability of LiverMultiScan to predict risk of post-operative morbidity and mortality by measuring the correlation between the preoperative liver health assessment score and the post-operative liver function composite integer-based risk (Hyder-Pawlik) score; the modified Hyder-Pawlik score (Fig 5a) was also evaluated. Patients with high cT1 had a slightly elevated Hyder-Pawlik score at day 3 than patients with a normal cT1 (P = 0.11 no significant difference), and had a significantly higher modified Hyder-Pawlik score (Fig 5b) than patients with a normal cT1 (P = 0.0076). Stepwise logistic regression was performed on the following nine pre-operative variables to determine the diagnostic accuracy to discriminate the patients with a 5-day sum of modified Hyder-Pawlik scores in the upper quartile: age, BMI, creatinine, bilirubin, liver volume, lesion volume, FLR, liver fat and liver cT1. The model was trained using a subset of data (n = 62) in whom at least 10% of the liver volume was removed to exclude patients with very small resections and for whom complete data were available. Non-collinearity of predictors was confirmed using linear regression models. A combination of FLR and pre-operative cT1 had the best performance with an AUROC of 0.78 (95% confidence intervals: 0.66, 0.90) (Fig 5c). The resulting composite biomarker was termed the 'Hepatica score', with a lower FLR and a higher pre-operative cT1 predictive of an increased risk of poor post-operative outcome. This composite biomarker showed a significantly higher diagnostic performance (DeLong's test for two correlated ROC curves P = 0.000104) than FLR alone with an AUROC of 0.70 (95% CI 0.55-0.84).
Hepatica score ¼ e À ð1:24þðlogFLR� À 2:26ÞþðcT1� 0:013ÞÞ Hepatica predicts liver regeneration 96 patients returned for a post-operative follow-up mpMRI scan at a mean of 102 ±4.7 days after their operation. The volume of the liver at this timepoint was measured semi-automatically as described above, with 42% of patients regenerating at least 90% of the resected liver volume. The Hepatica score, measured pre-operatively, correlated with the achieved regeneration (Fig 5d, R 2 = 0.48, P<0.0001). One example of liver regeneration is depicted; the pre-operative mpMRI indicated a lesion on the right lobe of the liver (Fig 5e), with the resection plan and FLR show in red (Fig 5f). The follow-up scan reveals left lobe hypertrophy (Fig 5g) with a measured increase in total liver volume of 798mL (Fig 5h).

Discussion
This study aimed to augment current methods for estimating FLP by incorporating non-invasive, contrast-agent free quantitative mpMRI to inform the risk assessment for patients indicated for liver surgery for cancer in the liver. We have shown here that patients with pre-  operative liver cT1, a biomarker of fibroinflammation, above the upper limit of normal spent on average 1.5 days longer in hospital than those with cT1 in the normal range. Factors affecting length of stay have been investigated previously for liver resection [36,37] but this is the first report of a non-invasive imaging approach identifying a patient group with a higher length of stay. This is of particular importance given the significant healthcare costs associated with in-patient care (~£10,000 total post-operative costs [38]). The correlations observed for cT1 and PDFF with histological measures of ballooning, inflammation and steatosis in this surgical population, reinforce previously published associations in patients with chronic liver disease [19,21]. Taken together, this evidence highlights a role for mpMRI in characterising liver disease in place of an invasive liver biopsy, which is typically only performed in selected patients with chronic liver disease as part of the pre-operative work-up, owing to the risk of haemorrhage and tumour seeding [39,40].
The primary endpoint in this study was to determine the correlation between a composite of FLR and pre-operative liver cT1, with a measure of post-operative morbidity, the Hyder-Pawlik score [4]. The weak correlation seen in this study (multiple R-squared = 0.14) is likely due to low rate of substantial morbidity observed in comparison to that reported by Hyder and colleagues [4]. As a result, our study was insufficiently powered to meet the primary endpoint. The low rate of morbidity is likely due to a combination of factors including liver cancer indication; 18.2% of cases were reported to present with HCC by Hyder and colleagues, while only 6 of 143 (4.2%) cases in our study presented with HCC. Additionally, 9.1% of cases investigated by Hyder and colleagues presented with underlying liver cirrhosis whereas no patients in our study were found to have histological cirrhosis (modified ISHAK fibrosis score 6).
The Hyder-Pawlik score identifies an increased risk of 90-day mortality when measured as >9 points at day 3 post-surgery. Prediction of 90-day mortality using patient information collected after the surgical procedure has utility in enabling clinical decisions regarding the level of post-operative care required for the patient. However, additional patient outcome and management benefits may be gained if such analysis was carried out pre-operatively. In this study, we opted to investigate imaging-based pre-surgical variables that would permit the evaluation of the risk of the surgical procedure before it was carried out. The composite biomarker 'Hepatica score' generated in our study performed well at classifying the quartile of patients with poor liver function as measured on each of the five days following surgery (modified Hyder-Pawlik score). We propose that this approach better captures the transient changes in serum liver enzymes that are indicative of limited liver function [41] and are used in clinical monitoring to guide changes in patient management during post-operative recovery.
Another key element in the recovery of patients following liver resection is the capacity of the liver to regenerate. In this study, the pre-operative Hepatica score showed excellent correlation with the volume of liver regenerated at the 3-month post-operative MRI scan. This highlights the utility of augmenting FLR with a measure of liver health, in this case the fibroinflammatory biomarker, cT1. Our understanding of the processes regulating liver regeneration are evolving, but inflammation [42] and recent chemotherapy prior to resection [43] have previously been reported as important modifiers.
A limitation of this study is the relatively small number of patients with poor post-operative liver performance. In order to build suitably robust and transferable predictive models, a sufficiently wide distribution of patients is required. As this study was prespecified to enrol all-correlated with the pre-operative Hepatica score (R 2 = 0.48). (e, f) Exemplar pre-surgical cT1 map and T1-weighted image with FLR overlaid (red) of a patient presenting with a HCC lesion in the right lobe of the liver. (g, h) Postsurgical cT1 map and T1 weighted image of the same patient 3 months following surgery.
https://doi.org/10.1371/journal.pone.0238568.g005 comers at tertiary referral centres in the UK, the patients recruited reflected the relatively low prevalence of primary liver cancer in this region with the majority having metastatic liver cancer on a background of relatively healthy liver tissue. A second limitation is our use of a new measure of post-operative liver performance, the modified Hyder-Pawlik score. This approach was derived to provide increased sensitivity to more subtle derangements in post-operative liver function and to allow an understanding of the acute care requirements and level of observation required for patients still under the care of the surgeon and associated multidisciplinary team. This scoring system and its clinical validity requires further evaluation in future studies.
In summary, the results presented in this study highlight the utility of pre-operative mpMRI for predicting post-operative liver performance and associated length of stay in hospital. This has the potential to transform surgical decision-making and improve personalised risk assessment for patients undergoing liver resection for cancer. Further evaluation of this technology is needed to fully evaluate the clinical utility in a broader patient population.