An Artificial Neural Network Stratifies the Risks of Reintervention and Mortality after Endovascular Aneurysm Repair; a Retrospective Observational study

Background Lifelong surveillance after endovascular repair (EVAR) of abdominal aortic aneurysms (AAA) is considered mandatory to detect potentially life-threatening endograft complications. A minority of patients require reintervention but cannot be predictively identified by existing methods. This study aimed to improve the prediction of endograft complications and mortality, through the application of machine-learning techniques. Methods Patients undergoing EVAR at 2 centres were studied from 2004-2010. Pre-operative aneurysm morphology was quantified and endograft complications were recorded up to 5 years following surgery. An artificial neural networks (ANN) approach was used to predict whether patients would be at low- or high-risk of endograft complications (aortic/limb) or mortality. Centre 1 data were used for training and centre 2 data for validation. ANN performance was assessed by Kaplan-Meier analysis to compare the incidence of aortic complications, limb complications, and mortality; in patients predicted to be low-risk, versus those predicted to be high-risk. Results 761 patients aged 75 +/- 7 years underwent EVAR. Mean follow-up was 36+/- 20 months. An ANN was created from morphological features including angulation/length/areas/diameters/volume/tortuosity of the aneurysm neck/sac/iliac segments. ANN models predicted endograft complications and mortality with excellent discrimination between a low-risk and high-risk group. In external validation, the 5-year rates of freedom from aortic complications, limb complications and mortality were 95.9% vs 67.9%; 99.3% vs 92.0%; and 87.9% vs 79.3% respectively (p<0.001) Conclusion This study presents ANN models that stratify the 5-year risk of endograft complications or mortality using routinely available pre-operative data.


Introduction
Endovascular aneurysm repair (EVAR) is the most frequently employed treatment for patients with large abdominal aortic aneurysms (AAA). The key challenge for EVAR is to ensure the durability of repair and detect endograft-related aortic complications requiring reintervention. These complications comprise a group of well-defined entities, each of which predispose to aneurysm rupture if left untreated (type 1 or 3 endoleak, sac expansion, or device migration). Limb complications (stenosis or occlusion) do not predispose to rupture but are frequently also a cause for re-intervention. Clinical trials have suggested that aortic or limb complications may affect up to 1 in 5 patients in the first 5 years after EVAR [1][2][3][4].
In order to detect these complications before aortic rupture, lifelong surveillance imaging is currently considered mandatory [5]. There is widespread debate regarding optimal surveillance intervals and the preferred imaging modality employed in endograft surveillance [6]. Despite the importance of detecting endograft related complications, only 30-50% of patients are compliant with aortic surveillance after EVAR [7]. In addition, a number of studies have found that the majority of aortic complications develop in the interval between normal surveillance scans [8][9][10][11]. The majority of endograft complications are not identified by surveillance, and over 90% of re-interventions after EVAR are prompted by the onset of symptoms between apparently normal surveillance scans [1,[9][10][11]. Therefore, it can be argued that over 90% of patients do not directly benefit from surveillance imaging, but remain exposed to unnecessary nephrotoxic contrast [12] and radiation [13], and incur economic cost [14]. Consequently, defining the role for endograft surveillance has been identified as a priority [15,16], particularly in light of the significant contribution of surveillance to the cost-effectiveness of EVAR [15]. The need for surveillance is predicated entirely on the risk of endograft complications, highlighting the need for a clinical tool to predict these events.
The risk of endograft-related complications is heterogeneous [8] and a minority of patients fall within a high-risk cohort [8]. Endograft complications and reinterventions are related to pre-operative aneurysm morphology [17,18], but current statistical models have proved insufficiently discriminatory for clinical use. Furthermore, existing risk models have not proved capable of predicting limb complications such as stenosis or occlusion, which remain a source of morbidity.
It is plausible that by using advanced machine learning techniques, such as artificial neural networks (ANNs), the prediction of endograft complications and long-term mortality after EVAR might be improved. Machine learning techniques have the potential to exploit complex and subtle relationships between pre-operative variables in order to predict the risk of postoperative events, and have gained popularity for modelling long-term outcomes in a variety of surgical settings [19][20][21][22][23].
The aim of the present study was to use ANN analysis to develop and externally validate a predictive tool for determining the risk of endograft complications and mortality after EVAR.

Methods
Prospective databases were maintained for 761 patients undergoing EVAR at 2 regional vascular units in the UK, from 2004 to 2010. Details of this cohort have been published previously [17]. Data from both centres were combined to create a single database. The dataset contained details of pre-operative demographics, comorbidity and aortic morphology; peri-operative technical details including operative procedure, endograft configuration and operative adjuncts; and follow-up information including aortic complications or mortality.
The primary outcome measure was the development of endograft complications. Endograft complications were classified as aortic or limb. Aortic complications were defined on an intention-to-treat basis as a group of conditions comprising any of: aortic rupture, type 1 endoleak, type 2 endoleak with sac expansion > 5mm on CT, type 3 endoleak, sac expansion of any cause > 5mm on CT and graft migration > 5mm on CT. Limb complications were defined on an intention-to-treat basis as limb stenosis requiring reintervention; or occlusion. Stenosis was treated symptomatically and the minimal criterion for haemodynamic significance was defined as a 2.5-fold increase in peak systolic velocity on duplex ultrasound [24]. The secondary outcome measure was all-cause mortality.

Inclusion Criteria
The study included all cases of EVAR for non-ruptured infrarenal AAA (both elective and non-elective admissions) presenting between January 2004 and June 2010 at St George's Vascular Institute (centre 1) and Leicester Vascular Unit (centre 2). Patients undergoing open aneurysm repair, fenestrated or branch stent-grafts, or those with juxta-renal/supra-renal/thoraco-abdominal aneurysms were excluded. All data was analysed retrospectively and was anonymised and de-identified prior to analysis.

Completion Imaging, Endograft Surveillance and Reintervention Policy
Biplanar angiography was performed at the completion of EVAR. In both centres, the role of surveillance CT imaging changed during the course of the study, although the role of duplex ultrasound remained constant. Before September 2007, postoperative imaging comprised CT and duplex prior to discharge from hospital. This group of patients underwent contrastenhanced CT at 3 months and 1 year postoperatively. Following September 2007, follow-up comprised duplex ultrasound only, with patients undergoing duplex ultrasound and plain abdominal radiograph prior to discharge from hospital. Throughout the study, all patients underwent duplex ultrasound and plain radiography at 6 weeks, 3 months, 6 months, 9 months, 12 months, 18 months and annually thereafter.
In all cases where surveillance detected a clinically significant complication, or when patients presented symptomatically between surveillance scans, a contrast-enhanced CT was performed to direct reintervention. All patients underwent clinical evaluation at 6 weeks and 12 months after the index procedure and annually thereafter. An aggressive reintervention policy was followed for type 1 endoleak, type 2 endoleak with sac expansion > 5mm on CT, type 3 endoleak, and graft migration > 5mm.

Data Collection: Comorbidity, Morphology and Freedom from Endograft Complications
3D morphological assessment of preoperative imaging was performed on 3Mensio Vascular software (3surgery; 3Mensio Medical Imaging B.V., Bilthoven, The Netherlands), following a validated and published protocol [25]. The minimum CT slice thickness used for the 3D reconstructions was 2.5mm. Comorbidity was categorised in binary variables specified by the Royal College of Surgeons' Charlson index for administrative data [26].
Freedom from endograft complications was reported using Kaplan-Meier analysis in both centres. Patients undergoing EVAR at centre 1 (St George's Vascular Institute) were treated as a "model development" cohort and used exclusively for network training. Patients at centre 2 (Leicester Vascular Unit) were treated as a "model validation" cohort, and used as an independent data source, to test the predictive accuracy of the network trained using data from centre 1 only.
Validation was performed by plotting the Kaplan-Meier freedom from endograft complications in patients predicted to be at high-risk versus those predicted to be at low-risk, with comparison by the log-rank test.

Construction of Bayesian ANNs
The censoring time of patients treated at centre 1 was used to classify patients into three groups, which developed an endograft complication within five years (high risk group), completed five years of observation without a recorded endograft complication (low risk group), or died within 5 years without a recorded endograft complication (unknown risk). Low-and high-risk groups for endograft complication were used to build two separate Bayesian networks called B low and B high respectively after applying a standard upsampling technique to balance the datasets [27]. Each censored event was compared with the inherent distribution of the highrisk group p high and inherent distribution of low risk group p low , by calculating the likelihood that the event was sampled from either model. Equations used to derive these likelihoods and specify the ANN are detailed in the accompanying appendix, (Appendix A in S1 File).
A chi-square test feature selection method was applied to select 19 morphological features for ANN construction for endograft complications; conventional univariate analyses were therefore not performed for feature selection and model inclusion [28]. Three-layers backpropagation ANNs were employed to predict aortic complications for centre 2, comprising 19 input, 4 hidden and 1 output neurons respectively. Full descriptive details of the ANN structure and the ANN dependency graph are provided in Appendix B in S1 File. Clinical data regarding patients' comorbidity were added as inputs for an ANN to predict all-cause mortality at 5 years.
Validation of ANN performance was performed by plotting the Kaplan-Meier freedom from aortic complications in patients predicted to be at high-risk versus those predicted to be at low-risk, with comparison of actual outcomes in these predicted groups by the log-rank test.
A chi-squared filter feature selection method ranked the importance of various aspects of aortic morphology as shown in Table 2, with higher-ranking features demonstrating greater predictive power for determining the risk of endograft complications. The 19 highest-ranking features were included in ANNs to predict aortic complications and limb complications ( Table 2, Figs 1, 2 and 3). Binary data regarding patient comorbidity and demographics (Table 1) were added as inputs for an ANN to predict mortality (Fig 4).

Prediction of Endograft Aortic Complications
In the training dataset (centre 1), the 19-feature Bayesian ANN allocated 45.8% of patients to the low-risk group. The 5-year freedom from aortic complications was 98.3% in the low-risk group, and 41.3% in the high-risk group (p<0.001, log-rank test; Fig 1A; c-statistic 0.763). In the validation dataset (centre 2), the ANN allocated 44.4% of patients to the low-risk group. The 5-year freedom from aortic complications was 95.9% in the low-risk group, and 67.9% in the high-risk group (p<0.001, log-rank test; Fig 1B; c-statistic 0.759).

Prediction of Endograft Limb Complications
In the training dataset (centre 1), the 19-feature Bayesian ANN allocated 52.8% of patients to the low-risk group. The 5-year freedom from endograft limb complications was 97.6% in the low-risk group, and 64.0% in the high-risk group (p<0.001, log-rank test ; Fig 2A; c-statistic 0.834). In the validation dataset (centre 2), the ANN allocated 51.0% of patients to the low-risk group. The 5-year freedom from endograft limb complications was 99.3% in the low-risk group, and 92% in the high-risk group (p<0.001, log-rank test; Fig 2B; c-statistic 0.767).

Combined Prediction of All Endograft Complications
In the training dataset (centre 1), the 19-feature Bayesian ANN allocated 71.8% of patients to the low-risk group. The 5-year freedom from all endograft complications was 88.5% in the low-risk group, and 32.0% in the high-risk group (p<0.001, log-rank test; Fig 3A; c-statistic 0.779). In the validation dataset (centre 2), the ANN allocated 40.6% of patients to the low-risk group. The 5-year freedom from endograft limb complications was 96.5% in the low-risk group, and 85.6% in the high-risk group (p<0.001, log-rank test; Fig 3B; c-statistic 0.776).

Prediction of Mortality after EVAR
An ANN combining 19 morphological features ( Table 2) with patient demographics and comorbidity (Table 1) classified 59.3% of the training set patients (Centre 1) as low-risk. In the training dataset, the 5-year freedom from mortality was 95% in patients predicted to be lowrisk, and 13.7% in those predicted to be high-risk (p<0.001 log rank test; Fig 4A; c-statistic 0.741). In the validation dataset, 57.3% were classified as low-risk and the 5-year freedom from mortality was 87.9% in the low-risk group vs. 79.3% in the high-risk group (p<0.001 log-rank test; Fig 4B; c-statistic 0.699).

Discussion
Although the short-term benefit of EVAR for AAA is clear, challenges remain regarding its longer-term clinical success, especially endograft-related complications and all-cause mortality [2][3][4]. The present study demonstrated that these long-term outcomes could be risk-stratified before surgery, using routinely available pre-operative data. A machine-learning technique reproducibly categorised patients' 5-year risk for both endograft complications and all-cause mortality. This adds considerably to previously published work aimed at providing patients with individualised risk profiles (IRP) for EVAR [17,29]. Risk stratification of endograft complications and mortality has considerable relevance because the need for lifelong surveillance after EVAR is predicated entirely on the incidence of endograft complications, while better prediction of long-term mortality informs patient selection for surgery. All-cause mortality was reported preferentially to aneurysm-related mortality, in order to maximise internal validity and the reproducibility of our findings. This is because autopsy is rare in the UK and aneurysm-related mortality may be confounded by differences in endograft surveillance practice, whereas all-cause mortality and endograft reinterventions can be reported with confidence, and reproducibly studied. Patients identified by an ANN to be at very high risk of all-cause mortality might also be potentially excluded from consideration for AAA repair. The ANN models developed in this study outperformed a validated IRP, the SGVI (St George's Vascular Institute) score, which was derived using Cox Proportional Hazards modelling techniques [17] (see Appendix C Figs 1, 2, 3 and 4 in S1 File). The majority of patients are low-risk, both in clinical practice and in classification by ANN or SGVI score models. The ANN provided a more powerful prediction of the low-risk group than has previously been possible with the SGVI score, which resulted in superior performance in simulated surveillance studies (see Appendix D in S1 File). This finding is not unexpected due to the construct of ANNs; that enables them to outperform logistic or survival models by including multiple interacting and complex covariate effects for event prediction. To date, ANNs have successfully been used in the clinical environment as event predictors in a variety of settings including the diagnosis of myocardial infarction [30] and survival after acute coronary syndrome [31]. In other clinical arenas, studies have demonstrated that ANNs outperform logistic regression models in the prediction of autonomic dysfunction [32], or mortality following hip fracture [33]. These quoted studies compared ANN with models derived from logistic regression; the SGVI score was derived using Cox Proportional Hazards modelling techniques and similar comparisons in this context have also been previously carried out [34].
The impact of aortic morphology on long-term outcome of EVAR is complex and wellsuited to ANN analysis, with considerable potential for interaction between aortic volume, shape, diameter, angulation. Existing models have repeatedly demonstrated that aneurysm diameter predicts reintervention after EVAR [35,36], but evidence also suggests that other aspects of aneurysm morphology contribute to long-term clinical success [23,[37][38][39]. More complex considerations such as endograft configuration and deployment, or intermediate markers of patients' cardiovascular risk phenotype, could potentially be "learned" by future iterations of the ANN in prospective studies. The addition of further operative factors (length of operation, graft size, and endoleak at completion) or post-operative factors (endoleak at early surveillance scans) might further improve the discriminatory power of ANNs.
In contrast to conventional statistical models, the ANN utilised for the present study was designed as a binary classifier of a dichotomous event: endograft complications, or all-cause mortality. A limitation of this technique is that clinicians are not provided with a greater number of predicted risk strata, or a group at predicted intermediate risk. A multinomial modification of the ANN analysis technique can allow for generation of intermediate risk strata, but there were too few patients to allow adequate discrimination in this study (Appendix E in S1 File); while for endograft surveillance a dichotomous model arguably provides sufficient information for clinicians to identify those in whom surveillance could either be curtailed or intensified. A disadvantage of the ANN approach was that the best model required the measurement and input of 19 features of aortic morphology for maximum accuracy, compromising ease of use for clinicians. A more parsimonious ANN, incorporating 8 predictive features, would allow faster use by clinicians, but performed with inferior discriminatory power (see Appendix C Figs 1, 2, 3 and 4 in S1 File). A web portal for entry of morphology data, relaying information to an ANN hosted on a remote server, would mitigate this difficulty and improve the usability of the ANN solution for clinicians. A single software package was used for the assessment of aortic morphology in the present study to minimise bias and maximise reproducibility. However, for clinical practice, any locally available software allowing 3D reconstruction of CT aortograms would be suitable, further improving the acceptability of this solution. Furthermore, the model was developed and validated on a cohort of patients from two institutions in a single country. It was notable that the proportion of patients classified as high-or low-risk varied between centres, although the absolute difference in event rates between groups within each centre was clinically significant. This might be attributable to local differences in patients' preferences for endograft limb reinterventions, and further prospective evaluation of the ANN models will be essential to inform clinicians regarding its generalizability, and improve understanding of the potential health economic implications for stratified endograft surveillance. A further potential limitation of the ANN technique was the number of input variables required compared to the number of events detected; this might challenge the generalizability of the model and further evidence is required of its performance characteristics in other groups of patients.
It has previously been suggested that surveillance should be adjusted according to patients' AAA diameter as a surrogate for the future risk of endograft complication [35]. Unfortunately, existing IRPs have not been able to define the lowest-risk patients with sufficient accuracy to suggest safe termination of surveillance in such a cohort; for example, the lowest-risk patients classified by the SGVI score continued to demonstrate a 12% 5-year risk of aortic complications [17]. Furthermore, existing risk scores have not been able to predict endograft limb complications (stenosis or occlusion), which remain a source of morbidity [24].
The ANN models developed in the present study have the potential to inform surveillance practice by stratifying the incidence of all endograft complications requiring reintervention. Existing controversies around surveillance after EVAR encompass diagnostic inaccuracy, patient attendance rates, interval presentations and adverse events caused through surveillance including ultimately unrequired reinterventions, nephrotoxicity, radiation exposure and cost. Due to these issues, surveillance protocols tend to be applied uniformly, and are empirical rather than evidence-based. There is widespread variation in practice in the UK [40,41] and internationally [42], both in the timing of scans and the imaging modality used. Many centres utilise CT for first-line imaging amid concerns regarding the sensitivity of duplex, but duplex ultrasound is proven to offer sufficient diagnostic accuracy for detecting key complications [6], is safe [43,44] and is preferred to CT by institutions with greater experience in performing EVAR [42].
Several studies have reported that the majority of complications requiring reintervention after EVAR are most often detected by the onset of symptoms occurring in the interval between apparently normal surveillance scans [1,[9][10][11]; so that just 1.4-9% of patients undergo reintervention after EVAR directly as a result of surveillance findings rather than the development of symptoms between surveillance scans. Therefore, it could be argued that over 90% of all patients receive no benefit from post-EVAR surveillance. In addition, 15% of post-EVAR aneurysm ruptures are described in patients with no detectable stent abnormality on surveillance imaging [45]. Finally, the attendance rate at surveillance is globally less than 50% of those who originally underwent EVAR, with some reports suggesting that the figure is as low as 30% [7,46]. Surveillance can also give rise to false positive findings; in one series, 3/553 (0.5%) patients underwent unnecessary diagnostic angiography due to apparent endoleak on initial surveillance; exposing these patients to unnecessary procedural risk [8]. The present study suggests that an ANN technique for analysing aortic morphology and basic co-morbidity data might inform a more evidence-based approach to patient selection and post-operative surveillance, in which the frequency of scans can be targeted to the risk of endograft failure.

Conclusion
This study has demonstrated that it is possible to stratify the risk of key long-term outcomes after EVAR, based on routinely available pre-operative data combined with accurate assessment of aortic morphology. This might impact both patient selection and surveillance after EVAR, which remain subject to controversy in practice. The development of a user interface, further validation of the model and feasibility study are required to enhancing the acceptability of this proposal.
Supporting Information S1 File. Appendices. Equations for ANN Specification, Description of ANN structure, Comparison of 19-feature ANN performance with the SGVI Score and a more parsimonious 8-feature ANN, Simulated Surveillance Protocols and Performance of a 3-group ANN for classification of high-risk, medium-risk, and low-risk patients for limb or aortic endograft complications after EVAR.