Development of prediction models of spontaneous ureteral stone passage through machine learning: Comparison with conventional statistical analysis

Objectives To develop a prediction model of spontaneous ureteral stone passage (SSP) using machine learning and logistic regression and compare the performance of the two models. Indications for management of ureteral stones are unclear, and the clinician determines whether to wait for SSP or perform active treatment, especially in well-controlled patients, to avoid unwanted complications. Therefore, suggesting the possibility of SSP would help make a clinical decision regarding ureteral stones. Methods Patients diagnosed with unilateral ureteral stones at our emergency department between August 2014 and September 2018 were included and underwent non-contrast-enhanced computed tomography 4 weeks from the first stone episode. Predictors of SSP were applied to build and validate the prediction model using multilayer perceptron (MLP) with the Keras framework. Results Of 833 patients, SSP was observed in 606 (72.7%). SSP rates were 68.2% and 75.6% for stone sizes 5–10 mm and <5 mm, respectively. Stone opacity, location, and whether it was the first ureteral stone episode were significant predictors of SSP. Areas under the curve (AUCs) for receiver operating characteristic (ROC) curves for MLP, and logistic regression were 0.859 and 0.847, respectively, for stones <5 mm, and 0.881 and 0.817, respectively, for 5–10 mm stones. Conclusion SSP prediction models were developed in patients with well-controlled unilateral ureteral stones; the performance of the models was good, especially in identifying SSP for 5–10-mm ureteral stones without definite treatment guidelines. To further improve the performance of these models, future studies should focus on using machine learning techniques in image analysis.


Introduction
Ureteral stones are the most common urologic emergency. They are associated with severe pain, renal obstruction, and urinary tract infection [1]. Indications for the management of ureteral stones are not clearly defined. The clinician's choice determines whether to wait for spontaneous ureteral stone passage (SSP) or perform active treatment, including extracorporeal shock wave lithotripsy (ESWL), ureteroscopy, laparoscopic removal, or percutaneous treatment. However, in some instances, active treatment may be provided without waiting for SSP, in well-controlled patients, to avoid unwanted complications, including recurrent attacks of renal colic, urinary tract infection, and deterioration of renal function, which might be considered as over-treatment. Therefore, suggesting the possibility of SSP would help make a clinical decision regarding ureteral stones.
Predictive modelling of patient outcomes involves development of mathematical models to predict individual patient outcomes [2]. Models can be based on traditional statistical techniques or artificial intelligence (AI) techniques [2]. AI approaches are capable in not only processing imprecise and uncertain data which is common in clinical and biological data, but also in processing big data that are too large or complex for conventional statistical techniques [3]. In the fields of ureteral stones, AI has been used in predicting stone-free rate, however, no study has been performed for usage of AI in predicting SSP [2].
Intervention for ureterolithiasis is recommended after a 4-week observation period for patients with a ureteral stone who visit the outpatient department through the emergency department. As recommended by the European Association of Urology (EAU) and American Association of Urology (AUA) joint guidelines [4], our team determines the SSP rate of ureteral stones among patients with pain control and without complications during the observation period. Therefore, this study aimed to identify predictive prognostic factors for SSP. In addition, using those factors, we developed and compared prediction models of SSP using machine learning and logistic regression.

Materials and methods
The Institutional Review Board of the Yonsei University Health system (project no: 2019-0959-001) approved the study. The data were analyzed anonymously in retrospective manners, so the consent was waived. Please contact the corresponding author or the Institutional Review Board of the Yonsei University Health system (irb@yuhs.ac) for the request for original data that has been used for this analysis. Upon request by the researchers for original data for this study, the Institutional Review Board of the Yonsei University Health system will review whether researchers meet the criteria for access to confidential data. stones at our emergency department between August 2014 and September 2018. Of the initial 868 patients, 35 were excluded for the following reasons: (I) 25 had uncontrolled pain, (II) four experienced spiking fever owing to renal obstruction, and (III) six were employed as airline pilots or soldiers, in whom an episode of intractable renal colic could be dangerous.

Institutional protocol for patients with ureteral stone episode
According to the renal colic management protocol of our emergency department, all patients underwent a detailed medical evaluation, including history taking; physical examination; urinalysis; complete blood count; routine serum chemistry measurements; kidney, ureter, and bladder radiography (KUB); and non-contrast-enhanced computed tomography (NCCT). Fluoroscopy or ultrasound has not been done.
In the urology outpatient department, patients were questioned about pain severity, complications, ureteral stone history, and sensation or observation of stone fragments during urination. Fluid intake of >2 L per day was recommended during the observation period. We performed NCCT 4 weeks from the first stone episode if the stone was not spontaneously expelled. For patients who did not experience SSP, the decision to continue follow-up after 2 weeks or perform intervention was based on the physician's discretion and patient preference.

Definition of variables related to stone characteristics
The diagnosis of ureteral stones was based on the presence of an unequivocal finding of a stone on NCCT. Stone size was defined according to its largest diameter and was stratified into groups: those measuring <5 mm, 5-10 mm, and >10 mm. The location of the stone was classified into two groups based on its anatomical position in the upper or lower ureter. Plain radiographic characteristics were used to classify stones as radiopaque or radiolucent [4]. In patients with a history of ureteral stones, the present episode on the other side was considered the first ureteral stone. The estimated glomerular filtration rate (eGFR) was determined using the Modification of Diet in Renal Disease formula [5].

Assessment of predicting stone spontaneous passing at 4 weeks from the stone episode
Predictors of SSP were analyzed based on 1) laboratory investigations, including urinalysis, complete blood count, and routine serum chemistry measurements; 2) radiographic results, such as stone size, stone location, radiographic characteristics, and presence of hydronephrosis; 3) medical expulsive therapy (MET) using α-blockers; and 4) a history of stones and interventions.

Machine learning
Prior to formulating machine learning models, the data set was randomly divided into two mutually exclusive sets for random allocation of SSP and non-SSP: training (80%) and validation (20%) [6]. The training set was used to construct the prediction model, while the validation set was used for validating the performance of each model. For the training set, the set was randomly divided into two sets: training (80%) and testing (20%). A concise description of each machine learning algorithm is provided below. All machine learning models were implemented using the Keras framework with R programming language (R Core Team, Vienna, Austria, 2016) [7].

Multilayer perceptron (MLP)
MLP is a class of artificial neural network (ANN) with multiple or deep layers of nodes. Each node is a neuron resembling the connectivity patterns of that of animals, and uses a non-linear activation function that enables distinguishing linearly inseparable data. The Keras framework, a recent deep learning interface, was employed to construct an MLP model in this study [7].
The architecture of the MLP model used in this study was composed of an input layer, three fully connected hidden layers with 128 and 16 nodes, and an output layer. The input layer referred to input data, such as predictors of SSP. The hidden layers were layers where computing of the input features occurred. The node in the output layer represented the computed prediction [8]. The learning process of this MLP is visualized in Fig 1.

Statistical analysis
Results were reported as mean ± standard deviation for continuous variables and as percentages for categorical variables. For univariate analyses, the t-test was used for continuous variables and the chi-square test or Fisher's exact test for categorical variables. Univariate logistic regression analyses were performed, and multivariate logistic regression analyses were performed on significant factors associated with SSP in the univariate analyses. The area under the receiver operating characteristic (ROC) curve (AUC) was used to compare the performance of models. All reported p-values were two-sided, and statistical significance was set at p<0.05. Statistical analyses were performed using the Statistical Package for Social Sciences, version 23.0, for Windows (SPSS, Chicago, IL, USA).

Results
Patient characteristics are shown in Table 1. Of the 833 patients included for analysis, SSP was observed in 606 (72.7%). eGFR, the first episode of ureteral stone, the spontaneous passage of the previous stone, location, opacity of the stone, and the presence of hydronephrosis were significantly different between the groups with and without SSP.
The predictive performances according to predictive modes are shown in Table 4. For stone size <5 mm, AUCs for MLP and logistic regression were 0.859 and 0.847 (p = 0.410), respectively (Fig 2), and for stone size 5-10 mm, these were 0.881 and 0.817 (p = 0.170), respectively (Fig 2). The sensitivity and specificity of each model are listed in Table 4.

Discussion
The greatest dilemma the urologist faces today is to decide whether to wait for SSP or immediately perform intervention for ureterolithiasis. Of the parameters for diagnostic evaluation recommended by the guidelines [9], in addition to the widely renowned predictors for SSP, location and opacity of the stones and the first ureteral stone episode were identified as the predictors of SSP in this study. In addition to analysis using logistic regression models, this study investigated the predictability of machine learning based prediction models using MLP, a class of deep learning method. To the best of our knowledge, no report has compared the predictability of deep learning or machine learning with conventional approaches.
According to the current literature, the size and location of the stone, serum C-reactive protein (CRP) concentration, neutrophil to lymphocyte ratio, pyuria, hydronephrosis, helical CT findings of perinephric fat stranding and the tissue-rim sign related to inflammatory changes, and Hounsfield unit of stone are predictors associated with SSP [10][11][12][13][14]. However, if no  intervention is planned, an examination of sodium, potassium, CRP, and blood coagulation time may be omitted [9]. Measuring Hounsfield units was recommended to decide whether to consider ESWL. There were no definite criteria, such as perinephric fat stranding and tissuerim sign, to classify the findings in CT. Therefore, we did not include these variables as potential predictors in our analysis. Additionally, our team is currently investigating the predictors for SSP that includes image analysis of Hounsfield units, anatomical abnormalities, and malformation. Stone size and location are generally considered the most important factors associated with the possibility of SSP. Limited data were found on SSP according to stone size. Ueno et al. analyzed 520 ureteral calculi based on size and reported that 286 (55.0%) calculi passed spontaneously, and the mean length was 6.3 mm [15]. In our previous study, we reported an SSP rate of 88.2% for stones up to 5 mm, and 62.2% for 5-10 mm stones [10]. However, a meta-analysis of five patient groups (224 patients) estimated that 68% of stones �5 mm would pass spontaneously (95% CI: 46% to 85%). For stones >5 mm and �10 mm, a meta-analysis of three groups (104 patients) estimated that 47% would pass spontaneously (95% CI: 36% to 59%). This study showed that the SSP rates within 4 weeks for patients with stones <5 mm, and those with stones 5-10 mm, were 75.6% and 68.2%, respectively [4]. Our findings, which are from the largest sample of 833 patients, were in accordance with the previous studies. The parameters, including a history of ureteral stones, α-blocker usage (MET), and hydronephrosis, were presented as predictive factors for SSP in other previous studies. Ozcan et al. presented a history of SSP as the positive predictive factor for SSP in 251 patients with 4-10 mm distal ureteral stones [16]. However, the previous SSP might have caused permanent changes in the ureter owing to inflammation. In our cohort, a history of SSP was not found to be a significant parameter in the univariate analysis. The first ureteral stone episode was found to be a positive predictor of SSP in the multivariate analysis. Patients using α-blockers (MET), calcium channel blockers, and phosphodiesterase type 5 inhibitors were more likely to pass stones with fewer colic episodes than those not undergoing such therapy [17][18][19]. The EAU guideline strongly recommends α-blockers as MET for (distal) ureteral stones > 5 mm [9]. However, in our study population, MET was not a significant predictor of SSP. Further, in Korea, medication for MET is not reimbursed for patients with ureteral stones; thus, only limited patients have received medication for MET based on the physician's preference. The presence of hydronephrosis was a negative predictor of SSP in a previous study [20]. However, in our study, hydronephrosis was not found to be a significant parameter in multivariate analysis. We presume that this is because the degree of hydronephrosis could be affected by the time that CT was performed and other physical parameters related to stones, such as shape, size and anatomical features in the ureter.
According to EAU guidelines, an exact cut-off size for stones that may pass spontaneously cannot be provided; however, ureteral stones <6 mm can pass spontaneously in well-controlled patients [9]. However, for stones measuring 5-10 mm, the optimal treatment strategy remains unclear. Therefore, clinicians may prefer interventions, such as ESWL or ureteroscopic ureterolithotomy, to conservative management.
Although each medical evaluation and parameter have their own meaning and roles for evaluation of the disease status of patients, they could not accurately predict patient outcome. Prediction of patient outcome in the medical fields including the fields of ureterolithiasis, requires incorporation of several parameters and handling of complex and big data. We believe that is the reason why models are being developed to predict patient outcome including the fields of ureterolithiasis. We attempted to develop predicting models using artificial intelligence (AI) technology. Indeed, the number of articles applying machine learning to medical research has been growing rapidly in recent years [21,22]. In particular, deep learning, a class of machine learning, is increasingly being applied in the field of diagnosis and prediction related to medical imaging, yielding impressive results [8]. It will benefit both patients and clinicians who are willing to adopt modern technologies.
In the fields of ureterolithiasis, AI has been widely used in predicting stone-free rate such as the recent study by Shabaniyan et al. reporting 94.8% accuracy in predicting postoperative outcome of a percutaneous nephrolithotomy [23]. However, to the best of our knowledge, no study has been reported for predicting SSP by AI techniques [2]. While this study implemented a deep learning approach for constructing the SSP prediction model, the "black box" nature, or incapability to identify the reason of each decision, is a limitation of AI based diagnosis [22]. The doctors will rarely follow the advice of a machine if they cannot see the reasoning underlying that advice, especially when the responsibility for the patient will remain with the clinicians [22,24]. There are ongoing studies on this, and some of them are showing achievements in this interpretability.
Our study had several strengths. Most previous studies only described the predicting factors of SSP [11,14,25,26]. To our knowledge, this is the first study to provide prediction models of SSP based on the diagnostic parameters recommended by the guidelines, with the largest cohort of its kind. Moreover, our study compared the deep learning method, MLP based on Keras framework, with the conventional statistic method, logistic regression in predicting SSP. Although MLP has higher AUC and accuracy than logistic regression, there were no statistical differences in the predictive power between the two models. However, MLP showed a higher specificity than logistic regression, especially for ureteral stones 5-10 mm in size. This grey zone is where clinicians have difficulty in deciding between ureterolithiasis or wait for SSP. Since MLP showed 100% specificity in predicting SSP for ureteral stones of 5-10mm in size, we believe that we could utilize MLP in real clinical settings to suggest wait for SSP in patients with 5-10mm ureteral stones.
However, there were several limitations. First, the level of patient compliance in terms of fluid intake would have differed among patients; hence, this could not be included as a parameter. Second, although radiopaque stone was identified as a significant predictor of SSP, we have not analyzed parameters in images, including using the Hounsfield unit. Third, although the original intent for this study was not focused on finding minimal-optimal subset of features, we could investigate on minimal-optimal features of SSP by using AI techniques such as maximum relevance-minimum redundancy algorithm or linear discriminant analysis for dimensionality reduction methods in our future study.
In this study, predictive models of SSP were developed in patients with unilateral ureteral stones. In particular, our findings can be useful in identifying low likelihood of SSP for 5-10 mm size ureteral stones with no definite treatment guidelines. Although the deep learning method MLP based on Keras framework did not show superior performance in predicting SSP compared to the conventional statistical method, logistic regression, future studies should attempt to improve the predictive power using image analysis.