Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

An efficient machine learning framework to identify important clinical features associated with pulmonary embolism

  • Baiming Zou ,

    Roles Data curation, Formal analysis, Methodology, Writing – original draft, Writing – review & editing

    bzou@email.unc.edu

    Affiliations Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, United States of America, School of Nursing, University of North Carolina at Chapel Hill, Chapel Hill, NC, United States of America

  • Fei Zou,

    Roles Methodology, Writing – review & editing

    Affiliation Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, United States of America

  • Jianwen Cai

    Roles Methodology, Writing – review & editing

    Affiliation Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, United States of America

Abstract

A misdiagnosis of pulmonary embolism (PE) can have severe consequences such as disability or death. It’s crucial to accurately identify key clinical features of PE in clinical practice to promptly identify potential PE patients who may present asymptomatically, and to prevent misdiagnosing PE as asthma exacerbation in patients with symptoms like dyspnea or chest pain. However, reliably identifying these important features can be challenging due to many factors influencing the likelihood of PE development in complex fashions (e.g., the interactions among these factors). To address this difficulty, we presented an effective framework using the deep neural network (DNN) model and the permutation-based feature importance test (PermFIT) procedure, i.e., PermFIT-DNN. We applied the PermFIT-DNN framework to the analysis of data from a PE study for asthma exacerbation patients. Our analysis results show that the PermFIT-DNN framework can robustly identify key features for classifying PE status. The important features identified can also aid in accurately predicting the PE risk.

1 Introduction

Pulmonary embolism (PE) is a blockage in one of the arteries in lung that can occur in an artery in the centre or near the edge of lung. Frequently, PE results from a blood clot that forms in the legs or other part of the body and travels to the lung [13]. PE can cause pulmonary hypertension [48], and it is associated with the risk of major bleeding [912]. Untreated PE can potentially lead to fatal conditions such as chronic pulmonary hypertension, fairly acute right heart failure, permanent damage to the lung, disability or even death. A false diagnosis thus exposes patients to unnecessary risk of complications from PE [1316]. Clinically, PE may present with conventional symptoms such as shortness of breath, coughing up blood and pleuritic chest pain, dyspnea, but also other symptoms, for examples, insidious onset of breathlessness over days or weeks, or syncope with relatively few respiratory symptoms. On the other hand, many people who have pulmonary embolism do not show any symptoms. The signs and symptoms of PE vary greatly depending on the size of the clot, how much of the lung is involved and whether the patient has an underlying medical condition [1720].

In addition to the symptoms and clinical presentations heterogeneity of PE, there are many factors associated with the risk of developing PE. Studies have shown that the major factors contributing to an increase in risk of development of pulmonary embolism include heart disease, certain types of cancer, obesity, acute paraplegia, accidental and operative trauma, air travel, inactivity, smoking, etc [2126]. Even more challenging, these clinical features and symptoms interact with each other and jointly influence the propensity of developing PE in a complex fashion. Identifying important clinical features associated with PE status is helpful to identify many potential PE patients who do not show any symptoms. This will conveniently raise prompt warnings for physicians and patients such that the exact PE status can be timely confirmed by the computed tomographic pulmonary angiography (CTA) to make appropriate treatment decisions. Furthermore, identifying important clinical features of PE can help to avoid diagnosing PE as asthma exacerbation in cases with symptoms such as dyspnea or chest pain [27]. However, it is challenging to determine the key clinical features due to the complex functional relationship between risk features and PE phenotype. On the other hand, many machine learning methods allow robustly modeling the complex relationship between disease outcomes and clinical features. In particular, the deep learning method, e.g., deep neural network (DNN) [28, 29], is a powerful machine learning tool that can accurately approximate a complex functional relationship between disease outcomes and clinical features [30]. Indeed, some machine learning methods such as support vector machine and random forest, etc, have been used for PE predictions [31] but not for the use of identifying critical features associated with PE status.

Though machine learning models can offer the optimal prediction power under complex functional structures, they suffer from the lack of transparency for interpreting each feature’s role in disease outcome prediction due to the abstract algorithm used. How to robustly identify the important clinical features associated with PE risk is critical yet not trivial. An existing procedure for this purpose is to adopt a LASSO method to identify the important clinical features, and use them for PE prediction via a logistic regression model [32]. However, this framework requires a (generalized) linear additive assumption between the clinical features and the propensity of developing PE, which may not hold and is unverifiable in practice. On the other hand, a newly proposed permutation-based feature importance test (PermFIT) provides a universal framework for various machine learning models to identify potential highly correlated important features [33]. Motivated by the challenge of identifying important clinical features associated with the risk of PE under the complex functional relationship, in this paper, we present a permutation-based feature importance test for DNN model (PermFIT-DNN). The PermFIT-DNN framework adapts the appealing feature of PermFIT procedure and the power of our recently proposed scoring algorithm to improve the stability of conventional DNN model [34].

Overall, the aim of this paper is to present an efficient framework for identifying important clinical features related to the risk of pulmonary embolism and to provide early warnings to potential patients who may show asymptomatically. Our analysis reveals that the PermFIT-DNN method outperforms commonly used machine learning techniques in accurately detecting the key features related to the risk of PE. Furthermore, using these identified important features leads to notably accurate predictions of PE risk. The paper is structured as follows: in the Methods section, we outline the process of using machine learning methods to model the complex relationship between risk factors and PE status, and the PermFIT procedure to identify the important features associated with PE status under the complex relationship. In the Results section, the PermFIT-DNN is applied to identify the important clinical features related to PE risk based on data from a clinical study [35] and the results are compared with other machine learning models. Finally, the paper concludes with a brief discussion.

2 Methods

2.1 Machine learning methods for modeling complex association relationship between clinical features and pulmonary embolism risk

Let X = (X1, …, Xp) be a p-dimensional clinical features (e.g., age, body mass index, hypertension, history of PE, etc), and Y be the binary PE status (e.g., Y = 1 and 0 for PE positive and negative, respectively) with π(X) = E(Y|X) = Pr(Y = 1|X), i.e., the conditional probability of being positive PE given the clinical features X. To predict PE status, we need to estimate π(X = x). Traditionally, it is estimated via a logistic regression model. This parametric modeling strategy needs to make a strong assumption on the relationship between the clinical features and the probability of developing PE, which may barely hold and is difficult to verify in practice. To relax this restrictive assumption, machine learning methods are often adopted. Here, we investigate four frequently used machine learning models, i.e., deep neural network (DNN) [28, 29], random forest (RF) [36, 37], and support vector machine (SVM) [38, 39], for their performance in classifying the binary PE status using the clinical features, among which SVM and RF have been used in predicting PE status [31] but not DNN. However, unlike the conventional DNN method, we adopt the stable DNN procedure which can improve prediction performance [34, 40]. Specifically, the stable DNN introduces two extra procedures, i.e., bootstrap aggregating—a machine learning ensemble meta-algorithm [41] is first adopted to increase the stability and accuracy of a single DNN [42]. However, this may not guarantee the stable prediction of each DNN model due to random parameter initialization. To further boost DNN performance, a filtering algorithm is adopted to remove poorly performing bagged DNNs according to the principle that “many could be better than all” [40, 43].

2.2 Identification of important clinical features of PE risk via permutation feature importance test (PermFIT)

Though the machine learning methods relax the restrictive assumptions made in the traditional parametric method (e.g., linear or logistic regression) and improve the prediction accuracy, they lack the transparency of interpretation regarding the role of each feature on disease outcome. To identify each feature’s effects on disease outcome under complex functional relationship, we need to establish a valid statistical inference for machine learning models. Herein, we adopt the permutation-based feature importance test (PermFIT) procedure which is applicable to various machine learning models [33]. Based on it, we present a powerful framework to select important clinical features associated with developing PE risk. Although the permutation-based feature importance assessment methods have been proposed for the random forest and DNN models, these methods either do not conduct any statistical inference or cannot provide valid inference regarding the feature importance [44, 45].

We define the feature importance score Mj of Xj (i.e., the jth feature in X (j = 1, …, p)) as the expected squared difference between and , where is a rearranged X with its jth feature replaced by Xj, a random permutation of the elements of Xj. The importance score Mj can be re-expressed as , which is zero only when π(X) ≡ π(X(j)), implying no contribution of X(j) on X(j) conditional on the other covariates. The stronger the impact of X(j) on X(j), the larger Mj is expected to be. Furthermore, Mj can be estimated empirically. Let be a random sample of the elements in Xj without replacement, and the empirical permutation importance score be where with Xi· = (Xi1, …, Xip) and .

Note that . The estimate of π(⋅), i.e. , can be obtained using the aforementioned machine learning models, and the parametric logistic regression (LOG). Particularly, the DNN method we used is the stable DNN model [34, 40]. We estimate as

Under finite sample size, to avoid a potential overfitting of the approximator using the machine learning method, we employ a cross-fitting strategy to separate the input data into training and validation sets, with the training set used for generating and the testing set for estimating . Let be the estimate of π(⋅) from the training set, and be the validation set, we obtain the feature importance score estimate , and the variance estimate of as:

Based on it, we construct the test statistic for importance hypothesis test of feature Xj and given as the following: (1)

The PermFIT framework can be summarized in the following Algorithm 1, and the implemented R package is available at https://github.com/SkadiEye/deepTL.

Algorithm 1 Important Feature Identification Procedure for Machine Learning Models

1: Pre-specify a significance cutoff p-value and randomly split the data as training set and testing set.

2: Within the training set, the machine learning model is adopted to evaluate the test statistic δ via Eq (1) and the corresponding p-value for each feature.

3: Identify the important features by comparing the evaluated p-values with the pre-specified cutoff.

2.3 Data source

In this paper, we applied the PermFIT framework to a cohort of asthma exacerbation patients from our early retrospective clinical study (with permission granted by the institutional review board (IRB) of the University of Florida (UF), Gainesville, Florida (IRB #: 201802508)) [35], to identify the important features associated with acute PE. The raw data were extracted from the patients’ electronic health records in fully anonymized format with the requirement for informed consent waived by IRB. This led to a total of 3, 660 samples being extracted.

Among the total 3, 660 asthma exacerbation patients, the final study sample included 758 patients who underwent CTA in our analysis. Among these patients, a total of 145 were confirmed positive PE patients via CTA. Under the PermFIT framework, we adopt the aforementioned machine learning models, i.e., stable DNN, random forest, support vector machine, and a parametric logistic regression model to identify the important features associated with PE status among the 22 collected clinical features. Summary statistics of 16 major clinical features of total 22 features are summarized in Table 1, while the rest of 6 clinical features include: atrial fibrillation (97 for yes and 661 for no), ED visit in previous year (130 for yes and 628 for no), coronary artery disease or peripheral vascular disease (178 for yes and 580 for no), use of contraceptives (13 for yes and 745 for no), fractures or general anesthesia in prior month (4 for yes and 754 for no), hemoptysis (7 for yes and 751 for no).

thumbnail
Table 1. Major clinical feature summary statistics.

https://doi.org/10.1371/journal.pone.0292185.t001

Results

We adopted a cross-validation in approximately 10: 1 ratio by randomly selecting 65 samples for testing and the rest used for training. In the analysis, we set 4 hidden layers with 50, 40, 30, 20 hidden nodes at each layer for the stable DNN method. For the RF method, we grew 1, 000 trees, and the hyper-parameter settings for RF and SVM were searched via a cross-validation. Under the PermFIT framework (with 100 permutations), we compared the DNN, RF, and SVM method (referred as PermFIT-DNN, PermFIT-RF, and PermFIT-SVM, respectively) and logistic regression (referred as PermFIT-LOG) for identifying the important clinical features at the significance level 0.05. The identified important features by each model and the corresponding p-values are presented in Table 2. Results of Table 2 indicate that different method identifies different set of important features. At the significance level of 0.05, the logistic regression, support vector machine, random forest, and stable DNN method identified 3, 2, 2, and 3 clinical features as the important features among total 22 features. Among the identified important features, we notice that the PE history is unanimously claimed by all models as the highly important feature.

thumbnail
Table 2. Identified important clinical features.

https://doi.org/10.1371/journal.pone.0292185.t002

With the selected important features, we evaluated the performance for predicting PE and draw comparison with the corresponding method using all 22 clinical features based on the testing samples. In evaluating the classification performance, besides the accuracy (Accuracy) and area under the receiver operating characteristic curve (AUC), we also adopted the precision-recall AUC (PR-AUC) since this data set is imbalanced, i.e., about 19.1% positive PE. PR-AUC is a useful cut-off independent metrics to evaluate the performance of a classifier on positive samples. For the accuracy evaluation, we used the cutoff probability 0.5. That is, the PE status (i.e., yi) for a patient i with clinical features X = X is predicted as:

Results of Table 3 clearly demonstrate that three machine learning models, i.e., the stable DNN, random forest, and SVM noticeably outperform the parametric logistic regression model in terms of Accuracy, AUC and PR-AUC predictions when all the 22 features are included, and achieve (0.862, 0.774, 0.956), (0.892, 0.775, 0.956), and (0.892, 0.700, 0.935) for (Accuracy, AUC, PR-AUC) predictions, respectively. In contrast, the predictions of for (Accuracy, AUC, PR-AUC) via the logistic regression are (0.108, 0.252, 0.771). This implies that there exist complex functional relationship between clinical features and PE status, which the logistic regression model have severely misspecified the complex functional relationship. Furthermore, though the stable DNN and random forest methods have almost identical performance in terms of AUC and PR-AUC predictions, they are all superior to the SVM method. This observation demonstrates that the stable DNN and random forest can better model the complex functional relationship using all 22 features. However, further examining the difference of Accuracy, AUC, and PR-AUC predictions between using all 22 features and using the identified important features, it is evident that there exist minor or no decrease for the stable DNN method, i.e., DNN vs PermFIT-DNN as (0.862, 0.774, 0.956) vs (0.892, 0.759, 0.940). On the other hand, comparing the prediction difference for AUC and PR-AUC using all 22 features with that including the identified important features only for random forest method, there exist notable decrease, i.e., RF vs PermFIT-RF as (0.775, 0.956) vs (0.710, 0.916). This observation indicates that the important features determined by the random forest method are not very reliable. Therefore, using the important features identified by the random forest method could not accurately classify PE status.

thumbnail
Table 3. PE prediction performance comparison.

https://doi.org/10.1371/journal.pone.0292185.t003

The combination of the results from Tables 2 and 3 shows that using the identified important features can achieve predictions that are almost as accurate as those obtained using all 22 features when the stable DNN model is used. This finding suggests that the important features identified by the stable DNN method can effectively determine PE patients. Particularly, it can result in a remarkable PR-AUC prediction of 0.940, implying that the identified clinical features can accurately characterize PE patients. In clinical practice, this means that the history of PE, chronic prednisone use, and deep vein thrombosis can be used to identify potential PE patients, who can then be confirmed by CTA. It is worth noting that although PermFIT-DNN outperforms two other machine learning models considered in this paper for robustly identifying important features, its predicted accuracy and AUC are not very high (see the confusion matrix in Table 4). This suggests that some important clinical features have not been collected. However, this does not diminish the usefulness of the PermFIT-DNN method for identifying features associated with PE status.

Discussion

In this paper, we used a permutation-based feature importance test procedure to investigate various machine learning models and a parametric logistic regression model to identify the important clinical features related to PE status. Our results indicated that the PermFIT-DNN framework, which combines PermFIT with the stable DNN model, can effectively identify important clinical features. Additionally, using these important features, the prediction performance was non-inferior in terms of all metrics considered, affirming the reliability of the identified features. These results clearly demonstrate the advantages of the permutation feature importance test procedure through the stable DNN model (i.e., the PermFIT-DNN framework) in clinical practice for identifying important features associated with PE risk. However, it should be noted that this study had some limitations, such as being a small single-center study with limited clinical features collected. Larger multi-center studies with more complete clinical features should be conducted in the future. Furthermore, the variance of accuracy estimate can be underestimate due to the overlapping of training samples in cross-validation [46]. However, this should not change the superiority of PermFIT-DNN method over other competing methods such as PermFIT-RF, PermFIT-SVM and PermFIT-LOG since the performance comparisons were based on the same training and testing samples. Also, it should not undermine the usefulness of the derived PermFIT-DNN framework for identifying important features associated with PE risk in clinical practice. With larger sample sizes and more complete clinical features collected, we expect that the PermFIT-DNN framework will identify a robust set of important clinical features to further improve PE prediction accuracy.

References

  1. 1. Heit J.A., et al. (2000). Predictors of recurrence after deep vein thrombosis and pulmonary embolism a population-based cohort study. Arch Intern Med. 160(6), 761–768. pmid:10737275
  2. 2. Konstantinides S.V. et al. (2020). 2019 ESC Guidelines for the diagnosis and management of acute pulmonary embolism developed in collaboration with the European Respiratory Society (ERS). European Heart Journal. 41, 543–603. pmid:31504429
  3. 3. Giri J, et al. (2019). Interventional therapies for acute pulmonary embolism: current status and principles for the development of novel evidence: a scientific statement from the american heart association. Circulation. 140(20), e774–e801. pmid:31585051
  4. 4. Goldhaber S.Z. (1998). Pulmonary embolism. N Engl J Med. 339, 93–104. pmid:9654541
  5. 5. Konstantinides S, et al. (2002). Heparin plus alteplase compared with heparin alone in patients with submassive pulmonary embolism. N Engl J Med. 347, 1143–1150. pmid:12374874
  6. 6. Goldhaber S.Z. (2004). Pulmonary embolism. Lancet. 363(9417), 1295–1305. pmid:15094276
  7. 7. Simonneau G, et al. (2019). Haemodynamic definitions and updated clinical classification of pulmonary hypertension. Eur Respir J. 53, 1801913. pmid:30545968
  8. 8. Couturaud F, et al. (2021) Prevalence of pulmonary embolism among patients with COPD hospitalized with acutely worsening respiratory symptoms. JAMA. 325(1), 59–68. pmid:33399840
  9. 9. Perlroth D.J. & Sanders G.D. & Gould M.K. (2007) Effectiveness and cost-effectiveness of thrombolysis in submassive pulmonary embolism. Arch Intern Med. 167(1), 74–80. pmid:17210881
  10. 10. Chatterjee S & Chakraborty A & Weinberg I, et al. (2014) Thrombolysis for pulmonary embolism and risk of all-cause mortality, major bleeding, and intracranial hemorrhage: a meta-analysis. JAMA. 311(23), 2414–2421. pmid:24938564
  11. 11. Couturaud F & Sanchez O & Pernod G, et al. (2015) Six months vs extended oral anticoagulation after a first episode of pulmonary embolism: the PADIS-PE randomized clinical trial. JAMA. 314(1), 31–40. pmid:26151264
  12. 12. Raslan I.A. & Chong J & Gallix B & Lee T.C. & McDonald E.G. (2018) Rates of overtreatment and treatment-related adverse effects among patients with subsegmental pulmonary embolism. JAMA Internal Medicine. 178(9), 1272–1274. pmid:30073241
  13. 13. Pulmonary Embolism Prevention (PEP) Trial Collaborative Group. (2000) Prevention of pulmonary embolism and deep vein thrombosis with low dose aspirin: Pulmonary Embolism Prevention (PEP) trial. Lancet. 355(9212), 1295–1302. pmid:10776741
  14. 14. Meyer G. (2014) Effective diagnosis and treatment of pulmonary embolism: Improving patient outcomes. Arch Cardiovasc Dis. 107(6-7), 406–414. pmid:25023859
  15. 15. Braekkan S.K. & Grosse S.D. et al. (2016) Venous thromboembolism and subsequent permanent work-related disability. J Thromb Haemost. 14(10), 1978–1987. pmid:27411161
  16. 16. Van der Pol & Liselotte M, et al. (2019) Pregnancy-Adapted YEARS Algorithm for Diagnosis of Suspected Pulmonary Embolism. N Engl J Med. 380(12), 1139–1149. pmid:30893534
  17. 17. Tapson V.F. (2008) Acute pulmonary embolism. N Engl J Med. 358, 1037–1052. pmid:18322285
  18. 18. Prandoni P & Lensing A.W. et al. (2016) Prevalence of pulmonary embolism among patients hospitalized for syncope. N Engl J Med. 375, 1524–1531. pmid:27797317
  19. 19. Righini M & Robert-Ebadi H & Legal G (2017) Diagnosis of acute pulmonary embolism. Journal of Thrombosis and Haemostasis. 15, 1251–1261. pmid:28671347
  20. 20. Howard L (2019) Acute pulmonary embolism. Clinical Medicine. 19(3), 243–247. pmid:31092519
  21. 21. Nordstrom M & Lindblad B & Anderson H & Bergqvist D & Kjellstrom T (1994) Deep venous thrombosis and occult malignancy: an epidemiological study. BMJ. 308, 891–894. pmid:8173368
  22. 22. Sørensen H.T. et al. (1998) The risk of a diagnosis of cancer after primary deep venous thrombosis or pulmonary embolism. N Engl J Med. 338(17), 1169–1173. pmid:9554856
  23. 23. Lapostolle F et al. (2001) Severe pulmonary embolism associated with air travel. N Engl J Med. 345(11), 779–783. pmid:11556296
  24. 24. Miller E.J. & Marques M.B. & Simmons G.T. (2003) Etiology of pulmonary thromboembolism in the absence of commonly recognized risk factors. Am J Forensic Med Pathol. 24(4), 329–333. pmid:14634470
  25. 25. Kyrle P.A. et al. (2004) The risk of recurrent venous thromboembolism in men and women. N Engl J Med. 350(25), 2558–2563. pmid:15201412
  26. 26. Piazza G. & Goldhaber S.Z. (2006) Acute pulmonary embolism part I: epidemiology and diagnosis. Circulation. 114, e28–e32. pmid:16831989
  27. 27. Hashimoto T. et al. (2019) Asthma Exacerbation Coincident with Saddle Pulmonary Embolism and Paradoxical Embolism. Tohoku J. Exp. Med. 248(2), 137–141. pmid:31243182
  28. 28. Bengio Y (2009) Learning deep architectures for AI. Foundations and Trends in Machine Learning. 2(1), 1–127.
  29. 29. LeCun Y & Bengio Y & Hinton G (2015) Deep learning. Nature. 521(7553), 436–444. pmid:26017442
  30. 30. Gelenbe E & Mao Z.H. & Li Y.D. (1999) Pulmonary embolism in acute asthma exacerbation: clinical characteristics, prediction model and hospital outcomes. IEEE Transactions on Neural Networks. 10(1), 3–9.
  31. 31. Hou L & Hu L et al. (2021) Construction of a risk prediction model for hospital-acquired pulmonary embolism in hospitalized patients. Clin Appl Thromb Hemost. 27, 1–9. pmid:34558325
  32. 32. Li Y & He Y et al. (2022) Development and validation of a prediction model to estimate risk of acute pulmonary embolism in deep vein thrombosis patients. Sci Rep. 12(649), 303–314. pmid:35027609
  33. 33. Mi X & Zou B & Zou F & Hu J (2021) Permutation-based identification of important biomarkers for complex diseases via machine learning models. Nature Communications. 12(1), 3008. pmid:34021151
  34. 34. Mi X & Tighe P & Zou F & Zou B (2021) A deep learning semiparametric regression for adjusting complex confounding structures. Annals of Applied Statistics. 15(3), 1086–1100.
  35. 35. Alzghoul B.N. et al. (2020) Pulmonary embolism in acute asthma exacerbation: clinical characteristics, prediction model and hospital outcomes. Lung. 198(4), 661–669. pmid:32424799
  36. 36. Amit Y & Geman D (1997) Shape quantization and recognition with randomized trees. Neural Computation. 9(7), 1545–1588.
  37. 37. Breiman L (2001) Random forests. Machine Learning. 45, 5–32.
  38. 38. Cortes C & Vapnik V.N. (1995) Support-vector networks. Machine Learning. 20(3), 273–297.
  39. 39. Drucker H & Burges C.C. & Kaufman L & Smola A.J. & Vapnik V.N. (1997) Support Vector Regression Machines. Advances in Neural Information Processing Systems. 155–161.
  40. 40. Mi X & Zou F & Zhu R (2019) Bagging and deep learning in optimal individualized treatment rules. Biometrics. 75, 674–684. pmid:30365175
  41. 41. Breiman L (1996) Bagging predictors. Machine Learning. 24(2), 123–140.
  42. 42. Hansen LK & Salamon P (1990) Neural network ensembles. IEEE Transactions on Pattern Analysis & Machine Intelligence. 10(12), 993–1001.
  43. 43. Zhou Z.H. & Wu J.X. & Tang W (2002) Ensembling neural networks: Many could be better than all. Artificial Intelligence. 137, 239–263.
  44. 44. Altmann A et al. (2010) Permutation importance: a corrected feature importance measure. Bioinformatics. 26(10), 1340–1347. pmid:20385727
  45. 45. Putin E et al. (2016) Deep biomarkers of human aging: application of deep neural networks to biomarker development. Aging (Albany NY). 8(5), 1021–1030. pmid:27191382
  46. 46. Wong T.T. & Yeh P.Y. (2020) Reliable Accuracy Estimates from k-Fold Cross Validation. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING. 32(8), 1586–1594.