Confirming the statistically significant superiority of tree-based machine learning algorithms over their counterparts for tabular data

Shahadat Uddin; Haohui Lu

doi:10.1371/journal.pone.0301541

Peer Review History

Original SubmissionJanuary 28, 2024
22 Feb 2024 Decision Letter - Nagarajan Raju, Editor PONE-D-24-03825Confirming the statistically significant superiority of tree-based machine learning algorithms over their counterparts for tabular dataPLOS ONE Dear Dr. Uddin, Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. Please submit your revised manuscript by Apr 07 2024 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript: A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'. A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'. An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'. If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter. If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols. We look forward to receiving your revised manuscript. Kind regards, Nagarajan Raju Academic Editor PLOS ONE Journal Requirements: When submitting your revision, we need you to address these additional requirements. 1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf. 2. Please note that PLOS ONE has specific guidelines on code sharing for submissions in which author-generated code underpins the findings in the manuscript. In these cases, all author-generated code must be made available without restrictions upon publication of the work. Please review our guidelines at https://journals.plos.org/plosone/s/materials-and-software-sharing#loc-sharing-code and ensure that your code is shared in a way that follows best practice and facilitates reproducibility and reuse. 3. Please include captions for your Supporting Information files at the end of your manuscript, and update any in-text citations to match accordingly. Please see our Supporting Information guidelines for more information: http://journals.plos.org/plosone/s/supporting-information. Additional Editor Comments (if provided): I suggest authors to go through the reviewers comments and address them properly in the revised manuscript. [Note: HTML markup is below. Please do not edit.] Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: No Reviewer #2: Yes Reviewer #3: No Reviewer #4: Yes ******** 2. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: I Don't Know Reviewer #2: Yes Reviewer #3: No Reviewer #4: Yes ****** 3. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: No Reviewer #2: Yes Reviewer #3: No Reviewer #4: Yes ****** 4. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: No Reviewer #2: Yes Reviewer #3: No Reviewer #4: Yes ****** 5. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: 1. The study relies on tabular datasets from the UCI Machine Learning Repository and Kaggle, which are popular repositories for machine learning datasets. However, there's a risk of selection bias as these datasets may not be representative of real-world data or may have biases inherent in their collection process. 2. The study chooses five machine learning algorithms, but the rationale for selecting these specific algorithms is not thoroughly justified. While tree-based and non-tree-based algorithms are commonly used, there should be a discussion on why these particular algorithms were chosen over others and how they complement each other in addressing the research question. 3. While the study uses common performance metrics such as accuracy, precision, recall, and F1 score, there's limited discussion on why these metrics were chosen and how they align with the research objectives. Additionally, there's no mention of other important metrics such as area under the receiver operating characteristic curve (AUC-ROC) or specificity, which are crucial for evaluating classification models. 4. The study mentions using the Scikit-learn library for implementing machine learning algorithms and IBM SPSS Statistics for conducting paired-sample t-tests. While these are widely used tools, the lack of detailed information on specific parameter settings and preprocessing techniques could hinder reproducibility. Providing a clear and detailed description of the experimental setup would enhance the study's transparency and reproducibility. 5. The study splits the data into training and test sets using an 80:20 ratio and performs five-fold cross-validation during model development. While cross-validation helps assess the model's performance, there's no external validation using independent datasets. 6. The study focuses solely on classical tree-based and non-tree-based supervised ML algorithms, neglecting other important techniques such as deep learning algorithms or unsupervised learning algorithms. 7. Measurement metrics (i.e., accuracy, recall, etc.) are well-known and have been used in previous biomedical studies such as PMID: 36642410, PMID: 28155651. Therefore, the authors are suggested to refer to more works in this description to attract a broader readership. 8. The study does not consider ensemble approaches beyond Random Forest (RF), such as AdaBoost or XGBoost, which have shown significant performance improvements in various classification tasks. 9. While the study demonstrates the superiority of tree-based ML algorithms, it fails to explore the underlying reasons behind this dominance. 10. The study suggests that tree-based algorithms consistently outperform non-tree-based algorithms across all datasets, without considering potential dataset-specific factors that may influence algorithm performance. 11. While the study briefly mentions future research opportunities, such as exploring ensemble tree-based algorithms and investigating the underlying reasons for algorithmic performance, it lacks depth in discussing specific research avenues and methodologies. Reviewer #2: My Comments are as follow: 1) The study focuses on classical algorithms and may not include recent advancements in machine learning, such as deep learning techniques that have shown promise in handling tabular data. It is highly recommended to include these for more contemporary perspective. 2) Abstract does not highlight novelty of the proposed work. It’s better to add more specific details of your work. 3) Introduction is not focused and literature can be reorganised to strengthen literature review following contributions and discuss few relevant works i.e., a) A Benchmark Dataset and Learning High-level Semantic Embeddings of Multimedia for Cross-media Retrieval b) Unsupervised pre-trained filter learning approach for efficient convolution neural network c) CSFL: A novel unsupervised convolution neural network approach for visual pattern classification d) Optimization of CNN through novel training strategy for visual classification problems e) Face recognition: A novel un-supervised convolutional neural network method f) ModPSO-CNN: an evolutionary convolution neural network with application to visual recognition g) Two-stage domain adaptation for infrared ship target segmentation 4) The work does not delve deeply into the impact of feature engineering and data preprocessing steps, which are crucial for the performance of machine learning algorithms. Add a detail discussion on it. 5) While the proposed work effectively compares tree-based algorithms with non-tree-based counterparts, it might lack a deeper analysis of why certain algorithms perform better than others. A more thorough investigation into the intrinsic properties of the datasets that favour tree-based methods is needed. Reviewer #3: The paper is not scentifically sound to be published in this form. Reviewer #4: The study aims to investigate the statistical significance of the performance of decision tree-based algorithms over other classical machine learning algorithms. Some points need modification in a final version. The manuscript's idea is interesting, since it seems inappropriate for articles on machine learning algorithms not to conduct statistical comparisons between the accuracies obtained by these algorithms in classification tasks. Abstract and Introduction -"no study has shown such supremacy through a statistical significance test." and "However, none shows such supremacy by employing any statistical significance comparison, such as a t-test." It's not true; below I can indicate an example that used statistics to compare the accuracy of machine learning algorithms, and it is possible that others have proceeded similarly. I suggest the authors rewrite the sentence and indicate that it is not usual to find statistical comparisons between the classification performance of machine learning algorithms. Farias, F. M., Salomão, R. C., Rocha Santos, E. G., Sousa Caires, A., Sampaio, G. S. A., Rosa, A. A. M., Costa, M. F., & Silva Souza, G. (2023). Sex-related difference in the retinal structure of young adults: a machine learning approach. Frontiers in medicine, 10, 1275308. https://doi.org/10.3389/fmed.2023.1275308. Methods -Figure 1: Use a dot instead of a comma for decimal numbers. Include the label name for the X-axis. -It would be important to provide more information about the type of data used. Time series for subsequent feature extraction? Was feature extraction performed? If yes, how many and which ones were extracted? Were they the same for all comparisons? How many groups were used in different datasets? -Why was the t-test chosen over an analysis of variance? I think it would be more appropriate to use an analysis of variance or Kruskal-Wallis or perform a Bonferroni correction for the t-test results. -I suggest performing at least a 10-fold cross-validation. -Was there data preprocessing? Any normalization? I think it would be important. -Does it make sense to compare the performance of random forest and decision tree? Results - Indicate the standard deviation of the mean values in Table 1 and Table 3. - Table 3 shows accuracy of 1. Does it imply overfitting? Or do the groups exhibit very large differences, leading to easier classification? This debate could be done in Discussion section ****** 6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No Reviewer #3: No Reviewer #4: No ******** [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step. https://doi.org/10.1371/journal.pone.0301541.r001
Revision 1
8 Mar 2024 Author Response Reviewer response Confirming the statistically significant superiority of tree-based machine learning algorithms over their counterparts for tabular data We sincerely thank the reviewers and editor for their insightful suggestions and comments. Here is our response to the corrections suggested by each of them. Changes are marked in red colour in the revised main manuscript file. Suggestions from the Editor Comment 1: Please ensure that your manuscript meets PLOS ONE’s style requirements, including those for file naming. The PLOS ONE style templates can be found at https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf Our response: Thank you very much for this suggestion. We paid particular attention to meeting the PLOS ONE style requirements while revising this manuscript. Comment 2: I suggest authors go through the reviewers' comments and address them adequately in the revised manuscript. Our response: Thank you for this suggestion. We gave our best effort to address all comments by the reviewers adequately. Reviewer: 01 Comment 1: The study relies on tabular datasets from the UCI Machine Learning Repository and Kaggle, which are popular repositories for machine learning datasets. However, there is a risk of selection bias as these datasets may not represent real-world data or may have inherent biases in their collection process. Our response: Thank you for pointing out this selection bias issue. We have taken steps to avoid this kind of bias as much as possible. The selected datasets considered in our study are from a wide range of contexts, as outlined in Figure 1 (page 6). In addition, we followed different statistical approaches, such as mean-based imputation [1] for handling the missing data problem and the synthetic minority oversampling technique [2] to make an unbalanced dataset a balanced one. Please see lines 140-147 on page 5 for further information. Comment 2: The study chooses five machine learning algorithms, but the rationale for selecting these specific algorithms is not thoroughly justified. While tree-based and non-tree-based algorithms are commonly used, there should be a discussion on why these particular algorithms were chosen over others and how they complement each other in addressing the research question. Our response: We appreciate this comment. Accordingly, we have revised Section 3.2 to clarify our rationale for selecting specific tree-based and non-tree-based algorithms, highlighting their complementary strengths in addressing our research question. This revision ensures a balanced exploration of machine learning strategies, enhancing the methodological rigour of our study and the relevance of our findings. Please see lines 161-162 (page 6), 177 –179 (page 6), and 194 – 201 (page 7) for further details. Comment 3: While the study uses standard performance metrics such as accuracy, precision, recall, and F1 score, there is limited discussion on why these metrics were chosen and how they align with the research objectives. Additionally, there is no mention of other important metrics, such as area under the receiver operating characteristic curve (AUC-ROC) or specificity, which are crucial for evaluating classification models. Our response: Thank you for highlighting the importance of diverse evaluation metrics in addition to the commonly used four we considered (i.e., accuracy, precision, recall and F1 score). In response, we have acknowledged this limitation in our manuscript and emphasised our intention to explore additional metrics such as AUC-ROC and specificity in future research to provide a more holistic evaluation of model performance. We added these to the limitations and future study scope of our study. Please see lines 368 – 372 on page 14. Comment 4: The study mentions using the Scikit-learn library to implement machine learning algorithms and IBM SPSS Statistics to conduct paired-sample t-tests. While these are widely used, the lack of detailed information on specific parameter settings and preprocessing techniques hinders reproducibility. A clear and precise description of the experimental setup would enhance transparency and reproducibility. Our response: Thank you for emphasising the importance of detailing our experimental setup. We have clarified in the manuscript that we used default Scikit-learn and IBM SPSS settings for all algorithms and statistical tests to ensure straightforward reproducibility. For the t-test, we used the SPSS default parameter setting for this test. Our approach aims for maximum transparency and ease of replication. Please check lines 238 – 243 on page 8 for further details. Comment 5: The study splits the data into training and test sets using an 80:20 ratio and performs five-fold cross-validation during model development. While cross-validation helps assess the model performance, there is no external validation using independent datasets. Our response: We acknowledge your feedback. In this revised edition, we have incorporated external validation. To achieve this, we applied the five ML algorithms considered in our study to the test dataset, which was not previously encountered during the training phase. The results we obtained were consistent with those from the training phase (Table 1). Additionally, we have introduced a new table (Table 4 on page 12) illustrating specific outcomes from the new t-tests, focusing solely on the accuracy measure. The other three measures (precision, recall and F1 score) showed similar superiority for the tree-based ML algorithms. For further details, please refer to lines 306-310 on page 12. Comment 6: The study focuses solely on classical tree-based and non-tree-based supervised ML algorithms, neglecting other essential techniques such as deep learning algorithms or unsupervised learning algorithms. Our response: Thank you for highlighting the exclusive focus of our study on classical tree-based and non-tree-based supervised machine learning algorithms. The decision to concentrate on these algorithms was deliberate, rooted in our research's specific scope and objectives, which aimed to investigate and compare the effectiveness of traditional ML approaches in our domain. While we acknowledge the potential of deep learning, ensemble ML algorithms and unsupervised ML algorithms in advancing the field, we intentionally leave them as a potential future scope. Please see pages 382-387) on pages 14-15 for more information. Comment 7: Measurement metrics (i.e., accuracy, recall, etc.) are well-known and have been used in previous biomedical studies, such as in PMID: 36642410 and PMID: 28155651. Therefore, the authors are suggested to refer to more works in this description to attract a broader readership. Our response: Thank you for suggesting additional seminal works on measurement metrics. We included references to critical studies to underscore the relevance of our chosen metrics in biomedical research and broaden the manuscript's appeal. Please see line 209 on page 7 for details. Comment 8: The study does not consider ensemble approaches beyond Random Forest (RF), such as AdaBoost or XGBoost, which have significantly improved performance in various classification tasks. Our response: We appreciate this comment. The second reviewer also made a similar comment (R2C1). In the revised manuscript, we have discussed this. We also mentioned that deep learning could have a chance of showing such superior performance. We leave them as potential future research scopes in alignment with our study. Please see lines 359-360 on page 13 for details. We also added further studies could delve into these methods. Please see lines 382 – 387 for details. Comment 9: While the study demonstrates the superiority of tree-based ML algorithms, it fails to explore the underlying reasons behind this dominance. Our response: We have discussed the possible underlying reasons behind the superiority of tree-based ML algorithms compared to their counterparts. The superior performance of tree-based ML algorithms compared to their counterparts can be attributed to various factors. A prominent reason is their capability to effectively map non-linear relations, providing excellent prediction accuracy and greater stability, especially compared to linear models [3]. Additionally, these algorithms exhibit better categorical and numerical data accommodation than other models [4]. Described as a set of if-else statements, tree-based ML algorithms excel at incorporating non-linear and categorical data during the learning process, contributing to enhanced predictive accuracy. Please see lines 326-331 on page 13 for our detailed discussion. Comment 10: The study suggests that tree-based algorithms consistently outperform non-tree-based algorithms across all datasets without considering potential dataset-specific factors that may influence algorithm performance. Our response: Thank you for pointing out the importance of considering dataset-specific factors in evaluating algorithm performance. Acknowledging this, we have updated our discussion to more carefully examine how each dataset's unique attributes could influence the effectiveness of tree-based versus non-tree-based algorithms. Please refer to lines 375-379 in the revised manuscript for an expanded analysis, ensuring a more nuanced and balanced comparison. Comment 11: While the study briefly mentions future research opportunities, such as exploring ensemble tree-based algorithms and investigating the underlying reasons for algorithmic performance, it lacks depth in discussing specific research avenues and methodologies. Our response: We acknowledge the reviewer’s feedback on the need for a more detailed exploration of future research directions within our study. Future work will explore ensemble tree-based algorithms and the reasons behind varying algorithmic performances, employing comparative analyses and feature importance studies for deeper insights. We also added further studies for these. For further details, please see lines 382 – 387 on pages 14-15. Reviewer: 02 Comment 1: The study focuses on classical algorithms and may not include recent advancements in machine learning, such as deep learning techniques that have shown promise in handling tabular data. It is highly recommended to include these for a more contemporary perspective. Our response: Thank you for this suggestion. The first reviewer also made a similar comment (R1C8). In the revised manuscript, we have discussed this. We also mentioned that ensemble approaches could have a chance of showing such superior performance. We leave them as potential future research scopes in alignment with our study. Please see lines 359-360 on page 13 for details. We also added further studies could delve into these methods. Please see lines 382 – 387 for more information. Comment 2: The abstract does not highlight the novelty of the proposed work. It is better to add more specific details to your work. Our response: We have revised the abstract considering our study aims and objectives. Please see page 2 for details. Comment 3: The introduction is not focused, and the literature can be reorganised to strengthen the literature review following contributions and discuss a few relevant works, i.e., (a) A Benchmark Dataset and Learning High-level Semantic Embeddings of Multimedia for Cross-media Retrieval (b) Unsupervised pre-trained filter learning approach for efficient convolution neural network (c) CSFL: A novel unsupervised convolution neural network approach for visual pattern classification (d) Optimisation of CNN through novel training strategy for visual classification problems (e) Face recognition: A novel un-supervised convolutional neural network method (f) ModPSO-CNN: an evolutionary convolution neural network with application to visual recognition (g) Two-stage domain adaptation for infrared ship target segmentation Our response: Thank you for your constructive feedback. We have focused our introduction, reorganised the literature review to highlight significant contributions in CNN and machine learning advancements, and meticulously incorporated the suggested references, enhancing the clarity and depth of our study. Please see lines 54 – 56 on page 3. Comment 4: The work does not delve deeply into the impact of feature engineering and data preprocessing steps, which are crucial for the performance of machine learning algorithms. Add a detailed discussion on it. Our response: Thank you for raising these issues related to feature engineering and data preprocessing. In this revised version, we briefly outlined the preprocessing steps followed for data analysis in this study. Please see lines 238-243 on page 8 for more information. To promote reproducibility, we adhered to Scikit-learn default parameters for all algorithms, ensuring a transparent and standardised experimental framework. While our findings suggest tree-based algorithms outperform non-tree-based ones across multiple datasets, we recognise the importance of considering dataset-specific characteristics, such as feature distribution and complexity, that could influence algorithm performance. Uddin and Lu [5] discovered that ML algorithms exhibit varying performances when applied to datasets with distinct meta-level and statistical attributes. Comment 5: While the proposed work effectively compares tree-based algorithms with non-tree-based counterparts, it might lack a deeper analysis of why certain algorithms perform better than others. A more thorough investigation into the intrinsic properties of the datasets that favour tree-based methods is needed. Our response: We appreciate your comment, which echoed the tenth comment from the first reviewer (R1C10), highlighting the necessity of acknowledging dataset-specific factors when assessing algorithm performance. In response, we have refined our discussion to more thoroughly explore the impact of each dataset's distinct characteristics on the performance of tree-based versus non-tree-based algorithms. For a detailed expansion of this analysis, which aims to offer a more nuanced and balanced comparison, please see lines 368-372 in the revised manuscript. Reviewer: 03 Comment 1: The paper is not scientifically sound to be published in this form. Our response: We believe that the comments from the other three reviewers and our corresponding responses have significantly improved the scientific merit of this study. Incorporating these changes would make the revised manuscript scientifically rigorous for publication. Please see the revised manuscript for our responses concerning the comments made by the first, second and fourth reviewers. Reviewer: 04 Comment 1: Abstract and Introduction -"no study has shown such supremacy through a statistical significance test." and "However, none shows such supremacy by employing any statistical significance comparison, such as a t-test." It's not true; below I can indicate an example that used statistics to compare the accuracy of machine learning algorithms, and it is possible that others have proceeded similarly. I suggest the authors rewrite the sentence and indicate that it is not usual to find statistical comparisons between the classification performance of machine learning algorithms. Farias, F. M., Salomão, R. C., Rocha Santos, E. G., Sousa Caires, A., Sampaio, G. S. A., Rosa, A. A. M., Costa, M. F., & Silva Souza, G. (2023). Sex-related difference in the retinal structure of young adults: a machine learning approach. Frontiers in medicine, 10, 1275308. https://doi.org/10.3389/fmed.2023.1275308. Our response: We appreciate for pointing out this issue. We have reviewed the mentioned article, which primarily followed descriptive statistics for ML performance comparison. We also found that many studies in the current literature empirically demonstrated the superiority of tree-based ML algorithms. They primarily used one or more datasets for descriptive statistical comparisons [e.g., 18]. Yet, employing statistical significance comparisons like t-tests to demonstrate such supremacy is not widespread. Please see lines 123-125 on page 5 for more information. Comment 2: Methods -Figure 1: Use a dot instead of a comma for decimal numbers. Include the label name for the X-axis. -It would be important to provide more information about the type of data used. Time series for subsequent feature extraction? Was feature extraction performed? If yes, how many and which ones were extracted? Were they the same for all comparisons? How many groups were used in different datasets? -Why was the t-test chosen over an analysis of variance? I think it would be more appropriate to use an analysis of variance or Kruskal-Wallis or perform a Bonferroni correction for the t-test results. -I suggest performing at least a 10-fold cross-validation. -Was there data preprocessing? Any normalisation? I think it would be important. -Does it make sense to compare the performance of random forest and decision tree? Our response: Please see below for our responses against each point - Figure 1: We intended to use a comma to show values both in raw value and its corresponding percentage. We have updated this figure in this revised submission. We put a bracket instead of a comma. We further updated the figure caption accordingly. Please see page 5 for more details. - Our datasets are from a wide range of contexts. They have attributes ranging from 2 to 2,548. All these datasets have two groups for the target variable. - Since we have only two groups for all datasets, we considered the independent sample t-test. ANOVA or Kruskal-Wallis is more suitable for datasets with more than two groups [6]. - We explore the size distribution for all 200 datasets to finalise the selection of 5-fold cross-validation. Some datasets are not large, and selecting a 10-fold cross-validation would lead to inappropriate results. - The second reviewer also raised this point (R2C4). In this revised version, we briefly outlined the preprocessing steps followed for data analysis in this study. Please see lines 238-243 on page 8 for more information. To promote reproducibility, we adhered to Scikit-learn default parameters for all algorithms, ensuring a transparent and standardised experimental framework. While our findings suggest tree-based algorithms outperform non-tree-based ones across multiple datasets, we recognise the importance of considering dataset-specific characteristics, such as feature distribution and complexity, that could influence algorithm performance. Uddin and Lu [5] discovered that ML algorithms exhibit varying performances when applied to datasets with distinct meta-level and statistical attributes. - We considered all classic supervised ML algorithms. Although RF is an ensemble approach based on DT, we considered it in our study in alignment with numerous studies in the literature. Comment 3: Results - Indicate the standard deviation of the mean values in Table 1 and Table 3. - Table 3 shows an accuracy of 1. Does it imply overfitting? Or do the groups exhibit very large differences, leading to easier classification? This debate could be discussed in the Discussion section Our response: All relevant tables, including Tables 1 and 3, have been updated with the standard deviation values. We further update Supplementary Table 1 accordingly. Please see different tables for details. We have discussed the presence of such high accuracy in lines 339-347 on page 14. Reference 1. Wei, R., Wang, J., Su, M., Jia, E., Chen, S., Chen, T., and Ni, Y., Missing value imputation approach for mass spectrometry-based metabolomics data. Scientific reports, 2018. 8(1): p. 663. 2. Ishaq, A., Sadiq, S., Umer, M., Ullah, S., Mirjalili, S., Rupapara, V., and Nappi, M., Improving the prediction of heart failure patients’ survival using SMOTE and effective data mining techniques. IEEE access, 2021. 9: p. 39707-39716. 3. Dumitrescu, E., Hué, S., Hurlin, C., and Tokpavi, S., Machine learning for credit scoring: Improving logistic regression with non-linear decision-tree effects. European Journal of Operational Research, 2022. 297(3): p. 1178-1192. 4. Song, Y.-Y. and Ying, L., Decision tree methods: applications for classification and prediction. Shanghai archives of psychiatry, 2015. 27(2): p. 130. 5. Uddin, S. and Lu, H., Dataset meta-level and statistical features affect machine learning performance. Scientific Reports, 2024. 14(1): p. 1670. 6. Field, A., Discovering statistics using SPSS. 2013, London: Sage Publications Ltd. Attachments Attachment Submitted filename: Reviewer response letter (Confirmation PONE) v02.docx https://doi.org/10.1371/journal.pone.0301541.r002
18 Mar 2024 Decision Letter - Nagarajan Raju, Editor Confirming the statistically significant superiority of tree-based machine learning algorithms over their counterparts for tabular data PONE-D-24-03825R1 Dear Dr. Uddin, We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements. Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication. An invoice will be generated when your article is formally accepted. Please note, if your institution has a publishing partnership with PLOS and your article meets the relevant criteria, all or part of your publication costs will be covered. Please make sure your user information is up-to-date by logging into Editorial Manager at http://www.editorialmanager.com/pone/ and clicking the ‘Update My Information' link at the top of the page. If you have any questions relating to publication charges, please contact our Author Billing department directly at authorbilling@plos.org. If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org. Kind regards, Nagarajan Raju Academic Editor PLOS ONE Additional Editor Comments (optional): Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation. Reviewer #1: All comments have been addressed Reviewer #2: All comments have been addressed Reviewer #4: All comments have been addressed ******** 2. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Yes Reviewer #2: Yes Reviewer #4: Yes ****** 3. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: N/A Reviewer #2: Yes Reviewer #4: Yes ****** 4. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: No Reviewer #2: Yes Reviewer #4: Yes ****** 5. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Reviewer #2: Yes Reviewer #4: Yes ****** 6. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: My previous comments have been addressed, therefore, the manuscript can be accepted in this current form. Reviewer #2: All my comments are successfully answered. Please take a good look to the grammar and typos while submitting the final version of the manuscript. Reviewer #4: (No Response) ****** 7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No Reviewer #4: No ******** https://doi.org/10.1371/journal.pone.0301541.r003
Formally Accepted
3 Apr 2024 Acceptance Letter - Nagarajan Raju, Editor PONE-D-24-03825R1 PLOS ONE Dear Dr. Uddin, I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now being handed over to our production team. At this stage, our production department will prepare your paper for publication. This includes ensuring the following: * All references, tables, and figures are properly cited * All relevant supporting information is included in the manuscript submission, * There are no issues that prevent the paper from being properly typeset If revisions are needed, the production department will contact you directly to resolve them. If no revisions are needed, you will receive an email when the publication date has been set. At this time, we do not offer pre-publication proofs to authors during production of the accepted work. Please keep in mind that we are working through a large volume of accepted articles, so please give us a few weeks to review your paper and let you know the next and final steps. Lastly, if your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org. If we can help with anything else, please email us at customercare@plos.org. Thank you for submitting your work to PLOS ONE and supporting open access. Kind regards, PLOS ONE Editorial Office Staff on behalf of Dr. Nagarajan Raju Academic Editor PLOS ONE https://doi.org/10.1371/journal.pone.0301541.r004

Open letter on the publication of peer review reports

PLOS recognizes the benefits of transparency in the peer review process. Therefore, we enable the publication of all of the content of peer review and author responses alongside final, published articles. Reviewers remain anonymous, unless they choose to reveal their names.

We encourage other journals to join us in this initiative. We hope that our action inspires the community, including researchers, research funders, and research institutions, to recognize the benefits of published peer review reports for all parts of the research system.

Learn more at ASAPbio .