Peer Review History
| Original SubmissionJanuary 7, 2025 |
|---|
|
PONE-D-25-00468Is There a Competitive Advantage to Using Multivariate Statistical or Machine Learning Methods Over the Bross Formula in the hdPS Framework for Bias and Variance Estimation?PLOS ONE Dear Dr. Karim, Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. Please submit your revised manuscript by May 08 2025 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript:
If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter. If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols. We look forward to receiving your revised manuscript. Kind regards, Hossein Ali Adineh, Ph.D Academic Editor PLOS ONE Journal requirements: When submitting your revision, we need you to address these additional requirements. 1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and 2. Please note that PLOS ONE has specific guidelines on code sharing for submissions in which author-generated code underpins the findings in the manuscript. In these cases, we expect all author-generated code to be made available without restrictions upon publication of the work. Please review our guidelines at https://journals.plos.org/plosone/s/materials-and-software-sharing#loc-sharing-code and ensure that your code is shared in a way that follows best practice and facilitates reproducibility and reuse. 3. Thank you for stating the following financial disclosure: [This work was supported by MEK’s Natural Sciences and Engineering Research Council of Canada (NSERC) Discovery Grant (PG#: 20R01603) and Discovery Launch Supplement (PG#: 20R12709).]. Please state what role the funders took in the study. If the funders had no role, please state: ""The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript."" If this statement is not correct you must amend it as needed. Please include this amended Role of Funder statement in your cover letter; we will change the online submission form on your behalf. 4. Thank you for stating the following in the Competing Interests section: [MEK is currently supported by grants from Canadian Institutes of Health Research and MS Canada. MEK has previously received consulting fees from Biogen Inc. for consulting unrelated to this current work. MEK was also previously supported by the Michael Smith Foundation for Health Research Scholar award.]. Please confirm that this does not alter your adherence to all PLOS ONE policies on sharing data and materials, by including the following statement: ""This does not alter our adherence to PLOS ONE policies on sharing data and materials.” (as detailed online in our guide for authors http://journals.plos.org/plosone/s/competing-interests). If there are restrictions on sharing of data and/or materials, please state these. Please note that we cannot proceed with consideration of your article until this information has been declared. Please include your updated Competing Interests statement in your cover letter; we will change the online submission form on your behalf. 5. Thank you for uploading your study's underlying data set. Unfortunately, the repository you have noted in your Data Availability statement does not qualify as an acceptable data repository according to PLOS's standards. At this time, please upload the minimal data set necessary to replicate your study's findings to a stable, public repository (such as figshare or Dryad) and provide us with the relevant URLs, DOIs, or accession numbers that may be used to access these data. For a list of recommended repositories and additional information on PLOS standards for data deposition, please see https://journals.plos.org/plosone/s/recommended-repositories. 6. Your ethics statement should only appear in the Methods section of your manuscript. If your ethics statement is written in any section besides the Methods, please move it to the Methods section and delete it from any other section. Please ensure that your ethics statement is included in your manuscript, as the ethics statement entered into the online submission form will not be published alongside your manuscript. Additional Editor Comments: Reviewer 1 The manuscript titled "Is There a Competitive Advantage to Using Multivariate Statistical or Machine Learning Methods Over the Bross Formula in the hdPS Framework for Bias and Variance Estimation?" offers a comprehensive evaluation of various proxy selection methods within the high-dimensional propensity score (hdPS) framework. It compares traditional statistical methods with machine learning approaches, and it effectively assesses their performance across various epidemiological scenarios. The use of a plasmode simulation with NHANES data adds significant value to the study's methodology and conclusions. However, I believe that the manuscript requires major revisions to improve its quality and robustness. The following points outline the areas that need attention: Introduction and Literature Review: The introduction could be enhanced by providing more context on the critical role of hdPS in modern epidemiology. Specifically, expanding on how hdPS addresses the limitations of traditional propensity score methods and directly impacts causal inference in healthcare data would strengthen the manuscript. The literature review could also benefit from a clearer connection to prior research, particularly focusing on how previous studies have applied machine learning to hdPS and how this study brings new insights to the field. Methods Section: While the methodology is generally well-detailed, there is a need for further clarification regarding the choice of machine learning models (e.g., Genetic Algorithm, XGBoost) and statistical methods. A better explanation of why these particular models were selected and their relevance to hdPS would make the study's approach more transparent. The "kitchen sink model" is also mentioned briefly but needs further elaboration on its role in the analysis. Furthermore, providing a clearer explanation of the traditional variable selection methods, such as "forward selection" and "backward elimination," would benefit readers who may not be familiar with these techniques. Results and Discussion: The simulation results are well-presented, but additional discussion on the interpretation of bias and MSE would help contextualize the trade-offs involved in using machine learning methods like XGBoost. While XGBoost shows lower MSE, the increase in bias should be explored in more detail to help readers understand the model's limitations. Similarly, the poor performance of the Genetic Algorithm should be explained with greater depth, especially regarding its struggles with high-dimensional data compared to other methods. Tables and Figures: The readability of Table 2 and Table 3 could be improved by adding brief summaries or interpretations of the key results in their captions. This would help readers quickly grasp the significance of the findings and how they support the broader argument. Conclusions: The conclusions are generally strong, but they could be expanded to provide more practical guidance for future research and real-world applications. Specifically, the authors could discuss which methods might be more appropriate for different types of epidemiological studies, such as those focused on rare diseases versus common conditions. These revisions are essential to enhancing the manuscript’s clarity, depth, and impact. A more robust discussion of the trade-offs between methods, along with clearer explanations of model choices, will help solidify the paper’s contribution to the field. Recommendation: Given the importance of the topic, I recommend that the authors address the major revisions outlined above. Once these revisions are made, I believe the manuscript could make a valuable contribution to the literature on hdPS and its applications in health data science. Thank you for considering my review. Reviewer 2 The manuscript presents a highly relevant comparison of hdPS variable selection methods, including both traditional statistical and machine learning approaches. The findings are useful for epidemiologists and health data scientists. Key suggestions for revision: - Clarify the rationale behind selecting 100 proxy variables in hdPS methods. - Expand the discussion on bias correction for coverage probability. - Address concerns regarding MSE interpretation when variance estimation is unstable. - Discuss the practical implications of computational efficiency for real-world applications. - Improve clarity and conciseness in the Results and Discussion sections. [Note: HTML markup is below. Please do not edit.] Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Yes Reviewer #2: Yes ********** 2. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes Reviewer #2: Yes ********** 3. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: Yes ********** 4. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Reviewer #2: Yes ********** 5. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: The manuscript presents a highly relevant comparison of hdPS variable selection methods, including both traditional statistical and machine learning approaches. The findings are useful for epidemiologists and health data scientists. Key suggestions for revision: - Clarify the rationale behind selecting 100 proxy variables in hdPS methods. - Expand the discussion on bias correction for coverage probability. - Address concerns regarding MSE interpretation when variance estimation is unstable. - Discuss the practical implications of computational efficiency for real-world applications. - Improve clarity and conciseness in the Results and Discussion sections. Reviewer #2: The manuscript titled "Is There a Competitive Advantage to Using Multivariate Statistical or Machine Learning Methods Over the Bross Formula in the hdPS Framework for Bias and Variance Estimation?" provides a comprehensive evaluation of various proxy selection methods within the high-dimensional propensity score (hdPS) framework. It compares traditional statistical methods with machine learning approaches, and it effectively assesses their performance in terms of bias, mean squared error (MSE), coverage, and standard error (SE) across different epidemiological scenarios. The study’s use of a plasmode simulation based on NHANES data provides a valuable contribution to the understanding of these methods' efficacy. While the overall analysis is robust, there are several areas where the manuscript could be improved to increase its clarity, depth, and contribution to the literature. Please refer to the comments document as attached for more specifics. ********** 6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No ********** [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.
|
| Revision 1 |
|
Is There a Competitive Advantage to Using Multivariate Statistical or Machine Learning Methods Over the Bross Formula in the hdPS Framework for Bias and Variance Estimation? PONE-D-25-00468R1 Dear Dr. Karim, We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements. Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication. An invoice will be generated when your article is formally accepted. Please note, if your institution has a publishing partnership with PLOS and your article meets the relevant criteria, all or part of your publication costs will be covered. Please make sure your user information is up-to-date by logging into Editorial Manager at Editorial Manager® and clicking the ‘Update My Information' link at the top of the page. If you have any questions relating to publication charges, please contact our Author Billing department directly at authorbilling@plos.org. If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org. Kind regards, Hossein Ali Adineh, Ph.D Academic Editor PLOS ONE Additional Editor Comments (optional): Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation. Reviewer #2: All comments have been addressed ********** 2. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #2: Yes ********** 3. Has the statistical analysis been performed appropriately and rigorously? Reviewer #2: Yes ********** 4. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #2: Yes ********** 5. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #2: Yes ********** 6. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #2: The authors have thoughtfully and thoroughly addressed all the major concerns raised during the previous review. Clear explanations were added for the kitchen sink model, the challenges observed with the Genetic Algorithm (GA), and the role of forward and backward selection methods. The Introduction and Literature Review sections have been strengthened, offering a clearer connection between traditional limitations in propensity score methods, the emergence of high-dimensional propensity score (hdPS) techniques, and the growing role of machine learning approaches in this area. The Methods section has been clarified, with a well-justified rationale for the choice of machine learning models. In the Results section, the authors provide a more nuanced interpretation of the trade-offs between bias and mean squared error (MSE), and offer a thoughtful explanation for the poorer performance of the GA. Improvements to Table 2 and Table 3, particularly the addition of interpretative summaries in the captions, have enhanced the accessibility and meaning of the results. The Conclusion and Future Directions sections now provide valuable practical guidance for researchers considering the application of these methods to real-world epidemiological studies, as well as thoughtful suggestions for how machine learning models could be refined for better bias reduction. Overall, the revisions have substantially improved the clarity, depth, and relevance of the manuscript. I commend the authors for their careful and considered revisions, and I have no further concerns. ********** 7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #2: No **********
|
| Formally Accepted |
|
PONE-D-25-00468R1 PLOS ONE Dear Dr. Karim, I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now being handed over to our production team. At this stage, our production department will prepare your paper for publication. This includes ensuring the following: * All references, tables, and figures are properly cited * All relevant supporting information is included in the manuscript submission, * There are no issues that prevent the paper from being properly typeset You will receive further instructions from the production team, including instructions on how to review your proof when it is ready. Please keep in mind that we are working through a large volume of accepted articles, so please give us a few days to review your paper and let you know the next and final steps. Lastly, if your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org. If we can help with anything else, please email us at customercare@plos.org. Thank you for submitting your work to PLOS ONE and supporting open access. Kind regards, PLOS ONE Editorial Office Staff on behalf of Dr. Hossein Ali Adineh Academic Editor PLOS ONE |
Open letter on the publication of peer review reports
PLOS recognizes the benefits of transparency in the peer review process. Therefore, we enable the publication of all of the content of peer review and author responses alongside final, published articles. Reviewers remain anonymous, unless they choose to reveal their names.
We encourage other journals to join us in this initiative. We hope that our action inspires the community, including researchers, research funders, and research institutions, to recognize the benefits of published peer review reports for all parts of the research system.
Learn more at ASAPbio .