Peer Review History

Original SubmissionAugust 19, 2024
Decision Letter - Salman Sadullah Usmani, Editor

PONE-D-24-31032Machine Learning Driven Dashboard for Chronic Myeloid Leukemia Prediction using Protein SequencesPLOS ONE

Dear Dr. Alahmadi,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript by Dec 21 2024 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.
  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.
  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Salman Sadullah Usmani, Ph.D.

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at 

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and 

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. Please note that PLOS ONE has specific guidelines on code sharing for submissions in which author-generated code underpins the findings in the manuscript. In these cases, we expect all author-generated code to be made available without restrictions upon publication of the work. Please review our guidelines at https://journals.plos.org/plosone/s/materials-and-software-sharing#loc-sharing-code and ensure that your code is shared in a way that follows best practice and facilitates reproducibility and reuse.

3. We suggest you thoroughly copyedit your manuscript for language usage, spelling, and grammar. If you do not know anyone who can help you do this, you may wish to consider employing a professional scientific editing service.  

The American Journal Experts (AJE) (https://www.aje.com/) is one such service that has extensive experience helping authors meet PLOS guidelines and can provide language editing, translation, manuscript formatting, and figure formatting to ensure your manuscript meets our submission guidelines. Please note that having the manuscript copyedited by AJE or any other editing services does not guarantee selection for peer review or acceptance for publication. 

Upon resubmission, please provide the following: 

● The name of the colleague or the details of the professional service that edited your manuscript

● A copy of your manuscript showing your changes by either highlighting them or using track changes (uploaded as a *supporting information* file)

● A clean copy of the edited manuscript (uploaded as the new *manuscript* file)

4. We note that the grant information you provided in the ‘Funding Information’ and ‘Financial Disclosure’ sections do not match. 

When you resubmit, please ensure that you provide the correct grant numbers for the awards you received for your study in the ‘Funding Information’ section.

5. Please provide a complete Data Availability Statement in the submission form, ensuring you include all necessary access information or a reason for why you are unable to make your data freely accessible. If your research concerns only data provided within your submission, please write "All data are in the manuscript and/or supporting information files" as your Data Availability Statement.

6. We are unable to open your Supporting Information file bibliography.bib and plos2015.bst. Please kindly revise as necessary and re-upload.

7. Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Partly

Reviewer #2: Yes

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: I Don't Know

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: Comment 1: The Materials & Methods section at line 140 appears to be incomplete. Additionally, please include the details of the genes in the introduction. In the Materials & Methods section, simply mention the database used for the dataset collection, the keyword search conducted on UniProt, and the number of sequences obtained from UniProt.

Comment 2: I’m unclear on the need to explain the FASTA file format, as it is a widely known format commonly used in sequencing. Do not make another section for this just add in the dataset collection section.

Comment 3: In the "Sample of Protein Sequence (HSP90)" section, the total number of sequences before and after filtering is missing. Please include this information in the Materials & Methods section.

Comment 4: Please specify the training and validation datasets, including the number of sequences for each set and each protein. If possible, consider using a table to clearly present the total number of sequences used for training and validation for the BCL2, HSP90, PARP, and RB proteins.

Comment 5: It is recommended to perform 5-fold or 10-fold cross-validation on your internal dataset (training dataset) to enhance the reliability of your results.

Comment 6: In the results section, please include the performance metrics for the training and validation models for each protein. If the dataset is too small to make predictions for individual proteins, please explain the rationale behind merging different datasets.

Comment 7: It is recommended to separate the results and discussion sections. This would allow you to include other methods that perform similar analyses in the discussion section. Additionally, if you identify other methods that create similar dashboards, it would be valuable to include a comparison.

Comment 8: The quality of the images is very poor and needs to be improved.

Comment 9: The link to the dashboard app is missing. Additionally, please include a section in the Materials & Methods that outlines the architecture for creating this app.

Please include link for the preprint.

Reviewer #2: Your work could greatly improve the early diagnosis and treatment of CML, particularly in areas where specialized healthcare is hard to access. Here are some key points and suggestions to enhance your work:

Regarding the dashboard, highlight its definition, importance, applications, design principles, and evaluation methods.

Compare the models by highlighting their strengths and weaknesses in various scenarios.

Highlights the novelty and significance of using protein sequences for CML prediction.

Although most references are recent, some older ones (e.g., from 2004 and 2011) should be updated with more current studies to reflect the latest advancements in the field.

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: Yes: Anjali Dhall

Reviewer #2: No

**********

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

Attachments
Attachment
Submitted filename: Comments_PONE-D-24-31032.docx
Revision 1

Authors’ response (#PONE-D-24-31032)

Original Article Title: Machine Learning Driven Dashboard for Chronic Myeloid Leukemia Prediction using Protein Sequences

Dear Editors and Reviewers,

We are very grateful for the opportunity provided by the Editors to improve our manuscript (PONE-D-24-31032) and for the valuable suggestions and insightful comments from the anonymous reviewers. Following these constructive suggestions and detailed feedback, we have carefully revised the manuscript and implemented several necessary modifications. Below, we provide a detailed response to the comments and suggestions from the Editor and reviewers:

Review 1:

The Materials & Methods section at line 140 appears to be incomplete. Additionally, please include the details of the genes in the introduction. In the Materials & Methods section, simply mention the database used for the dataset collection, the keyword search conducted on UniProt, and the number of sequences obtained from UniProt.

Answer:

Thank you for your valuable feedback. We have made the following revisions in response to your comment:

1. Introduction: We have included the details of the genes associated with Chronic Myeloid Leukemia (CML) in the introduction, specifically mentioning BCL2, HSP90, PARP, and RB as relevant genes involved in the disease. This additional information provides a clearer context for the study and the protein sequences used.

2. Materials & Methods: The Materials & Methods section has been updated to address the completeness of the description. We have now explicitly mentioned that the dataset was collected from the UniProt database. We also outlined the keyword search terms used for data retrieval and specified the number of sequences obtained.

We believe these updates enhance the clarity of the manuscript and provide the necessary details regarding the dataset collection process.

Review 2:

I’m unclear on the need to explain the FASTA file format, as it is a widely known format commonly used in sequencing. Do not make another section for this just add in the dataset collection section.

Answer:

Thank you for your feedback. We have removed the separate section explaining the FASTA format. Instead, we’ve incorporated a concise mention of the FASTA format directly in the Dataset Collection section, where it is most relevant. This change simplifies the manuscript while providing the necessary context.

Review 3:

In the "Sample of Protein Sequence (HSP90)" section, the total number of sequences before and after filtering is missing. Please include this information in the Materials & Methods section.

Asnwer:

Thank you for pointing that out. We have added the total number of sequences both before and after filtering in the Dataset Collection section. Specifically, we now mention that there were 2248 sequences initially obtained from UniProt, and after redundancy removal using the CD-Hit method, 2144 sequences remained in the dataset. This additional information helps clarify the data processing steps.

Review 4:

Please specify the training and validation datasets, including the number of sequences for each set and each protein. If possible, consider using a table to clearly present the total number of sequences used for training and validation for the BCL2, HSP90, PARP, and RB proteins

Answer:

Thank you for your suggestion. We have now specified the number of sequences used for training and validation for each protein (BCL2, HSP90, PARP, and RB) in the Dataset Collection section which enhances clarity and helps with the reproducibility of the dataset.

Review 5:

It is recommended to perform 5-fold or 10-fold cross-validation on your internal dataset (training dataset) to enhance the reliability of your results.

Answer:

Thank you for the suggestion. We appreciate the recommendation to perform 5-fold or 10-fold cross-validation to enhance the reliability of the results. We have already implemented 5-fold cross-validation on the internal training dataset. This step was included to ensure robust model evaluation and minimize potential overfitting, further validating the effectiveness of the model.

Review 6:

In the results section, please include the performance metrics for the training and validation models for each protein. If the dataset is too small to make predictions for individual proteins, please explain the rationale behind merging different datasets.

Answer:

Thank you for your insightful comment, regarding the merging of datasets, we formulated the dataset based on the most frequently mutated genes responsible for Chronic Myelogenous Leukemia (CML). This approach allowed us to create a more comprehensive and robust dataset, which is crucial for improving model performance. By merging different protein datasets, we were able to leverage a larger pool of data, enhancing the generalization of the model and improving the reliability of the predictions. The performance metrics cover’s accuracy, precision, recall, F1-score, and AUC, offering a comprehensive evaluation of the models

Review 7:

It is recommended to separate the results and discussion sections. This would allow you to include other methods that perform similar analyses in the discussion section. Additionally, if you identify other methods that create similar dashboards, it would be valuable to include a comparison.

Answer:

Thank you for your suggestion. While we understand the benefits of separating the Results and Discussion sections, We have chosen to integrate the results and discussion sections to maintain a cohesive narrative, as they align well with the structure and focus of our study. The methods used in this study are novel and specifically tailored to address the challenges of CML prediction. As such, there are no direct alternatives to compare in this context.

Review 8:

The quality of the images is very poor and needs to be improved.

Answer:

Thank you for your feedback. We apologize for the poor quality of the images in the original submission. The issue likely arose during the conversion of the original images into TIFF format, which may have affected their quality. In the revised manuscript, we have updated the images to higher-resolution versions to ensure improved clarity and readability. We appreciate your understanding and will ensure that all images meet the required quality standards.

Review 9:

The link to the dashboard app is missing. Additionally, please include a section in the Materials & Methods that outlines the architecture for creating this app.

Answer:

Thank you for your suggestion. We will include the link to the dashboard app in the revised manuscript. Additionally, we will add a section in the Materials & Methods outlining the architecture used for creating the app, providing a clearer understanding of its design and implementation.

https://cmlapp-k9xhmtb7tthequv47farry.streamlit.app/

Attachments
Attachment
Submitted filename: Rebuttal Letter (PONE-D-24-31032) ).docx
Decision Letter - Salman Sadullah Usmani, Editor

Machine Learning Driven Dashboard for Chronic Myeloid Leukemia Prediction using Protein Sequences

PONE-D-24-31032R1

Dear Dr. Alahmadi,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice will be generated when your article is formally accepted. Please note, if your institution has a publishing partnership with PLOS and your article meets the relevant criteria, all or part of your publication costs will be covered. Please make sure your user information is up-to-date by logging into Editorial Manager at Editorial Manager® and clicking the ‘Update My Information' link at the top of the page. If you have any questions relating to publication charges, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Salman Sadullah Usmani, Ph.D.

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: All comments have been addressed

Reviewer #2: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: (No Response)

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: (No Response)

Reviewer #2: (No Response)

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: Yes: Anjali Dhall

Reviewer #2: No

**********

Formally Accepted
Acceptance Letter - Salman Sadullah Usmani, Editor

PONE-D-24-31032R1

PLOS ONE

Dear Dr. Alahmadi,

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now being handed over to our production team.

At this stage, our production department will prepare your paper for publication. This includes ensuring the following:

* All references, tables, and figures are properly cited

* All relevant supporting information is included in the manuscript submission,

* There are no issues that prevent the paper from being properly typeset

You will receive further instructions from the production team, including instructions on how to review your proof when it is ready. Please keep in mind that we are working through a large volume of accepted articles, so please give us a few days to review your paper and let you know the next and final steps.

Lastly, if your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

If we can help with anything else, please email us at customercare@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Salman Sadullah Usmani

Academic Editor

PLOS ONE

Open letter on the publication of peer review reports

PLOS recognizes the benefits of transparency in the peer review process. Therefore, we enable the publication of all of the content of peer review and author responses alongside final, published articles. Reviewers remain anonymous, unless they choose to reveal their names.

We encourage other journals to join us in this initiative. We hope that our action inspires the community, including researchers, research funders, and research institutions, to recognize the benefits of published peer review reports for all parts of the research system.

Learn more at ASAPbio .