Peer Review History
| Original SubmissionApril 24, 2020 |
|---|
|
PONE-D-20-11896 Assessing the low complexity of protein sequences via the low complexity triangle PLOS ONE Dear Dr. Mier, Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. Please submit your revised manuscript by Jul 20 2020 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript:
If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter. If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols We look forward to receiving your revised manuscript. Kind regards, Alexandre G. de Brevern, Ph.D. Academic Editor PLOS ONE Journal Requirements: When submitting your revision, we need you to address these additional requirements. 1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and 2. Thank you for stating the following in the Acknowledgments Section of your manuscript: 'This work benefited from the Marie Skłodowska-Curie Research and Innovation Staff Exchange project “Repeat protein Function Refinement, Annotation and Classification of Topologies” (REFRACT), which received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie grant agreement No. 823886.' We note that you have provided funding information that is not currently declared in your Funding Statement. However, funding information should not appear in the Acknowledgments section or other areas of your manuscript. We will only publish funding information present in the Funding Statement section of the online submission form. Please remove any funding-related text from the manuscript and let us know how you would like to update your Funding Statement. Currently, your Funding Statement reads as follows: 'This work was supported by Deutsche Forschungsgemeinschaft [AN735/4-1 to M.A.A.N.]. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.' Please include your amended statements within your cover letter; we will change the online submission form on your behalf. [Note: HTML markup is below. Please do not edit.] Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Yes Reviewer #2: Partly ********** 2. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes Reviewer #2: N/A ********** 3. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: No ********** 4. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Reviewer #2: Yes ********** 5. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: This is an interesting and important study that adds significantly to the field and will have a noticeable impact. The manuscript is well-written and concise. It adds significantly to the field and will have a noticeable impact. The developed tool will be a useful addition to the repertoire of modern computational biology. However, there are several. 1) I was surprised by not finding any references to the work conducted by Dr. Kajava, who developed several databases of protein repeats (e.g., RepeatsDB: a database of tandem repeat protein structures; PRDB: Protein Repeat DataBase) and several computational tools for ab initio identification of the tandem repeats (e.g., T-REKS: Tandem REpeats in sequences with a K-meanS based algorithm; TAPO: A combined method for the identification of tandem repeats in protein structures; Tally: a scoring tool for boundary determination between repetitive and non-repetitive protein sequences; and Tally-2.0: upgraded validator of tandem repeat detection in protein sequences). 2) A brief discussion should be added of the fact that the presence of protein repeats is often correlated with the presence of intrinsic disorder. 3) The authors should clearly define the meaning of the “Low complexity triangle”. A succinct explanation should be provided of this type of diagram and why it is called a triangle. Reviewer #2: The manuscript details a new method for assessing the low complexity regions via the low complexity triangle. It describes the method based on two metrics - repeatability and fraction of the most frequent amino acid in proteins divided into sliding windows of specific length. Proposed method is applied on five different proteomes and proteins grouped according different features. Manuscript describes tool developed for presentation of low complexity triangle based on protein repeatability. The manuscript does appear to have some scientific merit, although it is hard to discern. It presents idea of using developed tool for assess which (part of) protein can be low complexity, but miss detail description of method. Also, the main part of manuscript (Results & discussion) in some parts seem a bit confusing. Comparisons of results obtained with proposed method and other related tools is missing. The following are list of errors that need to be addressed before a full conclusion can be made: 1) Because the main part of assessing low complexity regions (LCR) is low complexity triangle and LCT tool, detail description of other available tools for location of LCR is necessary to include in manuscript: what are differences, capabilities, etc. This is cover by only one sentence in the manuscript. It will be better to make separate section and list some tools and their characteristics. This can be used in comparison of results obtained by proposed LCT server and some other tools - it is necessary to show that new method proposed by authors gives comparable or better results (than results obtained with already existing tools and methods). 2) Section 2. Methods include very few sentences about method used for assessing of protein low complexity. On the other side, LCT server description is put on section Results, which is inappropriate. LCT server can not be RESULT of "assessing the low complexity...." - topic mentioned int the manuscript title. If authors describe method, it is normally that description of server that implement/used for visualization of method results is put together. 3) In current version of manuscript larger part of section 2 is occupied with text related to data used in examples. It t is usually that section title will be "Material and methods" or similar. Data retrieved from UniProtKB should be more precisely characterized. For example, there are many Escherichia coli genomes. Which one was used (K12 - which strain, O127H6, CFT073,....? It is necessary to specify organism identification. Also, it is necessary to include somewhere information about positional annotations of proteins related to the extracted groups (domains, transmembrane regions, ....). This is important if reader wants to repeat experiment and possibly compare results with results obtained from some other tool/method. 4) Method must be described in more detail. For example: - Authors mentioned that divided windows are overlapped, but did not mentioned step in overlapping. Are windows just overlap without sliding or window of selected length slide across sequence? - If input file includes more than one protein how calculation looks like? Does final results represent some kind average values of single proteins (related to overlapping or sliding windows) or all sequence are concatenated to big one which act as an input? If user put the whole protein in one file (proteins placed consecutively one by one in fasta format) on which way such file is processed? - Using different window length produce different results. What are differences among these results? Is there any benefit if default length is 20? And, why maximal length is restricted to 30? 5) In Method section, part 2.2 Low complexity assessment, last sentence in the paragraph: "To ease the comparison between the distribution of the windows’ results in the different low complexity triangles, we draw three overlapping boxes in different colors in coordinates: (x > 8, y < 0.3), in red; (x > 7, y < 0.35), in green; and (x > 6, y <0.4), in orange." There is not explanation why why exactly those boundaries are used. Also, on the figures orange, green and red lines are draw on different coordinates: red - (x>8.5, y>2.75), green (x>7.5, y>3.25) and orange (x>6.5, y>3.75)?! 6) About LCT server: authors offer possibility for download offline version, which is commendable. But, missing file with short description of package and list of prerequisites for installations (for example, on installation in Linux it is necessary to install biopython, ggplot, etc.) On LCT server exists in-advance unknown/unpublished restriction. For example, if protein length is 21, default length is 20, server refused input with error message "The query sequence needs to be at least 5 amino acids longer than the selected window length. Try again." . This message appear even if such protein exists in file together with bunch of other proteins (for example, for proteome taken as input) All such restriction should be documented. 7) Section "Results & Discussion" is confusing in some parts. The principle of applying method on complete proteome if results are presented with average values is for discussion because each proteome consists of proteins with (very) different length and characteristics. What kind of normalization is taken into account? Does larger proteins influence in final results with higher weight? Or does the proteins with higher degree of low complexity "masks" proteins with lower level? Etc... > "The numerical comparison of the previous low complexity triangles was done by studying how many windows cluster in the bottom-right corner of each plot." It can not be seen precisely from the figures. LCT web server produce figures with numerical values (percent of windows) which does not exists on figures in the manuscript (why?). Missing numerical numbers prevent checking numbers presented in the manuscript. From sentence "On the other hand, 99% of the windows of E. coli are within the orange region and therefore are mostly globular." one can conclude that if large percent of widows fall into orange region than this protein (proteomes, ...) is globular. But from figure 2.d (coiled coils) visually it can be seen that most of windows also fall into orange region, but coiled coil proteins not preferred to be globular. Also, material is taken from UniProtKb which contains mainly globular proteins(as mentioned in manuscript), so discussion that method discovered that they are mainly globular is strange. As an example (in proof-of-concept) one protein or group of proteins) which are non-globular should be taken, and then compared results of globular/non-globular groups (orange, green, red lines, etc.). Also and more important it is necessary to take some proteins that can not be characterized as low complexity and analyze results and graphics. The manuscript proposed method for assessing low complexity proteins, so it is normal to show how method works on both low complexity proteins and opposite group. 8) Comparison of results obtained with some other tools that locate LCR should be included. 9) Conclusion should include concrete sentences about possible use of this method and site - who, why and what are the advantages of using this method for locating LCR ********** 6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: Yes: Vladimir N. Uversky Reviewer #2: No [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step. |
| Revision 1 |
|
Assessing the low complexity of protein sequences via the low complexity triangle PONE-D-20-11896R1 Dear Dr. Mier, We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements. Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication. An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org. If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org. Kind regards, Alexandre G. de Brevern, Ph.D. Academic Editor PLOS ONE Additional Editor Comments (optional): Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation. Reviewer #1: All comments have been addressed Reviewer #2: All comments have been addressed ********** 2. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Yes Reviewer #2: Yes ********** 3. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes Reviewer #2: N/A ********** 4. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: Yes ********** 5. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Reviewer #2: Yes ********** 6. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: In my view, all issues pointed by the reviewers were adequately addressed and the manuscript was revised accordingly. Reviewer #2: Authors respond to all reviewer's comments. Comment on sentence: "Originally, we left the numbers out of the figures because they are too small to be read; we thought the color code would be enough. The revised Figures 2 and 3 now contain these numbers." Suggestion is to implement possibility to save figures generated on server in some other format (now it is possible to save it in .png). Although current format can also be zoomed to clear read numbers, it will be better (also for possible future users) to implement save option to allow user to save figure in some other format (for example jpg, eps, ..) which is more convenient for for various purposes. ********** 7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: Yes: Vladimir N. Uversky Reviewer #2: No |
| Formally Accepted |
|
PONE-D-20-11896R1 Assessing the low complexity of protein sequences via the low complexity triangle Dear Dr. Mier: I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department. If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org. If we can help with anything else, please email us at plosone@plos.org. Thank you for submitting your work to PLOS ONE and supporting open access. Kind regards, PLOS ONE Editorial Office Staff on behalf of Dr. Alexandre G. de Brevern Academic Editor PLOS ONE |
Open letter on the publication of peer review reports
PLOS recognizes the benefits of transparency in the peer review process. Therefore, we enable the publication of all of the content of peer review and author responses alongside final, published articles. Reviewers remain anonymous, unless they choose to reveal their names.
We encourage other journals to join us in this initiative. We hope that our action inspires the community, including researchers, research funders, and research institutions, to recognize the benefits of published peer review reports for all parts of the research system.
Learn more at ASAPbio .