Peer Review History
| Original SubmissionOctober 14, 2021 |
|---|
|
PONE-D-21-32424 On taming the effect of transcript level intra-condition count variation during differential expression analysis: a story of dogs, foxes and wolves PLOS ONE Dear Dr. Archer, Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. You will see that the reviewers have raised some concerns regarding the methodology, specifically the choice of data for evaluation and the possibility of batch effects in these data, which will need to be addressed. In addition several aspects of the methodology require further clarification. Please note that PLOS publication criteria only require a study to be rigorous, robust and described in sufficient detail for replication. Therefore, while I agree with reviewer 1 that an R package may improve user uptake of your tool, this is not required for acceptance of your manuscript, since both reviewers have confirmed you have made usable code available. Please submit your revised manuscript by Jul 04 2022 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript:
If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter. If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols. We look forward to receiving your revised manuscript. Kind regards, Katherine James, Ph.D. Academic Editor PLOS ONE Journal Requirements: When submitting your revision, we need you to address these additional requirements. 1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and 2. In your Data Availability statement, you have not specified where the minimal data set underlying the results described in your manuscript can be found. PLOS defines a study's minimal data set as the underlying data used to reach the conclusions drawn in the manuscript and any additional data required to replicate the reported study findings in their entirety. All PLOS journals require that the minimal data set be made fully available. For more information about our data policy, please see http://journals.plos.org/plosone/s/data-availability. Upon re-submitting your revised manuscript, please upload your study’s minimal underlying data set as either Supporting Information files or to a stable, public repository and include the relevant URLs, DOIs, or accession numbers within your revised cover letter. For a list of acceptable repositories, please see http://journals.plos.org/plosone/s/data-availability#loc-recommended-repositories. Any potentially identifying patient information must be fully anonymized. Important: If there are ethical or legal restrictions to sharing your data publicly, please explain these restrictions in detail. Please see our guidelines for more information on what we consider unacceptable restrictions to publicly sharing data: http://journals.plos.org/plosone/s/data-availability#loc-unacceptable-data-access-restrictions. Note that it is not acceptable for the authors to be the sole named individuals responsible for ensuring data access. We will update your Data Availability statement to reflect the information you provide in your cover letter. 3. Please upload a new copy of Figure S1 and S2 as the detail is not clear. Please follow the link for more information: https://blogs.plos.org/plos/2019/06/looking-good-tips-for-creating-your-plos-figures-graphics/" https://blogs.plos.org/plos/2019/06/looking-good-tips-for-creating-your-plos-figures-graphics/ [Note: HTML markup is below. Please do not edit.] Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Partly Reviewer #2: Partly ********** 2. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes Reviewer #2: No ********** 3. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: Yes ********** 4. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Reviewer #2: Yes ********** 5. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: Authors outline that filtering expression data based on intra-group variation is recommended for maximising the number of identified DE genes. However the goal of DE analysis is not to maximise the number of DEGs but to identify the truly correct DEGs, those that are likely to be replicated if the experiment were conducted again or confirmed with another technique. In addition, genes that have high variability may actually be true DE genes, and there is no valid reason to discard them. To build a better justification for such filtering, a more comprehensive analysis is required to show that accuracy in DE classification is improved. Analysis of RNA-seq datasets with large numbers of replicates would be useful (eg: PMID: 27022035). I downloaded the package and tried the example. It seemed to work fine using the directions in the manual. The genes which are hypervariable in expression, are these markers of different brain regions? I ask because dissection and sampling can be a major source of variation. P9: Regarding the way the variance is calculated, is it calculated for each sample group separately and then the average of the two groups is used, or is this done in a different way? Typically, in order to avoid violating FDR correction assumptions, it is not allowed to filter any genes after the sample labels have been revealed as this equates to cherry picking, a form of p-hacking. In microarray analysis it is customary to discard probes with low overall variance but is acceptable as this procedure does not peek at the sample labels before filtering (eg: PMID: 19133141). Some analysts filter lowly expressed RNA-seq data using a threshold of 1 TPM or an average of 10 reads per sample on average which is also fine. DESeq2 and other differential expression tools are written in R so it makes sense that this tool would also be written in R. Exporting the R data objects as TSV, running tvscript and then reading the data back into R is clumsy and may lead to poor uptake of this tool. I’d recommend a bioconductor package, which has the added benefit of being able to generate charts so that the user can better understand the intra-condition variability, like how edgeR generates a BCV chart (https://rdrr.io/bioc/edgeR/man/plotBCV.html). Another informative diagnostic chart could be PCA plots of (1) all transcripts, (2) hypervariable discarded transcripts, and (3) retained transcripts. Bowtie not recommended for transcriptome mapping. As there are reads that can map equally well to multiple transcripts which get discarded in such approaches, it is preferable to use Kallisto or Salmon which deals accurately with multi-mapped reads. This may explain the reason behind the low mapping rate of wolf, dog and fox reads. Does this approach work equally well for gene-based analysis using counts generated using STAR or featureCounts? This does not sound right. “The number of transcripts removed was higher for the fox samples than for those of wolf and dog, reflecting the higher intra-condition variability present.” If percentiles of transcripts are being discarded, shouldn’t the proportion of detected transcripts discarded be the same for both studies? It is not explained clearly. The figures should be explained in sequence. Eg: Figure 4C and the minitable in Fig 4 should be explained in the text before Fig 5a. Reviewer #2: Lobo et al present an evaluation of their software TVscript, which evaluates intra-condition variability in the counts that have been mapped to a transcriptome in an RNA-Seq experiment and removes the transcripts associated with the highest level of this variability, up to a user specified percentile threshold. They test the software by applying it to two pairs of datasets from wild and tame animals, wolves vs dogs and aggressive vs tame foxes. The greatest fraction of differentially expressed transcripts (DETs) is obtained by removing 3 to 5% of transcripts, and the authors describe some interesting features of the gene families of the corresponding differentially expressed genes, including common changes upon taming. The approach to RNA-Seq analysis is a potentially interesting one, representing another approach similar in some ways to the “orthogonal filtering” of low-expressed transcripts that is commonly used to increase the power in the analysis of RNA-Seq experiments. Unfortunately there are a number of aspects to the methodology that make it hard to recommend publication in the present form. 1 Most importantly, I don’t think the data sets analysed are appropriate for the main intention of the paper. It is hard to tell whether the alterations made to the transcriptome improve the results rather than inducing false positives. The data sets used for testing come from multiple batches, and two organisms, one of which is appreciably divergent from the transcriptome to which it is aligned. In short, there are too many other uncontrolled factors in the analysis done to tell whether the results are reliable. In testing TVscript, it would be better to use an approach like that taken in Rapaport et al 2013 (https://doi.org/10.1186/gb-2013-14-9-r95), which uses data sets where batch effects are better controlled, including one (GEO GSE 49712) where external rna control consortium (ERCC) spike ins were used to produce known true positive DEGs. 2 More detail is necessary about how the differential expression was performed; figure 2c seems to show that the dogs do separate by batch (1-5; 6; 7; 8-9) and one would normally use a design formula that took account of the different sources of the data, something like ~ batch + tameness. (although the brain tissue and instrument used are also similar for dogs 6-9, so one could also try ~ tissue + tameness). The authors should state whether they used a formula like this, and justify why not if they did not (Ideally, the R script used for differential expression analysis could be made available). I would also remark that, since the authors emphasise that data comes from different sources, it was not immediately clear, until I looked at supplementary table 1, that the wolves and dogs 1-5 all come from one study, and similarly all the foxes, aggressive and tame, come from one study. This should be brought out more in the text, as otherwise the reader is made to wonder how any difference between wild and tame will be detectable that is not confounded with batch effects. 3 It is unclear to me why the authors used the C. familiaris transcriptome for their work on the fox as well as the dog/wolf, when the fox and wolf lineages diverged 10 myr ago. A genome and transcriptome are available for the fox (https://www.ensembl.org/Vulpes_vulpes/Info/Index?db=core), and even though it is of lower quality than the dog, a higher mapping rate might have been expected. I appreciate that it makes the assessment of TVscript, and to some extent the comparison of dog and fox DET results, more straightforward (though information on orthology is also available). The low mapping rate and slightly strange clustering of the points in the PCA plot fig 2d are indications there may be some problems with the fox data that might in part come from the choice of transcriptome, and this casts some doubt on the DET results for me. My recommendation would be to split the work into two papers, one comparing the wild and tame animals, which to me was the most interesting part of the manuscript, and one assessing TVscript. It appears to me from comparing Tables 1 and 2 in the paper (filtered transcriptome) with supplementary table 6 (unfiltered transcriptome), that the filtering did not make a very big difference here. Hence the first paper could use the results from the more standard methods of supp. Table 6 and the interesting overlaps between the changes on domestication in the two pairs of animals would still be largely maintained. The second paper really would really need to use different test data sets , as suggested above, to establish whether TVscript is genuinely increasing sensitivity without introducing type I errors. I would like to thank the authors for preparing the manuscript carefully, providing detailed results and supplementary material, and providing access to the code of TVScript along with links to other useful material on sourceforge. ********** 6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.
|
| Revision 1 |
|
PONE-D-21-32424R1On taming the effect of transcript level intra-condition count variation during differential expression analysis: a story of dogs, foxes and wolvesPLOS ONE Dear Dr. Archer, Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. Both reviewers are overall very happy with your revised version. However, they both have some minor comments that require final clarification. I don't anticipate these points will take too much time to address and look forward to reading the final version. Please submit your revised manuscript by Sep 24 2022 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript:
If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols. We look forward to receiving your revised manuscript. Kind regards, Katherine James, Ph.D. Academic Editor PLOS ONE Journal Requirements: Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice. [Note: HTML markup is below. Please do not edit.] Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation. Reviewer #1: All comments have been addressed Reviewer #2: (No Response) ********** 2. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Yes Reviewer #2: Yes ********** 3. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes Reviewer #2: I Don't Know ********** 4. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: Yes ********** 5. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Reviewer #2: Yes ********** 6. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: I commend the authors for the comprehensive amendments and explanations. I think the article is in great shape. The provision of scripts and data on zenodo is appreciated. Please consider the following points as optional suggestions. 1. This new simulation is a welcome addition that supports the perceived need for a tool like TVscript.In figure 2, I would recommend putting the legend for light and dark grey boxes on the plot itself. 2. OK 3. OK 4. OK 5. OK, but the passage on line 723 should be written in a clearer, more straightforward way. 6. OK. 7. OK, but "high r2 correlation values" should be qualified with a specific range (eg: 85-98%), so that the reader can understand what "high" means. 8. OK 9. OK. 10. OK Typo: "we also had an interested in understanding whether" Reviewer #2: I am grateful to Lobo et al for their efforts in addressing my criticisms of the first version of their manuscript. 1 Appropriateness of data sets used to assessing TVscript: I think that by using extensive simulated data the assessment of the behaviour of TVscript is much improved. 2 more detail has been given on how the DE analysis was performed as requested. I am not totally convinced by the lengthy discussion of unknown effects (though there is no reason to remove it); clearly there are always unknown factors, but that does not affect the apparent effect of batch in the PCA plots. But in any case, the important test, that including batch in the statistical model for DE in the dogs, has been done and the results (fig S6) seem to confirm that it does not have a very large effect 3. I thank the authors for checking the mapping rate of the fox data to the fox transcriptome (and, incidentally, by mapping with kallisto as well). Even though it turned out not to affect the mapping rate very much, I feel this was an important check to perform. 4. I appreciate that factors outside the authors’ control can affect the way that a project ends up being carried out and written up. Although my suggestion was to divide the manuscript, I do not insist on it, I am content with the manuscript’s current form. The criticisms that I made in my first review, at least, are allayed. However the first reviewer raised other serious points particularly point 5 about filtering where the sample group information is used (see Bourgon, Gentleman, and Huber 2010). I did not spot this in my own review and it is for the first reviewer to assess whether the additional simulations have addressed this satisfactorily. I did notice that in the discussion of the matter in the DESeq2 vignette (http://bioconductor.org/packages/devel/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#independent-filtering-and-multiple-testing) a histogram of the p-value of the filtered genes is provided, showing that it is approximately uniform. I would tentatively suggest (again, I defer to the first reviewer here ) that, done for transcripts rather than genes, a p-value histogram could provide an empirical way of demonstrating that the filtering is independent of the test statistic under the null hypothesis, if required. ********** 7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No ********** [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step. |
| Revision 2 |
|
On taming the effect of transcript level intra-condition count variation during differential expression analysis: a story of dogs, foxes and wolves PONE-D-21-32424R2 Dear Dr. Archer, We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements. Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication. An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org. If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org. Kind regards, Katherine James, Ph.D. Academic Editor PLOS ONE Additional Editor Comments (optional): Reviewers' comments: |
| Formally Accepted |
|
PONE-D-21-32424R2 On taming the effect of transcript level intra-condition count variation during differential expression analysis: a story of dogs, foxes and wolves Dear Dr. Archer: I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department. If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org. If we can help with anything else, please email us at plosone@plos.org. Thank you for submitting your work to PLOS ONE and supporting open access. Kind regards, PLOS ONE Editorial Office Staff on behalf of Dr. Katherine James Academic Editor PLOS ONE |
Open letter on the publication of peer review reports
PLOS recognizes the benefits of transparency in the peer review process. Therefore, we enable the publication of all of the content of peer review and author responses alongside final, published articles. Reviewers remain anonymous, unless they choose to reveal their names.
We encourage other journals to join us in this initiative. We hope that our action inspires the community, including researchers, research funders, and research institutions, to recognize the benefits of published peer review reports for all parts of the research system.
Learn more at ASAPbio .