Peer Review History

Original SubmissionDecember 19, 2019
Decision Letter - Michael P. Epstein, Editor, Hua Tang, Editor

* Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out. *

Dear Dr Wallace,

Thank you very much for submitting your Research Article entitled 'Eliciting priors and relaxing the single causal variant assumption in colocalisation analyses' to PLOS Genetics. Your manuscript was fully evaluated at the editorial level and by independent peer reviewers. The reviewers appreciated the attention to an important topic but identified some aspects of the manuscript that should be improved.

We therefore ask you to modify the manuscript according to the review recommendations before we can consider your manuscript for acceptance. Your revisions should address the specific points made by each reviewer.

In addition we ask that you:

1) Provide a detailed list of your responses to the review comments and a description of the changes you have made in the manuscript.

2) Upload a Striking Image with a corresponding caption to accompany your manuscript if one is available (either a new image or an existing one from within your manuscript). If this image is judged to be suitable, it may be featured on our website. Images should ideally be high resolution, eye-catching, single panel square images. For examples, please browse our archive. If your image is from someone other than yourself, please ensure that the artist has read and agreed to the terms and conditions of the Creative Commons Attribution License. Note: we cannot publish copyrighted images.

We hope to receive your revised manuscript within the next 30 days. If you anticipate any delay in its return, we would ask you to let us know the expected resubmission date by email to plosgenetics@plos.org.

If present, accompanying reviewer attachments should be included with this email; please notify the journal office if any appear to be missing. They will also be available for download from the link below. You can use this link to log into the system when you are ready to submit a revised version, having first consulted our Submission Checklist.

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org.

Please be aware that our data availability policy requires that all numerical data underlying graphs or summary statistics are included with the submission, and you will need to provide this upon resubmission if not already present. In addition, we do not permit the inclusion of phrases such as "data not shown" or "unpublished results" in manuscripts. All points should be backed up by data provided with the submission.

PLOS has incorporated Similarity Check, powered by iThenticate, into its journal-wide submission system in order to screen submitted content for originality before publication. Each PLOS journal undertakes screening on a proportion of submitted articles. You will be contacted if needed following the screening process.

To resubmit, you will need to go to the link below and 'Revise Submission' in the 'Submissions Needing Revision' folder.

[LINK]

Please let us know if you have any questions while making these revisions.

Yours sincerely,

Michael P. Epstein

Associate Editor

PLOS Genetics

Hua Tang

Section Editor: Natural Variation

PLOS Genetics

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #1: Colocalization is an increasingly important aspect of genetic fine mapping efforts (>60 papers in 2018) but, unusually in statistical genetics, the most popular software (“coloc”) implements a Bayesian analysis with subjective priors. This paper demonstrates the potential sensitivity of coloc to the prior probability of colocalization, examines a huge amount of data to elicit suggestions for setting reasonable values, and provides software for performing sensitivity analysis. In addition, the assumption of one causal variant per region per trait is examined and a new approach, called masking, is suggested for situations in which current methods cannot be applied. Overall this paper gives useful guidance to users of coloc and provides insights into the method that should be of value.

Minor comments

1. P4 L52 “ubiquity of genetic effects … concordant with an omnigenic model” – suggests that such ubiquity has been established when it remains a conjecture. Some rewording needed.

2. The Introduction starts off by introducing MR and appears to motivate colocalization primarily as a way to validate instruments in MR studies. But, as seen elsewhere in the paper, most of the applications are in delineating molecular pathways to disease. I’d suggest reworking the opening paragraph to better reflect the broader motivations for colocalization.

3. P7 L101 full text could be accessed for only 25 of 60 papers. Was this due to limitations of institutional subscriptions? Could not the corresponding authors provide manuscripts for research purposes?

4. P7 L104 it would be interesting to know also how many papers used eCaviar or some other method to deal with multiple causal variants. Also, how often did the original discovery studies perform conditional analyses and rule out additional causal variants? So that when going to colocalization, the single causal variant assumption can be justified to some extent.

5. P7 L107 “prior probability … will depend” – should say “may depend” since at this point we haven’t established this, and anyway since priors are subjective the user is free to believe that there is no dependence on the traits (but may then draw the wrong conclusion).

6. P8 L124 “more likely” should be “relatively more likely”, otherwise this sentence is confusing. Initially I found this sentence counter-intuitive – seems that by looking at fewer SNPs we are more likely to find colocalization – but the point is that the prior probability of colocalization is higher relative to distinct variants when fewer SNPs are considered. However the lower number of SNPs would provide less evidence for colocalization so this is a false economy. Anyway some interpretation should be added to this and the previous paragraph as it is unclear what one should conclude from the observations.

7. P8 L132 note that all the estimates of p’s and q’s are based on statistically significant SNPs, and the number of truly associated variants must be larger. So the elicited priors must be lower bounds. What implications does this have for the final inferences?

8. P9 L145 not clear how to get a posterior probability of association from just a prior and a p-value.

9. P10 L163, 165 the Appendix was not available to review.

10. P12 L208 “unlinked” -> “not in linkage disequilibrium”. There is a difference between linkage and LD.

11. P12 The masking method still needs an LD matrix, so the only real advantage over CoJo is that there is no need to align the alleles.

12. P12 The masking method looks a lot like “clumping” as often used, for example, in constructing polygenic risk scores. Please clarify the difference, or use the same term to prevent jargon creep.

13. P213 Figure 6 caption, “setting to 1 the Bayes factor” –the main text suggests setting the log Bayes factor to -3. Log in what base?

14. P14 L244 is it feasible to make the sensitivity analysis a default action in coloc, with the results being returned in the same object as the posteriors?

15. P16 the Discussion would benefit from a summary take-home message, such as that the default values of p1 and p2 are OK but p12 needs more thought (and a summary of how to do this would also help).

Typos etc

1. P3 L32 “underly” -> “underlie”

2. P4 L59 “For example…” – the sentence has no active verb.

3. P7 L117 delete “a”; change final “,” to “.”

4. P8 L140 the double “-“ is confusing, suggest just saying “to” or writing as an interval.

5. P9 L150 “One” -> “On”

6. P11 L178 in the equation below, can delete the intersection with A2 in the third expression.

7. P11 L180 spelling of “asymmetric”

8. P12 Figure 5 caption line 2, “belief” -> “beliefs”. What does the dotted line marked “results” mean?

9. P12 L212 “is” -> “are”

10. P14 L234 “are” -> “is”

11. P16 L288 “interpretable” -> “interpretation”

12. References are a bit sloppy, eg page numbers for refs 11 and 14.

13. Supplement P1, footnote 4 “P=1105” etc looks incorrect.

Reviewer #2: This paper considers two important extensions to the currently most popular

and influential colocalization method/software "coloc": a more suitable

prior specification (than the current default) and relaxing the assumption

of only one causal SNP. In particular, the first problem has been largely

ignored in practice while its implication is significant, as the

author has clearly shown in the paper. Although the proposed methods are

not technically sophisticated, they can be tremendously useful as implemented

in the "coloc" software. The paper was well written. I only have two very

minor comments.

Minor comments:

1. Prior elicitation is a well known and general problem in Bayesian statistics,

both important and challenging. I agree with the author on all her points,

and commend the author for providing a useful online tool "coloc explorer".

However, without a "default" prior, I am not sure how useful it would be to

a "typical" biologist without deep understanding of Bayesian statistics or

"coloc" method; in fact, I would be a bit worried that someone might do

"prior mining" to try to get more significant results. Some comments or

guidelines might be helpful to a typical user.

2. I completely agree with the author on both the advantages and limitations

of the conditioning approach as compared to the proposed "masking" approach.

However, if I understand correctly, with a typical small genomic region of

interest, one would potentially mask out ALL SNPs in the region that are in

LD with the lead SNP; in other words, is the new assumption simply

that there is at most only one causal SNP in EACH LD block? If true, it is

still like doing coloc analysis under the single causal SNP assumption for

each LD block, which can be too restrictive given that there are only about

two thousand (approximately independent) LD blocks in the huiman genome.

Some clarifications and comments would be helpful.

Reviewer #3: This manuscript investigated how to derive data driven priors for best power of COLOC, provided a sensitivity analysis framework to assess the robustness of priors, and proposed a new masking approach for dealing with scenarios with multiple signals per region. It is very useful to provide guidelines for users of COLOC about how to setup priors to achieve the best power. However, this paper does not provide a clear guideline to readers. I have the following comments:

1) It would help refresh reader’s mind if a brief description about the statistical procedure of the COLOC tool could be provided either in the Introduction section along with the five stated hypotheses, or at the beginning of the Results section.

2) I think it would be helpful to make a clear guideline table for readers, e.g., suggestive p1, p2, p12 prior values for a few different combinations of number of SNPs in the test region, total number of trait signals, if multiple signals exist in the test region. Or a such table could be provided for GTEx expression traits of different tissue types, which will provide readers a concrete example.

3) It would be helpful if the authors could provide some descriptions about “coloc explorer” and “condmask coloc” and how to implement these two tools in the supplementary text.

**********

Have all data underlying the figures and results presented in the manuscript been provided?

Large-scale datasets should be made available via a public repository as described in the PLOS Genetics data availability policy, and numerical data that underlies graphs or summary statistics should be provided in spreadsheet form as supporting information.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Reviewer #3: No

Revision 1

Attachments
Attachment
Submitted filename: coloc response.pdf
Decision Letter - Michael P. Epstein, Editor, Hua Tang, Editor

Dear Dr Wallace,

We are pleased to inform you that your manuscript entitled "Eliciting priors and relaxing the single causal variant assumption in colocalisation analyses" has been editorially accepted for publication in PLOS Genetics. Congratulations!

Before your submission can be formally accepted and sent to production you will need to complete our formatting changes, which you will receive in a follow up email. Please be aware that it may take several days for you to receive this email; during this time no action is required by you. Please note: the accept date on your published article will reflect the date of this provisional accept, but your manuscript will not be scheduled for publication until the required changes have been made.

Once your paper is formally accepted, an uncorrected proof of your manuscript will be published online ahead of the final version, unless you’ve already opted out via the online submission form. If, for any reason, you do not want an earlier version of your manuscript published online or are unsure if you have already indicated as such, please let the journal staff know immediately at plosgenetics@plos.org.

In the meantime, please log into Editorial Manager at https://www.editorialmanager.com/pgenetics/, click the "Update My Information" link at the top of the page, and update your user information to ensure an efficient production and billing process. Note that PLOS requires an ORCID iD for all corresponding authors. Therefore, please ensure that you have an ORCID iD and that it is validated in Editorial Manager. To do this, go to ‘Update my Information’ (in the upper left-hand corner of the main menu), and click on the Fetch/Validate link next to the ORCID field.  This will take you to the ORCID site and allow you to create a new iD or authenticate a pre-existing iD in Editorial Manager.

If you have a press-related query, or would like to know about one way to make your underlying data available (as you will be aware, this is required for publication), please see the end of this email. If your institution or institutions have a press office, please notify them about your upcoming article at this point, to enable them to help maximise its impact. Inform journal staff as soon as possible if you are preparing a press release for your article and need a publication date.

Thank you again for supporting open-access publishing; we are looking forward to publishing your work in PLOS Genetics!

Yours sincerely,

Michael P. Epstein

Associate Editor

PLOS Genetics

Hua Tang

Section Editor: Natural Variation

PLOS Genetics

www.plosgenetics.org

Twitter: @PLOSGenetics

----------------------------------------------------

Comments from the reviewers (if applicable):

----------------------------------------------------

Data Deposition

If you have submitted a Research Article or Front Matter that has associated data that are not suitable for deposition in a subject-specific public repository (such as GenBank or ArrayExpress), one way to make that data available is to deposit it in the Dryad Digital Repository. As you may recall, we ask all authors to agree to make data available; this is one way to achieve that. A full list of recommended repositories can be found on our website.

The following link will take you to the Dryad record for your article, so you won't have to re‐enter its bibliographic information, and can upload your files directly: 

http://datadryad.org/submit?journalID=pgenetics&manu=PGENETICS-D-19-02090R1

More information about depositing data in Dryad is available at http://www.datadryad.org/depositing. If you experience any difficulties in submitting your data, please contact help@datadryad.org for support.

Additionally, please be aware that our data availability policy requires that all numerical data underlying display items are included with the submission, and you will need to provide this before we can formally accept your manuscript, if not already present.

----------------------------------------------------

Press Queries

If you or your institution will be preparing press materials for this manuscript, or if you need to know your paper's publication date for media purposes, please inform the journal staff as soon as possible so that your submission can be scheduled accordingly. Your manuscript will remain under a strict press embargo until the publication date and time. This means an early version of your manuscript will not be published ahead of your final version. PLOS Genetics may also choose to issue a press release for your article. If there's anything the journal should know or you'd like more information, please get in touch via plosgenetics@plos.org.

Formally Accepted
Acceptance Letter - Michael P. Epstein, Editor, Hua Tang, Editor

PGENETICS-D-19-02090R1

Eliciting priors and relaxing the single causal variant assumption in colocalisation analyses

Dear Dr Wallace,

We are pleased to inform you that your manuscript entitled "Eliciting priors and relaxing the single causal variant assumption in colocalisation analyses" has been formally accepted for publication in PLOS Genetics! Your manuscript is now with our production department and you will be notified of the publication date in due course.

The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript.

Soon after your final files are uploaded, unless you have opted out or your manuscript is a front-matter piece, the early version of your manuscript will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers.

Thank you again for supporting PLOS Genetics and open-access publishing. We are looking forward to publishing your work!

With kind regards,

Kaitlin Butler

PLOS Genetics

On behalf of:

The PLOS Genetics Team

Carlyle House, Carlyle Road, Cambridge CB4 3DN | United Kingdom

plosgenetics@plos.org | +44 (0) 1223-442823

plosgenetics.org | Twitter: @PLOSGenetics

Open letter on the publication of peer review reports

PLOS recognizes the benefits of transparency in the peer review process. Therefore, we enable the publication of all of the content of peer review and author responses alongside final, published articles. Reviewers remain anonymous, unless they choose to reveal their names.

We encourage other journals to join us in this initiative. We hope that our action inspires the community, including researchers, research funders, and research institutions, to recognize the benefits of published peer review reports for all parts of the research system.

Learn more at ASAPbio .