Peer Review History

Original SubmissionJanuary 20, 2021
Decision Letter - Scott M. Williams, Editor, Ron Do, Editor

Dear Dr Julienne,

Thank you very much for submitting your Research Article entitled 'Multitrait GWAS to connect disease variants and biological mechanisms' to PLOS Genetics.

The manuscript was fully evaluated at the editorial level and by independent peer reviewers. The reviewers appreciated the attention to an important topic but identified some concerns that we ask you address in a revised manuscript

We therefore ask you to modify the manuscript according to the review recommendations. Your revisions should address the specific points made by each reviewer.

In addition we ask that you:

1) Provide a detailed list of your responses to the review comments and a description of the changes you have made in the manuscript.

2) Upload a Striking Image with a corresponding caption to accompany your manuscript if one is available (either a new image or an existing one from within your manuscript). If this image is judged to be suitable, it may be featured on our website. Images should ideally be high resolution, eye-catching, single panel square images. For examples, please browse our archive. If your image is from someone other than yourself, please ensure that the artist has read and agreed to the terms and conditions of the Creative Commons Attribution License. Note: we cannot publish copyrighted images.

We hope to receive your revised manuscript within the next 30 days. If you anticipate any delay in its return, we would ask you to let us know the expected resubmission date by email to plosgenetics@plos.org.

If present, accompanying reviewer attachments should be included with this email; please notify the journal office if any appear to be missing. They will also be available for download from the link below. You can use this link to log into the system when you are ready to submit a revised version, having first consulted our Submission Checklist.

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org.

Please be aware that our data availability policy requires that all numerical data underlying graphs or summary statistics are included with the submission, and you will need to provide this upon resubmission if not already present. In addition, we do not permit the inclusion of phrases such as "data not shown" or "unpublished results" in manuscripts. All points should be backed up by data provided with the submission.

To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols

Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice.

PLOS has incorporated Similarity Check, powered by iThenticate, into its journal-wide submission system in order to screen submitted content for originality before publication. Each PLOS journal undertakes screening on a proportion of submitted articles. You will be contacted if needed following the screening process.

To resubmit, you will need to go to the link below and 'Revise Submission' in the 'Submissions Needing Revision' folder.

[LINK]

Please let us know if you have any questions while making these revisions.

Yours sincerely,

Ron Do

Guest Editor

PLOS Genetics

Scott Williams

Section Editor: Natural Variation

PLOS Genetics

Please address all reviewers' comments including Reviewer 1's comments on clarifying whether the biological and clinical insights obtained from this work are meant to be suggestive or whether the authors believe they are strong enough on its own to guide future action / decision making.

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #1: This paper includes a lot of analyses of summary-statistic GWAS results for multiple traits, and partitions SNP associations into clusters. There are interesting methodological details, and fairly detailed analyses and interpretations of several sets of phenotypes, with extensive supplemental material. The manuscript is written to focus on the phenotypic results with the methodological work largely in the supplement. It might be better to include more of the methods details in the main text.

The Introduction is not well written and needs more clarity and rationale. In fact the paper is comparing a set of multivariate methods, including at least one new method, and evaluating the performance and the identified results. But this message does not come through clearly at all in the current introduction.

Also, the introduction has a lot of space denoted to methods & results. I know this is a common practice in many articles, but in this case, the sentences’ meaning is unclear. The introduction should spend more time on the motivation. For example:

“First, characterizing and comparing the relative performances of alternative multitrait association models, we found strong specificity of the signal identified by each approach,…”

These “alternative…models” are not mentioned. Different kinds of models? Different assumptions? Why where they chosen? Then you say “….identified by each approach” but we don’t know what the approaches are.

Results

Performance of the different multivariate methods are discussed in the main text – when they agree, when they disagree, what kinds of features they agree on. But it is done superficially rather than exploring the specific assumptions behind each model. Presumably, a more in-depth comparison can be achieved through the simulation studies, but none of this appears in the main text. For example, are each SNP effects on phenotypes correlated? Or are the phenotypes correlated because the SNP acts on a shared latent factor that increases risk of each phenotype? Or are the phenotype correlated due to residual (non-genetic) correlations?

Many interesting findings are buried in one sentence: e.g. “We also developed corrections for several critical real data issues related to model misspecification (Figs S7 to S12) and missing data (Fig.S13)”. Some of these results should be in the main body of the paper. Similarly, in Methods, you say briefly “we implemented additional tools to estimate the per SNPs sample size…”. In the relevant supplemental material, the formulas and Figures relating to sample size are shown without sufficient explanation. It is difficult to decipher what is the problem being addressed in these supplemental sample size sections.

Existing multivariate methods were cited, but not used. Is there a reason for this?

Clustering: I am sure the authors are well aware of the challenges involved in interpreting clusters. Here, only SNPs that showed significance at 10-8 with at least one method are included in the clustering. Bootstrapping was used to look at cluster stability w.r.t. the number of clusters and the BIC/Silhouette criteria, but these would all be different if another threshold were used. Some clusters would have to be found given the thresholding effects. In fact, the authors acknowledge (in some way) the ad hoc nature of their analyses when they say:

“These distinct multitrait association profiles might arise because their variants belong to distinct genetic functional groups. Understanding whether those genetic functional groups are only statistical construction or correspond to meaningful biologically mechanism is critical. In the latter, it means that data-driven approach, such as the one proposed in the present study, can be used to dissect the genetic contribution of many complex human phenotypes.”

But this crucial text is in the middle of results, whereas I think this needs to be highlighted in the introduction.

An important message that needs to come through better is whether the conclusions that can be obtained through this kind of multivariate analysis of summary GWAS data is comparable enough to multivariate analyses of individual-level data to enable making decisions about potential drugs or treatments. Or is it just suggestive.

Could some UKbiobank data be used to do some structural equations modelling including individual level data, and to confirm some of the clustering relationships seen? i.e. could directed graphical relationships be estimated in the individual level data?

Minor

Careful editing would be helpful to clean up prepositions and articles, as well as general spelling and grammar here and there. A few examples are provided here:

• “relevance of multitrait association tests, there have” : probably should be “they”

• “We performed a series of analyses”

• “To understand further the relative performance of those three tests (omnibus, sumZica, sumZg) along with the univariate test”

And general wording would also benefit from good editing. For example:

• “we explored which multitrait signal was associated with the largest increase in detection per test. For that aim,…” The use of “For that aim,…” here is awkward.

• “…median chi-squared were elevated for the any …” should be “median chi-squared tests were elevated for the any…

Supplement, section on “Theoretical comparison with MANOVA”:

• K>>N should be K<<n

Clustering: Fig S22 should indicate the number of clusters chosen.</n

Reviewer #2: This study tackles the problem of finding pleiotropic loci that affects multiple traits (multi-trait analysis) and interpreting the complex genetic effect patterns. This is an important problem in the current genetics field, I think. The paper is extremely well written and most of the parts can be easily understood. The analyses and figures are fascinating.

I only have a few minor comments.

The authors are quite modest and do not argue that they developed these methods: the various sum of squares and omnibus. (They all look quite straightforward, and leave no room for controversy) Although omnibus is quite widely used, I think there’s some novelty in using PCs (driven from various sources) as weights for FE-meta. It’s a pretty simple idea, but it’s important to note that it’s certainly not what typical geneticist can bring up from his/her head instantly. I mean, it’s a good idea. I couldn’t find any citation there? So, if using PC as weight direction is what the authors have developed, I think it’s OK to advertise it as it is. (including the independent component analysis part)

A bit of discussion about possible multiple testing correction issue for applying combination of Z-r, Z-ica, and omnibus together would be beneficial.

Although details of GMM / MGMM are not described, only citing biorxiv paper, it should be better to reiterate how these frameworks work in a brief version here, so that this paper can be self-contained.

I would be really interested if the code that can automatically perform (1) some kinds of standard preprocessing including summary imputation, (2) various tests, and (3) visualization of Figure 2, 3, and 4. (Or generation of input formatted-data for Figure 2,3,4 along with the standard script). That will really help the community.

Reviewer #3: Review: Multitrait GWAS to connect disease variants and biological mechanisms

Overview

Julienne et al integrate GWAS summary statistics from multiple phenotypes to provide insights into shared genetic architecture and biological mechanisms. They leverage previously described multi-trait association methods to identify GWAS signals shared across 36 phenotypes and apply a novel clustering approach to group underlying association signals into broad biological categories. They highlight results from immune- and metabolism-related phenotypes to shed light on shared pathways. The methods and analyses are rigorous, and the takeaways should be of broad interest to the statistical genetics community. I found the manuscript to be well written, underlying assumptions thoroughly tested with simulations, and results to be generally well supported. With that said, I have a few comments.

Major Comments

1. Integrating multiple GWAS data from varying cohorts and studies requires great care in handling allele coding, varying sample size, and missingness. The authors have clearly spent a good deal of time considering the practical challenges and I was impressed with their systematic approach to each issue with rigorous statistical modeling and accompanying simulations. Similarly, I appreciated the thorough step-by-step derivations in the supplementary material. It was clear this was a considerable effort on part of the authors and sets a standard for how supplementary methods should be presented and for that they should be commended.

2. The authors note the need to prune SNPs that lacked a clear cluster assignment using large values of entropy as a metric. Can the authors perform analyses to provide some indication as to why these SNPs were unable to be assigned to any cluster with high confidence? Is this the result of similar functionality across clusters [and thus the method cannot assign to any with certainty], or is it more likely due to non-shared functionality and lack of a representative cluster?

3. With clusters acting as proxies for biological pathways, it would be interesting to see if tissue enrichment varies across clusters for a specific phenotype group.

Minor Comments

1. Figure 3 caption has a formatting or word-to-pdf conversion error, “ -cell function”.

**********

Have all data underlying the figures and results presented in the manuscript been provided?

Large-scale datasets should be made available via a public repository as described in the PLOS Genetics data availability policy, and numerical data that underlies graphs or summary statistics should be provided in spreadsheet form as supporting information.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Reviewer #3: No

Revision 1

Attachments
Attachment
Submitted filename: Review1_v0.4.docx
Decision Letter - Scott M. Williams, Editor, Ron Do, Editor

Dear Dr Julienne,

We are pleased to inform you that your manuscript entitled "Multitrait GWAS to connect disease variants and biological mechanisms" has been editorially accepted for publication in PLOS Genetics. Congratulations!

Before your submission can be formally accepted and sent to production you will need to complete our formatting changes, which you will receive in a follow up email. Please be aware that it may take several days for you to receive this email; during this time no action is required by you. Please note: the accept date on your published article will reflect the date of this provisional acceptance, but your manuscript will not be scheduled for publication until the required changes have been made.

Once your paper is formally accepted, an uncorrected proof of your manuscript will be published online ahead of the final version, unless you’ve already opted out via the online submission form. If, for any reason, you do not want an earlier version of your manuscript published online or are unsure if you have already indicated as such, please let the journal staff know immediately at plosgenetics@plos.org.

In the meantime, please log into Editorial Manager at https://www.editorialmanager.com/pgenetics/, click the "Update My Information" link at the top of the page, and update your user information to ensure an efficient production and billing process. Note that PLOS requires an ORCID iD for all corresponding authors. Therefore, please ensure that you have an ORCID iD and that it is validated in Editorial Manager. To do this, go to ‘Update my Information’ (in the upper left-hand corner of the main menu), and click on the Fetch/Validate link next to the ORCID field.  This will take you to the ORCID site and allow you to create a new iD or authenticate a pre-existing iD in Editorial Manager.

If you have a press-related query, or would like to know about making your underlying data available (as you will be aware, this is required for publication), please see the end of this email. If your institution or institutions have a press office, please notify them about your upcoming article at this point, to enable them to help maximise its impact. Inform journal staff as soon as possible if you are preparing a press release for your article and need a publication date.

Thank you again for supporting open-access publishing; we are looking forward to publishing your work in PLOS Genetics!

Yours sincerely,

Ron Do

Guest Editor

PLOS Genetics

Scott Williams

Section Editor: Natural Variation

PLOS Genetics

www.plosgenetics.org

Twitter: @PLOSGenetics

----------------------------------------------------

Comments from the reviewers (if applicable):

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #1: The authors have responded adequately to all comments.

Reviewer #2: The authors have successfully addressed my comments and I don't have further comments. I hope that the implementation will be used widely.

Reviewer #3: The authors have addressed all of my initial comments.

**********

Have all data underlying the figures and results presented in the manuscript been provided?

Large-scale datasets should be made available via a public repository as described in the PLOS Genetics data availability policy, and numerical data that underlies graphs or summary statistics should be provided in spreadsheet form as supporting information.

Reviewer #1: None

Reviewer #2: Yes

Reviewer #3: Yes

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Reviewer #3: No

----------------------------------------------------

Data Deposition

If you have submitted a Research Article or Front Matter that has associated data that are not suitable for deposition in a subject-specific public repository (such as GenBank or ArrayExpress), one way to make that data available is to deposit it in the Dryad Digital Repository. As you may recall, we ask all authors to agree to make data available; this is one way to achieve that. A full list of recommended repositories can be found on our website.

The following link will take you to the Dryad record for your article, so you won't have to re‐enter its bibliographic information, and can upload your files directly: 

http://datadryad.org/submit?journalID=pgenetics&manu=PGENETICS-D-21-00084R1

More information about depositing data in Dryad is available at http://www.datadryad.org/depositing. If you experience any difficulties in submitting your data, please contact help@datadryad.org for support.

Additionally, please be aware that our data availability policy requires that all numerical data underlying display items are included with the submission, and you will need to provide this before we can formally accept your manuscript, if not already present.

----------------------------------------------------

Press Queries

If you or your institution will be preparing press materials for this manuscript, or if you need to know your paper's publication date for media purposes, please inform the journal staff as soon as possible so that your submission can be scheduled accordingly. Your manuscript will remain under a strict press embargo until the publication date and time. This means an early version of your manuscript will not be published ahead of your final version. PLOS Genetics may also choose to issue a press release for your article. If there's anything the journal should know or you'd like more information, please get in touch via plosgenetics@plos.org.

Formally Accepted
Acceptance Letter - Scott M. Williams, Editor, Ron Do, Editor

PGENETICS-D-21-00084R1

Multitrait GWAS to connect disease variants and biological mechanisms

Dear Dr Julienne,

We are pleased to inform you that your manuscript entitled "Multitrait GWAS to connect disease variants and biological mechanisms" has been formally accepted for publication in PLOS Genetics! Your manuscript is now with our production department and you will be notified of the publication date in due course.

The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript.

Soon after your final files are uploaded, unless you have opted out or your manuscript is a front-matter piece, the early version of your manuscript will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers.

Thank you again for supporting PLOS Genetics and open-access publishing. We are looking forward to publishing your work!

With kind regards,

Andrea Szabo

PLOS Genetics

On behalf of:

The PLOS Genetics Team

Carlyle House, Carlyle Road, Cambridge CB4 3DN | United Kingdom

plosgenetics@plos.org | +44 (0) 1223-442823

plosgenetics.org | Twitter: @PLOSGenetics

Open letter on the publication of peer review reports

PLOS recognizes the benefits of transparency in the peer review process. Therefore, we enable the publication of all of the content of peer review and author responses alongside final, published articles. Reviewers remain anonymous, unless they choose to reveal their names.

We encourage other journals to join us in this initiative. We hope that our action inspires the community, including researchers, research funders, and research institutions, to recognize the benefits of published peer review reports for all parts of the research system.

Learn more at ASAPbio .