Neural network models for sequence-based TCR and HLA association prediction

Si Liu; Philip Bradley; Wei Sun

doi:10.1371/journal.pcbi.1011664

Peer Review History

Original SubmissionJune 21, 2023
11 Aug 2023 Decision Letter - Jinyan Li, Editor, Sushmita Roy, Editor Dear Dr. Sun, Thank you very much for submitting your manuscript "Neural network models for sequence-based TCR and HLA association prediction" for consideration at PLOS Computational Biology. As with all papers reviewed by the journal, your manuscript was reviewed by members of the editorial board and by several independent reviewers. In light of the reviews (below this email), we would like to invite the resubmission of a significantly-revised version that takes into account the reviewers' comments. The three reviewers have raised lots of important comments and suggests which are expected to be addressed in a revised version of the manuscript. We cannot make any decision about publication until we have seen the revised manuscript and your response to the reviewers' comments. Your revised manuscript is also likely to be sent to reviewers for further evaluation. When you are ready to resubmit, please upload the following: [1] A letter containing a detailed list of your responses to the review comments and a description of the changes you have made in the manuscript. Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out. [2] Two versions of the revised manuscript: one with either highlights or tracked changes denoting where the text has been changed; the other a clean version (uploaded as the manuscript file). Important additional instructions are given below your reviewer comments. Please prepare and submit your revised manuscript within 60 days. If you anticipate any delay, please let us know the expected resubmission date by replying to this email. Please note that revised manuscripts received after the 60-day due date may require evaluation and peer review similar to newly submitted manuscripts. Thank you again for your submission. We hope that our editorial process has been constructive so far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments. Sincerely, Jinyan Li Academic Editor PLOS Computational Biology Sushmita Roy Section Editor PLOS Computational Biology ********************* The three reviewers have raised lots of important comments and suggests which are expected to be addressed in a revised version of the manuscript. Reviewer's Responses to Questions Comments to the Authors: Please note here if the review is uploaded as an attachment. Reviewer #1: Liu et al. proposed to use neural network models to predict the association between T cell receptors (TCR) and human leukocyte antigens (HLA) with their amino acid sequences. After constructing the training data of positive and negative pairs, they selected the tuning parameters in the deep learning model to predict associations. Benchmarking shows better performance than the existing method CLAIRE. The proposed DePTH differs from CLAIRE by different training data, and DePTH allows unseen HLA alleles in training data since it directly inputs the sequences. In the application of immune checkpoint blockade, the authors showed that DePTH-predicted TCR-HLA associations could be useful as biomarkers associated with clinical outcomes. The paper is well-written, the analyses are comprehensive, and the scientific contribution is strong. I list some minor comments below that may further improve the manuscript. "Table C: Performance of DePTH on leave-one-out experiments" The AUC is above 0.64, specificity is above 0.92, but recall/sensitivity is below 0.31. This suggests a better cutoff other than 0.5 may be selected to balance specificity and sensitivity in leave-one-out experiments. This may help predict new TCRs and HLAs unseen during training. “The AUC to distinguish these 54 TCR-HLA pairs from the negative pairs from Emerson test data is 0.87” The proposed DePTH with Emerson data performs well for the solved TCR-pMHC-I structures. I wonder how CLAIRE or DePTH with McPAS data performs. This can be a potential comparison as well. “The validation AUC became stable after ensemble size increased to around 20 (S1 Appendix Fig B).” Based on Fig B, shuffles # 1 and 3 have stable AUCs around 20 models, but the other two shuffles have AUCs increasing even after 20 models, which suggests more models can be used in the ensemble if computation is not a concern. Reviewer #2: This manuscript developed a neural network method named DePTH to predict TCR-HLA associations based on their amino acid sequences. The method outperforms the existing method for the similar task for less common HLA alleles. The manuscript shows that DePTH can be used to quantify the functional similarities of HLA alleles. These similarities are shown to be associated with the survival outcomes of cancer patients who received immune checkpoint blockade treatment. Major: 1. The comparison with random forest is better combined with Section “DePTH models achieve AUC around 0.8 on test TCR-HLA pairs” to give readers a reference on how good an AUC around 0.8 is. 2. (a) Please explain this sentence more clearly: Line 223: “CLAIRE treats HLA as a categorical variable, which limits its ability to generalize to unseen HLAs”. For example, how does CLAIRE treat HLA as a categorical variable. (b) Based on this statement, CLAIRE would have very low predictive power when being used to predict the association in a different dataset. But the results seem to show that the performance of DePTH is similar to CLAIRE. What is the explanation? 3. In section “DePTH outperforms CLAIRE on most HLAs when trained on the same data”, please provide the frequencies for all 10 HLA-I alleles. 4. The section on HLA similarity metrics is not easy to follow. It would help if the rationale of this analyses is explained more clearly at the beginning of the section. In particular, what are the clinical implications of HLA similarities (both between individual and individual-level heterogeneity)? How does TCR-HLA pair relate to HLA similarities? Also when reporting the results, please explain how to interpret the results of this analysis and the clinical implication. 5. What are the covariates in Fig 4? Please provide a more complete description of the survival analyses. 6. Some parts of the results section involve a long description of the analysis procedure. It may be helpful to accentuate the main message by putting some description of the procedure in the method section and providing more rationales. Reviewer #3: The authors propose a method called DePTH for predicting TCR-HLA paring that uses the sequence information of HLA alleles (contrasting the competing method using HLA as a categorical variable), thus allow the prediction to generalize to rare and unseen HLAs. Using the Emerson data to train the model with selective positive associations (co-occurrences of TCR and HLA) and sampled negative pairs as controls. Specifically, they selected 6,423 associated TCR-HLA pairs out of 742,832,595 TCR-HLA pairs involving HLA-I alleles and 11,037 associated TCR-HLA pairs out of 1,136,096,910 TCR-HLA pairs involving HLA-II alleles. They show DepTH outperformed the competing methods in prediction accuracy and further validated using an independent dataset by Szeto et al. The method has the potential to study rare HLAs. They show the method has moderate prediction accuracy for unseen pairs. The study could be improved if examples of clinical utility can be demonstrated more convincingly. For example, in Crowell dataset, is there any novel findings of predicted TCR-HLA pairs associated with patient survival beyond what was reported in the original paper? ****** Have the authors made all data and (if applicable) computational code underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data and code underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data and code should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data or code —e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: Yes Reviewer #3: Yes ****** PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: Yes:** Jiebiao Wang Reviewer #2: No Reviewer #3: No Figure Files: While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org. Data Requirements: Please note that, as a condition of publication, PLOS' data policy requires that you make available all data used to draw the conclusions outlined in your manuscript. Data must be deposited in an appropriate repository, included within the body of the manuscript, or uploaded as supporting information. This includes all numerical values that were used to generate graphs, histograms etc.. For an example in PLOS Biology see here: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5. Reproducibility: To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols https://doi.org/10.1371/journal.pcbi.1011664.r001
Revision 1
9 Oct 2023 Author Response Attachments Attachment Submitted filename: PLOS_CB_response.pdf https://doi.org/10.1371/journal.pcbi.1011664.r002
6 Nov 2023 Decision Letter - Jinyan Li, Editor, Sushmita Roy, Editor Dear Dr. Sun, We are pleased to inform you that your manuscript 'Neural network models for sequence-based TCR and HLA association prediction' has been provisionally accepted for publication in PLOS Computational Biology. Before your manuscript can be formally accepted you will need to complete some formatting changes, which you will receive in a follow up email. A member of our team will be in touch with a set of requests. Please note that your manuscript will not be scheduled for publication until you have made the required changes, so a swift response is appreciated. IMPORTANT: The editorial review process is now complete. PLOS will only permit corrections to spelling, formatting or significant scientific errors from this point onwards. Requests for major changes, or any which affect the scientific understanding of your work, will cause delays to the publication date of your manuscript. Should you, your institution's press office or the journal office choose to press release your paper, you will automatically be opted out of early publication. We ask that you notify us now if you or your institution is planning to press release the article. All press must be co-ordinated with PLOS. Thank you again for supporting Open Access publishing; we are looking forward to publishing your work in PLOS Computational Biology. Best regards, Jinyan Li Academic Editor PLOS Computational Biology Sushmita Roy Section Editor PLOS Computational Biology ********************************************************* Reviewer's Responses to Questions Comments to the Authors: Please note here if the review is uploaded as an attachment. Reviewer #1: My comments have been fully addressed. Thank you! Reviewer #2: Authors have addressed all my comments satisfactorily Reviewer #3: The authors have addressed my critique. ****** Have the authors made all data and (if applicable) computational code underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data and code underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data and code should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data or code —e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: None Reviewer #2: Yes Reviewer #3: Yes ****** PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review?** For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No Reviewer #3: No https://doi.org/10.1371/journal.pcbi.1011664.r003
Formally Accepted
15 Nov 2023 Acceptance Letter - Jinyan Li, Editor, Sushmita Roy, Editor PCOMPBIOL-D-23-00978R1 Neural network models for sequence-based TCR and HLA association prediction Dear Dr Sun, I am pleased to inform you that your manuscript has been formally accepted for publication in PLOS Computational Biology. Your manuscript is now with our production department and you will be notified of the publication date in due course. The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript. Soon after your final files are uploaded, unless you have opted out, the early version of your manuscript will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers. Thank you again for supporting PLOS Computational Biology and open-access publishing. We are looking forward to publishing your work! With kind regards, Zsofi Zombor PLOS Computational Biology \| Carlyle House, Carlyle Road, Cambridge CB4 3DN \| United Kingdom ploscompbiol@plos.org \| Phone +44 (0) 1223-442824 \| ploscompbiol.org \| @PLOSCompBiol https://doi.org/10.1371/journal.pcbi.1011664.r004

Open letter on the publication of peer review reports

PLOS recognizes the benefits of transparency in the peer review process. Therefore, we enable the publication of all of the content of peer review and author responses alongside final, published articles. Reviewers remain anonymous, unless they choose to reveal their names.

We encourage other journals to join us in this initiative. We hope that our action inspires the community, including researchers, research funders, and research institutions, to recognize the benefits of published peer review reports for all parts of the research system.

Learn more at ASAPbio .