Candidate correlates of protection in the HVTN505 HIV-1 vaccine efficacy trial identified by positive-unlabeled learning

Shiwei Xu; Aaron Hudson; Holly E. Janes; Georgia D. Tomaras; Margaret E. Ackerman

doi:10.1371/journal.pcbi.1013705

Peer Review History

Original SubmissionNovember 5, 2024
9 Mar 2025 Decision Letter - Denise Kühnert, Editor PCOMPBIOL-D-24-01909 Expanded Insights into Correlates of Protection in the HVTN505 HIV-1 Vaccine Efficacy Trial Afforded by Positive-Unlabeled Learning PLOS Computational Biology Dear Dr. Ackerman, Thank you for submitting your manuscript to PLOS Computational Biology. After careful consideration, we feel that it has merit but does not fully meet PLOS Computational Biology's publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. Please submit your revised manuscript within 60 days May 09 2025 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at ploscompbiol@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pcompbiol/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript: * A rebuttal letter that responds to each point raised by the editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'. This file does not need to include responses to formatting updates and technical items listed in the 'Journal Requirements' section below. * A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'. * An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'. If you would like to make changes to your financial disclosure, competing interests statement, or data availability statement, please make these updates within the submission form at the time of resubmission. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter We look forward to receiving your revised manuscript. Kind regards, Jessica M. Conway Academic Editor PLOS Computational Biology Denise Kühnert Section Editor PLOS Computational Biology Journal Requirements: 1) Please ensure that the CRediT author contributions listed for every co-author are completed accurately and in full. At this stage, the following Authors/Authors require contributions: Shiwei Xu, Holly E Janes, Aaron Hudson, Georgia D Tomaras, and Margaret E Ackerman. Please ensure that the full contributions of each author are acknowledged in the "Add/Edit/Remove Authors" section of our submission form. The list of CRediT author contributions may be found here: https://journals.plos.org/ploscompbiol/s/authorship#loc-author-contributions 2) Please provide an Author Summary. This should appear in your manuscript between the Abstract (if applicable) and the Introduction, and should be 150-200 words long. The aim should be to make your findings accessible to a wide audience that includes both scientists and non-scientists. Sample summaries can be found on our website under Submission Guidelines: https://journals.plos.org/ploscompbiol/s/submission-guidelines#loc-parts-of-a-submission 3) Please upload all main figures as separate Figure files in .tif or .eps format. For more information about how to convert and format your figure files please see our guidelines: https://journals.plos.org/ploscompbiol/s/figures 4) We have noticed that you have uploaded Supporting Information files, but you have not included a list of legends. Please add a full list of legends for your Supporting Information files after the references list. 5) Some material included in your submission may be copyrighted. According to PLOSu2019s copyright policy, authors who use figures or other material (e.g., graphics, clipart, maps) from another author or copyright holder must demonstrate or obtain permission to publish this material under the Creative Commons Attribution 4.0 International (CC BY 4.0) License used by PLOS journals. Please closely review the details of PLOSu2019s copyright requirements here: PLOS Licenses and Copyright. If you need to request permissions from a copyright holder, you may use PLOS's Copyright Content Permission form. Please respond directly to this email and provide any known details concerning your material's license terms and permissions required for reuse, even if you have not yet obtained copyright permissions or are unsure of your material's copyright compatibility. Once you have responded and addressed all other outstanding technical requirements, you may resubmit your manuscript within Editorial Manager. Potential Copyright Issues: i) Figure 1. Please confirm whether you drew the images / clip-art within the figure panels by hand. If you did not draw the images, please provide (a) a link to the source of the images or icons and their license / terms of use; or (b) written permission from the copyright holder to publish the images or icons under our CC BY 4.0 license. Alternatively, you may replace the images with open source alternatives. See these open source resources you may use to replace images / clip-art: - https://commons.wikimedia.org - https://openclipart.org/. 6) Thank you for stating that "Code will be made available upon publication." Please note that, though access restrictions are acceptable now, your entire data will need to be made freely accessible if your manuscript is accepted for publication. This policy applies to all data except where public deposition would breach compliance with the protocol approved by your research ethics board. If you are unable to adhere to our open data policy, please kindly revise your statement to explain your reasoning and we will seek the editor's input on an exemption. Please be assured that, once you have provided your new statement, the assessment of your exemption will not hold up the peer review process. 7) Please amend your detailed Financial Disclosure statement. This is published with the article. It must therefore be completed in full sentences and contain the exact wording you wish to be published. 1) State the initials, alongside each funding source, of each author to receive each grant. For example: "This work was supported by the National Institutes of Health (####### to AM; ###### to CJ) and the National Science Foundation (###### to AM)." 2) State what role the funders took in the study. If the funders had no role in your study, please state: "The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.". Reviewers' comments: Reviewer's Responses to Questions Comments to the Authors: Please note that one of the reviews is uploaded as an attachment. Reviewer #1: In the present article, Xu et al. employ a Positive-Unlabeled (PU) inference method on immunogenicity data collected in the HVTN 505 study to infer vaccine-mediated protection scores against HIV acquisition by comparing patient immunogenicity profiles with those who acquired HIV. The authors then regressed these protection scores against the immunogenicity data to identify correlates of protection (CoP) against HIV infection. To test the robustness of their scores, the authors employed permutation testing, finding that their protection scores and CoPs were not artifacts of chance. These efforts confirmed previously reported correlates of risk (CoR) as well as identified novel CoPs, such as Env gp140-specific IgA and ADCP, that were not found through HIV acquisition-based models. Overall, this study is a well-reasoned application of novel machine learning methods to improve recognition of biological determinants in PU data that may also progress our understanding of CoPs in HIV vaccination. However, further efforts to verify the biological validity of the protection scores would greatly bolster confidence in PU inference methods and in the CoPs identified in this article. There are also several opportunities to improve the clarity of the figures and text. Major Concerns Figure 3A is missing the prediction lines. It is claimed that there is some clustering in Figure 2A among positive samples, but is this really supported? Adding quantification of clustering separation, such as through the Silhouette score, would improve confidence in this claim. The authors use permutation testing to validate their Positive-Unlabeled (PU) model captured immunological patterns different from those in randomized measurements. However, it remains unclear if the patients identified as protected from HIV infection were truly protected from infection or if they were never exposed to HIV. To their credit, the authors acknowledge that their permutation efforts cannot verify if the inferred protection statuses are biologically correct, and that such verification is not possible as patient HIV exposures are unknown. In the supplement, the authors conclude there is no significant association between behavioral vulnerability and inferred protection status, but that there is an association between behavioral vulnerability and HIV acquisition. As such, behavioral vulnerability could be used as a proxy for HIV exposure. Perhaps the biological validity of the PU-inferred protection may be bolstered by comparing inferred protection between acquisition and non-acquisition cases in patients with high behavioral vulnerability. Minor Concerns In line 68, the statement “While over-representation of controls is a commonly used means to improve upon the ability to detect CoR…” should include a citation to support its claim that oversampling is a common method in CoR. In line 148, missing a “were” between “that” and “observed”. Line 164: The authors allude to two-sample independent t-tests in Figure 2D, though the table in Figure 2D does not include t-test results. Similarly, it is unclear which statistical test was used to define significance differences in EPR and MBS means for Figure 2C. The color for the lighter colored data points, such as yellow, should be adjusted so they can be seen more easily. Figures 2A-B should be reorganized for improved clarity. Reorganizing into two columns, with each row representing a unique combination of principal components and each column corresponding to HIV acquisition or inferred protection, may better illustrate relationships between acquisition and inferred protection. Figure 2B contains “nan” as a feature of importance. Is this a real feature? Can the authors include a dictionary of the features? The triangles along the x-axis in Figure 3F should have their intersection at 0 log fold change, not the center of the plot. In Figure 3B, the x-axis should be labeled, and a legend should be included to define what types of features each color represents; if these correspond to the legend in 3F, it would be beneficial to put the legend earlier with Figure 3B. Similarly, it is unclear which classification task each feature importance plot corresponds to. Presumably, these match the classification task labels in Figure 3A, but increasing the spacing between rows or adding some background shading across all figures corresponding to a classification task may improve clarity in this distinction. Throughout Figure 4, the authors should report Mann-Whitney U test statistics alongside p-values when comparing acquisition and protection models; p-values should be used to define if a difference in means is significant, whereas the test statistics should be used to compare deviations in means across models. Reviewer #2: This study analyzes data from HVTN 505, a Phase IIb HIV vaccine trial that failed to meet its efficacy criteria, using a novel machine learning approach called Positive-Unlabeled (PU) learning to gain new insights about vaccine-mediated protection. While the trial showed no overall efficacy, previous analyses suggested that some vaccine recipients may have been protected from certain viral strains. The researchers applied PU learning to infer protection status among vaccine recipients who didn't acquire HIV, allowing for improved detection of potential correlates of immunity. Using this approach, the study confirmed previously identified correlates of risk, such as vaccine-elicited anti-HIV-1 Env glycoprotein IgG3 antibodies and antibody-dependent phagocytosis, while also revealing new observations including a strong inverse correlation between vaccine-mediated protection and virus-specific IgA responses. The findings demonstrate the value of advanced analytical methods for extracting meaningful insights from failed vaccine trials, particularly in cases where traditional analysis methods may lack statistical power due to low efficacy and exposure rates. This analytical framework offers a new way to use case-control datasets to identify markers of effective immune responses even in the context of low overall vaccine efficacy. Major Comments: 1. Circular Logic and Validation Concerns - The paper uses the same immunological data both to infer protection status and then to identify correlates of protection, creating problematic circular logic. - The validation approach using permutation testing is insufficient to prove the biological relevance of the inferred protection classifications. 2. Statistical and Methodological Issues - The PU learning approach makes strong assumptions about the mixture of protected/unprotected individuals that are not well justified. - Multiple testing corrections appear inadequate given the large number of features examined. - The robustness of the protection status inferences across different modeling choices is not thoroughly evaluated. - Inconsistent choice of k-fold from 3 (SVM), 5 (RF), to 10 (Bagging)! - Inconsistent use of softwares, Mann Whitney U tests were performed using SciPy library on Python 3.11[47]; the Logistic regression model was implemented in R (version 4.2.2). Why not using Wilcoxon test in R instead of Mann Whitney U test in SciPy? Or if Python is a preferred programming language, why not using SciKit Learn to perform Logistic Regression? 3. Overstatement of Findings - Claims about "discovering" new correlates of protection are overstated given the circular nature of the analysis. - The biological plausibility and mechanistic relevance of the identified features is not adequately discussed. - The limitations of inferring protection status without exposure data are downplayed. 4. Structural and Presentation Issues - The methods section lacks sufficient detail about key modeling choices and parameters. - Figures are not well presented and do not effectively communicate the key findings. - The discussion does not adequately address alternative interpretations of the results. - The practical utility for vaccine development is unclear given the methodological limitations Additional Points: - The manuscript requires significant editing for clarity and conciseness. - Key controls and sensitivity analyses are missing. - The statistical significance of many findings appears marginal. - Important caveats and limitations are buried in the discussion. The core idea of using machine learning to gain additional insights from vaccine trial data is interesting, but the current manuscript has fundamental flaws in its approach and overreaches in its conclusions. To be suitable for publication, the authors would need to: 1. Develop independent validation approaches that don't rely on circular logic. 2. More thoroughly evaluate the robustness of their findings. 3. Provide more rigorous statistical analyses. 4. Substantially improve the clarity of presentation. Reviewer #3: See the attachment ******** Have the authors made all data and (if applicable) computational code underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data and code underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data and code should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data or code —e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: No: I didn't see the raw data or a dictionary of the variable names. It's possible I just missed this. Reviewer #2: No: Only the datasets used in this analysis are available at https://atlas.scharp.org/project/HVTN%20Public%20Data/HVTN%20505/begin.view. Reviewer #3: None ****** PLOS authors have the option to publish the peer review history of their article (what does this mean? ). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy . Reviewer #1: No Reviewer #2: No Reviewer #3: No [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] Figure resubmission: While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step. If there are other versions of figure files still present in your submission file inventory at resubmission, please replace them with the PACE-processed versions. Reproducibility:** ?> Attachments Attachment Submitted filename: PCOMPBIOL-D-24-01909_R3viewer 3.pdf https://doi.org/10.1371/journal.pcbi.1013705.r001
Revision 1
29 Aug 2025 Author Response Attachments Attachment Submitted filename: response to review_R1.pdf https://doi.org/10.1371/journal.pcbi.1013705.r002
4 Nov 2025 Decision Letter - Denise Kühnert, Editor Dear Dr Ackerman, We are pleased to inform you that your manuscript 'Candidate Correlates of Protection in the HVTN505 HIV-1 Vaccine Efficacy Trial Identified by Positive-Unlabeled Learning' has been provisionally accepted for publication in PLOS Computational Biology. Before your manuscript can be formally accepted you will need to complete some formatting changes, which you will receive in a follow up email. A member of our team will be in touch with a set of requests. Please note that your manuscript will not be scheduled for publication until you have made the required changes, so a swift response is appreciated. IMPORTANT: The editorial review process is now complete. PLOS will only permit corrections to spelling, formatting or significant scientific errors from this point onwards. Requests for major changes, or any which affect the scientific understanding of your work, will cause delays to the publication date of your manuscript. Should you, your institution's press office or the journal office choose to press release your paper, you will automatically be opted out of early publication. We ask that you notify us now if you or your institution is planning to press release the article. All press must be co-ordinated with PLOS. Thank you again for supporting Open Access publishing; we are looking forward to publishing your work in PLOS Computational Biology. Best regards, Jessica M. Conway Academic Editor PLOS Computational Biology Denise Kühnert Section Editor PLOS Computational Biology ********************************************************* To improve clarity, please point to the data dictionary clearly somewhere in your MS. The link provided takes the reader to a top-level list of folders associated with other publications, and it takes some hunting to find what's needed. Reviewer's Responses to Questions Comments to the Authors: Please note here if the review is uploaded as an attachment. Reviewer #1: The authors have resolved my concerns. Reviewer #2: The authors have answered my comments and the paper is now ready for publication. ****** Have the authors made all data and (if applicable) computational code underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data and code underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data and code should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data or code —e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: No: ****** PLOS authors have the option to publish the peer review history of their article (what does this mean? ). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy . Reviewer #1: Yes:** Aaron S Meyer Reviewer #2: No https://doi.org/10.1371/journal.pcbi.1013705.r003
Formally Accepted
Acceptance Letter - Denise Kühnert, Editor PCOMPBIOL-D-24-01909R1 Candidate Correlates of Protection in the HVTN505 HIV-1 Vaccine Efficacy Trial Identified by Positive-Unlabeled Learning Dear Dr Ackerman, I am pleased to inform you that your manuscript has been formally accepted for publication in PLOS Computational Biology. Your manuscript is now with our production department and you will be notified of the publication date in due course. The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript. Soon after your final files are uploaded, unless you have opted out, the early version of your manuscript will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers. For Research, Software, and Methods articles, you will receive an invoice from PLOS for your publication fee after your manuscript has reached the completed accept phase. If you receive an email requesting payment before acceptance or for any other service, this may be a phishing scheme. Learn how to identify phishing emails and protect your accounts at https://explore.plos.org/phishing. Thank you again for supporting PLOS Computational Biology and open-access publishing. We are looking forward to publishing your work! With kind regards, Judit Kozma PLOS Computational Biology \| Carlyle House, Carlyle Road, Cambridge CB4 3DN \| United Kingdom ploscompbiol@plos.org \| Phone +44 (0) 1223-442824 \| ploscompbiol.org \| @PLOSCompBiol https://doi.org/10.1371/journal.pcbi.1013705.r004

Open letter on the publication of peer review reports

PLOS recognizes the benefits of transparency in the peer review process. Therefore, we enable the publication of all of the content of peer review and author responses alongside final, published articles. Reviewers remain anonymous, unless they choose to reveal their names.

We encourage other journals to join us in this initiative. We hope that our action inspires the community, including researchers, research funders, and research institutions, to recognize the benefits of published peer review reports for all parts of the research system.

Learn more at ASAPbio .