Integrating multiple spatial transcriptomics data using community-enhanced graph contrastive learning

Wenqian Tu; Lihua Zhang

doi:10.1371/journal.pcbi.1012948

Peer Review History

Original SubmissionNovember 10, 2024
10 Nov 2024 Author Response https://doi.org/10.1371/journal.pcbi.1012948.r001
14 Jan 2025 Decision Letter - Wei Li, Editor PCOMPBIOL-D-24-01959 Integrating multiple spatial transcriptomics data using community-enhanced graph contrastive learning PLOS Computational Biology Dear Dr. Zhang, Thank you for submitting your manuscript to PLOS Computational Biology. After careful consideration, we feel that it has merit but does not fully meet PLOS Computational Biology's publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. Please submit your revised manuscript within 60 days Mar 16 2025 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at ploscompbiol@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pcompbiol/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript: * A rebuttal letter that responds to each point raised by the editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'. This file does not need to include responses to formatting updates and technical items listed in the 'Journal Requirements' section below. * A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'. * An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'. If you would like to make changes to your financial disclosure, competing interests statement, or data availability statement, please make these updates within the submission form at the time of resubmission. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter We look forward to receiving your revised manuscript. Kind regards, Wei Li, Ph.D. Academic Editor PLOS Computational Biology Jian Ma Section Editor PLOS Computational Biology Journal Requirements: 1) We ask that a manuscript source file is provided at Revision. Please upload your manuscript file as a .doc, .docx, .rtf or .tex. If you are providing a .tex file, please upload it under the item type u2018LaTeX Source Fileu2019 and leave your .pdf version as the item type u2018Manuscriptu2019. 2) Please provide an Author Summary. This should appear in your manuscript between the Abstract (if applicable) and the Introduction, and should be 150-200 words long. The aim should be to make your findings accessible to a wide audience that includes both scientists and non-scientists. Sample summaries can be found on our website under Submission Guidelines: https://journals.plos.org/ploscompbiol/s/submission-guidelines#loc-parts-of-a-submission 3) Please upload all main figures as separate Figure files in .tif or .eps format. For more information about how to convert and format your figure files please see our guidelines: https://journals.plos.org/ploscompbiol/s/figures 4) Please amend your detailed Financial Disclosure statement. This is published with the article. It must therefore be completed in full sentences and contain the exact wording you wish to be published. Please ensure that the funders and grant numbers match between the Financial Disclosure field and the Funding Information tab in your submission form. Note that the funders must be provided in the same order in both places as well. - State the initials, alongside each funding source, of each author to receive each grant. For example: "This work was supported by the National Institutes of Health (####### to AM; ###### to CJ) and the National Science Foundation (###### to AM)." - State what role the funders took in the study. If the funders had no role in your study, please state: "The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.". If you did not receive any funding for this study, please simply state: u201cThe authors received no specific funding for this work.u201d Reviewers' comments: Reviewer's Responses to Questions Comments to the Authors: Please note here if the review is uploaded as an attachment. Reviewer #1: Tu et al., in their research article titled “Integrating multiple spatial transcriptomics data using community-enhanced graph contrastive learning,” showcase the use of graph learning method to integrate multiple spatial transcriptomics data from different platforms with different scenarios. The method development contributes to easing the complexities posed by spatial transcriptome data, especially the batch effect and noise impact. At its current stage, the article addresses technical defects of spatial transcriptomics rather than providing a certain degree of biological insights and aspects. The tool presented in the article is limited to integrating datasets from the same or different platforms, making the article in its present form unsuitable for publication. However, the article could be significantly improved if the authors added points as discussed below: 1) In the Overview of Tacos sub-section of the Result section, the author showcases a figure representing the schema of the graph theory algorithm. However, it is not mentioned why graph contrastive learning is the better choice. What were the criteria for selecting this algorithm? A better alternative will be to train different algorithms and finalize the algorithm that is best suited for the dataset. This validates and gives a better rationale for the tool development 2) The author shows the smoothness of integration but fails to justify how non-computational biologists or bioinformaticians could utilize the present tool. Could the tool in its present form be integrated into well-known spatial transcriptomics analysis? If so, how could they integrate it? 3) The tool's computational efficiency is not extensively discussed. Moreover, if the tool has extensive memory utilization, a section or supplementary note should discuss how multithreading or multiprocessing could elevate process execution. 4) Both scRNA-seq and spatial transcriptomics are used to find novel biological markers or insights. However, the present tool does not provide a better alternative to the existing tool. 4) The GitHub page appears to be well maintained; however, the readme and license file need to be updated Reviewer #2: The manuscript introduces Tacos, a novel community-enhanced graph contrastive learning-based method designed to integrate multiple spatial transcriptomics datasets across varying platforms and conditions. The method offers a promising tool for integrating spatial transcriptomics data, with potential applications in studying tissue architecture and disease progression. However several areas could benefit from further clarification to improve its overall impact and scientific rigor. 1. Given the align output is in a low-dimensional embedding space. Besides to the comparison to the Scanpy integration without batch effect removal, it is preferable to also compare it with Scanpy's batch-corrected integration methods like Harmony. Additionally, SPIRAL has the same design for spatial data integration across various experiments and technologies, SPIRAL also test the performance on DLPFC and MOA datasets. it would be beneficial to include SPIRAL into the comparisons as well. 2. How spatial gene expression data is denoised? A detailed explanation of the approach used to denoise spatial gene expression data are required. 3. Compared to ground truth annotations, Tacos had less scores in preserving annotated layers, Could you clarify how these scores are defined? Additionally, does this imply that Tacos may not effectively preserve the biological structures represented by the annotated layers? 4. The trajectory tree analysis appears unconventional. Could you clarify how the hierarchical organization of the dorsolateral prefrontal cortex tissue was interpreted? 5. All the test datasets in the manuscript appear to involve two-sample integration. It would be valuable to assess the performance of Tacos in integrating multiple samples simultaneously. 6. Although spatial domain is not the direct output of Tacos, it is important to provide a detailed description of how spatial domains were determined. Additionally, clarify the criteria and methods used to compare spatial domains across different approaches. Reviewer #3: In this study, the authors present TACOS, an advanced method for spatial transcriptomics integration that combines graph convolutional neural networks (GCNs) with innovative data preprocessing and augmentation strategies. By constructing cell graphs that integrate spatial and gene expression information, the method provides a robust framework for capturing the intricate relationships within spatial transcriptomics datasets. Furthermore, TACOS leverages contrastive learning to enhance feature representations, enabling the alignment of spatial datasets with unprecedented accuracy. This approach not only achieves state-of-the-art performance in spatial data integration but also addresses key challenges in preserving spatial context and biological relevance. Importantly, the authors have developed TACOS as a Python package, ensuring its accessibility and utility for the research community. This work represents a significant advancement in the field of spatial transcriptomics, offering both methodological innovations and practical tools for researchers. While I recommend the paper for publication, addressing the following relatively minor comments will improve the manuscript's clarity and presentation. 1. Figure 1 and Caption: The figure does not provide adequate details to explain the model's training process. Specifically, the manuscript does not clarify how the deep learning model generates the aligned dataset or what specific steps are taken to achieve this alignment. Moreover, the caption does not describe the training objectives, such as the loss functions used in the model, or explain their significance in aligning the datasets. Adding these details would greatly enhance the clarity and accessibility of this figure, ensuring readers fully understand the underlying methodology. 2. Methods Section – "Datasets and Data Preprocessing": There is a typographical error in the term "pseud-count," which should be corrected to "pseudocount." Additionally, the manuscript does not adequately discuss the choice of the number of highly variable genes, a parameter that is crucial for the analysis. This choice can significantly affect the results, and explicitly presenting the criteria used to select this parameter is necessary for reproducibility and transparency. A brief justification for the chosen number would strengthen the methodological rigor of the study. 3. Methods Section – Graph Construction and Embeddings: The subsections "Building Spatial Graph for Spatial Transcriptomic Data" and "Extracting Low-Dimensional Embeddings with Community-Enhanced Encoder" require additional clarity in explaining the equations and symbols introduced. For example, the meaning of symbols such as "s" and "x" in the "1-skeleton" equation is not explained, leaving their interpretation ambiguous. Similarly, the rationale for using "V(s)" instead of directly constructing a k-nearest neighbor (KNN) graph based on spatial distances is not adequately articulated. Providing a clear explanation of how these techniques contribute to solving the spatial integration task would strengthen the section. Additionally, numbering the equations for easier reference and avoiding redundancy in their presentation would improve the overall organization and readability of the manuscript. 4. Model Architecture and Training Details: The manuscript omits critical information about the structure and optimization of the deep learning model. Details such as the architecture of each module during training, including the number of neurons, are not provided. Similarly, essential optimization parameters, such as the choice of optimizer, learning rate, and the number of training epochs, are missing. Furthermore, there is no discussion of runtime or computational efficiency, particularly concerning graph construction and graph convolutional operations. Since the method involves graph-based spot-level interactions, it is important to clarify whether full-batch mode is used for graph convolutions and, if so, discuss its implications for runtime and memory requirements. Including these details would provide a more comprehensive understanding of the computational aspects of the method. 5. Figure 2 and Baseline Comparisons: The use of Leiden clustering on embeddings derived from PCA in Scanpy is not appropriate. Leiden clustering is primarily designed for identifying cell types based on gene expression, not for analyzing spatial slices, making its relevance to this context questionable. Additionally, the comparison with the ground truth trajectory is unclear. The manuscript mentions performing PAGA using ground truth labels, but this approach does not seem to provide a fair baseline for evaluating spatial integration tasks. A more suitable baseline comparison that directly addresses the spatial integration objectives of the study would better validate the effectiveness of TACOS and bolster the credibility of the results. ******** Have the authors made all data and (if applicable) computational code underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data and code underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data and code should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data or code —e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: Yes Reviewer #3: Yes ****** PLOS authors have the option to publish the peer review history of their article (what does this mean? ). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy . Reviewer #1: No Reviewer #2: No Reviewer #3: No [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] Figure resubmission: While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step. If there are other versions of figure files still present in your submission file inventory at resubmission, please replace them with the PACE-processed versions. Reproducibility:** To enhance the reproducibility of your results, we recommend that authors of applicable studies deposit laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols https://doi.org/10.1371/journal.pcbi.1012948.r002
Revision 1
20 Feb 2025 Author Response Attachments Attachment Submitted filename: response_letterR1.pdf https://doi.org/10.1371/journal.pcbi.1012948.r003
10 Mar 2025 Decision Letter - Wei Li, Editor Dear Dr Zhang, We are pleased to inform you that your manuscript 'Integrating multiple spatial transcriptomics data using community-enhanced graph contrastive learning' has been provisionally accepted for publication in PLOS Computational Biology. Before your manuscript can be formally accepted you will need to complete some formatting changes, which you will receive in a follow up email. A member of our team will be in touch with a set of requests. Please note that your manuscript will not be scheduled for publication until you have made the required changes, so a swift response is appreciated. IMPORTANT: The editorial review process is now complete. PLOS will only permit corrections to spelling, formatting or significant scientific errors from this point onwards. Requests for major changes, or any which affect the scientific understanding of your work, will cause delays to the publication date of your manuscript. Should you, your institution's press office or the journal office choose to press release your paper, you will automatically be opted out of early publication. We ask that you notify us now if you or your institution is planning to press release the article. All press must be co-ordinated with PLOS. Thank you again for supporting Open Access publishing; we are looking forward to publishing your work in PLOS Computational Biology. Best regards, Wei Li, Ph.D. Academic Editor PLOS Computational Biology Jian Ma Section Editor PLOS Computational Biology ********************************************************* Reviewer's Responses to Questions Comments to the Authors: Please note here if the review is uploaded as an attachment. Reviewer #1: The author have addressed my concerns Reviewer #2: The author has addressed all of my comments. Reviewer #3: I appreciate the authors’ thorough and thoughtful responses to my comments. Their clarifications and additional analyses have effectively addressed my concerns, and I am satisfied with their revisions. I also commend the authors for their rigorous approach and the high quality of their work. This study represents an exciting advancement in the field, and I look forward to seeing its impact on the community. ****** Have the authors made all data and (if applicable) computational code underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data and code underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data and code should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data or code —e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: Yes Reviewer #3: Yes ****** PLOS authors have the option to publish the peer review history of their article (what does this mean? ). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review?** For information about this choice, including consent withdrawal, please see our Privacy Policy . Reviewer #1: No Reviewer #2: No Reviewer #3: No https://doi.org/10.1371/journal.pcbi.1012948.r004
Formally Accepted
Acceptance Letter - Wei Li, Editor PCOMPBIOL-D-24-01959R1 Integrating multiple spatial transcriptomics data using community-enhanced graph contrastive learning Dear Dr Zhang, I am pleased to inform you that your manuscript has been formally accepted for publication in PLOS Computational Biology. Your manuscript is now with our production department and you will be notified of the publication date in due course. The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript. Soon after your final files are uploaded, unless you have opted out, the early version of your manuscript will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers. Thank you again for supporting PLOS Computational Biology and open-access publishing. We are looking forward to publishing your work! With kind regards, Zsofia Freund PLOS Computational Biology \| Carlyle House, Carlyle Road, Cambridge CB4 3DN \| United Kingdom ploscompbiol@plos.org \| Phone +44 (0) 1223-442824 \| ploscompbiol.org \| @PLOSCompBiol https://doi.org/10.1371/journal.pcbi.1012948.r005

Open letter on the publication of peer review reports

PLOS recognizes the benefits of transparency in the peer review process. Therefore, we enable the publication of all of the content of peer review and author responses alongside final, published articles. Reviewers remain anonymous, unless they choose to reveal their names.

We encourage other journals to join us in this initiative. We hope that our action inspires the community, including researchers, research funders, and research institutions, to recognize the benefits of published peer review reports for all parts of the research system.

Learn more at ASAPbio .