RETINA: Reconstruction-based pre-trained enhanced TransUNet for electron microscopy segmentation on the CEM500K dataset

Cheng Xing; Ronald Xie; Gary D. Bader

doi:10.1371/journal.pcbi.1013115

Peer Review History

Original SubmissionNovember 5, 2024
5 Nov 2024 Author Response https://doi.org/10.1371/journal.pcbi.1013115.r001
12 Feb 2025 Decision Letter - Jason A. Papin, Editor PCOMPBIOL-D-24-01938 RETINA: Reconstruction-based Pre-Trained Enhanced TransUNet for Electron Microscopy Segmentation on the CEM500K Dataset PLOS Computational Biology Dear Dr. Xing, Thank you for submitting your manuscript to PLOS Computational Biology. After careful consideration, we feel that it has merit but does not fully meet PLOS Computational Biology's publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. Please note that both reviewers expressed enthusiasm for hte topic, but both had signfiicant reservations about the rigor of the work. A revision would need to do much to address the concerns of the reviewers, but the work to address each point does seem feasible. Please submit your revised manuscript within 60 days Apr 14 2025 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at ploscompbiol@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pcompbiol/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript: * A rebuttal letter that responds to each point raised by the editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'. This file does not need to include responses to formatting updates and technical items listed in the 'Journal Requirements' section below. * A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'. * An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'. If you would like to make changes to your financial disclosure, competing interests statement, or data availability statement, please make these updates within the submission form at the time of resubmission. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter We look forward to receiving your revised manuscript. Kind regards, Jason Papin Editor-in-Chief PLOS Computational Biology Additional Editor Comments (if provided): Journal Requirements: [Note: HTML markup is below. Please do not edit.] Reviewers' comments: Reviewer's Responses to Questions Comments to the Authors: Please note here if the review is uploaded as an attachment. Reviewer #1: Xing et al. describe RETINA, a machine-learning method for segmenting ultrastructural features in electron microscopy (EM) images. Segmenting ultrastructures such as mitochondria, lysosomes, nucleoli, etc. in EM datasets is critical for their analysis since manual segmentation does not scale to large datasets. The main contribution of the authors is to implement pretraining on a related dataset into their model training to improve performance. Pretraining can leverage large amounts of unlabeled data avoiding higher costs for label generation. The authors show that pretraining improves model performance in comparison to a baseline that does not use pretraining, as well as another baseline. The paper is written clearly, and the data is presented well. However, there are several shortcomings in the model chosen for the benchmark datasets, experiments performed and presented, the comparison models, and the metrics used. The authors benchmark their model on several datasets that are 3d, in that they extend in all 3d dimensions. It has been long established that leveraging this 3d context improves performance, yet, the authors have chosen a model that uses 2d convolutions. This part is not entirely clear though, because the model is not fully presented in the paper; I referred to the original manuscript in which the model architecture of Transunet was presented (Chen et al., 2021). The baseline the authors compare their model to appear to be 2d models as well which do not represent the state of the art. Just as one point of reference, the CREMI leaderboard (https://cremi.org/leaderboard/) lists multiple methods applied for synapse detection. CleftNet, one of the high-scoring methods, uses a 3d residual UNet (https://arxiv.org/pdf/2101.04266). This is just one example for a stronger baseline the authors could have considered. The authors chose to use IoU for all their benchmarks. This is a limited metrics because it does not treat voxels that separate individual objects differently from those that are merely at the edge. Yet, merging two distinct objects has a much higher impact on downstream analyses. For instance, the CREMI challenge uses an object based metric (https://cremi.org/metrics/#clefts). A side effect of only using IoU is that the presented performance cannot be compared with the performances published on the cremi challenge. If a challenge like CREMI publishes metrics, they should be used. Still, the pretraining as implemented by the authors improves performance according to one metric (when compared to its not-pretrained self). To make the impact clear however, more evaluations and metrics should have been presented. For instance, it is not clear if the performance gap outweighs the additional cost of pretraining, etc, etc. . Overall, the impact of this work on the science that is using segmentations like the once created by RETINA is not made clear enough. The introduction claims that “deep learning-based segmentation has shown promise, its accuracy to automatically segment cellular structures in EM data remains insufficient compared to expert manual results.” There are many publications to list here that show that this is not true. Many automated methods now going back almost 10 years have shown performances similar to human annotators for ultrastructural segmentation. That is not to say that there are no improvements to be had, especially when it comes to efficiency along the lines of what RETINA tries to achieve. Reviewer #2: Xing et al. presented a hybrid approach, RETINA, for EM segmentation. By combining the widely used TransUNet backbone with pre-trained features from the CEM500K dataset, the authors suggested that the hybrid method outperformed the other approaches in terms of speed and accuracy. It is a timely work given the rapid expansion of volume EM datasets in recent years. However, I recommend addressing the overall accuracy of the manuscript, particularly in the following areas: 1. Could you provide the iteration time for each model? This information is essential for accurately estimating the computational resources required to achieve a certain IoU. 2. Discrepancies in the numbers reported in Table 1, Table S5, and Figure 3. For example, in line 145, it is stated that “The UroCell benchmark demonstrates the most significant IoU increase, over 150%”. I was not able to derive this number from Table 1. The Guay numbers in Table 1 do not match those in Table S5 and Figure 3. The UroCell Random Init. value of 0.203 in Table 1 appears to be incorrect. Iteration numbers of Gray, Perez, and Kasthuri++ in Figure 3 do not match with those in Table 1. The spelling of “Kathuri” in Figure 3 should be "Kasthuri". 3. The names of benchmarks in Table 1 should be better defined to ensure consistency with the main text. For example, “CEM500K moco” in Table 1 should be explicitly linked to “UNet-ResNet50 model pre-trained with MoCoV2” in line 126 and the “MoCoV2 pre-trained model” in line 158. “Random Init.” should be clearly defined as the “randomly initialized UNet-ResNet50 model” in line 131. The “non-pre-trained UNet-ResNet50 model” in line 172 is unclear and requires clarification. 4. It would be better to have the order of Benchmarks be consistent between Figure 3 and Figure S5. ******** Have the authors made all data and (if applicable) computational code underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data and code underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data and code should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data or code —e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: Yes ****** PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] Figure resubmission: While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step. If there are other versions of figure files still present in your submission file inventory at resubmission, please replace them with the PACE-processed versions. Reproducibility:** To enhance the reproducibility of your results, we recommend that authors of applicable studies deposit laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols https://doi.org/10.1371/journal.pcbi.1013115.r002
Revision 1
18 Apr 2025 Author Response Attachments Attachment Submitted filename: response_to_reviewer.pdf https://doi.org/10.1371/journal.pcbi.1013115.r003
6 May 2025 Decision Letter - Jason A. Papin, Editor Dear Xing, We are pleased to inform you that your manuscript 'RETINA: Reconstruction-based Pre-Trained Enhanced TransUNet for Electron Microscopy Segmentation on the CEM500K Dataset' has been provisionally accepted for publication in PLOS Computational Biology. Before your manuscript can be formally accepted you will need to complete some formatting changes, which you will receive in a follow up email. A member of our team will be in touch with a set of requests. Please note that your manuscript will not be scheduled for publication until you have made the required changes, so a swift response is appreciated. IMPORTANT: The editorial review process is now complete. PLOS will only permit corrections to spelling, formatting or significant scientific errors from this point onwards. Requests for major changes, or any which affect the scientific understanding of your work, will cause delays to the publication date of your manuscript. Should you, your institution's press office or the journal office choose to press release your paper, you will automatically be opted out of early publication. We ask that you notify us now if you or your institution is planning to press release the article. All press must be co-ordinated with PLOS. Thank you again for supporting Open Access publishing; we are looking forward to publishing your work in PLOS Computational Biology. Best regards, Jason Papin Editor-in-Chief PLOS Computational Biology ********************************************************* Reviewer's Responses to Questions Comments to the Authors: Please note here if the review is uploaded as an attachment. Reviewer #1: The authors sufficiently addressed my points through adding 3d approaches to the evaluation, adding metrics, and clarifying the manuscript in several ways. I appreciate the extensive responses and the added work that improved the manuscript. While I understand that the use of CEM500K is the reason for using a 2d approach in the first place, there are plenty of 3d datasets available to pretrain 3d models. Since the demonstrated performance of the RETINA exceeds that of the (reasonably) selected 3d approaches, the main point of the manuscript holds and this is left to future work. Reviewer #2: Thank you for addressing my questions. The revised version looks good. ****** Have the authors made all data and (if applicable) computational code underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data and code underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data and code should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data or code —e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: Yes ****** PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review?** For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No https://doi.org/10.1371/journal.pcbi.1013115.r004
Formally Accepted
Acceptance Letter - Jason A. Papin, Editor PCOMPBIOL-D-24-01938R1 RETINA: reconstruction-based pre-trained enhanced TransUNet for electron microscopy segmentation on the CEM500K dataset Dear Dr Xing, I am pleased to inform you that your manuscript has been formally accepted for publication in PLOS Computational Biology. Your manuscript is now with our production department and you will be notified of the publication date in due course. The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript. Soon after your final files are uploaded, unless you have opted out, the early version of your manuscript will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers. Thank you again for supporting PLOS Computational Biology and open-access publishing. We are looking forward to publishing your work! With kind regards, Anita Estes PLOS Computational Biology \| Carlyle House, Carlyle Road, Cambridge CB4 3DN \| United Kingdom ploscompbiol@plos.org \| Phone +44 (0) 1223-442824 \| ploscompbiol.org \| @PLOSCompBiol https://doi.org/10.1371/journal.pcbi.1013115.r005

Open letter on the publication of peer review reports

PLOS recognizes the benefits of transparency in the peer review process. Therefore, we enable the publication of all of the content of peer review and author responses alongside final, published articles. Reviewers remain anonymous, unless they choose to reveal their names.

We encourage other journals to join us in this initiative. We hope that our action inspires the community, including researchers, research funders, and research institutions, to recognize the benefits of published peer review reports for all parts of the research system.

Learn more at ASAPbio .