Histo-Miner: Deep learning based tissue features extraction pipeline from H&E whole slide images of cutaneous squamous cell carcinoma

Lucas Sancéré; Carina Lorenz; Doris Helbig; Oana-Diana Persa; Sonja Dengler; Alexander Kreuter; Martim Laimer; Roland Lang; Anne Fröhlich; Jennifer Landsberg; Johannes Brägelmann; Katarzyna Bozek

doi:10.1371/journal.pcbi.1013907

Peer Review History

Original SubmissionAugust 6, 2025
28 Oct 2025 Decision Letter - Stacey D. Finley, Editor PCOMPBIOL-D-25-01576 Histo-Miner: Deep Learning based Tissue Features Extraction Pipeline from H&E Whole Slide Images of Cutaneous Squamous Cell Carcinoma PLOS Computational Biology Dear Dr. Sancéré, Thank you for submitting your manuscript to PLOS Computational Biology. After careful consideration, we feel that it has merit but does not fully meet PLOS Computational Biology's publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. Please submit your revised manuscript within 60 days Dec 28 2025 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at ploscompbiol@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pcompbiol/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript: * A rebuttal letter that responds to each point raised by the editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'. This file does not need to include responses to formatting updates and technical items listed in the 'Journal Requirements' section below. * A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'. * An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'. If you would like to make changes to your financial disclosure, competing interests statement, or data availability statement, please make these updates within the submission form at the time of resubmission. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter We look forward to receiving your revised manuscript. Kind regards, Stacey D. Finley, Ph.D. Section Editor PLOS Computational Biology Additional Editor Comments : The reviewers acknowledge the utility of the approach and that the manuscript is well written. However, some details and explanation are needed to improve clarity and allow for reproducibility. These points should be addressed in a revised manuscript. Journal Requirements: 1) Please ensure that the CRediT author contributions listed for every co-author are completed accurately and in full. At this stage, the following Authors/Authors require contributions: Lucas Sancéré, Carina Lorenz, Doris Helbig, Oana-Diana Persa, Sonja Dengler, Alexander Kreuter, Martim Laimer, Anne Fröhlich, Jennifer Landsberg, Johannes Brägelmann, and Katarzyna Bozek. Please ensure that the full contributions of each author are acknowledged in the "Add/Edit/Remove Authors" section of our submission form. The list of CRediT author contributions may be found here: https://journals.plos.org/ploscompbiol/s/authorship#loc-author-contributions 2) We ask that a manuscript source file is provided at Revision. Please upload your manuscript file as a .doc, .docx, .rtf or .tex. If you are providing a .tex file, please upload it under the item type u2018LaTeX Source Fileu2019 and leave your .pdf version as the item type u2018Manuscriptu2019. 3) Please provide an Author Summary. This should appear in your manuscript between the Abstract (if applicable) and the Introduction, and should be 150-200 words long. The aim should be to make your findings accessible to a wide audience that includes both scientists and non-scientists. Sample summaries can be found on our website under Submission Guidelines: https://journals.plos.org/ploscompbiol/s/submission-guidelines#loc-parts-of-a-submission 4) Please upload all main figures as separate Figure files in .tif or .eps format. For more information about how to convert and format your figure files please see our guidelines: https://journals.plos.org/ploscompbiol/s/figures 5) We have noticed that you have uploaded Supporting Information files, but you have not included a list of legends. Please add a full list of legends for your Supporting Information files after the references list. 6) Some material included in your submission may be copyrighted. According to PLOSu2019s copyright policy, authors who use figures or other material (e.g., graphics, clipart, maps) from another author or copyright holder must demonstrate or obtain permission to publish this material under the Creative Commons Attribution 4.0 International (CC BY 4.0) License used by PLOS journals. Please closely review the details of PLOSu2019s copyright requirements here: PLOS Licenses and Copyright. If you need to request permissions from a copyright holder, you may use PLOS's Copyright Content Permission form. Please respond directly to this email and provide any known details concerning your material's license terms and permissions required for reuse, even if you have not yet obtained copyright permissions or are unsure of your material's copyright compatibility. Once you have responded and addressed all other outstanding technical requirements, you may resubmit your manuscript within Editorial Manager. Potential Copyright Issues: i) Figures 1d, and 5a. Please confirm whether you drew the images / clip-art within the figure panels by hand. If you did not draw the images, please provide (a) a link to the source of the images or icons and their license / terms of use; or (b) written permission from the copyright holder to publish the images or icons under our CC BY 4.0 license. Alternatively, you may replace the images with open source alternatives. See these open source resources you may use to replace images / clip-art: - https://commons.wikimedia.org - https://openclipart.org/. 7) Please amend your detailed Financial Disclosure statement. This is published with the article. It must therefore be completed in full sentences and contain the exact wording you wish to be published. 1) State what role the funders took in the study. If the funders had no role in your study, please state: "The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript." 8) Please ensure that the funders and grant numbers match between the Financial Disclosure field and the Funding Information tab in your submission form. Note that the funders must be provided in the same order in both places as well. Currently, this information "ODP received funding from the Bavarian Cancer Research Center(BZKF) and Deutsche Stiftung fur Dermatologie" is missing from the Funding Information tab. 9) Please amend your 'Competing Interests' statement, and declare all competing interests beginning with the statement "I have read the journal's policy and the authors of this manuscript have the following competing interests:" Note: If there are no competing interests to declare, please state "The authors have declared that no competing interests exist". Note: If the reviewer comments include a recommendation to cite specific previously published works, please review and evaluate these publications to determine whether they are relevant and should be cited. There is no requirement to cite these works unless the editor has indicated otherwise. Reviewers' comments: Reviewer's Responses to Questions Reviewer #1: Histo-Miner is a novel deep learning-based pipeline designed for the analysis of Whole-Slide Images (WSIs) of skin, specifically targeting the challenge of non-melanoma tumor cell classification, where data scarcity and high morphological similarities between tumor and healthy epithelial cells pose difficulties. The pipeline integrates convolutional neural networks and vision transformers for nucleus segmentation and classification, as well as tumor region segmentation. A key refinement step is implemented where nuclei predicted as tumor cells but located outside predicted tumor regions are reclassified as healthy epithelial, leveraging broader tissue context to overcome these morphological ambiguities. Histo-Miner introduces two new cSCC datasets and demonstrates state-of-the-art performance across multiple tasks. Importantly, it shows potential in predicting cSCC patient response to immunotherapy, highlighting key predictive cellular features such as lymphocyte percentages and granulocyte-lymphocyte ratios, and is designed for generalizability to other cancer types and datasets. Overall, the study presents a promising and well-structured pipeline with clear potential for advancing the analysis of WSIs in cSCC and beyond. However, several details regarding the methodology and evaluation remain unclear and should be clarified to ensure the reproducibility of the study and to allow a proper assessment of the robustness and generalizability of the approach. 2.1 Histo-Miner pipeline description. The pipeline description mentions two types of preprocessing involving downsampling, tiling, and color normalization. However, important details are missing to ensure reproducibility. Could the authors clarify whether tiles are extracted only from tissue regions or from the entire WSI? If tiles are extracted from the whole slide, are patches without tissue filtered out afterwards? The method for color normalization is not fully described, and it would be helpful to clarify how it addresses variability in staining across samples. 2.2 NucSeg and TumSeg datasets descriptions. In the dataset generation section, it is stated that class labels, segmentation masks, and tumor regions were manually annotated by two pathology experts. However, it remains unclear how the annotations from the two experts were combined, and what measures were taken to ensure consistency and reliability of the annotations across the dataset. 2.4 Training SCC Hovernet. Regarding the reported split of 5,968 patches for training and 848 patches for validation from the NucSeg dataset, could the authors clarify whether the patch division was performed at the patient level, ensuring that patches from the same patient were kept within the same split? This is important to avoid potential data leakage and overestimation of model performance. Could the authors clarify how the validation set was used for model selection? In particular, was early stopping applied, and if so, how was it configured? Additionally, was cross-validation considered to better assess the robustness and generalizability of the results? This section reports that performance of the resulting models was highest when first pre-trained on ImageNet-21k and subsequently on the non-curated H&E nucleus dataset before fine-tuning. However, these details seem more appropriate for the sections dedicated to Results and/or Discussion rather than the methodological description 2.5 Training SCC Segmenter. The description states that data augmentation was applied, consisting of random resizing, random left–right flipping, and normalization of the images based on the mean and standard deviation of ImageNet-1k pixel values. However, it seems that the explanation mixes data augmentation with normalization, and this should be clarified. In addition, it would be important to specify how the data augmentation was applied (e.g., whether augmentations were always applied or with a certain probability, and whether all transformations were used simultaneously or independently. The model was trained on 115 randomly chosen slides from the TumSeg dataset, with the remaining 29 slides used for validation. Could the authors clarify whether this split was performed at the patient level, ensuring that slides from the same patient were not included in both sets? The accuracy estimation was performed on the validation set due to the limited number of slides and the absence of an independent test set. Could the authors clarify whether the validation set was also used to select the best model? Given the small dataset size, cross-validation is recommended to provide a more reliable estimation of model performance and to ensure that the results are not specific to a particular data split, but consistent across different splits. 2.6 Tissue Analyser. Regarding the cell classification refinement, it is mentioned that nuclei predicted as tumor cells outside predicted tumor regions are reclassified as healthy epithelial. This step is crucial to address the morphological similarity between tumor and healthy cells. Could the authors clarify how the robustness of this refinement is ensured against potential inaccuracies in tumor region predictions by the SCC Segmenter? For instance, if a region is incorrectly classified as non-tumor, how does this affect the reclassification of genuinely tumor nuclei within that region? Are confidence thresholds or probability scores used to mitigate such errors? 2.7 Feature selection The subsection on feature selection in the case study appears under Methods but mixes methodological description with reporting of results. It would be helpful if the authors clearly separate the description of the feature selection procedure from the results obtained, to improve clarity and reproducibility. Reviewer #2: This manuscript presents Histo-Miner, a deep learning-enabled pipeline for analyzing whole-slide images (WSIs) of cutaneous Squamous Cell Carcinoma (cSCC). The pipeline incorporates a trained convolutional neural network for the segmentation and classification of nuclei and a trained vision transformer for the segmentation of tumor areas. After validating the accuracy of Histo-Miner with manually annotated class labels and segmentation masks, the framework creates a compact feature vector summarizing tissue morphology and cellular interactions. To demonstrate its clinical applicability, the authors utilized this pipeline to predict immunotherapy response in 45 cSCC patients based on their pre-treatment WSIs. Overall, the manuscript is well-written. However, I would like to suggest the following major revisions before getting it published: - On Page 2, line 70, it would be beneficial if the authors could provide a few references for this claim. - Zenodo repository links in supplementary materials are not functional. I believe they should be formatted as https://zenodo.org/records/13986860 - On Page 6, lines 163, 179, and 180, the term "resolution" was used instead of "magnification" when referring to microscope magnification settings. It would be helpful if the authors either clarify the actual resolutions (not magnifications) or consistently use the term "magnification" throughout these lines. - I could not find evidence that the developed method was compared against blind test samples (WSIs). If this comparison was not performed, would it be possible to analyze the accuracy of the technique using blind test samples? - On Page 4, line 123, the authors mention training and testing the models using manually annotated datasets. It would be valuable if the authors could elaborate on the potential problems of manual labeling and their impact on model accuracy in the discussion section. Reviewer #3: Summary The paper introduces a deep learning-based pipeline for the analysis of skin Whole-Slide Images (WSIs), called Histo-Miner. To this aim, the Authors generated new datasets of cutaneous Squamous Cell Carcinoma (cSCC). Histo-Miner is presented as a good predictor for nucleus segmentation, nucleus classification, and tumor region segmentation, achieving results comparable to the state of the art, with improvements in the classification of healthy tissue. Moreover, the work also introduces the use of the tool with a predictive objective for immunotherapy response. Overall Quality Statement The work appears to be of very high quality. It presents an interesting tool with strong potential both in image recognition and in its integration with mathematical models, particularly regarding response prediction. The material is clearly presented, with a good balance between text and supplementary material. The flow of the exposition is clear and well documented, including information on the pipeline, practical examples, validation of results, and presentation of possible further applications. Although my expertise is more on the side of application than on the technical aspects of the employed methodologies, I consider this manuscript to be of high value, both for its technical quality and for the usefulness of its content. Minor Observations R133: “After determining tumor areas using SCC Segmenter, the results of the SCC Hovernet cell nuclei classification are updated to add a new cell class as follows: all the nuclei predicted as tumor cells outside of the predicted tumor regions are reclassified as healthy epithelial.” Q. Could you please discuss the potential risk of omitting information related to invasive tissues or metastatic initiation? Have you checked whether there is any correspondence, when analyzing unresponsive tissues, with tumors showing a large number of cells outside predicted tumor regions? R144: “For every pair of cell classes X and Y, we also calculate the average distance of the closest cell of class Y to a cell of class X inside the tumor regions.” Q. Since later in the manuscript the Authors state that they evaluate local cell densities, I would find it more natural to suggest considering the average density instead (i.e., the average number of cells of class Y around cells of class X within a given sensing radius). This approach would also help address boundary effects, as density can be evaluated relative to the proportion of the sphere lying within the tissue, thereby avoiding singularities. Could you please provide an explanation for this choice? R176: “To build a tumor segmenter algorithm, we additionally assembled 144 WSIs of 125 cSCC patients from 3 medical centers – Bonn, Cologne, and Munich.” Q. It would be useful to include information about data uniformity, since the dataset originates from different centers. R220: “We list loss functions in Supplementary Eq. 1–4.” Q. While it is functional to place technical information in the supplementary material, a short descriptive summary in the main text would be valuable. For instance, a brief explanation of which elements the loss functions account for, and whether these are considered as absolute values or in terms of slope. R253: “The accuracy estimation was performed on the validation set due to the limited number of slides and lack of an independent test set.” Since accuracy was evaluated on the validation set rather than an independent test set, the reported performance may be overestimated. The limited number of slides further reduces the reliability of this estimation. I would suggest either cross-validation or the use of an external validation cohort to better assess generalizability. Could you clarify whether the lack of independence arises from the nature of the dataset itself, or from reliance on data augmentation? ******** Have the authors made all data and (if applicable) computational code underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data and code underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data and code should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data or code —e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: Yes Reviewer #3: Yes ****** PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No Reviewer #3: No [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] Figure resubmission: While revising your submission, we strongly recommend that you use PLOS’s NAAS tool (https://ngplosjournals.pagemajik.ai/artanalysis) to test your figure files. NAAS can convert your figure files to the TIFF file type and meet basic requirements (such as print size, resolution), or provide you with a report on issues that do not meet our requirements and that NAAS cannot fix. After uploading your figures to PLOS’s NAAS tool - https://ngplosjournals.pagemajik.ai/artanalysis, NAAS will process the files provided and display the results in the "Uploaded Files" section of the page as the processing is complete. If the uploaded figures meet our requirements (or NAAS is able to fix the files to meet our requirements), the figure will be marked as "fixed" above. If NAAS is unable to fix the files, a red "failed" label will appear above. When NAAS has confirmed that the figure files meet our requirements, please download the file via the download option, and include these NAAS processed figure files when submitting your revised manuscript. Reproducibility:** To enhance the reproducibility of your results, we recommend that authors of applicable studies deposit laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols https://doi.org/10.1371/journal.pcbi.1013907.r001
Revision 1
10 Dec 2025 Author Response Attachments Attachment Submitted filename: Response_to_Reviewers.pdf https://doi.org/10.1371/journal.pcbi.1013907.r002
9 Jan 2026 Decision Letter - Stacey D. Finley, Editor Dear Mr. Sancéré, We are pleased to inform you that your manuscript 'Histo-Miner: deep learning based tissue features extraction pipeline from H&E whole slide images of cutaneous squamous cell carcinoma' has been provisionally accepted for publication in PLOS Computational Biology. Before your manuscript can be formally accepted you will need to complete some formatting changes, which you will receive in a follow up email. A member of our team will be in touch with a set of requests. Please note that your manuscript will not be scheduled for publication until you have made the required changes, so a swift response is appreciated. IMPORTANT: The editorial review process is now complete. PLOS will only permit corrections to spelling, formatting or significant scientific errors from this point onwards. Requests for major changes, or any which affect the scientific understanding of your work, will cause delays to the publication date of your manuscript. Should you, your institution's press office or the journal office choose to press release your paper, you will automatically be opted out of early publication. We ask that you notify us now if you or your institution is planning to press release the article. All press must be co-ordinated with PLOS. Thank you again for supporting Open Access publishing; we are looking forward to publishing your work in PLOS Computational Biology. Best regards, Stacey D. Finley, Ph.D. Section Editor PLOS Computational Biology ********************************************************* Reviewer's Responses to Questions Comments to the Authors: Please note here if the review is uploaded as an attachment. Reviewer #1: The authors have satisfactorily clarified the issues raised in the review and followed the suggested recommendations. Reviewer #2: The manuscript was already well-written and now it is ready-to-publish after these major revisions. Reviewer #3: I would like to thank the Authors for the work carried out during the revision of the manuscript and for the careful attention given to the Reviewers’ comments. I have reviewed your responses to the Reviewers’ comments and the revisions made to the manuscript. I consider the corrections implemented in response to the other reviewers’ suggestions, as well as the changes introduced in line with my own remarks, to be appropriate and well justified. In light of this, I am satisfied with both the revisions made to the manuscript and the responses provided. ****** Have the authors made all data and (if applicable) computational code underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data and code underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data and code should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data or code —e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: Yes Reviewer #3: Yes ****** PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: Yes:** Cagatay Isil Reviewer #3: No https://doi.org/10.1371/journal.pcbi.1013907.r003
Formally Accepted
Acceptance Letter - Stacey D. Finley, Editor PCOMPBIOL-D-25-01576R1 Histo-Miner: deep learning based tissue features extraction pipeline from H&E whole slide images of cutaneous squamous cell carcinoma Dear Dr Sancéré, I am pleased to inform you that your manuscript has been formally accepted for publication in PLOS Computational Biology. Your manuscript is now with our production department and you will be notified of the publication date in due course. The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript. Soon after your final files are uploaded, unless you have opted out, the early version of your manuscript will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers. For Research, Software, and Methods articles, you will receive an invoice from PLOS for your publication fee after your manuscript has reached the completed accept phase. If you receive an email requesting payment before acceptance or for any other service, this may be a phishing scheme. Learn how to identify phishing emails and protect your accounts at https://explore.plos.org/phishing. Thank you again for supporting PLOS Computational Biology and open-access publishing. We are looking forward to publishing your work! With kind regards, Anita Estes PLOS Computational Biology \| Carlyle House, Carlyle Road, Cambridge CB4 3DN \| United Kingdom ploscompbiol@plos.org \| Phone +44 (0) 1223-442824 \| ploscompbiol.org \| @PLOSCompBiol https://doi.org/10.1371/journal.pcbi.1013907.r004

Open letter on the publication of peer review reports

PLOS recognizes the benefits of transparency in the peer review process. Therefore, we enable the publication of all of the content of peer review and author responses alongside final, published articles. Reviewers remain anonymous, unless they choose to reveal their names.

We encourage other journals to join us in this initiative. We hope that our action inspires the community, including researchers, research funders, and research institutions, to recognize the benefits of published peer review reports for all parts of the research system.

Learn more at ASAPbio .