Peer Review History

Original SubmissionMay 2, 2025
Decision Letter - Marcel Schulz, Editor

Library Size-Stabilized Metacells Construction Enhances Co-Expression Network Analysis in Single-Cell Data

PLOS Computational Biology

Dear Dr. Zhu,

Thank you for submitting your manuscript to PLOS Computational Biology. After careful consideration, we feel that it has merit but does not fully meet PLOS Computational Biology's publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the editorial assessment process.

Please submit your revised manuscript within 60 days Aug 18 2025 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at ploscompbiol@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pcompbiol/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

* A rebuttal letter that responds to each point raised by the editor. You should upload this letter as a separate file labeled 'Response to Reviewers'. This file does not need to include responses to formatting updates and technical items listed in the 'Journal Requirements' section below.

* A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

* An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, competing interests statement, or data availability statement, please make these updates within the submission form at the time of resubmission. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter

We look forward to receiving your revised manuscript.

Kind regards,

Marcel Holger Schulz, Ph.D.

Academic Editor

PLOS Computational Biology

Ilya Ioshikhes

Section Editor

PLOS Computational Biology

Additional Editor Comments :

I have read your paper and there are a few things that I would like you to address before I can sent it out for peer-review:

1. Cite and/or compare your approach for construction of metacells with existing methods that have been designed for that purpose using the same benchmarking you do already in the paper:

- MetaCell: analysis of single-cell RNA-seq data using K-nn graph partitions. Baran et al. Gen Biol 2019

- Building and analyzing metacells in single-cell genomics data. Bilous et al. Mol Sys Biol 2024

- MetaQ: fast, scalable and accurate metacell inference via single-cell quantization. Li et al. Nature Communciations 2025

2. Your argumentation is that the library size of the metacells is an important factor to control in the creation of metacells. In your approach LSMetacell you do this as described in Algorithm 1. In order to prove your argument on the level of the benchmark that you use, it would be useful to create an alternative implementation of LSMetacell, where you do not control for library size at the time of aggregation, but simply control for the number of metacells for example. Then you should check if the module preservation is negatively affected by leaving out library size control.

Journal Requirements:

1) Please ensure that the CRediT author contributions listed for every co-author are completed accurately and in full.

At this stage, the following Authors/Authors require contributions: Tianjiao Zhang. Please ensure that the full contributions of each author are acknowledged in the "Add/Edit/Remove Authors" section of our submission form.

The list of CRediT author contributions may be found here: https://journals.plos.org/ploscompbiol/s/authorship#loc-author-contributions

2) We note that your Manuscript files are duplicated on your submission. Please remove any unnecessary files from your revision, and make sure that only those relevant to the current version of the manuscript are included.

3) Please upload all main figures as separate Figure files in .tif or .eps format. For more information about how to convert and format your figure files please see our guidelines: 

https://journals.plos.org/ploscompbiol/s/figures

4) We have noticed that you have uploaded Supporting Information files, but you have not included a list of legends. Please add a full list of legends for your Supporting Information files after the references list.

5) Please amend your detailed Financial Disclosure statement. This is published with the article. It must therefore be completed in full sentences and contain the exact wording you wish to be published.

1) State the initials, alongside each funding source, of each author to receive each grant. For example: "This work was supported by the National Institutes of Health (####### to AM; ###### to CJ) and the National Science Foundation (###### to AM)."

2) State what role the funders took in the study. If the funders had no role in your study, please state: "The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript."

3) If any authors received a salary from any of your funders, please state which authors and which funders.

6) Thank you for stating "The authors declare no competing interests." If you have no competing interests to declare, please state "The authors have declared that no competing interests exist."

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

Figure resubmission:

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step. If there are other versions of figure files still present in your submission file inventory at resubmission, please replace them with the PACE-processed versions.

Reproducibility:

?>

Revision 1

Attachments
Attachment
Submitted filename: response to reviewers .docx
Decision Letter - Marcel Schulz, Editor

Library Size-Stabilized Metacells Construction Enhances Co-Expression Network Analysis in Single-Cell Data

PLOS Computational Biology

Dear Dr. Zhu,

Thank you for submitting your manuscript to PLOS Computational Biology. After careful consideration, we feel that it has merit but does not fully meet PLOS Computational Biology's publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript within 60 days Oct 17 2025 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at ploscompbiol@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pcompbiol/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

* A rebuttal letter that responds to each point raised by the editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'. This file does not need to include responses to formatting updates and technical items listed in the 'Journal Requirements' section below.

* A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

* An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, competing interests statement, or data availability statement, please make these updates within the submission form at the time of resubmission. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter

We look forward to receiving your revised manuscript.

Kind regards,

Marcel Holger Schulz, Ph.D.

Academic Editor

PLOS Computational Biology

Ilya Ioshikhes

Section Editor

PLOS Computational Biology

Journal Requirements:

1) We ask that a manuscript source file is provided at Revision. Please upload your manuscript file as a .doc, .docx, .rtf or .tex. If you are providing a .tex file, please upload it under the item type u2018LaTeX Source Fileu2019 and leave your .pdf version as the item type u2018Manuscriptu2019.

2) Please amend your detailed Financial Disclosure statement. This is published with the article. It must therefore be completed in full sentences and contain the exact wording you wish to be published.

1) State the initials, alongside each funding source, of each author to receive each grant. For example: "This work was supported by the National Institutes of Health (####### to AM; ###### to CJ) and the National Science Foundation (###### to AM)."

2) State what role the funders took in the study. If the funders had no role in your study, please state: "The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript."

3) If any authors received a salary from any of your funders, please state which authors and which funders..

Note: If the reviewer comments include a recommendation to cite specific previously published works, please review and evaluate these publications to determine whether they are relevant and should be cited. There is no requirement to cite these works unless the editor has indicated otherwise.

Reviewers' comments:

Reviewer's Responses to Questions

Reviewer #1: In the context of scRNA-seq, the authors address the well-known issue that normalizing library sizes through linear scaling introduces bias into estimated gene-gene correlations across single cells. As a remedy, they propose combining meta-cell formation with explicit control for library size variation. Specifically, they introduce a heuristic greedy algorithm that aggregates similar cells into meta-cells until the total count within each meta-cell reaches a predefined target library size. Through extensive validation experiments, they demonstrate that this approach effectively reduces bias in gene-gene correlation estimates, while preserving biological signal.

The idea is simple yet convincing. The validation experiments are generally convincing. Nevertheless, I would like to raise several points that could help further improve the manuscript.

Major comments:

The authors emphasize that their method controls for library size, unlike other approaches. However, it is important to acknowledge that standard library size normalization by scaling also “controls” library size, just in a different way. In my view, the key distinction lies in how variability across genes is handled. Scaling adjusts all gene counts within a (meta-)cell by the same factor. When different cells have different scaling factors, this uniform scaling across genes artificially inflates gene-gene correlations. In contrast, meta-cell formation aggregates counts from multiple cells, preserving more gene-specific variability: some genes may increase their counts by, say, 10%, while others only by 5%. This heterogeneity reduces (or even eliminates) spurious correlations while still approximately equalizing library sizes across meta-cells. I suggest that the authors make this conceptual point more explicit in the manuscript.

I believe it is also important to note that the generated meta-cell data cannot fully eliminate false correlations, as the data remain approximately compositional. One would still expect to observe the characteristic negative correlations across genes that arise in compositional data: if one gene is more highly expressed in a meta-cell, other genes must have correspondingly lower counts to maintain the fixed total library size. Were such negative correlations still observed in the meta-cell data?

1. Figure 1: The main conclusion, that increasing variance in library size leads to inflated gene-gene correlations, is clearly supported. However, I am curious why the authors chose to present the bias as a function of mean gene expression, and how they interpret the specific shape of the resulting curves. Furthermore, I would appreciate an explanation for the small but consistent positive bias observed even under constant library sizes. What is the origin of this residual bias?

2. Figure 2: The authors write that “LSMetacell demonstrated higher and stable module preservation.” I believe a more nuanced discussion is needed here, as this claim does not appear to hold uniformly across all cell types shown in the figure.

The manuscript refers to a dual “optimization” strategy underlying their algorithm. Since no formal optimization problem is actually solved, I recommend rewording this to avoid overstating the methodological rigor.

Minor comments:

On line 179, the authors write: “We found that the coefficient of variation of meta-cells obtained by different methods was in the order of hdWGCNA > metaQ > primary > metacell2 > LSMetacell should be interpreted...” Although this is part of the results section, a brief comment on the scientific significance of this observation would be helpful for the reader.

In Algorithm 1, I believe that the $x'$ in the denominator of the formula on line 21 should be $w_{x'}$.

Reviewer #2: Review: Library Size-Stabilized Metacells Construction Enhances Co-Expression Network

Analysis in Single-Cell Data

While multiple methods have been developed to build metacells, few methods have explored the influence of variations in library sizes before and after metacell construction. This paper presents a tool called LSMetacell to build metacells while stabilizing the library sizes across metacells. This method has been benchmarked against other tools designed to build metacells: hdWGCNA, MC2 and MetaQ as well as a version of LSMetacell which does not consider library size in the metacells building step. While LSMetacells is considering an important bias in (sc)RNA-Seq analyses, further analyses need to be added to evaluate in a more comprehensive manner the quality of metacells built with LSMetacells and the performance of this tool against standard approaches.

Major comments:

1. The authors generated null datasets in which genes are not co-expressed and evaluated the level of gene co-expression at the single-cell level, for different library size variations across cells. The authors should show that their method performs well on the null datasets, i.e. minimal biases should be observed after building metacells on the null datasets.

2. Also, the analysis based on the null datasets shows an increase in averaged gene correlations when variations in library sizes increase. However, the levels of correlations are really small (Pearson coef < 0.02). It is difficult to evaluate the impact of such correlations on downstream analyses and no one would ever consider seriously such low (spurious) correlations.

3. The authors should evaluate metacells qualities based on standard metrics used in the metacell field (e.g. compactness, separation, size distribution, purity, etc.). It would also be useful to project the metacells in the single-cell space to see whether metacells are well distributed over the single cell space.

4. In Table 1: the number of gene pairs with significant PPI enrichment for LSMetacell and the primary method are relatively close. The authors should test whether these differences are significant or not. Statistics indicating the significance of the differences observed between the methods benchmarked should also be added to Figure 2.

5. The benchmark of LSMetacell against published methods is not comprehensive: well-known methods need to be included such as SEACells (Persad et al. Nature biotechnology, 2023) and SuperCell (Bilous et al. BMC bioinformatics, 2022).

6. In the implementation of LSMetacell (R package on github), it seems that there is no possibility to link a metacell to the single cells they contain. This is critical for the evaluation of metacells quality (including the metrics mentioned in the comment n°3).

7. This algorithm might be computationally heavy for large datasets. The authors should mention the resources needed to build metacells with LSMetacells on datasets with various sizes.

8. The authors applied LSMetacell to a brain dataset and identified biologically meaningful cell-type-specific co-expression networks. However, the authors do not show whether these networks could also be found using the algorithms that do not consider library size in the metacell construction (or directly from the single-cell data without considering metacells). Here again, much more extensive benchmark is needed

9. In the discussion, the authors mention that one of the limitation of the method is that cell types have to be defined before using LSMetacells to build metacells, this point should be emphasized in the method section. It was indeed not clear to me that metacells should be built within each cell type.

10. On GitHub the example of the readme file cannot be run, the “dpfc” object is not defined. The authors should consider using publicly available dataset such as the Seurat pbmc data.

11. The quality of the figures needs to be substantially improved. Many of them are just unreadable due ot catastrophic choices of font sizes, axis names,… As reviewer, we spend lots of time evaluating papers, so a minimal efforts to make Figures readable and consistent would be much appreciated

**********

Have the authors made all data and (if applicable) computational code underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data and code underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data and code should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data or code —e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

Figure resubmission:

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step. If there are other versions of figure files still present in your submission file inventory at resubmission, please replace them with the PACE-processed versions.

Reproducibility:

?>

Revision 2

Attachments
Attachment
Submitted filename: Response to reviewers_zhb.pdf
Decision Letter - Marcel Schulz, Editor

PCOMPBIOL-D-25-00872R2

Library Size-Stabilized Metacells Construction Enhances Co-Expression Network Analysis in Single-Cell Data

PLOS Computational Biology

Dear Dr. Zhu,

Thank you for submitting your manuscript to PLOS Computational Biology. After careful consideration, we feel that it has merit but does not fully meet PLOS Computational Biology's publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript within 10 days Dec 19 2025 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at ploscompbiol@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pcompbiol/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

* A rebuttal letter that responds to each point raised by the editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'. This file does not need to include responses to formatting updates and technical items listed in the 'Journal Requirements' section below.

* A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

* An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, competing interests statement, or data availability statement, please make these updates within the submission form at the time of resubmission. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

We look forward to receiving your revised manuscript.

Kind regards,

Marcel Holger Schulz, Ph.D.

Academic Editor

PLOS Computational Biology

Ilya Ioshikhes

Section Editor

PLOS Computational Biology

Additional Editor Comments :

The reviewers have no conceptual problems after the last revision. However, there were concerns with the quality of the figures. Please make sure that all figures are vector graphics and are of high enough resolution for the main and supplemental figures. Also make sure that font sizes between subfigures are comparable. Currently there is quite a difference in font size comparing Figure 1,2 and 3. Also please rename the y-axis of Figure 2 as something easier to understand, such as Z-summary statistic

For Figure 3b, why's the CREAD score abbreviated as Ceradsc and not as Creadsc. Is this a typo?

Also in Figure 3B and 3C the names for modules are different, because in B its the eigen genes (MEyellow) and in C the modules (MC_yellow). Please add a description of this difference to the figure caption, such that readers that do not know WGCNA well can follow.

**********

Note: If the reviewer comments include a recommendation to cite specific previously published works, please review and evaluate these publications to determine whether they are relevant and should be cited. There is no requirement to cite these works unless the editor has indicated otherwise.

**********

Reviewers' comments:

Reviewer's Responses to Questions

Reviewer #2: Most of my comments have been addressed, but I still have some remaining issues.

1) Figures are still of very low quality with many plots being highly pixelized (not only in the pdf, also in the tif files). No effort has been made to avoid large empty spaces, points being cut by axis (e.g., Figure 3c), axis names reflecting internal variable in the code (“Zsummary.pres”), lack of axis, etc., etc. I was very clear in my previous report that Figures needed to be improved, and it looks like the authors could not care less about this point. This is utterly disappointing.

2) Lines 186-188 The authors mention that “LSMetacell achieved the lowest coefficient of variation, indicating superior stability of the generated meta-cells”. I would clarify in the main text that the coefficient of variation was computed based on library size (as described in the supplementary file) and thus the stability statement refers to library size stability.

3) In the discussion, lines 386-388 the authors mention that “in addition to library size control, other factors such as high compactness and separation of metacells may also play important roles in obtaining biologically meaningful co-expression networks”. Metacells of good quality should have low compactness and high separation not high compactness and separation.

4) In response to my 4th comment in the first round of revision, the authors say that they completed Table 1 by performing a “paired Wilcoxon test to compare the number of significantly enriched PPI pairs between LSMetacell and each benchmark method across multiple cell types”. However, I see only one pvalue in the legend and not a pvalue for each comparison.

**********

Have the authors made all data and (if applicable) computational code underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data and code underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data and code should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data or code —e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #2: Yes

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #2: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

Figure resubmission:

Reproducibility:

To enhance the reproducibility of your results, we recommend that authors of applicable studies deposit laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols

Revision 3

Attachments
Attachment
Submitted filename: Response to reviewers.docx
Decision Letter - Marcel Schulz, Editor

Dear Dr. Zhu,

We are pleased to inform you that your manuscript 'Library Size-Stabilized Metacells Construction Enhances Co-Expression Network Analysis in Single-Cell Data' has been provisionally accepted for publication in PLOS Computational Biology.

Before your manuscript can be formally accepted you will need to complete some formatting changes, which you will receive in a follow up email. A member of our team will be in touch with a set of requests.

Please note that your manuscript will not be scheduled for publication until you have made the required changes, so a swift response is appreciated.

IMPORTANT: The editorial review process is now complete. PLOS will only permit corrections to spelling, formatting or significant scientific errors from this point onwards. Requests for major changes, or any which affect the scientific understanding of your work, will cause delays to the publication date of your manuscript.

Should you, your institution's press office or the journal office choose to press release your paper, you will automatically be opted out of early publication. We ask that you notify us now if you or your institution is planning to press release the article. All press must be co-ordinated with PLOS.

Thank you again for supporting Open Access publishing; we are looking forward to publishing your work in PLOS Computational Biology. 

Best regards,

Marcel Holger Schulz, Ph.D.

Academic Editor

PLOS Computational Biology

Ilya Ioshikhes

Section Editor

PLOS Computational Biology

***********************************************************

Formally Accepted
Acceptance Letter - Marcel Schulz, Editor

PCOMPBIOL-D-25-00872R3

Library Size-Stabilized Metacells Construction Enhances Co-Expression Network Analysis in Single-Cell Data

Dear Dr Zhu,

I am pleased to inform you that your manuscript has been formally accepted for publication in PLOS Computational Biology. Your manuscript is now with our production department and you will be notified of the publication date in due course.

The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript.

Soon after your final files are uploaded, unless you have opted out, the early version of your manuscript will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers.

For Research, Software, and Methods articles, you will receive an invoice from PLOS for your publication fee after your manuscript has reached the completed accept phase. If you receive an email requesting payment before acceptance or for any other service, this may be a phishing scheme. Learn how to identify phishing emails and protect your accounts at https://explore.plos.org/phishing.

Thank you again for supporting PLOS Computational Biology and open-access publishing. We are looking forward to publishing your work!

With kind regards,

Anita Estes

PLOS Computational Biology | Carlyle House, Carlyle Road, Cambridge CB4 3DN | United Kingdom ploscompbiol@plos.org | Phone +44 (0) 1223-442824 | ploscompbiol.org | @PLOSCompBiol

Open letter on the publication of peer review reports

PLOS recognizes the benefits of transparency in the peer review process. Therefore, we enable the publication of all of the content of peer review and author responses alongside final, published articles. Reviewers remain anonymous, unless they choose to reveal their names.

We encourage other journals to join us in this initiative. We hope that our action inspires the community, including researchers, research funders, and research institutions, to recognize the benefits of published peer review reports for all parts of the research system.

Learn more at ASAPbio .