Putting BASIL in a BLT: A Bayesian filtering method for estimating the fitness effects of nascent adaptive mutations

Huan-Yu Kuo; Sergey Kryazhimskiy

doi:10.1371/journal.pcbi.1013946

Peer Review History

Original SubmissionMarch 29, 2025
10 Jul 2025 Decision Letter - Tobias Bollenbach, Editor A Bayesian Filtering Method for Estimating the Fitness Effects of Nascent Adaptive Mutations PLOS Computational Biology Dear Dr. Kryazhimskiy, Thank you for submitting your manuscript to PLOS Computational Biology. After careful consideration, we feel that it has merit but does not fully meet PLOS Computational Biology's publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. Please submit your revised manuscript within 60 days Sep 09 2025 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at ploscompbiol@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pcompbiol/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript: * A rebuttal letter that responds to each point raised by the editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'. This file does not need to include responses to formatting updates and technical items listed in the 'Journal Requirements' section below. * A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'. * An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'. If you would like to make changes to your financial disclosure, competing interests statement, or data availability statement, please make these updates within the submission form at the time of resubmission. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter We look forward to receiving your revised manuscript. Kind regards, Sergei Maslov Academic Editor PLOS Computational Biology Tobias Bollenbach Section Editor PLOS Computational Biology Additional Editor Comments: I hope the attachment with the review by the reviewer 1 is included. If not, write to me and I can send it directly. The editorial manager here is very clunky. Journal Requirements: 1) We ask that a manuscript source file is provided at Revision. Please upload your manuscript file as a .doc, .docx, .rtf or .tex. If you are providing a .tex file, please upload it under the item type u2018LaTeX Source Fileu2019 and leave your .pdf version as the item type u2018Manuscriptu2019. 2) Please provide an Author Summary. This should appear in your manuscript between the Abstract (if applicable) and the Introduction, and should be 150-200 words long. The aim should be to make your findings accessible to a wide audience that includes both scientists and non-scientists. Sample summaries can be found on our website under Submission Guidelines: https://journals.plos.org/ploscompbiol/s/submission-guidelines#loc-parts-of-a-submission 3) Please upload all main figures as separate Figure files in .tif or .eps format. For more information about how to convert and format your figure files please see our guidelines: https://journals.plos.org/ploscompbiol/s/figures 4) We notice that your supplementary Figures, and Tables are included in the manuscript file. Please remove them and upload them with the file type 'Supporting Information'. Please ensure that each Supporting Information file has a legend listed in the manuscript after the references list. 5) Please amend your detailed Financial Disclosure statement. This is published with the article. It must therefore be completed in full sentences and contain the exact wording you wish to be published. 1) State what role the funders took in the study. If the funders had no role in your study, please state: "The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript." 2) If any authors received a salary from any of your funders, please state which authors and which funders.. If you did not receive any funding for this study, please simply state: u201cThe authors received no specific funding for this work.u201d 6)Please send a completed 'Competing Interests' statement, including any COIs declared by your co-authors. If you have no competing interests to declare, please state "The authors have declared that no competing interests exist". Otherwise please declare all competing interests beginning with the statement "I have read the journal's policy and the authors of this manuscript have the following competing interests" Reviewers' comments: Reviewer's Responses to Questions Comments to the Authors: Please note here if the review is uploaded as an attachment. Reviewer #1: Please see the attachment. Reviewer #2: # Summary Kuo and Kryazhimskiy critically evaluate an existing method (neutral decline or Levy-Blundell) for inferring fitness effects of spontaneous mutants from barcoded lineage tracking data, using simulated data and previously-published experimental data (consistently applying the same method across these data sets). Besides showing how the results depend on the average strength of selection and the crucial choice of putatively neutral lineages as a reference, they find that a major limitation of this method is errors in the inferred mean fitness, which leads to errors in all individual mutant fitnesses as well. They also make an interesting point that a barcode's read count is a biased estimator of its true frequency. Then they develop a new method called BASIL which uses an iterative Bayesian algorithm to infer mutant fitness from these data sets. The major advantage of BASIL seems to be that reference lineages for the mean fitness calculation do not need to be manually chosen. They apply BASIL to the same simulated and experimental data sets and find that it generally works much better than the older methods, especially for capturing the mean fitness under both weak and strong selection. Altogether the paper is solid. The previous literature on these methods is highly technical and often confusing in my opinion, and having a well-written paper that critically and clearly addresses these methods (besides introducing a new and better one) is valuable. I only have a few minor comments and questions (some of which are my own curiosity). I separated these into "major" and "minor" just to highlight two comments that I think are a little more substantial, but none of these should be impediments to publication. # Major comments 1. I understand that the authors tried to keep the main text streamlined by putting a lot of details in the supplement. But this being a technical methods paper, most readers are probably going to be people who want to implement these methods themselves, and thus will want to know those details, requiring them to flip back and forth a lot between the main text and the supplement. I would therefore suggest that the authors consider just making the main text longer and more technical so at least it can be read more linearly by its main readership. For example, Secs. 4 and 5 in the supplement are critical for anyone who wants to understand this paper, so some of that material could be moved to the main text. 2. Could the authors say a little more about how to apply their method to other data sets? In particular, I am wondering what aspects of their method need to be adjusted for each new data set or experimental system and protocols, and if there are any experimental design choices that could be made in advance to accommodate them. (I realize this is not a best-practices paper for BLT experiments, but the data analysis method inevitably informs such considerations.) For example, the authors calibrate a model of how variance in read counts scales with mean read counts by sequencing a test sample with known barcode frequencies (Fig. 4). Is that something everyone needs to do first for their own system and protocols, or do they think their specific model and parameter values will hold across most cases? What about the prior distribution (with a normal distribution of fitness with standard deviation 0.1) and the hyperparameter beta? Is there some procedure we need to follow to calibrate them for our own data/systems? # Minor comments 1. Line 9 "recently developed": I quibble with this characterization of BLT — the Levy et al. paper that started all this is over 10 years old now. 2. Line 15 "currently standard": Is the Levy-Blundell method actually that standard? I am not familiar with every study on BLT from the last 10 years, but I didn't think that everyone who does these experiments uses the same approach. 3. Lines 43-44 "this strategy preferentially captures mutations with the strongest fitness benefits because they are least likely to be lost by genetic drift or clonal interference": How do the authors reconcile this statement with the argument in the first paragraph that clonal interference means one must know the whole distribution of beneficial mutants? I understand the authors' points here, but this latter statement seems to undercut the former. That is, if the same few mutants with strongest fitness effects consistently establish in an evolution experiment, then it seems like clonal interference isn't all that important and one doesn't need to know the whole DFE, at least if one just wants to predict evolutionary dynamics. On the other hand, if an evolution experiment yields many different established mutants, then clonal interference is significant but then one can get a decent sample of beneficial mutants just from established mutants in an evolution experiment. So doesn't that mean that high-resolution lineage tracking isn't crucial in either case? 'm playing devil's advocate here, but I'd like to know what the authors think about this. 4. Lines 131-132 "Low-abundance lineages are those represented by 20 to 40 reads, which corresponds to 20 to 40 cells": Doesn't this depend on the read depth and the number of cells in the sample used for sequencing (i.e., it assumes they are approximately the same)? Is that actually true for most of these experiments? 5. Figure 1A: I noticed that the inferred mean fitness in this case is actually negative; can the authors comment on why that happens and what it means? 6. Line 170: I think it would be clearer to denote this rescaled average read count as being for the time point $k - 1$ rather than $k$ (i.e., denote as $\tilde{r}_{k - 1}$ rather than $\tilde{r}_k$), since the idea is that it estimates the expected read count at time $k - 1$ (with true value already denoted as $r_{k - 1}$, plotted in Fig. 2). 7. Figure 2E: One thing I noticed throughout this paper was how much cleaner the Levy et al. data seems to be compared to all the more recent data sets (see also Figs. 6 and 7). Do the authors know why that is? 8. Lines 224-225 "it relies on the assumption that the measured barcode frequency is an unbiased estimator of the true lineage frequency in the population, which is in general incorrect": Can the authors say more about this? There was a very interesting section on the supplement about this (Sec. 4.3.2), but I didn't follow how it is (or isn't) related to the other issues with bias in the mean fitness and the shift in expected read counts for very low reads. In general some more explanation about to what extent these effects are all caused by the same underlying assumption, or are caused by different assumptions, would be helpful. My understanding is that there are two distinct issues, both arising from the use of Eq. 1. First is the choice of putatively neutral lineages as a reference, which is potentially arbitrary and can lead to biases. Second is the point mentioned here, which is that Eq. 1 assumes that a barcode's read frequency is an unbiased estimated of its true frequency. 9. Lines 414-415 "on the basis of identified adapted lineages": What does this mean? Does it mean that some individual lineages were determined to be adapted independently of the BLT data (e.g., by isolating clones for them and performing separate competition experiments against the ancestor)? 10. Line 473 "for the inferring": Extraneous "the" 11. Supplement Sec. 2.3: This says there were 2.3e5 total reads across all 10 replicate sequencing samples (which seems quite small to me), but in the main text (line 285) it says 2e5 reads per replicate. Is one of these a typo, or do I misunderstand? 12. Supplement Sec. 2.4: The text here talks about 26 clones (unique barcodes?), but I don't understand how those were allocated across the target frequencies in the experimental setup from Sec. 2.2. That says there are supposed to be barcodes at six distinct frequencies in the test culture, so I was expecting six unique barcodes (or an integer multiple of that). 13. Supplement Sec. 2.4, four lines above Eq. S2 "has its an unknown": Extraneous "its" 14. Supplement Sec. 4.1, Eq. S6: This model from Levy et al. always struck me as overkill, and I'd like to know what the authors of this paper think. To me it seems like the stochastic birth and death during a batch growth cycle is likely to be negligible compared to the stochasticity of the dilution step, and therefore unnecessary to model stochastically (rather than just as deterministic exponential growth). Indeed, that seems to be how the authors simulate it themselves here (Sec. 3), with deterministic growth between dilutions and Poisson sampling at the dilution. Is there something I'm missing? 15. Supplement Sec. 4.3.2, line above Eq. S11 "can also be re-write": Typo 16. Supplement Sec. 4.3.2, Eq. S15: Shouldn't there be a $\int dx x$ in the numerator of the right-hand side? The left side is an expectation value of $x$, so I think there should be an average over it on the right side. 17. Supplement Sec. 5.1, fifth line: Doesn't the model assumption $\langle r \rangle = nR/N$ contradict the point the authors just made in Sec. 4.3.2, that the read frequency $\langle r \rangle/N$ is not an unbiased estimator of the actual frequency $n/N$? 18. Figure S10: Why not plot this data as scatter points of individual mutants, with inferred fitness from one replicate vs. inferred fitness from the other replicate? 19. Supplement references: There is some inconsistent formatting here (capitalization in titles, e.g., "dna," and abbreviation/capitalization of journal titles), which the journal probably won't fix themselves. 20. Finally, I appreciated the food-themed acronyms and wondered if the authors would consider the title "Putting BASIL on a BLT." ******** Have the authors made all data and (if applicable) computational code underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data and code underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data and code should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data or code —e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: None ****** PLOS authors have the option to publish the peer review history of their article (what does this mean? ). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy . Reviewer #1: Yes: Mikhail Tikhonov Reviewer #2: No [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] Figure resubmission: While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step. If there are other versions of figure files still present in your submission file inventory at resubmission, please replace them with the PACE-processed versions. Reproducibility:** To enhance the reproducibility of your results, we recommend that authors of applicable studies deposit laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols Attachments Attachment Submitted filename: PCOMPBIOL-D-25-00612_review.pdf https://doi.org/10.1371/journal.pcbi.1013946.r001
Revision 1
7 Nov 2025 Author Response Attachments Attachment Submitted filename: Reviewer Response.pdf https://doi.org/10.1371/journal.pcbi.1013946.r002
26 Jan 2026 Decision Letter - Tobias Bollenbach, Editor Dear Dr Kryazhimskiy, We are pleased to inform you that your manuscript 'Putting BASIL in a BLT: A Bayesian Filtering Method for Estimating the Fitness Effects of Nascent Adaptive Mutations' has been provisionally accepted for publication in PLOS Computational Biology. Before your manuscript can be formally accepted you will need to complete some formatting changes, which you will receive in a follow up email. A member of our team will be in touch with a set of requests. Please note that your manuscript will not be scheduled for publication until you have made the required changes, so a swift response is appreciated. IMPORTANT: The editorial review process is now complete. PLOS will only permit corrections to spelling, formatting or significant scientific errors from this point onwards. Requests for major changes, or any which affect the scientific understanding of your work, will cause delays to the publication date of your manuscript. Should you, your institution's press office or the journal office choose to press release your paper, you will automatically be opted out of early publication. We ask that you notify us now if you or your institution is planning to press release the article. All press must be co-ordinated with PLOS. Thank you again for supporting Open Access publishing; we are looking forward to publishing your work in PLOS Computational Biology. Best regards, Tobias Bollenbach Section Editor PLOS Computational Biology ********************************************************* Congratulations on a great paper! Please make sure to address the remaining issue pointed out by Reviewer #2. Reviewer's Responses to Questions Comments to the Authors: Please note here if the review is uploaded as an attachment. Reviewer #1: The authors’ revisions addressed all my comments and concerns. Thank you & happy holidays! Reviewer #2: The authors have made valuable revisions to this paper, and overall I support publication. However, in response to my second major comment on providing more explicit guidelines for how to apply their method to other data sets, they said they added more details on this in "Sec. 5.5 of the SI," but as far as I can tell there is no Sec. 5 in the SI. So I'm not sure where is the material they are referring to, which I do still think is important to include so that others can easily adapt the method to their own experiments. So the editor should confirm with the authors this has been included. ****** Have the authors made all data and (if applicable) computational code underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data and code underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data and code should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data or code —e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: Yes ****** PLOS authors have the option to publish the peer review history of their article (what does this mean? ). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy . Reviewer #1: Yes:** Mikhail Tikhonov Reviewer #2: No https://doi.org/10.1371/journal.pcbi.1013946.r003
Formally Accepted
Acceptance Letter - Tobias Bollenbach, Editor PCOMPBIOL-D-25-00612R1 Putting BASIL in a BLT: A Bayesian Filtering Method for Estimating the Fitness Effects of Nascent Adaptive Mutations Dear Dr Kryazhimskiy, I am pleased to inform you that your manuscript has been formally accepted for publication in PLOS Computational Biology. Your manuscript is now with our production department and you will be notified of the publication date in due course. The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript. Soon after your final files are uploaded, unless you have opted out, the early version of your manuscript will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers. For Research, Software, and Methods articles, you will receive an invoice from PLOS for your publication fee after your manuscript has reached the completed accept phase. If you receive an email requesting payment before acceptance or for any other service, this may be a phishing scheme. Learn how to identify phishing emails and protect your accounts at https://explore.plos.org/phishing. Thank you again for supporting PLOS Computational Biology and open-access publishing. We are looking forward to publishing your work! With kind regards, Anita Estes PLOS Computational Biology \| Carlyle House, Carlyle Road, Cambridge CB4 3DN \| United Kingdom ploscompbiol@plos.org \| Phone +44 (0) 1223-442824 \| ploscompbiol.org \| @PLOSCompBiol https://doi.org/10.1371/journal.pcbi.1013946.r004

Open letter on the publication of peer review reports

PLOS recognizes the benefits of transparency in the peer review process. Therefore, we enable the publication of all of the content of peer review and author responses alongside final, published articles. Reviewers remain anonymous, unless they choose to reveal their names.

We encourage other journals to join us in this initiative. We hope that our action inspires the community, including researchers, research funders, and research institutions, to recognize the benefits of published peer review reports for all parts of the research system.

Learn more at ASAPbio .