A generalized Bayesian framework for maximizing information gain and model selection

Prem Jagadeesan; Karthik Raman; Arun K. Tangirala

doi:10.1371/journal.pcsy.0000082

Peer Review History

Original SubmissionJune 30, 2025
13 Aug 2025 Decision Letter - Réka Albert, Editor PCSY-D-25-00067 A Generalized Bayesian Framework for Maximizing Information Gain and Model Selection PLOS Complex Systems Dear Dr. Tangirala, Thank you for submitting your manuscript to PLOS Complex Systems. After careful consideration, we feel that it has merit but does not fully meet PLOS Complex Systems's publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. Please submit your revised manuscript within 60 days Oct 12 2025 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at complexsystems@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pcsy/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript: * A rebuttal letter that responds to each point raised by the reviewers. You should upload this letter as a separate file labeled 'Response to Reviewers'. This file does not need to include responses to any formatting updates and technical items listed in the 'Journal Requirements' section below. * A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'. * An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'. If you would like to make changes to your financial disclosure, competing interests statement, or data availability statement, please make these updates within the submission form at the time of resubmission. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter. We look forward to receiving your revised manuscript. Kind regards, Réka Albert Section Editor PLOS Complex Systems Hocine Cherifi Editor-in-Chief PLOS Complex Systems Journal Requirements: 1. Please provide a/amend your detailed Financial Disclosure statement. This is published with the article. It must therefore be completed in full sentences and contain the exact wording you wish to be published. 1. Please clarify all sources of funding (financial or material support) for your study. List the grants (with grant number) or organizations (with url) that supported your study, including funding received from your institution. 2. State the initials, alongside each funding source, of each author to receive each grant. 3. State what role the funders took in the study. If the funders had no role in your study, please state: “The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.” 4. If any authors received a salary from any of your funders, please state which authors and which funders. If you did not receive any funding for this study, please simply state: “The authors received no specific funding for this work.” 2. Please send a completed 'Competing Interests' statement, including any COIs declared by your co-authors. If you have no competing interests to declare, please state "The authors have declared that no competing interests exist". Otherwise please declare all competing interests beginning with the statement "I have read the journal's policy and the authors of this manuscript have the following competing interests:" 3. Please note that your Data Availability Statement is currently missing the repository name and/or the DOI/accession number of each dataset OR a direct link to access each database. If your manuscript is accepted for publication, you will be asked to provide these details on a very short timeline. We therefore suggest that you provide this information now, though we will not hold up the peer review process if you are unable. 4. We ask that a manuscript source file is provided at Revision. Please upload your manuscript file as a .doc, .docx, .rtf or .tex. 5. Please provide an Author Summary. This should appear in your manuscript between the Abstract (if applicable) and the Introduction, and should be 150–200 words long. The aim should be to make your findings accessible to a wide audience that includes both scientists and non-scientists. Sample summaries can be found on our website under Submission Guidelines: https://journals.plos.org/complexsystems/s/submission-guidelines#loc-parts-of-a-submission 6. Please provide separate figure files in .tif or .eps format. For more information about figure files please see our guidelines: https://journals.plos.org/complexsystems/s/figures https://journals.plos.org/complexsystems/s/figures#loc-file-requirements 7. We have noticed that you have uploaded Supporting Information files, but you have not included a list of legends. Please add a full list of legends for your Supporting Information files after the references list. 8. If the reviewer comments include a recommendation to cite specific previously published works, please review and evaluate these publications to determine whether they are relevant and should be cited. There is no requirement to cite these works unless the editor has indicated otherwise. [Note: HTML markup is below. Please do not edit.] Reviewers' Comments: Reviewer's Responses to Questions Comments to the Author 1. Does this manuscript meet PLOS Complex Systems’s publication criteria? Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe methodologically and ethically rigorous research with conclusions that are appropriately drawn based on the data presented. Reviewer #1: Yes Reviewer #2: Yes ******** 2. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes Reviewer #2: Yes ****** 3. Have the authors made all data underlying the findings in their manuscript fully available (please refer to the Data Availability Statement at the start of the manuscript PDF file)? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception. The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: Yes ****** 4. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS Complex Systems does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Reviewer #2: Yes ****** 5. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: This paper seeks to extend the authors‘ previously proposed β-information gain criterion within the Bayesian Optimal Experimental Design (OED) framework to the case of discrete parameter distributions, and further explores its utility in both experiment selection and model selection tasks. The authors present a pipeline that integrates MCMC-ABC–based posterior sampling, discretization of parameter spaces, and estimation of β-information gain, and apply the method to three biological case studies: the Hes1 regulatory network, the HIV 2-LTR model, and a two-compartment system. These examples are used to demonstrate the method’s potential in selecting measurement modalities, optimizing sampling schedules, and distinguishing among competing model structures. The manuscript is well-organized and implements a reproducible pipeline that integrates inference, estimation, and evaluation. And the proposed approach may be practically useful for non-analytical Bayesian models. However, I find that the paper currently lacks the level of innovation and persuasiveness required for publication. Below are my comments. Major comments. 1. The main methodological claim of the paper—extending β-information gain to discrete distributions—amounts to rephrasing a continuous expression into its discrete counterpart. Mathematically, this is a straightforward and expected reformulation rather than a substantive theoretical innovation. Furthermore, Theorem 1—which is used to illustrate the advantage of the extended β-information gain—relies on the assumption that the prior is uniform and the posterior is a Kronecker-delta distribution. But for the typical case where both posterior distributions are not delta-like, can the proposed criterion still provide meaningful comparisons? The paper lacks theoretical support for this situation, which is crucial for any metric proposed for OED. 2. β-information gain should be compared quantitatively with established alternatives, particularly Kullback–Leibler divergence. Yet, the paper exclusively uses β as both the evaluation metric and the objective function, which creates a risk of circular validation. The absence of comparative experiments makes it difficult to assess whether the proposed criterion offers real improvements in terms of generalizability, computational efficiency, or identification accuracy. Minor comments 1. The proposed estimation procedure for β involves discretizing the parameter space via K-means clustering. However, the paper does not investigate how this estimator performs in higher-dimensional settings, nor does it assess its sensitivity to sample size or clustering resolution. I suggest the authors report computational cost (e.g., runtime), convergence diagnostics, and sensitivity of β to sample size $N$, ideally via visualization or statistical summaries. 2. The paper includes only three small-scale case studies, all involving low-dimensional models with relatively simple dynamics. While illustrative, these examples fall short of demonstrating the claimed general applicability of the method. There are no experiments involving high-dimensional parameters, multi-modal posteriors, or strong correlations among variables—scenarios where discretized estimators often face difficulties. The method’s robustness across such settings remains untested. 3. Some elements of the figures and tables require clarification or correction. For instance (a) in Table 1, the "Variance" column for the prior $p(\theta)$ includes a formula despite the support of the distribution not being specified. it seems the authors assume a discrete uniform distribution on $\{1, 2, ..., n\}$, which should be clearly stated. (b) In Fig. 6, the label "u1",”u2” should be corrected to "u_1", “u_2”. (c) In Fig. 2, the label "P(θ)\|z" should be corrected to "P(θ\|z)". 4. The mathematical expressions in the manuscript also need careful proofreading. For example (a) in Eq. (14), the upper limit of the summation index should be changed from “$t=n$” to “$n$. (b) in Eqs. (7)–(10), the dimension of the parameter space is alternately denoted by $n$ and $N$, which is confusing. Several similar inconsistencies occur throughout the manuscript and should be carefully corrected to improve clarity and professionalism. Reviewer #2: SPECIFIC COMMENTS: The paper is an interesting paper about A Generalized Bayesian Framework for Maximizing Information Gain and Model Selection. Specific comments are below. -- Figure 1 is a bit strange in the way the arrows are organized and plots (up and bottom schemes...). This requires better plotting and explanation. -- What is the analytical and numerical relationship or similarity between Bhattacharyya distance and KL distance? Not clear. Previous work has emphasized the relationship of KL divergence and Value of Information for model selection (see Convertino et al. 2015) -- what is the applicability of Bhattacharyya distance for non-canonical multimodal distributions? -- Fig 3 seems to suggest the fact that only ONE parameter value is optimal but in many cases multiple values may be optimal for highly non-linear systems. Therefore, it is more about matching a distribution of ''true optimal'' values. -- Fig 5 shows simple time-dependent dynamics but stochastic dynamics can be highly fluctuating and complex. How the Bhattacharyya distance work in that case? -- Please make the figure captions self-standing so one can understand the figures much better without finding the meaning across the manuscript. I also propose to put together affine figures, and make bigger figures with higher quality -- What is the generalizability of your results and consistency with other results? Are these findings only found on your model type/dynamics or on others? What about the patterns of other models and other features considered (e.g., statistical moments) compared to any other methods like KL divergence? Generic comments are provided below to perhaps try to find some sort of generalizability and depth of results, if possible. Further quantification may be done in the future, but the statement of limitations and possibilities is an important aspect of scientific publication, including deep-uncertainty considerations about the unpredictability of true pdfs. GENERAL COMMENTS: (1) Eco-STOCHASTICITY/VARIABILITY of PATTERNS as spatio-temporal probability distributions (eco-variability attribution via and systemic uncertainty decomposition for causal attribution): To address the model/data Uncertainty-Sensitivity coupling, global sensitivity and uncertainty analysis (GSUA, aka systemic information decomposition) should be done to identify key determinants of model/data variability (OUTPUTS=f(INPUT)) and universal determinants across geographies. You do not quite perform a one-factor-at-a-time sensitivity analysis, nor a non-linear sensitivity analysis to capture the variables' interactions (high-order interactions) that can be predominant in defining patterns' variability (OVER SPACE AND TIME). See Pianosi et al. (2016) for an extensive discussion about this topic and how data should be used for GSUA using a simple variance-based approach. Entropy approaches of GSUA (Li and Convertino, 2021) are also available when the pdfs are too complex to make the variance meaningful. The attribution of uncertainty can lead to the quantification of ecological stress (as change into the systemic function considered, e.g., OUTPUTS features' change) attributable to different socio-environmental causes or unknown factors (characterized as ''deep uncertainty''). I also think the paper should pinpoint which sites have the highest and lowest uncertainty (uncertainty sources and sinks), uncertainty/information baseline and thresholds. OUTPUT gradients should have the lowest uncertainty. (2) eco-STABILITY and eco-STATE CONTROLS via OPTIMAL Network CORE (prediction of optimal causal controls): How indicators/predicted variables (i.e. OUTPUT pattern/network indicators or values) change over space and/or time, conditional to optimal or desired outcomes (e.g. related to OUTPUT optimal ranges, that are not identified in your paper), is critical for mapping site-/time-specific and universal patterns and shifts (Sugihara G et al (2012)), and more importantly environment-ecological controls that are Pareto-optimal (Shoval, O. et al, 2012 and ParTi model), such as FLOWS (socio-ecological or economical) and NETWORKS (even functional dependencies of paraeters and variables). The stability (and universality) of ecological patterns over predictors' gradients and their critical change should be quantified because that can define potential stable states over which the predictands (causal factors) are relatively stable or approaching a transition (ecoshifts and optimal eco-switch). These calculations can be done by doing inverse modeling via MonteCarlo filtering over the pdfs of OUTPUTS. RECOMMENDATION: I suggest accepting the paper after Major Revisions. The paper is very interesting, but I think the findings are quite dependent on the model considered, and uniqueness, limitations, and uncertainty should be stated or addressed more explicitly. I suggest that the authors focus on what patterns are universal and what are model-specific, as well as to leverage other datasets (space-time). Also, please revise the English and try to make a probabilistic analysis that is essential in defining the sources of uncertainty and how we can attribute OUTCOMES/OUTPUTS to CAUSES for CONTROLS. REFERENCES: Information-based fitness and the emergence of criticality in living systems Jorge Hidalgo, Jacopo Grilli, Samir Suweis, ..., and Amos Maritan June 30, 2014 111 (28) 10095-10100 https://doi.org/10.1073/pnas.1319166111 Matteo Convertino et al. Design of optimal ecosystem monitoring networks: hotspot detection and biodiversity patterns Volume 29, pages 1085–1101, (2015) https://link.springer.com/article/10.1007/s00477-014-0999-8 Inferring ecosystem networks as information flows Jie Li & Matteo Convertino Scientific Reports volume 11, Article number: 7094 (2021) Campo-Bescós MA, Muñoz-Carpena R, Kaplan DA, Southworth J, Zhu L, Waylen PR (2013) Beyond Precipitation: Physiographic Gradients Dictate the Relative Importance of Environmental Drivers on Savanna Vegetation. PLoS ONE 8(8): e72348. https://doi.org/10.1371/journal.pone.0072348 Pianosi et al. (2016) Sensitivity analysis of environmental models: A systematic review with practical workflow Environmental Modelling & Software Volume 79, May 2016, Pages 214-232 Packages for GSUA https://safetoolbox.github.io/Documentation.html Sugihara G et al (2012) Detecting Causality in Complex Ecosystems https://www.science.org/doi/10.1126/science.1227079 Shoval, O. et al. Evolutionary trade-offs, Pareto optimality, and the geometry of phenotype space. Science 336(6085), 1157–1160 (2012). Pareto Task Inference (ParTI) https://www.weizmann.ac.il/mcb/alon/download/pareto-task-inference-parti-method --- as for Deep Uncertainty papers see here https://www.nature.com/articles/s41467-025-57897-1 https://www.science.org/doi/10.1126/sciadv.add7082 https://www.nature.com/articles/s44304-025-00072-9 https://agupubs.onlinelibrary.wiley.com/doi/am-pdf/10.1029/2021EF002322 ****** 6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. Do you want your identity to be public for this peer review? If you choose “no”, your identity will remain anonymous but your review may still be made public. For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No ****** [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] Figure resubmission: While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step. If there are other versions of figure files still present in your submission file inventory at resubmission, please replace them with the PACE-processed versions. Reproducibility:** To enhance the reproducibility of your results, we recommend that authors of applicable studies deposit laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols https://doi.org/10.1371/journal.pcsy.0000082.r001
Revision 1
12 Oct 2025 Author Response Attachments Attachment Submitted filename: Response to Reviewers.pdf https://doi.org/10.1371/journal.pcsy.0000082.r002
23 Nov 2025 Decision Letter - Réka Albert, Editor A Generalized Bayesian Framework for Maximizing Information Gain and Model Selection PCSY-D-25-00067R1 Dear Dr. Tangirala, We're pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements. Within one week, you'll receive an e-mail detailing the required amendments. When these have been addressed, you'll receive a formal acceptance letter and your manuscript will be scheduled for publication. An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at https://www.editorialmanager.com/pcsy/ click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. For questions related to billing, please contact billing support at https://plos.my.site.com/s/. If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they'll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact complexsystems@plos.org. Kind regards, Réka Albert Section Editor PLOS Complex Systems Additional Editor Comments (optional): Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation. Reviewer #1: All comments have been addressed Reviewer #2: All comments have been addressed -------------------- 2. Does this manuscript meet PLOS Complex Systems's publication criteria? Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe methodologically and ethically rigorous research with conclusions that are appropriately drawn based on the data presented. Reviewer #1: Yes Reviewer #2: Yes -------------------- 3. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes Reviewer #2: Yes -------------------- 4. Have the authors made all data underlying the findings in their manuscript fully available (please refer to the Data Availability Statement at the start of the manuscript PDF file)? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception. The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: Yes -------------------- 5. Is the manuscript presented in an intelligible fashion and written in standard English?<br/><br/>PLOS Complex Systems does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Reviewer #2: Yes -------------------- 6. Review Comments to the Author<br/><br/>Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: (No Response) Reviewer #2: I think the authors have revised the manuscript. I still believe that global sensitivity and uncertainty analyses was not done fully but that can be the matter of another paper. -------------------- 7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. Do you want your identity to be public for this peer review? If you choose “no”, your identity will remain anonymous but your review may still be made public. For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No -------------------- https://doi.org/10.1371/journal.pcsy.0000082.r003

Open letter on the publication of peer review reports

PLOS recognizes the benefits of transparency in the peer review process. Therefore, we enable the publication of all of the content of peer review and author responses alongside final, published articles. Reviewers remain anonymous, unless they choose to reveal their names.

We encourage other journals to join us in this initiative. We hope that our action inspires the community, including researchers, research funders, and research institutions, to recognize the benefits of published peer review reports for all parts of the research system.

Learn more at ASAPbio .