Peer Review History

Original SubmissionNovember 6, 2024
Decision Letter - Denise Kühnert, Editor

PCOMPBIOL-D-24-01946

iPAR: A framework for modelling and inferring information about disease spread when the populations at risk are unknown

PLOS Computational Biology

Dear Dr. Catterall,

Thank you for submitting your manuscript to PLOS Computational Biology. After careful consideration, we feel that it has merit but does not fully meet PLOS Computational Biology's publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript within 60 days Mar 23 2025 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at ploscompbiol@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pcompbiol/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

* A rebuttal letter that responds to each point raised by the editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'. This file does not need to include responses to formatting updates and technical items listed in the 'Journal Requirements' section below.

* A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

* An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, competing interests statement, or data availability statement, please make these updates within the submission form at the time of resubmission. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter

We look forward to receiving your revised manuscript.

Kind regards,

Nicholas Geard

Academic Editor

PLOS Computational Biology

Denise Kühnert

Section Editor

PLOS Computational Biology

Additional Editor Comments:

The reviewers appreciated the importance of the problem you are addressing, and innovative nature of your proposed solution. They also provided detailed suggestions for improving the clarity and utility of your manuscript by including additional discussion of:

  • factors such as data availability, selection of covariates, computational cost, that may be relevant to use of the proposed methods by other researchers.
  • the applicability of the proposed methods to populations and diseases with different characteristics (eg, human populations, recovery/immunity)
  • how your approach compares to existing modelling methods.

While the provision of code was appreciated, clearer documentation would make your methods more readily usable by others.

Journal Requirements:

1) Please ensure that the CRediT author contributions listed for every co-author are completed accurately and in full.

At this stage, the following Authors/Authors require contributions: Stephen Catterall, Thibaud Porphyre, and Glenn Marion. Please ensure that the full contributions of each author are acknowledged in the "Add/Edit/Remove Authors" section of our submission form.

The list of CRediT author contributions may be found here: https://journals.plos.org/ploscompbiol/s/authorship#loc-author-contributions

2) We ask that a manuscript source file is provided at Revision. Please upload your manuscript file as a .doc, .docx, .rtf or .tex. If you are providing a .tex file, please upload it under the item type u2018LaTeX Source Fileu2019 and leave your .pdf version as the item type u2018Manuscriptu2019.

3) Please provide an Author Summary. This should appear in your manuscript between the Abstract (if applicable) and the Introduction, and should be 150-200 words long. The aim should be to make your findings accessible to a wide audience that includes both scientists and non-scientists. Sample summaries can be found on our website under Submission Guidelines:

https://journals.plos.org/ploscompbiol/s/submission-guidelines#loc-parts-of-a-submission

4) Please upload all main figures as separate Figure files in .tif or .eps format. For more information about how to convert and format your figure files please see our guidelines:

https://journals.plos.org/ploscompbiol/s/figures

5) We notice that your supplementary Figures, and Tables are included in the manuscript file. Please remove them and upload them with the file type 'Supporting Information'. Please ensure that each Supporting Information file has a legend listed in the manuscript after the references list.

6) Some material included in your submission may be copyrighted. According to PLOSu2019s copyright policy, authors who use figures or other material (e.g., graphics, clipart, maps) from another author or copyright holder must demonstrate or obtain permission to publish this material under the Creative Commons Attribution 4.0 International (CC BY 4.0) License used by PLOS journals. Please closely review the details of PLOSu2019s copyright requirements here: PLOS Licenses and Copyright. If you need to request permissions from a copyright holder, you may use PLOS's Copyright Content Permission form.

Please respond directly to this email and provide any known details concerning your material's license terms and permissions required for reuse, even if you have not yet obtained copyright permissions or are unsure of your material's copyright compatibility. Once you have responded and addressed all other outstanding technical requirements, you may resubmit your manuscript within Editorial Manager.

Potential Copyright Issues:

- Figures Figures 6, 8, A10, A11, and A12. Please provide a direct link to the base layer of the map (i.e., the country or region border shape) and ensure this is also included in the figure legend; and provide a link to the terms of use / license information for the base layer image or shapefile. We cannot publish proprietary or copyrighted maps (e.g. Google Maps, Mapquest) and the terms of use for your map base layer must be compatible with our CC BY 4.0 license.

Note: if you created the map in a software program like R or ArcGIS, please locate and indicate the source of the basemap shapefile onto which data has been plotted.

If your map was obtained from a copyrighted source please amend the figure so that the base map used is from an openly available source. Alternatively, please provide explicit written permission from the copyright holder granting you the right to publish the material under our CC BY 4.0 license.

If you are unsure whether you can use a map or not, please do reach out and we will be able to help you. The following websites are good examples of where you can source open access or public domain maps:

* U.S. Geological Survey (USGS) - All maps are in the public domain. (http://www.usgs.gov)

* PlaniGlobe - All maps are published under a Creative Commons license so please cite u201cPlaniGlobe, http://www.planiglobe.com, CC BY 2.0u201d in the image credit after the caption. (http://www.planiglobe.com/?lang=enl)

* Natural Earth - All maps are public domain. (http://www.naturalearthdata.com/about/terms-of-use/).

7) Please ensure that the funders and grant numbers match between the Financial Disclosure field and the Funding Information tab in your submission form. Note that the funders must be provided in the same order in both places as well.

- State the initials, alongside each funding source, of each author to receive each grant. For example: "This work was supported by the National Institutes of Health (####### to AM; ###### to CJ) and the National Science Foundation (###### to AM)."

- State what role the funders took in the study. If the funders had no role in your study, please state: "The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.".

If you did not receive any funding for this study, please simply state: u201cThe authors received no specific funding for this work.u201d

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #1: My comments are uploaded as an attachment

Reviewer #2: The manuscript introduces the iPAR (inference for populations at risk) framework, a novel modeling and inference approach for spatial infectious disease dynamics when the population at risk is uncertain or poorly quantified. This is a non-trivial and important problem in epidemiology and population health, where precise host distribution data are frequently unavailable. The proposed approach is both conceptually interesting and methodologically innovative. It attempts to extract critical epidemiological information—such as susceptibility, infectivity, and rate of spread—from case-only data, supplemented by spatial covariates. The manuscript includes a theoretical and methodological development, as well as simulation-based evaluations and a real-world case study on African Swine Fever (ASF) in Estonian wild boar populations.

# Strengths:

The introduction of the iPAR framework addresses a critical gap in spatial disease modelling, especially when populations at risk are poorly quantified. This framework is highly relevant for wildlife outbreaks and under-documented livestock settings.

The step-by-step development of the model, including Bayesian inference, patch-based structure, and the use of covariates, is thoroughly detailed.

The authors show that the method can incorporate spatial covariates—land use, climate factors, or other relevant environmental proxies—to estimate susceptibility and infectivity surfaces. The framework is not restricted to a specific host-pathogen system and can be extended, at least in principle, to more complicated compartmental structures or multiple host species.

# Potential limitations and suggestions (major comments):

While the paper acknowledges the difficulty of recovering infectivity parameters compared to susceptibility, it may be beneficial to discuss conditions under which identifiability issues arise more explicitly. For instance, what levels of data availability (in terms of number of cases, spatial resolution, and outbreak duration) are needed for robust inference of both susceptibility and infectivity?

The authors note that the method can use a wide range of covariates. However, more guidance on how to select appropriate covariates, how to handle spatio-temporal variation in these covariates, and how to choose the patch size would strengthen the applicability. While patch-based models are common, best practices or decision criteria for resolution selection would be useful, especially in more heterogeneous landscapes or when data resolution differs between outbreak reports and covariates.

The current work focuses on an SI-type model with persistence. Many real-world diseases involve recovery, waning immunity, or multiple host types. A clearer roadmap on how to incorporate these complexities within the iPAR framework would strengthen its applicability. While mentioned as a future direction, more concrete guidance or at least a detailed conceptual outline would improve the paper’s practical utility.

In addition, clearly state the implications of using the SI model, especially for diseases like ASF that have latent phase, or other that would have a recovery phase.

Moreover, since inference is based on a likelihood-based approach, the use of a more complex model seems likely to severely hamper the use of the proposed framework. It is essential at the start of the article to clearly define the conditions of application of the framework.

The methods may be computationally intensive for large-scale problems with many patches and long observation periods. Although the paper mentions computational tractability as a motivation for patch aggregation, a discussion of computational scaling or potential performance optimizations (e.g., parallelization, approximation methods) would be valuable. No indication is given of the language used or the computing resources consumed (time, memory), which would give a better idea of the framework's performance and potential for improvement.

In "2.2.3 Uncertainty in model predictions" : "...initial conditions typically set to the infection status of the patches at the final time point of the

data used to fit the model ( = ). ...". In this case, it's assumed that the system is perfectly observed, but it's highly unlikely, and often impossible, to observe this kind of system exhaustively. Why not use as initial conditions the state of the system at t=T obtained in the simulations carried out? If the data do not contain an observation of all the patches, then it is very likely that the assumption made about the initial conditions tends to minimize the number of infected patches.

In "2.2.4 Testing the iPAR modelling approach" : This type of approach (the use of synthetic/simulated data) also has its limitations and drawbacks, which should be mentioned and discussed (choice of synthetic trajectory, etc.).

In "3.1 Estimation and prediction for an illustrative simulated outbreak" and "3.2 Estimation of key epidemiological parameters" : In these sections, no mention is made of how the simulated trajectory used as an observation was chosen, even though this can have a significant impact on the results. It should be described and discussed.

The approach infers “effective” susceptibility and infectivity surfaces that combine unknown population density and other factors. This layered interpretation may cause confusion for end-users. More explicit statements about the interpretative limits of these inferred surfaces, and how to use them alongside or in the absence of independent population density estimates—would help ensure that the results are not misinterpreted as direct host density maps.

While the paper’s accompanying code is appreciated, it appears incomplete. It lacks clear guidance on compilation and execution (notably due to the C code), and there are no provided examples or instructions for the necessary input files. More comprehensive documentation is mandatory.

# Minor comments:

- Define "external transmission" explicitly to ensure clarity for readers less familiar with disease modelling.

- Figures are informative, but their descriptions could be more concise. Ensure that captions focus on the key message of each figure (e.g., “parameter recovery demonstrates…”). For few figures, the addition of a legend would make reading easier.

- Part of the penultimate paragraph of the introduction (first half) seems to belong to the second part, which focuses on method.

- Tables : In the definition of scenarios, the "Susceptibility" variable always seems to be the same, and therefore not very useful to have it in the tables.

# Summary:

It's an interesting paper with valuable insights, but it is quite lengthy and dense, which can make it challenging for the reader to maintain a clear grasp of the material, and make it easy to occasionally lose track of the main thread, but well worth reading.

This manuscript makes a valuable contribution to spatial disease modelling, particularly for scenarios where host population data are limited or unavailable.

Addressing the suggested clarifications on model limitations, assumptions, guidance, and extended discussion on model complexity would further enhance the manuscript.

Reviewer #3: iPAR: A framework for modeling and inferring information about disease spread when

the populations at risk are unknown

comments:

The authors developed the iPAR framework which enables modeling and estimating of the spatial disease dynamics when the populations at risk are unknown. The authors presented rigorous simulation analysis with different spatiotemporal scenarios and real-world data analysis results. It is very interesting and relevant work since there are many situations we need to make strong assumptions about susceptible populations. This work can contribute to the infectious disease literature providing the modeling framework with a more accurate and flexible approach when we have the incomplete population distribution. However I have a few questions regarding the iPAR framework described in the manuscript.

1. I had the impression that the iPAR framework targets the infectious disease of animals, even though the manuscript did not specifically mention it. Can it be applied to human infectious disease transmission modeling? Is there any appropriate scenario of human infectious disease that the iPAR framework can be well suited? If it can be, I think it is necessary to show i) how iPAR model terms are changed or similarly used in human infectious disease transmission modeling, ii) which covariates can be applied to estimate surface infectivity and susceptibility, in case of human transmission, iii) how it can be used to model and predict infectious human disease transmission in simulation analysis. Currently, the model covariates ( e.g. Land use), simulation scenarios and real-world applications all focus on animal infectious disease modeling. If not, the manuscript needs to specifically mention that this paper mainly targets animal infectious disease transmission to help better understanding of audiences.

2. External Validation: This is a novel framework to model infectious disease transmission when the population at risk is unknown. As the authors wrote in section 1, there exist previous modeling methods based on transmission trees, contact-distribution models, and approaches based on hypothesized population distributions. What will be the benefits of using the iPAR framework compared to existing approaches? How will model performance be different ( e.g. DIC, TPR and FPR) among different approaches? In which simulation scenario does the iPAR perform better than other existing approaches? The authors can show it with different simulation analyses.

3. The iPAR framework uses case reports of the region, but it only estimates and predicts whether the patch itself is infected or not. Is the small/ large number of cases in each patch associated with the performance of the iPAR framework? Or is it only a binary decision that is important here (patch is infected at time t, or not)? It would be a relevant question whether the performance of the iPAR framework will be the same for the regions (multiple patches combined) with only a small number of cases and with a large number of cases, similar to the small area estimation problem in aggregated spatial entities. Will the performance metrics or the width of 95% credible intervals of estimators be the same for these areas with small and large cases? Simulation or real-world data analysis can show this difference.

Reviewer #4: While the subject of the study is of great interest and would benefit readers, more work is required on the presentation of the manuscript. Overall, the manuscript provides a mathematically robust and flexible framework for dealing with missing population data in spatial epidemiology.

However, my initial impression is that the manuscript is too lengthy (25 pages of the manuscript and an additional 25 pages containing 10 appendices), which may divert readers' attention. The manuscript would greatly benefit from an effort by the authors to shorten the text, summarize the appendices within the main text, and integrate the explanation of the model’s application with case study and parameter estimation, rather than presenting them in separate sections.

The text is logically coherent and maintains a consistent framework throughout, but the exposition could be streamlined for clarity. Most sections are dense and may be challenging for readers without a strong mathematical background. Simplification would benefit the overall readability. For example, the likelihood expression and Bayesian updates, while technically accurate, are dense and could overwhelm readers unfamiliar with Bayesian modeling or epidemic processes.

The paper would benefit from merging the explanation of the model with its application, both in the case study and the estimation of key parameters, rather than separating them into distinct sections. This would improve the flow and make it easier to understand the model’s real-world implications.

The manuscript demonstrates a flexible approach to addressing uncertainties in the spatial distribution of populations at risk. By relying on covariates, the model accommodates varying levels of data availability, making it adaptable to scenarios where detailed population data is missing. However, the effectiveness of this flexibility depends on the quality and relevance of the covariates used. For this study, only land use was employed as a covariate, but further discussion on the selection and justification of covariates could enhance the model's applicability.

The methodology is mathematically specified with clear equations, but reproducibility in practice could depend on the following aspects:

Details on the implementation and parameter estimation, including the software used.

Explanation of how covariates are pre-processed, standardized, or selected (i.e., defining what constitutes a "suitable covariate").

Simplification or rephrasing of the explanation and interpretation of susceptibility, infectivity, covariate-based functions, and parameters to make them more approachable.

While the incorporation of absolute and relative covariates is mathematically valid and aligns with common practices in modeling spatial heterogeneity, clarification is needed. For instance, what defines temperature as absolute and land cover as relative? If only land cover is used in the application, the entire explanation of absolute and relative covariates could be removed, keeping only the relevant covariate used in the model.

Clear explanations of what each parameter represents, the rationale behind the values or algorithms used to estimate them, and the references used to assign these values.

The choice of non-informative priors requires further justification. Informative priors could be incorporated based on relevant epidemiological, spatial, and temporal factors. For example, proximity to known sources (infection time may depend on the distance from other infected patches), wild boar or farm density (higher densities may lead to earlier infections), patch-level biosecurity (better biosecurity could delay infection), seasonality (ASF transmission may exhibit seasonal trends), outbreak start time (infection times should logically follow the initial outbreak in nearby patches), and ASF's tendency to spread via contact (e.g., wild boar migration, human-mediated transmission). Additionally, models for spatial spread (e.g., diffusion or gravity models) could inform the selection of priors.

Other minor comments:

Acronyms should be spelled out the first time they are mentioned in the manuscript (e.g., MCMC). The term "ESM" can be removed and simply refer to the appendix number.

Incorporating line numbers would be a useful practice for facilitating specific revision comments.

Ensure that references are cited correctly in the text. For example, on page 6, it should read: O'Neill and Roberts (1999), Jewell et al. (2006), and Stockdale et al. (2017). Revise all reference citations accordingly.

When citing other authors, summarize their arguments clearly to provide context to the reader.

**********

Have the authors made all data and (if applicable) computational code underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data and code underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data and code should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data or code —e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

Reviewer #4: No:  The data was downloaded from public websites. No code was provided, nor details of software used.

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean? ). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy .

Reviewer #1: No

Reviewer #2: No

Reviewer #3: No

Reviewer #4: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

Figure resubmission:

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step. If there are other versions of figure files still present in your submission file inventory at resubmission, please replace them with the PACE-processed versions.

Reproducibility:

?>

Attachments
Attachment
Submitted filename: IPAR_framework_review.docx
Revision 1

Attachments
Attachment
Submitted filename: Response to reviewers.docx
Decision Letter - Denise Kühnert, Editor

Dear Dr Catterall,

We are pleased to inform you that your manuscript 'iPAR: A framework for modelling and inferring information about disease spread when the populations at risk are unknown' has been provisionally accepted for publication in PLOS Computational Biology.

Before your manuscript can be formally accepted you will need to complete some formatting changes, which you will receive in a follow up email. A member of our team will be in touch with a set of requests.

Reviewer 2 has made two further suggestions. I leave to your discretion whether you wish to incorporate these in your manuscript.

Please note that your manuscript will not be scheduled for publication until you have made the required changes, so a swift response is appreciated.

IMPORTANT: The editorial review process is now complete. PLOS will only permit corrections to spelling, formatting or significant scientific errors from this point onwards. Requests for major changes, or any which affect the scientific understanding of your work, will cause delays to the publication date of your manuscript.

Should you, your institution's press office or the journal office choose to press release your paper, you will automatically be opted out of early publication. We ask that you notify us now if you or your institution is planning to press release the article. All press must be co-ordinated with PLOS.

Thank you again for supporting Open Access publishing; we are looking forward to publishing your work in PLOS Computational Biology. 

Best regards,

Nicholas Geard

Academic Editor

PLOS Computational Biology

Denise Kühnert

Section Editor

PLOS Computational Biology

***********************************************************

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #1: The modifications made by the authors in response to all reviewers' comments look satisfying to me, and importantly, the guide produced to re-use the code is very complete.

Reviewer #2: Major comments have been addressed; however, two points remain to be addressed:

- Manuscript length and organization: the main text remains dense; consider merging parts of results and case study to improve narrative flow.

- Perfect-observation assumption (p 6, l 22–24) may be unrealistic. Even if it is sometimes assumed, it is a characteristic that is very often taken into account, due to a potentially significant bias.

Reviewer #3: The authors provided the reasonable explanation for my previous comments.

**********

Have the authors made all data and (if applicable) computational code underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data and code underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data and code should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data or code —e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: None

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean? ). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy .

Reviewer #1: No

Reviewer #2: No

Reviewer #3: No

Formally Accepted
Acceptance Letter - Denise Kühnert, Editor

PCOMPBIOL-D-24-01946R1

iPAR: A framework for modelling and inferring information about disease spread when the populations at risk are unknown

Dear Dr Catterall,

I am pleased to inform you that your manuscript has been formally accepted for publication in PLOS Computational Biology. Your manuscript is now with our production department and you will be notified of the publication date in due course.

The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript.

Soon after your final files are uploaded, unless you have opted out, the early version of your manuscript will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers.

Thank you again for supporting PLOS Computational Biology and open-access publishing. We are looking forward to publishing your work!

With kind regards,

Zsofia Freund

PLOS Computational Biology | Carlyle House, Carlyle Road, Cambridge CB4 3DN | United Kingdom ploscompbiol@plos.org | Phone +44 (0) 1223-442824 | ploscompbiol.org | @PLOSCompBiol

Open letter on the publication of peer review reports

PLOS recognizes the benefits of transparency in the peer review process. Therefore, we enable the publication of all of the content of peer review and author responses alongside final, published articles. Reviewers remain anonymous, unless they choose to reveal their names.

We encourage other journals to join us in this initiative. We hope that our action inspires the community, including researchers, research funders, and research institutions, to recognize the benefits of published peer review reports for all parts of the research system.

Learn more at ASAPbio .