^{1}

^{1}

^{2}

^{1}

^{1}

^{1}

^{3}

^{4}

^{5}

^{6}

^{7}

The authors have declared that no competing interests exist.

Individual-based models (IBMs) informing public health policy should be calibrated to data and provide estimates of uncertainty. Two main components of model-calibration methods are the parameter-search strategy and the goodness-of-fit (GOF) measure; many options exist for each of these. This review provides an overview of calibration methods used in IBMs modelling infectious disease spread. We identified articles on PubMed employing simulation-based methods to calibrate IBMs informing public health policy in HIV, tuberculosis, and malaria epidemiology published between 1 January 2013 and 31 December 2018. Articles were included if models stored individual-specific information, and calibration involved comparing model output to population-level targets. We extracted information on parameter-search strategies, GOF measures, and model validation. The PubMed search identified 653 candidate articles, of which 84 met the review criteria. Of the included articles, 40 (48%) combined a quantitative GOF measure with an algorithmic parameter-search strategy–either an optimisation algorithm (14/40) or a sampling algorithm (26/40). These 40 articles varied widely in their choices of parameter-search strategies and GOF measures. For the remaining 44 (52%) articles, the parameter-search strategy could either not be identified (32/44) or was described as an informal, non-reproducible method (12/44). Of these 44 articles, the majority (25/44) were unclear about the GOF measure used; of the rest, only five quantitatively evaluated GOF. Only a minority of the included articles, 14 (17%) provided a rationale for their choice of model-calibration method. Model validation was reported in 31 (37%) articles. Reporting on calibration methods is far from optimal in epidemiological modelling studies of HIV, malaria and TB transmission dynamics. The adoption of better documented, algorithmic calibration methods could improve both reproducibility and the quality of inference in model-based epidemiology. There is a need for research comparing the performance of calibration methods to inform decisions about the parameter-search strategies and GOF measures.

Calibration—that is, “fitting” the model to data—is a crucial part of using mathematical models to better forecast and control the population-level spread of infectious diseases. Evidence that the mathematical model is well-calibrated improves confidence that the model provides a realistic picture of the consequences of health policy decisions. To make informed decisions, Policymakers need information about uncertainty: i.e., what is the range of likely outcomes (rather than just a single prediction). Thus, modellers should also strive to provide accurate measurements of uncertainty, both for their model parameters and for their predictions. This systematic review provides an overview of the methods used to calibrate individual-based models (IBMs) of the spread of HIV, malaria, and tuberculosis. We found that less than half of the reviewed articles used reproducible, non-subjective calibration methods. For the remaining articles, the method could either not be identified or was described as an informal, non-reproducible method. Only one-third of the articles obtained estimates of parameter uncertainty. We conclude that the adoption of better-documented, algorithmic calibration methods could improve both reproducibility and the quality of inference in model-based epidemiology.

Individual-based models (IBMs) intended to inform public health policy should be calibrated to real-world data and provide valid estimates of uncertainty [

Parameter values with accompanying confidence intervals used in IBMs are obtained from the literature and are often obtained through statistical estimation. When researchers cannot estimate parameters from empirical data, they obtain their likely values through calibration [

In this review, we pay particular attention to the parameter-search strategy and GOF measure used. Algorithmic parameter-search strategies can be divided into

Previous research in the context of IBMs of HIV transmission found that 22 (69%) out of 32 included articles described the process through which the model was calibrated to data [

We conducted a systematic review of epidemiological studies using IBMs of the HIV, malaria and tuberculosis (TB) epidemics, as these have been among the most investigated epidemics with the highest global burden of disease [

The PubMed search resulted in 653 publications, of which 84 articles were included for review; 388 were excluded based on title and abstract, and another 181 were excluded based on a full-text review (see

Most articles, namely 56 (67%), investigated the effect of an intervention, 17 articles looked at behavioural or biological explanations for the observed epidemic, and other goals (e.g. parameter estimation, model development) were used in 17. In total, six (7%) articles had two objectives. For most of these (5/6), one of the objectives was investigating the effect of an intervention (see

Of the included articles, 40 (48%) combined a quantitative measure of GOF with an algorithmic parameter-search strategy, which was an optimisation algorithm (14/40) or a sampling algorithm (26/40) (see

Detailed information on calibration methods for the 14 (17%) articles using optimisation algorithms is reported in

Authors | Year | Pathogen | Parameter search strategy algorithm | GOF |
---|---|---|---|---|

2018 | HIV | Grid search | Absolute distance | |

2013 | HIV | Grid search | Kolmogorov-Smirnov | |

2018 | HIV | Grid search | R-squared | |

2018 | HIV | Grid search | R-squared and Manhattan distance of parameters | |

2014 | HIV | Grid search | Squared distance | |

2014 | TB | Grid search | Number of model outputs within the confidence intervals around the targets | |

2015 | TB | Grid search | Number of model outputs within the confidence intervals around the targets | |

2013 | HIV | Iterative, descent-guided optimisation algorithm ( |
Squared distance | |

2015 | HIV | Iterative, descent-guided optimisation algorithm ( |
Squared distance | |

2015 | Malaria | Iterative, descent-guided optimisation algorithm ( |
Squared distance | |

2015 | TB, HIV | Iterative, descent-guided optimisation algorithm ( |
Squared distance | |

2018 | HIV | Iterative, descent-guided optimisation algorithm ( |
Absolute distance | |

2017 | TB | Latin hypercube sampling | Surrogate likelihood | |

2015 | HIV | Sampling from tolerable range | Squared distance |

Authors | Year | Pathogen | Parameter search strategy algorithm | GOF |
---|---|---|---|---|

2015 | Malaria | Bayesian calibration ( |
Surrogate likelihood | |

2015 | TB | Bayesian calibration ( |
Surrogate likelihood | |

2018 | TB | Bayesian calibration ( |
Surrogate likelihood | |

2015 | Malaria | Bayesian calibration ( |
Surrogate likelihood | |

2015 | Malaria | Bayesian calibration ( |
Surrogate likelihood | |

2018 | Malaria | Bayesian calibration ( |
Surrogate likelihood | |

2018 | HIV | Bayesian calibration ( |
Surrogate likelihood | |

2016 | HIV | Bayesian melding | Squared distance | |

2014 | HIV | Bayesian melding | Surrogate likelihood | |

2017 | HIV | Bayesian melding | Surrogate likelihood | |

2013 | HIV | Grid search, step-wise acceptance of parameter sets resulting in GOF < cut-off | Absolute distance | |

2017 | HIV | History matching with model emulation | Implausibility measure | |

2017 | HIV | History matching with model emulation | Implausibility measure | |

2018 | HIV | History matching with model emulation | Implausibility measure | |

2018 | Malaria | Markov chain Monte Carlo | Absolute distance | |

2016 | HIV | Random draw from prior with selection of best 500 parameter combinations | Surrogate likelihood | |

2015 | Malaria | Random draw from prior, stepwise calibration | Absolute distance | |

2018 | Malaria | Random draw from prior, stepwise calibration | Squared distance | |

2016 | HIV | Rejection ABC ( |
Relative distance | |

2017 | HIV | Rejection ABC ( |
Chi-square | |

2018 | HIV | Rejection ABC ( |
Relative distance | |

2013 | HIV | Rejection ABC ( |
Squared distance | |

2013 | HIV | Rejection ABC ( |
Relative distance | |

2015 | HIV | Rejection ABC ( |
Relative distance | |

2017 | HIV | Rejection ABC ( |
Absolute distance | |

2017 | TB | Rejection ABC ( |
Squared distance |

IMIS, Incremental-mixture importance sampling; SIR, Sampling importance resampling; MCMC, Markov chain Monte Carlo.

From the 44 (52%) articles with unidentifiable or informal parameter-search strategies, the majority (25/44) are also unclear about the GOF used, while the rest either relied on visual inspection as a GOF (14/44) or used a quantitative GOF (5/44).

Only 14 (17%) of the 84 included articles provided a rationale for their choice of model-calibration method. For example, McCreesh

Ten out of the 84 articles included (12%) used a weighted calculation of GOF. Four articles weighted the GOF based on the amount of data behind the summary statistic fitted to, for example by weighting based on the inverse of the width of the confidence interval around the data. In contrast, one article increased the weight for a data source for which fewer data was available. Other strategies included weighting based on a subjective assessment of the quality of the data, or weighting based on which data they wanted the model to fit best. One article down-weighted particular data to improve fit. Others stressed the importance of determining weights a priori since weights are chosen subjectively.

None (0/14) of the articles applying optimisation algorithms mentioned the acceptance criteria or stopping rules. Acceptance criteria and stopping rules applied in studies using sampling algorithms can be summarised as running the model until obtaining an arbitrary number of accepted parameter combinations.

The number of target statistics was explicitly mentioned in only three (3%) of the 84 included articles, for 62 (74%) articles we had enough information to attempt to deduce this number from either text or figures. The remaining 19 (23%) articles either provided incomplete information (11/19) or no information (8/19). Some (4/65) of the articles for which we were able to obtain the number of target statistics had different numbers of target statistics for calibration in different locations or calibration to different diseases. The 61 (73%) articles for which we were able to obtain a single count had a median number of target statistics of 23 (range 1–321). A histogram of the number of target statistics is provided in figure A in

The number of calibrated parameters was explicitly mentioned in 11 (13%) of the 84 included articles, for another 53 (63%) articles it was possible to deduce this number from either text or figures. The remaining 20 (24%) articles either provided incomplete information (10/20) or no information at all (10/20). The 64 (75%) articles for which we were able to obtain a count had a median number of calibrated parameters of 10 (range 1–96). A histogram of the number of calibrated parameters is provided in figure B in

(A) Boxplots of the number of calibrated parameters for different parameter search strategies. (B) Boxplots of the number of target statistics for different parameter search strategies.

For 55 (66%) articles, we obtained counts for both the number of target statistics and the number of calibrated parameters. For many of these articles (17/55), the number of calibrated parameters appeared to exceed the number of target statistics. A plot of the number of target statistics against the number of calibrated parameters is provided in figure C in

The size of the simulated population was explicitly mentioned in 54 (64%) of the 84 included articles, for another 9 (11%) articles it was possible to deduce this number from either text or figures. The remaining 21 (25%) articles either provided incomplete information (3/21) or no information at all (18/21). For the 63 (75%) articles for which we obtained a number, the median population size was 78000 (range: 250–47000000). A histogram of the log_{10} of the size of the simulated population is provided in figure D in

The software used to build IBM was not reported in 33 (39%) of the articles. Sixteen articles (19%) used the low-level programming language C++, six (7%) used MATLAB, and another six (7%) used Python. Various other computing platforms were used in the remaining 23 (28%) articles. A high-performance computing facility was used in 16 (19%) articles.

Several simulation tools (i.e. CEPAC [

Only 31 (37%) articles mentioned that a validation of the model had been performed.

More than half of IBMs we studied used non-reproducible or subjective calibration methods. Articles that reported the use of formal calibration methods used a wide range of parameter-search strategies and GOF measures. Only one-third of articles used calibration methods that quantify parameter uncertainty. These findings are important because choices concerning the calibration method can have substantial effects on model results and policy implications [

We encourage authors to use the standardised Calibration Reporting Checklist of Stout

There are several methodological challenges in the calibration of individual-based models, including the choice of calibration method–i.e. the combination of algorithmic parameter-search strategy and GOF measure. The findings of the current review and previous research suggest that there is no consensus on which calibration method to use [

Another methodological challenge in the calibration of IBMs is determining a priori whether the target statistics provide sufficient information to calibrate the parameters [

The last methodological aspect of IBMs we would like to draw attention to is the size of the simulated population [

Our results in the setting of HIV, TB and malaria IBMs indicate that the use of formal calibration methods (48% of articles) is higher than in previous research on simulation models in general–not IBMs specifically. Previously, only one-fifth to one-third of articles reporting on epidemiological models used a quantitative GOF [

To our knowledge, this is the first detailed review of methods used to calibrate IBMs of HIV, malaria and TB epidemics. A limitation of our study is that we are unsure to what extent the results are generalisable to other infectious diseases. We encourage future research on other diseases to confirm or refute our current findings on the use of and reporting on methods in the calibration of IBMs in epidemiological research. Similarly, since our PubMed search excluded articles matching “molecular”, we may have missed relevant articles. However, we don’t believe this selection is likely to bias the findings of this review. Another possible concern is that we don’t control for overlaps in authorship; thus, we effectively treat articles that come from a given”research group” as independent observations, even though the calibration method used by a particular group is often the same, as we show in Tables

In conclusion, it appears that calibrating individual-based models in epidemiological studies of HIV, malaria and TB transmission dynamics remains more of an art than a science. Besides limited reproducibility for a majority of the modelling studies in our review, our findings raise concerns over the correctness of model inference (e.g., estimated impact of past or future interventions) for models that are poorly calibrated. The quality of inference and reproducibility in model-based epidemiology could benefit from the adoption of algorithmic parameter-search strategies and better-documented calibration and validation methods. We recommend the use of sampling algorithms to obtain valid estimates of parameter uncertainty and correlations between parameters. There is a need for simulation-based studies that compare the performance, strengths and limitations of different methods for calibrating IBMs to epidemiological data.

This review was performed following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement [

We identified articles on PubMed that employed simulation-based methods to calibrate IBMs of HIV, malaria and tuberculosis, and that were published between 1 January 2013 and 31 December 2018. Six years seemed to be long enough to yield a sizeable amount of information and to observe recent time trends, and short enough to be feasible and to speak to recent practices in model calibration in epidemiological modelling studies. The following search query was performed on 31 January 2019:

Eligibility criteria were agreed upon by WD, JD and CMH before screening. Articles were included if models stored individual-specific information and calibration involved running the model and comparing model output to population-level targets expressed as summary statistics. We excluded review articles, statistical simulation studies, and studies that focused on molecular biology and immunology because we were primarily interested in studies informing public health policy.

Titles and abstracts were screened for eligibility by CMH, and difficult cases were discussed with WD. If the title and abstract did not provide sufficient information for exclusion, a full-text examination was performed. Full-text inclusion was performed by two independent researchers (CMH and either ZM or ED) for a subset of 100 articles. CMH included 28 articles, of which ZM and ED did not include six; these six articles were double-checked by WD and consequently included for review. ZM included four articles that CMH did not include these four articles were double-checked by WD and consequently not included for review. After that, full-text inclusion was performed by CMH in consultation with WD.

For each article, we extracted information on the objective of the study (i.e. estimating the effect of an intervention, investigating a behavioural or biological explanation for the observed infectious disease outbreak or other goals including estimation of parameters or model development), the parameter-search strategy and the GOF measure, the rationale for choosing this calibration strategy over alternatives, and model validation. Acceptance criteria and stopping rules are only relevant for articles applying algorithmic parameter-search strategies and collected for that subset of articles. For readability purposes, we say “used” to mean “reported the use of” throughout this review.

Information was collected independently by two reviewers (CMH and either ZM or ED) for each article included using a prospectively developed form. This form was based on the Calibration Reporting Checklist of Stout

Information on calibration methods was extracted verbatim, allowing for later classification. Articles on which there was disagreement in the classification were discussed by WD, JD and CMH until an agreement was reached. We classified articles reporting both algorithmic and informal calibration as informal since doing part of the calibration informally makes the entire calibration irreproducible.

R 3.5.0 (

(DOCX)

(DOCX)

(DOCX)

(DOCX)

(DOCX)

(DOCX)

(DOCX)

The authors gratefully acknowledge the help of all SACEMA students and researchers, specifically the fruitful conversations and helpful comments on the manuscript by Prof. Alex Welte, Mrs Cari van Schalkwyk, Dr Florian Marx, Prof. Juliet Pulliam and Dr Larisse Bolton. We would also like to acknowledge Mrs Marisa Honey and Mrs Susan Lotz from the Stellenbosch writing lab, who copy-edited a first version of the manuscript.

Dear Dr Hazelbag,

Thank you very much for submitting your manuscript 'Fitting individual-based models to data in HIV, tuberculosis and malaria: a systematic review' for review by PLOS Computational Biology. Your manuscript has been fully evaluated by the PLOS Computational Biology editorial team and in this case also by independent peer reviewers. The reviewers appreciated the attention to an important problem, but raised some substantial concerns about the manuscript as it currently stands. While your manuscript cannot be accepted in its present form, we are willing to consider a revised version in which the issues raised by the reviewers have been adequately addressed. We cannot, of course, promise publication at that time.

Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out.

Your revisions should address the specific points made by each reviewer. Please return the revised version within the next 60 days. If you anticipate any delay in its return, we ask that you let us know the expected resubmission date by email at

In addition, when you are ready to resubmit, please be prepared to provide the following:

(1) A detailed list of your responses to the review comments and the changes you have made in the manuscript. We require a file of this nature before your manuscript is passed back to the editors.

(2) A copy of your manuscript with the changes highlighted (encouraged). We encourage authors, if possible to show clearly where changes have been made to their manuscript e.g. by highlighting text.

(3) A striking still image to accompany your article (optional). If the image is judged to be suitable by the editors, it may be featured on our website and might be chosen as the issue image for that month. These square, high-quality images should be accompanied by a short caption. Please note as well that there should be no copyright restrictions on the use of the image, so that it can be published under the Open-Access license and be subject only to appropriate attribution.

Before you resubmit your manuscript, please consult our Submission Checklist to ensure your manuscript is formatted correctly for PLOS Computational Biology:

- Figures uploaded separately as TIFF or EPS files (if you wish, your figures may remain in your main manuscript file in addition).

- Supporting Information uploaded as separate files, titled Dataset, Figure, Table, Text, Protocol, Audio, or Video.

- Funding information in the 'Financial Disclosure' box in the online system.

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool,

To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions see

We are sorry that we cannot be more positive about your manuscript at this stage, but if you have any concerns or questions, please do not hesitate to contact us.

Sincerely,

Roger Dimitri Kouyos

Associate Editor

PLOS Computational Biology

Virginia Pitzer

Deputy Editor

PLOS Computational Biology

A link appears below if there are any accompanying review attachments. If you believe any reviews to be missing, please contact

[LINK]

Reviewer's Responses to Questions

Reviewer #1: Dear authors,

Please find the attached review.

Reviewer #2: This systematic review used HIV, Malaria, and Tuberculosis studies to provide an overview of the fitting methods used in IBMs modelling infectious disease spread. This is critical as the usage of IBMs becomes more common nowadays than the deterministic models. The calibration of IBMs to data is often a challenge. Overall, this is an interesting and well-written paper. I have some comments that need to be clarified.

1. Line 98: In addition to the stochasticity, the number of parameters to estimate in IBM indicates the complexity of the parameter-search strategy (i.e. estimating only one parameter is the search on one-dimensional real space (1D), estimating two parameters is the search on two-dimensional real space (2D), and so on). The authors may comment in the introduction to show how the complexity of calibration varies between IBMs.

2. Line 144: It is better to provide a reference for the most investigated epidemics.

3. Line 155: Is there any reasons why the authors did the search from 2013?

4. Line 179: “the goal” need to be clarified along this line.

5. Fig 2 lists the different methods of parameter search strategy found, but without minimal explanations. It is better to include in the appendix a table that explains briefly these methods for non-modelers.

6. Line 258: “…while the rest either relied on visual inspection as a GOF (14 articles) or used a quantitative GOF (five articles)...” The authors should discuss why some studies use manual fitting instead of formal parameter-search strategies. This manual fitting is impossible in case of multi-fitting parameters. Another issue of manual fitting is not reproducible due to the stochasticity in IBM.

7. Line 300: A reference along this line is very useful for the simulation tools.

8. This systematic review provides interesting data about the methods of calibration of IBMs, but it did not provide a comparison, which method is best, or what is the strength and limitations of each method. The authors should clarify this point in the limitation section.

9. Minor comment: There is an issue in a cell in Table 1.

Reviewer #3: The authors did a tedious and very valuable work in order to identify articles that used an individual-based model (IBM) to fit to data in HIV, tuberculosis (TB) and malaria, and assess the proportion of them that reported the parameter-search strategy and the type of parameter-search strategy used. This work is particularly important as one aims for more transparency on the optimization methods used in modelling works. However, the authors could go a little bit further in order to answer questions such as:

• Does the proportion of articles reporting the parameter search strategy used vary according to the disease studied (HIV, TB, malaria)?

• What about uncertainty? Is the uncertainty related to the parameter-search strategy (e.g. confidence interval or credibility interval) reported? Does the proportion of article reporting uncertainty depend on the field (i.e. HIV, TB, malaria) or on the parameter-search strategy used (sampling or optimization strategy)?

• How many parameters are estimated in each study? Does this number depend on the parameter search strategy? Are the most complex models (i.e. the ones with the highest number of parameter to estimate) the ones that do not report uncertainty/search strategy?

Answering these questions could help better identify the articles that report searching method less frequently (e.g. assess whether it is related to the field studied). Reporting information about the number of parameters estimated and whether uncertainty is reported could also provide a wider understanding of the issue of lack of transparency that we face in some fields.

In addition, I have a few minor comments:

• Lines 49-54: This is mainly repetition of the results already presented in the previous paragraph. Consider removing this paragraph and replacing by a discussion of the results, e.g. something that looks like the second part of “Author summary” (lines 65-72).

• Line 98-99: To me, it is not clear why a greater complexity could make exact likelihood calculation impossible. I would rather say that greater complexity prevent from identifying the exact maximum likelihood estimator, but it should not prevent the model to calculate the exact likelihood.

• Line 129: What kind of stochasticity do you mention here? Would it not be relevant to report stochasticity in your systematic review as well?

• Line 155: Why only from 2013 and not before?

• Line 166: Maybe you could mention before (in abstract?) that you will focus on studies informing public health policy.

• Line 241: In the “GOF” column of Table 1, there is some formatting problem, as we can not read the whole text in the “Suen et al.” column.

• Line 271: It is not clear how you could obtain a weight (i.e. a number) with the ”inverse of the confidence interval” which are two numbers.

• Lines 309-323: Consider rewriting this paragraph, as, as it is now, it repeats what you already said in the “Results” section. I see the point of summarizing the results, but this could be in a more concise way.

• Line 377: Was it not 48% (40/84) before, instead of 49%?

**********

Large-scale datasets should be made available via a public repository as described in the

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

PLOS authors have the option to publish the peer review history of their article (

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Reviewer #1: No

Reviewer #2: Yes: H.H. Ayoub

Reviewer #3: No

Submitted filename:

Submitted filename:

Dear Dr. Hazelbag,

Thank you very much for submitting your manuscript "Calibration of individual-based models to epidemiological data: a systematic review" for consideration at PLOS Computational Biology. As with all papers reviewed by the journal, your manuscript was reviewed by members of the editorial board and by several independent reviewers. The reviewers raised only one minor issue still requires attention. Based on the reviews, we are likely to accept this manuscript for publication, providing that you modify the manuscript according to the review recommendations.

Please prepare and submit your revised manuscript within 30 days. If you anticipate any delay, please let us know the expected resubmission date by replying to this email.

When you are ready to resubmit, please upload the following:

[1] A letter containing a detailed list of your responses to all review comments, and a description of the changes you have made in the manuscript. Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out

[2] Two versions of the revised manuscript: one with either highlights or tracked changes denoting where the text has been changed; the other a clean version (uploaded as the manuscript file).

Important additional instructions are given below your reviewer comments.

Thank you again for your submission to our journal. We hope that our editorial process has been constructive so far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments.

Sincerely,

Roger Dimitri Kouyos

Associate Editor

PLOS Computational Biology

Virginia Pitzer

Deputy Editor

PLOS Computational Biology

***********************

A link appears below if there are any accompanying review attachments. If you believe any reviews to be missing, please contact

[LINK]

Reviewer's Responses to Questions

Reviewer #1: Good job. I have no additional comments.

Reviewer #2: The authors have satisfactorily responded to all my questions and made the necessary changes to the manuscript.

Reviewer #3: All the previously reported issues have been addressed by the authors. The authors might however check the following issue. In Fig3, title (and legend) mentioned that the figure reports the number of target statistics, while the y-axis label mentions the number of calibrated parameters. This must be corrected. Additionally, the authors could present both figures (number of targets and number of calibrated parameters according to the parameter search strategy) side by side in the main manuscript.

**********

Large-scale datasets should be made available via a public repository as described in the

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

PLOS authors have the option to publish the peer review history of their article (

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Reviewer #1: No

Reviewer #2: No

Reviewer #3: No

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool,

Please note that, as a condition of publication, PLOS' data policy requires that you make available all data used to draw the conclusions outlined in your manuscript. Data must be deposited in an appropriate repository, included within the body of the manuscript, or uploaded as supporting information. This includes all numerical values that were used to generate graphs, histograms etc.. For an example in PLOS Biology see here:

To enhance the reproducibility of your results, PLOS recommends that you deposit laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions see

Submitted filename:

Dear Dr. Hazelbag,

We are pleased to inform you that your manuscript 'Calibration of individual-based models to epidemiological data: a systematic review' has been provisionally accepted for publication in PLOS Computational Biology.

Before your manuscript can be formally accepted you will need to complete some formatting changes, which you will receive in a follow up email. A member of our team will be in touch with a set of requests.

Please note that your manuscript will not be scheduled for publication until you have made the required changes, so a swift response is appreciated.

IMPORTANT: The editorial review process is now complete. PLOS will only permit corrections to spelling, formatting or significant scientific errors from this point onwards. Requests for major changes, or any which affect the scientific understanding of your work, will cause delays to the publication date of your manuscript.

Should you, your institution's press office or the journal office choose to press release your paper, you will automatically be opted out of early publication. We ask that you notify us now if you or your institution is planning to press release the article. All press must be co-ordinated with PLOS.

Thank you again for supporting Open Access publishing; we are looking forward to publishing your work in PLOS Computational Biology.

Best regards,

Roger Dimitri Kouyos

Associate Editor

PLOS Computational Biology

Virginia Pitzer

Deputy Editor

PLOS Computational Biology

***********************************************************

PCOMPBIOL-D-19-01519R2

Calibration of individual-based models to epidemiological data: a systematic review

Dear Dr Hazelbag,

I am pleased to inform you that your manuscript has been formally accepted for publication in PLOS Computational Biology. Your manuscript is now with our production department and you will be notified of the publication date in due course.

The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript.

Soon after your final files are uploaded, unless you have opted out, the early version of your manuscript will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers.

Thank you again for supporting PLOS Computational Biology and open-access publishing. We are looking forward to publishing your work!

With kind regards,

Laura Mallard

PLOS Computational Biology | Carlyle House, Carlyle Road, Cambridge CB4 3DN | United Kingdom