^{1}

^{*}

^{2}

^{3}

^{4}

^{5}

^{¶}

¶ Membership of the MMED Organizing Committee is provided in the Acknowledgments.

The Education Series features noteworthy, innovative open-education programs to enhance understanding of biology.

The authors have declared that no competing interests exist.

In this fun, interactive exercise, students simulate an infectious disease outbreak among themselves that conceptually integrates two historically distinct fields in epidemiology.

Modern infectious disease epidemiology builds on two independently developed fields: classical epidemiology and dynamical epidemiology. Over the past decade, integration of the two fields has increased in research practice, but training options within the fields remain distinct with few opportunities for integration in the classroom. The annual Clinic on the Meaningful Modeling of Epidemiological Data (MMED) at the African Institute for Mathematical Sciences has begun to address this gap. MMED offers participants exposure to a broad range of concepts and techniques from both epidemiological traditions. During MMED 2010 we developed a pedagogical approach that bridges the traditional distinction between classical and dynamical epidemiology and can be used at multiple educational levels, from high school to graduate level courses. The approach is hands-on, consisting of a real-time simulation of a stochastic outbreak in course participants, including realistic data reporting, followed by a variety of mathematical and statistical analyses, stemming from both epidemiological traditions. During the exercise, dynamical epidemiologists developed empirical skills such as study design and learned concepts of bias while classical epidemiologists were trained in systems thinking and began to understand epidemics as dynamic nonlinear processes. We believe this type of integrated educational tool will prove extremely valuable in the training of future infectious disease epidemiologists. We also believe that such interdisciplinary training will be critical for local capacity building in analytical epidemiology as Africa continues to produce new cohorts of well-trained mathematicians, statisticians, and scientists. And because the lessons draw on skills and concepts from many fields in biology—from pathogen biology, evolutionary dynamics of host–pathogen interactions, and the ecology of infectious disease to bioinformatics, computational biology, and statistics—this exercise can be incorporated into a broad array of life sciences courses.

The goal of epidemiology is to identify the biological, behavioral, and environmental causes of health outcomes or diseases and apply this knowledge to the development of effective disease interventions to improve public health

Thus, population-level epidemiological patterns emerge from the complex, interacting processes governing pathogen biology, host biology, host behavior, the environment, and their interactions

Firmly rooted in empirical research, classical epidemiology takes a phenomenalistic approach that focuses more on whether a causal relationship exists

Both approaches have played successful roles in public health

Despite such successes, training options within the fields remain distinct. Furthermore, the very few scientists with training in both disciplines seldom receive formal instruction explaining how they complement each other. Reflecting the severity of this rift, a recent published symposium on the future of epidemiological education lacked any discussion of dynamical approaches

The annual Clinic on the Meaningful Modeling of Epidemiological Data (MMED) at the African Institute for Mathematical Sciences in Muizenberg, South Africa, is working to address this gap. African capacity in epidemiology is in short supply despite the continent's disproportionate burden of disease and burgeoning new cohorts of African students trained in biomathematics, biology, and public health. MMED offers participants (ranging from undergraduate students to professors) exposure to a broad range of concepts and techniques from both classical and dynamical epidemiology, while explicitly highlighting how the two approaches complement each other and fit into a larger context. Selected lecture material from MMED is provided in the

During MMED we designed the following pedagogical tool to address the rift between classical and dynamical epidemiology in which participants not only design studies and collect data prior to analysis, but also observe pathogen transmission in a way that reveals how epidemics are inherently nonlinear stochastic (i.e., random) processes. The approach is hands-on, consisting of the real-time simulation of a stochastic outbreak among course participants, including realistic data reporting, followed by analyses from both epidemiological traditions. We believe this type of integrated educational tool can stimulate the training of a cohort of infectious disease epidemiologists who are well acquainted with both the dynamical systems nature of epidemiological processes as well as the empirical design and analytical issues associated with investigating causal relationships between risk factors and disease.

At the 2010 and 2011 MMED Clinics we instigated outbreaks of a novel infectious agent, Muizenberg Mathematical Fever (MMF), in our course participants. The infectious agent was a paper form (

MMED lecture slides (

An example document that provides instructions that students are to carry out upon being exposed (this document doubles as the infectious agent; being handed this document constitutes exposure to MMF)

Data obtained (by the authors) from prior MMF outbreaks

An example survey used to identify risk factors associated with contracting MMF and corresponding data.

R code that organizers may experiment with to assist them in determining initial parameter values (for more information on the R computing language and to download the latest version of R, please see

Additional R code that provides illustrative exercises that can be performed using MMF data (e.g., identification of risk factors, calculating measures of effect, confidence intervals, etc.)

Additional online resources containing introductory materials on both epidemiological traditions

Information on infectious diseases and epidemics to share with students (e.g., ProMED, Center for Disease Control website, and other such resources)

Before implementation, organizers should set values for epidemiological parameters (initial number of infections, potential protective factors, _{0}, the proportion symptomatic) that take into account the number of participants and the types of analyses that students will perform; we suggest doing this by simulating outbreak dynamics and have provided example R code

If a protective factor is induced (e.g., “vaccination” or other form of immunity), ensure that the number of immune individuals will be sufficient for an effect to be detected and that the epidemic is likely to take off when accounting for immunity; also, attempt to make the factor something that will be readily detectable via an appropriately designed questionnaire

Prepare a full roster of participating individuals (i.e., those that could be exposed) for tracking purposes

Provide some brief training for individuals so that they are able to generate the appropriate random numbers (note: while we have had success with R, any software package that can generate Poisson and Bernoulli random variables could be used)

To ensure timely and full participation, we suggest using incentives (e.g., linking participation to student evaluation, course participation credit, etc.).

Do not initiate the outbreak until the course list has been finalized (i.e., after the end of the drop/add period)

If possible, provide a link to the infectious agent (i.e., the instruction document) that is only accessible to course participants (e.g., through a Blackboard or Sakai site) and cannot be found by searching; this will help ensure that the epidemic is confined to the closed population of course participants

Make sure participants know whom they may and may not infect (e.g., provide a course roster to each student or refer them to a course management website).

Once the outbreak is underway, discretely remind participants to follow through accordingly (i.e., “I was expecting to receive emails from some of you—don't forget to follow up”)

The outbreaks percolated through participants before burning out, much like epidemics of biological pathogens. While we determined _{0} (the number of susceptible individuals infected by an index case introduced into an entirely susceptible population; see glossary in

Initiation of the MMF outbreaks resulted in a complex epidemiological process (

Epidemic time series for the outbreaks at the 2010 (A) and 2011 (B) outbreaks. The former and latter outbreaks differ by their different basic reproductive numbers (defined as the average number of people an infectious individual infects if the rest of the population is susceptible; _{0} = 1.23 and 1.82, respectively), the initial number of infectious individuals in the population (2 and 4, respectively), and the number of individuals immune at the start of the outbreak (0 and 14, respectively). (C and D) demonstrate how the effective reproductive number (_{eff}; average number of individuals each infected person infects) changes during the course of the outbreak as the number of susceptibles decreases and that the epidemic begins to burn out when _{eff} decreases below 1 and infectious individuals no longer replace themselves with new infections. The script for production of and further detail on this figure are given in

(A) Muizenberg Mathematical Fever 2011 outbreak data to illustrate how using a case definition with imperfect sensitivity (symptomatic disease) can cause nondifferential misclassification bias (the category of information bias where exposed and unexposed individuals are equally likely to be misclassified). Nondifferential misclassification biases the association between a risk factor and a disease outcome towards the null hypothesis of no association (odds ratio = 1). While attendance at the prior year's clinic was actually protective (black square and 95% CI), this bias was sufficient to cause the confidence interval for the odds ratio of this very protective variable to overlap (gray). (B) illustrates how a risk factor (arrival a day or more early to the clinic) that has no real association to a disease outcome can appear associated through confounding. Individuals who had attended the clinic in prior years were less likely to come to the clinic early and were also protected (i.e., A). Consequently, early attendance appeared associated with a higher risk of disease in a univariate analysis (gray) though the CI contains the null hypothesis of no association in a multivariate analysis that adjusts for prior attendance (black). The script for production of and further detail on this figure are given in

(A) Five stochastic simulations of Muizenberg Mathematical Fever outbreaks using transmission parameters fit from the 2011 outbreak data but with only one infected individual initiating the outbreak (instead of four) but the same proportion initially immune (25%). In comparison to

Field | Study | Analysis | Data | Goal | Pedagogical Value |

Risk factor | Case-control or cohort | Logistic or Poisson regression (generalized linear model) | Disease outcomes, risk factors | Understand what variables increase (or decrease) risk of disease/infection | Exposure to survey design, data entry and cleaning, and statistical methods; understand effects of confounding and bias in data analysis |

Mathematical | Estimation of _{0} and infectious period |
Probability distribution fit to infectious contact data and infectious periods | Contact tracing data and observed infectious periods | Characterize individual heterogeneity in infectiousness and overall pathogen contagiousness | Understand utility of contact tracing data in characterizing disease dynamics, understand _{0} as an epidemic threshold, understand how individual variation can affect disease dynamics |

Outbreak simulation | Stochastic SIR individual-based model with Gillespie algorithm | Estimates of _{0}, incubation and infectious periods |
Understand how outbreak size is affected by immune proportion | Awareness of effects of stochasticity in outbreaks of small sizes, gain intuition for how simulation can be used to answer applied questions |

Introduction to the concepts of incubation, latency, infectiousness, being a/symptomatic, virulence, pathogenicity, immunity, transmissibility, pathogen evolution

Epidemiological study designs (e.g., case-control and cohort studies)

Outbreak investigation methodology (case definition, contact tracing, epidemic curves)

Measures of effect (e.g., odds ratios, relative risk)

Confounding, bias, and interaction

Introduction to a simple Susceptible-Infected-Recovered (SIR) model

Introduction to concepts of the basic and effective reproduction numbers, attack rate, and herd immunity

Using dynamic models to answer public health questions (e.g., using models as a means to explore counterfactual instances of disease occurrence)

Probability distributions and generation of random variables

Regression, confidence intervals, and hypothesis testing

Parameter estimation (e.g., maximum likelihood estimation)

Questionnaire design

Data collection, cleaning, visualization, and analysis

Verbal communication skills (i.e., presentation of results)

What is the difference between an infectious disease and a communicable disease?

How might you collect information on a disease outbreak?

What might cause an outbreak to end?

What determines how many cases occur in an outbreak?

What determines how long an outbreak lasts and when the peak occurs?

Why aren't data a perfect representation of reality?

Is it possible to predict whether an epidemic will occur when a pathogen is introduced into a population?

How and why might an individual's risk of infection change over the course of an outbreak? When is average individual risk the highest?

What individuals are most likely to be infected in an outbreak of communicable disease?

Why do some pathogens cause epidemics while others do not?

Why don't all individuals in a population have to be vaccinated to prevent an epidemic?

Have students describe the life cycle of MMF by matching infectious disease terms (such as “latent period” and “transmission event”) to aspects of the exercise and discussing in relation to a real pathogen

Have students plot and explain data (e.g., the epidemic curve, the cumulative incidence through time, the distribution of infectious contacts, latent periods, and infectious periods)

Have students describe the epidemic curve, explain differences in data collection that might influence aspects of the observed curve (e.g., a case definition that relies on symptoms or reporting)

Have students estimate parameters (such as those describing the latent and infectious period distributions) on a dataset from another source

Have students pick a real immunizing infection with available estimates of the latent period, infectious period, and transmissibility (_{0}

Have students discuss what aspects of the epidemic dynamics and data collection determine the ease with which a classical epidemiology study can detect which individual-level risk factors are associated with a higher probability of infection.

Have students conduct exercises/analyses as described and present on their projects, including data collection, cleaning, and analysis as well as any unexpected difficulties they encountered

This educational approach provides a much-needed conceptual integration of risk factor and mathematical approaches in epidemiological training, illuminating their strengths, weaknesses, and how they complement each other. Further, this exercise is of interest and understandable to students in other fields. Much of the exercise is simple enough to be performed by adequately trained high school students and could therefore even serve as an early introduction to infectious disease epidemiology. The exercise is also perfectly suited to undergraduate or graduate courses in epidemiology, infectious diseases, public health, biomathematics, computational biology, statistics, and nonlinear dynamics. Such tools will stimulate training of epidemiologists able to think from both the classical and dynamical perspectives. Particularly in developing countries where training is only slowly becoming more interdisciplinary, exercises such as MMF teach participants how subfields of public health complement each other and produce professionals more able to collaborate across disciplines.

The most recent updates to supplementary material as well as an online webpage for running the epidemic without use of R are available at

(CSV)

(CSV)

(CSV)

(CSV)

(CSV)

(CSV)

(CSV)

(PDF)

(PDF)

(PDF)

(PDF)

(PDF)

(PDF)

(PDF)

(PDF)

(DOC)

(DOC)

(TXT)

(TXT)

(PDF)

The MMED Organizing Committee includes S.E.B., J.R.C.P., J.C.S., J.D., Travis C. Porco (The Francis I Proctor Foundation for Ophthalmic Research and Department of Epidemiology and Biostatistics, University of California, San Francisco, California, US), Brian G. Williams (South African Centre for Epidemiological Modelling and Analysis, University of Stellenbosch, Stellenbosch, Republic of South Africa), and John W. Hargrove (South African Centre for Epidemiological Modelling and Analysis, University of Stellenbosch, Stellenbosch, Republic of South Africa).

We would like to thank Travis C. Porco, Brian G. Williams, John W. Hargrove, Alex Welte, Wim Delva, and Gavin Hitchcock for their continuing contribution to organization, teaching, and mentoring at the Meaningful Modeling of Epidemiological Data (MMED) clinic. We would also like to thank all the MMED clinic participants and, in particular, the project groups who enthusiastically analyzed Muizenberg Mathematical Fever outbreak data: Bibi Adams, Linsay Blows, Dario Fanucchi, Jacob Ismail Irunde, Jessica Nezar Gennrich, Piet Jones, Eric Maluta, Geoffrey Marutla, Cynthia Mazinu, Edinah Mudimu, Juliet Nakakawa, Nthatheni Norman Nelufule, Olina Ngwenya, Dany Pascal, Tarylee Reddy, Wilcan Sekgobela, Valrie Mabu Serumula, Milaine Seuneu, Patrick Shabangu, Marinel Janse Van Rensburg, and Ben Wilson. We would especially like to express our gratitude to Igsaan Kamalie, Jan Groenwald, Barry Green, and the rest of the staff at the African Institute of Mathematical Sciences for their superb logistical support and housing of the Meaningful Modeling of Epidemiological Data clinics.

Meaningful Modeling of Epidemiological Data

Muizenberg Mathematical Fever