Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Limits to Causal Inference with State-Space Reconstruction for Infectious Disease

  • Sarah Cobey ,

    Contributed equally to this work with: Sarah Cobey, Edward B. Baskerville

    cobey@uchicago.edu

    Affiliation Ecology & Evolution, University of Chicago, Chicago, IL, United States of America

  • Edward B. Baskerville

    Contributed equally to this work with: Sarah Cobey, Edward B. Baskerville

    Affiliation Ecology & Evolution, University of Chicago, Chicago, IL, United States of America

Limits to Causal Inference with State-Space Reconstruction for Infectious Disease

  • Sarah Cobey, 
  • Edward B. Baskerville
PLOS
x

Abstract

Infectious diseases are notorious for their complex dynamics, which make it difficult to fit models to test hypotheses. Methods based on state-space reconstruction have been proposed to infer causal interactions in noisy, nonlinear dynamical systems. These “model-free” methods are collectively known as convergent cross-mapping (CCM). Although CCM has theoretical support, natural systems routinely violate its assumptions. To identify the practical limits of causal inference under CCM, we simulated the dynamics of two pathogen strains with varying interaction strengths. The original method of CCM is extremely sensitive to periodic fluctuations, inferring interactions between independent strains that oscillate with similar frequencies. This sensitivity vanishes with alternative criteria for inferring causality. However, CCM remains sensitive to high levels of process noise and changes to the deterministic attractor. This sensitivity is problematic because it remains challenging to gauge noise and dynamical changes in natural systems, including the quality of reconstructed attractors that underlie cross-mapping. We illustrate these challenges by analyzing time series of reportable childhood infections in New York City and Chicago during the pre-vaccine era. We comment on the statistical and conceptual challenges that currently limit the use of state-space reconstruction in causal inference.

Introduction

Identifying the forces driving change in natural systems is a major goal in ecology. Because experiments are often impractical and come at the cost of generalizability, a common approach is to fit mechanistic models to observations. Testing hypotheses through mechanistic models has a particularly strong tradition in infectious disease ecology [14]. Models that incorporate both rainfall and host immunity, for example, better explain patterns of malaria than models with only rainfall [5]; models with school terms fit the historic periodicity of measles in England and Wales [6, 7]. The ability of fitted mechanistic models to predict observations outside the training data strongly suggests that biological insight can be gained. There is nonetheless a pervasive risk that predictive variables merely correlate with the true, hidden variables, or that the model’s functional relationships create spurious resemblances to the true dynamics. This structural uncertainty in the models themselves limits inference [812].

An alternative approach to inferring causality is to examine the time series of potentially interacting variables without invoking a model. These methods face a similar challenge: they must distinguish correlated independent variables sharing a mutual driver from correlations arising from direct or indirect interactions. Many of these methods, including Granger causality [13] and other related methods [1416], infer interactions in terms of information flow in a probabilistic framework and cannot detect bidirectional causality. A recent suite of methods based on dynamical systems theory proposes to infer interactions, both unidirectional and bidirectional, in systems that are nonlinear, noisy, and potentially high-dimensional [1719]. The basic idea is that if X drives Y, information about X is embedded in the time series of Y. Examining the relationships between delay-embeddings of the time series of X and Y can reveal whether X drives Y, Y drives X, both, or neither. These approaches, which we refer to collectively as convergent cross-mapping (CCM), have been offered as general tools to analyze causation in nonlinear dynamical systems [1720].

The mathematical foundations of CCM, and therefore its assumptions, lie in deterministic nonlinear systems theory. After sufficient time, the states of a deterministic dynamical system reach an attractor, which may be a point equilibrium, a limit cycle, or a higher-dimensional chaotic attractor. By Takens’ theorem, a one-dimensional time series X(t) from the system can be mapped perfectly to the attractor in the full state space in the system by constructing a delay embedding, in which states of the full system are mapped to delay vectors, x(t) = {X(t), X(tτ1), X(tτ2), …, X(tτE−1}, for delays τi and an embedding dimension E, which must be at least as large as the dimensionality of the attractor [21]. This mapping provides the basis for causal inference under CCM: if Y drives (causes) X, then a newly observed x(t) can perfectly reconstruct the corresponding from past observations of the mapping x(t) → Y(t) (Fig 1A). As the number of observed delay vectors x(t) increases, the reconstruction converges to small error, as observed points on the reconstructed attractor become close together [17].

thumbnail
Fig 1. Summary of criteria for detecting causality.

(A) Schematic of cross-map algorithm for testing YX. Delay vectors in X, mapped to values in Y with lag ℓ, are bootstrap-sampled to construct a prediction library. For each delay vector in X, reconstructed values are calculated from a distance-weighted sum of Y values from nearest neighbors in the library. Many sampled libraries yield a distribution of cross-map correlations between actual Y and reconstructed . (B) Criterion 1 (cross-map increase). Bootstrap distributions of cross-map correlation are calculated at minimum and maximum library sizes with ℓ = 0; causality is inferred if the correlation at Lmax is significantly greater than the correlation at Lmin. (C) Criterion 2 (negative cross-map lag). Cross-map correlations are calculated across different values of ℓ. Causality is inferred if the highest cross-map correlation for negative ℓ is positive and significantly greater than the highest value for nonnegative ℓ.

https://doi.org/10.1371/journal.pone.0169050.g001

With finite, noisy real data, the reconstruction is necessarily imperfect, and two operational criteria have been used to detect causality. The first criterion (Fig 1B) is based simply on this improvement in reconstruction quality with the number of observations. This approach is known to produce false positives in the case of strongly driven variables, where the system becomes synchronized to the driver [17, 22]. This failure is logically consistent with the theory: the theory implies that, with perfect data, causal drivers will produce good reconstructions, but not that non-causal drivers will not produce good reconstructions. The second criterion (Fig 1C) tries to correct this problem by additionally considering the directionality of information flow in time [18]. If one variable drives another, the best predictions of current states of the driven variable should come from past, not current or future, states of the driver.

Many ecological systems undergo synchronized diurnal or annual fluctuations and thus raise doubts about the first criterion. Transient dynamics, demographic and environmental noise, and observation error—all ubiquitous in nature—raise general concerns, since they violate the theory’s assumption that variables are perfectly observed in a deterministic system. Variations of CCM have nonetheless been applied to such systems to test hypotheses about who interacts with whom [1719, 23, 24].

We investigated whether the frequently periodic, noisy, and transient dynamics of ecological systems are a current obstacle to causal inference based on state-space reconstruction. These factors have been addressed to varying degrees in different contexts [1719] but not systematically. Specifically, we examined whether the two criteria for causal inference are robust to inevitable uncertainties about the dynamics underlying the data. With little prior knowledge of a system’s complexity, including the influences of transient dynamics and noise, can we reach statistically rigorous conclusions about who interacts with whom? Infectious diseases provide a useful test case because their dynamics have been extensively studied, long time series are available, and pathogens display diverse immune-mediated interactions [25]. Their dynamics are also influenced by seasonal variation in transmission rates, host population structure, and pathogen evolution. The ability to test directly for the presence of interactions would save considerable effort over fitting semi-mechanistic models that incorporate these complexities. We find that although CCM appears to work beautifully in some instances, it does not in others. Noise and transient dynamics contribute to poor outcomes, as do statistical ambiguities in the methodology itself. We propose that except in extreme circumstances, the current method cannot reliably reveal causality in natural systems.

Results

To assess the reliability of CCM, we began by simulating the dynamics of two strains with stochastic, seasonally varying transmission rates (Methods). In large systems, many factors might influence these rates. In low-dimensional models, these factors are typically represented as process noise. We consequently varied the level of process noise in our simulations by changing its standard deviation, η. We also varied the strength of competition from strain 2 on strain 1 (σ12); strain 1, in contrast, never affected strain 2 (σ21 = 0). For each level of competition and process noise, we simulated 100 replicates from random initial conditions to stochastic fluctuations around a deterministic attractor. One thousand years of error-free monthly incidence were output to give CCM the best chance to work. For each combination of parameters (competition strength σ12 and process noise η), we examined whether strain interactions were correctly inferred. When σ12 > 0, strain 2 should be inferred to “drive” (influence) strain 1. Because σ21 = 0, strain 1 should never be inferred to drive strain 2.

To detect interactions, for each individual time series, we identified the delay-embeddings (Fig 1A) and applied one of two causality criteria using the reconstructed attractors (Fig 1B and 1C and Methods). Both criteria are based on the cross-map correlation ρ, which is the correlation between reconstructed values of and actual values of Y, given the reconstructed attractor of X. We use p < 0.05 to identify significant differences in these correlations because we are interested in situations in which the null hypothesis of no change in correlation, and thus no interaction, is rejected. Criterion 1 [17, 19] measures whether the cross-map correlation increases as the number of observations of the putatively driven variable grows (Fig 1B). We refer to this as the cross-map increase criterion. Criterion 2 [18] infers a causal interaction if the maximum cross-correlation of the putative driver is positive and occurs in the past (i.e., at a negative temporal lag; Fig 1C). We refer to this as the negative cross-map lag criterion.

Sensitivity to periodicity

Criterion 1, which requires a significant increase in cross-map correlation ρ with observation library size L, frequently detected interactions that did not exist. In all cases where strain 2 had no effect on strain 1, CCM always incorrectly inferred an influence (Fig 2A). Although strain 1 never influenced strain 2, it was often predicted to (Fig 2A). Sample time series suggested a strong correlation between synchronous oscillations and the appearance of bidirectional interactions (Fig 2B). In contrast, when strain 2 appeared to drive strain 1 but not vice-versa (σ12 = 0 and η = 0.05), strain 1 often oscillated with a period that was an integer multiple of the other strain’s (Fig 2C). Thus, as expected, strongly synchronized dynamics prevented separation of the variables. Additionally, the resemblance of strain 2 to the seasonal driver led to false positives even when the strains were independent and strain 1 oscillated at a different frequency.

thumbnail
Fig 2. Interactions detected as a function of process noise and the strength of interaction (C2C1) and representative time series.

(A) Heat maps show the fraction of 100 replicates significant for each inferred interaction for different parameter combinations. A significant increase in cross-map correlation ρ with library length L indicated a causal interaction. The time series consisted of 1000 years of monthly data. (B) Representative 25-year sample of the time series for which mutual interactions were inferred (σ12 = 0.25, η = 0.01). (C) Representative sample of the time series for which C2 is inferred to drive C1 but not vice-versa (σ12 = 0.25, η = 0.05).

https://doi.org/10.1371/journal.pone.0169050.g002

The sensitivity of the method to periodicity persisted despite transformations of the data and changes to the driver. One possible solution to reducing seasonal effects, sampling annual rather than monthly incidence, reduced the overall rate of false positives but also failed to detect some interactions (S1A Fig). Furthermore, when the effects of strain 2 on 1 were strongest, the reverse interaction was more often inferred. Sampling the prevalence at annual intervals gave similar results (S1B Fig), and first-differencing the data did not qualitatively change outcomes (S1C Fig). The method yielded incorrect results even without seasonal forcing (ϵ = 0) because of noise-induced oscillations (S1D Fig). In all of these cases, the presence of shared periods between the strains correlated strongly and significantly with the rate of detecting a false interaction (Fig 3).

thumbnail
Fig 3. Shared frequency spectra predict probability of inferred interaction.

Points show the maximum cross-spectral densities of strains 1 and 2 plotted against the p-values for C1C2 for 1000 years of annual data. In all replicates, C1 never actually drives C2. Point color indicates the strength of C2C1 (σ12), and point size indicates the standard deviation of the process noise (η) on transmission rates.

https://doi.org/10.1371/journal.pone.0169050.g003

Because cross-map skill should depend on the quality of the reconstructed attractor, we investigated performance under other methods of constructing the attractors of the two strains (Methods). Nonuniform embedding methods allow the time delays to occur at irregular intervals, τ1, τ2, …τE−1, which may provide a more accurate reconstruction. Alternative reconstruction methods, including nonuniform embedding [26, 27], random projection [23], and maximizing the cross-map (rather than univariate) correlation failed to fix the problem (S2 Fig).

We also tested a method that infers causality if the cross-map correlation is significantly greater than a null correlation distribution from surrogate time series with randomized seasonal anomalies [20]. This method resulted in a high false positive rate (S6 Fig). When surrogates were used for both the putative cause and the putative effect (S6A Fig), more true positives (σ12 = 0.25) and false positives (σ12 = 0) were detected than when surrogates were used only for the putative cause (S6B Fig).

Criterion 2, which infers that Y drives X if there is a positive cross-map correlation that is maximized at a negative cross-map lag, performed relatively well (Fig 4). Fewer false positives were detected, although the method missed some weak extant interactions (σ12 = 0.25) and interactions in noisy systems (η = 0.05, 0.1). Results for annual data were similar (S3A Fig). Requiring that ρ be not only positive but also increasing barely affected performance (S3B Fig).

thumbnail
Fig 4. Interactions detected as a function of process noise and the strength of interaction (C2C1) and representative time series.

Heat maps show the fraction of 100 replicates significant for each inferred interaction for different parameter combinations. A maximum, positive cross-map correlation ρ at a negative lag indicated a causal interaction. Each replicate used 100 years of monthly incidence.

https://doi.org/10.1371/journal.pone.0169050.g004

Limits to identifiability

If two variables X and Y share the same driver but do not interact, if the driving is strong enough, X may resemble the driver so closely that X appears to drive Y. In a similar vein, when the two strains in our system have identical transmission rates (β1 = β2) and one strongly drives the other (σ12 = 1), the direction of the interaction cannot be detected when the dynamics are nearly deterministic (η = 10−6) (S3C Fig). Causal inference in such cases becomes difficult.

To investigate the limits to distinguishing strains that are ecologically similar and do not interact, we varied the correlation of the strain-specific process noise while applying the more conservative of the two criteria for inferring causality (Criterion 2), that the cross-map correlation ρ be positive and peak at a negative lag [18]. Process noise can be thought of as a hidden environmental driver that affects both strains simultaneously, and thus the strength of correlation indicates the relative contribution of shared versus strain-specific noise. With two identical, independent strains, no seasonal forcing, and low process noise (η = 0.01), the false positive rate depended on correlation strength and the quantity of data. When using 100 years of monthly incidence, the false positive rate varied non-monotonically with correlation strength, with a minimum (5%-6%) at a correlation of 0.75 and its highest values, near 24%, at correlations of 0 and 1 (S4A Fig). Using 1000 years of annual incidence reduced false positive rates to 5%-9% for imperfectly correlated noise (S4B Fig). The best performance occurred with 100-year monthly data when cross-map correlation was required to increase with library length (S4C Fig). Thus, the independence of two strains will generally be detected as long as they experience imperfectly correlated noise.

We next considered the problem of identifying two ecologically distinct strains (β1β2) when one strain strongly drives the other (σ12 = 1) and its dynamics resemble the seasonal driver. In this case, even with perfectly correlated process noise, correct interactions are consistently inferred (S5 Fig). Thus, we conclude that the presence of noise, even highly correlated noise, can help distinguish causality between coupled, synchronized variables [14]. It is more difficult to distinguish non-interacting, dynamically equivalent variables. In the latter case, noise has inconsistent effects on causal inference, although Criterion 2 may perform much better than Criterion 1. These results at least hold for “modest” noise (η = 0.01): as shown earlier, higher levels hurt performance (Fig 4).

Transient dynamics

CCM is optimized for dynamics that have converged to a deterministic attractor. Directional parameter changes in time and large perturbations can prevent effective cross-mapping because the method requires a consistent mapping between system states as well as sufficient coverage of state space by the data. We evaluated the impact of both of these types of transient dynamics on causal inference, using a simple example of each as proof of principle. We again used very long time series to give the method the best chance to work.

In the first test, we identified two sets of parameter values where CCM was successful under Criterion 2 (intermediate interaction strength, σ21 = 0.5; seasonal forcing, ϵ = 0.1; process noise, η = 0.01; and transmission rates β1 of 0.30 (Fig 5A) and 0.32 (Fig 5B)). We tested CCM on simulations with the parameter values fixed and then with the transmission rate β1 varying linearly over time betwen the two values. All three tests used 100 years of monthly incidence. Of 100 replicates, with β1 fixed at 0.30, CCM failed to detect an interaction 5 times, and never falsely detected an absent interaction. With β1 fixed at 0.32, there were 12 false negatives and 1 false positive. When β1 varied from 0.30 to 0.32, error rates increased: there were 29 false negatives and 44 false positives. Transient dynamics due to a linear change in a system parameter can thus lead to incorrect causal inference even when causal inference is successful before and after the change.

thumbnail
Fig 5. Incorrect inference with changing transmission rate.

Example time series for testing transient dynamics. Each time series contained 100 years of monthly incidence data. The transmission rate β1 for the driven strain C1 was fixed at β1 = 0.30 (A) and β1 = 0.32 (B), and varied linearly over time between the two values (C). The transient time series yields high false positive and false negative rates under CCM. Interaction strength was σ21 = 0.5, process noise was η = 0.01, and seasonal forcing was ϵ = 0.

https://doi.org/10.1371/journal.pone.0169050.g005

In the second test, we began simulations at random initial conditions far from equilibrium and applied CCM to the first 100 years of monthly incidence. When strain 2 weakly drives strain 1 (σ12 = 0.5), causal inference is compromised, even when process noise is low (η = 0.01; S7 Fig). In 100 simulations of this scenario, the correct interaction (strain 2 driving strain 1) was always detected after transients had passed, but it was detected in only 19 of 100 simulations that included transients. Furthermore, a reverse interaction (strain 1 driving 2) was incorrectly detected in 21 of 100 simulations. The method thus performed worse than chance in identifying interactions that were present, and it also regularly predicted nonexistent interactions.

Application to childhood infections

Given the apparent success of CCM under Criterion 2 (negative cross-map lag) with two strains, little noise near the attractor, and 1000 years of observations, we investigated whether the method might shed light on the historic dynamics of childhood infections in the pre-vaccine era. Time series analyses have suggested that historically common childhood pathogens may have competed with or facilitated one another [28, 29]. We obtained the weekly incidence of six reportable infections in New York City from intermittent periods spanning 1906 to 1953 [30] (Fig 6A). Six of 30 pairwise interactions were significant at the p < 0.05 level, not correcting for multiple tests (Fig 6C). Polio drove mumps and varicella, scarlet fever drove mumps and polio, and varicella and pertussis drove measles. Typical cross-map lags occurred at one to three years (S8 Fig). The inferred interactions were identical if we required that the cross-map correlation ρ be increasing and not merely positive.

thumbnail
Fig 6. Historical childhood infections in New York City and Chicago and inferred interactions from two reconstruction methods.

Time series show weekly incidence of infections per 1000 inhabitants of New York City (A) and Chicago (B). Delay-embeddings were constructed by maximizing the univariate correlation (C) or through a random projection method (D) Arrows indicate the inferred interactions from the New York (blue) and Chicago (red) time series under Criterion 2 (negative cross-map lag).

https://doi.org/10.1371/journal.pone.0169050.g006

Although we specifically chose infectious diseases not subject to major public health interventions in the sampling period, it is possible that the New York data contain noise and transient dynamics. To the check robustness of the conclusions, we analyzed analogous time series from Chicago from the same period (Fig 6B). Completely different interactions appeared (Fig 6C). Not correcting for multiple tests, pertussis drove scarlet fever and varicella; accepting marginally significant negative lags (p = 0.055), polio drove measles. In these cases, the maximum cross-map correlation ρ was not only positive but also increased at negative lag. Requiring that ρ only be positive at negative lag, polio also drove pertussis, measles drove mumps and varicella, and mumps drove scarlet fever. Except in one case, all negative lags occurred at more than one year (S9 Fig). Thus, no consistent interactions appeared in epidemiological time series of two major, and possibly dynamically coupled, cities.

To investigate the possibility that our method of attractor reconstruction might be unduly sensitive to noise and transient dynamics, we repeated the procedure with a method based on random projections [23]. Once again, no interactions were common to both cities (Fig 6D). Furthermore, only one of the original eight interactions from the first reconstruction method reappeared with random projection (two of eight reappeared if disregarding the city), and two interactions changed direction (three if disregarding the city). Both reconstruction methods selected similar lags (S10 and S11 Figs).

Discussion

CCM is, in theory, a computationally efficient alternative to mechanistic modeling for causal inference in systems that are deterministic, unchanging, and perfectly observed. By evaluating properties of reconstructed dynamics in state space, it sidesteps any need to formulate and fit what are often inaccurate mathematical models. In current practice, CCM appears an unstable basis for inference in natural systems. We simulated two interacting strains and found that the original CCM (Criterion 1) can lead to erroneous conclusions whenever strains fluctuated at similar frequencies. A related approach intended to control for periodic behavior also fared poorly [20]. Applying a criterion for causality that considers the temporal lag at which the cross-map correlation is maximized [18], rather than the change in the cross-map correlation with time series length L [17], avoids this problem. Inference with Criterion 2 is somewhat robust to process noise, which can improve performance in some cases. But the method has two problems, even with perfect and unrealistically abundant observations. First, it remains susceptible to deviations from its core dynamical assumptions. “High” process noise and transient dynamics each diminish performance, leading to false positives and negatives. Although some observed systems may follow deterministic dynamics that do not themselves change in time, this assumption is often dubious in ecology. Second, even when the dynamical assumptions are upheld, seemingly equally justifiable methods of attractor reconstruction yield different results. If the aim is to test hypotheses statistically, these problems raise doubts about the suitability of methods based on state-space reconstruction in ecology.

Oscillations are common in nature, especially in infectious diseases, and suggest that one of the criteria for causal inference (Criterion 1), including a method that tries to control for periodic behavior, could routinely mislead. Climatic and seasonal cycles, driven by such factors as school terms, El Niño, and absolute humidity, pervade the dynamics of many pathogens and influence the timing of epidemics [5, 6, 3133]. Infectious diseases can also exhibit fluctuations in the absence of external forcing. These fluctuations arise from transient damped oscillations or from noise, which induces fluctuations on characteristic time scales and can interact with seasonal drivers to generate complex patterns [3437]. Consumer-resource interactions [3840] and patchy populations [41, 42] demonstrate similar behavior. In systems with synchronized dynamics, the only demonstrated reliable criterion for causal inference is a negative cross-map lag [18].

Assuming the stronger criterion for causality [18], under what conditions might we consider this method “safe”? We have shown that departures from a fixed attractor are a problem. These departures constitute different forms of transient dynamics. From a modeling perspective, we could describe them as arising from initial conditions, process noise, demographic stochasticity, or a change in the underlying attractor due to a secular change in a parameter. In our system, a ≥5% standard deviation in the transmission rate generated appreciable false positives. Is this high or low? In small populations, such variation could arise from the direct effects of demographic stochasticity, and in large populations, it could arise indirectly from the interaction of demographic stochasticity with nonlinear components [34, 43]. Although deterministic skeletons can help estimate the amount of noise present, if the true skeleton is unknown, estimates are sensitive to the approximating statistical functions [44]. More importantly, the existence of transient dynamics in a time series indicates insufficient observations. There is furthermore no guarantee any natural system will reach an attractor before going extinct or that the system’s dynamics themselves do not evolve [40].

If an ecologist were confident that observed dynamics reflected dynamics near a fixed, deterministic attractor (e.g., in a simple, closed system), uncertainties in the methodology of attractor reconstruction still suggest caution. We tested four different methods of selecting the lag-embedding. Even near an attractor, they gave different results (S2 Fig). Decades of research on methods of attractor reconstruction show the continued difficulty of justifying a particular approach [23, 26, 27, 45, 46]. Reconstructions from unknown systems thus currently run the risk of being ad hoc and compromising causal inference. The statistics for evaluating cross-map correlations also deserve attention. We bootstrapped and attempted to validate approaches empirically with simulated data, but the methods are not rigorously grounded in a probabilistic framework such as those common to mechanistic modeling [47]. Extending the approach to explicitly link nonlinear dynamics with process and observation noise in a probabilistic framework has the potential to put the method on a sounder footing.

Of the many factors that might explain the contrasting results for childhood infections in two cities, biological explanations thus seem the least likely. Although there is evidence that measles increases suceptibility to other pathogens [28], and that measles and pertussis compete for susceptible hosts [29], the CCM analyses did not consistently support either hypothesis. It is difficult to imagine a parsimonious mechanism by which the inferred interactions might be plausible. Several of the putative “driving” pathogens in fact infected children at older ages than the “driven” pathogens during this period (e.g., varicella and measles infected children 6–8 y old and 5–6 y old, respectively; polio and mumps infected children 12–17 y old and 6–7 y old, respectively) [48]. Different rates or modes of transmission for each disease in each city might lead to varying patterns of infection in different subpopulations, which would affect interactions. We know of no support for this hypothesis. In contrast, we cannot rule out transient dynamics, which could arise from changes in birth rates, mobility, and behavior during this period [49]. Process noise, implying the omission of important state variables and poor resolution of the underlying deterministic attractor, could also affect performance. Errors in attractor reconstruction are another possibility. Except for pertussis, different delay-embeddings were selected for each pathogen in each city, and an alternative method of attractor reconstruction yielded even more divergent results. Finally, we cannot account for the effects of short time series and measurement error. We conclude that the inferred interactions are untrustworthy.

Detecting causality remains challenging in the face of real data from a complex world. With limited data and complex dynamics, mechanistic models are always misspecified to some extent, and the use of other lines of evidence to motivate the choice of model structure is necessary for good inference [812]. But even an accurate mechanistic model that reproduces observed patterns well cannot prove causality. Controlled manipulative experiments, which are notoriously hard to conduct in large complex systems, are necessary. Global systems can never sustain this high standard, but randomization and replication are sometimes possible on smaller scales [5052]. With diseases like the ones we invesigate here, manipulations (e.g., vaccination) are seldom feasible. This has led epidemiologists and disease ecologists to resort to a mishmash of heuristics, frequently based on observational data, for causal inference [53].

Prediction, in contrast, is epistemologically straightforward and useful without knowledge of the true underlying system. It does not require deciding a priori what the best method is (model-based, model-free, or hybrid): the proof is in the prediction. Model-free prediction methods, including those based on state-space reconstruction [20, 54, 55], nonetheless share limitations with CCM. The power of these methods is limited by the dynamical coverage of the data. If past observations cover only a small part of state space (a subset of the attractor), model-free methods have no way to anticipate qualitative changes in dynamics. Secular changes in parameters that change the shape of the attractor pose a similar problem. These situations might cause previously excellent predictive models to fail without warning. Good mechanistic models, however, not only predict novel dynamics under these circumstances but also use them to inform biology. Models that include immune boosting and waning, for instance, can extrapolate contrasting patterns of pertussis activity in different locations and periods [56, 57]. Models that calculate the fraction of the population susceptible to measles can explain seemingly sudden changes in disease dynamics from changes in birth rates [6] and persistently chaotic dynamics, which paradoxically suggest intrinsic limitations to predictability [43]. It will be interesting to see how well mechanistic models can infer correct interactions in complex nonlinear dynamical systems, including our examples involving transient dynamics and childhood infections in New York and Chicago. In theory, predictive and mechanistic models can converge if the predictive factors mimic the hypothesized state variables over time and the data include a large range of possible dynamics.

Beyond its statistical practicalities, the prospect of applying state-space reconstruction to causal inference touches on unsettled questions in ecology. Are systems approximately deterministic and settled on static attractors, and how can we tell? Although CCM does not require that dynamics follow an identifiable model, it does require sufficient coverage of a fixed state-space [58]. We propose that this position is justifiable only in systems that are already well-understood (e.g., closed, non-evolving microcosms at steady state), but in these cases, causality is typically known.

Methods

Dynamical model

We modeled the dynamics of two pathogen strains under variable amounts of competition and process noise (Fig 7). The state variables in the system are the hosts’ statuses with respect to each strain [59]. Hosts can be susceptible (Si), infected (Ii), or recovered and immune (Ri) to each strain i. The deterministic model has the form: (1) (2) (3) (4) (5)

thumbnail
Fig 7. Compartmental representation of strain-competition model.

Hosts are susceptible (S), infected/infective (I), or recovered (R) with respect to each strain. Hosts move from S to I based on a seasonally varying transmission rate, and from I to R at a constant recovery rate. Competition takes place through cross-immunity, which is implemented by having hosts skip the infected state for one strain with some probability if they are already infected with another strain.

https://doi.org/10.1371/journal.pone.0169050.g007

Hosts enter the susceptible class for strain i through the birth (and death) rate μ. They leave through infection with strain i (SiIi), infection with strain j that elicits cross-immunity to i (SiRi), or death. The per capita transmission rate, βi(t), depends on a mean strain-specific rate, βi, and a forcing function that is shared by all strains. This function has a sinusoidal form and represents a shared common driver, such as seasonal changes in susceptibility or transmission from school-term forcing. The forcing function is defined by a shared period ψ and amplitude ϵ. Infected hosts recover at rate νi (IiRi). The immune host class grows through these recoveries and also from the fraction of susceptible hosts, Si, contacting infected hosts, Ij, who develop cross-immunity, σij (0 ≤ σij ≤ 1). Immunity of this form has been described as “polarizing” because σij of hosts Si contacting infecteds Ij become completely immune (non-susceptible) to strain i, while 1 − σij remain completely susceptible. This cross-immunity is a form of competition that determines the directions of interaction between strains: when σij > 0, strain j drives strain i. We assume σii = 1: hosts acquire perfect immunity to a strain from which they have recovered (Table 1).

Process noise on the per capita transmission rate produces stochastic differential equations in Ito form: (6) (7) (8) where the Wi are independent Wiener processes, one for each pathogen i, and η represents the standard deviation of the noise as a fraction of the deterministic transmission rate.

The observations consist of the number of new cases or incidence over some interval. Cumulative cases ci at time t were obtained by summing the SiIi transitions from the start of the simulation through time t. The incidence over times t − Δtobs to t, written as C(t) for convenience, is given by the difference in cumulative cases: (9) (10)

Simulation

The equations were solved numerically using the Euler-Maruyama method with a fixed step size. The step size was chosen to be less than the smallest within-run harmonic mean step size across deterministic, adaptive-step size pilot runs performed across the range of parameter space being studied. When numerical errors arose during transients, the step size was reduced further until the numerical issues disappeared.

Except where noted, the model was simulated with random initial conditions, and 1000 years of monthly observations were obtained from stochastic fluctuations around the deterministic attractor. The use of random initial conditions minimizes arbitrary bias in the simulated dynamics. From visual inspection of dynamics, the transient phase lasted much less than 1000 years. Time series were obtained from years 2000–3000.

Cross-mapping

Convergent cross-mapping (CCM) is a method for inferring causality in deterministic systems via delay embedding [17]. Takens’ theorem holds that, for an E-dimensional system, the attractor for the state space represented by delay vectors in a single variable X, x(t) = {X(t), X(tτ1), X(tτ2), …, X(tτE−1)}, is topologically equivalent to the E-dimensional attractor for variables X1, …, XE. In the limit of infinite data, the full E-dimensional attractor can be reconstructed perfectly from a one-dimensional time series. Therefore, because x(t) contains complete information about the system’s dynamics, if Y is part of the same system and thus causally drives X, observations of x(t) → Y(t − ℓ), for a fixed lag ℓ, can be used to reconstruct unobserved values of Y(t) from new observations of x(t) (Fig 1).

To evaluate whether Y drives X, we construct “libraries” of observations of x(t) → y(t − ℓ). For a particular library, we treat each value of Y(t) as unobserved, and reconstruct its value by identifying the E + 1 nearest neighbors to x(t) in the library, x(ti), for t1, …, tE + 1, and calculating . In order to avoid predictability due to system autocorrelation rather than dynamical coupling, neighbors are restricted to be separated in time by at least three times the delay at which the autocorrelation drops below 1/e. Weights are calculated from the Euclidean distances di between x(t) and x(ti), with wi proportional to , where d0 is the distance to the nearest neighbor [17].

The cross-map correlation ρ measures how well values of Y can be reconstructed from values of X, and is defined as the Pearson correlation coefficient between reconstructed values and actual values Y(t) across the entire time series [18]. Given library size L and lag ℓ, we generate a distribution of cross-map correlations ρ by bootstrap-sampling libraries mapping delay vectors x(t) to values Y(t − ℓ) and then computing the cross-map correlation for each sampled library. We use the bootstrap distribution of cross-map correlation as the basis for statistical criteria for causality.

Criteria for causality

We infer causality using two primary criteria involving the cross-map correlation ρ [17, 18]: (1) whether ρ increases with L for a fixed lag ℓ; (2) whether ρ is positive and maximized at a negative temporal lag ℓ. We also consider a weaker alternative to the first criterion, testing simply whether ρ is positive, and a method intended to control for seasonal behavior based on randomizing seasonal anomalies [20].

Criterion 1.

If Y drives X, then increasing the library size L should improve predictions of x(t) as measured by ρ [17] for fixed lag ℓ = 0. The first criterion tests for this increase in ρ with L. We calculate ρ at Lmin = E + 2, the smallest library that will contain E + 1 nearest neighbors for delay vectors x(t), and at Lmax, the total number of delay vectors x(t) in the time series. An increase in ρ is indicated by a lack of overlap between the distributions at Lmin = E + 2, the smallest library that will have E + 1 neighbors for most points, and Lmax, the largest possible library given the time-series length and delay embedding parameters E and τ.

Criterion 2.

If Y strongly drives X, cross-map correlation at ℓ = 0 may yield a false positive when testing for X driving Y, but because information is transferred forwards in time from Y to X, the cross-map correlation should be maximized at a negative lag ℓ [18]. The second criterion simply requires that, to infer that Y drives X, the cross-map correlation ρ be maximized at a negative cross-map lag ℓ and be positive. In other words, not only must X contain information about Y, but this information must be greatest for past states of Y, reflecting the correct temporal direction for causality.

Statistical tests for causality criteria

The theory underlying CCM assumes completely deterministic interactions and infinite data. If Y drives X in the absence of noise, the correlation ρ between the reconstructed and observed states of Y should converge to one with infinite samples of X. In practice, if X and Y share a complex (e.g., chaotic) attractor, time series of X may not be long enough to see convergence [17].

The presence of observation and/or process noise violates the deterministic assumptions and prevents ρ from ever reaching 1. Nonetheless, a detectable increase in the correlation ρ with the library length L (for Criterion 1), or a maximum and positive correlation at negative lag (for Criterion 2), may suffice to demonstrate that X drives Y in natural systems. It is important to note that we have no formal theoretical justification for such statistical heuristics.

Our statistics are based on the distributions obtained from bootstrapping. For Criterion 1, which tests for an increase in ρ(L), we perform a nonparametric test of whether ρ(Lmax), obtained at the largest library length is greater than ρ(Lmin), obtained at the smallest libary length. The p-value for this test is calculated as the probability that ρ(Lmax) is not greater than ρ(Lmin), and calculate the p-value directly from the sampled distributions (the fraction of bootstraps in which ρ(Lmax) < ρ(Lmin)). We also consider a weaker alternative, testing simply whether ρ is significantly positive.

We also test a proposed method for controlling for periodic behavior by comparing ρ to a null distribution based on randomized seasonal anomalies [20]. Specifically, we calculate the mean for each time point within a year (the forcing period) across all years, and the difference (anomaly) from that mean at each time point. We construct surrogate time series by randomizing the sequence of anomalies across all time points and adding them to the seasonal means. The p-value is calculated as the probability that the cross-map correlation ρ for the original time series is less than ρi for a random surrogate time series i.

For Criterion 2, which tests whether the best cross-map lag is negative and thus indicates the correct causal direction in time, we perform a similar nonparametric test. We identify the negative cross-map lag ℓ(−) with the highest median correlation, ρ(ℓ(−)) as well as the nonnegative cross-map lag ℓ(0+) with the highest median correlation. The p-value for this test is calculated as the probability that ρ(ℓ(−)) is not greater than ρ(ℓ(0+)).

We use a significance threshold of p < 0.05 for all tests.

Choice of delay and embedding dimension

The theory underlying attractor reconstruction works with any E-dimensional projection of a one-dimensional time series, which can be generated in many ways from lags of the time series. In simulated, deterministic models, E can be known perfectly, but the best projection may be system-dependent. In systems with process noise, unknown dynamics, and/or finite observations, there is no clearly superior method to select the appropriate projection [26, 27, 45, 6062].

We accommodated this uncertainty by using four different methods. Two methods infer the best delay-embedding for each interaction by maximizing the ability of one variable, the driven variable, to predict itself (akin to nonlinear forecasting [46, 63]). The third method instead uses the delay-embedding that maximizes the cross-mapping correlation ρ for each interaction. Three of the four methods use uniform embeddings, identifying E and a fixed delay τ, and the other uses a nonuniform embedding, identifying a series of specific delays τ1, τ2, etc., whose length determines E.

  1. Univariate prediction method: By default, for each causal interaction (CiCj), E and τ are chosen to maximize the one-step-ahead univariate prediction ρ at Lmax for the driven variable (Cj) based on its own time series.
  2. Maximum cross-correlation method: As an alternative, E and τ are chosen to maximize the mean cross-map correlation ρ at Lmax for each causal interaction being tested, for each time series.
  3. Random projection method: A recently proposed method based on random projection of delay coordinates sidesteps the problem of choosing optimal delays [23]. Instead, for a given E, all delays up to a maximum delay τmax are projected onto an E-dimensional vector via multiplication by a random projection matrix. E is chosen to maximize the cross-map correlation ρ.
  4. Nonuniform method: For each driven variable Cj, starting with τ0 = 0, additional delays τ1, τ2, … are chosen iteratively to maximize the directional derivative to nearest neighbors when the new delay is added [26]. The delays are bounded by the optimal uniform embedding based on a cost function that penalizes irrelevant information [27]. This method can be seen as a nonuniform extension of the method of false nearest neighbors [64].

Code

Code implementing the state-space reconstruction methods is publicly available at https://github.com/cobeylab/pyembedding. The complete code for the analysis and figures is publicly available at https://github.com/cobeylab/causality_manuscript; individual analyses include references to the Git commit version identifier in the ‘pyembedding’ repository. The simulated time series on which the analyses were performed are available from the authors on request.

Data on childhood infections

Time series were obtained from L2-level data maintained by Project Tycho [30]. All available cases of measles, mumps, pertussis, polio, scarlet fever, and varicella were obtained from the first week of 1906 through the last week of 1953 for New York City and Chicago. Pertussis data were terminated in the 26th week of 1948 to limit the influence of the recently introduced pertussis vaccine. Incidence was calculated by dividing weekly cases by a spline fit to each city’s population size, as reported by the U.S. Census.

Supporting Information

S1 Fig. Interactions detected as a function of process noise and the strength of interaction (C2C1) for different types of data.

Heat maps show the fraction of 100 replicates significant for each inferred interaction for different parameter combinations. A significant increase in cross-map correlation ρ with library length L indicated a causal interaction. Each analysis is based on 1000 years of data. (A) Annual incidence, (B) prevalence strobed annually, (C) first-differenced annual incidence, and (D) monthly incidence without seasonal forcing.

https://doi.org/10.1371/journal.pone.0169050.s001

(PDF)

S2 Fig. Interactions detected as a function of process noise and the strength of interaction (C2C1) for different delay-embedding methods.

Heat maps show the fraction of 100 replicates significant for each inferred interaction for different parameter combinations. A significant increase in cross-map correlation ρ with library length L indicated a causal interaction. Each analysis is based on 100 years of monthly data. Delay-embeddings were chosen by (A) nonuniform embedding, (B) random projection, or (C) maximizing the cross-map correlation ρ.

https://doi.org/10.1371/journal.pone.0169050.s002

(PDF)

S3 Fig. Interactions detected for different types of data.

Heat maps show the fraction of 100 replicates significant for each inferred interaction for different parameter combinations. A maximum cross-map correlation ρ at a negative lag was required for inferring causal interaction. (A) 1000 years of annual incidence, requiring that the maximum ρ be positive. (B) 100 years of monthly incidence, requiring that the maximum ρ be increasing. (C) 100 years of monthly incidence with identical strains (β1 = β2 = 0.3), requiring that maximum ρ be positive.

https://doi.org/10.1371/journal.pone.0169050.s003

(PDF)

S4 Fig. Interactions detected between identical strains with correlated process noise.

Heat maps show the fraction of 100 replicates significant for each inferred interaction. A maximum cross-map correlation ρ at a negative lag was required for inferring causal interaction. 100 years of monthly (A) and 1000 years of annual (B) incidence, requiring that the maximum ρ be positive. (C) 100 years of monthly incidence, requiring that maximum ρ be increasing.

https://doi.org/10.1371/journal.pone.0169050.s004

(PDF)

S5 Fig. Interactions detected between distinct strains with correlated process noise.

Heat maps show the fraction of 100 replicates significant for each inferred interaction. A maximum cross-map correlation ρ at a negative lag and ρ > 0 were required for inferring causal interaction. Results are shown for 5, 10, 25, 50, and 100 years of monthly incidence.

https://doi.org/10.1371/journal.pone.0169050.s005

(PDF)

S6 Fig. Interactions detected using null surrogates with randomized seasonal anomalies.

Heat maps show the fraction of 100 replicates significant for each inferred interaction in simulations using distinct strains and seasonal forcing. (A) Surrogate time series generated for putative cause and putative effect. (B) Surrogate time series generated for putative cause only.

https://doi.org/10.1371/journal.pone.0169050.s006

(PDF)

S7 Fig. Incorrect inference with far-from-attractor dynamics.

Cross-map correlations at different lags for a sample 100-year time series with monthly sampling (inset). Lines represent bootstrap medians; gray ribbons represent the middle 95% of the bootstrap distribution. Although C2 drives C1 (σ12 = 0.5, σ21 = 0), the maximum cross-correlation ρ for C1 cross-mapped to C2 occurs at a positive lag, and the reverse at a negative lag, leading to the conclusion that C1 drives C2, and C2 does not drive C1. Sample dynamics include process noise (η = 0.01) but no seasonal forcing (ϵ = 0).

https://doi.org/10.1371/journal.pone.0169050.s007

(PDF)

S8 Fig. Cross-map lags for New York with default (univariate) embedding.

https://doi.org/10.1371/journal.pone.0169050.s008

(PDF)

S9 Fig. Cross-map lags for Chicago with default (univariate) embedding.

https://doi.org/10.1371/journal.pone.0169050.s009

(PDF)

S10 Fig. Cross-map lags for New York with embedding based on random projection.

https://doi.org/10.1371/journal.pone.0169050.s010

(PDF)

S11 Fig. Cross-map lags for Chicago with embedding based on random projection.

https://doi.org/10.1371/journal.pone.0169050.s011

(PDF)

Acknowledgments

We thank Mercedes Pascual, Lauren Childs, and Greg Dwyer for helpful comments.

Author Contributions

  1. Conceptualization: SC EB.
  2. Formal analysis: SC EB.
  3. Software: EB.
  4. Writing – review & editing: SC EB.

References

  1. 1. Keeling MJ, Rohani P. Modeling Infectious Diseases in Humans and Animals. Princeton University Press; 2011.
  2. 2. Anderson RMRMM. Infectious Diseases of Humans. Dynamics and Control. Oxford University Press; 1995.
  3. 3. Kermack WO, McKendrick AG. A Contribution to the Mathematical Theory of Epidemics. Proc R Soc A. 1927;115(772):700–721.
  4. 4. Ross R. The Prevention of Malaria. J. Murray; 1910. Available from: https://books.google.com/books?id=0jRaWNX--s0C.
  5. 5. Laneri K, Bhadra A, Ionides EL, Bouma M, Dhiman RC, Yadav RS, et al. Forcing Versus Feedback: Epidemic Malaria and Monsoon Rains in Northwest India. PLOS Comput Biol. 2010;6(9):e1000898. pmid:20824122
  6. 6. Finkenstädt BF, Grenfell BT. Time series modelling of childhood diseases: a dynamical systems approach. J R Stat Soc C. 2000;49(2):187–205.
  7. 7. Fine PE, Clarkson JA. Measles in England and Wales–I: An analysis of factors underlying seasonal patterns. Int J Epidemiol. 1982;11(1):5–14. pmid:7085179
  8. 8. Burnham KP, Anderson DR. Model Selection and Multimodel Inference: A Practical Information-Theoretic Approach. Springer; 2003.
  9. 9. He D, Ionides EL, King AA. Plug-and-play inference for disease dynamics: measles in large and small populations as a case study. J R Soc Interface. 2009;7(43):271–283. pmid:19535416
  10. 10. Yodzis P. The Indeterminacy of Ecological Interactions as Perceived Through Perturbation Experiments. Ecology. 1988;69(2):508–515.
  11. 11. Simon N Wood MBT. Super-Sensitivity to Structure in Biological Models. Proc R Soc B. 1999;266(1419):565–570.
  12. 12. Grad YH, Miller JC, Lipsitch M. Cholera Modeling. Epidemiology. 2012;23(4):523–530. pmid:22659546
  13. 13. Granger CWJ. Investigating Causal Relations by Econometric Models and Cross-spectral Methods. Econometrica. 1969;37(3):424.
  14. 14. Schumacher J, Wunderle T, Fries P, Jäkel F, Pipa G. A Statistical Framework to Infer Delay and Direction of Information Flow from Measurements of Complex Systems. Neural Comput. 2015;27(8):1555–1608. pmid:26079751
  15. 15. Mooij JM, Peters J, Janzing D, Zscheischler J, Schölkopf B. Distinguishing cause from effect using observational data: methods and benchmarks. J Mach Learn Res. 2016;17(32).
  16. 16. Stegle O, Janzing D, Zhang K, Mooij JM, Schölkopf B. Probabilistic latent variable models for distinguishing between cause and effect. In: Lafferty JD, Williams CKI, Shawe-Taylor J, Zemel RS, Culotta A, editors. Advances in Neural Information Processing Systems 23. Curran Associates, Inc.; 2010. p. 1687–1695. Available from: http://papers.nips.cc/paper/4173-probabilistic-latent-variable-models-for-distinguishing-between-cause-and-effect.pdf.
  17. 17. Sugihara G, May R, Ye H, h Hsieh C, Deyle E, Fogarty M, et al. Detecting Causality in Complex Ecosystems. Science. 2012;338(6106):496–500. pmid:22997134
  18. 18. Ye H, Deyle ER, Gilarranz LJ, Sugihara G. Distinguishing time-delayed causal interactions using convergent cross mapping. Nature Sci Rep. 2015;5:14750.
  19. 19. Clark AT, Ye H, Isbell F, Deyle ER, Cowles J, Tilman GD, et al. Spatial convergent cross mapping to detect causal relationships from short time series. Ecology. 2015;96(5):1174–1181. pmid:26236832
  20. 20. Deyle ER, Maher MC, Hernandez RD, Basu S, Sugihara G. Global environmental drivers of influenza. Proceedings of the National Academy of Sciences. 2016;113(46):13081–13086.
  21. 21. Takens F. Detecting strange attractors in turbulence. In: Rand D, Young LS, editors. Dynamical Systems and Turbulence, Warwick 1980. vol. 898 of Lecture Notes in Mathematics. Springer Berlin Heidelberg; 1981. p. 366–381. Available from: http://dx.doi.org/10.1007/BFb0091924.
  22. 22. Kocarev L, Parlitz U. Generalized Synchronization, Predictability, and Equivalence of Unidirectionally Coupled Dynamical Systems. Phys Rev Lett. 1996;76(11):1816–1819. pmid:10060528
  23. 23. Tajima S, Yanagawa T, Fujii N, Toyoizumi T. Untangling Brain-Wide Dynamics in Consciousness by Cross-Embedding. PLOS Comput Biol. 2015;11(11):e1004537. pmid:26584045
  24. 24. Tsonis AA, Deyle ER, May RM, Sugihara G, Swanson K, Verbeten JD, et al. Dynamical evidence for causality between galactic cosmic rays and interannual variation in global temperature. PNAS. 2015;112(11):3253–3256. pmid:25733877
  25. 25. Cobey S. Pathogen evolution and the immunological niche. Ann NY Acad Sci. 2014;1320(1):1–15. pmid:25040161
  26. 26. Nichkawde C. Optimal state-space reconstruction using derivatives on projected manifold. Phys Rev E. 2013;87:022905.
  27. 27. Uzal LC, Grinblat GL, Verdes PF. Optimal reconstruction of dynamical systems: A noise amplification approach. Phys Rev E. 2011;84(1).
  28. 28. Mina MJ, Metcalf CJE, de Swart RL, Osterhaus ADME, Grenfell BT. Long-term measles-induced immunomodulation increases overall childhood infectious disease mortality. Science. 2015;348(6235):694–699. pmid:25954009
  29. 29. Rohani P, Green CJ, Mantilla-Beniers NB, Grenfell BT. Ecological interference between fatal diseases. Nature. 2003;422(6934):885–888. pmid:12712203
  30. 30. van Panhuis WG, Grefenstette J, Jung SY, Chok NS, Cross A, Eng H, et al. Contagious Diseases in the United States from 1888 to the Present. New Engl J Med. 2013;369(22):2152–2158. pmid:24283231
  31. 31. Shaman J, Pitzer VE, Viboud C, Grenfell BT, Lipsitch M. Absolute Humidity and the Seasonal Onset of Influenza in the Continental United States. PLOS Biol. 2010;8(2):e1000316. pmid:20186267
  32. 32. Altizer S, Dobson A, Hosseini P, Hudson P, Pascual M, Rohani P. Seasonality and the dynamics of infectious diseases. Ecol Lett. 2006;9(4):467–484. pmid:16623732
  33. 33. Metcalf CJE, Bjornstad ON, Grenfell BT, Andreasen V. Seasonality and comparative dynamics of six childhood infections in pre-vaccination Copenhagen. Proc R Soc B. 2009;276(1676):4111–4118. pmid:19740885
  34. 34. Alonso D, McKane AJ, Pascual M. Stochastic amplification in epidemics. J R Soc Interface. 2006;4(14):575–582.
  35. 35. Nguyen HTH, Rohani P. Noise, nonlinearity and seasonality: the epidemics of whooping cough revisited. J R Soc Interface. 2008;5(21):403–413.
  36. 36. Rand D A, W HB. Chaotic Stochasticity: A Ubiquitous Source of Unpredictability in Epidemics. Proc R Soc B. 1991;246(1316):179–184.
  37. 37. Rohani P, Keeling MJ, Grenfell BT. The interplay between determinism and stochasticity in childhood diseases. Am Nat. 2002;159(5):469–481. pmid:18707430
  38. 38. Boland RP, Galla T, McKane AJ. Limit cycles, complex Floquet multipliers, and intrinsic noise. Phys Rev E. 2009;79(5).
  39. 39. McKane AJ, Newman TJ. Predator-Prey Cycles from Resonant Amplification of Demographic Stochasticity. Phys Rev Lett. 2005;94(21). pmid:16090353
  40. 40. Turchin P. Complex Population Dynamics: A Theoretical/Empirical Synthesis (MPB-35) (Monographs in Population Biology). Princeton University Press; 2003.
  41. 41. W S C Gurney RMN. Single-Species Population Fluctuations in Patchy Environments. Am Nat. 1978;112(988):1075–1090.
  42. 42. Durrett R, Levin S. The Importance of Being Discrete (and Spatial). Theor Popul Biol. 1994;46(3):363–394.
  43. 43. Dalziel BD, Bjørnstad ON, van Panhuis WG, Burke DS, Metcalf CJE, Grenfell BT. Persistent Chaos of Measles Epidemics in the Prevaccination United States Caused by a Small Change in Seasonal Transmission Patterns. PLOS Computational Biology. 2016;12(2):e1004655. pmid:26845437
  44. 44. Stephen Ellner PT. Chaos in a Noisy World: New Methods and Evidence from Time-Series Analysis. Am Nat. 1995;145(3):343–375.
  45. 45. Casdagli M, Eubank S, Farmer JD, Gibson J. State space reconstruction in the presence of noise. Physica D. 1991;51(1–3):52–98.
  46. 46. Sugihara G, May RM. Nonlinear forecasting as a way of distinguishing chaos from measurement error in time series. Nature. 1990;344(6268):734–741. pmid:2330029
  47. 47. Hilborn R, Mangel M. The Ecological Detective: Confronting Models with Data. Princeton University Press; 1997.
  48. 48. Anderson RM, May RM. Vaccination and herd immunity to infectious diseases. Nature. 1985;318(6044):323–329. pmid:3906406
  49. 49. Earn DJ. A Simple Model for Complex Dynamical Transitions in Epidemics. Science. 2000;287(5453):667–670. pmid:10650003
  50. 50. Simberloff DS, Wilson EO. Experimental Zoogeography of Islands: The Colonization of Empty Islands. Ecology. 1969;50(2):278–296.
  51. 51. Hurlbert SH. Pseudoreplication and the Design of Ecological Field Experiments. Ecol Monogr. 1984;54(2):187–211.
  52. 52. Tilman D. Ecological experimentation: Strengths and conceptual problems. In: Likens GE, editor. Long-term Studies in Ecology: Approaches and Alternatives. New York: Springer-Verlag; 1989. p. 136–157.
  53. 53. Plowright RK, Sokolow SH, Gorman ME, Daszak P, Foley JE. Causal inference in disease ecology: investigating ecological drivers of disease emergence. Front Ecol Environ. 2008;6(8):420–429.
  54. 54. Ye H, Beamish RJ, Glaser SM, Grant SCH, hao Hsieh C, Richards LJ, et al. Equation-free mechanistic ecosystem forecasting using empirical dynamic modeling. PNAS. 2015;112(13):E1569–E1576. pmid:25733874
  55. 55. Deyle ER, May RM, Munch SB, Sugihara G. Tracking and forecasting ecosystem interactions in real time. Proceedings of the Royal Society B: Biological Sciences. 2016;283(1822):20152258. pmid:26763700
  56. 56. de Cellès MD, Magpantay FMG, King AA, Rohani P. The pertussis enigma: reconciling epidemiology, immunology and evolution. Proceedings of the Royal Society B: Biological Sciences. 2016;283(1822):20152309.
  57. 57. Lavine JS, King AA, Andreasen V, Bjørnstad ON. Immune Boosting Explains Regime-Shifts in Prevaccine-Era Pertussis Dynamics. PLoS ONE. 2013;8(8):e72086. pmid:23991047
  58. 58. Hastings A. Transients: the key to long-term ecological understanding? Trends Ecol Evol. 2004;19(1):39–45. pmid:16701224
  59. 59. Gog JR, Swinton J. A status-based approach to multiple strain dynamics. J Math Biol. 2002;44(2):169–184. pmid:11942531
  60. 60. Pecora LM, Moniz L, Nichols J, Carroll TL. A unified approach to attractor reconstruction. Chaos. 2007;17(1). http://dx.doi.org/10.1063/1.2430294. pmid:17411246
  61. 61. Cao L. Practical Method for Determining the Minimum Embedding Dimension of a Scalar Time Series. Phys D. 1997;110(1–2):43–50.
  62. 62. Small M, Tse CK. Optimal embedding parameters: a modelling paradigm. Physica D. 2004;194(3–4):283–296.
  63. 63. Sugihara G. Nonlinear Forecasting for the Classification of Natural Time Series. Phil Trans R Soc A. 1994;348(1688):477–495.
  64. 64. Kennel MB, Brown R, Abarbanel HDI. Determining embedding dimension for phase-space reconstruction using a geometrical construction. Phys Rev A. 1992;45(6):3403–3411. pmid:9907388