Deep neural networks for endemic measles dynamics: Comparative analysis and integration with mechanistic models

Wyatt G. Madden; Wei Jin; Benjamin Lopman; Andreas Zufle; Benjamin Dalziel; C. Jessica E. Metcalf; Bryan T. Grenfell; Max S. Y. Lau

doi:10.1371/journal.pcbi.1012616

Abstract

Measles is an important infectious disease system both for its burden on public health and as an opportunity for studying nonlinear spatio-temporal disease dynamics. Traditional mechanistic models often struggle to fully capture the complex nonlinear spatio-temporal dynamics inherent in measles outbreaks. In this paper, we first develop a high-dimensional feed-forward neural network model with spatial features (SFNN) to forecast endemic measles outbreaks and systematically compare its predictive power with that of a classical mechanistic model (TSIR). We illustrate the utility of our model using England and Wales measles data from 1944-1965. These data present multiple modeling challenges due to the interplay between metapopulations, seasonal trends, and nonlinear dynamics related to demographic changes. Our results show that while the TSIR model yields similarly performant short-term (1 to 2 biweeks ahead) forecasts for highly populous cities, our neural network model (SFNN) consistently achieves lower root mean squared error (RMSE) across other forecasting windows. Furthermore, we show that our spatial-feature neural network model, without imposing mechanistic assumptions a priori, can uncover gravity-model-like spatial hierarchy of measles spread in which major cities play an important role in driving regional outbreaks. We then turn our attention to integrative approaches that combine mechanistic and machine learning models. Specifically, we investigate how the TSIR can be utilized to improve a state-of-the-art approach known as Physics-Informed-Neural-Networks (PINN) which explicitly combines compartmental models and neural networks. Our results show that the TSIR can facilitate the reconstruction of latent susceptible dynamics, thereby enhancing both forecasts in terms of mean absolute error (MAE) and parameter inference of measles dynamics within the PINN. In summary, our results show that appropriately designed neural network-based models can outperform traditional mechanistic models for short to long-term forecasts, while simultaneously providing mechanistic interpretability. Our work also provides valuable insights into more effectively integrating machine learning models with mechanistic models to enhance public health responses to measles and similar infectious disease systems.

Author summary

Mechanistic models have been foundational in developing an understanding of the transmission dynamics of infectious diseases including measles. In contrast to their mechanistic counterparts, machine learning techniques including neural networks have primarily focused on improving forecasting accuracy without explicitly inferring transmission dynamics. Effectively integrating these two modeling approaches remains a central challenge. In this paper, we first develop a high-dimensional neural network model to forecast spatiotemporal endemic measles outbreaks and systematically compare its predictive power with that of a classical mechanistic model (TSIR). We illustrate the utility of our model using a detailed dataset describing measles outbreaks in England and Wales from 1944–1965, one of the best-documented and most-studied nonlinear infectious disease systems. Our results show that overall, our neural network model outperforms the TSIR in all forecasting windows. Furthermore, we show that our neural network model can uncover the mechanism of hierarchical spread of measles where major cities drive regional outbreaks. We then develop an integrative approach that explicitly and effectively combines mechanistic and machine learning models, improving simultaneously both forecasting and inference. In summary, our work offers valuable insights into the effective utilization of machine learning models, and integration with mechanistic models, for enhancing outbreak responses to measles and similar infectious disease systems.

Citation: Madden WG, Jin W, Lopman B, Zufle A, Dalziel B, E. Metcalf CJ, et al. (2024) Deep neural networks for endemic measles dynamics: Comparative analysis and integration with mechanistic models. PLoS Comput Biol 20(11): e1012616. https://doi.org/10.1371/journal.pcbi.1012616

Editor: Benjamin Althouse, University of Washington, UNITED STATES OF AMERICA

Received: May 27, 2024; Accepted: November 4, 2024; Published: November 21, 2024

Copyright: © 2024 Madden et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All code and data are available on the GitHub repo: https://github.com/WyattGMadden/deep_measles_dynamics.

Funding: BL, ML, AZ, WM are supported by the cooperative agreement CDC-RFA-FT-23-0069 from the CDC’s Center for Forecasting and Outbreak Analytics. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.

Introduction

Following the COVID-19 pandemic, there has been a marked increase in machine learning research focused on enhancing the forecasting of infectious diseases. This body of work primarily sought to develop highly predictive models for real-time application during the peak of the health crisis [1, 2]. A portion of these studies has endeavored to meld classical mechanistic approaches to infectious disease with machine learning, either through the post-hoc analysis of machine learning outputs in light of established disease dynamics [3, 4], or by directly integrating mechanistic insights into the machine learning models [5–7]. Our research advances these efforts by developing neural-network-based models tailored to the complex spatiotemporal multi-year transmission dynamics of endemic measles, leveraging a well-characterized infectious disease system and a rich historical dataset describing outbreaks in pre-vaccination England and Wales.

Measles is one of the most highly transmissible and strongly immunizing pathogens. Spatiotemporal patterns of pre- and post-vaccination measles incidence are among the most well-documented, and well-studied, nonlinear infectious disease systems. Measles exhibits complex spatiotemporal dynamics driven by the interplay between seasonal forcing, susceptible recruitment due to births and spatial coupling between populations. These dynamics range from regular multiannual infection patterns in large populations [8] to coexisting attractors [9]. By contrast, measles dynamics in small highly vaccinated populations dominated by chaotic patterns driven by stochastic extinction [10].

For example, before widespread vaccination in the late 1960s, measles epidemics in England and Wales were dominated by highly regular periodic (often biennial) cycles in large cities whose populations are at or above the Critical Community Size (CCS)—the population size required to maintain endemic transmission—of approximately 300,000 individuals [11]. Following the widespread vaccination in the late 1960s, the epidemics shifted from highly regular cycles to largely irregular dynamics [12]. Due to its simple natural history and long time series of data, measles incidence in England and Wales has provided a fruitful testing ground for better understanding spatiotemporal nonlinear epidemiological dynamics, and developing semi-mechanistic statistical modeling approaches more broadly [13–18].

A suite of previous analyses has demonstrated the utility of deterministic and stochastic (semi-) mechanistic models, notably the time-series-SIR (TSIR) model [13], a discrete approximation of the S-I-R model, and other successful inferential approaches including particle filtering [19], in characterizing the dynamics in large urban populations. However, in general, these models have not primarily focused on generating long-term forecasting accuracy. While machine learning models, including a recent work leveraging the Least Absolute Shrinkage and Selection Operator (LASSO) [15], have shown improved forecasting skills for endemic measles dynamics, they generally lack deep mechanistic interpretability.

These models also do not explicitly consider spatial interactions between locations which is a known driver for measles transmission, particularly between less populous locations (e.g., small towns) and population centers (e.g., core cities) [16]. To this end, we first train a high-dimensional neural network explicitly incorporating both spatial and temporal features (SFNN) to forecast measles incidence over 1,452 cities and towns from 1944 to 1965 and assess forecast performance over a range of different forecast steps. We also employ explainability (XAI) methods to shed light on how the neural network reveals mechanistic relationships when making predictions.

Following this we turn our attention to integrative approaches that have the potential to simultaneously provide high forecasting performance and mechanistic interpretabililty. Specifically, we focus on the so-called physics-informed neural network (PINN) methods, a class of integrative neural networks that incorporate physics differential equations into the model fitting procedure [20, 21]. PINN methods are able to preserve high predictive performance while incorporating and inferring scientific parameters, and have only recently been extended from physics differential equations to infectious disease mechanistic equations [6, 22]. While previous pioneering work [6, 22–24] has demonstrated the ability of PINN methods to improve disease incidence forecasts, the applications have not focused on long-term prediction and inference of the transmission dynamics in the context of endemic childhood infections. We build a PINN model which integrates a machine learning model directly with a mechanistic S-I-R model, and is able to address these shortcomings by augmenting the measles transmission dynamics with reconstructed latent susceptible dynamics from the TSIR model.

Our results demonstrate that appropriately designed machine learning models can outmatch more traditional mechanistic modeling approaches with respect to forecasting accuracy while effectively uncovering mechanistic infectious disease dynamics, in both a post-hoc and an integrative fashion. First, the high-dimensional neural network (SFNN), overall, outperforms the TSIR model for all forecast windows and in the majority of towns and cities in E&W, but with the most notable improvement for long-term predictions. The explainability (XAI) methods applied to our SFNN uncover the mechanism of hierarchical spread from large core cities to less populous towns without imposing such a mechanism a priori. Specifically, our results suggest that the relative role of spatial hierarchical spread increases as the population size of towns decrease, which is consistent with previous findings leveraging gravity model formulations [25]. Second, we compare the performance of a PINN model augmented with TSIR-reconstructed latent susceptible dynamics (referred to as TSIR-PINN) to a PINN model with naively constrained susceptible dynamics (referred to as Naive-PINN). We demonstrate that inclusion of the TSIR-reconstructed susceptible dynamics (in the TSIR-PINN model) improves the inference of disease parameters while simultaneously providing high forecasting accuracy. Together these findings illustrate the potential for a new suite of methods to provide improved integration between mechanistic models and machine learning approaches for infectious disease modeling, achieving high predictive performance while simultaneously ensuring accurate scientific inference of the spatiotemporal dynamics of measles and similar infectious disease systems.

Results

Neural network model (SFNN) outperforms TSIR model in forecasting endemic measles dynamics

The TSIR model estimates measles dynamics by leveraging incidence data and birth data [13, 18, 26] (see Materials and methods for full model specification). It provides a computationally inexpensive and highly tractable alternative approach to the continuous-time S-I-R model and has been shown to excel in short-term forecasting for measles incidence in large populous cities [17]. However, the TSIR model generally does not perform well for long-term forecasts. Furthermore, incorporating spatial interaction among multiple locations into the TSIR is a steep statistical challenge [27], which limits its utility for characterizing and forecasting (typically less regular outbreaks) in less populous towns whose population size is less than the CCS.

With these deficiencies in mind we employ a neural network explicitly incorporating spatial and temporal features (SFNN). Specifically, our SFNN considers not only measles incidence lags as features, but also potentially important spatial features including the measles incidence lags in, and distances to, the nearest (ten) towns/cities and the (seven) highest population cities (see Materials and methods for full model specification). The seven highest population cities were chosen because these were identified as having populations greater than that of the critical community size (CCS) of 300,000, an empirical threshold at which chains of infections are locally sustained [28].

Our results show that our spatiotemporally featured neural network (Fig 1) generally outperforms the TSIR for both short- and long-term predictions, across different population sizes (Fig 2) and train/test year cutoffs (S1 Table). For very short-term predictions (e.g., when k = 1, where k is the number of biweeks ahead of the targeted prediction), our neural network model SFNN notably outperforms the TSIR in less populous towns where the TSIR model traditionally has struggled with. As the population size of the prediction target increases, the performance of SFNN gradually approaches that of the TSIR (Fig 2B). As the forecasting time window widens (e.g., k > 4), the added predictive accuracy of the SFNN, in comparison with the TSIR, becomes more significant, but with an opposite trend: the performance of the SFNN now improves with the population size.

Download:

Fig 1. SFNN architecture.

The SFNN architecture with input features grouped according to feature type, (maximum) 3 hidden layers of (maximum) dimension 1201, and 1 output layer of dimension 1 for incidence forecasts. Number of hidden layers and hidden layer dimension differ by forecasting window.

https://doi.org/10.1371/journal.pcbi.1012616.g001

Download:

Fig 2. SFNN vs. TSIR model performance measured by Root Mean Squared Error (RMSE) of within-city-standardized log(incidence + 1).

(A) Within-city SFNN RMSE versus TSIR RMSE, colored by log(population), faceted by k-step ahead forecast. (B) Difference between the within-city-standardized RMSE for TSIR and the within-city-standardized RMSE for SFNN; loess regression curves are fitted.

https://doi.org/10.1371/journal.pcbi.1012616.g002

We also test whether our SFNN can capture annual to biennial bifurcation of measles epidemics in E&W caused by susceptible response to the late 1940s baby boom [18]. While our results (S1 Fig) suggest that our SFNN trained on the limited data prior to 1948 has limited medium- and long-term predictive accuracy for the outbreak size, it largely captures the bifurcation of the seasonal pattern for smaller forecasting windows (k ≤ 4).

Neural network model (SFNN) can uncover mechanism of spatial hierarchical spread

Previous work has demonstrated the presence of gravity-like dynamics in measles outbreaks [16, 25, 29]. For instance, dynamics in small towns are shown to be driven by the mechanism of spatial hierarchical spread in which infections in large cities can serve as reservoirs for seeding infections in less populous regions [25]. To assess if the neural network is learning such a mechanism, we employ feature importance methods that estimate how predictions rely on information from certain features and groups of features. We specifically use SHAP values [30] to investigate the relative importance of a core city to the measles spread in locations with different population sizes (see Materials and methods for more details).

Our results (Fig 3) show that the (lagged) incidence in large cities are relatively more important for less populous cities/towns. This suggests that our neural network model is able to reveal the mechanism of spatial hierarchical spread in the endemic measles spatio-temporal dynamics. This is notable both for the indication that our neural network SFNN is able to employ spatial features in a complex manner that reveals mechanistic dynamics, without explicitly imposing spatial hierarchy in the model a priori, and as an example of a post-hoc XAI method that reaffirms a theorized dynamic in a disease system.

Download:

Fig 3. SHAP values uncover mechanism of spatial hierarchical spread.

The SHAP value measures the relative importance of the incidence of a core city (e.g., London) for making incidence prediction among cities/towns with different population sizes, which can be heuristically treated as the relative importance to the local transmission of measles in a particular city/town. Core city incidence lag features are shown to be more important when predicting incidence for less populous cities/towns. Specifically, the mean relative absolute SHAP value for each of the core city incidence lag features has an inverse relationship with log population. Cities and towns are categorized (on the x-axis) into 10 groups according to the quantiles of their population sizes.

https://doi.org/10.1371/journal.pcbi.1012616.g003

Latent susceptible dynamics reconstruction using TSIR improves inference and forecasts of the integrative PINN framework

Next we turn our attention to integrative approaches that combine mechanistic and machine learning models. We consider the general conceptual framework of Physics-Informed Neural Network (PINN), a class of integrative neural networks that incorporate physics differential equations [20]. PINNs regularize a neural network by including a loss term which matches differential equations with observed gradient approximations garnered during the fitting process (typically using automatic differentiation methods). They hold the promise of preserving the high predictive capabilities and expressibility of neural networks while integrating scientific relationships directly into the model. Though PINNs are classically employed as a surrogate model for computationally intensive differential equation solvers [20], they also enable parameter inference and let (physics) dynamics partially drive predictions in an integrative fashion. These latter aims have been the primary impetus of existing methods to extend PINNs to spread of infectious disease [22] and are also the motive for us improving the PINN framework.

Here, we investigate the utility of a customized PINN model augmented by the reconstruction of the latent susceptible dynamics leveraging the TSIR model, referred to as TSIR-PINN (see Materials and methods for full model specification). We compare the added benefit of our approach to a naive PINN model without such augmentation of the latent dynamics (referred to as Naive-PINN). We apply both models to London measles incidence data and assess a two-year-ahead forecast window, demonstrating that the Naive-PINN fails to make accurate predictions and parameter inference, while our TSIR-PINN model utilizing TSIR-reconstructed susceptible dynamics is able to capture and predict the transmission dynamics reasonably accurately (Fig 4). In particular, the TSIR-PINN model estimates an R₀ value (Fig 4) which is largely consistent with previous estimates [18]. The TSIR-PINN model also outperforms the Naive-PINN with respect to test-set Mean Absolute Error (MAE) and correlation (Table 1).

Download:

Fig 4. Test-set 52-step-ahead incidence predictions over time (A) and Inference of seasonal transmission rate (B) in London.

(A) TSIR-PINN test-set 52-step-ahead incidence predictions for London more closely match true incidence, when compared to those for Naive-PINN. (B) PINN parameter values are notably different between TSIR-PINN and Naive-PINN models over 2,500 epochs. The parameter v (black lines) correspond to the R₀. Convergence is rapidly achieved when fitting the TSIR-PINN model, while convergence is less clear for the Naive-PINN model. More importantly, the TSIR-PINN model estimates an R₀ of 26.8 which is broadly consistent with the literature, while the Naive-PINN estimates an R₀ of 5.7.

https://doi.org/10.1371/journal.pcbi.1012616.g004

Download:

Table 1. 52-biweek-ahead London measles incidence forecasting performance of TSIR-PINN and Naive-PINN measured by test-MAE and test-correlation.

TSIR-PINN outperforms Naive-PINN by both measures.

https://doi.org/10.1371/journal.pcbi.1012616.t001

Our results suggest that including the TSIR-reconstructed (latent) susceptible dynamics (in our TSIR-PINN) can improve parameter inference while maintaining the predictive capabilities of a PINN modeling framework. These results provide important insights into more rigorously incorporating partially observed epidemic data into a PINN model, which may facilitate future developments and applications of PINN-based epidemic models.

Discussion

Measles is among one of the most well-documented infectious disease systems and is known for its complex spatio-temporal dynamics. Spatiotemporal dynamics of measles infection, driven by interplay between seasonal forcing and susceptible recruitment dynamics [17], range from simple limit cycles to chaos, with the domination of stochastic extinction in small, highly vaccinated populations [18, 31, 32]. As such, measles serves as an excellent test bed for developing modeling techniques aimed at understanding similar nonlinear infectious disease systems.

Flexible machine learning approaches hold much potential for forecasting measles dynamics. Deep neural networks in particular are known to be highly flexible for incorporating various types of data structures and capture highly nonlinear relationships, and can efficiently handle large datasets and numerous features. However interpretation and inference is often difficult due to high dimensional model parameterizations and lack of scientific knowledge integration. Two broad classes of methods are suitable for improving mechanistic interpretability of machine learning models for infectious disease dynamics: post-hoc explainability (XAI) methods which conduct post-hoc analysis on model outputs to understand underlying drivers of predictions, and direct integration of mechanistic models or other scientific priors into machine learning models. Here we detail one example of each of these classes and demonstrate their effectiveness in accurately characterizing measles spatio-temporal dynamics while preserving high forecasting performance.

Our high-dimensional feed-forward SFNN overall performs well for all forecasting windows and the majority of cities. More noteworthy is its ability to outperform TSIR for the difficult forecasting scenarios of long forecast windows (ranging from six months to two years) and less populous towns with sparse, less regular outbreaks. Neural networks are known as a “black box” method, indicating that the way in which the model uses specific covariates to arrive at a forecast is not readily apparent from parameter inspection. This is the primary downside of employing such machine learning methods when compared to mechanistic and semi-mechanistic methods such as the TSIR, which provide ample opportunities for parameter inference and assessment in relation to scientific knowledge and hypotheses. To surmount this limitation there are a collection of post-hoc methods that allow methodical interrogation of machine learning output.

Our application of one such method, the SHAP value XAI calculation [30], is able to provide insights into how our SFNN predictions are being driven by a combination of input variables that has scientifically meaningful interpretation. Specifically we show that our SFNN uncovers the mechanism that outbreaks in large cities may influence measles transmission in smaller towns/cities. This is consistent with previously theorized mechanism which suggest a similar dynamic of hierarchical spread of infections from large cities to smaller towns [14].

While this post-hoc method is insightful and relatively straightforward to apply due to its lack of interference with model-training, we push the neural-network inferential capability further with a fully integrative PINN based model (TSIR-PINN) that incorporates reconstructed latent susceptible dynamics from the seminal semi-mechanistic TSIR model. Previous work combining neural networks with compartmental models often require separate ad-hoc steps for model estimation or prediction [33, 34]. Here, by fusing mechanistic compartmental models with the neural network in the loss function used during model training, we are able to jointly regularize all parameters with respect to the SIR constraints while conducting parameter inference and maintaining forecast performance. We show that by including the reconstructed latent susceptible population in our TSIR-PINN, both forecasting performance and parameter estimation are improved when compared to the Naive-PINN model (which does not utilize the augmented latent susceptible dynamics using TSIR). While PINN-based models have previously been applied to infectious disease data, our work is a step forward in terms of more rigorous inference of and integration with latent aspects of the transmission dynamics, which is crucial in enabling long-term forecast windows and mechanistic interpretability. Our results provide key insights into incorporating partially observed epidemic data into a PINN-based modeling framework, which may facilitate future developments and applications of PINN-based epidemic models. There are several potential future directions we can explore.

The PINN-based formulation introduced here provides solely point estimates of disease dynamic parameters, and one area ripe for further development is the incorporation of rigorous statistical uncertainty quantification into these methods that might enable probabilistic statements about both model parameters and predictive output. There is also potential for this work to be extended to other disease systems, incorporating mechanisms and assessing hypotheses that are specific to these areas of study. We demonstrate in this work that accurately augmenting the latent aspects of the underlying transmission dynamics enables the PINN to perform well in both prediction and estimation. The augmentation scheme employed here, the TSIR model, has been successfully applied to other disease systems, including HFMD [35], COVID-19 [36], and RSV [37], which would allow straightforward application of our TSIR-PINN method. It is worth noting that other latent dynamic augmentation schemes, such as methods incorporating Approximate Bayesian Computation [38], can be considered to enable models similar to TSIR-PINN. Furthermore, we have only applied the TSIR-PINN model to London measles incidence data. London measles incidence is arguably the “gold-standard” dataset for understanding and assessing measles dynamics due to having a strong mechanistic signal. Our work follows many previous studies that also focus exclusively or primarily on London when studying measles incidence data and dynamics [8, 9, 11, 13, 15, 18]. Thus London is a natural test bed to compare the performance of TSIR-PINN relative to the Naïve-PINN, showing the added value of integrating a mechanistic model (TSIR) within the PINN architecture. That said, there is potential to extend the TSIR-PINN model with gravity model mechanisms that incorporates distance and population size into the unsupervised portion of the loss, which would allow us to include additional cities (or all cities, such as with the SFNN). Also, we have focused on comparing the relative performance of the TSIR-PINN to the Naive-PINN, and thus there remains room to further explore model formulations that may result in even higher prediction accuracy, such as shared embeddings or explicitly spatial or temporal architectures such Convolutional or Long Short-Term Memory Neural Networks (CNNs, LSTMs respectively) [39, 40]. Finally, there is potential for application of machine learning methodology that instead of imposing compartmental model structures a priori and inferring parameter values, focuses on hypothesis generation and model structure discovery in the context of infectious disease [41–44]. This could automate some of the fine-tuning of model structure required for these highly bespoke models and aid modellers at earlier stages in their research.

In summary, our results show that appropriately designed neural network-based models can outperform traditional mechanistic models in forecasting, while simultaneously providing mechanistic interpretability. Our work also offers valuable insights into the more effectively integrating machine learning models with mechanistic models to enhance public health responses to measles and similar infectious disease systems.

Materials and methods

Study data

We train and assess our models on biweekly measles incidence counts across 1,452 cities/towns in England & Wales during the pre-vaccination period from 1944 to 1965 (Fig 5). Separate models are fitted for different k-step ahead forecasts, ranging from 1 to 52 biweekly time steps ahead.

Download:

Fig 5. Measles cases in England and Wales.

(A) Cities/towns are colored by log measles incidence on the first biweek of 1961. The England and Wales map is made with Natural Earth vector map data. (B) The seasonal measles trend is apparent across the four most populous cities in England and Wales from 1944 to 1965.

https://doi.org/10.1371/journal.pcbi.1012616.g005

Feed-forward spatial feature neural network model (SFFN)

For each k-step ahead we fit a separate feed-forward spatial feature neural network (SFNN) with 1–3 hidden linear layers of dimension 240–1201, linear input/output layers, and ReLu [45] activation functions (Fig 1), where the number of hidden layers and hidden layer dimension differ by forecasting window (S2 Table).

We include a range of features, including birth counts, population size, lagged incidence counts and lagged incidence counts and distances for the seven cities with a population higher than the critical community size of 300,000 which has previously been identified as “core cities” that drive epidemics in connected cities/towns. [28]. We also incorporate spatial features, including the lagged incidence counts of the nearest ten cities and their distances. This potentially enables the neural network to learn spatial dynamics that the TSIR model does not capture. Birth and population features are from the nearest time step less than or equal to t − k while still sharing the same biweek of the year. Lagged features range from t − k to t − T_lag, where t is the target time step and T_lag ranges from 26 to 130, depending on the forecasting window (S2 Table). Neural networks are fitted in Pytorch [46] using the Adam [47] optimizer with Mean Squared Error (MSE) loss, and are trained on incidence data ranging from 1949 to 1961 with incidence data ranging from 1961 to 1965 held out for testing. Hyperparameters (including T_lag, number of hidden layers, hidden layer dimension, and Adam optimizer weight decay) are chosen using grid search with the Ray Tune [48] Python library. This procedure selects the optimal combination of hyperparameters based on minimum test-MSE after 10 epochs for all combinations of T_lag ∈ {26, 52, 78, 104, 130}, hidden dimension ∈ {240, 721, 1201}, number of hidden layers ∈ {1, 2, 3}, and weight decay ∈ {0.0001, …, 0.1}, for each k forecasting window (S2 Table).

Time-series SIR (TSIR) model

We compare the neural networks to the TSIR (time-series susceptible-infected-recovered) model, a popular semi-mechanistic technique that approximates the continuous-time SIR model and has been shown to accurately capture the dynamics of measles outbreaks in major cities [13]. TSIR provides a computationally inexpensive and highly tractable alternative to the classic SIR compartmental model, and is described by the following equations: (1) (2) where S_t is reconstructed as at each time step and with the average number of susceptible individuals in the population. Z_t is estimated from Eq 2 by regressing the cumulative births against the cumulative incidence as follows, (3) and the log-linearized Eq 1.

For each k-step ahead, target time set t, and city, a separate TSIR model is fit on time steps t − 130 to t − k. One-step ahead forecasts are then made recursively with Eqs 1 and 2 until time t forecast is reached. We employ the tsiR R package for TSIR model fitting, and refer the reader to the package documentation for details not specified here [26].

Neural network interpretability methods (SHAP)

We use the SHAP (SHapley Additive exPlanations) method [30] to assess neural network feature importance, specifically relying on sampling-based approximation methods [49, 50] from the Captum [51] Python library. SHAP values are estimated by randomly permuting (input) feature groups, calculating the change in model output due to a particular permutation and finally averaging across all permutations. Features are grouped according to lag type; that is, incidence lags are grouped together, high-population-city incidence lags are grouped together, etc.

In our analysis, we first estimate the normalized absolute SHAP value associated with a particular feature group for each observation. We then, within a particular city or town, calculate the average of all the normalized absolute SHAP values associated with a particular feature group of interest, across all the observations of that city or town. Together these provide a measure of the relative importance of a particular feature group for the predictions made for a city/town.

Physics-Informed-Neural-Network model

The neural network architecture for the PINN models is different from the previously described neural network, due to incorporation of compartmental S-I-R equations and parameters in the model’s loss function. We start with a Feed-Forward Neural Network with 2 hidden layers of dimension 128, linear input layer, and a 2 dimensional output layer for the TSIR-reconstructed susceptible (S^TSIR) and observed incidence (I). GeLU [52] activation functions are used on the hidden layers and a softplus activation function is used on the output layer. Features include time, lagged incidence counts, and lagged TSIR-reconstructed susceptibles, with the time feature transformed with Gaussian Random Fourier feature mappings [53]. Neural networks are again fit in PyTorch using the Adam optimizer with the same train/test as previously, though here we employ a Mean Absolute Error (MAE) loss comprised of the following components: (4) where λ_FF and λ_PINN are tunable hyperparameters, and (5) (6) where and are the FF predictions at time t.

Here and are the output of the following compartmental SIR equations at time t: (7) (8) where B_t is the number of births at time t, β_t is a seasonal transmission rate at time t, γ is the recovery rate, N_t is the population at time t, and and are the approximations of the relevant gradients, which are calculated at each epoch using autograd in PyTorch [46].

We parameterize β_t as follows: (9) where Eq 9 implies a seasonal transmission rate with three free parameters: ν, α₁, and α₂. ν is the baseline transmission rate, while α₁ and α₁ are seasonal parameters controlling sinusoidal annual fluctuations.

We assume γ = 1 due to the measles recovery period being approximately equal to the biweekly scale of the data [54], thus the parameters employed in β_t are the sole learnable parameters for this MAE_PINN component of the loss. By matching to , we are providing an unsupervised soft constraint on the neural network to adhere to compartmental equation dynamics and vice versa.

One explanation for why incorporation of the TSIR reconstructed susceptibles in the PINN improves prediction and estimation is as follows. The predicted incidence, I_t, determines (part of) the loss, and the incidence is structurally related to the susceptible dynamics of measles (see Eqs 1 and 2). As such, incorporating the TSIR reconstructed susceptibles imposes more reasonable constraints on the (predicted) incidence, ultimately resulting in more stable fitting and improved predictions. To assess this impact, we also fit versions of the above models with naively constrained latent susceptibles, such that all S^TSIR components are replaced with S^Naive, and are fit as unconstrained parameters. To aid in model stability we fit the TSIR-PINN and Naive-PINN models 100 times each and take the final predictions as the mean of the predictions over all model runs.

Supporting information

S1 Fig. London SFNN bifurcation assessment.

Our SFFN model trained on the limited data prior to 1948 predicts change of seasonality (i.e., annual to biennial bifurcation in late 1940s) in London, for steps-ahead ranging from 1–4. It is noted that, due to the lack of training data in this case, our SFNN does not perform well in capturing the magnitude of the incidence in general.

https://doi.org/10.1371/journal.pcbi.1012616.s001

(TIFF)

S1 Table. Forecasting year train/test cutoff sensitivity analysis.

The average within-city standardized test-set RMSE for TSIR and SFNN for all combinations of k forecasting windows ∈ {1, 4, 12, 20, 34, 52} and train/test cutoff years ∈ {1960, 1961, 1962} demonstrates that the improvement of the SFNN model over the TSIR is stable across train/test cutoff points.

https://doi.org/10.1371/journal.pcbi.1012616.s002

(PDF)

S2 Table. Optimal SFNN hyperparameters.

Tuned hyperparameter values, as determined by Ray Tune grid search, indicate the optimal number of incidence feature time lags, hidden dimension, and weight decay value, for each forecasting window.

https://doi.org/10.1371/journal.pcbi.1012616.s003

(PDF)

References

1. Rustam F, Reshi AA, Mehmood A, Ullah S, On BW, Aslam W, et al. COVID-19 Future Forecasting Using Supervised Machine Learning Models. IEEE Access. 2020;8:101489–101499.
- View Article
- Google Scholar
2. Du H, Dong E, Badr H, Petrone M, Grubaugh N, Gardner L. Incorporating variant frequencies data into short-term forecasting for COVID-19 cases and deaths in the USA: a deep learning approach. EBioMedicine. 2023;89:104482. pmid:36821889
- View Article
- PubMed/NCBI
- Google Scholar
3. Rodriguez A, Tabassum A, Cui J, Xie J, Ho J, Agarwal P, et al. DeepCOVID: An Operational Deep Learning-driven Framework for Explainable Real-time COVID-19 Forecasting. Proceedings of the AAAI Conference on Artificial Intelligence. 2021;35:15393–15400.
4. Temenos A, Tzortzis IN, Kaselimi M, Rallis I, Doulamis A, Doulamis N. Novel Insights in Spatial Epidemiology Utilizing Explainable AI (XAI) and Remote Sensing. Remote Sensing. 2022;14(13).
- View Article
- Google Scholar
5. Arik S, Li CL, Yoon J, Sinha R, Epshteyn A, Le L, et al. Interpretable Sequence Learning for Covid-19 Forecasting. In: Larochelle H, Ranzato M, Hadsell R, Balcan MF, Lin H, editors. Advances in Neural Information Processing Systems. vol. 33. Curran Associates, Inc.; 2020. p. 18807–18818.
6. Rodríguez A, Cui J, Ramakrishnan N, Adhikari B, Prakash BA. EINNs: epidemiologically-informed neural networks. In: Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence. AAAI’23/IAAI’23/EAAI’23. AAAI Press; 2023.
7. Nguyen DQ, Vo NQ, Nguyen TT, et al. BeCaked: An Explainable Artificial Intelligence Model for COVID-19 Forecasting. Sci Rep. 2022;12:7969. pmid:35562369
- View Article
- PubMed/NCBI
- Google Scholar
8. Brownlee J. An investigation into the periodicity of measles epidemics in London from 1703 to the present day by the method of the periodogram. Phil Trans R Soc Lond B. 1918;208:225–250.
- View Article
- Google Scholar
9. Becker A, Zhou S, Wesolowski A, Grenfell B. Coexisting attractors in the context of cross-scale population dynamics: Measles in London as a case study. Proceedings Biological sciences. 2020;287:20191510. pmid:32315586
- View Article
- PubMed/NCBI
- Google Scholar
10. Finkenstädt B, Bjornstad O, Grenfell B. A stochastic model for extinction and recurrence of epidemics: Estimation and inference for measles outbreaks. Biostatistics (Oxford, England). 2003;3:493–510.
- View Article
- Google Scholar
11. Bartlett MS. Measles Periodicity and Community Size. Journal of the Royal Statistical Society Series A (General). 1957;120(1):48–70.
- View Article
- Google Scholar
12. Bolker BM, Grenfell BT. Impact of vaccination on the spatial correlation and persistence of measles dynamics. Proceedings of the National Academy of Sciences. 1996;93(22):12648–12653. pmid:8901637
- View Article
- PubMed/NCBI
- Google Scholar
13. Finkenstadt BF, Grenfell BT. Time Series Modelling of Childhood Diseases: A Dynamical Systems Approach. Journal of the Royal Statistical Society Series C (Applied Statistics). 2000;49(2):187–205.
- View Article
- Google Scholar
14. Xia Y, Bjørnstad O, Grenfell B, DeAngelis AEDL. Measles Metapopulation Dynamics: A Gravity Model for Epidemiological Coupling and Dynamics. The American Naturalist. 2004;164(2):267–281. pmid:15278849
- View Article
- PubMed/NCBI
- Google Scholar
15. Lau MSY, Becker A, Madden W, Waller LA, Metcalf CJE, Grenfell BT. Comparing and linking machine learning and semi-mechanistic models for the predictability of endemic measles dynamics. PLOS Computational Biology. 2022;18(9):1–14. pmid:36074763
- View Article
- PubMed/NCBI
- Google Scholar
16. Grenfell B, Bjørnstad O, Kappey J. Travelling waves and spatial hierarchies in measles epidemics. Nature. 2001;414:716–723. pmid:11742391
- View Article
- PubMed/NCBI
- Google Scholar
17. Grenfell BT, Bjørnstad ON, Finkenstädt BF. Dynamics of Measles Epidemics: Scaling Noise, Determinism, and Predictability with the TSIR Model. Ecological Monographs. 2002;72(2):185–202.
- View Article
- Google Scholar
18. Becker AD, Wesolowski A, Bjørnstad ON, Grenfell BT. Long-term dynamics of measles in London: Titrating the impact of wars, the 1918 pandemic, and vaccination. PLOS Computational Biology. 2019;15(9):1–14. pmid:31513578
- View Article
- PubMed/NCBI
- Google Scholar
19. Endo A, van Leeuwen E, Baguelin M. Introduction to particle Markov-chain Monte Carlo for disease dynamics modellers. Epidemics. 2019;29:100363. pmid:31587877
- View Article
- PubMed/NCBI
- Google Scholar
20. Raissi M, Perdikaris P, Karniadakis GE. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational Physics. 2019;378:686–707.
- View Article
- Google Scholar
21. Karniadakis GE, Kevrekidis IG, Lu L, Perdikaris P, Wang S, Yang L. Physics-informed machine learning. Nature Reviews Physics. 2021;3(6):422–440.
- View Article
- Google Scholar
22. Shaier S, Raissi M, Seshaiyer P. Data-driven approaches for predicting spread of infectious diseases through DINNs: Disease Informed Neural Networks; 2022.
- View Article
- Google Scholar
23. Berkhahn S, Ehrhardt M. A physics-informed neural network to model COVID-19 infection and hospitalization scenarios. Adv Contin Discret Model. 2022;2022(1):61. pmid:36320680
- View Article
- PubMed/NCBI
- Google Scholar
24. Ning X, Guan J, Li X, Wei Y, Chen F. Physics-Informed Neural Networks Integrating Compartmental Model for Analyzing COVID-19 Transmission Dynamics. Viruses. 2023;15(8):1749. pmid:37632091
- View Article
- PubMed/NCBI
- Google Scholar
25. Lau M, Becker A, Korevaar H, Caudron Q, Shaw D, Metcalf CJ, et al. A competing-risks model explains hierarchical spatial coupling of measles epidemics en route to national elimination. Nature Ecology & Evolution. 2020;4:1–6. pmid:32341514
- View Article
- PubMed/NCBI
- Google Scholar
26. Becker A, Grenfell B. tsiR: An R package for time-series Susceptible-Infected-Recovered models of epidemics. PLoS One. 2017;12(9):e0185528. pmid:28957408
- View Article
- PubMed/NCBI
- Google Scholar
27. Jandarov R, Haran M, Bjørnstad O, Grenfell B. Emulating a Gravity Model to Infer the Spatiotemporal Dynamics of an Infectious Disease. Journal of the Royal Statistical Society Series C: Applied Statistics. 2013;63(3):423–444.
- View Article
- Google Scholar
28. Keeling MJ, Grenfell BT. Disease Extinction and Community Size: Modeling the Persistence of Measles. Science. 1997;275(5296):65–67. pmid:8974392
- View Article
- PubMed/NCBI
- Google Scholar
29. Bharti N, Xia Y, Bjornstad ON, Grenfell BT. Measles on the Edge: Coastal Heterogeneities and Infection Dynamics. PLOS ONE. 2008;3(4):1–7. pmid:18398467
- View Article
- PubMed/NCBI
- Google Scholar
30. Lundberg SM, Lee SI. A Unified Approach to Interpreting Model Predictions. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. NIPS’17. Red Hook, NY, USA: Curran Associates Inc.; 2017. p. 4768–4777.
31. Ferrari MJ, Grais RF, Bharti N, Conlan AJK, Bjørnstad ON, Wolfson LJ, et al. The dynamics of measles in sub-Saharan Africa. Nature. 2008;451(7179):679–684. pmid:18256664
- View Article
- PubMed/NCBI
- Google Scholar
32. Dalziel BD, Bjørnstad ON, van Panhuis WG, Burke DS, Metcalf CJE, Grenfell BT. Persistent Chaos of Measles Epidemics in the Prevaccination United States Caused by a Small Change in Seasonal Transmission Patterns. PLOS Computational Biology. 2016;12(2):1–12. pmid:26845437
- View Article
- PubMed/NCBI
- Google Scholar
33. Bousquet A, Conrad WH, Sadat SO, et al. Deep learning forecasting using time-varying parameters of the SIRD model for Covid-19. Scientific Reports. 2022;12:3030. pmid:35194090
- View Article
- PubMed/NCBI
- Google Scholar
34. Nadler P, Arcucci R, Guo Y. A Neural SIR Model for Global Forecasting. In: Alsentzer E, McDermott MBA, Falck F, Sarkar SK, Roy S, Hyland SL, editors. Proceedings of the Machine Learning for Health NeurIPS Workshop. vol. 136 of Proceedings of Machine Learning Research. PMLR; 2020. p. 254–266. Available from: https://proceedings.mlr.press/v136/nadler20a.html.
35. Takahashi S, Liao Q, Van Boeckel TP, Xing W, Sun J, Hsiao VY, et al. Hand, Foot, and Mouth Disease in China: Modeling Epidemic Dynamics of Enterovirus Serotypes and Implications for Vaccination. PLoS Medicine. 2016;13(2):e1001958. pmid:26882540
- View Article
- PubMed/NCBI
- Google Scholar
36. Baker RE, Park SW, Yang W, Vecchi GA, Metcalf CJE, Grenfell BT. The impact of COVID-19 nonpharmaceutical interventions on the future dynamics of endemic infections. Proceedings of the National Academy of Sciences. 2020;117(48):30547–30553. pmid:33168723
- View Article
- PubMed/NCBI
- Google Scholar
37. Wambua J, Munywoki PK, Coletti P, Nyawanda BO, Murunga N, Nokes DJ, et al. Drivers of respiratory syncytial virus seasonal epidemics in children under 5 years in Kilifi, coastal Kenya. PLOS ONE. 2022;17(11):1–13. pmid:36441757
- View Article
- PubMed/NCBI
- Google Scholar
38. Minter A, Retkute R. Approximate Bayesian Computation for infectious disease modelling. Epidemics. 2019;29:100368. pmid:31563466
- View Article
- PubMed/NCBI
- Google Scholar
39. LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521(7553):436–444. pmid:26017442
- View Article
- PubMed/NCBI
- Google Scholar
40. Hochreiter S, Schmidhuber J. Long Short-Term Memory. Neural Computation. 1997;9(8):1735–1780. pmid:9377276
- View Article
- PubMed/NCBI
- Google Scholar
41. Hao Z, Liu S, Zhang Y, Ying C, Feng Y, Su H, et al. Physics-Informed Machine Learning: A Survey on Problems, Methods and Applications; 2023.
- View Article
- Google Scholar
42. Brunton SL, Proctor JL, Kutz JN. Discovering governing equations from data by sparse identification of nonlinear dynamical systems. Proceedings of the National Academy of Sciences. 2016;113(15):3932–3937. pmid:27035946
- View Article
- PubMed/NCBI
- Google Scholar
43. Schmidt M, Lipson H. Distilling Free-Form Natural Laws from Experimental Data. Science. 2009;324(5923):81–85. pmid:19342586
- View Article
- PubMed/NCBI
- Google Scholar
44. Chen Z, Liu Y, Sun H. Physics-informed learning of governing equations from scarce data. Nature Communications. 2021;12:6136. pmid:34675223
- View Article
- PubMed/NCBI
- Google Scholar
45. Nair V, Hinton GE. Rectified linear units improve restricted boltzmann machines. In: Proceedings of the 27th International Conference on International Conference on Machine Learning. ICML’10. Madison, WI, USA: Omnipress; 2010. p. 807–814.
46. Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, et al. In: PyTorch: An Imperative Style, High-Performance Deep Learning Library. Red Hook, NY, USA: Curran Associates Inc.; 2019.
47. Kingma D, Ba J. Adam: A Method for Stochastic Optimization. International Conference on Learning Representations. 2014;.
48. Liaw R, Liang E, Nishihara R, Moritz P, Gonzalez JE, Stoica I. Tune: A Research Platform for Distributed Model Selection and Training. arXiv preprint arXiv:180705118. 2018;.
49. Štrumbelj E, Kononenko I. An Efficient Explanation of Individual Classifications using Game Theory. J Mach Learn Res. 2010;11:1–18.
- View Article
- Google Scholar
50. Castro J, Gómez D, Tejada J. Polynomial calculation of the Shapley value based on sampling. Computers & Operations Research. 2009;36(5):1726–1730.
- View Article
- Google Scholar
51. Kokhlikyan N, Miglani V, Martin M, Wang E, Alsallakh B, Reynolds J, et al. Captum: A unified and generic model interpretability library for PyTorch; 2020.
- View Article
- Google Scholar
52. Hendrycks D, Gimpel K. Gaussian Error Linear Units (GELUs); 2023.
- View Article
- Google Scholar
53. Tancik M, Srinivasan P, Mildenhall B, Fridovich-Keil S, Raghavan N, Singhal U, et al. Fourier Features Let Networks Learn High Frequency Functions in Low Dimensional Domains. In: Larochelle H, Ranzato M, Hadsell R, Balcan MF, Lin H, editors. Advances in Neural Information Processing Systems. vol. 33. Curran Associates, Inc.; 2020. p. 7537–7547.
54. Black FL. Measles. In: Evans AS, editor. Viral Infections of Humans: Epidemiology and Control. New York: Plenum; 1984. p. 397–418.

[ref1] 1. Rustam F, Reshi AA, Mehmood A, Ullah S, On BW, Aslam W, et al. COVID-19 Future Forecasting Using Supervised Machine Learning Models. IEEE Access. 2020;8:101489–101499.
View Article
Google Scholar

[2] View Article

[3] Google Scholar

[ref2] 2. Du H, Dong E, Badr H, Petrone M, Grubaugh N, Gardner L. Incorporating variant frequencies data into short-term forecasting for COVID-19 cases and deaths in the USA: a deep learning approach. EBioMedicine. 2023;89:104482. pmid:36821889
View Article
PubMed/NCBI
Google Scholar

[5] View Article

[6] PubMed/NCBI

[7] Google Scholar

[ref3] 3. Rodriguez A, Tabassum A, Cui J, Xie J, Ho J, Agarwal P, et al. DeepCOVID: An Operational Deep Learning-driven Framework for Explainable Real-time COVID-19 Forecasting. Proceedings of the AAAI Conference on Artificial Intelligence. 2021;35:15393–15400.

[ref4] 4. Temenos A, Tzortzis IN, Kaselimi M, Rallis I, Doulamis A, Doulamis N. Novel Insights in Spatial Epidemiology Utilizing Explainable AI (XAI) and Remote Sensing. Remote Sensing. 2022;14(13).
View Article
Google Scholar

[10] View Article

[11] Google Scholar

[ref5] 5. Arik S, Li CL, Yoon J, Sinha R, Epshteyn A, Le L, et al. Interpretable Sequence Learning for Covid-19 Forecasting. In: Larochelle H, Ranzato M, Hadsell R, Balcan MF, Lin H, editors. Advances in Neural Information Processing Systems. vol. 33. Curran Associates, Inc.; 2020. p. 18807–18818.

[ref6] 6. Rodríguez A, Cui J, Ramakrishnan N, Adhikari B, Prakash BA. EINNs: epidemiologically-informed neural networks. In: Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence. AAAI’23/IAAI’23/EAAI’23. AAAI Press; 2023.

[ref7] 7. Nguyen DQ, Vo NQ, Nguyen TT, et al. BeCaked: An Explainable Artificial Intelligence Model for COVID-19 Forecasting. Sci Rep. 2022;12:7969. pmid:35562369
View Article
PubMed/NCBI
Google Scholar

[15] View Article

[16] PubMed/NCBI

[17] Google Scholar

[ref8] 8. Brownlee J. An investigation into the periodicity of measles epidemics in London from 1703 to the present day by the method of the periodogram. Phil Trans R Soc Lond B. 1918;208:225–250.
View Article
Google Scholar

[19] View Article

[20] Google Scholar

[ref9] 9. Becker A, Zhou S, Wesolowski A, Grenfell B. Coexisting attractors in the context of cross-scale population dynamics: Measles in London as a case study. Proceedings Biological sciences. 2020;287:20191510. pmid:32315586
View Article
PubMed/NCBI
Google Scholar

[22] View Article

[23] PubMed/NCBI

[24] Google Scholar

[ref10] 10. Finkenstädt B, Bjornstad O, Grenfell B. A stochastic model for extinction and recurrence of epidemics: Estimation and inference for measles outbreaks. Biostatistics (Oxford, England). 2003;3:493–510.
View Article
Google Scholar

[26] View Article

[27] Google Scholar

[ref11] 11. Bartlett MS. Measles Periodicity and Community Size. Journal of the Royal Statistical Society Series A (General). 1957;120(1):48–70.
View Article
Google Scholar

[29] View Article

[30] Google Scholar

[ref12] 12. Bolker BM, Grenfell BT. Impact of vaccination on the spatial correlation and persistence of measles dynamics. Proceedings of the National Academy of Sciences. 1996;93(22):12648–12653. pmid:8901637
View Article
PubMed/NCBI
Google Scholar

[32] View Article

[33] PubMed/NCBI

[34] Google Scholar

[ref13] 13. Finkenstadt BF, Grenfell BT. Time Series Modelling of Childhood Diseases: A Dynamical Systems Approach. Journal of the Royal Statistical Society Series C (Applied Statistics). 2000;49(2):187–205.
View Article
Google Scholar

[36] View Article

[37] Google Scholar

[ref14] 14. Xia Y, Bjørnstad O, Grenfell B, DeAngelis AEDL. Measles Metapopulation Dynamics: A Gravity Model for Epidemiological Coupling and Dynamics. The American Naturalist. 2004;164(2):267–281. pmid:15278849
View Article
PubMed/NCBI
Google Scholar

[39] View Article

[40] PubMed/NCBI

[41] Google Scholar

[ref15] 15. Lau MSY, Becker A, Madden W, Waller LA, Metcalf CJE, Grenfell BT. Comparing and linking machine learning and semi-mechanistic models for the predictability of endemic measles dynamics. PLOS Computational Biology. 2022;18(9):1–14. pmid:36074763
View Article
PubMed/NCBI
Google Scholar

[43] View Article

[44] PubMed/NCBI

[45] Google Scholar

[ref16] 16. Grenfell B, Bjørnstad O, Kappey J. Travelling waves and spatial hierarchies in measles epidemics. Nature. 2001;414:716–723. pmid:11742391
View Article
PubMed/NCBI
Google Scholar

[47] View Article

[48] PubMed/NCBI

[49] Google Scholar

[ref17] 17. Grenfell BT, Bjørnstad ON, Finkenstädt BF. Dynamics of Measles Epidemics: Scaling Noise, Determinism, and Predictability with the TSIR Model. Ecological Monographs. 2002;72(2):185–202.
View Article
Google Scholar

[51] View Article

[52] Google Scholar

[ref18] 18. Becker AD, Wesolowski A, Bjørnstad ON, Grenfell BT. Long-term dynamics of measles in London: Titrating the impact of wars, the 1918 pandemic, and vaccination. PLOS Computational Biology. 2019;15(9):1–14. pmid:31513578
View Article
PubMed/NCBI
Google Scholar

[54] View Article

[55] PubMed/NCBI

[56] Google Scholar

[ref19] 19. Endo A, van Leeuwen E, Baguelin M. Introduction to particle Markov-chain Monte Carlo for disease dynamics modellers. Epidemics. 2019;29:100363. pmid:31587877
View Article
PubMed/NCBI
Google Scholar

[58] View Article

[59] PubMed/NCBI

[60] Google Scholar

[ref20] 20. Raissi M, Perdikaris P, Karniadakis GE. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational Physics. 2019;378:686–707.
View Article
Google Scholar

[62] View Article

[63] Google Scholar

[ref21] 21. Karniadakis GE, Kevrekidis IG, Lu L, Perdikaris P, Wang S, Yang L. Physics-informed machine learning. Nature Reviews Physics. 2021;3(6):422–440.
View Article
Google Scholar

[65] View Article

[66] Google Scholar

[ref22] 22. Shaier S, Raissi M, Seshaiyer P. Data-driven approaches for predicting spread of infectious diseases through DINNs: Disease Informed Neural Networks; 2022.
View Article
Google Scholar

[68] View Article

[69] Google Scholar

[ref23] 23. Berkhahn S, Ehrhardt M. A physics-informed neural network to model COVID-19 infection and hospitalization scenarios. Adv Contin Discret Model. 2022;2022(1):61. pmid:36320680
View Article
PubMed/NCBI
Google Scholar

[71] View Article

[72] PubMed/NCBI

[73] Google Scholar

[ref24] 24. Ning X, Guan J, Li X, Wei Y, Chen F. Physics-Informed Neural Networks Integrating Compartmental Model for Analyzing COVID-19 Transmission Dynamics. Viruses. 2023;15(8):1749. pmid:37632091
View Article
PubMed/NCBI
Google Scholar

[75] View Article

[76] PubMed/NCBI

[77] Google Scholar

[ref25] 25. Lau M, Becker A, Korevaar H, Caudron Q, Shaw D, Metcalf CJ, et al. A competing-risks model explains hierarchical spatial coupling of measles epidemics en route to national elimination. Nature Ecology & Evolution. 2020;4:1–6. pmid:32341514
View Article
PubMed/NCBI
Google Scholar

[79] View Article

[80] PubMed/NCBI

[81] Google Scholar

[ref26] 26. Becker A, Grenfell B. tsiR: An R package for time-series Susceptible-Infected-Recovered models of epidemics. PLoS One. 2017;12(9):e0185528. pmid:28957408
View Article
PubMed/NCBI
Google Scholar

[83] View Article

[84] PubMed/NCBI

[85] Google Scholar

[ref27] 27. Jandarov R, Haran M, Bjørnstad O, Grenfell B. Emulating a Gravity Model to Infer the Spatiotemporal Dynamics of an Infectious Disease. Journal of the Royal Statistical Society Series C: Applied Statistics. 2013;63(3):423–444.
View Article
Google Scholar

[87] View Article

[88] Google Scholar

[ref28] 28. Keeling MJ, Grenfell BT. Disease Extinction and Community Size: Modeling the Persistence of Measles. Science. 1997;275(5296):65–67. pmid:8974392
View Article
PubMed/NCBI
Google Scholar

[90] View Article

[91] PubMed/NCBI

[92] Google Scholar

[ref29] 29. Bharti N, Xia Y, Bjornstad ON, Grenfell BT. Measles on the Edge: Coastal Heterogeneities and Infection Dynamics. PLOS ONE. 2008;3(4):1–7. pmid:18398467
View Article
PubMed/NCBI
Google Scholar

[94] View Article

[95] PubMed/NCBI

[96] Google Scholar

[ref30] 30. Lundberg SM, Lee SI. A Unified Approach to Interpreting Model Predictions. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. NIPS’17. Red Hook, NY, USA: Curran Associates Inc.; 2017. p. 4768–4777.

[ref31] 31. Ferrari MJ, Grais RF, Bharti N, Conlan AJK, Bjørnstad ON, Wolfson LJ, et al. The dynamics of measles in sub-Saharan Africa. Nature. 2008;451(7179):679–684. pmid:18256664
View Article
PubMed/NCBI
Google Scholar

[99] View Article

[100] PubMed/NCBI

[101] Google Scholar

[ref32] 32. Dalziel BD, Bjørnstad ON, van Panhuis WG, Burke DS, Metcalf CJE, Grenfell BT. Persistent Chaos of Measles Epidemics in the Prevaccination United States Caused by a Small Change in Seasonal Transmission Patterns. PLOS Computational Biology. 2016;12(2):1–12. pmid:26845437
View Article
PubMed/NCBI
Google Scholar

[103] View Article

[104] PubMed/NCBI

[105] Google Scholar

[ref33] 33. Bousquet A, Conrad WH, Sadat SO, et al. Deep learning forecasting using time-varying parameters of the SIRD model for Covid-19. Scientific Reports. 2022;12:3030. pmid:35194090
View Article
PubMed/NCBI
Google Scholar

[107] View Article

[108] PubMed/NCBI

[109] Google Scholar

[ref34] 34. Nadler P, Arcucci R, Guo Y. A Neural SIR Model for Global Forecasting. In: Alsentzer E, McDermott MBA, Falck F, Sarkar SK, Roy S, Hyland SL, editors. Proceedings of the Machine Learning for Health NeurIPS Workshop. vol. 136 of Proceedings of Machine Learning Research. PMLR; 2020. p. 254–266. Available from: https://proceedings.mlr.press/v136/nadler20a.html.

[ref35] 35. Takahashi S, Liao Q, Van Boeckel TP, Xing W, Sun J, Hsiao VY, et al. Hand, Foot, and Mouth Disease in China: Modeling Epidemic Dynamics of Enterovirus Serotypes and Implications for Vaccination. PLoS Medicine. 2016;13(2):e1001958. pmid:26882540
View Article
PubMed/NCBI
Google Scholar

[112] View Article

[113] PubMed/NCBI

[114] Google Scholar

[ref36] 36. Baker RE, Park SW, Yang W, Vecchi GA, Metcalf CJE, Grenfell BT. The impact of COVID-19 nonpharmaceutical interventions on the future dynamics of endemic infections. Proceedings of the National Academy of Sciences. 2020;117(48):30547–30553. pmid:33168723
View Article
PubMed/NCBI
Google Scholar

[116] View Article

[117] PubMed/NCBI

[118] Google Scholar

[ref37] 37. Wambua J, Munywoki PK, Coletti P, Nyawanda BO, Murunga N, Nokes DJ, et al. Drivers of respiratory syncytial virus seasonal epidemics in children under 5 years in Kilifi, coastal Kenya. PLOS ONE. 2022;17(11):1–13. pmid:36441757
View Article
PubMed/NCBI
Google Scholar

[120] View Article

[121] PubMed/NCBI

[122] Google Scholar

[ref38] 38. Minter A, Retkute R. Approximate Bayesian Computation for infectious disease modelling. Epidemics. 2019;29:100368. pmid:31563466
View Article
PubMed/NCBI
Google Scholar

[124] View Article

[125] PubMed/NCBI

[126] Google Scholar

[ref39] 39. LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521(7553):436–444. pmid:26017442
View Article
PubMed/NCBI
Google Scholar

[128] View Article

[129] PubMed/NCBI

[130] Google Scholar

[ref40] 40. Hochreiter S, Schmidhuber J. Long Short-Term Memory. Neural Computation. 1997;9(8):1735–1780. pmid:9377276
View Article
PubMed/NCBI
Google Scholar

[132] View Article

[133] PubMed/NCBI

[134] Google Scholar

[ref41] 41. Hao Z, Liu S, Zhang Y, Ying C, Feng Y, Su H, et al. Physics-Informed Machine Learning: A Survey on Problems, Methods and Applications; 2023.
View Article
Google Scholar

[136] View Article

[137] Google Scholar

[ref42] 42. Brunton SL, Proctor JL, Kutz JN. Discovering governing equations from data by sparse identification of nonlinear dynamical systems. Proceedings of the National Academy of Sciences. 2016;113(15):3932–3937. pmid:27035946
View Article
PubMed/NCBI
Google Scholar

[139] View Article

[140] PubMed/NCBI

[141] Google Scholar

[ref43] 43. Schmidt M, Lipson H. Distilling Free-Form Natural Laws from Experimental Data. Science. 2009;324(5923):81–85. pmid:19342586
View Article
PubMed/NCBI
Google Scholar

[143] View Article

[144] PubMed/NCBI

[145] Google Scholar

[ref44] 44. Chen Z, Liu Y, Sun H. Physics-informed learning of governing equations from scarce data. Nature Communications. 2021;12:6136. pmid:34675223
View Article
PubMed/NCBI
Google Scholar

[147] View Article

[148] PubMed/NCBI

[149] Google Scholar

[ref45] 45. Nair V, Hinton GE. Rectified linear units improve restricted boltzmann machines. In: Proceedings of the 27th International Conference on International Conference on Machine Learning. ICML’10. Madison, WI, USA: Omnipress; 2010. p. 807–814.

[ref46] 46. Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, et al. In: PyTorch: An Imperative Style, High-Performance Deep Learning Library. Red Hook, NY, USA: Curran Associates Inc.; 2019.

[ref47] 47. Kingma D, Ba J. Adam: A Method for Stochastic Optimization. International Conference on Learning Representations. 2014;.

[ref48] 48. Liaw R, Liang E, Nishihara R, Moritz P, Gonzalez JE, Stoica I. Tune: A Research Platform for Distributed Model Selection and Training. arXiv preprint arXiv:180705118. 2018;.

[ref49] 49. Štrumbelj E, Kononenko I. An Efficient Explanation of Individual Classifications using Game Theory. J Mach Learn Res. 2010;11:1–18.
View Article
Google Scholar

[155] View Article

[156] Google Scholar

[ref50] 50. Castro J, Gómez D, Tejada J. Polynomial calculation of the Shapley value based on sampling. Computers & Operations Research. 2009;36(5):1726–1730.
View Article
Google Scholar

[158] View Article

[159] Google Scholar

[ref51] 51. Kokhlikyan N, Miglani V, Martin M, Wang E, Alsallakh B, Reynolds J, et al. Captum: A unified and generic model interpretability library for PyTorch; 2020.
View Article
Google Scholar

[161] View Article

[162] Google Scholar

[ref52] 52. Hendrycks D, Gimpel K. Gaussian Error Linear Units (GELUs); 2023.
View Article
Google Scholar

[164] View Article

[165] Google Scholar

[ref53] 53. Tancik M, Srinivasan P, Mildenhall B, Fridovich-Keil S, Raghavan N, Singhal U, et al. Fourier Features Let Networks Learn High Frequency Functions in Low Dimensional Domains. In: Larochelle H, Ranzato M, Hadsell R, Balcan MF, Lin H, editors. Advances in Neural Information Processing Systems. vol. 33. Curran Associates, Inc.; 2020. p. 7537–7547.

[ref54] 54. Black FL. Measles. In: Evans AS, editor. Viral Infections of Humans: Epidemiology and Control. New York: Plenum; 1984. p. 397–418.

Figures

Abstract

Author summary

Introduction

Results

Neural network model (SFNN) outperforms TSIR model in forecasting endemic measles dynamics

Neural network model (SFNN) can uncover mechanism of spatial hierarchical spread

Latent susceptible dynamics reconstruction using TSIR improves inference and forecasts of the integrative PINN framework

Discussion

Materials and methods

Study data

Feed-forward spatial feature neural network model (SFFN)

Time-series SIR (TSIR) model

Neural network interpretability methods (SHAP)

Physics-Informed-Neural-Network model

Supporting information

S1 Fig. London SFNN bifurcation assessment.

S1 Table. Forecasting year train/test cutoff sensitivity analysis.

S2 Table. Optimal SFNN hyperparameters.

References