Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Enhancing epidemic forecasting with a physics-informed spatial identity neural network

Abstract

Forecasting the future number of confirmed cases in each region is a critical challenge in controlling the spread of infectious diseases. Accurate predictions enable the proactive development of optimal containment strategies. Recently, deep learning-based models have increasingly leveraged graph structures to capture the spatial dynamics of epidemic spread. While intuitive, this approach often increases model complexity, and the resulting performance gains may not justify the added burden. In some cases, it may even lead to overfitting. Moreover, infectious disease data is typically noisy, making it difficult to extract infectious disease-specific dynamics from data without guidance based on epidemiological domain knowledge. To address these issues, we propose a simple yet effective hybrid model for multi-region epidemic forecasting, termed Physics-Informed Spatial IDentity neural network (PISID). This model integrates a spatio-temporal identity (STID)-based neural network module, which encodes spatio-temporal information without relying on graph structures, with an SIR module grounded in classical epidemiological dynamics. Regional characteristics are incorporated via a spatial embedding matrix, and epidemiological parameters are inferred through a fully connected neural network. These parameters are then used to govern the dynamics of the SIR model for forecasting purposes. Experiments on real-world datasets demonstrate that the proposed PISID model achieves stable and superior predictive performance compared to baseline models, with approximately 27K parameters and an average training time of 0.45 seconds per epoch. Additionally, ablation studies validate the effectiveness of the neural network’s encoding architecture, and analysis of the decoded epidemiological parameters highlights the model’s interpretability. Overall, PISID contributes to reliable epidemic forecasting by integrating data-driven learning with epidemiological domain knowledge.

Introduction

Infectious diseases have long been intertwined with daily human life, with outbreaks historically causing significant disruptions to public health, society, and the economy. For instance, the novel coronavirus disease (COVID-19) has triggered a global pandemic since 2019, resulting in widespread infections and fatalities, and severely impairing social functions [1]. Addressing the threat of such diseases requires accurate epidemic forecasting to enable policymakers to implement timely preventive measures and allocate medical resources effectively.

Many mathematical models for epidemic forecasting have been studied and proposed so far. In recent years, deep learning-based approaches have gained attention due to their strong representational power and predictive accuracy. In particular, because infectious diseases like COVID-19 spread across regions primarily through human mobility, spatio-temporal models incorporating graph neural networks (GNNs) have been developed to capture the spatial dynamics of epidemics. These models extract useful features by modeling dynamic interactions between regions over time, thereby enhancing prediction accuracy. However, learning graph structures, which is a common component of these models, is inherently challenging [2] and increases the model complexity. The increased complexity often leads to reduced computational efficiency and, in some cases, even diminished predictive performance. Moreover, some models rely on auxiliary data such as population mobility [3] or social connectivity [4], to learn graph structures. However, such data are often difficult to obtain and may introduce unintended biases. In addition to the challenges of learning graph structures, the inherent complexity of epidemic dynamics—characterized by exponential transmission dynamics and influenced by diverse factors such as public awareness, climate, and drug availability—exposes deep learning models to the risk of overfitting in exchange for their flexibility in adapting to historical data. On the other hand, classical compartmental models such as the SIR model [5] and its variants, which describe epidemic processes using differential equations, are often employed due to their simplicity and interpretability. These models typically adjust their parameters to best fit historical data. However, this approach cannot adequately account for the inherent uncertainties in future epidemic trends.

Recently, several studies [3,68] have attempted to incorporate epidemiological domain knowledge—specifically, physics-informed compartmental models unique to infectious diseases, such as the SIR model—into deep learning frameworks to enhance forecasting accuracy. By incorporating deterministic epidemic dynamics into model architectures or loss functions, these approaches guide neural networks in accordance with the underlying principles of disease transmission—efforts to embed physical laws into neural networks have gained attention, including in disciplines such as the natural sciences [9]. However, they often require the number of individuals in the infectious state at each time point as input, which is typically estimated from the number of newly recovered cases. Such data are generally more difficult to track than the number of newly confirmed cases and are often unavailable. To address scenarios where such detailed data are lacking, we propose a simple and practical physics-informed deep learning model for forecasting the future number of confirmed cases, relying solely on historical confirmed case data and population data. Our model, named the Physics-Informed Spatial IDentity neural network (PISID), integrates the SIR model into a deep learning framework based on STID [10], a spatio-temporal identity model that avoids the complexity of graph structure learning. Epidemiological parameters are estimated using simple Multi-Layer Perceptron (MLP) layers, incorporating spatial characteristics through a spatial embedding matrix. Based on these parameters and the number of confirmed cases, the number of infectious individuals required for applying the SIR model is inferred. The future number of confirmed cases is then predicted using update equations derived from the infection dynamics in the SIR model. This approach enables interpretable forecasting grounded in epidemiological principles—an aspect often lacking in conventional deep learning models. In summary, the contributions of our study include the following:

  • We propose a novel multi-region epidemic forecasting model that leverages epidemiological domain knowledge by combining a classical dynamical system in epidemiology with simple neural networks incorporating region-specific embeddings, without relying on graph structure learning.
  • By estimating and utilizing epidemiological parameters, our model enhances interpretability and can describe epidemiological dynamics without requiring additional data on the number of infectious individuals.
  • We conduct extensive experiments using real-world COVID-19 data, demonstrating the model’s stable predictive performance and interpretability.

The remainder of this paper is organized as follows: the “Related Works” section reviews related works, the “Methodology” section details the proposed model structure, the “Experimental Study” section presents the experimental results, and finally, the “Conclusion” section summarizes the work and discusses directions for future research.

Related works

Numerous mathematical models have been developed for infectious disease epidemic forecasting, which can broadly be categorized into two groups: traditional mathematical models and deep learning models. Among traditional models, classical compartmental models and their variants are particularly prevalent. In these models, the population is divided into homogeneous subgroups representing different states, and the transitions between these states are typically described by differential equations. The SIR model [5], which classifies individuals as susceptible, infectious, or recovered, is the most fundamental. Variants such as the SEIR model [11], which includes an exposed state, and the SIS model [12], which assumes reinfection in possible, have also been widely studied. These models are often used to gain insights into disease characteristics and to explore future prevention strategies through simulation and parameter estimation. Batistela et al. [13] proposed a compartmental model that accounts for temporary immunity due to infection or vaccination, as well as unreported infections, and evaluated the effects of vaccination and social isolation. Fudolig and Howard [14] developed an SIR model incorporating multiple virus strains to explore the conditions under which endemic equilibrium can occur. Typically, future epidemic dynamics are simulated using parameters either optimized from historical data or manually set. However, this approach cannot account for changes in epidemic characteristics during the forecast period, raising concerns about cumulative errors over multiple time steps. Beyond compartmental models, other traditional mathematical models have also been employed for epidemic forecasting. Achrekar et al. [15] used an Autoregressive Moving Average (ARMA) model to predict future influenza-like illness (ILI) cases based on Twitter message trends. Wang et al. [16] developed a generalized Vector Autoregressive (VAR) model to forecast COVID-19 cases in the United States. The spread of infectious diseases exhibits non-stationary characteristics, influenced by various factors such as changes in viral properties, shifts in human behavior, and advancements in medical care. Therefore, the data distribution may evolve over time. In traditional models such as those mentioned above, which assume strong stationarity, it is particularly important to detect the points at which the distributional properties change. While known events such as lockdowns can be used to define these change points, there have also been efforts to identify unknown change points using a Bayesian approach [17], a genetic algorithm [18], and other techniques [19,20]. In other fields of natural science, a method for handling non-stationarity has been proposed using Bayesian compressive sensing [21].

To address the complex nonlinear relationships that traditional models struggle with, more flexible machine learning models have also been explored for prediction. Battineni et al. [22] conducted COVID-19 outbreak forecasting based on Fb-Prophet, a time series prediction model developed by Facebook that accounts for seasonality and holidays. Sadig et al. [23], on the other hand, employed LightGBM and XGBoost—representative gradient boosting algorithms based on decision trees—to predict the number of COVID-19 cases in real-time scenarios. Deep learning-based approaches that adaptively learn feature representations have also attracted significant attention. ArunKumar et al. [24] applied a Recurrent Neural Network (RNN) to forecast COVID-19 cases, while Lee et al. [25] used a Convolutional Neural Network (CNN) for ILI prediction. Transformer-based models such as Autoformer [26] have also been employed to capture temporal dependencies in time series data. While these models effectively process sequential data, it is important to note that infectious diseases inherently spread through spatial interactions. Consequently, graph-based deep learning methods have attracted attention for modeling spatial dependencies between regions. By representing each region as a node and connecting related regions with edges, Graph Neural Networks (GNNs), such as Graph Convolutional Networks (GCNs) [27], can efficiently capture spatial relationships. Basic graph construction methods often rely on prior knowledge, such as geographic distance or adjacency. For example, Panagopoulos et al. [28] constructed a graph based on human mobility data and analyzed the correlation between population movement and COVID-19 spread across countries. However, such predefined graph structures may not accurately reflect true dependencies. To address this, graph representation learning methods that adaptively learn graph structures using trainable node embeddings have been proposed [29]. ColaGNN [30] extracts inter-regional correlations from temporal latent representations using attention mechanisms, while EpiGNN [31] adaptively learns non-bidirectional spatial correlations that consider both geographical and temporal dependencies, as well as local and global transmission risks. Dual-Topo-STGCN [8] incorporates correlations between geographically distant regions by introducing functional topology that accounts for socio-economic interrelationships, in addition to geographical topology. M-Graphormer [32] learns dynamic graph representations primarily from human mobility data employing three encoding strategies that focus on centrality, spatial characteristics, and edge features. Recently, spatio-temporal GNN models equipped with graph representation learning and incorporating epidemiological domain models such as the SIR model have been proposed [3,68]. These models enhance predictive performance by grounding predictions in the physical laws governing disease transmission. However, they require as input the number of infectious individuals at each time point to accurately model infection dynamics. In other words, it is necessary to track not only the number of newly infected individuals who have become infectious, but also those who have ceased to be infectious (i.e., recovered cases) at each time point. Compared to data on new infections, data on recovered cases are often more difficult to follow up on and may be unavailable, which limits the applicability of these models. From a practical standpoint, it is therefore necessary to develop an epidemiologically informed model that can operate solely based on the number of new infections. Furthermore, the inherent complexity of the graph representation learning adopted by the aforementioned spatio-temporal GNN models may hinder performance improvements commensurate with the complexity of the neural architectures themselves [2]. As alternatives that do not rely on graph representation learning, STNorm [33] distinguishes dynamics by normalizing raw data separately in the temporal and spatial dimensions through factorization, while STID [10] ensures spatio-temporal identifiability by embedding temporal identities shared across similar cycles and spatial identities shared within the same region. Despite not utilizing graph representation learning, these models achieve predictive performance comparable to or better than more complex spatio-temporal GNN models. Motivated by these studies, we propose a simple yet effective neural network model that integrates the SIR model into STID, enabling it to capture spatial distinctions without relying on graph representation while also leveraging the underlying epidemiological dynamics. Furthermore, by incorporating a mechanism to infer the number of infectious individuals at each time point based on a simple equation rewrite, our model overcomes the limitation of previous epidemiology-based neural models that required this information as auxiliary input.

Methodology

In this section, we define the problem setting addressed in this study and present the framework of the proposed model.

Problem setting

In this study, we address the problem of forecasting the numbers of new confirmed cases in multiple regions based on historical data. Let XT = [x1,T, x2,T, …, xM,T] ∈ ℝM denote the number of new confirmed cases in M regions at time step T, and let XT-Tin+1:T = [XT-Tin+1, XT-Tin+2, …, XT] ∈ ℝM×Tin represent the historical data from time step T going back Tin steps. The objective is to forecast the number of new confirmed cases Tout steps into the future, denoted YT+Tout ∈ ℝM, which can be formulated using a mapping function F as follows:

(1)

Model structure

The overall structure of the proposed model is illustrated in Fig 1 and consists of two modules: a spatio-temporal neural network module and an SIR module. The spatio-temporal neural network module encodes temporal and spatial information based on the historical data of each region and predicts parameters that characterize the underlying epidemiological dynamics. Subsequently, the SIR module forecasts the future number of new confirmed cases by iteratively executing a discrete SIR model using the predicted parameters, leveraging epidemiological domain knowledge.

Spatio-temporal neural network module

We design a neural network to estimate epidemiological feature parameters. To avoid the potential introduction of erroneous biases caused by overly complex graph representation learning, we construct our framework based on STID [10], a simple yet effective spatio-temporal model. First, the historical input data XT-Tin+1:T ∈ RM×Tin is embedded into a latent space HT ∈ RM×D from a temporal perspective using a fully connected layer FC(·) as follows:

(2)

where D represents the hidden dimension. Next, spatial information is embedded using spatial identities E ∈ RM×D, a randomly initialized learnable matrix that captures region-specific features without relying on graph representation learning. The concatenated embeddings Z1T ∈ RM×2D, which incorporate both spatial and temporal information, are then used as input to the encoder:

(3)

The encoder consists of L layers of basic MLP with residual connections:

(4)

where FCl1 and FCl2 with l ∈ [1, L], denote the first and second fully connected layers of the l-th layer, respectively, and ReLU represents the Rectified Linear Unit (ReLU) activation function, applied with dropout. Then, the epidemiological parameters β=[β1, β2, …, βM] ∈ ℝM and γ=[γ1, γ2, …, γM] ∈ ℝM are output through FC layers and passed to the SIR module.

(5)

where FCβ and FCγ denote the fully connected layers used to estimate β and γ, respectively, and Sigmoid refers to the sigmoid activation function.

SIR module

The SIR module outputs the target forecast values of the number of new confirmed cases in the future, based on the dynamics of the SIR model. The SIR model is described by the following system of differential equations:

(6)

where Si, Ii, and Ri represent the number of susceptible, infectious, and recovered individuals in region i, respectively, and Ni = Si(t) + Ii(t) + Ri(t) denotes the total population in region i. In Equation (6), the infection rate β and the recovery rate γ are key parameters that govern the dynamics of disease transmission. The index Re(t): = βi/ γi· Si(t)/ Ni can be interpreted as the effective reproduction number, which represents the expected number of new infections caused by a single infectious individual in a partially immune population at time step t. This metric is often used as a timely indicator of the extent of disease transmission.

We now explain how the aforementioned SIR model is adapted for the current task, which involves forecasting new confirmed cases. These cases are typically assumed to be isolated or treated at the time of reporting and are therefore no longer capable of transmitting the infection. Accordingly, based on the discretized version of Equation (6), the number of new confirmed cases xi,t in region i at time step t can be interpreted as the number of new transitions into the recovered state:

(7)

Therefore, our goal is to estimateγi Ii(T + Tout-1). To achieve this, we iteratively update the number of individuals in each compartment over the time interval from T to T + Tout based on the dynamics defined by the discretized SIR model. As a first step, we need to determine the initial values of these iterations in each compartment using the available historical data xi,T-Tin+1, …, xi,T. The number of recovered individuals Ri(t) can be computed by accumulating the number of new confirmed cases up to time step t. Regarding the number of susceptible individuals Si(t), there is the relation Si(t) = Ni - Ii(t) - Ri(t). Therefore, once Ii(t) is estimated, the number of individuals in each compartment can be determined, allowing the initial values for the iterations to be set. The differential equation for Ii(t) in Equation (6) can be reformulated as follows [34]:

(8)

By approximating the integral in the second term with a discrete summation and treating unavailable data points prior to time step T - Tin as negligible, we estimate Ii(t) for t ≥ T-1 as follows:

(9)

where ΔIi(u + 1) represent the new infections at time step u + 1, βi Si(u)Ii(u)/ Ni. Based on the relationships derived from Equations (6) and (7), ΔIi(u + 1) for u ∈ [T-Tin, T-2] can be calculated as (xi,u+2xi,u+1)/ γi + xi,u+1. In light of the above findings, at each time step t ∈ [T, T + Tout] in region i, we update the states according to the following:

(10)

where the initial values at time step T-1 in each compartment are given by . By applying the estimates of β andγ obtained from the spatio-temporal neural network module into Equation (10), and iteratively updating each state, we obtain the forecasted number of new confirmed cases Ŷi,t ∈ ℝ at time step t ∈ [T + 1, T + Tout] in region i as follows:

(11)

Algorithm 1 presents the pseudocode illustrating the flow leading to the prediction output.

Algorithm 1. PISID algorithm.

Spatio-temporal neural network module:

1.

2.

3.

4.

SIR module:

5.

6.

7.

8.

9.

10.

11. return

Objective function

We employ the Mean Absolute Error (MAE) as the loss function and train the model to capture the epidemic dynamics up to the target time step T + Tout by minimizing the difference between the forecasted values Ŷi = [Ŷi,T+1, …, Ŷi,T+Tout] ∈ ℝTout and the ground truth values Yi = [Yi,T+1,…, Yi,T+Tout] ∈ ℝTout for each region i. The objective function to be minimized is defined as:

(12)

where Θ denotes all learnable parameters, which are contained entirely within the spatio-temporal neural network module.

Experimental study

Datasets

To conduct our computational experiments, we use two publicly available COVID-19 datasets from Japan and the US, each recording the number of daily new confirmed cases:

  • Japan: This dataset is collected from the Ministory of Health, Labour and Welfare [35] and contains the number of daily new confirmed cases for each of the 47 prefectures from January 16, 2020, to May 8, 2023. Population data for each prefecture are obtained from the Japan LIVE Dashboard [36].
  • US: This dataset is sourced from the Johns Hopkins Coronavirus Resource Center [37] and includes the number of daily new confirmed cases for each of the 51 states from January 22, 2020, to March 9, 2023.

Baselines

We compare the proposed PISID model with both traditional mathematical models (SIR, ARMA, GAR) and deep learning models (RNN, DCRNN, LSTNet, STGCN, GWNet, ColaGNN, FourierGNN, STID).

  • SIR: The SIR model [5] is a classical compartmental model based on differential equations, widely used in epidemiology. We optimize the model parameters directly using historical data for each region.
  • ARMA: ARMA is a fundamental statistical model for time series forecasting, which makes linear predictions based on past values and stochastic noise.
  • GAR: GAR is an autoregressive model that incorporates inter-regional influence structures and is commonly used to model global economic systems.
  • RNN: RNN [38] is a basic neural architecture for sequence modeling, which propagates information recursively from one time step to the next.
  • DCRNN: DCRNN [39] is a spatio-temporal deep learning model that captures spatial dependencies via diffusion convolution and temporal dynamics via gated recurrent units.
  • LSTNet: LSTNet [40] is a multivariate time series forecasting model that captures both short-term and long-term dependencies using a combination of CNN and RNN, and incorporates an autoregressive component to handle input scale variations.
  • STGCN: STGCN [41] extracts spatial features using graph convolution and temporal features using gated temporal convolution.
  • GWNet: GWNet [29] is a spatio-temporal deep learning model that adaptively learns the graph structure and captures spatio-temporal dependencies by combining graph convolution with dilated casual convolution.
  • ColaGNN: ColaGNN [30] is an epidemic forecasting model that dynamically models spatial influence using a location-aware attention mechanism and captures local temporal patterns at multiple granularities using dilated convolution.
  • FourierGNN: FourierGNN [42] is an architecture for multivariate time series forecasting that models spatio-temporal dynamics in a unified framework using matrix multiplication of space-time fully connected graphs with Fourier Graph Operators in Fourier space.
  • STID: STID [10] is a multivariate time series forecasting model that addresses indistinguishability in spatial and temporal dimensions by embedding spatial and temporal identities through learnable matrices.

Experimental setting

We evaluated our model under two forecasting scenarios: short-term and long-term. Both the input history length Tin and the prediction horizon Tout were set to either 14 or 28. This means the model predicts the number of new confirmed cases 14 or 28 days ahead using the past 14 or 28 days of data. The original daily case counts are heavily influenced by weekly seasonality, primarily due to the reporting practices of local governments and medical institutions—for example, a decrease in reports on weekends when many medical facilities are closed. To remove this seasonality, which is unrelated to actual infection trends, we applied a 7-day moving average as a preprocessing step. Additionally, because the dynamics of infection spread vary significantly depending on the dominant virus strain, we divided the dataset into two periods: one during which the Delta variant was dominant (Japan: 2020/01/22 ~ 2021/12/31, US: 2020/01/29 ~ 2021/11/30), and another during which the Omicron variant was dominant (Japan: 2022/01/01 ~ 2023/05/08, US: 2021/12/01 ~ 2023/03/09). Each dataset was split into training, validation, and test sets in a 6:1:3 ratio. Input data were normalized using the mean and standard deviation of the training set. The embedding dimension D was set to 32, and the number of MLP layers in the encoder L was set to 3. The number of model parameters to be trained was approximately 27K. We used a batch size of 32 and trained the model for up to 300 epochs, with early stopping triggered if validation performance does not improve for 20 consecutive epochs. Curriculum learning [43] was employed, gradually increasing the prediction horizon from 1 to Tout by one time step every two epochs. We used the Adam optimizer with an initial learning rate of 0.001 and a weight decay of 1e-8. All experiments were conducted using PyTorch on a server with an NVIDIA A100 GPU. The code for PISID is available at https://github.com/satoki-fujita/PISID.

To evaluate predictive performance, we used the following metrics: Mean Absolute Error (MAE), Root Mean Square Error (RMSE), Mean Absolute Percentage Error (MAPE), Relative Absolute Error (RAE), Pearson Correlation Coefficient (PCC), and Concordance Correlation Coefficient (CCC). Lower values of MAE, RMSE, MAPE, and RAE, and higher values of PCC and CCC indicate better performance. These metrics are defined as follows:

(13)(14)(15)(16)(17)(18)

where yi denotes the observed value in region i, ŷi is the predicted value in region i, y and y are the means of the observations and predictions, σᵧ and σᵧ are their standard deviations, and ρ is the correlation coefficient between the observations and predictions.

Prediction performance

We evaluated the predictive performance of each model on the test set. Each model was trained five times with different random initializations, and we report the mean and standard deviation of the evaluation metrics. The performance results for all models on the Japan dataset are presented in Table 1, and those for the US dataset are shown in Table 2. Across all datasets, PISID demonstrates consistent and competitive performance. In fact, in every case, it achieves either the best or the second-best MAE compared to other baseline models. On the Japan dataset (2020/01/22 ~ 2021/12/31), GWNet shows relatively strong performance, while on the Japan dataset (2022/01/01 ~ 2023/05/08), SIR performs comparatively well. However, the models do not exhibit notable performance when tested on the opposite time period, suggesting limited generalizability. These findings imply that the effectiveness of the models may be contingent upon the characteristics of the dominant epidemic dynamics in the time and place of application. During the Delta variant dominant period in Japan (2020/01/22–2021/12/31), government interventions such as mobility restrictions and limitations on restaurant operating hours were implemented periodically, which led to frequent changes in infection dynamics, resulting in relatively strong non-stationarity. Under such conditions, the adaptive nature of GWNet, which flexibly captures spatiotemporal dependencies, likely contributed to its effective performance. Meanwhile, during the Omicron variant dominant period in Japan (2022/01/01–2023/05/08), fewer abrupt interventions aimed at controlling human contact were implemented, allowing the epidemic dynamics to more closely follow the inherent epidemiological characteristics of the disease. Accordingly, the SIR model, grounded in epidemiological domain knowledge, is considered to have performed relatively well. GWNet did not exhibit distinctly superior performance during the period, potentially due to the complexity introduced by its graph structure learning mechanism which may have caused the model to overfit to spurious trends. In contrast, STID, which utilizes a straightforward neural network architecture devoid of graph-based components, achieved more favorable results. PISID, which integrates the STID architecture with the SIR model, have been capable of handling both scenarios where complex spatiotemporal patterns predominate and those where epidemiologically specific dynamics are dominant, without experiencing significant performance degradation. Furthermore, PISID consistently maintained its performance regardless of the forecast horizon. Other neural network models search a vast representational space for epidemic dynamics that best fit the training data without any guidelines based on epidemiological knowledge, which increases the risk of overfitting and can lead to a more pronounced performance degradation when transitioning from short-term to long-term forecasts. For example, on the US dataset (2021/12/01 ~ 2023/03/09), GWNet performs well for 14-day-ahead forecasts but suffers a more significant drop in accuracy for 28-day-ahead forecasts compared to PISID. Even STID, which demonstrates competitive performance on other datasets, exhibits a similarly significant deterioration. Given the sparse and noisy nature of infectious disease data, incorporating epidemiological domain knowledge—as done in PISID—contributes to more stable and reliable predictions.

Fig 2 visualizes the 28-day-ahead forecasts produced by PISID and representative baseline models on the test set for Tokyo and New York, alongside the actual observed values. The right column of the figure reveals that ColaGNN’s forecasts are markedly unstable, with pronounced divergence from the ground truth values, likely caused by overfitting due to the attention mechanism used for graph structure learning, which leads to excessive model size. In contrast, PISID produces relatively stable forecasts; however, like other models, it struggles to capture the real-time dynamics of infection spread. A 28-day forecasting horizon is long enough for the epidemic distribution to change, and it is possible that there is a lack of external data capable of capturing such changes. Especially, abrupt outbreaks are likely driven by some kind of external intervention, making it challenging to predict their onset accurately based solely on historical confirmed case data.

thumbnail
Fig 2. Plot of the predicted confirmed cases 28-day-ahead in Tokyo and New York.

https://doi.org/10.1371/journal.pone.0331611.g002

In addition to the predictive performance, we also evaluated the training efficiency of the neural network models. Table 3 presents the training time per epoch for each model on each dataset. While PISID requires more training time than STID due to the inclusion of the SIR module that performs iterative processing, it is more computationally efficient than more complex models that adaptively learn graph structures, such as GWNet and ColaGNN.

To verify the effectiveness of the fully connected neural network-based encoder with a spatial embedding matrix in the neural network module, we also compared performance when replacing the encoder with alternative architectures. We employed several commonly used models for time-series tasks as encoders, including RNN [38], which uses recurrent architectures to process time-series information, TCN [44], which employs convolutional architectures for sequence modeling, and Transformer [45], which utilizes an attention mechanism to capture long-range dependencies in sequences. In addition, we evaluated GWNet [29], a spatio-temporal model that adaptively learns graph structures, as the encoder, and also assessed a variant of our model without the spatial embedding matrix to investigate the contribution of spatial embeddings to performance. In all cases, the encoded information was decoded into epidemiological parameters via a fully connected layer and passed to the subsequent SIR module. Table 4 presents the MAE and RMSE scores for predictions made by models using each encoder architecture across the datasets. Among the encoder architectures that do not explicitly incorporate spatial information—namely RNN, TCN, Transformer, and MLP w/o SID—MLP w/o SID demonstrates competitive performance compared to other sequence-specialized encoders, suggesting that MLP-based architectures can effectively capture temporal dependencies. Furthermore, MLP w/ SID, which incorporates a spatial embedding matrix, achieves the best performance among all encoder architectures, including GWNet that perform adaptive graph structure learning. This underscores the efficacy of handling spatial dependencies using a simple embedding-based approach.

thumbnail
Table 4. Prediction performance of models with different backbone encoder architectures across each dataset (Tin, Tout = 28).

https://doi.org/10.1371/journal.pone.0331611.t004

We also examined the sensitivity of the hidden dimension D, which corresponds to the dimensionality of the temporal feature HT and the spatial feature E embedded by the encoder. The dimension values were set to {8, 16, 32, 64, 128}, and evaluation results for each dataset are presented in Fig 3. When D is too small, the embedded spatio-temporal representation becomes limited, resulting in degraded predictive performance. On the other hand, as observed in the results for the Delta strain epidemic data (Fig 3, left), an excessively large D may lead to inferior performance due to overfitting. Therefore, selecting a well-balanced value for D is recommended.

thumbnail
Fig 3. Sensitivity analysis results of the hidden dimension D across each dataset (Tin, Tout = 28).

https://doi.org/10.1371/journal.pone.0331611.g003

Interpretability

Since the PISID model incorporates an SIR module, its predictions can be made interpretable through the parameters that govern the underlying dynamical system. We focus on the effective reproduction number Re(T), defined as β/γ·S(T)/N, a widely used indicator of infectious disease transmissibility. To assess the interpretability of the model, we conducted a case study using this metric. In Fig 4, we use the PISID model configured for 28-day-ahead forecasting (Tout = 28) and plot the estimated Re values over time T for specific periods in which significant COVID-19 response measures were implemented in the training dataset for Tokyo (Japan) and New York (US). The plot also includes major event labels along the timeline and the actual daily confirmed cases. In Tokyo, Re begins to decline sharply after the declarations of a state of emergency on 2021/01/08, 2021/04/25, and 2021/07/12. During each of these periods, residents were urged to stay home, and customer-facing establishments were requested to shorten operating hours. The behavior of Re appears to reflect the reduction in infection risk resulting from these externally enforced measures. Conversely, Re increases again around 2021/03/22 and 2021/06/21, when case numbers had declined and restrictions were partially lifted—suggesting a potential resurgence of infections following deregulation. Indeed, the number of newly confirmed cases began to increase following each of these points in time. In New York, Re drops significantly after 2020/11/13, when new restrictions were imposed on restaurants, bars, gyms, and private gatherings, falling below 1—a threshold often interpreted as indicating that the epidemic is under control. A decline in the number of newly confirmed cases can also be observed, as if mirroring this trend. Re begins to rise again on 2021/01/28, possibly reflecting the gradual easing of restrictions and the lifting of most “color zone” regulations. Between 2021/07 ~ 2021/08, just prior to the resurgence driven by the Delta variant, a notable increase is also observed. This trend may be associated with the full lifting of restrictions on 2021/06/15. These results suggest that Re, as estimated by PISID, reflects real-world fluctuations in transmission dynamics in a relatively timely and interpretable manner. It can thus serve as a meaningful indicator for assessing the epidemic situation based on the model’s internal epidemiological reasoning.

thumbnail
Fig 4. Plot of the derived effective reproduction number with event labels in Tokyo and New York.

https://doi.org/10.1371/journal.pone.0331611.g004

Discussion

We proposed PISID, a simple infectious disease forecasting model that combines a fully connected neural network with an SIR module, and evaluated its performance using real-world COVID-19 case data from Japan and the US. Although PISID’s predictions are grounded in the deterministic dynamics of the SIR model, it demonstrates competitive predictive performance compared to well-established neural network baselines. This highlights the importance of incorporating domain knowledge in infectious disease modeling. While neural networks can flexibly approximate complex functions through a large number of parameters, relying solely on noisy data—such as epidemic time series—can lead to overfitting and poor generalization. Embedding prior knowledge of epidemic dynamics into the model architecture, especially in scenarios where large-scale training data or external features are limited, can enhance generalization and robustness. The SIR module in PISID, though a simplified dynamical system representing average epidemic behavior in a population, maintains strong empirical performance without compromising the validity of its underlying principles. Moreover, it contributes to addressing the interpretability challenges often associated with neural networks. The parameters estimated by the SIR module can be interpreted as indicators of future transmissibility, offering practical value for outbreak risk management. For instance, an increase in the parameter value can serve as an early warning signal, enabling timely interventions such as contact tracing or resource allocation. Conversely, a decrease in the parameter may indicate a suitable time to relax restrictions. This level of interpretability is particularly important in the context of infectious diseases, which can have far-reaching societal, economic, and healthcare impacts, thereby enhancing the model’s practical utility.

It is also noteworthy that the neural network component of PISID primarily consists of basic fully connected layers, without relying on graph structure learning. While recurrent or convolutional architectures are commonly used for sequence modeling due to their memory capabilities, our experimental results show that MLP-based structures are equally effective in capturing temporal dynamics. In fact, recent studies have reported that simple linear layers can outperform more complex architectures like Transformers [46], which also serves as the foundational architecture for large language models (LLMs), suggesting that simplicity should not be underestimated. The lightweight nature of MLP also enables efficient training without excessive computational overhead. Spatial dependencies are captured using a spatial embedding matrix, avoiding the complexity of graph structure learning employed in many spatio-temporal models. Overall, the architecture of PISID is straightforward and interpretable, yet it effectively encodes both spatial and temporal information, achieving performance comparable to more complex models.

There are, however, several limitations and directions for future work. First, this study focuses on forecasting future confirmed cases using only past case data as input. Since epidemic dynamics are influenced by various external factors—such as cluster outbreaks, viral mutations, new treatments or vaccines, and government interventions—incorporating additional external data could improve predictive accuracy.

Second, since the proposed method generates forecasts based on the SIR equations, it performs well in predicting stationary epidemic patterns but struggles to respond sensitively to sudden trend shifts. In infectious diseases such as COVID-19, distribution characteristics can change abruptly due to mutations in virus strains or shifts in human behavior. In situations where such non-stationarity is pronounced, predictive performance becomes limited. In our experiments, the dataset was pre-divided based on a known major change point—specifically, the shift from the Delta to the Omicron variant—allowing the method to be evaluated under relatively stationary conditions. Addressing prediction under broader, potentially more non-stationary scenarios remains an important future challenge. It is necessary either to attempt predictions within each stationary pattern interval, in conjunction with detecting change points where the distribution shifts, or to develop a model that incorporates new mechanisms capable of responding sensitively to non-stationary epidemic patterns. In addition, the use of the aforementioned external data associated with the dynamics of non-stationary epidemic patterns, is expected to be essential for detecting such patterns.

Third, our experiments are limited to COVID-19 data. Further research is needed to assess the model’s applicability to other infectious diseases, such as influenza. Depending on the disease, alternative compartmental models (e.g., SIS) may better represent the transmission process. Extending PISID to support such model variants could further enhance its generalizability. Additionally, for infectious diseases with strong periodicity, it may be necessary to develop models that account for periodic patterns, such as C-GRU [47].

Conclusion

In this paper, we proposed PISID, a novel model for epidemic forecasting across multiple regions. PISID combines an SIR module—based on an infectious disease-specific dynamical system—with a simple neural network module composed of fully connected layers. The model requires only historical confirmed case data as input, making it broadly applicable. While complex models that incorporate graph structure learning can sometimes suffer from overfitting and limited interpretability, PISID is designed to follow an exponential trajectory grounded in epidemiological domain knowledge. This design contributes to both the interpretability and stability of its predictions. The effectiveness of the model was validated through experiments on real-world COVID-19 datasets, where it demonstrated competitive predictive performance compared to established benchmark models for multivariate time series forecasting. Although not always the top performer, PISID consistently ranked among the top two models across all forecasting scenarios—despite variations in regional scope, prevalent strains, and forecast horizons—demonstrating stable and reliable forecasting capabilities. We also conducted a comparative analysis of different encoder architectures and confirmed that information related to future epidemic dynamics can be effectively captured by modeling temporal dependencies using fully connected layers with residual connections, and spatial dependencies using a spatial embedding matrix. This architecture achieved an average improvement of 7.4% in MAE and 5.8% in RMSE compared to the best-performing baseline encoders, highlighting its effectiveness in epidemic forecasting. Furthermore, we demonstrated the interpretability of the model through a case study, highlighting how the explicit trajectory representation of the SIR module can provide meaningful insights into epidemic dynamics. In future work, we plan to incorporate external data related to epidemics—such as mobility patterns, distribution of viral strains, and vaccination rates—to more effectively capture shifts in epidemic dynamics in a timely manner. Regarding graph structure learning, this study raised concerns about performance degradation due to its complexity, but we believe that pursuing this direction remains valuable, given the spatial nature of infectious disease spread. To gain a deeper understanding of the underlying trends in epidemic propagation, we aim to incorporate external data with richer spatio-temporal correlations and apply dynamic graph structure learning to explore transmission routes and delay patterns. We also plan to extend the SIR module to its variant forms by incorporating external data and introducing finer-grained compartments capable of disentangling and explaining individual contributing factors. This will enable a deeper integration of epidemiological knowledge into the neural network framework, ultimately supporting the development of public health and medical strategies.

Acknowledgments

We thank all individuals who contributed indirectly to this research through discussions and insights.

References

  1. 1. Haleem A, Javaid M, Vaishya R. Effects of COVID-19 pandemic in daily life. Curr Med Res Pract. 2020;10(2):78–9. pmid:32292804
  2. 2. Shao Z, Wang F, Xu Y, Wei W, Yu C, Zhang Z, et al. Exploring progress in multivariate time series forecasting: Comprehensive benchmarking and heterogeneity analysis. IEEE Trans Knowl Data Eng. 2024.
  3. 3. Cao Q, Jiang R, Yang C, Fan Z, Song X, Shibasaki R. MepoGNN: Metapopulation epidemic forecasting with graph neural networks. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Cham: Springer Nature Switzerland; 2022. pp. 453–68.
  4. 4. Zhang H, Xu Y, Liu L, Lu X, Lin X, Yan Z, et al. Multi-modal information fusion-powered regional covid-19 epidemic forecasting. In: IEEE Int Conf Bioinform Biomed (BIBM). IEEE; 2021. pp. 779–84.
  5. 5. A contribution to the mathematical theory of epidemics. Proc R Soc Lond A. 1927;115(772):700–21.
  6. 6. Mao J, Han Y, Tanaka G, Wang B. Backbone-based dynamic spatio-temporal graph neural network for epidemic forecasting. Knowl-Based Syst. 2024;296:111952.
  7. 7. Wang L, Adiga A, Chen J, Sadilek A, Venkatramanan S, Marathe M. Causalgnn: Causal-based graph neural networks for spatio-temporal epidemic forecasting. In: Proc AAAI Conf Artif Intell. 2022. pp. 12191–9.
  8. 8. Luo J, Wang X, Fan X, He Y, Du X, Chen Y-Q, et al. A novel graph neural network based approach for influenza-like illness nowcasting: exploring the interplay of temporal, geographical, and functional spatial features. BMC Public Health. 2025;25(1):408. pmid:39893390
  9. 9. Yang H-Q, Shi C, Zhang L. Ensemble learning of soil–water characteristic curve for unsaturated seepage using physics-informed neural networks. Soils and Found. 2025;65(1):101556.
  10. 10. Shao Z, Zhang Z, Wang F, Wei W, Xu Y. Spatial-temporal identity: A simple yet effective baseline for multivariate time series forecasting. In: Proc 31st ACM Int Conf Inf Knowl Manag. 2022. pp. 4454–8.
  11. 11. Kuznetsov YA, Piccardi C. Bifurcation analysis pf periodic SEIR and SIR epidemic models. J Math Biol. 1994;32(2):109–21. pmid:8145028
  12. 12. van den Driessche P, Watmough J. A simple SIS epidemic model with a backward bifurcation. J Math Biol. 2000;40(6):525–40. pmid:10945647
  13. 13. Batistela CM, Correa DPF, Bueno ÁM, Piqueira JRC. SIRSi-vaccine dynamical model for the Covid-19 pandemic. ISA Trans. 2023;139:391–405. pmid:37217378
  14. 14. Fudolig M, Howard R. The local stability of a modified multi-strain SIR model for emerging viral strains. PLoS One. 2020;15(12):e0243408. pmid:33296417
  15. 15. Achrekar H, Gandhe A, Lazarus R, Yu SH, Liu B. Predicting flu trends using twitter data. In: IEEE Conf Comput Commun Workshops (INFOCOM WKSHPS). IEEE; 2011. pp. 702–7.
  16. 16. Wang Q, Zhou Y, Chen X. A vector autoregression prediction model for covid-19 outbreak. arXiv preprint arXiv:2102.04843, 2021.
  17. 17. Dehning J, Zierenberg J, Spitzner FP, Wibral M, Neto JP, Wilczek M, et al. Inferring change points in the spread of COVID-19 reveals the effectiveness of interventions. Science. 2020;369(6500):eabb9789. pmid:32414780
  18. 18. Dey T, Lee J, Chakraborty S, Chandra J, Bhaskar A, Zhang K, et al. Lag time between state-level policy interventions and change points in COVID-19 outcomes in the United States. Patterns (N Y). 2021;2(8):100306. pmid:34308391
  19. 19. Jiang F, Zhao Z, Shao X. Time series analysis of COVID-19 infection curve: A change-point perspective. J Econom. 2023;232(1):1–17. pmid:32836681
  20. 20. Bai Y, Safikhani A, Michailidis G. Non-stationary spatio-temporal modeling of COVID-19 progression in the US. medRxiv. 2020;2020.09.14.20194548.
  21. 21. Li P, Wang Y. Interpretation of spatio-temporal variation of precipitation from spatially sparse measurements using Bayesian compressive sensing (BCS). Georisk: Assess Manage Risk Eng Syst Geohazards. 2023;17(3):554–71.
  22. 22. Battineni G, Chintalapudi N, Amenta F. Forecasting of COVID-19 epidemic size in four high hitting nations (USA, Brazil, India and Russia) by Fb-Prophet machine learning model. Appl Comput Inform. 2025;21(1–2):2–11.
  23. 23. Sadig HE, Kamal M, ur Rehman M, Habadi MI, Alnagar DK, Yusuf M. Advanced time complexity analysis for real-time COVID-19 prediction in Saudi Arabia using LightGBM and XGBoost. J Radiat Res Appl Sci. 2025;18(2):101364.
  24. 24. ArunKumar KE, Kalaga DV, Kumar CMS, Kawaji M, Brenza TM. Forecasting of COVID-19 using deep layer Recurrent Neural Networks (RNNs) with Gated Recurrent Units (GRUs) and Long Short-Term Memory (LSTM) cells. Chaos Solitons Fractals. 2021;146:110861. pmid:33746373
  25. 25. Lee K, Ray J, Safta C. The predictive skill of convolutional neural networks models for disease forecasting. PLoS One. 2021;16(7):e0254319. pmid:34242349
  26. 26. Wu H, Xu J, Wang J, Long M. Autoformer: Decomposition transformers with auto-correlation for long-term series forecasting. Adv Neural Inf Process Syst. 2021;34:22419–30.
  27. 27. Kipf TN, Welling M. Semi-Supervised Classification with Graph Convolutional Networks. In: Proc 5th Int Conf Learn Representations (ICLR). 2017.
  28. 28. Panagopoulos G, Nikolentzos G, Vazirgiannis M. Transfer Graph Neural Networks for Pandemic Forecasting. Proc AAAI Conf Artif Intell. 2021;35(6):4838–45.
  29. 29. Wu Z, Pan S, Long G, Jiang J, Zhang C. Graph WaveNet for deep spatial-temporal graph modeling. In: Int Joint Conf Artif Intell, 2019.
  30. 30. Deng S, Wang S, Rangwala H, Wang L, Ning Y. Cola-GNN: Cross-location Attention based Graph Neural Networks for Long-term ILI Prediction. In: Proc 29th ACM Int Conf Inf Knowl Manag. ACM; 2020. pp. 245–54. Available from: https://dl.acm.org/doi/10.1145/3340531.3411975
  31. 31. Xie F, Zhang Z, Li L, Zhou B, Tan Y. EpiGNN: Exploring spatial transmission with graph neural network for regional epidemic forecasting. In: Joint European Conf Mach Learn Knowl Discov Databases. Springer; 2022. pp. 469–85.
  32. 32. Wang X, Jin Z. Multi-region infectious disease prediction modeling based on spatio-temporal graph neural network and the dynamic model. PLoS Comput Biol. 2025;21(1):e1012738. pmid:39787070
  33. 33. Deng J, Chen X, Jiang R, Song X, Tsang IW. St-norm: Spatial and temporal normalization for multi-variate time series forecasting. In: Proc 27th ACM SIGKDD Conf Knowl Discov Data Min. 2021. pp. 269–78.
  34. 34. Kharazmi E, Cai M, Zheng X, Zhang Z, Lin G, Karniadakis GE. Identifiability and predictability of integer- and fractional-order epidemiological models using physics-informed neural networks. Nat Comput Sci. 2021;1(11):744–53. pmid:38217142
  35. 35. Ministry of Health, Labour and Welfare. Visualizing the data: information on COVID-19 infections. [cited 2024 Dec 22]. Available from: https://covid19.mhlw.go.jp/extensions/public/en/index.html
  36. 36. Su W, Fu W, Kato K, Wong ZSY. “Japan LIVE dashboard” for COVID-19: A scalable solution to monitor real-time and regional-level epidemic case data. In: Context Sensitive Health Informatics: The Role of Informatics in Global Pandemics. IOS Press; 2021. pp. 21–5.
  37. 37. Dong E, Du H, Gardner L. An interactive web-based dashboard to track COVID-19 in real time. Lancet Infect Dis. 2020;20(5):533–4. pmid:32087114
  38. 38. Werbos PJ. Backpropagation through time: what it does and how to do it. Proc IEEE. 1990;78(10):1550–60.
  39. 39. Li Y, Yu R, Shahabi C, Liu Y. Diffusion convolutional recurrent neural network: data-driven traffic forecasting. In: Proc ICLR. 2018.
  40. 40. Lai G, Chang WC, Yang Y, Liu H. Modeling long- and short-term temporal patterns with deep neural networks. ACM; 2018.
  41. 41. Yu B, Yin H, Zhu Z. Spatio-Temporal Graph Convolutional Networks: A Deep Learning Framework for Traffic Forecasting. IJCAI; 2018.
  42. 42. Yi K, Zhang Q, Fan W, He H, Hu L, Wang P. FourierGNN: Rethinking multivariate time series forecasting from a pure graph perspective. Adv Neural Inf Process Syst. 2023;36:69638–60.
  43. 43. Wu Z, Pan S, Long G, Jiang J, Chang X, Zhang C. Connecting the dots: Multivariate time series forecasting with graph neural networks. In: Proc 26th ACM SIGKDD Int Conf Knowl Discov Data Min. 2020.
  44. 44. Bai S, Kolter JZ, Koltun V. An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. arXiv preprint. 2018.
  45. 45. Nie Y, Nguyen NH, Sinthong P, Kalagnanam J. A time series is worth 64 words: Long-term forecasting with transformers. arXiv preprint. 2022.
  46. 46. Zeng A, Chen M, Zhang L, Xu Q. Are Transformers Effective for Time Series Forecasting? Proc AAAI Conf Artif Intell. 2023;37(9):11121–8.
  47. 47. Wu L, Zhou JT, Zhang H, Wang SR, Ma T, Yan H, et al. Time series analysis and gated recurrent neural network model for predicting landslide displacements. Georisk: Assess Manage Risk Eng Syst Geohazards. 2022;18(1):172–85.