Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

A new Seasonal Difference Space-Time Autoregressive Integrated Moving Average (SD-STARIMA) model and spatiotemporal trend prediction analysis for Hemorrhagic Fever with Renal Syndrome (HFRS)

  • Youlin Zhao ,

    Roles Conceptualization, Data curation, Visualization, Writing – original draft, Writing – review & editing

    sobzyl@hhu.edu.cn (YZ); geliang0021@126.com (LG)

    Affiliation Business School of Hohai University, Nanjing city, Jiangsu Province, PR China

  • Liang Ge ,

    Roles Data curation, Formal analysis, Investigation, Methodology

    sobzyl@hhu.edu.cn (YZ); geliang0021@126.com (LG)

    Affiliation Tianjin Institute of Surveying and Mapping, Tianjin city, PR China

  • Yijun Zhou,

    Roles Data curation, Methodology, Visualization, Writing – original draft

    Affiliation Tianjin Institute of Surveying and Mapping, Tianjin city, PR China

  • Zhongfang Sun,

    Roles Investigation, Methodology, Resources

    Affiliation Tianjin Institute of Surveying and Mapping, Tianjin city, PR China

  • Erlong Zheng,

    Roles Data curation, Writing – review & editing

    Affiliation Tianjin Institute of Surveying and Mapping, Tianjin city, PR China

  • Xingmeng Wang,

    Roles Visualization, Writing – original draft

    Affiliation Tianjin Institute of Surveying and Mapping, Tianjin city, PR China

  • Yongchun Huang,

    Roles Writing – review & editing

    Affiliation Business School of Hohai University, Nanjing city, Jiangsu Province, PR China

  • Huiping Cheng

    Roles Formal analysis, Resources

    Affiliation School of Economics and Management, Hubei University of Technology, Wuhan,Hubei Province, PR China

A new Seasonal Difference Space-Time Autoregressive Integrated Moving Average (SD-STARIMA) model and spatiotemporal trend prediction analysis for Hemorrhagic Fever with Renal Syndrome (HFRS)

  • Youlin Zhao, 
  • Liang Ge, 
  • Yijun Zhou, 
  • Zhongfang Sun, 
  • Erlong Zheng, 
  • Xingmeng Wang, 
  • Yongchun Huang, 
  • Huiping Cheng
PLOS
x

Abstract

Hemorrhagic fever with renal syndrome (HFRS) is a naturally-occurring, fecally transmitted disease caused by a Hantavirus (HV). It is extremely damaging to human health and results in many deaths annually, especially in Hubei Province, China. One of the primary characteristics of HFRS is the spatiotemporal heterogeneity of its occurrence, with notable seasonal differences. In view of this heterogeneity, the present study suggests that there is a need to focus on trend simulation and the spatiotemporal prediction of HFRS outbreaks. To facilitate this, we constructed a new Seasonal Difference Space-Time Autoregressive Integrated Moving Average (SD-STARIMA) model. The SD-STARIMA model is based on the spatial and temporal characteristics of the Space-Time Autoregressive Integrated Moving Average (STARMA) model first developed by Cliff and Ord in 1974, which has proven useful in modelling the temporal aspects of spatially located data. This model can simulate the trends in HFRS epidemics, taking into consideration both spatial and temporal variations. The SD-STARIMA model is also able to make seasonal difference calculations to eliminate temporally non-stationary problems that are present in the HFRS data. Experiments have demonstrated that the proposed SD-STARIMA model offers notably better prediction accuracy, especially for spatiotemporal series data with seasonal distribution characteristics.

Introduction

Hemorrhagic Fever with Renal Syndrome (HFRS) is a serious infectious disease that is mainly caused by a Hantavirus (HTNV) and the Seoul virus (SEOV) [14]. The clinical symptoms for HFRS are fever, hemorrhaging and renal dysfunction and it can result in long-term kidney damage, hypotension and even death. HFRS has s distribution across a number of countries. China is the most seriously affected, accounting for more than 90% of the world's cases of HFRS [59]. Within China, however, one province in particular, Hubei, has become the most seriously affected area of all in recent years. Since the first case of HFRS was reported in Hubei in 1957, HFRS epidemics have expanded and reached a high point in 1983 with 23,943 cases. From 1980 to 2009, the number of HFRS cases in Hubei Province totaled 104,467. The spread of HFRS has had a significant impact on social stability and human health [1012].

Spatial and temporal statistical methods have been used to discover the spatial and temporal distribution and clustering characteristics of HFRS across a number of different locations [13], including Buenos Aires in Argentina [14], Germany [15] and Brussels in Belgium [16]. In China, a Kulldorff spatial scan statistic has been used to try and identify the clustering of HFRS, drawing upon data spanning the period 1980 to 2009 [17]. A Gaussian GWR model has also been used to try and identify the factors influencing HFRS transmission (such as meteorological factors, rodent density, surface mean elevation, water area and human population density) drawing upon data from Hubei that was collected between 2011 and 2015 [18]. Moran’s I index was adopted for a global spatial autocorrelation analysis that sought to identify the overall spatiotemporal pattern of HFRS outbreaks in Hubei between 2005 and 2014, and Spearman's rank correlation analysis was used at the same time to explore the possible factors influencing the epidemics, such as the weather and the area’s geography [19]. Cross-correlation analysis has also been used to assess a possible association with meteorological variables and a time-series Poisson regression model was adopted to examine the independent contribution of meteorological variables to HFRS transmission in both Elunchun and Molidawahaner counties in Northeastern China between 1997 and 2007 [20]. Alongside of this, a generalized additive model with penalized smoothing splines has been used to examine the effect of meteorological factors on the occurrence of HFRS in Jiaonan between 2006 and 2011 [21].

Identifying the spatial and temporal distribution of HFRS can help with analyzing and evaluating the trends in HFRS outbreaks, thus leading to the adoption of more effective measures for the prevention and control of the disease. HFRS, however, has a frustrating degree of spatiotemporal heterogeneity and seasonal variation [19]. So, in order to conduct a better analysis of HFRS distribution and to acquire a more accurate means of prediction, the construction of a space-time model seems to be called for. Space-time modeling refers to the process of finding an analytical method to model and predict the value of an unrecorded space-time position based on given spatiotemporal data [22]. Space-time modeling is a spatial expansion of time series modelling and the factors influencing the attribute values of unobserved space-time positions bring together the spatial and temporal factors associated with single time series modeling, single spatial modeling and spatiotemporal modeling.

The most representative single time series model is Autoregressive Integrated Moving Averages (ARIMA). This analyzes the time series of historical data and obtains the model with the optimal fit for predicting events that will occur in the short term [23] [24]. An ARIMA model shows time series data that is related to both sequentially lagged variables and their errors. ARIMA models have been used several times for the prediction of HFRS outbreaks [25,23,2628], which indicates that this model is a good fit here as well for the forecasting of outbreaks.

For the single spatial modeling, there are space autoregressive models and space moving average models. Based on a spatial weight matrix, these models study the quantization measure of neighboring spatial units [29].

Drawing upon time and space series modeling, the geographer A.D. Cliff and the statistician J.K. Ord, originally proposed in 1974 a space-time series modeling framework [22] that is essentially a spatial expansion of the time series model. It combines Spatial Autocorrelation (SAR), a Spatial Moving Average (SMA) and Spatial Regression (SR). A large number of studies had shown that, whilst the ARIMA model provided better fitting results for data with a relatively stable temporal distribution and no strong spatial autocorrelation, its effectiveness for prediction relating to spatiotemporally heterogenous sample data was much weaker [3032]. Cliff and Ord’s Spatiotemporal Autoregressive Integrated Moving Average (STARIMA) extended beyond the ARIMA model [33]. The STARIMA model provides a space-time autocorrelation function (ST-ACF) and a space-time partial correlation function (ST-PACF) to address the problem of measuring spatiotemporal correlations. It also introduced a spatiotemporal lag operator that makes it capable of simultaneously extrapolating and predicting multiple spatial units [34]. The STARIMA model was subsequently proved to offer high estimation performance when applied to a case study of the regional deposits of commercial banks operating in Turkey using non-linear estimators [35]. The STARIMA model has also been applied to rainfall and waterlogging process simulation and to short-term forecasting. Here, it offers improved prediction accuracy and reliability when compared to traditional hydro model simulation and prediction [36]. Outside of this, STARIMA models have been applied to traffic prediction, environment variable prediction and in social and economic analyses [3741].

Research has indicated that HFRS has a characteristic seasonal or cyclic time series-based occurrence [42,11]. In our previous work, a Seasonal Difference—Geographically and Temporally Weighted Regression (SD-GTWR) model was developed as an extension of the GTWR model that sought to use seasonal difference to get stabilized data [43]. Seasonal difference was used to deal with a non-stationary time series with seasonal distribution characteristics. Following on from this research, we constructed a Seasonal Difference—Space-Time Auto Regressive Integrated Moving Average (SD-STARIMA) model that is based on STARIMA. Time serials analysis and autocorrelation analysis were conducted to ensure the feasibility of using a seasonal difference approach. The STARIMA model is a prerequisite for advanced seasonal difference modeling and analysis. In our previous research, we found that from 1980 to 2000 [17] and from 2005 to 2014 [19] the HFRS cases in Hubei Province displayed a bimodal seasonal distribution pattern rather than a linear distribution. Seasonal difference calculations for HFRS incidence in Hubei using SD-STARIMA offer the prospect of improving the accuracy of previous space-time series models. The main contribution of this paper is the development of a new SD-STARIMA model that is able to bring seasonal difference calculations to bear in a way that will eliminate the non-stationary temporality problem found in HFRS data. Estimation results from the SD-STARIMA model show it to be more accurate than other models such as ARIMA and STARIMA. This confirms its potential to contribute to the prevention and control of HFRS.

Study data and analysis

Study data

The area focused on in this study is Hubei Province in central-southern China. The data covers the period from 2005 to 2014. In the past 30 years, the data during this decade is the most representative and 2014 is the most recent year for which detailed data is available. Basic geographic data about Hubei Province was collected from the Chinese National Administrator of Surveying, Mapping and Geo-Information. HFRS case data was provided by the Hubei Province Center for Disease Control and Prevention and the Chinese Center for Disease Control and Prevention. The HFRS case data contains the monthly case values for each county. Meteorological data was obtained from the National Center for Environmental Prediction and the Hubei Meteorological Bureau. Human population density data was extracted from the Hubei Statistical Yearbook, which includes the annual population for each county.

Seasonal characteristic analysis

The monthly distribution pattern of HFRS in Hubei Province from 2005 to 2014 is shown in Fig 1. It can be seen that HFRS epidemics appear to have a bimodal distribution for each year (12 months), occurring around March and September. As a result, the time frame for the range of seasonal differences for each year has been narrowed down to 6 months for this study [44].

thumbnail
Fig 1. Monthly HFRS incidence from 2005 to 2014.

(A) Average monthly HRFS incidence from 2005 to 2009. (B)Average monthly HRFS incidence from 2010 to 2014.

https://doi.org/10.1371/journal.pone.0207518.g001

Stationarity analysis of the HFRS incidence data

To arrive at a more effective time series analysis, it is necessary to identify the spatial and temporal series of the HFRS case data. Figs 2 and 3 show that the HFRS outbreak incidence in Hubei is clustered and does not meet the requirements of a normal distribution. In order to look for significant correlations in the HFRS outbreak distribution across the time series, an autocorrelation of the HFRS incidence time series data was undertaken using an autocorrelation graph. The autocorrelation graph and partial autocorrelation graph are plotted according to the autocorrelation and partial autocorrelation coefficients. In Fig 4(A), the abscissa is the number of lags and the ordinate is the ACF (autocorrelation function) value. The two lines in this figure represent the autocorrelation coefficient confidence interval of 95%. If there is no autocorrelation, the distribution pattern should be randomly distributed within the 95% confidence interval, without any fixed pattern and with the ACF values gradually tending to zero as the lag k increases. However, it can be seen from Fig 4(A) that the autocorrelation coefficient rk does not do this. At the same time, it can be seen from Fig 4(B) that the partial correlation function value is larger at the 1st, 4th,5th,7th,8th and 12th order lag states. This indicates that there is periodicity in the time series. That being so, the time and space series for the HFRS case data in Hubei does not have a smooth time series.

thumbnail
Fig 2. Scatter distribution of the HFRS incidence in Hubei Province from 2005 to 2014.

Each plot shows the incidence of HFRS for a unique date.

https://doi.org/10.1371/journal.pone.0207518.g002

thumbnail
Fig 3. Normal distribution of the HFRS incidence data in Hubei Province from 2005 to 2014.

Each column is an estimate of the probability distribution of the HFRS incidence.

https://doi.org/10.1371/journal.pone.0207518.g003

thumbnail
Fig 4. Correlation function values for the HFRS incidence data in Hubei Province from 2005 to 2014.

(a) Autocorrelation (b) Partial autocorrelation. The ACF (autocorrelation function) values for HFRS incidence in each lag.

https://doi.org/10.1371/journal.pone.0207518.g004

Thus, according to the seasonal characteristics and stationarity analysis of the HFRS outbreaks presented above, the series for HFRS incidence distribution in Hubei Province is temporally unstable. As previously mentioned, a large number of studies have shown that ARIMA models are better able to fit data with a relatively stable time distribution and no strong spatial autocorrelation, but they are not so effective when there is spatiotemporal heterogeneity in the sample data [3032]. This was the original reason for the development of Cliff and Ord’s, Spatiotemporal Autoregressive Integrated Moving Average (STARIMA) model [33]. However, the accuracy of this model is still limited for non-stationary series. In that case, there is a need for a new spatiotemporal series model that is capable of analyzing the seasonal characteristics and stationary distribution of the HFRS outbreaks in Hubei to improve the precision of the predictions.

Construction of a seasonal difference Spatio-temporal autoregressive integrated moving average (SD-STARIMA) Model

By building upon both the ARIMA model and the STARIMA model, the SD-STARIMA model not only inherits the functions of STARIMA, but also has its own particular advantages. In this paper, the ARIMA analysis was conducted using SPSS 22 and the STARIMA analysis was conducted using R package. Construction and analysis of the SD-STARIMA model was conducted using MATLAB.

Principles of the ARIMA model

ARIMA models are able to take into account changing trends, periodic changes, and random disturbances in a time series, so they are very useful for modeling a time series’ time dependence structure. In epidemiology, ARIMA models have been successfully applied to predict the incidence of a number of infectious diseases, such as influenza [45] and malaria [46], to mention but a few [47,48]. ARIMA (p,d,q) modeling of time series originated with the work of Box-Jenkins [24]. The model-building process was designed to take advantage of associations in the sequentially-lagged relationships that usually exist in periodically collected data [49]. The following were the parameters selected when fitting the ARIMA model: p, the order of autoregression; d, the integration parameter; and q, the order of the moving average. Autocorrelation function (ACF) and Partial autocorrelation function (PACF) graphs were used to identify the order of the moving average (MA) and the autoregressive (AR) terms included in the ARIMA model.

Fig 5 and Table 1 indicate the spatial autocorrelation results for Moran’s Index I. From this it can be concluded that the distribution of HFRS incidence in Hubei has spatial autocorrelation characteristics, so the trends for HFRS cannot be simulated using just time.

thumbnail
Fig 5. Distribution pattern of HFRS according to Moran's Index from 2005–2014.

Each point represents the Moran’s I value for a specific year. All of the points are joined to indicate the trend of Moran’s I for the HFRS incidence in Hubei Province.

https://doi.org/10.1371/journal.pone.0207518.g005

thumbnail
Table 1. Spatial autocorrelation results for the HFRS average annual incidence rate for each year in Hubei Province.

https://doi.org/10.1371/journal.pone.0207518.t001

Construction of the STARIMA model

The Space-time Autoregressive Integrated Moving Average model, STARIMA for short, is an extension of the ARIMA model. The STARIMA model class expresses: zi(t); observations of the random variables at site i, i = 1,2,, N; and time t as a weighted linear combination of past observations and errors, which may be lagged across both space and time. The basic mechanism for this representation is a hierarchical ordering of the neighbors of each site and a sequence of N×N weighting matrices, W(t). Matrix W(t) has elements wij(t) that are nonzero if and only if sites i and j are lth order neighbors and w(o) is defined to be a N×N identity matrix. Specifically, if you let z(t) be the N×1 vector of observations at time t, the STARIMA model class can be expressed as follows [50]: (1) where p is the autoregressive order; q is the moving average order;λk is the spatial order of the kth autoregressive term; mk is the spatial order of the kth moving average term;Φkl is the autoregressive parameter at temporal lag kand spatial lag l; θkl is the moving average parameter at temporal lag k and spatial lag l; W(t) is the N×N matrix of weights for spatial order l; and Ɛ(t) is the random normally distributed error vector at time t [51].

(2)

This specific model is referred to as the STARIMA () model. Two special subclasses of the STARIMA model are of note. When q = 0, only autoregressive terms remain, in which case the model is called a space-time autoregressive or STAR model. Models that contain no autoregressive terms (p = 0) are referred to as STMA models.

Construction of the SD-STARIMA model

By building upon the ARIMA model, the STARIMA model is able to evaluate the space functions pertaining to ARIMA. In essence, STARIMA is an extended linear regression model, so it can only describe linear autocorrelation results. That being so, STARIMA models are not well-suited to the prediction of the incidence of diseases with a seasonal epidemic pattern.

Our above analysis of the time series results for HFRS incidence in Hubei suggests that HFRS incidence does not have a stationary temporal distribution. ARIMA or ARIMA-based models need a stationary distribution of time series data as a prerequisite. In view of this, a seasonal difference method was used to eliminate the disruptive tendencies and get a stationary time series. The seasonal difference method amounts to being a way of getting a new time series by calculating the difference between various circles labeled L: (3)

A new series model can be obtained after the d-order difference calculation has finished. This can be formally defined as: (4)

As previously mentioned, HFRS has specific spatially-distributed epidemics, with the seasonal epidemic pattern in Hubei being characteristically bimodal. The time frame for the seasonal difference calculation was set to 6 months. So, for the purposes of data stabilization by differential, the interval for each order of difference in the time series should be set to 6 months. The stationary series data used to establish the STARIMA model has three steps: identification; estimation; and diagnostic checking [52]. The novel SD-STARMA model proposed in this paper can be formally expressed as follows (9): (5)

Results and discussion

Selection of the order of difference

The Augmented Dickey-Fuller test for unit root in level is conducted, the results are demonstrated in Table 2. It can be conducted in Table 2 that p value for ADF test is 0.07 which indicate that HFRS cases series is non-stationary distributed which p<0.07.

The results of the time series for the HFRS outbreaks data in Hubei Province from 2005 to 2014 using a first-order difference are shown in Fig 6

thumbnail
Fig 6. Time series results using first-order difference for the HFRS incidence data.

The polyline is constructed using the collected HFRS incidence points after seasonal difference adjustment.

https://doi.org/10.1371/journal.pone.0207518.g006

Fig 7 presents the stationarity analysis results relating to HFRS incidence after using a first-order difference. The time series fluctuates around the value 0, indicating an overall uniform distribution. The ACF and PACF appear to be tailing off. It can be inferred from Fig 7. that, after taking the first-order difference into account, the time series shown in Fig 6. is a stationary time series. Therefore, for this paper we have chosen to use the first order difference to preprocess the data.

thumbnail
Fig 7. Stationarity analysis using first-order difference for the HFRS incidence data.

The ACF and PACF values for HFRS incidence after seasonal difference adjustment for each lag.

https://doi.org/10.1371/journal.pone.0207518.g007

Construction of the SD-STARIMA model and comparison with the ARIMA and STARIMA models

In this section we construct ARIMA, STARIMA and SD-STARIMA models using first-order difference for the time series relating to the HFRS incidence data.

ARIMA model.

On the basis of first-order difference, the ARIMA(p,q) model can be defined as: (6)

On the basis of the ACF and PACF across different time lag values, p = 4 and q = 2 were selected as the values for this model. The autoregressive coefficient, moving average coefficient and test parameters are shown in Table 3.

STARIMA model.

For the STARIMA model, a spatial weight matrix had to be established first of all. First-order spatial neighborhood matrices and second-order spatial domain matrices of 73*73 were obtained according to the spatial neighborhood relationship of 73 counties in Hubei Province (there are actually 76 counties, but 73 were used as samples and the other 3 for validation). The core diagonal elements of the first-order adjacency matrix are 0. There are no adjacent spatial units if the non- diagonal elements are 0. 1 indicates that there are adjacent spatial units. The first- and second-order spatial neighborhood matrix can be obtained on the basis of the specific adjacency unit according to the row and column identifying the elements and the line standardization.

The space-time autocorrelation coefficients and space-time partial autocorrelation coefficients are then calculated for the HFRS outbreaks data incidence series (before seasonal difference). The calculated results are shown in Tables 4 and 5.

thumbnail
Table 4. Autocorrelation function values of HFRS incidence before seasonal difference.

https://doi.org/10.1371/journal.pone.0207518.t004

thumbnail
Table 5. Partial Autocorrelation function values of HFRS incidence before seasonal difference.

https://doi.org/10.1371/journal.pone.0207518.t005

The ACF values are truncated after time lag 4 and for all of the spatial lags. The PACF values are truncated after time lag 3 and for all of the spatial lags. In that case, a STARIMA (4,3) model can be constructed using the results in Table 6.

SD-STARIMA model.

Tables 7 and 8 present the calculated values for the space-time autocorrelation coefficient and space-time partial autocorrelation coefficient after applying the first-order difference series to the HFRS outbreaks data.

thumbnail
Table 7. ACF values of HFRS incidence after seasonal difference adjustment.

https://doi.org/10.1371/journal.pone.0207518.t007

thumbnail
Table 8. PACF values of HFRS incidence after seasonal difference adjustment.

https://doi.org/10.1371/journal.pone.0207518.t008

Looking at the results in Tables 7 and 8, it can be seen that both the AFC and PACF are tailing off. This confirms that this is a STARIMA model. A candidate time autocorrelation average moving model as in STARIMA (1,1) can now be got using the transformation status of the AFC and PACF. The STARIMA (1,1) model can be expressed formally as: (7)

A maximum likelihood estimate is made for the STARIMA (1,1) model to obtain its parameter estimation values and hypothesis test values. The results are shown in Table 9.

thumbnail
Table 9. Parameter estimation and test results for the SD-STARMA model.

https://doi.org/10.1371/journal.pone.0207518.t009

HFRS incidence prediction

Highly representative areas or areas with a high incidence of the disease were used to validate the model. Luotian, Zhongxiang and Yicheng counties were used to undertake a comparison. The observed values and predicted values for these three counties are presented in Fig 8. It can be seen that the two values are very close, indicating that the prediction results are reliable.

thumbnail
Fig 8. Comparison between observed and predicted values of HFRS incidence in Luotian, Zhongxiang and Yicheng counties from 2010–2014.

The continuous lines with points represent the observed HFRS incidence and the dotted lines with points represent the predicted values for HFRS incidence for each specific date.

https://doi.org/10.1371/journal.pone.0207518.g008

We also evaluated the prediction results and general performance of the ARIMA, STARIMA and SD-STARIMA models to assess their relative effectiveness. Table 10 shows the correlation coefficient (R), Mean Absolute Percentage Error (MAPE), Root Mean Square Error (RMSE), Average Absolute Error (MAE) and Classic Akaike Information Criterion(AIC) for each of the models. It can be seen from the table that the SD-STARIMA model is more reliable and that the error between its predicted values and actual observed values is smaller. Overall, then, we can conclude as follows:

  1. The data relating to HFRS incidence in Hubei Province has a fluctuating distribution curve and is quite different from other statistically sampled data in terms of its space and time distribution features, which are characterized by an obvious seasonal distribution. An SD-STARIMA model was therefore introduced that is able to adjust for seasonal difference and thus fit the data incorporating seasonal distribution trends more effectively.
  2. The data for HFRS incidence in Hubei Province has both spatial and temporal characteristics. The SD-STARIMA model has both spatial and temporal features that are thus able to explain and simulate the HFRS tendencies in Hubei, with a spatial-temporal weight matrix being used to quantify the influence from the neighboring counties. We found that the SD-STARMA model has a higher degree of fit as a result of its implementation of time-space autocorrelation than would be the case with time autocorrelation alone.
  3. Although the overall trend for HFRS incidence is consistent across every county in Hubei Province, the time series for the different counties is still different because of various impacting factors such as the local environment and human demography. The SD-STARIMA model is able to combine not only historical influences, but also the spatial and temporal impact from neighboring counties to evaluate the tendencies for HFRS incidence for any one specific county.
thumbnail
Table 10. Results for both fit and general prediction performance for the ARIMA, STARIMA and SD-STARIMA models.

https://doi.org/10.1371/journal.pone.0207518.t010

Having arrived at our results, we also compared them, to previous studies relating to HFRS analysis and prediction. Zhang at al., for instance, used a basic Poisson regression method to examine the potential impact of climate variability on the transmission of HFRS [20]. They incorporated climatic variables across a range of lags into a basic Poisson regression model that effectively eliminated the lagged effect of the climatic variables on the number of HFRS cases. However, spatial influences and spatial lag for the HFRS data were not considered, potentially overlooking a significant set of influencing factors.

Li et al. have used a GWR (geographically weighted regression) model to identify the impact of environmental factors and social-economic factors on the spatiotemporal heterogeneity of HFRS in China [42]. In this model, spatial characteristics are taken into account when undertaking the GWR-based analysis. However, this model suffers from the opposite flaw to the one above: temporal correlation is also a key influencing factor for HFRS cases in Hubei Province. Thus, by overlooking the temporal factors, this may similarly undermine the accuracy of the estimated results.

Conclusion

Time series-based approaches have commonly been used in the past to predict the trends in HFRS epidemics, with ARIMA models standing as prime example. As a result of their capacity to capture both spatial and temporal variation, simulation results based on STARIMA models have been found to be more accurate than the results provided by non-spatial models like ARIMA. However, because there are also seasonal characteristics relating to the HFRS epidemics in Hubei Province, we developed a new model named SD-STARIMA that is able to incorporate adjustments for seasonal differences into space-time series analysis of HFRS outbreaks. We compared the estimates produced by ARIMA, STARIMA and SD-STARIMA for HFRS incidence data for Hubei Province and found that the SD-STARIMA model more closely predicted observed trends.

In conclusion, our examination of various possible models in this paper demonstrated the importance of analyzing seasonal differences in relation to HFRS epidemics because of the disease’s seasonal characteristics. On top of this, we found that first-order differences most closely reflect the stability data and bimodal distribution characteristics of the disease. We then constructed a first-order difference based SD-STARIMA model that is able to make accurate predictions using both space-time autocorrelation coefficients and space-time partial autocorrelation coefficients.

To validate the proposed approach, we used data relating to three counties that have a higher incidence of HFRS in Hubei Province (Luotian, Zhongxiang and Yicheng). According to the results, the SD-STARIMA model is more accurate than the ARIMA and STARIMA models and is generally much better for counties that are consistent with overall distribution trends. In that case, the SD-STARIMA model proposed in this paper has been proven to be more reliable for predicting HFRS epidemics in Hubei Province and has the potential to be more widely used for the prediction of epidemics.

References

  1. 1. Xu ZY, Guo CS, Wu YL, Zhang XW, Liu K. Epidemiological studies of hemorrhagic fever with renal syndrome: analysis of risk factors and mode of transmission. J INFECT DIS. 1985;152(1):137–44 pmid:2861242
  2. 2. Zou Y, Wang JB, Gaowa HS, Yao LS, Hu GW, Li MH et al. Isolation and genetic characterization of hantaviruses carried by Microtus voles in China. J MED VIROL. 2008;80(4):680–8. pmid:18297708
  3. 3. Fang L, Wang X, Liang S, Li Y, Song S, Zhang W et al. Spatio temporal trends and climatic factors of hemorrhagic fever with renal syndrome epidemic in Shandong Province, China. PLOS NEGLECT TROP D. 2010;4(8):e789
  4. 4. Wu X, Lu Y, Zhou S, Chen L, Xu B. Impact of climate change on human infectious diseases: Empirical evidence and human adaptation. ENVIRON INT. 2016;86:14–23. pmid:26479830
  5. 5. Lin H, Liu Q, Guo J, Zhang J, Wang J, Chen H. Analysis of the geographic distribution of HFRS in Liaoning Province between 2000 and 2005. BMC PUBLIC HEALTH. 2007;7:207. pmid:17697362
  6. 6. Wu W, Guo J, Guan P, Sun Y, Zhou B. Clusters of spatial, temporal, and space-time distribution of hemorrhagic fever with renal syndrome in Liaoning Province, Northeastern China. BMC INFECT DIS. 2011;11:229. pmid:21867563
  7. 7. Zuo SQ, Fang LQ, Zhan L, Zhang PH, Jiang JF, Wang LP et al. Geo-spatial hotspots of hemorrhagic fever with renal syndrome and genetic characterization of Seoul variants in Beijing, China. PLoS Negl Trop Dis. 2011;5(1):e945. pmid:21264354
  8. 8. Guan P, Huang D, He M, Shen T, Guo J, Zhou B. Investigating the effects of climatic variables and reservoir on the incidence of hemorrhagic fever with renal syndrome in Huludao City, China: a 17-year data analysis based on structure equation model. BMC INFECT DIS. 2009. pmid:19583875
  9. 9. Wu W, Guo JQ, Yin ZH, Wang P, Zhou BS. GIS-based spatial, temporal, and space-time analysis of haemorrhagic fever with renal syndrome. EPIDEMIOL INFECT. 2009;137(12):1766–75. pmid:19393118
  10. 10. Wang T, Liu J, Zhou Y, Cui F, Huang Z, Wang L et al. Prevalence of hemorrhagic fever with renal syndrome in Yiyuan County, China, 2005–2014. BMC INFECT DIS. 2016;16(1):69. pmid:26852019
  11. 11. Zhang WY, Wang LY, Liu YX, Yin WW, Hu WB, Magalhaes RJ et al. Spatiotemporal transmission dynamics of hemorrhagic fever with renal syndrome in China, 2005–2012. PLoS Negl Trop Dis. 2014;8(11):e3344. pmid:25412324
  12. 12. Li S, Ren H, Hu W, Lu L, Xu X, Zhuang D et al. Spatiotemporal heterogeneity analysis of hemorrhagic fever with renal syndrome in China using geographically weighted regression models. Int J Environ Res Public Health. 2014;11(12):12129–47. pmid:25429681
  13. 13. Sugumaran R, Larson SR, Degroote JP. Spatio-temporal cluster analysis of county-based human West Nile virus incidence in the continental United States. INT J HEALTH GEOGR. 2009;8:43. pmid:19594928
  14. 14. Busch M, Cavia R, Carbajo AE, Bellomo C, Capria SG, Padula P. Spatial and temporal analysis of the distribution of hantavirus pulmonary syndrome in Buenos Aires Province, and its relation to rodent distribution, agricultural and demographic variables. TROP MED INT HEALTH. 2004;9(4):508–19. pmid:15078270
  15. 15. Weber De Melo V, Sheikh Ali H, Freise J, Kuhnert D, Essbauer S, Mertens M et al. Spatiotemporal dynamics of Puumala hantavirus associated with its rodent host, Myodes glareolus. EVOL APPL. 2015;8(6):545–59. pmid:26136821
  16. 16. Dobly A, Yzoard C, Cochez C, Ducoffre G, Aerts M, Roels S et al. Spatiotemporal dynamics of Puumala hantavirus in suburban reservoir rodent populations. J VECTOR ECOL. 2012;37(2):276–83. pmid:23181849
  17. 17. Zhang YH, Ge L, Liu L, Huo XX, Xiong HR, Liu YY et al. The epidemic characteristics and changing trend of hemorrhagic fever with renal syndrome in Hubei Province, China. PLOS ONE. 2014;9(3):e92700. pmid:24658382
  18. 18. Ge L, Zhao Y, Sheng Z, Wang N, Zhou K, Mu X et al. Construction of a Seasonal Difference-Geographically and Temporally Weighted Regression (SD-GTWR) Model and Comparative Analysis with GWR-Based Models for Hemorrhagic Fever with Renal Syndrome (HFRS) in Hubei Province (China). INT J ENV RES PUB HE. 2016;13(11)
  19. 19. Ge L, Zhao Y, Zhou K, Mu X, Yu H, Wang Y et al. Spatio-Temporal Pattern and Influencing Factors of Hemorrhagic Fever with Renal Syndrome (HFRS) in Hubei Province (China) between 2005 and 2014. PLOS ONE. 2016;11(e016783612). pmid:28030550
  20. 20. Zhang WY, Guo WD, Fang LQ, Li CP, Bi P, Glass GE et al. Climate variability and hemorrhagic fever with renal syndrome transmission in Northeastern China. Environ Health Perspect. 2010;118(7):915–20. pmid:20142167
  21. 21. Lin H, Zhang Z, Lu L, Li X, Liu Q. Meteorological factors are associated with hemorrhagic fever with renal syndrome in Jiaonan County, China, 2006–2011. INT J BIOMETEOROL. 2014;58(6):1031–7. pmid:23793957
  22. 22. K CADO. Space-Time Modelling with and Application to Regional Forecasting. 1975
  23. 23. Liu Q, Liu X, Jiang B, Yang W. Forecasting incidence of hemorrhagic fever with renal syndrome in China using ARIMA model. BMC INFECT DIS. 2011;11:218. pmid:21838933
  24. 24. Box G E JGMR. Time series analysis: forecasting and control. Time series analysis: forecasting and control: John Wiley & Sons; 2015.
  25. 25. Hoel LA, Williams BM. Modeling and Forecasting Vehicular Traffic Flow as a Seasonal ARIMA Process: Theoretical Basis and Empirical Results. 2003;129(6):664–72.
  26. 26. Wang T, Zhou Y, Wang L, Huang Z, Cui F, Zhai S. Using an Autoregressive Integrated Moving Average Model to Predict the Incidence of Hemorrhagic Fever with Renal Syndrome in Zibo, China, 2004–2014. JPN J INFECT DIS. 2016;69(4):279–84. pmid:26370428
  27. 27. Li S, Cao W, Ren H, Lu L, Zhuang D, Liu Q. Time Series Analysis of Hemorrhagic Fever with Renal Syndrome: A Case Study in Jiaonan County, China. PLOS ONE. 2016;11(10):e163771. pmid:27706256
  28. 28. Song X, Xiao J, Deng J, Kang Q, Zhang Y, Xu J. Time series analysis of influenza incidence in Chinese provinces from 2004 to 2011. Medicine (Baltimore). 2016;95(26):e3929. pmid:27367989
  29. 29. K CDOJ. Spatial Processes: Models and Applications: London: Pion; 1981.
  30. 30. Liu L LRSY. Predicting the incidence of hand, foot and mouth disease in Sichuan province, China using the ARIMA model. EPIDEMIOL INFECT. 2016;144(01):144–51
  31. 31. den Butter F A G CRLV. The use of ARIMA models in seasonal adjustment. Empirical Economics. 1985;10(4):209–30
  32. 32. Zheng Y, Zhou BY, Wei J, Xu Y, Dong JH, Guan LY et al. Persistence of immune responses to vaccine against haemorrhagic fever with renal syndrome in healthy adults aged 16–60 years: results from an open-label2-year follow-up study. Infect Dis (Lond). 2017:1–6. pmid:28703073
  33. 33. Martin R L OJE. The identification of regional forecasting models using space: time correlation functions. Transactions of the Institute of British Geographers. 1975:95–118
  34. 34. L A. Spatial Econometrics: Methods and Models: Boston: Kluwer Academic Publishers; 1988.
  35. 35. Kurt S, Tunay KB. STARMA Models Estimation with Kalman Filter: The Case of Regional Bank Deposits. WORLD CONFERENCE ON TECHNOLOGY, INNOVATION AND ENTREPRENEURSHIP. 2015:2537–47.
  36. 36. Zheng S, Wan Q, Jia M. Short-term forecasting of waterlogging at urban storm-waterlogging monitoring sites based on STARMA model. Progress in Geography. 2014;33(1007-6301(2014)33:7<949:JYSMXD>2.0.TX;2-I7):949–57
  37. 37. Garrido RA, Mahmassani HS. Forecasting short-term freight transportation demand—Poisson STARMA model. TRANSPORTATION RESEARCH RECORD1998. p. 8–16.
  38. 38. Huang H, Lin S, Tang T, Li J. Application of RBF-STARMA model in shipping flow forecasting. In: Wu Y, 'editor'. Advanced Materials Research2010. p. 893.
  39. 39. Li Z, Miao Z. A New Precipitable Water Vapor STARMA Model Based on Newton's Method. In: Cao BY, Liu ZL, Zhong YB, Mi HH, ''editors'. Advances in Intelligent Systems and Computing2016. p. 275–87.
  40. 40. Lee SD, Lee KM. A Comparison on Forecasting Performance of STARMA and STBL Models with Application to Mumps Data. The Korean Journal of applied Statistics. 2007;20(1):91–102
  41. 41. Wang S, Wang J, Chen H. STARMA-network model of space-time series prediction. Application Research of Computers. 2014;31(1001-3695(2014)31:8<2315:SWLSKX>2.0.TX;2-S8):2315–9
  42. 42. Li S, Ren H, Hu W, Lu L, Xu X, Zhuang D et al. Spatio temporal Heterogeneity Analysis of Hemorrhagic Fever with Renal Syndrome in China Using Geographically Weighted Regression Models. INT J ENV RES PUB HE. 2014
  43. 43. Ge L, Zhao Y, Sheng Z, Wang N, Zhou K, Mu X et al. Construction of a Seasonal Difference-Geographically and Temporally Weighted Regression (SD-GTWR) Model and Comparative Analysis with GWR-Based Models for Hemorrhagic Fever with Renal Syndrome (HFRS) in Hubei Province (China). INT J ENV RES PUB HE. 2016;13(106211). pmid:27801870
  44. 44. Khashei M, Bijari M, Hejazi SR. Combining seasonal ARIMA models with computational intelligence techniques for time series forecasting. SOFT COMPUT. 2012;16(6):1091–105.
  45. 45. Reichert TA. Influenza and the Winter Increase in Mortality in the United States, 1959–1999. AM J EPIDEMIOL. 2004;160(5):492–502. pmid:15321847
  46. 46. Gaudart J, Touré O, Dessay N, Dicko AL, Ranque S, Forest L et al. Modelling malaria incidence with environmental dependency in a locality of Sudanese savannah area, Mali. 2009;8(1):61.
  47. 47. Caputo B, Manica M, D'Alessandro A, Botta G, Filipponi F, Protano C et al. Assessment of the Effectiveness of a Seasonal-Long Insecticide-Based Control Strategy against Aedes albopictus Nuisance in an Urban Area. PLoS Negl Trop Dis. 2016;10(3):e4463. pmid:26937958
  48. 48. Luz PM, Mendes BV, Codeco CT, Struchiner CJ, Galvani AP. Time series analysis of dengue incidence in Rio de Janeiro, Brazil. AM J TROP MED HYG. 2008;79(6):933–9 pmid:19052308
  49. 49. Li Q, Guo N, Han Z, Zhang Y, Qi S, Xu Y et al. Application of an Autoregressive Integrated Moving Average Model for Predicting the Incidence of Hemorrhagic Fever with Renal Syndrome. AM J TROP MED HYG. 2012;87(2):364–70. pmid:22855772
  50. 50. Pfeifer P E DSJ. Identification and interpretation of first order space-time ARMA models. TECHNOMETRICS. 1980;22(3):397–408
  51. 51. Lin S L HHQZ. The application of space-time ARIMA model on traffic flow forecasting. Machine Learning and Cybernetics, 2009 International Conference on. IEEE,2009. p. 6–3408.
  52. 52. Pfeifer PEDS Seasonal Space-Time ARIMA Modeling. Geographical analysis. 1981;13(2):117–33