Predicting the influence of climate on grassland area burned in Xilingol, China with dynamic simulations of autoregressive distributed lag models

The influence of climate change on wildland fire has received considerable attention, but few studies have examined the potential effects of climate variability on grassland area burned within the extensive steppe land of Eurasia. We used a novel statistical approach borrowed from the social science literature—dynamic simulations of autoregressive distributed lag (ARDL) models—to explore the relationship between temperature, relative humidity, precipitation, wind speed, sunlight, and carbon emissions on grassland area burned in Xilingol, a large grassland-dominated landscape of Inner Mongolia in northern China. We used an ARDL model to describe the influence of these variables on observed area burned between 2001 and 2018 and used dynamic simulations of the model to project the influence of climate on area burned over the next twenty years. Our analysis demonstrates that area burned was most sensitive to wind speed and temperature. A 1% increase in wind speed was associated with a 20.8% and 22.8% increase in observed and predicted area burned respectively, while a 1% increase in maximum temperature was associated with an 8.7% and 9.7% increase in observed and predicted future area burned. Dynamic simulations of ARDL models provide insights into the variability of area burned across Inner Mongolia grasslands in the context of anthropogenic climate change.


Introduction
There is strong evidence that climate change and altered fuel characteristics associated with human activity can dramatically influence area burned by wildfire at large spatial scales [1,2]. Understanding how climate influences fire in grasslands is challenging because of complex interactions between climate parameters, ignitions, and past fire history [3][4][5][6][7][8][9][10][11][12][13]. Previous research has demonstrated complex relationships between fire activity and climate variability in the extensive grasslands of Xilingol in northern China [14,15]. Fire in this region potentially creates feedbacks between climate and fire occurrence both by climate forcing related to carbon emissions from fires and changes in flammability related to post-fire succession [16,17]. The climate thresholds that potentially accelerate area burned in the Xilingol region remain poorly resolved. Identifying these thresholds is important to managers seeking to optimize the production of key ecosystem services from grasslands. Many investigations of climate influence on fire extent rely on ordinary least squares (OLS) regression techniques [1,11,12,18,19]. The OLS method is appropriate for analysis of stationary time-series-series in which the mean, variance, and autocorrelation structure are constant over time. In the case of non-stationary time series data, application of OLS may result in spurious relationships between variables [20,21].
The goal of this study is to provide a comprehensive analysis of fire-climate relationships that accounts for potential non-stationarity in the time series analyzed and that distinguishes between "short-run" perturbations that move the time series analyzed apart over relatively short timeframes and "long-run" relationships in which relationships exhibit equilibrium over time. We accomplish this goal by implementing a method that is increasingly popular in social science investigations of political and economic trends-dynamic simulations of Autoregressive Distributed Lag (ARDL) models [22]. This method conveys results by constructing counterfactual scenarios to describe, in our case, the effects on grassland area burned by perturbations in climate and carbon emissions [23]. We provide a detailed description of the implementation of these methods. Although this analysis focused on the steppe lands of northern China, this methodology will be applicable to other investigations of broad-scale climatefire relationships. Our results will provide a better understanding of the role of anthropogenic climate change on fire and help identify adaptation strategies.

Study area and data
Xilingol is located within the Autonomous Region of Inner Mongolia in northern China ( Fig  1). The climate of this region is semi-arid, and maximum temperatures have increased by approximately 1.5˚C over the last 70 years. The extensive grasslands of this region burn mostly in the months of April, May, and September (Fig 2). Between 2001 and 2018 there were 832 grassland fires covering a total area of 42,190 ha (Fig 1). Carbon emissions from these fires and other sources in Xilingol may potentially influence area burned in this region both by contributing to atmospheric climate forcing of temperature, and because the fires that caused these emissions reset succession which potentially influences future flammability (Fig 3).
We investigated the relationship between area burned and seven variables: monthly averages of minimum and maximum temperature (in degrees C), monthly average relative humidity (percentage), monthly average precipitation (in millimeters), average monthly wind speed (in meters per second), average monthly carbon emissions from fires in Xilingol (in grams centimeter 2 ), and average sunlight (in hours per month) (Fig 4). All climate data were obtained from the China Meteorological Data Sharing Service Center (http://www.cdc.cma.gov.cn/) for the period 2001 to 2018. Data for the number of hectares of grassland area burned and carbon emissions from fire and other sources were acquired from the Monitoring Center of the Ministry of Agriculture. (http://www.moa.gov.cn/). We combined biomass consumed in fire and carbon emissions in the models we describe below. We log-transformed the data to address heteroscedasticity [24], multi-collinearity [25] and assessed autocorrelation using the Durbin-Watson statistic [26,27]. Dynamic simulations of ARDL models were performed using the Stata module [28].

Stationarity test
We followed the workflow of implementing ARDL models as outlined in Figure 1 of [29]. The information in this section is adopted from methods reported previously in [30,31]. The first step in the analysis of potentially non-stationary time series data is an Augmented Dickey-Fuller (ADF) test to investigate the order of integration of variables [32]. If a series has constant mean and variance it is represented as I(0) and we call it a stationary series. A nonstationary series has a changing mean and variance which can be made stationary by taking the first difference or second difference of the series denoted as I(1) and I (2). We used the following functional form for the ADF test from Gujarati and Porter [33]: Where X t represents the variable series, Δ represents the first difference, and the lagged difference terms are included to correct for serial correlations of the disturbance terms on the right side of the equation. The Schwarz information criterion (SIC) was used for the selection of lagged differences. When θ = 0, the X t variable series has a unit root and an I(1) process governed by a stochastic trend. If the selected time series variable appears to be integrated of order one, the investigation of 2 nd order unit root is performed by using the following expression: where Δ 2 represents the second-difference operator. The variable X t is integrated of order two or I(2) if γ = 0. Suppose d shows the number of times that the variable Xt must be differenced to become stationary, then the series X t is integrated of order d or I(d). If the ADF test statistic value was higher than the critical values at a 5 percent significance level, we considered it to be a stationary series but if the test statistic value is lower than the critical values, then, we classified it as a nonstationary series. If non-stationary, the series was differenced to make it stationary. Providing variables are either integrated of order I(0) or integrated of order I(1) or both but not I(2), it is possible to estimate an ARDL model in error-correction form and perform a bounds test for cointegration of variables.

Bounds test
The dynamic simulated ARDL approach employs a bounds test to check the long-run relationship between variables. During bounds testing, a long-run relationship between variables exists if the F-statistic is greater than the upper bound critical value at a 5 percent significance level. If the F-statistic is less than the upper bound critical value, then the null hypothesis of no longrun relationship between variables cannot be rejected. If the F-statistic value lies between the upper and lower bound critical values, a decision about co-integration remains inconclusive. The null hypothesis of no long-run relationship is represented as H 0 : δ1 = δ2 = δ3 = δ4 = δ5 = δ6 = δ7 = δ8 = 0 tested during the model estimation. The ARDL bounds testing estimation model follows: Where Δ is the change operator, ln is its natural logarithm, and t-i is the optimal number of lags selection based on Schwartz Bayesian information criterion and Akaike information criterion. Delta (δ) and beta (β) are the parameters to be estimated. If a long-run relationship is  found, short-and long-run elasticities can be estimated using dynamic simulations of an ARDL model.
After confirming the long-run relationship among the variables, we incorporated the cumulative sum (CUSUM) and cumulative sum of squares (CUSUMSQ) tests developed by Brown et al. [34] to check the goodness of fit for the ARDL model [35]. These tests are performed on the residuals of the error correction model and reported in graphical form. Stable models fall within the 5% critical bound of CUSUM and CUSUMQ plots.
Dynamic simulated ARDL models. We selected the dynamic simulation ARDL model with cointegrated variables embedded in a vector autoregressive time series model (VAR) [28,36]. This method is designed to estimate the effect of a group of explanatory variables on a single response (area burned) with variable measurements taken discretely over time (also known as a single equation model framework). In contrast to ordinary least squares (OLS) regression, both current and lagged values of explanatory variables are considered in an ARDL framework and the estimated effect on the response can be instantaneous or observed gradually on future time steps. The error correction algorithm of the dynamic ARDL technique was used to develop eight models via 5000 simulations of the vector of parameters from a multivariate normal distribution [37]. We selected the best regression model using dynamic simulations ARDL model with some methodological modifications introduced with the general formulation expressed as follows: Where Δ y represents a change in the exogenous variable, α 0 is the intercept of all exogenous variables at time t − 1, which affects the level of maximum lagged p and qk to first-differences Δ with error term ε on the response of t. The null hypothesis of a level relationship is assessed using Kripfganz and Schneider [38] critical values and approximate p-values based on response surface regressions. To reject the null hypothesis of no level relationship H 0 = θ 0 + θ 1 + � � � + θ k = 0, the F-statistic from the jointly zero estimation of all parameters on the climate threshold variables in level and the lagged grassland area burned coefficient must be above the upper bound [I(1)] critical values. The empirical specification in Eq (4) can be re-written into eight conceptual models as: MODEL 2: MODEL 3: MODEL 4: Where α 0 is the intercept, θ's and β's are the parameters to be estimated and u denotes the white noise at time t. All variables are taken in logarithmic scale to stabilize the variance and so that results can be presented as "elasticities" in which coefficients represent the estimated percent change in the burned area dependent variable for a percent change in a climate independent variable.

Results
ADF and PP unit root tests indicated that maximum temperature and carbon emissions were stationary at the 10% significance level presented (Table 1). All other variables were integrated at order one I(1). The lag selection criteria presented in Table 2 was used to select the optimal lag order for the ARDL model estimation technique. The lag selection criteria (final prediction error, Schwarz Bayesian Criterion, and Hannan-Quinn Information Criterion) confirmed lag one as the optimal lag for subsequent analysis. Table 3 shows the results of the ARDL bounds cointegration test. The F-statistic of the estimated models was above the upper bounds critical values, thus rejecting the null hypothesis of no level relationship. The absence of I(2) variable validated the application of ARDL bound testing technique.

Long-run coefficients of observed and predicted area burned
An ARDL model with lag (1,1,1,1,1,0,1) was selected based on the Schwarz Bayesian Criterion. The dependent variable (grassland area burned) and the regressors with 832 observations from 2001 to 2018 were estimated using dynamic simulations of an ARDL model. Wind speed, maximum temperature, and carbon emissions showed a significant relationship to observed and predicted area burned. All other variables tested were non-significant. A one percent increase in wind speed was associated with a 20.8% and 22.8% increase in observed and predicted area burned respectively while holding other climatic variables constant. A one percent increase in maximum temperature was associated with 8.7% and 9.8% increase in observed and predicted area burned while holding other climatic variables constant. A one percent increase in carbon emission was associated with only a 2.6% and 2.8% increase in observed and predicted area burned with other variables held constant. The estimated long-run coefficients for the effect of climate variables on the grassland area burned are shown in Table 4 while a plot of the  Fig 6 shows the results of the dynamic stimulated ARDL model in which the grassland area response is predicted at various time steps after forcing a one standard deviation increment of each climate variable. These simulations showed that a one standard deviation increase in maximum temperature and wind speed would significantly increase area burned over a twenty-year period beginning in approximately ten years. Carbon emissions were associated with an increased area burned at the 75% confidence interval after ten years, but not at a 90% or 95% confidence interval.

Short-run coefficients of observed and predicted area burned
The short-run equilibrium relationship of observed and future area burned was examined using the error correction representation of the dynamic stimulated ARDL model with 832 observations over the period 2001 to 2018. The estimated elasticity of wind speed in the short run was found to be low relative to the long-run elasticity. A 1% increase in wind speed was associated with a 6.1% and 7.8% increase in observed and predicted area burned respectively while holding other climatic variables constant. The estimated short-run elasticity of carbon emission was associated with a 1.8% and 1.9% increase in observed and predicted area burned while other climatic variables were held constant. The empirical results of the short-run equilibrium relationship are presented in Table 5 while the monthly correlation between grassland area burned and climatic variables are depicted in Table 6. The area burned exhibited stronger correlations with maximum temperature than carbon emissions. The correlation of wind speed and sunlight to area burned varied between months.

Model validation
Diagnostic tests are critical to examine the independence of the residuals of the estimated models. Several diagnostic tests such as the LM test for autoregressive conditional heteroskedasticity, Breusch-Godfrey test for autocorrelation, Ramsey RESET test for functional form and Jarque-Bera test for normality were employed to verify the estimated long-and short-run elasticities of the dynamic stimulated ARDL Model. Table 5 shows that the estimated models are free from heteroskedasticity, autocorrelation, functional misspecification and are normally distributed. Fig 7 presents the plots of the cumulative sum of recursive residuals for the dynamic stimulated ARDL model and indicates that values are within the 95% confidence bands-confirming the stability of the estimated models.

Discussion
This study demonstrates that in the long-run, grassland area burned in Xilingol is more sensitive to changes in wind speed and temperature than other climate variables [10,15,39]. Past fire history and the carbon emissions that resulted from these fires had a marginal influence on the projected future fire. Area burned is likely to increase given warmer winter and spring temperatures related to directional climate change [6]. Variability in precipitation, in contrast, had no significant effect on area burned, which is surprising because fuel moisture often plays a critical role in fire spread. There are three potential explanations for the lack of a clear relationship between precipitation and area burned. First, our study area has relatively low precipitation during every year of the fire season and fuel moisture is usually at critical levels during fire season so that ignition and wind events are the key determinates of burned area. It is also possible that the field capacity of soils in Xilingol grasslands is large enough that fuel moisture does not decrease because plants have sufficiently deep roots to keep fuel moisture at normal levels even during drought. The third possibility is that our statistical approach is unable to accurately estimate the precipitation response because there is insufficient variation in precipitation over the 18-year time series examined.
Other studies (e.g. [14]) failed to find strong correlations between area burned and wind speed, possibly because of a shorter climate record examined (2000-2014 instead of 2001- 2018) or different statistical methodology. Existing literature- [6,14,40] often ignore lag selection in statistical models, which may result in spurious regressions. Although we demonstrate an important influence of climate on area burned across Xilingol, anthropogenic ignitions, urbanization, agriculture, and management practices may account for substantial variability in fire occurrence and pattern [41,42]. Extreme wind events occur in northern China on a regular basis, but do not regularly result in large fire events. A large number of fires are reliant on the concurrency of these weather events related to human activities [43]. Over 95% of ignitions are due to humans [4,16], and as populations increase, we expect a greater chance of ignitions during severe fire weather conditions.
Our research suggests that the combination of increasing anthropogenic pressure on grasslands in concert with continued warming temperatures will likely increase burning in the northern China steppe, which may have significant effects to the livestock industry and conservation efforts [7,12]. Rehabilitation and following fire in arid and semi-arid landscapes require significant time and expense. Our research demonstrates that there may be climatic thresholds past which point rising summer temperature and high wind speed events could lead to abrupt increases in area burned. The response of wind speed related to grassland area burned is the most critical threshold, suggesting that change in intensity of wind speed is particularly impactful.

Conclusion
Anthropogenic climate change along with an increase in the human population is likely to significantly increase the impact of fire on the globally important grassland ecosystems of Xilingol. This study successfully utilized a dynamic simulated Autoregressive Distributed Lag (ARDL) model to determine the climate variables that have the greatest effect on the area of grassland burnt. Our results indicated that many factors predicted to have an influence of the area burned-such as precipitation-were not as influential as expected. The most important factors influencing area burned are maximum temperature and wind speed. Although our results indicate that an increase in area burned is inevitable, the fire environment is not independent of human activities and changes in fire pattern will also depend on human action, government policy, and social goals.