^{1}

^{*}

^{1}

^{2}

Conceived and designed the experiments: ERL DAC DWP. Performed the experiments: ERL. Analyzed the data: ERL. Wrote the paper: ERL DAC DWP. Analysed the data and performed statistical analysis: ERL DWP.

The authors have declared that no competing interests exist.

Few studies have quantified regional variation in tree mortality, or explored whether species compositional changes or within-species variation are responsible for regional patterns, despite the fact that mortality has direct effects on the dynamics of woody biomass, species composition, stand structure, wood production and forest response to climate change. Using Bayesian analysis of over 430,000 tree records from a large eastern US forest database we characterised tree mortality as a function of climate, soils, species and size (stem diameter). We found (1) mortality is U-shaped vs. stem diameter for all 21 species examined; (2) mortality is hump-shaped vs. plot basal area for most species; (3) geographical variation in mortality is substantial, and correlated with several environmental factors; and (4) individual species vary substantially from the combined average in the nature and magnitude of their mortality responses to environmental variation. Regional variation in mortality is therefore the product of variation in species composition combined with highly varied mortality-environment correlations within species. The results imply that variation in mortality is a crucial part of variation in the forest carbon cycle, such that including this variation in models of the global carbon cycle could significantly narrow uncertainty in climate change predictions.

An understanding of tree mortality is central to any predictive understanding of forest dynamics. The long-term dynamics of woody biomass are regulated by the difference between gains through individual growth and losses through mortality. This makes tree mortality a crucial determinant of the forest carbon cycle, the future of which is a major source of uncertainty in Earth System Model predictions of future climate

The simplest approach to making predictions about mortality in a changing world would be to correlate stand-level mortality obtained from permanent plot data with climatic variables, and use these relationships to predict changes under future climate scenarios. The problem with this approach is that it neglects the effects of species, individual size and competition, factors that individually have been shown to strongly affect mortality at the scale of the individual tree, with potentially serious consequences for landscape-level predictions. In order to predict the impacts of changing climate on forest-level mortality, it is therefore important to isolate the effects of these factors because they are likely to show complex, semi-independent changes in the future. For example, in much of the temperate zone, many forest stands are successional and regenerating, undergoing directional change in species composition independent of any changes in the environment

Here we use the Eastern USA Forest Inventory and Analysis (FIA) dataset to parameterise, for each of 21 common US tree species, a logistic regression model that assigns an annual probability of mortality to an individual tree given its size, species identity, competitive environment (plot basal area) and physical environment. We estimate the nature and relative magnitude of the different factors affecting tree mortality and parameterise a model that could be useful in predicting potential responses of US forest carbon stocks to climate change (e.g.

We used the pre-1999 USA Department of Agriculture Forest Inventory and Analysis (USDA FIA) dataset containing tree-level data for 182 species from a network of plots distributed across the Eastern USA ^{2} ha^{−1}). The FIA survey was designed specifically to allow accurate estimates of average forest characteristics such as species composition and average tree size through scaling from the tree, through the stand, to the regional level

Before analysis began, the dataset was filtered to include only those dead trees that we could be certain were not removed by human activity, and to remove various kinds of errors in the data (e.g. false mortality events corresponding to subplots that were measured in the first, but not the second, survey). The model was parameterised for 21 of the most common species, using 438,401 individual tree records in total, accounting for around 60% of all trees in the reduced dataset. Due to the high number of possible predictors being considered, only species with over 10,000 individuals in the data set were used for parameterising the model. Of these, two species (

Since little is known about the geographical variation in tree mortality we had little information to judge which climatic factors might correlate with mortality. However, there have been many studies linking growth with a wide variety of climatic variables; for example, solar radiation,

We assigned environmental factors to each tree using two sources of environmental data, both available on a 0.5°×0.5° degree. The first source was the CRU05 climatology product (Climatic Research Unit, University of East Anglia:

To avoid convergence problems during parameter estimation, we applied principal component analysis (PCA) to the 14 different environment variables (both from the VEMAP and CRU05 data) to remove highly correlated variables. Among highly correlated variables, the variable with the highest weighting in the principal components was retained and the rest discarded. This left four CRU05-derived climatic variables (radiation, yearly precipitation, mean annual temperature and maximum wind speed) to be included as possible mortality predictors, plus one FIA soil texture classification associated with each tree. We normalised each factor (i.e. subtracted the mean value and divided by the standard deviation) to allow for a simple comparison between the magnitudes of effects of each of the factors. We also check that plot basal area was not highly correlated with the remaining climate variables.

Tree mortality is a difficult property to estimate because unlike growth, it has only 2 possible outcomes from each re-measured tree (survived or died), and typical tree mortality rates are low (on the order of 0.1 to 2% year^{−1}), such that large sample sizes and/or long re-measurement periods are required. Moreover this dataset contained varying re-measurement intervals, meaning that a simple ‘proportion dead’ would not have been informative _{i}

We included different combinations of the predictor variables: dbh (continuous); soil type (discrete, ranging from 1–5); plot basal area (i.e. FIA inventory plot) (continuous); and environmental variables (all continuous) as follows:_{1} is a function of the first predictor variable (e.g. dbh), _{2} is a function of the second (e.g. precipitation), and so on. Initial analysis indicated that the relationship between dbh and mortality was U-shaped, corresponding to high mortality in small trees, low mortality for medium sized trees (typically 25–40 cm) and increasing mortality in larger trees. To describe this relationship we tried several different model equations and found the best fit to the data using the following functional form_{i}^{2} ha^{−1}):_{i}

We used Bayesian methods based on Metropolis-Hastings Markov Chain Monte Carlo sampling _{i}

We used non-informative uniform priors on all parameters so the MCMC algorithm (see below) needed to refer to the log-likelihood only. However, for numerical reasons we imposed upper and lower limits on the allowable values of all parameters, i.e., a prior probability of 0 on parameter values outside of the allowable range. We set the allowable range much wider than the plausible values, and also checked the posterior distributions to make sure the tails of the posterior distributions were a long way from the edge of the allowable range.

The next step was to estimate values for the parameter set _{k}_{k}_{k}

We implemented the MCMC algorithm by initializing each parameter value at a random point close to the middle of the allowable range, allowing a suitable burn-in period (between 25,000 and 1,000,000 iterations) for the algorithm to reach quasi-equilibrium, then recording every 100^{th} sample of

As metrics to compare alterative models, we calculated, for each model

Given the high number of possible mortality predictors, the options of functional forms presented by Eqns (2)–(7) and the choice of species-specific or global for any parameter, there was a very large set of possible models

First, we established which of the possible predictor variables was the best single predictor of mortality by parameterising all possible mortality models featuring one predictor variable (referred to here as 1-d models). This set of models was still relatively large (28 different models), since the predictor variable in question could included using a linear or non-linear function, and with species-specific or global parameters (see Eqns (4)–(6)). We also tested some of the closely correlated alternative climate predictors in this way, but none gave a better fit than the set we had already chosen. Comparing the AIC and BIC values associated with each model allowed us to determine whether, considered in isolation, each predictor variable was best described using species-specific vs. global parameters, and a linear vs. non-linear functional form (see Eqns (4)–(6)). This analysis suggested that all predictor variables were best described using non-linear, species-specific functional forms. Therefore we decided to retain, within the larger set of all possible models, only those models that included non-linear functional forms. Further, comparing the maximum likelihood of the different 1-d models allowed us to rank the predictor variables in descending order of importance (meaning importance considered in isolation). The rank was: size>>radiation>yearly precipitation>mean annual temperature>plot basal area>maximum wind speed>soil type. Since size (dbh) was by far the best single predictor of mortality, we decided at this point to discard, form the large set of all possible models, any models not including dbh as a predictor variable.

Second, we sought, within the remaining set of models, the best set of environmental variables to include in the model. Since radiation was the best single environmental predictor, we tested each additional environmental predictor to find the best two-predictor combination, using species-specific responses, giving a model of the form:

These steps gave us for types of predictor variable: the constant α (Eqn (2)), dbh (Eqn (3)), the set of five non-linear environmental effects (Eqn (5)) and the non-linear competition effect (plot basal area: Eqn (7)). To determine the final model form we generated a set of models which allows us to test whether each type of predictor should be species-specific or shared, and whether the extra model complexity added by including environmental and competition effects in the simple size model was justified by the improvement in fit. We tested models using every combination of species-specific or shared effects for each type of predictor, as well as every combination with or without environment and competition effects (36 models in total). The full list of different models tested are shown in

The majority (74%) of parameters' 95% posterior distributions did not include 0, indicating statistically significant effects for these parameters. None of the posterior distributions for the constant or size parameters (α_{j}, β_{1j} and β_{2j} in Eqn (4)) included 0, while the least significant deviations from zero were seen for soil type and maximum wind speed parameters, and for the species

In order to compare the different mortality rates predicted for each species we calculated a single ‘baseline’ mortality of each species as the predicted mortality of a tree of standard size growing in a standard environment (we used both the mean environment, taken over the study region, for all variables together with a ‘mesic’ soil texture; and the species' own median environment). We chose to use 20cm as the standard stem diameter because it is approximately the size of a canopy tree

In order to visualise geographical patterns in observed mortality rates, we calculated a mortality rate for each plot (“plot-averaged mortality”) by fitting a single-parameter logistic model to the data, and used the coordinates of each to create a regional mortality map. We visualised geographical patterns in predicted mortality rates by creating simulated datasets which were identical to the original dataset except that whether each tree died or not was determined using the model's posterior parameter values. We then used the simulated data to calculate a model-predicted mortality rate for each plot. For each tree _{i}_{i}_{i}

We also wanted to create maps showing how mortality varies regionally in response to variation in species identity, stand structure (stem size and plot basal area) and environmental conditions, whilst controlling for variation in the other factors. We devised an approach to do this, based on creating simulated datasets in various different ways which selectively removed

Our maps are an imperfect way to partition spation variation but this method allows us to analyse variation in mortality due to each factor by selectively controlling for variation in the others. Had we chosen a different size of tree or a different set of environmental conditions, we would have seen the same spatial variation in mortality rates but the overall level of mortality would have been different. Since different species responded in different ways to changes in environment and stand structure, we also calculated variation in mortality due to these factors but using

We were also interested in seeing how mortality varied along the range of each predictor, both for all species together (“forest-averaged mortality”) and for each individual species (“species-averaged mortality”). We generated estimates of how observed mortality varied along the range of each predictor by binning the raw data according to the predictor of interest into equal sized bins (i.e. each containing the same number of stems) and found the best single annual mortality rate for the whole bin in the same way as before, using a single parameter logistic model. We did this both for the raw data (for just the 21 species for which we parameterised the model) and for all the data (including the rare species). In order to compare this to the model predictions for all species together (the forest-averaged mortality) we created 100 sets of simulated data as before (i.e. data of the same form as the original dataset but with alive/dead status based on our model predictions), ordered and binned these according to the variable of interest and calculated a single mortality rate for each bin, and a 95% confidence interval on this rate. Thus the forest-averaged mortality accounted for simultaneous changes in species composition and size structure across whichever gradient was being considered, and could be compared to the observed data.

Finally, for each species we were interested in how mortality varied with changes in the variable of interest alone, but since the predictors (size, environment and stand basal area) all co-varied along each gradient we calculated the median conditions in which each of the species was found. For each model predictor we created 100 simulated datasets using parameter values randomly chosen from the joint parameter posterior distributions. In these datasets, each tree was given the median condition of its species (apart from the predictor of interest) and was assigned as dead or alive based on its predicted annual mortality rate. For example, in order to examine the sole effect of temperature change on mortality we re-assigned each tree the median size, precipitation, radiation, maximum wind speed, soil type and stand basal area in which its species was found in the original dataset, and kept only the temperature information for each individual tree and then created the 100 simulated datasets as before by selecting 100 parameter sets at random from the joint posterior. This gave us a spread of mortality vs. temperature functions for each species, where the spread represents parameter uncertainty (variation in parameters causing variation in probability of mortality) and sampling (random variation in whether lived or died given the probability of mortality). This allowed us to consider only the effect of temperature on that species mortality, whilst modelling the species in a reasonable environment.

Using AIC and BIC, we found that the 7 best performing models all included species-specific environment effects, even when other predictors were not species-specific, or when plot basal area was not included. Plot basal area was only found to be a worthwhile predictor if its effects were species specific but did not benefit the model if the effect was shared among species. Models with non species-specific constant or size effects performed well, but the model with all predictors included as species-specific performed significantly better than all the others, according to both AIC and BIC. Therefore in our final model the function k (Eqn (1)) took the form:_{j} and the β_{j}s were the parameters estimated (so a different function k_{j} was estimated for each species). The MLEs, Bayesian means and confidence intervals for the parameters for each species of the best fit model (Eqn (9)) are given in

Species showed very different baseline mortality rates, even when other effects were factored out (

Mortality in forest mean environment | Mortality in each species' median environment | |||

Species | Annual mortality rate | 95% CI | Annual mortality rate | 95% CI |

0.0035 | (0.0034, 0.0038) | 0.0022 | (0.0020,0.0023) | |

0.0108 | (0.0104, 0.0112) | 0.0052 | (0.0049,0.0054) | |

0.0012 | (0.0010, 0.0014) | 0.0009 | (0.0008,0.0011) | |

0.0011 | (0.0009, 0.0012) | 0.0026 | (0.0023,0.0028) | |

0.0020 | (0.0018, 0.0022) | 0.0017 | (0.0014,0.0019) | |

0.0016 | (0.0014, 0.0018) | 0.0008 | (0.0007,0.0010) | |

0.0040 | (0.0037, 0.0042) | 0.0048 | (0.0045,0.0052) | |

0.0005 | (0.0004, 0.0006) | 0.0009 | (0.0008,0.0010) | |

0.0180 | (0.0170, 0.0187) | 0.0323 | (0.0311,0.0336) | |

0.0044 | (0.0040, 0.0049) | 0.0098 | (0.0090,0.0104) | |

0.0017 | (0.0015, 0.0018) | 0.0407 | (0.0399,0.0414) | |

0.0016 | (0.0015, 0.0017) | 0.0013 | (0.0012,0.0014) | |

0.0094 | (0.0088, 0.0102) | 0.0068 | (0.0063,0.0074) | |

0.0004 | (0.0003, 0.0005) | 0.0005 | (0.0004,0.0006) | |

0.0062 | (0.0059, 0.0064) | 0.0035 | (0.0033,0.0038) | |

0.0392 | (0.0384, 0.0399) | 0.0083 | (0.0078,0.0088) | |

0.0114 | (0.0110, 0.0119) | 0.0072 | (0.0068,0.0075) | |

0.0157 | (0.0151, 0.0163) | 0.0054 | (0.0051,0.0057) | |

0.0020 | (0.0018, 0.0022) | 0.0054 | (0.0052,0.0057) | |

0.0047 | (0.0042, 0.0052) | 0.0282 | (0.0271,0.0299) | |

0.0114 | (0.0110, 0.0119) | 0.0011 | (0.0010,0.0013) |

The relationship between size (dbh) and mortality was U-shaped for all species (

Observed and predicted forest-averaged and species-averaged annual mortality rates (deaths tree^{−1} yr^{−1}, log scale) plotted against (^{−2} day^{−1}). Each panel shows the observed trends in mortality calculated using data from all species (orange) and from the 21 most common species (green), and the predicted curves for 21 common species (grey) and the combined curve from these species (purple). Individual species mortality rates are shown vs. changes in the predictor variable of interest alone, i.e. with all other predictor variables held at the median for that species (see Supporting Information). Error bars on the predictions (grey and purple) are 95% confidence intervals calculated from an error propagation procedure that accounted for parameter uncertainty. Error bars on the observations for the whole forest including rare species (orange) and 21 species combined (green) are 95% confidence intervals for mortality rates in the data (see Supporting Information).

Predicted changes in species' average annual mortality rate (calculated at each species' median size and environment) when subjected to a hypothetical 2°C temperature increase (•) and a 20% increase in annual precipitation (□), shown plotted against the current average mortality rate without this change.

Of the several environmental factors included in the model, temperature and precipitation are particularly important in this region because they are likely to change substantially, and perhaps rapidly, under anthropogenic climate change

Forest-averaged mortality rates decreased with increasing precipitation up to a threshold of around 800 mm yr^{−1} and showed no clear trend thereafter (^{−2} day^{−1} (^{2} ha^{−1} (

The model reproduces most of the geographical patterns in plot-averaged mortality observed in the FIA dataset (compare ^{2} = 0.89). Since the model reproduced geographical variation well, we were able to decompose the variation into the separate effects of stand structure (stem-size distributions and plot basal area), environment and species (

Maps of estimated annual forest-level mortality across the Eastern United States illustrating the contributions of each of the components of the model (

However, not all variation predicted by the model was explained by a simple sum of the three components, indicating strong interactions between them. For example, both stand structure and species composition (

We found that size (dbh) was the single variable with the greatest effect on mortality rate at the level of the individual tree, with trees of intermediate size exhibiting mortality rates much lower than smaller, or larger, trees. This U-shaped relationship between size and mortality appears to be a common feature of forests, whether from sub-boreal

However, despite size being the most important single predictor of mortality at the tree scale, variation in stand size structure had almost no effect on geographical variation in plot-averaged mortality (

The mismatches we found between species-averaged and forest-averaged mortality -environment correlations imply that, under climate change, forest-averaged mortality will change in ways that cannot be anticipated by examining the current relationship between observed mortality and climate. Given that mortality is highly dependent on species identity, size and environmental factors, it is important to include all these factors in predictive models of climate-change effects. For example, consider the response of carbon stocks in the coldest regions of the Eastern US to a scenario of increased temperature. Forest-averaged mortality is currently greatest in the coldest locations, suggesting that warming should decrease mortality rates, and increase carbon stocks (

At the forest-averaged level, wind speed did not have an effect on mortality yet several species-averaged mortality rates showed a strong correlation with it (

Our results suggest that species show contrasting responses to changing environmental conditions, and these mortality responses were strongly non-linear which suggests that individuals within a species may respond at different rates to a change in conditions, depending on where they sit within the species range. Changes in mortality have been correlated with changing temperature and precipitation levels in the USA in other studies

Although this work presents strong evidence for marked variation in mortality with a variety of different factors, we recognise several shortcomings in terms of a lack of inclusion of external disturbance factors, forest management and history, which are all likely to affect mortality. It is also important to note a significant limitation of the study, namely that the data used cover a single survey period only (1980s–1990s), so particular quantitative results are dependent on conditions in this period and must be treated with caution. This raises the possibility that some of the patterns reported here reflect particular episodic events that may not be representative of mortality patterns averaged over the longer term. However, our four main conclusions (that mortality is U-shaped against dbh, hump-shaped against plot basal area, and species exhibit both different underlying mortality rates and different responses to changes in environmental conditions) presented in the main paper are robust unless: (a) over longer periods temporal variation completely or nearly removes all effects of species, size or environment on mortality, (b) the apparent effects of the different predictor variables on mortality uncovered here were caused

We found large and statistically significant differences in mortality among species not only in baseline mortality rates (

Table comparing model fits using AIC and BIC 36 models were run within which the four types of model predictor in Eqn (4) (constant, size, environment, basal area) were left out or included with forest-level (FL) or species specific (SS) effects. Total number of parameters, AIC and BIC scores and rankings are reported. Models without size and species effects were rejected very strongly, and the additional inclusion of environmental and competition variables increased model fit significantly. The best-fit model, number 26, showed a very significant improvement on the next best using both AIC and BIC.

(0.07 MB DOC)

Table of maximum likelihood estimators (MLEs), Bayesian means and 2.5% and 97.5% confidence levels calculated from the posterior distributions for each of the 15 parameters of Eqn (9) for each of the 21 common species parameterised by the adaptive MCMC algorithm. The burn-in for the algorithm was 750,000 iterations and the sampling was 250,000 iterations.

(0.18 MB DOC)

Observed and predicted mortality rates against maximum wind speed. Log annual mortality rates observed for the whole forest including rare species (orange) and the 21 common species (green), and the model predictions for the 21 species combined (purple) and each species individually (grey), plotted against maximum wind speed (m/sec). Species' error bars (grey) show parameter uncertainty, forest error bars (purple, orange and green) show the 95% confidence interval for the mortality rates predicted from the model-created and real datasets.

(8.20 MB TIF)

Observed and predicted mortality rates against soil type. Log annual mortality rates plotted against soil type for the predicted forest-level mortality rate for all 21 species parameterised by the model (purple), the real forest-level mortality rates for the 21 species (green) and the whole forest including rare species (orange). Error bars (purple, orange and green) show the 95% confidence interval for the mortality rates predicted from the model-created and real datasets.

(8.20 MB TIF)

Observed and predicted mortality rates against plot basal area. Log annual mortality rates observed for the whole forest including rare species (orange) and the 21 common species (green), and the model predictions for the 21 species combined (purple) and each species individually (grey), plotted against plot basal area (m^{2}/hectare). Species' error bars (grey) show parameter uncertainty, forest error bars (purple, orange and green) show the 95% confidence interval for the mortality rates predicted from the model-created and real datasets.

(8.20 MB TIF)

Observed versus predicted plot-averaged mortality rates. Observed versus predicted plot-averaged annual mortality rate for all plots with at least 10 stems, showing the high correlation (r^{2} = 0.9).

(8.20 MB TIF)

Patterns of mortality due to regional variation in stand strucuture and environmental alone. Maps of estimated annual forest-level mortality across the Eastern United States illustrating the contributions of variation in stand structure (stem size and plot basal area) and environment, modelled across the range of

(8.88 MB TIF)

Regional patterns of differences between observed at predicted mortality rates. Map of absolute difference between predicted and observed forest level mortality across the Eastern United States.

(7.66 MB TIF)

The authors would like to thank an anonymous reviewer for their helpful comments on an earlier version of the manuscript. We thank all the people involved in collecting and archiving the FIA data used in our analyses.