Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

New Insights into Tree Height Distribution Based on Mixed Effects Univariate Diffusion Processes

  • Petras Rupšys

    petras.rupsys@asu.lt

    Affiliations Centre of Mathematics, Physics and Information Technologies, Aleksandras Stulginskis University, Kaunas, Lithuania, Institute of Forest Management and Wood Sciences, Aleksandras Stulginskis University, Kaunas, Lithuania

Abstract

The aim of this paper is twofold: to introduce the mathematics of stochastic differential equations (SDEs) for forest dynamics modeling and to describe how such a model can be applied to aid our understanding of tree height distribution corresponding to a given diameter using the large dataset provided by the Lithuanian National Forest Inventory (LNFI). Tree height-diameter dynamics was examined with Ornstein-Uhlenbeck family mixed effects SDEs. Dynamics of a tree height, volume and their coefficients of variation, quantile regression curves of the tree height, and height-diameter ratio were demonstrated using newly developed tree height distributions for a given diameter. The parameters were estimated by considering a discrete sample of the diameter and height and by using an approximated maximum likelihood procedure. All models were evaluated using a validation dataset. The dataset provided by the LNFI (2006–2010) of Scots pine trees is used in this study to estimate parameters and validate our modeling technique. The verification indicated that the newly developed models are able to accurately capture the behavior of tree height distribution corresponding to a given diameter. All of the results were implemented in a MAPLE symbolic algebra system.

Introduction

Understanding the key forces that shape tree heights distribution patterns and their dynamics through average breast height diameter within a forest stand (in the sequel–diameter) is a fundamental goal of forestry [1]. Stand volume, one of the most important variables in forest management, is heavily dependent on tree diameter and height distribution. The literature on forestry reports that tree height distribution varies across different stands and/or species. The tree height distribution is of prime importance from the point of view of the quality and quantity of a stand and its future growth. The importance of using the tree height rather than the tree diameter as a predictor of forest demographics arises from the former’s high potential for predicting the properties of forest productivity as pointed out by Kempes et al. [2]. Traditional methods quantify the tree size distribution in an even-age forest stand [3]. Unfortunately, height and diameter distributions cannot be combined if they are estimated independently using datasets from different stands. Much research has been conducted on the utilization of various theoretical functions for height and diameter distribution modeling techniques for improving stand volume prediction, such as Johnson’s [45], beta distribution [6], and power-normal [3]. Recently, there were also a few results published on the use of the copula approach for the modeling of tree height and diameter distribution in stands [7], [8].

The relationship between height and diameter varies for the same tree in different forest stands, such that there is a distribution of tree heights for any given tree diameter based on environmental conditions, or a random site effect. The different height-diameter relationships affect growth predictions and stand trajectories. The new developed stochastic differential equation (SDEs) based modeling approach for complex stands uses stochastic height-diameter relationships at the individual-tree level representing tree growth and neighborhood interactions that are then aggregated to predict the stand height structure. In this study, to project height distribution for a given diameter, a one-dimensional SDE with mixed effects was employed. The main feature of mixed effects models is that they allow parameter vectors to vary from plot to plot by splitting regression coefficients into a fixed part, common to the population, and random components, specific to each plot [9]. Mixed effects models allow fixed and random parameters to be estimated simultaneously and evaluate the value of the random parameters for a location not present in the original estimation dataset. This approach is known as calibration and can be applied if a sub-sample of trees measured for the total height and breast height diameter are available [10]. Fixed effects parameter SDEs are used in a wide range of applications in environmental, engineering, and biological modeling [1114]. Discrete stationary stochastic models defined by Markov chains have been used to describe size-structure predictions [15].

The essential features of developed height distributions for a given diameter may be explained as follows. Heights are measured at different diameters in a number of sample plots. The diameters and number of measurements differ among plots and the measurements of the diameters are not evenly spaced. The diameter based dependent tree height distribution models are assumed to have some fixed effects parameters that are common to all plots and random effects that are specific to each plot. Two sources of variation were simultaneously included for modeling tree height distribution: variability between plots using a random effects approach and variability in the individual tree height using system noise, which reflects the random fluctuations around the corresponding theoretical height-diameter model. New developed conditional probability density functions of a tree height at a given diameter based on diffusion processes can be used for calculating the mean value of growth and yield attributes and its coefficient of variation as a function of tree height at any specified diameter. The random effects SDEs height-diameter relationships allow taking into account the effect of multiple causal relations in the model, the influence of unknown covariates affecting the height growth and they allow for developing height distribution accounting for spatial variability in large-scale modeling.

In this study, the evolution of a random variable (height), H(d), for a given diameter, d, is modeled using mixed effects SDEs from the Ornstein—Uhlenbeck family [16], for example, the Vasicek, the Gomperz (3-parameters and 4-parameters), the Bertalanffy, and the Gamma. We focused on mixed effects SDEs with a deterministic term depending on random effects and a stochastic term without random effects.

The aim of this study is to present the advantages of SDEs with mixed effects in analyzing tree height distributions for a given diameter and their application to describe the evolution of the height-diameter ratio, quantile curves, mean tree height, mean stem volume and the coefficients of variation for the mean tree height and stem volume. We also discuss how a conditional height’s probability density function for a given diameter can be used to construct maximum likelihood estimators using large collections of datasets provided by the Lithuanian National Forest Inventory (LNFI). A MAPLE program was used to carry out the calculations.

Materials and Methods

A typical diffusion process is modeled as a differential equation involving a deterministic (drift) term and a stochastic (diffusion) term, the latter represented by Brownian motion [14]. Traditionally used ordinary differential equation models are the Malthus, Mitscherlich, Gompertz, and Bertalanffy types [1113] (see Appendix A).

There are alternative ways of introducing stochasticity in an ordinary differential equation. In this work, the tree height randomness was approximated as a standard Brownian motion [1114]. Therefore, the complete deterministic models defined by Eqs A.1, A.3, A.4, A.7 and A.9 were converted, into stochastic models assuming that the deterministic parameter, α, varies randomly around the mean: (1) where σ (σ>0) is the diffusion coefficient, which reflects random fluctuations around the corresponding theoretical height-diameter curve, and ε(d) is a Gaussian white noise process. If the magnitude of the parameter capturing system noise, σ, is zero, the entire system noise term will vanish, and the remaining part of the SDEs will simply be differential forms, the solutions to which are Eqs A.2, A.5, A.6, A.8 and A.10, respectively.

The relationships between total tree height and diameter are altered by environmental conditions. Among other plot-specific characteristics such as soil type, nutrient status, resistance of trees to windthrow, competition for light, and elevation cause the parameters to differ across plots. In the case of between-plot variations, the fixed effects parameters α, β, and σ vary from plot to plot and, hence, account for these variations. For the construction of the mixed effects parameters models, the first step is to determine which parameters should be considered mixed effects and which should be considered purely fixed effects. The parameters with high variability could be considered mixed effects. The parameter α has high variation between plots for all used SDEs models, so it can be altered by adding plot-specific random effects to the fixed effects parameter to produce a plot-specific parameter in the following form: (2) where ϕi (i = 1, 2, …, M)—plot-specific random effects, M is the number of plots. It is assumed that the random effects, ϕi, i = 1, 2, …, M, are independent and normally distributed with 0 mean and constant variance ().

In order to derive mixed effects SDEs height-diameter models, it is sufficient to substitute Eqs 1 and 2 into Eqs A.1, A.3, A.4, A.7 and A.9. In this study, the tree height, Hi(d), i = 1,2,…,M, evolving in M different experimental plots randomly chosen from a theoretical population was described by the Itô [17] sense SDE of the Vasicek type: (3) the 3-parameters Gompertz type: (4) the 4-parameters Gompertz type: (5) the Bertalanffy type: (6) the Gamma type: (7) where Wi(d), d≥0 are the independent standard Brownian motions, Wi(d) and ϕj are mutually independent for all 1≤i,j≤M, and M is the total number of plots used for model fitting. The term P(Hi(0) = 1.3) = 1 or P(Hi(0.001) = 1.3) = 1 (for Eq 7) ensures that if d = 0, then Hi = 1.3.

Taking into account the analytical expressions of the deterministic term and the stochastic term specified by Eqs 37, both terms fulfil the Lipschitz restriction on growth conditions for the existence and unicity of the solutions of the SDEs defined by Eqs 37 [18]. Transforming Eq 3 by Yi(d) = eβdHi(d), Eqs 4, 6 and 7 by Yi(d) = eβdln(Hi(d)), and Eq 5 by Yi(d) = eβdln(Hi(d)−γ), and applying Ito's [17] formula, we deduce that the solution, Hi(d), of Eq 3 has a normal distribution and the solutions, Hi(d), of Eqs 47 have lognormal distributions, respectively, , , , and . The conditional probability density, mean, and variance functions were deduced in Appendix B.

An approximated maximum likelihood procedure (see Appendix C) was used for the estimation of the fixed effects parameters and random effects by assuming that tree height and diameter observations are without measurement noise.

Data

The data used for developing the models were obtained from the Lithuanian National Forest Inventory (LNFI) (2006–2010). The NFI plots are systematically distributed using a grid of 4x4 km squares with a random starting point. The sample plots are arranged into triangle distributed clusters with a distance between angles of 2 km. Each cluster has 4 sample plots. They are situated on each 250 m length side of square 25 m from its angles [19]. At plot establishment, the following data were recorded for every sample tree: the species, the diameter over bark at 1.30 m high and measured to the nearest millimeter and the total height to the nearest quarter meter. The tree diameters were measured with outside calipers in two perpendicular directions. A total of 3,455 plots (500 m2) of Scots pine trees were chosen from the LNFI 2006–2010 database. A random sample of 1,999 plots (7,343 trees) was selected for model estimation, and the remaining dataset of 1,456 plots (5,413 trees) was utilized for model validation. Only measurements from live trees without top damage were included in the statistical analysis. Summary statistics for the diameter at breast height (d), the total height (h) and the age (A) for all of the trees used in model estimation and validation datasets are presented in Table 1. Table 2 presents the distribution of the number of trees per plot measurements from both datasets. It should be noted that data on the number of plots with greater than 10 measured trees were very limited.

thumbnail
Table 2. Distribution of the number of trees per plot measurement.

https://doi.org/10.1371/journal.pone.0168507.t002

Results

To examine the impact of fixed and random effects on the prediction of the height distribution, the maximum likelihood estimators (Eqs A.2 and A.10) were calculated using the NLPSolve procedure in MAPLE 11 [20]. The models with fixed effects and mixed effects were evaluated based on Akaike’s [21] information criterion (AIC), which was defined as follows: (8) where is the log-likelihood function and p is the number of parameters in the model. The nested models with the smallest AIC value are considered to be the best. Using the estimation dataset presented in Table 1, the parameter estimates of the fixed effects and mixed effects SDEs height-diameter models, defined by Eqs 37, are summarized in Table 3. The standard errors of the parameter estimates were calculated by Eq C12. All of the parameters of the fixed effects and mixed effects SDEs height-diameter models are highly significant (p < 0.001). The AIC values for the fixed effects SDEs height-diameter models were more than for mixed effects models, indicating that random effects are needed in the height-diameter SDEs.

thumbnail
Table 3. Estimated parameters and AIC for all height-diameter models applied to the estimation dataset.

https://doi.org/10.1371/journal.pone.0168507.t003

Height distributions

Tree height structure is a basic modeling component of many complex forest yield models relating individual tree characteristics with stand variables. The distribution of the tree height, as a diameter dependent variable, can be approximated by classifying diameter and applying the desired transformation to the mean tree of the class [22]. A more convenient way to derive tree height distributions for a given diameter is the use of SDEs. This paper described research aimed at deriving tree height probability densities for a given diameter (Eqs B.1, B.4, B.7, B.10 and B.13) by directly fitting the SDEs (Eqs 37) to the diameter and height observations. Fig 1 demonstrates the estimated probability density functions of tree height for a given diameter (Eqs B.1, B.4, B.7, B.10 and B.13) for three randomly selected plots from the estimation dataset. These probability density functions indicate that density curves are steeper for the young stands and less steep for the mature stands. On the other hand, Fig 1 shows that the mixed effects probability density functions are characterized as having smaller variances than the fixed effects probability density functions.

thumbnail
Fig 1. Height’s conditional probability density functions (Eqs B.1, B.4, B.7, B.10 and B.13) for three different plots within estimation dataset.

Left–fixed effects models; right–mixed effects models; first plot–solid line and height’s dataset–cross; second plot–dash line and height’s dataset–diamond; third plot–dot line and height’s dataset–box; mean diameter within a plot is recorded in the graph.

https://doi.org/10.1371/journal.pone.0168507.g001

Several empirical methods are available for comparing conditional probability densities, as has been illustrated by [23]. In the present paper, a well-known measure of distributional accuracy named by Kullback-Leibler Information Criterion (KLIC) [24] was utilized. We are interested in comparing two conditional probability density functions and , , 1 < s,t < 2, A,B ∈ {V,G3,G4,B,G}. Therefore, in particular, we choose conditional probability density over if: (9)

Under appropriate conditions, the KLIC has limiting distribution under the null, and is consistent against all possible fixed alternatives. The expression for KLIC in Eq 9 depends on the unknown expectation E(⋅). We consider estimating KLIC by a discrete height sample () at diameters () analogue: (10) where ni is the number of observed trees of the ith plot, i = 1,2,…,M.

Analysis of paired comparison of the five conditional probability densities, described in Section 2 by Eqs B.1, B.4, B.7, B.10 and B.13 was performed by KLIC calculated using the estimation dataset. The results of comparisons are presented in Table 4. As we see in Table 4, the Vasicek type conditional probability density function of the tree height with mixed effects (see Table 4, values above diagonal) and fixed effects (see Table 4, values below diagonal) are superior to the other densities and the worst conditional probability density function is the Gompertz (3-parameters) type with mixed and fixed effects. All mixed effects density functions are superior to corresponding fixed effects parameters density functions (see bold values in diagonal Table 4).

thumbnail
Table 4. Comparison of the conditional probability density functions.

https://doi.org/10.1371/journal.pone.0168507.t004

Height-diameter models

Many comparisons between the different models or ecoregions have been carried out to identify the appropriate height–diameter relationships within stands. The height dynamics defined by Eqs B.16, B.18, B.20, B.22 and B.24 are affected by many processes and vary among stands. Fig 2 illustrates the influence of the plot within Lithuanian pine forests on the mean and standard deviation of height-diameter dynamics using the Vasicek and 4-parameters Gompertz diffusion processes and random-effects parameter, ϕ, for the 3 randomly selected plots from the estimation dataset. The parameter estimates for each plot are calculated by adding the fixed effect parameter and random effect. Therefore, considering the asymptotic maximum height parameter, α+ϕ, for the Vasicek type model, the values varied from plot to plot. Fig 2 shows significant differences of tree height dynamics among the sample plots.

thumbnail
Fig 2. Mean (Eqs B.16 and B.20) and standard deviation (Eqs B.17 and B.21) curves of the height for the 3 randomly selected plots within estimation dataset.

First plot–black color, diameter and height dataset–cross; second plot–blue color, diameter and height dataset–diamond; third plot–red color, diameter and height dataset–box; mean height curve—solid line; mean ± standard deviation curve–dash line.

https://doi.org/10.1371/journal.pone.0168507.g002

To understand the advantages of the height-diameter equations (Eqs B.16, B.18, B.20, B.22 and B.24), fixed effects models, mixed effects models and mixed effects models with the random effects set to zero scenarios were used to predict tree height in both the estimation and validation datasets. The performance statistics of new developed tree height’s equations included three statistical indices: prediction accuracy, δ, which combines the mean bias, B, and the variation, ξ, of the biases, enabling improved assessment of model accuracy; an adjusted coefficient of determination, , which reflects the part of the total variance explained by the equation; and Akaike’s information criterion, AICC, which measure the quality of the height-diameter equation for a given dataset. The expressions for these statistics are as follows: where n is the total number of observations used to estimate the height-diameter model, p is the number of model parameters, and yi, , and are the measured, predicted, and average values of the dependent variable (total tree height), respectively.

Table 5 presents the performance statistics for the tree height’s models for all three scenarios; these include the fixed effects model, the mixed effects model and the mixed effects model with random effects set to zero, illustrating the extent to which the inclusion of the random effects improved the performance statistics for both the estimation and validation datasets. The random effects for the validation dataset were calibrated using Eqs D.1D.5, respectively. The results of this study show (see Table 5, Akaike’s information criterion) that the SDEs Vasicek and Gompertz 4-parameters type tree height’s models with mixed effects are significantly superior at predicting tree height compared to the other newly developed models. Compared to the basic fixed effects models, the mixed effects models show better performance with lower bias and prediction accuracy, and with a higher adjusted coefficient of determination evaluated over the entire dataset. The mixed effects models with random effects set to zero show the worst performance, with greater bias and prediction accuracy, and with a lower adjusted coefficient of determination evaluated over the entire dataset. The fixed effects models, the mixed effects models, and the mixed effects models with random effects set to zero have very similar fit statistics for both the estimation and validation datasets.

The plots of the residuals versus predicted heights and the lowess line [25], estimated for the validation dataset, in the fixed effects and mixed effects scenarios (random effects for the validation dataset were calibrated by Eqs D.1D.5) are presented in Fig 3. Fig 3 shows that the residuals that were calculated using the mixed effects scenario are distributed more symmetrically around zero, with approximately constant variance, compared with the fixed effects scenario. A non-parametric smoothing line, called a lowess line, shows a clear trend in the middle range of predicted height; however, what happens at the extremes is dictated by relatively little data.

thumbnail
Fig 3. Residuals and the lowess curve of the tree height fixed effects and mixed effects models for the validation dataset.

https://doi.org/10.1371/journal.pone.0168507.g003

Quantile regression

The conditional tree height’s mean, defined by Eqs B.16, B.18, B.20, B.22 and B.24, illuminates just one aspect of the conditional distribution of a tree height and yet neglects all other features of possible interest. Quantile regression model allows the predictor variable to have a more complex relationship with the response variable [26]. Our developed tree height’s conditional probability density functions for a given diameter (Eqs B.1, B.4, B.7, B.10 and B.13) enables us to write the quantile equation of the tree height to any desired conditional quantile of the height’s distribution. Forest researchers are not mainly interested in quantifying the conditional central tendency of the tree height. Evidently, quantile tree height models also allow us to explore the lower boundary relationship which covers cramped trees with a very slender trunk. On the other hand, the exact conditional density functions, defined by Eqs B.1, B.4, B.7, B.10 and B.13, can be employed in practice by the quantile regression, which allow us to make height predictions using intervals that contain the tree height for a given diameter, with a specific probability, 0<p<1. For the Vasicek type model the quantiles functions are defined as follows: and for the Gompertz 4-parameters type model:

For example, the 10% quantile function, , (splits off the lowest 10% tree height predictions from the highest 90%) and the 90% quartile function, , (splits off the highest 10% of tree height predictions from the lowest 90%). For three randomly selected plots 10% and 90% quantile functions are presented in Fig 4.

thumbnail
Fig 4. 10% and 90% quantile functions for the mixed-effects models and three different plots within estimation dataset.

First plot–solid line and dataset (diameter and height)–circle; second plot–dash line and dataset (diameter and height)–diamond; third plot–dot line and dataset (diameter and height)–box.

https://doi.org/10.1371/journal.pone.0168507.g004

Slenderness ratio

Tree height to diameter ratio (slenderness ratio) is regarded as an index of the resistance of trees to windthrow and competition for light, and its mean value may be useful in determining stand stability. The slenderness ratio is calculated by dividing the tree height to its diameter at breast height. For the fixed effects and mixed effects SDEs height-diameter models, defined by Eqs 37, the slenderness functions are defined as follows:

The slenderness ratio dynamics for three randomly selected plots are presented in Fig 5.

thumbnail
Fig 5. Slenderness dynamics for mixed-effects models and three different plots within estimation dataset.

First plot described by solid line–dataset (height/diameter) by circle; second plot described by dash line–dataset (height/diameter) by diamond; third plot described by dot line–dataset (height/diameter) by box.

https://doi.org/10.1371/journal.pone.0168507.g005

The findings of our investigation generally support that the height’s conditional probability densities driven by diameter correctly predict slenderness ratio (see Fig 5). Furthermore, all new developed tree height’s distribution models show a decrease of slenderness ratio with increasing diameter.

Mean stem volume

The fixed effects and mixed effects height’s conditional probability density functions allow us to revise mean stem volume calculation in the following form:

Here V(d,h) is the stem volume regression function of power form [27], , where parameters β1, β2, and β3 are to be estimated. The selection of stem volume model was basically motivated by the available measured tree level characteristics. Parameter estimates were calculated by weighted least squares technique. The estimators and their standard deviations (in parenthesis) are, , , [12]. The relationship between the mean stem volume and the diameter of a tree for the fixed effects and mixed effects Vasicek and Gompertz 4-parameters type models are shown in Fig 6.

thumbnail
Fig 6. Mean stem volume for three different plots within estimation dataset.

Left–fixed effects models; right–mixed effects models; first plot (mean diameter 48.23)–solid line; second plot (mean diameter 26.88)–dash line; third plot (mean diameter 17.86)–dot line; mean diameter within a plot is recorded in graph.

https://doi.org/10.1371/journal.pone.0168507.g006

The direct effects of stand variables such as site index and management practices and thinning could be included in the new developed models (see [28], [29]); however, their indirect effect via mixed effects (see right side Fig 6) has been included in mixed effects tree mean volume models.

Coefficient of variation for height and volume

The coefficient of variation is typically used to indicate the precision of the dispersion of datasets and is also often used to compare numerical distributions measured at different scales. Tree height based and tree volume based quantifications of the stand structural diversity can be performed using the coefficient of variation. The coefficient of variation reaches its maximum with two-storied stands. The coefficient of variation of tree height (tree volume) measures the variability of tree height (tree volume) relative to its mean and relates the mean and standard deviation by expressing the standard deviation as a percentage of the mean. To further discuss the results of this study, the coefficient of variation, which may help examine dispersion in tree heights occurring at diameter d, is defined by: and dispersion in tree volumes occurring at diameter d:

Fig 7 shows a plot of the coefficient of variation as a function of a diameter using the mean trend and standard deviation functions. In both cases (height and volume), the coefficient of variation of the tree height and volume evolves into a stationary coefficient of variation. The coefficient of variation based on tree height (volume) decreases with an increase in diameter.

thumbnail
Fig 7. Coefficient of variation of tree height and volume for three different plots within estimation dataset.

Left–fixed effects; right–mixed effects models; first plot (mean diameter 17.86)–solid line; second plot (mean diameter 25.88)–dash line; third plot (mean diameter 48.23)–dot line; mean diameter within a plot is recorded in graph.

https://doi.org/10.1371/journal.pone.0168507.g007

Discussion

The models commonly used of height distribution fitting in a forest stand are supplemented by tree height’s measurements. However, in the Lithuanian National Forest Inventory foresters measure no more than 15 heights (see Table 2) of pine trees per stand. For estimating the parameters by traditionally used maximum likelihood technique such sample sizes are too small [5], [30]. New developed height distribution models based on mixed effects parameters diffusion processes overcome such weakness. The pioneer of the SDE approach in forest growth modeling is Suzuki [31]. In this paper for height-diameter evolution were used linear and non-linear SDEs from the Ornstein—Uhlenbeck family by incorporating random effects into deterministic (drift) term. This extended model describes the within-stand variation in data through the system noise reflecting the random fluctuations around the corresponding theoretical height-diameter curve and the between-stand variation in data through the random effects. The maximum likelihood estimation procedure converged for all five diffusion processes using the estimated dataset from LNFI.

In order to predict the parameters of the tree size probability density function for a new stand, traditionally were carried out regression models from the different stand variables [32]. If the diameter and height of a sub-sample of trees are known, then for new developed height distributions based on univariate diffusion processes the random effects can be calibrated by Eqs D.1D.5.

Quantifying variability in tree height at a given diameter by a distribution law has both theoretical and practical value. First, knowledge of the tree height distribution in forest stands is important for understanding of competition and self-thinning which must studied not in the mean size of trees but in the size structure of trees in a forest stand. Second, understanding tree height distribution at a given diameter is important for improving estimates of stand biomass and carbon storage. To describe how tree height distribution vary across regional scales, we developed new empirical distributions of tree height at a given diameter across the Scots pine trees in Lithuania. In this paper our specific objectives were to test (1) what new developed probability density function based on stochastic differential equation height-diameter evolution provides the best fit, (2) how new developed models explain observed variation in the probability density functions, the mean height-diameters, quantile height-diameter, mean slenderness ratio and mean stem volume relationships across the Scots pine trees in Lithuania, and (3) how to describe the mean height-diameter and mean stem volume relationships fit in terms of the relative sizes defined by coefficient of variation.

The Kullback-Leibler Information Criterion [24] was used to compare all new developed conditional probability density functions using the estimation dataset. The conditional probability density function derived from the Vasicek type height-diameter univariate diffusion process showed better results than the other used stochastic processes (Table 4). All mixed effects parameters probability density functions are superior to corresponding fixed effects parameters density functions (see bold values in diagonal, Table 4). Theoretical validating that a height dataset observed at discrete diameters follows univariate probability density functions defined by Eqs B.1, B.4, B.7, B.10 and B.13 is not easy and there is no simple statistical test. The goodness of fit of the estimated univariate density functions (Eqs B.1, B.4, B.7, B.10 and B.13) graphically were illustrated in Fig 1 using fixed effects and mixed effects parameters scenarios and three randomly selected plots from an estimation dataset by plotting the estimated probability density functions and height’s measurements. Fig 1 showed that the mixed effects and fixed effects parameters estimated probability density functions well capture the main features of the data from three randomly selected plots. The height-diameter evolution can be written using a wide range of mathematical relationships from linearized fixed effects regression equations to nonlinear mixed effects generalized relationships. Mathematical technique of a system of uniform diameter and height regional functions is the approach known as the generalized model. The mixed effects regression models are able to achieve the same results than the generalized model [10, 33]. In this study new developed mixed effects parameters height-diameter relationships demonstrated similar statistical indexes as in the nonlinear generalized height-diameter regression models presented by Petrauskas et al. [34].

In addition, one of the advantages of using diffusion processes for quantifying tree height distribution is that it allows to derive the first two moments about height’s and volume’s evolutions through diameter and to calculate the relative standard deviation (coefficient of variation) for the height and volume. Fig 8 shows the variation of the coefficient of variation in pine trees forest stands from the estimation dataset from LNFI as a function of mean plot diameter using the mixed effects Vasicek type diffusion process. There is an exponential increase of the coefficient of variation as the mean diameter per plot decreases; the coefficients of variation for tree height varies from 6.94% to 24.72% and for stem volume varies from 4.77% to 17.05%.

thumbnail
Fig 8. Coefficient of variation of tree height and volume for plots within estimation dataset.

Left–tree height; right–stem volume.

https://doi.org/10.1371/journal.pone.0168507.g008

Conclusions

This study demonstrated the use of SDEs to quantify tree height distribution at a given diameter in a forest stand using the Lithuanian National Forest Inventory dataset. The results indicated that it is possible to measure mean tree height and volume evolution with an acceptable accuracy over a broad area of Lithuania. Overall, the models explained over 90% of the variation in height predictions observed in the LNFI (2006–2010) dataset. The remaining variation was likely to have resulted from stand variables. Better performance can be expected by introducing stand variables [29]. The diffusion processes based SDE models described here implicitly model spatial effects. The technique we described can be used for developing a new generation of forest growth models.

A system of bivariate stochastic differential equations with mixed-effects parameters could be used to develop tree diameter and height at a given age (or trivariate: diameter, height and stand density at a given age) distribution model. This extension to multivariate SDEs come with an increased computational burden.

Results for both tree height and volume predictions using the mixed effects SDE Vasicek type height-diameter model indicate that the coefficient of variation over all plots for the tree height and volume (at the mean diameter of a plot) takes values from the interval 6.9%–24.8% and 1.7%–16.0%, respectively, and evolves to a stationary value from the interval 6.6%–19.8% and 1.7%–13.0%, respectively.

The field of SDEs is a large and growing area of applied mathematics that is being increasingly used to model biological systems. In this paper, new mixed effects height’s probability density functions for a given diameter were developed using an Ornstein-Uhlenbeck SDE family. Unfortunately, measurements from at least one tree in a stand, or their measure of central tendency (mean, median, mode of diameter and height) are required for the practical calibration of the random effects for a new stand. The use of the mixed effects model enables us to develop a simple model structure without including additional predictor stand variables.

The results showed that the mixed effects Vasicek type tree heights distribution models are superior to the other new developed models.

The variance functions developed here can be applied to generate weights in every linear and nonlinear least squares regression height-diameter model by the weighted least squares form.

Appendix A

Deterministic models

The mathematical representation of Mitscherlich growth [35] is derived from physical chemistry, where it describes a first order irreversible chemical reaction. The deterministic height-diameter model used to describe the individual growth of a tree in terms of its size (height), h(d), at instant (diameter), d, can be written in the form of an autonomous differential equation given by the following: (A.1) where D0 is the upper limit on the diameter at the breast height. Height dynamics are irreversible, and the growth rate is proportional to the difference between the asymptotic maximum height, α, and the already formed tree height, h(d), β is the proportionality constant (β>0). The formula describing a Mitscherlich type height-diameter trajectory takes the form: (A.2)

The changes in tree height, h(d), using deterministic ordinary differential equations, developed by Gompertz [36], for 2-parameters and 3-parameters models, respectively, are described as follows: (A.3) (A.4)

The formulas describing a Gompertz type height-diameter trajectory for 2-parameters and 3-parameters models, respectively, are as follows: (A.5) (A.6) where α is the intrinsic growth rate of the height, β is the growth deceleration factor, γ is a threshold parameter, and represents the largest height size that the tree can sustain.

Von Bertalanffy (for a review, see example in Román-Román et al. [13]) hypothesized that the growth of an organism could be represented as the difference between the synthesis and degradation of its building materials. There are few theoretical equations formulated specifically for biology applications. In this paper, the tree height, h(d), are described using an ordinary differential equation: (A.7) where α, β, and γ are unknown fixed effects parameters. The formula describing the Bertalanffy trajectory follows the form of a sigmoidal function: (A.8)

The changes in tree height, h(d), using the well-known regulated Malthusian growth process [37], are described in the following form: (A.9)

The formula describing the Gamma (Malthusian) trajectory follows the form: (A.10)

Appendix B

Conditional probability densities

The solution, Hi(d), of Eq 3 has a normal distribution with conditional probability density, mean, and variance, respectively [29]: (B.1) (B.2) (B.3) and the solutions, Hi(d), of Eqs 47 have lognormal distributions, with conditional probability density, means, and variance, respectively [28, 38]: (B.4) (B.5) (B.6) (B.7) (B.8) (B.9) (B.10) (B.11) (B.12) (B.13) (B.14) (B.15)

The conditional mean, m(d), and variance, v(d), functions of the tree height, H(d), for all the models (Eqs 14–17) are given by the following expressions [28], [29], [38]: (B.16) (B.17) (B.18) (B.19) (B.20) (B.21) (B.22) (B.23) (B.24) (B.25)

Appendix C

Maximum likelihood procedure

We consider the SDEs height-diameter models, as defined by Eqs 37, from two perspectives. First, the log-likelihood functions are derived for the fixed effects parameters models (in this case the parameters of random effects, ϕi, i = 1,…,M are assumed to be equal to its mean value E(ϕi) = 0). Second, the log-likelihood functions are derived for the mixed effects. In the sequel, K ∈ {V,G3,G4,B,G}, , , , .

The fixed effects parameters , , K ∈ {V,G3,G4,B,G} are estimated by means of an approximated maximum likelihood procedure using discrete sampling and conditional probability density functions defined by Eqs B.1, B.4, B.7, B.10 and B.13. We assume that all observations are independent (no repeated measurements are used in the dataset for model estimation). Let us consider a discrete height sample () at diameters () without measurement errors, where ni is the number of observed trees of the ith plot, i = 1,2,…,M. The associated likelihood functions for the fixed effects parameters SDEs height-diameter models (the parameters of random effects, ϕi, i = 1,…,M are assumed to be equal to the mean value E(ϕi) = 0), take the following forms: (C.1) and the log-likelihood functions are: (C.2) where ni is the number of observed trees of the ith plot i = 1,2,…,M, are fixed effects parameters (the same for all plots), density functions take the forms defined by Eqs B.1, B.4, B.7, B.10 and B.13.

The likelihood functions for the mixed effects SDE height-diameter models take the following forms: (C.3) and the log-likelihood function is: (C.4) where are fixed effects parameters (the same for all plots) and ϕi are random effects (plot specific), which are assumed to follow a normal distribution with 0 mean and constant variance , and p(ϕi|σϕ) is the normal density of the random effects.

The integral in Eq C.4 does not have a closed form solution. Because analytic expression for the integrand in Eq C.4 is known, the Laplace method [39], [40] may be used. Let us define a function g:RR as follows: (C.5)

The Laplace approximation to , i = 1,2,…,M, K ∈ {V,G3,G4,B,G} is based on a second-order Taylor series expansion about mode , i = 1,2,…,M: (C.6) where is the global max of and the root of: (C.7)

Then, the Laplace approximation of , i = 1,2,…,M, K ∈ {V,G3,G4,B,G} takes the following form: (C.8) where: (C.9)

The log-likelihood function for the mixed-effects SDEs height-diameter models is approximately given by: (C.10)

The maximization of is a two-step optimization problem. The internal optimization step estimates the for every plot i = 1,2,…,M with Eq C.9. The external optimization step maximizes after plugging the into Eq C.10.

To assess the asymptotic standard errors of the maximum likelihood estimators for the stochastic height-diameter models, a study of the Fisher [41] information matrix was performed. The approximate asymptotic variance of the approximated maximum likelihood estimators (Eq C.10) was calculated by the inverse of observed Fisher information matrix. By defining the vector, and the matrix, s = 1,2, K ∈ {V,G3,G4,B,G}, the observed Fisher information matrix takes the following form: (C.11)

The approximate asymptotic standard errors of the fixed effects parameters are defined by the diagonal elements of the matrix, s = 1,2, K ∈ {V,G3,G4,B,G} by: (C.12)

Appendix D

Calibration and stochastic prediction

In the literature on forestry, calibration means that random effects are predicted using a supplementary sample of observations taken from a sampling unit. The tree heights for new stand can be predicted either by using random effects set to zero, or by adding random effects that were predicted from prior observations. When the diameter and height of a sub-sample of trees are known, the predicted random effects are added to the fixed effects parameters to obtain localized parameters for this sub-sample plot.

Let us assume that a sub-sample of m trees with height hj and diameter dj, j = 1, 2, …, m, is taken from a new plot. Using height-diameter models defined by Eqs 37, the random effect, ϕ, for a new stand can be approximately calibrated as follows: (D.1) (D.2) (D.3) (D.4) (D.5) where are the parameter estimates calculated using the approximated maximum likelihood procedure (Eq C.10). The height of another tree from the same plot can be predicted by adding the random effect calibrated by Eqs D.1D.5 to the fixed effects parameter , respectively. The random effects height distribution models explain much more variability than the fixed effects models and provide better height-diameter model fitting. The calibrated height distribution models allow accurate results to be obtained with a very small sampling effort, making this approach highly effective and useful.

Mixed effects models incorporate the variability between plots using the expression of the model's parameters in terms of both fixed and random effects. Random effects are conceptually random variables; they can be simulated as such, in terms of utilizing their distribution. To address this, we can also add a random component to the height prediction. This stochastic prediction approach uses distribution functions of random variable, H(d), and their confidence intervals. The stochastic predictions, hstoch,K, K ∈ {V,G3,G4,B,G}, of a tree height can be defined as follows: (D.6) (D.7) (D.8) where , K ∈ {V,G3,G4,B,G} is the estimated trend of the mean (calculated using Eqs B.16, B.18, B.20, B.22 and B.24) of the tree height; and (, K ∈ {G3,G4,B,G}) is the inverse of the normal (the lognormal) distribution with a mean of , K ∈ {V,G3,G4,B,G} defined by Eqs B.2, B.5, B.8, B.11 and B.14, and a variance of , K ∈ {V,G3,G4,B,G} defined by Eqs B.3, B.6, B.9, B.12 and B.15, for a uniform random variable, U, in the interval (0;1).

Acknowledgments

I would like to thank the Academic Editor and anonymous reviewers for thorough commentary, feedback, and suggestions that improved the quality of this paper. The author would like to express appreciation for the support of the Lithuanian Association of Impartial Timber Scalers.

Author Contributions

  1. Conceptualization: PR.
  2. Data curation: PR.
  3. Formal analysis: PR.
  4. Funding acquisition: PR.
  5. Investigation: PR.
  6. Methodology: PR.
  7. Project administration: PR.
  8. Resources: PR.
  9. Visualization: PR.
  10. Writing – original draft: PR.
  11. Writing – review & editing: PR.

References

  1. 1. Rennolls K, Päivinen R, San-Miguel-Ayanz J, Tomé M, Skovsgaard JP, Palahi M, et al. Harmonisation of european forest growing stocking data using a model-based conversion approach. Forest Biometry, Modelling and Information Sciences. 2009; 1: 1–34.
  2. 2. Kempes CP, West GB, Crowell K, Girvan M. Predicting maximum tree heights and other traits from allometric scaling and resource limitations. PLoS ONE. 2011; 6(6): e20551. pmid:21695189
  3. 3. Mønness E. The bivariate power-normal distribution and the bivariate Johnson system bounded distribution in forestry, including height curves. Can J For Res. 2015; 45(3): 307–313.
  4. 4. Schreuder HT, Hafley WL. A useful bivariate distribution for describing stand structure of tree heights and diameters. Biometrics. 1977; 33: 471–478.
  5. 5. Zucchini W, Schmidt M, von Gadow K. A model for the diameter-height distribution in an uneven-aged beech forest and a method to assess the fit of such models. Silva Fenn. 2001; 35(2): 169–183.
  6. 6. Li F, Zhang L, Davis CJ. Modeling the joint distribution of tree diameters and heights by bivariate generalized beta distribution. For Sci. 2002; 48(1): 47–58.
  7. 7. Wang M, Rennolls K, Tang S. Bivariate distribution modeling of tree diameters and heights: dependency modeling using copulas. For Sci. 2008; 54(3): 284–293.
  8. 8. Rupšys P. The use of copulas to practical estimation of multivariate stochastic differential equation mixed effects models. AIP Conference Proceedings. 2015; 1684: 080011 8 p.
  9. 9. Almquist J, Bendrioua L, Adiels CB, Goksör M, Hohmann S, Jirstrand M. A nonlinear mixed effects approach for modeling the cell-to-cell variability of Mig1 dynamics in yeast. PLoS ONE. 2015; 10 (4): e0124050. pmid:25893847
  10. 10. Xu H, Sun Y, Wang X, Fu Y, Dong Y, Li Y. Nonlinear mixed-effects (NLME) diameter growth models for individual China-fir (Cunninghamia lanceolata) trees in Southeast China. PLoS ONE. 2014; 9(8): e104012. pmid:25084538
  11. 11. Gutiérrez-Jáimez R, Gutiérrez-Sánchez R, Nafidi A, Ramos-Ábalos EM. A bivariate stochastic Gamma diffusion model: statistical inference and application to the joint modelling of the gross domestic product and CO2 emissions in Spain. Stoch Env Res Risk A. 2014; 28: 1125–1134.
  12. 12. Petrauskas E, Bartkevičius E, Rupšys P, Memgaudas R. The use of stochastic differential equations to describe stem taper and volume. Baltic For. 2013; 19: 143–151.
  13. 13. Román-Román P, Romero D, Torres-Ruiz F. A diffusion process to model generalized von Bertalanffy growth patterns: Fitting to real data. J Theor Biol. 2010; 263: 59–69. pmid:20018193
  14. 14. García O. A stochastic differential equation model for the height growth of forest stands. Biometrics. 1983; 39: 1059–1072.
  15. 15. Roitman I, Vanclay JK. Assessing size–class dynamics of a neotropical gallery forest with stationary models. Ecol Model. 2015; 297: 118–125.
  16. 16. Uhlenbeck GE, Ornstein LS. On the theory of Brownian motion. Phys Rev. 1930; 36: 823–841.
  17. 17. Itô K. On stochastic processes. Jap J Math. 1942; 18: 261–301.
  18. 18. Wong E, Hajek B. Stochastic Processes in Engineering Systems. New York: Springer; 1985.
  19. 19. Vidal C, Alberdi I, Hernández L, Redmond JJ. National Forest Inventories. Springer; 2010.
  20. 20. Monagan MB, Geddes KO, Heal KM, Labahn G, Vorkoetter SM, Mccarron J, et al. Maple Advanced Programming Guide. Maplesoft: Printed in Canada; 2007.
  21. 21. Akaike H. A new look at the statistical model identification. IEEE T Automat Contr. 1974; 19: 716–723.
  22. 22. Chen CM, Rose DW. Direct and indirect estimation of height distributions in even-aged stand. Minnesota Forestry Research Notes. 1978; 267.
  23. 23. Corradi V, Swanson NR. Predictive density evaluation. In: Elliott G, Granger CWJ, Timmermann AG, editors. Handbook of Economic Forecasting. Amsterdam: North-Holland; 2006. pp. 197–284.
  24. 24. Kullback S. Information Theory and Statistics. New York: Dover; 1997.
  25. 25. Cleveland WS. Robust locally weighted regression and smoothing scatter-plots. J Am Stat Assoc. 1979; 74: 829–836.
  26. 26. Smith LB, Fuentes M, Gordon-Larsen P, Reich BJ. Quantile regression for mixed models with an application to examine blood pressure trends in China. Ann Appl Stat. 2015; 9: 1226–1246.
  27. 27. Schumacher FX, Hall FDS. Logarithmic expression of timber tree volume. J Agric Res. 1933; 47: 719–734.
  28. 28. Rupšys P. Stochastic mixed-effects parameters Bertalanffy process, with applications to tree crown width modeling. Math Probl Eng. 2015; Article ID 375270.
  29. 29. Rupšys P. Generalized fixed-effects and mixed-effects parameters height–diameter models with diffusion processes. Int J Biomath. 2015; 8(5): 1550060.
  30. 30. Schmidt M, von Gadow K. Baumhöhenschätzung mit Hilfe der bivariaten Johnson’s SBB-Funktion. Forstwissenschaftliches Centralblatt. 1999; 118: 355–367.
  31. 31. Suzuki T. Forest transition as a stochastic process. Mitt Forstl Bundesversuchsanstalt Wien. 1971; 91:69–86.
  32. 32. Duan A-g, Zhang J-g, Zhang X-q, He C-y. Stand diameter distribution modelling and prediction based on Richards function. PLoS ONE. 2013; 8(4): e62605. pmid:23638124
  33. 33. Adamec Z. Comparison of linear mixed effects model and generalized model of the tree height-diameter relationship. J Forest Sci. 2015; 61: 439–447.
  34. 34. Petrauskas E, Rupšys P. The generalised height-diameter equations of Scots pine (Pinus sylvestris L.) trees in Lithuania. Rural Development. 2013; 6(3): 407–411.
  35. 35. Mitscherlich EA. Die zweite annäherung des wirkungsgesetzes der wachstumsfaktoren. Z Pflanzenernährung. 1928; 12: 273–282.
  36. 36. Gompertz B. On the nature of the function expressive of the law of human mortality, and on a new mode of determining the value of life contingencies. Philos T Roy Soc A. 1825; 115: 513–585.
  37. 37. Capocelli RM, Ricciardi LM. Growth with regulation in random environment. Kybernetik. 1974; 15: 147–157. pmid:4852193
  38. 38. Rupšys P. Height–diameter models with stochastic differential equations and mixed-effects parameters. J For Res. 2015; 20: 9–17.
  39. 39. Joe H. Accuracy of Laplace approximation for discrete response mixed models. Comput Stat Data An. 2008; 52: 5066–5074.
  40. 40. Shun Z, McCullagh P. Laplace approximation of high dimensional integrals. J Roy Stat Soc B. 1995; 57: 749–760.
  41. 41. Fisher RA. On the mathematical foundations of theoretical statistics. Philos T Roy Soc A. 1922; 222: 309–368.