Nonlinear Mixed-Effects (NLME) Diameter Growth Models for Individual China-Fir (Cunninghamia lanceolata) Trees in Southeast China

An individual-tree diameter growth model was developed for Cunninghamia lanceolata in Fujian province, southeast China. Data were obtained from 72 plantation-grown China-fir trees in 24 single-species plots. Ordinary non-linear least squares regression was used to choose the best base model from among 5 theoretical growth equations; selection criteria were the smallest absolute mean residual and root mean square error and the largest adjusted coefficient of determination. To account for autocorrelation in the repeated-measures data, we developed one-level and nested two-level nonlinear mixed-effects (NLME) models, constructed on the selected base model; the NLME models incorporated random effects of the tree and plot. The best random-effects combinations for the NLME models were identified by Akaike's information criterion, Bayesian information criterion and −2 logarithm likelihood. Heteroscedasticity was reduced with two residual variance functions, a power function and an exponential function. The autocorrelation was addressed with three residual autocorrelation structures: a first-order autoregressive structure [AR(1)], a combination of first-order autoregressive and moving average structures [ARMA(1,1)] and a compound symmetry structure (CS). The one-level (tree) NLME model performed best. Independent validation data were used to test the performance of the models and to demonstrate the advantage of calibrating the NLME models.


Introduction
China-fir (Cunninghamia lanceolata (Lamb.) Hook) is the most commonly grown afforestation species in southeast China because of its fast growth and good wood qualities. It is widely used for buildings, furniture, bridge construction and many other purposes.
Growth and yield models are commonly used for forest management planning because they can simulate stand development and production under various management alternatives [1,2]. Individual-tree diameter growth models are a fundamental component of forest growth and yield prediction frameworks [3][4][5]. The models are based on extensive growth data obtained from diverse regions and management levels. Individual-tree diameter growth can be expressed as a function of tree size, competitive effect, stand structure and site quality [6]. A distance-independent individual-tree model structure may be flexible enough to predict diameter growth in monospecific even-aged stands and in mixedspecies and multi-aged stands [7].
Regression analysis, such as ordinary non-linear least squares (ONLS) regression, is the most commonly used statistical method in forest modeling [8]. Individual-tree diameter growth models have been fitted to growth increment data collected repeatedly over time on the same tree [9]. The hierarchical nature of the data results in spatial and temporal correlation among observations made in the same sampling unit (i.e., plot and tree) [10]. However, the stochastic structure is often ignored and independence of observations is assumed [11][12][13][14][15]. Furthermore, the data are autocorrelated and cannot be considered independent samples of the basic tree population [10]. The ONLS regression assumption of independent residuals is therefore violated, biasing the estimates of the standard error of the parameter estimates [16]. Many recent efforts to develop diameter growth models have used nonlinear mixed-effects (NLME) models [5,17]. NLME models include both fixed effects, which are parameters associated with an entire population or with certain repeatable levels of experimental factors, and random effects, which are associated with individual experimental units drawn at random from a population [18]. Random effects account for spatial and temporal correlation by defining the covariance structure of the model's random components and by using this structure during parameter estimation. NLME models provide an efficient statistical method for explicitly modeling hierarchical stochastic structure. Growth models can be calibrated by predicting random components from tree-or plot-level covariates when a new subject is available and is not used in the fitting of the model by using the empirical best linear unbiased predictors (EBLUPs) [3,4,12,19].
Statistical models in which both fixed and random effects enter nonlinearly are increasingly common in the biosciences [20]. The models are relevant to many disciplines, including forestry, agriculture, ecology, biology, biomedicine and pharmacokinetics [21]. They are used to analyze data with complex structures, including grouped data, longitudinal data, repeated measures data and multivariate multilevel data [22]. One of the most common applications is for analysis of nonlinear growth data [23]; these are data measured repeatedly over time on the same tree (multiple observations obtained from the same sampling unit or subject in sequence over time) and are also known as longitudinal data [12,24].
The main purpose of this study was to develop an individualtree diameter growth model for C. lanceolata (Lamb.) Hook growing in Fujian province, southeast China. The data were derived from 144 increment cores from 72 trees in 24 sample plots. One-level and nested two-level nonlinear mixed modeling approaches that included both fixed and random components were applied to the hierarchical structure of the data. This diminished the level of variance among the sampling units, which were included as covariates. In developing the diameter growth models, we considered nested two-level models and a single-level model. The first level is the plot and the second level is the tree, nested within the plot. Our preliminary analysis showed that the NLME models with random effects effectively removed the heteroscedasticity and autocorrelations in the repeated-measure data and therefore could be important tools for sustainable management of China-fir species within the study area. The predictive ability of the developed model and the applicability of the NLME model were demonstrated using separate validation data.

Data
The data were obtained from 24 single-species plots of plantation-grown China-fir on the Jiangle state-owned forest farm in southeast China ( Figure 1). One-hundred and forty-four increment cores were collected from 72 trees; 15 cores missed the pith and were excluded from analysis. The increment cores were extracted from three mean trees in each plot; the mean trees were trees with a diameter at breast height (dbh; 1.3 m above ground) approximately equal to the plot mean dbh. Two cores were collected perpendicular to each other from each tree at breast height. The sample plots were square and varied in size from 400 to 600 m 2 . All standing live trees (height .1.3 m) on the plots were measured for dbh (outside bark) and tree height. Three to five dominant trees on each plot were chosen to calculate plot dominant height. Using the Lintab tree-ring measurement system (Rinntech Company in Germany), the width of each annual growth ring (radial increment data) was measured; trees were assumed to be round, so the diameters were calculated as twice the radius. Growth data from the seed orchard at the forest farm indicate that China-fir requires two years to attain a height of 1.3 m. Diameter data were therefore assigned an initial age of 3 years. Independent data were used for model validation. The data were randomly divided into two groups; 75% of the points were used for model fitting, and 25% were used for model validation. The fitting data and the validation data included 54 trees from 23 plots and 18 trees from 13 plots, respectively. Summary statistics for both fitting and validation data are shown in Table 1.

Methods
Nested two-level NLME model. The model data were derived from the measured annual increment of the sampled trees. The nested sampling structure created a high degree of correlation among observations taken from the same tree and plot. The mixed-effects modeling approach is a common means of addressing the correlation structure in the data [13,23]. A general expression for a NLME model can be defined as [22,25]. where M is the number of plots, M i is the number of trees within the ith plot, and n ij is the number of observations (increments). DBH ijk is the dbh (cm) at the kth age of the jth tree taken from the ith plot, t ijk is the age, w ij is the parameter vector r61 (where r is the number of parameters in the model), f is a nonlinear function of the predictor variables and the parameter vector, and e ijk is the within-group error including the within-group variance and correlation [26]; the error is assumed normally distributed with a mean of zero and a positive-definite variance-covariance structure R ij , generally expressed as a function of the parameter vector d [27]. Moreover, w ij can be expressed as: where l is the p61 vector of fixed population parameters (where p is the number of fixed parameters in the model). m i and m ij are the q 1 61 and q 2 61 vectors of random effects associated with the first and second levels, respectively (where q 1 and q 2 are the numbers of random parameters of two-level in the model), which are assumed to be normal (or Gaussian) with a mean of 0 and have the variance-covariance matrices y i and y ij ; these are the q 1 6q 1 and q 2 6q 2 variance-covariance matrices associated with the first and second level random effects, respectively. A ij , B i,j and B ij are the design matrices r6p, r6q 1 and r6q 2 for the fixed and random effects specific to each sampling unit. Individual-tree diameter growth equation. Five theoretical nonlinear growth equations, the Richards, Weibull, Korf, Logistic and Schumacher equations, were selected as candidates for modeling diameter growth. These equations are widely used for the simulation of individual-tree growth, particularly the Richards and Korf equations. Mathematical expressions of the equations are shown in Table 2.
The five above-mentioned equations are all S-shaped growth equations with inflection points and asymptotes. A characteristic of the Richards, Weibull and Korf equations is that the coordinates of the inflection points are variable multiples of asymptotic values; in contrast, the equivalent values of the logistic and Schumacher equations are fixed multiples [28]. The five equations were initially fit by ONLS regression using the R nls function without random parameters. Different initial values for the parameters were tried to ensure that a global minimum was achieved. The best performing function was selected as the base model by applying three statistical criteria; absolute mean residual (AMR), root mean square error (RMSE), and adjusted coefficient of determination (R 2 adj ) [29]. The function with the smallest AMR and RMSE and the largest R 2 adj provides the best fit. The adjusted coefficient of determination is used similarly as an unbiased estimator in both multiple regression and canonical redundancy analysis. The formulas of the fit statistics are: whereŷ y ijk is the predicted increment at the kth age within the jth tree within the ith plot.
y is the average of observations. Table 1. Summary statistics for both fitting and validation data. Mixed parameter evaluation. A crucial issue in fitting mixed-effects models is deciding which parameters should be considered random effects and which can be treated as fixed effects. A common approach is to start with random effects for all parameters and then to examine the fitted object to decide which, if any, of the random effects can be eliminated from the model [18]. Different combinations of model parameters were therefore tested to ascertain their contribution to predictions of diameter growth; the best model was selected by Akaike's information criterion (AIC) [30], Bayesian information criterion (BIC) [31] and 22 log-likelihood (22 LL) [32]. The best model gave the smallest AIC, BIC and 22 LL. The appropriate variance function and autoregressive structure for the NLME models were determined by the likelihood ratio test (LRT) [18,33]. All NLME models presented in this paper were calibrated using the nlme function in the R statistical environment [34].
Determining the variance-covariance structure. The variance-covariance matrices y i and y ij are positive-definite and symmetric, which is to say that all their eigenvalues must be strictly positive [18]. A hypothetical 262 variance-covariance matrix is shown as follows [24,35]: where s 2 u and s 2 w are the variance for the random effects u and w, respectively, and s uw~swu is the covariance between random effects u and w.
Determining the structure of R ij . The matrix R ij is allowed to depend on both random and fixed effects, as well as on a set of common but unknown parameters. The matrix accounts for within-plot heteroscedasticity and autocorrelation [24,26,27] by including both correlation effects and weighting factors. The matrix is expressed as [24,36]: where for a tree j in plot i, with n ij increment, R ij is the n ij 6n ij within-tree variance-covariance matrix that defines within-group variability, G ij is an n ij 6n ij diagonal matrix of within-tree error variance (heteroscedasticity), I ij is an n ij 6n ij matrix of within-tree autocorrelation of the errors, and s 2 is a scaling factor for the error dispersion [13].
In individual-tree diameter growth models, the variance is often found dependent on the means, and the variance will generally increase with increasing mean tree diameter. To remove this effect, we modeled the variance as an exponential function or power function which was used for G ij matrix [18]. And for the exponential function and power function, the diagonal elements of G ij are t 2d ijk and 2dt ijk , respectively, and the off-diagonal elements are all 0.
var exp e ijk À Á~s 2 exp 2dt ijk À Á ð7Þ var power e ijk À Á~s 2 t 2d ijk Autocorrelation structures were used for I ij matrix to address the within-tree autocorrelations of the errors observed in the data [37,38]. A method was selected from among three commonly used approaches: first-order autoregressive structure [AR(1)], a combination of first-order autoregressive and moving average structures [ARMA(1,1)], and the compound symmetry structure (CS) [18].  Table 2. Mathematical expressions of the five equations.

Equation Expression
Inflection point Parameters

Abscissa Ordinate
Richards where r is the autoregressive parameter, c is a moving average component, s 2 is the residual variance, and s 1 is the residual covariance [37][38][39][40].
Parameter estimation. The parameters in the equations were estimated by maximum likelihood (ML) using the Lindstrom and Bates (LB) algorithm implemented in the R nlme function [18,22]. The LB algorithm and nlme function are detailed in several articles (see, for example, [18,22]).
Predicting the random effects parameters is more problematic during model application and prediction than during the fitting process. In this case, they were estimated by the EBLUPs [25], using the increment data.b whereb b i is the estimated random effects vector of EBLUPs,D D is the q6q estimated variance-covariance matrix (q is number of random-effects parameters) for the random effects,R R i is the estimated variance-covariance matrix for the error term,Ẑ Z i is the estimated partial derivatives matrix with respect to the random effects parameters for the new observation, andê e i is the residual vector, whose dimension is the number of observations, and whose components are given by the difference between the observed diameter growth value for each tree, and the value predicted by the model including only fixed effects. The standwise calibration was used to evaluate the accuracy of the calibration [12]. This type of calibration involves using the random plot components predicted from the increments of a small sample of trees per plot to predict the increment of the trees within the plot not used in the calibration. In this case, the calibration was made with 1, 2 and 3 trees per plot. The random parameters of new observations could be predicted with Equation 12.

Function selection
The R nls function was used to evaluate the parameter estimates and model fit statistics of the five equations (Table 2); the results are listed in Table 3. The Korf equation had slightly better predictive ability than the others. Therefore, the Korf equation was selected as the basic nonlinear model for estimating diameter growth. The final base model is given by:

NLME model construction
The approach used to construct the NLME models was to fit the models with nested effects of plot and tree for Equation 13 and then to successively remove the random effects. The results are listed in Table 4. Four of the NLME models reached convergence with nested effects of plot and tree; the fifth and sixth models converged when the one-level models included the random effects of plot and tree, respectively.
LRT, AIC, BIC and 22 LL fit statistics were compared among different combinations of random effects parameters ( Table 4). The models represented by Equations 14-16, incorporating the nested effects of plot and tree, plot effects and tree effects on w 1 and w 3 , yielded the smallest AIC, BIC and 22 LL.
where DBH p and t ijk and DBH p ijk , DBH t ijk are the diameters at breast height for the three effects; b 1 , b 2 and b 3 are fixed-effects parameters; u 1i and u 3i are random-effects parameters generated by plot on w 1 and w 3 , respectively; u 1j and u 3j are random-effects parameters generated by tree on w 1 and w 3 , respectively; and u 1ij and u 3ij are random-effects parameters generated by interaction of plot and tree on w 1 and w 3 , respectively.

NLME models with heteroscedasticity and autocorrelation
We used the power function or the exponential function as the variance functions and the AR(1), ARMA(1,1) or CS as the autocorrelation structures to fit diameter growth models incorporating different random effects. The results of the models provided the best fit are shown in Table 5. The selected models had the smallest AIC, BIC and 22 LL. Thus, the final models of plot effects, tree effects and the two nested effects are, respectively: Parameter estimates Nested effects of plot and tree. The nested two-level NLME diameter growth model is:   Plot effects. The NLME diameter growth model incorporating the effect of plot is: Tree effects. The NLME diameter growth model incorporating the effect of tree is:

Model prediction
The predictive ability of Equation 13 was evaluated using predict procedures and Equations 3-5 on both fitting and validation data. The performance of the NLME models, with and without modeling the error structure, was evaluated using cross-validation procedures for both fitting and validation data; the random effects were predicted with the EBLUPs (Equation 12), using the measurement data. Table 6 lists the three fit statistics for Equation 13 and Equations 20a-22a with and without random effects. Equation 20a was the best predictor, with increases in R 2 adj and decreases in AMR and RMSE for both fitting and validation data, but it was more complex than the others and incurred significant computing cost. In Figure 2, the residuals of Equations 13 and 20a-22a are plotted against the fitted values; the fitted values are plotted against the observed values in Figure 3. Based on the above analysis, we can conclude that, although Equation 20a is the strongest predictor, it is more complex than Equation 22a and the difference between them is small. Compared with Equation 13, Equation 22a had a higher R 2 adj , 0.9956 compared to 0.7758, and a lower RMSE, 0.5344 compared to 3.2810. Therefore, the NLME model incorporating the random effect of trees was the best model for predicting diameter growth of individual China-fir trees in the single-species plantations of the study area.

Discussion
Of the 5 theoretical growth equations tested, the Korf equation best fit the individual-tree diameter growth data of China-fir when evaluated on the basis of AMR, RMSE and R 2 adj . The Korf equation is widely used for forest growth and yield simulation models [41][42][43]. ONLS regression is commonly used to build forest growth models, but its value is limited because tree data typically violate the assumption of independent and identically distributed errors [13,44,45]. NLME models are a useful tool for analyzing repeated measures data and spatially correlated data [18,33]. A model can be constructed with a unique variancecovariance structure that eliminates the influence of the random effects (plot and tree effects in this study). The two primary challenges in fitting NLME models are determining the mixed parameters and calculating the random effects [9,18,24,33,46]. An additional source of inherent correlation would be the effect of year, where observations coming from the same year would be highly correlated; tree-ring width is largely related with yearly climate variables [47]. However, year effects were not analyzed in this study. Incorporating annual climate factors into the NLME models may be an appropriate area for future research.
The Korf equation has been widely used as the base NLME model for forest growth and yield prediction. For example, Cheng and Gordon [48] successfully used the Korf equation with NLME models to fit loblolly pine (Pinus taeda L.) diameter-age relationships; the one-level (tree) individual-tree NLME model, based on the Korf equation, with random effects parameters w 1 and w 3 had the best fit. Parameters w 1 , w 2 and w 3 are the asymptotic values, the values associated with the growth rate of the tree and the values associated with the curve shape (inflection point) of the Korf equation, respectively. Therefore, the random effects (tree) mainly influence the maximum value and the inflection point, with evidence that the growth rate of the tree affects the model fit.
Sometimes, no prior information is available from which the random parameters can be predicted. In this case, the mixedeffects model with the random parameters set to 0 is not the same Table 6. Performance criteria of each model. as the population average model and will give biased predictions. Instead, the population average model, fit without random effects, should be used.

Conclusions
Five theoretical growth equations were evaluated for estimating the diameter growth of China-fir trees grown in monospecific plantations in Fujian province, southeast China. The equations can be evaluated for both biological and statistical meaning. All 5 equations and the Korf equation in particular were commonly and successfully used to model individual-tree diameter growth. Onelevel (plot or tree) and nested two-level (tree nested within plot) NLME models based on the Korf equation, with variance functions and correlation structures, were used to estimate diameter growth of individual trees; this approach was necessitated by the hierarchical structure of the experimental design and the autocorrelated tree-ring data. The results showed that the onelevel (tree) NLME model (Equation 22a) with random effects was better than the others (Equations 13, 20a and 21a) (Table 5, Figure 2 and 3). Therefore, we recommend using nonlinear mixed-effects models to estimate individual-tree diameter growth.