A Mixed-Effects Model with Different Strategies for Modeling Volume in Cunninghamia lanceolata Plantations

A systematic evaluation of nonlinear mixed-effect taper models for volume prediction was performed. Of 21 taper equations with fewer than 5 parameters each, the best 4-parameter fixed-effect model according to fitting statistics was then modified by comparing its values for the parameters total height (H), diameter at breast height (DBH), and aboveground height (h) to modeling data. Seven alternative prediction strategies were compared using the best new equation in the absence of calibration data, which is often unavailable in forestry practice. The results of this study suggest that because calibration may sometimes be a realistic option, though it is rarely used in practical applications, one of the best strategies for improving the accuracy of volume prediction is the strategy with 7 calculated total heights of 3, 6 and 9 trees in the largest, smallest and medium-size categories, respectively. We cannot use the average trees or dominant trees for calculating the random parameter for further predictions. The method described here will allow the user to make the best choices of taper type and the best random-effect calculated strategy for each practical application and situation at tree level.


Introduction
The ability to describe the stem form of a forest tree is important for practical and theoretical reasons. Foresters require stem profile models for estimating the volume and the value of the whole stem or a part of it [1], to various utilization limits [2,3]. Such estimates are essential in forest planning, for example in evaluating the economics of different management regimes [4]. A theoretical aspect of interest is the relationship between stem form, competition and tree age for individual species, which calls for parameter-parsimonious models that can be used to make general statements about the effect of silviculture and site conditions on stem form [5,6].
Taper models can be classified into simple polynomial, segmented and variable-form models [7]. A comparison of these 3 types of models shows that although the simple polynomial taper models have a notably simple structure and easy convergence, they are not good at accurately describing the stem. The segmented taper models are more complicated and have good accuracy but are also more difficult to calculate. The variable-form taper models have good structure, can accurately predict the stand volume, and are not overly complicated to calculate [8][9][10]. Therefore, in contrast to the single taper and segmented taper models, the variableform taper model is widely used.
Because the data for stem taper have hierarchy and repeated measurement [11], many researchers use Nonlinear Mixed-Effects (NLME) models to develop taper models. Compared with the regression method, NLME models consist of fixed-and random-effect parameters and have the advantage of enabling the modeling of the covariance matrix of correlated data. There are 2 responsible variable components in the variance-covariance matrix: the random-effect component and the within-subject component. Both components can be used to model the heteroskedasticity and autocorrelation of a mixed-effect model [12][13][14].
However, previous studies have mostly considered the fitting of one variable exponent [15]. Many studies [15][16][17][18][19][20][21] used the segmented model of Max and Burkhart (1976) [22]. Furthermore, de-Miguel [4] compared the simple polynomial, segmented and variable-form taper model types using both fixed-and random-effect approaches to predict the volume. Finally, Kozak II with 9 parameters [13] was selected as the best taper model. However, the taper model of Kozak II has 9 parameters and a complicated structure. Therefore, there is no systematic comparison of the aforementioned taper models using fixed-and random-effect approaches that has both simple structure and good accuracy.
For the mixed-effect model, the high cost of measuring additional upper-stem diameters makes it difficult to calibrate the tree-specific taper functions in forestry practice. To solve this problem, de-Miguel [23] compared 3 different prediction methods in model evaluation and validation: (1) a fixed-effect model, (2) the fixed part of a mixed-effect model, and (3) Monte Carlo simulation based on a randomized mixed-effect model. Their results suggest that fixedeffect models should be used when the purpose of the model is prediction and calibration data are not available. Crecente-Campogeneralized the NLME height-diameter model for Eucalyptus globulus L. in northwestern Spain [24]; random parameters for particular plots were estimated with different tree selections (5 options). Finally, the height-diameter relationships for individual plots were obtained by calibrating the height measurements of the 3 smallest trees in a plot.
First, this study aimed to perform a consistent analysis of the performance of taper models with fewer than 5 parameters and to modify each of them for good accuracy. Using the best model that was found, 7 strategies were compared for volume prediction using a taper model in the absence of additional measurements for tree-specific calibration [25].

Materials
The study area is in Jiangle state-own forest farm in Fujian province,China. Jiangle stateowned forest farm provide the permission for each location. This forest farm has a compartment as a study area, and these forest lands are all experimental plantation, this is a place for Beijing Forestry University and Jiangle state-owned forest farm for forestry research. Jiangle state-owned forest farm is a place for Chinese fir wood production, the forest area is very big, study the taper is good for the wood trading. And this place has no specific permissions were required for these locations/activities. So we choose this place for case. The main species of the forest farm are Chinese fir, Masson pine, Moso bamboo. Using the data collected from Jiangle state-owned forest farm have published many papers, there has no endangered or protected species. So we confirm that the field studies did not involve endangered or protected species. The region is characterized by ferromagnesian (red) soils and has a mean annual precipitation of approximately 1699 mm, a mean annual frost-free season of 287 days, and a mean annual temperature of 18.7°C [26].
We sampled four regions, which were divided equally into 41 plots of Cunninghamia lanceolata trees (Qiantan, 15 plots; Shuinan, 8 plots; Yuhua, 11 plots; and Yuandang, 9 plots) and are represented by I, II, III and IV, respectively, in Fig 1. Established between 2010 and 2014, the plots vary in size from 400 to 600 m 2 . In the plots, we measured the diameters at breast height (DBHs) over the bark (at 1.3 m above ground) of fresh trees (height > 1.3 m) and the total tree height of 41 trees that were felled for stem analysis. Before felling each tree, we measured two attributes: diameter at breast height (1.3 m above ground) and total tree height (H). After felling, we measured the diameter at intervals of 1 m and 2 m above the breast height depending on the total tree height. We further performed a laboratory analysis of the outer and inner bark of each disc. These diameters were measured along the largest axis and smallest axis ( Table 1 and Fig 2).

Selection of candidate equations
The ranking and selection of the taper models were performed in three steps. First, 21 published candidate equations are variable-form taper models with fewer than 5 parameters (Tables 2 and 3) [27]. The best function was selected by applying 5 statistical criteria: Mean Absolute Bias (MAB), root mean square error (RMSE), adjusted coefficient of determination R 2 , Akaike's information criterion (AIC), and Bayesian information criterion (BIC) [28].
Second, we modified the best model because building an equation with fewer parameters while maintaining good accuracy in volume prediction was the main goal of this study. The model was modified by comparing the relationships among the total height (H), diameter at breast height (DBH), and height above ground level (h) against the modeling data. The best taper model for d 2 provides unbiased predictions for the cross-sectional area and volume [29][30]. Therefore, all fitted candidate models used d ki as the ith diameter measurement [4]: where d ki is the ith diameter measurement of tree k, which is measured at height h ki , D k and H k are the DBH and total height of tree k, respectively, q is the vector of 1-5 parameters, and ε ki is the residual.

Volume calculation based on taper model
Based on taper model selected above, Formula (2) was used calculate the volume of trees.
p 40000 where H is the total height, d is the diameter outside the bark at height h (cm), h is the height above ground level.

Testing different prediction strategies using different random parameters
For the best taper model, the calibrated response was evaluated for different height sampling designs and sampling sizes within all data to calculate the tree random parameter for different heights diameter estimation. Randomly calculate the total height of different number trees from 1 to 10 for random parameter calculation in model calibration and using the remaining trees for validation in 7 strategies. The 7 selected alternatives are as follows: 1. calculating a fixed-parameter model (with no random-effect parameter).
2. calculating the fixed part of a mixed-effect model (random-effect parameter is 0).
3. calculating the heights of the randomly selected trees (total heights of 1-10 randomly selected trees to calculate the parameters).
4. calculating the heights of the largest selected trees (calculating the total heights of 1-10 largest trees to calculate the parameter).
5. calculating the heights of the smallest selected trees (total heights of 1-10 smallest trees to calculate the parameters). 6. calculating the heights of the medium-size selected trees (total heights of 1-10 medium-size trees to calculate the parameters).
7. calculating the heights of a mix of selected trees (calculating the total heights of 3, 6 and 9 trees in the largest, smallest and medium-size categories) [4,22,24].

Results
Analyzing the candidate taper model to select the best model The performance of 5 stem taper functions shows that the function has the form of ((H-h)/(H-1.3)) b0 , except for the model of (16) and (21), which does not have that structure for Cunninghamia lanceolata. The Model (1) has the best accuracy (Table 4) and is therefore the best candidate model. The comparison of parameter b0 with H, D and h shows that b0 significantly correlates with h, correlates with D and exhibits only a normal relationship with H. The above b0 equation can be rewritten as follows Eq (3): where b0, b1, b2 is the parameter, h is the height above ground level. From the results in Fig 3, we deduce that parameters H, D and b0 are not suitable for increasing the accuracy of the model.
We found that the new taper model with two parameters has a smaller residual than Eq (1) with 4 parameters ( Table 5). Because including too many parameters is unsuitable for the model's convergence, the new taper model is the simplest model that is suitable for volume prediction.

A mixed-effect taper model based on the new taper model
Fitting with the NLME function [48] of R using ML was successful with one to three tree-specific random parameters and with a higher number of parameters in some cases. However, we restricted the analysis to the models with one or two random parameters. The model did not converge for parameter (b1, b2) or (b1) with the random parameter β, and the model only converged for (b1) with the random parameter β. Thus, the best model according to the likelihood ratio tests was Formula (4).
where the fixed parameter is b1 [44], β is the random parameter, H is the total height, d is the diameter outside the bark at height h (cm), h is the height above ground level, D is the diameter at breast height. The residual variance was assumed to follow the following Model (5): where σ 2 is within-tree residual variance, D i is the ith tree diameter at breast height,δ 1 and δ 2 is variance-covariance parameters for random effects.

Evaluation of the prediction strategies based on the best model
Based on the random parameters for all developing trees, which were estimated to predict the volume, the MAB was 0.0108, the R 2 was 0.9981, and the RMSE was 0.0119. Strategie 7 calculated the total height of 3, 6 and 9 trees in the largest, smallest and mediumsize categories and obtained the smallest MAB (0.0119), an adjusted coefficient of determination R 2 of 0.9900, and the smallest RMSE, of 0.0185. Strategies 1, 2 and 5 produced similar values for MAB, RMSE, and R 2 . Strategie 2 calculated the fixed part of a mixed-effect model better than strategie 1, which calculated a fixed-parameter model with MAB 0.0006 m 3 , R 2 0.0003, and RMSE 0.0003 m 3 , and strategie 5, which calculated the total height of the 1-10 smallest trees with MAB 0.0001 m 3 , R 2 0.0002, and RMSE 0.0002 m 3 . Strategie 3, which calculated the total height of 1-10 randomly selected trees, and strategie 4, which calculated the total height of the 1-10 largest trees, had low accuracy. The worst strategie is strategie 6, which calculated  the total height of 1-10 medium-size trees; its R 2 is only 0.9480, which is substantially poorer than the best R 2 of 0.0420.

Discussion
This paper provides a thorough generalization of many published taper equations with fewer than 5 parameters. The accuracy of the models in an independent data set shows that the sample size and design were sufficient [50][51]. In general, taper models with more parameters produce better fit than those with fewer parameters. However, we found that the models with fewer parameters performed better than certain models with more parameters; for example, the equation d 2 ¼ D 2 ðHÀhÞ ðHÀ1:3Þ ððb1þbÞþb2h 0:007 Þ with two parameters performs better than Eq 1 with four parameters or the equations with five parameters (Table 6). We also know that including too many parameters in nonlinear mixed-effect models is not good for convergence. Thus, under some conditions, the modified model in this paper may be better than other models with more parameters for Cunninghamia lanceolata in Fujian Province, China, or for other trees worldwide.
In the taper Model (1), in parameter b0, a larger height corresponds to a smaller b0. Thus, we choose parameter h as the only parameter in b0 to modify the taper, and the validation process demonstrates that this method is accurate.
Several strategies were compared in this study using the model that was considered the best based on ease of convergence and a small number of parameters. Seven prediction strategies are readily used in forestry practice. Strategie 7, which calculates the total height of 3, 6 and 9 trees in the largest, smallest and medium-size categories, respectively, has the best accuracy (Fig 4), which suggests that the largest and smallest trees show substantial differences in stem form. The numbers for the 3, 6 and 9 trees large, small and medium-size categories form a nearly normal distribution. Thus, computing the random-effect parameters of the largest, smallest, and medium-size trees clearly improves the predictive accuracy. The calibrated taper model allows the acquisition of accurate results with a notably small sampling effort, which makes this method extremely effective and useful (Table 7). Strategie 1, which calculates a fixed-parameter model, and strategie 2, which calculates the fixed part of a mixed-effect model, have good accuracy with nearly random parameters for all developing trees. In other words, when the purpose of the model is prediction and calibration data are not available, strategies 1 and 2 should be used based on the best taper model that was modified in this paper. These results are similar to those of de-Miguel [4,23].
Strategies 6 and 4 calculate the heights of medium-size selected trees (total height of 1-10 medium-size trees to calculate the parameters) and the largest selected trees (total height of 1-10 largest trees to calculate the parameters), respectively. Based on the bias results (Fig 5), we find that strategies 6 and 4 were the poorest approaches: strategie 4, using the largest trees, has a larger bias than the other strategies, and strategie 6, using medium-size trees, has a smaller bias than the other strategies. In forest practice, the sample trees are usually average trees, and the medium-size trees are always the average tree in a sample plot (medium-size tree as analytic trees with the average diameter at breast height). Similarly, the largest trees are always the dominant tree in a sample plot, in which they have the largest DBH or total height. Thus, when we  use the NLME to predict the volume in a forest stand, we cannot use the average trees or dominant trees to calculate the random parameter to estimate the stand volume; those approaches would produce the lowest accuracy.

Conclusion
The taper model developed in this paper is the best taper model for describing stands of Cunninghamia lanceolata. It has the advantage of easy convergence and simple structure, is a useful tool for predicting the volume of Cunninghamia lanceolata and may also be useful for analyzing the taper of other trees worldwide.
In forest practice, when we use the NLME to estimate the stand volume, we cannot use the average trees or dominant trees to calculate the random parameter as the stand random parameter. We should sample some small trees in a mixed approach (strategie 7) to obtain good accuracy.
The results of this study show that when the purpose of the taper model is prediction and calibration data are not available, fixed-effect (with no random parameter) or mixed-effect (random parameter is 0) models should be used. However, because calibration may sometimes be performed for some but not all types of wood, strategie 7 is one of the best strategies to improve the volume prediction accuracy at tree level. This strategie helps the user make the best selection in random-effects calculation for practical applications and scenarios. Residuals for the calibrated model with different tree sampling designs and sampling sizes to calculate the random parameters. Note: All: calculate all trees; no: with no random parameter; zero: random parameter is 0; large: largest trees; medium: medium-size trees, small: smallest trees; mixed: a mix of large, medium and small trees; random: randomly selected trees.