Tree Biomass Estimation of Chinese fir (Cunninghamia lanceolata) Based on Bayesian Method

Chinese fir (Cunninghamia lanceolata (Lamb.) Hook.) is the most important conifer species for timber production with huge distribution area in southern China. Accurate estimation of biomass is required for accounting and monitoring Chinese forest carbon stocking. In the study, allometric equation was used to analyze tree biomass of Chinese fir. The common methods for estimating allometric model have taken the classical approach based on the frequency interpretation of probability. However, many different biotic and abiotic factors introduce variability in Chinese fir biomass model, suggesting that parameters of biomass model are better represented by probability distributions rather than fixed values as classical method. To deal with the problem, Bayesian method was used for estimating Chinese fir biomass model. In the Bayesian framework, two priors were introduced: non-informative priors and informative priors. For informative priors, 32 biomass equations of Chinese fir were collected from published literature in the paper. The parameter distributions from published literature were regarded as prior distributions in Bayesian model for estimating Chinese fir biomass. Therefore, the Bayesian method with informative priors was better than non-informative priors and classical method, which provides a reasonable method for estimating Chinese fir biomass.


Introduction
Chinese fir (Cunninghamia lanceolata (Lamb.) Hook.), a fast growing evergreen coniferous tree, is one of the most important tree species for timber production in southern China. As an important native tree, Chinese fir has been widely planted extending over more than 1000 years [1]. It produces excellent quality timber, with straight shape, high resistance of bending and cracking, and easily processing trait. Because of its high commercial value, the planting area of Chinese fir in China is around 9.215 million ha, accounted for 28.54% of all forested land [2]. Currently, it is thought that this conifer tree will be able to bring great profit of biomass production.
The estimation of tree biomass is needed for both sustainable planning of forest resources and for studies on the energy and nutrients flows in ecosystems. Hall [3] reviewed that the potential role of biomass is an energy source in the 21 st century. In addition, the United Nations Framework Convention on Climate Change and in particular the Kyoto Protocol recognize the importance of forest carbon sink and the need to monitor, preserve and enhance terrestrial carbon stocks, since changes in the forest carbon stock influence the atmospheric CO2 concentration [4]. The reliability of the forest carbon stock estimates and the understanding of ecosystem carbon dynamics can be improved by biomass equations [5], [6]. Since collecting biomass data is costly and time-consuming, accurate estimation of biomass is required for accounting and monitoring carbon stock. The biomass equations can be applied directly to tree level inventory data (diameter, height).
For estimating tree biomass for Chinese fir, many models were developed, especially for allometric equations: W~aD b , and W~a(D 2 H) b (W: tree biomass, D: diameter at breast height, H: tree height). The variable D 2 H was usually used in biomass equations and gave good estimates. Lin et al. [7] developed biomass model of Chinese fir with the two equations, and found that the second equation with D 2 H was better than the first one. Tian et al. [8] estimated the stem biomass, branch biomass, root biomass and foliage biomass of a second generation Chinese fir plantation with W~a(D 2 H) b , and the correlation coefficients of the models were more than 0.97. However, as tree allometry is influenced by both environmental and competitive factors [9], [10], temporal changes in these conditions are likely to affect the biomass estimation. A major limitation of these equations is that produces very different results when applied to different stands where the equations were originally developed [11].
Bayesian inference is an alternative method of statistical inference that is frequently being used to evaluate ecological models [12], [13]. Although Bayesian and classical approaches have been debated on the philosophical level in many scientific fields [14], [15], it has been shown that the Bayesian method has unique advantages in main two situations. Firstly, Bayesian methods are fully consistent with mathematical logic, while classical methods are only logical when making probabilistic statements about long-run averages obtained from hypothetical replicates of sample data, not hypotheses [16], [17]. Secondly, relevant prior knowledge about the data can be incorporated naturally into Bayesian analyses whereas classical methods ignore the relevant prior knowledge other than the sample data [15].In forestry, Bayesian methods have been adopted in several applications such as diameter distribution [18], [19], tree growth [20], individual tree mortality [21], [22], tree foliar dry matter [23], stand-level height and volume growth models [24], [25], and stand basal distribution [26]. Zapata-Cuartas et al. [27] used Bayesian method to estimate aboveground tree biomass, and obtained a reliable result.
The objective of the study was to estimate stem biomass, branch biomass, foliage biomass, and root biomass of Chinese fir using W~a(D 2 H) b based on a Bayesian framework. In addition, the Bayesian method was compared with the classical method for biomass estimation of Chinese fir.

Study Site
The plantations studied were at Weimin farm, Shaowu city (27.08uN, 117.72uE), in Fujian Province, southern China ( Fig. 1) which has a subtropical maritime monsoon climate. Mean annual precipitation is 1768 mm. Mean annual temperature is 17.7uC, and monthly mean temperature ranges from 6.8uC in January to 28uC in July. The soil is red, with rich soil humus contents. The plantations were built and authorized by Research Institute of Forestry, Chinese Academy of Forestry. The field studies did not involve endangered or protected species.

Biomass
Three stands of 7-, 16-and 28-year-old Chinese fir were selected for the investigation. Each plot comprised an area of 20 m630 m and a buffer zone of similarly treated trees surrounded each plot. The tree diameter and height measurements in all of the plots were conducted after the tree height reached 1.3 m ( Table 1). The trees were distributed in diameter classes of 6, 8, 10, …, 28. Diameter classes of 7-year-old stand range from 6 to 16, 6 to 22 of 16-year-old stand, and 8 to 28 of 28year-old stand. One or two trees in each diameter class were destructively sampled. A total 39 trees were sampled. After the tree was felled, the fresh weights of stem wood, branch, foliage and root were measured, and the subsamples were selected and weighed on a portable digital balance in the field. After removal to the laboratory, each subsample was oven-dried to constant weight at 105uC to determine the proportion of dry biomass in each component. According to the ratio of dry weight to fresh weight, each compartment biomass was computed. The statistics of total biomass of the 39 trees was showed in Table1.

Biomass Model
In this study, we modeled tree biomass W (in kg) as a function of height H (measured in meters) and diameter D (in cm) with the allometric equation W~a(D 2 H) b . It is convenient to take logarithms for fitting model and dealing with heterocedasticity [28].
where a~ln a and b are the parameters of the model, and e is error term, which is normally distributed with mean zero and variance s 2 .

Bayes Rule
Let y = (y 1 , y 2 , y 3 , …) represent a vector of data and h = (h 1 , h 2 , h 3 , …) be a vector of parameters to be estimated. Bayes' rule is then expressed as: where p represents the probability distribution or density function. Values for h can be obtained by minimum least squares (MLS) or maximum likelihood estimation (MLE) in the classical approach.
In the Bayesian framework, it uses probability distributions to describe uncertainty in the parameters being estimated. In light of the observed data, h has a probability distribution given by: We should note that the conditional distribution of h given data y (p(hDy)) is what we are interested in estimating and is referred to the posterior probability distribution (simply called posterior) in the Bayesian framework. p(y | h) tells us the distribution of y assuming h is known, which is a likelihood function when regarded as a function of the parameters [14]. p(h) is called the prior probability distribution for the parameters (simply called prior), and reflects information available about the hypothesis. The important characteristic of Bayesian method is that the parameters are treated as random variables [15], [25]. This is a very different assumption from that of classical method, which treats parameters as true, fixed (if unknown) quantities [14], [29].
In the study, the allometric relationship between tree biomass W and its diameter D, height H is given by a statistical model: where g(D,H : a,b)~azb ln (D 2 H) is the log of the allometric formula and gives the mean of the distribution of log-biomass. So Eqn (2) in the study can be said by where the data consist of triples (ln(W), D, H) measured from trees. In the current study, p(dataDa,b) is the likelihood implied by Eqn (6): Prior Distribution Specification The choice of prior distribution is critical for Bayesian method [30]. In the above model specification, we need to choose appropriate prior distributions for all parameters, including a, b. Many researchers choose to use non-informative normal (Gauss- ian) priors that reflect prior 'ignorance', which would not have a strong influence on the parameters. Such priors typically arise in the form of a parametric distribution with large or infinite variance. Alternatively, if prior information is available from external knowledge (reported parameters from the literature), the information can be used to construct a prior distribution. In this study, two prior distribution specifications were used in the Bayesian framework. One is non-informative prior, the other is informative prior. For non-informative prior, Gaussian priors on all parameters (a, b) were a,N (0, 1000), b,N (0, 1000). For informative prior, we assumed the parameters of a, and b distributed as a bivariate normal distribution N(m,S), where m~(m a ,m b ) is a vector of means, and S is the covariance matrix. The parameters of m and S were specified from the reported literature. 32 biomass equations of Chinese fir (Table S1 in file S1) were collected from published references (Appendix S2 in file S1). The database has wide geographical distribution of southern China (Fig. 1). In addition, assuming that the errors are normally distributed e*N(0,s 2 ), as recommended by Hadfield [31], the scalar parameters of the prior inverse Wishart of s 2 were set to V = 1 and v = 0.001.
The Bayesian method was implemented using the R package MCMCglmm [31] to fit the linear Gaussian model. MCMCglmm uses Gibbs sampling [32] to update the parameters. We set 25 000 iterations to run to ensure the obtainment of maximum convergence and satisfied posterior distributions of estimated parameters. To reduce the correlation between neighbouring iterations, the thinning parameters were all set to 3.

Model Evaluation
Bayesian method was evaluated against the classical method (MLS), based on the following criteria. Smaller values of the criteria indicate that a model is better.
Mean absolute deviation MAD~X

Results
A total of 32 biomass equations in logarithmic form of Chinese fir were compiled from the reported literature for informative prior distributions. The parameters of a, and b in each component biomass model distributed as a bivariate normal distribution ( Table 2). In addition, we found that the two parameters are negatively correlated with each other. Based on the Bayesian method, the posterior probability distributions of the two parameters for each component biomass model were obtained. The posterior probability distributions were very similar based on Bayesian method with informative priors and non-informative priors (Fig. 2). Estimate values of a and b using the Bayesian with non-informative prior and MLS method were numerically identical in each component biomass model. The intervals of the two parameters estimates also had similar range, while they were wider than the interval from Bayesian method with informative prior (Table 3).
Evaluation statistics of Bayesian method and MLS method for biomass model were showed in Table 4. In stem biomass model, MD, MAD, and RMSE of Bayesian method with informative priors were the smallest among the three methods. The same results were found in branch biomass model, foliage biomass model, root biomass model, and total biomass model. RMSE is a widely accepted criterion for evaluating performance of a model. In the five models, compared with MLS method, Bayesian method with informative priors lowered RMSE ranging from 0.32% to 2.77%. Both each component biomass model and total biomass model, Bayesian method with informative priors was better than non-informative priors and MLS method. Bayesian method with non-informative priors was slightly better than MLS method in trunk biomass model, branch biomass model and leaf biomass model, while slightly worse in root biomass model and total biomass model.
In this study, each component biomass model and total biomass model were all developed. There were few biases between the total biomass estimates and summation of each component biomass estimates based on Bayesian method with informative priors (Fig. 3).

Discussion
Chinese fir was one of the most important tree species for the biomass carbon pool in China from the 1980s to 2000s. The total biomass stock of Chinese fir increased continuously during the last three decades. The relative contribution of Chinese fir to the Chinese forest biomass stock increased from 2.48% in 1977-1981  [34], [35], [36], especially for two equations: W~aD b , and W~a(D 2 H) b . Previous studies have been reported for the estimation of Chinese fir biomass using the two equations with classical method [37], [38]. Although these two equations often give the impression of close relationships, good performance and high values of R 2 , they can still fail to get accurate estimates of stand biomass when they are applied to stands beyond the data range and site conditions [39]. Many different biotic and abiotic factors introduce variability in tree biomass model, suggesting that parameters of allometric equations are better represented by probability distributions (Fig. 2) rather than fixed values as classical method. Therefore, the widespread use of general Chinese fir biomass equations at the biome level obscures important differences in different stands.
In this study, we have presented a Bayesian solution to tree biomass models of Chinese fir. Bayesian method is an important statistical tool that is increasingly being used by ecologists. Bayes' rule provides an alternative method for estimating parameters and expressing the degree of confidence or uncertainty in these estimates. Bayesian method allows for as much or as little data or prior knowledge and provides a direct measure of the probability of one or more hypotheses of interest [15]. Zapata-Cuartas et al. [27] found that model efficiency (RMSE) in the Bayesian method were almost identical to classical method when the sample size was larger than 60, and better when the sample size was smaller than 60. Here, the sample size was 39, and the model efficiency was better than classical method.
Bayesian credible interval and classical confidence are usually numerically identical if the Bayesian prior is non-informative [40]. This could be found in the study (Table 3). A non-informative prior is one in which the data (by the likelihood, which is p(y | h) in Bayes' rule) dominates the posterior, and the prior probabilities of all reasonable parameter values are approximately equal. Thus the posterior distribution has the same form as the likelihood. Since the posterior distribution with non-informative prior is less precise, the credible interval was wider and obtained worse prediction (Table 4).
Although Bayesian methods have been adopted in several applications in forestry, applications of Bayesian techniques in biomass are still relatively uncommon. In this study, a comprehensive sample of allometric equations of Chinese fir biomass gathered from published literature revealed that the parameters of  allometric model can be well described by a bivariate normal distribution. The bivariate normal distribution ( Table 2) was considered as prior distribution in Bayesian model for estimating tree biomass. It is the advantage of Bayesian method to update a model with priors. Therefore, not only are the data considered to be samples from a random variable, but the parameters to be estimated are regarded as random variables [15]. Thus, the performance of Bayesian method was better than classical method (Table 4). It also notes that there are chances to improve the research. Additional variables can be included in the analysis and develop hierarchical Bayesian models that can yield more accurate priors for new data. For example, it would be possible to develop procedures in which the prior information adapts to specific site. For more comprehensive and accurate Chinese fir biomass model, additional variable describing site quality (e.g. site index) should be incorporated into the models [41]. We believe that the tree biomass modeling could be benefit from further explorations of the use of Bayesian method.

Conclusions
Established over a broad geographical range, the national Chinese fir resource encompasses large variations in climatic, edaphic conditions, silviculture and genetic stock that affect biomass accumulation. These different biotic and abiotic factors introduce variability in Chinese fir biomass model, suggesting that parameters of allometric equations are better represented by probability distributions rather than fixed values. In Bayesian   framework, appropriate prior distribution is very necessary. In this study, Bayesian methods with non-informative priors and informative priors were used to estimate Chinese fir biomass. For informative priors, 32 biomass equations of Chinese fir were collected from published references. The parameter distributions from published literature were considered as prior distribution in Bayesian model for estimating tree biomass. Bayesian method with non-informative prior and classical method got similar performance. The performance of Bayesian method with informative prior was better than non-informative prior and classical method, which provides a reasonable method for estimating Chinese fir biomass.

Supporting Information
File S1 Supporting Appendices. Appendix S1. Parameter estimates of 32 biomass equations (W~a (D 2 H)

Author Contributions
Conceived and designed the experiments: X-QZ J-GZ. Performed the experiments: X-QZ A-GD. Analyzed the data: X-QZ. Contributed reagents/materials/analysis tools: X-QZ A-GD. Wrote the paper: X-QZ.