Evaluation of Four Methods for Predicting Carbon Stocks of Korean Pine Plantations in Heilongjiang Province, China

A total of 89 trees of Korean pine (Pinus koraiensis) were destructively sampled from the plantations in Heilongjiang Province, P.R. China. The sample trees were measured and calculated for the biomass and carbon stocks of tree components (i.e., stem, branch, foliage and root). Both compatible biomass and carbon stock models were developed with the total biomass and total carbon stocks as the constraints, respectively. Four methods were used to evaluate the carbon stocks of tree components. The first method predicted carbon stocks directly by the compatible carbon stocks models (Method 1). The other three methods indirectly predicted the carbon stocks in two steps: (1) estimating the biomass by the compatible biomass models, and (2) multiplying the estimated biomass by three different carbon conversion factors (i.e., carbon conversion factor 0.5 (Method 2), average carbon concentration of the sample trees (Method 3), and average carbon concentration of each tree component (Method 4)). The prediction errors of estimating the carbon stocks were compared and tested for the differences between the four methods. The results showed that the compatible biomass and carbon models with tree diameter (D) as the sole independent variable performed well so that Method 1 was the best method for predicting the carbon stocks of tree components and total. There were significant differences among the four methods for the carbon stock of stem. Method 2 produced the largest error, especially for stem and total. Methods 3 and Method 4 were slightly worse than Method 1, but the differences were not statistically significant. In practice, the indirect method using the mean carbon concentration of individual trees was sufficient to obtain accurate carbon stocks estimation if carbon stocks models are not available.


Introduction
Carbon stocks estimation for the plot or the national level with reliable and verifiable techniques have always increased the social concerns [1][2], whereas the biomass calculation is the basis [3]. Compared to the popularized technique of remote sensing in biomass/carbon stocks monitoring and estimation, the traditional statistical technique which expensively and labourintensively depends on the destructive sampling is still widely accepted due to high accurate estimation on biomass/carbon stocks at both tree and forest levels [4][5][6]. Allometric model always showed good performance for components (i.e., stem, branch, foliage and root) biomass prediction of individual trees or forest and carbon stocks will be converted by the estimated biomass combining with carbon conversion factor [7]. However, the accuracy of this method is in great demand to the accuracy of the carbon conversion factor for that some of the conversion factors were only empirical (e.g., 0.5) or simply conducted by species-specific of the trees [8]. As a result, incorporating the comparison of biomass model into the carbon concentration analysis is meaningful to the carbon stocks estimation.
A common practice for estimating tree carbon stocks is that tree biomass (total and components) models are first developed based on field-sampled data, and then carbon stock is calculated by multiplying the model-predicted biomass by a conversion factor of carbon concentration. The acceptable conversion factors include (1) the carbon concentration of 50% for woody tissues and 50% for non-woody tissues, or (2) 50% for woody and 45% for non-woody tissues [9][10][11][12][13][14][15]. In addition, carbon stock models were also established to directly predict carbon stocks of tree and components and total [16]. However, some noteworthy issues and problems still existed in these methods. Some reported biomass and carbon models may not satisfy the additivity or compatibility property among tree components and total equations. The impact of the prediction errors of tree biomass on estimating carbon stock by using conversion factors is lack of quantification.
Over the last decades, numerous allometric biomass models have been developed for many tree species around the world [17][18][19][20][21]. To ensure the additivity or compatibility of tree total and component biomass models, a variety of parameter estimation methods have been proposed [22][23][24]. Generalized method of moments (GMM) has been proven the efficiency in parameter estimation without the specification for the nature of the heteroscedasticity [25][26][27]. Maximum likelihood (ML) [23] and two-stage error-in-variable model (TSEM) [24] has been also employed and given the introductions in the researches [27]. However, seemingly unrelated regressions (SUR) [22,25] and non-linear seemingly unrelated regressions (NSUR) [23] have been proven more general and flexible in recent years [28].
However, there have been limited efforts of developing tree carbon models, compared to biomass models, using tree attributes such as tree diameter and height [16]. Therefore, the carbon stocks of tree components and total are computed as the tree biomass predictions multiplied by carbon conversion factors, either an acceptable common constant (e.g., 0.5) or an empirical constant based on available data. Although this method worked well in some cases [16], there is lack of effort for quantitatively evaluating the error sources of estimating carbon stocks due to the biomass models and different carbon conversion factors, i.e., the prediction error through a sequence of operations on tree carbon stocks estimations. Furthermore, to our best knowledge, the additive system of biomass models performed well in estimating biomass and carbon stocks of total and tree components [29].
Korean pine (Pinus koraiensis Sieb. et Zucc) forests mainly distribute in Xiaoxing'an Mountain, and Changbai Mountain in Northeast of China. It is one of the important tree species for plantations in the area. In addition to the good quality of its lumber for industrial uses, its seeds are extensively harvested and sold as pine nuts, which have been the most widely traded pine nut in international commerce. The objectives of this study were (1) to develop additive systems of tree biomass and carbon stock models, (2) to evaluate four methods for estimating the carbon stocks of tree total and components, including a direct method (carbon stock models), and three indirect methods (i.e., 3 carbon conversion factors), (3) to compare and test the differences between the four methods.

Data and Method
Biomass and carbon stock data The study area is located in Heilongjiang province in Northeast China (43°25' − 53°33' N, 121°1 1' − 135°05' E) in which the forest area is 19.62 million hectares and the forest coverage is 43.16%. 5 counties were included in the study area (Fig 1) and the mean annual rainfall is 485 − 577mm and the mean annual temperature of 1.69 − 3.60°C. A total of 17 plots were selected in the different age of forest stand with a total of 89 Korean pine trees were destructively sampled in our study. The characteristics of the stand level for the study area were listed in Table 1. The data collection process in each site has been permitted by the local authority of each location. The names of the six official organizations were Mengjiagang forest farm, Boli forest bureau, Acheng forest bureau, Qing'an forest bureau and Jidong forest bureau. The whole process didn't damage and disturb the natural environment. All the location of our study was carried out in the state-owned forest land not the private forest.
The sample trees were destructively felled at the ground level with tree diameter at breast height (D, cm), total tree height (H, m) and length of live crown (CL, m) measured and recorded in the field immediately. Each stem was cut into sections by the length of 1 m with each section weighed and recorded. Then, the live crowns were stratified into three layers along the stem. In each crown layer, 3 − 5 branches were selected as the samples and the branches and foliage were separated and weighed, respectively. We sampled about 50 − 100 g branches and foliage with the record of fresh weight and then took the samples to laboratory for moisture content determination. The tree roots were excavated using both a lifting machine and manual digging. The zone of excavating roots was limited to a circle of 3 m in radius, and the fine roots (< 5 mm) were not included. All roots of the sampled trees were divided into three categories: large roots (diameter ! 5 cm), medium roots (diameter 2 − 5 cm), and small roots (diameter < 2 cm). Each root class was sampled (about 100 − 200 g), weighed, and taken to laboratory for moisture content determination. All stem, branch, foliage, and root samples were oven-dried at 80°C and weighted. The dry biomass of each component was calculated by multiplying the fresh weight of each component by the dry / fresh ratio of each component. For each sample tree, the sum of branch dry biomass and foliage dry biomass yielded the crown dry biomass. The sum of crown dry biomass and stem dry biomass gave the aboveground biomass. The sum of aboveground dry biomass and root dry biomass produced the total tree biomass (Fig 2).
For each oven-dried sample of stem, branch, foliage and root approximately 50 mg was used for measuring carbon concentration using a Multi N/C 3000 analyzer with 1500 Solids Module (Analytik Jena AG, Germany). The samples were then burned completely at 1200°C in a vial containing pure oxygen, and the carbon emitted was measured with a non-dispersion infrared ray (NDIR) analyzer. The carbon stock of each component was then calculated by multiplying the biomass of each tree component by the respective carbon concentration. Thus, the carbon stock of individual tree was obtained by summing the component estimates. The carbon concentration for the each component and total tree were showed as

Model specification and estimation
The data used to fit the models has been supported by the official authority and has no negative effect on the environment of human being and habitat of wildlife. Over the last two decades, researchers have tried different allometric equations for modeling tree biomass. The following equation was considered excellent in performance and remarkable in flexibility [22,30,31].
where Y is response variable (tree biomass or carbon stocks), X is independent variable, a and b are model parameters to be estimated. Diameter at breast height which was defined to be 1.3m above the ground level, total tree height (H) and the combination forms were widely employed in the models. In our study, variable D and the combination form D 2 H were used to develop biomass and carbon stock models so that X can be D or D 2 H. The additive systems of tree biomass and carbon stock models were formulated based on the methods by Tang and Li [32], in which the total tree biomass and carbon stocks were used as the constraints. The compatible biomass and carbon stock model which was derived from Eq 1 was showed as Eq 2.
where Y si , Y bi , Y fi , Y ri are biomass or carbon stock for the stem, branch, foliage and root respectively. a i , b i , r 1i , r 2i , r 3i , k 1i , k 2i and k 3i are model parameters to be estimated using the jackknifing technique. The subscript i = 1, 2 were biomass models based on Y = aD b and Y = a(D 2 H) b which were defined as BM1, BM2 respectively and i = 3, 4 were carbon stock models based on Y = aD b and Y = a(D 2 H) b which were defined as CM3, CM4. The detailed derivation process for Eq 2 was showed in the Appendix. Model [2] comprises a system of four nonlinear equations in our study. NSUR was applied to estimate the model coefficients in these nonlinear simultaneous equations using SAS/ETS PROC MODEL procedure [26]. NSUR employs the nonlinear joint generalized least squares to explain the correlations in the system of nonlinear biomass and carbon stock models [23,25,28]. Besides, NSUR also allow each model of the system to specify its own weighting function to overcome heteroscedasticity. To overcome the heteroscedasticity in biomass and carbon stock model, weighted functions were multiplied at the two side of the non-linear function with 1/D x and (D 2 H) p so that this problem can be resolved. The weighting function of each tree component for BM1 and CM3 was Y i = resid.Y i /D p (i = 1, 3 which has been defined above) and for BM2 and CM4 was Y i = resid.Y i /(D 2 H) p . Error variance model e i = x p was developed (e i is the model residual of the unweighted model, x is D or D 2 H) in which p was determined for each component [28].

Model assessment and validation
In this study, jackknifing technique was used to assess the model parameters and validation model performance. The entire data size by leaving one out (n-1, n is the sample size) were used to estimate the model parameters, thus the one left sample not used in the model fitting was employed to predict by the fitted model. The R a 2 (Eq 3) for each component and the total and the prediction for each left out independent variable were calculated for each round of fitting. The mean and standard deviation for each parameter were calculated. R a 2 was used to assess the goodness-of-fit of the model. Due to the good performance in incorporating both variance of prediction error and the bias of prediction, the root mean squared error of prediction (RMSEp) was used as the model validation criteria [33,34]. As a result, a total of five statistics (Eqs 5-9) were used to validate the model performance.
Adjusted coefficient of determination: where Root mean squared error of prediction: Mean prediction error: Mean prediction error percent: Mean absolute error: Mean absolute percent error: Where Y i is the ith observed value of response variable (i.e., biomass or carbon stock),Ŷ i;Ài is the ith predicted value for the dependent variable without inclusion in the fitting dataset, Y is the mean of observed response variable, n is sample size, and p is the number of model parameters.

Comparison of carbon stock estimation by four methods
Four methods were used in this study to evaluate the carbon stocks of tree components and total: (1) the developed additive systems of carbon models (CM3 and CM4) were directly used to compute the carbon stocks of tree components and total, given tree D and H (Method 1); (2) the developed additive systems of biomass models (BM1 and BM2) were first used to estimate the biomass of tree components and total, and then the predicted biomass was multiplied by the carbon conversion factor 0.5 (Method 2); (3) the developed additive systems of biomass models were first used to estimate the biomass of tree components and total, and then the predicted biomass was multiplied by the average carbon concentration of total tree (i.e., 0.48; Fig  3) (Method 3); and (Eq 4) the developed additive systems of biomass models were first used to estimate the biomass of tree components and total, and then the predicted biomass was multiplied by the average carbon concentration of each tree component (i.e., stem 0.47, branch 0.48, foliage 0.49, and root 0.48; Fig 3) (Method 4). The last three methods were considered as the indirect methods. Relative root mean squared error (RMSE r ) (Eq 10) was used to compare the prediction error of the four methods. Thus, the relative root mean squared error for carbon (RMSE rc ) was calculated and RMSE rc of indirect method was based on Eq 11.
where C mi is the ith observed value of carbon stock for the m tree components (m = 1, 2, 3, and 4 denotes stem, branch, foliage and root, respectively),Ĉ mi; Àmi is the ith predicted value of carbon stock without inclusion in the fitting dataset for the m tree components.Ĉ m is the mean of predicted value of carbon stock for the m tree components with the 89 rounding processes, and the other symbols were defined as above.
Given Eq 11, RMSE rc of indirect method was derived as Eqs 12 and 13.
Where W mi is the ith observed value of biomass for the m tree components (m = 1, 2, 3, and 4 denotes stem, branch, foliage and root, respectively),Ŵ mi; Àmi is the ith predicted value of biomass without inclusion in the fitting dataset for the m tree components,Ŵ m is the mean of predicted value of biomass of the m tree components with the 89 rounding processes, C mi % is the ith observed value of carbon concentration for the m tree components, C mk % is the kth carbon conversion factor used in the three indirect methods for the m tree components (k = 1 denotes carbon conversion factor 0.5, k = 2 denotes the average carbon concentration of total tree 0.48, and k = 3 denotes the average carbon concentration of each tree component).
To test whether the significant difference will be produced by the four methods for predicting carbon stocks of the tree components and total, the analysis of variance with randomized completed block design (RCBD) was used [35][36][37], in which the treatment was the four methods and blocking was each tree because the four methods were applied to each tree simultaneously. The least significant difference (LSD) was used for multiple mean comparisons. The PROC GLM procedure of SAS 9.3 was used for computation [38,39], and the significance level was set at α = 0.05.

Model fitting and validation
The mean values and standard deviations of the each parameter for the compatible biomass models (i.e., BM1 and BM2) and the carbon stock models (i.e., CM3 and CM4) using jackknifing technique were listed in Table 2. The mean values and standard deviation of R a 2 for compatible biomass and carbon stock models were listed in Table 3. Validation results based on jackknifing technique for compatible biomass models were showed in Table 4 and for compatible carbon stock models were showed in ) than did BM1 for stem and root, indicating that adding tree H into the biomass model increased the explanation power of these two components. The jackknifing validation also proved this conclusion with significant larger RMSE p , ME and MAE for stem and root for BM1 compared to BM2. However, BM2 was much inferior to BM1 when considering the branch, foliage and the total. For CM3 and CM4, the comparison results explained the similar conclusion to BM1 and BM2. However, BM2 and CM4 have much model parameters to be estimated and the correlations among these parameters could not be ignored. Thus, we preferred to specify the compatible biomass and carbon stock models using D as the only independent variable to make methods comparison.

Comparison of four methods in carbon stock prediction
Another statistic, namely, relative root mean squared error for carbon (RMSE rc ) was derived to compare the carbon stock prediction of four methods [9,16,18,40]. The RMSE rc of all tree components and total with the four methods was showed as Fig 4. For tree stem, Method 1 was the best with the lowest RMSE rc , followed by Method 4, Method 3, and Method 2. For tree branch, Method 1 was the best method with the lowest RMSE rc , followed by Method 4, Method 2 and Method 3. As for foliage, Method 1 was also the best, followed by Method 4, Method 3 and Method 2. The RMSE rc of Method 1 for root was slightly larger than Method 2, and still lower than Method 3 and Method 4. For the total, Method 1 was notably lower than Method 2 but slightly larger than Method 3 and Method 4.
To test the differences among the four methods, the prediction errors were used as the response variable using analysis of variance (ANOVA) with least significant difference (LSD) for multiple mean comparisons. The test results were showed in Table 6. The F-test in ANOVA was statistically significant (p-value < 0.01) for stem, branch, foliage, root and total tree carbon prediction. The blocking (trees) was also significant, indicating the differences among the sampled trees were also relatively large. The multiple mean comparison results indicated that (1) for stem, Method 1 was significantly different from Method 2 and Method 3, but there was no difference between Method 3 and Method 4, as well as no difference between Method 1 and Method 4 ( Table 6). One the other hand, for branch, foliage, and root, there was no difference between the four methods. For the total, there was no significant difference between Method 2 and Method 3, and no difference among Method 1, Method 3 and Method 4 ( Table 6).

Discussion
To investigate which model is better for fitting the biomass and carbon stock data, tree diameter alone and the combination of diameter and height were used as the predictor variables for predicting the tree component and total biomass and carbon stocks in this study. Some researchers also used other tree variables such as crown width, stand age, wood density, diameter outside bark at the base of the live crown, crown height, and etc., which may performed well in some studies [18,23,[41][42][43][44]. However, given the practical difficulties and costs of obtaining crown measurements, tree diameter and height are the most frequently used predictors in model construction. In our study, adding H into the biomass and carbon stock models only increased the R a 2 of stem and root because the differences of tree height at the same tree diameter contributed more information for the biomass/carbon stocks of stem and root [45]  and the estimation errors of other tree components (branch, foliage) were increased consistent with the decreased of R a 2 after adding H into the models (Tables 3, 4 and 5). Therefore, we chose D as the sole predictor variable in this study to develop the biomass and carbon stock models. To ensure the additivity of the tree components and total, the compatible biomass and carbon stock models were constructed with the total tree biomass and carbon stock as the constraints based on the allomatric equation [32]. Furthermore, the R a 2 of some tree components might be sacrificed to ensure the additivity for the components and total compared to fit each component separately. Given NSUR is more flexible and has been widely accepted by forest modelers [25,27,28], we used NSUR as the only parameter estimation method in this study and did not compare NSUR against others parameter estimation methods. Currently, the carbon stock calculation is based on either observed carbon concentrations [46] or a commonly accepted carbon conversion factor 0.5, which is multiplied by the estimated tree biomass [47]. However, the prediction error using indirect method is lack of evaluation. In this study we used RMAE rc to compare and evaluate carbon stocks of tree components  of individual trees for the indirect method. The statistical tests (ANOVA with LSD) indicated that, for stem, the indirect method using the carbon conversion factor 0.5 produced significantly larger errors than the other two methods. Consequently, the larger prediction error for stem would surely lead to the large prediction error for the total biomass because it is known that more than 50% of the total biomass is located in the stem [16]. It seemed that the other two indirect methods, using either the average carbon concentration of total tree (0.48) (Method 3) or the average carbon concentration of each tree component (stem 0.47, branch 0.48, foliage 0.49, and root 0.48) (Method 4), were superior to the carbon conversion factor 0.5 (Method 2). Although the differences between Method 3 and Methods 4 was not significant, Method 4 would require considerably more time and effort to collect in the field. Thus, Method 3 (the average carbon concentration of total tree) should be sufficient to obtain good carbon stock prediction with the estimated tree biomass from the models. For branch, foliage and root, the prediction error among the four methods was not significant, this can be explained that the real carbon concentration of the three components are more close to the total tree concentration. Lehtonen et al. (2007) [48] analyzed the prediction error of biomass expansion factors and the error from variable measurements and regression functions. Their results showed that most of the error was due to the regression models. Therefore, it is crucial to develop and select the "best" biomass models as the foundation for computing carbon stocks using indirect methods. In our study, we used diameter at breast height and the combination of diameter at breast height and tree height to establish the biomass and carbon stock models. Based on goodnessof-fit and validation results for the different kinds of models, we selected the model which employed diameter at the breast height to make comparison among the four methods to decrease the error of indirect methods deriving from the biomass model as large as possible. There were no significant differences between Method 3 and Method 4 according to the total prediction error estimation indicating that using observed average carbon concentration of individual tree and / or tree components would lead to better prediction for carbon stocks. We concluded that it was not necessary to measure the carbon concentrations for tree components because they are time-consuming and costly. The average carbon concentration of individual tree should be sufficient to obtain accurate computation for carbon stocks given reliable and accurate estimation of biomass. The results of our study were different from the study of [16], in which they used three methods: 1) estimated total biomass was multiplied by weighted mean carbon concentration; 2) estimated tree compartment biomass was multiplied by average carbon concentration of tree components; and 3) developing carbon stocks prediction model directly. Mello et al. (2012) [16] found that there was no statistical difference among the three methods. However, there were only 30 sample trees included in Mello et al. (2012) [16]. The small sample size might lead to an insignificant comparison. In addition, the biomass and carbon models developed in Mello et al. (2012) [16] did not hold the additivity or compatibility of tree components and total. In addition, the differences of prediction errors among the three methods at tree component levels were not clear in the study of Mello et al. (2012) [16].

Conclusion
Four methods were used to predict carbon stocks of tree components and total for Korean pine trees in the plantations of northeastern China. Based on the results of goodness-of-fit and model performance statistics of the biomass and carbon stock models, we chose the compatible systems of models with tree diameter as the sole independent variable in this study. The NSUR method was used to estimate the parameters in these models. The prediction errors of four methods for estimating carbon stocks of tree components and total were compared and tested.
The direct method using the compatible carbon stocks models (Method 1) was the best method for predicting the carbon stocks of stem and total. The indirect method using the average carbon concentration of tree (Method 3) and average carbon conversion of each tree component (Method 4) produced relative small prediction errors, while the indirect method using the carbon conversion factor 0.5 (Method 2) was the worse for estimating the carbon stocks for tree components and total of Korean pine in this study. However, there were no significant difference between Method 3 and Method 4. Thus, the average carbon concentration of individual tree rather than the average carbon concentration for each tree component is good enough to calculate the carbon stocks for tree components and total.
Supporting Information S1 File. Biomass data used in the study. (XLS)