Compatible Models of Carbon Content of Individual Trees on a Cunninghamia lanceolata Plantation in Fujian Province, China

We tried to establish compatible carbon content models of individual trees for a Chinese fir (Cunninghamia lanceolata (Lamb.) Hook.) plantation from Fujian province in southeast China. In general, compatibility requires that the sum of components equal the whole tree, meaning that the sum of percentages calculated from component equations should equal 100%. Thus, we used multiple approaches to simulate carbon content in boles, branches, foliage leaves, roots and the whole individual trees. The approaches included (i) single optimal fitting (SOF), (ii) nonlinear adjustment in proportion (NAP) and (iii) nonlinear seemingly unrelated regression (NSUR). These approaches were used in combination with variables relating diameter at breast height (D) and tree height (H), such as D, D2H, DH and D&H (where D&H means two separate variables in bivariate model). Power, exponential and polynomial functions were tested as well as a new general function model was proposed by this study. Weighted least squares regression models were employed to eliminate heteroscedasticity. Model performances were evaluated by using mean residuals, residual variance, mean square error and the determination coefficient. The results indicated that models with two dimensional variables (DH, D2H and D&H) were always superior to those with a single variable (D). The D&H variable combination was found to be the most useful predictor. Of all the approaches, SOF could establish a single optimal model separately, but there were deviations in estimating results due to existing incompatibilities, while NAP and NSUR could ensure predictions compatibility. Simultaneously, we found that the new general model had better accuracy than others. In conclusion, we recommend that the new general model be used to estimate carbon content for Chinese fir and considered for other vegetation types as well.


Introduction
In recent years, many ecologists and policy makers have paid great attention to searching for ways to solve the problem of increasing concentrations of greenhouse gases, especially carbon dioxide (CO 2 ) [1]. Due to the great potential of carbon sequestration, forests are viewed as one way to reduce atmospheric CO 2 concentration [2]. The amount of carbon fixed in forest vegetation accounts for 2/3 of that in terrestrial ecosystems, making forest ecosystems a crucial carbon sink [3]. Thus, forests play an important role in preventing an increase in atmospheric CO 2 concentrations and in the mitigation of global climatic changes. In this context, much research has already been conducted to estimate and predict the potential of carbon storage in different forest ecosystems [4][5][6][7][8][9]. However, it is complicated to estimate carbon storage due to the nonlinear and stochastic character of many vital processes, the complex interactions between different vegetation organs, and many limitations generated by incomplete and insufficient availability of data [10]. One appropriate approach for accurate simulation of carbon content is to develop models based on allometric relationships [11][12], and therefore, it is essential to explore more effective models for estimating carbon storage.
Carbon content of an individual plant refers to the part of carbon completely fixed in vegetation and is the net result of a series of uptake (e.g. photosynthesis) and loss processes (e.g. respiration). The amount of carbon content can be expressed as the total sum of carbon content of all components (including bole, branches, foliage leaves and roots) of an individual tree. Huang et al. [13] established carbon content models for individual trees of Chinese fir (Cunninghamia lanceolata (Lamb.) Hook.). They used power function as a basic model form to construct models with some testability factors to create the carbon yield table for predicting carbon content of individual trees. However, the main deficiency of their research findings was only considering one basic model without compatibility for estimating carbon content, this weakness might have led to some deviations in practical application. In general, compatibility means that if one component is part of the tree, it is logical to expect the estimate of the component not to exceed the estimate of the whole tree. If a component is defined as the sum of a few subcomponents, its regression estimate value must be equal the sum of the regression estimates of all subcomponents. So like previous studies which were conducted based on allometric scaling theory [14][15], we decided to develop some equations to establish carbon content models and to provide the scientific rationale to guarantee compatibility of results.
The detailed objectives of this study are as follows: (i) to validate the performance of different carbon content models, especially incompatible and compatible ones; (ii) to analyse and compare different approaches for choosing the best way to solve the compatibility problem; and (iii) to propose a new general model for developing the form of carbon content model for individual trees.

Study area
Fujian Province is located in southeast China. The geographical coordinates range between 23°3 0' N to 28°22' N, 115°50' E to 120°40' E ( Fig 1A). It is situated near the Tropic of Cancer and is influenced by monsoon circulation such that the climate of this area is a subtropical marine monsoon. The province is characterized by abundant sunlight and rainfall. The annual average temperature is between 17°C and 21°C, while rainfall varies over a range of 1400 to 2000 mm. As one of the main production areas of Chinese fir (Cunninghamia lanceolata), according to the Eighth National Forest Resources Inventory of China (2009 to 2013), the forest area of Fujian Province is 8,012,700 hm 2 with forest coverage that has reached 65.95%, hence ranking the first in the country. The principal soil types contain red, yellow and lateritic red earth. These climate and edaphic factors provide favourable environmental conditions for production of Chinese fir in Fujian province.

Data resources
We conducted a stand survey from three Chinese fir plantation distribution areas ( Fig 1B). These include (i) The central region (Jianyang City) which is found between 27°06' N to 27°43' N, 117°31' E to 118°38' E, (ii) The general region (Yong'an City) located at 25°33' N to 26°12' N, 115°56' E to 117°47' E and (iii) The edge region (Minhou County of Fuzhou City) located between 25°47' N to 26°37' N, 118°51' E to 119°25' E in Fujian Province [16]. The authority responsible for each study location: (i) Study areas of Jianyang City belong to Forestry Bureau of Jianyang City; (ii) Study areas of Yong'an City belong to Forestry Bureau of Yong'an City; (iii) Study areas of Minhou County of Fuzhou City belong to Baisha State-owned Forest farm of Minhou County. All studies in these fields did not involve endangered or protected species and all activities for research were permitted by the authority responsible for each study location. Based on site index, stand age, elevation, slope gradient, slope aspect and slope position, we used a quadratic orthogonal rotational combining design [17][18] to lay out 54 plots (18 plots in each region, for details, see S1 Table). Some key points of sample plots included: stand age from 6 to 12 years, stand density from 1000 to 4000 plants/hm 2 , site index from 6 to 28 m, stand average diameter at breast height (DBH, D) from 7.1 to 30.8 cm, and stand average tree height from 8.5 to 23.8 m. Based on the characteristics of each diameter class, we chose one sample tree per plot to investigate the biomass.
After the tree was felled, we excavated the soil to get roots by using the whole roots digging method [19]. The roots included large sized roots (2-5 cm), middle sized roots (0.5-2 cm), small sized roots (0.2-0.5 cm), fine roots (< 0.2 cm) and stump roots (> 5cm). Then the whole plant parts were obtained. We first measured and recorded the fresh weights of bole, branches, foliage leaves and roots of each sample and took some subsamples from each component. In the laboratory, we processed the subsamples, and then all samples were oven dried at 105°C for 2 hour once a time until a constant weight for each sample was obtained. Then oven dry weights were calculated for bole, branches, foliage leaves and roots. After drying, grinding, sieving and heating with potassium dichromate (K 2 Cr 2 O 7 ), we obtained the carbon content of the subsamples for each component. Then, we calculated the average ratios of carbon content (kg per 1 kg weight) from different regions: boles = 0.5229(s.d. = 0.0059), branches = 0.5008 (s.d. = 0.0055), foliage leaves = 0.4794(s.d. = 0.0057), roots = 0.5048 (s.d. = 0.0062), aboveground = 0.516 (s.d. = 0.0058) and the whole tree = 0.514 (s.d. = 0.0057). Next, we multiplied the biomass of boles, branches, foliage leaves and roots by the corresponding ratio of carbon content. Finally, we obtained the carbon content of each component of the sample trees as model sample data to establish models.
Simultaneously, we collected validation data from previous publications on Chinese fir biomass in Fujian Province [19][20][21][22][23][24][25][26][27] and chose 27 sample trees that had complete measured data, including D, H and biomass of each component. However, because some measurement methods were inconsistent, we decided to make the following adaptations before using them: (i) We classified component data into four parts: bole (including bark), branches, foliage leaves and roots. Whenever we found existing data (from literatures) on aboveground and the whole biomass, we used them. Otherwise, we assumed that the aboveground biomass is the sum of bole, branches and leaves biomass, and the sum of aboveground and roots biomass was the whole tree biomass.
(ii) Because we needed to validate the predictive abilities of carbon content models, we converted biomass data into carbon content. Assuming that the average ratios of carbon content calculated in this study were accurate and applicable, then we used these ratios to obtain carbon content of validation samples. Table 1 shows some statistics for the modelling and validation data (details of modelling and validation samples are in S2 and S3 Tables, respectively).

Basic models and Variable selection
In general, to obtain the carbon content of plants, scholars usually establish biomass models at first, and then use conversion factors to obtain carbon content results [28][29][30]. In this study, we directly measured the carbon content of each tree component (i.e., bole, branches, foliage leaves and roots). We applied power, exponential and polynomial functions (which were used to build models for biomass [14,[31][32][33][34]) and a new general model proposed by this study to establish carbon content model of individual trees. We used F-test decider [35] to determine the maximum order of polynomial function that was equivalent to "2" so that a polynomial function form was a quadratic polynomial. According to the findings from using different variables to estimate biomass in previous studies, it is recommended to use D which is easily measured and is the best parameter for estimating the biomass of each component [36][37]. Moreover, based on different types of models reported by previous researches [38][39], the models with two dimensional variables, D and H (tree height), were much more strongly correlated with biomass than those with only one variable. For that reason, we focused on many different combinations between D and H, including D, DH, D 2 H and D&H (D&H meant two separate variables in bivariate model) as variables to establish carbon content models.
We assumed that f i (x) (i = 1, . . ., 6) represented carbon content equations of the whole tree, aboveground, for bole, branches, foliage leaves and roots. The different basic models were as follows: Single variable model with different variable combinations, including D, DH and D 2 H: Bivariate model with variables D&H: where by Eqs 1 and 4 represented a form of a power function; Eqs 2 and 5 represented a form of an exponential function; Eqs 3 and 6 represented a form of a polynomial function; e is the base of natural logarithms; a, b, c, d, f, g, h, j were parameters. In this study, we first proposed a new general model as a basic model, and the form of a single variable model could be written as: where by α, β, γ, η, θ were parameters, and when η = 0, θ = 1, Eq 7 could be transformed into a form of a power function (Eq 1); when β = 0, θ = 1, Eq 7 could be transformed r into a form of an exponential function (Eq 2); when β = 1, η = 0, and Eq 7 could be transformed into a form of a polynomial function (Eq 3). Hence, we considered Eq 7 as the general model that contained various forms of functions, including power, exponential and polynomial functions. Similarly, the general bivariate model was: where by α, β, γ, η, θ, ε were parameters.
To compare and evaluate performance of different carbon content models, we used four fit statistics: mean residuals (MR), residual variance (RV), mean square error (MSE), and determination coefficient (R 2 ). They were calculated using the following equations: where by y i represented observed values of the i th sample tree;ŷ i represented estimated values of the i th sample tree; y represented arithmetic mean of all observed values; N represented number of sample trees.

Single optimal fitting
Single optimal fitting (SOF) used the traditional least square method to establish a single optimal regression model. This approach fitted equations separately based on the respective carbon content data of each component of individual trees. Consequently, the sum of carbon content predictions from the separate models of each component might not equal carbon prediction from the whole tree carbon content model. The model was expressed as follows: where by f i (x) represented a single optimal regression model of each component; W i (i = 1, . . ., 6) represented carbon content of the whole tree, aboveground, bole, branches, foliage leaves and roots respectively; ε i represented random error of each component.

Compatible estimation approach
Nonlinear adjustment in proportion. According to the concept of compatibility, to satisfy that the sum of each component was equal to the whole tree, it was important to ensure that the sum of each component proportion was 100%. Following this logic, Tang et al. [40] adopted the nonlinear adjustment in proportion (NAP) to guarantee compatibility. This approach had two alternative procedures in common use; one was directly controlling the total using proportional functions (NAPI), and the other was jointly controlling from level to level by ratio functions (NAPII).
The procedure of NAP I was to directly assign the whole tree amount into bole, branches, foliage leaves and roots by each respective proportion, and each component function was expressed as follows: where by f 1 (x) represented a single optimal regression model of the whole tree carbon content; and f 3 (x), f 4 (x), f 5 (x), f 6 (x) represented a single optimal regression model of carbon content of bole, branches, foliage leaves and roots, respectively. The NAP II firstly assigned the carbon content of the whole tree into two parts by proportion, including aboveground and roots. Then, the aboveground carbon content was divided into bole, branches and foliage leaves. This approach not only guaranteed that the sum of the component predictions was equal to the whole tree but also that the bole, branches and foliage leaves predictions would sum up to the aboveground prediction. Detail models were expressed by the following equations: where by f 2 (x) represented the single optimal regression model of aboveground carbon content. Nonlinear seemingly unrelated regression. Parresol [41] presented an approach of nonlinear seemingly unrelated regression (NSUR) to solve the question on the addition of nonlinear biomass equations. The methodology considered the correlation among each component and guaranteed the compatibility between the whole tree and each component. In this study, we combined NSUR and the modified simplex method (MSM) [42] to estimate parameters.
The equation for applying NSUR in this study was described as follows: Heteroscedasticity removal Normally, carbon content data exhibits heteroscedasticity; the error variance is not constant across all observations. Therefore, we removed heteroscedasticity before estimating parameters. The usual methods employed to remedy heteroscedasticity are the logarithmic transformation and weighted least squares regression [43]. In this study, we used weighted least squares regression to solve the heteroscedasticity problem based on regression analysis between variable values and residual square values of a single optimal regression model of each component [44]. According to Akaike Information Criterion (AIC) index, the optimum form of weighted function was a power function [15,45], and the equation was as follows: where by the forms of g i (x) were D a 1 ,ðDHÞ a 2 ,ðDH 2 Þ a 3 and D a 4 H a 5 when setting different variables, respectively; a 1 , a 2 , a 3 , a 4 , a 5 were parameters; i represented different components.

Software
All results were calculated by using DPS 7.05 [46].

Results
According to Eq 26, weighted functions of different variables were calculated (see details in S4, S5, S6 and S7 Tables). After removing heteroscedasticity by using weighted functions, we used different variables to establish carbon content models of each component and the whole tree for a modelling sample. Due to space limitations, we only listed optimal estimation results of different variables in Table 2, and the detail comparison evaluation indices of four basic models in S8, S9, S10 and S11 Tables. We found ( Table 2) that the best basic model to estimate carbon content was the general model (Eq 7). This model had the highest R 2 as well as the lowest RV and MSE for each component and the whole tree, when only D as variable was used (S8 Table). However, there were different results when a combination DH as a variable was used. Except for foliage leaves, the results revealed that the best model form was a polynomial (quadratic polynomial, Eq 3). Moreover, using Duncan's multiple range tests, we found that there was a significant difference (at 0.01 significance level) between Eq 2 and the other three models (Eqs 1, 3 and 7) with variable DH (S9 Table). Interestingly, under the condition of rounding to four decimal places and using D 2 H as the combination variable to establish models (S10 Table), the evaluation indices of a power function and the general model were almost the same except for branches (whose optimal model was the general model, Eq 7) and aboveground (whose optimal model was polynomial, Eq 3). Meanwhile, Fig 2 compares relative errors of predictions by using optimal models with different variable combinations (based on Table 2). Fig 2A and 2B show the mean relative and maximum relative errors for each component with different variables respectively. Fig 2C compares the sum of all prediction relative errors for different variables. The bivariate models using D&H as variables had the best fitting abilities for all components and the whole tree (see Table 2 and Fig 2). Likewise, by comparing all optimal estimation results, the optimal form was the bivariate general model (Eq 8) except for roots, for which a polynomial function (Eq 6) had higher precision than other forms. D represented diameter height at breast and H represented tree height. The formula of relative error was ð jestimationÀobservationj observation Þ Â 100% Based on the results listed in Table 2, the optimal basic model is combined with D&H to establish compatible models as follows: Eq 4-power function (bole), Eq 6-polynomial function (roots), Eq 8-bivariate general model (branches, foliage leaves, aboveground and the whole tree). According to the formulas from Eq 14 to Eq 25, we obtained compatible models using Eq 1 (Eq 4), Eq 3 (Eq 6) and Eq 7 (Eq 8) represented a power function (Eq 4 was bivariate), a polynomial function (Eq 6 was bivariate) and the general model (Eq 8 was bivariate), respectively. a Under the condition of rounding four decimal places, the results were almost the same between power function (Eq 1) and general model (Eq 7) for bole, foliage leaves, roots and the whole tree models with variable D 2 H (in S10 Table). Considering Eq 1 had less parameters, we considered Eq 1 was optimal due to more convenient application in practice. b Under the condition of rounding four decimal places, the R 2 of a power function (Eq 4) and the general model (Eq 8) were same, but the general model had lower RV and MSE, so we considered Eq 8 was optimal (in S11 Table).
doi:10.1371/journal.pone.0151527.t002  Table 3. Through Duncan's multiple range tests, there was no significant difference (at 0.01 significance level) among different approaches. The accuracies of different compatible approaches were characterized by this sequence: whole tree > aboveground > bole > roots > foliage leaves and branches. Furthermore, the performance of compatible models was relatively good. The precision of aboveground (0.9932) and the whole tree (0.9947) by using NSUR were even better than the results by SOF. The highest R 2 by SOF for aboveground and the whole tree were 0.9928 and 0.9944, respectively ( Table 2). As for the models of each component, we found that the results by NAP approach were slightly better than results by NSUR.
To compare the predictive abilities between compatible and incompatible models, we estimated the parameters of the optimal models using SOF (incompatible models) and compatible models by NSUR (Tables 2 and 3). Then, we used these models to simulate validation data and compute various evaluation indices ( Table 4). The results of T-test revealed that there were no significant differences (at 0.01 significance level) between SOF and NSUR. As a complete unit, predictive effects of model by NSUR were superior to those by SOF. However, whether by SOF or by NSUR, the predictive accuracies of branches and foliage leaves were relatively low (at approximately 0.80). Additionally, the accuracy of the aboveground carbon content predicted by SOF (0.8720) was obviously lower as compared to the result simulated by NSUR (0.9548). For the carbon content of bole and roots as well as the whole tree, we found that there were no significance differences between the two approaches. The relative errors of some predictive Through Duncan's multiple range tests, there was no significant difference (at 0.01 significant level) among different approaches. a The optimal basic models for each component and the whole tree were as follows: Eq 4-a power function for bole, Eq 6-a polynomial function for roots, and Eq 8-the bivariate general model for branches, foliage leaves, aboveground and the whole tree. Specifically, we used Eq 4 as the optimal model for bole in order to reduce estimation parameters on the condition of same R 2 between Eqs 4 and 8. b On the basis of the procedure of NAP I, it did not involve aboveground in processing of estimating compatible models, so the values of evaluation indices were null. values had large deviations. The maximum deviations of the two approaches reached 95.80% (by SOF) and 131.24% (by NSUR). In terms of average relative error, the values of branches and foliage leaves were greater than 20%. However, there were two outliers (detected by triple standard deviation approach, at 0.01 significant level) in predictive value for branches by NSUR. After elimination of these two outliers, the maximum relative error and the mean relative error for branches by NSUR changed to 67.27% and 16.92%, respectively. In addition, other component predictive values were relatively good, including the bole, roots, aboveground and the whole tree. Although R 2 (0.9296%) and mean relative errors (11.39%) were acceptable for predicting the carbon content of roots by using SOF, the results showed that fourteen (14) predicting relative error values were greater than 20% (Table 4). So we considered that the effects of carbon content prediction on roots by SOF were not very good.

Necessity of establishing compatible model
Compatibility requires that the sum of components equal the whole tree, so the sum of percentages calculated from component equations must equal 100%. But using SOF to construct models would lead to greater deviations in predicting the sum of the carbon content of each component and that of the whole tree. Some examples were listed in Table 5 to illustrate incompatibility for modelling and validation samples by using SOF. It showed that Through T-test, there was no significant difference (at 0.01 significant level) between SOF and NSUR. a As results of performances of a power function were almost the same as the general model, we decided to use power function as optimal models by SOF, in order to reduce estimation parameters. b Through detecting by triple standard deviation approach, there were two outliers among results of relative error for branches (131.24% and 99.83%), so we recalculated results after elimination of these two values (the latter ones of Maximum, Mean and N of Relative Error). incompatible errors existed in estimated values for both the whole tree and aboveground carbon content while the magnitude of errors was unknown. If these estimated results were applied in a practical situation, it might lead to confusions and mistakes. Hence, it was necessary to solve the compatibility problem when estimating carbon content. Through comparisons in this study, we found that compatible models took full account of mathematical relations and inherent correlations between the whole tree and each component. These models which were constructed by using NAP and NSUR guaranteed that the sum of each component's carbon content was equivalent to that of the whole tree in the range of systematic errors permitted. Additionally, predictive abilities of the compatible models were even better than incompatible models to some extent (Table 4). So we concluded that it was better to establish compatible models for simulating carbon content of individual trees.

Comparisons of different fitting approaches
Although we could obtain single optimal models by using SOF, these models did not guarantee compatibility. However, NAP and NSUR guaranteed the property of carbon content additivity among the components of the tree and the whole tree (Table 3). Nonetheless, there were still some flaws in calculation for these two compatible estimation approaches. NAP used single optimal models calculated by SOF to determine the proportion coefficient for each component, and parameters of single optimal models were estimated using the traditional least square method. Due to the parameters being calculated without considering compatibility, there were prediction errors in the results. When using NAP, these errors were passed down as a process of grading adjustment, and the final results of the compatible models of each component were no longer optimal [15]. Referring to relevant literature [41], NSUR was based on a nonlinear joint-generalized regression with parameter restrictions. In this context, NSUR permitted each component model to use its own weighting function for removing heteroscedasticity and obtain a lower variance for the aboveground and the whole tree carbon content models. Unlike the linear model, a nonlinear estimation would pose many challenges [41], such as the initial parameters estimation needing to be specified. If we could not obtain appropriate starting values, the computing time would be long and a nearly flat gradient could lead to slow convergence. Various methods could be available for obtaining starting values such as experience and estimating linearization by transformation with ordinary least squares. If these issues were not a concern, it would be preferable to use NSUR to establish compatible models which had higher accuracies and smaller estimation errors in theory, by considering overall precision and stability. However, we noticed that according to compatible results in Table 3, there were some small contradictions between this study and previous literature [15,40], which concluded that the precision of NSUR was better than NAP. Except for the aboveground and the whole tree, the fitting accuracies of bole, branches, foliage leaves and roots by NSUR were slightly inferior to those by NAP, where the results were different from existing theories. After carefully thought theories, we speculate that reasons for this contradiction include: (I) The nonlinear parameter estimation method had affected the final results by NSUR. We only used MSM to calculate parameters, so we inferred that if we used other advanced methods to obtain parameters, the accuracies of the models would be improved; (II) Because existing uncertainty of predictive errors in every optimal model by SOF, error cancellation occurred during adjustment when we used NAP to process compatibility. Based on these two aspects, it might appear that the fitting accuracies of NAP were slightly better than NSUR, but there were no statistically significant differences between these two methods.
In addition to NAP and NSUR, Tang et al. [40] proposed another compatible estimation method called nonlinear joint equation sets estimation. This method is combined with the advantages of NSUR and NAP, but compatible models constructed by this approach have more parameters and more complicated model forms. Through theoretical consideration and empirical evidence, NSUR was recommended as a more suitable approach to solve compatibility in general.

Optimization basic model of carbon content
In this study, we compared model performances based on different variables, such as D, DH, D 2 H and D&H. The results indicated that models with two dimensional variables (DH, D 2 H and D&H) were always superior to those with a single variable (D). Moreover, we found that the most useful predictor variable combination was D&H (Table 2 and Fig 2). Actually, DH or D 2 H are both special forms of D&H combinations from mathematical aspect. In practice, D and H are two easily collected biophysical properties of individual trees and generally available in most forest inventories. Many species-specific equations for relating D&H to standing volume of wood or total biomass (carbon content) are commonly used for forest ecological studies [36][37][38][39]. So we conclude that it was better to use D&H as variables to establish carbon content models in this study.
Existing studies show that the selection of model forms always depends on subjective experience instead of objective comparisons of different models. However, in this study the performance of four basic models were compared in order to find the optimal form for simulating carbon content. Specially, the general model proposed by this study could be turned into other three function models (i.e. power, exponential and polynomial functions) by setting specific values for certain parameters. Using the Duncan test, we found that there were no significant differences among the results of the four basic models with D as a single variable. However, when we introduced H as another variable, exponential function showed a significant difference and the fitting accuracies were lower than those of other functions (S9, S10, S11 Tables). Meanwhile, we found that the optimal model was the general model for the whole tree and all components except for roots, whose best model form was a polynomial function (Table 2). But before setting a polynomial function as a basic model, the optimal maximum order must first be determined. Besides, the form of a power function was relatively simple and analogous to the Constant Allometric Ratio (CAR) model [47] often used to estimate forest biomass, but accuracy of power function was inferior to the general model in some results (Table 2). So, on balance, we did not think that these two types were optimal basic models in this study, at least for precision and practicality.
To further compare performance, we listed some models used in estimation for forest aboveground biomass from previous literature [11,12,37,39,[48][49][50][51][52][53]. These models were used to fit the modelling data from this study ( Table 6 presents only examples of aboveground carbon content. Most of models were logarithmic transformation of a power function and this transformation could be used to deal with the problem of heteroscedasticity). From the evaluation indices, we observed that accuracies of foreign models were still lower than those of the general model from Table 2. The new general model had a widespread applicability, noticeable flexibility, and could adapt to different situations through proper transformation. Some commonly used functions were actually special forms of the general model. So we recommend the use of new general model to estimate the carbon content of individual trees as the optimal basic model with D&H in this study.
For a more detailed examination of the general model extrapolation, we made comparisons between Liu's general model and our general model based on a regional database of individual trees of Chinese fir collected by Liu [54]. We used model (Eq 8 form) from Table 4 by SOF for the whole tree as General model 1 and constructed new fitting model (Eq 8 form) by SOF for the whole tree based on Liu' s regional database as General model 2. Fig 3 demonstrates the mean relative error of the whole tree carbon content prediction by using four models. The Table 6. Results of fitting modelling data by some common models usually used in estimation for forest biomass from previous literatures. result revealed that, for total province plots, the General model 2 had the best performance on estimating the whole tree carbon content. The prediction ability of the General model 1 was superior to that of Liu' Eq 1, but inferior to that of Liu's Eq 2. However, the General model 1 had better results than that of Liu's Eqs 1 and 2 for Fujian Province plots. So results from Fig 3 indicate that estimated errors were exposed and amplified when we used the General model 1 to fit regional database. The reason might be the General model 1 was constructed only based on plots of Fujian Province in this study. Over-fitting problems appeared when estimating other provinces' plots due to existing variability in the real world. So it was not suitable to use directly the General model 1 to predict carbon content of Chinese fir in other provinces. Hence, a new fitting general model was constructed to provide further comparison. The results of the General model 2 revealed that the form of Eq 8 was optimal model for estimating carbon content in the mathematical significance for all provinces except for Guangdong, whose best results were based on Liu' Eq 1. Therefore, we still recommend trying to use the general model (Eq 8) as an optimal form to estimate carbon content even based on meta-analysis of published data to date. Liu's Eq 1:ln W = a 1 +b 1 ln D; Liu's Eq 2: ln W = a 2 +b 2 ln(D 2 H). W represented carbon content, a 1 , a 2 , b 1 , b 1 were parameters. Because Liu's database was about biomass for the whole tree, we used the general carbon content ratio of Chinese fir (0.52, by "Afforestation Project Carbon Measurement and Monitoring Guidelines", language in Chinese, State Forestry Administration, 2011) to convert biomass to carbon content.
The General model 1 represented the general model (Eq 8 form) from Table 4 by SOF for the whole tree. The General model 2 represented the new fitting general model (Eq 8 form) by SOF for the whole tree based on Liu's regional database.
The formula of relative error was ð jestimationÀobservationj observation Þ Â 100%

Limitations and recommendations
We established different models of individual trees based on the perspective of carbon content. These models had excellent abilities and strong operability. However, errors in model extrapolation have two main sources in general [55]: the stochastic character of the estimated model coefficients and measurement errors in the data or lack of data used for model construction. Therefore there were still some aspects that should be given attention for practical application. First, although the new general model had good performance, there were some issues to be paid attentions. Because this model required more parameters to be estimated and the initial setting values of parameters had an effect on the final results. Then the phenomenon of slow convergence might have appeared in the process of nonlinear parameter estimation. So we must fine tune the convergence criterion to obtain the desired degree of accuracy by trying different algorithms, such as a genetic algorithm or simulated annealing algorithm.
Second, in order to obtain more precise models, all results in this study were kept in four decimal places and most of the conclusions were drawn based on statistical significance. We didn't rely entirely on statistical tests to decide optimal model in practice because differences in the third or fourth decimal place have no impacts at all in biological or ecological significance due to naturally occurring variability in the real world. Like results in S8, S9, S10 and S11 Tables, we found that there was no significant differences (at 0.01 significance level) among different model forms with different variables except between Eq 2 and other three models (Eqs 1, 3 and 7) with variable DH (S9 Table). Moreover, some results showed that the final form of the general model almost transformed into a power function (S10 and S11 Tables), i.e. carbon content might typically be characterized by an allometric scaling law in some extent [56]. So it is an alternative choice to use CAR model to estimate carbon content of individual trees of Chinese fir without extremely high precision demand in practice.
Third, we only used D and H to estimate carbon content of each component. We found that the estimated values of branches and foliage leaves had relatively larger deviations than other components. In some previous studies, scholars suggested that crown breath is a key variable to influence biomass of branches and foliage leaves [15,40]. So if we can add crown breath as another variable to model, the results may be more accurate. Moreover, specific gravity is becoming another important variable for biomass estimation [57][58]. Although the whole tree specific gravity will not be a useful prediction variable in practice due to that it is seldom measured in stand survey, it still could be a very good variable for theoretical model guidance and selection when we develop model extrapolation for other species (like some models in Table 6).
Furthermore, most validation data from previous studies were only on biomass of Chinese fir, not carbon content. There were some deviations when we directly converted biomass into carbon content by using average carbon content ratios. However, different sampling and measurement methods might have led to inconsistence and inaccuracy during biomass data collection. Some models were extrapolated poorly in some extent, such as the estimate values of branches and foliage leaves which had relatively large deviations for validation data in Table 4. So increasing quantification of potential error would be useful for practical application in further studies.

Conclusion
Accurate estimation of forest carbon content plays a major role in sustainable management of forests as well as mitigating the increasing CO 2 concentration in the atmosphere. Based on the carbon content data on the Chinese fir plantation in Fujian Province, we selected four basic models to establish carbon content models. The results indicated that models with two dimensional variables (DH, D 2 H and D&H) were always superior to those with a single variable (D) and the most useful predictor variable combination was D&H. Through theoretical analysis and predictive validation based on data from previous studies, we recommend using NSUR to guarantee additivity when constructing a compatibility model. Meanwhile, comparisons of model evaluation indices showed that the general model was superior to other present models for estimating carbon content in statistical or mathematical significance. Theoretically, the general model has more flexibility and higher availability because it could transform to different form models by setting specific initial parameters. Hence, we conclude that the general model should be worthy of promotion for predicting carbon content of individual trees, not only for Chinese fir, but also for other vegetation types.
Supporting Information S1  Table. Comparison evaluation indices of four basic models with variable DH. (DOCX) S10 Table. Comparison evaluation indices of four basic models with variable D 2 H. (DOCX) S11 Table. Comparison evaluation indices of four basic models with variables D&H. (DOCX)