Development and evaluation of height diameter at breast models for native Chinese Metasequoia

Accurate tree height and diameter at breast height (dbh) are important input variables for growth and yield models. A total of 5503 Chinese Metasequoia trees were used in this study. We studied 53 fitted models, of which 7 were linear models and 46 were non-linear models. These models were divided into two groups of single models and multivariate models according to the number of independent variables. The results show that the allometry equation of tree height which has diameter at breast height as independent variable can better reflect the change of tree height; in addition the prediction accuracy of the multivariate composite models is higher than that of the single variable models. Although tree age is not the most important variable in the study of the relationship between tree height and dbh, the consideration of tree age when choosing models and parameters in model selection can make the prediction of tree height more accurate. The amount of data is also an important parameter what can improve the reliability of models. Other variables such as tree height, main dbh and altitude, etc can also affect models. In this study, the method of developing the recommended models for predicting the tree height of native Metasequoias aged 50–485 years is statistically reliable and can be used for reference in predicting the growth and production of mature native Metasequoia.


Introduction
Tree height and dbh are the two most important factors in surveys, production and management of forest resources and research on forest ecosystems [1,2]. They are usually used to calculate the volume, site index, forest growth and yield [3]; and to estimate forest volume, biomass and carbon stock [4]. Accurate tree height and dbh are necessary conditions for evaluating biomass and are of great importance for the research of forest growth models based on physiological ecology.
Compared with dbh, the observation of tree height is often affected by the complexity of the distribution of forest vegetation, forest density and landform [5,6]. At the same time, it takes time and effort to measure tree height since there are some limitations caused by observational PLOS  error and visual disturbance [7], which increases the cost of the forest survey. Therefore, it is very necessary to construct a simple and accurate tree height-dbh model to estimate the height of trees [8].
The allometry equation of tree height and dbh is usually used in estimating tree height. This method neglects the possible large deviation in estimating biomass by allometry [9]. There are currently many studies on tree height-dbh models, and some tree height-dbh models of common tree species have achieved good effects in application [10][11][12]. These studies mainly focus on artificial forests, natural forests and pure forests [13][14][15][16][17]. An individual tree growth model is the basis of a forest growth and production forecast [18]. Huang et al. [11] evaluated the main species of Alberta by using 20 non-linear tree height-dbh models. The non-linear function fitting is relatively easier, so the non-linear tree height-dbh functions are widely used in the prediction of tree height [19][20][21][22]. However, changes in tree height and dbh may lead to a deviation in the predicted tree height. In tree height and dbh relationships, there is a correlation between tree height and growing conditions, forest density, tree age, basal area and dominant tree height and dbh. The correct choice of allometry models is the key in accurate prediction [23]. In this process, the choice of parameters is a key factor in making errors in models. The proper parameters can improve the data fitting of models [24]. Therefore, we cannot apply the same function to all the tree height-dbh models.
Metasequoia is a relict plant and is known as a "living fossil". The earliest Metasequoia fossil plants appeared in the sedimentary formation in the Mesozoic Cretaceous period, which was 63~110 million years ago [25]. The ancient Metasequoia plant originated in the Arctic Circle and then gradually expanded southward. In the quaternary Pleistocene period, when temperature dropped dramatically, Metasequoia plants gradually disappeared [26].The native population of existing Metasequoia is distributed only at the junction of the Hubei, Hunan, Sichuan and Chongqing Provinces of China. Metasequoia in other regions were directly or indirectly introduced from the studied area. This area has become the only existing shelter for the native population of Metasequoia [27].
The tree population of native Metasequoia is one of the 69 endangered species in China [28]. It is the only existing habitat of the most primitive population of Metasequoia, therefore is the only place to study the living situation of the natural population of Metasequoia which preserves the most complete gene pool of the existing Metasequoia and is of great scientific value. This plays an irreplaceable role in the study of the genetic diversity and other biological characteristics of Metasequoia. The study of the native population and its living environment is of great significance to paleobotany, paleoclimatology and paleogeology [29].
Up to now, there has not been any study worldwide on the allometry models of tree height and dbh of native Chinese Metasequoia. This thesis aims to study the allometry equation of native Metasequoia by using tree height and dbh data. It is significant in clarifying the development and structure of this species. Previous studies were mostly conducted on plantations, and the growth time was short. However, predicting the height of native Metasequoia by using area-based tree height-dbh models has high uncertainty. Therefore, this study aims to 1) develop a model that can be used to predict the height-dbh relationship of native Metasequoia in China and 2) determine which variables have a significant effect on the relationship between tree height and dbh.

Ethics statement
Our study sites (Xingdoushan National Nature Reserve, Hubei) are owned by the National Forestry Bureau of Enshi Tujia and Miao Autonomous Prefecture. Our research work was conducted in collaboration with the National Forestry Bureau of Enshi Tujia and Miao Autonomous Prefecture. We are permitted by the Forestry Bureau to collect the related plant sample data. Our research object is Metasequoia, the local protected species.
The profile of the study sites The study sites are located in Enshi City, Lichuan City and Xianfeng County of southwestern Hubei Province (Fig 1). The protection zone is divided into two districts: the eastern part is the Xingdou mountain area, located at the junction of Enshi City, Lichuan City and Xianfeng County, which is located at north latitude 29˚57'~30˚10' and east longitude 108˚57'~1092 7'; and the western part is the Xiaohe area, located in Lichuan City, which is located at north latitude 30˚04'~30˚14' and east longitude 108˚31'~108˚48', with a total area of 68339 hm 2 .
The Metasequoia under study is in the Xiaohe area, the western part of the protection zone. The study sites are located in a sub-tropical continental monsoon area. The annual precipitation is 1481 mm. The annual average temperature is 14.9˚C. The annual average frost-free period is 217 days. The trees of Metasequoia under study are distributed at an altitude of 900-1350 m. The soil in the study sites is loam, mainly a mixture of yellow soil and brown soil, with a small amount of purple soil. There are 5763 trees of native Metasequoia in total, and we collected data from all these trees. From the material collection and investigation, we know that the age of the Metasequoia trees is 50-510 years and that these trees are distributed in 4 towns, 16 administrative zones and 45 villages, with a total area of more than 800 km 2 . The trees are scattered, with a maximum of 20 in one district. The oldest trees are 510 years old, with a dbh of 248 cm and a height of 51 m. The average age of the trees is 97 years. The average dbh is 65.24 cm, and the average height is 27.84 m.

Data collection
The data that were collected from the trees are individual measurements, including dbh, tree height, elevation and tree age, collected using a dbh ruler, height finder, Runtastic Altimeter Tree height and DBH model of native Chinese Metasequoia PRO and vegetative cone, respectively. We chose the 10 toppest trees as the main samples. We also measured other stand-level variables, such as the dominant dbh (D 0 cm), the dominant height (H 0 ) and the basal area (BA). There are data from a total of 5746 Metasequoia trees available for this study. Some samples are either top dried, top broken, or struck by lightning, burned, or damaged by people. These samples could produce errors in the models; therefore, they were removed from the dataset. There were 5503 samples remaining that were studied.
The data from the 5503 sample trees were divided into two parts at random. The majority of the data (80%) was used for model calibration, and the remainder (20%) was used for verifying the consistency of the calibration data and the models. The calibration data show that the average dbh and height are 57.03 cm and 27.61 m, respectively. However, according to the validation data, the average dbh and height are 57.3 cm and 27.73 m, respectively. Table 1 shows the description of the stand-level variables.

Model selection and evaluation
We chose the linear and non-linear models and compared their performance. In this study, we selected two groups of general tree height-dbh models. There are 53 equations in total, as shown in Table 2. The criteria set for the two groups of models is in accordance with the number of variables, as follows: Group 1: models that use one independent variable. We only need to measure the dbh.
Group 2: models that use two or more independent variables. We need to measure the dbh and other variables.
Model selection and evaluation are based on the graphical and numerical data analysis, as follows: (1) root of mean square error (RMSE) (Eq 1), used for the accuracy estimation of the analysis (the smaller, the better); (2) adjusted coefficient of determination (R 2 adj ) (Eps.2 and 3), which reflects the part of the total variance explained by the model and takes into account the parameters that are necessary in making the estimation (the greater the value, the more interrelated between the variables); (3) deviation (Eq 4) and relative deviation (Eq 5), which is the deviation in the observed results from the evaluation model (the smaller, the better the effect); and (4) Akaike's information criterion (AIC) (Eqs 6 and 7), which is an information standard commonly used to choose the best model (as a rule, the model whose AIC value is lower is

No. Models References Group
Liner models Non-linear models preferred) [5]. The expressions of the statistics are shown as follows: Coefficient of determination : Adjusted coeff icient of determination : Bias ¼ Relative bias % ð Þ ¼ Residual sum of squares : Akaike's information criterion AIC ¼ n lnðRSSÞ þ 2ðp þ 1Þ À n lnðnÞ ð7Þ Where h i is the observation value;ĥ i is the forecast value and; " h i is the average value; n is the total number of data used to the fitted model; and p is the number of independent variables.

Data analysis
Because many of the selected models were non-linear, they were fitted with the statistical package SPSS 20.0 (SPSS for Windows version 20.0) and the Levenberg-Marquardt (LM) methods.
The initial values of the parameters for starting the iterative procedure were used to fit the data. The regression analysis was made using the SPSS program.

Results
The relationship between tree height and dbh As shown in Fig 2, which is a scatter diagram of the tree height and dbh of all the 5503 native Metasequoia, the minimum dbh is greater than 26 cm, and with the increase in the dbh, the tree height also slowly increases. The solid line is the trend line, whose formula is h = 12.546 + 0.264 Ã d(R 2 = 0.7582). Tables 3 and 4 are the prediction and fitting results of the fitting data and validation data from Group 1 models and Group 2 models, respectively. The two groups of models show good results, except for models 44, 47 and 48. Therefore, we can see that these three models are not fit for the prediction of the data, and we will remove their results in later statistics. As shown in Table 3, in the Group 1 models, the changes of the derived value between the adjusted R 2 and RMES are roughly equivalent. In the Group 1 models, model 4 is the most suitable linear model (the adjusted R 2 value is the highest, and the RMES value, deviation sum and AIC are the lowest). It derives the most accurate result in the fitting of calibration data and is suitable for estimating the height of most trees in Group 1. Model 22 is the most suitable non-linear model. Both models have only one independent variable (dbh) that is used to predict tree height, and their derived data are in good agreement. When considering the other variables (Group 2 models) the models show different performances. In the calibration data from the Group 2 models, the derived R 2 value is 0.7456~0.7737, and the RMSE value is 1.7965~1.9050 (Table 4). In addition, most of the Group 2 models have greater adjusted R 2 values and smaller RMSE values than the Group 1 models. In the models of Group 1, the calibration dataset of the AIC value is relatively higher. The calibration data of the AIC value in Group 2 models is smaller than that in the Group 1 models. The choice of models should be based on the arrangement of the expression of models using calibration data and validation data (the adjusted R 2 , RMSE, the absolute deviation, the relative deviation and the AIC). In this analysis, the top two or three models are summarized below in Table 5 (the numbers in brackets indicate the ranking of the parameters). The model with the adjusted R 2 value closest to the highest value and a deviation (absolute and relative deviation) closest to zero is thought to be the best. The lower the value of RMSE and AIC, the higher the ranking of the model. For each model, its final ranking is the sum of the statistics of the five evaluation values. The model with the minimum  Fig 3 shows the predicted values and the observed values obtained by using the model. The straight line between the observed value and the predicted value is the determining factor of the model evaluation standard. Each model has a higher R 2 value, so the solid line is closely  Tree height and DBH model of native Chinese Metasequoia

Parameter estimation
Initially, the model estimates all the parameters using the calibration data. Table 6 shows the parameter estimation and fitting statistics of models 4, 22, 51 and 52 using all data. All the parameters were significant (P < 0.05). It is easy to obtain the parameters of each model through calculations.

Evaluation and application of models
The choice of the models is based on the degree of optimization, the accuracy and the practical application. The models were compared and ranked after implementation (Table 5). Under this approach, models from Group 1 cannot be clearly differentiated. As shown in Fig 2, the result is that the tree height increases monotonically with the increase in the dbh. A reasonable tree height-dbh model should be consistent with the first order monotone increasing characteristic [56]. In the models of Group 2, the calibration and validation data for models 22, 51, Tree height and DBH model of native Chinese Metasequoia and 52 are in agreement. Usually, the increase in the new independent variables can reduce the error in the height-dbh model and increase the accuracy of the model. Therefore, through large amounts of sampling and measuring the different variables, we derived the Group 2 models that are more suitable for the prediction of tree height than the Group 1 models.
In this research, model 4 is the only linear model with 2 variables of tree height and dbh, and the others are non-linear models, in which model 22 has 2 variables of tree height and dbh; model 51 has 5 variables of tree height, dbh, main tree height, sample numbers and tree age; and model 52 has 5 variables of tree height, dbh, basal area, sample numbers and tree age. The non-linear models are more accurate. Therefore, we believe that the non-linear models perform better than the linear models in the case of Metasequoia. The non-linear models are more flexible. The forms and parameters of the models are very important for the application of models. We have demonstrated the stability of the structure of the four models. Therefore, the parameter is the determining factor that affects the accuracy of the models in their application.
The development of simple and accurate models makes it possible for forestry workers to predict the height of a tree by relying on the diameter data of a region, which is very important  for forest management. In this study, models 51 and 52 are similar. The four models are all consistent with statistical principles. There are a relatively larger number of parameters in models 51 and 52, and the increase in the number of parameters can improve the accuracy of the model in the prediction of tree height. The choice of the four models is not only statistically reliable but is easier for application. The practical application of the models is necessary for the prediction of tree height.
The relationship between tree height and the study area variables According to the study, bringing the study area variables as independent variables into the height-dbh model can improve the accuracy of tree height predictions [49]. The study area variables mainly include the development of comprehensive factors, such as the dominant tree height, dbh, tree age, basal area, stand density and growth status, etc. In the Group 1 models, only one independent variable of dbh (d) can be used to predict the relationship between height and dbh. Depending on the specific circumstances, if the other variables of the study area can be measured, Group 2 models are the best method. We can accurately assess the direct relationship between tree height and dbh using Group 2 models and can predict the future development. A significant difference exists in the growth rate of tree height and dbh, and there is an allometric relationship between them. Because the average age of the Metasequoia trees collected in this study is more than 50 years and they are a mature forest, the tree height and dbh growth relationships are relatively stable. Although tree age is not the most important variable in the study of the tree height-dbh relationship, some differences in parameter estimation can be discovered when selecting models. Therefore, considering tree age when selecting models and parameters can make the tree height prediction more accurate.

Conclusion
This study compared and predicted the native population of Metasequoia aged 50-485 in China using 53 height-dbh models. The choice of models is based on the principles of good fit, high accuracy and suitability for practical application. The results show that the composite models which include additional variables can improve the quality of the models. Linear model 4 and non-linear model 22, which have 3 parameters and 2 variables each, and non-linear models 51 and 52, which have 6 parameters and 5 variables each, can better predict the height of natural Metasequoia trees. Although stand age is not the most significant variable in the relationship between height and dbh, knowing the tree age can improve the accuracy of natural Metasequoia tree height prediction. In addition, sample size can also improve the accuracy of tree height prediction to some degree. In this study, forest density will not affect the growth of the trees. However, as a referential maximum value of growth of local Metasequoia, the main tree height can also improve the accuracy of model prediction. In remote or individual regions, because different ecological niches may have an important impact on the estimation of height, it is not suitable for application. In our analysis, the measurement error due to the ground level was not considered. This is because these errors likely cannot affect the allometric models [57].
This research aims to build some models to facilitate the prediction of tree height of local Metasequoia, thus making it easier to evaluate the growth of native Metasequoia and protect it as an important plant resource. Therefore, the sanctuary workers who have mastered the original data of local Metasequoia can predict tree height using models 51 or 52. General scientific researchers, in the case that they have measured the dbh, can use models 4 or 22 to predict tree height. The methods and the models recommended in this study are based on statistics, which are reliable for forest growth and survival estimates and planting management for Chinese native Metasequoia. After verification, these models can be applied in other regions and to other tree species.