Methodological comparison of alpine meadow evapotranspiration on the Tibetan Plateau, China

Estimation of evapotranspiration (ET) for alpine meadow areas in the Tibetan Plateau (TP) is essential for water resource management. However, observation data has been limited due to the extreme climates and complex terrain of this region. To address these issues, four representative methods, Penman-Monteith (PM), Priestley-Taylor (PT), Hargreaves-Samani (HS), and Mahringer (MG) methods, were adopted to estimate ET, which were then compared with ET measured using Eddy Covariance (EC) for five alpine meadow sites during the growing seasons from 2010 to 2014. And each site was measured for one growing season during this period. The results demonstrate that the PT method outperformed at all sites with a coefficient of determination (R2) ranging from 0.76 to 0.94 and root mean square error (RMSE) ranging from 0.41 to 0.62 mm d-1. The PM method showed better performance than HS and MG methods, and the HS method produced relatively acceptable results with higher R2 (0.46) and lower RMSE (0.89 mm d-1) compared to MG method with R2 of 0.16 and RMSE of 1.62 mm d-1, while MG underestimated ET at all alpine meadow sites. Therefore, the PT method, being the simpler approach and less data dependent, is recommended to estimate ET for alpine meadow areas in the Tibetan Plateau. The PM method produced reliable results when available data were sufficient, and the HS method proved to be a complementary method when variables were insufficient. On the contrary, the MG method always underestimated ET and is, thus, not suitable for alpine meadows. These results provide a basis for estimating ET on the Tibetan Plateau for annual data collection, analysis, and future studies.

Introduction radiation-based, temperature-based, and mass transfer-based. The goal of our study was to find a best method that can be applied as a benchmark for alpine meadows with scarce data over the entire TP area in future studies.

Materials and methods
Data at study sites were chosen in our study and these sites were established and monitored by the Northwest Institute of Eco-Environment and Resources (NIEER), Chinese Academy of Sciences (CAS) (Fig 1). The NIEER, CAS provided permission for the study to collect field data at each of the sites. Field data (including meteorological and flux data) at five sites (S1 Table), which were available from the NIEER, CAS, were selected to calculate ET using the four estimation methods. The measured ET from the EC tower was used for validation. Measured air temperature, relative humidity, wind speed, net radiation, and soil heat flux were taken every 30 minute and then integrated into daily values. The mean annual temperatures at the Arou, Hulugou, Suli, Nagqu, and Tanggula sites for one single observation year were 0.9, -3.1, -4.0, -0.6, and -4.1˚C, respectively. The analyses for the five sites were listed in order from low to high elevation. The mean annual precipitation readings at the Arou, Hulugou, Suli, Nagqu, and Tanggula sites for one single observation year were 403, 553.3, 388.2, 449.5, and 538.2 mm, respectively. At all sites, the land covers were all alpine meadows with semi-arid climates, where elevation ranges from 3033 to 5100 m a.s.l. The information of each site is listed in detail in Table 1. All data were collected during one growing season at each site due to lack of data availability. Due to the only one year of the meteorological data for each site, the mean annual meteorological data (including air temperature and precipitation) of Chinese national stations Tuole and Tuotuohe, which are near the Suli and Tanggula sites, respectively, were used for analysis. The difference of mean annual temperatures of Tuole and Tuotuohe during 2010 with that the fiveyear (2010-2014) were only -0.1 and 0.6˚C, while the annual precipitation difference was 2.6% and 6.7%, which indicates that the influence of the data collected during different years for ET comparison is little.
The raw EC data were processed by the EdiRe software (University of Edinburgh, http:// www.geos.ed.ac.uk/abs/research/micromet/EdiRe). The processing procedures included the removal of spikes, coordinate rotation (3-D rotation), frequency response correction, sonic virtual temperature correction, and corrections for density fluctuation (Webb-Pearman-Leuning, WPL-correction). The raw sampling frequency was 10 Hz. The EC data were averaged with 30-min periods, and data quality assessment was performed for the 30-min EC values using turbulence stationary and integrated turbulence characteristics tests. The 30-min EC data were rejected when precipitation occurred within 1 h before and after data collection and the friction velocity was below 0.1 m s -1 at night. The missing data were interpolated by a nonlinear regression method [41,42].
While EC has been recognized as the most accurate ET estimation method [5,[43][44][45], an energy closure problem arises when using this method [46,47], where the turbulent flux (sensible and latent heat flux) is usually less than the available energy [39]. This issue has also been found at observation sites on the TP [39,48]. The energy balance closure ratios (H+LE) /(R n -G) at Arou, Hulugou, Suli, Nagqu, and Tanggula during the growing season were 1.04, 0.67, 0.82, 0.93, and 1.02, respectively. Therefore, a correcting method was applied to address the closure issue, as proposed by Twine et al. [49]: where ET is the corrected evapotranspiration; R n is net radiation; G is soil heat flux; and H ori , LE ori , and ET ori are the original sensible heat flux, latent heat flux, and evapotranspiration,

Methods for reference ET
Combination-based category: FAO PM method. The combination-based method is made up of the available energy and aerodynamic components and has been found to outperform for a variety of crops and climates compared to other methods [28]. Rahimikhoob et al. [50] suggested that the Food and Agriculture Organization (FAO) PM was the best model for this method type. Therefore, PM was selected as the representative method for the combination-based category in this study, which utilizes the following equation: where ET 0 is reference ET; R n is net radiation (MJ mm d -1 ); G is soil heat flux (MJ mm d -1 ); T is air temperature (˚C); u 2 is wind speed at the height of 2 m; e s is saturated water vapor pressure (kPa); e a is actual water vapor pressure (kPa); e s -e a represents the water vapor pressure deficit at reference height (kPa); Δ is the slope of saturated water vapor pressure curve (kPa˚C -1 ); and γ is the psychrometric constant (kPa˚C -1 ). Radiation-based category: PT method. The radiation-based category mainly depends on the available energy and ignores the influence of the aerodynamic component. PT is the most popular method for this category, which integrates an empirical coefficient to avoid the influence of surface and aerodynamic resistances compared to the PM method [51]. The equation for this method is as follows: where α is Priestley-Taylor coefficient. According to Priestly and Taylor [51] and Rahimikhoob et al. [50], we used α = 1.26 in the respective equation in this work. Temperature-based category: HS method. The temperature-based category is empirical and only considers air temperature as a representative of the available energy for ET. It ignores the influence of solar radiation and wind speed, which most likely causes poor results compared to other categories. In this study, we used the HS method [52] for this category, which is described by the following equation: where T max , T min , and T mean are the daily maximum, minimum, and average air temperatures (˚C), respectively; and R a is the water equivalent of extraterrestrial radiation (mm d -1 ). Mass transfer-based category: MG method. The mass transfer-based category is based on Dalton's gas law, which incorporates wind speed and humidity as parameters. Several equations represent this category, including the Dalton [53], Rohwer [54], Penman [55], and Mahringer [56] equations. We decided to adopt the equation proposed by Mahringer [56] (MG) in this study to analyze the mass transfer-based category: Determination of scaling factor (Kc) Actual ET can be determined by multiplying the reference ET and Kc, which shows that the accuracy of Kc is largely influential. In general, Kc values depend on the growth stages, their length, soil moisture, and leaf area index (LAI). In the FAO Irrigation and Drainage Paper 56, the Kc values in each growth stage for different crops under different climates were defined [26]. The growth stage consists of the initial, development, middle, and end stages. When the mean air temperature of 5 days is greater than 0˚C, the alpine meadow enters the initial stage. When the air temperature is consistently greater than 3˚C, the alpine meadow begins the development stage. The middle stages begins when the air temperature is greater than 5˚C. The end stage occurs when the air temperature is below 5˚C, in which the alpine meadow stops growing until the air temperature drops 0˚C [57,58]. In this work, we set a value for each growth stage of the alpine meadows (Table 2) based on the reports by Allen et al. [26], Liu et al. [57], and Li et al. [59]. During the development and end growth stages, Kc values varied linearly and can be defined as [26]: where i is the day number during the growing season; Kc i , Kc next , and Kc prev are the scaling factors at day i, beginning of the next stage, and end of the previous stage, respectively; L stage is the length of the current stage; and X L prev is the sum length of all previous stages. Detailed information about Kc values for different growing periods can be seen in Table 2

Evaluation criteria
To evaluate the performance of these methods on the alpine meadow, statistical analysis of the coefficient of determination (R 2 ), root mean square error (RMSE), mean absolute error (MAE), relative error (RE), mean bias (MB), and Nash-Sutcliffe efficiency coefficient (NSE) was performed. The equations for these variables are described as follows: where Q obs and Q est are the observed and estimated values, respectively; Q obs and Q est are the mean values corresponding to Q obs and Q est , respectively; and n is the sample number. Methodological comparison of evapotranspiration on the Tibetan Plateau

Results
R 2 of all sites between ET before and after-correction for Arou, Hulugou, Suli, Nagqu, and Tanggula were 0.92, 0.96, 0.83, 0.82 and 0.86, respectively. And the MAEs between ET before and after-correction for Arou, Hulugou, Suli, Nagqu, and Tanggula were 0.25, 0.81, 0.67, 0.40, and 0.29 mm d -1 , respectively. We used the measured data after correction as comparison. The time series of daily ET during the growing season estimated from PM, PT, HS, MG, and measured data are shown in Fig 3, and the statistics of ET estimation using the four methods are listed in Table 3. Daily ET estimated by PM, PT, HS, and MG methods over five representative sites were compared to that estimated by EC (Fig 4). Fig 5 presents   At the Hulugou site (relative lower elevation), the mean ET of PM, PT, HS, MG, and measured data were determined to be 2.5, 2.64, 2.54, 1.7, and 2.52 mm d -1 , respectively, during the growing season (Fig 3b). REs of PM, PT, HS and MG were -48.2%, -30%, -52.4%, and -26.7%, respectively (Table 3) At the Suli site (relative middle elevation), the mean ET of PM, PT, HS, MG, and measured data were 2.55, 3.03, 2.6, 1.56 and 3.27 mm d -1 , respectively (Table 3). Overall, ET was underestimated by these methods with positive RE values. The PT method exhibited good performance with the highest R 2 (0.79), lowest RMSE (0.59 mm d -1 ), and highest NSE (0.71) values (Fig 5), suggesting that this method estimated ET best. The second best method was the PM method with R 2 = 0.68, RMSE = 0.96 mm d -1 , and NSE = 0.23, which was followed by the HS method with similar RMSE and MAE values but a lower RE. However, NSE of the HS method was close to zero, which indicates poor results, while the MG method performed the worst  (Table 3). ET was underestimated by the MG method with an RE of 20.4%, which suggests large uncertainties in the method. The PM and PT methods had lower RMSE values (0.61 and 0.51 mm d -1 , respectively) compared to the HS and MG methods. ET estimated by PT had good consistency with the measured ET (R 2 = 0.87), while PM had the smallest bias (0.06 mm d -1 ). PT performed the best with the highest R 2 (0.87), lowest RMSE (0.51 mm d -1 ), and highest NSE (0.79), and PM proved to be the second-best method with a lower RMSE (0.61 mm d -1 ) and higher NSE (0.69). HS showed reduced performance with an RMSE of 0.86 mm d -1 and NSE of 0.39, which was followed by PT and PM. MG performed the worst with the highest RMSE (1.58 mm d -1 ) and negative NSE (-1.1) values, which means that MG may be not a suitable method for alpine meadows.
At the Tanggula site (relative higher elevation), the mean ET of PM, PT, HS, MG, and measured data were 2.21, 2.89, 2.21, 1.35, and 3.2 mm d -1 during the growing season in 2010 (Fig  3e), where ET estimated by PM and PT had higher correlation with measured ET. However, the REs of PM, PT, HS, and MG were determined to be 30.8%, 11.2%, 30.3%, and 56.5%, respectively (Table 3), which indicates that all methods underestimated ET at the Tanggula site. Among them, the PT method presented the best results with the highest R 2 (0.76), lowest RMSE (0.62 mm d -1 ), and highest NSE (0.44), while PM showed reduced performance with a higher R 2 (0.59) and lower RMSE (1.13 mm d -1 ). The HS and MG methods displayed the poorest performance with lower R 2 (0.37 and 0.18, respectively) and higher RMSE (1.22 and 2.0 mm d -1 ) values. All methods produced a negative NSE, except PT with an NSE of 0.44, suggesting that the PT method performed best at the Tanggula site.
Noting that all methods underestimated ET during May to mid-June at the Suli and Tanggula sites (Fig 3c and 3e). Moreover, all methods had large biases at the beginning of the growing season. Thus, we took PT method (the best method as has been shown above) as an example to explore the methods application and limitation on alpine meadow. The mean absolute deviation X n i¼1 Q obs À Q est Q obs =n (MAD) was chosen as a statistic. PT had large MAD when temperature were lower than 2˚C, especially at the initial stage of the growing season (Fig 6). The MAD during the initial stage of the growing season for Arou, Hulugou, Suli, Nagqu and Tanggula sites were 42.5%, 119.3%, 45.2%, 29.1%, and 35.4%, respectively. The MAD of Suli and Tanggula sites during May to mid-June were about 34.4% and 35.4%, which means the meadows growed slowly and the temperature during this period were 2.22 and 1.09˚C, respectively. This may be caused by the low temperature at the beginning of the growing season while some days appeared negative temperature which means that snow/ice events might occur, almost all the methods used in this paper could not predict ET of snow/ice surface accurately due to their own limitations. Therefore, all methods cannot capture high accuracy ET of alpine meadow when air temperature was less than 2˚C during the growing season, especially at the beginning of the growing season. In summary, the PT method produced the best results for all stations. The PM and HS methods showed reduced performance for all stations, while the MG method may be not suitable for alpine meadows in the TP.

Uncertainties and limitations of performance of different methods
Compared with the measured ET before correction, PT performed better with higher R 2 (0.6) and lower RMSE (0.85 mm d -1 ) than other methods among all sites, while it showed better performance with R 2 of 0.82 and RMSE of 0.52 mm d -1 compared with EC data after correction. Though EC values have been regarded as 'truth' values for validation, the uncertainties of the EC data are difficult to present which primarily due to the stochastic error and systematic error [60,61]. The stochastic error depends on the number of independent observations when time series are auto-or cross-correlated [60]. The systematic error is not only related on the instrument, but also affected by the unmet environmental conditions and the complex procedure for data processing. For the five sites, the daily EBR was 0.93 during single growing season for different years. The systematic error d sys F ¼ Fð

EBR
À 1Þ can be directly calculated from EBR of 30-min during daytime conditions (global radiation > 20 W m -2 ), where F means the scalar fluxes. Thus the systematic error of the turbulent fluxes for the five sites was about 10%. Generally, higher EBR would decreased the uncertainty of the measured data. The EBR higher than 1 indicates that the turbulent flux were larger than the available energy which was unreasonable. Similarly, lower EBR would produce large uncertainty due to the energy imbalance.
In order to acquire the data close to the "truth" value, the correction for measured data was necessary for energy closure [45]. The correction would decrease measured ET when EBR was higher than 1 and vice versa. However, the correction for energy balance still produced the uncertainty of the performance of different methods.
Higher MAE between ET before and after-correction corresponding to lower EBR implies the large uncertainty for measured ET. Therefore, the analyses of the performance of different methods over different EBR thresholds are needed (Fig 7). Among all sites, the PT method performed best over all different intervals than other three methods. It performed best with R 2 of 0.97 and RMSE of 0.35 mm d -1 at the EBR interval [0.9, 1.1) than other intervals at the Arou site while the PM method performed similar with the PT method during EBR less than 0.6 or more than 1.1. Though the EBR at the Hulugou site was 0.68, which means large uncertainty of EC data exists, the PT method performed better than other methods over different intervals. And the performance was best during the interval EBR interval [0.6, 0.9). The PT method also performed better with higher R 2 and lower RMSE over different intervals, followed by the PM method at the Suli, Nagqu, and Tanggula sites. And the MG method performed worst over different intervals. The PT and PM methods performed best during the EBR interval [0.6, 0.9) at the Suli and Tanggula sites. The PT method performed best during the EBR less than 0.6 and more than 1.1 than other intervals at the Nagqu site. The better performance of PT method at the EBR interval [0.6, 1.1) indicates that higher accuracy measured values are necessary for validation. Though the PT method could produce better results than other methods during the EBR less than 0.6 and more than 1.1, both the corrected EC data and the results of the methods have large uncertainties which make meaningless for comparison and validation.
Besides the uncertainty of EC data, several researchers have studied the performance of different methods. Yang et al. [9] considered that the PM method offered better results than the PT method when compared with ET observations using the micro-lysimeter in the Qilian Mountains. Wu et al. [62] found that the PT method showed considerable difference with the PM method, which was used as a reference in the middle of Heihe River. Therefore, the performance of the PM and PT methods may be different on the alpine meadows, while the PM method has been considered the standard method for ET estimation with more rigorous physical meaning [26]. Ma et al. [17,18] compared the complimentary relationship and PM methods in an alpine steppe of the TP and ascribed that the complimentary relationship and PM methods could provide efficient results using meteorological data. Zhu et al. [63] recommended using the PT method when available data were limited for alpine grassland on the TP.
Moreover, the PM method needs more input variables, which limits its application in datascarce region [34]. There have limited observation sites on the TP due to the harsh environment [17]. Net radiation and air temperature can be required with spatial interpolation or remote sensing data for the whole TP region while RH and wind speed may have large uncertainties. Unfortunately, it is difficult to obtain high accuracy ET using PM due to limited input data and to analyze the exact specific meteorological condition when the PT outperforms PM under the current available dataset. Future studies will aim at decreasing these difficulties. We also noted that the collected data were come from different years, which will bring some uncertainties on the evaluation of different methods.
To obtain the main limiting factor of each method, we calculated the sensitivity coefficient of input variables for each method (Fig 8).
The PM method exhibited good performance at Suli and Hulugou sites. In addition to net radiation, relative humidity is another important factor that influences water vapor deficit and the accuracy of ET estimation by the PM method [64]. Air temperature and wind speed were found to have little impact on the PM method (Fig 8a). In order to estimate the influence of wind speed over different elevations, we calculated the sensitivity of wind speed for each site which were located at different elevations (Fig 9). The sensitivity of the wind speed to ET was highest at the Suli site (relative middle elevation), while the sensitivity coefficient was lower at the Tanggula site (relative higher elevation), indicating that wind speed was not sensitive to ET in the high altitude areas which had higher wind speed. Generally, wind speed increased with increasing altitude gradients. Thus, the sensitivity of the wind speed to ET was higher at the lower elevation sites than the higher elevation sites.
It can be concluded that PT performed well at all stations, which may be due to the simple parameterization of PT. The available energy and air temperature were the most influential factors for PT, while ET estimated by PT was not sensitive to air temperature (Fig 8b). The net radiation was a dominant factor of the PT method, and had higher correlation with ET; thus, it was considered in both PM and PT methods. This may be the reason behind the better performance of the PM and PT methods compared to HS and MG [9].
HS showed reduced performance at most alpine meadow sites over the TP, which may be explainable by this method's simple empirical formula that is mainly based on air temperature and extraterrestrial radiation. Consideration of the available energy and soil moisture would most likely improve the performance of this method. R a was used rather than incoming solar radiation as radiation data, which does not consider atmospheric transmissivity [34,65], while air temperature was the only variable that had a large impact on the HS method.
MG was found to underperform compared to all methods, suggesting that it is not suitable for alpine meadow areas. This method does not describe the available energy conditions although air temperature, relative humidity, and wind speed are considered as parameters. Among the three variables, the sensitivity of relative humidity was greater than air temperature and wind speed. Overall, net radiation had a large impact on the PM and PT methods. RH showed higher sensitivity for PM and MG, and air temperature had the greatest impact on the HS method as the only input variable. Moreover, wind speed was found to have minimal influence on the PM and MG methods.

Conclusions
We evaluated PM, PT, HS, and MG as ET estimations methods by comparing observed ET via EC at alpine meadow sites on the TP in China. The PM and HS methods showed reduced results, while results for MG suggest that it may not be suitable for most alpine meadow sites. Based on its acceptable results, the PT method is recommended as the most suitable for alpine meadows on the TP to estimate and stimulate the water and energy balance, where sufficient data is not available. The present evaluations may be helpful for deriving the ET over the whole TP, enhancing our understanding of hydrological cycle of Third Pole region. Future work will focus on the estimation of ET for other vegetation cover types to obtain enhanced accuracy ET over the whole TP region.
Supporting information S1 Table. The meteorological data and flux data of the five sites. The meteorological data and flux data for each site was in the S1 Table.xlsx. (XLSX)