Selecting models for the estimation of reference evapotranspiration for irrigation scheduling purposes

Alternative models for the estimation of reference evapotranspiration (ETo) are typically assessed using traditional error metrics, such as root mean square error (RMSE), which may not be sufficient to select the best model for irrigation scheduling purposes. Thus, this study analyzes the performance of the original and calibrated Hargreaves-Samani (HS), Romanenko (ROM) and Jensen-Haise (JH) equations, initially assessed using traditional error metrics, for use in irrigation scheduling, considering the simulation of different irrigation intervals/time scales. Irrigation scheduling was simulated using meteorological data collected in Viçosa-MG and Mocambinho-MG, Brazil. The Penman-Monteith FAO-56 equation was used as benchmark. In general, the original equations did not perform well to estimate ETo, except the ROM and HS equations used at Viçosa and Mocambinho, respectively. Calibration and the increase in the time scale provided performance gains. When applied in irrigation scheduling, the calibrated HS and JH equations showed the best performances. Even with greater errors in estimating ETo, the calibrated HS equation performed similarly or better than the calibrated JH equation, as it had errors with greater potential to be canceled during the soil water balance. Finally, in addition to using error metrics, the performance of the models throughout the year should be considered in their assessment. Furthermore, simulating the application of ETo models in irrigation scheduling can provide valuable information for choosing the most suitable model.


Introduction
Irrigation is a very important practice to ensure good agricultural productions in arid and semiarid areas. In addition, it can contribute to reduce production risks, even in areas with reasonable rainfall levels, and can be used in greenhouse production. However, despite its benefits, irrigation should be used properly to avoid excessive or insufficient water application. In this sense, irrigation scheduling plays a key role, allowing one to provide water to different crops according to their requirements [1].
Irrigation scheduling can be performed using different approaches, but it is commonly based on reference evapotranspiration (ETo), which is typically computed using meteorological data [2][3][4][5][6][7]. ETo can be used as basis to compute the evapotranspiration of different crops. To accomplish this, a crop coefficient (Kc) and a water stress coefficient (Ks) are used to convert ETo to the evapotranspiration of a particular crop, considering its development phase and the soil water availability [6,8]. ETo can be estimated using the Penman-Monteith FAO-56 (PM) equation, recommended by the Food and Agriculture Organization (FAO) [2,8]. This equation performs well in different regions of the world. However, in places with low meteorological data availability, its application becomes limited, since it requires air temperature, relative humidity, solar radiation and wind speed data [9,10].
To make it possible to estimate ETo using fewer meteorological data, several studies have evaluated the potential of empirical equations and machine learning models to estimate ETo under different meteorological data availability scenarios [10][11][12][13][14][15]. These alternative models can be important options for the estimation of ETo, however they typically have a limited performance. According to the performance of a particular model, it can be considered suitable or not for irrigation scheduling purposes.
To assess the performance of models for the estimation of ETo, traditional error metrics, such as root mean square error (RMSE), mean absolute error (MAE), mean bias error (MBE) and coefficient of determination (R 2 ), are typically used [11][12][13][14]16]. Overall, these metrics compute the dissimilarity (error) or similarity between the estimates provided by a reference model, which is commonly represented by the PM equation, and a model under evaluation. Based on a single error metric or on a set of error metrics, it is possible to define the most efficient model to estimate ETo as the one with lower errors in relation to the reference model. However, when selecting models for irrigation scheduling, the use of the strategy mentioned above do not provide a direct assessment of the performance of the models for this specific purpose.
In irrigation scheduling, irrigation frequency can have a significant influence on the performance of the models since when grouping daily ETo values in longer periods, the prediction errors may decrease. In addition, when calculating crop evapotranspiration (ETc) using a water stress coefficient (Ks), problems with ETo overestimation, which cause ETc overestimation, can be partially reduced during the soil water balance since the estimated soil water content will drop faster, promoting higher Ks reduction, which reduces the next ETc values calculated. Other important factor is the behavior of the ETo model over time. For instance, a model with random errors over time can has its errors partially canceled during the soil water balance. Finally, the rainfall distribution over the year can also impact the performance of irrigation scheduling performed with alternative ETo models. Given the dynamics of irrigation scheduling, it is highlighted that the simple use of error metrics may not be sufficient to select the best ETo model for irrigation scheduling purposes.
Despite the importance of the development of methodologies for a better assessment of models for the estimation of ETo for irrigation scheduling purposes, according to our knowledge, so far, this type of study has not been found. Thus, the objective of this study was to analyze the performance of three original and calibrated empirical equations, initially evaluated using traditional error metrics, for irrigation scheduling, considering the simulation of different irrigation intervals.

Database
Hourly data from two automatic weather stations (2015-2017) of the Brazilian National Institute of Meteorology (INMET) located in the municipalities of Viçosa and Mocambinho, which are located in the state of Minas Gerais, Brazil, were used. Maximum and minimum air temperature, mean relative humidity, solar radiation, wind speed (10 m) and rainfall data were used. Wind speed measured at 10 m height was converted to 2 m height, as suggested by Allen et al. [8]. The hourly data were converted to a daily timescale. Days with missing data were removed. The weather stations used in this study were selected because they represent relatively different climatic conditions. The mean values of the meteorological variables used, in the periods considered to calibrate the equations (2015-2016) and to assess their performances (2017), are presented in Table 1. The database is available in Supporting information or directly from INMET (https://portal.inmet.gov.br/dadoshistoricos).

Irrigation scheduling-simulation configurations
To carry out irrigation scheduling, the soil water inputs (rainfall and irrigation) and output (evapotranspiration) were computed. Crop evapotranspiration (ETc) was calculated based on Eq 1, as recommended by Allen et al. [8] and Bernardo et al. [17]. Ks coefficient is used to adjust ETc for water deficit conditions. When adjusted for water deficit conditions, as considered in the present study, it is common to refer to ETc as actual evapotranspiration (ETa) or adjusted ETc. In this study, the denotation ETc was maintained.
where ETc-crop evapotranspiration, mm d -1 ; ETo-reference evapotranspiration, mm d -1 ; Kc-crop coefficient; Ks-water stress coefficient. ETo was obtained using different equations, which are presented later. Ks was calculated based on Eq 2 [17].
Where SWC-soil water content, mm; TAW-total available water, mm.
Once ETc has been obtained, the soil water balance was computed based on Eq 4. The initial value of the soil water content (SWC) was equal to TAW. Effective rainfall (rainfall stored in the root zone) was considered equal to total rainfall, if total rainfall does not exceed the current soil water deficit (TAW-SWC), or equal to the current soil water deficit, otherwise.
Where SWC i -soil water content on the current day, mm; SWC i-1 -soil water content on the previous day, mm; ETc-crop evapotranspiration, mm; Pe-effective rainfall, mm; I-net irrigation depth, mm.
Knowing the current SWC, irrigation was computed in order to return SWC to field capacity. Thus, net irrigation depth was obtained by subtracting SWC from TAW (TAW-SWC). The parameters used for the simulations were as follows: field capacity (FC) = 30%, permanent wilting point (PWP) = 15%, soil bulk density (BD) = 1.1 g cm -3 , effective rooting depth (z) = 20 cm, and crop coefficient (Kc) = 1.1. Fixed irrigation intervals (1, 2, 4, 6 and 8 days) and variable irrigation intervals were considered. For variable irrigation intervals, the critical minimum soil water content was defined as 50% of TAW, which is considered by using a soil water depletion fraction for no stress (p), also called soil water availability factor (f), equal to 0.5. It is assumed that below this water content the crop begins to be affected by water deficit. To prevent the soil water content from exceeding the aforementioned critical minimum limit, irrigation was carried out when the soil water content was 40% below TAW. The simulations were performed using data from the year 2017, with data from 2015-2016 reserved to calibrate the empirical equations.

Estimation of reference evapotranspiration
Daily ETo estimated using the PM equation (Eq 5) was employed as the standard method for calibration and evaluation of the empirical equations. All procedures necessary to calculate ETo were performed according to the recommendations of Allen et al. [8]. Although the PM equation is also subject to errors, it has good reliability and can be used as a standard for the development and calibration of other models [8,9].
where ETo-reference evapotranspiration, mm d -1 ; R n -net solar radiation, MJ m -2 day -1 ; Gsoil heat flux, MJ m -2 day -1 (considered to be null for daily estimates); T mean -daily mean air temperature,˚C, u 2 -wind speed at a 2 m height, m s -1 ; e s -saturation vapor pressure, kPa; e a -actual vapor pressure, kPa; Δ-slope of the saturation vapor pressure function, kPa˚C -1 ; and γ-psychrometric constant, kPa˚C -1 . ETo was also estimated using the empirical equations shown in Table 2.
To adjust the empirical equations to the local climate conditions, they were calibrated based on simple linear regression, as recommended by Allen et al. [8], using data from 2015 to 2016. Table 2. Empirical equations used in the study.

Name / Inputs Equation Reference
Hargreaves-Samani (T) ETo ¼ 0:0023R a ðT mean þ 17:8ÞðT max À T min Þ 0:5 [18] Romanenko (T, RH) Jensen-Haise (T, R s ) ETo ¼ 0:408R s ð0:0252T mean þ 0:078Þ [20] T-air temperature,˚C; RH-mean relative humidity, %; R a -extraterrestrial radiation, mm d -1 ; T max -maximum air temperature,˚C; Tmin-minimum air temperature,˚C; T mean -mean air temperature ([T max + T min ]/2),˚C; R ssolar radiation, MJ m -2 d -1 . https://doi.org/10.1371/journal.pone.0245270.t002 For this, daily ETo values estimated by the equation to be calibrated were used as the independent variable and ETo values estimated by the PM equation were used as the dependent variable. The intercept (a) and slope (b) values were used as calibration parameters, according to the following equation. The values obtained for the calibration parameters "a" and "b" are presented in Table 3.

Performance comparison criteria
The performance of the empirical equations for the estimation of ETo was evaluated using data from the year 2017, the same period considered for irrigation scheduling.
OÞ ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi where RMSE-root mean square error, mm d -1 ; MAE-mean absolute error, mm d -1 ; MBEmean bias error, mm d -1 ; R 2 -coefficient of determination; P i -predicted value, mm d -1 ; O iobserved value, mm d -1 ; � P-mean of the predicted values, mm d -1 ; � O-mean of the observed values, mm d -1 ; n-number of data pairs.
To assess the performance of the equations in the simulated irrigation scheduling, total ETc, total net irrigation depth and total effective rainfall estimated when using each equation were compared. In addition, after the end of the irrigation scheduling carried out with each empirical equation, the soil water balance was recomputed considering the irrigations recommended over the management period and ETc recalculated using ETo obtained with the PM equation (Fig 1). All the computations were performed in a daily basis. ETc and effective rainfall obtained in the recomputed soil water balance were denoted as ETc (true) and Pe (true), respectively. This procedure was performed to assess the real performance of the irrigation scheduling carried out with the different empirical equations. In this way, it is possible to analyze ETc that actually occurred during the management period and check the occurrence of irrigation excesses or deficits when using the different empirical equations to schedule irrigation. Based on the recomputed soil water balance (Fig 1), the irrigation excesses and deficits occurred during the management period were calculated. Irrigation excesses were computed as the sum of the net irrigation depths that resulted in soil water contents that exceeded field capacity. To compute irrigation deficits for the simulations with fixed irrigation intervals, deficit was considered as the reduction of ETc (true) observed for each empirical equation in relation to ETc observed when using the PM equation to schedule irrigation. It was done because water deficit promotes reductions in ETc, which is related to a worse crop development. For irrigation scheduling with variable irrigation intervals, the occurrence of deficits was computed as the sum of the soil water content deficits in relation to the critical minimum water content considered (50% of TAW, f = 0.5). Two classes of deficit were defined: (i) cases in which deficits were equivalent to 0.5<f�0.6 (weak deficit), and (ii) f>0.6 (moderate to strong deficit). These deficits were calculated according to Eqs 11 and 12. Deficitð0 Deficitðf > 0:6Þ where CL 1 -critical limit soil water content referring to f = 0.5, mm; CL 2 -critical limit soil water content referring to f = 0.6, mm; SWC i -soil water content value, mm; TAW-total available water, mm.

Estimation of ETo
Among the non-calibrated equations, the Romanenko equation (ROM) had the best performance for the estimation of ETo at Viçosa, with lower RMSE and MAE values in the various time scales considered (Table 4). This equation was followed by the Jensen-Haise (JH) equation and the Hargreaves-Samani (HS) equation, in that order. However, after calibration, the ROM equation exhibited the worst performance. The best performance was obtained by the JH equation, followed by the HS equation. At Mocambinho, the HS equation showed the best performance among the non-calibrated equations, followed by the JH and ROM equations, in that order (Table 5). After calibration, as for Viçosa, the JH equation showed the best performance, followed by the HS and ROM equations. Possibly the HS equation obtained the best performance among the non-calibrated equations because it was developed for a dry climate region (semiarid) [21], such as Mocambinho.
By increasing the time scale, there were performance gains for all the equations at both municipalities considered, with reductions in the error metrics (RMSE, MAE and MBE) and increase in R 2 . This is because part of the errors in daily estimates can be canceled when considering longer time periods.
All the non-calibrated equations evaluated, with exception for the ROM equation used at Viçosa, showed relatively high MBE values at both studied locations, which indicates that there was a systemic overestimation of ETo. These equations obtained only small performance gains with the increase of the time scale. Furthermore, they did not reach RMSE and MAE values as low as those obtained by the calibrated equations, which showed a low general tendency to overestimate or underestimate ETo (low MBE absolute values).
Calibration promoted large reductions in RMSE and MAE values. After calibration, the equations with higher R 2 values, with emphasis on the JH equation, even with high RMSE and MAE values before calibration, exhibited low errors. It should be noted that equations with good structure, which can adequately map the relationship between the input and output variables, reaching high R 2 values, can be benefited by calibration [16].
Based on the metrics presented in Tables 4 and 5, one can easily rank the performance of the models, identifying those with the highest performances. However, it can still be difficult to infer whether a particular model is suitable or not for irrigation scheduling purposes.

Irrigation scheduling
The results of the irrigation scheduling simulations with fixed irrigation intervals for Viçosa and Mocambinho are shown in Tables 6 and 7, respectively. The increase in irrigation intervals promoted, in all cases, reductions in ETc values and in the total net irrigation depths applied. The decrease in ETc occurs due to the larger reductions in the soil moisture promoted by larger irrigation intervals, which reduces Ks values and, consequently, ETc. The reduction in the net irrigation depths occurs due to the reduction in ETc and due to the increase in effective rainfall, as seen in Tables 6 and 7. Longer irrigation intervals promote greater use of rainfall (i.e., more rainwater is stored in the root zone) because they increase the chance of soil having less moisture, in relation to shorter irrigation intervals, when rainfall reaches the soil.
Among the non-calibrated equations, only the ROM equation used at Viçosa obtained total net irrigation depth close to that obtained with the PM equation. In all other cases, irrigation was overestimated. Thus, such equations promoted excessive water application, increasing the soil moisture above field capacity, as seen in the "Excess" column of Tables 6 and 7. However, after calibration, all the equations obtained total net irrigation depths close to those obtained when using the PM equation. Such behaviors corroborate the reductions in MBE absolute values observed for the estimation of ETo (Tables 4 and 5).
Although the calibrated equations obtained total net irrigation depths close to those obtained using the PM equation, it does not mean that they had the same performance of the To analyze the performance of the equations considering their time dynamics, it is possible to evaluate the occurrence of excessive water applications, as well as reductions of ETc under adequate irrigation conditions (i.e., irrigation scheduling using the PM equation) in relation to ETc observed under lower water application (i.e., irrigation scheduling using alternative equations). In this sense, although most of the calibrated equations resulted in total net irrigation depths close to those calculated with the PM equation, there were both irrigation underestimation and overestimation during the period evaluated, as shown in columns "Deficit" and "Excess" in Tables 6 and 7. However, after calibrating the equations, there were, in general, large reductions in the excessive water applications. On the other hand, the calibrated equations promoted certain irrigation deficits, slightly reducing total ETc (true) in relation to that  observed when scheduling irrigation with the PM equation. For both study sites, the calibrated HS and JH equations were the best options, promoting low excessive water applications and only small reductions in ETc (true). When scheduling irrigation using variable irrigation intervals, a critical soil water content is adopted to prevent the crop from suffering water deficit. Thus, it is necessary that the current soil water content is always above or, at most, slightly below the critical minimum limit considered. Thus, alternative models for the estimation of ETo must be able to provide sufficiently Table 6. Information on the irrigation scheduling carried out at Viçosa with the PM equation and original and calibrated empirical equations considering different irrigation intervals (II). All the variables, except for II, are expressed in mm.

Equation II (d) ETc ETc (true) NID Pe Pe (true) Deficit Excess
reliable ETo estimates to meet the condition described above. The results of the irrigation scheduling simulations with variable irrigation intervals are shown in Table 8.
As previously observed, among the non-calibrated equations, only the ROM equation used at Viçosa obtained total net irrigation depth close to that obtained with the PM equation. In the other cases, the total net irrigation depths were much higher than those calculated with the PM equation. After calibrations, there were, in general, reductions in the irrigation excesses. In relation to the irrigation deficits over the period evaluated, accumulated deficits in relation to the critical soil water content (f = 0.5) were computed in two classes: (i) cases in which deficits were equivalent to 0.5<f�0.6 (weak deficit), and (ii) f>0.6 (moderate to strong deficit). Even using the PM equation, there were some weak deficit events (0.5<f�0.6). This behavior is expected because even though the soil has not reached the limit water content for irrigation (in this study, irrigation was carried out when the soil water content was 40% below TAW) on a particular day, it is possible that, on the next day, the soil water content is already below the critical limit adopted (50% of TAW). However, it is expected that this level of stress, which remains for a short period and is of low intensity, does not cause significant damage to the crops.
At Viçosa, the calibrated HS and JH equations performed the best, with similar performance to each other. For Mocambinho, these equations also obtained the best performances; however, the calibrated HS equation was slightly better than the calibrated JH equation since it had lower moderate to strong deficits (f>0.6) and lower irrigation excesses. These behaviors partially contradict the results obtained when directly evaluating the equations for the estimation of ETo (Tables 4 and 5), since the calibrated JH equation was considered better than the calibrated HS equation in all the studied scenarios. To better assess the irrigation scheduling carried out with the different equations for the estimation of ETo, the soil water content behaviors during the evaluation period at Viçosa and Mocambinho are shown in Figs 2 and 3, respectively. After the end of the irrigation scheduling simulations with each empirical equation, the soil water contents were recalculated based on ETo obtained with the PM equation, as shown in Fig 1. The information presented in Figs 2 and 3 is referring to these recalculated water contents. On the days when there was irrigation, the water contents presented refer to the moment before irrigation.
At Viçosa, both the calibrated HS and JH equations promoted only small water deficits below the critical limit (Fig 2). At Mocambinho, when using the calibrated HS and JH equations, the soil water content falls considerably in the period of 250 to 300 days, especially for the calibrated JH equation (Fig 3). Even though the calibrated JH equation presented better metrics than the calibrated HS equation for the estimation of ETo at Mocambinho (Table 5), this equation had continuous ETo underestimations in the period around 250-300 days (Fig  4). On the other hand, the calibrated HS equation, despite showing, in general, greater deviations in relation to ETo obtained with the PM equation, had more alternate ETo underestimates and overestimates, which contributes to partially cancel the errors occurred during the irrigation scheduling period. Similar behavior was observed for Viçosa (Fig 4). It is also worth mentioning that in places with high rainfall levels, problems with ETo underestimation tend to be reduced. Finally, in addition to evaluating alternative models for the estimation of ETo using error metrics such as RMSE, MAE, MBE and R 2 , it is also important to analyze their behavior throughout the year. Furthermore, the simulation of the use of these models for irrigation scheduling can help in choosing the best model. Future studies could address the development of software, including integration with crop models, for simulating the use of models for the estimation of ETo for irrigation scheduling. Thus, it would be possible to evaluate the performance of the models considering more specific scenarios of interest. It could be considered, for example, the application of an empirical equation only at a certain period of the year, irrigation of crops with different cycles, and different types of soil, among other factors. Another important issue to be considered in future studies and a limitation of the present study is the use of real field data as benchmark, such as eddy covariance and/or soil water content measurements.

Conclusions
Alternative models for the estimation of ETo are typically assessed using error metrics. However, the model with the best metrics for the estimation of ETo may not be the best option to be used for irrigation scheduling. Despite the importance of the development of methodologies for a better assessment of the performance of models for the estimation of ETo for irrigation scheduling purposes, according to our knowledge, so far, this type of study has not been found. Thus, this study analyzes the performance of three original and calibrated empirical equations, initially assessed using traditional error metrics, for irrigation scheduling, considering the simulation of different irrigation intervals. Two study sites, Viçosa-MG and Mocambinho-MG, Brazil, were used.
In general, the original empirical equations did not perform well for the estimation of ETo, with the exception of the Romanenko and Hargreaves-Samani equations used at Viçosa and Mocambinho, respectively. Calibration promoted performance gains, reducing the tendency of the equations to overestimate ETo. The increase in the time scale also led to reductions in estimation errors.
When used for irrigation scheduling, the calibrated Hargreaves-Samani and Jensen-Haise equations showed the best performances in both Viçosa and Mocambinho stations. Even with greater errors when estimating ETo, the calibrated Hargreaves-Samani equation performed similarly or better than the calibrated Jensen-Haise equation, as it had errors with greater potential to be canceled during the soil water balance. The results obtained are dependent of   the climate conditions of the study site, thus, the performance of the equations can be very different in areas with different climatic conditions. Finally, it is suggested that the assessment of models for the estimation of ETo for use in irrigation scheduling, in addition to using traditional error metrics, consider the performance of the models throughout the year. Furthermore, simulating the application of the models in irrigation scheduling can provide valuable information for choosing the most suitable option.