Federated learning based reference evapotranspiration estimation for distributed crop fields

Muhammad Tausif; Muhammad Waseem Iqbal; Rab Nawaz Bashir; Bayan AlGhofaily; Alex Elyassih; Amjad Rehman Khan

doi:10.1371/journal.pone.0314921

Abstract

Water resource management and sustainable agriculture rely heavily on accurate Reference Evapotranspiration (ET_o). Efforts have been made to simplify the (ET_o) estimation using machine learning models. The existing approaches are limited to a single specific area. There is a need for ET_o estimations of multiple locations with diverse weather conditions. The study intends to propose ET_o estimation of multiple locations with distinct weather conditions using a federated learning approach. Traditional centralized approaches require aggregating all data in one place, which can be problematic due to privacy concerns and data transfer limitations. However, federated learning trains models locally and combines the knowledge, resulting in more generalized ET_o estimates across different regions. The three geographical locations of Pakistan, each with diverse weather conditions, are selected to implement the proposed model using the weather data from 2012 to 2022 of the selected three locations. At each selected location, three machine learning models named Random Forest Regressor (RFR), Support Vector Regressor (SVR), and Decision Tree Regressor (DTR), are evaluated for local Evapotranspiration (ET) estimation and the federated global model. The feature importance-based analysis is also performed to assess the impacts of weather parameters on machine learning performance at each selected local location. The evaluation reveals that Random Forest Regressor (RFR) based federated learning outperformed other models with coefficient of determination (R²) = 0.97%, Root Mean Squared Error (RMSE) = 0.44, Mean Absolute Error (MAE) = 0.33 mm day⁻¹, and Mean Absolute Percentage Error (MAPE) = 8.18%. The Random Forest Regressor (RFR) performance yields the local machine learning models against each selected site. The analysis results suggest that maximum temperature and wind speed are the most influential factors in Evapotranspiration (ET) predictions.

Citation: Tausif M, Iqbal MW, Bashir RN, AlGhofaily B, Elyassih A, Khan AR (2025) Federated learning based reference evapotranspiration estimation for distributed crop fields. PLoS ONE 20(2): e0314921. https://doi.org/10.1371/journal.pone.0314921

Editor: Ghani Rahman, Sejong University, KOREA, REPUBLIC OF

Received: April 22, 2024; Accepted: November 19, 2024; Published: February 5, 2025

Copyright: © 2025 Tausif et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All Data files are available from the GitHub repository:https://github.com/RaoTausif/FL-Based-ETo-Estimation.

Funding: The research is supported by the Artificial Intelligence & Data Analytics (AIDA) Lab, Prince Sultan University, Riyadh, Saudi Arabia. The authors would like to acknowledge the support of Prince Sultan University, Riyadh Saudi Arabia for support of the Article Processing Charges (APC) of this publication. The funders had a role in study design, data collection and analysis, decision to publish, and preparation of the manuscript.

Competing interests: The authors declared that no competing interests exist.

Abbreviations: ANN, Artificial Neural Network; BaTs, Bayesian Additive Trees; BoTs, Boosted Trees; CNN, Convolution Neural Network; DTR, Decision Tree Regressor; ELM, Extreme Learning Machine; ENR, Elastic Net Regression; ET, Evapotranspiration; ET_o, Reference Evapotranspiration; ETR, Extra Tree Regression; FFNN, Feed Forward Neural Network; FL, Federated Learning; GBR, Gradient Boosting Regression; IoT, Internet of Things; LR, Lasso Regression; LSTM, Long Short-Term Memory; M5P, M5 Model Tree; MAE, Mean Absolute Error; MAPE, Mean Absolute Percentage Error; ML, Machine Learning; MLR, Multiple Linear Regression; PCR, Principal Component Regression; PLSR, Partial Least Square Regression; POR, Poisson Regression; PSO, Particle Swarm Optimization; R², Coefficient of Determination; RBFNN, Radial Basis Function Neural Network; RDR, Ridge Regression; RF, Random Forest; RFR, Random Forest Regressor; RH, Relative Humidity; RMSE, Root Mean Square Error; SR, Solar Radiations; SVR, Support Vector Regressor; Tmax, Maximum Temperature; Tmin, Minimum Temperature; WS, Wind Speed; XGBR, eXtreme Gradient Boosting Regressor

1 Introduction

Water resource management and planning considers Evapotranspiration (ET) as a key water cycle concept and has many application areas [1], i.e., water management, drought monitoring, and irrigation scheduling [2]. Although its computation is critical for water management, the interrelationship between weather parameters and ET makes such computation complex. Moreover, water management in multiple locations is more critical. Therefore, solutions for the ET estimation in multiple locations considering their weather parameters are required to help agriculturists.

In agriculture, farming and irrigation utilize about 70% of fresh water around the globe [3] and significantly impact sustainability [4]. Water management exploits Reference Evapotranspiration ET_o to contribute to improving irrigation practices. Accuracy in ET_o calculation considering the weather condition of a location is compulsory to implement the sustainability practices by irrigation water conservation.

ET_o is a standard unit for calculating the ET rate among locations and weather conditions. It can be computed by estimating a crop’s transpired and evaporated water against specific weather conditions [5]. Different weather parameters, i.e., solar radiations, temperature, precipitation, vapor pressure deficit, and wind speed, may influence ET_o. ET_o can be estimated using a variety of methods, including Penman-Monteith (PM), Thornthwaite, Hargreaves [6], satellite-based, and machine-learning-based approaches [7–9]. Machine Learning based approaches able to capture more complex relationships between input and output variables [10–12]. Based on weather data, these methods use machine and deep learning algorithms to estimate ET_o. Each of these approaches has its advantages and disadvantages. Using historical data, efforts were made to establish empirical relationships between weather parameters and ET_o.

Machine learning (ML) algorithms have made remarkable progress in ET estimation with limited weather parameters. ML algorithms, such as Support Vector Regression (SVR), Random Forest Regressor (RFR), and Artificial Neural Networks (ANN), are proven to be very effective for accurate estimation of ET_o with limited weather parameters [13]. ML methods can analyze more complex relationships between the weather parameters for accurate ET_o estimation. Tradition methods for estimating ET_o, e.g., the Penman-Monteith equation, are complex and computationally hard. Moreover, simple ML approaches also face limitations, particularly related to data centralization, and may not effectively capture the diverse climatic conditions of different regions.

The application of ML techniques for predicting ET_o leads to evidence of their accuracy by collecting limited weather data. Problems with such measures are that they are often limited to specific areas. The existing solutions were proposed for a specific location tailored according to local weather conditions [14, 15]. The existing solutions of ET_o using limited weather parameters are limited to a specific area, and it is very hard to apply them in different contexts with the same accuracy [16]. The inherent variations in weather conditions of different locations further diversify the situation of ET_o modeling using limited weather parameters [17]. Understanding the relationship between weather and ET_o can be very complex, particularly for several locations with diverse weather patterns [18]. A universal ET_o model using limited weather parameters is difficult to optimize across diverse weather parameters of different locations [17]. Traditional machine learning methods for estimating ET_o are effective within certain contexts but are constrained by their dependence on local datasets, limiting their generalizability across diverse geographical locations. Consequently, this limitation is not due to the models themselves but rather to the restricted scope of the input data available to them. There is a need for a model that can generalize well for ET_o prediction of different locations with distinct weather conditions. This study addresses this limitation by implementing a federated learning (FL) approach, allowing localized model training and the capability to generalize across these local models using a global model [14]. The proposed approach intends to explore the possibilities of generalizations ability of a federated model to learn ET_o predictions using diverse weather parameters of multiple locations.

FL has the potential to handle different weather conditions across multiple locations using a single global model. Current methods have made some progress in certain areas. However, we still need to figure out how to deal with the diversity of data when estimating ET_o across a wider range of locations using a single model. FL can handle data diversity by leveraging multiple models for different locations [19]. The FL enables distributed data across multiple sources without data centralization [20, 21]. The FL has the potential to address the issues of data heterogeneity, decentralized data distribution, real-time adaptability, and resource efficiency. The advantages of FL over traditional machine learning and deep learning approaches for ET_o estimation of multiple locations with distinct weather conditions make it different, novel, and unique [22]. The architecture of the proposed FL model is illustrated in Fig 1.

Download:

Fig 1. Overview of the federated model.

https://doi.org/10.1371/journal.pone.0314921.g001

FL is diverse in handling the weather parameters of different locations for ET_o estimation considering the given data from multiple locations [23]. It integrates the dataset of multiple locations, improves the accuracy, and generalizes the ET_o estimation model across multiple locations. Furthermore, FL improves geographical coverage and ensures privacy by uncovering the insights of datasets of multiple locations [24, 25]. Considering the mentioned advantages of FL, this paper proposes a novel model to improve the accuracy of ET_o prediction for multiple locations. The proposed model considers the collective power of distributed data [1, 26] and enhances the precision and spatial coverage of ET_o prediction. FL-based ET_o estimation enhances water balance analysis by promoting model generalization and privacy by training on a diverse set of localized data from various regions, ensuring that the models are robust and applicable across different environmental conditions.

The main contributions of the paper are as follows:

A novel model is proposed for estimating ET_o for three (3) geographical locations in Pakistan by exploiting FL that provides a global automated solution for ET_o prediction and connects agriculturists around the globe.
The study intends to explore the performance of local and global machine learning models in ET_o prediction and their relative comparison.
The study also explores the impact of different weather parameters in ET_o prediction, using the feature important features of machine learning models.

The remaining sections of the paper are organized as follows: Section II explores existing literature and studies related to ET_o estimation, machine learning models, and relevant methodologies. Section III provides a detailed overview of this study’s materials, data sources, and methodologies. Section IV provides the obtained results and discussion, and section V contains a conclusion that summarizes the study’s key findings.

2 Related work

Machine learning techniques have been the research community’s focus in agriculture domain [27–29] in recent years. This section explores the progress of recent emerging approaches for ET_o estimation using machine learning approaches.

Dong et al. [30] proposed a solution that aims to improve the accuracy of ET_o estimations by analyzing the spatial and temporal variation of ET_o in China. This study used three ML models. Included models are; multiple adaptive regression, convolution neural networks (CNN), and extreme learning machines (ELM). CNN provided better estimation results of ET_o estimation.

Rai et al. [31] carried out a comparative analysis of various machine learning models for the prediction of monthly ET_o. They used India’s weather data for a period from 2009 to 2016. The results revealed that among all the models examined, the SVR model yielded the highest accuracy in reconstructing water requirements.

Bellido et al. [32] discussed the application of the neural network for determining the ET_o in the Andalusia region of southern Spain. The multi-layer perceptron, ELM, SVM, generalized regression neural network (GRNN), RF, and XGBoost were the methods assessed in this study. This study employed performance measures such as the coefficient of determination (R2), the root mean squared error (RMSE), and Nash-Sutcliffe model efficiency coefficient (NSE). Notably, the ELM approach emerged the best of all models, with an R2 of 0.89 NSE, 0.89, and RMSE of 0.67mm day⁻¹.

Krishna et al. [33] have employed diverse cognitive computing models to predict ET_o. The study utilized various factors and concluded that the second order neural network was most accurate in predicting ET_o. It also showed low error and high accuracy with the use of RMSE and R2 values of 0.065 mm day⁻¹ and 99%, respectively.

Ayaz et al. [34] used different machine learning models in India and New Zealand. The focus of this study is to use just only temperature data. They tried models like Long Short-Term Memory (LSTM), XGBR, SVR, and RF. When using all weather data inputs, the LSTM model outperformed with 99% accuracy. But when they only used temperature, accuracy dropped to 86%.

Samman et al. [35] examined the performance of four machine learning models in ET_o estimations. Five Iraqi stations were used as inputs to the models. SVM, RF, Bagged Trees (BaTs), and Boosting Trees (BoTs) have all been used for modeling daily ET_o. The RFR model provided the most accurate ET_o estimates at all cites, while SVM provided the lowest results. RFR significantly enhanced estimation accuracy compared to SVM, BoT, and BaT models across different locations. The improvement in RMSE ranged from 8% to 94% during the test period.

Mirzania et al. [36] proposed ET_o estimation approach for Australia. This study evaluated the performance of three models: the innovative Gunner algorithm, SVR, and hybrid innovative Gunner support vector regression. It was found that the AIG-SVR model outperformed, with r and RMSE corresponding to Marree Aero station values of 0.945 and 1.124, respectively, and St Helen Aerodrome stations of 0.951 and 0.476, respectively.

Khan et al. [16] proposed a method for reclaiming saline soil that employs the Internet of Things (IoT) and ML to estimate ET_o on monthly data. LSTM and ensemble LSTM models predict ETs based on field temperature, irrigation water salinity, and soil salinity. It was found that the ensemble LSTM-based model was more accurate than the single LSTM model, with an accuracy of 92% for the ET_o estimation.

Rashid et al. [37] aimed to develop an ET_o estimation method using four machine learning models with different input combinations. All combinations of the four defined models showed that the RF model was the most effective with MAE, R² and RMSE values 0.76 mm day⁻¹, 0.85% of 0.82 mm day⁻¹ respectively.

Yu et al. [38] aimed to assess the performance of different machine learning models for ET_o estimation with various input combinations, such as minimum and maximum temperatures, wind speed, solar radiation, relative humidity, atmospheric pressure, and sunshine duration. This study evaluated the performance of three machine learning models: ANN, SVR, and ELM. The SVR model proved to be the most accurate, with an R of 0.881, an RMSE of 0.925mm day⁻¹, MAE of 0.59 mm day⁻¹, and NSE of 0.744.

Zhang et al. [39] provide a detailed analysis of the special specificity of FL and the potential future of FL. FL is important in numerous contexts of application, and in particular when it is used for the discussion of frameworks of IoT. The research also addresses the challenges of applying FL within the IoT framework. The work also outlines practical aspects concerning the implementation of FL in practice and the necessity of the corresponding development tools.

Using a federated approach, Manoj et al. [40] introduced crop yield prediction on distributed datasets across multiple client devices. The ResNet-16 and ResNet-28 regression models were trained with the “federated averaging” technique to ensure decentralized training. The results of these models were then compared with other deep learning, and machine learning models. This research indicates that federated averaging was effective when applied with the ResNet-16 regression model and Adam optimizer to enhance the performance.

Kumar et al. [41] address the issues of data privacy and security that affect the implementation of SA. The study proposes PEFL: private FL framework with distinctive depth of privacy encoding. To enhance privacy, PEFL employs perturbation based encoding while the short term memory-auto encoder enhances the capacity of the memory. Despite a somewhat ambiguous division between the standard and the attack pattern, PEFL outperforms the non-FL and other FL methods for the ToN-IoT dataset.

Nguyen et al. [42] focused on the growing popularity of FL in IoT networks. The overview of FL and IoT with the overview of the improvements made in the recent advancements is presented. This study also examines FL possibility of enabling a range of IoT services, including data sharing, caching, attack detection, localization, mobile crowdsensing, and privacy protection. FL is extensively analyzed across critical IoT domains such as healthcare, transportation, UAVs, intelligent cities, and industry, highlighting its transformative impact.

Imteaj et al. [43] conducted a study that examined how distributed machine learning models could be trained on IoT devices with limited resources. It describes the prior research on FL and the assumptions made about its widespread use through IoT devices. The study also discussed the difficulties and problems of integrating FL into an IoT environment. A thorough analysis of new obstacles to using FL in diverse IoT scenarios is presented. In the estimation of advancing ET_o recent studies have contributed valuable understandings. Combining remote sensing techniques lays the foundation for satellite data to understand river dynamics [44]. [45] explores the potential of machine learning and satellite data for enhancing seasonal water supply forecasts. These studies collectively underscore the multidisciplinary approach required to refine reference evapotranspiration modeling and emphasize the importance of remote sensing data in ET_o estimation.

Although significant improvements in existing approaches (as shown in Table 1) have been observed, these approaches still suffer from limited geographical coverage, highlighting the need to address data diversity in ET_o estimation across a broader range. Despite significant advancements in the estimation of ET_o through various traditional and machine learning methods, a notable gap exists in the literature concerning the application of these approaches across multiple geographical locations with diverse weather conditions. Most existing studies are limited to localized datasets, effectively modeling ET_o for specific areas. The existing models are limited in generalizing findings across different locations with diverse weather parameters.

Download:

Table 1. Comparison of the state-of-the-art approaches.

https://doi.org/10.1371/journal.pone.0314921.t001

For instance, while numerous researchers have successfully applied machine learning techniques such as Support Vector Regression (SVR) and Random Forest Regressor (RFR) to predict ET_o in singular climatic contexts, the results are often not transferable to other regions with different meteorological characteristics. This limitation is primarily due to the inherent variability in weather parameters—such as temperature, humidity, wind speed, and solar radiation—that influence ET_o differently in distinct environments.

There is a needed to integrate FL techniques to estimate ET_o across multiple locations simultaneously. By using an FL framework, this study aims to use localized data from diverse climatic conditions while ensuring model generalization. This approach not only addresses the existing limitations in the literature but also provides a comprehensive solution for improving ET_o estimations relevant to agricultural practices and water resource management across different geographical settings.

3 Material and methods

This section details the key components and methodologies used in this ET_o estimation study based on ML algorithms and FL. The PM equation has been recognized as a standard [59]. However, it gets complicated because it needs many different factors to operate. The PM equation is written as (1) (1) Where,

A reliable ET_o estimate is essential for water resource management, agriculture, and weather sustainability. Using FL, we address the challenges associated with aggregating data from diverse geographic regions while maintaining the model’s generalizability and the data’s privacy.

3.1 Study area

Pakistan’s diverse climate and geography result in varying ET rates across different regions [60–64]. The study area comprises Punjab, Pakistan’s second-largest province in the eastern part of the country, as shown in Fig 2.

Download:

Fig 2. Geographical locations of the experiment cities.

https://doi.org/10.1371/journal.pone.0314921.g002

Agriculture is an important sector of Pakistan’s economy. This sector directly supports the country’s population and accounts for 26 percent of gross domestic product [65]. The major crops include cotton, wheat, rice, sugarcane, fruits, and vegetables. Multan, Faisalabad, and Rawalpindi are three major cities in Pakistan known for their fertile lands and significant contributions to the Pakistan agriculture sector. Geographically, Punjab is located between 24–37° N and 62–75° E. The majority of Punjab falls within the arid and semi-arid zones. Punjab, Pakistan, can be categorized into three distinct regions. The Multan region is characterized by arid conditions and experiences relatively high temperatures. The climate here is notably harsh. The Faisalabad area is semi-arid. The climate in Faisalabad is generally milder than that in the southern region. Rawalpindi features a tropical and semi-arid climate. Consequently, it typically experiences high temperatures. These weather distinctions within Punjab are essential when studying various weather and agricultural phenomena in the region. These distinctions significantly impact factors such as ET_o and water resource management.

3.2 Dataset

The data for this study were collected from three stations in Punjab, Pakistan. Multan is located at 30.1575° N, 71.5249° E, Rawalpindi is at 33.5651° N, 73.0169° E, and Faisalabad is at 31.4187° N, 73.0791° E. Daily data for 2012–2022 was obtained from NASA data sources [66] and Panman moniteth equation is used to calculate the daily ET_o. These selected stations cover the southern, central, and upper parts of Punjab. Daily data were collected on three key parameters: Maximum temperature (Tmax), wind speed (WS), and relative humidity (RH). The weather in these areas is distinct, with significant differences in weather characteristics. Figs 3–5 show separate plots for three distinct datasets. These 3D-scatter plots visualize the relationships between selected features Tmax, WS, RH, and ET_o Within each loop iteration. Adding climatic conditions into traditional ML models is helpful but doesn’t address challenges like data centralization, regional variability, and privacy. FL making it ideal for ET_o estimation in distributed crop fields.

Download:

Fig 3. Relationship of the selected features with ET_o at Multan dataset.

https://doi.org/10.1371/journal.pone.0314921.g003

Download:

Fig 4. Relationship of the selected features with ET_o at Faisalabad dataset.

https://doi.org/10.1371/journal.pone.0314921.g004

Download:

Fig 5. Relationship of the selected features with ET_o at Rawalpindi dataset.

https://doi.org/10.1371/journal.pone.0314921.g005

The dataset displays diverse weather characteristics at Multan, Faisalabad, and Rawalpindi (presented in Table 2). The temperatures in Multan and Faisalabad are higher, with Multan recording the highest Tmax at 50°C and Faisalabad following closely at 49.5°C. Rawalpindi, on the other hand, is comparatively milder, with a Tmax of 47.4°C. Rawalpindi shows the maximum RH of 94.5% and Multan the lowest minimum RH of 7.3%. Significant variations in WS in different cities, with Faisalabad having the highest maximum WS 5.67 ms^-1. The ET_o rate varies greatly in Multan, indicating substantial water loss. However, the rate is lower in Faisalabad and Rawalpindi due to their relatively milder climates. These observations underscore the diverse meteorological conditions among the regions.

Download:

Table 2. Summary statistics of weather variables in Multan, Faisalabad, and Rawalpindi.

https://doi.org/10.1371/journal.pone.0314921.t002

The diagram presented in Fig 6 consists of a set of violin plots describing the data distribution of five climatic variables. It shows the probability density of the data at different values, with the plot’s width representing the density and a central box plot indicating the interquartile range, median, and potential outliers. Each subplot is labeled from A to E, with each violin plot showing the data distribution. Faisalabad and Multan show a wider spread of Tmax values from around 10°C to 50°C while mean temperatures range approximately from 10°C to 40°C. Faisalabad and Multan have similar distributions of wind speeds reaching up to 7 m/s. Faisalabad and Multan display a broader range of evapotranspiration values, with distributions extending up to around 15 mm/day. Rawalpindi’s ET_o values show fewer variations in evapotranspiration rates. These values are significant for understanding regional climatic conditions and their agricultural implications.

Download:

Fig 6. Violin plots for each variable in each dataset.

https://doi.org/10.1371/journal.pone.0314921.g006

Scatter plots are generated to allow comparison and visualization of data characteristics across datasets, as shown in Figs 7–9. Figs 7–9 used in this study to illustrate how these datasets are associated with estimates of ET_o. This also illustrates the different statistical properties of each dataset, enabling a clear understanding of their characteristics. Variations in data distributions could significantly impact FL models’ ability to generalize and predict ET_o. ET_o is more accurately simulated by including Tmax, wind speed, and humidity as inputs to the model. The Tmax, WS and RH strongly correlate with ET_o. These factors directly affect ET_o as they govern the rate at which water converts from liquid to vapor. The inclusion of these parameters in ET_o models allows better capture of the complex dynamics of water loss processes.

Download:

Fig 7. Distribution plot of Multan dataset.

https://doi.org/10.1371/journal.pone.0314921.g007

Download:

Fig 8. Distribution plot of Faisalabad dataset.

https://doi.org/10.1371/journal.pone.0314921.g008

Download:

Fig 9. Distribution plot of Rawalpindi dataset.

https://doi.org/10.1371/journal.pone.0314921.g009

The correlation plots are shown in Figs 10–12. Correlation plots visually represent the relationships between variables in a dataset, often using color-coded matrices. These plots display the strength and direction of correlations, with the intensity of color or the slope of the trend line indicating the degree of positive or negative correlation between pairs of variables. In our study, these correlation plots show the relationships between ET_o and feature sets. It is evident from the correlation results that there is a strong relationship between weather parameters and ET_o. Fig 13 shows weather parameters across different datasets over the years.

Download:

Fig 10. Correlation among climatic variables at Multan dataset.

https://doi.org/10.1371/journal.pone.0314921.g010

Download:

Fig 11. Correlation among climatic variables at Faisalabad dataset.

https://doi.org/10.1371/journal.pone.0314921.g011

Download:

Fig 12. Correlation among climatic variables at Rawalpindi dataset.

https://doi.org/10.1371/journal.pone.0314921.g012

Download:

Fig 13. Comparison of input variables and ET_o over the years at all sites.

https://doi.org/10.1371/journal.pone.0314921.g013

3.3 Machine learning models

3.3.1 Decision Tree Regressor (DTR).

DTR is a type of decision tree used in supervised machine learning tasks like regression and classification. The working principle is making a tree of decisions based on various features in the dataset, such as temperature, wind speed or humidity. DTR divides the data into smaller groups until it cannot be divided anymore or the stopping criteria are met. DTR accommodates various data types and is adept at capturing complex non-linear connections within the dataset.

3.3.2 Random Forest Regressor (RFR).

RFR used an ensemble learning approach and acted like a team of DTR models working together to enhance the prediction of the task. It combines many tree models to give a final prediction by averaging what each model (tree) predicts. Each tree in the prediction is trained on a random subset of the training data. For each tree split, a random subset of features is considered. This ensemble approach helps to prevent the model from memorizing the training data, which could lead to wrong predictions on new data. RFR Can handle large datasets with high dimensionality.

3.3.3 Support Vector Regressor (SVR).

SVR is a supervised machine learning tool used to perform regression tasks. SVR works by finding a best-fit line (hyperplane) to the data, leaving some points outside but not too many. SVR focuses on controlling how many points can be outside this line rather than trying to make every prediction perfect. SVR is Memory efficient, using only support vectors in the decision function. Using different kernel functions for non-linear decision boundaries makes it more efficient to perform regression tasks.

3.4 Feature importance analysis

Feature importance analysis determines the impact and importance of different weather parameters for other locations on a specific location’s ET_o. Gini impurity metric is used to assess the importance of various weather parameters in predicting ET_o of each location. Gini impurity is a commonly used criterion in decision tree algorithms to evaluate the quality of a split at each node. The Gini impurity for a dataset is calculated using the Eq 1. (2) Where:

p_i is the proportion of instances in class i relative to the total number of instances.
n is the different values of ET_o.

The Gini impurity is calculated for the parent and child nodes after the split when a feature is used to split the data at a node. The decrease in Gini impurity resulting from this split indicates how well the feature separates the data into different ET_o value ranges. The more significant the reduction in impurity, the more important the weather feature is considered for ET_o determination. The total decrease in Gini impurity is calculated for each feature in the model, which is determined by splits on that feature across all trees in the Random Forest model. The total decline is normalized by the number of trees to determine the average Gini importance score for each feature.

The resulting scores indicate the relative importance of each weather parameter in predicting ET_o. A higher Gini importance score suggests a feature significantly impacts the ET_o predictions.

The Gini impurity criterion is used to assess the importance of weather parameters for ET_o estimation because it can handle continuous variables without considering data distribution. Gini impurity is also computationally efficient and allows seamless integration with RF models to apply an ensemble learning approach to enhance the predictive accuracy of the model.

3.5 Federated Learning (FL) framework for ET_o estimation

The proposed FL framework adopts a decentralized architecture, enabling multiple clients to collaboratively train a global model for estimating evapotranspiration (ET_o). Each client is responsible for training on its local dataset, which contains weather parameters relevant to ET_o estimation, as shown in Fig 14. The central server orchestrates the training process by coordinating model updates across clients. This collaborative approach is particularly advantageous for estimating ET_o in distributed crop fields, where local weather data varies, and direct data centralization may not be feasible due to privacy, bandwidth, or regulatory concerns.

Download:

Fig 14. FL-based ET_o prediction flow chart.

https://doi.org/10.1371/journal.pone.0314921.g014

3.5.1 FL framework design and methodology.

The core of the framework involves three key components: client initialization, local training, and global aggregation. The workflow of the federated learning process applied to ET_o estimation is as follows:

Client Initialization: Each client initializes its local model by receiving parameters from the global model, which is maintained by the central server. The local model represents an initial estimate for ET_o based on the global understanding of weather patterns.
Local Training: Clients train their local models using their respective datasets, which include historical weather data such as maximum temperature, wind speed, and relative humidity collected over the period from 2012 to 2022. Each client optimizes its local model parameters based on a loss function defined specifically for ET_o prediction. The loss function typically measures the error between predicted and actual ET_o values at the local level, helping each client refine its model.
Model Update: After local training, clients compute updates to their model parameters. These updates are derived from the gradient of the loss function with respect to the local model parameters, representing the direction and magnitude of adjustments needed to improve the model’s accuracy. Clients then send these updates to the central server for aggregation.
Aggregation: The server aggregates the updates from all clients to form a new global model, which incorporates the knowledge from all participating regions. The aggregation process involves averaging the model parameter updates from each client. This ensures that the global model reflects both the local weather conditions (which vary by region) and the generalizable patterns across all locations. The typical optimization objective in FL can be expressed as: (3) where F(θ) is the global loss function to be minimized, representing the overall model performance across all clients, θ represents the global model parameters, n is the total number of participating clients, and L_i(θ) is the local loss function for client i, which is computed using the client’s local dataset and global model parameters θ.
The goal of FL is to minimize the average of these local losses across all clients, ensuring that the global model performs well across diverse weather conditions.
The process of aggregation is mathematically represented by the following equation for the global model update: (4) where n is the number of participating clients, Δθ_i is the update computed by client i based on its local training, and Δθ_global represents the aggregated update applied to the global model.
This aggregation ensures that the final model is a synthesis of all local models, enhancing the model’s ability to generalize across different climates and geographical regions.
Clients and Communication Process: In this study, three clients represent three distinct geographical locations: Multan, Faisalabad, and Rawalpindi. Each client collects weather data over the period from 2012 to 2022, focusing on parameters like maximum temperature, wind speed, and relative humidity. The communication process is designed to minimize the need for large-scale data transfer and ensure that local data privacy is maintained. The communication process includes the following steps:
1. (a) Data Characteristics: Each client’s dataset is unique, reflecting local weather conditions and variations in ET_o across the regions. This diversity in the data is essential for training a robust and generalized model that can adapt to different climates.
2. (b) Communication Rounds: The FL process consists of 20 communication rounds. In each round, the central server distributes the updated global model to all clients and receives their local model updates. This iterative process continues until convergence is achieved, meaning that the global model performs satisfactorily across all regions.

The Algorithm regarding FL training is given by Algorithm 1. The FL algorithm starts with Initialization (lines 2-4), where the global model parameters (w) are set. The learning rate (α = 0.025), regularization strength (λ = 0.05), and number of training rounds (T = 20) are defined. The optimal values are found using the random search method. During the Training Process (lines 5-11), for each training round t (line 5), each client i (line 6) updates its local model parameters based on the global model w and its local dataset (X_i, Y_i). This update involves adjusting the local model parameters according to the gradients from the local loss function and regularization term. After all clients have completed their local updates, the Global Model Update (line 8) takes place, where the global model parameters are updated by averaging the parameters from all clients. Finally, the Convergence Check (lines 12-13) determines whether the model has converged or if training should continue for up to the specified number of rounds T. This step ensures that the iterative process continues until convergence criteria are met or the maximum number of rounds is reached.

Algorithm 1 FL with three clients

1: Initialization:

2: Initialize global model parameters w

3: Initialize learning rate α, regularization strength λ, and number of training rounds T End Initialization

4: for t = 1 to T do

5: for each client i do

6: Update local model parameters w_i based on the global model w and the local dataset (X_i, Y_i)

7: w_i ← w−α∇L(w, X_i, Y_i) + λ∇R(w) ▷ Where L(w, X_i, Y_i) is the local loss function, and R(w) is the regularization term

8: end for

9: After all clients have updated their local models, update the global model as follows:

10:

11: end for

12: Convergence Check:

13: Check for convergence or end the training after T rounds

14: End Convergence Check

3.6 Evaluation metrics

To test the results’ reliability, the models were trained and tested using 10-fold cross-validation to identify the best-performing model. Finally, evaluation metrics were computed for each model to compare their performance. This study used R², RMSE, MAE, MAPE, and NSE as evaluation metrics to assess the machine learning model’s performance. These metrics will help quantify how well the model estimates ET_o. R² measures the model’s goodness of fit to the observed ET_o values. It tells the proportion of the variance in ET_o that the model can explain. The R² value can be obtained by Eq (5). (5) Where SSR presents the sum of squared residuals that can be computed by the squared difference between predicted and observed ET_o), and SST presents the total sum of squares that can be computed by estimating the squared difference between observed ET_o and its mean).

RMSE presents the average magnitude of the errors between predicted and observed ET_o values. RMSE of the proposed model prediction can be formalized as follows: (6) where predicted ET_o presents the estimated ET_o of the proposed model, Observed ET_o presents the actual observed ET_o, and n presents the number of data points. In this equation, i is an index that corresponds to each specific data point in the dataset.

MAE can be utilized to measure the average absolute magnitude of the errors between predicted and observed ET_o values (computed in the previous equation) can be formalized as follows: (7) where predicted ET_o presents the estimated value of the proposed model, Observed ET_o presents the estimated value measured from real-world data. Here, i is an index that represents each data point in the dataset. The above equation aggregates all the individual errors into one total error, and n represents the total number of data points in the given dataset.

MAPE can compute the average percentage difference between predicted and actual ET_o values that can be formalized as follows: (8)

Nash-Sutcliffe Efficiency (NSE), used to measure the performance of ML-based models. It assesses the predictive accuracy of a model by comparing the model’s predictions to observed data. NSE mathematically expressed by (9) (9) where is the mean of the observed ET_o values, and n is the total number of observations. Moreover, t is an index representing each dataset observation.

4 Results

The study examined three machine learning models: RFR, DTR, and SVR. A proposed experiment uses three separate weather datasets from Multan, Faisalabad, and Rawalpindi. Performance was evaluated using key metrics, including R², RMSE, MAE, MAPE and NSE. A comparison of the performance of these models across different geographical locations is provided in the results. Notably, to our knowledge, the proposed FL method is the first to automatically estimate ET_o for distributed fields using FL. Therefore, it is compared with traditional machine learning models instead of baseline models. At Multan, the RFR model achieved the highest R² and NSE value of 0.98 and MAPE of 6.72%. The obtained value of R² and MAPE indicates an excellent fit to the data. The lower RMSE values = 0.42, MAE = 0.32 mm day⁻¹ reveals the model’s ability to accurately predict the ET_o. The obtained values of RMSE and MAE suggest precise and accurate ET_o predictions with the RFR model. SVR and DTR model also performed good, with R² values above 0.95 and low in error metrics. A high R² and NSE value of 0.97 with the Faisalabad dataset using the RFR model is achieved.

For Faisalabad, the RFR outperformed other models with RMSE = 0.31, MAE = 0.23 mm day⁻¹, and MAPE of 5.21%. In the case of the Rawalpindi dataset, the RFR model outperformed other models with R² = 0.96, NSE = 0.96 and MAPE = 8.31%, indicating a good fit for the model. The RFR also exhibits a low RMSE = 0.37 and MAE = 0.28 mm day⁻¹ in ET_o predictions. The DTR and SVR models also performed reasonably with R² values = 0.93, but they exhibit higher errors than the RFR model. Kruskal-Wallis test is also performed to evaluate the performance of RFR, SVR and DTR. The performance of three machine learning models is compared using evaluation metrics R², RMSE, MAE, MAPE, and NSE. The results of the test are described in Table 3. The p-values for all metrics are greater than 0.05. This indicates no significant difference in performance among the models across the local and the federated model. The RMSE and MAE values are closer to the threshold value. Moreover, we also perform the ANOVA for the reliability analysis. The results of ANOVA analysis suggest that f-ratio value is 14.6 and the p-value is.000077. The result is significant at p <.05.

Download:

Table 3. Kruskal-Wallis test results.

https://doi.org/10.1371/journal.pone.0314921.t003

Radar chart shows how the three models (RFR, SVR, DTR) perform according to the five metrics for each city as described in Fig 15. The radar charts represent that the models’ performance across different metrics is relatively consistent within each city, with no significant outliers. In the federated approach described in Fig 16, RFR and SVR show better performance in terms of lower errors (MAE, MAPE, RMSE) and higher R² and NSE, while DTR has higher error metrics and lower R² and NSE values on average. Note that we selected a Radar chart instead of a Smith chart as a Radar chart provides a better understanding while comparing multiple variables across a single category.

Download:

Fig 15. Performance comparison of ML algorithms across multiple evaluation metrics.

https://doi.org/10.1371/journal.pone.0314921.g015

Download:

Fig 16. Performance of federated learning on different ML models across multiple evaluation metrics.

https://doi.org/10.1371/journal.pone.0314921.g016

The performance of different models in correlation and the standard deviation is shown in Taylor diagram 17, offering a comprehensive overview of model accuracy and variability. Fig 17 suggests that the federated approach generally shows good performance with RFR and SVR, but DTR shows lower performance.

Download:

Fig 17. Performance evaluation of different models using Taylor diagram on various datasets.

https://doi.org/10.1371/journal.pone.0314921.g017

The error boxplot presented in Fig 18. It demonstrates that RFR constantly outperforms the DTR and SVR across all locations and performance metrics. RFR exhibits the highest R² and NSE values and the lowest MAPE, MAE, and RMSE values, representing greater predictive accuracy and reliability. DTR generally shows the poorest performance, with the lowest R² and NSE values and the highest MAPE, MAE, and RMSE values. SVR falls in between, performing better than DTR but not as well as RFR. FL models also show a similar trend, with RFR having the lowest MAPE values as represented in Fig 19. DTR in federated generally performs better than SVR in terms of RMSE and MAE, but worse in terms of R², NSE, and MAPE

Download:

Fig 18. Box plot analysis of evaluation metrics for RFR, DTR, and SVR models.

https://doi.org/10.1371/journal.pone.0314921.g018

Download:

Fig 19. Box plot analysis of evaluation metrics for RFR, DTR, and SVR models using federated learning.

https://doi.org/10.1371/journal.pone.0314921.g019

A feature-importance-based analysis is also performed to determine the impact of different weather parameters on ET_o. The feature importance analysis is shown in Fig 20. By analyzing this information, we can better understand the relationship between weather features and ET_o. In feature analysis, it was found that Tmax and WS were the most influential parameters for ET_o determination. A novel approach to learning called FL is also compared with separate traditional machine learning models in the study. This study also compared three regression models, RFR, SVR, and DTR, revealing varying performance metrics across evaluation criteria. The RFR outperformed other models with an R² value of 0.97 while maintaining lower errors with an RMSE of 0.44 and MAE of 0.33 mm day⁻¹ and MAPE of 8.18. The DTR results closely followed the RFR results, with an R² = 0.96 and similar error values RMSE = 0.48, MAE = 0.35 mm day⁻¹ and MAPE = 8.50. This federated model, with an R² value = 0.97 and an RMSE = 0.44, can explain a large part of the changes in the target variable. In the FL approach, RMSE is higher than the best-performing individual model (e.g., Multan’s RFR at RMSE 0.4064). Although the difference is relatively small, it still represents an accurate prediction. The MAE of the federated model = 0.33 mm day⁻¹, and the MAPE = 8.18%, demonstrating its ability to provide accurate estimates of ET_o. Performance metrics of different models across all the datasets are represented in Table 4.

Download:

Fig 20. Features importance analysis.

https://doi.org/10.1371/journal.pone.0314921.g020

Download:

Table 4. Performance metrics of different models across datasets.

https://doi.org/10.1371/journal.pone.0314921.t004

4.1 Discussion

The study implemented ML and FL models using Python and Google Colab by exploiting Python libraries: Scikit-learn, Keras, and TensorFlow Federated. The specific configurations included a processor with Single-core hyper-threaded Xeon Processors with RAM Memory of 12.72 GB, GPU of NVIDIA Tesla K80, P100, or T4 (depending on availability), providing substantial computational resources. While training time can vary due to internet connectivity and the availability of Google services. The RFR model consistently performed excellently across all three datasets, with high R² values and low error metrics. SVR and DTR models also produced competitive results, but often RFR outperformed them. Many parameters influence model selection, including the specific application, interpretability, and computational capacity. The FL approach combines data from multiple regions. It proved highly effective, especially when using the RFR. RFR is the most reliable model for accurate ET_o estimation in diverse crop field settings. This finding is crucial for agricultural management, enabling more precise water resource optimization and planning. The high R² and low error metrics of RFR indicate its strong capability to handle the spatial variability in ET_o data, making it a valuable tool for improving irrigation efficiency and crop yields. DTR showed more variability in its performance. Although DTR can capture non-linear relationships, it didn’t perform as consistently as RFR. Regarding error metrics, the Support Vector Regressor (SVR) generally lagged behind RFR and DTR. It is likely due to its sensitivity to kernel choice and hyperparameters. The findings emphasize the importance of using the right machine-learning models for particular geographical locations when estimating ET_o values. These findings explain how well machine learning models predict ET_o for distributed crop fields. The RFR-based model seems appropriate for accurate ET_o predictions, but the application’s requirements should be considered when choosing the final model. This study also uses feature importance analysis to prioritize and select the most suitable features for the ET_o prediction model. A machine learning model that emphasizes the most important factors may optimize training times and enhance interpretability by focusing on the most significant factors. Identifying traits of low value in data preparation and quality control might be useful. FL is beneficial when dealing with data distributed across multiple locations or clients, such as geographic regions. It allows models to be trained locally on specific datasets while preserving data privacy. Three different datasets (Multan, Faisalabad, and Rawalpindi) are combined to create the federated global model. The obtained results indicate strong performance across multiple evaluation metrics. The R² value of the federated model is comparable to that of the best-performing individual models on each dataset. It indicates the excellent generalization capabilities of the federated model. FL preserves privacy even though the RMSE of the federated model is slightly higher than that of the best individual models. The trade-off is acceptable when considering the accuracy of the predictions. As a result, the federated model’s MAE and MAPE for estimating ET_o across multiple locations are reliable. Comparing FL to individual models for each dataset, the FL method can effectively predict ET_o. It can be beneficial when distributing data across multiple locations, and model generalization is a key concern. It is important to consider the particular requirements of the application when choosing between individual models and FL. Federated models may exhibit slightly higher RMSE, MAE, and MAPE, indicating a modest compromise in accuracy. However, this trade-off enhances their ability to generalize across diverse datasets, making them more robust, albeit less optimized for specific individual datasets. Distributed learning can indeed be an effective method for geographically dispersed data but in many real-world applications, transferring local weather data to a central location for model training may be infeasible due to bandwidth limitations, and data security regulations. Federated learning also addresses these concerns by allowing models to be trained locally at each site.

The study focuses on three locations in Pakistan, each characterized by distinct weather conditions. Expanding the research to encompass broader geographical areas could significantly enhance the model’s adaptability and generalization. Moreover, Training a model with data from various regions, each with unique weather conditions, geographical features, and farming practices, is essential for achieving high accuracy. However, for future work, it is recommended to implement the proposed solution in regions with even more diverse weather conditions and to incorporate advanced deep learning approaches to refine the model’s performance further.

5 Conclusion

The study proposed an FL approach for estimating reference evapotranspiration (ET_o) across multiple locations with distinct weather parameters. By employing various machine learning algorithms, including support vector machines, decision tree regression, and random forest regression (RFR), the research aimed to analyze and predict ET_o effectively. The results demonstrated that the RFR model consistently outperformed other models at local and global levels, highlighting its robustness in ET_o predictions. Feature importance analysis identified maximum temperature and wind speed as key weather parameters influencing ET_o estimation. This research offers insights into the complex relationships between weather variables and ET_o. However, the model’s adaptability might be limited by the study’s focus on three specific locations in Pakistan. Future work should explore the application of this approach in regions with more diverse weather conditions and consider the integration of deep learning techniques for further improvement.

The study proposed an FL approach for ET_o estimation of multiple locations with distinct weather parameters. Various machine learning algorithms were used to analyze and predict Reference Evapotranspiration (ET_o), including Support Vector Machines (SVM), DTR, and RFR. The implementation of the proposed solution reveals that the RER model outperformed local and global models with R² = 0.95, MAPE = 10.35, RMSE = 0.49, NSE = 0.95, and MAE = 0.38 (mm day⁻¹). The performance of the federate learning is satisfactory to estimate ET_o with a single machine learning model trained using data of different locations. In the case of local models, the performance of the RFR model for the Multan dataset is R² = 0.98, MAPE = 6.72, RMSE = 0.42, NSE = 0.98 and MAE = 0.30 (mm day⁻¹). For the RFR model of the Faisalabad dataset is R² = 0.97, MAPE = 5.46, NSE = 0.97 RMSE = 0.32, and MAE = 0.24 (mm day⁻¹). For the RFR model of the Rawalpindi dataset is R² = 0.96, MAPE = 8.31, RMSE = 0.37, NSE = 0.96 and MAE = 0.27 (mm day⁻¹). Using a machine learning model, the RFR-based model outperformed the SVR and DTR in ET_o predictions at global and local levels. A feature importance analysis revealed that maximum temperature and wind speed are the dominant weather parameters in ET_o estimation. The study gains a deeper understanding of the relationships between weather parameters and reference evapotranspiration. Three locations in Pakistan may limit the model’s adaptability to other regions. Implementing the solution in areas with more diverse weather conditions and utilizing deep learning approaches are recommended for future work. The findings of this study underscore the potential of FL in enhancing ET predictions across varying climatic conditions, paving the way for improved agricultural management practices.

References

1. Boobalan P, Ramu SP, Pham QV, Dev K, Pandya S, Maddikunta PKR, et al. Fusion of federated learning and industrial Internet of Things: A survey. Computer Networks. 2022;212:109048.
- View Article
- Google Scholar
2. Mostafa RR, Kisi O, Adnan RM, Sadeghifar T, Kuriqi A. Modeling potential evapotranspiration by improved machine learning methods using limited climatic data. Water. 2023;15(3):486.
- View Article
- Google Scholar
3. Xie C, Chen PY, Zhang C, Li B. Improving privacy-preserving vertical federated learning by efficient communication with admm. arXiv preprint arXiv:220710226. 2022.
- View Article
- Google Scholar
4. Zouzou Y, Citakoglu H. General and regional cross-station assessment of machine learning models for estimating reference evapotranspiration. Acta Geophysica. 2023;71(2):927–947.
- View Article
- Google Scholar
5. Zhuge W, Yue Y, Shang Y. Spatial-temporal pattern of human-induced land degradation in Northern China in the Past 3 decades RESTREND approach. International Journal of Environmental Research and Public Health. 2019;16(13):2258. pmid:31248024
- View Article
- PubMed/NCBI
- Google Scholar
6. Cobaner M, Citakoğlu H, Haktanir T, Kisi O. Modifying Hargreaves–Samani equation with meteorological variables for estimation of reference evapotranspiration in Turkey. Hydrology Research. 2017;48(2):480–497.
- View Article
- Google Scholar
7. Zotarelli L, Dukes MD, Romero CC, Migliaccio KW, Morgan KT. Step by step calculation of the Penman-Monteith Evapotranspiration (FAO-56 Method). Institute of Food and Agricultural Sciences University of Florida. 2010;8.
- View Article
- Google Scholar
8. Manikumari N, Vinodhini G, Murugappan A. Modelling of reference evapotransipration using climatic parameters for irrigation scheduling using machine learning. Hydrological Sciences Journal. 2020;65(16):2669–2677.
- View Article
- Google Scholar
9. Han X, Wei Z, Zhang B, Li Y, Du T, Chen H. Crop evapotranspiration prediction by considering dynamic change of crop coefficient and the precipitation effect in back-propagation neural network model. Journal of Hydrology. 2021;596:126104.
- View Article
- Google Scholar
10. Uncuoglu E, Citakoglu H, Latifoglu L, Bayram S, Laman M, Ilkentapar M, et al. Comparison of neural network, Gaussian regression, support vector machine, long short-term memory, multi-gene genetic programming, and M5 Trees methods for solving civil engineering problems. Applied Soft Computing. 2022;129:109623.
- View Article
- Google Scholar
11. Bayram S, Çıtakoğlu H. Modeling monthly reference evapotranspiration process in Turkey: application of machine learning methods. Environmental Monitoring and Assessment. 2023;195(1):67.
- View Article
- Google Scholar
12. Citakoglu H, Cobaner M, Haktanir T, Kisi O. Estimation of monthly mean reference evapotranspiration in Turkey. Water Resources Management. 2014;28:99–113.
- View Article
- Google Scholar
13. Hu Z, Bashir RN, Rehman AU, Iqbal SI, Shahid MMA, Xu T. Machine learning based prediction of reference evapotranspiration (et 0) using iot. IEEE Access. 2022;10:70526–70540.
- View Article
- Google Scholar
14. Nauman MA, Saeed M, Saidani O, Javed T, Almuqren L, Bashir RN, et al. IoT and Ensemble Long-Short-Term-Memory-Based Evapotranspiration Forecasting for Riyadh. Sensors. 2023;23(17). pmid:37688039
- View Article
- PubMed/NCBI
- Google Scholar
15. Nauman MA, Saeed M, Saidani O, Javed T, Almuqren L, Bashir RN, et al. IoT and Ensemble Long-Short-Term-Memory-Based Evapotranspiration Forecasting for Riyadh. Sensors. 2023;23(17). pmid:37688039
- View Article
- PubMed/NCBI
- Google Scholar
16. Khan AA, Nauman MA, Bashir RN, Jahangir R, Alroobaea R, Binmahfoudh A, et al. Context Aware Evapotranspiration (ETs) for Saline Soils Reclamation. IEEE Access. 2022;10:110050–110063.
- View Article
- Google Scholar
17. Bashir RN, Saeed M, Al-Sarem M, Marie R, Faheem M, Karrar AE, et al. Smart reference evapotranspiration using Internet of Things and hybrid ensemble machine learning approach. Internet of Things. 2023;24:100962.
- View Article
- Google Scholar
18. Tausif M, Dilshad S, Umer Q, Iqbal MW, Latif Z, Lee C, et al. Ensemble learning-based estimation of reference evapotranspiration (ETo). Internet of Things. 2023;24:100973.
- View Article
- Google Scholar
19. Babar M, Qureshi B, Koubaa A. Review on Federated Learning for digital transformation in healthcare through big data analytics. Future Generation Computer Systems. 2024;160:14–28.
- View Article
- Google Scholar
20. Siddique AA, Alasbali N, Driss M, Boulila W, Alshehri MS, Ahmad J. Sustainable collaboration: Federated learning for environmentally conscious forest fire classification in Green Internet of Things (IoT). Internet of Things. 2024;25:101013.
- View Article
- Google Scholar
21. Kaleem S, Sohail A, Babar M, Ahmad A, Tariq MU. A hybrid model for energy-efficient Green Internet of Things enabled intelligent transportation systems using federated learning. Internet of Things. 2024;25:101038.
- View Article
- Google Scholar
22. El Hanjri M, Kabbaj H, Kobbane A, Abouaomar A. Federated learning for water consumption forecasting in smart cities. In: ICC 2023-IEEE International Conference on Communications. IEEE; 2023. p. 1798–1803.
23. Supriya Y, Gadekallu TR. Particle Swarm-Based Federated Learning Approach for Early Detection of Forest Fires. Sustainability. 2023;15(2):964.
- View Article
- Google Scholar
24. Ullah F, Srivastava G, Xiao H, Ullah S, Lin JCW, Zhao Y. A Scalable Federated Learning Approach for Collaborative Smart Healthcare Systems with Intermittent Clients using Medical Imaging. IEEE Journal of Biomedical and Health Informatics. 2023.
- View Article
- Google Scholar
25. Pandya S, Srivastava G, Jhaveri R, Babu MR, Bhattacharya S, Maddikunta PKR, et al. Federated learning for smart cities: A comprehensive survey. Sustainable Energy Technologies and Assessments. 2023;55:102987.
- View Article
- Google Scholar
26. Friha O, Brik B, Touati F, Al-Fuqaha A, Afyouni I. FELIDS: Federated learning-based intrusion detection system for agricultural Internet of Things. Journal of Parallel and Distributed Computing. 2022;165:17–31.
- View Article
- Google Scholar
27. Mahjoub T, Mnaouer AB, Said MB, Boujemaa H. LoRa signal propagation and path loss prediction in Tunisian date palm oases. Computers and Electronics in Agriculture. 2024;222:109027.
- View Article
- Google Scholar
28. Ahmed RA, Hemdan EED, El-Shafai W, Ahmed ZA, El-Rabaie ESM, Abd El-Samie FE. Climate-smart agriculture using intelligent techniques, blockchain and Internet of Things: Concepts, challenges, and opportunities. Transactions on Emerging Telecommunications Technologies. 2022;33(11):e4607.
- View Article
- Google Scholar
29. Boulila W, Alzahem A, Koubaa A, Benjdira B, Ammar A. Early detection of red palm weevil infestations using deep learning classification of acoustic signals. Computers and Electronics in Agriculture. 2023;212:108154.
- View Article
- Google Scholar
30. Dong J, Wang Y, Shen Y, Wang C, Chen J. Nation-scale reference evapotranspiration estimation by using deep learning and classical machine learning models in China. Journal of Hydrology. 2022;604:127207.
- View Article
- Google Scholar
31. Rai P, Kumar A, Kumar M, Kushwaha S, Chauhan A. Evaluation of machine learning versus empirical models for monthly reference evapotranspiration estimation in Uttar Pradesh and Uttarakhand States, India. Sustainability. 2022;14(10):5771.
- View Article
- Google Scholar
32. Bellido-Jiménez JA, Estévez J, García-Marín AP. New machine learning approaches to improve reference evapotranspiration estimates using intra-daily temperature-based variables in a semi-arid region of Spain. Agricultural Water Management. 2021;245:106558.
- View Article
- Google Scholar
33. Krishnashetty PH, Balasangameshwara J, Sreeman S, Desai S, Kantharaju AB. Cognitive computing models for estimation of reference evapotranspiration: A review. Cognitive Systems Research. 2021;70:109–116.
- View Article
- Google Scholar
34. Ayaz A, Rajesh M, Singh SK, Rehana S, et al. Estimation of reference evapotranspiration using machine learning models with limited data. AIMS Geosci. 2021;7(3):268–290.
- View Article
- Google Scholar
35. Sammen SS, Kisi O, Al-Janabi AMS, Elbeltagi A, Zounemat-Kermani M. Estimation of Reference Evapotranspiration in Semi-Arid Region with Limited Climatic Inputs Using Metaheuristic Regression Methods. Water. 2023;15(19):3449.
- View Article
- Google Scholar
36. Mirzania E, Vishwakarma DK, Bui QAT, Band SS, Dehghani R. A novel hybrid AIG-SVR model for estimating daily reference evapotranspiration. Arabian Journal of Geosciences. 2023;16(5):1–14.
- View Article
- Google Scholar
37. Rashid Niaghi A, Hassanijalilian O, Shiri J. Estimation of reference evapotranspiration using spatial and temporal machine learning approaches. Hydrology. 2021;8(1):25.
- View Article
- Google Scholar
38. Yu H, Wen X, Li B, Yang Z, Wu M, Ma Y. Uncertainty analysis of artificial intelligence modeling daily reference evapotranspiration in the northwest end of China. Computers and Electronics in Agriculture. 2020;176:105653.
- View Article
- Google Scholar
39. Zhang T, Gao L, He C, Zhang M, Krishnamachari B, Avestimehr AS. Federated learning for the Internet of things: Applications, challenges, and opportunities. IEEE Internet of Things Magazine. 2022;5(1):24–29.
- View Article
- Google Scholar
40. Manoj T, Makkithaya K, Narendra V. A federated learning-based crop yield prediction for agricultural production risk management. In: 2022 IEEE Delhi Section Conference (DELCON). IEEE; 2022. p. 1–7.
41. Kumar P, Gupta GP, Tripathi R. PEFL: Deep privacy-encoding-based federated learning framework for smart agriculture. IEEE Micro. 2021;42(1):33–40.
- View Article
- Google Scholar
42. Nguyen DC, Ding M, Pathirana PN, Seneviratne A, Li J, Poor HV. Federated learning for internet of things: A comprehensive survey. IEEE Communications Surveys & Tutorials. 2021;23(3):1622–1658.
- View Article
- Google Scholar
43. Imteaj A, Thakker U, Wang S, Li J, Amini MH. A survey on federated learning for resource-constrained IoT devices. IEEE Internet of Things Journal. 2021;9(1):1–24.
- View Article
- Google Scholar
44. Gleason CJ, Durand MT. Remote sensing of river discharge: A review and a framing for the discipline. Remote Sensing. 2020;12(7):1107.
- View Article
- Google Scholar
45. Fleming SW, Rittger K, Oaida Taglialatela CM, Graczyk I. Leveraging next-generation satellite remote sensing-based snow data to improve seasonal water supply predictions in a practical machine learning-driven river forecast system. Water Resources Research. 2024;60(4):e2023WR035785.
- View Article
- Google Scholar
46. Zhu B, Feng Y, Gong D, Jiang S, Zhao L, Cui N. Hybrid particle swarm optimization with extreme learning machine for daily reference evapotranspiration prediction from limited climatic data. Computers and Electronics in Agriculture. 2020;173:105430.
- View Article
- Google Scholar
47. Gong D, Hao W, Gao L, Feng Y, Cui N. Extreme learning machine for reference crop evapotranspiration estimation: Model optimization and spatiotemporal assessment across different climates in China. Computers and Electronics in Agriculture. 2021;187:106294.
- View Article
- Google Scholar
48. Duhan D, Singh MC, Singh D, Satpute S, Singh S, Prasad V. Modeling reference evapotranspiration using machine learning and remote sensing techniques for semiarid subtropical climate of Indian Punjab. Journal of Water and Climate Change. 2023.
- View Article
- Google Scholar
49. Aly MS, Darwish SM, Aly AA. High performance machine learning approach for reference evapotranspiration estimation. Stochastic Environmental Research and Risk Assessment. 2023; p. 1–25.
- View Article
- Google Scholar
50. Nagappan M, Gopalakrishnan V, Alagappan M. Prediction of reference evapotranspiration for irrigation scheduling using machine learning. Hydrological Sciences Journal. 2020;65(16):2669–2677.
- View Article
- Google Scholar
51. Dias SHB, Filgueiras R, Fernandes Filho EI, Arcanjo GS, Silva GHd, Mantovani EC, et al. Reference evapotranspiration of Brazil modeled with machine learning techniques and remote sensing. Plos one. 2021;16(2):e0245834. pmid:33561147
- View Article
- PubMed/NCBI
- Google Scholar
52. Mokari E, DuBois D, Samani Z, Mohebzadeh H, Djaman K. Estimation of daily reference evapotranspiration with limited climatic data using machine learning approaches across different climate zones in New Mexico. Theoretical and Applied Climatology. 2022;147:575–587.
- View Article
- Google Scholar
53. Reis MM, da Silva AJ, Junior JZ, Santos LDT, Azevedo AM, Lopes ÉMG. Empirical and learning machine approaches to estimating reference evapotranspiration based on temperature data. Computers and electronics in agriculture. 2019;165:104937.
- View Article
- Google Scholar
54. Elbeltagi A, Srivastava A, Al-Saeedi AH, Raza A, Abd-Elaty I, El-Rawy M. Forecasting long-series daily reference evapotranspiration based on best subset regression and machine learning in Egypt. Water. 2023;15(6):1149.
- View Article
- Google Scholar
55. Rajput J, Singh M, Lal K, Khanna M, Sarangi A, Mukherjee J, et al. Data-driven reference evapotranspiration (ET0) estimation: a comparative study of regression and machine learning techniques. Environment, Development and Sustainability. 2023; p. 1–28.
- View Article
- Google Scholar
56. Santos PABd, Schwerz F, Carvalho LGd, Baptista VBdS, Marin DB, Ferraz GAeS, et al. Machine Learning and Conventional Methods for Reference Evapotranspiration Estimation Using Limited-Climatic-Data Scenarios. Agronomy. 2023;13(9):2366.
- View Article
- Google Scholar
57. Estévez J, Bellido-Jiménez JA, Liu X, García-Marín AP. Monthly precipitation forecasts using wavelet neural networks models in a semiarid environment. Water. 2020;12(7):1909.
- View Article
- Google Scholar
58. Achite M, Jehanzaib M, Sattari MT, Toubal AK, Elshaboury N, Wałęga A, et al. Modern techniques to modeling reference evapotranspiration in a semiarid area based on ANN and GEP models. Water. 2022;14(8):1210.
- View Article
- Google Scholar
59. Allen R, Smith M, Perrier A, Pereira LS, et al. An update for the definition of reference evapotranspiration. ICID bulletin. 1994;43(2):1–34.
- View Article
- Google Scholar
60. Elbeltagi A, Raza A, Hu Y, Al-Ansari N, Kushwaha N, Srivastava A, et al. Data intelligence and hybrid metaheuristic algorithms-based estimation of reference evapotranspiration. Applied Water Science. 2022;12(7):152.
- View Article
- Google Scholar
61. Wang J, Raza A, Hu Y, Buttar NA, Shoaib M, Saber K, et al. Development of monthly reference evapotranspiration machine learning models and mapping of Pakistan—A comparative study. Water. 2022;14(10):1666.
- View Article
- Google Scholar
62. Salahudin H, Shoaib M, Albano R, Inam Baig MA, Hammad M, Raza A, et al. Using Ensembles of Machine Learning Techniques to Predict Reference Evapotranspiration (ET0) Using Limited Meteorological Data. Hydrology. 2023;10(8):169.
- View Article
- Google Scholar
63. Raza A, Khaliq A, Hu Y, Zubair N, Acharki S, Zubair M, et al. Water Resources and Irrigation Management Using GIS and Remote Sensing Techniques: Case of Multan District (Pakistan). In: Surface and Groundwater Resources Development and Management in Semi-arid Region: Strategies and Solutions for Sustainable Water Management. Springer; 2023. p. 137–156.
64. Raza A, Saber K, Hu Y, L Ray R, Ziya Kaya Y, Dehghanisanij H, et al. Modelling reference evapotranspiration using principal component analysis and machine learning methods under different climatic environments. Irrigation and Drainage. 2023;72(4):945–970.
- View Article
- Google Scholar
65. Rehman A, Jingdong L, Shahzad B, Chandio AA, Hussain I, Nabi G, et al. Economic perspectives of major field crops of Pakistan: An empirical study. Pacific Science Review B: Humanities and Social Sciences. 2015;1(3):145–158.
- View Article
- Google Scholar
66. NASA. NASA The Data Access Viewer; 2023. Available from: https://power.larc.nasa.gov/data-access-viewer/.

[ref1] 1. Boobalan P, Ramu SP, Pham QV, Dev K, Pandya S, Maddikunta PKR, et al. Fusion of federated learning and industrial Internet of Things: A survey. Computer Networks. 2022;212:109048.
View Article
Google Scholar

[2] View Article

[3] Google Scholar

[ref2] 2. Mostafa RR, Kisi O, Adnan RM, Sadeghifar T, Kuriqi A. Modeling potential evapotranspiration by improved machine learning methods using limited climatic data. Water. 2023;15(3):486.
View Article
Google Scholar

[5] View Article

[6] Google Scholar

[ref3] 3. Xie C, Chen PY, Zhang C, Li B. Improving privacy-preserving vertical federated learning by efficient communication with admm. arXiv preprint arXiv:220710226. 2022.
View Article
Google Scholar

[8] View Article

[9] Google Scholar

[ref4] 4. Zouzou Y, Citakoglu H. General and regional cross-station assessment of machine learning models for estimating reference evapotranspiration. Acta Geophysica. 2023;71(2):927–947.
View Article
Google Scholar

[11] View Article

[12] Google Scholar

[ref5] 5. Zhuge W, Yue Y, Shang Y. Spatial-temporal pattern of human-induced land degradation in Northern China in the Past 3 decades RESTREND approach. International Journal of Environmental Research and Public Health. 2019;16(13):2258. pmid:31248024
View Article
PubMed/NCBI
Google Scholar

[14] View Article

[15] PubMed/NCBI

[16] Google Scholar

[ref6] 6. Cobaner M, Citakoğlu H, Haktanir T, Kisi O. Modifying Hargreaves–Samani equation with meteorological variables for estimation of reference evapotranspiration in Turkey. Hydrology Research. 2017;48(2):480–497.
View Article
Google Scholar

[18] View Article

[19] Google Scholar

[ref7] 7. Zotarelli L, Dukes MD, Romero CC, Migliaccio KW, Morgan KT. Step by step calculation of the Penman-Monteith Evapotranspiration (FAO-56 Method). Institute of Food and Agricultural Sciences University of Florida. 2010;8.
View Article
Google Scholar

[21] View Article

[22] Google Scholar

[ref8] 8. Manikumari N, Vinodhini G, Murugappan A. Modelling of reference evapotransipration using climatic parameters for irrigation scheduling using machine learning. Hydrological Sciences Journal. 2020;65(16):2669–2677.
View Article
Google Scholar

[24] View Article

[25] Google Scholar

[ref9] 9. Han X, Wei Z, Zhang B, Li Y, Du T, Chen H. Crop evapotranspiration prediction by considering dynamic change of crop coefficient and the precipitation effect in back-propagation neural network model. Journal of Hydrology. 2021;596:126104.
View Article
Google Scholar

[27] View Article

[28] Google Scholar

[ref10] 10. Uncuoglu E, Citakoglu H, Latifoglu L, Bayram S, Laman M, Ilkentapar M, et al. Comparison of neural network, Gaussian regression, support vector machine, long short-term memory, multi-gene genetic programming, and M5 Trees methods for solving civil engineering problems. Applied Soft Computing. 2022;129:109623.
View Article
Google Scholar

[30] View Article

[31] Google Scholar

[ref11] 11. Bayram S, Çıtakoğlu H. Modeling monthly reference evapotranspiration process in Turkey: application of machine learning methods. Environmental Monitoring and Assessment. 2023;195(1):67.
View Article
Google Scholar

[33] View Article

[34] Google Scholar

[ref12] 12. Citakoglu H, Cobaner M, Haktanir T, Kisi O. Estimation of monthly mean reference evapotranspiration in Turkey. Water Resources Management. 2014;28:99–113.
View Article
Google Scholar

[36] View Article

[37] Google Scholar

[ref13] 13. Hu Z, Bashir RN, Rehman AU, Iqbal SI, Shahid MMA, Xu T. Machine learning based prediction of reference evapotranspiration (et 0) using iot. IEEE Access. 2022;10:70526–70540.
View Article
Google Scholar

[39] View Article

[40] Google Scholar

[ref14] 14. Nauman MA, Saeed M, Saidani O, Javed T, Almuqren L, Bashir RN, et al. IoT and Ensemble Long-Short-Term-Memory-Based Evapotranspiration Forecasting for Riyadh. Sensors. 2023;23(17). pmid:37688039
View Article
PubMed/NCBI
Google Scholar

[42] View Article

[43] PubMed/NCBI

[44] Google Scholar

[ref15] 15. Nauman MA, Saeed M, Saidani O, Javed T, Almuqren L, Bashir RN, et al. IoT and Ensemble Long-Short-Term-Memory-Based Evapotranspiration Forecasting for Riyadh. Sensors. 2023;23(17). pmid:37688039
View Article
PubMed/NCBI
Google Scholar

[46] View Article

[47] PubMed/NCBI

[48] Google Scholar

[ref16] 16. Khan AA, Nauman MA, Bashir RN, Jahangir R, Alroobaea R, Binmahfoudh A, et al. Context Aware Evapotranspiration (ETs) for Saline Soils Reclamation. IEEE Access. 2022;10:110050–110063.
View Article
Google Scholar

[50] View Article

[51] Google Scholar

[ref17] 17. Bashir RN, Saeed M, Al-Sarem M, Marie R, Faheem M, Karrar AE, et al. Smart reference evapotranspiration using Internet of Things and hybrid ensemble machine learning approach. Internet of Things. 2023;24:100962.
View Article
Google Scholar

[53] View Article

[54] Google Scholar

[ref18] 18. Tausif M, Dilshad S, Umer Q, Iqbal MW, Latif Z, Lee C, et al. Ensemble learning-based estimation of reference evapotranspiration (ETo). Internet of Things. 2023;24:100973.
View Article
Google Scholar

[56] View Article

[57] Google Scholar

[ref19] 19. Babar M, Qureshi B, Koubaa A. Review on Federated Learning for digital transformation in healthcare through big data analytics. Future Generation Computer Systems. 2024;160:14–28.
View Article
Google Scholar

[59] View Article

[60] Google Scholar

[ref20] 20. Siddique AA, Alasbali N, Driss M, Boulila W, Alshehri MS, Ahmad J. Sustainable collaboration: Federated learning for environmentally conscious forest fire classification in Green Internet of Things (IoT). Internet of Things. 2024;25:101013.
View Article
Google Scholar

[62] View Article

[63] Google Scholar

[ref21] 21. Kaleem S, Sohail A, Babar M, Ahmad A, Tariq MU. A hybrid model for energy-efficient Green Internet of Things enabled intelligent transportation systems using federated learning. Internet of Things. 2024;25:101038.
View Article
Google Scholar

[65] View Article

[66] Google Scholar

[ref22] 22. El Hanjri M, Kabbaj H, Kobbane A, Abouaomar A. Federated learning for water consumption forecasting in smart cities. In: ICC 2023-IEEE International Conference on Communications. IEEE; 2023. p. 1798–1803.

[ref23] 23. Supriya Y, Gadekallu TR. Particle Swarm-Based Federated Learning Approach for Early Detection of Forest Fires. Sustainability. 2023;15(2):964.
View Article
Google Scholar

[69] View Article

[70] Google Scholar

[ref24] 24. Ullah F, Srivastava G, Xiao H, Ullah S, Lin JCW, Zhao Y. A Scalable Federated Learning Approach for Collaborative Smart Healthcare Systems with Intermittent Clients using Medical Imaging. IEEE Journal of Biomedical and Health Informatics. 2023.
View Article
Google Scholar

[72] View Article

[73] Google Scholar

[ref25] 25. Pandya S, Srivastava G, Jhaveri R, Babu MR, Bhattacharya S, Maddikunta PKR, et al. Federated learning for smart cities: A comprehensive survey. Sustainable Energy Technologies and Assessments. 2023;55:102987.
View Article
Google Scholar

[75] View Article

[76] Google Scholar

[ref26] 26. Friha O, Brik B, Touati F, Al-Fuqaha A, Afyouni I. FELIDS: Federated learning-based intrusion detection system for agricultural Internet of Things. Journal of Parallel and Distributed Computing. 2022;165:17–31.
View Article
Google Scholar

[78] View Article

[79] Google Scholar

[ref27] 27. Mahjoub T, Mnaouer AB, Said MB, Boujemaa H. LoRa signal propagation and path loss prediction in Tunisian date palm oases. Computers and Electronics in Agriculture. 2024;222:109027.
View Article
Google Scholar

[81] View Article

[82] Google Scholar

[ref28] 28. Ahmed RA, Hemdan EED, El-Shafai W, Ahmed ZA, El-Rabaie ESM, Abd El-Samie FE. Climate-smart agriculture using intelligent techniques, blockchain and Internet of Things: Concepts, challenges, and opportunities. Transactions on Emerging Telecommunications Technologies. 2022;33(11):e4607.
View Article
Google Scholar

[84] View Article

[85] Google Scholar

[ref29] 29. Boulila W, Alzahem A, Koubaa A, Benjdira B, Ammar A. Early detection of red palm weevil infestations using deep learning classification of acoustic signals. Computers and Electronics in Agriculture. 2023;212:108154.
View Article
Google Scholar

[87] View Article

[88] Google Scholar

[ref30] 30. Dong J, Wang Y, Shen Y, Wang C, Chen J. Nation-scale reference evapotranspiration estimation by using deep learning and classical machine learning models in China. Journal of Hydrology. 2022;604:127207.
View Article
Google Scholar

[90] View Article

[91] Google Scholar

[ref31] 31. Rai P, Kumar A, Kumar M, Kushwaha S, Chauhan A. Evaluation of machine learning versus empirical models for monthly reference evapotranspiration estimation in Uttar Pradesh and Uttarakhand States, India. Sustainability. 2022;14(10):5771.
View Article
Google Scholar

[93] View Article

[94] Google Scholar

[ref32] 32. Bellido-Jiménez JA, Estévez J, García-Marín AP. New machine learning approaches to improve reference evapotranspiration estimates using intra-daily temperature-based variables in a semi-arid region of Spain. Agricultural Water Management. 2021;245:106558.
View Article
Google Scholar

[96] View Article

[97] Google Scholar

[ref33] 33. Krishnashetty PH, Balasangameshwara J, Sreeman S, Desai S, Kantharaju AB. Cognitive computing models for estimation of reference evapotranspiration: A review. Cognitive Systems Research. 2021;70:109–116.
View Article
Google Scholar

[99] View Article

[100] Google Scholar

[ref34] 34. Ayaz A, Rajesh M, Singh SK, Rehana S, et al. Estimation of reference evapotranspiration using machine learning models with limited data. AIMS Geosci. 2021;7(3):268–290.
View Article
Google Scholar

[102] View Article

[103] Google Scholar

[ref35] 35. Sammen SS, Kisi O, Al-Janabi AMS, Elbeltagi A, Zounemat-Kermani M. Estimation of Reference Evapotranspiration in Semi-Arid Region with Limited Climatic Inputs Using Metaheuristic Regression Methods. Water. 2023;15(19):3449.
View Article
Google Scholar

[105] View Article

[106] Google Scholar

[ref36] 36. Mirzania E, Vishwakarma DK, Bui QAT, Band SS, Dehghani R. A novel hybrid AIG-SVR model for estimating daily reference evapotranspiration. Arabian Journal of Geosciences. 2023;16(5):1–14.
View Article
Google Scholar

[108] View Article

[109] Google Scholar

[ref37] 37. Rashid Niaghi A, Hassanijalilian O, Shiri J. Estimation of reference evapotranspiration using spatial and temporal machine learning approaches. Hydrology. 2021;8(1):25.
View Article
Google Scholar

[111] View Article

[112] Google Scholar

[ref38] 38. Yu H, Wen X, Li B, Yang Z, Wu M, Ma Y. Uncertainty analysis of artificial intelligence modeling daily reference evapotranspiration in the northwest end of China. Computers and Electronics in Agriculture. 2020;176:105653.
View Article
Google Scholar

[114] View Article

[115] Google Scholar

[ref39] 39. Zhang T, Gao L, He C, Zhang M, Krishnamachari B, Avestimehr AS. Federated learning for the Internet of things: Applications, challenges, and opportunities. IEEE Internet of Things Magazine. 2022;5(1):24–29.
View Article
Google Scholar

[117] View Article

[118] Google Scholar

[ref40] 40. Manoj T, Makkithaya K, Narendra V. A federated learning-based crop yield prediction for agricultural production risk management. In: 2022 IEEE Delhi Section Conference (DELCON). IEEE; 2022. p. 1–7.

[ref41] 41. Kumar P, Gupta GP, Tripathi R. PEFL: Deep privacy-encoding-based federated learning framework for smart agriculture. IEEE Micro. 2021;42(1):33–40.
View Article
Google Scholar

[121] View Article

[122] Google Scholar

[ref42] 42. Nguyen DC, Ding M, Pathirana PN, Seneviratne A, Li J, Poor HV. Federated learning for internet of things: A comprehensive survey. IEEE Communications Surveys & Tutorials. 2021;23(3):1622–1658.
View Article
Google Scholar

[124] View Article

[125] Google Scholar

[ref43] 43. Imteaj A, Thakker U, Wang S, Li J, Amini MH. A survey on federated learning for resource-constrained IoT devices. IEEE Internet of Things Journal. 2021;9(1):1–24.
View Article
Google Scholar

[127] View Article

[128] Google Scholar

[ref44] 44. Gleason CJ, Durand MT. Remote sensing of river discharge: A review and a framing for the discipline. Remote Sensing. 2020;12(7):1107.
View Article
Google Scholar

[130] View Article

[131] Google Scholar

[ref45] 45. Fleming SW, Rittger K, Oaida Taglialatela CM, Graczyk I. Leveraging next-generation satellite remote sensing-based snow data to improve seasonal water supply predictions in a practical machine learning-driven river forecast system. Water Resources Research. 2024;60(4):e2023WR035785.
View Article
Google Scholar

[133] View Article

[134] Google Scholar

[ref46] 46. Zhu B, Feng Y, Gong D, Jiang S, Zhao L, Cui N. Hybrid particle swarm optimization with extreme learning machine for daily reference evapotranspiration prediction from limited climatic data. Computers and Electronics in Agriculture. 2020;173:105430.
View Article
Google Scholar

[136] View Article

[137] Google Scholar

[ref47] 47. Gong D, Hao W, Gao L, Feng Y, Cui N. Extreme learning machine for reference crop evapotranspiration estimation: Model optimization and spatiotemporal assessment across different climates in China. Computers and Electronics in Agriculture. 2021;187:106294.
View Article
Google Scholar

[139] View Article

[140] Google Scholar

[ref48] 48. Duhan D, Singh MC, Singh D, Satpute S, Singh S, Prasad V. Modeling reference evapotranspiration using machine learning and remote sensing techniques for semiarid subtropical climate of Indian Punjab. Journal of Water and Climate Change. 2023.
View Article
Google Scholar

[142] View Article

[143] Google Scholar

[ref49] 49. Aly MS, Darwish SM, Aly AA. High performance machine learning approach for reference evapotranspiration estimation. Stochastic Environmental Research and Risk Assessment. 2023; p. 1–25.
View Article
Google Scholar

[145] View Article

[146] Google Scholar

[ref50] 50. Nagappan M, Gopalakrishnan V, Alagappan M. Prediction of reference evapotranspiration for irrigation scheduling using machine learning. Hydrological Sciences Journal. 2020;65(16):2669–2677.
View Article
Google Scholar

[148] View Article

[149] Google Scholar

[ref51] 51. Dias SHB, Filgueiras R, Fernandes Filho EI, Arcanjo GS, Silva GHd, Mantovani EC, et al. Reference evapotranspiration of Brazil modeled with machine learning techniques and remote sensing. Plos one. 2021;16(2):e0245834. pmid:33561147
View Article
PubMed/NCBI
Google Scholar

[151] View Article

[152] PubMed/NCBI

[153] Google Scholar

[ref52] 52. Mokari E, DuBois D, Samani Z, Mohebzadeh H, Djaman K. Estimation of daily reference evapotranspiration with limited climatic data using machine learning approaches across different climate zones in New Mexico. Theoretical and Applied Climatology. 2022;147:575–587.
View Article
Google Scholar

[155] View Article

[156] Google Scholar

[ref53] 53. Reis MM, da Silva AJ, Junior JZ, Santos LDT, Azevedo AM, Lopes ÉMG. Empirical and learning machine approaches to estimating reference evapotranspiration based on temperature data. Computers and electronics in agriculture. 2019;165:104937.
View Article
Google Scholar

[158] View Article

[159] Google Scholar

[ref54] 54. Elbeltagi A, Srivastava A, Al-Saeedi AH, Raza A, Abd-Elaty I, El-Rawy M. Forecasting long-series daily reference evapotranspiration based on best subset regression and machine learning in Egypt. Water. 2023;15(6):1149.
View Article
Google Scholar

[161] View Article

[162] Google Scholar

[ref55] 55. Rajput J, Singh M, Lal K, Khanna M, Sarangi A, Mukherjee J, et al. Data-driven reference evapotranspiration (ET0) estimation: a comparative study of regression and machine learning techniques. Environment, Development and Sustainability. 2023; p. 1–28.
View Article
Google Scholar

[164] View Article

[165] Google Scholar

[ref56] 56. Santos PABd, Schwerz F, Carvalho LGd, Baptista VBdS, Marin DB, Ferraz GAeS, et al. Machine Learning and Conventional Methods for Reference Evapotranspiration Estimation Using Limited-Climatic-Data Scenarios. Agronomy. 2023;13(9):2366.
View Article
Google Scholar

[167] View Article

[168] Google Scholar

[ref57] 57. Estévez J, Bellido-Jiménez JA, Liu X, García-Marín AP. Monthly precipitation forecasts using wavelet neural networks models in a semiarid environment. Water. 2020;12(7):1909.
View Article
Google Scholar

[170] View Article

[171] Google Scholar

[ref58] 58. Achite M, Jehanzaib M, Sattari MT, Toubal AK, Elshaboury N, Wałęga A, et al. Modern techniques to modeling reference evapotranspiration in a semiarid area based on ANN and GEP models. Water. 2022;14(8):1210.
View Article
Google Scholar

[173] View Article

[174] Google Scholar

[ref59] 59. Allen R, Smith M, Perrier A, Pereira LS, et al. An update for the definition of reference evapotranspiration. ICID bulletin. 1994;43(2):1–34.
View Article
Google Scholar

[176] View Article

[177] Google Scholar

[ref60] 60. Elbeltagi A, Raza A, Hu Y, Al-Ansari N, Kushwaha N, Srivastava A, et al. Data intelligence and hybrid metaheuristic algorithms-based estimation of reference evapotranspiration. Applied Water Science. 2022;12(7):152.
View Article
Google Scholar

[179] View Article

[180] Google Scholar

[ref61] 61. Wang J, Raza A, Hu Y, Buttar NA, Shoaib M, Saber K, et al. Development of monthly reference evapotranspiration machine learning models and mapping of Pakistan—A comparative study. Water. 2022;14(10):1666.
View Article
Google Scholar

[182] View Article

[183] Google Scholar

[ref62] 62. Salahudin H, Shoaib M, Albano R, Inam Baig MA, Hammad M, Raza A, et al. Using Ensembles of Machine Learning Techniques to Predict Reference Evapotranspiration (ET0) Using Limited Meteorological Data. Hydrology. 2023;10(8):169.
View Article
Google Scholar

[185] View Article

[186] Google Scholar

[ref63] 63. Raza A, Khaliq A, Hu Y, Zubair N, Acharki S, Zubair M, et al. Water Resources and Irrigation Management Using GIS and Remote Sensing Techniques: Case of Multan District (Pakistan). In: Surface and Groundwater Resources Development and Management in Semi-arid Region: Strategies and Solutions for Sustainable Water Management. Springer; 2023. p. 137–156.

[ref64] 64. Raza A, Saber K, Hu Y, L Ray R, Ziya Kaya Y, Dehghanisanij H, et al. Modelling reference evapotranspiration using principal component analysis and machine learning methods under different climatic environments. Irrigation and Drainage. 2023;72(4):945–970.
View Article
Google Scholar

[189] View Article

[190] Google Scholar

[ref65] 65. Rehman A, Jingdong L, Shahzad B, Chandio AA, Hussain I, Nabi G, et al. Economic perspectives of major field crops of Pakistan: An empirical study. Pacific Science Review B: Humanities and Social Sciences. 2015;1(3):145–158.
View Article
Google Scholar

[192] View Article

[193] Google Scholar

[ref66] 66. NASA. NASA The Data Access Viewer; 2023. Available from: https://power.larc.nasa.gov/data-access-viewer/.

Figures

Abstract

1 Introduction

2 Related work

3 Material and methods

3.1 Study area

3.2 Dataset

3.3 Machine learning models

3.3.1 Decision Tree Regressor (DTR).

3.3.2 Random Forest Regressor (RFR).

3.3.3 Support Vector Regressor (SVR).

3.4 Feature importance analysis

3.5 Federated Learning (FL) framework for ETo estimation

3.5.1 FL framework design and methodology.

3.6 Evaluation metrics

4 Results

4.1 Discussion

5 Conclusion

References

3.5 Federated Learning (FL) framework for ET_o estimation