Figures
Abstract
Accurate prediction of greenhouse temperature and relative humidity is critical for developing environmental control systems. Effective regulation strategies can help improve crop yields while reducing energy consumption. In this study, Multilayer Perceptron (MLP) and Radial Basis Function (RBF) networks were used for short-term prediction of temperature and relative humidity in a double-film greenhouse. The prediction models used indoor soil temperature, light intensity, and historical measurements of temperature and humidity from the previous 10 minutes as inputs. Results show that the MLP model with Levenberg-Marquardt optimization performs best in predicting the current temperature and humidity, with an RMSE of 0.439°C and R2 of 0.997 for temperature prediction and an RMSE of 1.141% and R2 of 0.996 for relative humidity prediction. For 30-minute short-term prediction, the Bayesian optimized RBF model showed better temperature prediction with an RMSE of 1.579°C and an R2 of 0.958, while the MLP model performed better in relative humidity prediction with an RMSE of 4.299% and an R2 of 0.948. This study provides theoretical support for advancing the intelligent regulation of greenhouse environmental factors in cold and arid regions, and the application of predictive models to intelligent environmental management systems could help optimize cultivation practices and energy efficiency.
Citation: Yan C, Na T, Zhen Q, Sun Y, Liu K (2025) Prediction of air temperature and humidity in greenhouses via artificial neural network. PLoS One 20(6): e0325650. https://doi.org/10.1371/journal.pone.0325650
Editor: Morteza Taki, Agricultural Sciences and Natural Resources University of Khuzestan, IRAN, ISLAMIC REPUBLIC OF
Received: March 1, 2025; Accepted: May 16, 2025; Published: June 9, 2025
Copyright: © 2025 Yan et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All relevant data are within the paper and its Supporting information files.
Funding: 1.Tana,University Basic Scientific Research Project of Inner Mongolia Autonomous Region Directly Interdisciplinary Research Project (No. BR22-14-03); 2.Tana,the Inner Mongolia Natural Science Foundation of China (No. 2022MS03040); 3.Tana,First Class Disciplines Research Special Project (YLXKZX-NND-009). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
1. Introduction
Greenhouses are widely used systems for artificially creating a suitable growth microclimate environment for plants. Greenhouse environmental factors such as temperature and humidity are critical environmental factors influencing plant development, quality, and production quantity [1]. More specifically, the exposure of plants to extreme temperatures can cause heat damage or frost damage, and an increase in relative humidity may exacerbate fungal diseases and affect the uptake and utilization of calcium and the normal water balance of plants [2]. Therefore, these factors need to be considered when growing and managing plants, and appropriate measures need to be taken to protect plants from the adverse effects of high-humidity environments. Greenhouse climatic control necessitates the consideration of complex, nonlinear and strong coupling systems, where variables are highly dependent on external climate conditions as well as structure type, design and orientation [3].
The Hohhot region, where the experimental study was conducted, is a typical cold and arid region in China, where the average temperature in January in winter is about −15°C, and the extreme low temperature exceeds −30°C. The region is vigorously developing facility agriculture and promoting the “Vegetable Basket Project”. In 2024, newly constructed facilities covered 690 ha (10,350 mu), while vegetable cultivation spanned 3,600 ha (54,000 mu).solar greenhouses constitute a vital component of winter vegetable provisioning systems, and some greenhouses are planted with high-value cultivars such as medicinal herbs and premium fruits to increase farmers’ income. Presently, the operation of traditional greenhouses (Fig 1) is mainly dependent on manual expertise, with low automation. Consequently, precise regulation of greenhouse temperature and relative humidity is imperative to ensure optimal growing environments for plants, enhance crop yields, and mitigate losses caused by climatic fluctuations. Obviously, building a precise model of the inside greenhouse climate is an important way to achieve efficient microclimate management [4,5].
Modeling of solar greenhouses can be based on either physical laws or system identification methods. Modeling via system identification methods is more advantageous than physical modeling and is more responsive to the requirements of environmental factor regulation [6,7]. The rapid development of Artificial Intelligence has enabled it to play an important role in a variety of application scenarios [8–10]. As a typical data-driven tool, Artificial Neural Networks (ANN) are widely used to predict environmental factors such as temperature and humidity in greenhouses due to their excellence in capturing nonlinear relationships between multivariate inputs such as light, humidity and temperature [11,12]. Castañeda and Castaño et al. [13] developed a Levenberg-Marquardt optimized multilayer perceptron (MLP) model, achieving remarkable determination coefficients of 0.9549 (winter) and 0.9590 (summer) for indoor temperature prediction. This implementation effectively addressed frost control challenges, validating ANN’s practical utility in complex agricultural environments. Petrakis et al. [14] used a BP neural network to predict temperature and relative humidity in a greenhouse, and the inputs were 10 variables, including historical indoor temperature and relative humidity. The number of nodes in the hidden layer of the BP neural network was derived through extensive experimental comparison. Zhang et al. [15] used the sparrow search algorithm (SSA) to optimize the radial basis function (RBF) network to carry out a simulation study on the temperature and humidity of greenhouses. The coefficient of determination (R2) of the temperature and humidity was approximately 0.86, but the algorithm has a relatively large number of input variables. Hayoung et al. [16] predicted the indoor temperature and relative humidity of an eight-span greenhouse via a MLP model. The effects of different numbers of hidden layers and nodes on the prediction accuracy were discussed, and the optimal network structure with four hidden layers and 128 nodes for air temperature (R2 = 0.988) and with four hidden layers and 64 nodes for relative humidity (R2 = 0.990) was determined. However, too many hidden layer nodes can lead to an overly large network structure, making the computation slower or less pervasive [17]. Mohmed et al. [18] proposed LM-MLP model to predict the maximum and minimum temperature during two different seasons (warm and cold), the mean absolute error between the predicted and the measured temperatures was 0.6833°C. Liu et al. [19] established a BP neural network model to predict the temperature variation trend in greenhouses for the next 1–7 days on the basis of daily meteorological data from Tianjin from 2011–2013 in the winter. Yue et al. [20] proposed a model to predict the temperature and humidity of a greenhouse based on improved LM-RBF model. The model uses the inside and outside meteorological data of the greenhouse as inputs and the temperature and humidity in a greenhouse as outputs, but they do not consider the effects of historical indoor temperature and humidity on the prediction accuracy of the system. Yu et al. [21] established a new prediction model based on a least squares support vector machine (LSSVM) using the environmental factors of Shandong Province to predict the temperature of a greenhouse. These authors considered the historical inside temperature and predicted temperature well but did not predict indoor humidity. Zou et al. [22] took the indoor temperature and relative humidity of the previous moment and the external meteorological data as the input vector (6 in total) of the prediction model and established a prediction model of temperature and humidity in a greenhouse.
Despite recent advancements in greenhouse microclimate modeling, there has been less research on short-term predictions of temperature and relative humidity for traditional double-layer film greenhouses for cold and arid regions in China. This paper employs two representative artificial neural network models (MLP and RBF) to predict the temperature and relative humidity inside the greenhouse 30 minutes after its initial state. Initially, the input variables were optimized using spearman correlation analysis to eliminate environmental variables with low correlation. Subsequently, a new integration of Local Sensitivity Analysis (LSA) and Kendall’s W coefficient of concordance was performed to verify the statistical significance of the results of the input variables ranked according to their importance. Secondly, the effects of training sample partitioning methods (60%−80%) on the output results of MLP and RBF prediction models were investigated, and the dataset partitioning schemes of the two prediction models were optimized, respectively. Finally, the key hyperparameters and training algorithms of the MLP and RBF are optimized separately, and the final performance of both models is compared to select the optimal model. In this study, the 5-fold cross-validation method was used in all predictive models to ensure the stability of the results. The findings of this study have the potential to provide technical support for high-precision environmental regulation and to promote the transformation and upgrading of traditional facility agriculture to smart agriculture.
2. Materials and methods
2.1 Greenhouse and measurements
All the experiments were based on a greenhouse (experimental area: 25 m × 8 m, 200 m2) located in Hohhot, Inner Mongolia, northwestern China, Hailiu Village Science and Technology Park of Inner Mongolia Agricultural University (lat. 40.68_N, long. 111.38_E). The crop grown in the greenhouse during the experiment was celery. The greenhouse is equipped with inner and outer double-layer film and inner and outer thermal insulation quilts. The back wall is a brick–concrete structure and has an external thermal insulation layer that is 8 cm thick. In addition, a solar heating system was added to the greenhouse. The exterior and interior of the greenhouse are shown in Fig 1(a) and Fig 1(b). Compared with other types of greenhouses, the experimental greenhouse’s minimum temperature is above 0°C, which is suitable for the cold, dry areas of northern China. When the sunshine is sufficient at approximately 9:00 in the morning in winter, the outer quilt and the inner quilt are opened, so that the solar greenhouse can absorb more solar radiation, which can promote plant growth and make the solar greenhouse temperature rise rapidly, inner film is opened at approximately 10:30 a.m. according to weather conditions and the top vent is opened at noon. At approximately 3:30 p.m., the solar radiation gradually weakened, and the inner and outer quilts and inner film were put down, which slowed the heat loss of the solar greenhouse to the outside and played a role in heat preservation. The solar heating circulation system starts working at approximately 8:00 p.m.
Indoor environmental factors were measured using appropriate sensors: air temperature (−10°C to 50°C, ± 0.5°C), relative humidity (0–95% RH, ± 3%) and CO2 concentration (0–5000 ppm, ± 40 ppm) were measured by an integrated sensor of Shandong Renke RS-CO2WS-N01-2. Total solar radiation (0–1800 W/m2, 1W/m2) was measured by a total radiation sensor (RS-RA-N01-AL). Soil parameters were obtained using a soil temperature and moisture sensor (Kong Saien KE-N01-TR-1), which recorded temperature and volumetric water content. The sensors were configured to synchronize the acquisition of data at 10-minute intervals via a centralized Modbus/RS-485 interface. structure of the greenhouse environmental data acquisition system is illustrated in Fig 2.
The data on indoor air temperature and relative humidity used in this paper were obtained from integrated sensors installed at the geometric center of the greenhouse (1.8 m above ground level, positioned at the intersection of east-west and north-south axes), and it was found that the data from this location could reflect the mean values of greenhouse temperature and humidity in a representative way. Soil temperature and moisture content data were obtained from sensors buried 5 cm below the soil surface at the same central location. The total solar radiation sensor was installed at the midpoint of the greenhouse’s east-west axis, 3 m from the southern film cover, and at a height of 1.3 m. This positioning minimizes structural shading effects while ensuring representative radiation data collection. The outdoor meteorological parameters are obtained from HOBO U30 NRC weather station installed in the open area outside the greenhouse (Fig 3).
2.2 Data sources
This study used a dataset from 22 December 2021 0:00–19 February 2022 14:30, for the training and testing of the model. The dataset contained the following variables: outdoor temperature (Tout), outdoor relative humidity (RHout), outdoor wind speed (WSout), indoor temperature (Tin), indoor relative humidity (RHin), indoor soil temperature (TSin) and soil water content (SWCin), indoor light intensity (PARin). Data from 26 December 2021 8:10–17:40, 6 January 2022 19:40–7 January 2022 08:00 were missing due to power outages. Therefore, the total amount of data is equal to 8441 samples (as shown in S1 Data).
Temperature and humidity variations inside the greenhouse are influenced by both external climatic conditions and the greenhouse’s environmental factors. These variations exhibit a significant time lag. Consequently, the environmental data collected indoors and outdoors (Tout, RHout, WSout, TSin, SWCin, PARin), along with the historical indoor temperatures and relative humidity from the previous 10 minutes (RHin (t-t0), Tin (t-t0), t0 = 10 min), were utilized as inputs for the prediction models. The outputs of the prediction model are the temperature and relative humidity inside the greenhouse after 30 minutes, as it balances the response lag of greenhouse environmental regulation with the cumulative effect of model errors – as the prediction time period lengthens, the system nonlinearity enhances leading to a decrease in prediction accuracy [12]. Configuration of the prediction model is illustrated in Fig 4.
2.3 MLP network
The core structure of the multilayer perceptron (MLP) is that of a classical feed-forward neural network, consisting of an input layer, a hidden layer, and an output layer [23]. The nonlinear mapping of multidimensional environmental parameters is realized by means of full connectivity. The input layer receives input variables including indoor and outdoor environmental factors, which are then transformed by the hidden layer through a non-linear process. The output layer then produces the predicted values of the target environmental factors. The MLP is capable of effectively capturing the complex interaction mechanisms among greenhouse environmental variables by adaptively adjusting the weight parameters through the back-propagation algorithm. While traditional training relies on gradient descent, advanced optimizers like the Levenberg-Marquardt (LM) algorithm can be employed. The LM algorithm enhances training by adaptively blending the Gauss-Newton method (utilizing second-order derivatives) and gradient descent through a damping factor, significantly accelerating convergence in nonlinear regression tasks. Consequently, the LM algorithm is employed in both the primary input variable selection and data division components of the experimental design. Furthermore, there are numerous alternative training algorithms, including Bayesian Regularization and Scaled Conjugate Gradient. A comprehensive list can be found in Table 1. In the following discussion, we propose a comparative analysis of different algorithms.
2.4 RBF network
The Radial Basis Function (RBF) network is a three-layer feedforward neural network that utilizes radial basis functions as activation units in the hidden layer. Its architecture comprises an input layer for feature mapping, a nonlinear hidden layer with Gaussian kernels, and a linear output layer for regression or classification tasks [24]. The input layer size is equal to the number of model inputs, with an equivalent weight assigned to each input. Hidden layer neurons must be selected to determine the optimal number of neurons to ensure prediction effectiveness. The output layer size is equal to the network output, with two neurons in this study since the goal is to predict greenhouse temperature and relative humidity.
The network’s hidden layer utilizes Gaussian activation functions centered on prototype vectors to compute Euclidean distances from the input data. Gaussian kernel function that is calculated based on [20]:
Where is the width of RBF. The
is the center of RBF.
The selection of the width parameter of function directly influences the local response range of the Gaussian kernel function, which is typically [25]:
Where dmax is the maximum Euclidean distance between all central vectors, and h is the number of hidden layer neurons.
The output layer performs linear combinations of these nonlinear transformations, it is weighted by hidden nodes, and the formula for calculation is as follows [20]:
Where is the weight between the hidden and the output neurons,
is the output.
2.5 Normalization
It is clear that environmental factors have neither the same units nor the same orders of magnitude. Thus, variables with higher values contribute more to the output error, with the result that the algorithm gives more weight to them and less to variables with a smaller range of values [26]. For this reason, data must be normalized before being presented to the network. Data normalization compresses the range of the data between 0 and 1. The normalization in this study was carried out via the following expression given by Eq. (4) [27].
Where is the normalized value of
,
is a real value, and
d
note the maximum and minimum values in the data, respectively.
2.6 Predictive model evaluation indicators
To verify the predictive performance of the BP neural network model for environmental factors in solar greenhouses, the determination coefficient (R2) and root mean square error (RMSE) were selected as model evaluation indices. The calculation equations of each evaluation index are as follows [28].
1. Root mean square error (RMSE).
The root mean square error, also known as the standard error, reflects the degree of deviation between the true value and the predicted value.
2. Determination coefficient (R2).
Where R2 indicates the closeness of the linear relationship between the predicted and measured values. The closer R2 is to 1, the better the model effect is. When the sum of squares of the prediction residuals is greater than the sum of squares of the variance of the true value, R2 becomes negative, indicating that the fitting effect is extremely poor and that the prediction effect is not as good as the direct mean value.
In the above equations, n is the number of samples, is the measured value of the i-th sample,
is the predicted value of the i-th sample, and
is the average of the measured values of the n samples.
2.7 Correlation analysis
There are nonlinear coupling relationships between different environmental factors and indoor temperature and indoor relative humidity. To enhance computational efficiency in modeling, spearman correlation analysis was used to initially identify the key input variables [29]. Table 2 shows the matrix of correlation coefficients between the two outputs and all input features. Correlation coefficients ranging from ±0.4 to ±1.0 generally reflect statistically significant associations, whereas values between −0.3 and +0.3 typically suggest weak or negligible relationships between variables [30], positive values indicate concordant variations, whereas negative values reflect inverse dependencies. Therefore, all input variables were selected except soil water content for indoor temperature and indoor relative humidity predictive model. Subsequently, sensitivity analyses will be performed on input variables other than SWCin to further reduce the dimensionality of the prediction model and improve prediction efficiency.
3. Results and discussion
3.1 Input variable selection via sensitivity analysis
A systematic approach based on sensitivity analysis and multivariate combinatorial validation was used in this study to determine the optimal set of input variables for a greenhouse environment prediction model. First, the sensitivity index of each input variable to the output (temperature and humidity) was calculated by perturbation method to rank the importance of the variables: positive and negative perturbations were applied to each variable in the validation set, and the strength of the influence was evaluated by the amount of change in the mean square error of the feed-forward neural network output. Based on the ranking results of 10 independent experiments, the Kendall’s W (W = 0.98) concordance test was used to verify the stability of variable importance. Then, a progressive combination validation strategy was adopted to sequentially select the first k highly sensitive variables (k = 1, …, 7), build the MLP neural network, and obtain the optimal model through 20 times of random initialization training.
As illustrated in Table 3, the sequence of importance of the input variables is as follows: RHin(t-t0), PARin, Tin(t-t0), TSin, Tout, RHout, WSout. Table 3 shows that the 4-variable combination (RHin(t-t0), PARin, Tin(t-t0), TSin) attains near-optimal predictive performance for both temperature and humidity. In terms of temperature prediction, the subset achieves an RMSE of 1.50°C (R2 = 0.96), thereby signifying an 81.5% error reduction in comparison to the single-variable baseline (RHin(t-t0) alone: RMSE = 2.43°C). In addition, the accuracy of the humidity prediction is demonstrated to improve to an RMSE of 3.95% (R2 = 0.95), which signifies a 57.1% error reduction from the baseline (9.20%). It can be seen that there is a substantial improvement in the prediction effect after the addition of TSin to the input variables, indicating a strong interaction between soil temperature and air temperature and humidity in the greenhouse, Deng et al. [31] confirmed the existence of a complex coupling relationship between soil temperature and greenhouse air temperature and humidity in terms of dynamic energy exchange and feedback regulation by building an analytical model that integrates soil evaporation, solar radiation absorption, thermal radiation loss and convective heat transfer. Although the continued introduction of the variables (Tout, RHout, WSout) led to a slight reduction in RMSE, the improvement in prediction was not significant. Notably, the five-variable model showed an increase in the RMSE of temperature prediction (1.50°C to 1.54°C), suggesting that redundant outdoor temperatures may lead to overfitting risk. This result suggests a reasonable balance between model simplicity and the preservation of physical relationships governing greenhouse environments. During subsequent parameter optimization phases for both MLP and RBF prediction models, these 4 variables were consistently employed as input parameters for greenhouse temperature and relative humidity prediction modeling.
3.2 Data set partitioning
In order to rigorously evaluate the generalizability of MLP and RBF models while preventing overfitting, a comparative analysis of data partitioning strategies was conducted (80%−20%, 70%−30%, 60%−40% for training-testing splits). In order to ensure the stability of the results, K-fold cross-validation was used with K = 5, repeat 3 times. Prediction results presented in Table 4 (temperature) and Table 5 (relative humidity). It can be seen that when the training set is 80%, in the temperature prediction model, the RMSE of the MLP model is slightly higher on the test set, while the R2 of the RBF model is lower than that of the 70% training set; when the training set is 70% of the sample size, the MLP exhibits the best generalization ability for both temperature and humidity prediction (temperature test set RMSE = 1.59°C, humidity test set RMSE = 4.42%), whereas the RBF network has a higher R2 and lower RMSE in temperature prediction and the highest R2 in humidity prediction; When the proportion of the training set is reduced to 60%, the prediction performance of both the MLP and the RBF models is observed to decrease, indicating that an insufficient quantity of training data can result in inadequate learning by the model, which may consequently lead to under fitting. So, 70% sample size is finally selected as the best split.
3.3 MLP prediction model
This section explores the impact of training algorithm selection, the number of nodes in the hidden layer on model performance, in accordance with the predetermined optimal input variables and data partitioning strategy.
3.3.1 Training algorithm selection.
As illustrated in Fig 5, it is evident that the Levenberg-Marquardt and Bayesian regularization algorithms demonstrated optimal performance for both temperature and humidity. The RMSE and the R2 of the predicted temperature and humidity on the training and test sets are similar. This finding suggests that the system is capable of accurately capturing the primary characteristics of the input variables during the training phase and providing a high level of accuracy on the test set. This indicates excellent generalization ability and stability. The LM algorithm demonstrated the most superior performance in temperature prediction (RMSE = 1.624°C, R2 = 0.956) and humidity prediction (RMSE = 4.535%, R2 = 0.942). The Bayesian regularization algorithm ranked second, with an RMSE of 1.634°C, R2 of 0.955 in temperature prediction and an RMSE of 4.548%, R2 of 0.942 in humidity prediction. A comparative analysis of the generalization ability of the results reveals that the LM algorithm exhibits a slight advantage, thereby substantiating its selection as the optimal training algorithm for the Multi-Layer Perceptron (MLP). Conversely, the Variable Learning Rate Gradient Descent algorithm yields poor results.
(a) Mean RMSE of temperature prediction. (b) Mean R2 of temperature prediction. (c) Mean RMSE of humidity prediction. (d) Mean R2 of humidity prediction.
3.3.2 Number of hidden layer nodes selection.
In order to ascertain the optimal number of hidden layer nodes, a grid search method was employed to observe the change in performance of the predictive model when the number of hidden layer nodes was increased from 3 to 30. In order to enhance the reliability of the results, each node configuration was subjected to 10 independent repetitions of the experiment. The LM training algorithm was used, an early stopping mechanism was set (maximum number of failures is 6), and L2 regularization (= 0.005) was optimized to prevent over fitting.
The prediction results for temperature and relative humidity at different numbers of nodes are shown in Fig 6. When the number of hidden layer nodes is set at 11, the temperature prediction model achieves a test set RMSE of 1.68°C, with the narrowest confidence interval (as shown by the shaded band in Fig 6(a)). Concurrently, showing an R2 above 0.95, demonstrating robust performance. At 11 nodes, the humidity prediction model achieves an RMSE of 4.62%, with the narrower confidence interval bandwidth across tested configurations (Fig 6(c)), coupled with an R2 exceeding 0.93, indicating both high accuracy and stability.
(a) RMSE for temperature prediction. (b) R2 for temperature prediction. (c) RMSE for humidity prediction. (d) R2 for humidity prediction.
As shown in table 6, the discrepancy in performance between the optimal node 11 and node 3 in temperature prediction has been substantiated through the implementation of a one-way analysis of variance (ANOVA) in conjunction with a Tukey’s HSD post-hoc test (significance level α = 0.05), thereby confirming a statistically significant outcome. The discrepancy in performance between the optimal node 11 and node 3 in temperature prediction was found to be statistically significant. Furthermore, no statistically significant difference was observed between the selected node 11 and the optimal node (node 29) in humidity prediction. Additionally, the system’s humidity prediction did not demonstrate a significant enhancement with an increase in the number of nodes. Thus, 11 nodes were selected for the MLP model, achieving optimal prediction accuracy for both temperature and humidity within a unified network.
3.4 RBF prediction model
As with MLP, the selection of training algorithms has a direct impact on the accuracy of predictions. In the relevant literature, Taki et al. [32] used 13 different algorithms compared with regard to their ability to predict greenhouse temperatures, and it indicates that both the LM algorithm and the Bayesian algorithm have demonstrated satisfactory performance. The present study thus compares the impact of these two algorithms on the output results of the RBF prediction model, as illustrated in Fig 7. The Bayesian optimization algorithm has been shown to have a clear advantage in the RBF network, with higher R2 values and lower RMSE in temperature and relative humidity predictions. Consequently, the Bayesian algorithm is selected as the optimization algorithm for the RBF prediction model.
(a) Temperature prediction. (b) Humidity prediction.
3.5 Optimal prediction model selection
The optimized MLP and RBF models were utilized to predict the temperature and relative humidity of the greenhouse over the following 30 minutes. The parameters of the MLP model are detailed in the preceding section, the RBF prediction model utilizes the Bayesian regularization optimization algorithm previously identified. The hidden layer RBF centers are initialized via K-means clustering, with the maximum number of RBF neurons is 300 based on empirical validation. During training, the hidden layer neuron count automatically adapts to the to the training sample size, while the spread parameter is dynamically adjusted via Eq. (2) to harmonize the expressive and generalization capabilities of the model. The maximum iterations capped at 500 (typically not reached, adjustable if required).
As summarized in Table 7, the RMSE and R2 values of both the MLP and RBF prediction models are presented for the training, validation, and test datasets. The standard deviation of RMSE for the RBF and MLP models on the validation set is ± 0.058 and ±0.074, respectively, demonstrating their robustness to data fluctuations. The high R2 value (0.958) and low RMSE can basically be maintained on the test set, proving that both models have good generalization ability. The RBF model demonstrates marginally superior performance, with its RMSE is 1.579°C, smaller than that of the MLP model. For humidity prediction, the MLP and RBF prediction models perform well on both the training and validation sets, with the R2 reaching about 0.96, and the RBF network outperforms the MLP network due to having a smaller RMSE and higher R2. However, the reversal of performance during the testing phase suggests that MLP may have superior generalization potential for humidity prediction. So, for humidity prediction, MLP model is preferred. The overall error of the temperature prediction model was significantly lower than that of the humidity prediction, which suggests that humidity varies in a large range (36%−98%) over the course of a day, and that its changing characteristics are more difficult to capture, for example, when encountering snowy weather or irrigation operations in the greenhouse, the changes are more different from sunny days and non-irrigation times. The MLP prediction model has a smaller RMSE and higher R2 on the test set. Jung et al. [12] employed an RNN-LSTM model for 30-minute-ahead prediction of temperature and humidity in a Venlo type multi-greenhouse, with an R2 of 0.96 for temperature, 0.80 for humidity.
As shown in Fig 8, the comparison curves of predicted values and true values obtained using RBF and MLP prediction models are presented. Fig 8(a) illustrates the performance of prediction results across all test sets (2531). To clearly demonstrate the prediction scenarios under different data conditions, each segment of 500 samples is selected for local display, as shown in Figs 8(b) and 8(c).
(a) All testing sets. (b) 0-500 samples. (c) 500-1000 samples.
From the comparison curves, it can be seen that the predicted values of the two models, MLP and RBF, are in good agreement with the real values as a whole. The time when the temperature and humidity prediction value have a large error with the measured value occurs during periods when the greenhouse temperature and relative humidity change drastically, for example, near the extreme value point, and the prediction accuracy of this location is affected by the change of greenhouse operation status. For example, when the insulating quilt was unfolded in the morning on a sunny day, the intensity of solar radiation continued to increase (peaking at noon), resulting in a sudden increase in greenhouse temperature and a sudden decrease in humidity [33]; superimposed on the secondary sudden change in temperature and humidity triggered by the opening of the vents at noon. The nonlinear coupling effect formed by such multiple dynamic perturbations poses a dual challenge to the model’s ability to capture rapidly changing processes.
In order to comprehensively assess the model performance, the prediction of temperature (Tin0) and relative humidity (RHin0) in the greenhouse at the current moment was tested with the above model parameters, as shown in Table 8 for the performance of the prediction model on the training, validation and test sets, The consistently low RMSE and high R2 values of both models across all datasets validate their excellent stability and generalization capability in temperature prediction. It can be seen that the prediction of the MLP model is better than that of the RBF. Fig 9 shows the comparison between the prediction curves of MLP and RBF with actual values. Fig 9 (a) presents the curves for all prediction datasets, while Fig 9 (b) shows the local comparison curves for 0–500 sampling points. It can be seen that the prediction curve of the MLP model almost perfectly matches the actual values, whereas the RBF prediction model has larger deviations near the highest temperature and lowest humidity points. Therefore, in the case of real-time prediction of temperature and humidity in the greenhouse, the MLP prediction model has more obvious advantages in temperature and humidity prediction.
(a) All testing sets. (b) 0-500 samples.
3.6 Discussion
This study systematically evaluated the ability of MLP and RBF neural network models in predicting greenhouse temperature and relative humidity for the next 30 minutes and verified the performance of the optimized models in predicting environmental factors at the current moment. By innovatively integrating correlation analysis, partial sensitivity analysis, and Kendall’s W coefficient of concordance, the input variables were optimized from 8 to 4 critical parameters. This variable reduction strategy not only mitigated potential errors from redundant sensors but also reduced hardware deployment costs.
Both prediction models were rigorously validated through 5-fold cross-validation and independent testing. For the prediction of temperature and humidity in the greenhouse at the current moment, both the MLP model and the RBF model showed good stability and generalization ability, and the MLP prediction model outperformed the RBF prediction model for temperature prediction with a smaller RMSE of 0.439°C and a higher R2 of 0.997 and for humidity prediction with an RMSE of 1.141% and R2 of 0.996. For the 30-minute advance forecast, the experimental results indicated that the RBF network demonstrated a superior capacity for temperature prediction,with an RMSE of 1.579°C and an R2 of 0.958. In contrast, the MLP model exhibited a higher degree of proficiency in humidity forecasting, with an RMSE of 4.299 and an R2 of 0.948. However, both prediction models exhibited significant error amplification during periods of rapid environmental fluctuation, such as those caused by ventilation activation or abrupt irradiance changes. This phenomenon was primarily attributed to their inability to effectively capture nonlinear coupling effects between multiple control actuators. While light intensity parameters did indirectly reflect Insulated quilts status, critical discrete variables such as ventilation on/off states were not incorporated in the current modeling.
Future research should prioritize integrating binary operational parameters of environmental actuators (ventilation/insulation) into input features and enhancing model mapping capabilities for abrupt state transitions. In addition, the overall effect of humidity prediction is somewhat worse than temperature prediction, and the prediction performance of relative humidity can be further improved in the future by optimizing the algorithm and input variables. This study lays a theoretical foundation for the intelligent regulation of greenhouse environmental factors in cold and arid regions, which is expected to enhance the efficiency of greenhouse management and yield optimization through precise environmental factor regulation. With the continuous development and expansion of facility agriculture, greenhouse precision environmental control system will have a broader application prospect.
Supporting information
S1 Data. Indoor and outdoor environmental data.
https://doi.org/10.1371/journal.pone.0325650.s001
(XLSX)
References
- 1. Jung D-H, Lee TS, Kim K, Park SH. A deep learning model to predict evapotranspiration and relative humidity for moisture control in tomato greenhouses. Agronomy. 2022;12(9):2169.
- 2. Rabbi B, Chen Z-H, Sethuvenkatraman S. Protected cropping in warm climates: a review of humidity control and cooling methods. Energies. 2019;12(14):2737.
- 3. Ma D, Carpenter N, Maki H, Rehman TU, Tuinstra MR, Jin J. Greenhouse environment modeling and simulation for microclimate control. Comput Electron Agric. 2019;162:134–42.
- 4. Jiao W, Liu Q, Gao L, Liu K, Shi R, Ta N. Computational fluid dynamics-based simulation of crop canopy temperature and humidity in double-film solar greenhouse. J Sens. 2020;2020:1–15.
- 5. Shamshiri RR, Jones JW, Thorp KR, Ahmad D, Man HC, Taheri S. Review of optimum temperature, humidity, and vapour pressure deficit for microclimate evaluation and control in greenhouse cultivation of tomato: a review. Int Agrophys. 2018;32(2):287–302.
- 6. López-Cruz IL, Fitz-Rodríguez E, Salazar-Moreno R, Rojano-Aguilar A, Kacira M. Development and analysis of dynamical mathematical models of greenhouse climate: A review. Eur J Hortic Sci. 2018;83(5):269–79.
- 7. Xu LH, Su YP, Liang YM. Requirement and current situation of control-oriented microclimate environmental model in greenhouse system. Trans Chin Soc Agric Eng. 2013;29(19):1–15.
- 8. Xu H, Zhao Y, Dajun Z, Duan Y, Xu X. Exploring the typhoon intensity forecasting through integrating AI weather forecasting with regional numerical weather model. npj Clim Atmos Sci. 2025;8(1).
- 9. Han H, Sha R, Dai J, Wang Z, Mao J, Cai M. Garlic origin traceability and identification based on fusion of multi-source heterogeneous spectral information. Foods. 2024;13(7):1016. pmid:38611322
- 10. Devi CJ, Syam B, Reddy P, Kumar KV, Reddy BM, Nayak NR. ANN approach for weather prediction using back propagation. Int J Eng Trends Technol. 2012; 1(3): 19–23. https://api.semanticscholar.org/CorpusID:16986276
- 11. Taki M, Ajabshirchi Y, Ranjbar SF, Rohani A, Matloobi M. Heat transfer and MLP neural network models to predict inside environment variables and energy lost in a semi-solar greenhouse. Energy Build. 2016;110:314–29.
- 12. Jung D-H, Kim HS, Jhin C, Kim H-J, Park SH. Time-serial analysis of deep neural network models for prediction of climatic conditions inside a greenhouse. Comput Electron Agric. 2020;173:105402.
- 13. Castañeda-Miranda A, Castaño VM. Smart frost control in greenhouses by neural networks models. Comput Electron Agric. 2017;137:102–14.
- 14. Petrakis T, Kavga A, Thomopoulos V, Argiriou AA. Neural network model for greenhouse microclimate predictions. Agriculture. 2022;12(6):780.
- 15. Zhang YF, Wang F. Study on temperature and humidity prediction model of solar greenhouse based on SSA-RBF network. J Hebei Agric Univ. 2021;44(3):115–21.
- 16. Choi H, Moon T, Jung DH, Son JE. Prediction of air temperature and relative humidity in greenhouse via a multilayer perceptron using environmental factors. phpf. 2019;28(2):95–103.
- 17.
Xu C, Xu C. Optimization analysis of dynamic sample number and hidden layer node number based on bp neural network. Adv Intell Syst Comput. Springer Berlin Heidelberg. 2013:687–95. https://doi.org/10.1007/978-3-642-37502-6_82
- 18.
Mohmed G, Grundy S, Sun W, Lotfi A, Lu C. Temperature prediction in chinese solar greenhouse based on artificial neural networks using environmental factors. Adv Compu Intell Syst. 2024; 1454. https://doi.org/10.1007/978-3-031-55568-8_24
- 19. Liu S, Xue Q, Li Z, Li C, Gong Z, Li N. An air temperature predict model based on BP neural networks for solar greenhouse in North China. J China Agric Univ. 2015;20(1):176–84.
- 20.
Yue Y, Quan J, Zhao H, Wang H. The Prediction of Greenhouse Temperature and Humidity Based on LM-RBF Network. In: 2018 IEEE International Conference on Mechatronics and Automation (ICMA), 2018. 1537–41. https://doi.org/10.1109/icma.2018.8484456
- 21. Yu H, Chen Y, Hassan SG, Li D. Prediction of the temperature in a Chinese solar greenhouse based on LSSVM optimized by improved PSO. Comput Electron Agric. 2016;122:94–102.
- 22. Zou W, Yao F, Zhang B, He C, Guan Z. Verification and predicting temperature and humidity in a solar greenhouse based on convex bidirectional extreme learning machine algorithm. Neurocomputing. 2017;249:72–85.
- 23. Hsieh K-L, Lu Y-S. Model construction and parameter effect for TFT-LCD process based on yield analysis by using ANNs and stepwise regression. Expert Syst Appl. 2008;34(1):717–24.
- 24. Hosseini Monjezi P, Taki M, Abdanan Mehdizadeh S, Rohani A, Ahamed MS. Prediction of greenhouse indoor air temperature using Artificial Intelligence (AI) combined with sensitivity analysis. Horticulturae. 2023;9(8):853. 9080853.
- 25. Sun M, Li B, Yi Z, Cao K, Li A, Wang Y. Optimization of surface free energy parameters for asphalt binder-aggregate system based on RBF neural network model. Constr Build Mater. 2022;357:129382.
- 26. Bhanja S, Das A. Impact of data normalization on deep neural network for time series forecasting. arXiv. 2018.
- 27. Yang Y, Gao P, Sun Z, Wang H, Lu M, Liu Y, et al. Multistep ahead prediction of temperature and humidity in solar greenhouse based on FAM-LSTM model. Comput Electron Agric. 2023;213:108261.
- 28. Kamilaris A, Prenafeta-Boldú FX. Deep learning in agriculture: A survey. Comput Electron Agri. 2018;147:70–90.
- 29. Bolandnazar E, Sadrnia H, Rohani A, Marinello F, Taki M. Application of artificial intelligence for modeling the internal environment condition of polyethylene greenhouses. Agriculture. 2023;13(8):1583.
- 30. Rezaei Melal S, Aminian M, Shekarian SM. A machine learning method based on stacking heterogeneous ensemble learning for prediction of indoor humidity of greenhouse. J Agric Food Res. 2024;16:101107.
- 31. Deng L, Huang L, Zhang Y, Li A, Gao R, Zhang L, et al. Analytic model for calculation of soil temperature and heat balance of bare soil surface in solar greenhouse. Solar Energy. 2023;249:312–26.
- 32. Taki M, Abdanan Mehdizadeh S, Rohani A, Rahnama M, Rahmati-Joneidabad M. Applied machine learning in greenhouse simulation; new application and analysis. Inf Process Agric. 2018;5(2):253–68.
- 33. Zhang G, Liu X, Fu Z, Stankovski S, Dong Y, Li X. Precise measurements and control of the position of the rolling shutter and rolling film in a solar greenhouse. J Clean Prod. 2019;228:645–57.