Predicting saturated and near-saturated hydraulic conductivity using artificial neural networks and multiple linear regression in calcareous soils

Hydraulic conductivity (Kψ) is one of the most important soil properties that influences water and chemical movement within the soil and is a vital factor in various management practices, like drainage, irrigation, erosion control, and flood protection. Therefore, it is an essential component in soil monitoring and managerial practices. The importance of Kψ in soil-water relationship, difficulties for its measurement in the field, and its high variability led us to evaluate the potential of stepwise multiple linear regression (SMLR), and multilayer perceptron (MLPNNs) and radial-basis function (RBFNNs) neural networks approaches to predict Kψ at tensions of 15, 10, 5, and 0 cm (K15, K10, K5, and K0, respectively) using easily measurable attributes in calcareous soils. A total of 102 intact (by stainless steel rings) and composite (using spade from 0–20 cm depth) soil samples were collected from different land uses of Fars Province, Iran. The common physico-chemical attributes were determined by the common standard laboratory approaches. Additionally, the mentioned hydraulic attributes were measured using a tension-disc infiltrometer (with a 10 cm radius) in situ. Results revealed that the most of studied soil structure-related parameters (soil organic matter, soluble sodium, sodium adsorption ratio, mean weight diameter of aggregates, pH, and bulk density) are more correlated with K5 and K0 than particle-size distribution-related parameters (sand, silt, and standard deviation and geometric mean diameter of particles size). For K15 and K10, the opposite results were obtained. The applied approaches predicted K15, K10, K5, and K0 with determination coefficient of validation data (R2val) of 0.52 to 0.63 for SMLR; 0.71 to 0.82 for MLPNNs; and 0.58 to 0.78 for RBFNNs. In general, the capability of the applied methods for predicting Kψ at all the applied tensions was ranked as MLPNNs > RBFNNs > SMLR. Although the SMLR method provided easy to use pedotransfer functions for predicting Kψ in calcareous soils, the present study suggests using the MLPNNs approach due to its high capability for generating accurate predictions.


Introduction
Increased population pressure and the necessity of efficient water management have recently led attentions to reuse arid and semiarid soils as potential productive soil resources [1].Hydraulic conductivity (K ψ ) is an important attribute in designing and monitoring drainage and irrigation systems and to characterize many aspects of water flow within unsaturated soils such as irrigation, aquifer recharge, infiltration of rainfall and runoff, nutrient transportation, pollutants, and pesticides, and water balance [2][3][4].There are some crucial factors such as pores size distribution and soil structure [5,6], bulk (apparent) density (BD) [7], particle-size distribution (PSD) [8], and soil organic matter, (SOM) content [9] that can significantly affect K ψ .
The values of K ψ are highly depending on the methods used for their determination [1].Moreover, its measurement both in the field (by double rings, tension-disk infiltrometer, or Guelph permeameter) and laboratory (using the constant and falling head approaches) is a laborious, challenging, and costly process [10].Consequently, various efforts have been made to indirectly determine saturated and, to a lesser extent, unsaturated or near-saturated hydraulic conductivity.This has been achieved through creation of pedotransfer functions (PTFs), which used easily measurable soil attributes.These easily measurable soil attributes include soil texture and PSD parameters, BD, mean weight diameter of aggregates (MWD), SOM content, calcium carbonate equivalent (CCE), and others.To estimate saturated and unsaturated hydraulic conductivity, different approaches have been used in literature like, linear [11][12][13][14] and non-linear [15,16] regressions, and different machine learning algorithms [13,[16][17][18][19]].Nevertheless, the potential of PTFs to accurately predict K ψ is limited due to not incorporating soil structure, which is a crucial factor influencing hydraulic conductivity [20].
Artificial neural networks (ANNs), ANNs-based PTFs, and other related approaches have been widely employed as powerful modeling tools in various fields of science and engineering [21][22][23].The ANNs are inspired by the human brain which is able to simulate the complex relation between input and output data [24].According to literature, ANNs typically exhibit higher efficiency and potentials to estimate hardly measurable soil attributes rather than conventional regression approaches [25][26][27].There are some advantages of ANNs approaches compared to traditional PTFs.For instance, in the ANNs approaches there is: i) no need to pre-assumptions for modeling; ii) no need to priori assumptions of data distribution; iii) high capability to model complex and non-linear behaviors; and iv) high compatibility with missing and noisy data [24].However, ANNs approaches in comparison to conventional PTFs have some weaknesses like: i) "black boxes" modeling; ii) necessity for a large number of data to obtain the optimal weights and biases of the network; and iii) necessity for trial and error approaches to select the most suitable parameters in their structures [28,29].
Two feed-forward NNs (FFNNs), including multilayer perceptron (MLPNNs) and radialbasis function (RBFNNs), have been commonly used among various types of ANNs as efficient tools for recognizing patterns or approximate functions [30,31].
Although ANNs have been used to predict saturated hydraulic conductivity (K 0 ) in several studies [32][33][34][35][36][37], there is a limitation in literature which applied ANNs to predict unsaturated hydraulic conductivity especially in calcareous soils.In this regard, Moosavi and Sepaskhah [38] in the Fars Province of Iran, found that the most accurate prediction of K ψ at tensions of 20, 15, 10, 6, 3, and 0 cm were obtained with a four-layer MLPNNs with three nodes in the first and four nodes in the second hidden layers.Sihag [39] in sandy soils of India reported that the prediction of unsaturated hydraulic conductivity by back propagation algorithm based on ANNs approach works better than fuzzy logic.Jian et al. [40] in the USA soils found the two-hidden layers MLPNNs approach with the first and the second layers of five and three Coefficient of variations; D, Fractal dimension of particle-size distribution; d g , Geometric mean of particles size diameter; EC, Electrical conductivity of saturated extract; FFNNs, Feed-forward neural networks; K + , Water-soluble potassium; K ψ , Hydraulic conductivity at tension ψ; K 0 , Hydraulic conductivity at tension 0 cm (saturated hydraulic conductivity); K 5 , Hydraulic conductivity at tension 5 cm; K nodes had a good performance to predict K ψ at 2 cm tension when trained with soil texture components (i.e., clay, silt, and sand percentages).
In addition to the mentioned limitation, there are no reports about the prediction of field measured near-saturated K ψ using multiple linear regression (MLR), MLPNNs, and RBFNNs approaches and comparing their modeling potential.Therefore, novel aspects of the present study were covering the potential of two common FFNNs to predict saturated and near-saturated K ψ in calcareous soils and also comparing their capability with that of the MLR approach.In general, this study aimed to: i) predict K ψ of calcareous soils at tensions of 15, 10, 5, and 0 cm using MLPNNs and RBFNNs along with MLR, and ii) compare capability of the mentioned approaches and their accuracies to predict K ψ in calcareous soils.

Study area
Totally 102 locations were randomly selected in the Fars Province, Iran to measure physicochemical and hydraulic attributes of the soils, which cover the most relevant land uses and soil types.It should be noted that for developing reliable models to predict specific soils attribute, the number of input data, which is largely depends on difficulties and costs for its determining, is very important.For this purpose, normally the 100 data are trustworthy and appropriate in soil science studies.Of course, higher numbers of input data result in more reliable models.Still, due to difficulties and spending lots of time for measuring the K ψ , the 102 data would be appropriate for performing the present study.In addition, the random selection of experimental points while considering different land uses can help to include almost all soil types with different hydraulic, physical, and chemical attributes in the study.Therefore, the obtained results can be used in larger areas with similar soil conditions.
General descriptions of the study region (Fars Province) and the soils were summed up in Table 1.The Fars Province, in south and southwest regions of Iran, has a great potential to grow many types of agricultural plants like wheat, barley, corn, rice, alfalfa, etc.On the other hand, there are different types of land uses, like croplands, pasture, garden, woodland, etc. in the mentioned Province.The soils of Fars Province are calcareous and relatively calcareous (with a CCE contents of 12.5-70.6%based on our data in the present study), which can be a Table 1.General descriptions of the study region and the studied soils.

General climate
Arid and semi-arid -Climate regime based on Ko ¨ppen-Geiger BWh, BSk, BSh [41] Mean annual temperature 17.5˚C [42] Annual precipitation 5 to 100 cm [43] Elevation from the mean sea level 0.5 to 4 km [44] Soil moisture regimes Xeric, ustic, aridic [45] Soil temperature regimes Mesic, thermic, hyperthermic [45] Parent material Soluble dolomite and calcite limestone [46] Studied soil types according to Taxonomy classification Inceptisols (71 soils), Aridisols (18 soils), and Entisols (13 soils) [44,47] Studied land uses Croplands (wheat, barley, corn, rice, and fallow), pasture, woodland (oak forests) https://doi.org/10.1371/journal.pone.0296933.t001 good representative for calcareous soils in the Middle East.Due to these mentioned reasons, the Fars Province was selected as target area for performing the present study.At each of the 102 selected locations, water infiltration was measured using a tension-disk infiltrometer apparatus.Intact soil samples of 3.5 cm diameter and 2 cm height (using stainless steel rings) and 1 kg composite samples (using spade) were taken from the depth of 0 to 20 cm to measure soil physico-chemical attributes.

Analyzing selected physico-chemical attributes of the soils
Air-dried soil samples, that were ground and sieved to sizes of 8 and 2 mm, were prepared to determine the selected physico-chemical attributes using standard laboratory procedures.Generally, in the present study the selected physico-chemical attributes were divided in two groups: i) PSD-related parameters, and ii) structure-related parameters.The PSD-related parameters contain sand, silt, clay, geometric mean (d g ) and geometric standard deviation (σ g ) of particles size diameter, and fractal dimension (D).The PSD and its related parameters affect pores size distribution due to the sizes of primary particles.For instance, water infiltrates faster in the coarse-textured soils compared with the fine-textured soils due to having a high content of sand.On the other hand, existing clay content in soil is very essential for aggregation.Therefore, the PSD-related attributes may have significant effects on hydraulic parameters, like K ψ .Subsequently, the structure-related parameters contain BD, MWD, pH, SOM, CCE, electrical conductivity (EC), water-soluble sodium (Na + ), potassium (K + ), calcium (Ca 2+ ), and magnesium (Mg 2+ ), and sodium adsorption ratio (SAR).In general, the mentioned structure-related parameters may affect the pores size distribution of soils, due to their effects on aggregation and disaggregation.The K ψ values may therefore be affected by these structure-related parameters.
The soil PSD and texture were determined using the combined sedimentation (hydrometer) and wet-sieving methods [48].More specifically, the PSD of the soils was obtained based on measuring the density of soil-water suspension at 120, 300, and 600 sec, and at 1, 3, 6, and 24 h by hydrometer (for particles of less than 0.05 mm diameter) and remaining particles on sieves with opening diameters of 1, 0.5, 0.15, and 0.05 mm (for sand fraction) [48].
In addition, the PSD data was used to calculate some indices like, d g and σ g [57], and D [58].Furthermore, SAR was calculated using the values of water-soluble Na + , Ca 2+ , and Mg 2+ (for details see Mozaffari et al., [59]).

Infiltration experiments
A single-disc tension infiltrometer with 10 cm radius was used to conduct the infiltration experiments (Fig 1).At first, at each experimental location, the residues of the plant materials were gently eliminated from the soil surface without causing any alteration to the soil structure.To ensure a suitable hydraulic connection between the soil surface and the disc membrane of the infiltrometer, we used a thin contact layer (nearly 0.5 cm) of fine sand with particle diameters of 0.1 to 0.25 mm.Fresh water (with EC of 0.6 dS m -1 and SAR of 0.5 meq 0.5 L -0.5 ) was used to fill the device reservoir for conducting the infiltration experiments.After that, the infiltrometer was gently put onto the layer of sand and fixed.The experiment was started with the intended maximum tension, which was considered as 15 cm during this study.The Mariotte tubes and air tower were used for setting the applied tensions.In the present study, for performing infiltration test, four successive tensions including 15, 10, 5, and 0 cm were applied at each experimental site.At each tension, the height of infiltrated water was manually recorded at the time intervals of 0.25 min (for the first 5 min) and 1 min to reach the steady-state conditions.Consider that, the steady-state conditions are achieved when at least five successive infiltration rates became similar [60].The time needed to reach the steady state conditions varies for different soils, and usually falls between 20 to 60 min at each applied tension.

Calculating near-saturated and saturated hydraulic conductivity (K ψ )
The K ψ at different applied tensions (ψ) was calculated using the method introduced by Ankeny et al. [61].According to their theory, the K ψ was determined based on the collected infiltration data using well-known Wooding's approach [62].In this particular approach, it is assumed that the variation of K ψ with respect to ψ (as described in Eq 1) follows an exponential pattern.This assumption holds true when water enters the soil from a circular source with a fixed radius "r" under a constant tension [62][63][64]: where K 0 and α are hydraulic conductivity (L T -1 ) at zero cm ψ and the sorptive number (L -1 ), respectively [65,66].Wooding [62] derived the subsequent analytical solution: where Q ψ represents the volume (L 3 ) of water infiltrates into the soil per unit time (T), at tension ψ under steady-state conditions and r denotes radius (L) of the disc infiltrometer apparatus.
Based on Ankeny et al. [61] analysis, by applying Wooding's solution for unsaturated steady-state conditions and substituting K ψ with Eq (1), as well as replacing ψ i and ψ i + 1 into the resulting equation, we would obtain Eqs (3) and ( 4).
Ankeny et al. [61] solved Eqs ( 3) and ( 4) simultaneously as follows: In present study, the values of 0 and 5, 5 and 10, and 10 and 15 cm tensions were used to solve three equations simultaneously.Ankeny et al. [61] stated that the arithmetic average value has the best compatibility with the K ψ , where ). Eqs ( 7) to (10) indicate how K 0 , K 5 , K 10, and K 15 were practically calculated [61]: Stepwise multiple linear regression (SMLR) The SMLR, which still is one of the most commonly used and standardized approaches for developing PTFs [20], was employed for predicting K ψ as described below: where V 1 to V N are the independent variables (easily measurable soil attributes), a 0 to a N denote regression coefficients, and N shows the number of independent variables.The significant predictor variables were selected using a forward approach.For developing SMLR-PTFs, the K ψ at different applied tensions and easily measurable soil attributes were selected as dependent and independent variables, respectively.The STATISTICA (version 8.0) software package was used for developing SMLR-PTFs.

Artificial neural networks (ANNs)
Multilayer perceptron NNs (MLPNNs).The ANNs are a subset of machine learning, which consist of several layers and processing elements called neurons.Two unknown parameters including weights (w ji ) and biases (b j ) in the structure of the MLPNNs should be adjusted by the training process [24,67].For this purpose, an error function, which is a function of weights and biases, should be minimized as follows: where O ij and P ij denote the observed (measured) and predicted values of dependent variable, respectively.m and n are the number of output and data fed into the networks, respectively.An iterative back-propagation algorithm is usually used to adjust the w ji as following: A generalized delta-learning rule is applied to determine the Δw ji (k) values [24,67].
In the current study, for training MLPNNs, various hidden layers of different numbers of nodes (neurons) were examined.The MATLAB software package was used to select the most effective hidden layers combination and their number of neurons, and to predict the K ψ .For training the MLPNNs, an architecture with two hidden layers were selected for all K ψ that includes 8 and 12, 6 and 11, 7 and 10, and 9 and 14 neurons in the first and second layers to estimate the K 0 , K 5 , K 10 , and K 15 , respectively.
Different transfer functions and learning (training) algorithms were tested in the mentioned MLPNNs to achieve the lowest value of Eq (12).The Levenberg-Marquardt algorithm was employed as learning (training) algorithm in the structure of the used MLPNNs for estimating all K ψ .In addition, the sigmoid transfer function was used in the hidden layer, whereas linear transfer function was employed for the output (target) layer.

Radial-basis function NNs (RBFNNs).
The RBFNNs is another type of FFNNs that composed of two layers with simple structure and fast in learning.There are two main stages for learning process in RBFNNs: i) obtaining the centers of the clustered input variables (data) by the unsupervised approaches [24,67,68], and ii) determining the weights of the network using the least square approach.The RBFNNs output is computed as: where X and U j show the input vector and the vector corresponding to the center of the RBF φ j (as transfer function), respectively.The Gaussian function is commonly employed as a basis function for the RBFNNs, which is given as: where σ denote the spread of the RBF.

Data analysis
At first, the selected hydraulic, physical, and chemical attributes were subjected to calculate the descriptive statistics and normality test using Excel (version 2019) and SPSS (version 26) software packages.Attributes that did not follow normal distribution were transformed using the natural logarithm (ln) transformation to be closer to the normal distribution.To evaluate the potential of selected modeling approaches to predict K ψ at different applied tensions by easily measurable soil attributes, the total of 102 laboratory/field measured soil attributes were randomly divided to 75% and 25% of the dataset as the calibration and validation subsets, respectively [38,[69][70][71].Furthermore, in order to ensure that: i) both validation and calibration subsets of K ψ have approximately the same distributions, and ii) the mean values of the aforementioned subsets have no significant difference; the 1:1 line (using Excel software) and t-test analysis (using STATISTICA software) were used, respectively [72].After that, the K ψ values at different applied tensions (as dependent variables) and studied physico-chemical attributes (as independent variables) related to calibration subset were imported to the STATISTICA (version 8) software package for developing PTFs using forward-SMLR approach.The most significant and effective easily measurable attributes were appeared in four developed SMLR-PTFs to predict K ψ at different applied tensions.To avoid the complexity of the NNs models, the appeared physico-chemical attributes in developed SMLR-PTFs were imported to MATLAB software package as independent variables for predicting K ψ using MLPNNs and RBFNNs models.It is worth mentioning that, all applied models were developed by the calibration subset and then tested by the validation subset.

Statistical evaluation of the models
Several statistical indices including the determination coefficient (R 2 ), the Nash-Sutcliffe coefficient (NS), the residual prediction deviation (RPD), and the normalized root mean square error (NRMSE) were employed to assess and compare the performance of the models (Eqs 16 to 19): NRMSEð%Þ ¼ ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi X n i¼1 where P i and O i show the predicted and measured (observed) data, n is number of data, Sd and SEP denote the standard deviation and standard error of the measured data (SEP is equal to RMSE of the predicted data).Table 2 shows the reported classifications for different ranges of NRMSE, NS, and RPD performance criteria.The statistical calculations were performed using the MATLAB software package.

Summary statistics
According to Fig 2, the soils included a broad range of textures (mainly silt loam and loam classes).Therefore, it can be expected that the PSD-related parameters (clay, silt, sand, d g , σ g , and D) be significant and effective parameters for predicting K ψ .It should be noted that we used the United State Department of Agriculture (USDA) guideline to classify the soil texture components, which means that a particle with diameter of less than 0.002 mm is clay; with diameter of 0.002 to 0.05 mm is silt; and with diameter of 0.05 to 2 mm is sand [76].The d g , σ g , and D are criteria of PSD data.They represent unique features of PSD.For example, with increasing clay content and decreasing sand content, the D values increase and the d g values decrease [57,77], but their relationships with soil texture components (sand, silt, and clay contents) are not linear.While, the σ g parameter depends on combination of all texture components in the soil [57].The primary particles (sand, silt, and clay) contents affect pores size distribution due to their sizes and also their impacts on aggregation.Furthermore, macropores and mesopores play crucial roles in water flow through the soil, especially when it comes to processes such as infiltration and the rapid movement of water, solutes, and pollutants through the soil [78,79].Therefore, the PSD-related parameters may affect infiltration rate and subsequently K ψ at different tensions.
The obtained results demonstrated, among studied soil attributes, the D and SAR with coefficient of variations (CV) values of 2.1% and 277%, respectively, represent the lowest and the highest CVs (Table 3).According to classification proposed by Wilding [80], the D, BD, and pH belong to the class of low variability (0% < CV � 15%); K 5 , MWD, silt, σ g , and CCE belong  [73]).† †: The Nash-Sutcliffe coefficient (After Ritter and Muñoz-Carpena [74]).† † †: The residual prediction deviation (After Mouazen et al. [75]).https://doi.org/10.1371/journal.pone.0296933.t002 to the class of moderate variability (15% < CV � 35%); and the remaining studied soil attributes belong to the class of high variability (CV > 35%).The higher CV values show the greater level of the data dispersion around the mean.Therefore, the high variability of an attribute in a dataset enables the researcher to model that parameter with higher certainty and the resulting model can be used on a larger scale.In contrast, a lower CV value of input data may result in a good model, while the applicability of the model is limited to a small range of target attribute.Of course, it should be noted that some attributes (for instance D, BD, and pH) inherently have low CVs due to the low values of standard deviation in comparison with their mean values.More specifically, the D value normally changes between 2.4 to 3 in soils, which in the present study varied between 2.56 to 2.85.Therefore, the minimum value of CV was obtained for this property.While, in arid and semi-arid regions, like the study area, very high values of EC and water-soluble cations can be observed.In irrigated fields with fresh water, the soluble salts and cations are leached toward the deeper sections of soil profile.While, in the pastures or rainfed fields, the soluble salts and cations of subsoil may migrate to the surface layers of soils through capillary movement and accumulate there.Therefore, the high variability of EC, water-soluble Na + , K + , Ca 2+ , and Mg 2+ , and SAR may be observed in such arid and semi-arid regions.In this regard, the EC values of 0.26 to 76.7 dS m -1 was reported by Mozaffari et al. [76] in the soils of Fars Province, Iran.In addition, moderate and high variabilities of K ψ in the present study indicate that the obtained models may work well on a larger scale with similar soils.Furthermore, the Kolmogorov-Smirnov test of normality showed the Na + , EC, SAR, K + , Mg 2+ , and Ca 2+ departed from normal distribution (Table 3).Normal distribution of the input data is an essential assumption for applying the MLR modeling procedure.There are different types of data transformations, like taking logarithm or natural logarithm, square rooting, squaring, etc.Therefore, the mentioned chemical attributes were transformed using the natural logarithm (ln or logarithm to the base e) function for being closer to the normal distribution.

Correlation between studied attributes
The Pearson's correlation coefficients (R) between the soil attributes are shown in Fig 3 .As it was expected, the PSD-related parameters (i.e., silt, sand, σ g , and d g ) and CCE had fair to relatively strong correlations with K 15 and K 10 .It means that among the studied different soil physico-chemical attributes, the PSD-related parameters are more significant for predicting K 15 and K 10 .Alternatively, by closing to saturation conditions, the soil structure-related parameters exhibited stronger correlations with K ψ , in which K 5 and K 0 were significantly related to SOM, pH, CCE, BD, and ln(SAR).Therefore, it can be concluded that with increasing tension (i.e., decreasing matric potential), the variation of unsaturated hydraulic conductivity can be better described by soil PSD-related parameters compared to the soil structure-related parameters.To the best of our knowledge, with decreasing tension from 15 to 0 cm the dependency of water flow to macropores increases.Therefore, soil structure strongly affects the K ψ values.Simultaneously, the pores size distribution changes with soil structure [81].In the well aggregated and structured soils, high amounts of macropores exist due to large aggregates.Therefore, it is expected that all factors that affect soil structure have a significant influence on K 0 and to a lesser extent K 5 .As we respected, the K 5 and K 0 were more sensitive to variations of SOM, pH, BD, and ln(SAR).By increasing tension from 0 to 15 cm, the diameter of pores contributed in water flow, and consequently the effects of macropores on water flow decreased.At high tensions (like 10 and 15 cm in the present study), the soil primary particles are placed next to each other within aggregates, and different mesopores and micropores are formed.Therefore, dependencies of K 10 and K 15 to the PSD-related parameters were remarkably increased.Tajik et al. [82] confirmed that soil particles can be dispersed and soil structure can be destroyed in response to high amounts of Na + (SAR values), due to large hydrated radius.In addition, lower BD values might indicate more macropores in soils due to lower compaction.According to Vaezi et al. [83], Ostovari et al. [84,85], and Mozaffari et al. [86], the SOM and Ca 2+ (represented by a high CCE) function as agents that bind mineral colloids together in order to facilitate flocculation.This process leads to the formation and stability improvement of aggregates.In accordance with our results, Kotlar et al. [13], in different Danish soil types, observed that log(K 10 ) was powerfully related to PSD components (i.e., clay, sand, and silt contents) and the effects of these PSD parameters on log(K 0 ) decreased, while the impact of BD (structure-related parameter) increased.A significant correlation between clay and SOM contents and the values of ln(K 0 ) and ln(K 1 ) was found by Yang et al. [87].Further, they reported ln(K 5 ) and ln(K 10 ) significantly related to MWD.Santra et al. [88] reported the ln(K 0 ) was significantly correlated with sand and clay contents, and BD in soils of Orissa, India.
Generally, the R values help us to clearly understand how each physico-chemical attribute is related to K ψ .In addition, the appeared parameters in the developed SMLR-PTFs for predicting K ψ at each applied tension are justified by R values.It should be noted that some physico-chemical attributes have collinearity, and therefore the more effective ones would appear in the SMLR-PTFs for predicting K ψ .

Predicting K ψ by SMLR method
The t-test analysis revealed that there were no statistically significant differences between the mean values of validation and calibration K ψ across all applied tensions.In addition, the distributions of K ψ values at all applied tensions in the mentioned subsets were relatively similar.Considering these conditions, we ensure that the data included in the validation subset are not concentrated in a specific small range of all data.Therefore, the validation subset can surely examine the developed models by the calibration subset.The developed PTFs using studied physico-chemical attributes by applying SMLR approach for predicting K ψ are provided in Table 4.It is worth mentioning that, for predicting K ψ , all studied soil physico-chemical attributes were imported to SMLR models as independent variables.The developed PTF for predicting K 15 (Eq 20) using d g , σ g , silt, and CCE had R 2 val (subscript val shows validation subset), NRMSE val (%), NS val , and RPD val values of 0.55, 27.4,0.52, and 1.47, respectively.In addition to these mentioned easily measurable soil attributes, the ln(SAR) was appeared to PTF developed for predicting K 10 (Eq 21).The K 10 -PTF predicted this property by R 2 val , NRMSE val (%), NS val , and RPD val values of 0.52, 25.9, 0.47, and 1.40, respectively.According to literature [89,90], the PSD and its related parameters can directly affect K ψ at all tensions.For predicting K 5 and K 0 , the silt, σ g , SOM, and CCE were appeared with positive sign, while the ln(SAR) was introduced with a negative sign in the PTFs developed (Eqs 22 and 23).Furthermore, to predict K 0 , the BD and pH contents were the influential factors by negative sign.By closing to saturated conditions, the macropores dependency of hydraulic conductivity increases [81].Although the ln(Na + ) showed a significant correlation with both K 5 and K 0 , the impact of ln (SAR) was more pronounced (Fig 3).The R 2 val , NRMSE val (%), NS val , and RPD val values were 0.56, 18.1, 0.49, and 1.44 for K 5 and 0.63, 25.4, 0.57, and 1.56 for K 0 predictions, respectively.Table 4. Developed PTFs for predicting K ψ using easily measurable soil attributes by applying the SMLR method.
Overall, in comparison with K 15 , K 10 , and K 5 , the K 0 was estimated more accurately.The moderate performance of K ψ -PTFs was anticipated due to the considerable variability of K ψ even in soils with similar characteristics.This variability is greatly affected by the geometry of the pores, which contributes to the dependency of K ψ to the mentioned factor [20].Although we used some soil structure-related parameters in the present study, they are unable to fully capture the complexity of soil structure.Nevertheless, due to the challenging, time-consuming, and costly nature of measuring K ψ [10], a model that provides even moderate prediction of this property can still be highly beneficial and valuable.In accordance with our results, in the western desert of Egypt, Gamie and De Smedt [12] discovered that the parameters related to soil structure (carbonate content, SAR, BD, and water content in saturated, field capacity, and wilting point conditions) exhibited stronger correlations with log(K 0 ) as compared to the soil texture-related attributes (clay and silt contents).In contrast, Azadmard et al. [14] reported PTFs with R 2 values of 0.17, 0.21, 0.14, 0.19, and 0.19 for prediction of K 15 , K 10 , K 5 , K 2 , and K 0 , respectively based on MLR approach using easily measurable soil attributes.Kotlar et al. [13] estimated log(K 0 ) and log(K 10 ) with R 2 of 0.26 and 0.65, respectively using the SMLR method.In general, based on the statistical indices employed in this study, the K ψ values at ψ of 15, 10, 5, and 0 cm were reasonably predicted using easily measurable soil attributes and applying SMLR method.The developed SMLR-PTFs can undergo testing, and subsequently be applied in other areas with the same soils.

Predicting K ψ by MLPNNs and RBFNNs methods
Table 5 summarized the performance of all applied approaches in the present study for predicting K ψ , i.e., SMLR, MLPNNs, and RBFNNs.As it was mentioned before, the imported independent variables to MLPNNs and RBFNNs models for predicting K ψ at different applied tensions were these which appeared at Eqs (20) to (23).Regarding capability of MLPNNs approach to predict K ψ , the K 15 , K 10 , K 5 , and K 0 were predicted by respectively R 2 val values of 0.71, 0.78, 0.82, and 0.80.According to NRMSE classification, the K 15 , K 10 , and K 0 were On the other hand, The RBFNNs predicted K 15 , K 10 , K 5 , and K 0 with R 2 val values of 0.58, 0.62, 0.78, and 0.77, respectively.Similar to SMLR and MLPNNs approaches, the RBFNNs predicted K 5 with good and the other studied K ψ s with fair accuracy based on NRMSE val values.The NS val values illustrated acceptable accuracy for K 5 and K 0 (NS val values of 0.74 and 0.73, respectively) and unsatisfactory accuracy for K 15 and K 10 (NS val < 0.65).Furthermore, based on RPD val values, just K 5 was predicted with good accuracy (RPD val = 1.87) while, the K 15 , K 10 , and K 0 were fairly predicted using mentioned ANNs approach.
Jian et al. [40] reported adjusted R 2 of 0.53 and 0.33 related to the calibration and validation subsets, respectively to predict K 2 by applying MLPNNs by 2 hidden layers (5 and 3 neurons for the first and second hidden layers) and using clay, silt, and sand contents as inputs.Merdun et al. [32] predicted K 0 by R 2 value of 0.52 using sand, silt, clay, BD, and pores with diameter > 30 μm, between 3-30 μm, and < 3 μm as input variables and cascade forward neural network.Ghanbarian-Alavijeh et al. [34] used one hidden layer MLPNNs to estimate K 0 of UNSODA database.They found the mentioned approach successfully predicted K 0 (with R 2 values > 0.90) using clay and sand contents, BD, effective porosity and van Genuchten retention model parameters (θ r , α, and n) as input variables.

Comparing potentials of SMLR, MLPNNs, and RBFNNs approaches to predict K ψ
Discovering the most powerful modeling approach for predicting hardly measurable soil attributes is very crucial in soil modeling studies.To the best of our knowledge, soil management practices are closely related to hardly measurable soil attributes.The limitations and difficulties for direct measurement of K ψ lead soil scientists to investigate different modeling procedures and discover the most accurate ones to predict this important attribute.Therefore, it is beneficial to compare the potential of classical regression methods (like SMLR) with artificial intelligence algorithms (like MLPNNs and RBFNNs) to predict K ψ at different tensions.Table 5 shows that the values of R 2 , NS, and RPD of different applied approaches for predicting K ψ across all ψ were ranked as MLPNNs > RBFNNs > SMLR-PTFs and values of NRMSE was ranked as MLPNNs < RBFNNs < SMLR-PTFs.In order to better comparison between the capabilities of different approaches to predict K ψ , we provided the values of R 2 val in Fig 4 .Although the SMLR method provided simple and easy to use PTFs to predict K ψ by using easily measurable soil attributes, their application may be limited due to relatively low accuracy.Of course, it should be pointed out that, for a property like hydraulic conductivity, these SMLR-PTFs are valuable due to difficulties in measuring K ψ and its high variability.While, the MLPNNs approach showed a high capability (R 2 val > 0.70) for prediction of K ψ across all ψ.This could be attributed to the presence of non-linear relations among the soil hydraulic and physico-chemical attributes [14].
Fig 5 shows the scatter plots of predicted against the observed (measured) K ψ values across all ψ using MLPNNs as the best predictor approach.As can be seen, the points are near to 1:1 line for both validation and calibration subsets.In accordance with our results, Moosavi et al. [24] in soils of Fars Province, Iran, stated the capability of different approaches to predict sorptivity coefficient could be ranked as MLPNNs > RBFNNs > MLR regarding their accuracies and computational times.Furthermore, Shams Emamzadeh et al. [91] in soils of Tehran Province, Iran, found MLPNNs predicted K 0 was more accurate than that of RBFNNs.While, Rezaei Arshad et al. [36], reported more capability of RBFNNs to predict K 0 compared to MLPNNs and MLR methods in soils of Khuzestan Province, Iran.They predicted K 0 by R 2 val values of 0.68, 0.66, and 0.50 for RBFNNs, MLPNNs, and MLR approaches, respectively.Overall, the present study suggests using ANNs-based methods, especially MLPNNs, to predict saturated and near-saturated K ψ .The proposed modeling approaches can be applied to predict hardly measurable soil attributes in different regions.Generally, the strengths of the present study were: i) obtaining relatively accurate to accurate predictions of saturated and near-saturated K ψ in calcareous soils using MLPNNs, and ii) showing the high capability of artificial intelligence algorithms for modeling K ψ compared to a classical regression approach (SMLR).However, other studies can be performed for testing other machine learning modeling procedures to predict soil-water-related parameters, especially K ψ , and compare their potential with artificial intelligence algorithms.Furthermore, the developed SMLR-PTFs, can practically be tested in the same regions and then be used to predict K ψ .For the ANNs-based approaches used, we suggest developing special MLPNNs and RBFNNs algorithms for predicting K ψ by using easily measurable soil attributes in different areas.

Conclusions
The importance of K ψ in water and chemical movements in the soil as well as the difficulties and time-consuming nature of its' measurement caused performing the present study aimed to investigate the potential of SMLR, MLPNNs, and RBFNNs approaches to predict K ψ at 15, 10, 5, and 0 (saturated condition) cm tensions using easily measurable soil attributes in calcareous soils.According to Pearson's correlation coefficients, the PSD-related parameters were more correlated with K 15 and K 10 compared to structure-related parameters.The opposite results were obtained for K 5 and K 0 .The MLPNNs approach provided the best prediction of K ψ at all applied tensions by 0.71 � R 2 val � 0.82 using recognized easily measurable soil attributes by SMLR-PTFs.In addition, the RBFNNs predicted K ψ at all applied tensions acceptably to accurately with 0.58 � R 2 val � 0.78.Although the developed SMLR-PTFs provided simple equations with relatively acceptable accuracy (0.52 � R 2 val � 0.63), the present study recommends ANNs-based approaches (specifically MLPNNs) to predict K ψ at different tensions due to higher capability.Generally, the accuracy of K ψ prediction at all applied tensions by using different approaches and easily measurable soil attributes was ranked as MLPNNs > RBFNNs > SMLR-PTFs.The K ψ is a crucial factor in water resources management and in the field of soil science and it should be accurately determined in arid and semi-arid regions.The calcareous soils are normally placed in arid and semi-arid regions that face shortage of fresh water for agricultural activities.In addition, lime (calcium carbonate) can help aggregation, improve soil structure, and subsequently affect K ψ in calcareous soils.The findings of the present study, especially the developed SMLR-PTFs, can practically be tested and then used in the other similar soils.Furthermore, the soils of different areas can benefit from the applied ANNs approaches to receive accurate predictions of K ψ .Overall, machine learning algorithms (like those used in the present study) are beneficial for soil science researchers and professionals to find accurate PTFs for predicting and mapping the hardly measurable soil attributes in order to using in precision agriculture.For future works, we recommend paying more attention to predict soil hydraulic attributes in arid and semi-arid areas all over the world for using in water resources management and modeling the water and material transport within the soil.In addition, future studies can be performed to evaluate the performances of different machine learning approaches, like wavelet-neural networks, support vector regression, random forest, decision tree, cubist, k-nearest neighbors, etc. for predicting K ψ .To conclude the present study, the readers should consider two important issues: i) accurate predictions of saturated and near-saturated hydraulic conductivity in arid and semi-arid regions, and ii) application of powerful machine learning algorithms for accurate prediction of hardly measurable soil attributes.

Table 3 . The descriptive statistics and related normality test parameters for the selected attributes of the studied soils (n = 102).
ns †: ns demonstrate no significant difference with the normal distribution and * indicates significant difference with the normal distribution at the probability level of 5%.https://doi.org/10.1371/journal.pone.0296933.t003

Table 5 . The performance criteria for predicting K ψ using easily measurable soil attributes by applying the SMLR, MLPNNs, and RBFNNs approaches.
NRMSE val < 30%) and K 5 was predicted with good accuracy (NRMSE val = 13.1%).The NS val values showed acceptable accuracy (0.65 < NS val < 0.80) for K ψ at all applied tensions and RPD val values showed very good predictions of K 5 and K 0 (2.05 and 2, respectively) and fair predictions of K 15 and K 10 (1.71 and 1.76, respectively).