Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Modelling key ecological factors influencing the distribution and content of silymarin antioxidant in Silybum marianum L.

Abstract

The increasing demand for natural medicine has increased the significance of Silybum marianum as a valuable medicinal plant. It is used to restore liver cells; reduce blood cholesterol; prevent prostate, skin, and breast cancer; and protect cervical cells and kidneys. To identify ecological factors affecting the distribution and amount of silymarin in S. marianum three machine learning algorithms including boosted regression trees (BRT), random forest (RF), and support vector machines (SVM) have been applied in Fars Province, Iran. Fourteen factors affecting S. marianum growth and development were determined and subsequently converted into raster maps for the modeling phase using a Geographic Information System (GIS). Subsequently, the Receiver Operating Characteristic (ROC) curve and random forest algorithm were used to evaluate the models and the significance of the factors, respectively. Results showed that The RF (ROC: 0.99), BRT (ROC: 0.98), and SVM (ROC: 0.96) models were highly accurate in predicting the habitat suitability of S. marianum. The results of the RF algorithm also revealed that factors such as distance from roads, elevation, and mean annual rainfall had the most significant influence on the habitat suitability of S. marianum. In addition, the mean annual rainfall, mean annual temperature, and elevation had the highest effects on silymarin accumulation. In general, the northern and northwestern regions of the Fars Province offer optimal environmental conditions for the growth of S. marianum. The southern and southwestern regions of Fars Province, characterized by higher temperatures and lower precipitation, are suitable for the enhanced biosynthesis of silymarin and expansion of its cultivation and production. This study provides a robust framework for understanding the ecological preferences of S. marianum and optimizing its cultivation and management for pharmaceutical applications. By identifying the most influential environmental variables, this research has the potential for the sustainable utilization of this species, enhancing both its conservation and use as a medicinal resource.

1. Introduction

Milk thistle (Silybum marianum L.) is one of the most important plants in the Asteraceae family, and can be both an annual and biennial species [1]. It is native to southern Europe, Mediterranean, and North Africa. It is also used as a medicinal plant in Australia, New Zealand, and North and South America [2,3]. According to Shokrpour et al. [4], S. marianum ecotypes are endangered because of overgrazing, poor farm management, and increasing pastures across several geographical regions of Iran, particularly in the north, northwest, west, and southwest. This herb has been used since ancient years to cure chronic liver illnesses and to protect the liver from toxins. It has been known for millennia as a liver booster [5]. Silybum marianum can be utilized to produce polymers or biodiesel, and can also be used as animal feed and seed oil [6,7].

The therapeutic properties of S. marianum have been attributed to the concentration of active flavonolignans, with silymarin being one of the most significant components [8,9]. Silymarin, a complex mixture of flavonolignans including silybin, silydianin, and silychristin, has been acknowledged for its numerous applications in agriculture as well as its therapeutic potential in human medicine [10,11]. Silymarin is a strong antioxidant in S. marianum. It is used to restore liver cells, reduce blood cholesterol, prevent prostate, skin, and breast cancer, and protect cervical cells and kidneys [12].

In recent years, the market for medicinal herbs and nutritional supplements has experienced substantial growth, with this product which now ranking among the top-selling medicinal herbs in several countries, including the US and Italy [3,13]. Some wild species of medicinal plants are being overexploited due to the increasing demand [14,15]. Thus, it is necessary to cultivate them extensively under conditions conducive to enhancing the quality of their beneficial components [16].

The distribution of medicinal plants from one location to another was determined based on the relationship between plants and their growth conditions [17]. Various ecological factors such as mean annual rainfall, mean annual temperature, and minimum temperature during the coldest season of the year significantly affect the geographical distribution of species [18]. Understanding this relationship is vital because environmental and climatic factors play pivotal roles in determining the success or failure of medicinal plant cultivation [19,20]. Models play a crucial role in predicting plant ecological processes, offering insights that enhance our understanding of agricultural systems and enable informed decision-making essential to agricultural engineering [21]. In recent years, the integration of artificial intelligence (AI) and GIS into agricultural models has attracted significant interest from researchers. GIS has made significant changes in modern/precision agriculture by designing quality maps and specifying the spatial and temporal changes in plant and soil characteristics [22].

Species distribution models (SDM) use field-measured data, supplementary maps, and specialist knowledge to quantify the connection between the spread of plants and environmental factors and to estimate the real or potential distribution of a species [23]. SDM assesses the relationship between environmental conditions and plant dispersion to determine its potential range. These models can predict current and future species distributions, quantify the impacts of environmental factors, and estimate their distribution in data-scarce regions [24]. Critical aspects of this application require careful selection of appropriate habitats to guarantee the accuracy and reliability of predictions and access to relevant high-quality data.

At the regional scale, including Fars Province in Iran, S. marianum is of ecological and economic importance because of its medicinal value and potential as a source of income for local communities. However, understanding its distribution and predicting suitable habitats is critical for sustainable harvesting and prevention of potential invasiveness. Previous studies have demonstrated the utility of SDM based on machine learning techniques, such as RF, BRT, and SVM, to accurately predict plant species habitats [2527]. These models have proven to be particularly effective when integrated with GIS and remote sensing data.

Globally, studies such as those by Mollalo et al. [28] and Kohansarbaz et al. [29] have highlighted the relative performance of machine-learning algorithms in mapping vegetation. For instance, RF consistently outperformed SVM and BRT in accuracy metrics, whereas other approaches, such as “artificial neural networks (ANN)” and “gradient boosting machines (GBM)”, also demonstrated promise under certain conditions. Incorporating these insights at the regional level can enhance predictions for S. marianum in “Fars Province”, considering the unique environmental and climatic characteristics of the area.

This study employed the RF, SVM, and BRT models with the utmost confidence to accurately pinpoint the key ecological features and regions where S. marianum is highly likely to exist. In addition, Used the RF model to identify the most crucial ecological parameters influencing the active ingredients of silymarin. S. marianum is an extremely valuable therapeutic plant that can be grown in the Fars Province. This plant can be cultivated by creating maps that identify the most active, ent-rich, and vulnerable areas. This plant is one of Iran’s most significant medicinal plants, and plays a crucial role in preventing ecosystem destruction resulting from the region’s reliance on wild plants. Although many studies worldwide have examined the bioactivity, phytochemistry, and genetics of milk thistle, there has been relatively little exploration of its basic agronomy and farming potential. In this context, the current study aimed to determine the performance of the RF, BRT, and SVM models on the distribution of S. marianum growth and silymarin production in Fars Province. The findings of this study can be used as a decision-making tool by farmers and researchers to optimize the cultivation, development, and industrial production of this medicinal plant to increase yield and product quality. Also, these findings can help develop new strategies for the industrial production of plant active ingredients and the development of medicinal plant cultivation. This research specifically addressed the following questions: 1) How do different ecological factors influence the habitat suitability of S. marianum in different regions of Fars Province? (2) What are the most significant environmental variables affecting silymarin accumulation in S. marianum across various climatic regions?

2. Materials and methods

Fig 1 illustrates the methodological flowchart of this study, encompassing inventory mapping, the division of data into training and validation sets, the selection and preparation of environmental predictors, the modeling procedure, the validation of the outcomes, and the selection of the superior model.

thumbnail
Fig 1. The flowchart of the modeling process of S. marianum habitat and its production.

https://doi.org/10.1371/journal.pone.0322442.g001

2.1. Study area

This study was conducted in southern Iran’s Fars Province (Fig 2), which extends between 27.05° and 31.67° (N) and between 50.60° and 55.58° (E). This province covers an area of approximately 12.4 million ha, of which pastures represent the 86.29% [30]. The region experiences a varied climate, characterized by an average annual rainfall of 300 mm, an average annual temperature of 17 °C, and an average elevation of 1500 m [31], and supports a rich variety of vegetation, including 144 valuable plant species [32]. However, overgrazing, harvesting of plant species, and land use change have led to the destruction of rangelands and the decline of natural plant species in this province over the past decade [33,34].

thumbnail
Fig 2. The research area’s location in Iran’s Fars province.

https://doi.org/10.1371/journal.pone.0322442.g002

2.2. Methodology

2.2.1. Mapping of S. marianum presence.

A total of 117 S. marianum sites were found in all counties of the Fars Province during a field study conducted in 2021. These locations were recorded using a “global positioning system (GPS)” device (Garmin Map 62s, USA) (Fig 3). Soil and seed samples were collected during the physiological ripening phase at the location where the medicinal plant is found. Ecological factors such as the slope degree, elevation, slope aspect, plan curvature, clay percentage, sand percentage, silt percentage, carbon percentage, pH, nitrogen percentage, mean annual rainfall, mean annual temperature, distance from rivers, and distance from roads, were used for determining their correlation with species distribution and the active component silymarin [35,36]. In the current study, 70% of the presence data (82 locations) were used to train the models and the remaining 30% were used for the validation phase [37].

thumbnail
Fig 3. Identification and collection of S. marianum in the study area (Taken photo by SX710 HS Canon Camera, Mahbobe Hojati).

https://doi.org/10.1371/journal.pone.0322442.g003

2.3. Preparation of variables

The best sites for S. marianum habitats and their impact on the quantity of silymarin in Fars Province were modeled using a variety of ecological parameters. The beneficial topological variables were identified as elevation, slope, aspect, and plan curvature. To produce these parameters, a 30-meter-resolution Digital Elevation Model (DEM) map was utilized. Using ArcGIS 10.8.0 (http://data.aoos.org/maps/sensors/#l=sensor-stations), these factors were converted into raster layers (Fig 4A4D).

thumbnail
Fig 4. Effective factors on the growth and development of S. marianum: “slope aspect” (A), “DEM (m)” (B), “slope angle” (C), “plan curvature” (D), “clay percent” (E), “sand percent” (F), “silt percent” (G), “carbon percent” (H), “pH” (I), “Nitrogen percent” (J), “mean annual rainfall (mm)” (K), “mean annual temperature (°C)” (L), “distance from rivers (m)” (M), and “distance from road (m)” (N).

https://doi.org/10.1371/journal.pone.0322442.g004

The physical attributes of the soil (such as the proportions of sand, silt, and clay), chemical attributes (such as pH, EC, nitrogen, and organic matter), mean annual rainfall, mean annual temperature, and distance from roads and rivers were measured. Soil samples (117 sites) were collected from locations where S. marianum was found and processed at the Central Laboratory of Shiraz University, the Soil Science Laboratory, and the Azma Pars Laboratory.

To determine the physical characteristics of the soil (percentage of clay, sand, and silt), air-dried samples were first passed through a 2-mm sieve and then determined using the hydrometry method [38]. The soil pH was determined using a pH meter [39].

The amount of organic carbon (SOC) in the soil samples was measured using the Walkey-Block titration method and calculated using Equation 1 [40].

(1)

where SOC is the weight of the dried soil (g); V1 and V2 are the ferrous ammonium sulfate volumes used in the control and the soil sample (mL), respectively; N is the ferrous ammonium sulfate normality; S is the dry weight of the soil sample in grams; and 0.003 is the conversion factor from the volume of titrant to grams of carbon, based on the relationship between the molecular weight and 0.76 is the portion of oxidized organic carbon, respectively.

The amount of nitrogen in the soil samples was determined by the Kjeldahl method based on Bremner’s instructions [41]. Parallel lines of temperature and precipitation were prepared by the Regional Meteorological Organization of Fars Province to create climatic layers. The average values of annual rainfall and temperature were calculated, and raster maps with a resolution of 30 m were created using the inverse distance weighting (IDW) algorithm [42] in “ArcGIS 10.8.0” (Fig 4K4L). The smaller the distance of the points from the nearest cell, the more effective it is. Therefore, a set of points with different radus was used to create maps for better interpolation [43].

Similarly, vector maps of the road and river were used to prepare the study layers of distance from roads and rivers (scale of 1:25,000). Layers were created using the Euclidean distance (ED) algorithm in ArcGIS 10.8.0 (Fig 4M and 4N). This algorithm creates maps based on varying distances from a location, and the classification and grouping of algorithms are influenced by the effective distance metric [44].

Finally, “IDW algorithm” was used to create clay percentage, silt percentage, sand percentage, pH, carbon, and nitrogen maps in ArcGIS 10.8.0 (Fig 4E4J).

2.4. Determination of silymarin

To quantify the effective chemicals, samples of pyxidiums were collected while they were still in the physiological ripening stage. The seeds were first removed from the pyxidium, and 5 g of seeds from each treatment was finely ground using a mill, enclosed in filter paper envelopes, and subsequently subjected to Soxhlet extraction.

Prior to placement in the Soxhlet apparatus, the flasks were weighed and 200 mL of hexane was added to each flask attached to the Soxhlet. The flasks were then heated for 8 h at 65 °C (the boiling point of hexane is 65–70 °C), allowing the oil to separate from the sample and dissolve in hexane. The resulting oil-free powders were oxidized with methanol for 8 h to extract the silymarin. The extracts were subsequently maintained at 50 °C for 5 h to yield a yellow powder after methanol evaporation [45] (Fig 5).

thumbnail
Fig 5. Procedure of measuring secondary metabolites (Silymarin).

https://doi.org/10.1371/journal.pone.0322442.g005

2.5. Modeling process

Three algorithms were used to model habitat suitability and changes in the secondary metabolites of S. marianum. The RF, BRT, and SVM algorithms were used to model habitat suitability. Similarly, the random forest (RF) algorithm was used to model changes in silymarin levels in Fars Province, Iran.

2.5.1. RF.

Expanding regression tree models are employed within Random Forests (RF), a non-parametric tree-based method that consists of multiple regression and classification trees [46]. The percentage of subsamples, total number of effective estimators for each node, and ideal number of trees are the three characteristics were considered when creating an RF [47]. By utilizing the Bootstrap Aggregation technique, also known as bagging, this method generated an enormous number of noncorrelated trees and determines their average [25]. Large collections of gathered data was split into smaller groups in the decision tree, based on straightforward decision-making criteria in a chain. Groups of sets become increasingly similar to each other with successive classifications [48]. The prediction ability of each tree, as well as the connections between trees, determined RF accuracy [46]. RF widely utilized in machine lerning because it performs well compared with other algorithms for classification and requires fewer testing samples [49,50]. Overfitting is another feature of this categorization technique that removes the necessity for certain processes such as cross-validation [46].

Because the RF technique is non-parametric, it may be used to exploit a variety of explanatory factors and is flexible enough to demonstrate hierarchical relationships among the explanatory variables as well as nonlinear correlations between response factors and explanatory factors [51]. The models in this study were implemented using R version 3.5.3 software (https://cran-archive.r-project.org/bin/windows/base/old/3.5.3/) and the SDM Package [52]. An RF classifier required two parameters to be configured to function: the number of tree classifications and the input factors used at every node [53].

2.5.2. BRT.

The BRT is especially recommended when providing an understanding of ecosystem dynamics, which is of equal importance to model precision [9].

Regression trees and boosting algorithms were used in BRT as a potent modeling technique. Boosted Regression Trees (BRT) provide an effective framework for ecologists to examine the relationships between biological processes and predictor variables. This methodology adeptly accommodates a diverse array of input data types and distributions, as well as efficiently manages missing or erroneous data [54]. The hierarchical architecture of decision trees intrinsic to BRT inherently models interactions among various variables, thereby obviating the need to assume their independence. These capabilities, along with the ability to handle heterogeneous input data, render BRT a highly efficient instrument for ecological research [55,56].

The SDM Package [52], in R version 3.5.3 (https://cran.r-project.org/web/packages/gbm/index.html) was used to operate the BRT models [57]. Because of the assignment of the response variable, the Gaussian was used as the error framework for the loss function in the BRT analysis [58]. The following variables also affected the BRT model fit: (1) cross-validation specifies the number of times the information is randomly split for model fitting and validation, (2) the bagging portion sets the proportion of the findings used to select variables, (3) tree variety controls the level of connections in the BRT, and (4) the learning rate establishes the value of each tree in the growth model [54].

2.5.3. SVM.

Initially, by Cortes and Vapnik [59], “SVM” is supervised learning techniques with corresponding learning algorithms that examine and identify patterns in both input and output data, much like “ANN.” SVM has shown promise as a replacement for determinism modeling and estimating techniques in recent years, and its use warrants further investigation. However, this can occasionally result in random initialization of undeveloped networks and change the stopping criteria when the model parameters are optimized [60]. Such issues and restrictions are missing from SVM-based approaches, which also have a straightforward theoretical foundation and are dependable instruments for modeling and engineering [61]. Model parameters such as the number of nodes and hidden layers do not need to be changed for SVM training techniques, which converge to both local and global optima faster than ANN [59].

The foundation of SVM is the structural risk minimization concept [62]. Among the most advanced non-parametric supervised classification methods currently available, can be configured in a variety of ways based on the kernel function used to create the transform function that converts the input space into the output space. Several functions are commonly employed as kernel functions in SVM, such as the “linear”, “polynomial”, “radial basis function (RBF)”, and “multilayer perceptron” [63]. Essential computations are performed immediately in an input space using kernels [64]. The fact that it is often perceived as a linear method in high-dimensional feature spaces does not necessarily imply that the I/O mapping problem involves high-dimensional features. The models in this study were implemented using “R version 3.5.3” software (version 3.5.3; https://cran.r-project.org/web/packages/gbm/index.html) and SDM Package [52].

2.6. Determining the best model

In this research, an ROC curve was used to assess the models and to determine the best model. The Y and X axes in the “ROC curve” are the “true positive rate” and “false positive rate,” respectively; Therefore, if a curve is drawn to the left, the model provides better evaluation. Likewie, if the area under the curve is bigger, it indicates that the model has a better level of accuracy [65,66]. Several studies have confirmed the accuracy of the “ROC curve” [6770]. The area under the curve shows the accuracy of the models as follows: 0.5–0.6 (poor), 0.6–0.7 (moderate), 0.7–0.8 (good), 0.8–0.9 (very good), and 0.9–1 (excellent) [71].

3. Results

3.1. Examining the co-linear effect of factors

In this study, 14 factors that influence the growth and development of S. marianum medicinal plants were used. Tolerance indices (TOL) and variance inflation factors (VIF) were used to assess collinearity. If TOL < 0.1 and VIF > 5 were fulfilled, collinearity existed between variables [72]. The results of this study showed no collinearity between the independent factors (Table 1).

3.2. Quantitative modeling

3.2.1. Habitat suitability by RF.

The results of random forest model are shown in Fig 6A. The habitat suitability map demonstrated that the distribution of S. marianum in the Fars Province was not consistent. For instance, the northwest, west, and south of Fars had more favorable habitats for S. marianum (Fig 6A). The habitat suitability map was divided into four classes (low, moderate, high, and very high) using the natural break method [73]. As shown in Table 2, the study area exhibited varying levels of habitat suitability ranging from low to very high.

thumbnail
Table 2. Distribution percentage of habitat suitability of Silybum marianum in Fars province.

https://doi.org/10.1371/journal.pone.0322442.t002

thumbnail
Fig 6. Habitat suitability of Silybum marianum using RF(A), BRT (B), and (C) models and silymarin.

https://doi.org/10.1371/journal.pone.0322442.g006

3.2.2. Habitat suitability by BRT.

Based on the BRT algorithm, S. marianum had greater habitat suitability in the northwest, west, and south of the Fars Province (Fig 6B). As shown in Table 2, habitat suitability was allocated to the low, moderate, high, and very high classes, respectively. As a result, the proportion of each class varies throughout the study region, with the “very high” class having the lowest proportion and the low class having the highest proportion.

3.2.3. Habitat suitability by SVM.

The SVM had approximately the same results as those of RF and BRT. However, the SVM model has a very small difference from the RF and BRT models; therefore, in addition to the northwestern, western, and southern regions, parts of the eastern regions of the Fars Province also have habitat suitability for S. marianum. Table 2 presents the habitat suitability across different classes, ranging from low to very high.

3.3. Modeling the quality change of silymarin

In this study, the “ RF” algorithm was used to model the quality change of silymarin in the Fars Province. These results demonstrate that the concentration of silymarin varied among S. marianum from different geographic locations. Similarly, S. marianum in the southern region of the Fars Province had greater silymarin concentrations. In addition, S. marianum had favorable silymarin concentrations in the central, western, and eastern regions of Fars Province. However, S. marianum was significantly decreased in the north (Fig 7). The silymarin ratio of S. marianum was categorized as very high, high, moderate, and low (Table 3).

thumbnail
Table 3. Percentage change in quality of silymarin in Fars province.

https://doi.org/10.1371/journal.pone.0322442.t003

3.4. Evaluation of classification algorithms

The ROC curve and area under the curve (AUC) were used to assess quantitative models. SVM, BRT, and RF had accuracies of 0.96, 0.98, and 0.99, respectively. These results showed that the habitat suitability of S. marianum could be accurately predicted using the three aforementioned models (Table 4). The RF model predicted changes in silymarin quality with an acceptable value (RMSE = 0.013), indicating excellent accuracy due to its low RMSE value.

3.5. Determining the importance of the factors

The RF method was used to identify the most significant ecological factors that affect habitat suitability. Distance from roads, elevation, and mean annual rainfall had the greatest effects on habitat suitability and silymarin concentrations in S. marianum. However, plan curvature, slope aspect, and distance from the rivers had the lowest impacts (Fig 8). Considering that S. marianum in Fars Province often grows alongside roads, the findings regarding the significance of these factors are logical. Similarly, the random forest algorithm results about the importance of factors in the qualitative modeling process showed that the mean annual rainfall, mean annual temperature, and elevation were prominent factors in silymarin accumulation (Fig 9).

thumbnail
Fig 8. Determining the importance of factors in quantitative modeling by the RF algorithm.

https://doi.org/10.1371/journal.pone.0322442.g008

thumbnail
Fig 9. Determining the importance of factors in qualitative modeling by the RF algorithm.

https://doi.org/10.1371/journal.pone.0322442.g009

4. Discussion

Habitat development [74], habitat conservation [75], and invasive species control [76] have all utilized the habitat suitability modeling framework, leveraging the machine learning techniques introduced in this study. This approach is not affected by the geographic location, scale resolution, or plant species distribution. These characteristics may have important ecological applications in the management and conservation of medicinal plants in the future.

Habitat fragmentation poses a threat to the biodiversity. Thus, the primary goal of future conservation programs is to preserve and restore habitat ecosystems [77]. Enforcing beneficial activities requires evaluation of the effectiveness of natural habitats and mapping of their areas. When assessing habitat suitability, the multicollinearity of efficient variables as negative parameters frequently increases the total amount of noise in all the models [78]. However, in this study, none of the environmental, climatic, or soil condition variables included as conditioning elements in the S. marianum “habitat suitability model” showed signs of multicollinearity.

Habitat suitability modelingpredicts the number of plant species using important environmental factors [79]. In addition, it is possible to forecast habitats through habitat suitability modeling [80]. Habitat suitability modeling is used for most ecological events to prevent the extinction of plants and animals through accurate predictions [81]. For example, various investigations have utilized habitat suitability modeling to assess the suitability of habitats for various species such as Panthera tigris, and Melursus ursinus [82,83]. In general, the use of spatial modeling to predict natural events is suggested for the following reasons:(1) protecting the ecosystem, (2) controlling invasive species, (3) preventing the extinction of endangered species, and (4) assisting in managing endangered plant species [84,85]. However, few studies have investigated the qualitative characteristics of these medicinal plants. The Willingness To Pay (W.T.P.) of the local population for the preservation of the Seine Estuary Wetlands, a significant and endangered biological region in Northern France, was determined using a continual valuation survey. In this study, a random forest was used for qualitative modeling [86].

The RF, BRT, and SVM are frequently used [87,88]. Consequently, RF allows for a rapid and simple evaluation of both the current and prospective occurrences of a species [89]. Mollalo et al. [28], proposed that BRT and SVM classifiers are useful and low-cost methods for determining the habitat suitability of a species when combined with GIS and remote sensing data.

The RF model is the most effective and powerful model for determining the habitat suitability [90]. Accordingly, earlier research contrasting the RF and SVM techniques revealed that both models had the highest overall accuracy in terms of forecasting the appropriateness of a given environment [91]. Compared to the CART and GLM models, the BRT model has been shown to be the most sustainable model [92,93]. Massada et al. [94], reported that the BRT and RF models can be used to simulate the occurrence of fires in aforests and rangelands. The number of trees can affect the quality of regression-based models such as BRT and RF [95].

In addition, the models that required a training sample, RF, BRT, and SVM, were optimized in the shortest amount of time; hence, separate training pattern optimization was required for the proximity, density, and inhomogeneity variables. Muñoz-Mas et al. [96], suggested that optimizing training data may enhance algorithmic outcomes. According to Mollalo et al. [28], AUC-ROC is a crucial threshold for associated indices in the presence and absence models of classification (RF, BRT, and SVM) that evaluate how well a model can differentiate between presence and absence. Thus, the AUC statistically generates a single differentiation measure equivalent to the nonparametric Wilcoxon test for all threshold ranges [97]. The AUC values of the RF (0.99), BRT (0.98), and SVM (0.96) models were all reasonable and appropriate, and evaluation of the models did not show a discernible difference between the algorithms. Previous studies have confirmed a small variance in the AUC values for BRT < SVM < RF [98].

The use of machine learning algorithms, including RF, BRT, and SVM, offers significant advantages in conservation and management planning, particularly for habitat suitability modeling. These models handle complex non-linear relationships between predictor variables and species distribution. For instance, RF is robust to overfitting and can identify key environmental factors that influence species distribution [18]. BRT is effective in combining the strengths of regression and decision trees to achieve high predictive accuracy, whereas SVM is particularly suitable for small and imbalanced datasets, offering flexibility in modeling complex ecological patterns [99]. These techniques, coupled with GIS and remote sensing data, facilitate the creation of spatially explicit maps that are essential for identifying priority conservation zones [100].

Despite their benefits, however, these models have certain limitations. RF and BRT can be computationally demanding, particularly for large datasets, and may lack interpretability compared to traditional statistical approaches [101]. SVM requires careful parameter optimization, and its performance can be degraded using very large datasets. Another common challenge is the reliance of these models on high-quality input data; inaccuracies in environmental or spatial datasets can propagate errors in final predictions. Furthermore, the transferability of models across regions or under future climate scenarios requires validation to ensure their reliability [18].

By understanding these advantages and limitations, researchers can make informed decisions regarding appropriate modeling techniques for specific ecological and conservation objectives, ensuring robust and actionable outcomes for habitat management.

The geographic distribution of species within their habitats is significantly influenced by environmental conditions in significant ways [102]. A substantial correlation between the size of the training dataset and eco-geographic variables (EGV) has been reported when predicting habitat suitability using RF, BRT, and SVM models [102104]. Overall, it has been discovered that topography, temperature, and precipitation have greater detrimental effects on species dispersion [105]. In their native environments, S. marianum varies according to ecological and climatic conditioning variables [106]. Previous studies have also reported this phenomenon in other plants. For example, according to Kunwar et al. [107], the primary determinants of Juniperus occidentalis abundance and distribution include long-term temperature variations, the quantity and distribution of rainfall, and the size and length of fire outbreaks. Temperature, precipitation, and altitude are the most important variables that influence J. drupacea are temperature, precipitation, and altitude [108]. Furthermore, elevation and precipitation have an impact on J. excelsa distribution patterns in Lebanon [109]. However, the findings of the factor importance analysis showed that the three most crucial variables in S. marianum habitat suitability modeling were elevation, mean annual rainfall, and distance from roads. According to these findings, S. marianum accumulated to a greater extent in regions close to roads. On the other hand, the variables that showed the greatest influence on the accumulation of silymarin in S. marianum were elevation, mean annual temperature, and mean annual rainfall.

The results showed that there was a higher S. marianum population close to highways. Additional studies have shown that roads can positively affect the distribution and survival of plant species [110]. Human disturbance is one of the primary factors affecting habitat alteration in plants is human disturbance [111]. Zhang and Ma [112], emphasized the impact of highways on the diversity of plant habitats. They found that one of the key variables influencing species richness is the presence of roads [113]. Roads can change species assemblages by affecting the seed dispersal.

In the present study, the random forest algorithm’s categorization ranked elevation as the second most influential factor. These results indicated that low altitudes often exhibit the highest habitat suitability for S. marianum. Tiwari et al. [114], reported that elevational factors affected plant distribution. For example, plants usually grow better at lower elevations than at upper elevations. Vegetation is widely dispersed at altitude above 1000 m [112]. Topography, particularly elevation, may affect plant distribution [115]. Thus, elevation may affect the surface water flow, erosion, plant surface penetration, soil formation over time [116], climate [117], and seed dispersion [118].

These findings revealed that the density of S. marianum is directly influenced by decreasing rainfall. According to Naghipour Borj et al. [119], precipitation is one of the factors influencing the spread of medicinal plants throughout temperate and semiarid regions. Future and current climate changes are expected to disrupt grasslands and agriculture. Temperature and precipitation play a crucial role in the distribution of medicinal plants. It has been demonstrated that ambient precipitation is a significant indicator of the distribution of medicinal plants within a given region [120]. Kefalew et al. [121], found that “temperature and precipitation” in semi-arid regions can affect the distribution of medicinal plants. Additionally, Dong et al. [122], reported that an increase in temperature caused by climate change leads to an increase in secondary metabolites. These metabolites defend against high temperatures by increasing lignin, promoting sclerophilic tissues, and synthesizing secondary compounds such as phenolic compounds and sesquiterpenes. These findings obtained support this argument as the concentration of silymarin notably increased in the southern regions characterized by warmer temperatures.

Recent studies have demonstrated the importance of qualitative modeling and advocated its application [123,124]. Assessing the quality of S. marianum across different regions of Fars Province is crucial because of its silymarin content, a significant compound utilized in treating conditions such as fatty liver and chronic inflammatory liver diseases such as liver cirrhosis. Hence, this study aimed to delineate variations in silymarin content across the Fars Province using a random forest algorithm. These findings suggest that the RF model demonstrated outstanding accuracy. Zhang and Wang [124], suggested that the random forest approach could be applied to qualitative modeling of medicinal plants.

Determining the habitat suitability and quality of S. marianum is crucial, because it is generally recognized as an important medicinal plant. One noteworthy aspect of this study was the comparison between the maps of habitat suitability and changes in silymarin quality. This demonstrates that while S. marianum habitats may be identified in various locations, the concentration of silymarin is not consistently high across these areas. This study demonstrated that silymarin content tends to be higher in plants found in the southern part of Fars Province with warmer climate. However, because S. marianum is a medicinal plant that grows at cooler temperatures, the western and northern locations are more desirable habitats. Nevertheless, it is important to note that elevated concentrations of silymarin were not expected in colder regions.

Distance from roads is the most crucial aspect of quantitative modeling. It appears that the proliferation of roads has led to the dispersal of S. marianum seeds at various locations. On the other hand, the “mean annual rainfall and mean annual temperature” had the greatest effect on silymarin quality. This study Showed that high temperatures and low rainfall increased silymarin levels. The results of this study may be used to expand the habitat of S. marianum to southern regions, thereby producing more silymarin.

5. Conclusions

This study highlights the critical findings regarding habitat suitability and silymarin accumulation in S. marianum. Key environmental factors influencing habitat suitability include distance from roads, elevation, mean annual rainfall, and soil properties. Additionally, RF effectively identified these factors as the primary contributors to the spatial distribution of S. marianum and accumulation of silymarin, a valuable medicinal compound. This Study showed that high temperatures and low rainfall increased silymarin levels. The results of this study may be used to expand the habitat of S. marianum to southern regions, thereby producing more silymarin. These findings provide a robust framework for understanding the ecological preferences of S. marianum and optimizing its cultivation and management for pharmaceutical applications. By identifying the most influential environmental variables, this research advances the potential for the sustainable utilization of this species, enhancing both its conservation and use as a medicinal resource.

Supporting information

References

  1. 1. Marceddu R, Dinolfo L, Carrubba A, Sarno M, Di Miceli G. Milk thistle (Silybum Marianum L.) as a novel multipurpose crop for agriculture in marginal environments: A review. Agronomy. 2022;12:729.
  2. 2. Karkanis A, Bilalis D, Efthimiadou A. Cultivation of milk thistle (Silybum marianum L. Gaertn.), a medicinal weed. Ind Crop Prod. 2011;34(1):825–30.
  3. 3. Martinelli T, Fulvio F, Pietrella M, Focacci M, Lauria M, Paris R. In Silybum marianum Italian wild populations the variability of silymarin profiles results from the combination of only two stable chemotypes. Fitoterapia. 2021;148:104797. pmid:33271258
  4. 4. Shokrpour M, Mohammadi SA, Moghaddam M, Ziai SA, Javanshir A. Variation in Flavonolignan Concentration of Milk Thistle (Silybum marianum) Fruits Grown in Iran. J Herbs Spices Med Plants. 2008;13(4):55–69.
  5. 5. Zhao Q, Bai J, Chen Y, Liu X, Zhao S, Ling G, et al. An optimized herbal combination for the treatment of liver fibrosis: Hub genes, bioactive ingredients, and molecular mechanisms. J Ethnopharmacol. 2022;297:115567. pmid:35870684
  6. 6. Bendowski W, Michalczuk M, Jóźwik A, Kareem KY, Łozicki A, Karwacki J, et al. Using Milk Thistle (Silybum marianum) Extract to Improve the Welfare, Growth Performance and Meat Quality of Broiler Chicken. Animals (Basel). 2022;12(9):1085. pmid:35565511
  7. 7. Chambers CS, Holečková V, Petrásková L, Biedermann D, Valentová K, Buchta M, et al. The silymarin composition… and why does it matter??? Food Res Int. 2017;100(Pt 3):339–53. pmid:28964357
  8. 8. Abenavoli L, Capasso R, Milic N, Capasso F. Milk thistle in liver diseases: past, present, future. Phytother Res. 2010;24(10):1423–32. pmid:20564545
  9. 9. Zhang Y, Raashid M, Shen X, Waqas Iqbal M, Ali I, Ahmad MS, et al. Investigation of the evolved pyrolytic products and energy potential of Bagasse: experimental, kinetic, thermodynamic and boosted regression trees analysis. Bioresour Technol. 2024;394:130295. pmid:38184085
  10. 10. Surai PF. Silymarin as a Natural Antioxidant: An Overview of the Current Evidence and Perspectives. Antioxidants (Basel). 2015;4(1):204–47. pmid:26785346
  11. 11. Wadhwa K, Pahwa R, Kumar M, Kumar S, Sharma PC, Singh G, et al. Mechanistic Insights into the Pharmacological Significance of Silymarin. Molecules. 2022;27(16):5327. pmid:36014565
  12. 12. Migahid MM, Elghobashy RM, Bidak LM, Amin AW. Priming of Silybum marianum (L.) Gaertn seeds with H2O2 and magnetic field ameliorates seawater stress. Heliyon. 2019;5(6):e01886. pmid:31304408
  13. 13. Jacobs BP, Dennehy C, Ramirez G, Sapp J, Lawrence VA. Milk thistle for the treatment of liver disease: a systematic review and meta-analysis. Am J Med. 2002;113(6):506–15. pmid:12427501
  14. 14. Liu C, Wang J, Ko Y-Z, Shiao M-S, Wang Y, Sun J, et al. Genetic diversities in wild and cultivated populations of the two closely-related medical plants species, Tripterygium Wilfordii and T. Hypoglaucum (Celastraceae). BMC Plant Biol. 2024;24(1):195. pmid:38493110
  15. 15. Schippmann U, Leaman D, Cunningham A. Impact of cultivation and gathering of medicinal plants on biodiversity: global trends and issues. In: Biodiversity and the ecosystem approach in agriculture, forestry and fisheries. 2002. p. 142–67.
  16. 16. Wang Y, Huang Q, Liu C, Ding Y, Liu L, Tian Y, et al. Mulching practices alter soil microbial functional diversity and benefit to soil quality in orchards on the Loess Plateau. J Environ Manage. 2020;271:110985. pmid:32579532
  17. 17. Das M, Jain V, Malhotra SK. Impact of climate change on medicinal and aromatic plants. Indian J Agricult Sci. 2016;86:1375–82.
  18. 18. Hama AA, Khwarahm NR. Predictive mapping of two endemic oak tree species under climate change scenarios in a semiarid region: Range overlap and implications for conservation. Ecol Inform. 2023;73:101930.
  19. 19. Ahmad Rather R, Bano H, Ahmad Padder S, Perveen K, Al Masoudi LM, Saud Alam S, et al. Anthropogenic impacts on phytosociological features and soil microbial health of Colchicum luteum L. an endangered medicinal plant of North Western Himalaya. Saudi J Biol Sci. 2022;29(4):2856–66. pmid:35531237
  20. 20. Zhan P, Wang F, Xia P, Zhao G, Wei M, Wei F, et al. Assessment of suitable cultivation region for Panax notoginseng under different climatic conditions using MaxEnt model and high-performance liquid chromatography in China. Ind Crop Prod. 2022;176:114416.
  21. 21. Bariotakis M, Georgescu L, Laina D, Oikonomou I, Ntagounakis G, Koufaki M-I, et al. From wild harvest towards precision agriculture: Use of Ecological Niche Modelling to direct potential cultivation of wild medicinal plants in Crete. Sci Total Environ. 2019;694:133681. pmid:31756796
  22. 22. Talukdar S, Naikoo MW, Mallick J, Praveen B, Sharma P, Islam ARMT, et al. Coupling geographic information system integrated fuzzy logic-analytical hierarchy process with global and machine learning based sensitivity analysis for agricultural suitability mapping. Agricultural Syst. 2022;196:103343.
  23. 23. Robinson NM, Nelson WA, Costello MJ, Sutherland JE, Lundquist CJ. A systematic review of marine-based species distribution models (SDMs) with recommendations for best practice. Front Marine Sci. 2017;4:421.
  24. 24. Petrosyan V, Dinets V, Osipov F, Dergunova N, Khlyap L. Range Dynamics of Striped Field Mouse (Apodemus agrarius) in Northern Eurasia under Global Climate Change Based on Ensemble Species Distribution Models. Biology (Basel). 2023;12(7):1034. pmid:37508463
  25. 25. A. Lee‐Yaw J, L. McCune J, Pironon S, Sheth SN. Species distribution models rarely predict the biology of real populations. Ecography. 2021;2022(6):e05877.
  26. 26. Frans VF, Augé AA, Fyfe J, Zhang Y, McNally N, Edelhoff H, et al. Integrated SDM database: Enhancing the relevance and utility of species distribution models in conservation management. Methods Ecol Evol. 2022;13(1):243–61.
  27. 27. Lovrenčić L, Temunović M, Gross R, Grgurev M, Maguire I. Integrating population genetics and species distribution modelling to guide conservation of the noble crayfish, Astacus astacus, in Croatia. Sci Rep. 2022;12(1):2040. pmid:35132091
  28. 28. Mollalo A, Sadeghian A, Israel GD, Rashidi P, Sofizadeh A, Glass GE. Machine learning approaches in GIS-based ecological modeling of the sand fly Phlebotomus papatasi, a vector of zoonotic cutaneous leishmaniasis in Golestan province, Iran. Acta Trop. 2018;188:187–94. pmid:30201488
  29. 29. Kohansarbaz A, Kohansarbaz A, Shabanlou S, Yosefvand F, Rajabi A. Modelling flood susceptibility in northern Iran: Application of five well‐known machine‐learning models. Irrigation Drainage. 2022;71(5):1332–50.
  30. 30. Marzban M, Haghdoost A-A, Dortaj E, Bahrampour A, Zendehdel K. Completeness and underestimation of cancer mortality rate in Iran: a report from Fars Province in southern Iran. Arch Iran Med. 2015;18(3):160–6. pmid:25773689
  31. 31. Salimi S, Balyani S, Hosseini SA, Momenpour SE. The prediction of spatial and temporal distribution of precipitation regime in Iran: the case of Fars province. Model Earth Syst Environ. 2018;4(2):565–77.
  32. 32. Masoudi M, Gore SD, Panah SA. A new methodology using geographical information system ‘GIS’ for assessing livestock pressure in the Qareh Aghaj sub-basin, southern Iran. Nature Environ Pollut Technol. 2005;4.
  33. 33. Fatemi M, Karami E, Moghaddam KR. Determinants of land use change in Fars province, Iran. Int J Agricult Res Govern Ecol. 2017;13(3):272.
  34. 34. Mohammadi F, Ahmadi A, Toranjzar H, Shams-Esfandabad B, Mokhtarpour M. The effects of environmental factors on plant diversity of Darab natural ecosystems in Fars province, Iran. Environ Monit Assess. 2023;195(12):1555. pmid:38036716
  35. 35. Kalusová V, Le Duc MG, Gilbert JC, Lawson CS, Gowing DJG, Marrs RH. Determining the important environmental variables controlling plant species community composition in mesotrophic grasslands in Great Britain. Appl Vegetat Sci. 2009;12(4):459–71.
  36. 36. Rao DVK, Eappen T, Ulaganathan A, Satisha GC. Influence of landscape attributes on soil-plant inter-relationships. Curr Adv Agricul Sci. 2014;6:142–7.
  37. 37. Gholamy A, Kreinovich V, Kosheleva O. Why 70/30 or 80/20 Relation Between Training and Testing Sets: A Pedagogical Explanation. Int J Intell Technol Appl. 2018;11:105–11.
  38. 38. Bouyoucos GJ. Hydrometer Method Improved for Making Particle Size Analyses of Soils1. Agronomy J. 1962;54(5):464–5.
  39. 39. Richards LA. Diagnosis and improvement of saline and alkali soils. US Government Printing Office; 1954.
  40. 40. Walkley A, Black IA. An examination of the Degtjareff method for determining soil organic matter, and a proposed modification of the chromic acid titration method. Soil Science. 1934;37:29–38.
  41. 41. Bremner JM, Mulvaney CS. Nitrogen—Total. In: Page AL, editor. Agronomy Monographs. 1st ed. Wiley; 1982. p. 595–624. https://doi.org/10.2134/agronmonogr9.2.2ed.c31
  42. 42. Emmendorfer LR, Dimuro GP. A novel formulation for inverse distance weighting from weighted linear regression. Computational Science–ICCS 2020: 20th International Conference, Amsterdam, The Netherlands, June 3–5, 2020, Proceedings, Part II 20. Springer International Publishing; 2020. p. 576–89. Available from: https://link.springer.com/chapter/10.1007/978-3-030-50417-5_43
  43. 43. Chen F-W, Liu C-W. Estimation of the spatial rainfall distribution using inverse distance weighting (IDW) in the middle of Taiwan. Paddy Water Environ. 2012;10(3):209–22.
  44. 44. Bundak CEA, Abd Rahman MA, Karim MKA, Osman NH. Fuzzy rank cluster top k Euclidean distance and triangle based algorithm for magnetic field indoor positioning system. Alexandria Eng J. 2022;61:3645–55.
  45. 45. Benthin B, Danz H, Hamburger M. Pressurized liquid extraction of medicinal plants. J Chromatogr A. 1999;837(1–2):211–9. pmid:10227181
  46. 46. Breiman L. Random Forests. Machine Learn. 2001;45: 5–32.
  47. 47. Sun Z, Wang G, Li P, Wang H, Zhang M, Liang X. An improved random forest based on the classification accuracy and correlation measurement of decision trees. Expert Syst Appl. 2024;237:121549.
  48. 48. Arshad A, Mirchi A, Vilcaez J, Umar Akbar M, Madani K. Reconstructing high-resolution groundwater level data using a hybrid random forest model to quantify distributed groundwater changes in the Indus Basin. J Hydrol. 2024;628:130535.
  49. 49. Elshewey AM, Osman AM. Orthopedic disease classification based on breadth-first search algorithm. Sci Rep. 2024;14(1):23368. pmid:39375370
  50. 50. Leo GL, Jayabal R, Srinivasan D, Das MC, Ganesh M, Gavaskar T. Predicting the performance and emissions of an HCCI-DI engine powered by waste cooking oil biodiesel with Al2O3 and FeCl3 nano additives and gasoline injection–A random forest machine learning approach. Fuel. 2024;357:129914.
  51. 51. Becker T, Rousseau A-J, Geubbelmans M, Burzykowski T, Valkenborg D. Decision trees and random forests. Am J Orthod Dentofacial Orthop. 2023;164(6):894–7. pmid:38008491
  52. 52. Naimi B, Araújo MB. sdm: a reproducible and extensible R platform for species distribution modelling. Ecography. 2016;39(4):368–75.
  53. 53. Pal M. Random forest classifier for remote sensing classification. Int J Remote Sens. 2005;26(1):217–22.
  54. 54. Elith J, Leathwick JR, Hastie T. A working guide to boosted regression trees. J Anim Ecol. 2008;77(4):802–13. pmid:18397250
  55. 55. De’ath G. Boosted trees for ecological modeling and prediction. Ecology. 2007;88(1):243–51.
  56. 56. Manley W, Tran T, Prusinski M, Brisson D. Modeling Tick Populations: An Ecological Test Case for Gradient Boosted Trees. Peer Community J. 2023;3. Available from: https://peercommunityjournal.org/item/10_24072_pcjournal_353/
  57. 57. Elith* J, H. Graham* C, P. Anderson R, Dudík M, Ferrier S, Guisan A, et al. Novel methods improve prediction of species’ distributions from occurrence data. Ecography. 2006;29(2):129–51.
  58. 58. Hu X, Fu Z, Sun G, Wang B, Liu K, Zhang C, et al. Importance of forest stand structures for gross rainfall partitioning on China’s Loess Plateau. J Hydrol. 2024;631:130671.
  59. 59. Cortes C, Vapnik V. Support-vector networks. Mach Learn. 1995;20(3):273–97.
  60. 60. Guo J, Wu H, Chen X, Lin W. Adaptive SV-Borderline SMOTE-SVM algorithm for imbalanced data classification. Appl Soft Comput. 2024;150:110986.
  61. 61. Singh S, Bansal P, Hosen M, Bansal SK. Forecasting annual natural gas consumption in USA: Application of machine learning techniques-ANN and SVM. Resour Policy. 2023;80:103159.
  62. 62. Ayeleru OO, Fajimi LI, Oboirien BO, Olubambi PA. Forecasting municipal solid waste quantity using artificial neural network and supported vector machine techniques: A case study of Johannesburg, South Africa. J Clean Prod. 2021;289:125671.
  63. 63. Sánchez A VD. Advanced support vector machines and kernel methods. Neurocomputing. 2003;55(1–2):5–20.
  64. 64. Chowdhury MS. Comparison of accuracy and reliability of random forest, support vector machine, artificial neural network and maximum likelihood method in land use/cover classification of urban setting. Environ Challenges. 2024;14:100800.
  65. 65. Borges TC, Gomes TL, Pichard C, Laviano A, Pimentel GD. High neutrophil to lymphocytes ratio is associated with sarcopenia risk in hospitalized cancer patients. Clin Nutr. 2021;40(1):202–6. pmid:32446788
  66. 66. Ralbovsky NM, Lednev IK. Analysis of individual red blood cells for celiac disease diagnosis. Talanta. 2021;221:121642.
  67. 67. Ghazavi A, Ganji A, Keshavarzian N, Rabiemajd S, Mosayebi G. Cytokine profile and disease severity in patients with COVID-19. Cytokine. 2021;137:155323. pmid:33045526
  68. 68. Grifoni E, Valoriani A, Cei F, Vannucchi V, Moroni F, Pelagatti L, et al. The CALL Score for Predicting Outcomes in Patients With COVID-19. Clin Infect Dis. 2021;72(1):182–3. pmid:32474605
  69. 69. Ogura T, Ackermann J, Mestriner AB, Merkely G, Gomoll AH. The Minimal Clinically Important Difference and Substantial Clinical Benefit in the Patient-Reported Outcome Measures of Patients Undergoing Osteochondral Allograft Transplantation in the Knee. Cartilage. 2021;12(1):42–50. pmid:30463426
  70. 70. Verbakel JY, Steyerberg EW, Uno H, De Cock B, Wynants L, Collins GS, et al. ROC curves for clinical prediction models part 1. ROC plots showed no added value above the AUC when evaluating the performance of clinical prediction models. J Clin Epidemiol. 2020;126:207–16. pmid:32712176
  71. 71. Van der Schouw YT, Verbeek AL, Ruijs JH. ROC curves for the initial assessment of new diagnostic tests. Fam Pract. 1992;9(4):506–11. pmid:1490547
  72. 72. O’brien RM. A Caution Regarding Rules of Thumb for Variance Inflation Factors. Qual Quant. 2007;41(5):673–90.
  73. 73. Jenks GF. The data model concept in statistical mapping. Int Yearbook Cartograph. 1967;7:186–90.
  74. 74. Mathur M, Mathur P. Habitat suitability of Opuntia ficus-indica (L.) MILL. (CACTACEAE): a comparative temporal evaluation using diverse bio-climatic earth system models and ensemble machine learning approach. Environ Monit Assess. 2024;196(3):232. pmid:38308673
  75. 75. Georgiades P, Proestos Y, Lelieveld J, Erguler K. Machine Learning Modeling of Aedes albopictus Habitat Suitability in the 21st Century. Insects. 2023;14(5):447. pmid:37233075
  76. 76. Pasha SV, Reddy CS. Global spatial distribution of Prosopis juliflora - one of the world’s worst 100 invasive alien species under changing climate using multiple machine learning models. Environ Monitor Assess. 2024;196: 196.
  77. 77. Theis S, Poesch M. Current capacity, bottlenecks, and future projections for offsetting habitat loss using mitigation and conservation banking in the United States. J Nature Conserv. 2022;67:126159.
  78. 78. Mancino C, Hochscheid S, Maiorano L. Increase of nesting habitat suitability for green turtles in a warming Mediterranean Sea. Sci Rep. 2023;13(1):19906. pmid:38062052
  79. 79. Vasquez CR, Gupta S, Miano TA, Roche M, Hsu J, Yang W, et al. Identification of Distinct Clinical Subphenotypes in Critically Ill Patients With COVID-19. Chest. 2021;160(3):929–43. pmid:33964301
  80. 80. Medinas D, Marques JT, Costa P, Santos S, Rebelo H, Barbosa AM, et al. Spatiotemporal persistence of bat roadkill hotspots in response to dynamics of habitat suitability and activity patterns. J Environ Manage. 2021;277:111412. pmid:33038670
  81. 81. Shadloo S, Mahmoodi S, Hosseinzadeh MS, Kazemi SM. Prediction of habitat suitability for the desert monitor (Varanus griseus caspius) under the influence of future climate change. J Arid Environ. 2021;186:104416.
  82. 82. Ash E, Macdonald DW, Cushman SA, Noochdumrong A, Redford T, Kaszta Ż. Optimization of spatial scale, but not functional shape, affects the performance of habitat suitability models: a case study of tigers (Panthera tigris) in Thailand. Landscape Ecol. 2021;36(2):455–74.
  83. 83. Sharma M, Thakur R, Sharma M, Sharma AK, Sharma AK. Changing scenario of medicinal plants diversity in relation to climate change: a review. Plantarchives. 2020;4389–400. Available from: http://www.plantarchives.org/20-2/4389--44%2000%20(6582).pdf
  84. 84. Gan L, Chen Y, Hu P, Wu D, Zhu Y, Tan J, et al. Willingness to Receive SARS-CoV-2 Vaccination and Associated Factors among Chinese Adults: A Cross Sectional Survey. Int J Environ Res Public Health. 2021;18(4):1993. pmid:33670821
  85. 85. Jafarov EI, Van der Jeugt J. Exact solution of the semiconfined harmonic oscillator model with a position-dependent effective mass. Eur Phys J Plus. 2021;136(7):758.
  86. 86. Laroutis D, Taibi S. Discriminant analysis versus random forests on qualitative data: Contingent valuation method applied to the Seine estuary wetlands. Int J Ecol Econ Statist. 2011;20:1–19.
  87. 87. Anand A, Srivastava PK, Pandey PC, Khan ML, Behera MD. Assessing the niche of Rhododendron arboreum using entropy and machine learning algorithms: role of atmospheric, ecological, and hydrological variables. J Appl Rem Sens. 2022;16:042402.
  88. 88. Feng L, Tian X, El-Kassaby YA, Qiu J, Feng Z, Sun J, et al. Predicting suitable habitats of Melia azedarach L. in China using data mining. Sci Rep. 2022;12(1):12617. pmid:35871227
  89. 89. Fourcade Y, Engler JO, Rödder D, Secondi J. Mapping species distributions with MAXENT using a geographically biased sample of presence data: a performance assessment of methods for correcting sampling bias. PLoS One. 2014;9(5):e97122. pmid:24818607
  90. 90. Li M, Zhang C, Xu B, Xue Y, Ren Y. Evaluating the approaches of habitat suitability modelling for whitespotted conger (Conger myriaster). Fisheries Res. 2017;195:230–7.
  91. 91. Poursanidis D, Traganos D, Reinartz P, Chrysoulakis N. On the use of Sentinel-2 for coastal habitat mapping and satellite-derived bathymetry estimation using downscaled coastal aerosol band. Int J Appl Earth Observ Geoinform. 2019;80:58–70.
  92. 92. Chen W, Lei X, Chakrabortty R, Chandra Pal S, Sahana M, Janizadeh S. Evaluation of different boosting ensemble machine learning models and novel deep learning and boosting framework for head-cut gully erosion susceptibility. J Environ Manage. 2021;284:112015. pmid:33515838
  93. 93. Khan Z, Mohsin M, Ali SA, Vashishtha D, Husain M, Parveen A, et al. Comparing the Performance of Machine Learning Algorithms for Groundwater Mapping in Delhi. J Indian Soc Remote Sens. 2023;52(1):17–39.
  94. 94. Massada AB, Syphard AD, Stewart SI, Radeloff VC. Wildfire ignition-distribution modelling: a comparative study in the Huron–Manistee National Forest, Michigan, USA. Int J Wildland Fire. 2012;22:174–83.
  95. 95. Zimmer SN, Holsinger KW, Dawson CA. A field-validated ensemble species distribution model of Eriogonum pelinophilum, an endangered subshrub in Colorado, USA. Ecol Evol. 2023;13(12):e10816. pmid:38107426
  96. 96. Muñoz-Mas R, Fukuda S, Pórtoles J, Martínez-Capel F. Revisiting probabilistic neural networks: a comparative study with support vector machines and the microhabitat suitability for the Eastern Iberian chub (Squalius valentinus). Ecol Inform. 2018;43:24–37.
  97. 97. Lobo JM, Jiménez‐Valverde A, Real R. AUC: a misleading measure of the performance of predictive distribution models. Global Ecol Biogeograph. 2007;17(2):145–51.
  98. 98. Millar CS, Blouin-Demers G. Habitat suitability modelling for species at risk is sensitive to algorithm and scale: A case study of Blanding’s turtle, Emydoidea blandingii, in Ontario, Canada. J Nature Conserv. 2012;20(1):18–29.
  99. 99. Radha KO, Khwarahm NR. An Integrated Approach to Map the Impact of Climate Change on the Distributions of Crataegus azarolus and Crataegus monogyna in Kurdistan Region, Iraq. Sustainability. 2022;14(21):14621.
  100. 100. Mirhashemi H, Ahmadi K, Heydari M, Karami O, Valkó O, Khwarahm NR. Climatic variables are more effective on the spatial distribution of oak forests than land use change across their historical range. Environ Monit Assess. 2024;196(3):289. pmid:38381166
  101. 101. Majeed KA, Khwarahm NR, H. Ahmed S. Predicting the geographical distribution of the Persian leopard, Panthera pardus tulliana, a rare and endangered species. J Nature Conserv. 2023;76:126505.
  102. 102. Shang J, Zhao Q, Yan P, Sun M, Sun H, Liang H, et al. Environmental factors influencing potential distribution of Schisandra sphenanthera and its accumulation of medicinal components. Front Plant Sci. 2023;14:1302417. pmid:38162305
  103. 103. An Q, Zheng J, Guan J, Wu J, Lin J, Ju X, et al. Predicting the Effects of Future Climate Change on the Potential Distribution of Eolagurus luteus in Xinjiang. Sustainability. 2023;15(10):7916.
  104. 104. Oca G, Reyes T. Predicting the habitat suitability of the Philippine Cockatoo (Cacatua haematuropygia S. Muller) using ecological niche factor analysis. Ecosyst Develop J. 2023;13:58–67.
  105. 105. Bradie J, Leung B. A quantitative synthesis of the importance of variables used in MaxEnt species distribution models. J Biogeograph. 2016;44(6):1344–61.
  106. 106. Ahad B, Shahri W, Rasool H, Reshi ZA, Rasool S, Hussain T. Medicinal Plants and Herbal Drugs: An Overview. In: Aftab T, Hakeem KR, editors. Medicinal and Aromatic Plants. Healthcare and Industrial Applications. 2021. p. 1–40. https://doi.org/10.1007/978-3-030-58975-2_1
  107. 107. Kunwar RM, Thapa-Magar KB, Subedi SC, Kutal DH, Baral B, Joshi NR, et al. Distribution of important medicinal plant species in Nepal under past, present, and future climatic conditions. Ecol Indicators. 2023;146:109879.
  108. 108. Walas Ł, Sobierajska K, Ok T, Dönmez AA, Kanoğlu SS, Dagher-Kharrat MB, et al. Past, present, and future geographic range of an oro-Mediterranean Tertiary relict: The juniperus drupacea case study. Reg Environ Change. 2019;19(5):1507–20.
  109. 109. Cheikha Douaihy B, Restoux G, Machon N, Bou Dagher-Kharrat M. Ecological characterization of the Juniperus excelsa stands in Lebanon. Ecol Mediterranea. 2013;39(1):169–80.
  110. 110. Halder M, Jha S. The Current Status of Population Extinction and Biodiversity Crisis of Medicinal Plants. In: Jha S, Halder M, editors. Medicinal Plants: Biodiversity, Biotechnology and Conservation. Singapore: Springer Nature Singapore; 2023. p. 3–38. https://doi.org/10.1007/978-981-19-9936-9_1
  111. 111. Ribeiro EMS, Arroyo‐Rodríguez V, Santos BA, Tabarelli M, Leal IR. Chronic anthropogenic disturbance drives the biological impoverishment of the Brazilian Caatinga vegetation. J Appl Ecol. 2015;52(3):611–20.
  112. 112. Zhang Y-B, Ma K-P. Geographic distribution patterns and status assessment of threatened plants in China. Biodivers Conserv. 2008;17(7):1783–98.
  113. 113. Li Y, Bearup D, Liao J. Habitat loss alters effects of intransitive higher-order competition on biodiversity: a new metapopulation framework. Proc Biol Sci. 2020;287(1940):20201571. pmid:33259756
  114. 114. Tiwari D, Kewlani P, Gaira KS, Bhatt ID, Sundriyal RC, Pande V. Predicting phytochemical diversity of medicinal and aromatic plants (MAPs) across eco-climatic zones and elevation in Uttarakhand using Generalized Additive Model. Sci Rep. 2023;13(1):10888. pmid:37407604
  115. 115. Pan L, Yang N, Sui Y, Li Y, Zhao W, Zhang L, et al. Altitudinal Variation on Metabolites, Elements, and Antioxidant Activities of Medicinal Plant Asarum. Metabolites. 2023;13(12):1193. pmid:38132875
  116. 116. Arai M, Minamiya Y, Tsuzura H, Watanabe Y, Yagioka A, Kaneko N. Changes in water stable aggregate and soil carbon accumulation in a no-tillage with weed mulch management site after conversion from conventional management practices. Geoderma. 2014;221–222:50–60.
  117. 117. Chandora R, Paul S, Kanishka RC, Kumar P, Singh B, Kumar P, et al. Ecological survey, population assessment and habitat distribution modelling for conserving Fritillaria roylei—A critically endangered Himalayan medicinal herb. South African J Botany. 2023;160:75–87.
  118. 118. Sekar KC, Thapliyal N, Pandey A, Joshi B, Mukherjee S, Bhojak P, et al. Plant species diversity and density patterns along altitude gradient covering high-altitude alpine regions of west Himalaya, India. Geol Ecol Landscapes. 2023;8(4):559–73.
  119. 119. Naghipour Borj AA, Ostovar Z, Asadi E. The influence of climate change on distribution of an endangered medicinal plant (Fritillaria imperialis L.) in central Zagros. J Rangeland Sci. 2019;9:159–71.
  120. 120. Xia Y, Li T, Liu X, Xu S, Wang Y, Fan X, et al. How do Environmental Variables Affect the Suitable Habitat of Medicinal Plants? A Case Study of Citrus medica L. var. sarcodactylis Swingle in China. Pol J Environ Stud. 2023;32(3):2383–91.
  121. 121. Kefalew A, Sintayehu S, Geremew AY. Distribution analysis of wild medicinal plants in Ada’a District, Ethiopia: A means to identify most prior species for conservation. Acta Ecologica Sinica. 2023;43(2):352–62.
  122. 122. Dong J, Ma X, Wei Q, Peng S, Zhang S. Effects of growing location on the contents of secondary metabolites in the leaves of four selected superior clones of Eucommia ulmoides. Ind Crops Prod. 2011;34(3):1607–14.
  123. 123. Ikraoun H, Najem M, el Mderssa M, Nassiri L, Ibijbijen J. Tropical Journal of Natural Product Research Original Research Article Quantitative and Qualitative Ethnobotanical Study of Medicinal Plants used in Oulmes Region, Morocco for the Treatment of Diseases and Infections. Trop J Natural Prod Res. 2023;07:3325–3341.
  124. 124. Zhang Y, Wang Y. Recent trends of machine learning applied to multi-source data of medicinal plants. J Pharm Anal. 2023;13(12):1388–407. pmid:38223450