Combining ensemble models and connectivity analyses to predict wolf expected dispersal routes through a lowland corridor

The Italian wolf (Canis lupus italicus) population has remained isolated South of the Alps for the last few thousand years. After a strong decline, the species has recolonized the Apennines and the Western Alps, while it is currently struggling to colonize the Eastern Alps. Recently, the species was detected in a lowland park connecting the Northern Apennines to the Central Alps. If the park was able to sustain a net wolf dispersal flow, this could significantly boost the connection with the Eastern Alps and the Dinaric-Balkan population. We investigated the suitability of the park as a functional ecological corridor for the wolf through the unhospitable lowland of Northern Italy. We collected wolf occurrence data in two study areas. We modeled species distribution running a separate ensemble model for each study area and then merging the output of the models to obtain an integrated suitability map. We used this map to identify corridors for the wolf adopting a factorial least-cost path and a cumulative resistant kernel approach. The connectivity models showed that only two corridors exist in the lowland areas between the Northern Apennines and the Central Alps. The Western corridor is a blind route, while the eastern corridor passes through the park and has a continuous course. However, the models also revealed a scarce resilience of corridor connectivity in the passageways between the park and the Apennines and the Prealps, which suggests that urgent management actions are necessary to ensure the future functionality of this important corridor.


Introduction
Large carnivores are top predators playing a key role in the maintenance of healthy and functional ecosystems [1]. Despite their ecological importance, most large carnivores suffered major declines in both population size and geographic range over the twentieth century [2]. The major causes of these declines were the loss and fragmentation of habitats, the decline of prey populations and the direct persecution by humans [3,4,5]. Nevertheless, in the last decades, a large-scale recovery of carnivore populations has occurred even within some of the most anthropized areas of the world, such as in Europe [6]. The recovery of large carnivores in a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 would have a high probability of occurrence in a wide area of Northern Italy based on an ensemble modeling approach. Then, we applied a synoptic connectivity approach [25,26], combining distribution models with a factorial least-cost path analysis.

Study area
The study was carried out in an area of about 11,000 km 2 in Northern Italy (centered at 45˚28' N, 8˚92' E). This wide area comprises three subregions: the Prealps in the North (about 300-2000 m a.s.l.), the lowland in the central part (below 300 m a.s.l.), and the Apennines in the South (about 300-1700 m a.s.l.) (Fig 1). The Prealps and the Apennines are characterized by a mainly continuous forest cover, while the lowland area is characterized by small and isolated residual forest fragments scattered in the intensively cultivated areas [27,28]. The lowland is crossed from North to South by the Sesia River, by the Ticino River, and by other minor rivers, and from East to West by the Po River (Fig 1). For this study, data were collected in two study areas: the Ticino study area within the Ticino Valley and the Apennine study area, located in the Northern portion of the Apennines (Fig 1). The Prealps were not investigated because this region has not yet been re-colonized by wolves [20].
The Ticino study area is an area of about 270 km 2 and corresponds to the Ticino Natural Park (Lombardy and Piedmont administrative Region). It is covered for almost half of its extension by intensively cultivated crops mainly composed of rice paddies, partially flooded from April to August every year [28]. The remaining surface is largely covered by woodlands, and to a lesser extent by water bodies and built up areas. The continuous forests of the park constitute an important refuge for several mammals (57 described species), including three wild ungulate species (the wild boar Sus scrofa, roe deer Capreolus capreolus and fallow deer Dama dama, in a decreasing abundance order), while the wolf is the only large carnivore detected in the park.
The Northern Apennines can be divided into four different areas: the foothill area (below 300 m a.s.l.), mainly composed of intensively cultivated crops and built-up areas; the low hills (300-500 m a.s.l.), mainly composed of vineyards; the high hills (500-800 m a.s.l.), mainly covered by oak and chestnut woodlands; and the mountain area (above 800 m a.s.l.) covered by continuous beech (Fagus sylvatica) forests and coniferous reforestations. The Apennine study area, designed as the minimum convex polygon containing all wolf presence signs detected in the foothill, low and high hills areas, is an area of about 610 km 2 . Mountain areas were not considered because they have been occupied by stable packs since the end of the 80's [29], while in this study we were mainly interested in gathering data from wolves colonizing new areas [30,31]. The hilly portion of the study area hosts several mammal species, including four wild ungulate species (roe deer, wild boar, fallow deer and red deer Cervus elaphus, in a decreasing abundance order).

Wolf occurrence data and land cover variables
In order to collect wolf occurrence data within the Ticino study area, 35 linear transects (mean length 8 km) covering the entire park surface were walked once per season (winter: December-February, spring: March-May, summer: June-August, autumn: September-November) in the 2017-2018 period. The transects were walked by researchers and students of the University of Milano-Bicocca and of the University of Pavia, assisted by park guards. Species presence signs (scats, tracks, predation, direct observations) were detected along the transects.
In order to collect wolf occurrence data within the Apennine study area, 18 linear transects (mean length 4.7 km) were walked once per season in the 2017-2018 period. Transect locations were defined adopting a tessellation stratified sampling design [32]. Specifically, a grid of 5 km x 5 km cells was superimposed on the study area and at least one transect was randomly identified within each cell. The transects were walked by researchers and students of the University of Pavia to detect species presence signs.
In both study areas, direct, opportunistic observations of the species during the study period were added to the presence signs systematically collected along transects.
The field work was carried out under the Law of the Republic of Italy on the Protection of Wildlife (L. 157/92). In both study areas, no specific permissions were required for walking along transects. No approval by any animal ethics committee was required because data detection did not involve sampling procedure and experimental manipulation of animals.
Nine land cover variables were selected to model wolf habitat preferences based on their relevance to the species' ecology. The land cover variables were calculated as the fractional cover of multiple land-cover types (urban areas, arable lands, rice paddies [in the Ticino study area only], vineyards [in the Apennine study area only], complex cultivations with natural elements, meadows, transitional woodlands/shrublands, woodlands and water bodies) within a 100-m buffer around each occurrence point. The reference land use cartography was the Corine Land Cover 2018 (available at https://land.copernicus.eu/pan-european/corine-landcover). The land cover variables were calculated on a fine spatial scale (100 m buffer) according to the results found by Ziółkowska et al. [33]. The authors demonstrated that habitat suitability models can be a useful alternative to movement data to parameterize resistance maps (which are required to perform connectivity models) provided that variables are calculated at a fine scale. Interestingly, the greater ability of single-scale fine-scale models to estimate resistance does not correspond to a best modeling performance, which is better in the case of multi-scale models. The reason for this is likely that dispersers and stable individuals behave differently with respect to habitat selection [34,35]. When dispersers move through the landscape they largely depend on the availability and use of local resources [30], while stable individuals, which typically carry out an active habitat selection, may be also affected by broadscale patterns [36]. For the ultimate purpose of this study, the most suitable approach was therefore to consider variables calculated on a fine scale. To avoid multicollinearity among variables, we checked pairwise Pearson's correlation coefficient between covariates and verified that no variable pair had a coefficient higher than 0.60 [1].

Building the predictive ensemble models
We used the BIOMOD2 package [37] in R [38] to develop ensemble modeling of wolf distribution. By fitting several species distribution models, ensemble modeling decreases the uncertainty associated with using a single modeling approach and, therefore, increases the accuracy of model predictions [39]. We developed a separate ensemble model for each study area. Before running the ensemble models, a set of 300 and 500 pseudo-absence points (proportional to the study areas' surface) were randomly generated across the Ticino and Apennine study area, respectively. To avoid the effect of sampling bias we generated the pseudo-absence points within the boundaries of the two study areas, which were exhaustively sampled. For this study, the ensemble models were implemented using three species distribution models, including generalized linear models (GLM), generalized boosted models (GBM), and a maximum entropy algorithm (MaxEnt). These species distribution models were selected based on their high predictive power [1,40,41,42]. For each model, we ran three replications where 75% of the occurrence points was used as training set, while the remaining 25% was used for model evaluation. The three distribution models were evaluated for their accuracy using the area under the curve (AUC) of the receiver operating characteristic (ROC) and the true skill statistic (TSS). To integrate the output of the performed distribution models and develop the ensemble predictions, we used a weighted-averaging approach through which each single distribution model was weighted according to its predictive accuracy as independently assessed [43]. The performance of the ensemble models was evaluated using the AUC of the ROC [44,45,46,47,48]. Based on the results obtained from the models run for the two study areas, we developed two ensemble predictive maps for the whole landscape. The relative contribution of each variable in the final ensemble predictive maps for each study area was estimated by variable randomizations [43]. The predictive maps depict a gradient of suitability varying from 0 to 1000. To obtain an integrated predictive map for the whole area, the two predictive maps obtained from the ensemble models separately developed in the two study areas were superimposed and the maximum suitability value was kept for each pixel of the map. We kept the maximum suitability value for each pixel for two reasons. First, to properly appreciate the selection of sub-optimal land-cover types which can potentially be used by the wolves in areas where they are forced to use them. For instance, in the Apennine study area, woodlands (known to be the optimal habitat for the wolf, e.g. 49,50) are concentrated in the southern part of the area, while they disappear in the North where wolves are forced to use sub-optimal land-covers, which would thus result as positively selected in this area. Conversely, in the Ticino study area woodlands have a significant cover value in the whole area not forcing wolves to use alternative land-cover types, which would thus result as not selected or avoided. This is also the reason that guided the choice of performing two separate ensemble models for the two study areas instead of a unique ensemble model, which would not have allowed us to control this effect. Second, we kept the maximum suitability value for each pixel because a part of the wolf occurrence data used as input data both for the model developed for the Ticino and Apennine study area could derive from stable and/or temporary stable individuals. In order to model species dispersal starting from these data, it would be more realistic to keep suitability values not excessively conservative [30,33,49]. Before integrating the two predictive maps, we set to 0 the suitability values of the pixels pertaining to land-cover types not included in the model of each study area (the vineyards in the Ticino study area and the rice paddies in the Apennine study area). Thus, the suitability values for land-cover types composing only one study area were only predicted by the model implemented in that area. Finally, the suitability value of each pixel of the final suitability map was normalized to obtain a suitability range varying from 0 to 1 [34].

Modeling connectivity
We used the universal corridor network simulator software UNICOR [50] to model the connectivity corridors for the wolf within the landscape analyzed. The software requires a resistance map layer and a point file containing the geographic coordinates of each source location. To obtain the resistance map, we converted the integrated suitability map into a resistance map representing the permeability of the landscape to species movement using an exponential decay function [34,51]. Because the wolf is a highly mobile species, an exponential decay function is appropriate for transforming habitat suitability into resistance values. In fact, the use of an exponential conversion means that large portions of the landscape are associated to low resistance while only highly unsuitable areas have high resistance values, leading to more flexibility in corridor location than it would be expected if the resistance map was obtained through a linear conversion of habitat suitability [1,31]. The most suitable locations of the study area were used as source locations. Specifically, we superimposed a grid of 1 km x 1 km cells to the study area and selected only the cells composed of pixels with suitability values higher than 0.5. The centroids of these cells corresponded to the source locations for connectivity modeling. In particular, while in the Apennines and lowland area we retained all source locations, they were fictitiously rarefied in the Prealps (one centroid every 5 km) to simulate the sporadic presence of the species in this area [20]. Connectivity modeling was developed adopting a factorial leastcost path [25] and a cumulative resistant kernel approach. Factorial least-cost path analysis was used to overcome the limitation of the least-cost path approach associated with the number of sources and target points and to produce a synoptic measure of landscape connectivity [5,26]. The resistant kernel approach shows the suitability of the whole landscape in supporting the movement of individuals [52]. Specifically, this approach calculates the least-cost dispersal Gaussian kernel around each source location up to a defined distance threshold (corresponding to the maximum dispersal distance of the species). The kernels were then combined through summation to produce a path density map [53], where the value of each pixel of the landscape corresponded to the density of the least cost paths passing through that pixel.
Because of the uncertainty about the maximum dispersal distance of the wolf, we tested two Euclidean distance thresholds: 100 km (low dispersal ability) and 850 km (high dispersal ability) [31]. We chose these two threshold distances because the mean wolf dispersal distance has been reported to range from about 13 to about 80 km [54,55], but occasionally individuals may travel much longer distances reaching 840-866 km [55,56]. In addition, genetic analyses performed on European wolf populations have revealed that the range of spatial influence is 650-850 km, meaning that the genetic diversity of a wolf population is influenced by populations up to 850 km apart [12].

Predicted habitat suitability
During the study period, 96 and 71 wolf presence signs were collected within the Ticino and Apennine study area, respectively.
Among the single distribution models, GLM had the lowest performance in predicting habitat suitability in the Ticino study area, while GBM had the highest. Conversely, MaxEnt and GBM had the lowest and the highest performance, respectively, in the Apennine study area ( Table 1). The output of the single distribution models for the two study areas are reported in Tables 2 and 3 (for further details see S1-S6 Figs in Supplementary material).
As regards the ensemble model performed in the Ticino study area, the fractional cover values of rice paddies, woodlands and water bodies were the most important variables in predicting the occurrence probability of the wolf. Conversely, the most important variables in predicting the occurrence probability of the species in the Apennine area were the fractional cover values of arable lands, urban areas, woodlands and vineyards (Table 4). Both the ensemble models performed in the Ticino (AUC = 0.767) and the Apennine (AUC = 0.825) study area showed a quite good performance in predicting species occurrence.
Predictions of the ensemble models implemented using the Ticino and Apennine data revealed that 22.8% and 32.07%, respectively, of the investigated landscape had the potential to support the presence of the wolf (considering a species occurrence probability higher than 0.5) (Fig 2A and 2B). In the integrated map, 32.3% of the landscape was suitable to support the presence of the species (Fig 2C). Specifically, the most suitable areas were located in the wooded areas of the Northern Apennines and the Prealps and along the main rivers, with a greater portion of suitable areas located along the Ticino River compared with the other rivers. Conversely, the most unsuitable areas were widely distributed in the central part of the study area and corresponded to rice paddies and urban areas (Fig 2).

Ecological corridors
The connectivity analyses were developed on 537 source locations. The predicted connectivity maps produced by the factorial least cost path models show a gradient of path density across the whole landscape (Fig 3). About 6.7% and 9.4% of the study area support wolf movements, respectively, in the low and high dispersal ability scenario, with varying degrees of probability. The major differences between the connectivity maps representing the two dispersal scenarios concern the degree of permeability of the corridors (generally higher in the high dispersal ability scenario) rather than the number and pattern of corridors, which remains virtually unchanged except in the Northern part of the study area. However, the number and pattern of corridors in the northernmost part of the landscape, i.e. the area North to the passage between the Northern border of the Ticino Natural Park and the Prealps, are not realistic in any of the two scenarios. In fact, the Prealps are a permeable land for the wolf, but the model predicts relatively few corridors because the nodes were fictitiously rarefied in this area.
In both dispersal scenarios, the Apennine area is well connected by a very large number of corridors, characterized by a not too high path density. Leaving the hilly areas of the Apennines and moving northwards towards the plain, the models identified two corridors only. The Western corridor that heads towards the Sesia River is a blind route characterized by a low path density. Conversely, the eastern corridor that heads towards the Ticino River is characterized by a high path density and has a continuous course up to the Prealps. Along this corridor the probability of wolf passage gradually decreases moving northwards, until it abruptly drops at the passage between the Ticino Natural Park and Prealps. Once they have reached the Prealps, as reported above, wolves encounter again a wide suitable area where they can easily disperse.

Discussion
The increase of connectivity in fragmented landscapes is essential to restore or improve the genetic exchanges between animal populations, generate new genotypes and increase genetic diversity [57,58,59,60]. This is fundamental to ensure population resilience to future environmental changes. For the long-term viability of wolves in Europe, and to reach a favorable conservation status of the species (mandatory by [61]), it is crucial to preserve large populations and to enhance the dispersal flow between and within them [12]. In this study, we investigated the potential dispersal flow through the lowland between the Apennine population and the recent established Alpine population in Northern Italy, which could have important implications in increasing the genetic exchanges with the Dinaric-Balkan population and interrupting the isolation of the Italian population that has lasted for thousands of years [13,17]. To reach this aim, we combined ensemble habitat suitability and connectivity modeling approaches [e.g. 5]. To predict the suitability degree of the whole study area, we integrated two ensemble models implemented in different study areas, in order to obtain a more realistic suitability prediction. Based on the integration of the two predictive models, approximately one third of the study area was predicted as suitable for the wolf. However, most of the suitable areas were in the hilly and mountain areas of Northern Apennines and Prealps, were forests are widespread and continuous. Conversely, a very small portion of the central lowland area was covered by suitable areas, which were concentrated along the main rivers.
The combination of the results of the two ensemble models showed that rice paddies, arable lands and woodlands were the most important explanatory variables in predicting wolf occurrence. Rice paddies and arable lands had a strongly negative effect on species occurrence, while woodlands had a positive effect. Wolves very often use woodlands or woodland fringes to move or rest because of the shelter provided by forest cover and the high prey densities, whereas open habitats, such as rice paddies and simple arable lands, are usually avoided [55]. The wolf does not avoid agricultural areas just because of the lack of shelter and the low density of prey species, but also due to the anthropogenic disturbance typical of these habitats. Several other studies have shown that anthropogenic pressure is the variable with the most negative effect on wolf occurrence [62,63,64,65]. Moreover, urban centers and agricultural areas, as well as linear infrastructures, such as roads or canals, can be important barriers for wolf movement [64]. The negative effect of such anthropogenic constructs also emerged from our analyses. In fact, the fractional cover of urban areas was another important variable with a strong negative effect on wolf occurrence. A variable with a slightly negative effect on the species was the fractional cover of vineyards. Unlike rice paddies and simple arable lands, vineyards are characterized by a more complex environmental pattern, with small woods scattered in the vineyard matrix. Finally, the relationship between wolf occurrence and the fractional cover of water bodies revealed that areas close to rivers are selected by the species. This choice depends on two fundamental requirements provided by woodland belts bordering rivers in the lowland part of the study area, i.e. protection from human disturbances and the availability of preys [62,55,56,66]. The predicted connectivity maps showed that a very small portion of the study area supports wolf movements, both in the low and high dispersal ability scenarios. The corridor pattern was consistent within the maps obtained for the two dispersal ability scenarios, so we are confident that these results are highly likely. In both dispersal scenarios, most of the corridors were concentrated in Northern Apennine, but very high path densities were not detected in this area. This happened because Apennine areas are widely covered by suitable habitats for wolves [29,62] and there are no obligatory passages in which many least cost paths converged. Moving northwards, the models identified two corridors crossing the lowland areas South to the Po River. Both corridors wind along the lowland, passing through the rare small forest patches scattered in the agricultural plain and reaching the banks of the Po River. The Western corridor moves northwards reaching the Po River and following its course for a while, before breaking off at the mouth of the Sesia River. The low probability of wolf passage throughout its whole extension makes this corridor not functional from an ecological perspective. Moreover, the abrupt change of habitat suitability between the Po River and the Sesia River, characterized by a too narrow belt of forested areas to be functional to the passage of the species, may turn this corridor into an ecological trap [67,68]. The Eastern corridor has a continuous course up to the Prealps and is characterized by a high path density both in the crossing area of the Southern agricultural plain and along the entire course of the Ticino River, while path density abruptly decreases at the crossing point between the Ticino River and the Prealps. On the one hand, the continuity and high permeability of this corridor encourage optimism regarding the existence of a functional ecological connection between the Northern Apennine and the Central Alps, passing through the Ticino Natural Park. On the other hand, the results predicted by the factorial least cost path models showed at least two critical points. First of all, the passage between the Northern Apennines and the Southern border of the Ticino Natural Park is supported by one corridor only, and this can be problematic from a conservation point of view because dispersers seldom follow a single optimal route in a landscape [69]. Moreover, if for any reason the only corridor present in a landscape was no longer functional to the passage of the species, the connectivity of the area would drastically collapse. Redundant corridors are fundamental to strengthen the effectiveness and resilience of the landscape ecological network [70] and to the long-term maintenance of connectivity even in the face of future environmental changes [71]. The second critical point regards the difficult passage from the Northern border of the Ticino Natural Park to the Prealps, which could compromise the functionality of the whole corridor. Here, the permeability of the corridor drops abruptly because of the large number of built-up areas and roads crossing the area [66,72,73]. This hypothesis is supported by the finding of a wolf hit by a car exactly at this crossing point in 2012. From a management point of view, in order to ensure the long-term maintenance of the effectiveness of the Ticino Natural Park as a functional corridor for the wolf it is necessary to enhance connectivity both at the passage between the Northern Apennines and the Southern border of the park and, in particular, between the northern border of the park and the Prealps. Specifically, at the passage between the Northern Apennines and the Southern border of the park, connectivity should be enhanced by reducing the resistance of the agricultural matrix, for example by setting aside fields, planting hedgerows or restoring the banks of the numerous small streams crossing the areas, which already represent natural underpasses for crossing motorways. As regards the Northern passage between the Park and the Prealps, we did not have data on the location of viaducts or drainage channels under traffic roads. Such data should be considered in a local scale study aimed at evaluating in detail the permeability conditions of this crossing area and, eventually, at identifying priority areas to defragment barriers through overpasses and underpasses [55].

Conclusions
In conclusion, the Ticino Natural Park seems to be a functional corridor for sustaining the wolf dispersal flow from the Northern Apennines to the Central Alps, which would increase the probability of a future stable genetic exchange between the Italian and the Dinaric-Balkan populations. However, the analyses also showed that this corridor is fragile and not resilient to possible environmental changes. The Ticino Natural Park is probably the only functional corridor that connects the Apennines to the Central or Eastern Alps by crossing the wide anthropized agricultural plain of Northern Italy. All the other longitudinal rivers crossing this area, even if included in protected areas, are often bordered by a very narrow and fragmented strip of natural vegetation. If, for any reason, the Ticino Natural Park was no longer functional to sustain the wolf dispersal flow, the connectivity across the whole lowland area of Northern Italy would drastically collapse. Therefore, it is crucial to maintain the functionality of the corridor, not only along the Ticino River but also, and above all, in the passageways between the park and the source (Apennines) and destination (Prealps) areas.
Supporting information S1 File. Data. Presence and pseudo-absence wolf locations in the Apennine and Ticino study area. (XLSX) S1 Fig. GLM plots (Ticino study area). Plots representing the relationship between the wolf occurrence probability (y axes) and the fractional cover of the land cover variables (x axes) obtained from the GLM run for the Ticino study area. (TIF) S2 Fig. GBM plots (Ticino study area). Plots representing the relationship between the wolf occurrence probability (y axes) and the fractional cover of the land cover variables (x axes) obtained from the GBM run for the Ticino study area. (TIF) S3 Fig. MaxEnt plots (Ticino study area). Plots representing the relationship between the wolf occurrence probability (y axes) and the fractional cover of the land cover variables (x axes) obtained from the MaxEnt model run for the Ticino study area. (TIF) S4 Fig. GLM plots (Apennine study area). Plots representing the relationship between the wolf occurrence probability (y axes) and the fractional cover of the land cover variables (x axes) obtained from the GLM run for the Apennine study area. (TIF) S5 Fig. GBM plots (Apennine study area). Plots representing the relationship between the wolf occurrence probability (y axes) and the fractional cover of the land cover variables (x axes) obtained from the GBM run for the Apennine study area. (TIF) S6 Fig. MaxEnt plots (Apennine study area). Plots representing the relationship between the wolf occurrence probability (y axes) and the fractional cover of the land cover variables (x axes) obtained from the MaxEnt model run for the Apennine study area. (TIF)