The use of geographic information system and 1860s cadastral data to model agricultural suitability before heavy mechanization. A case study from Malta

The present study seeks to understand the determinants of land agricultural suitability in Malta before heavy mechanization. A GIS-based Logistic Regression model is built on the basis of the data from mid-1800s cadastral maps (cabreo). This is the first time that such data are being used for the purpose of building a predictive model. The maps record the agricultural quality of parcels (ranging from good to lowest), which is represented by different colours. The study treats the agricultural quality as a depended variable with two levels: optimal (corresponding to the good class) vs. non-optimal quality (mediocre, bad, low, and lowest classes). Seventeen predictors are isolated on the basis of literature review and data availability. Logistic Regression is used to isolate the predictors that can be considered determinants of the agricultural quality. Our model has an optimal discriminatory power (AUC: 0.92). The positive effect on land agricultural quality of the following predictors is considered and discussed: sine of the aspect (odds ratio 1.42), coast distance (2.46), Brown Rendzinas (2.31), Carbonate Raw (2.62) and Xerorendzinas (9.23) soils, distance to minor roads (4.88). Predictors resulting having a negative effect are: terrain elevation (0.96), slope (0.97), distance to the nearest geological fault lines (0.09), Terra Rossa soil (0.46), distance to secondary roads (0.19) and footpaths (0.41). The model isolates a host of topographic and cultural variables, the latter related to human mobility and landscape accessibility, which differentially contributed to the agricultural suitability, providing the bases for the creation of the fragmented and extremely variegated agricultural landscape that is the hallmark of the Maltese Islands. Our findings are also useful to suggest new questions that may be posed to the more meagre evidence from earlier periods.


Introduction
The study of past landscapes and the way they have evolved over long periods of time is key to understanding the changing relationship between humans and their environment. Approaches to the study of long-term landscape change is increasingly making use of an ever-growing PLOS  array of different tools and data-sets, ranging from environmental reconstruction to ethnographic comparison, from to surface survey to map regression. The present study was undertaken in the context of the five-year ERC-funded FRAGSUS project, which examines fragility and sustainability in small island contexts, with a focus of human-environment interactions in prehistoric Malta. The sheer density of human activity on Malta over the past 7,000 years makes the reconstruction of past environments extremely challenging, as the evidence is at best fragmentary, and often obliterated by subsequent erosion and anthropogenic activity [1][2][3]. The challenge increases exponentially for more remote periods in time.
While the primary focus of FRAGSUS is the prehistoric environment, a sound understanding of the subsequent evolution of the landscape was considered essential and useful for a number of key reasons: first, because they form part of the same landscape palimpsest that can only be understood in diachronic terms; second, to allow more informed predictions of where evidence of earlier landscapes may be preserved; third, because different cultural responses in better documented periods may suggest new questions that may be posed to the more meagre evidence from earlier periods, and enrich their interpretations.
Archival records preserved from the early modern period onwards include census records and cadastral maps that may contain detailed records of ownership, productivity and yield of land, range of crops and size of herds, as well as human demography. This record, which tends to be increasingly rich in coverage and detail in more recent centuries, offers ample opportunities for the historical reconstruction of early modern landscapes. In a study of another Mediterranean small island context [4], it has been noted how the uneven nature of such records may give rise to a form of analytical exceptionalism which privileges better-documented periods, and that there is an obvious risk that the unusually detailed historical & ethnographic evidence available for the late eighteenth to twentieth centuries will lead to some gross methodological differences compared to earlier periods that may skew our practical interpretations. These risks notwithstanding, the same researchers conclude that the abundant historical records and standing remains of the late eighteenth to twentieth centuries are a resource that it would be foolish to ignore [4].
Bearing the above pitfalls in mind, but also mindful of the opportunities presented by early modern archival records, the present study examined a comprehensive survey undertaken by the British colonial government in Malta in the 1860s to produce a highly accurate and detailed terrier, or cabreo. It consists of around 750 ink and watercolour drawings of all Crown property in Malta [5]. Bound in three large-format volumes with plans of government properties on the island of Malta, and a fourth with properties on the island of Gozo, these cadastral records, referred to below collectively as the cabreo, are today held in the National Archives of Malta in Rabat. The records of rural properties contained in the cabreo offer a rare and detailed glimpse into the organisation of the productive landscape in the early modern period. The detailed documentation of the productivity or land quality of different parcels of land, described in detail below, presents the researcher with a number of interesting challenges and opportunities. How did the different conditions prevailing in different parts of the landscape influence the recorded productivity of the land? And could the information on productivity recorded for government-owned land be generalised for the wider landscape? How enduring was the influence of these factors on the favourability of land for agriculture? Or in other words, could an improved understanding of the variability in land quality in the early modern landscape shed light on and pose useful questions for the study of earlier landscapes?
The work presented here attempts to provide the basis to address these questions by developing a GIS-based model that aims at understanding to what extent (if any) topographic and cultural factors could have influenced the agricultural quality recorded in the cabreo, so allowing the historic information to be generalised for the entire landscape. The process required to achieve this also entailed some innovative applications of GIS and statistical methods to accommodate historical spatially-referenced data, which are also interesting from a methodological point of view. As a matter of fact, to the best of the authors' knowledge, the present work is the first attempt at using cabreo data for the purpose of model building in a GIS environment.

The study area
The Maltese archipelago is located in the central Mediterranean, about 90 km south of Sicily (Fig 1).
The combined surface area of the archipelago is a mere 316 square km, largely made up by the two principal islands of Malta and Gozo. The present study is focussed on Malta, the largest island in the group. In spite of the small size of the island, the landscape is highly varied and fragmented. The northwest of Malta is characterised by a series of parallel ridges, which are separated by sheltered valleys that have been prime agricultural land at least since the early modern period. The west of the island is characterised by windswept uplands, while the centre and southeast are made up of gently rolling hills that are rather more favourable for agriculture. Several of the factors that may influence land quality, discussed in detail below, are evidently enduring features of the landscape, which may have also influenced land use in much earlier periods. qualitative scale used in the cabreo ranges as follows, in descending order: buona (good), mediocre (mediocre), cattiva (bad), inferiore (lower), infima (lowest). Besides the key to the colour classification, an accompanying hand-written caption at the margin of each map recorded information about the type of crops grown, the presence of farmhouses and stables, the number and types of water facilities, and the presence of animals.
As a preliminary step, the dataset has been scrutinized in order to have a general understanding of the quality of the documentation itself. The whole dataset comprised 550 parcels, including maps of (a) extremely small tenements, (b) tenements that no longer exist due to modern urbanization, and (c) isolated buildings or farmhouses. Unfortunately, some of the images (d) showed colours that varied from those in the colour code noted above for land productivity quality. After excluding the maps falling in the mentioned four mentioned categories (a-d), the total sample of maps remaining was 250. Due to time constrains, and since maps were to be georeferenced and digitized, it was decided to draw a more manageable sub-sample in which maps from each of the three principal geographical regions of Malta could have the same probability of being chosen (namely, the low-lying hills and plains of south-east to central Malta, the parallel ridges and valleys in the northwest of the island, and the uplands in the western region). Random sampling was performed, stratified across the three regions, and 20 random maps for each macro-area were thus obtained. The fraction of the sub-sample relative to the parent dataset was arbitrarily set at 25%. The total area covered by the cabreo sub-sample is 6.70 sq km, corresponding to 2.72% of the area of the island of Malta, the largest island in the Maltese archipelago (246 sq km). The percentage rises up to 3.56% if the extent of areas urbanized today (58 sq km) is subtracted from the total are of the study region. The maps have been given spatial references (using ESRI's ArcGIS 10.1) against georeferenced 1940s survey sheets used as base map (scale: 6 inches to 1 mile), and have been digitized by using polygons (Fig 3).
Different information has been stored in the attribute table of the polygon layer, the most significant for the purposes of the present study being the agricultural quality class registered for each parcel or part thereof. A total of 318 polygons were used (Fig 4).

Logistic regression
Logistic regression (hereafter LR) is widely used in different research fields, spanning from social to hard sciences [10][11][12][13][14][15][16]. It finds extensive use in GIS-based studies [10][11][17][18][19] since it allows modelling of the relation between a nominal dependent variable and independent variables (i.e., predictors) of different types (nominal and/or continuous). The reader is referred to the existing literature for an in-depth treatment of the topic [20][21][22]. LR makes it possible to estimate the probability that a particular outcome of a dependent nominal variable (y) will occur based on information from one or more predictors (x m ). The technique ultimately finds the equation that best predicts the probability p of getting a particular value of y, with p taking values from 0.0 to 1.0. If m is the number of predictors, the general form of the logistic regression model is: Unlike the least-squares method used in linear regression, logistic regression finds the intercept (β 0 ) and slopes (also termed logistic regression's coefficients; β 1 , β 2 . . . β m ) of the best-fitting equation by means of the maximum-likelihood method, which is a computer-intensive technique that finds the values of the parameters under which you would be most likely to get the observed results [23]. The LR equation consists of values of the predictors plus weights estimated by the model to predict the outcome of the dependent variable [24]. Once logistic regression has been run, and the intercept and coefficients have been found, it is possible to calculate the probability of the outcome of y by plugging those parameters and any known value of the predictors into the logistic regression equation.
The model's coefficients can be meaningful interpreted once they are exponentiated and thus expressed in terms of odds ratio [22]. An exponentiated coefficient of 1 leaves the odds for the positive outcome of the dependent variable unchanged, while a coefficient greater or smaller than 1 increases or decreases the odds respectively. For instance, if the presence of, say, a landslide is modelled as dependent on the terrain slope, and assuming that the latter has an estimated coefficient of 0.1871, its odds ratio is 1.206 (i.e., e 0.1871 ). This indicates that a 1-unit increase in slope increases the odds of landslide by a factor of 1.206. In case of categorical predictors, the following interpretation holds. Let's assume that the presence of landslides is modelled as dependent also on the type of soil, and the latter categorical predictor has three levels, soil A, B, and C. Usually, one of the levels is used as reference category and used as baseline for comparison. In other words, assuming that the reference level is soil A, an exponentiated coefficient of, say, 1.5 for soil B and of 0.50 for soil C indicate that B increases the odds for landslide by 1.5 relative to soil A, while C decreases the odds by 0.50 relative to the same reference level.
In this study, the analysis of the area under the ROC curve is used to assess the discriminatory power of the model [20,[25][26][27]. It plots the proportion of cases correctly classified as a positive outcome of the dependent variable (sensitivity) versus the proportion of cases incorrectly classified as a negative outcome (1 minus specificity) for the entire range of possible cutoff points on the model's probability. The area under the ROC curve (AUC) provides an overall measure of the model's ability to discriminate between the two outcomes of the dependent variable [20] in the dataset on which the model has been trained: the more the curve deviates from 45˚, the higher is the model's power. As a rule of thumb, the AUC value can be classified as follows [20]: discriminatory ability no better than chance (0.5), poor (0.5-0.7), acceptable (0.7-0.8), excellent (0.8-0.9), outstanding (0.9-1.0).

Rationale behind the choice of the model's variables
A choice has been made as to what variables could be meaningfully used in this study. While, as touched upon earlier, the dependent variable is the quality of the land productivity as classified in the cabreo data, a critical choice needed to be made regarding the use of the whole cabreo qualitative scale. It was decided to collapse the cabreo classes into two broad ones, i.e. optimal (corresponding to the good class) vs. non-optimal quality (comprising the mediocre, bad, low, and lowest classes). This dichotomization was considered appropriate to this study's goal of understanding which predictors are likely to have contributed to optimum agricultural yield. Of course, the model could be further refined in future stages of this research, making use of the whole cabreo classification, or collapsing categories in a different fashion.
A total of 17 predictors have been used in the model. We must acknowledge that modelling agricultural suitability is not a simple task and there is no unique set of criteria to be taken into consideration when studying agricultural potential [28]. A complex interplay between at least socio-cultural, economic, climatic, environmental, and topographic factors may indeed affect the inherent properties of the land in the context of its use as well as of its abandonment [28,29]. While they cannot be considered the only determinants of agricultural suitability, the predictors used in this study were deemed useful on the basis of the literature review and data availability ( Table 1 and Fig 5).
While factors such as salinity and sodicity, soil texture, soil depth, winds, precipitation, and climate variability are likely to influence land quality in terms of suitability for agriculture [28,[30][31][32][33], the first eight variables listed in Table 1 have been considered potentially important for modelling agricultural suitability for their connection with soil moisture and water availability. Indeed they are widely used in the available literature [28,[34][35][36]. Moisture and  [37,38]. As the table summarizes, some of the predictors affect the water flow, which influences the processes of soil erosion and deposition, which in turn affect soil depth and fertility. Terrain attributes can contribute to predict a significant portion of the spatial organization of sediments and microclimate [39], soil moisture [40], and soil properties [41]. As different authors point out [26,[31][32][33][42][43][44][45][46][47][48], variation in slope, aspect, relative elevation, and curvature, affect the distribution of moisture near the land surface, with slope influencing infiltration, drainage, and runoff. Steeper slopes  are likely to be drier than flat areas due to lower infiltration rates and higher surface runoff, and are also likely to have shallower soils. The amount of solar energy is also higher on steep slopes, since the quantity of solar radiation per unit area of the land surface decreases as the slope decreases. Steeper slopes are obviously more difficult to cultivate relative to more gentle sloping terrains [28,49]. Elevation is associated with lower temperatures and lower moisture content [28,42,44]. In fact, elevation can affect water retention since terrains at greater elevation may have more soil water draining down and are subject to receive less water from upslope [44,46]. Aspect (i.e., slope orientation) influences solar radiation, and hence evapotranspiration, soil moisture, and soil nutrients [41,42,46,50]. Slopes with different orientation are differentially subject to sunlight and prevailing winds. For these reasons, aspect is taken into consideration as an assessment criterion for the selection of the land to be used for agriculture [28,49]. Curvature, which can be further characterized as planform (i.e., perpendicular to the slope direction) and profile (i.e., in the direction of the slope) [51], influences convergence/divergence and acceleration/deceleration of rainwater flow since the latter is related to the convexity or concavity of the terrain. Where acceleration of flow occurs (convex profile curvature) erosion will be higher and soil water content lower, whereas in areas with concave profile curvature erosion is lower [52] and deceleration of flow allows water to accumulate [43]. A Topographic Wetness Index (TWI) has also been taken into account. A combination of parameters such as flow-accumulation and slope of a given cell was used to give an indication of the tendency of water to accumulate at any point of the area under study [43,51]. High values correspond to converging flat terrains, while low values are typical of steep and diverging areas [53][54][55]. The Euclidean distance from the main geological faults has been used as a proxy for fresh water availability. Fault lines may facilitate infiltration of rainwater into the ground [56], also providing fissures and micro-fractures along which the water retained by geological clayish layers may escape [57]. The relation between springs and fault lines has been tested in a preliminary step of the present study. The distance to the nearest fault line of 49 Ghajn toponyms (place-names indicating the actual presence of a spring; located on the basis of an earlier work [58]) has been compared to the distance of 49 random points. The toponyms show a tendency to be closer to the fault lines (mean distance: 251 m) compared to random points (mean distance: 423 m). The mean difference in distance (172 m) is statistically significant (permuted p value based on 999 permutations: 0.011). As for the distance from the coastline, it has been considered as a predictor since lands close to the coast could be more subject to the negative effects of sea-spray and salt-laden air [37].
Soil type (Fig 6) has been chosen as a predictor (comprising 5 levels) since soils provide essential nutrients to plants and their different physical characteristics allow water and air to infiltrate, roots to explore, and biota to thrive [59].
Soils vary in their ability to drain and retain water and nutrients at a different pace, and are therefore differentially suitable for agriculture. The soil classification used for this study is that devised by Lang [60,61] in his detailed analysis of the soils of the Maltese Islands. He provided descriptions of the soils and of their distribution, mapping differences in chemistry, physics, and biology. His work was intended as an aid to agricultural planning and development in the study area. Interestingly, he also provided some remarks on the agricultural suitability of the different soil types, basing his observations on his first-hand knowledge of the Maltese manmade landscape [60,62].
In order to assess the possible influence of land accessibility on the agricultural quality, the distances to the nearest road (classified as main, secondary, or minor) and to the nearest footpath have also been taken into account as predictors (Figs 5 and 7).
Road networks may provide an important driving force in shaping the relation between human activity and spatial phenomena, such as urban growth [63], forest clearance and agricultural abandonment [12,64,65], land use changes and vegetation dynamics [66,67], and the spatial organization of land use [68]. The use of these predictors allows the importance of this cultural factor to be compared to that of environmental and topographic ones [64]. For the same reason [68], the distance to the nearest urban area (as it stood in 1895) has also been included as a predictor. Finally, the geographic coordinates have been entered as predictors (see later on).

Modelling strategy and sample size
Each independent variable was entered into ArcGIS (10.1) as a raster layer. The elevation layer is a Digital Terrain Model (1 m cell size) derived from LiDAR [69] data made available through an agreement signed between the University of Malta and the former Malta Environment and Planning Authority in 2013. Slope, aspect, and curvature layers were obtained from the DTM by using the appropriate tools in ArcGIS Spatial Analyst. Aspect was split into two components, aspect-cosine and aspect-sine [46,51,70], producing values from 1 to -1. North-facing slopes have aspect-cosine tending to 1, whereas south-facing ones tend to -1. East-facing slopes have aspect-sine tending to 1, whereas west-facing slopes will tend to -1. The TWI raster was produced in ArcGIS using the Geomorphometry and Gradient Metrics toolbox created by Jeffrey Evans and colleagues [71]. The raster layers expressing the distance from the coastline, geological faults, road types, and 1800s urban areas, were generated on the basis of vector data created by digitazing those features against different survey sheets, which have been preliminarily georeferenced. The 1993 geological map of the Maltese Islands (scale 1:25,000) [57] has been used for digitizing the fault lines, the geological formations (which have not been used as A set of random points (n = 16,643) with a minimum spacing of 20m was generated within the polygon layer representing the cabreo maps. Each point has been used as a sampling location [11] and has been given values of both the dependent variable and the predictors. The spacing of the points was chosen to alleviate spatial autocorrelation [72], which poses issues to traditional statistical methods in terms of inference, coefficients estimation, and assessment of the relative importance of predictors [72][73][74][75][76]. Two additional steps were taken beside the 20m points spacing. First, a 25% random sub-sample of points (n = 3,897) was drawn from the larger universe of sampling points to further increase the between-points distance relative to the full dataset [63]. The subjective choice of the 25% fraction was made in an attempt to find a balance between: a) dealing with a more manageable sample, which could also be less prone to assure statistical significance to otherwise small and unimportant effects, just as a result of the increase in power due to the very large sample size [20,22]; b) assuring an adequate sample size to properly perform LR (as discussed later on). Secondly, the geographic coordinates of each sampling point have been entered in the model as predictors. Besides using an auto-covariate term as predictor [72][73][74]76,77], adding geographic coordinates as extra predictors is well documented in the literature as a method to alleviate spatial autocorrelation [63,[78][79][80][81][82][83]. The presence of spatial autocorrelation in the model's residuals has been nonetheless tested [84][85][86][87][88] to formally check if the fitted model and the estimated coefficients can be considered reliable.
Predictors have been preliminarily checked for the presence of a strong correlation (i.e., collinearity) among them [89]. Pearson's r has been calculated between pairs of predictors, and 0.70 has been considered as a critical threshold [89][90][91]. Table 2 shows that there is no critical collinearity among predictors.
Variance Inflation Factor was also considered since no pair of predictors is critically correlated but there are several variables tied by interdependencies [90]. VIF shows how much the variance of the coefficient estimate is being inflated by multicollinearity [90]. In our case (Table 3), no predictor has a VIF larger than 10, which is the critical value suggested in the literature [89,90].
Geology has not been used as predictor since, in a preliminary step of the present study, it proved (as expected) strongly correlated to soil types (Table 4).
As for sample size for LR [20], we followed Peduzzi et al. [93] who provide a guideline for calculating the minimum sample size: considering at least 10 observations per predictor, divided by the proportion of negative ('non-optimal' quality; n = 2,965) or positive cases ('optimal' quality; n = 932), whichever is smaller. In our case, the minimum size is 708 observations (i.e., 10x17:0.24). The sample used in this study is more than five times larger than the minimum necessary for a model with 17 predictors. The dataset on which the model has been built is available in a tab-delimited.txt file (which can be easily imported into any statistical program) provided as Supporting Information (S1 Text. Dataset).

Predictors selection and model validation
The "best" model was selected via the backward stepwise procedure implemented by D. Rizopoulos's boot.StepAIC R package [94]. While mixed opinions exist about stepwise procedures (1) elevation - [20, [95][96][97], the aim of such an approach is to isolate a parsimonious model. The principle of parsimony suggests avoiding models with unnecessary complexities, and to give preference to simpler models (in comparative terms) that explain the greatest amount of data variability with the lowest level of complexity [98,99]. The package implements the model selection devised by Austin and Tu [100], widely used in literature [101][102][103][104]. The assessment of how much the selected model is able to generalize outside the training data (i.e., model validation [20,27,105,106]) has been performed by means of internal validation, following the method described by Arboretti Giancristofaro-Salmaso [27], which has been implemented in R [107,108]. Further details about both procedures are provided as Supporting Information (S2 Text. Predictors selection and model validation).

Results
Eleven predictors out of 17 candidates can be considered truly independent ones since they were selected in practically all the 1000 bootstrap resamples ( Table 5). The six excluded predictors (planform curvature, TWI, profile curvature, distance to the nearest urban area, cosine of the aspect, distance to the nearest main road) were selected only in tiny fractions of the samples, and their estimated coefficients were unstable, indicating that they do not contribute to the prediction of the outcome of the dependent variable (Table 6). The model comprising the selected predictors is statistically significant: the p value of the difference between the null and the full model is well below 0.01, indicating that the predictors significantly affect the outcome of the dependent variable. The discriminatory power of the model can be considered excellent/outstanding, according to the 5-tiered scale described earlier: the AUC value is 0.92. As for model validation (Fig 8A), the fitting distribution of AUC is excellent, with a median value which is practically identical to the AUC for the original full dataset, and a minimum value of 0.91 which still points to an excellent/outstanding discriminatory power. The validation distribution of AUC is excellent, with about 50% of the values falling between 0.91 and 0.92, and nearly 25% of them being larger than 0.89. This indicates that the model excellently discriminates both on the original sample and outside it. No spatial autocorrelation proves present in the model residuals. The spatial correlogram returned by the SAM 4.0 program [109] shows that the values of Moran's I are extremely close to 0 along all the 22 distance classes (Fig 8B). Their statistical significance is just the result of the very large sample size.
The details of the fitted model are summarized in Table 7.
The estimated coefficients and the constant were fed into ArcGIS via the Raster Calculator facility in order to produce a raster representing the fitted model. The raster has been given a colour scale reflecting the probability for 'optimal' land quality, ranging from the lowest (red = 0.0 probability) to the highest (green = 1.0 probability) (Fig 9).
As for the predictors related to topography, elevation and slope have a negative effect on the chances for optimal agricultural quality: when they increase by 1 unit, the odds of optimal agricultural quality decrease by a factor of 0.96 and 0.97 respectively (or, put another way, the odds are 0.96 and 0.97 times as small for an additional metre of elevation and degree of slope, respectively). The sine of the aspect turns out to have a positive effect according to the model: as it increases (i.e., turning from west to east), the odds of optimal agricultural quality increase by 1.42. The distance to the coastline and to the nearest geological fault have an opposite effect on the outcome of the dependent variable. As the distance to the coastline increases by 1 unit, the odds of optimal agricultural quality increase by 2.42, whereas 1-unit increase in distance to the nearest fault line decreases the odds by 0.09.
The fitted model also allows to assess the effect of different soil types on the outcome of the dependent variable. Brown Rendzinas soils are associated with an increase in the odds of optimal agricultural quality by a factor of 2.31, and the same holds true for Carbonate Raw and Xerorendzinas soils, by a factor of 2.62 and 9.23 respectively. On the other hand, Terra Rossa soils are associated with a decrease in the odds by 0.46. As for the cultural predictors related to land accessibility (i.e., distance to the road network), the distance to the nearest secondary road and to the nearest footpath was found to have a negative effect on the odds of optimal agricultural quality. As the model indicates, a 1-unit increase in those variables translates into a decrease of the odds by 0.19 and 0.41 respectively. On the other hand, a 1-unit increase in the distance to the nearest minor road is associated with an increase in the odds of optimal agricultural quality by a factor of 4.88.

Discussion
The model made it possible to isolate a host of variables that had an influence, either positive or negative, on the suitability for agriculture at the time when the cabreo was created. The literature reviewed earlier in this work provides grounds to understand the negative impact of elevation and slope. At least three factors may account for the negative contribution of elevation. As already mentioned, relative elevation affects water retention: terrain at higher elevation drain more readily and receives less water from upslope. In addition, terrain at lower elevations is generally better sheltered against the negative effects of winds [37]. Furthermore, changes in elevation go hand-in-hand with changes in temperature. The negative impact of the slope on agricultural quality is consistent with many aspects of the reviewed literature. Steep slopes can in fact be considered a handicap [32] for agricultural areas not only because they hamper the use of mechanical devices [30], but also because slope influences solar radiation, water infiltration rates, surface runoff, moisture, erosion, and depth of soils. The practice of terracing aims at reducing some of these negative effects [110,111]. The modelled increase in the odds of optimal agricultural quality as the terrain starts to gently slope down is consistent with Rolé's [38] remarks describing an increase in productivity as ones moves from midslope to the valley, which is also characterised by deeper soils [62]. This was already stressed by Lang [61], who noted that alluvial flats, flatter lands and (notably) damper valley bottoms were usually heavily cultivated in Malta. An interesting parallel is provided, for instance, by northwestern Keos (in the Aegean area) where in the early 1900s agricultural suitability was shaped by different factors, including physical ones. As Cherry et al. [68] argue, valley bottoms were preferred because they were more accessible, easier to irrigate, and characterised by deeper soils.
The influence of eastern aspects as opposed to western ones, with the former being more suitable for agriculture, turns out to be an interesting result. On the grounds of what was reviewed earlier in this study, it can be posited that east-facing slopes were more favourable for being relatively cooler than those exposed to south and west, so retaining more soil moisture. This can be considered a crucial factor for agriculture especially in a climate like that of the Maltese archipelago, where summers are hot and dry, and springs are characterised by rainfall deficit [62], with a resulting high rate of evapo-transpiration [112]. On the other hand, the decrease in the odds of optimal agricultural quality associated with west-facing slopes is consistent with the fact that slopes facing west and southwest receive a greater amount of solar radiance [43], resulting in drier conditions and in a different microclimate at ground level. Given that young plants and seedlings are killed by heat stresses [31], e.g. by temperature exceeding 38 degrees Celsius [37], it makes perfect sense that the cooler slopes have a positive effect on the chances for optimal agricultural quality. Eastern exposures also benefit from morning sun that allows plants to dry from dew or rain sooner than those on western slopes. Moreover, in the study area, east-facing slopes are more sheltered from the prevailing winds. Meteorological data gathered between 1997-2006 indicate that the most frequent direction is from northwest, followed in frequency by westerly winds [113].
The positive impact of the distance to the coastline can be explained in the light of what was touched upon previously. Being relatively distant from the coast implies being less prone to sea-spray and salt-laden air. On the other hand, the model shows that the distance to the nearest geological fault line has a negative impact on the odds of optimal agricultural quality. Since this predictor has been used as proxy for fresh water availability, the model seems to coherently indicate that being progressively far from a fault line translates into a decrease in the chances to have access to fresh water. This can be easily considered a crucial factor for agricultural development in an arid climate such as the Maltese one [62].
Another interesting achievement of the study is the possibility of assessing the contribution of different soil types to the agricultural quality as recorded in the cabreo. The model confirms many of the empirical observations made by Lang in the 1960s, and there seems to be a good correspondence between his remarks on land productivity and the agricultural suitability as predicted by the fitted model. The soils defined by Lang as giving satisfactory crops were Xerorendzinas, Brown Rendzinas, and Carbonate Raw [61]. The first two belong to those Rendzina soils that, elsewhere in Europe [114,115], are regarded as having favourable physical properties, characterised by high water infiltration rates when wet and high water-holding capacity, resulting in high biological activity and high natural fertility. According to the model, they are associated with an increase in the odds of optimal agricultural quality, which is consistent with Lang's remarks. The latter are consistent with the model's results also in relation to Terra Rossa soils, which were rather dry, compact, and difficult to cultivate and were usually left uncultivated, or used for bird catching or sheep/goat grazing [61]. Notably, the model proposed in this study pointed to this type of soil as a negative factor for optimal agricultural quality. It is worth stressing that the areas for which the model estimated a low probability of optimal agricultural quality can be nonetheless thought of as being potentially good for purposes other than agriculture. Indeed, historical and ethnographic sources from Malta and elsewhere reveal that thin-soiled and scrub-covered karstland was used for a variety of purposes that turned an apparently unproductive landscape into an important part of the agrarian economy. They actually provided grazing grounds for sheep and goats, quarried stone for construction, brushwood for fuel, apart from herbs, greens, wild game, and flowering plants for bee pasture [116][117][118]. It is also worthy of note that in our cabreo sample farmhouses incorporating pens for traction animals and others (stalle, or stables) tend to occur in areas with predicted low probability for optimal agriculture, or on the fringe of good agricultural zones (see the aforementioned Fig 9). Moreover, public spaces (spazi pubblici) or wasteland occur amongst such uncultivated areas. If this form of opportunistic exploitation of a common resource and resulting economic return is often hard to assess, the investment in demarcating such apparently unproductive areas with rubble walls and ensuring access to them for humans and herds through walled paths or tracks is hard to miss. As a matter of fact, our data indicate that there is a tendency for stables to be located close to public spaces. A randomized test, performed with the aid of the PASSaGE v.2 program [119], shows that the average minimum distance is 635 m, which is significantly smaller (p: <0.01) than the randomized average minimum distance of 1141 m, calculated across 999 permutations. It would appear, therefore, that there is a 'symbiosis' between the location of farmhouses incorporating stables, on the one hand, and these public spaces on the other. Indeed, we are currently investigating the latter aspect in the framework of the same project to which reference has been made earlier on in this study.
Employing the fitted cabreo model as constrain, we will use GIS to isolate potential foraging routes from stables toward those areas known from ethnographic accounts for being used as pastures. Among other things, the likely routes (generated via Least-Cost Path analysis) will be compared to information derived from interviews of local shepherds and to evidence regarding the spatial distribution of disappeared villages originally connected to the movement of flocks across the landscape [120]. All in all, our research into the pastoral foraging landscape will aim at exploring the ways in which zones flagged by the cabreo model as non optimal for agriculture could have been used for other aspects of the economic exploitation of the Maltese landscape.
The model's results regarding land accessibility are also interesting. The analysis formally shows that accessibility had an effect on agricultural suitability; in particular, the more distant a plot was from secondary roads, the lesser the odds of optimal agricultural quality. This makes sense inasmuch the secondary road network can be thought of as allowing a gradual shift from urbanized areas to more peripheral zones, going deeper into the landscape relative to the main road system (which, remarkably, turned out not to have a significant contribution to the model), and providing access to the countryside. Seen from this perspective, it is not by chance if being distant from secondary roads (i.e., being in less accessible plots of land) decreases the odds of optimal agricultural quality, according to the model. While this proves an interesting new acquisition for the study area, it is something that has been stressed in ethnoarchaeological studies of rural settlements and land use elsewhere in the Mediterranean. For instance, Cherry et al. [68] have stressed that in northwestern Keos land accessibility was one of the factors influencing the decision to cultivate a particular land as of early 1900s, before the use of motorized transport became widespread. This situation now finds an interesting parallel in mid-1800s Malta.
The issue of land accessibility and its interpretation holds true for the distance to the nearest footpath, which has a similar negative effect: as seen, the increase in distance to footpaths is associated with a decrease in the odds for optimal land quality. It must be noted that the model's results for the distance to the nearest minor road seem counterintuitive, and deserve comments. If seen from the standpoint of the secondary road network and of its ability to make the landscape accessible, one would expect minor roads to have an effect similar to that of secondary roads. Yet, unlike secondary roads, the distance to the nearest minor road is associated with an increase in odds, i.e. the larger the distance the larger the odds for optimal quality. A review of the literature shows that proximity to roads may alter the chemical composition of adjoining soils, leading to a decrease in fertility [121][122][123]. Besides chemical factors, physical ones may also account for a decrease in agricultural suitability: S. Vella [62], for instance, argues that roads increase the quantity and velocity of runoff, enhancing soil erosion and damage to rubble walls. She goes on saying that loose materials used for surfaced tracks release gravel, which may increase the erosive ability of runoff. From this standpoint, the model result as to distance to the nearest minor road may prove less counterintuitive, and may help explaining the modelled increase in the odds of optimal agricultural quality as one moves away from minor roads. Such a scenario may explain what is happening in the immediate vicinity of a road. However, it may well be that minor roads occur in areas of garrigue and karstland which in the model correspond to an area less favourable for agriculture. Optimal land quality would be found further away from these, downslope or in valley bottoms, areas highlighted by the model for their agricultural suitability.
All in all, our findings shed considerable new light on some of the questions posed in the introduction to this paper. One of the striking characteristics displayed both by the archival evidence and by the resulting model is the wide variability in land quality that is evident even over small distances. The rugged profile of the island presents dramatically different microenvironments which typically range from exposed plateaux, to well-watered but steep clay slopes, to more sheltered valley bottoms with rich and deep alluvial and colluvial fills. Each of these micro-environments may present very different agricultural affordances. As stressed, some may be unsuitable for crop cultivation, but ideal for grazing for sheep and goat. The fragmented and variable nature of the Maltese landscape is interestingly an enduring characteristic, which would have been no less variable in more remote periods, and even during prehistory. Each of these environments may of course have undergone considerable transformations over long periods of time, resulting in very different agricultural affordances in different periods. However this does not alter the basic fact that the Maltese landscape, like many small island environments, presents a range of different agricultural opportunities in close juxtaposition. Mixed subsistence strategies, such as those combining animal grazing and food collecting on more inhospitable areas, with crop cultivation in more sheltered and favourable zones, appear to be better suited to such an environment. This characteristic should be taken into account in any research efforts to reconstruct prehistoric and ancient subsistence strategies in such landscapes.

Conclusions
This work has sought to use for the first time mid-1800s cabreo maps as a basis to develop a statistical model in a GIS environment to understand possible determinants of agricultural quality in Malta before heavy mechanization. Logistic regression modelling has made it possible to gauge the effect of a host of topographic and cultural factors on the agricultural quality, enabling us to build a predictive model for the entire study area. It has been therefore possible to isolate a negative contribution of some topographic factors such as elevation, slope, distance to the nearest geological fault line (used as proxy for fresh water availability). Other factors, such as distance to the coast, turned out to have a positive effect on the chances for optimal agricultural quality. The model has also indicated that different soil types had different effects on agricultural quality, with Terra Rossa proving to have a negative influence on agriculture relative to other types of soils. This proved to be consistent with observations published almost a century later, hence pointing to an interesting continuity in some constrains to optimal agriculture in the Maltese landscape. Remarkably, the analysis also showed that cultural factors, such as roads network and related mobility and landscape accessibility, had a constraining role as well, in a way consistent with findings in other Mediterranean islands. It has also been stressed that wide zones flagged as non-optimal for agriculture by the model must have been nonetheless good for a host of other activities such as grazing grounds, quarrying, fuel procurement, as both documentary evidence and the location of farmhouses incorporating stables indeed indicate. They were therefore part of a broader economic picture. All in all, our model has shown that different factors are likely to have shaped the agricultural landscape of the Maltese Islands. A host of topographic and cultural factors, the latter related to human mobility and landscape accessibility, did contribute to differential agricultural suitability. They provided the bases to the creation of that fragmented and extremely variegated agricultural landscape that is the hallmark of the Maltese Islands.