EFForTS-LGraf: A landscape generator for creating smallholder-driven land-use mosaics

Spatially-explicit simulation models are commonly used to study complex ecological and socio-economic research questions. Often these models depend on detailed input data, such as initial land-cover maps to set up model simulations. Here we present the landscape generator EFFortS-LGraf that provides artificially-generated land-use maps of agricultural landscapes shaped by small-scale farms. EFForTS-LGraf is a process-based landscape generator that explicitly incorporates the human dimension of land-use change. The model generates roads and villages that consist of smallholder farming households. These smallholders use different establishment strategies to create fields in their close vicinity. Crop types are distributed to these fields based on crop fractions and specialization levels. EFForTS-LGraf model parameters such as household area or field size frequency distributions can be derived from household surveys or geospatial data. This can be an advantage over the abstract parameters of neutral landscape generators. We tested the model using oil palm and rubber farming in Indonesia as a case study and validated the artificially-generated maps against classified satellite images. Our results show that EFForTS-LGraf is able to generate realistic land-cover maps with properties that lie within the boundaries of landscapes from classified satellite images. An applied simulation experiment on landscape-level effects of increasing household area and crop specialization revealed that larger households with higher specialization levels led to spatially more homogeneous and less scattered crop type distributions and reduced edge area proportion. Thus, EFForTS-LGraf can be applied both to generate maps as inputs for simulation modelling and as a stand-alone tool for specific landscape-scale analyses in the context of ecological-economic studies of smallholder farming systems.


Details on road creation and household placement
EFForTS-LGraf offers three options for road creation: (1) roads can be read in from an existing road map (option: real.shapefile); (2) a road network is artificially created based on a random elevation model (option: artificial.perlin); (3) a road network is created based on the straight road creation algorithm of the G-Raffe landscape generator (option: artificial.graffe).
To upload an existing real road map, one needs to store the respective shapefile of the roads, the projection file and a second shapefile that only includes the extent of the map. After loading the shapefile, the world is re-sized if necessary. All cells that have a road intersecting are identified as road cells.
The Perlin algorithm mimics situations in which roads are created depending on elevation gradients. First, a simple perlin noise elevation model is created by adding several random noise grids with decreasing weight (perlin parameters p1 r,perl , p2 r,perl , p3 r,perl ) [1,2]. The first road is then created by connecting two random locations in the landscape. First, a road-buildingagent is created on one of these two random locations. For each neighboring cell, the road-building-agent calculates the distance to the destination cell (second random location) and the elevation difference to the agents' current location. The total score of each cell is then determined by weighting the distance and elevation criterion depending on the continuous parameter p4 r,perl (€[0,1]). The road-building-agent then moves to the cell with the highest score, establishes a road-cell there and continues to move and create road-cells until it reaches the final destination. Thus, by adjusting the parameter p4 r,perl , the algorithm gives a higher weight to the distance (values near 1) of the connection allowing for straight roads or to the elevation gradient (values near 0) of the connection allowing for wiggling roads. If the n r,art is not reached yet, another road is created by selecting one random point on an already established road and another location in the landscape that is at least m r,art cells away from any other road.
Under the G-RaFFe algorithm all roads are straight and start along the left or lower edge of the simulation area and are directed either vertically, horizontally or diagonally within the landscape. Road length is drawn randomly within the interval [1, w 2 s + h 2 s ]. The parameter m r,art determines the minimum parallel distance in cells between two roads. Roads are created until the number of road cells reaches n r,art . This road creation option is in most parts adopted from the road creation algorithm of the G-RaFFe model [3].

Details on field establishment
During the EFForTS-LGraf field establishment procedure, households attempt to establish a field in three steps: first deciding on the field size, second moving to a potential location and third making sure there is enough space to establish a field of the desired size in this location.
Step one: The size f of the field is drawn randomly from the field size frequency distribution.
Step two: Move to an 'others' start cell using one of four search strategies, which can change during the model run: 2. Random walk to an 'others' cell starting from one of the fields of this household (s2.field) 3. Determine an 'others' cell within a defined radius from the home base (s3.nearby) 4. Determine the closest 'others' cell which is surrounded only by 'others' cells (s4.avoid) The selection and order of these search strategies can be defined on the model interface. At the start cell, the household turns in a random compass direction (north, east, south or west) and starts with the field establishment procedure by trying to establish a first row of cells that will belong to the field. The household agent moves forward cell by cell until it either meets a cell which is already a field or a home base or an inaccessible area cell. To ensure reasonable field shapes, the minimum length l of this first row needs to be the side length of the smallest square that fits into the field size. The maximum length of this first row is then set to be the side length of the next larger square that fits into the field size. Because this algorithm restricts field sizes to rather quadratic shapes, the maximum length can be further modified by the parameter s f which allows to proportionally increase the length of the first row. This results in narrower field shapes.
If the establishment of the first row is not successful, the other possible compass directions for the first row are tested. Once a first field row is established, the farmer tries to expand the field to the left or right of this row until the field reaches its predefined size f . If expansion to one side of the first row did not result in the designated field size, the field is extended also to the other side. If also this expansion does not yield the designated field size, all so far added field patches are removed, the household moves back to the start cell and starts over again with the first field row in a different direction. If all possible four compass directions do not lead to a successful field establishment, the farmer selects a new random start cell and tries again to establish a field of the designated field size f . The parameter n strat defines in how many locations a household tries to establish fields before it switches the search strategy to the next strategy. If the household failed to establish fields for all selected strategies, the model will stop and report a warning message that field establishment was not successful. In such cases, the resulting landscape is incomplete because not all households reached their final size. This might especially happen under certain model parameterizations where households and field sizes tend to be very large without adjusting the total dimensions of the landscape.

Details on EFForTS-LGraf spatial output
The spatial information of generated landscapes can be exported as *.asc raster files in order to allow exchange to other model applications. EFForTS-LGraf contains export functions for the following raster files: • Road-raster: 0 for non-road cells, 1 for cells that have a road intersecting • Homebase-raster: -1 for non-home base cells, household identity number (p_homebase_id) for home base cells. Multiple raster files are written if n s,c > 1.
• Ownership-raster: -1 for cells that are not owned, household identity number (p_owner) for cells that are owned by a household. Note: inaccessible areas do not show up in this output.
Besides raster output, the write-road-shapefiles procedure creates a polylineshapefile from the current road network, which is useful if one of the artificial road creation algorithms has been used.
Additionally, EFForTS-LGraf offers an output function that utilizes the 3D renderer of NetLogo, to create a three dimensional illustration of the generated landscape using trees, palms and houses to visualize the different crop types (Fig A1). With little code adjustments, this feature could in general be used to display other crop types as well. Figure A1: 3d rendered snapshot of EFForTS-LGraf. White lines represent the real road polylines derived from GIS data. 'Others' cells are illustrated as green shaded tree shapes. Oil palm plantations are illustrated as orange colored palm rows. Rubber plantations are illustrated as yellow colored tree rows. Agroforestry cells are illustrated as mixed rows and trees in blue and green color. Settlements are illustrated by small house symbols.  We created a parameter input matrix in order to calculate Sobol Sensitivity Indices for model parameters on different outputs (here: landscape metrics). The parameter matrix was created by variation of parameters in defined ranges (Table A1). The parameter matrix was created with 500 samples and 10 bootstrapping replicates which results in 9500 simulations. For calculating sensitivity indices we used the optimized Sobol estimator proposed by Jansen [4][5][6][7].

Approach 2: Validation
To study the range of landscape characteristics that EFForTS-LGraf can create, we performed a second validation approach in addition to the genetic algorithm validation approach (see Approach 2, main text). From the reclassified land-cover map, that was used for the genetic algorithm validation approach, we sampled 100 randomly placed landscapes, 100 × 100 cells in size (landscapes may overlap), and calculated the five landscape metrics for each of these sampled landscapes and each land-cover type (fields and other). We compared the landscape characteristics (landscape metrics) Figure A2: Points indicate distributions of landscape metrics from 100 landscape samples (100 × 100 cells) from the re-classified satellite image of the Harapan region. Background shading indicates distributions of landscape metrics from 9500 generated landscapes of the same size and resolution (Sobol sensitivity analysis, approach 1). The darker the shading, the more often a landscape with the corresponding index value has been generated. Index distributions of 'others' cells and patches are presented in the upper row (green points, 'others'), whereas index distributions of field cells and patches are presented in the bottom row (yellow points, fields).
of these landscapes to the 9500 artificially generated landscapes from the Sobol sensitivity analysis (approach 1). To allow comparability, all agricultural crop types within the generated landscapes were aggregated to one class (fields) and the remaining area was classified as class 'others'. Large heterogeneity was found in landscape metrics from 100 randomly placed sample landscapes (100 × 100 cells) within the classified land-cover map (point distributions in Fig A2). Landscape metrics were calculated as a function of 'others' cells area, because most indices are very sensitive to class proportions. The landscape shape index (LSI) quantifies the spatial aggregation of patches for each land-cover type. Compared to our generated landscapes, the sampled landscapes covered a relatively small range of aggregation levels (Fig A2). For landscapes with small 'others' area, our generated landscapes had higher LSI than the sampled landscapes. The largest patch index was almost linearly correlated with the 'others' area (negative for agricultural patches, positive for 'others' patches). With increasing 'others' area, the mean patch area was exponentially increasing for 'others' patches in both, the generated and the sampled landscapes. For fields, the mean patch area was almost constant at a very low level. Compared to our generated landscapes, the number of 'others' patches was relatively low, especially for landscapes with small 'others' area. The patch cohesion index (PCI) quantifies the shape complexity and perimeter density of patches and showed a decreasing trend for agricultural patches with increasing 'others' area. For 'others' patches, the patch cohesion index was almost constant across the 100 sampled landscapes.
Comparing the 9500 landscapes that were generated within the Sobol sensitivity analysis approach (approach 1) with the 100 realistic landscapes, many characteristics found in the sampled landscapes (point distributions in Fig A2) were covered by our generated landscapes (background shading in Fig A2). The generated landscapes covered a wide range of landscape shape index (LSI) values. However, generated landscapes with small 'others' area showed systematically higher LSI values. The largest patch index (LPI) distribution was matched very accurately and all sampled landscapes lay within the boundaries of the 9500 generated landscapes. EFForTS-LGraf was also able to reproduce a wide range of mean patch area although the observed heterogeneity in the sampled landscapes was significantly lower. The variety of the total number of patches was reproduced for field patches, but did not fit well for 'others' patches, especially when 'others' area was small. For the patch cohesion index all sampled landscapes lay within the boundaries of our generated landscapes.
In order to visualize the similarity between sampled and generated land- cover maps, we selected three different example landscapes with different 'others' cells area from the 100 samples of the Harapan land-cover map and three example landscapes from the 9500 generated artificial landscapes with matching 'others' area ( Fig A3). By visual comparison, the resulting landscapes from EFForTS-LGraf showed similar spatial clustering of patches and patch sizes although the landscapes from the Harapan land-cover map contained slightly more small field patches that were slightly more scattered across the 'others' area. This matched the observation from the landscape metrics comparison, where the sampled landscapes from the classified landuse map had a significantly greater number of field patches compared to our generated landscapes (Fig A3, n patches).  Figure A4: Five selected landscape metrics, for two crop types (oil palm and rubber) and 'others' patches (matrix) calculated for land-cover maps generated by the model using a latin hypercube sampling approach. Plots show the value of the landscape metrics versus household specialization level on oil palm. Differences in household size distributions are shown by colours with darker colors indicating larger household size. These are grouped into small (1 -1.65 ha, yellow), medium (1.66 -2.32 ha, green) and large (2.32 -3 ha, purple). Lines show smoothed trends and standard error using the locally weighted scatterplot smoothing (LOESS) method.

Approach 3: Applied case study
approaches. We calculated linear regression models for each landscape metric and crop type combination and calculated standardized regression coefficients to estimate parameter and interaction effects on landscape metrics of the generated landscapes (coefficient results, see main text). Additionally, we investigated the raw data by plotting landscape metrics as a function of specialization level, grouped by household size (see Fig A4).