Spatial parameters associated with the risk of banana bunchy top disease in smallholder systems

The Banana Bunchy Top Disease (BBTD), caused by the Banana Bunchy Top Virus (BBTV) is the most important and devastating in many tropical countries. BBTD epidemiology has been little studied, mixed landscape smallholder systems. The relative risks associated with this disease vary between geographical areas and landscapes. This work analyzed the management and vegetation conditions in smallholder gardens to assess the factors linked to landscape-level BBTV transmission and management. Mapping was done in this study area which is in a BBTD-endemic region, involving farmers actively managing the disease, but with household-level decision making. A spatial scanning statistic was used to detect and identify spatial groups at the 5% significance threshold, and a Poisson regression model was used to explore propagation vectors and the effect of surrounding vegetation and crop diversity. Spatial groups with high relative risk were identified in three communities, Dangbo, Houéyogbé, and Adjarra. Significant associations emerged between the BBTD prevalence and some crop diversity, seed systems, and BBTD management linked factors. The identified factors form important candidate management options for the detailed assessment of landscape-scale BBTD management in smallholder communities.


Introduction
Cultivated in more than 150 countries, Musa spp (banana and plantain-hereafter referred to as ''banana") is now the fourth-largest food crop in the world, after rice, wheat, and maize [1]. It is a key staple food for millions of people and plays an important role in the social and economic security of many rural communities. In many parts of Africa, the banana is the ultimate food crop. Banana is consumed ripe (dessert), cooked, or processed into snacks, soft drinks and fermented beverages. Plantain is mostly starchy and cooked before consumption, but one of the most powerful tools in precision agriculture [26]. This approach involves data collection, field variability mapping through the use and development of algorithms, and decision-making to inform management practice [23][24][25][26][27]. Smallholder farmers produce bananas in variable landscapes and crop associations. Banana is cultivated in monoculture plantations in mixtures of varieties often of different ages. Banana association with annual or perennial crop and tree species in cleared primary and secondary forests, and in backyard gardens close to residential areas is common. Thus, the banana canopy is presented to remote scanning tools in variable backgrounds in which tools for mapping species and diseases are still poorly developed. Recently, machine learning protocols involving a first step of identifying bananas accurately in a complex matrix of crops, before narrowing down to disease symptoms have been attempted [23]. This protocol is, however, still limited to the detection of advanced disease symptoms. Ground observations and the assessment of production history are therefore important in determining the underlying disease risk, support recommendations for further manual monitoring, and supporting community management and collaboration for local production recovery.
Studies on BBTD epidemiology are mostly based on subtropical large-scale monocultures [28,29]. Some of these studies have yielded a useful framework for BBTD management in smallholder systems in Africa [20]. [30] designed a model for understanding BBTD prevalence using hypothetical model farms, based on distance from the point of infection. Recently, [28] have modelled the spread of BBTD in sections of a plantation. These studies reveal important aspects of BBTD spread, but do not consider the landscape diversity in smallholder production systems. This study, therefore, investigates the risk factors associated with BBTD prevalence in mixed canopy smallholder production systems. We used UAS, manual ground-truthing, and farmer surveys to take spatial information, such as banana mats location and vegetation diversity, for BBTD risk determination. A ground-truthing assessment was done in parallel to assess BBTD prevalence around individual gardens.
The following sections are arranged as follows: in "Materials" and "Methods", the materials and the methods are briefly presented. "Results" describes the outputs of the analysis. Finally, in "Discussion" we discuss the work, and future studies generated from this study.

Study area
This study was carried out in six municipalities in southern Benin: Akpro-Missérété, Adjarra, Dangbo, Sakété, Athiémé, and Houéyogbé, representing the main banana-growing regions, where BBTD has been reported (Fig 1). These fall in the departments of Ouémé, Plateau, Mono, respectively. Banana and plantain in Benin are mainly grown in the humid south, a region characterized by warm temperatures (27-32˚C) and bimodal rainfall with peaks in April/ May and September and totalling between 1000 and 1300 mm p.a. This study was carried out from July to August 2019, representing the period of high rainfall during the main rainy season. This represents a period after three months of vigorous growth of banana plants during the rainy season.
Authorisation for UAV flight within the territory of the Republic of Benin operation was granted following an application to the Ministry of Agriculture and military council of the presidency (letter attached). Local administration was further informed of project activities, as part of collaborator work with a local university. Farmers who were part of the data collection gave verbal consent for flights and survey (see survey document for wording), with the first demonstration of drone operation being done in a common demonstration garden before the survey was started.
Seventy-one banana gardens in six municipalities were surveyed. A banana garden was considered as a continuous unit with discernible limits, under a single owner, or planted in the same season. These farmers are participants in the BBTD recovery project, hence assumed capable of recognizing most symptoms of BBTD. Their gardens were also assumed to closely represent the banana production systems in this region. Within each municipality, all farmers available for a ground-truthing interview also had their farms surveyed using the UAS. Thus, we created a dataset of crop scouting and survey with the management history for each plot. A banana plot was the unit of the survey, thus where a farmer had more than one plot, each was treated as a separate entity.

Image capture and analysis
The protocol for data collection and analysis is summarized in Fig 2. Images were taken between 10 am and 5 pm, in a clear, calm (cloud cover < 20% and wind speeds between 7 and 34 km/h) using a Phantom 4 Pro and Phantom 4 Pro+ (DJI, China) UAV (Unmanned Aerial Vehicle), equipped with a 1-inch, 20-megapixel sensor multimedia camera. The Phantom 4 Pro+ is capable of recording in Ultra High Definition (UHD) 4K (4096x2160) at 60 fps (feet per second), at a maximum bit rate of 100 Mbps. For data collection, mission planning was done using PIX4D CAPTURE Software (4.5 version). At each garden, the UAV was flown at an altitude of 40 meters, allowing a footprint of approximately 5472 × 3648 pixels with a spatial resolution of 7 cm. The flight mission was designed to leave target plots at the centre of the area surveyed with a buffer of up to 100m. The minimum length of flight area flow is 75m ×75 m, about 5.6 ha, and the minimum number of photos obtained is 230 (Fig 3). The camera was programmed to take a picture every two seconds, ensuring 70-80% overlap between each pair of adjacent images. Thus, every single point was covered at least four times during the mission. All flights maintained the line of sight of the operator to the UAV. Airspace regulations as defined in Benin by the Civil Aviation Authority were followed, and no restricted areas were overflown. The flight time at each site was about 20 minutes and the pattern used is the grid The images collected were automatically geotagged by the onboard GPS, so the orthomosaic could be georeferenced without the need for image, map, or ground control point (GCP) references. The resulting image from the flights was imported into PIX4DFIELDS (1.5 version, PIX4D) and processed to extract an orthomosaic including a correction for topographical distortions, for each plot following a standard procedure: (i) aligning the photographs (accuracy: high; alignment: reference); (ii) constructing a dense point cloud (quality: high; depth filtering: moderate); (iii) constructing a digital elevation model (DEM) (pixel size 7 cm; interpolation: extrapolation; all classes of points to be generated digital surface model); (iv) constructing an orthomosaic (input area: DEM; mixing mode: mosaic) (See S1 Fig

Field ground-truthing survey
Parallel to the UAV study, a ground-truthing survey was conducted in each field to observe selected environmental variables. We recorded the presence and type of vegetation around the target field, wind direction and speed, banana tree density, and types of banana mats used. A survey (n = 71) questionnaire was used to assess the crop and BBTD management history of each plot. These observations were made within and around it at 30m, 60m, and 90m radius from the target plot. Brief historical information of the garden (age of the plot and cropping management) was sought from the proprietors. For all analyses, the response variable was the BBTD symptomatic plants observed, assuming that the number of plants showing BBTD closely estimated the actual number of plants infected with BBTV [20].

Data processing and analysis
The survey forms were encoded in the CSPRO software (7.2 version on www.census.gov), designed beforehand, and summarized for analysis. The database was imported into SATSCAN software (9.6 version, https://www.satscan.org/) for further analysis. SATSCAN was used to detect spatial groups and test if they were statistically significant through purely spatial approaches using discrete scan statistics of Poisson and with high-or low-rate scans for unique risk-associated attributes of geographical areas around the sampling points. A generalized linear model was done in R (3.6 version) following the procedure of Poisson regression [31].
Spatial modelling was done as follows. The notations adopted for the presentation of this method were:

Purely spatial scanning statistics
The spatial scan statistic [32] was used to identify areas with abnormal BBTD incidences that were least consistent with the null hypothesis of constant risk within the study area. This method is based on a likelihood ratio test and tests the hypothesis that disease risk remains constant within the study area. It is very powerful and can be applied to both grouped and single data points [32]. A window of predefined shape (disk or ellipses), of variable size, scans the study area. For each window, a statistic, based on the likelihood ratio and the number of observed and expected cases, was calculated. The likelihood functions follow the theoretical distribution associated with the number of cases. Two distribution models were used: the Poisson's law (aggregate data or when the number of cases is negligible compared to the size of the population) and the binomial law (individual case and control data). The window that corresponded to the maximum likelihood was the most likely group, the one least likely to occur by chance. To test the significance of the groups detected, Monte Carlo simulations are performed (with up to 999 replications).

Detection and identification of BBTD groups
Spatial scan statistics, implemented in the SATSCAN software, were used to test for the presence of BBTD spatial groups and to identify their approximate locations. Group evaluation was performed by comparing the number of cases within the window with the expected number if the cases were randomly distributed spatially. The test of significance of the identified groups was based on a likelihood test whose p-value was obtained by the Monte Carlo test [33]. The identification of spatial groups with high or low rates was carried out under the assumption of the Poisson probability model, using a maximum spatial group size of 5% of the total population. For statistical inference, 999 Monte Carlo replications were performed. The null hypothesis of no group was rejected when the simulated p-value is less than or equal to 0.05 for the primary group and 0.10 for the secondary group.
[34] defined a spatial scanning statistic that imposes a circular window and allows the centre of the disk to move across the map. For any given position of the centroid, the radius of the disk varies continuously from zero to a predefined upper limit (100m). Theoretically, the method uses an infinite number of separate disks, each with a different location and size. For the analysis, we define the disc as comprising no more than 5% of the total population. In the alternative hypothesis, there is at least one disk with a higher relative risk on the inside compared to the outside. For each disk, it is possible to calculate a likelihood that takes into account the number of cases observed inside and outside the disk. The disk that maximizes the likelihood function is called the most likely group or the candidate group. The most common disease mapping applications take into account the Poisson distribution or the binomial distribution of the number of disease cases. The Poisson model was defined as follows: Considering the map G, as our study area, and Z, the disc of current interest and a collec- [34,35]. P z i is the number of individuals in a zone Z i and C z i is the number of cases among P z i . Let A be a zone included in G, P A the number of individuals in A, and C A the random variable of the number of cases in A. For 8A � G, C A is generated by an Inhomogeneous Poisson Process (IPP) such as: where π is the probability of being a case in Z i , P A\Z i is the number of individuals at the intersection of zones A and Z i , δ is the probability of being a case outside Z i , P A\ � Z i the number of individuals in zone A and outside zone Z i . Under the null hypothesis (H 0 ) of risk homogeneity: a: Calculation of the likelihood function for all zones Z i . In the region G, knowing the Z i window, the probability of observing c case is such that: Each case x of the c cases has a probability of being in a given area of G which depends on the population at risk in that area.
The function f(x) of the probability density of the cases in G known Z i was defined such that: The likelihood function for the Z i window is equal to: By developing this equation, we obtain: find the Z i zone that maximizes the likelihood function.
To do this, we ask: When The maximum likelihood estimatorẐ is such that: The likelihood ratio is: The significance of the model was tested by Monte Carlo inference, as the distribution of the statistic is not known. The most likely groups were obtained at the 5% significance level, while the secondary groups are determined at the 10% threshold. This zone is the Z s window (defined in (2.7)) which maximizes the likelihood function and is, therefore, the most likely group.

Relative risk
The real relative risk (RR) is defined as the risk λ z in the zone Z compared to the risk λ (G\Z) in all other regions G | Z: where C Z is the Poisson random variable representing the number of cases in the region Z, with the expected value given by P Z the population of the region Z, P is the total population of the map G, E Z is the expected number of cases in region Z under the null hypothesis, and c is the total number of observed cases. Similarly, we define λ (G|Z) . In this way, the true relative risk is defined as follows: If Z and G | Z both have the same λ Z = λ (G|Z) = λ, the real relative risk is 1. Suppose that Z is selected independently of the observed data. The natural estimator of RR is given by: where c is the total number of cases, c Z is the number of cases in the group Z, E G is the expected number of cases in the region under the null hypothesis, and E Z is the expected number of cases in the region Z under the null hypothesis. This is an unbiased estimator when the region is chosen independently of the observed data. Thus, if the estimated risk in the Z region, c Z /E Z , is close to the estimated risk outside the group (c − c Z )/(E G − E Z ), we have c RR which is close to 1 and there is strong evidence that there is no group on the map G. If c Z /E Z increases relative to (c − c Z )/(E G − E Z ), the estimated relative risk increases by indicating the existence of a group in Z region. However, we are only interested in estimating the RR for the candidate group. BecauseẐ is chosen among all possible Z to maximize (2.5), it is highly dependent on the random variables realized.
The GLM model was run to determine the variables most significantly associated with the response variable (BBTD prevalence) in the examined gardens and their buffer areas. GLM enabled the use of both continuous and categorical variables (Table 1). GLM was run in R software (3.6 version) assuming a Poisson distribution, with a significance threshold of 5%. However, the variables showing a 10% limit were noted. The model was validated using the Hosmer-Lemeshow test and Cameron and Trivedi test. The Hosmer-Lemeshow test was used to assess the relevance of the Poisson regression model. The associated p-value uses the Chisquare distribution, and so the model was assumed to be adapted to the data if the p-value obtained was 1 > 0.05. The Cameron and Trivedi test were used to assess the abnormality of the dispersion of the response variable. The associated p-value obtained is equal to 0.9987 > 0.05. Thus, we conclude that there is a normal dispersion of Y values represented by the number of cases of BBTD.

Results
Spatial clustering analysis of all 71 fields identified three statistically significant high-risk areas, covering a total of 21 fields, distributed in all six communes ( Table 2). The rest of the 50 fields assessed did not have a BBTV infection risk significantly different from the general background. The total number of fields with reported cases is 64 and the total number of reported cases (mats showing BBTD) is 168. A total of 30 groups were detected, 7 of which were significant at the 5% threshold and 11 significant at the 1% threshold. The shapefile of the groups generated was exported and a Keyhole Markup Language (KML) file in GOOGLE EARTH, and projected onto a map to identify the locations of the clusters on the ground. Most of the clusters were detected in Dangbo (13 clusters, 15 fields, 20.4 mean risk factor) and the lowest was detected in Athiémé and Sakété. Results suggest the existence of BBTD "high-risk areas" in certain regions, namely, Dangbo, Houéyogbé, and Adjarra showed relative higher risk and representation of fields with high relative risk compared to Akpro-Missérété, Athiémé, and Sakété (S3 Fig).

The output of Poisson regression
The results of the Poisson regression show 11 significant explanatory variables at the 5% threshold. Crop association with banana was associated with lower BBTD occurrence. In General, 8 crops were found in association with banana/plantains spanning different growth habits annuals, perennials, biennials, and use classes-grains, legumes, and horticultural crops. The more crops are associated, the lower the number of cases of BBTD were observed. Banana monocrops thus had the greatest risk of BBTD in all communes (S4 Fig). Among the individual crop types, fruits crops, and other crops (sugar cane, legumes, taros, and palm oil) had the greatest association with BBTD, while maize and cassava had the lowest. Banana crop preceding the current stand (continuous/perennial banana cropping) gave the greatest risk of BBTD, similar to tomato crops (n = 3, p-value = 0.0069); while maize and cassava preceding the banana crops had the lowest association with BBTD risk (n = 7, p-value = 0.1087). The plots that had been planted with tomato crop one year before the current crop were 1,357 times more likely to have BBTD cases compared to those that had banana-free fallow (S5 Fig). The older the banana garden, the more the number of cases of BBTD were observed. BBTD cases increased by 1.14 times per year, keeping the other parameters constant. Older banana crops also had a higher density of stems per square meter (r = -0.2). The cropping density (mats/ stem per square meter) was positively correlated with disease risk (r = 0.0528, pvalue = 0.0367) (See S6 Fig). Biotic environmental, and crop management parameters were also correlated with the BBTD risk observed. Vector abundance and distribution were the single most important factor associated with BBTD risk (n = 63, p-value = 0.0461). The occurrence of aphids on the main pseudostem and the suckers were both positively correlated with BBTD risk (S7 and S8 Figs). Seed sourcing was also associated with BBTD. Farmers using suckers sourced from the neighbours were 53.1 times more likely to have BBTD in their gardens than those who used their own seeds (see S9 Fig). Several standard cultural disease control options were associated with reduced BBTD occurrence in the investigated fields. Farmers reporting uprooting of diseased banana mats were also 7.9 times also reported higher levels of BBTD (S10 Fig). Similarly, securing clean seed, field isolation, and intercropping were also significantly associated with reduced BBTD. The farther away the field was from an infected field, the smaller the number of cases of BBTD were reported (S11 Fig). Border vegetation of different species was also associated with reduced BBTD infested plants reported (S12 Fig).

Discussion
The present study assessed the spatial and production factors associating with the BBTD risk in banana gardens in southern Benin. This distribution of groups in these zones would be compatible with common risk factors, namely: type of variety, associated crops, the distance between fields, presence of barriers, etc., which would have considerably reduced the number of BBTD cases observed (S13 Fig). In addition, we also identified some estimators that could be useful in deriving BBTD risk in such efforts, especially in low altitude plantain production systems. These are consistent with the risk factors previously identified by [30]. Some level of BBTD risk was found in all regions, with some regions including Dangbo, Houéyogbé, and Adjarra, near the Benin-Nigeria border, being the riskiest, and closest to the regions of the first outbreak within the country. Banana production in Benin is localized in the southern end of the country. There is also some cultivation in the drier areas along river valleys and lowlands. The variables obtained in this study could be useful in initiating community-based approaches to BBTD control in West Africa. The risk of BBTD infection in replanted fields is variable even within BBTD-endemic landscapes. Overall, our data show that crop succession, production, longevity, and informal seed systems to be associated with a higher risk of BBTD; while mixed cropping matrices, higher altitude, and lower surrounding BBTD pressure were linked to reduced BBTD prevalence. Within the cropping management, the diverse kind of banana production (plantain or sweet banana) differed in their BBTD prevalence and calculated risk, as did the identity of the associated crop, the distance between fields, and the presence of barriers. This study isolates some potential combinations of approaches that could help keep banana production in BBTD endemic regions [20][21][22][23][24][25][26][27][28][29][30][31][32][33][34][35][36]. The results of this study suggest a set of tools that could be better refined and tested alone or in combination to determine their potential in BBTD control. Analyzing BBTD reinfection risk is important in determining where to initiate the recovery of bananas in BBTV endemic regions.
Out of the 25 variables tested, eleven showed significant association with BBTD prevalence observed. They fall into three categories: environmental factors, those related to crop growth and vector movement, seed systems and crop management factors. The role of natural barriers in reducing infection levels was an interesting finding. Rows of barriers influence the movement of vectors between plots, slowing disease spread at the landscape level, similar to associated crops within the plantation. A number of these variables could be correlated or interacting with each other. For example, older farms and those with higher banana density had a higher level of BBTD. The growth habit of bananas implies that an older crop would also have more stems per mat than recently established ones in smallholder production systems, where suckers are often not thinned. Thus, the influence of crop age on BBTD prevalence may be partly due to the effect of plant density in older plots. Conversely, intercropped fields have a lower plant density compared to monocrop bananas, both resulting in lower canopy connectivity. The individual and additive effects of these measures would be interesting to test in controlled experiments, to identify priority packages for BBTD control in smallholder systems [20]. Previously, [37] had observed the role of canopy connectivity in BBTD spread at a plantation edge in Australia (subtropical systems). This study highlights the importance of inoculum source management and landscape-scale operations in the management of BBTD in smallholder systems, and the role of population and vector movement in determining BBTD reinfection at the landscape level [20][21][22][23][24][25][26][27][28][29][30][31][32][33][34][35][36][37][38].
The direct correlation of vector-linked variables (e.g., aphid density, banana density, and presence of barriers) with BBTD risk showed an important contribution of those factors related to vector movement in reducing BBTD risk. Although vector control is typically not done in BBTD management, except in reducing the emigration of aphids during rouging or plant disturbance practices, the role of vector movement is clear. Such practices could include intercropping with non-host species and those likely to influence vector dispersal and host findings. Vector movement might take a temporal variation dependent on the effect of weather or plant growth on the formation of winged aphids, and the risk of vector migration. Thus, such control measures may be more effective in some seasons than others, or entirely unnecessary when vector movement is minimal. The development of winged aphid forms has been associated with vector population, host quality, and environmental cues [39]. Apterous aphids form when populations rise either due to sufficient feeding opportunities or reduced direct mortality from weather factors. This could pose a seasonal change in disease transmission risk from migrating aphids. Optimization in this way would include designing cost-effective packages suitable for specific production situations and seasons. We estimated that the presence of aphid populations could only explain 27% of the total BBTD in these communes. Therefore, other mechanisms of virus dispersion could be driving the BBTD, particularly at slow rows of barriers around the nearest fields.
In this study, GLM was used to analyze the current spatial distribution of BBTD. Spatial scan statistics do not make it possible to take into account explanatory variables in the calculation of the expected number of cases in a spatial unit. However, in epidemiology, the spatial heterogeneity of the number of cases of the disease can be attributable to many factors such as the socio-demographic characteristics of each spatial unit. These factors are taken into account through indirect standardization and by considering only qualitative variables. One possible solution lies in the use of log-linear spatial models to take into account all types of explanatory variables and, consequently, to give an estimate of the expected number of cases per spatial unit. Testing the limits of this approach in epidemiological modelling is beyond the scope of this work, but would be interesting especially as applied to disease risk mapping or estimation of production suitability in agricultural areas. The clusters identified in these departments enclosed more than one farm with BBTD. For banana plots that were up to 5 years old, the preceding crop influenced the associated risk of BBTD infestation. BBTD could have a more complex natural history than simple vector-borne transmission.
It is estimated that less than 5% of the seed planted with bananas in these communes comes from certified seed sources. Without an adequate seed certification system, informal trade in seed bananas might provide a key pathway for long-distance virus dissemination, expanding significantly the range of virus dispersion, and becoming a major driver of disease propagation. Short-distance dispersion (secondary infections) are a result of aphid movement. After the establishment of the infection, vector transmission could then become an efficient shortdistance mechanism of virus dispersion to other plants within the farm and along with the surrounding neighbouring crops.
Nevertheless, improvements could be made to the current model. A possible extension of this method to the Spatio-temporal framework is envisaged: it is sufficient to define an appropriate period between events. This could constitute an alternative to space-time scanning statistics hence improve the temporal response to accommodate management activities in disease modelling. Moreover, using UAVs equipped with a spectral and thermal camera to identify disease plants [23] and to take direct measurements could be possible in the next study and landscape scale to improve the accuracy of vegetation mapping by including plant stress indicators.