Mapping spatial distribution and geographic shifts of East African highland banana (Musa spp.) in Uganda

doi:10.1371/journal.pone.0263439

Fig 1.

Scanned map of historical locations with banana in 1958.

Each dot represents 2.02 km² of standing banana. A total of 2400 dots are spread across Uganda equivalent to 4,856 km² of banana. Image reproduced with permission of Giles Clark, the copyright holder.

More »

Expand

Fig 2.

Sampled locations with banana in 2016.

Each dot represents the centroid of a quadrat of 10,000 m² used for collecting banana presence/absence information during the Geosurvey. Data acquired with permission of Markus Walsh, Africa Soils Information Service (https://doi.org/10.17605/OSF.IO/J8Y3Z).

More »

Expand

Fig 3.

Correlation among selected covariates A) 29 covariates after recursive feature elimination; B) 17 covariates with Pearson’s correlation coefficient (r) less than ± 0.7; C) 12 covariates selected using a subjective approach.

More »

Expand

Table 1.

List of 29 covariates selected from a list of 71 variables using recursive feature elimination and further selection of 17 uncorrelated covariates (shaded grey).

More »

Expand

Table 2.

List of 12 covariates selected subjectively and their underlying reasons for their selection.

More »

Expand

Table 3.

Average nearest neighbour analysis before and after filtering of data points.

More »

Expand

Fig 4.

Performance metrics (A: Adjusted F-measure, B: Brier score, C: Geometric mean, D: Cohen’s Kappa, E: PR AUC, F: ROC AUC) for random forest (RF), gradient boosted machines (GBM) and neural networks (NN) trained on the 12 covariates chosen via subjective feature selection. Each algorithm was trained under three different sampling scenarios: Oversampling (OS), and undersampling (US) and without sampling (WS). The black line and red dot inside the box are the median and mean, respectively.

More »

Expand

Fig 5.

Performance metrics (A: Adjusted F-measure, B: Brier score, C: Geometric mean, D: Cohen’s Kappa, E: PR AUC, F: ROC AUC) for random forest (RF), gradient boosted machines (GBM) and neural networks (NN) trained on the 17 covariates selected using recursive feature elimination. Each algorithm was trained under three different sampling scenarios: Oversampling (OS), undersampling (US) and without sampling (WS). The black line and red dot inside the box are the median and mean, respectively.

More »

Expand

Fig 6.

Performance metrics (A: Adjusted F-measure, B: Brier score, C: Geometric mean, D: Cohen’s Kappa, E: PR AUC, F: ROC AUC) for the ensemble models. The black line and red dot inside the box are the median and mean, respectively. Wilcoxon rank test significance values: Not significant (ns) p > 0.05; * p < 0.05; ** p < 0.01; *** p < 0.001; ****p < 0.000.

More »

Expand

Fig 7.

Predicted banana distribution map (2016) using an ensemble model from RF, GBM and NN trained on A) 12 covariates and B) 17 covariates. The maps were refined using the SAGA majority filtering tool within QGIS. Probabilities were converted into categories of banana presence using the probability threshold of 0.25 that maximizes the true positive rate and true negative rate (Max TPR+TNR).

More »

Expand

Table 4.

Comparison between logistic regression models of different complexity and structure derived from the 12 covariates chosen using subjective feature selection.

More »

Expand

Table 5.

Summary results of the logistic regression model M2-12 including the significant two-way interactions to maximise loglikelihood.

More »

Expand

Fig 8.

Predicted banana distribution map (2016) using logistic regression model M2-12 fitted using 12 covariates and significant two-way combinations.

More »

Expand

Fig 9.

Spatial distribution of banana A) historical banana distribution (1958); B) latest banana distribution (2016) predicted using ensemble model of RF, GBM and NN trained on the 12 covariates; C) percentage share of banana among administrative regions: Northern, Eastern, Central and Western. The share of banana was computed using counts of pixels with banana in each region divided by the total number of pixels with banana in Uganda.

More »

Expand

Fig 10.

Geographic shifts of banana in Uganda.

A) geographic shift patterns generated by overlaying the historical distributions (1958) and latest banana distribution (2016); B) percentage distribution of banana geographic shift between administrative regions: Northern, Eastern, Central and Western; C) percentage distribution of banana geographic shift among agroecological zones 1: West Nile Farmlands; 2: Northwestern Farmlands-Wooded-Savanna; 3: Northern Moist Farmlands; 4: Northeastern Central Grass-Bush Farmlands; 5: Northeastern Semi-arid Short Grass Plains; 6: Western Mid-Altitude Farmlands and the Semuliki Flats; 7:Central Wooded Savanna; 8: Southern and Eastern Lake Kyoga Plains; 9: Mount Elgon Farmlands; 10: Western Medium High Farmlands; 11: Southwestern Grass Farmlands; 12: Lake Victoria Crescent and Mbale Farmlands; 13: Ssese Islands and Sango Plains; 14: Southwestern Highlands. The percentages were computed based on numbers of pixels in each region that correspond to the different shift categories divided by the total number of pixels in the geographic shift map of Uganda.

More »

Expand

Fig 11.

Classification and regression tree (CART) showing the biophysical factors associated with geographic shift in banana at national level.

Probabilities for each geographic shift class are included within the coloured boxes. The node number at which a split occurs are shown above the coloured boxes.

More »

Expand