Empiric recommendations for population disaggregation under different data scenarios

doi:10.1371/journal.pone.0274504

Fig 1.

Population density for the three geographic levels (population per hectare).

The urban sectors (L2) are used as source zones (SZ), and the urban sections (L1) and census urban blocks (L0) are the validation zones (VZ). At the bottom, the histogram of population density for each level is shown with the number of total spatial units (n) and the mean population density () value with a dashed line.

More »

Expand

Fig 2.

Geospatial covariates.

Four urban masks and datasets issued from remotely sensed imagery and open geodatabases. The zoomed areas correspond to the same location (6°11’35.6"N, 75°34’17.7"W).

More »

Expand

Fig 3.

Basic outline of the dasymetric, statistical and combined methods for population disaggregation.

Dasymetric: population from source zones (SZ) are distributed to the urban masks at medium, high and very high resolutions (MR, HR, VHR) and to grid target zones (TZ). The result is two-fold validated using the validation zones (VZ) at levels 1 and 0 (L1, L0). Statistical: population from SZ is used as dependent variable to predict the population density at grid level by means of several covariates. The prediction is used as weight (W_p) to distribute the population into the grid TZ, which is validated using the VZ. Combined: the distribution from SZ to TZ using the weight layer is constrained by the urban masks.

More »

Expand

Fig 4.

Experiments design.

Combination of urban masks, dimensions and covariates to simulate scenarios with poor- (E1, E14-E16), average- (E2, E7-8, E13, E17-E20) and rich-data (E3-E6, E9-E13, E21-E36) for population count disaggregation, by means of binary and categorical dasymetric, statistical and combined methods. The cross (x) indicates the data available for each experiment.

More »

Expand

Table 1.

Relative accuracy statistics for the validation of the population disaggregation methods.

More »

Expand

Table 2.

Accuracy metrics for the population disaggregation at medium resolution.

More »

Expand

Table 3.

Accuracy metrics for high resolution population.

More »

Expand

Table 4.

Accuracy metrics for the population disaggregation at very high resolution.

More »

Expand

Table 5.

Accuracy metrics for the population disaggregation at very high resolution with land use information.

More »

Expand

Fig 5.

Violin plots with the distribution and density of absolute percentage errors (APE).

Errors are grouped by land use percentages by experiment, for the validation L0 using the urban blocks. In red, errors for mostly residential blocks (>80% of the land use is residential, the rest can be commercial and/or others, n = 9,236). In blue and green errors for mostly commercial (n = 307) and other (n = 494) land uses urban blocks, respectively. And purple gathers errors for urban blocks without a predominant land use (n = 2,050). The dot reports the median APE per land use group, while the number between brackets in the x-axis reports its mean.

More »

Expand

Fig 6.

Spatial distribution of absolute percentage errors (APE) for L1 (left) and L0 (right) for the best performing experiment (E6). Yellowish colours indicate low errors in population disaggregation, while dark blue and red show high underestimation and overestimation of population, respectively. The boundaries of the L2 source zones (SZ) used in the disaggregation are shown in bold on the top of the two maps.

More »

Expand

Fig 7.

Scatterplots comparing the results of the best population grid (P_L0, SZ L0, y-axis) against the estimated population map from all experiments (P_e, SZ L2, x-axis) with the accuracy metrics.

On the right, the maps show the spatial distribution of residuals (P_L0 –P_e) for the best P_e grid by availability level scenario (highlighted in blue). Bright blue colours indicate an underestimation of population per cell, while red show overestimation. Light colours indicate low errors.

More »

Expand