Fig 1.
Population density for the three geographic levels (population per hectare).
The urban sectors (L2) are used as source zones (SZ), and the urban sections (L1) and census urban blocks (L0) are the validation zones (VZ). At the bottom, the histogram of population density for each level is shown with the number of total spatial units (n) and the mean population density () value with a dashed line.
Fig 2.
Four urban masks and datasets issued from remotely sensed imagery and open geodatabases. The zoomed areas correspond to the same location (6°11’35.6"N, 75°34’17.7"W).
Fig 3.
Basic outline of the dasymetric, statistical and combined methods for population disaggregation.
Dasymetric: population from source zones (SZ) are distributed to the urban masks at medium, high and very high resolutions (MR, HR, VHR) and to grid target zones (TZ). The result is two-fold validated using the validation zones (VZ) at levels 1 and 0 (L1, L0). Statistical: population from SZ is used as dependent variable to predict the population density at grid level by means of several covariates. The prediction is used as weight (Wp) to distribute the population into the grid TZ, which is validated using the VZ. Combined: the distribution from SZ to TZ using the weight layer is constrained by the urban masks.
Fig 4.
Combination of urban masks, dimensions and covariates to simulate scenarios with poor- (E1, E14-E16), average- (E2, E7-8, E13, E17-E20) and rich-data (E3-E6, E9-E13, E21-E36) for population count disaggregation, by means of binary and categorical dasymetric, statistical and combined methods. The cross (x) indicates the data available for each experiment.
Table 1.
Relative accuracy statistics for the validation of the population disaggregation methods.
Table 2.
Accuracy metrics for the population disaggregation at medium resolution.
Table 3.
Accuracy metrics for high resolution population.
Table 4.
Accuracy metrics for the population disaggregation at very high resolution.
Table 5.
Accuracy metrics for the population disaggregation at very high resolution with land use information.
Fig 5.
Violin plots with the distribution and density of absolute percentage errors (APE).
Errors are grouped by land use percentages by experiment, for the validation L0 using the urban blocks. In red, errors for mostly residential blocks (>80% of the land use is residential, the rest can be commercial and/or others, n = 9,236). In blue and green errors for mostly commercial (n = 307) and other (n = 494) land uses urban blocks, respectively. And purple gathers errors for urban blocks without a predominant land use (n = 2,050). The dot reports the median APE per land use group, while the number between brackets in the x-axis reports its mean.
Fig 6.
Spatial distribution of absolute percentage errors (APE) for L1 (left) and L0 (right) for the best performing experiment (E6). Yellowish colours indicate low errors in population disaggregation, while dark blue and red show high underestimation and overestimation of population, respectively. The boundaries of the L2 source zones (SZ) used in the disaggregation are shown in bold on the top of the two maps.
Fig 7.
Scatterplots comparing the results of the best population grid (PL0, SZ L0, y-axis) against the estimated population map from all experiments (Pe, SZ L2, x-axis) with the accuracy metrics.
On the right, the maps show the spatial distribution of residuals (PL0 –Pe) for the best Pe grid by availability level scenario (highlighted in blue). Bright blue colours indicate an underestimation of population per cell, while red show overestimation. Light colours indicate low errors.