Deep learning with satellite images enables high-resolution income estimation: A case study of Buenos Aires | PLOS One

Advertisement

Browse Subject Areas

?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

< Back to Article

Fig 1 — Fig 1.

Study area in the metropolitan area of Buenos Aires, Argentina.
The map displays population density (per km²) from the 2010 National Census. The black outline delineates the boundary of the satellite imagery used in the analysis, which is focused on the densely populated urban core, including the Autonomous City of Buenos Aires and its 24 surrounding municipalities (dashed lines).

More »

Table 1 — Table 1.

Comparison of our model with other strategies.

More »

Fig 2 — Fig 2.

Overview of the estimation strategy by sections of the paper.
This flowchart outlines the three-stage methodology used in the paper. (Sect 4) Building the Dataset: This stage involves creating the training data by combining 2010 census and household survey data through a small area estimation technique to generate income labels for census tracts, which are then paired with high-resolution satellite imagery from 2013. (Sect 5) Training a Neural Network to Predict Income: A Convolutional Neural Network is trained on this dataset to learn the relationship between visual features in the images and per capita income. (Sect 6) Results: The trained model is used to predict income for a high-resolution 50x50 meter grid across the entire study area for multiple years (2013, 2018, 2022), producing detailed income maps.

More »

Table 2 — Table 2.

Summary of datasets used in the study.

More »

Fig 3 — Fig 3.

Spatial distribution of per capita average income estimated for Buenos Aires at census tract level.
This map of the Buenos Aires Metropolitan Area shows the spatial distribution of the estimated per capita income used as the “ground truth” for training the model. Each colored polygon represents a census tract, with its color corresponding to the decile of average household’ income for 2010. This data was generated using a small area estimation method that combines microdata from the 2010 National Census and the Permanent Household Survey.

More »

Table 3 — Table 3.

Model selection: configurations and mean squared error on the test set.

More »

Fig 4 — Fig 4.

Multi-resolution approach for generating the stack of images used as input for our model.
The lower-resolution images provide context for the prediction on the 50x50m area. This diagram illustrates the multi-resolution image stacking approach used as input for the CNN model. To capture both detailed local features and broader contextual information, a high-resolution satellite image with dimensions of 50x50 meters is combined with a lower-resolution image of 200x200 meters from the same central area. This stacked input significantly enhances prediction accuracy and generates more spatially coherent income maps. The images displayed are aerial photographs from the Instituto Geogr´afico Nacional de la Rep´ublica Argentina (2025), provided for illustrative purposes, as the underlying satellite data is subject to licensing restrictions.

More »

Fig 5 — Fig 5.

Example of images generated from census tract data, 128x128px (50x50 meters).
This figure demonstrates the process of generating image samples for model training. On the left, two census tracts with different average income levels are shown. On the right, several 50x50 meter square images are depicted, which are randomly sampled from within these tracts. Each of these smaller images is assigned the average per capita income of its respective census tract, which serves as the training label. The images shown are aerial imagery (from Instituto Geográfico Nacional de la República Argentina, 2025) for illustrative purposes as the underlying satellite data has licensing restrictions.

More »

Fig 6 — Fig 6.

Assignment of census tracts to training, validation and test sets.
This map of Buenos Aires illustrates how census tracts were partitioned to prevent data leakage and ensure a robust model evaluation. Two vertical geographic strips (green) were designated as the test set. The remaining tracts were randomly divided into a training set (orange, 70% of the total) and a validation set (purple, 5%). This spatial separation guarantees that the model is tested on geographically distinct areas it has not seen during training. Background satellite imagery comes from USGS/NASA Landsat.

More »

Fig 7 — Fig 7.

Construction of a grid of 128x128px (50x50m) images to calculate the average income prediction in a census tract.
This figure illustrates the procedure used for model evaluation. A census tract (outlined in green) is systematically covered by a grid of 50x50 meter cells (orange). The model makes an income prediction for each cell, and the average of all these cell-level predictions is then calculated. This average represents the model’s overall income estimate for the tract and is compared against the tract’s known “ground truth” income to measure performance. The images shown are aerial imagery (from Instituto Geográfico Nacional de la República Argentina, 2025) for illustrative purposes as the underlying satellite data has licensing restrictions.

More »

Fig 8 — Fig 8.

Spatial per capita income estimates of Buenos Aires.
This figure highlights the primary contribution of the study by comparing the model’s output with the input data. The map on the left shows the per capita income for Buenos Aires in 2013, as predicted by the model on a high-resolution 50x50 meter grid. The map on the right shows the lower-resolution 2010 “ground truth” income data at the census tract level, which was used for training. The comparison visually demonstrates the model’s ability to create a significantly more granular and detailed income map.

More »

Fig 9 — Fig 9.

Comparing predicted and observed income by census tract in the test set.
This scatter plot provides a quantitative evaluation of the model’s accuracy on the test set. Each point represents a census tract, plotting the model’s average predicted income (y-axis) against the “observed” income from small area estimations (x-axis), with both values being standardized logarithms. The strong linear relationship, with a coefficient of determination (R²) of 0.878, demonstrates the model’s high predictive power at the census tract level.

More »

Fig 10 — Fig 10.

Case studies taken from the test set.
This figure presents several case studies from the test set to qualitatively evaluate the model’s performance in complex urban environments. Each case consists of a satellite image (left) and the corresponding high-resolution predicted income map (right). The examples showcase the model’s ability to accurately differentiate income levels, such as identifying low-income informal settlements adjacent to formal housing and correctly mapping income disparities, even within a single, heterogeneous census tract. The images shown are aerial imagery (from Instituto Geográfico Nacional de la República Argentina, taken between 2013 and 2022) for illustrative purposes as the underlying satellite data has licensing restrictions.

More »

Table 4 — Table 4.

Performance on predicting economic indicators.

More »

Fig 11 — Fig 11.

Comparing indicators at county level.
Note: each dot represents a county. This figure contains a series of scatter plots comparing various economic indicators (e.g., mean income, median income, Gini coefficient) at the aggregated county (municipality) level. For each plot, the x-axis represents the indicator calculated from the 2010 census-based estimates, while the y-axis shows the same indicator calculated from the model’s 2013 high-resolution predictions. The plots demonstrate that the model’s predictions strongly correlate with the census-based data, correctly capturing the relative ranking and trends across municipalities.

More »

Fig 12 — Fig 12.

Per capita income predictions over time.
This figure presents three predicted per capita income maps for the Buenos Aires Metropolitan Area for the years 2013, 2018, and 2022. All three maps were generated using the same trained model, applied to satellite imagery from each respective year. The results demonstrate the model’s ability to produce temporally consistent spatial income distributions, allowing for the analysis of urban socioeconomic patterns over time. The color scale represents income deciles based on the 2010 Census.

More »

Fig 13 — Fig 13.

Detecting development over time in Bella Vista, San Miguel.
This case study from the test set shows the model tracking socioeconomic change from urban development in Bellavista, Buenos Aires. The top row displays a 2013 aerial image and the corresponding income prediction. The bottom row visualizes new construction by 2018, and the model correctly identifies the developed area as high-income. The heatmap color scale corresponds to the 2010 Census per capita income deciles. The 2013 aerial photo (from Instituto Geográfico Nacional de la República Argentina, 2025) and 2018 building layer (from the Open Buildings Dataset [43]) are shown for illustration, as the original satellite imagery used for prediction is not licensed for publication.

More »

Fig 14 — Fig 14.

Growth rates of income indicators over time.
This line chart compares the temporal change of various economic indicators as measured by two sources: official household survey data and the model’s satellite-based predictions. For each indicator (e.g., mean income, Gini, poverty rates), the chart shows the percentage change between years. The results indicate that the model’s predictions successfully capture the direction of change (the trend) for most indicators, suggesting its utility for tracking socioeconomic evolution.

More »