Fig 1.
Building heights in the different geographical areas used in this study.
(A) Height distributions for the state of Brandenburg in Germany, the Netherlands, the region of Friuli-Venezia Giulia in Italy and five French urban areas. These distributions correspond to the final dataset, after removing buildings with a height below 2 m and buildings with a footprint area below 10 m2, see S4 Appendix. (B) Location of the four areas representing Northern and Southern European regions, and urban and rural contexts. (C) Example of building heights in the city of Udine in Italy. This map shows a mid-size city with higher buildings in the historical center and along a main axis, and lower buildings in residential areas.
Table 1.
Summary statistics of the full dataset.
Fig 2.
Illustrations of the urban form features used.
(A) Individual building footprint geometries. The convexity values of buildings’ footprint polygon are displayed in the legend. Convexity ranges between 0 and 1. (B) Block of adjacent buildings. The block in which a building of interest is located is depicted in dark blue. (C) Street-based block, in green, surrounding a building of interest. (D) Buildings within a circular buffer of 50, 200 and 500 m around a building of interest. (E) Streets within a circular buffer of 50, 200 and 500 m around a building of interest. (F) Betweeness centrality shows main streets and secondary streets. We use as features for example the betweeness centrality of the closest street, or the average within a buffer. (G) Closeness centrality shows where streets are converging. Both centrality measures give information on the structure of the city and relative position of a street in the city street network.
Table 2.
Summary of urban form features used in this study.
Fig 3.
Map of the three experiments on the test set of Brandenburg.
(A) Experiment 1: No local data are available, and the model is trained only on data from other countries. (B) Experiment 2: Scarce local data are available—we add 2% of the test set to the training set, to test the hypothesis that these data provide relevant data for training the model. (C) Experiment 3: The main city of the area is available—we add this city to the training. The areas in blue are the training set and the area in red is the test set.
Table 3.
Model comparison results.
Fig 4.
Results of the predictions for the test regions, Brandenburg (left) and Berlin (right).
(A–B) Joint plot of predicted values over target values for Experiment 1 (No local data), both in meters, with the marginal distributions as barplots. The intensity of the bins’ color represent the density of data points in the bin. On the thick diagonal grey line, points are perfectly predicted. Points above the line are under-predicted and those below the line are over-predicted. (C–D) Error distribution of different target height ranges, for Experiment 1 in Brandenburg and for Experiment 2 (2% local sample) in Berlin. The shaded areas represents an error range of +/− 2.5 meters, which roughly corresponds to the height of one floor. (E–F) Error distributions for Experiments 1 and 2. The shaded areas represents an error range of +/− 2.5 meters.
Table 4.
Results of the experiments of the test sets.
Fig 5.
Prediction errors in Berlin for Experiments 1 (no local data) and Experiment 2 (2% of local data).
The errors (in meters) are aggregated on a grid for better readability, and depicted by a color gradient. The presence of local data in Experiment 2 starkly reduces the errors and the occurrences of under-prediction, especially in the center of Berlin.
Table 5.
Feature importance.