Modelling urban vibrancy with mobile phone and OpenStreetMap data

doi:10.1371/journal.pone.0252015

Fig 1.

Presence of mobile phone users and density of urban features in Milan.

We retrieve data on the number of mobile phone users in the city of Milan, together with their age group, as derived from Telecom Italia data between March and April 2015. (A) Mobile phone users are provided for cells in a discrete grid superimposed on the city. We depict here all the cells which are included in the dataset for the metropolitan city of Milan. (B) Our analysis focuses on urban environments in densely populated areas. We present here the geographical boundaries of the administrative unit of the metropolitan city of Milan, which we use to select those cells in the data set which are part of the large urban area of Milan. (C, D) We depict here the average density of users for two different age groups across the area of analysis. Visual inspection suggests that the overall pattern is consistent across the two groups (E, F) We compare the presence of people across the city with the underlying urban structure of the different cells. To quantify the urban environment, we consider data derived from OpenStreetMap (OSM) and the Italian census. Here, we present (E) the number of road intersections extracted from OSM, and (F) the vertical density of buildings. All maps in this Figure are oriented North.

More »

Expand

Table 1.

Metropolitan cities.

Our analysis focuses on seven metropolitan cities across Italy. Here, we report the number of spatial cells of the mobile phone network and the population (in thousands) of each of these cities split across 6 age groups. Population data is retrieved from the 2011 Italian census and comprises all the census sections within the phone cells considered for each city. It is important to highlight that in each cell of the network there can be several mobile phone users, thus we cannot estimate the fraction of the census population included in our data set. Note that the age groups provided by the Italian census do not perfectly match those of the Telecom Italia dataset.

More »

Expand

Fig 2.

Relationship between presence of people and geographic features.

In these scatter plots each point represents a cell in each city. The x-axis encodes the presence of all age groups aggregated together, whereas the y-axis represents the values for each feature described in the Data section. All quantities are normalised by the area of the cell. The points are plotted with some transparency so that accumulation of multiple cells with similar values looks slightly darker.

More »

Expand

Table 2.

Investigating the relationship between urban features and vibrancy.

We perform an initial correlation analysis, using Kendall’s rank correlation coefficient, between the presence of people and urban features. Here, we present the results when aggregating different cities together, as well as when aggregating different age groups together. All p-values from the correlation tests have been adjusted using false discovery rate, and all values are significant (p > 0.05).

More »

Expand

Fig 3.

Age group differences in an univariate spatial error model of geographical features.

For each city and each age group we run a univariate spatial regression model where the dependent variable Y is the vibrancy, and the independent variable X is one urban feature at a time. We first run the the spatial error model for the aggregate case (all ages) and then subtract the resulting coefficient from each age group. Thus, a positive value for a certain age group indicates that a particular urban feature is relatively more strongly related to the vibrancy compared to the aggregate case. X and Y axes are all the same across small multiples, and labels are only displayed in the bottom-right subplot. Before running the univariate linear model, all values have been standardised subtracting the mean and dividing by the standard deviation. Straight lines along zero indicate no variation across age groups. The error bars are provided by the spatial model result. The grey bands around zero indicate the error of the aggregate value. Milan, Rome and Turin are the cities with higher number of cells (and populations) and more significant patterns. An equivalent plot using the spatial lag model can be found in the S1 File.

More »

Expand

Table 3.

Investigating the spatial relationship between urban features and vibrancy.

We model the relationship between the presence of people and urban features by constructing a series of multivariate spatial models to account for spatial correlations present in the data. Here, we present the results of these models when considering all cities together. The table shows the values of the model coefficients, and values are reported in italic and small font size if they are not significant after fdr correction (p > 0.05). The last column reports the Nagelkerke pseudo R square value, which provides a measure of how good the model is compared to a null model and has a similar interpretation to a traditional regression R². Results for the individual cities are presented in S1 and S2 Tables in S1 File.

More »

Expand

Table 4.

Investigating the role of third places on urban vibrancy.

Further to the analysis above, we investigate in more detail the relationship between each third places category and urban vibrancy. As before, we construct a range of multivariate spatial models and present the results in this table. The table shows the values of the model coefficients, and values are reported in italic and small font size if they are not significant after fdr correction (p > 0.05). The last column reports the Nagelkerke pseudo R square value.

More »

Expand

Fig 4.

Age group differences in an univariate spatial error model of third places.

For each city and each age group we run a univariate spatial error model where the dependent variable Y is the vibrancy, and the independent variable X is one of the categories of third places at a time. Similar to Fig 3, we firstly run the same model for the aggregated case (all ages) and subtract the resulting coefficient from each age group. X and Y axes are all the same across small multiples, and labels are only displayed in the bottom-right subplot. All values have been standardised subtracting the mean and dividing by the standard deviation before running the univariate model. Straight lines along zero indicate no variation across age groups. The error bars are provided by the spatial model result. The grey bands around zero indicate the error of the aggregate value. Milan, Rome and Turin are the cities with higher number of cells (and populations) and more significant patterns. An equivalent plot using the spatial lag model can be found in the S1 File.

More »

Expand