The Collaborative Image of The City: Mapping the Inequality of Urban Perception

A traveler visiting Rio, Manila or Caracas does not need a report to learn that these cities are unequal; she can see it directly from the taxicab window. This is because in most cities inequality is conspicuous, but also, because cities express different forms of inequality that are evident to casual observers. Cities are highly heterogeneous and often unequal with respect to the income of their residents, but also with respect to the cleanliness of their neighborhoods, the beauty of their architecture, and the liveliness of their streets, among many other evaluative dimensions. Until now, however, our ability to understand the effect of a city's built environment on social and economic outcomes has been limited by the lack of quantitative data on urban perception. Here, we build on the intuition that inequality is partly conspicuous to create quantitative measure of a city's contrasts. Using thousands of geo-tagged images, we measure the perception of safety, class and uniqueness; in the cities of Boston and New York in the United States, and Linz and Salzburg in Austria, finding that the range of perceptions elicited by the images of New York and Boston is larger than the range of perceptions elicited by images from Linz and Salzburg. We interpret this as evidence that the cityscapes of Boston and New York are more contrasting, or unequal, than those of Linz and Salzburg. Finally, we validate our measures by exploring the connection between them and homicides, finding a significant correlation between the perceptions of safety and class and the number of homicides in a NYC zip code, after controlling for the effects of income, population, area and age. Our results show that online images can be used to create reproducible quantitative measures of urban perception and characterize the inequality of different cities.


Introduction
In ''The Image of The City'', Kevin Lynch defines the city as a form of temporal art [1]. Much like sculptures, cities are spatial structures, but unlike sculptures, cities are too large to be experienced in a single try. Hence, people experience cities through unique temporal sequences that are reversed, interrupted and cut-across from the sequences experienced by others. Ultimately, in a world in which people's experiences of urban environments is unique, this uniqueness can give rise to an alternative form of inequality, where differences in the experiences elicited by different neighborhoods, rather than income, becomes an important source of interpersonal contrast.
Neighborhoods often differ in their demographics, such as the income and ethnicity of the people that inhabits them, but also on how safe they feel, how clean they are, how historical they look, and how lively they are, among many other evaluative dimensions [2]. Certainly, many of these dimensions will correlate with measures of income, but income will not necessarily be a complete proxy for all of them. Because of this, it is important to create measures of cities-and their neighborhoods-that incorporate the evaluative aspects of cities that income based measures are unable to fully capture.
In this paper, we present a high-throughput method to quantify people's perception of cities, and their neighborhoods, and use it to measure the perceptual inequality of Boston, New York, Linz and Salzburg. The method is based on image ratings created from the pairwise comparison of images in response to evaluative questions, such as ''Which place looks safer?'' or ''Which place looks more upper-class?'' The data shows that the range of perceptions elicited by images from Boston and NYC is wider than the range of perception elicited by the images of Linz and Salzburg. Finally, we validate our measures of urban perception by studying the correlation between urban perception and homicides in New York City, finding a significant correlation between violent crime and urban perception after controlling for income, population, area and age.
We conclude that the method presented in the paper is able to capture information about a city's built environment that is relevant for the experiences of citizens, and not fully contained in income-based measures. Moreover, we conclude that these measures can be used to estimate the contrasts -or inequalityof a city's built environment with respect to these evaluative dimensions.

A tale of two literatures
Cities, and their neighborhoods, are complex entities that weave together the physical components of the built environment, and the social interactions of the citizens that inhabit them. Yet, the study of cities does not belong to a unified stream of literature, but largely to two parallel branches. On the one hand, we have the literature advanced by urban planners and architects, and on the other, we have the literature advanced by social scientists and natural scientists.  The literature advanced by architects and urban planners puts special emphasis on a city's built environment. During the 20 th century, the development of this literature was punctuated by a series of movements, which have resulted in cities combining different architectural and planning styles [3]. Among the most notable of these movements are: the City Beautiful or Civic Art movement of Charles Mulford Robinson [4], which emphasizes the aesthetic aspects of a city's built environment -think of New York's Grand Central Station; The Garden City of Ebenezer Howard [5], which proposed a mixture of low density housing and parks -much like many modern suburbs; and the Radiant City of Le Corbusier [3,6], which reconciled Howard's Garden City with high density buildings -NYC Stuyvesant village being an excellent illustration of it.
The literature of architects and urban planners has also been active in the creation of measurements of urban perception along a number of different evaluative dimensions [2]. This study is certainly inspired by these measures, which have been based mostly on visual surveys where people rate images on a 1-10 scale [2,[6][7][8][9][10][11][12][13][14]. The justification of visual surveys is that urban environments have features, such as the exterior beauty of the architecture, or the neatness of the shrubbery, that are not traded in the market. Hence, these cannot be inferred from market mechanisms, such as the price system [2,[14][15]. The offline and online studies conducted in the past, however, have lacked the throughput required to make comprehensive maps of urban perception (Table 2s in File S2), and hence, are limited in their ability to compare a large number of cities and neighborhoods.
Within the social sciences, the study of cities has focused mostly on the connection between demographic and economic variables, with the physical appearance of the built environment playing little or no role. The literature advanced by economists, for instance, has focused on the creation of mathematical models, such as those involved in the new economic geography of Krugman, Fujita and Venables [16][17], or on the establishment of empirical patterns, such as the knowledge spillovers documented by Glaeser and others [18][19].
Natural scientists, on the other hand, have a different focus than economists, but also rely on quantitative methods that do not incorporate the aesthetic features of the cities they study. Notable examples here include the study of the fractal growth of cities [20][21] and the study of allometric relations connecting population to a number of social and infrastructural variables [22]. Natural scientists have also been keen to develop automated data collection methods that use big data to study the statistical properties of citizens, such as their human mobility patterns [23][24][25] and social networks [26][27][28][29][30]. Finally, the most direct connection between these two streams of literature is the work of Jane Jacobs [31][32][33] and the Broken Windows Theory (BWT) of Wilson and Kelling [34]. In ''The Death and Life of Great American Cities'' [31], Jacobs emphasizes the connections she observed between the physical environment of neighborhoods, and the social interactions between the citizens that inhabited them. ''Death and Life'' is well cited among architects and urban planners. Social scientists and economists, on the other hand, often build on Jacobs' later works, including ''The Economy of Cities'' [32] and ''Cities and The Wealth of Nations'' [33]. Hence, the literature bridge represented by Jacobs' work is largely due to her participation in both streams of literatures-and unfortunately -does not indicate a clear dialogue between them.
The Broken Windows Theory (BWT) of Wilson and Kelling [34], on the other hand, represents a more direct connection between the study of urban forms and social outcomes. In brief, the Broken Windows Theory suggests that evidence of environmental disorder, such as broken windows, litter and graffiti, can induce other kinds of disorder, like crime, and hence, policies that focus on the amelioration of minor offences can help fight more severe forms of criminal activity.
The BWT has also been politically influential. For instance, it was cited as a justification for New York City's quality-of-life initiative [35][36], an order-maintenance strategy that strictly enforces minor offenses, such as public drinking and turnstile jumping, as a way to prevent more substantial forms of crime, such as robbery.
Providing evidence to prove or disprove the BWT, however, has not been easy. In fact, several observational and longitudinal studies have argued in favor and against of the BWT [35][36][37][38]. Arguments against the BWT point to, among other things, the existence of spurious correlations in which underlying environmental features, such as liquor stores, can lead to both crime and disorder [36]. Arguments in favor of the BWT include experiments, like the ones performed by Keizer et al. [39]. Here the authors showed that in controlled settings, evidence of disorderly behavior, such as graffiti or supermarket carts left unattended in parking garages, were associated with an increase in the probability of people breaking other social norms, such as littering or stealing.
In recent years, the BWT has also been linked to health. For example, cases of gonorrhea in New Orleans have been shown to correlate more strongly with an index of neighborhood disorder than with an index of neighborhood poverty [40], and residents of disadvantaged neighborhoods in Illinois, where noise, graffiti and vandalism are more common, have been found to have worse health outcomes than residents of advantaged neighborhoods, even after controlling for individual level disadvantages [41].
All of these studies explore the link between people's perception of urban environments and social outcomes. Yet, the focus of this literature has been mainly on the association between crime and disorder, when this is only one of the many potential associations between the urban environment and social outcomes that can be of interest. In effect, urban landscapes are complex enough to demand a number of evaluative dimensions to be characterized [2], since beyond disorder places can look lively, modern, inspiring, classy, abandoned, congested, colorful or beautiful, among other things. These additional dimensions can be used to explore connections between aspects of urban perception and other social dimensions, such as entrepreneurship, civic engagement and high-school completion, among other things. To explore these connections, however, we need to extend our quantitative methods of urban perception beyond measures of disorder. In this paper, we show that it is possible to capture detailed information about other evaluative dimensions and show that this information can be used to characterize the inequality of cities with respect to these dimensions. Finally, inspired by the BWT, we validate the measures collected by comparing them with data on homicides for NYC.

Data
We collected data on urban perception by using 4,136 geo-tagged images from four cities (# of images): New York City (1,706) and Boston (1,236) in the United States; and Salzburg (544) and Linz (650) in Austria, (Fig. 1A-D). Images from New York City (NYC) and Boston were sourced digitally from Google Street View while images from Linz and Salzburg were collected manually onsite. The images and dataset used in the study can be downloaded from (http://pulse.media.mit.edu/static/dataset/). Perception data was collected using a website created for the study ( Fig. 2A). Here users were shown two images, selected randomly from the dataset, and asked to click on one in response to one of three questions: ''Which place looks safer?'', ''Which place looks more upper-class?'', or ''Which place looks more unique?''. Users additionally had the option of indicating that both images were perceived as equal. The spatial location of images was not revealed to participants during the study.
We selected the phrasing ''Which place looks more X?'' because it reflected more accurately what could be evaluated from an image. We note that similar questions have been asked in preceding evaluative studies (17). 7,872 unique participants from 91 countries contributed a total of 208,738 votes and self-reported age and gender (SM and table 1s in File S2).
Some limitations of the data include the constrained amount of information that is captured in an image, since other sensory  Table 2. Comparison between the means and standard deviations of the urban perception recorded for each city and question.

Difference in Means
T-test for equal means with unequal variances. channels that can affect perception, such as sound and smell, are absent in pictographic depictions. Also, variation in image quality (i.e. contrast, hue, saturation, brightness, tint and clarity), as well as the time of day, and weather conditions, can introduce additional sources of variation in the perceptions associated with a digital image. We therefore interpret the urban perception data collected through this method as a proxy for the perceptions elicited by the actual locations [2]. Finally, we note that the mapping between images and locations is not one-to-one. In fact, for a large number of locations we captured more than one image, by pointing the camera in two or more directions. Hence, many locations are characterized by more than one quantitative value -usually two. We captured more than one image for many locations to take into account the variability of using images that are not 360-degree representations of a place, but a 90-degree wedge.

Measures
We scored each image using the fraction of times it got selected over another image, corrected by the ''win'' and ''loss'' ratios of all images with which it was compared. This correction allowed us to adjust for the ''strength of schedule'' [42], since by chance some images were compared with others that were more likely to be selected favorably in pairwise comparisons. We define the win (W) and loss (L) ratios of image i with respect to question u as: where w is the number of times an image was selected over its paired image, l is the number of times that an image was not chosen over its paired image, and t is the number of times when an image was chosen as equal to its paired image. Using this, we define the Q-score for each image i and question u as: where niw is equal to the total number of images i was preferred over, nil is equal to the total number of images i was not preferred over, and where the first sum extends over j1, the images that image i was preferred over and the second sum extends over j2, the images that were preferred over i. Equation (2) simply corrects an images win ratio (W i,u ) by adding the average win ratio of the images that it was selected over and by subtracting the loss ratio of the images that were selected over image i. By doing this, we incorporate information about the images that were paired together with each image. The numerical factors of 10/3 and 1 are used to scale the score to fit the range [0-10], and come from the theoretical minimum and maximums of the analytic expression (2) (see SM). In sum, a score of Q = 10 represents the maximum possible score for safety, social-class or uniqueness, whereas Q = 0 represents the minimum.

Robustness of Q
We test the inter-rater, or inter-observer reproducibility of Q, by comparing the scores obtained using the same number of images, but extracted from non-overlapping subsets of votes of size v. We do this using subsets containing up to 50% of the total votes, because it is not possible to construct non-overlapping subsets that are larger than 50% of the original sample. As our measure for inter-rater robustness (B), we use the average R2 of the Pearson correlation between rankings calculated using the same set of images, but a different set of votes. Formally, we define B as: where Q 1 (v) and Q 2 (v) represent two sets of Q-scores calculated using disjoint sets of participants of size v, ,. is used to indicate averages, and s 1 and s 2 are, respectively, the standard deviations of the Q-scores in the sets Q 1 and Q 2 . We note that B is related to Cronbach's aand represents an estimate of the test-retest reliability of the method. A value of B = 100% indicates a perfectly robust ranking, since it would mean that the exact same set of Q-scores was obtained by using data collected from different people. Figure 2B shows the average B obtained for subsets of different size v (thick line) for each question. We find that the behavior of B as a function of the sample size v is well approximated by: where a and b are fitting parameters (R 2 = 99.7% for safety, R 2 = 99.9% for social-class and R 2 = 99.9% for uniqueness). We use (4) to extrapolate the observed values (thin line Fig 2B) and infer the values expected for the totality of our dataset, finding that the 93,622 votes collected for the safety question (red square) results in B = 86.3%, the 70,157 votes available for the social-class question (blue square) results in B = 84.4%, and the 48,109 votes collected for uniqueness (green square) results in B = 56.0%. Finally, we test the internal consistency of the perceptions collected by looking at their transitivity. We find that the overall level of transitivity of our data is high (86.76% for safety, 87.00% for social-class, and 83.34% for uniqueness).
As a rule of thumb, we find that between 22 and 32 votes per image are needed to produce a ranking with B.75% for each of the three questions.
One important concern that needs to be addressed here is the possible biases in the measures that might come from the demographic of participants that joined the online experiment. To test for this, participants were asked to self-report age and gender after contributing five clicks. Self-reporting was high, with 97.1% of the participants providing answers for age and gender. From these, 76.0% identified themselves as male and 21.1% as female. The median self-reported age was 28 years. Finally, participants were geo-located using their IP addresses and the 7,872 unique IP addresses were located in 91 countries. We test the significance of possible biases by comparing the Qscores estimated using different subsets of participants. We do this for participants' age (above and below the median), gender (male and female), and location (United States vs non-United States). As controls, we show the correlations obtained for random subsets of participants of the same size (Figures 1s, 2s and 3s in File S2). For example, we compare the correlation of the scores obtained for people older and younger than the median age of 28, with the correlation obtained for two disjoint random half-samples of participants. The same procedure was used to create controls for the correlations observed between groups of participant with different sex and for participants from US and non-US locations, as proxied by participants' IP-addresses. Overall, we find that the correlations obtained for groups of different demographics are not significantly lower than those obtained for the random controls, indicating that the results of our sample are not driven by biases in age, gender or location of the study's participants.

Results
We begin by asking whether perceptions of safety, class and uniqueness are perfectly collinear, or whether they have significant orthogonal components. Figures 3A-3C show typical images associated with high and low scores for safety, social-class and uniqueness. Places perceived as safe are also more likely to be perceived as upper-class ( Fig. 3D Figure 3F shows the analysis for the remaining combination of social-class and uniqueness (R 2 = 37.04%, p-value,0.0001). Together, these results show that data collected through this method can be used to identify images satisfying combinations of criteria, and therefore can distinguish between the perceptions of safety, social-class and uniqueness.
Next, we use Q to measure the contrast or inequality of urban perception. We begin this by asking: how wide is the range of perceptions elicited by the images of one city vis-a-vis another? Figure 4A shows the distribution of scores characterizing each image, for each city and question (values are reported in Table 1). Here, we see that images in Boston and NYC are distributed over a wider range of values. Yet, since we have considerably more images for Boston and NYC, than for Linz and Salzburg, we compare the standard deviations of these distributions (s), rather than their range. We do this because the standard deviation of a distribution is independent of sample size and provides a good comparator to measure the dispersion of the Q-scores calculated for each city. Moreover, the distribution of Q-scores for each question is close to normal (see SM and Figure 4s in File S2). Table 2 compares the means and standard deviations of each city and question using, respectively, a t-test to compare the means of distributions with different variances, and an F-test. The F-Test allows us to assess whether the difference between the standard deviations of two distributions is significant, after taking into consideration their sample size [43]. We find that the standard deviations of the distribution for Boston and NYC are consider-ably larger than those for Linz and Salzburg, even when there are no significant differences in the mean (for example with the means of Linz and Boston for social-class). This suggests that Boston and NYC are perceptually more unequal, since the average gap of the evaluative response between images is larger in NYC and Boston than in Linz and Salzburg. Moreover, we note that the standard deviation measured for NYC is not statistically larger than the one measured for Queens and Brooklyn, when it comes to the perception of safety and class.
Next, we study the segregation of urban environments by asking if the places associated with similar perceptions of safety, socialclass and uniqueness co-locate, and if so, to what extent. In principle, a wider range of values is observed for Boston and NYC, but these could be spatially intermixed rather than clustered. To measure the spatial segregation of perceptions we use Moran's I statistic [44]. Values of I range from 21 to 1. A value of 21 indicates perfect anti-correlation (e.g. a checkerboard), whereas a value of 1 indicates that similar values are perfectly clustered. The null-hypothesis of I is complete spatial randomness and produces values near 0. Moran's I statistic, however, cannot be used directly to make statistical inferences, since its significance depends on the sample size. Hence, we normalize the Moran I scores for each city by subtracting the city's average and dividing it by its standard deviation (creating a z-score). We also control for differences in sample size by randomly down-sampling the data for Boston, NYC and Linz to match the 544 points available for Salzburg. This guarantees that all datasets have the same sample size and ensure that variations are not due to differences in the number of points considered. Figure 4B shows the z-scores associated with Moran's I for each city and question (see Table 3s in File S2 for p-values). In general we find that all cities exhibit positive spatial autocorrelation, with Boston and New York having higher z-scores than Linz and Salzburg. These results suggest that the American cities studied have more segregated neighborhoods than the Austrian cities of Linz and Salzburg. To explore this further, we measure the length of the spatial autocorrelation using the autocorrelation function: Figure 4C shows the autocorrelation function (5) for each city and for the three NYC boroughs of Manhattan, Queens and Brooklyn. We note that since many locations contain more than one image -images captured with the camera pointing in a different direction-A(0),1, since this represents the correlation between images captured in the same location but with a different heading. Finally, we measure the correlation length of each of these using: where m, g and g are fitting parameters. g is included to capture the negative correlations observed for large values of Dd dD (.5 [km]). To ease interpretation, we define l as the distance Dd dD at which A(Dd dD) = 0. To avoid measurement errors due to binning, we take the average l calculated empirically using a series of bins ranging from 100  (Figures 4 D-F) and Boston than in Linz and Salzburg. Finally, we note that the NYC boroughs of Manhattan, Brooklyn and Queens all exhibit strong autocorrelation, with lengths only slightly smaller than that of NYC. This suggests that the measures obtained for NYC also hold for smaller spatial scales in that city, yet a detailed evaluation of the association between the segregation of urban perception and city size will require data on a larger number of cities.

Urban perception and violent crime
Finally, we use homicide data for NYC to look at the correlation between the urban perception of inequality and homicides. We note from the start that our intention is not to make a causal statement, but simply to use this correlation to validate the value of the information contained in our measures of urban perception. Because of the spatial nature of the dataset, we use Getis Spatially Filtered Regression (GSFR) [45][46], rather than an Ordinary Least Square (OLS) regression. In spatial datasets is not appropriate to use OLS regressions because of the existence of spatial auto correlations. In other words, the fact that neighboring cells are characterized by similar values violates the independence assumption needed to perform an OLS. So, an OLS is only justified if the residuals of the OLS regression are NOT spatially auto-correlated. This is because the autocorrelation of the residuals would indicate the existence of unexplained spatial variation, and therefore, the existence of a missing variable. In statistics, we would say that in this case the model is underspecified.
GSFRs solve this problem by using a transformation that filters out the spatial component of each variable x, into two estimates: one capturing the spatial variation of the variable (L x ), and the other capturing the local variation of this variable remaining after the spatial variation has been removed (x*). For each location i, and variable x, these variables are defined as: ð7Þ where S i = S j s ij is the sum of the spatial weights used to characterize the spatial proximity between data points (in our case 1/distance between locations i and j), n is the number of locations considered and G i~P j w ij x j P j x j for j?I Finally, a GSFR regression is an OLS regression where each variable x is replaced by its spatially filtered x* and varying component L x . More details about this statistical technique can be found in [45]. To illustrate what the method doe consider the income of a zip code. This is a variable that is certainly spatially autocorrelated, since rich zipcodes are more likely to locate next to other rich zipcodes. Instead of incorporating income as a variable, a GSFR will incorporate an income* variable, which would be the income of a zip code that is not explained by the incomes of nearby zip codes, and a L income variable, that would capture the spatial variation of income across zip codes. Table 3 shows the results of a GSFR where the dependent variable is the logarithm of the number of homicides in a NYC zip code recorded between 2003 and 2011. We note that the Google Street View API does not provide information for the date and time the images were captured. As explanatory factors we use the average incomes of households in the zip-code, population, area, age and four urban perception variables: the average Q-score for safety and class (Q safety , Q class ), and their respective standard deviations (SQ safety , SQ class ) calculated for each zip-code. Formally, the regression takes the form: Table 3 presents 5 different specification of the statistical model. All models include the population and area of a zip code, since these are obvious correlates of crime. Model 1 includes also income and age. Model 2 adds the perception of safety, while model 3 includes the perception of class. Model 4 includes the perception of class and safety, but no information on age or income. Finally, model 5 includes all variables -population, area, income, age, average perception of safety, average perception of class, standard deviation in the perception of safety, and standard deviation in the perception of class. We note that for the full specification of our model (model [5]), we find no spatial correlations among the residuals (Moran's I z-score = 20.23, pvalue = 0.82), indicating that the model is not underspecified and can be used for statistical inference. Hence, the results cannot be interpreted as the result of a missing variable, such as policing or race [45][46].
Model 5 explains nearly 80% of the variation of homicides across zip codes. This correlation is 10% larger than what is explained by income, age, population and area alone -from 69.88% (model [1]) to 79.36% (model [5])). The increase is statistically significant (F = 5.3, p-value,1.86102 5 ), and indicates that the measures of urban perception contain information on the location of homicides that is not contained in income.
Overall, we find that in the full model (model [5]), the spatial components (LQ safety , LQ class ), and not the local intensity components (Q safety *, Q class *) are statistically significant meaning that the spatial variation of urban perception across the city, is what correlates significantly with the location of homicides. Moreover, we find that the local spread of perceptions within a zip-code (SQ class *, SQ safety *) correlates with the number of homicides. These results are consistent in the sense that spatial variations for the perceptions of safety and class (rather than their absolute values) correlate with violent crime, after introducing the control variables. A visual comparison of the statistical models presented in table 3 is presented in figure 5.
Finally, we notice that the regression coefficients of the safety variables are negative (safer looking, less crime), whereas those of class are positive (classier looking, more crime). As expected, coefficients of safety and class are negative when introduced individually (models [2] and [3]), but the one for class reverse signs when we control for safety (models [4] and [5]). We interpret the opposite signs of these coefficients as evidence that the orthogonal component between class and safety ( Figure 3D) carries important information, since it indicates that violent crime occurred in places that look relatively more upper class after controlling for their perception of safety.

Conclusions
The way a city looks is of central importance for the daily experience of billions of city-dwellers. Yet until now, the availability of data about urban perception has been limited, and so has our ability to compare cities with respect to them. In this paper, we presented a method to measure urban perception and found that the cities of Boston and NYC differ from the Austrian cities of Linz and Salzburg in two important dimensions. First, the perceptions recorded for the cities of Boston and NYC are distributed more broadly than the perceptions elicited by the images from the two Austrian cities of Linz and Salzburg. Second, positive and negative perceptions cluster more strongly in the two American cities, than in their European counterparts. This means that the recorded gap between ''good'' and ''bad'' neighborhoods is larger in NYC and Boston and that both positively evaluated and negatively evaluated images cluster more in these American cities than in their Austrian counterparts. Finally, we showed that the inequality of perceptions helps explain the location of violent crime in a NYC zip code, even after controlling for income, population, area and age.
As the world gears towards building cities for hundreds of millions of individuals, the imperative of understanding cities becomes ever more important [3]. Therefore, there is a strong need to create quantitative bridges that can help us link urban perception with other social, political, economic and cultural aspects of cities. In this paper, we present a method that can be used to quantify urban perception and have applied it to the study of a few cities and questions. Although the method offers an important improvement in throughput over previous studies, its ability to collect data is limited to web traffic and participation. Because of this, future iterations will need to consider the use of a combination of crowdsourcing and machine learning tools to extend the patterns captured by the online participation data to higher resolution and different latitudes. Moreover, future studies might also explore the perceptual biases associated with the measurement technique presented in this paper, as well as support the development of techniques that can help identify the features that determine the evaluative responses recorded. Ultimately, the goal of this study -and those similar to it -is to contribute to our understanding of the urban environments that we have built, with the goal of improving them, and their ability to include their citizens, while also informing the construction of future cities.

Supporting Information
File S1 Q scores.