Structure of 311 service requests as a signature of urban location

While urban systems demonstrate high spatial heterogeneity, many urban planning, economic and political decisions heavily rely on a deep understanding of local neighborhood contexts. We show that the structure of 311 Service Requests enables one possible way of building a unique signature of the local urban context, thus being able to serve as a low-cost decision support tool for urban stakeholders. Considering examples of New York City, Boston and Chicago, we demonstrate how 311 Service Requests recorded and categorized by type in each neighborhood can be utilized to generate a meaningful classification of locations across the city, based on distinctive socioeconomic profiles. Moreover, the 311-based classification of urban neighborhoods can present sufficient information to model various socioeconomic features. Finally, we show that these characteristics are capable of predicting future trends in comparative local real estate prices. We demonstrate 311 Service Requests data can be used to monitor and predict socioeconomic performance of urban neighborhoods, allowing urban stakeholders to quantify the impacts of their interventions.


Introduction
Cities can be seen as a complex system composed of multiple layers of activity and interactions across various urban domains; therefore, discovering a parsimonious description of urban function is quite difficult [1][2][3][4].However, urban planners, policy makers and other types of urban stakeholders, including businesses and investors, could benefit from an intuitive proxy of neighborhood conditions across the city and over time [5][6][7].At the same time, such simple indicators could provide not only valuable information to support urban decision-making, but also to accelerate the scalability of successful approaches and practices across different neighborhood and cities, as urban scaling patterns have become an increasing topic of interest [8][9][10][11][12].As the volume and heterogeneity of urban data have increased, machine learning has become a viable tool for enhancing our knowledge of urban space and in developing predictive analytics to inform city management and policy [1,3,13,14].
However, one of the major limitations to widespread adoption of such analytics in the practice of urban management and planning is the extreme heterogeneity of the data coverage: different types of data are available for different areas and periods of time, which undermine efforts to develop universal and reliable analytic approaches.Privacy considerations are another significant issue that create additional practical and legal obstacles, restricting data access and preventing their use out of a concern for confidentiality [51][52][53][54].
Increasingly, cities are introducing systems to collect service requests and complaints from their citizens.These data, commonly referred to as 311 requests, reflect a wide range of concerns raised by city residents and visitors, offering a unique indicator of local urban function, condition, and service level.In many cities, 311 requests are publicly available through city-managed open data platforms as part of a broader movement in local government to increase transparency and good governance [55].Although potentially biased by the self-reported nature of the requests and complaints, these data provide a comparable measure of perceived local quality of life across space and time.
In this article, we develop a method for classifying urban locations based on the categorical and temporal structure of 311 Service Requests for a given neighborhood, exploring whether these spatio-temporal patterns can reveal characteristic signatures of the area.For New York, Boston, and Chicago, we present applications of this new urban classifier for predicting socioeconomic and demographic characteristics of a neighborhood and estimating the economic performance and well-being of a defined spatial agglomeration.The paper begins with a discussion of the data and methodology, followed by specific use cases relating to demographics and real estate values, and concluding with opportunities for future research.

Materials and Resources
1.1 The 311 data 311 service request and complaint data are being collected across more than 30 cities in the United States, including New York, Boston and Chicago.Through the 311 system, local government agencies offer non-emergency services to residents, visitors and businesses and respond to reported service disruptions, unsafe conditions, or quality-of-life disturbances.These 311 service requests and complaints cover a wide range of concerns, including, but not limited to, noise, building heat outages, rodent sightings, etc.Thus, these data serves as an extremely useful resource in understanding the delivery of critical city services and neighborhood conditions.
We explore the 311 datasets from New York, Chicago, and Boston as major urban centers where 311 systems are in place and commonly used.We consider a time frame between 2012 and 2015 during which the data are available for all three cities selected.
In table 1, we provide descriptive statistics of the data.Note that the number of total requests has been increasing from 2012 to 2015 in each city.Conceivably, the number of requests in New York City (which now approaches 2 million per year) is higher than the others because of its population size.However, Boston has a substantially smaller number of requests compared to the similar-sized city of Chicago, which shows the discrepancies in the use of the system across cities.Unfortunately, each city uses a different complaint/request coding convention, thus there is little consistency in the classification of particular complaint types.This fact raises certain difficulties for analysis between cities, a common challenge in comparative urban analytics given the lack of data standardization.1.General properties of the 311 data for NYC, Chicago and Boston

Demographic and socio-economic data
As we are attempting to use 311 data as a proxy for the socioeconomic characteristics and real estate values of urban neighborhoods, ground-truth data are needed to train and validate our models.For socioeconomic and demographic features, we use data from the U.S. Census 2014 American Community Survey (ACS).For real estate values, we collect housing price data from the online real estate listing site Zillow.Both are described below.

2014 census data
The 2014 ACS contains survey data on a number of socieconomic and demographic variables, at the spatial aggregation of the Census Block.For this analysis, we have selected common features representing important phenomena in population diversity, education, and income and employment.For example, our selection covers the number of population in the following categories: "Non-Hispanic White", "African-American", "Asian", "High school degree", "College degree", "Graduate degree", "Uninsured ratio", "Unemployment ratio", "Poverty ratio", and mean for "Income (all)", "Income of No Family", "Income of Families" and "Income of Households".
One important consideration is the level of spatial aggregation for this analysis.Having considered zip code, census tract and census block, we decided to proceed with census tracts providing the best trade-off between spatial granularity, in terms of having a sufficient number of sub-areas within each city, and having a statistically significant sample of 311 complaints for each areal unit.In Boston and Chicago, there are too few zipcodes within in each city to create a useful sample, and there is not a significant density of 311 complaints at the census block level (please refer to SI4 for details).In addition, given the survey methodology of the ACS data, census block data include non-trivial margins-of-error for each variable.

Zillow Housing price
One important indicator of local economic conditions is housing prices [56].We utilize Zillow housing price data that contain monthly average residential real estate sales prices by zip code.Although housing prices are a lagging indicator of neighborhood economic strength, since recorded sales occur as much as two to more than six months after a contract is signed, we use these values as one of the targets for our 311 predictions.Our spatial level of analysis will be the zipcode, rather than census tracts, given the coverage area of the Zillow aggregate data.

Normalization method and some notations
In order to better compare various areas, the census data need to be normalized.Take income per capita and population with bachelor degree for example.Firstly, these two features have different measurement units (dollars versus number of people).Secondly, this number can be affected by the area's total population.For an area with high population, there should be a higher possibility to have higher population with bachelor degree.Therefore, the normalization process is important in order to compare different features and different areas with heterogeneous population.For our analysis, we normalize our census tract data set in the following way.
Let p i be the total population in census tract, while v i denotes one feature recorded in the same census tract i, for example, "the total population who holds graduate degree in census tract i".Next, we normalize it by defining We define Ω as a set of all census tracts in New York City.
In section 2, we use 311 complaint frequency categorized by census tracts to cluster and investigate the difference in local socioeconomic features y.In section 3 we use machine learning regression models to predict these features y using normalized 311 data .
In order to get initial insights on the usage of 311 service across the considered cities, we define for each census tract a 311 service request signature -a vector of the relative frequencies of 311 requests of different types.Specifically, let the total number of service requests of each type t within an area a be s(a, t) and let s(a) = t s(a, t) be the total number of service requests in the area a. Then a vector S(a) = (s(a, t)/s(a), t = 1..T ), where T is the total number of service request types, serves as a signature of the location's aggregated 311 service request behavior.The vector S highlights the primary reasons for service requests or complaints in the specific area, as well as allowing for straightforward comparison across tracts and cities.
Signatures S(a) serve as unique characteristics of each location a, and we would expect similar spatio-temporal patterns to emerge in 311 service requests across a city or cities.Our hypothesis here is that these similarities also suggest similarities in the socioeconomic characteristics of the areas.In order to explore this further, we apply a k-means clustering approach to the set of multi-dimensional vectors S(a).In order to ensure we get an optimal clustering we run the k-mean 100 times, selecting the best solution in terms of cumulative square sum of distances from centroids.
One crucial step in this approach is to pick up an appropriate number of clusters to consider.For that purpose we have evaluated the clustering model with both Silhouette method and Elbow method.While different methods give a slightly different optimal number of clusters for the cities in our sample, in most cases it is within a range of two to four clusters.Given the socioeconomic diversity across neighborhoods in the selected cities, we determine that a minimum of four clusters is an appropriate value.Readers can find more details in SI.
We consider NYC first.In Figure 1, we see below with approximate 2000 census tracts divided into four clusters based on our clustering results.Midtown Manhattan, downtown Brooklyn and several outliers such as JFK and LGA airports belong to cluster 1; Staten Island and eastern Brooklyn/Queens constitute cluster 2; Northern Manhattan, the Bronx, and central Brooklyn are included in cluster 3; and Southern Brooklyn, Flushing and some eastern parts of Bronx comprise cluster 4.
In order to evaluate how different each cluster is with respect to the nature of 311 service requests, (see figure 2) we present the distribution of top service requests over the four clusters.We observe clear variation in this distribution.For example, complaints/requests within cluster 1 more often report noise concerns than others, cluster 2 experiences more issues relating to residential heating, cluster 3 has the highest relative complaints about blocked driveways, while cluster 4 reports concerns about street conditions.
Similarly, we repeat the same clustering process for Chicago and Boston and the clustering results for census tracts in those cities are shown in Figure 3.

Socioeconomic features among clusters
Given knowledge of the local spatial contexts for the analyzed cities, the clusters that emerge make certain intuitive sense.However, in order to quantitatively address the hypothesis formulated in the previous section -that similarities in local 311 service request signatures also imply similarities in the socioeconomic profiles of those areashere we summarize and analyze the socioeconomic characteristics for each of the discovered clusters.
Recall that thus far the clustering results are obtained based on the 311 service requests frequency alone with no socioeconomic information considered.Next we summarize 14 socioeconomic features and compare the normalized mean level for each Cluster1 Cluster2 Cluster3 Cluster4 feature in each of the considered clusters.The results for our three cities are presented in the radar plots in Figures 4-6.From the output, we can see that the socioeconomic features among the defined clusters are quite distinctive.
Take NYC for example: • Education and Income: People with higher levels of education (with graduate degree and above) are found in cluster 1, which, as expected, also has highest income level.Cluster 3 appears to show the opposite results.
• Racial diversity: There are above average concentrations of Non-Hispanic Whites living in clusters 1 and 2, of Asian origin in cluster 4, and African-American populations in cluster 3.
Similarly we have (for both Chicago and Boston): • Cluster 1 has the highest income and education level, while cluster 3 is the lowest.
• Cluster 2 is predominantly Asian and African-Americans, while Non-Hispanic Whites tend to live in clusters 1 and 4.   The observations above provide some evidence for our hypothesis, revealing links between socioeconomic features and 311 service request data structure.Indeed, while the clustering is performed based on the 311 data alone, the socioeconomic features happen to be quite distinctive among the clusters.Of course this only reveals the existence of a certain relation in principle, which might not be that practical.However this gives rise to another hypothesis -can one use 311 service request data to actually model socioeconomic features at the local scale?We find that 311 service request signatures allow the city to be divided into clusters based on distinctive patterns of socioeconomic characteristics.Following this, we explore whether 311 service requests can be used to model these socioeconomic patterns.Such a model could be useful as socioeconomic data are often unavailable or inconsistent at a given spatio-temporal scale, and therefore having a proxy derived from a model based on regularly-updated open data could have considerable potential for city operations and neighborhood planning.

Cluster1
We train regression models over the relative frequencies of 311 service requests of each type in each census tract in order to estimate the selected socioeconomic features described in subsection 1.2.1.The service requests frequencies s(a, t)/s(a) (components of the signature vectors) constitute our feature space, including 179 different features in the case of NYC, across 2000 census tracts following the data cleaning/filtering process.
We consider six target variables including income per capita, percentage of residents with a graduate degree, percentage of unemployed residents, percentage of residents living below the poverty level, as well as demographic characteristics including percentage of Non-Hispanic White and African-American populations.
The objective of the modeling is to use partial information about the target variables defined in a certain part of the city to train the model so that it can explain the target variables over the rest of the city.
For each model, we treat the different set of hyper parameters as different models.For Neural Networks, we try 5, 10, 20, 40 hidden unites and for each hidden unit, we try penalization lambda for 0.0005, 0.005, 0.05, 0.5.As to the learning process, we use mini batch size 20 and we use the following learning rate and epochs: (0.1,100), (0.05,200), (0.01,500), (0.005,1000).For RF and ETR, we use 500 trees(since increasing trees does not help) and try maximum leaf nodes: 10, 20, 30, ... 100.In total we have 64 sets of hyper parameters for NN and 10 for RF and 10 for ERF.
More details on the model selection process is presented in SI 3. Generally speaking, we select the final model with suitable hyper parameters with the help of cross-validation.We divide the data set into training and testing set by the ratio 7:3.As described above, we have 84 different models.For each model, we randomly divide our training set into training and validation sets and train the model on the training set and report the R-squared on the validation set.We repeat this process 20 times for each model and get the average R-squared.We pick the best model and use it for prediction on the test set.Finally, we report the out-of-sample R-squared in Table 2. Generally speaking, RF and ETR usually give us best performance based on R-squared.
The resulting out-of-sample R-squared for the models selected are summarized in Table 2.We consider this modeling result important because: • it indicates that a relationship exists between 311 request signature and the local socioeconomic features of each area; • it enables possible prediction and estimation of other local socioeconomic features by using 311 requests data, particularly those features for which data are collected at low temporal frequency, such as Census data; and, • it can be easily scaled by geographic aggregation for various research, operational, or planning purposes.

Prediction of the real estate prices
Following our previous analysis, we attempt to understand the practical applicability of the prediction models.Although the findings above once again highlight a strong relation between 311 service request data and socioeconomic context of urban locations, this by itself has limited practical implications except for filling gaps in the data availability.In this section we show that 311 service request data could be also used to predict future socioeconomic variations, which may have more important practical implications for urban analytics.
As an example, consider the annual average sale price of housing per square foot in different neighborhoods of NYC as the target variable for our prediction.Our housing price is reported by Zillow at the zip code level; therefore, we rescale our 311 service request frequencies to this spatial aggregation.
To match available housing price data, we only include those 311 service categories that were recorded consistently between 2012 and 2015.New York City has 145 of such categories, covering about 70 percent of total service requests.
The target variable is updated annually and is available for each year from 2012 to 2015.The Zillow data cover 112 of the 145 zip codes in New York City where the density and frequency of 311 requests is sufficient to satisfy the filtering procedure described in the Data section.Thus, our sample for this prediction is based on data from 112 zipcodes.
We do not attempt to predict the absolute level of prices, but changes over time relative to the NYC mean.Our output therefore indicates how much more (less) expensive the housing price in a given zip code area is going to be compared to the average relative increase (decrease) in housing prices across NYC from the previous year.This way we define a new log-scale target variable Y t (z) in year t as Y t (z) = log(P t (z)/P t mean ) where P t (z) is the average price per square foot in zip code z during the year t, while P t mean is the average price per square foot across the entire city during the year t, estimated as the mean of P t (z) for all the locations z weighted by residential population of the locations used as a proxy for the locations' size.
We begin by modeling the output variable Y 2015 .We train the model using 2012 and 2013 data (both -features and output variable) over the entire NYC and use 2014 data for tuning hyper-parameters, then apply it to 2015 using the features defined based on 2015 service requests.To reiterate, the feature space as before consists of the relative service requests frequencies s(a, t)/s(a) , but now including only 145 categories of service requests, while the number of observations is 112 zip codes.
The results are reported in table 3 (we also include Boston and Chicago here just for comparison, although the number of zip codes in these cities is much smaller and thus the model becomes less significant).
As one can see from the table 3, we achieve reasonable predictive power, especially with RF and ETR approaching R 2 values of 0.80 for all three cities.However, note that modeling housing prices in 2015 is not our objective here, since a simplified model Y 2015 = Y 2014 would achieve better results given the serial correlation in the time series and the relatively small year-to-year variation in price levels.Instead, we rather focus on the model's ability to predict the magnitude and direction of those fluctuations, forecasting price trends at the zip code level.
Let Y t P (z) be the predicted value of Y t (z).We define Additionally we classify the zip codes based on the actual tendency strength, i.e. let us introduce G P ositive , G N egative , G N eutral in the same way as above but replacing the estimated D P (z) with the real D(z) in the corresponding.In this way, compared to defining strong tendency using predicted results, we define strong tendency by the real values and then test the performance of our model by the following indicators.
For each group G P ositive , G N egative , G N eutral , we calculate its the normalized population weighted average value of actual D(z) using the following formulae: where N (z) is the population of the zip code z.Similarly for each of the groups G P ositive , G N egative , G N eutral we calculate the average prediction )/σ(D(z)), )/σ(D(z)), The values of those quantities for different values of the threshold (m = 0.15, example of a very loose threshold classifying most of the predictions as strong, m = 0.35, 0.65, 1) are reported in the Tables 4 and 5  holding for all the values of the threshold m, which means that our predicted trend directions are consistent with the real trends on average.Moreover, we compare the signs of the predicted values of D P (z) for the strong predicted trends G P ositive ∪ G N egative vs the ground-truth D(z), as well as the actual values D(z) for the strong actual trends G P ositive ∪ G N egative , reporting the accuracy ratio of predicting the correct trend direction for strong actual trends and the accuracy ratio for having strong predicted trends to reveal correct trend directions (D P (z)D(z) > 0).Those indicators are listed in Table 4 and Table 5 demonstrating the model's performance.
From Tables 4 and 5, we see that, for around 40 percent of strongest tendency observations or predictions (m=0.65),our prediction accuracy of a trend direction is higher than 80 percent compared to around 43/62(69%) percent random guess baseline model in Table 4 and 31/51(60.7%)baseline in Table 5.Moreover, in Table 4, we see that when the threshold m increases from 0.15 to 0.65, the accuracy ratio of prediction goes up from 70 percent to 82 percent, meaning that the stronger the actual trend, the more likely to achieve correct prediction.In Table 5, we see that while m increases from 0.15 to 1, the accuracy ratio of prediction goes up from 72 percent to 90 percent, hence the stronger the predicted trend, the more accurately our prediction reflects the reality.The results presented in this section demonstrate that the 311-based model can indeed predict future fluctuations of socio-economic characteristics, including real estate price trends.This serves as an initial proof of concept for multiple potential urban applications using 311 data as a proxy for local socio-economic conditions.

Conclusions
A quantitative understanding of urban neighborhoods can be quite challenging for urban planners and policy-makers given significant gaps in the spatial and temporal resolution of data and data collection modalities.However, this subject is crucial for urban planning and decision making, as well as for the study of urban economic and  5. Accuracy of the correspondence of the predicted strong relative real estate price trends to the actual ones neighborhood change.In this paper, we provide an approach to quantify local signatures of urban function via 311 service request data collected in various cities across the US.These datasets, which can be easily scaled by spatial (zip code, census tracts/blocks, etc.) and temporal level of aggregation, are open to the public and updated regularly.Importantly, we demonstrate consistent relationships between socioeconomic features of urban neighborhoods and their 311 service requests.
For all three cities analyzed -New York City, Boston and Chicago -we demonstrate how clustering of census tracts by the relative frequency vectors of different types of 311 requests reveal distinctive socioeconomic patterns across the city.Moreover, those frequency vectors allow us to train and cross-validate regression models successfully explaining selected socioeconomic features, such as education level, income, unemployment and racial composition of urban neighborhoods.For example, the accuracy of the model explaining local average income in NYC is characterized by a R-squared value of 0.7, while Extra Trees Regression results in a 0.9 out of sample R-squared in explaining housing prices in Chicago (although this must be considered with respect to the smaller sample size).Finally, we illustrate the predictive capacity of the approach by training and validating the model to detect comparative average real estate price trends for zip codes in New York City.
In the nascent field of urban science and more traditional disciplines of economics and urban planning, there is increasing attention on how data collected by cities can be combined with novel machine learning approaches to yield insight for researchers and policy-makers.It is possible that such data can be used to better understand the dynamics of local areas in cities, and support more informed decision-making.In addition, it is conceivable that a set of efficient instrumental variables based on widely-available 311 data can be used to replace survey-based socioeconomic statistics at spatio-temporal scale where such official survey data is non-existent or inconsistent, thus broadening opportunities for urban analytics.

S1 Text. Unsupervised model and cluster number selection
Let S be a set of observations (locations in our data set), and a clustering U on S is a way of partitioning S into non-overlapping subsets U 1 , U 2 , ..., U k .We will investigate how well the model performs with different number of clusters, i.e. different k.
Here we choose K-means clustering with four clusters as our basic model.We made this decision based on the two clustering evaluation methods: Silhouette method and Elbow method [1] .

S1.1 Silhouette method
Silhouette is a commonly used method of interpretation and validation of consistency within clusters of data.It was first described by Peter J. Rousseeuw in 1986 [3] and it measures how similar an object is to its own cluster (internal relation) compared to other clusters (external relation).The Silhouette score ranges from -1 to 1, where higher value indicates better match to its own cluster and, at the same time, poorer matched to neighboring clusters-hence, higher Silhouette score means a better model overall as it highlights the distinctions among clusters.
The Silhouette value can be calculated with any distance metric, such as the Euclidean distance we applied here.We have run the tests for all three cities.The results are plotted on Figure S1.1.We can see that: • New York City: Models with 2, 3, and 4 clusters seem closely comparable and outperform the rest; • Boston: Models with 2 and 4 clusters have the highest Silhouette scores; • Chicago: Silhouette score appears to be decreasing as the number of clusters rises, the optimal choice is 2. • Models with 2, 3, and 4 clusters are generally better than others; • 2-cluster model seems to be the best choice in terms of Silhouette's quantitative criteria.
Next we try Elbow method, another validation approach described in next subsection, before making final decisions.

S1.2 Elbow method
The Elbow method measures how "cost-efficient" a model is by looking at the percentage of variance explained as a function of the number of clusters.It searches for a balance between "more information" and "less complicated model".Intuitively, if we start from 1-cluster model (which is no processing at all, just leave them as a whole), adding another cluster should give more information about the data distinction, but one should stop when the marginal gain is insignificant compared to the cost.Then the number of clusters is chosen at this point [5] .
Equivalently, we can check the average sum of squared errors.Of course, we want our error as small as possible, and the error tends to decrease toward 0 as we increase the cluster number, k (the error is 0 when k is equal to the number of data points in the dataset, because then each data point is its own cluster, and there is no error between this point and the center of its cluster-that point itself).The goal is the same: search for the point where the marginal drop is no longer attractive beyond it.
The results are summarized in Figure S1.2: • New York City: Obviously the error drops rapidly before 4 and then slows down after 4, so 4-cluster model is the best choice here; • Chicago: Very similar to NYC, although the change is a bit mild and both -3 and 4 -seem to be good choices; • Boston: The trend does not provide any intuitive number of clusters to focus on.

S1.3 Conclusion
To sum up, we have the following observations among three major cities in Table S1: Since 4 is the only number that has appeared in all three rows, and clearly 4 clusters can reveal more details about the city structures than 2 or 3, we think that 4-cluster model may the best overall choice.Choosing 4 instead of, say, 2, in our opinion, is a reasonable trade-off between having more clusters and still decent clustering quality.S1.Optimal choices based on each evaluation method

S2 Text. Classification result based on 311 service requests timeline data
The 311 service request data is pretty rich and although types of requests considered in the paper provide an important and useful perspective for spatial clustering and modeling socio-economic quantities, there are other interesting dimensions in the data to consider.In this supplementary paragraph, we provide an alternative approach to conduct the clustering analysis based on 311 service requests data.Instead of using types of 311 service requests, we consider their timeline, building our new data set by accumulating all types of 311 services requests during each hour of the week for each zip code area.Thus this new data set includes 168 features (activity distribution per hours within an average), for all the zip code areas within New York City.
Based on the new 168 dimensional feature space, we divide the zip code areas in NYC into four clusters using K-Means clustering algorithm, highlighting different temporal patterns in 311 service request activity.The clustering result is shown on the Figure S2.2 shows that the timelines of 311 service requests among Clusters 1,2,3 are rather similar.But Cluster 4 can be distinguished since the service requests have a considerable spike after 8 PM each day and especially over the weekends, indicating evening-time activity in those areas, which largely include locations across Manhattan, which makes common sense.Further understanding this pattern might require However the overall difference between socio-economic factors among different clusters is much less significant than the result based on 311 service request type data shown in Figure 4. Therefore, for the purpose of the socio-economic analysis of this study we decided to stick to the service request types other the timeline.

S3 Test. Model selection, Cross-validation and Out of Sample R-squared
The results of the model selection are shown in Table 2.The following procedures are applied for each city.
Assuming we have data set X and labels y, where X is n × m matrix and y is n × 1 vector.All the entries are real values.
Firstly, we collect all the possible models for training.The types of the models we consider include: Lasso, Neural Networks with regularization, Random Forest Regression, and Extra Trees Regression.We treat models with different hyper-parameters set as different models even they are belong to the same model type.
Secondly, we randomly split the whole data set into training data and test data.We train each model's parameters on training set, and get the out of sample R-squared

Figure 1 .
Figure 1.Classification of urban locations based on the categorical structure of the 311 requests.

Figure 3 .
Figure 3. Classification of urban locations based on the categorical structure of the 311 service requests for Chicago and Boston.

Figure 4 .Figure 5 .Figure 6 .
Figure 4. Normalized ratio of socioeconomic features among clusters in New York as the actual tendency of relative real estate prices in the zip code z and D P (z) = Y 2015 P − Y 2014 P as the predicted tendency of comparative housing price.We classify the 112 zip codes of NYC into three groups based on the predicted tendency strength D 2015 P : G P ositive = {z : D i P > m • σ(D P ), where i = 1, 2, ..., 112}: group of areas with strong positive tendency; G N egative = {z : D i P < −m • σ(D P ), where i = 1, 2, ..., 112}: group of areas with strong negative tendency; G N eutral = {z : −m • σ(D P ) < D P (z) < m • σ(D P ), i = 1, 2, ..., 112}: group of areas with close to neutral tendency, where m is a certain threshold and σ(D P ) indicates the standard deviation of D P (z).
and we can see consistent inequalities D P ositive > 0 > D N egative and D P ositive > 0 > D N egative

Figure S2. 1 .
Figure S2.1.Classification of urban locations based on the timeline data of 311 request services For example, in 2015, New York City's 311 data are categorized into 182 types, where Chicago has only 12.Even within a particular city, request categories are subject to change over time, especially in NYC where only approximately 70% of the entire service request activity belong to common categories present in all four years.Additional adjustments are needed to re-classify complaint types into standardized categories across the different cities and over the time period of the analysis.The original data set provided by 311 Services contains one record for each customer's call.For most cities, these records include information such as: service request type, service request open/close time and date and location(longitude and latitude).Therefore, for any given time period and area(census tract area/zipcode area), we can aggregate the 311 service requests and group by type.

Table 2 .
Out of Sample R-squared

Table 3 .
Out of Sample R-squared

Table 4 .
Accuracy of discovering actual strong relative real estate price trends by the predictive model