^{1}

^{2}

^{1}

The authors have declared that no competing interests exist.

Crime is a major threat to society’s well-being but lacks a statistical characterization that could lead to uncovering some of its underlying mechanisms. Evidence of nonlinear scaling of urban indicators in cities, such as wages and serious crime, has motivated the understanding of cities as complex systems—a perspective that offers insights into resources limits and sustainability, but that usually neglects details of the indicators themselves. Notably, since the nineteenth century, criminal activities have been known to occur unevenly within a city; crime concentrates in such way that most of the offenses take place in few regions of the city. Though confirmed by different studies, this concentration lacks broad analyses on its characteristics, which hinders not only the comprehension of crime dynamics but also the proposal of sounding counter-measures. Here, we developed a framework to characterize crime concentration which divides cities into regions with the same population size. We used disaggregated criminal data from 25 locations in the U.S. and the U.K., spanning from 2 to 15 years of longitudinal data. Our results confirmed that crime concentrates regardless of city and revealed that the level of concentration does not scale with city size. We found that the distribution of crime in a city can be approximated by a power-law distribution with exponent

Cities are the fundamental drivers of human societies; their capability to bring individuals together fosters innovation, wealth creation, and economic growth, but unfortunately they suffer from problems such as pollution, disease spread, and more pervasively, crime. Yet, even though crime is a danger to the development of cities, and counter-measures are greatly desired, we still fail to understand its structure and dynamics [

The very notion of a city bringing people together to interact comprises the idea of emergence of self-organized coordination derived from local activities [_{c}〉 that two peers of an individual interact presents scale-invariance with 〈_{c}〉 ≈ 0.25 [

Accordingly, human dynamics also play a major role in criminal activities, which are likely to drive patterns in crime activity [

Here we develop a framework to assess the distribution of criminal activities in cities by dividing the area of a city into regions with equal population size and aggregating offenses that happened within the same regions. This general framework allows us to perform a comprehensive analysis of the allometric relationship between crime distribution and city population. We examined criminal data from locations in the United States and the United Kingdom, and found that not only crime concentrates regardless of city, but also population size does not have an influence on the levels of concentration—despite the relationship between crime and total population. Crime concentration manifests in the probability distribution of crime across a city which can be described by a power law

Our analysis of crime concentration is based on official disaggregated data sets of criminal occurrences from locations of different population size from the United States and the United Kingdom, summarized in

United States (cities) | |||||||

Population | Period | #Records | Population | Period | #Records | ||

Atlanta/GA | 447,841 | 2009–2015 | 241,070 | Los Angeles/CA | 3,928,864 | 2012–2015 | 944,039 |

Baltimore/MD | 622,104 | 2011–2015 | 261,446 | New York/NY | 8,550,405 | 2006–2015 | 1,123,466 |

Baton Rouge/LA | 229,426 | 2011–2015 | 803,934 | Philadelphia/PA | 1,567,442 | 2006–2015 | 747,743 |

Boston/MA | 645,966 | 2012–2015 | 268,057 | Portland/OR | 609,456 | 2004–2014 | 649,349 |

Chattanooga/TN | 173,366 | 2011–2012 | 155,241 | Raleigh/NC | 431,746 | 2005–2015 | 492,899 |

Chicago/IL | 2,695,598 | 2001–2015 | 6,000,707 | San Francisco/CA | 837,442 | 2003–2015 | 1,856,293 |

Dallas/TX | 1,258,000 | 2014–2015 | 161,998 | Santa Monica/CA | 92,472 | 2006–2015 | 92,456 |

Denver/CO | 649,495 | 2011–2015 | 366,352 | Seattle/WA | 652,405 | 2008–2015 | 610,079 |

Hartford/CT | 125,017 | 2005–2015 | 516,043 | St. Louis/MO | 318,416 | 2008–2015 | 301,713 |

Kansas City/MO | 467,007 | 2009–2015 | 2,679,336 | ||||

United Kingdom (police forces) | |||||||

Population | Period | #Records | Population | Period | #Records | ||

Cleveland | 566,740 | 2011–2015 | 446,625 | Leicestershire | 1,005,558 | 2011–2015 | 439,950 |

Metropolitan | 8,538,689 | 2011–2015 | 5,377,392 | North Wales | 687,937 | 2011–2015 | 330,527 |

Greater Manchester | 2,732,854 | 2011–2015 | 1,701,428 | West Yorkshire | 2,264,329 | 2011–2015 | 1,337,565 |

See

We divided each city into regions with the same population size and analyzed the distribution of the number offenses that occurred within each region. When dividing a region into areas with the same population size, it is important to understand that there are a very large possible number of divisions. Hence, for each city _{c} same-population divisions of the city (see

(A) The Lorenz curves of the distributions of crime in the regions of Chicago reveal higher tendency of concentration in the case of thefts than in robberies and burglaries, a tendency that (B) seems to occur systematically in all considered cities: theft concentrates more than robbery, and robbery more than burglary. The difference between these types of crime manifests itself in their respective estimated complementary cumulative distribution. For instance, (C) the probability of a place with a high rate of burglaries in Chicago decays almost as fast as an exponential with λ = 0.11, while the curve for thefts follows approximately a power-law with

To assess the regularities in the concentration of crime, we fit the distribution of crime in each arrangement with the following distributions: power law, truncated power law, lognormal, exponential, and stretched exponential; and then compare them using the likelihood ratio test [_{t} ≈ 2.44 for theft, _{r} ≈ 3.31 for robbery, and _{b} ≈ 5.45 for burglary—in agreement with the Lorenz curves given that higher values for _{t} is between 2.1 and 3.0; whereas the exponents for burglaries _{b} and robberies _{r} vary in wider ranges with _{r} within 2.4 and 4.1, and _{b} between 2.9 and 6.0 (see _{t} suggests independence of the dynamics of theft from the idiosyncrasies of the cities; whereas the high variance of _{r} and _{b} suggests a dependency on the characteristics of the city, such as city layout, demographics. Despite the differences between exponents, our results showed that _{t} ≤ _{r} ≤ _{b} in all the cities with the exceptions of Santa Monica (_{r} < _{t}), Baton Rouge and Atlanta (_{b} < _{r}). Though the regions in the cities have the same population size, the distributions of crime in the regions are highly skewed and depend on crime type.

The allometric scaling of crime in cities suggests, however, a similar relationship between the concentration of crime and population size. To examine the relationship we evaluate the statistical dependence between city size and the distribution of crime across the city. We employ the Hoeffding’s test of independence H between the population size of the cities and the average power-law exponent

Though the growth of a city implies an increase in crime rates, the spatial concentration of offenses seems to be independent of the population size of the city. To test this, we employed Hoeffding’s independence test from which we could not reject the hypothesis that population size and the exponent of the power-law fit of the crime distribution are independent. In the case of thefts, the well-behaved

To assess the dynamics of crime, we measure the entropy of the positions in the rank of criminal spots over time. We thus divide each data set in temporal intervals using two procedures: amount-based and time-based. In the former, data is aggregated every _{w} records; whereas the latter aggregates data every _{w} days. The two approaches are used to take into account possible discrepancies in crime dynamics due to the existence of cities with high and low crime rates. To analyze the relative variation of crime in a given city, we first rank its regions by the amount of crime using each instance of aggregation _{i}(_{a} and _{t} of each considered city and type of crime. We used two-years data to enable us to compare the considered cities given that the smallest longest interval of longitudinal data among all cities is two years. In the case of _{t}, we aggregated data every _{w} = 7 which allows us to capture weekly variation and guarantees enough number of instances of aggregation to calculate the probabilities. For the amount-based approach, we constructed _{a} for each city by aggregating every

Though criminal activities exhibit regularities in their spatial concentration, the relative amount of crime in the regions of the city changes continuously over time. For that, we calculated the Shannon entropy of the positions in the criminal ranks of regions _{t} and _{a} which are created using the number of offenses aggregated by time and by the total amount of crime, respectively. In the case of _{a}, we used data slices of size that (A) minimizes the entropy of the first position of the rank in order to measure (B–C) the entropies of the positions in the rank for all considered U.S. cities. For the time-based rank, weekly data allowed us to measure (D–F) the entropies with respect to time. The overall high entropy in the positions of both ranks indicates that crime is likely to fluctuate across the city, leading to uncertainty about the regions in the rank; still, the most criminal regions have the tendency to be the same ones.

We found that most positions in both ranks tend to have high entropy with sample means _{t}, as seen in _{a}. In other words, we have more certainty about the whereabouts of the hottest spots of theft than the hottest spots of robbery and burglary. Similarly, our results showed that the regions with few number of crime are usually the same ones. As depicted in

The entropy in the ranks of a given city (A–B) increases rapidly with position, reaching a peak in which the uncertainty about the regions in this interval of positions is the highest for the particular city. After this range of minimal information, the entropy drops to an interval of steady entropy, then finally decreases to zero entropy. The intervals of increasing, highest, and steady, can be seen as different categories of regions in the criminal ranks. The steady-entropy positions vanish when the ranks are created with unstable sorting algorithms, which means that these positions hold criminal regions with a similar number of offenses. Not only regions but also (C) some cities present similar dynamics of crime—as also seen in

Not only categories of regions, we also found categories of cities. The curves of

Crime is ubiquitous in cities but needs still quantitative understanding. To characterize crime in cities, we examined criminal activities in 25 locations from two different countries using longitudinal data sets spanning 2 to 15 years. We developed a method to assess the spatial concentration of crime which divides a city into regions based on the resident population; then analyzed the distribution of crime in the regions. In all considered cities, we were able to confirm previous studies and identified that offenses take place in few regions of a city. Here we performed a comprehensive statistical characterization of the phenomenon in cities and showed that not only crime concentrates but also presents concentration level that depends on the type of crime and exhibits independence of the size of the city—despite the relationship between population and number of crimes. Yet, though cities have such regularity in the concentration of crime, our results revealed that criminal ranks in the cities have the tendency to change over time.

The regularities in the concentration of crime coupled with the constant displacement of crime suggest an understanding of crime as a complex system. Criminal activities flow continuously across the city while maintaining the organization of the system in such way that its dynamics and regularities appear to be scale-invariant. Different types of crime exhibit particular dynamics that lead to distinct levels of concentration and allometric scaling laws. Our results revealed thefts presenting a well-behaved concentration over cities which indicates invariance with city size and with idiosyncrasies of cities; while burglaries and robberies are more dependent on the city. These findings are particularly intriguing in light of the superlinear scaling found in thefts in contrast to the linearity in burglaries—though we are still in need of more conclusive analyses on the scaling laws of robberies (see

The characterization of crime paves the way for a better understanding of crime dynamics and provides the means to create and validate models. Though the proposal of a generative mechanism is beyond the scope of the present study, our framework can be employed for modeling given its implicit network of regions which can be used to represent a city. A theory or model attempting to explain this complex phenomenon have to conform to the skewed distribution of crime and the existence of distinct concentrations of offenses for different types of crime. For instance, models for burglary are expected to be more dependent on features of the city such as the layout of the streets or demographics. One should not conclude that we argue for any universality of power laws here, but instead we present statistical characteristics in criminal activities which we systematically found in different locations [

The perspective of crime as a complex system demands analyses that need to cover the system as a whole in order to assess crime. The connectedness of the city suggests that one should resist to neglect the “cold” areas by studying solely the hotspots of crime. Moreover, our results suggest that areas of high concentration of crime are expected to exist as the city grows—finding that urges for proper government policies. Still, the notion of the city as a process implies that developing static policies is likely to fail and, as such, policy-makers should pursue evolving strategies based on real-time data [

Since police departments employ different nomenclature for types of crime as well as different subcategory of offenses, we preprocessed the records in order to group together thefts, burglaries, and robberies (as described in

To split a city into regions with same population size, we use census data in order to build a graph with nodes that represents roughly the same number of people and divide this graph into _{i} of _{i} random coordinates is created for each census block _{i} of a place _{i} is the number of people in _{i} and each x–y coordinate is uniformly generated within the geographical shape of the block. The nodes of the graph are created based on the cells of each Voronoi diagram _{i} that is constructed from each _{i}, and the edges between nodes exist if their respective cells are neighbors of each other. Finally, this graph can be partitioned using a graph partitioning algorithm in order to generate regions (i.e., partitions) with approximately the same population size [_{c} has to be chosen to allow us to examine crime distribution. In all data sets we analyzed, we found that the number of regions that contain at least one offense ^{n ≥ 1} increases with the total number of regions ^{n ≥ 1} saturates at a point ^{n ≥ 1}(_{u}) = _{c} = _{u} with

(PDF)