Settlement size predicts extreme variation in the rates and magnitudes of many social and ecological processes in human societies. Yet, the factors that drive human settlement-size variation remain poorly understood. Size variation among economically integrated settlements tends to be heavy tailed such that the smallest settlements are extremely common and the largest settlements extremely large and rare. The upper tail of this size distribution is often formalized mathematically as a power-law function. Explanations for this scaling structure in human settlement systems tend to emphasize complex socioeconomic processes including agriculture, manufacturing, and warfare—behaviors that tend to differentially nucleate and disperse populations hierarchically among settlements. But, the degree to which heavy-tailed settlement-size variation requires such complex behaviors remains unclear. By examining the settlement patterns of eight prehistoric New World hunter-gatherer settlement systems spanning three distinct environmental contexts, this analysis explores the degree to which heavy-tailed settlement-size scaling depends on the aforementioned socioeconomic complexities. Surprisingly, the analysis finds that power-law models offer plausible and parsimonious statistical descriptions of prehistoric hunter-gatherer settlement-size variation. This finding reveals that incipient forms of hierarchical settlement structure may have preceded socioeconomic complexity in human societies and points to a need for additional research to explicate how mobile foragers came to exhibit settlement patterns that are more commonly associated with hierarchical organization. We propose that hunter-gatherer mobility with preferential attachment to previously occupied locations may account for the observed structure in site-size variation.
Citation: Haas WR Jr, Klink CJ, Maggard GJ, Aldenderfer MS (2015) Settlement-Size Scaling among Prehistoric Hunter-Gatherer Settlement Systems in the New World. PLoS ONE 10(11): e0140127. doi:10.1371/journal.pone.0140127
Editor: Bin Jiang, University of Gävle, SWEDEN
Received: June 22, 2015; Accepted: September 22, 2015; Published: November 4, 2015
Copyright: © 2015 Haas et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited
Data Availability: All relevant data are within the paper and its Supporting Information files.
Funding: This study was supported by the National Science Foundation BCS-0331992 awarded to Haas, American Philosophical Society Lewis and Clark Fund awarded to Haas, National Science Foundation BCS-9816313 awarded to Aldenderfer, Wenner Gren Foundation 6174 awarded to Klink, and National Science Foundation BCS-0331992 awarded to Maggard. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
Extreme settlement-size variation predicts extreme variation in the rates and magnitudes of many social and ecological processes in human societies including the rates and magnitudes of technological innovation, disease transmission, crime, and wealth [1–8]. Understanding how extreme settlement-size variation self-organizes (sensu ) and persists in human societies is therefore relevant to modeling such processes. Yet, the behavioral basis for settlement-size variation remains poorly understood [10–12]. Previous demographic research on hierarchically organized societies has observed that settlement-size variation, whether measured by census counts or areal extents, is heavy-tailed with the largest settlements in the upper tail of the distribution tending to exhibit scale-free, or power-law structure such that f(x) ∝ x‒α, where x is settlement size and α is a scaling exponent [13–15]. Scholars commonly link this variation to central-place theory (sensu ) with various combinations of agriculture, specialized craft production (i.e., manufacturing), elite competition, and warfare driving hierarchical order among settlement systems by differentially dispersing and nucleating populations [11,17–23]. Given such behavioral models, we might expect hierarchical settlement patterns to be absent among hunter-gatherer societies, which often lack the complex socioeconomic drivers enumerated above.
Current studies of human settlement-size variation are overwhelmingly biased toward modern and historical settlement systems of western cultures [10,16,24–26], thus limiting our ability to evaluate the social and environmental contexts that foster the self-organization of hierarchical settlement structure. Considerable attention has been given to the structure of U.S. settlement systems, for example. The largest cities in U.S. settlement systems appear to exhibit power-law scaling in their size distribution [10,11,14,24,27]. Archaeological research extends the scope of settlement-size studies to include the settlement systems of prehistoric non-western cultures, especially state-organized societies such as Maya, Mesopotamia, and Tiwanaku [19,20,28,29]. To a lesser extent, non-state agricultural societies have also been examined [30,31]. Comparable analyses of settlement-size structure among hunter-gatherer societies are rare ostensibly because conventional wisdom holds that hierarchical structure of any form is antithetical to egalitarian hunter-gatherer economies. Yet, there is reason to suspect power-law structure in the absence of socioeconomic complexity. Some gregarious animal species appear to exhibit power-law scaling in their group sizes , and some predatory animals appear to exhibit power-law structure in the distribution of waiting times—the time spent in a given location before moving to another . Analogous forms of either of these behaviors—differential aggregation or waiting times—could conceivably generate power-law structure in archaeological settlement-size variation in hunter-gatherer settlement systems.
Hamilton et al.  made the surprising observation that log-linear structure characterized group-size variation among 339 ethnographic hunter-gatherer settlement systems. Though the semi-quantitative data structure may or may not reflect power-law scaling per se, it is indicative of heavy-tailed structure and suggests the possibility that power-law structure is a property of hunter-gatherer group-size variation. Moreover, two other studies have argued that power-law scaling characterized waiting times in ethnographic! Kung foraging patterns [35,36]. While these novel studies represent key data points in our understanding of hunter-gatherer settlement structure, the fact that all ethnographic hunter-gatherers were economically integrated with sedentary agricultural and industrial societies to some degree [37,38] raises the concern that the observed scaling patterns are a result of those socioeconomic relations and not an endogenous property of hunter-gatherer systems.
Resolution of the question of power-law scaling in hunter-gatherer settlement size holds implications for our understanding of how complex socioeconomic structure self-organizes in human societies. The immediate goal of this analysis, then, is to reject the null hypothesis of power-law scaling among hunter-gatherer settlement systems. We analyze the statistical structure of settlement-size variation among eight prehistoric hunter-gatherer settlement systems that unequivocally existed in the absence of agricultural and industrial economies. We assume that if waiting times among hunter-gatherer settlements were power-law distributed, then artifact quantities among those sites should also be power-law distributed. If co-resident group size was power-law distributed, then both site area and artifact quantities should be power-law distributed.
The case studies examined here include 1779 temporally diagnostic artifacts from 405 archaeological sites. The total span of occupation for the sample includes more than 4000 years of the Early and Middle Holocene Epoch and three distinct environmental contexts including montane Peru, coastal Peru, and interior U.S. Southwest. While our analysis is able to reject the power-law hypothesis for site-area variation, it is unable to reject the power-law hypothesis for hunter-gatherer site-size variation as measured by artifact-per-site counts. Although this finding should not be construed as an assertion of power-law scaling, it strongly supports the presence of heavy-tailed statistical structure in the data and offers contingent support for power-law scaling. This support forms an independent and convergent line of empirical evidence that is immune to the biases faced by ethnographic data, albeit subject to its own biases. This paper describes the materials, methods, and results of the analysis. We conclude with a discussion of the study's implications for models of hunter-gatherer mobility and the self-organization of complexity in human societies.
Materials and Methods
To test the hypothesis of power-law structure in hunter-gatherer settlement-size variation, we examine the size distribution of prehistoric archaeological sites in hunter-gatherer settlement systems. This section describes the sample, the archaeological proxies of settlement size, and the procedure used to test for power-law scaling.
The sample analyzed here consists of 1779 temporally diagnostic artifacts from 405 archaeological sites representing eight prehistoric New World settlement systems and three distinct arid environments (Figs 1 and 2, ). Each system represents an unequivocal hunter-gatherer economy marked by economic dependence on wild resources and high degree of residential mobility. Agricultural neighbors were either absent or highly unlikely in all cases. Environmental contexts range from 16° south latitude to 33° north latitude, sea level to over 3800 masl, seasonal to cold effective temperature regimes, arid to hyper-arid precipitation regimes, and desert to grassland biomes. The broad temporal and environmental scope of this sample serves to explore the generality of settlement-size structure within a narrow hunter-gatherer economic regime. For each settlement system, field researchers conducted systematic pedestrian surveys that recorded site locations and areal extents. Sites consist of spatially discrete artifact clusters separated by artifact-free expanses. One-hundred-percent surface collections of temporally diagnostic artifacts were conducted in each case. Artifact looting is absent or negligible in all cases, thus minimizing a potential source of sample bias. The surprising lack of projectile-point looting is attributable to cultural and historic circumstances specific to each region. In the Titicaca Basin, we have yet to observe an instance of avocational projectile-point collection during our combined experience in the region which spans over 30 years. In the Jequetepeque region, looters tend to overlook lithic artifacts in favor of gold and pottery from later contexts [40,41]. Sites on the Gila River Indian Reservation have remained relatively protected from collection due to cultural prohibitions against disturbing prehistoric sites .
The first environmental context considered is the western Lake Titicaca Basin in the Andes Mountains of highlands Peru. Elevations range from 3800 masl at Lake Titicaca to 6400 masl at the peak of Cerro Janq'u Uma. Human populations intensively inhabited the lower elevations in an environment known as the Altiplano—a vast expanse of rolling-hill grasslands dissected by perennial rivers and flanked by mountains . Precipitation varies from approximately 300 to 900 mm/yr depending on elevation, local physiography, and climatic conditions. Mean daily temperature lows and highs range from -10°C to 19°C according to a seven-year period of record for the Inca Manco Cápac International Airport weather station in Juliaca, Peru.
Two study areas within the Titicaca Basin were examined. The first study area is the Río Ilave Basin, centered at 16°12'40”S, 69°43'20”W (WGS 1984). Elevations range from approximately 3830 to 3900 masl with adjacent mountains to 4600 masl. In a 41-km2 sample area of the Ilave Basin, Aldenderfer and colleagues recorded 90 archaeological sites with Archaic Period hunter-gatherer artifacts . In addition, the first author revisited 24 of those sites and recorded 6 new sites. The second study area in the Titicaca Basin is the Río Huenque study area, centered at 16°45'50”S, 69°43'40”W (WGS 1984) . The 33-km2 sample area occurs in a relatively restricted valley where elevations range between approximately 3940 and 4070 masl. Surrounding mountains rise to 5100 masl. Klink recorded 139 archaeological sites with hunter-gatherer artifacts in this area .
A detailed projectile point typology allows us to divide the long hunter-gatherer occupation of the Andean Altiplano into Early (11,500–9000 cal. B.P.), Middle (9000–7000 cal. B.P.), and Late (7000–5000 cal. B.P.) Archaic periods . All of these periods represent subsistence economies reliant on vicuña (a wild camelid), taruca (Andean deer), wild seeds, and wild tubers . The three temporal divisions in conjunction with the two environmental sub-contexts comprise six archaeological settlement systems.
The second environmental context considered is the Jequetepeque coastal plain and foothills of northern Peru, 1400 km northwest of the Altiplano study area. The extremely arid environment rarely receives more than 50 mm of rainfall per year. Mean daily temperature lows and highs range from 16°C to 31°C according to a 19-year period-of-record at the Capitán FAP José A. Quiñones Gonzales Airport weather station in Chiclayo, Peru. However, the Pacific littoral and lush alluvial plains offer highly productive, localized resource zones with diverse and often-abundant marine and terrestrial resources . Dillehay and Maggard [50,51] conducted archaeological settlement surveys covering 70 km2 centered at 7°9'11"S, 79°22'31"W (WGS 84). The efforts recorded 126 hunter-gatherer sites with material evidence of Paijan culture—a hunter-gatherer tradition marked by distinctive flaked stone technologies that persisted from approximately 11,000–8500 cal. B.P. [51,52].
The third environmental context considered is located in the Middle Gila River of the U.S. Southwest at 33°8'53"N, 111°51'10"W, approximately 6000 km to the northwest of the Jequetepeque region. The Sonoran Desert environment averages 200 mm of precipitation per year. Mean daily temperature lows and highs range from 3°C to 40°C according to a 21-year period of record at the Casa Grande Municipal Airport weather station in Casa Grande, Arizona. Surface water is scarce and ephemeral. Major hunter-gatherer subsistence resources include bighorn sheep, whitetail deer, rabbit, mesquite seed pods, and cactus fruit . Gila River Indian Community archaeologists reported Middle Archaic Period (5000–4000 cal. B.P.) projectile-point counts for 50 archaeological sites in a 591-km2 area . These counts comprise the eighth and final case study investigated here.
Identification numbers for previously unpublished specimens are presented in S1 Specimens. All necessary permits were obtained for the described study, which complied with all relevant regulations. The archaeological specimens that inform this study were collected and curated in compliance with Peruvian law as stipulated in Resolución Directoral No 064-2013-DGPA-VMPCIC/MC issued to Haas, Aldenderfer, and Carlos Viviano Llave; DGPA-0122-2002 issued to Maggard and colleagues; and C038-98 issued to Klink by the Ministry of Culture, Republic of Peru. The Altiplano field collections are temporarily housed at the Collasuyo Archaeological Research Institute (Jr. Nicaragua 199, Puno, Puno, Peru) as stipulated in the respective research permits. The collections will be permanently curated at an official Ministry of Culture artifact repository to be determined at the time of dispossession. The Jequetepeque region collection is permanently curated at the Ministry of Culture's Huaca Arco Iris facility in Trujillo. No permits were required to use the previously published Gila River data.
Measuring Settlement Size
We use two archaeologically tractable metrics of settlement size—artifact count and areal extent. It is first important to consider the possibility that the very act of defining sites might generate power-law structure in site-size variation (c.f., ). Defining sites is a subjective process in which field investigators identify artifact clusters, define their boundaries, and if necessary decide whether to split or aggregate adjacent clusters. In the study regions considered here, inter-cluster distances tend to be large relative to cluster size, thus distinguishing artifact clusters is relatively unambiguous in most instances. Regardless of the degree of ambiguity, we are unable to identify a clear theoretical link between the site-definition process and any of the generic mechanisms known to produce power-law or power-law-like structure [13,55]. We therefore currently have no theoretical reason to suspect that the site definition process poses a confounding factor in this study.
Given that sites represent behaviorally meaningful units of analysis, a site's artifact count is considered a relative proxy for person-hours of occupation, or cumulative waiting times. In general, the greater the number of individuals that occupy a settlement and the longer it is occupied, the greater the deposition of cultural materials. Thus, if hunter-gatherer waiting times were power-law distributed as previous studies have suggested [35,36], then we would expect to find that artifact-per-site counts are similarly distributed. To avoid over-estimation of artifact counts due to extraneous periods of activity, only temporally diagnostic artifacts for the periods of interest are used. Temporally diagnostic artifacts include projectile points in all eight settlement systems under consideration. In the Jequetepeque case, other temporally diagnostic tool types include bifacial preforms, scrapers, and limaces. Limaces are a unifacial, steep-edged flaked stone tools of uncertain function and are unique to Paijan culture . In the north coast of Peru, bifacial flaked stone tools, scrapers, and limaces generally do not extend beyond the period of Paijan occupation and therefore can be considered diagnostic of that period. In highlands Peru and the U.S. Southwest, this is not the case. Thus, we are restricted to projectile points as temporally diagnostic artifacts in those cases. The raw artifact counts used in this analysis are provided in S1 Dataset.
A site's areal extent is considered a relative proxy for co-resident population size. In general, the more individuals that contemporaneously occupy a location, the more horizontal space they require. Thus, if co-resident group sizes were power-law distributed as previous studies have suggested [35,36], then we would expect to find that site-area values are similarly distributed [14,56]. Although specific activities may also affect site artifact counts and areas, we assume they are reasonable proxies for hunter-gatherer occupation intensity. This assumption finds empirical support in the ethnoarchaeological work of Yellen .
We compiled site-area estimates as reported by field analysts for seven of the eight settlement systems analyzed. Site-area estimates are not available for the Gila River case; however, as will be seen below, this omission does not affect the consistent results obtained in the other cases. In general, all of the field procedures for site-area estimation entailed qualitative identification of the maximal extents of artifact dispersion followed by maximal length and width estimations using tape or pace-based methods. Length and width dimensions were then multiplied to give area estimates.
Estimation of archaeological site area is potentially confounded by several sources of error. In this study, error sources include subjectivity in defining sites, imprecision in estimation of site dimensions, the use of length and width to estimate the areas of non-rectangular entities, and post-depositional movement of artifacts. Nonetheless, none of these factors are likely to confound this analysis because the error in each case tends to be linearly distributed and is thus insignificant relative to the question of non-linear structure in site-size variation. In other words, if site-area variation is truly power-law distributed, then such linear sources of error would be insufficient to bias the data to the extent that they would mask the extreme variation that non-linear power-law models entail. Moreover, the cumulative effects of such error sources would also be unlikely to confound tests of power-law structure. Such multiplicative effects are well known to generate lognormal variation as opposed to power-law variation [13,55].
Another confounding factor relates to reuse of sites by exogenous populations. Such occupations can inflate site-area estimates and therefore confound the settlement-area signal of the target system. We employ several bias-control measures to minimize this effect. For the Altiplano settlement systems, a site's area is included in a given dataset only if the majority of the diagnostic projectile points for that site can be assigned to the period of interest. In doing so, it is likely that the site's area is primarily a function of occupation during the period of interest. We consider three thresholds for inclusion in the Altiplano site-area datasets—greater than 50, 70, and 90 percent temporally diagnostic artifacts. Each threshold reflects a tradeoff between sample quality and size. Whereas higher thresholds offer more reliable samples, they tend to reduce the number of samples available for analysis. This sampling strategy produces 14 distinct datasets (two subregions x three time periods x three data thresholds—four equivalent pairs) for the Altiplano study area.
In the Jequetepeque case where flaked stone tool traditions are more constrained in time, we take three approaches to control for site-area inflation. Again, each approach reflects a tradeoff between sample quality and size. First, we examine all sites with one or more Paijan artifacts. Second, only sites identified as single component with Paijan artifacts are examined. Third, sites with one or more Paijan artifacts and excluding sites with ceramic artifacts are examined because ceramics are associated with later agricultural occupations. Site-area estimates are provided in S1 Dataset.
Power-law Analysis of Settlement-Size Variation
Each dataset is analyzed in six steps in an effort to reject the hypothesis of power-law scaling. First, cumulative mass and cumulative density function (CMF and CDF, respectively) plots with logarithmic axes are used to inspect the data structure. Power-law distributions generate linear trends in such plots (i.e., they are log-linear) while other statistical structures tend to produce upwardly convex curves . Second, we use maximum likelihood estimation (MLE) to find the best-fit model parameters for each of a candidate set of statistical models [27,58]. Because artifact-count data are discrete integer data, we consider Poisson, geometric, and discrete power-law distributions. For site-area data, which consist of continuous data measured in square meters, the candidate set of statistical models that we consider include normal, exponential, lognormal, and power-law (Pareto) distributions. Each of these statistical models considered has seen explicit or implicit use in the study of human settlement-size variation (e.g., [18,21,25]) and therefore merits consideration in our effort to reject the power-law hypothesis.
Third, we assess the statistical plausibility of each model fit to the data using the goodness-of-fit test described by Clauset et al. . For each empirical dataset (i), consisting of ni sites, we first solve for the KS distance between the empirical data and the MLE model (Dm). Next, we draw a random sample of ni values from the MLE-generated statistical model. We then solve for the KS distance between the empirical and synthetic data (Ds). This procedure is then iterated 2500 times, and the fraction of times Ds is greater than Dm defines the probability (p) that the difference between the data and a given statistical model is a product of statistical chance alone. If p ≤ 0.10, then the model is rejected. If p > 0.10, then the model is considered plausible. The number of iterations and probability thresholds used here reflect the recommendations of Clauset et. al.
Fourth, we compare the relative information content of each statistically plausible model using Akaike information criterion (AIC) and AIC weights following the method described by Edward's et al. . Models that generate low AIC weights (w ≤ 0.10) are rejected in favor of those that produce high AIC weights (w > 0.10).
Fifth, we consider power-law structure in the upper tails of the distributions. Unlike other statistical distributions, theoretical power-law distributions are scale invariant meaning that they obtain over an infinite range of values. Real-world phenomena, however, have finite size limits that restrict the potential range of applicability of power-law scaling . Even in datasets where power-law models can be rejected for the full range of data, the possibility remains that power-law structure pertains to some upper-tail fraction of the data . Defining this range is an analytical problem that must be addressed. To test for power-law scaling in the upper tails of the empirical distributions, we apply the iterative KS-test method of Clauset et al.  to find the most-probable threshold value (xmin) for the hypothesized power-law tail of a given data sample. We then use MLE to find a best-fit power-law model for the tail and the previously described goodness-of-fit test to assess the statistical plausibility of the model. An upper-tail power-law model is rejected if the difference between the data and the model are unlikely to be explained by statistical chance (p ≤ 0.10).
Sixth and finally, we present a power analysis (not to be confused with power-law analysis), which serves two purposes. First, the power analysis serves to demonstrate that the methods and code function as intended. Second, the analysis serves to assess the probability of type I and II errors that might result given the sample size and statistical models under consideration. The power analysis consists of seven tests—one for each of the statistical models considered in this study. In each test, random samples are drawn from synthetic statistical distributions with known parameter values. The synthetic data are then analyzed with the same code used to analyze the empirical data. Sample sizes and parameter values for the synthetic data are selected at random from the set of sample sizes and parameter values observed in the empirical data. The procedure is iterated 100 times to evaluate the ratio of correct:incorrect model identifications. Correct identifications include those that produce insignificant p values (p > 0.1) and AIC weights (w > 0.1), indicating that the known model cannot be ruled out as providing a plausible fit to the data. Incorrect identifications include those that produce significant probability values (p ≤ 0.1) and AIC weights (w ≤ 0.1), indicating a poor fit between the model and the data despite the fact that their congruence is known. The ratio of correct:incorrect results gives a probabilistic measure of the procedure's efficacy, which we can then use to evaluate the robustness of the conclusions reached in the analysis of the empirical data. All calculations are performed using R statistical computing language including functions from MASS and PoweRlaw packages [60–62]. The code is presented in S1 Code.
For the discrete artifact-count data, CMF plots reveal clear log-linearity in all datasets suggesting power-law statistical structure (Fig 3). Maximum likelihood estimations of model parameters are presented in Table 1 along with the statistical plausibility results. The goodness-of-fit test is unable to reject power-law structure in seven of ten datasets (Fig 4). Conversely, best-fit Poisson models provide plausible fits to the data in just one of seven datasets, and geometric models offer implausible fits to all datasets. Two datasets did not produce plausible fits to any of the statistical models considered. Given that each dataset produced a single plausible result, AIC comparison is not applicable to the artifact count data.
The power-analysis for the discrete artifact-count data confirms the method's efficacy and suggests an exceedingly small chance of obtaining the results by statistical chance alone. The procedure correctly identifies 90 percent of the synthetic power law data as consistent with power-law models (Fig 5, see also S1 Table). Power-law structure was never incorrectly identified given Poisson data and only once (1 percent) given geometric data. Accordingly, the power analysis results suggest an exceedingly small chance that the 7 datasets identified as consistent power-law models came from Poisson or geometric distributions. Moreover, the fact that 10 of the 100 of the synthetic power law models were misidentified (10 incorrect:90 correct) as inconsistent with all of the models under consideration raises the possibility that type II error could account for the 2 of 8 empirical datasets found to be inconsistent with all considered models (2 incorrect:6 correct; Fisher's Exact Test odds ratio = 0.34, 95% C.I. = 0.05–3.86, p = 0.22). In sum, the artifact-per-site datasets are generally consistent with power-law models and inconsistent with the alternatives.
For the continuous site-area data, CDF plots reveal convex structure over the full range of the data, suggesting an absence of power-law structure (Fig 6). Moreover, the more-rigorous goodness-of-fit and AIC analyses indicate that power law models offer poor characterizations of the full range of data in all seven archaeological settlement systems. Power law models are plausible and parsimonious for only three of the seventeen datasets, and other statistical models are also plausible in those three cases (Tables 2 and 3, see Fig 4). Normal distributions also offer plausible and parsimonious characterizations for just three datasets. In contrast, lognormal models are plausible and parsimonious for 13 of the 17 datasets, and exponential distributions are plausible and parsimonious for 12 of the 17 datasets. Only one dataset did not produce a plausible fit to any of the models considered. Power-law models offer plausible fits to the upper tails of 8 of the 17 datasets.
The power analysis results for the continuous data are presented in Fig 7 (see also S2 Table). The results confirm the method's efficacy and suggest an exceedingly small chance of obtaining the analytical results by statistical chance alone. Given samples drawn from power-law models with parameter values in the range of the empirically estimated values, the procedure correctly identifies 97 percent as consistent with power-law models. Moreover, power-law structure is identified in the upper tails of 97 percent of the synthetic power-law data samples.
However, the procedure also incorrectly identifies models with low frequency. Because the empirical data show that power-law structure is highly unlikely to obtain over the full range of empirical data but is plausible for the upper tails of the data, we are most concerned here with how likely power-law structure is to be spuriously identified in the upper tails of non-power-law samples. The procedure incorrectly finds power-law structure in the upper tails of 67 percent of the lognormal data samples. The same misidentification occurs 27 percent of the time given exponential data. Normally distributed data generate upper tails that are spuriously identified as power-law distributed 9 percent of the time.
Recall that the archaeological site-area data produced 8 of 17 datasets with plausible power-law structure in the upper tails of the distributions. Given that (a) exponential, lognormal, and normal structure is found to be consistent for the full range of data 12, 13 and 3 times, respectively, in the 17 continuous archaeological datasets and (b) the proportion of times we expect those distributions to generate upper tails identifiable as power-law distributed, we would expect to have spuriously identified power-law structure in the upper tails in 0.27 * 12 + 0.67 * 13 +0.09 * 3 = 12 of 17 datasets. This expectation more than accounts for the 8 of 17 archaeological datasets with plausible power-law tails, leading us to conclude that in the case of site-area data, the observed plausibility of power-law structure in the upper tails may simply be an artifact of sample uncertainty.
In sum, the results show that power-law models provide plausible and parsimonious characterizations of hunter-gatherer site-size variation when settlement-size is measured by artifact counts but that power-law models are unlikely to characterize hunter-gatherer settlement-size variation when settlement size is measured by areal extent.
On one hand, previous research on agricultural and state-organized societies has suggested that power-law scaling of settlement-size variation is a property of complex, hierarchical socioeconomic processes, which tend to be absent among hunter-gatherer societies. On the other hand, ethnographic research has suggested that power-law scaling characterized hunter-gatherer settlement-size variation. Although the latter claim would seem to trump the former, ethnographic data limitations cast some concern on the degree to which scaling properties are intrinsic to hunter-gatherer systems or are artifacts of economic connections with sedentary societies. To our knowledge, this paper presents the first rigorous analysis of hunter-gatherer settlement-size variation as observed through archaeological data. The analysis provides an independent, complementary test that is immune to the limitations faced by ethnographic observations, albeit with its own limitations.
We have reasoned that if hunter-gatherer waiting times varied as a power-law function, then artifact-per-site quantities in a given hunter-gatherer settlement system should also vary as a power-law function. If hunter-gatherer group-size varied as a power-law function, then both site areas and artifact quantities in a given hunter-gatherer settlement system should vary as a power-law function. Although the analysis rejected power-law scaling for site-area variation, it was unable to reject power-law scaling for hunter-gatherer site-size variation as measured by artifact-per-site counts. These conclusions are consistent with a model of power-law distributed site occupation spans, or cumulative waiting times, and inconsistent with a model of power-law distributed co-resident group sizes.
To be sure, our inability to reject power-law scaling in artifact-count variation should not be confused with assertion of power-law structure. Alternative statistical models may yet offer stronger fit to the data and should be explored as theory dictates. Regardless, it is clear that hunter-gatherer site-size variation in arid environments tends to exhibit heavy-tailed structure. Hunter-gatherer research now faces the challenge of explaining the structural properties described here. We currently lack models of hunter-gatherer mobility, social interaction, or site formation that explicitly predict this structure. Efficacious models will be those that predict (a) heavy-tailed statistical structure in the size-distributions of hunter-gatherer settlements as measured by cumulative occupation time and (b) exponential or lognormal structure as measured by co-resident group size.
A Preferential Attachment Model of Forager Mobility
We briefly consider a candidate model here to offer a potential guide for future research. The working model posits that heavy-tailed site-size variation in hunter-gatherer settlement systems was a property of long-term preferential attachment to places on landscapes. Preferential attachment is a term that statistical physicists use to describe a generic class of processes that entail feedback loops [13,55]. Importantly, preferential attachment is one of several mechanisms known to give rise to power-law structure. The “rich-get-richer” is a classic example of a preferential attachment process that gives rise extreme wealth disparity among individuals in a given society and is often characterized by power-law (i.e., Pareto) models . We can readily imagine that preferential attachment to places on landscapes played an important role in hunter-gatherer residential mobility decisions thus giving rise to heavy-tailed variation in the differential accumulation of site occupation times and thus artifacts. Anthropologists have suggested a variety of mechanisms by which humans become “attached” to places, including economic and symbolic mechanisms [63–67]. We might therefore imagine that as hunter-gatherers moved across landscapes to take advantage of seasonally available resources, they preferentially reoccupied certain locations to access previously discarded materials, cultural infrastructure, or cultural meaning. While most settlements would have experienced modest, short-term occupation and material accumulation, some locations would have experienced compounding occupation intensity that would have driven extreme material accumulation over the long-term.
Importantly, this model does not predict power-law scaling in site-area variation and is thus consistent with the rejection of power-law scaling in the empirical analysis presented here. We might further consider the type of site-area variation that this preferential attachment model does predict. As a first approximation, the range of variation is expected to be more constrained than that of power-law variation. This is because temporally distributed reoccupations of locations entail some degree of spatial overlap thus limiting a site's areal growth rate relative to its quantitative growth rate (i.e., artifact accumulation). Of the continuous statistical distributions considered in this analysis, normal, exponential, and lognormal models fit this general expectation of comparatively low dispersion. Furthermore, we can rule out normal distribution models given that site areas cannot be negative by definition. To decide between exponential and lognormal models, we consider site-formation processes in light of the generic processes known to give rise to exponential and lognormal structure. Site area is expected to vary as a function of co-resident group size, spatial non-overlap between sequential occupations, and the spatial dispersion of cultural materials in systemic and taphonomic contexts. Without additional theoretical guidance as to which of these factors are most important, we might simply consider a null expectation in which each variable randomly contributes to site-area variation. Because lognormal distributions are the product of many independent random events , we suggest lognormal site-area variation could be expected under the model of preferential attachment. Given the empirical results of this study, we currently cannot rule out this expectation of the preferential attachment model. However, it is important to note that exponential models also present plausible fits to the empirical data. Additional research is needed to determine which statistical model offers a better fit and whether or not there is a theoretical basis for exponential variation in site areas.
Implications of the Working Model
The preferential attachment model of forager mobility holds a number of anthropological implications, and we briefly consider several here including implications for archaeological site formation, settlement-size variation in diverse environmental contexts, and self-organization of settlement-size hierarchies in human societies. Regarding the structure of site formation, the model suggests that artifact accumulation among the sites of a given hunter-gatherer settlement system would have been distributed through time such that that largest sites would be expected to exhibit occupation spans that approach the temporal span of the settlement system's existence. For many archaeologically visible hunter-gatherer systems, such spans may be on the order of centuries to millennia. This expectation follows from the assumption of preferential attachment, which implies that the attractiveness of a site is partially a function of the intensity of previous occupations. Thus, even a chance resource encounter at some otherwise unexceptional location on the landscape could lock hunter-gatherers into persistent reoccupation over the long-term. Importantly, such long-term uses of highly localized sites can be expected even in the absence of highly localized natural resources such as caves, rockshelters, or springs. We should therefore expect chronological analyses of large open-air hunter-gatherer sites to produce decadal- to millennial-scale occupation spans even in the absence of residential sedentism or spatially localized natural resources.
The preferential attachment model also holds implications for settlement-size variation across environmental contexts. First, because all anatomically modern hunter-gatherers relied on material and symbolic culture, we should expect that preferential attachment to places applied to all hunter-gatherers in all environmental contexts. Thus, we should expect to observe heavy-tailed settlement-size variation across environmental contexts. Nonetheless, the center of mass of site-size variation should vary across environmental contexts. Resource richness should negatively predict the strength of preferential attachment and thus the scaling exponent of power-law models for a given set of settlement systems. This is because resource poor environments would tend to exert greater pressure on resource recycling behavior and thus the reocupation of sites.
Last, the preferential attachment model offers potential insights into how settlement-size hierarchies may have self-organized in human societies through a common behavioral process that transcends economic extremes. When populations were low, as is often the case for hunter-gatherer societies, co-resident population size at the largest settlements would have been restricted, thus limiting site growth in areal extent relative to growth in material accumulation. In other words, for low-density hunter-gatherer populations, settlement-size scaling would have been a property of preferential attachment among settlements with largely asynchronous occupation. Conversely, preferential attachment could be expected to generate different site-formation results when population densities are high, as is often the case for agrarian and industrial societies. When individuals make residential moves in these cases, albeit with lower frequency than hunter-gatherers, preferential attachment to places would tend to result in multi-resident, multi-family occupations of settlements. These co-resident populations would have required a proportionate amount of space, thus material accumulations would have expanded spatially at a rate that was commensurate with the rate of quantitative accumulation. For high-density populations, settlement-size scaling would therefore have been a property of preferential attachment among settlements with largely synchronous occupation, as is the case among most modern societies.
To the extent that this preferential attachment model is viable, it would help us understand the similarities and differences in settlement-size variation observed in mobile and sedentary societies. In turn, it tentatively suggests a trajectory for the self-organization of settlement-size hierarchies in human societies. As hunter-gatherer populations grew, land use intensified, and residential mobility decreased, asynchronous settlement-size scaling structure would have gradually given way to the synchronous scaling structure that characterizes settlement-size hierarchies in settled agrarian societies. If so, incipient forms of hunter-gatherer settlement-size hierarchy would have created a context for the self-organization of socioeconomic complexity including the hierarchical structure of political and economic organization. This cultural trajectory differs from previous thinking, which has tended to see settlement-size hierarchy as a distinctive feature of hierarchical, state-organized, and industrial societies [17,22,23]. The model proposed here does not undermine previous models that use complex socioeconomic behaviors such as agriculture, manufacturing, and warfare to explain settlement-size hierarchy (e.g., [11,19,68]), but it does suggest that such complex behaviors may be proximate to more fundamental socioeconomic behaviors that existed among states and hunter-gatherer societies alike (see also  for a similar conclusion). We suggest that the shared behavior spanning economic extremes may be residential mobility guided by preferential attachment to places.
In conclusion, this study found that prehistoric hunter-gatherer settlement systems of the New World exhibit heavy-tailed statistical structure that is consistent with power-law scaling. We have interpreted this archaeological variation to reflect extreme variation in the occupation spans of hunter-gatherer sites. We speculate that the statistical structure may have been a self-organized property of preferential attachment behavior whereby foraging populations preferentially occupied certain locations on the landscape to take advantage of material culture or symbolic resources. In turn, the behavior and its macro-scale outcomes may have laid a structural foundation for self-organized settlement hierarchies in subsequent times. We emphasize that this working model simply suggests an analytical starting point. Alternative models such as post-depositional process models (e.g., ) require additional consideration. We hope that these analytical findings and theoretical considerations stimulate additional efforts to link hunter-gatherer behavior to the observed structural properties of human settlement patterns—patterns that fundamentally shape the organization and dynamics of human societies.
S1 Specimens. Archaeological specimens.
S1 Dataset. Artifact count and site area data.
S1 Code. R code used to analyze data.
S1 Table. Numerical results of power analysis for artifact-count data.
S2 Table. Numerical results of power analysis for site-area data.
Steven Kuhn, Mary Stiner, Stephen Lansing, and David Raichlen of The University of Arizona, Shane Miller of Mississippi State University, and five anonymous reviewers offered comments that greatly improved this manuscript. Collasuyo Archaeological Research Institute, the Peruvian Ministry of Culture, Carlos Viviano Llave, Virginia Incacoña Huaraya, Mateo Incacoña Huaraya, Albino Pilco Incacoña, and many others supported field research in the Ilave region. Author order reflects “first-last-author-emphasis” convention.
Conceived and designed the experiments: WRH. Performed the experiments: WRH MSA CJK GJM. Analyzed the data: WRH. Contributed reagents/materials/analysis tools: WRH MSA CJK GJM. Wrote the paper: WRH.
- 1. Bettencourt LMA, Lobo J, Helbing D, Kühnert C, West GB. Growth, innovation, scaling, and the pace of life in cities. Proc Natl Acad Sci. 2007;104: 7301–7306. doi: 10.1073/pnas.0610172104. pmid:17438298
- 2. Bettencourt LMA, Lobo J, Strumsky D, West GB. Urban scaling and its deviations: revealing the structure of wealth, innovation and crime across cities. PLoS ONE. 2010;5: e13541. doi: 10.1371/journal.pone.0013541. pmid:21085659
- 3. Jones KE, Patel NG, Levy MA, Storeygard A, Balk D, Gittleman JL, et al. Global trends in emerging infectious diseases. Nature. 2008;451: 990–993. doi: 10.1038/nature06536. pmid:18288193
- 4. Shennan S. Population, culture history, and the dynamics of culture change. Curr Anthropol. 2000;41: 811–835.
- 5. Carneiro RL. The transition from quantity to quality: a neglected causal mechanism in accounting for social evolution. Proc Natl Acad Sci. 2000;97: 12926–12931. doi: 10.1073/pnas.240462397. pmid:11050189
- 6. Bandy MS. Fissioning, scalar stress, and social evolution in early village societies. Am Anthropol. 2004;106: 322–333. doi: 10.1525/aa.2004.106.2.322.
- 7. Alberti G. Modeling group size and scalar stress by logistic regression from an archaeological perspective. PLoS ONE. 2014;9: e91510. doi: 10.1371/journal.pone.0091510. pmid:24626241
- 8. Crema ER. A simulation model of fission–fusion dynamics and long-term settlement change. J Archaeol Method Theory. 2013;21: 385–404. doi: 10.1007/s10816-013-9185-4.
- 9. Gershenson C, Fernandez N. Complexity and information: measuring emergence, self-organization, and homeostasis at multiple scales. Complexity. 2012;18: 29–44. doi: 10.1002/cplx.21424.
- 10. Batty M. The size, scale, and shape of cities. Science. 2008;319: 769–771. doi: 10.1126/science.1151419. pmid:18258906
- 11. Krugman P. The self-organizing economy. Cambridge: Blackwell Publishers; 1996.
- 12. Adamic L. Complex systems: unzipping Zipf’s law. Nature. 2011;474: 164–165. doi: 10.1038/474164a. pmid:21654791
- 13. Newman MEJ. Power laws, pareto distributions and Zipf’s law. Contemp Phys. 2005;46: 323–351. doi: 10.1080/00107510500052444.
- 14. Decker EH, Kerkhoff AJ, Moses ME. Global patterns of city size distributions and their fundamental drivers. PLoS ONE. 2007;2: e934. doi: 10.1371/journal.pone.0000934. pmid:17895975
- 15. Brown CT, Witschey WRT, Liebovitch LS. The broken past: fractals in archaeology. J Archaeol Method Theory. 2005;12: 37–78. doi: 10.1007/s10816-005-2396-6.
- 16. Christaller W. Central places in southern Germany. Englewood Cliffs N.J.: Prentice-Hall; 1966.
- 17. Flannery KV. The ground plans of archaic states. In: Feinman GM, Marcus J, editors. Archaic states. Santa Fe: School of American Research Press; 1998. pp. 15–57.
- 18. Johnson GA. Rank-size convexity and system integration: a view from archaeology. Econ Geogr. 1980;56: 234–247.
- 19. Brown CT, Witschey WRT. The fractal geometry of ancient Maya settlement. J Archaeol Sci. 2003;30: 1619–1632. doi: 10.1016/S0305-4403(03)00063-3.
- 20. Stanish C. Ancient Titicaca: The evolution of complex society in southern Peru and northern Bolivia. Berkeley: University of California Press; 2003.
- 21. Drennan RD, Peterson CE. Comparing archaeological settlement systems with rank-size graphs: a measure of shape and statistical confidence. J Archaeol Sci. 2004;31: 533–549. doi: 10.1016/j.jas.2003.10.002.
- 22. Fujita M, Krugman PR, Venables A. The spatial economy: cities, regions and international trade. Cambridge, Mass.: MIT Press; 1999.
- 23. Ames KM. The archaeology of rank. In: Bentley RA, Maschner HDG, editors. Handbook of Archaeological Theories. AltaMira Press; 2008. pp. 487–513.
- 24. Zipf GK. Human behavior and the principle of least effort: an introduction to human ecology. Cambridge: Addison-Wesley Press; 1949.
- 25. Paynter R. Models of spatial inequality: settlement patterns in historical archeology. New York: Academic Press; 1982.
- 26. Richardson H. Theory of the distribution of city sizes: Review and prospects. Reg Stud. 1973;7: 239–251. doi: 10.1080/09595237300185241.
- 27. Clauset A, Shalizi CR, Newman MEJ. Power-law distributions in empirical data. SIAM Rev. 2009;51: 1–43. doi: 10.1137/070710111.
- 28. Inomata T, Aoyama K. Central-place analyses in the La Entrada region, Honduras: implications for understanding the classic Maya political and economic systems. Lat Am Antiq. 1996;7: 291–312. doi: 10.2307/972261.
- 29. Wright HT. Recent research on the origin of the state. Annu Rev Anthropol. 1977;6: 379–397. doi: 10.1146/annurev.an.06.100177.002115.
- 30. Peterson CE, Drennan RD. Communities, settlements, sites, and surveys: regional-scale analysis of prehistoric human interaction. Am Antiq. 2005;70: 5–30. doi: 10.2307/40035266.
- 31. Bandy MS. Early village society in the Formative Period in the southern Lake Titicaca Basin. In: Isbell WH, Silverman H, editors. Andean archaeology. New York: Springer; 2006. pp. 210–236.
- 32. Bonabeau E, Dagorn L, Fréon P. Scaling in animal group-size distributions. Proc Natl Acad Sci. 1999;96: 4472–4477. pmid:10200286
- 33. Wearmouth VJ, McHugh MJ, Humphries NE, Naegelen A, Ahmed MZ, Southall EJ, et al. Scaling laws of ambush predator “waiting” behaviour are tuned to a common ecology. Proc R Soc B Biol Sci. 2014;281: 20132997. doi: 10.1098/rspb.2013.2997.
- 34. Hamilton MJ, Milne BT, Walker RS, Burger O, Brown JH. The complex structure of hunter-gatherer social networks. Proc R Soc B Biol Sci. 2007;274: 2195–202.
- 35. Brown CT, Liebovitch LS, Glendon R. Lévy flights in Dobe Ju/’hoansi foraging patterns. Hum Ecol. 2006;35: 129–138. doi: 10.1007/s10745-006-9083-4.
- 36. Grove M. The quantitative analysis of mobility: ecological techniques and archaeological extensions. In: Lycett SJ, Chauhan PR, editors. New Perspectives on Old Stones: Analytical Approaches to Paleolithic Technologies. New York: Springer; 2010. pp. 83–118.
- 37. Kelly RL. The foraging spectrum: diversity in hunter-gatherer lifeways. New York: Percheron Press; 1995.
- 38. Wobst HM. The archaeo-ethnology of hunter-gatherers or the tyranny of the ethnographic record in archaeology. Am Antiq. 1978;43: 303–309. doi: 10.2307/279256.
- 39. Stöckli R, Vermote E, Saleous N, Simmon R, Herring D. The blue marble next generation—a true color earth dataset including seasonal dynamics from MODIS [Internet]. NASA Earth Observatory; 2005. Available: http://visibleearth.nasa.gov
- 40. Chauchat C. Prehistoria de la costa norte del Perú: el Paijanense de Cupisnique. Lima, Perú: Instituto Francés de Estudios Andinos; 2006.
- 41. Briceño J. Los primeros habitantes en los Andes Centrales y la tradición de puntas de proyectil Cola de Pescado de la quebrada de Santa María. In: Valle Álvarez L, editor. Desarrollo arqueológico, costa norte del Perú. Trujillo: Ediciones SIAN; 2004. pp. 29–44.
- 42. Loendorf CR. The Hohokam-Akimel O’odham continuum: sociocultural dynamics and projectile point design in the Phoenix Basin, Arizona. Sacaton, Arizona: Gila River Indian Community Cultural Resource Management Program; 2012.
- 43. Loendorf CR, Rice G. Projectile point typology: Gila River Indian Community, Arizona. Sacaton: Gila River Indian Community, Cultural Resource Management Program; 2004.
- 44. Winterhalder B, Thomas RB. Geoecology of southern highland Peru: a human adaptation perspective. [Boulder]: University of Colorado, Institute of Arctic and Alpine Research; 1978.
- 45. Aldenderfer MS, Flores Blanco LA. Reflexiones para avanzar en los estudios del Periodo Arcaico en los Andes Centro-Sur. Chungara Rev Antropol Chil. 2011;43: 531–550.
- 46. Klink CJ. Archaic Period research in the Río Huenque Valley, Peru. In: Stanish C, Cohen AB, Aldenderfer MS, editors. Advances in Titicaca Basin Archaeology-1. Los Angeles: Cotsen Institute of Archaeology at UCLA; 2005. pp. 13–24.
- 47. Klink CJ, Aldenderfer MS. A projectile point chronology for the South-Central Andean highlands. In: Stanish C, Cohen AB, Aldenderfer MS, editors. Advances in Titicaca Basin Archaeology-1. Los Angeles: Cotsen Institute of Archaeology at UCLA; 2005. pp. 25–54.
- 48. Aldenderfer MS. Montane foragers: Asana and the South-Central Aandean Archaic. Iowa City: University of Iowa Press; 1998.
- 49. Moseley ME. The maritime foundations of Andean civilization. Menlo Park, Calif.: Cummings Publishing Company; 1974.
- 50. Dillehay TD, Rossen J, Maggard G, Stackelbeck K, Netherly P. Localization and possible social aggregation in the Late Pleistocene and Early Holocene on the north coast of Perú. Quat Int. 2003;109–110: 3–11. doi: 10.1016/S1040-6182(02)00198-2.
- 51. Maggard GJ. Las ocupaciones humanas del Pleistoceno Final y el Holoceno Temprano en la costa norte del Perú. Bol Arqueol PUCP. 2011;15: 1–23.
- 52. Dillehay TD. Profiles in Pleistocene history. In: Silverman H, Isbell WH, editors. Handbook of South American Achaeology. New York NY: Springer; 2008. pp. 29–43.
- 53. Phillips S, Comus PW. A natural history of the Sonoran Desert. Tucson: Arizona-Sonora Desert Museum; 2000.
- 54. Dunnell RC. The Notion Site. In: Rossignol J, Wandsnider L, editors. Space, time, and archaeological landscapes. New York: Plenum Press; 1992. pp. 21–42.
- 55. Mitzenmacher M. A brief history of generative models for power law and lognormal distributions. Proc Annu Allerton Conf Commun Control Comput. 2001;39: 182–191.
- 56. Ortman SG, Cabaniss AHF, Sturm JO, Bettencourt LMA. The pre-history of urban scaling. PLoS ONE. 2014;9: e87902. doi: 10.1371/journal.pone.0087902. pmid:24533062
- 57. Yellen J. Archaeological approaches to the present: models for reconstructing the past. New York: Academic Press; 1977.
- 58. White EP, Enquist BJ, Green JL. On estimating the exponent of power-law frequency distributions. Ecology. 2008;89: 905–912. doi: 10.1890/07-1288.1. pmid:18481513
- 59. Edwards AM, Phillips RA, Watkins NW, Freeman MP, Murphy EJ, Afanasyev V, et al. Revisiting levy flight search patterns of wandering albatrosses, bumblebees and deer. Nature. 2007;449: 1044–1048. doi: 10.1038/nature06199. pmid:17960243
- 60. The R Foundation. R statistical computing language. Wien; 2009.
- 61. Venables WN, Ripley BD. Modern applied statistics with S. Fourth. New York: Springer; 2002.
- 62. Gillespie CS. Fitting heavy tailed distributions: the poweRlaw package. 2013.
- 63. Schlanger SH. Recognizing persistent places in Anasazi settlement systems. In: Rossignol J, Wandsnider L, editors. Space, time, and archaeological landscapes. New York: Plenum Press; 1992. pp. 91–112.
- 64. Binford LR. The archaeology of place. J Anthropol Archaeol. 1982;1: 5–31. doi: 10.1016/0278-4165(82)90006-X.
- 65. Basso KH. Wisdom sits in places: landscape and language among the western Apache. Albuquerque: University of New Mexico Press; 1996.
- 66. Laland KN, O’Brien MJ. Niche construction theory and archaeology. J Archaeol Method Theory. 2010;17: 303–322.
- 67. Schiffer MB. An alternative to Morse’s Dalton settlement pattern hypothesis. Plains Anthropol. 1975;20: 253–266.
- 68. Griffin AF, Stanish C. An agent-based model of prehistoric settlement patterns and political consolidation in the Lake Titicaca Basin of Peru and Bolivia. Struct Dyn EJournal Anthropol Relat Sci. 2007;2: 1–47.
- 69. Hodder I. The identification and interpretation of ranking in prehistory: a contextual perspective. In: Renfrew C, Shennan S, editors. Ranking, Resource, and Exchange: Aspects of the Archaeology of Early European Society. Cambridge: Cambridge University Press; 1982. pp. 150–155.