Why Levallois? A Morphometric Comparison of Experimental ‘Preferential’ Levallois Flakes versus Debitage Flakes

Background Middle Palaeolithic stone artefacts referred to as ‘Levallois’ have caused considerable debate regarding issues of technological predetermination, cognition and linguistic capacities in extinct hominins. Their association with both Neanderthals and early modern humans has, in particular, fuelled such debate. Yet, controversy exists regarding the extent of ‘predetermination’ and ‘standardization’ in so-called ‘preferential Levallois flakes’ (PLFs). Methodology/Principal Findings Using an experimental and morphometric approach, we assess the degree of standardization in PLFs compared to the flakes produced during their manufacture. PLFs possess specific properties that unite them robustly as a group or ‘category’ of flake. The properties that do so, relate most strongly to relative flake thicknesses across their surface area. PLFs also exhibit significantly less variability than the flakes generated during their production. Again, this is most evident in flake thickness variables. A further aim of our study was to assess whether the particular PLF attributes identified during our analyses can be related to current knowledge regarding flake functionality and utility. Conclusions/Significance PLFs are standardized in such a manner that they may be considered ‘predetermined’ with regard to a specific set of properties that distinguishes them statistically from a majority of other flakes. Moreover, their attributes can be linked to factors that, based on current knowledge, are desirable features in flake tools (e.g. durability, capacity for retouch, and reduction of torque). As such, our results support the hypothesis that the lengthy, multi-phase, and hierarchically organized process of Levallois reduction was a deliberate, engineered strategy orientated toward specific goals. In turn, our results support suggestions that Levallois knapping relied on a cognitive capacity for long-term working memory. This is consistent with recent evidence suggesting that cognitive distinctions between later Pleistocene hominins such as the Neanderthals and anatomically modern humans were not as sharp as some scholars have previously suggested.


Introduction
For over a century, archaeologists and palaeoanthropologists have been discussing a particular group of Palaeolithic flaked (i.e. knapped) stone cores and flake products that are collectively referred to as 'Levallois' [1], [2]. Named after the suburb of Paris (Levallois-Perret) from where they were recovered during the 19 th century, Levallois artefacts are now known to occur over large parts of Africa, western Asia as well as Europe [3]. In Africa, they appear to have a chronological origin ,300 Kya [4], [5], and in Europe, Levallois is now also known to date from at least 300 Kya [6]. Indeed, the presence of Levallois artefacts is traditionally regarded as one of the main diagnostic features of the archaeological period referred to as the 'Middle Palaeolithic', or what in Africa is termed the 'Middle Stone Age' (MSA) [4], [5], [7]. With a wide geographic and temporal spread, the manufacturers of Levallois conservatively include at least three hominin species: Homo sapiens, H. neanderthalensis and late H. heidelbergensis (Archaic H. sapiens sensu lato) [8], [9]. The association of such artefacts with Neanderthals (e.g. [9], [10]) has, in particular, given rise to much debate regarding their potential significance for the evolution of hominin cognitive and linguistic capacities (e.g. [11], [12], [13], [14], [15]).
An important component in such debates relates to the fact that Levallois cores have frequently been thought to represent 'prepared cores'. That is, the core is shaped in a deliberate manner such that the 'Levallois flakes' removed following such preparation are deliberately 'pre-prepared' and 'predetermined' in terms of overall size and shape [12], [16], [17]. Indeed, Levallois was once popularly identified and defined on the basis of specific flake products [18,19]. More recently, however, 'Levallois' has more typically been identified on the basis of cores with specific properties of form and geometry [12], [20], [21], [22]. This 'volumetric concept' of Levallois ( Figure 1) is based on six key criteria originally outlined by Boëda [23], [24], [25]: (1) the volume of the core is bifacial, comprised of two distinct surfaces that intersect at the core's margin, ultimately identifying a 'plane of intersection'; (2) the two surfaces are organized hierarchically, whereby one surface is dedicated to the production of striking platforms that are used to detach flakes from the opposite 'Levallois' flaking surface; (3) the Levallois flake surface is shaped such that it possesses both distal and lateral convexities; (4) Levallois flakes are removed parallel to the plane of intersection; (5) the intersection (or 'hinge') of the striking platform surface and the flaking surface is perpendicular to the flaking axis of the Levallois flakes; (6) Levallois flakes are removed via direct hard hammer percussion. Although several of these stages may be achieved by a variety of different means, this volumetric concept has brought a level of coherence to Levallois such that cores identified as having been produced via this reduction processes exhibit a certain level of ''homogeneity'' ( [26]: 201).
Despite the shift in emphasis away from flake products to cores and core reduction in the definition of Levallois, the concept of flake predetermination is still, however, inherent in Boëda's [25] 'volumetric concept', and conscious predetermination remains an important feature of Levallois according to many scholars (e.g. [12], [14], [17], [27], [28], [29]). This alleged predetermination has been used to support arguments for developed cognitive capacities in terms of foresight and 'planning depth' (e.g. [15]). Wynn and Coolidge [14] meanwhile have used Levallois to support arguments that extinct hominins such as Neanderthals possessed a long-term working memory, which allows the rapid retrieval of knowledge from long-term memory thus enabling 'expert' levels of performance. Notions of predetermination in Levallois have also been used to support arguments relating to linguistic capacities in extinct hominins. For instance, some time ago Holloway ([30]: 403) (in conceiving of Levallois production as a structured goal-orientated activity) suggested that ''as in language, the activity is made up of units concatenated nonrandomly, there being contingencies both in language pattern and tool-making'' such that there is a ''grammar'' involved in both activities. With regard to interconnecting concepts made up of minimal unit activities, he went onto state ( [30]: 404) that ''the alphabet of chipping technique is not random either … where certain of these are contingent upon prior operations (e.g. Levallois technique)''. Lieberman ([11]: 163-170) also drew on concepts of Chomskian grammar to link the processes of Levallois reduction with the cognitive processes involved in language (although see [31]: 257). All such arguments pertaining to the cognitive and linguistic capacities of extinct hominins (regardless of other details relevant to their merit) are obviously contingent upon the premise that production of 'Levallois flakes' is a deliberate, goal-orientated activity, predicated around the production of 'preferred' and 'predetermined' flakes. However, not all accept that Levallois flaking involves strong concepts of predetermination or conscious, structured planning. For instance, Noble and Davidson ([13]: 200) suggest that the alleged phases of extensive preparatory flaking involved in 'Levallois' flake production implies ''a wastefulness of knapping effort and raw material that seems implausible''. Likewise, Sangathe ( [32]: 148) has argued that since ''most flakes produced with skill and regularity … have sharp usable edges … it does not seem likely that the advantage acquired by producing a flake of specific shape was sufficient to necessitate the extra effort required by employing the Levallois technique''. In the absence of predetermination, it has been argued, the ''time depth of intentionality is reduced to decisions about the next flake, and not to decisions about the final form'' ( [33]: 376). Rejection of notions concerning predetermination or planning in Levallois industries has therefore led some to suggest that not until the Upper Palaeolithic do we see clear ''marks of planning that seem to entail a capacity for consciousness'' ( [33]: 382).
Suspicions regarding the 'preferred' and 'planned' nature of Levallois flakes led Sangathe [32] to the novel suggestion that removal of large central flakes was primarily a core maintenance strategy intended to reduce the central mass of a core allowing the establishment of a consistent core morphology throughout reduction. Importantly, Sangathe ( [32]: 157) recommended that experimental flintknapping could be used to test his proposition. However, subsequent experiments by workers following this recommendation failed to support the central tenets of his hypothesis, and demonstrated that a consistent core morphology is readily maintained in the absence of Levallois removals [34].
Examinations of archaeological material that might shed light on the issue of Levallois predetermination have produced mixed results. In one of the most comprehensive studies of the issue, Dibble [35] examined flakes from 27 different assemblages in southern France. Specifically, he focused on the issue of predetermination in Levallois flakes and the allied notion that certain flakes were more desirable than others, such that their production could be linked to language categories ( [35]: 424). The logic underlying his analysis was that if by 'predetermination' a level of standardization was implied, then it can reasonably be expected that there will be less variability in Levallois flakes compared with other flake categories. Flakes from each assemblage were divided into three categories: Levallois flakes, biface retouch flakes, and indeterminate 'normal' flakes. An analysis of flake area, length, width and thickness measurements (and their ratios) suggested that Levallois flakes were not necessarily statistically more standardized, thus leading Dibble ([35]: 425) to argue that their manufacture could not be linked to ''the presence of linguistic rules, structures, or categories''. A study by Schlanger [12], however, used flakes from a refitted Levallois core from the Middle Palaeolithic site of Maastricht-Belvédère (Netherlands) and reached a different conclusion. Here, he found that length, widths and thicknesses of the nine Levallois flakes were, as a group, more standardized than the 32 non-Levallois (debitage) flakes.
Working within the restrictions automatically imposed whenever dealing with Palaeolithic archaeological materials, these previous studies inevitably possess certain weaknesses alongside particular strengths in each case. The major strength of Dibble's [35] study was a large overall sample size. However, there is an inevitable degree of subjectivity in assigning flakes to different categories (i.e. 'Levallois', 'biface', etc.) in the absence of additional information, such as might be obtained through refitting. Indeed, controlled experiments have demonstrated that even in the case of experienced workers, the accurate identification of Levallois flakes over other categories of flake is subjective and non-replicable across participants [36]. Moreover, Schlanger ([12]: 249) pointed out an apparent incongruity in Dibble's [35] study, whereby the categorization of certain flakes as 'Levallois' was achieved in the initial phases, yet the subsequent quantitative analysis did not indicate standardization. Similarly, Kuhn [37] has noted that selecting flakes from a range of varying archaeological examples and classifying them on the basis of certain properties (e.g. as 'Levallois', 'biface flake', etc.) might inevitably lead to them being regarded as 'standardized' in a subsequent metric analysis. In using a refitted core, Schlanger [12] was on a somewhat firmer, although not entirely assumption free, basis with regard to classifying certain flakes as 'Levallois'. However, in using a sample size of just 41 total flakes from a single (incomplete) core, statistical validity is open to question since inferential statistical methods were not applied. In addition to these points, it is notable that both of these studies used simple measurement schemes (essentially three primary measurements of length, width and thickness) and neither study utilized multivariate statistical approaches. While the use of relatively simple morphometric methodologies alone does not necessarily negate the various arguments concerning standardization and 'preference' in Levallois flakes, it does mean that only limited aspects of flake variability were examined in these previous studies.
Clearly, in light of the foregoing, a level of ambiguity concerning the 'predetermined' nature of Levallois flakes is evident. Here, therefore, we adopted an experimental approach to the issue. We focus on the production of 'classic' lineal or socalled 'preferential' Levallois ('tortoise') cores and their products ( [12]: 238, [22]: 65, [25]: 56), which have figured prominently in the issues discussed previously. The use of experimental assemblages allows us to negate the problems associated with arbitrarily assigning archaeological flakes to different categories. It also enabled the generation of flake samples large enough (n = 642 flakes) to be amenable to several inferential statistical analyses. In addition, we used a morphometric scheme involving 15 sizeadjusted variables, thus enabling multivariate methodologies to be applied, and issues of flake size and shape to be disentangled more directly during analysis. Prior to our main analyses of the flakes, we also established (via a comparative analysis) that the experimental cores produced in our experiments replicated known archaeological examples of Levallois core accurately.
Our analyses focused on two issues. Firstly, if so-called 'preferential' Levallois flakes (hereafter putative PLFs) produced on classic 'tortoise' cores were genuinely a 'preferred' product with common properties uniting them as a coherent entity or 'category' of flake, then they should possess a series of particular attributes that identify them as a group more consistently than the debitage flakes produced during their manufacture. Accordingly, we tested this prediction using size-adjusted morphometric data and the multivariate statistical technique of discriminant function analysis. Secondly, if PLFs produced through tortoise 'Levallois' core reduction represent genuinely 'preferred' products engineered (via this volumetric core reduction strategy) to meet specific requirements, they should possess a greater standardization in their attributes compared with the debitage flakes produced during their manufacture. We tested this prediction using coefficients of variation for each of the attributes. Moreover, in both cases, we aimed to identify which particular attributes might unite PLFs as a coherent entity, or in the case of standardization, which particular attributes, appear to be relatively more standardized in PLFs as opposed to the flakes produced during their manufacture. Our rationale here was that if particular attributes unite PLFs as a consistent and coherent flake group and the volumetric construction of the core results in them being controlled (i.e. 'standardized') in a particular manner, then it should be possible to relate these variables to current archaeological knowledge concerning the functionality and practical desirability of certain flake forms over others. In other words, our analyses aimed to establish on a more firm basis why Levallois flakes might have been a preferred and targeted product during Levallois reduction; an issue to which we turn in our discussion.

Knapping the Levallois reductions
One of us (MIE) knapped a total of 75 PLFs from a series of 25 nodules of Texas chert from the Cretaceous-aged Fredericksberg Group [38]. The number of PLFs produced from each nodule ranged from 1-5 (mean = 3). Each Levallois reduction was specifically configured to conform to Boëda's [25] criteria for Levallois, via the production of a classic lineal 'preferential' (tortoise) Levallois core. Following Bradley ([39]: 22), Levallois reduction was comprised of two stages using direct hard hammer percussion throughout. The first stage establishes the preliminary bifacial margin, which is continuous around the circumference of the nodule. Stage two, involves three sub-stages: (1) shaping of the Levallois flaking surface and margin adjustment; (2) preparation of the PLF platform; (3) removal of PLFs.
Again, following Bradley [39], we defined 'ventral' flakes as those removed from the face from which the putative PLFs are removed, and refer to flakes removed from the non-PLF surface as 'dorsal' flakes. This is potentially confusing as Levallois cores are typically illustrated with the Levallois flaking surface facing upward (i.e. superiorly). However, it should be noted that when the putative PLFs are eventually removed from the core, it is orientated such that the Levallois surface is facing downward (i.e. ventrally), thus establishing the terminology used here. For each Levallois reduction, all debitage flakes from the dorsal and ventral surfaces were bagged separately and labeled. Each PLF was also bagged and labeled. Following this cataloging procedure, all subsequent stages of sampling, data recording, and analysis were performed by SJL thus ensuring an independence between the knapping and data analysis phases of the study.
The manufacture of Levallois products is generally considered a highly skilled activity and it has been claimed that only a relatively limited number of contemporary knappers are able to produce replications that stand close scrutiny alongside archaeological examples ( [14]: 474, [15]: 118). Hence, a comparative 3D geometric morphometric analysis of the experimental cores resulting from the production of flakes used in our later analyses was also undertaken (Text S1). This analysis demonstrated that the replica cores fit comfortably within the range of variation exhibited by a sample of genuine archaeological examples of 152 Levallois cores found at sites in Africa, western Asia and Europe (Text S1, Figure S1, Figure S2, Figure S3). Importantly, this thus verifies quantitatively that Levallois core morphologies were replicated with high degrees of accuracy compared with known archaeological examples.

Flake sampling protocol
A total of 642 experimentally produced flakes were examined in this study, including the 75 'Preferential' Levallois flakes. There is some evidence to suggest that wherever a range of flake sizes are available, extremely small flakes (i.e. ,2 cm in length) would less likely have been utilized as hand/finger held tools [40]. Moreover, in the context of the current analyses, extremely small flakes/chips are, a priori, those least likely to share form affinities with PLFs. Therefore, only debitage flakes .2 cm in maximum length were measured. A maximum of eight complete debitage flakes per PLF were measured; up to four from the PLF (ventral) surface and up to four from the non-PLF (dorsal) surface. Wherever the total number of potentially measurable flakes from a surface exceeded four specimens, four flakes were sampled randomly using a random number generator (http://www.randomizer.org). Application of this strategy resulted in a total of 567 debitage flakes being compared against the 75 putative PLFs.

Flake attributes
A total of 15 quantitative variables were measured for all flakes and are listed in Table 1. Full details and descriptions of these measurements can obtained in the supporting information (Text S2).

Analysis 1: Discriminant analysis of flake attributes
If PLFs were genuinely a 'preferred' product with common properties that unite them as a coherent entity or 'category' of flake, then they should possess a series of attributes that identify them as a group more consistently than the debitage flakes produced during their manufacture.
Such a prediction may be tested multivariately using Discriminant Function Analysis (DFA). Analytically, DFA provides a set of weightings (i.e. discriminant functions) that most effectively discriminate between groups that are defined a priori [41], [42]. Such weightings are linear combinations of the original variables. The relative coherency of specific groups (in terms of the original variables) may be assessed by the extent to which individual specimens can be classified back to their original group, with results frequently expressed in percentages (%). Importantly, the DFA also identifies which of the attributes are most important in assigning specimens to groups. Here, the DFA was undertaken in SPSS v16.0. Conservatively, only cross-validated results were examined, whereby specimens are classified in turn on the basis of linear functions derived from all other specimens except that specific case [42]. If PLFs are genuinely a specific category, with common properties that unite them as a group with relatively high degrees of consistency, it may in this specific instance be predicted that in a DFA of PLF, dorsal and ventral flake groups, PLFs will be classified more accurately than either dorsal or ventral flakes, and with a relatively high degree of accuracy. The ratio of correct to incorrect classifications for each flake group may be assessed for statistical significance (a = 0.05) relative to chance (H 0 = 50:50) using a chi-square (x 2 ) test. Note here that in the original DFA, the probability of a flake being assigned to its correct group by chance alone is 33.3%. However, since the x 2 test is simply asking whether the chance of a flake being classified correctly in the original DFA is significantly different from the chance of it being misclassified (i.e. in cases of misclassification the test is not taking into account which of the other two groups it has been assigned to), chance in this latter instance is 50%.
Given that PLFs will on average be bigger than many debitage flakes in a Levallois reduction sequence, all data were sizeadjusted in order to analyse their shape properties as opposed to merely examining size differences. Moreover, by size-adjusting the data this ensures that results will be generally comparable across tortoise Levallois cores, regardless of overall size, which may be especially important given that archaeological examples of Levallois cores vary greatly in isometric size [43]. Attributes 1-13 were size adjusted by the geometric mean of those measurements [44], [45], and attribute 14 (length edge of sharp edge) was size-adjusted using the geometric mean of all planform variables (i.e. attributes 1-6). The Index of Symmetry is a scale-free variable (Text S2) and was inputted to the DFA directly.

Analysis 2: comparison of standardization
If PLFs are genuinely 'preferred' products engineered to meet specific requirements, they should possess a greater standardization in their attributes compared with the debitage flakes produced during their manufacture. Following Dibble [35] and Schlanger [12] relative standardization in the attributes of PLFs compared with debitage flakes may be assessed directly through comparison of coefficients of variation (CV) of the raw measurements expressed as percentages (i.e. standard deviation/mean6100). Hence, in order to test predictions of standardization a CV was calculated for each attribute. Thereafter, the overall extent of standardization in PLFs versus debitage flakes was assessed for statistical significance via a Mann-Whitney U-test (a = 0.05) of the two groups of CV values. Because the Index of Symmetry is a scale-free variable (Text S2), descriptive statistics such as means and standard deviations may be compared across flake groups directly. Therefore, in this instance, the difference in flake symmetry across groups was assessed using a Mann-Whitney Utest, while an F-test was used to determine differences in the standard deviation of each group (a = 0.05). Figure 2 shows the plot of the DFA scores (functions 1 and 2) for the 642 flakes. Function 1 explained 90.1% of variance and is statistically significant (Wilks' Lambda = 0.715; p,0.0001). As Table 2 shows, PLFs were correctly classified to group in 89.3% (cross-validated) of cases, well over twice as high (i.e. 2.6826) as would be predicted by chance alone (33.3%). Conversely, dorsal debitage flakes were correctly classified to group in only 36.7% of Figure 2. Plot of discriminant functions 1 (x axis) and 2 (y axis) resulting from the DFA (see Table 2 for classification scores). doi:10.1371/journal.pone.0029273.g002 cases, while ventral flakes could be classified to their correct group in only 54.3% of cases. Dorsal and ventral debitage flakes were consistently misclassified with each other to a greater extent than they were as PLFs ( Table 2). The ratio of correct to incorrect classifications for PLFs was significantly greater than chance (x 2 = 60.840; df = 1; exact p,0.0001). In the case of dorsal flakes, the ratio of correct to incorrect classifications was significantly below chance (x 2 = 6.760; df = 1; exact p = 0.012). For ventral flakes, the ratio of correct to incorrect classifications was not significantly different from chance (x 2 = 0.640; df = 1; exact p = 0.484). Hence, the results of the DFA support the hypothesis that the PLFs (as a category of flake) share a particular combination of attributes, robustly identifying them as a coherent group.

Analysis 1: Discriminant function analysis of flake attributes
It is also notable that the variables loading most highly (positively) on DF1 and thus contributing to the positioning of the PLFs on that function (and their classification rate) were the five flake thickness variables (i.e. Thickness at 25, 50 and 75% of Length, and Thickness at 25 and 75% of Maximum Width). This suggests that control of these thickness variables was an important feature of PLFs.

Analysis 2: comparison of standardization
CVs for debitage flakes were consistently higher for all variables (Table 3). Differences between the two groups of CVs for PLF versus debitage flakes were statistically significant (Mann-Whitney U = 48.0; asymptotic p = 0.022; exact p = 0.021). Likewise, mean symmetry measures and their standard deviations were higher for debitage flakes than for PLFs, thus demonstrating that PLFs are, on average, more symmetrical and exhibit less variability in this attribute. Differences between flake categories were statistically significant for overall symmetry measures (Mann-Whitney U = 11227.0; asymptotic p,0.0001) and for their standard deviations (F = 37.108; d.f. = 1; p,0.0001). Hence, the results of this analysis consistently support the hypothesis that PLFs are more standardized in form than debitage flakes.
Consistent with the results of the DFA analysis, it should also be noted that the attributes with the highest differences in CV values between debitage flakes and PLFs were the five thickness measurements of debitage flakes along the various percentage points of maximum length and maximum width (Table 3). Again, this suggests that PLFs (relative to alternative flake categories) are a means of engineering consistency of flake thickness within specific bounds, across a large proportion of their surface area.

Discussion
In our first (DFA) analysis, dorsal flakes were correctly classified to group at levels barely above chance, and in the case of ventral flakes, almost every other flake was misclassified. Conversely, in the case of PLFs, only around one in ten flakes were misclassified. Most importantly, only in the case of PLFs was the ratio of correct to incorrect classifications statistically greater than chance. Hence, in line with the hypothesis that PLFs are a 'preferred' product with common properties that unite them as a coherent entity, this first analysis demonstrated that PLFs form a robust group with a relatively consistent relationship between measured variables. It is also notable that the most important variables driving their  performance in the DFA were measurements relating to relative flake thickness.
In the second analysis, it was found that the PLFs were significantly less variable than the debitage flakes produced during their manufacture. PLFs were also found, on average, to be significantly more symmetrical than debitage flakes. Importantly, in a manner consistent with the results of the first analysis, the greatest differences between CV values for PLFs versus debitage flakes were observed in the five variables measuring flake thickness along their maximum lengths and widths. This is despite the fact that maximum thickness was found to be more variable than maximum length or width measures in both PLFs and debitage flakes. The results of this multi-core analysis are thus consistent with Schlanger's [12] examination of flakes from a single archaeological Levallois core, in terms of showing that maximum thickness is more variable than maximum length or width measures (regardless of flake category), but more importantly, in corroborating his assertion that PLFs exhibit less overall variability than the debitage flakes removed during their production. Moreover, in this instance, the statistical significance of this distinction has been established.
Overall, therefore, the results of our analyses demonstrate that PLFs form a relatively coherent entity with a set of specific properties that unite them robustly as a group or 'category' of flake. The properties that do so, relate most strongly to relative flake thicknesses across the surface area of PLFs. In addition, our analyses demonstrate that PLFs exhibit significantly less variability than the flakes generated during their production, and that such relative standardization is again most evident in variables relating to flake thicknesses across the length and width of PLFs. Hence, our results are consistent with propositions (e.g. [12], [14], [28], [29]) that Levallois flakes are standardized in such a manner that they may be considered 'predetermined' with regard to a specific set of properties, even when adjusted for overall size differences.
A further specific aim of our study was to determine whether the particular PLF attributes identified during the course of our analyses, can be related to existing archaeological knowledge concerning the potential functionality and practical desirability (i.e. utility) of certain flake forms over others. In other words, do our analyses provide further insight into why Levallois flakes manufactured on classic 'tortoise' cores might logically have been a 'preferred' product having been standardized in such a manner?
Mobility is a factor in the lives of all hunter-gatherer populations, although the extent and pattern of such mobility may vary greatly ( [46]: 111-160, [47]). Transport distances of lithic raw materials appear to increase during the course of the European Middle Palaeolithic, suggestive of increased mobility [48], [49], [50], with similar evidence available for the African MSA [51]. Such evidence has led to suggestions that Levallois was a technology geared specifically toward increased mobility [52].
Regardless of this, given that Pleistocene hominins were foragers, mobility was inevitably a feature of their existence. As Kuhn ([53]: 427) has noted ''mobile toolkits should tend to optimize their potential usefulness relative to weight, the primary determinate of transport cost''. Moreover, such artefacts ''should be durable and inherently 'maintainable''' ( [53]: 428). From the viewpoint of optimality, therefore, the most ideal flake cutting tool is one that provides the greatest utility/durability relative to transport cost (i.e. weight).
Modeling the potential utility of different flake sizes, Kuhn ([53]: 430-432) has shown that potential for retouch (i.e. resharpening) is directly proportional to increased flake area, although the relative increase in utility (so defined) diminishes as flake area increases (i.e. as flakes become heavier). Moreover, under the assumptions of such a model he has shown that decreasing the relative thickness of a flake increases its retouch potential relative to mass ( [53]: 432). A further adjustment to the model showed that if the increased amount of cutting edge provided on larger tools was accounted for, utility declines relative to increasing mass as before, but that the rate of relative decline decreases under these conditions ( [53]: 435).
The large surface area of PLFs compared to flakes from the same core is a feature that was noted in some of the earliest commentaries on Levallois ( [1]: 225), and has been repeated on many occasions since (e.g. [12]: 241, [32]: 148). This is also clearly evident in our results given the mean lengths and widths of PLFs compared to debitage flakes (Table 3). PLFs removed from tortoise cores would, therefore, appear to provide a relatively large potential for retouch under the parameters of Kuhn's [53] model. However, as Kuhn ([53]: 430) himself notes, the model does not assume that differing flake thicknesses might directly impact utility (however measured), nor does the model account for the fact that flake weight itself may have functional advantages affecting optimization factors. When applying a flake tool to a task, greater force may be applied either by the tool-user exerting greater pressure [54], [55], or by choosing relatively heavier tools such that gravity increases momentum. Indeed, experiments have shown that larger flake cutting tools exhibit greater cutting efficiency than smaller flakes [40]. This suggests that alongside Kuhn's [53] observations regarding utility in terms of retouch potential, the fact that Levallois flaking enables the production of large flakes (relative to the size of the core) would also provide an advantage in terms of cutting efficiency, at least compared to debitage flakes from the same core.
However, these factors aside, the strongest patterns emerging from our analyses were related to the thicknesses of PLFs, both in terms of classification and standardization. Examination of Table 4, which shows the averages for flake thickness measurements in the size adjusted data, gives greater insight into the precise parameters underlying this statistically significant pattern. Four factors are evident in this Table. Firstly, PLFs (relative to size) are on average thicker across their surface area (as a whole) than debitage flakes. Secondly, maximum thickness in PLFs (relative to size) is less for the PLFs than for the debitage flakes. Thirdly, examination of the six individual thickness measurements shows that thickness is greater (relative to size) in PLFs for all thickness measurements except for maximum thickness, indicating that maximum thickness is reduced relative to the other measurements, and contributing to the relatively even thickness of PLFs throughout their surface area. Fourthly, PLFs are less variable across all thickness measurements (i.e. thickness is more evenly distributed, as indicated by the lower standard deviation of the means). These factors may be related directly to several different utility/ efficiency issues. As noted, for simplicity, Kuhn's [53] model assumed that flake thickness did not affect utility, and suggested that increasing flake area equated to increased retouch potential. At the same time, his model suggested that reducing flake thickness would reduce weight without reducing utility (i.e. retouch capacity). However, thin flakes also break more easily ( [56]: 150). Hence, a flake so thin that it disintegrates upon usage and/or retouching would negate any advantage of large flake size (i.e. plan-view surface area), and it is now recognized that edge durability (i.e. the capacity to withstand attrition upon use) is a factor that would have affected hominin decision making in factors relating to cutting tools ( [57], [58]: 1612). Even a flake with only a portion of its surface area that is too thin to provide a viable working edge, would exhibit reduced utility relative to its absolute surface area. The relative thickness distributed evenly across PLFs would, therefore, provide support for a viable and robust working edge across the greatest extent of its surface area. Moreover, the fact that maximum thickness in PLFs does not appear to increase proportionally with regard to the other thickness variables, indicates that carrying-weight is reduced directly in the portion of flake area that is typically the least utilizable in flakes (see e.g. [53]). As Turq ([59]: 77) has shown, flakes with a more evenly distributed thickness of cross-section themselves have a greater potential for retouch and re-use ( Figure 3).
In addition to these points, our results indicate that several factors relating to ergonomic considerations and efficiency during use may also have made PLFs desirable relative to other flakes. For instance, increased relative symmetry in a cutting tool, and an evenly distributed thickness, ''puts the center-of mass of the tool in the line corresponding to the direction of motion of the tool at the instant of impact, thus avoiding torque and, consequently, maximizing power'' (i.e. efficiency) [60]. Moreover, experiments with handaxes have shown empirically that there is a statistically significant relationship between increased symmetry and increased efficiency in cutting performance [61]. An increased regularity of surface would ''distribute the reaction force at impact time more evenly through the hand of the tool's user, which increases comfort'' [60]. These proposed advantages of PLFs are not, of course, contingent upon a presupposition that debitage flakes were not utilized, nor are they mutually exclusive to suggestions that the volumetric reduction strategy of Levallois is itself an economic means of reducing cores and maximizing productivity [21]. Indeed, the multiple potential reasons for the utilitarian advantages of Levallois would explain its manufacture by at least three different hominin species and its widespread geographic distribution.
As some have noted, all flakes removed from a core are to some extent influenced by the morphology of the core (angle, curvature, flake scar pattern, etc.) prior to their detachment ( [12]: 235, [22]: 63). A flake bearing the scars of previous removals is, therefore, both automatically 'predetermined' and predetermining with regard to any future removals. What our results suggest, however, is that predetermination via the multi-phase volumetric construction of a Levallois/tortoise core (sensu [25]) enables this predetermination of PLFs to be engineered in a particular way that ultimately (and significantly) distinguishes PLFs from a majority of other flakes. Moreover, those particular attributes may be linked to certain specific factors that, based on current knowledge, can be suggested as potentially desirable features when faced with a choice of alternative flake forms. Future experiments are now required to more accurately model the advantages of PLFs over alternative categories of flake. It should also be emphasized that our analyses have focused on 'classic' lineal Levallois, and that further experiments should explore alternative forms of core incorporated under the term 'Levallois'. Indeed, some experiments have shown previously that 'point' Levallois flakes may have functioned as projectile points [62], [63].
In suggesting that Levallois flakes were indeed a genuinely predetermined and preferred product, our results also have implications for the cognitive and linguistic debates associated with Levallois. That is, our results are consistent with the hypothesis that the execution of Levallois knapping is evidence of an ability to draw on a cognitive capacity of long-term working memory [14]. Direct evidence for language is perhaps unlikely to come from stone tools, and such evidence should always be supplemented with anatomical and palaeoneurological evidence (e.g. [64], [65]). However, our results are consistent with analogies between the hierarchical structuring of information such that it results in a specific goal, and the hierarchical organization of syntax and grammar (e.g. [66]: 129) in sentence construction [11], [30], and suggest that such analogies and their implications are worthy of future exploration. Moreover, our results also suggest that Middle-Late Pleistocene hominins attributed to H. heidelbergensis (sensu lato), H. neanderthalensis and early H. sapiens were all, at least on occasion, solving problems associated with lithic resource optimization and the optimization of flake tool technology in the same manner (i.e. via Levallois). Our results are, therefore, consistent with recent evidence (e.g. [67], [68]) suggesting that cognitive capacities in different species of Middle-Late Pleistocene hominins are not as sharply differentiated as previous generations of scholars postulated, and that the behavioural changes that eventually emerge during the Later Stone Age (African LSA) and Upper Palaeolithic may be more the product of demographic change and increased connectivity of social networks [69] than they were, necessarily, of fundamental cognitive changes.

Supporting Information
Text S1 A comparative 3D geometric morphometric analysis of the experimental Levallois cores and archaeological examples.