Revisiting the Middle and Upper Palaeolithic archaeology of Gruta do Caldeirão (Tomar, Portugal)

Gruta do Caldeirão features a c. 6 m-thick archaeological stratification capped by Holocene layers ABC-D and Ea, which overlie layer Eb, a deposit of Magdalenian age that underwent significant disturbance, intrusion, and component mixing caused by funerary use of the cave during the Early Neolithic. Here, we provide an updated overview of the stratigraphy and archaeological content of the underlying Pleistocene succession, whose chronology we refine using radiocarbon and single-grain optically stimulated luminescence dating. We find a high degree of stratigraphic integrity. Dating anomalies exist in association with the succession’s two major discontinuities: between layer Eb and Upper Solutrean layer Fa, and between Early Upper Palaeolithic layer K and Middle Palaeolithic layer L. Mostly, the anomalies consist of older-than-expected radiocarbon ages and can be explained by bioturbation and palimpsest-forming sedimentation hiatuses. Combined with palaeoenvironmental inferences derived from magnetic susceptibility analyses, the dating shows that sedimentation rates varied in tandem with the oscillations in global climate revealed by the Greenland oxygen isotope record. A steep increase in sedimentation rate is observed through the Last Glacial Maximum, resulting in a c. 1.5 m-thick accumulation containing conspicuous remains of occupation by people of the Solutrean technocomplex, whose traditional subdivision is corroborated: the index fossils appear in the expected stratigraphic order; the diagnostics of the Protosolutrean and the Lower Solutrean predate 24,000 years ago; and the constraints on the Upper Solutrean place it after Greenland Interstadial 2.2. (23,220–23,340 years ago). Human usage of the site during the Early Upper and the Middle Palaeolithic is episodic and low-intensity: stone tools are few, and the faunal remains relate to carnivore activity. The Middle Palaeolithic is found to persist beyond 39,000 years ago, at least three millennia longer than in the Franco-Cantabrian region. This conclusion is upheld by Bayesian modelling and stands even if the radiocarbon ages for the Middle Palaeolithic levels are removed from consideration (on account of observed inversions and the method’s potential for underestimation when used close to its limit of applicability). A number of localities in Spain and Portugal reveal a similar persistence pattern. The key evidence comes from high-resolution fluviatile contexts spared by the site formation issues that our study of Caldeirão brings to light—palimpsest formation, post-depositional disturbance, and erosion. These processes. are ubiquitous in the cave and rock-shelter sites of Iberia, reflecting the impact on karst archives of the variation in climate and environments that occurred through the Upper Pleistocene, and especially at two key points in time: between 37,000 and 42,000 years ago, and after the Last Glacial Maximum. Such empirical difficulties go a long way towards explaining the controversies surrounding the associated cultural transitions: from the Middle to the Upper Palaeolithic, and from the Solutrean to the Magdalenian. Alongside potential dating error caused by incomplete decontamination, proper consideration of sample association issues is required if we are ever to fully understand what happened with the human settlement of Iberia during these critical intervals, and especially so with regards to the fate of Iberia’s last Neandertal populations.

luminescence dating sample positions. The conversion factors of Guérin et al. (2011) have been used to derive gamma and beta dose rates from the measured radionuclide concentrations and specific activities. Cosmic-ray dose rates have been calculated as described in Prescott and Hutton (1994) after taking into consideration site altitude, geomagnetic latitude, and density, thickness and geometry of sediment and bedrock overburden. The beta, gamma and cosmic-ray dose rates have been corrected for long-term sediment moisture contents (Aitken, 1985;Readhead, 1987), which are taken to be equivalent to the present-day measured water contents (i.e., 10-12% of dry sediment weight) as the cave environment has remained sufficiently well-protected from major variations in external atmospheric conditions. A relative uncertainty of 20% has been assigned to the long-term water content values to accommodate any minor variations in hydrologic conditions during burial.
High-resolution gamma spectrometry (HRGS) measurements were additionally made on the homogenised bulk sediment samples to assess the presence of secular equilibrium in the 238 U and 232 Th decay series ( Table A). Daughter-parent isotopic ratios for 238 U, 226 Ra,210 Pb,228 Ra and 228 Th are consistent with unity at either 1σ or 2σ for all samples, confirming that the 238 U and 232 Th decay series are in equilibrium. Table A also includes the corresponding beta dose rates obtained using the HRGS results, which have been calculated after taking into consideration the fractional beta dose rate contributions of different isotopes in the 238 U decay series (Stokes et al., 2003;Guérin et al., 211). For all five samples, the final beta dose rates derived using HRGS are in agreement, at either 1 or 2σ, with those obtained using beta counting ( Table 6).

Equivalent dose (D e ) estimation
Multi-grain and single-grain OSL measurements have been made using a Risø TL/OSL-DA-20 reader equipped with blue LEDs (470 nm, maximum power 102 mW/cm 2 ), infrared LEDs (peak emission 850 nm, maximum power of 302 mW/cm 2 ), and a 10 mW Nd:YVO 4 (532 nm) singlegrain laser attachment (maximum power of c. 50 W/cm 2 ). Ultraviolet OSL signals were detected using Electron Tubes PDM 9107B photomultiplier tubes fitted with 7.5 mm-thick Hoya U-340 filters. Samples were irradiated with a mounted 90 Sr/ 90 Y beta source that had been calibrated to administer known doses to multi-grain aliquots and single-grain discs (average single-grain dose rate at the time of measurement = 0.095 Gy/s). Multi-grain OSL measurements (used for dose recovery tests only; see below) were made by mounting monolayers of quartz grains on 9.7 mm stainless steel discs using silicon oil spray (Silkospray). Single-grain D e measurements were made by manually loading individual 212-250 μm grains onto standard single-grain aluminium discs drilled with a 10 x 10 array of 300 μm diameter holes to ensure true single-grain resolution during D e evaluation (Arnold et al., 2012b).
Multi-grain dose recovery tests and single-grain D e measurements were undertaken using modified versions of the single-aliquot regenerative-dose (SAR) protocol described in Murray and Wintle (2000), as shown in Table B. Multi-grain D e values measured as part of the dose recovery test experiments were calculated by integrating the first 0.4 s of stimulation and subtracting a late-light background from the last 10 s. Single-grain OSL D e values were calculated by integrating the first 0.09 s of stimulation and subtracting a late-light background from the last 0.25 s. The sensitivity-corrected SAR dose-response curves were fitted with a single saturating exponential function. The uncertainty ranges of each individual D e value include three sources of error: (i) a random uncertainty term arising from photon counting statistics for each OSL measurement, calculated using Eq. 3 of Galbraith (2002); (ii) an empirically determined instrument reproducibility uncertainty of 1.9% for each single-grain measurement (calculated for the specific Risø reader used in this study according to the approach outlined in Jacobs et al., 2006); and (iii) a dose-response curve fitting uncertainty determined using 1000 iterations of the Monte Carlo method described by Duller (2007) and implemented in Analyst.
Single-grain OSL D e values were excluded from final age calculations when: (i) the net intensity of the natural test dose signal, T n , was not >3σ above the late-light background signal; (ii) the low-dose (plus high-dose in the case of single-grain OSL) recycling ratios (i.e., sensitivitycorrected luminescence responses (L x /T x ) for two identical regenerative doses) were not consistent with unity at 2σ; (iii) the OSL IR depletion ratio of Duller (2003) was not consistent with unity at 2σ (i.e., the ratio of the L x /T x values obtained for two identical regenerative doses measured with and without prior IR stimulation, designed to detect feldspar contamination or inclusions); (iv) the recuperation ratio, calculated as the ratio of the sensitivity-corrected 0 Gy dose point (L 0 /T x ) to the sensitivity-corrected natural (L n /T n ), was >5%; (v) the net T n signal had a relative error of >30%; (vi) the sensitivity-corrected natural signal (L n /T n ) did not intercept the sensitivity-corrected dose-response curve; (vii) the dose-response curve displayed anomalous properties (i.e., zero or negative response with increasing dose) or very scattered L x /T x values that could not be successfully fitted with the Monte Carlo procedure and, hence, did not yield finite D e values and uncertainty ranges; (viii) the L n /T n value intercepted the saturated part of the dose-response curve (L n /T n values were equal to the I max saturation limit of the doseresponse curve at 2σ).

SAR D e validation tests
Multi-grain dose recovery tests were initially undertaken on 160-grain aliquots of sample CLD17-2 to assess the suitability of the SAR protocol and determine optimal preheat combination for bulk grain fractions of the Caldeirão samples. Five batches of four aliquots were each bleached twice at room temperature for 1000 s using blue LEDs (with an intervening 10,000 s pause to ensure complete decay of any phototransferred charge in the 110 o C TL trap), after which a known laboratory dose of 50 Gy was administered using the calibrated beta source. The surrogate natural dose of each aliquot was then measured using protocol A shown in Table B. A series of different preheat combinations were applied to each batch of four aliquots, as follows: regenerative-dose preheat (PH1) of 200, 220, 240 or 260°C for 10 s in combination with a test-dose preheat (PH2) of 160°C for 10 s; and a PH1 of 220°C for 10 s in combination with a PH2 of 200°C for 10 s. The two preheat combinations that produced multi-grain dose recovery ratios closest to unity (PH1 = 200°C for 10 s and PH2 = 160°C for 10 s; PH1 = 240°C for 10 s and PH2 160°C for 10 s) (Fig A, panel A) were further tested via single-grain dose recovery tests. These additional dose recovery tests were performed on 200-300 quartz grains of sample CLD17-2 (Table C) after bleaching their natural signals using the same procedure described above and administering a surrogate natural dose of 75 Gy. The single-grain OSL dose recovery test results indicate that a PH2 of 240 °C for 10 s and PH1 of 160°C for 10 s is optimal for single-grain OSL burial dose estimation. This preheat combination yielded a mean measured-to-given dose ratio of 1.00 ± 0.02 (n = 55 accepted grains) and an overdispersion value of 10 ± 4% (Fig B, panel A). The data shown are for single-grain OSL measurements made using the 212-250 µm quartz fraction for all samples from Gruta do Caldeirão. (B) Examples of OSL decay curves and sensitivity-corrected dose-response curve (inset) for a representative quartz grain of sample CLD17-1. White square denotes the sensitivity-corrected natural OSL signal; filled circles denote the sensitivity-corrected regenerative dose OSL signals; white circles denote the repeated regenerative dose points used to calculate the recycling ratios. The D 0 value characterises the rate of signal saturation with respect to administered dose and equates to the dose value for which the saturating exponential dose-response curve slope is 1/e (or c. 0.37) of its initial value.  Table B. SAR protocols used in this study to undertake dose recovery tests on multi-grain aliquots (protocol A) and to obtain single-grain quartz OSL ages (protocol B). Ln and Lx refer to the natural and regenerative-dose OSL signal measurements, respectively. Tn and Tx refer to the test dose OSL signals measured after the Ln and Lx OSL signals, respectively. Each of these SAR measurement cycles was repeated for the natural dose, five different sized regenerative doses and a 0 Gy regenerative-dose (to measure OSL signal recuperation). Both the smallest and largest non-zero regenerative-dose cycles were repeated at the end of the SAR procedure to assess the suitability of the test-dose sensitivity correction. For protocol B, the smallest regenerative-dose cycle was also repeated a second time with the inclusion of step 2 to check for the presence of feldspar contaminants using the OSL IR depletion ratio of Duller (2003) A

Multi-grain OSL SAR protocol B Single-grain OSL SAR protocol
Step Treatment Symbol Step Treatment Symbol  (2006). Critical skewness values are taken to be equivalent to twice the standard error of skewness score (95% C.I.) for single-grain De datasets, following the results of sensitivity analyses performed by Bailey and Arnold (2006) and Arnold et al. (2007). b CAM = central age model; MAM-3 = 3-parameter minimum age model; MAM-4 = 4-parameter minimum age model (Galbraith et al., 1999). c De estimates have been calculated after adding, in quadrature, a relative error of 20% to each individual De measurement error to approximate the underlying dose overdispersion observed in 'ideal' (well-bleached and unmixed) sedimentary samples from this site (CLD17-2, CLD17-3, CLD17-4), the single-grain dose-recovery tests performed on the Caldeirão samples (CLD17-2) and from global overdispersion datasets (Arnold and Roberts, 2009). d Maximum log likelihood score of the CAM, MAM-3 or MAM-4 fit. For a given sample, the Lmax score of the MAM-3 is expected to be substantially higher (i.e. at least 1.92 greater) than that of the CAM when the addition of the extra model parameter improves the fit to the data. Likewise, the Lmax score of the MAM-4 is expected to be significantly greater than that of the MAM-3 (by at least 1.92 when compared with the 95% C.I. of a X 2 distribution) when the addition of the extra model parameter improves the fit to the data. If the extra parameter of the MAM-3 (or MAM-4) is not supported by the data, then its Lmax score will be similar to (i.e. within 1.92 of) the CAM (or MAM-3) Lmax score, indicating that the simpler age model explains the data equally well (Arnold et al., 2009). e Mean ± total uncertainty (68% confidence interval), calculated as the quadratic sum of the random and systematic uncertainties. Total uncertainty includes a systematic component of ±2% associated with laboratory beta-source calibration. f The preferred age for each sample is shown in bold. For these samples, the preferred age has been derived using the statistical age model that yielded the optimum Lmax score, following the criterion outlined in footnote d and Arnold et al. (2009).

Methodology
We employed three techniques to characterise the elemental and mineralogical composition of sediment samples and of residues adhering to three directly dated marine shells from Gruta do Caldeirão: P13sc491 (Fig 12, no. 1; OxA-22299) and P13-402 (Fig 12, no. 2; OxA-22300), from layer Jb (henceforth Shell 1 and Shell 2, respectively); P11sc968 (Fig 12, no. 4; OxA-22301), from layer K (henceforth Shell 3). Raman spectroscopy was applied to residues and uncoated areas of the three shells. A portable X-ray fluorescence (pXRF) analyser was used for sediment samples. Scanning electron microscopy (SEM) coupled with energy dispersive X-ray spectrometry (EDS) was used for the analysis of Shell 3 and sediment samples. The latter were also analysed by means of X-ray microdiffraction (µXRD).
Raman analyses were conducted with a Raman Senterra (BRUKER) device equipped with a 532 nm laser and using an illumination intensity of 2 mW. Scattered light was collected through a 50× objective.
The SEM-EDS instrument was a PHILIPS XL30 ESEM model with an electron gun LaB6 coupled with Si(Li) EDS. The samples were observed and analysed without any preparation, in controlled pressure mode (pressure of 10 -4 Torr). The acceleration voltage was set to 20 kV.
µXRD analysis were carried out on a dedicated, laboratory-made device using a Rigaku monochromatic source (λ=1.54186 Ǻ) and a 200 µm collimator. The maximum voltage and current were set at 45 kV and a 660 µA, respectively. The incident beam was positioned to form a grazing angle with the surface of the sample. The analysed area was about 1 mm². A 2D Rigaku imaging plate detector (R-AXIS IV++) and a motorized X,Y,Z,φ positioning system with an independent θ axis were coupled to the XRD equipment. Acquisition time was set at 3 minutes. The circular diffractograms were calibrated in 2θ and transformed into linear ones through the software Fit2D v.12.077, developed by Andy Hammersley (European Synchrotron Radiation Facility, Grenoble, France). Data treatment was performed with the EVA© software (Bruker).
pXRF measurements were carried out with a SPECTRO xSORT (AMETEK) instrument, equipped with a silicon drift detector (SDD) and a low power W X-ray tube with an excitation source of 40 kV. Measurements were acquired in the air with a constant working distance by using a lead receptacle to which the spectrometer is fixed. Light elements such as Na, Mg, and Al are not detected with this technique. An area of 8 mm in diameter was analysed. Spectra acquisition times were set to 60 s. The spectrometer is internally calibrated by an automated measure of the contents of a standard metal shutter. Data treatment was realised using standard materials and after two-time calibration of the results.

Results
Observation under a reflected light optical microscope identified remnants of a red coating and whitish residues on Shell 1, and red, orange, and white residues on Shell 2 (Figs D and E). Three superimposed deposits of different colour were detected on the surface of Shell 3 (Figs F-H): the first, adhering to the shell's test, is bright red in colour and composed of fine, sorted particles; the second is a thicker and coarser orange/reddish layer that, in places, covers the first; the third is an even thicker, whitish layer that covers both and can also be seen in the fill of the shell's aperture, indicating that it must relate to the sedimentary matrix.
SEM observation confirms the differences in texture between the bright red and orange/reddish layers of Shell 3 (Fig H): most particles composing the inner bright red layer are <1 µm and those in the order of 5 µm are rare, which indicates a clayey texture. However, the elements composing both layers are the same (Si, Al, Ca, K, Fe, P), albeit in different proportions (P is substantially more abundant in the outer orange/reddish layer).
Raman spectroscopy ( Table E; Fig D) identified the presence of hematite and calcite in, respectively, the red and the whitish residues coating Shell 1. Hematite was also identified in the red residues found on the surface of Shell 2, whose pale reddish coating was dominated by calcite with traces of hematite; the analysis of the shell's test reveals diagenetically unmodified aragonite (Fig E). The bright red layer adhering to the surface of Shell 3 is composed of hematite, possibly associated with ferrihydrite, and calcite, while the overlying orange/reddish layer contains either hematite associated with calcite or hematite (and, possibly, magnetite) associated with phosphates. In Shell 3, however, the hematite spectra identified in the bright red and orange/reddish layers differ from reference spectra in the form of line shift, change in relative intensities, and absence of bands beyond 400 cm-1 ( Table F). These anomalies can be due to excessive laser power (2 mW, wavelength 532 nm), and are not necessarily related to the crystallinity of the hematite (Faria et al., 1997). µXRD analysis of the bright red layer of Shell 3 identified quartz, calcite, aragonite, clays of the illites/glauconites family, and kaolinite (Fig G; Table G). Hematite was not detected, but the main peak of this mineral coincides with an aragonite peak, and a secondary peak coincides with a kaolinite one. In light of the Raman results, the failure of µXRD to detect hematite must be due to method limitations. Indeed, µXRD similarly failed when analysing red residues found in the overlying orange/reddish layer. The latter was found to be mainly composed of calcite and aragonite, while the bright red layer also showed traces of illite/glauconite and kaolinite.
The pXRF analysis of sediments from layers Jb and K identified a notable proportion (4-5%) of iron oxides, as intimated by their reddish colour (Table H). However, iron oxides were not detected by µXRD (Table I), which probably implies that, in the sediment, such oxides are found in poorly crystalline form only. Otherwise, both layers have a similar mineralogical composition: quartz, calcite, feldspar (microcline or other), calcium phosphate (hydroxyapatite family), and illite/muscovite are present in both; kaolinite is the single mineral found in one layer only (layer Jb).

Synthesis
The layer of residue adhering to the surface of Shell 3 differs from the overlying orange/reddish layer in colour (bright red), grain size (clayey), and composition (less calcium phosphate). The composition of the orange/reddish layer is intermediate between the underlying bright red layer and the sedimentary matrix of layer K, where the shell was retrieved. The bright red and the orange/reddish layer also feature a higher proportion of clay minerals (illite/muscovite and kaolinite), which is consistent with the clayey grain size of the bright red layer. The layer's matrix contains a small proportion of iron, but µXRD failed to identify hematite. Iron compounds (oxy-hydroxides or oxides) seem to be present in the sediment in poorly crystalline form, whereas in the residues found on all the shells they appear as hematite (although this difference would need to be confirmed by Raman analysis of the sediment).
Overall, these patterns are consistent with two different interpretations. The first is that the bright red layer seen on Shell 3 represents the remnant of a hematite-rich compound that (a) covered the shell during its use as an ornamental object, and (b) was still present on its surface at the time of loss (or discard) and eventual incorporation in the deposit. The second hypothesis is that said bright red layer (a) corresponds to the finest fraction of the sediment making up archaeological layer K, and (b) represents post-depositional accumulation. Although it cannot formally ruled out, this second hypothesis is unparsimonious and indeed rather unlikely, as it requires us to postulate an unknown mechanism by which, prior to its eventual deposition on the shell's surface, the fine fraction would have been segregated, with some of the minerals that make it up being eliminated in the process. Provided we interpret the reddish deposits found in Shells 1 and 2 alike the orange/reddish layer of Shell 3, i.e., as remnants of a pigmentatious compound diluted in a calcitic matrix, we can conclude that, originally, the three shells were coated with a red, hematite-rich colouring mixture.   Fig F. Shell 3 (P11sc968; layer K). Details of the orange/reddish and bright red deposits covering the shell and Raman spectra of those deposits compared to a white spot of the shell's test itself. Fig G. Shell 3 (P11sc968; layer K). Raman spectra and X-ray diffraction pattern of the bright red layer. Fig H. Shell 3 (P11sc968; layer K). SEM back scattered images and EDS results of the orange/reddish and bright red deposits.

Stratigraphic provenance of key finds
O13sc91 (MAMS-38336) The specimen is a non-plotted horse tooth retrieved in spit E4 of square O13 (see Fig 3 for the location of this grid unit in the Corridor area of the site). The décapage plans illustrating the excavation of that spit, carried out in 1983 between August 26 and August 31, are reproduced in Fig I. The hand-written annotations provide elevations of both the sediment and the upper and lower surfaces of the blocks exposed at the base of each spit. Note the large animal burrows and the linear disturbance features, which denote root paths. The description emphasises the large number of rabbit bones retrieved, especially in and around the larger burrow feature against the cave wall.
O13 was the first grid unit to be opened in order to extend the Back Chamber trench towards the Corridor. At the time, it was assumed that the latter would feature the same stratigraphic succession and, therefore, that the first spits of the reddish-brown deposit below layer ABC-D would correspond to layer Ea. The décapage descriptions reflect that assumption. Only subsequently, with further outward expansion of the excavation trench, was it possible to recognise that layer Ea wedged out at the transition between Back Chamber and Corridor, and that spits E1-E4 belonged in the upper part of layer Eb, not in layer Ea.

O12-84 (OxA-X-2786-13)
The specimen is a human left mandibular fragment. It belonged to an early adolescent: the Caldeirão 2 individual, as described in Trinkaus et al. (2001). The dm 2 is preserved in its socket. The fossil was retrieved in a small burrow against the cave wall, during the excavation of spit H1 of grid unit O12 (Fig J). The latter corresponds to a triangular surface created by the site's gridding against the north wall of the Back Chamber (see Fig 3 for its location on the site plan). Here, layers Fa-Jb were excavated between July 7 and August 5, 1986, i.e., after the surface of layer K in the adjacent P row had already been reached (during the previous field season, in 1985). The idea was to double-check, by careful décapage, the EW dip of the stratification suggested by observation of the P>O/11-12 profile, and to verify, by comparison with the opposite profile (P>Q/11-13), that the NS dip was indeed negligible, as the excavation of the P row had suggested.
The hand-written annotations provide elevations of the sediment at the base of each spit. They also contain summary descriptions of matrix and clasts, reflecting how the Fc/H interface was first thought to correspond to the base of spit F8, with continued excavation showing that a few cm remained before the surface of H was truly exposed, which was the case at the base of spit F9. Note that the burrow only appeared as the surface of H was reached. This evidence suggests that the disturbance was a small scale one and that the finds made in the burrow are reworked from layer H itself, not intrusive form layer Fc above.

P13-403 (OxA-5541)
The specimen is a distal metapodial of red deer from spit J6 of square P13, at the base of which the Jb/K interface was reached in most of the square (Fig K). In this part of the cave (squares O-P/13-14), at the 90° angle between Back Chamber and Corridor, controlling for the presence of a double dip (EW in the former, NS in the latter) was hindered by the relative homogeneity displayed by the matrix through the succession of layers Fa-L. Even though often aided by such clues as the presence of stone lines, incrustation lenses, or flat-bottomed slabs denoting the actual disposition of past cave floors, the décapage of stratigraphic interfaces in P13 was always rather approximate.
This difficulty may explain the erroneous assignment of P13-403 to "layer K-top" that appeared in previously published reports on the site's dating (e.g., Zilhão, 1997). As shown by the décapage plans reproduced in Fig K, P13-403 was found at the same elevation and adjacent to the retouched flint knife and directly dated Aporrhais pespelecani shell illustrated in Figs 10 and 12.
When their position is assessed against a virtual surface reconstructed from the elevations found in the more reliable excavation records -the topography of the Jb/K interface in O/13-14, and its elevation along the P>Q/11-13 profile -the three items lay at, or just above the base of layer Jb. Indeed, this exercise shows that, at the base of spit J6 of P13, the surface of layer K (a) had yet to be reached in the square's SW corner, and (b) conversely, due to the heavy induration of the deposit, which hindered a precise décapage of layer boundaries, it had been somewhat undercut in the square's SE corner (without consequence, however, as that corner was entirely devoid of finds). Note the root burrow along the wall in O13, which was not detected when, a month before (July 31, 1986), the same surface had been exposed in P13.

O13-361
The specimen is a large quartz sidescraper (Fig 11, no. 1) retrieved in square O13 at the surface of layer L. In previous publications (e.g., Zilhão, 1997), it was assigned to layer K, which we correct here.
The following reasons explain the original misassignment: (a) the excavation of squares O/13-14 and P13 down to the surface of layer L was carried out at the very end of the project (September 12 and 14, 1988, respectively), and it stopped at the elevation of that boundary; (b) no subsequent field assessment of the stratigraphic accuracy of the assignment of finds then made was therefore possible; (c) through the excavation of the site, all finds made during the last, fine-décapage stage of the exposure of stratigraphic interfaces were by convention recorded as belonging in the unit above the interface. Following this convention dictated that O13-361 be recorded as "K," even though this was the first artefact found in the excavation of O/13-14 since the quartzite flake O13-346, which lay 30 cm higher-up, at the top of layer Jb. In addition, at the time, layer K was thought to belong in the Middle Palaeolithic. Whether this item came from K or L was therefore not regarded as hugely significant, and whether the convention ought to be ignored in this case was not considered to be an issue of chronostratigraphic importance.
As shown by the décapage plans reproduced in Fig L, the elevation of O13-361 clearly places it at the very top of layer L, not in layer K. The plan also shows that, due to the same "double-dip" problems mentioned in relation to the P13-403 radiocarbon sample, the K/L interface was significantly undercut in the NE part of P13. The same happened in its SE corner, due to induration. Elsewhere along the P>Q13 profile, however, induration had the opposite effect, i.e., the décapage could not proceed to the exact interface and remained a few cm above it. Fig I. O13, field records for spits E3-E4. Description and elevation of the surfaces delimiting the thickness of deposit that yielded the O13sc91, non-plotted horse tooth; its radiocarbon age (20,077 ± 100 BP;  shows that this an upwardly moved find derived from the underlying Solutrean deposit.

Fig J. O12, field records for spits F8-H1.
Description and elevation of the surfaces delimiting the thickness of deposit across which the Fc/H interface was exposed and excavated in grid unit O12, and the O12-84 human fossil (whose [x,y] coordinates are indicated by the star) retrieved in the small burrow exposed at that interface and radiocarbon dated to 19,400 ± 150 BP (OxA-X-2786-13).

Methods
Bayesian modelling was undertaken using OxCal v4.4 (Bronk Ramsey, 2009a), following the general approach outlined in Demuro et al. (2019Demuro et al. ( , 2020. The sedimentary sequence has been modelled using a Sequence depositional model, incorporating stratigraphic units in ordered succession and separated by associated boundaries. The Gruta do Caldeirão Bayesian model focuses on the eleven layers comprising the pre-Magdalenian archaeological sequence excavated in the Back Chamber (layers Fa-O), plus the two Middle Palaeolithic layers from the Entrance Trench (Units 5-6). The dating determinations for individual units are represented as a grouped set of likelihoods (Phase) within the Sequence model. Boundaries have been used to delineate the beginning and end of each stratigraphic unit, and to specify that all likelihoods or events included in these groupings have a uniform prior likelihood of occurrence. Separate rather than shared boundaries have been used to delineate the beginning and end of each stratigraphic unit to ensure the model is able to accommodate potential depositional hiatuses or erosional discontinuities between successive layers.
The single-grain OSL dating likelihoods have been input into the model as calendar ages before year of sample collection, together with their associated 1σ uncertainty ranges, using the date command. The Bayesian model was run using the general outlier function (Bronk Ramsey, 2009b), with prior outlier probabilities of 5% assigned to all dating samples. Likelihood estimates that yielded posterior outlier probabilities >5% were not excluded from the final model but were proportionally down-weighted in the iterative Markov Chain Monte Carlo runs (Bronk Ramsey, 2009b).
To examine the sensitivity of the modelling outcomes to different assumptions about stratigraphic priors and dating likelihoods, we have run five different versions of the Gruta do Caldeirão Bayesian model (Models I to V). The structure of these models, and the main differences in representation of individual stratigraphic layers and dating determinations, are summarised in Table J. In brief, Model I is set up with separate stratigraphic units defined for each individual layer, with the exception of layers Fa-Fc, which are grouped as a single unit. Model I includes all age determinations depicted in Fig 23, together with the radiocarbon determination obtained on the Semicassis saburon ornament from layer K (OxA-22301). The radiocarbon determinations for OxA-1938 and OxA-22301 are assumed to represent maximum age estimates for layers Fa-Fc and K, respectively, and have therefore been input into the model using the After command. The radiocarbon determination for MAMS-41872 is assumed to represent a minimum age estimate for layer L and has thus been input into the model using the Before command. Model II is the same as Model I, except that layers I-Ja and layers L-N are represented as combined stratigraphic units rather than defined as individual units. Model III is the same as Model II, except that the radiocarbon determination for MAMS-33905 is considered as a minimum age estimate for layers L-N and has therefore been input into the model using the Before command. Model IV is equivalent to Model III, but all radiocarbon determinations have been removed from the Middle Palaeolithic units (layers L, M, N) to test the extreme assumption that they all suffer from methodological or stratigraphic reliability issues. Layers L, M and N are also represented as separate stratigraphic units rather than as a single combined grouping in Model IV. Model V is the same as Model III but includes the two radiocarbon determinations from the Entrance Trench (MAMS-41874 and MAMS-41876), and additionally adopts a single stratigraphic grouping for layers L, M, N (Back Chamber), and the Middle Palaeolithic layers from the Entrance Trench (Units 5-6). The CQL codes used to construct Models I to V are provided in the next section.
The results obtained for Models I to V are summarised in Tables K-O        The likelihood (unmodelled) and posterior (modelled) age ranges are presented for each of the numerical dating samples. Posterior (modelled) ranges are also shown for the boundaries and age of each stratigraphic layer. Posterior ages are presented as the 68.3% and 95.4% highest probability density ranges. The mean and 1σ uncertainty ranges of the modelled posterior distributions are shown for comparison (assuming a normally distributed probability density function). The unmodelled and modelled age estimates have been rounded to the nearest 10 years.