Towards the restoration of ancient hominid craniofacial anatomy: Chimpanzee morphology reveals covariation between craniometrics and facial soft tissue thickness

In modern humans, facial soft tissue thicknesses have been shown to covary with craniometric dimensions. However, to date it has not been confirmed whether these relationships are shared with non-human apes. In this study, we analyze these relationships in chimpanzees (Pan troglodytes) with the aim of producing regression models for approximating facial soft tissue thicknesses in Plio-Pleistocene hominids. Using CT scans of 19 subjects, 637 soft tissue, and 349 craniometric measurements, statistically significant multiple regression models were established for 26 points on the face and head. Examination of regression model validity resulted in minimal differences between observed and predicted soft tissue thickness values. Assessment of interspecies compatibility using a bonobo (Pan paniscus) and modern human subject resulted in minimal differences for the bonobo but large differences for the modern human. These results clearly show that (1) soft tissue thicknesses covary with craniometric dimensions in P. troglodytes, (2) confirms that such covariation is uniformly present in both extant Homo and Pan species, and (3) suggests that chimp-derived regression models have interspecies compatibility with hominids who have similar craniometric dimensions to P. troglodytes. As the craniometric dimensions of early hominids, such as South African australopithecines, are more similar to P. troglodytes than those of H. sapiens, chimpanzee-derived regression models may be used for approximating their craniofacial anatomy. It is hoped that the results of the present study and the reference dataset for facial soft tissue thicknesses of chimpanzees it provides will encourage further research into this topic.


Introduction
The primate family of Hominidae is comprised of the African apes, humans and all ancestors leading to these clades. Reconstructing  Hominidae, called here hominids, has become an increasingly popular practice with many approximations of their faces presented in museum exhibitions, popular science publications, and at conference presentations worldwide [1][2][3]. In these contexts, reconstructions of the face and body have proven to be an effective vehicle for the dissemination of scientific information about human evolution. However, there is a recognized problem of variability among reconstructions of the same individual. A recent study comparing approximations of LB1, the holotype of Homo floresiensis [4,5], reported that they vary significantly among one another [6]. Similarly, in a systematic survey of 860 hominid reconstructions presented in 71 museums across Australia and Europe, it was found that inconsistencies are prevalent in all other approximations of extinct hominid species [7]. If practitioners were using consistently reliable methods this variability would not have occurred. Obviously, the confounding effect of practitioner experience and competency in the reconstruction procedure also plays a role here. However, there is clearly variation in the replicability across existing reconstruction methods depending on the robustness of the empirical data supporting them and, since there is currently very little applicable data that can be extrapolated to Plio/Pleistocene hominids, results vary depending on methods used and individualities of practitioners.
Worse still, in its present state the practice of hominid reconstruction is particularly vulnerable to attack. Campbell et al. [8] argues that reconstructions based on unspecified sets of assumptions and biased misconceptions can actually do harm by perpetuating erroneous ideas about human evolution. Critics of human evolution are already using discrepancies between reconstructions of the same individual to undermine the reliability of evolutionary theory. Therefore, it is of utmost importance to strengthen methods of reconstruction as much as possible to reduce this variability and avoid such criticisms.
The term used to describe the process of building a face over a skull varies in the literature between disciplines. In forensics, the name of the process most commonly referred to as 'facial reconstruction' was updated to 'facial approximation' because it is a more accurate description of the results, whereas in paleoanthropology the term 'facial reconstruction' is still being used [2]. Regardless of what term is preferred, the results are always approximate and therefore we agree with previous authors and prefer the term 'facial approximation' [6,9,10]. Scientific testing of facial approximation methods has been a major focus in craniofacial identification of human remains with research dating back at least to Welcker [11], with important contributions by Gerasimov [12,13], Prag and Neave [14], and Wilkinson [15]. Methods using means of soft tissues of the face have received the most attention [16][17][18][19], however, there is a recognized flaw in extrapolating means to individuals. As statistically robust as means may be, they only express means for specific populations. For reconstructing individuals, population means are not appropriate because they completely ignore variation among individuals. Regarding the approximation of extinct hominids, interspecies extrapolation of means derived from either modern humans or the extant great apes (Gorilla, Pan, and Pongo), as suggested in Hanebrink [18], is equally inappropriate because it also ignores variation among individuals.
One possible solution to this problem is to identify approximation methods that are compatible across all members of the Hominoidea superfamily. If a consistent pattern in covariation between soft tissue and craniometric measurements can be identified in extant hominids, then extinct hominids can reasonably be assumed to have followed suite. Such covariations were first explored in human material by Sutton [20] and extended in Simpson and Henneberg [21]. Correlations were found and multiple linear regression models were used to generate equations for improving estimations of soft tissue thickness from craniometrics alone in modern humans, though this covariance has rarely been used in facial approximations. Reactions to the results of these studies are mixed. Stephan and Sievwright [22], using data measured with substantial random errors, report that regression models have low correlation coefficients that do not improve soft tissue thickness estimates above population means. However, Dinh, Ma [23] repeated the use of linear regression models and produced favorable results that encourage further exploration. Thus, for the purpose of hominid facial approximation, the possibility of generating soft tissue thickness values that are individualized to a specific hominid specimen is undoubtably better than extrapolation of species-specific means.
The present study is motivated by the aforementioned concerns and while we hold that the findings reported here are valuable we raise three caveats at the outset: 1) As in previous studies of chimpanzee soft tissues [24][25][26], this study includes only a small sample of chimpanzees and, therefore, the conclusions from the results are subject to further testing on larger samples; 2) We also do not include other members of the African ape clade and so we cannot expand our findings to the entire Hominoidea superfamily; and 3) We do not claim to eliminate the need for informed speculation in hominid facial approximation entirely. Not all soft tissue characters of ancient hominids are addressed here, such as the facial features (eyes, nose, mouth, and ears), which arguably have a much greater impact on the variability between reconstructions of the same individual than soft tissue thicknesses alone. This work represents a step towards an empirical method that will strengthen the practice, but it is by no means a final solution to the problem.
The aims are: (1) To validate in chimpanzees (Pan troglodytes) that facial soft tissue thicknesses covary with craniometric dimensions, and (2) to produce soft tissue prediction models with interspecies compatibility from chimpanzee material that can be used in the facial approximation of extinct hominids.

Materials and methods
Computed tomography scans of 28 chimpanzees were collected from two separate data repositories. Scans were accessed online, via the Digital Morphology Museum, KUPRI (dmm.pri. kyoto-u.ac.jp) and Morphosource (https://www.morphosource.org), and obtained as Digital Imaging and Communications in Medicine (DICOM) format bitmap files. After excluding two neonates, one infant and six subjects showing obvious pathological effects or degradation caused by decomposition, the study sample contained 19 individuals of known age, sex, and subject condition. The sex ratio was 1:1.71 (7 male and 12 female) and the mean age was 30.9 years (minimum = 9; maximum = 44; SD = 10.1). Subject condition was varied and included seven living, five fresh, five frozen, and two subjects that were preserved by immersion. Information on whether individual subjects were scanned in the supine or prone position was not available at the time of this study. Further information on whether subjects had been living in the wild prior to scanning was also not available but the study sample is assumed to consist of animals that were housed in captivity only. A complete list of all subjects used in this study is presented in the (S1 Table).
Prior to measuring, all skulls were oriented in the Frankfurt horizontal plane determined according to its original definition by a horizontal line passing through the inferior border of the orbital rim (mid-infraorbital) and the top of the external auditory meatus (porion) on both sides of the skull. Facial soft tissue thickness was then measured at 39 cephalometric landmarks (Fig 1; 17 medial and 22 bilateral) in OsiriX, v. 11.02 (Visage Imaging GmbH, Sand Diego, USA), which has been shown to produce accurate measurements that can be reliably compared between separate studies [27,28]. Cephalometric landmarks were selected based on common depths found in the facial approximation literature [21,[29][30][31]. However, to allow for other aspects of the head beyond the face to be investigated, further points were added to include a wider range of points than normal, particularly points on the lateral areas of the head. The decision to include additional points can be explained as follows. The purpose of facial approximation of modern humans is to generate a specific recognition of a target individual, however, in facial approximation of ancient hominids the purpose is to show morphological differences between separate species. As morphological differences extend beyond the face to the rest of the head, the inclusion of further points allows for such comparisons to be made. A maximum total of 61 soft tissue depth measurements were possible per individual as well as 21 measurements of craniometric dimensions. All cephalometric points were positioned onto 3D volume renderings of the skulls prior to measuring with the exception of 6 points (gnathion, metopion, mid-mandibular body, mid-mandibular border, mid-nasal, mid-philtrum) that were more precisely aligned in 3D multiplanar reformatting by halving the inter-landmark distance between two adjacent points along the sagittal plane. When analyzing scans, it was noticed that some sutures of the cranium had been obliterated. However, as sutures are needed for the positioning of bregma, dacryon, ectoconchion, nasion, pterion, and zygomaxillare, these points were positioned and cross-checked against a reference skull of a chimpanzee from the Vernon-Roberts Anatomy and Pathology Museum, University of Adelaide. Depths at all landmarks were then measured perpendicular to the bone surface using the coronal, sagittal, and transverse planes to control the direction of measurement. Xscope, v. 4.4.1 (ARTIS Software, Virginia, USA) was used to superimpose horizontal and vertical guides during the measurement procedure to keep measurements parallel to the reference planes. For thicknesses at bilateral landmarks, measurements were taken from both the right and the left side of the face and then the mean was calculated. Very few depths were unobtainable with the exception being those landmarks located in areas of incomplete DICOM data or where the soft tissues were outside of the anatomical position. For example, craniometrics crossing the occlusal line were not taken from subjects with open mandibles and soft tissue measurements were not taken for prosthion and/or infradentale in subjects with folded lips. Table 1 gives a complete list of all landmarks used in this study, their definitions, as well as their corresponding planes and angles of measurement. All soft tissue thickness data collected for chimpanzees has been made available by the authors on Figshare (https://figshare.com/s/8a3ba7df4dad9df7d70d).  Table 1. https://doi.org/10.1371/journal.pone.0245760.g001

PLOS ONE
Towards the restoration of ancient hominid craniofacial anatomy Intra-observer soft tissue and craniometric measurement reliability was assessed by testretest measurements following recently discussed data collection protocols in Stephan et al. [32]. Measurements were taken on a subsample of five individuals that were randomly selected by a volunteer. Specimens were PRI-9783, PRI-Akira, PRI-10814, PRI-Mari, and PRI-Reiko. Measurements of each specimen were conducted on separate days over a four-hour period with retest measurements taken seven days after initial assessment for 40 total measuring hours (96 hours including the entire sample). Intra-observer measurement reliability was calculated using the technical error of measurement (TEM) equation: TEM ¼ ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi SD 2 =2n p Where D is the difference between the values obtained for measurements taken on two separate occasions for each variable raised to the second power and n is the number of individuals measured. The relative TEM (r-TEM) was also calculated by dividing TEM by the mean measurement to convert to a percentage value. The addition of another investigator to determine inter-observer errors was not included in this study as Stephan et al. [32] do not appear to suggest that it is necessary to do both, especially when the entire study is conducted by a single observer. Two-tail t-tests were used to analyze differences in mean soft tissue thicknesses and craniometric dimensions between males and females as well as between living and deceased subjects. Significance levels were set at p < 0.05 but altered according to the Bonferroni adjustment for 39 comparisons of soft tissue thicknesses and 21 comparisons of craniometric dimensions. Post-hoc power analyses were also performed to validate conclusions from two-tail t-tests. These analyses were conducted in Microsoft 1 Excel 1 , v. 16.39 for Mac.
Data distribution was checked using the Kolmogorov-Smirnov and Shapiro-Wilk tests. All variables' distributions did not differ significantly from the normal distribution, so no attenuation correction was made. Stepwise multivariate linear regression analysis was then performed to examine the relationships between facial soft tissue thicknesses and craniometric dimensions. Due to the small sample size, missing values in the data set were imputed with variable means rather than using list-wise or pair-wise deletion. These analyses were carried out with the Statistical Package for the Social Sciences (SPSS 1 ) software, v. 26.0 for Mac (SPSS Inc, Chicago, II, USA).
To assess the validity of the regression models, they were tested on an in-group sample of 19 subjects. We admit that it would have been better to perform this test on a separate sample of chimpanzees, rather than the same (training) sample. However, given that the total sample size was already small (n = 19), and no other samples were available, it was decided that it would be better to maintain the largest possible sample size for the regression analyses, rather than removing some individuals to generate a separate out-of-group sample just for validation. With that said, craniometric measurements for each subject, taken from the results of the above described measurement procedure, were employed with the appropriate regression models to predict facial soft tissue thickness at 26 landmarks (8 medial and 18 bilateral). Empirical comparisons were then calculated as the simple difference, z-score and relative percent difference between predicted and observed soft tissue thicknesses. These analyses were conducted in Microsoft 1 Excel 1 , v. 16.39 for Mac.
To validate the interspecies compatibility of the regression models, an out-of-group test was conducted on computed tomography scans of one living adult modern human male of European decent and one wet specimen of a deceased sub-adult male bonobo (Pan paniscus; subject S9655). Bonobos and chimpanzees are highly similar to each other in many respects, however, they are classified as distinct species [33]. S9655 is a sub-adult (4 years) male and thus outside the age-range of this study sample. Therefore, the bonobo served as an independent test on a sub-adult individual belonging to a separate species that was not used to generate the regression models. Moreover, to the knowledge of the authors, this is the only bonobo scan that is publicly available and as such removes the possibility of selection bias. The bonobo was accessed online via Morphosource (https://www.morphosource.org). The human subject was donated specifically for the purpose of this study to the University of Adelaide by an anonymous donor. Both subjects were obtained as DICOM format bitmap files. Soft tissue measurements used in the comparison between observed and predicted values were taken following the aforementioned protocol and the landmarks are listed in Table 1.
To demonstrate the practical utility of the regression models, a 3D facial approximation was performed on the skull of subjects PRI-Cleo, S9655, as well as on an Australopithecus africanus skull (a composite reconstruction of specimens Sts 5 and Sts 52) previously described in Strait et al. [34] and Benazzi et al. [35,36]. The Sts 5 specimen, dated to 2.14 Ma [37] and found in 1947 at the South African Sterkfontein site by Robert Broom and John Robinson [38], is a perfect candidate for demonstrating the utility of the facial approximation method. Despite the edentulous maxilla and a break in the cranium associated with a dynamite explosion at the time of discovery, Sts 5 is an exceptionally preserved specimen relative to most other Plio-Pleistocene hominid skulls [39][40][41]. It is worth nothing here that while the identification of Sts 5 as a female has been and is still the subject of ongoing debate [42][43][44], the cranial features of the Sts 5 specimen suggest that it was certainly not a subadult individual; therefore, it is well within the age-range of this study sample [45,46]. The Sts 52 mandible was reconstructed using state-of-the-art digital methods from research quality casts of the original specimen. For a full description of the digital reconstruction process, see S1 Table in Benazzi et al. [35].
To begin the facial approximation procedure, soft tissue thicknesses were estimated for all three subjects by taking craniometric measurements from the skulls in digital format and inserting these measurements into the appropriate regression models. The resulting values were then used to design and place pegs corresponding to the results onto the skulls at their appropriate cephalometric points in Autodesk Maya, 2018 (Autodesk, San Rafael, CA). Both skulls were then 3D printed as recommended by Walker and Humphries [47]. Each skull was printed separately with articulated mandibles on the M200 3D printer (Zortrax 1 ) in acrylonitrile butadiene styrene via fused deposition modelling. Post-processing of prints involved the removal of all support material and then mounting the skulls in the Frankfurt horizontal plane. As a precautionary measure, all three of the 3D printed skulls were measured and crosschecked against measurements taken from their digital counterparts. No discrepancies were observed. The soft tissues were then constructed using an oil-based modelling medium by GV using the pegs to guide the thicknesses at each cephalometric point. Given that facial features (eyes, nose, mouth, and ears) were not the focus of the present study and are likely to be based on intuition rather than empirical science, especially in the case of A. africanus, the eyes were closed, the nose and mouth were left undefined, and the ears were omitted from all three approximations.

Results
The descriptive statistics for soft tissue measurements and the craniometrics are presented in Tables 2 and 3 respectively, along with the intra-observer TEM and r-TEM for each variable. The mean intra-observer r-TEM for measurements of soft tissue thickness was 2.38%. The lowest intra-observer r-TEM was observed for prosthion (0.45%) and the largest was rhinion (7.00%). For the craniometrics, the mean intra-observer r-TEM was 0.25%. The lowest intraobserver r-TEM was observed for the distance from vertex to subnasale (0.05%) and the largest was for bigonial breadth (0.93%). It may be suggested that the TEM values reported here are underestimates because the repeat measurements were performed on the same scans rather than replicating the whole measurement process to include the reacquisition of scans. However, not only are CT scans of great apes exceptionally rare but also reacquisition of scans of living apes would expose these endangered animals to needless levels of ionizing radiation and is, therefore, unethical.
Two-tail t-tests show that there were no significant differences between living and deceased chimpanzee means across all 39 soft tissue thicknesses ( Table 2) and 21 craniometric dimensions (Table 3). This implies that the use of fresh, frozen, and immersed subjects in this study did not compromise the validity of the results. However, post-hoc power analysis revealed that the probability of detecting a significant difference was low and that these results should be taken as tentative. In contrast, post-hoc power analysis revealed that comparisons of sexual dimorphisms were valid. Our results are in agreement with prior research carried out on living chimpanzees in that there are no significant differences between male and female means [18], with the exception of two soft tissue landmarks (vertex and frontotemporale) and two

PLOS ONE
Towards the restoration of ancient hominid craniofacial anatomy craniometrics (bizygomatic breadth and inter-canine breadth). The greatest difference between male and female soft tissue measurements was observed for vertex, and the lowest difference for the craniometric was observed for bizygomatic breadth. Note that this means there were only two instances out of 39 soft tissue depths and 21 craniometric dimensions that were significantly different between the sexes. In stepwise multivariate linear regression analyses, statistically significant (p < 0.05) correlations between soft tissue depth measurements (Table 2) and craniometric dimensions ( Table 3) were found. Of the 39 cephalometric landmarks assessed, statistically significant regression models were established for 26 landmarks (8 medial and 18 bilateral) and these are given in Table 4. Scatterplots showing four examples of bivariate relationships are shown in Fig 2. As such, it is now possible to reconstruct soft tissue thickness at 44 individual points on the face of chimpanzees using regression models alone. The mean standard error of the estimate (SEE) was 2.39 mm, ranging from 0.42 mm for menton to 5.29 mm for ectomolare 2 , and the mean multiple correlation coefficients (R = 0.67) far exceed those produced in previous studies of human material [21,22]. The model with the highest correlation coefficient was pterion (r = 0.93) and the lowest was mid-nasal (r = 0.46). A factor that should be considered here is that if this study sample of chimpanzees was indeed composed of subjects that were scanned in both the supine and prone positions, then the correlations observed here are likely underestimates and the true strengths of correlations higher than those reported.
The results of the regression models applied to the in-group sample of 19 subjects are shown in Table 5. Overall, the performance of the regression models was accurate as the

PLOS ONE
Towards the restoration of ancient hominid craniofacial anatomy differences between observed and predicted soft tissue thickness values were small (mean difference in mm = 1.96 mm; mean z-score = 0.54; mean relative difference 18.13%). The minimum difference in mm was observed in the regression model for mid-nasal (0.30 mm) and the maximum for ectomolare 2 (5.18 mm) The minimum z-score was observed in the regression model for pterion (0.28) and the maximum for mid-supraobital superius (0.96). The minimum relative difference was observed in the regression model for prosthion (7.62%) and the maximum for mid-mandibular border (28.50%). Fig 3 shows the results of the regression models applied in 3D facial approximations of subject PRI-Cleo (Fig 3A), the out-of-group bonobo subject S9655 (Fig 3B), and the composite skull of A. africanus (hereafter referred to as Sts 5; Fig 3C). The differences between observed and predicted soft tissue thicknesses for PRI-Cleo and S9655 were small (Fig 4). The mean difference across 25 landmarks was 1.3 mm for PRI-Cleo (minimum = 0.1 mm; maximum = 5.8 mm) and 1.5mm for S9655 (minimum = 0 mm; maximum = 4.1 mm), which demonstrates that chimpanzee-derived regression models have closely predicted the soft tissue thicknesses for a sub-adult bonobo. In contrast, in the human subject the mean difference across 26 landmarks was 14.5 mm (minimum = 0.1 mm; maximum = 61.5 mm). This difference is much higher than those reported for PRI-Cleo and S9655 and in one case (menton) the regression

PLOS ONE
Towards the restoration of ancient hominid craniofacial anatomy model produced a negative result, which clearly indicates a fundamental problem with chimpderived regression models for predicting modern human soft tissue thicknesses. As it is not possible for tissues thicknesses to be negative or equal to zero a 3D facial approximation of the human subject was not produced.

Discussion
Despite the aforementioned limitations of sample size, specimen condition, and complications related to the reacquisition of scans, we have observed that soft tissue thicknesses covary with craniometric dimensions in chimpanzees. Using Stephan's [17] previously published soft tissue thicknesses for humans, a comparison can be made for 23 cephalometric points between P. troglodytes and H. sapiens means. Fig 5A shows a number of human thicknesses that are more similar to chimpanzees than reported in previous research measuring chimpanzee tissue thicknesses via ultrasound [18]. The thickness of soft tissue in the area of the cheeks, which corresponds to landmarks ectomolare 2 and ectomolare 2 , has been reported to differ between humans and chimpanzees with humans having thicker cheeks as a result of increased adipose tissue at this region [6,18]. However, our data show that mean

PLOS ONE
Towards the restoration of ancient hominid craniofacial anatomy thicknesses at ectomolare 2 are identical between humans and chimpanzees and that ectomolare 2 is only marginally larger in humans. Similarly, zygion is identical between species and not smaller in chimpanzees as previously reported. It is worth noting here that since measurements taken by ultrasound are known to compress the tissues of the face in comparison to CT based measurements [32], soft tissue thickness at zygion was most likely underestimated in the previous study [18]. Of the larger differences shown in Fig 5B, there are only slight differences between human and chimpanzee means (minimum = 2 mm; maximum 7.8 mm). It is important to note that this comparison includes only 23 out of the 39 points that were measured in this study and, therefore, a more thorough comparison composed of a larger sample of human values may reveal less similarities than what is reported here. It may be argued also that, based on the similarities identified here, human and chimpanzee means are largely interchangeable and that this may appear like a valid option in the facial approximations of extinct hominids. However, we would like to remind the reader of two problems inherent in using means: 1) means have only been verified for a limited number of landmarks and therefore other regions of the face and head will need to be intuited or the thicknesses interpolated from species specific means; and, more   Table 1. https://doi.org/10.1371/journal.pone.0245760.g004

PLOS ONE
Towards the restoration of ancient hominid craniofacial anatomy importantly 2), means completely ignore variation among individuals. If we had interpolated chimpanzee means into our facial approximation of S9655 the average difference between the observed and predicted soft tissue thicknesses would have been 3.1 mm (minimum = 0.2 mm; maximum = 11.3 mm), which is higher than the average difference of 1.5 mm (minimum = 0 mm; maximum = 4.1 mm) that was produced using the regression models. Furthermore, with regard to our comparison of soft tissue thicknesses and craniometric dimensions between male and female chimpanzees, we have shown that in general they do not display sexual dimorphism and that this does not justify producing separate soft tissue prediction models for males and females because variation between sexes is negligible. In this respect, sexual dimorphism in chimpanzees is similar to modern humans [48], which corresponds with previous analyses of craniofacial sexual dimorphism among extant hominids [49,50].
The results of the out-of-group tests on the bonobo and human subjects suggest that chimp-derived regression models are compatible with species that have craniometrics that are more similar to chimpanzees than to those of modern humans. As is presented in Table 6 and Fig 6, chimpanzees, bonobos, and modern humans do not equally display disparate craniometric differences. Of the 15 craniometrics taken from S9655, only four were outside the range of variation observed in this study sample of chimpanzees, whereas the human subject presented 11 craniometrics that were outside the range. The slight differences observed in the craniometric dimensions for S9655, however, did not appear to largely affect the predictive accuracy of the regression models, whereas large differences observed in the craniometric dimensions of the human subject produced large estimation errors. With the understanding that the craniofacial morphology of bonobos is similar to chimpanzees, particularly in cranial dimensions and the morphology of the masticatory apparatus, this result is to be expected. We admit that this test was conducted on two individuals only and that this may be perceived as weak evidence for regression model interspecies compatibility, however, we think it unreasonable to assert that all 26 regression models have performed fittingly on the bonobo subject and poorly on the human subject as a result of random chance. Given that covariation between soft tissue thicknesses and craniometric measurements has been observed in both extant Homo and Pan species, we hold that it is reasonable to assume that such covariation was present in archaic hominids, such as Sts 5. We submit also that skull morphology is the prime determinant of regression model interspecies compatibility and that chimpanzee-derived regression models are valid for reconstructing the facial appearance of Sts 5. The justification for this is as follows. First, Sts 5's craniometrics were just as different from chimpanzees as S9655's were (Table 6 and Fig 6). We must therefore agree with previous authors [51][52][53] in that Pan appears to be the most suitable extant hominid upon which extrapolations of covariation can be made for A. africanus. It is supported by the fact that since the chimpanzee-bonobo split c.2 Ma ago there have been no musculoskeletal changes in bonobos [24]. If bonobos had gone extinct c.2 Ma ago chimpanzee-derived regression models would still have produced an accurate result. Based on this and the closer affinity of the Sts 5  Table 1.
Bolded text indicates where craniometrics were outside ± two standard deviations from chimpanzee sample means.
https://doi.org/10.1371/journal.pone.0245760.t006 skull to Pan, any estimation errors in soft tissue thickness for Sts 5 are likely to be similar to or only slightly larger than those of S9655. Second, under the assumption that sexual dimorphism of soft tissue thicknesses in A. africanus did not differ significantly from chimpanzees and modern humans, it is clear that the lack of consensus surrounding the sex of Sts 5 did not affect the precision of the regression models' predictions. Although we agree with Montagu [54] that the true soft tissue thicknesses for extinct hominids are largely unknowable, we argue that this fact does not diminish the utility of chimpanzee-derived regression models in formulating an informed hypothesis about the facial appearance of Sts 5 and other hominids with similar craniometrics.
To the best of the authors' knowledge, regression models have not been used in the facial approximations of Plio-Pleistocene hominids prior to the present study. Earlier reconstructions have relied heavily on species specific means and/or comparative anatomy of primate muscle morphology [1]. Here we will focus only on the latter as the limitations of means were discussed previously. The location and shape of muscle attachment areas on great ape skulls has been described in detail for bonobos [24], chimpanzees [26], orangutans [55], and gorillas [56], as well as in comparative anatomy textbooks [57]. In any hominid facial approximation, there is an obvious importance in knowing the origin and insertion of the various muscles of the face and head between great apes and humans. However, knowledge of correct anatomical positions of individual muscles is not a substitute for specific estimates for the volumes of the muscles themselves and their coverings, namely the thicknesses of subcutaneous adipose tissues and epithelial linings. Gurche [1] reports being able to systematically determine the size, location, and shape of muscles based on macroscopic surface markings on fossil bones. We will not be surprised if some readers support Gurche's method as found throughout the facial approximation literature is the view that that the face can be reliably approximated from the construction of the facial musculature alone [14,15,31]. However, as Ullrich and Stephan [58] have shown, this is a gross misinterpretation of the facial approximation method. In actual fact, facial approximation has always relied heavily on empirical data on soft tissue thicknesses [12,13,59]. Gerasimov, for example, implemented soft tissue thickness measurements, only ever placed four muscles onto the skull (the masseter and temporalis muscles), and considered adding any further muscles, such as those of facial expression, to be pointless since their attachments to the skull are not visible. Furthermore, research attempting to recover the size and location of 92 muscles in human material reported that only 23 could be reliably reconstructed from bone alone [60]. Gurche's reconstructions are not necessarily illogical by any means but his approximations are not produced from direct observations of bone as commonly believed [8]. While a practitioner's sculptural skills and anatomical expertise is an obvious benefit in any facial approximation, in isolation the intuited use of this knowledge alone is highly vulnerable to subjective interpretation. For example, soft tissue may be added or subtracted based on personal preference. In contrast, the regression models of the present study provide direct evidence for the approximation of hominid soft tissues. We would like to clarify here that the models do not allow for an entirely speculation-free reconstruction because subjective input is still required to interpolate the surface between landmarks. However, the regression models certainly help to inform any interpolation that is needed. The models may also be useful for studies reconstructing the physiology of extinct hominids. The masticatory system of Australopithecus, for example, may be analyzed in more detail by assigning empirical values to individual muscles of the head. Regression models for mid-ramus, temporal fossa, and euryon reflect the volume of the masseter and temporalis muscles, which may be used to further analyze the biting performance of these hominid species. Regression models may also be extended in future studies to include the postcranial skeleton and improve upon current body mass estimates for extinct hominids [61].
The strong correlations observed in this study certainly raise questions about the claim that soft tissue thicknesses do not covary sufficiently enough with craniometric dimensions to improve soft tissue estimates in craniofacial identification [22]. Given that correlation coefficients generated from regression analysis are sensitive to measurement error and that these errors can only detract from the strengths of association, it is possible that measurement error accounts for some, if not all, of the differences between the correlation coefficients of the present study and those reported in previous studies. The mean intra-observer r-TEMs for soft tissue measurements collected in the present study are lower than the mean intra-observer r-TEM of 8% recorded by Stephan and Sievwright [22], which involved measurements of living human subjects by B-mode ultrasound. The small mean intra-observer r-TEM for the craniometrics in the current study also stands in contrast to the mean intra-observer r-TEM of 2% recorded by Stephan and Sievwright [22]. These results follow the usual trend whereby soft tissue measurements pose a greater challenge to measurement accuracy than the craniometrics. However, they also show that measurements taken from CT scans in OsiriX in the present study are more accurate than measurements taken in previous studies via ultrasound [22]. As stated previously, measurements taken by ultrasound are known to compress the soft tissues in comparison to CT based measurements [32]. It is a basic fact of statistics that random errors reduce covariations and thus produce poorer results of correlations and regressions [62]. Thus, correlations generated from CT based measurements can be expected to be stronger than those obtained from ultrasound measurements. With that said, in the specific case of facial soft tissue thicknesses, it is difficult to evaluate conclusively until further analyses of covariation are made using human material and more reliable CT based measurements.
It is important for us to be transparent about the limitation of the regression models. The aim of any facial approximation is to provide an accurate model of a complete subject. Disappointingly, our regression models offer very little information about the facial features of hominids as they only provide a 3D silhouette upon which facial features can be built. In our approximation of subjects PRI-Cleo and S9655, the facial features can be extrapolated from photographic evidence of great apes as shown in the completed approximations in Fig 7A and  7B respectively. However, for Sts 5 the challenge is further complicated by the fact that practitioners of facial approximation have no direct information to extrapolate the facial features from. Numerous facial approximation studies have developed methods for approximating the facial features in modern humans, although the validity of these methods applied to other hominids has never been tested. Published methods include the approximation of eyeball diameter and anatomical placement in the orbits [63]; eyebrow size, position and shape [64][65][66][67]; nasal profile [13,31,[68][69][70][71][72]; mouth width and shape [12,14,31,58,[73][74][75]; and size and shape of the external ear [11-13, 67, 76, 77]. Given that facial features are needed to complete any facial approximation, the interspecies compatibility of these methods is worthy of detailed examination in the near future to allow for complete approximations of Plio-Pleistocene hominids to be produced. Until then, the facial features presented in any facial approximation of Sts 5 must obviously remain tentative. For this reason, we chose to present our final reconstruction of Sts 5, shown in Fig 7C, without facial features. While we could have followed in the footsteps of previous practitioners and used our intuition to estimate the facial features, we feel this would only dilute the significance of our results. Mixing up what we know with that which is unknown would only induce confusion. Thus, incomplete as it may be, in Fig 7C we present only what the results of the present study can accomplish. The undefined mass of tissue produced, as a result of the regression models predictions, highlights just how much work there is yet to be done in this domain.
Second, craniofacial morphology among Plio-Pleistocene hominid taxa is highly variable and as such not all fossil craniometrics may fall inside or close to the range modelled in the present study's regression analysis. Those hominid crania with craniometrics that are at or outside of the extreme ends of the independent variable are likely to produce large estimation errors in soft tissue thickness. As mentioned previously, the craniometrics of Sts 5, as well as subject S9655, are within and close to the range of variation observed in the present study sample of chimpanzees. However, this was not the case for the modern human, which resulted in poor approximation of soft tissue thicknesses, and will not be the case for all hominid skulls, especially specimens with craniofacial morphologies like those exhibited in Paranthropus boisei [78,79]. Repeating stepwise multivariate regression analyses on other extant apes with craniometrics that approximate these fossils, such as gorillas and orangutans, is one possible solution to be explored in future studies.
Finally, it cannot be assumed that correlations observed in adult subjects scale isometrically for very young hominid skulls like that of the Taung fossil or DIK-1 [80,81]. Changes in modern human soft tissue thicknesses between 0 and 19 years of age have been shown to be small and relatively constant throughout ontogeny [17,82], but their relations to substantially growing craniometric dimensions may not be the same as in adults. Similarly, thickness changes throughout ontogeny in other primates are unknown, therefore the regression models of the present study may only be viable for approximating adult hominid faces.

Conclusions
The results of this study show that soft tissue and craniometric measurements covary in chimpanzees, which confirms that such covariation is uniformly present in both extant Homo and Pan species. Chimpanzee-derived regression models appear to be compatible with bonobos but show a marked decrease in predictive accuracy in humans, suggesting that regression model reliability is dependent on craniometric similarity. As the craniometric dimensions of early hominids, such as South African australopithecines, are more similar to chimpanzees than those of humans, chimpanzee-derived regression models may be used to approximate their craniofacial anatomy. Additional relationships between soft tissue thickness and craniometric dimensions of non-human hominids may contribute to more precise facial approximations of early hominids bringing them even closer to standards of objectivity used in forensic sciences. It is hoped that the results of the present study and the reference dataset for facial soft tissue thicknesses of chimpanzees it provides will encourage further research into this topic.
Supporting information S1 Table. List of specimens used in this study. (DOCX)