Contacts in the last 90,000 years over the Strait of Gibraltar evidenced by genetic analysis of wild boar (Sus scrofa)

Contacts across the Strait of Gibraltar in the Pleistocene have been studied in different research papers, which have demonstrated that this apparent barrier has been permeable to human and fauna movements in both directions. Our study, based on the genetic analysis of wild boar (Sus scrofa), suggests that there has been contact between Africa and Europe through the Strait of Gibraltar in the Late Pleistocene (at least in the last 90,000 years), as shown by the partial analysis of mitochondrial DNA. Cytochrome b and the control region from North African wild boar indicate a close relationship with European wild boar, and even some specimens belong to a common haplotype in Europe. The analyses suggest the transformation of the wild boar phylogeography in North Africa by the emergence of a natural communication route in times when sea levels fell due to climatic changes, and possibly through human action, since contacts coincide with both the Last Glacial period and the increasing human dispersion via the strait.


Introduction
At present, Africa and Europe are in close proximity to each other geographically speaking, and are by only 14 km separated through the Strait of Gibraltar. However, it is known that major falls in sea level (~100 metres) related to glacial periods, and the consequent emergence of islands, reduce this distance to smaller marine barriers, less than 5 km each [1][2][3]. In this situation the interaction between both sides of the strait seems possible despite it being a barrier for some species [4][5][6][7][8].
Evidence of contacts across major marine distances is not new. Human dispersal across a marine barrier 0.88 million years ago (MYA) has been proven in the Flores Island (Java) [9,10]. During glacial periods, the Strait of Sicily would not have acted as a major geographical barrier for some species [11,12]. On the Strait of Gibraltar, some documented cases are found of movements of hominids and fauna across this permeable barrier [2,10,[13][14][15][16][17]. For example, the arrival of humans and vertebrate fauna to the Iberian Peninsula from Africa has been recorded at the sites of Orce (southeast Spain) as early as the Plio-Pleistocene boundary [2,10,18]. a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 Due to the complex biogeographic histories of some species, it may be complicated, or even impossible, to distinguish the cause of movements in the late Pleistocene. The sea level was lower until the Last Glacial Maximum (LGM), some 25,000 and 18,000 years ago. Thus contacts could have taken place by natural migrations or colonisations from anthropogenic introductions [17,19].
In any case, the North African wild boar (Sus scrofa) is closely related to the European wild boar, which indicates a strong gene flow [20,21], but no studies exist have focused on the possible routes of these contacts. Very few studies exist about African populations, and the history of the native wild boar in North Africa is poorly known. We found some references from historical and paleontological records about its possible origin [22][23][24], one study about the genetic structure of the wild boar population of Tunisia [25], and a number of studies about African pigs [26,27]. In GenBank we only found information on three sequences of cytochrome b and four of the control region identified exclusively in Morocco, or which belonged to the sequences also found for wild boars from other areas. In this study, five Moroccan wild boar samples were analysed and incorporated into GenBank [28]. This dataset has sufficed to allow us to study the hypothesis of the present work.
The present study aims to elucidate contacts between Africa and the Iberian Peninsula across the Strait of Gibraltar by considering the genetic similarity of the wild boar populations on both sides of the strait.
We decided to analyse mitochondrial DNA (mtDNA) cytochrome b and the control region because the latter is more hypervariable. The first analysis of Y-chromosome polymorphisms of a Moroccan wild boar is also provided.

Samples and DNA extraction
Hair and tissue were obtained from five wild boar individuals. Specifically, four females and one male (WBMoroc2) were sampled from the Middle Atlas in Morocco. Samples were collected during the 2014 and 2015 hunting seasons. Throughout the study area no special permits were required to legally hunt wild boars, only a general hunting licence. Animals were killed for other purposes and no authors were involved in hunting. Samples were obtained directly from licenced hunters.
Mitochondrial DNA was extracted from hair roots and tissue samples using the material and protocols for DNA isolation of the Invisorb1 Spin Forensic Kit (STRATEC Biomedical AG, Berlin).

Mitochondrial DNA amplification and sequencing
Mitochondrial DNA was amplified using the primers and amplification profiles described by Alves et al. [29] (S1 Table). The thermocycling profile was the same for both cytochrome b and the control region: one cycle at 94˚C for 2 min, followed by 30 cycles of 94˚C for 45 seconds, 55˚C for 45 seconds, 72˚C for 1 min, and finally an extension step at 72 ºC for 10 min. PCR products were purified and sent to Macrogen (http://www.macrogen.com/eng/) for sequencing. We obtained two complementary fragments for each region, which were assembled using BioEdit 7.2.5 [30].
conditions: 95˚C for 10 min and 35 cycles of 94˚C for 1 min, annealing temperature (Tm) (S1 Table) for 1 min, 72˚C for 1 min, and finally an extension step at 72˚C for 15 min [21,27]. PCR products were purified and sent to Macrogen for sequencing.

Mitochondrial DNA analyses
In all, 1,152 base pairs (bp) including the entire cytochrome b, were obtained for the five analysed wild boar samples (GenBank accession numbers: KU664546, KU6645407, KU66454, KU664553 and KU608293) and were aligned with the 358 wild boar sequences available in GenBank (Table A in S2 Table). Bearded pig (Sus barbatus), celebes wild boar (Sus celebensis), philippine warty pig (Sus philippensis) and common warthog (Phacochoerus africanus) were employed as the outgroups. The partial control region gene (995 bp) was amplified for the five analysed wild boar samples (GenBank accession numbers: KU664554-KU664558) and were aligned with the 1,210 sequences from GenBank ( Table B in S2 Table). Celebes wild boar (S. celebensis) and common warthog (P. africanus) were employed as outgroups.
The sequences used for the analysis comprised wild boar with a wide geographical distribution, including Europe, Africa, Near East and Asia. These regions represent the geographic areas of interest for our study. From the available GenBank sequences, we selected those from wild boar. We ruled out most sequences from clones, feral pigs, mixed and archaeological specimens, as well as the sequences classified as unverified and predicted. The size of the final dataset to be used for the analyses varied after aligning all the selected sequences and removing the positions that contained gaps and missing data (N).
Sequences were aligned using BioEdit 7.2.5 and ClustalW alignment tool included in this software. The number of haplotypes was calculated using the DnaSP 5.10 software [31]. The cytochrome b haplotypes obtained here were given the code "CB", and those obtained for the control region were coded as "CR". The best nucleotide substitution models were selected using jModelTest 2.1.7 [32] under the Bayesian Information Criterion (BIC). The Pairwise genetic distances between sequences were calculated by MEGA6 [33] with 1,000 bootstraps replicates and gamma distribution (shape parameter = 0.5).
Time of divergence (T) was estimated using the molecular clock equation T = K/ (2r) [34], where T = divergence time in years, K = genetic distance and r = rate of nucleotide substitutions. The genetic distance (K) between P. africanus and genus Sus was calculated with the Tamura 3-parameter model and gamma distribution (shape parameter = 0.5) using MEGA6 in both cytochrome b and the control region. We assumed a substitution rate (r) of 1 x 10− 8 per site per year for cytochrome b. This rate was previously estimated for complete mtDNA [35]. We used a higher substitution rate (r) of 1.37 x 10 −8 per site per year, as estimated by Pesole et al. [36], for the control region in mammals [37,38].
Bayesian phylogenetic trees were constructed using BEAST 1.8.2 [39]. We assumed a strict clock and the coalescent prior with a constant size. Evolutionary parameters were given by jModeltest. At least two independent Markov chain Monte Carlo (MCMC) chains were run for 50 million generations, and parameter values were sampled every 1,000 generations. We examined the results using Tracer 1.6 [40]. We used TreeAnnotator 1.8.2 [39] to obtain the consensus trees. The first 10% of the sampling trees were ruled out as burn-in and the resulting trees were visualized in FigTree 1.4.2 (http://tree.bio.ed.ac.uk/software/figtree/). Two medianjoining networks [41] were generated and visualized using Network 5.0.0.1 (http://www. fluxus-engineering.com). For the cythocrome b, with more complex data than the control region in the final dataset, we used the star contraction option and epsilon = 20.

Results
For methodological reasons, the analyses were carried out with wild boar samples from Europe, Africa, Near East and Asia, but we focused on the analysis of clades that was more related to Africa and Europe. Asian sequences were included to improve the analysis of the divergence times and to better understand the spread of wild boar from Asia to Europe.

Cytochrome b analyses
Except for the network (1,030 bp), we used a 897 bp fragment for the analyses that corresponded to the cytochrome b gene. In all, 107 haplotypes were identified from 363 wild boars (S7 Table). The Moroccan and Tunisian samples are included in haplotypes CB9 and CB90 ( Table A in S2 Table). CB9 is the commonest haplotype in Europe and is shared by some North African and 33 European (including Italian), four Asian and four wild boars from Near Eastern countries. When focusing on the Iberian Peninsula, we found that on the other side of the Strait of Gibraltar of the 13 Spanish wild boars, 8 were included in this haplotype.
For the phylogenetic analysis, the best model was the Generalised Time-Reversible Evolutionary Model with gamma distribution (GTR + G) [46] for cytochrome b. The Bayesian phylogenetic tree revealed the previously observed clades: E1 (European clade), E2 (Italian clade), NE (Near Eastern clade), A (Asian clade) (Fig 1A). The haplotypes with sequences from Morocco and Tunisia belonged to the European clade (E1).
The pairwise genetic distances between clades ranged from 0.0085 to 0.0138 (Table A in S3  Table). The mean distances between the haplotypes included in the European clade (E1) were calculated ( Table B in S3 Table). The MoroccoA haplotype (CB90), exclusive of Africa, differs from the rest by more than CB9, which is the other haplotype with African sequences.
The cytochrome b gene contains six single nucleotide polymorphisms (SNPs) that allow the differentiation between European and Asian haplogroups [26,37,47]. In order to understand the genetic diversity in African wild boar populations, variable sites for their sequences were analysed (S5 Table). The cytochrome b sequences (1,047 bp) have seven nucleotide polymorphic sites in the analysed fragment. There are five transitions, one transversion and one deletion.
The time of divergence estimated between genus Sus and P. africanus was 7 million years (between 5.95 and 8.05) for cytochrome b (S6 Table). From the Bayesian phylogenetic tree, it was deduced that the time of isolation between the Asian and the other clades occurred approximately 818,300 years ago. The isolation of the European clade (E1) occurred some 429,500 years ago. The beginning of the isolation of haplotypes found in North Africa took place 63,800 years ago.
Networks were constructed to better visualize the relationships among clades (Fig 2A). We used fewer sequences with more base pairs to better understand existing relationships and to check for possible variations when using a larger segment size (1,030 bp). North African, Near Eastern and European wild boars clustered in the same way in both the network and the Bayesian phylogenetic tree.   [46], with invariant sites and gamma distribution (GTR + I + G) for the control region. The Bayesian phylogenetic tree revealed the same clades seen in cytochrome b (Fig 1B). The sequences from Morocco and Tunisia clustered in the European clade (E1), except for the CR182 haplotype from Tunisia. The control region sequences from Egypt (CR46), Sudan (CR164) and Tunisia (CR182) belong to the Near Eastern clade (NE). Egypt shares a haplotype with some sequences from Iran.
The pairwise genetic distances between clades ranged from 0.0168 to 0.0293 (Table A in  S4 Table). The mean distances between the haplotypes included in the European clade (E1) ( Table B in S4 Table) showed that there was a longer distance between the haplotypes found in Morocco (between CR1 and CR 140/141) than between some Moroccan ones and the Tunisian one (between CR1 and CR181). The higher values in the mtDNA control region distances can be explained by it being a more hypervariable region than cytochrome b.
The analysed partial control region (406 bp) has 12 nucleotide polymorphic sites, including 11 transitions and one deletion (S5 Table).
The time of divergence estimated between genus Sus and P. africanus was 6.75 million years (between 5.51 and 7.99) for the control region. The time of the isolation between the Asian and the other clades occurred approximately 1,234,100 years ago (S6 Table). The time of divergence of the European clade (E1) was 696,000 years ago, and the beginning of the isolation of the haplotypes found in North Africa took place 116,600 years ago, according to the phylogenetic tree.
Finally, we generated the network (Fig 2B). In this case we included only the sequences that belonged to clades E1 (Europe), E2 (Italy) and NE (Near East) to focus on the connections of these groups. The relationships that we found were similar for both the network and the Bayesian phylogenetic tree.

Y-chromosome haplotype
For the Y-chromosome analysis, we sequenced the partial AMELY, USP9Y and UTY (UTYin1 and UTYin9) regions in one male from Morocco (WBMoroc2) in order to compare them with the published sequences of previous studies [21,48,49] and to identify their haplotype. There were three defined haplotypes, and our results showed that, according to Ramírez et al. [21], our sample belonged to the HY2 haplotype.

Discussion
We explored the relationships via the Strait of Gibraltar based on an analysis of the partial mtDNA of wild boar (S. scrofa). We agree with the presence of one European clade (E1) that is widely distributed in Europe and North Africa (Morocco and Tunisia), one exclusive of Italy (E2), and another with most Near Eastern sequences (NE). These results are congruent with those reported by Larson et al. [45] and Meiri et al. [50].
The control region phylogenetic tree also shows some Asian haplotypes in the basal clade. This coincides with the results of Larson et al. [45]. Regarding genetic distances, in both cases the European clade is closer to the clade of the Near East than to the Italian one. When considering the distances only in cytochrome b, the Asian clade displays the same distance with both the Italian and the Near East one, which gives rise to the different distribution in the corresponding trees. However, the cytochrome b shows some Asian sequences that are closer to the European ones, which is due to the specific dispersal process of these populations and their contacts with others [51].
Finally, the analyses confirm that the modern wild boars from Morocco and Tunisia share European haplotypes. Only the Tunisian wild boar with haplotype CR182, which is less present in our results, belongs to the Near Eastern clade as wild boars from Egypt and Sudan. The median-joining networks showed the same relationships when fewer sequences with more base pairs were used.
We obtained an interesting finding for the Moroccan wild boar ( Table B in S4 Table). The estimated genetic distance between the sequences belong to the CR1 haplotype from Morocco and the Tunisian CR181 is 0.0025. The estimated distance with the CR140 and CR141 haplotypes from Morocco is 0.0051 and 0.0077 respectively. CR1 also display shorter distances (0.0025) with some haplotypes found in Portugal, Spain, and even Greece (haplotypes CR12, 21,48,52,60,100,148). A long distance is seen between haplotypes CR141 and CR142. Differences between Moroccan populations are consistent with results from previous studies, and might indicate different origins or isolation due to geographical barriers in Maghreb [6,52]. However, very little information is available in other studies on samples used from Morocco to obtain mtDNA. The specimens that belong to our study (CR1) are from the province of Khenifra. Morocco1 (CR1) is from Taforalt (Oujda) [45] and Morocco2 (CR140) and Morocco4 (CR1) are labelled with the location Atlas/Rabat [53]. Although these last two samples are referenced with the same location, this covers a wide area and could belong to different populations. As we do not have further information, and without knowing the location of Morocco3 (CR142), more samples will be needed to confirm the hypothesis.
When focusing on the sequences from Israel, we find that they are included in the European clade (E1) and within the most frequent haplotype (CB9). Larson et al. [45] obtained a similar result for one wild boar from Armenia, and suggested that it might be due to introgression from European wild boars or feral pigs. This latter possibility does not seem plausible because Far Eastern haplotypes have a 29% frequency in International pig breeds [8,37], but there is no evidence for these haplotypes appearing in wild boars from Armenia, Israel or North Africa. The most logical explanation is the interchange of haplotypes between wild boars through human action or natural movements. In any case, even when the near Eastern and African wild boars share this haplotype, it does not seem directly related to one other. Neither the phylogenetic trees nor networks indicate any similarity between both populations, except for one haplotype from Tunisia (CR182), which is located in the Near Eastern clade. Therefore, the commonest haplotype might have been transmitted to North Africa by contacts with the European wild boar.
The six SNPs located at specific mtDNA cytochrome b positions, and used to differentiate the origin of samples, confirm that the origin of all the sequences from Morocco and Tunisia used for these analyses is European. The results of the variable sites in the African wild boar sequences are representative of a strong transitional bias, usually found in mammals' mitochondrial evolution [54][55][56].
From the genetic distances and Bayesian phylogenetic trees included in this study, we can estimate that the Near Eastern clade originated between 857,100 and 429,500 years ago (S6 Table). After this event, the wild boar arrived in North Africa possibly through Egypt, which would have isolated it from the Near Eastern clade. Presence of wild boar fossils from the Middle Pleistocene in North Africa and in several Near Eastern countries [24,57] supports our results, and the idea of wild boar forming part of fauna of Maghreb during this period. Nevertheless, the North African wild boar currently has European haplotypes.
Since the control region is more hypervariable than cytochrome b to differences like the size of the mtDNA segment used, and the fact that the analysed specimens of both regions do not belong to the same specimens in most cases, the years of divergence shown in S5 Table  vary within a narrow range. For the haplotypes found in Morocco and Tunisia, our analyses gave an approximation of their isolation of 63,800 years ago for cytochrome b, and of 116,600 years ago for the control region. Not until these haplotypes appeared could their dispersion between Europe and Africa have begun, which caused the genetic similarity. Therefore, by taking into account all these conditions and the years obtained, we offer an average approximation of 90,000 years (in the Late Pleistocene) for the time when a major genetic flow started between populations on both sides of the Strait of Gibraltar. This process must have given rise to the modern African wild boar, and something seems to have happened in Israel [50].
The Y-chromosome analysis result shows that our sample belongs to haplotype HY2, and is present in at least Tunisia, Spain, Russia, Iran and Japan. Haplotypes HY1 and HY2 have been documented in Tunisia [21]. With the information available from the Moroccan and Tunisian Y-chromosome sequences, it seems that they might be related to the European or Near Eastern wild boar. Haplotype HY3 is relatively abundant in Far Eastern specimens, and has been detected in Kenyan and Mukota pigs [21], but is absent in the North African wild boar. These results support the hypothesis that there was no direct genetic flow between Asian and African wild boars, which would be congruent with the fact that the commonest haplotype in mtDNA would not be in Africa due to introgression by pigs, as previously mentioned. Although these data are not enough to draw definitive conclusions, given the lack of patrilinear history information on this point in Morocco, we feel the analysis has been valuable.

The role of the Strait of Gibraltar as a marine permeable barrier between Europe and Africa
The results are interesting because it would be more logical for North African wild boars to share their haplotypes with those of Egypt, Sudan, or even with those of the Near East, due to connectivity by land. However, all their haplotypes are European, except for CR182.
According to Manlius and Gautier [57], the so-called wild boars from Sudan are feral pigs, and the modern wild boars from Egypt were probably introduced by humans in the Neolithic period, like sheep and goats. Even so, presence of native wild boar at low densities in the past cannot be ruled out, and natural colonisation through Egypt is logical and supported by the existence of fossils found in North Africa. Accordingly, if the only contacts had been made by land through Near East countries, the African wild boar would form part of the Near Eastern clade, or would at least differ from European populations.
Therefore, isolation between populations from Egypt and North Africa in the past (in the Late Pleistocene) and contacts with the European wild boar across the Strait of Gibraltar, the most likely route, could have had a strong effect on the mtDNA of the African populations. Obviously, we cannot rule out previous contacts to the Late Pleistocene during former glacial periods. The absence of a genetic footprint could be due to the genetic flow not being enough to have endured [49]. The CR182 haplotype found in Tunisia should be the result of either genetic introgression with the Near East wild boar or a trace from the past. Indeed this haplotype is older than the rest by at least 81,000 years.
Contacts across the Strait of Gibraltar could have been possible during glacial periods when sea level fell and it was easier to cross. The Last Glacial Maximum, the maximum extent of glaciation during the Last Glacial period, occurred during the interval between 25,000 and 18,000 years ago [17,19], and the Last Glacial period finished 11,700 years ago. The Late Glacial, the beginning of the modern warm period, began approximately 13,000 years ago [58], but the rise in temperatures was gradual. During cold periods, the currents in the strait would have been minimal or inexistent, which would have facilitated crossing the strait [2].
As for the causes of movements, if they took place approximately from 90,000 onward, it is difficult to know if contacts were made by natural colonisations, by human action, or by a combination of both without having further data [48,49]. This is due to several events occurring at the same time. In addition, it would appear that the South region of the Iberian Peninsula and North Africa form part of a refugial subcentre denominated Atlanto-Mediterranean, from which species could have recolonised areas in North of Europe at the beginning of the post-glacial periods the [5,11,17].
In fact the existence of human contacts across the strait during the period between 12,000 and 10,500 years ago has been proven, which was when sea levels were still rising [13,59]. Since then human contacts have happened. The existence of cattle of one characteristic haplotype of African breeds in the Iberian Bronze Age suggests that contacts over the Strait of Gibraltar were caused by interactions among communities, their culture and livestock in prehistory [14]. Besides, other animals like genet (Genetta genetta), barbary ape (Macaca sylvanus) or egyptian mongoose (Herpestes ichneumon) are accepted as introductions between Europe and North Africa [24]. Movements of people between both continents, who took pigs with them from the 15th century onward on exploratory or commercial routes, and with the consequent genetic introgression, is another possible explanation for similarity [20,21]. In our case it is more likely that migrations occurred naturally. Accessible information suggests that people took domestic varieties of livestock, such as cattle or pigs, but no information on transporting wild boar is available.
Similarly, the arrival of the wild boar from mainland Asia to the Ryukyu islands in the Late Pleistocene seems to be a fact [51]. A much closer geographic proximity between the two regions when sea levels were lower could account for this dispersion.
Finally, a strong gene flow might have been persistent over time, and a population decline or a displacement of the original population, followed by the expansion of new haplotypes, could have occurred. As a result, the African wild boar forms part of the European clade, at least it does according to its mtDNA.
Therefore, we suggest that the Strait of Gibraltar acted as a bridge for wild boar (S. scrofa) to disperse in the Late Pleistocene. In this case, the dispersion of specimens accompanied by a strong gene flow would have been occurred from at least 90,000 years onward. Genetic analyses, the history of S. scrofa, and the fact that the Last Glacial period finished 11,700 years ago, all suggest natural dispersion, but we cannot rule out the contact by human action.
Supporting information S1  Table. Tables with information about the sequences obtained from GenBank and those from this study. Tables with information about cytochrome b (Table A), the control region (Table B) and the Y-chromosome (Table C)