Microallopatry Caused Strong Diversification in Buthus scorpions (Scorpiones: Buthidae) in the Atlas Mountains (NW Africa)

The immense biodiversity of the Atlas Mountains in North Africa might be the result of high rates of microallopatry caused by mountain barriers surpassing 4000 meters leading to patchy habitat distributions. We test the influence of geographic structures on the phylogenetic patterns among Buthus scorpions using mtDNA sequences. We sampled 91 individuals of the genus Buthus from 51 locations scattered around the Atlas Mountains (Antiatlas, High Atlas, Middle Atlas and Jebel Sahro). We sequenced 452 bp of the Cytochrome Oxidase I gene which proved to be highly variable within and among Buthus species. Our phylogenetic analysis yielded 12 distinct genetic groups one of which comprised three subgroups mostly in accordance with the orographic structure of the mountain systems. Main clades overlap with each other, while subclades are distributed parapatrically. Geographic structures likely acted as long-term barriers among populations causing restriction of gene flow and allowing for strong genetic differentiation. Thus, genetic structure and geographical distribution of genetic (sub)clusters follow the classical theory of allopatric differentiation where distinct groups evolve without range overlap until reproductive isolation and ecological differentiation has built up. Philopatry and low dispersal ability of Buthus scorpions are the likely causes for the observed strong genetic differentiation at this small geographic scale.


Introduction
The Mediterranean region comprises a highly diverse geographic area characterized by a variety of peninsulas and islands. This underlying geographic structure is reflected in biogeographical patterns of the region already described by de Lattin (1949) [1] who distinguished nine Mediterranean sub-centres. These early hypotheses have largely been confirmed by genetic analyses [2]. However, recent molecular data support even more fine-scaled patterns within these sub-centres; such patterns are especially known from the Iberian Peninsula e.g. [3], peninsular Italy e.g. [4] and the Balkans [5]. While the European part of the Mediterranean is biogeographically relatively well understood, the Maghreb still remains largely unstudied [6]. Therefore, the question emerges whether similar genetic substructures also exist in NW Africa.
De Lattin (1949) [1] suggested that the faunas of the Maghreb differ north and south of the High Atlas; this pattern is well reflected in the geographic distributions of many species e.g. [7]. On the contrary, other genetic studies revealed an east-west split along the Maghreb coast [8]. Furthermore, no genetic differentiation from Morocco to Tunisia was observed in the allozyme patterns of two butterfly species [9]. Nevertheless, all these genetic patterns are much simpler compared to findings in southern Europe. This is particularly astonishing considering the high orographic diversity of the Maghreb in general and the Atlas Mountains in particular. Yet, only few studies have addressed the phylo-or biogeographic patterns within the Atlas Mountains, although this area comprises an excellent region to study allopatric differentiation.
In order to test for the influence of orographic structures on differentiation and speciation processes, we selected the scorpion genus Buthus, which has undergone extensive speciation within NW Africa [10], as a model system. Buthus scorpions are widely distributed over large parts of Africa and show a circum-Mediterranean stronghold including the Mediterranean islands, with extraordinarily high diversity in North Africa [10][11][12][13][14]. Yet, molecular studies showed that diversity may still be underestimated and additional cryptic species may be abundant [14]. Most buthid scorpions are strictly territorial and philopatric and have low dispersal abilities. Consequently, individuals have limited home ranges which generally are restricted to small areas around their burrows [15]. Strong genetic differentiation has previously been observed for scorpion species and was probably triggered by barriers such as high elevations, steep valleys or sea barriers [16][17][18][19] in combination with low dispersal ability of the organisms [20]. In addition to being extremely limited, dispersal is also highly asymmetrical in these organisms, with male scorpions dispersing more widely. This apparently is a function of mate, prey and shelter search, where males exhibit a much more opportunistic locomotive behaviour, whereas females stay in their shelter and forage mainly close to the entrance of their burrows (''doorkeeping'' strategy) [21,22].
For this study we collected a total of 91 individuals belonging to 4 described Buthus species from 51 locations over major parts of the Antiatlas, High Atlas, Middle Atlas and Jebel Sahro during spring 2008 and 2009. Three specimens of the related genus Androctonus from two locations served as outgroup. We analysed a fragment of the mitochondrial Cytochrome Oxidase I (COI) gene, which has been shown to be highly informative at species level and consequently is one of the most commonly used markers for phylogenetic and phylogeographic studies. Due to its high potential for species identification it is often chosen as the barcoding gene in zoological studies. Here we use its properties to address the following questions: Does genetic variation within Buthus exhibit geographic structuring? (ii) Do genetic structures follow any geographical and/or evolutionary pattern? (iii) Are genetic lineages in accordance with current taxonomy?

Results
We generated 94 sequences (including three outgroup specimens) with a total length of 452 base pairs. One sequence from Androctonus mauritanicus (JF820097) was obtained from genbank as an additional outgroup. Details about analyzed individuals, including genbank accession numbers, are given in Table 1. Out of the 452 sites analysed, 135 were polymorphic (29.9%) of which 127 were parsimony informative sites (4 sites had missing data). Within the ingroup we recovered 69 haplotype with a haplotypes diversity of Hd = 0.993 (SD = 0.003) and a nucleotide diversity of Pi = 0.0916 (SD = 0.0018). The average number of nucleotide differences was k = 41.138. Fst estimates among groups obtained from phylogenetic analyses were overall high and mostly significant ( Table 2; non significant values are due to low sample sizes in some groups).
The second dataset included 84 additional sequences obtained from genbank (appendix SI). In total this more comprehensive dataset consisted of 179 sequences with 383 bp each. This included 148 variable sites (38.6%), 99 of which were parsimony informative. Many of these sites were uninformative due to a relatively large amount (60 sites) of missing data in genbank sequences.
Reconstructed trees indicate the presence of 12 distinct Buthus lineages, which were recovered in all analyses (Fig. 1a-c). However, the relationships among groups differed depending on the tree reconstruction method. This is also reflected in the low statistical support for among group relationships. The support for most groups is high. The genetic patterns strongly coincide with the geographical distribution of the respective sampling locations. Lineages that occur sympatrically (K; most probably H/G/I, Figure 1) are genetically well separated in the trees, while genetically closer groups are all strictly allopatric or parapatric (e.g. I/E; H/K). The distributional patterns of most of the genetic clusters conform to the orographic pattern of the region (e.g. along valleys, southern or northern slopes of mountain systems).
The comprehensive analyses of genbank data together with our own dataset generally yielded geographically meaningful clusters. The European samples represent two monophyletic groups, one in Portugal and one in France also including Spain. These clades likely correspond to the recently described species, Buthus montanus Lourenço & Vachon, 2004 and Buthus ibericus Lourenço & Vachon, 2004. However, the European Buthus are polyphyletic, yet restricted to one branch of the entire tree. They also make the North African Buthus paraphyletic (see Figure 1c).

Discussion
The two tree hypotheses (ML and BI) constructed from our original dataset distinguish 12 genetic lineages. One of these clusters (lineage K) is further structured into three sub-clades. The genetic clusters exhibit a clear geographic pattern with all the lineages being confined to relatively restricted areas. None of the lineages is widely distributed. Most lineages are separated from each other by geographic barriers such as high mountain ranges or steep valleys. Only some basal lineages show range overlap (e.g. K) while all younger groups are strictly allopatric or parapatric (e.g. the three sub-groups of K). Four groups (A-D) are situated around the western and north-western parts of the High Atlas. One group is located in the westernmost Antiatlas (E), three others at the northern foothills of these mountains (F-H) and one at the southern foothills of the Antiatlas (J). The largest cluster (K) is distributed south of the High Atlas, east of the Draa valley, and splits further into three subclades, which form distinct strictly parapatric groups following the orographic structures of this region (e.g. along the Dades valley or restricted to the Jebel Sahro). Another cluster (I) extends from the Draa valley to the south western edge of the Antiatlas, probably including parts of the distribution ranges of clade G and H.

Barriers and retreats
The geographic distribution of the obtained genetic clusters strongly supports significant impact of orographic structures as dispersal barriers. The extant distribution of Buthus scorpions mostly does not exceed elevations higher than 2500 m asl [23], and individuals show low dispersal ability. Thus, high mountain ranges likely acted (and act) as strong barriers to gene flow resulting in genetically distinct clusters found within the Atlas Mountains. This is reflected in our data by fine structured genetic lineages of the sampled Buthus individuals. A similar pattern was observed for the same taxonomic group using various molecular markers such as allozymes, nuclear and mitochondrial DNA all of which showed significant genetic differentiation patterns even among local populations [24,25]. Similar results were found for other organisms with low to moderate mobility [7]. These extremely sedentary species apparently diverged into several narrowly distributed genetic clusters, restricted to the slopes of mountains and river valleys (e.g. the Ourika and Draa). These lineages are found at different altitudes and might be adapted to various ecological niches. We assume that all actual distribution ranges of Buthus lineages also include their last centre of differentiation. Thus, our study area may comprise 12 allopatric refugia during the last ice age revealing a much more fine grained biogeographical sub-structure than previously observed in the Maghreb e.g. [7][8][9]. Consequently, the biogeographical structur-    ing of Buthus in Morocco is much more complex than the structures observed in other Mediterranean sub-centres [26], most probably as a consequence of strong gene flow barriers represented by high mountain ranges and a more constant climatic situation in North Africa compared to the northern part of the Mediterranean. The combination of our data with previously analysed Buthus individuals (Figure 1c) underlines this superposition of the Maghreb region as a speciation and differentiation centre: The European Buthus are polyphyletic, but restricted to one branch of the entire tree and make the North African Buthus paraphyletic. This may be indicative of multiple colonisation events from North Africa to Europe.

The trigger of genetic differentiation
This distribution of genetic lineages supports an allopatric differentiation scenario as opposed to sympatric speciation processes [27]: The lineages originating from the oldest vicariance events are most strongly differentiated from each other. Perhaps this long period of time elapsed since the onset of differentiation allowed for the evolution of characteristics enabling sympatric occurrence of representatives of different genetic groups. Furthermore, the close geographic proximity among the well differentiated lineages G, H and I suggests some areas of range overlap among them. If reproductive barriers are in effect, species might even occur in sympatry without niche differentiation when resources are abundant. Thus, the time elapsed since the onset of the evolution of these lineages must have been sufficient that neither the inter-lineage competition for space nor the strong stenotopy of Buthus scorpions has been able to prevent the geographic intermixing of these lineages. However, differentiation among them still might not be sufficient for syntopic occurrences so that these groups exclude each other at each single locality. Such a scenario of competitive exclusion would reinforce a strictly geography-dependent distribution pattern of lineages.

Discordance between morphospecies and COI phylogeny
In contrast to the stringent geographic pattern, our findings are not congruent with the current taxonomic classification. The different species identified based on morphological characters often cluster within different genetic lineages which, in turn, often comprise more than one species. In addition, many specimens could not be identified due to incomplete or doubtful identification keys, and a lack of diagnostic traits. Hence, the current taxonomy of North African Buthus species is in need of revision. Additional diagnostic characters must be identified and new species keys be developed. However, to actually define the status of the detected genetic lineages, further morphological, behavioural and ecological studies are necessary. Besides, it is well known that gene trees and species trees may differ, e. g. due to shared ancestral polymorphisms, incomplete phylogenetic lineage sorting and introgression [28]. Evidence of recombination of mtDNA has been found in Buthus, which could also influence our results [29]. Taking our results as an exclusive basis for taxonomical conclusions would therefore doubtlessly be premature. In future studies, the addition of nuclear markers is indispensable to solve the question how prominent hybridization and introgression are among these Buthus lineages and species.

Sampling and identification
Most specimens were collected at daylight under stones. Overnight collections were facilitated by black light as all scorpions fluoresce in UV light due to a specific protein in their exoskeleton [30]. Collected specimens were stored in absolute ethanol until DNA extraction. All sampling sites are shown in Figure 2 and listed in Table 1 which also gives details on the sampling date and location. Species were identified with keys and additional species descriptions [10,12,[31][32][33] based on typical morphological characters used for scorpion identification [cf. [32][33]. All species identifications were performed by the same author (IS).

PCR and sequencing
For all individuals, DNA was isolated from leg, caudal segment or telson muscle tissue using the Qiagen DNeasy kit with standard protocols for tissue samples. A 452 bp fragment of the mitochondrial Cytochrome Oxidase I (COI) gene was amplified using standard PCR procedures with the following primers: forward 59 GGT CAA CAA ATC ATA AAG ATA TTG G 39, reverse 59 TAA ACT TCA GGG TGA CCA AAA AAT CA 39 [34]. PCRs were performed in 20 ml volumes: 10 ml Mastermix (Thermozyme), 0.2 ml of each Primer (1 mM), 4.6 ml PCR grade water and 5 ml DNA template. The cycle programme comprised an activation step at 94uC for 4 min, followed by 40 cycles of 30 sec denaturation at 94uC, 30 sec annealing at 45uC and 1 min elongation at 72uC. Cycling was terminated by a final extension step at 72uC for 10 min. Amplicons were subsequently purified and sequenced in both directions on an automated sequencer (3730xl DNA Analyzer; Applied Biosystems, Carlsbad, CA, USA) at the University of Kiel. All sequences are deposited at the NCBI genbank (Appendix SI).
Significance of Fst values was tested using 1000 permutations and a significance level of 0.05. The best fitting model of sequence evolution for phylogenetic tree reconstruction was chosen using MrModeltest 2.3 [38] in PAUP 4.0 b10 [39] and was determined to be the GTR+I+G. Maximum Likelihood (ML) trees were constructed using MEGA version 5.01 [40]; 1000 bootstrap replicates were performed to obtain a statistical measure for branch support (Figure 1a). Additionally, we used the Bayesian algorithm implemented in MrBayes 3.1.2 [41] to obtain a second phylogenetic inference and evaluate the stability of our tree reconstruction (Figure 1b). Using the obtained substitution model we ran the Markov chain Monte Carlo algorithm with four chains and two independent runs for 20 million generations. Trees were sampled every 2,000 generations; the first 10% of generated trees were discarded as burn-in as recommended by Tracer v1.5 [42]. The remaining trees were used to generate a consensus tree.
In a second analysis, we included 84 sequences from Buthus scorpions obtained from genbank (see Appendix SI) resulting in a total of 175 ingroup sequences. However, due to only partial overlap the alignment was reduced to 383 bp in length. Bayesian analyses of this dataset were performed as described above. While these additional analyses provide further insights into geographic structuring of genetic divergence, they do not help in evaluating species identity and monophyly (Figure 1c). When these sequences were submitted to Genbank, the vast majority of Moroccan Buthus species known today had not yet been described and most Moroccan Buthus were characterized as subspecies of B. occitanus (B. o. occitanus -Europe, B. o. mardochei -Morocco, B. o. tunetanus -Tunisia) [31].

Supporting Information
Appendix SI DNA sequences obtained from NCBI genbank.   Table 1. Letters of the groups correspond with Figure 1. The geographic coordinates of sampling sites are given in Table 1