Interspecific Introgression in Cetaceans: DNA Markers Reveal Post-F1 Status of a Pilot Whale

Visual species identification of cetacean strandings is difficult, especially when dead specimens are degraded and/or species are morphologically similar. The two recognised pilot whale species (Globicephala melas and Globicephala macrorhynchus) are sympatric in the North Atlantic Ocean. These species are very similar in external appearance and their morphometric characteristics partially overlap; thus visual identification is not always reliable. Genetic species identification ensures correct identification of specimens. Here we have employed one mitochondrial (D-Loop region) and eight nuclear loci (microsatellites) as genetic markers to identify six stranded pilot whales found in Galicia (Northwest Spain), one of them of ambiguous phenotype. DNA analyses yielded positive amplification of all loci and enabled species identification. Nuclear microsatellite DNA genotypes revealed mixed ancestry for one individual, identified as a post-F1 interspecific hybrid employing two different Bayesian methods. From the mitochondrial sequence the maternal species was Globicephala melas. This is the first hybrid documented between Globicephala melas and G. macrorhynchus, and the first post-F1 hybrid genetically identified between cetaceans, revealing interspecific genetic introgression in marine mammals. We propose to add nuclear loci to genetic databases for cetacean species identification in order to detect hybrid individuals.


Introduction
In a progressively threatened oceanic environment where large species are more and more endangered, cetacean monitoring is increasingly important for estimating population censuses and early detecting signals of species depletion [1]. However, visual species identification is not always accurate. Some species are morphologically similar and their distribution ranges overlap. A correct taxonomic identification is indeed important for practical issues of management and conservation of cetacean species.
The two recognised pilot whale species, Globicephala melas and G. macrorhynchus, are cetaceans of charismatic behaviour. They are highly social and exhibit post-reproductive female care. They are sympatric across a North Atlantic latitudinal area from American to European coasts [2], [3]. Based on their osteology, Van Bree [4] demonstrated that they are two clearly distinct species; however, their external appearance is similar and the morphometric characteristics employed to visual species discrimination partially overlap [5]. Therefore species identification based on external morphology may be difficult and in some cases impossible [6]. Moreover, dead stranded individuals are sometimes highly degraded and their distinctive traits may be lost. For these reasons, as in other forensic zoological studies, DNA-based identification is necessary [7], [8].
Genetic species identification in cetaceans is generally based on maternally inherited mitochondrial DNA (mtDNA). A cetacean sequence database, DNA Surveillance [8], has been created to help in this purpose. It contains reference sequences for all known cetacean species. In addition, other databases like for example GenBank [9] have also reference sequences for cetaceans -and many other organisms. Although mtDNA is more frequently used for cetacean identification [7], [8] nuclear markers may be also needed for this purpose. Some species naturally hybridize e.g. [10], [11], [12]. Hybrids may exhibit morphologically ambiguous phenotypes e.g. [13] and therefore nuclear (biparentally inherited) genetic markers are needed for accurate identification of pure species and their hybrids [14]. Nuclear markers are also recommended in cases of PCR contamination [14], e.g. bacterial contamination of cetacean tissues. In this study, we have used mitochondrial and nuclear genetic markers to determine the species of stranded individuals and to investigate the possible existence of some genetic mixture between the two pilot whale species found in waters off northern Spain.

Materials and Methods
Six stranded pilot whales (Table 1), 19 reference Globicephala macrorhynchus from Canary Islands and 20 reference Globicephala melas from Faroe Islands (donated by the Faroese Museum of Natural History) were analysed ( Figure 1). All animal samples were obtained from dead specimens: dead strandings and museum collection. We obtained the CITES permit (ESBI00001/12I) and all the permissions from the Faroese Museum of Natural History to analyse the Faroese samples. No one animal suffered nor was injured or killed for this study. The protocol employed was approved by the Committee on the Ethics of Animal Experiments of the University of Oviedo.
Eight  [21] and genotyped employing GeneMapperH Software. A multi-tube method [22] was employed to validate the allele scores. Each microsatellite locus was individually amplified four times in three different thermal cycler machines (Applied Biosystems 2720 Thermal Cycler). Scoring errors, large allele dropout and null alleles were checked with MICROCHECKER [23]. Linkage disequilibrium tests were performed with GENE-POP version 4.2 [24]. Variation parameters (number of alleles; allele richness; minimum, maximum and mean allele length; expected and observed heterozygosities; F IS ) and distances between genotypes for populations (Nei distances) and individuals (Fuzzy set similarity distances) were calculated with Microsatellite Analyser MSA 4.05 [25]. F ST distances were calculated with Arlequin v.3.5.1.3 [26] with 1 023 permutations and 0.05 significance level. Neighbour-Joining trees based on genetic distances and bootstrap (1 000 bootstrapping) were reconstructed with PHYLIP v. 3.69 [20]. Species assignment was done using three different methods widely employed for microsatellites e.g. [27]. The likelihood-based Bayesian method of Rannala and Mountain [28] was performed with GeneClass2 [29] with 0.05 score threshold. Two fully Bayesian methods were also employed: one with the program STRUCTURE 2.3.1 [30] (under the ''Admixture model'' which assumes that individuals may have mixed ancestry; burn-in period of 100 000 steps followed by 1 000 000 Markov Chain Monte Carlo (MCMC) iterations and five runs for k = 2 -two species), and other with NewHybrids 1.0 [31] software (with 500 000 sweeps after a burn-in period of 100 000 MCMC iterations) that identifies first and second generations hybrids and backcrosses.

Results and Discussion
The markers employed exhibited sufficient variation for discriminating between the two pilot whale species in reference samples. The mitochondrial D-loop haplotypes were clearly species-specific, as expected [12]. Intraspecific polymorphism was found for the two species, with seven and two haplotypes for G. macrorhynchus and G. melas respectively ( Table 2). Eight microsatellite loci were assayed, from which two (EV94MN and 468/469) exhibited possible null alleles in our dataset (detected with MICROCHECKER) and were discarded from further analyses. For the six remaining microsatellite loci, null alleles and linkage disequilibrium were found to be non-significant, allowing their use for genetic assignment. Allelic frequencies of the six selected microsatellite loci were deposited in LabArchives, LLC   Table 2). No significant differences between expected and observed heterozygosities and low F IS values were found ( Table 2). Highly significant F ST -values between species (0.2957, P,0.00001) confirmed enough resolution for species discrimination. The six stranded pilot whale here analysed yielded positive amplification at the D-Loop sequence and the six microsatellite loci considered (Table 3); except for two loci that failed to amplify in one specimen (Galicia01). Genetic assignment was coincident with visual species identification when available (Table 4), and consistent for nuclear and mitochondrial markers. The male of ambiguous phenotype Galicia05 exhibited private alleles of the two parental species for 4 loci (Table 3 and   uous signals of post-F1 status. This individual was assigned with NewHybrids to a cross between F2 and G. melas ( Table 4). The STRUCTURE software also revealed mixed ancestry for Galicia05 (57% membership of G. melas, 43% G. macrorhynchus; Figure 2). From the mitochondrial DNA its maternal species was G. melas. As in other studies [27], the two fully Bayesian methods (STRUCTURE and NewHybrids software) performed better than partially Bayesian assignment tests (GeneClass), which did not assign Galicia05 significantly to any species. The hybrid status of this individual is clearly visualized in the NJ tree reconstructed from nuclear markers ( Figure 3): in the microsatellite-based tree, Galicia05 appears in the middle of the two species. Its clustering with a reference G. melas individual was not supported by bootstrapping, which was very low. In contrast the tree exhibited high bootstrapping in the rest of the nodes. These results therefore identify the first known hybrid between the two pilot whale species. These two species join the pairs blue whales and fin whales; Dall's and harbour porpoises; narwhals and belugas, and Risso's and bottle-nosed dolphins in the short list of sympatric cetaceans that hybridize [32]. A post-F1 hybrid foetus was described between blue and fin whales [13], but this is the first post-F1 adult cetacean documented until now and strongly suggests the possibility of  interspecific introgression in marine mammals, a good example of Darwinian continuum between varieties and species [32]. Finally, the present results emphasize the need of including nuclear markers in reference databases aimed at identifying cetacean species [8]. SNPs and nuclear sequence data, as well as hypervariable microsatellite loci, can be used for this purpose. As proposed long time ago by Palumbi and Cipriano [14], nuclear markers will help to understand the extent of interspecific hybridization in these marine mammals and its implications for conservation.