Homoploid hybrid speciation is the formation of a new hybrid species without change in chromosome number. So far, there has been a lack of direct molecular evidence for hybridization generating novel traits directly involved in animal speciation. Heliconius butterflies exhibit bright aposematic color patterns that also act as cues in assortative mating. Heliconius heurippa has been proposed as a hybrid species, and its color pattern can be recreated by introgression of the H. m. melpomene red band into the genetic background of the yellow banded H. cydno cordula. This hybrid color pattern is also involved in mate choice and leads to reproductive isolation between H. heurippa and its close relatives. Here, we provide molecular evidence for adaptive introgression by sequencing genes across the Heliconius red band locus and comparing them to unlinked wing patterning genes in H. melpomene, H. cydno, and H. heurippa. 670 SNPs distributed among 29 unlinked coding genes (25,847bp) showed H. heurippa was related to H. c. cordula or the three species were intermixed. In contrast, among 344 SNPs distributed among 13 genes in the red band region (18,629bp), most showed H. heurippa related with H. c. cordula, but a block of around 6,5kb located in the 3′ of a putative kinesin gene grouped H. heurippa with H. m. melpomene, supporting the hybrid introgression hypothesis. Genealogical reconstruction showed that this introgression occurred after divergence of the parental species, perhaps around 0.43Mya. Expression of the kinesin gene is spatially restricted to the distal region of the forewing, suggesting a mechanism for pattern regulation. This gene therefore constitutes the first molecular evidence for adaptive introgression during hybrid speciation and is the first clear candidate for a Heliconius wing patterning locus.
Hybrid speciation challenges our view of biodiversity as a branching tree and is considered rare or absent in animals. A possible route by which it may occur is establishment of a novel “magic trait,” influencing both ecological adaptation and mating preference, via hybridization. We provide, to our knowledge, the first molecular genetic evidence for this process in the tropical butterfly Heliconius heurippa. We sampled molecular markers both linked to the locus controlling red color pattern and across the genome of Heliconius heurippa and its putative parents, H. cydno and H. melpomene. We found evidence of genetic introgression from H. melpomene into the hybrid H. heurippa only at the genomic region of the forewing red-band locus. This signature of introgression corresponds to the 3′ end of a kinesin gene that also shows a pattern of expression restricted to the distal region of the forewing. As the wing color pattern in these butterflies is crucial in maintaining the isolation of this species through mate choice, this study provides molecular support for the hybrid origin of a new adaptive trait that can lead to speciation.
Citation: Salazar C, Baxter SW, Pardo-Diaz C, Wu G, Surridge A, Linares M, et al. (2010) Genetic Evidence for Hybrid Trait Speciation in Heliconius Butterflies. PLoS Genet 6(4): e1000930. https://doi.org/10.1371/journal.pgen.1000930
Editor: Bruce Walsh, University of Arizona, United States of America
Received: January 19, 2010; Accepted: March 30, 2010; Published: April 29, 2010
Copyright: © 2010 Salazar et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Funding: CDJ and SWB were supported by the Royal Society, Leverhulme Trust, and the BBSRC. CS was supported by the Smithsonian Tropical Research Institute (STRI)-Academic Programs Office and a Leverhulme Trust grant to CDJ. CP-D was supported by a STRI internship. ML was funded by COLCIENCIAS. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
Identifying the genetic mechanisms involved in speciation is an important challenge in the study of evolution –. Empirical studies have shown that species differences can be localized in just a few genomic regions –, and that reproductive isolation is more easily achieved when traits causing assortative mating are also subject to natural selection , . Such characteristics have been termed ‘magic traits’  and can facilitate speciation as a side-effect of ecological divergence in the presence of ongoing gene flow , . Likely examples of such magic traits include body size and color in sticklebacks, flowering time in edaphic plants, host shifts in phytophagous insects, color patterns in Heliconius butterflies, beak size in Darwin finches, development time in melon flies and color patterns in Hypoplectrus fish –.
If ‘magic traits’ were acquired by introgression from related lineages, adaptation and speciation could proceed without the requirement for novel mutations , . Recent studies in plants and animals have shown that introgression can provide the raw material for adaptation –. Hence, it is plausible that if introgression produces new adaptive phenotypes that also generate reproductive isolation, for example through mate choice, habitat colonization or asynchronous emergence, then hybrid speciation can occur without geographic isolation , . We have called such a scenario ‘hybrid trait speciation’, as a special case of speciation through hybridization without a change in chromosome number or homoploid hybrid speciation (HHS) , . Hybrid trait speciation contrasts with what we have termed mosaic genome speciation, documented in Helianthus sunflowers, where the hybrid species genome is composed of blocks derived from both parental species . Rapid establishment of incompatibilities between parental and daughter species can result due to the large number of genes causing epistatic hybrid breakdown in hybrids . The two scenarios contrast in their genomic signature, with hybrid trait speciation potentially involving introgression of just a few adaptively important loci into the genetic background of one of the parental species, making it much more difficult to detect using traditional approaches based on ‘neutral markers’ . This is in addition to the fact that detecting hybrid speciation at the molecular level is difficult anyway, due to incomplete linage sorting and historical gene flow, which can leave similar signatures of shared variation .
There is evidence that hybridization has played an important role in the adaptive radiation of Heliconius butterflies . Heliconius have aposematic wing coloration, mimicry between divergent species and frequent hybridization, providing an excellent opportunity to study the genetics of adaptation and speciation –. In particular, studies in the closely related species H. melpomene and H. cydno, that occur sympatrically throughout Central America and in the west Andes, show that mimicry shifts are coupled with assortative mating and lead to speciation . In addition, differences in host plant use, microhabitat preferences and partial hybrid sterility also contribute to reducing genetic interchange between these species . The species hybridize in both the field and the laboratory, although natural interspecific hybrids are collected at a very low rate (one in a thousand or less) , . Nonetheless, introgression of color pattern alleles has been observed in natural hybrid zones and the same phenotypes have been recreated in experimental crosses , . Furthermore, studies using neutral markers reveal that introgression between the species has been frequent throughout their evolutionary history , .
Occasionally, novel color pattern variants produced through hybridization appear to have produced stable hybrid populations. The best studied example is Heliconius heurippa, found in the eastern slopes of Colombian Andes, which has a color pattern that can be recreated in the laboratory through crosses between H. c. cordula and H. m. melpomene, the races of the melpomene group adjacent to its current geographical range , . H. heurippa is abundant and its color pattern is stable along several hundred kilometers of the Andean slopes, although is not mimetic with any other species , . This wing pattern stability across a broad geographic area contrasts with the transient production of hybrid forms in narrow Heliconius hybrid zones .
Surprisingly, only three generations of crosses are needed to obtain a homozygous H. heurippa like color pattern . Two tightly linked loci controlling the red forewing band (B allele, hereafter HmB) and the absence of brown forceps marks in the ventral surface (br allele) from linkage group 18 are introgressed into an H. c. cordula genetic background that includes the yellow forewing band allele at the HmN locus on linkage group 15. The resultant pattern contributes to reproductive isolation through assortative mating, and therefore plays a direct role in speciation . In particular, mating experiments revealed strong pre-mating mating isolation between H. heurippa and H. melpomene (≈90%) and between H. heurippa and H. cydno (≈75%). Furthermore, assays with wing models showed that H. heurippa males use the combined red and yellow bands to discriminate females . Even first-generation backcross hybrids between H. m. melpomene and H. c. cordula, resembling H. heurippa, showed a strong preference for their own color pattern over that of either parental species, implying that mate preference can be established directly through hybridization . This result implies, first that assortative mating preferences would facilitate the initial establishment of a homozygous hybrid color pattern by increasing the probability that early generation hybrids mate among themselves. Second, once the new hybrid population was established, it would immediately possess the assortative mating preferences that generate partial reproductive isolation from the parental species. Thus, H. heurippa is an excellent candidate for speciation through adaptive introgression .
Despite extensive support for the hybrid origin of H. heurippa from biogeography, crosses, mate choice experiments and mathematical simulations , molecular evidence for a hybrid origin is inconclusive. Neutral markers reveal extensive gene flow between all three species, H. heurippa, H. melpomene and H. cydno . To definitively test the hybrid origin hypothesis we need to study the loci controlling the adaptive traits responsible for speciation, in this case wing patterns , . Here, we take advantage of the recent cloning of the HmB locus controlling the red forewing band of H. melpomene, in order to carry out such a test , .
Previous work on H. heurippa has used a panel of largely intronic markers and shown evidence for ongoing gene flow between H. heurippa and its close relatives . Here, we directly addressed the adaptive introgression hypothesis by sampling 18 contigs based on 24 amplicons representing 18,629bp across the HmB locus for ten individuals of each species (60 alleles; Table S1) , . In addition, we improved our broader genome sampling by developing a larger panel of unlinked molecular markers based on single copy genes with large exons. We analyzed around 15,000 contigs assembled from H. melpomene GSS (BAC-end) sequences, 484 of which showed strong homology with single copy genes in B. mori. From these unigenes, 27 had exons longer than 700 bp (Table S2). Additionally, we used two previously published markers, CAD and GAPDH . Thus, we used a set of 29 markers that were putatively distributed in at least 17 of the 21 chromosomes in H. m. melpomene (Table S2). Sequences of these markers were obtained for the same ten individuals per species used in the HmB locus analysis (60 sequences per gene).
Among the 29 loci sampled from across the genome, most SNP polymorphisms were shared among the three species and therefore did not associate H. heurippa with one or other of the two parental species. Only 8 SNPs (over six genes) were fixed polymorphisms shared by H. heurippa and H. c. cordula relative to H. m. melpomene (Figure 1A), consistent with previous data from mitochondrial genes that related H. heurippa with H. cydno , while no SNPs were fixed in H. heurippa and H. m. melpomene relative to H. c. cordula (Figure 1A).
Likelihood values are plotted for each variable position, where positive likelihood values indicate a SNP position at which H. heurippa and H. m. melpomene are more similar, and negative values a position at which H. heurippa shows a stronger association with H. c. cordula. Fixed sites are indicated by dotted lines showing a likelihood value of 244 for a complete association of H. heurippa with H. m. melpomene and −244 for that with H. c. cordula. Colors represent different coding regions. The majority of unlinked SNPs (634) show shared polymorphism among the three species (240> ΔLnL >−240). At unlinked loci, H. heurippa and H. c. cordula shared fixed polymorphism at only 8 SNPs whereas H. heurippa and H. m. melpomene did not share any fixed polymorphism. For the HmB locus, sequenced BAC clones are indicated above the gene annotation .
Our prediction derived from the adaptive introgression hypothesis was that within the HmB locus there should be a region introgressed from H. melpomene into the H. heurippa genome. This prediction was upheld. From nearly 19Kb of sequence analyzed across the HmB locus, there was a 6,493 bp region corresponding to the 3′ end of a putative kinesin gene (hereafter, 3′ kinesin) showing a strong association between H. heurippa and H. m. melpomene (Figure 1B). Across the remaining HmB region there was either shared variation among the three species, unique polymorphisms in one of the three species, or nearly fixed changes unique to H. heurippa. In the case of two genetic markers, kin_2 and sdp, H. heurippa was most strongly associated with H. c. cordula (Figure 1B).
The HmB locus was a significant outlier relative to the rest of the genome. We calculated a likelihood statistic that estimated the relationship of H. heurippa to the parental species, where positive values indicate sites linking H. heurippa with H. m. melpomene and negative values sites where H. heurippa is similar to H. c. cordula. The distribution of mean likelihood values across unlinked loci gives a distribution of expected values for the genome. Comparison of the 3′ kinesin region showed values that fell outside this distribution derived from unlinked markers (Figure 2; p<0.05), demonstrating that this genetic region has a significantly stronger association with H. melpomene than any other region of the H. heurippa genome. The kin_2 region was also an outlier, but with a significantly stronger association with H. cydno (Figure 2; p<0.05), implying that the kinesin gene is in fact a chimera derived from two parental species. As H. heurippa is most closely allied to H. c. cordula, we analyzed linkage disequilibrium (LD) for these two species combined. Across the HmB region, the highest r2 values (p<0.001) were observed within the 3′ kinesin and nearby genes, with LD decaying in surrounding regions, suggesting a haplotype structure across the kinesin gene resulting from strong selection on wing pattern (Figure 3). Nonetheless, consistent with the wing patterning alleles being relatively ancient, there was no evidence for a reduction in diversity or deviation from neutrality that might indicate a recent selective sweep at 3′ kinesin (Table 1). Similarly, there was no evidence for adaptive amino acid substitution at this locus, with Ka/Ks ratios not significantly greater than 1 (p>0.05; Figure S1) and McDonald-Kreitman tests showing no deviation from neutrality (p>0.05). H. heurippa also had five private amino acid substitutions not observed in the other two species (Figure S2). Only one amino acid replacement was shared between the red-banded species, H. heurippa and H. m. melpomene, representing a putative causative site for this phenotype (Figure S2).
The average of likelihood values for each unlinked marker and for 1,000 bp windows in the HmB region was calculated. This histogram shows the distribution of these values. Dotted lines represent the 95% two-tailed interval for unlinked genes. The asterisk over the bars indicates those 1,000 bp windows showing average values that lie outside the unlinked genes distribution (p<0,005). Positive values outside that distribution correspond to 3′ kinesin whereas negative values are those in kin_2.
Pairwise estimates of linkage disequilibrium (r2) among 199 SNPs in the HmB locus (those with a rare allele frequency less than 10% were excluded) for combined H. c. cordula and H. heurippa population samples. Physical distance between sites is shown in the adjacent map.
An alternative hypothesis to adaptive introgression is that the H. heurippa pattern might be ancestral. However we could rule this out by reconstructing a rooted genealogy of the 3′ kinesin with either nucleotides or amino acid sequences (Figure 4B; data not shown for AA). Gene genealogies for kin_2, sdp and 3′ kinesin confirmed the results obtained with the SNP association tests (Figure 4). In the 3′ kinesin, H. heurippa was monophyletic and branched from within the H. m. melpomene clade, with H. c. cordula an outgroup to both species (Figure 4B, Table 2). In contrast, genomic regions surrounding 3′ kinesin showed H. heurippa alleles more closely related to H. c. cordula (Figure 4A and 4C; Table 2). Thus, the SNP analysis, LD patterns and gene genealogies revealed that a clearly delimited genomic portion of the HmB region closely relates H. heurippa with H. m. melpomene. This result is in contrast to the rest of the genome where no other gene showed a similar association.
(A) sorting nexin (sdp); (B) 3′ kinesin; and (C) 5′ kinesin partial sequence (kin_2). Filled color circles represent alleles of each species. The 3′ kinesin tree is rooted with H. numata as outgroup (black). Numbers above and below the branches are bootstrap support values for likelihood and parsimony analyses respectively. (B) shows H. heurippa most closely related to H. m. melpomene, while (A,C), the genomic regions surrounding 3′ kinesin, show H. heurippa alleles more closely related to H. c. cordula. A similar tree topology was obtained from an amino acid alignment (data not shown).
When the genealogy of 3′ kinesin was used to estimate divergence times, we found that H. heurippa alleles were derived from H. m. melpomene around 0.43 Mya (0.12–0.84) ago, subsequent to the splitting of the H. c. cordula/H. m. melpomene alleles at 2.82 Mya (1.03–5.22) ago. A coalescent expansion model ,  (SSD>0.05 in all the cases) similarly suggested that H. heurippa haplotypes radiated more recently (∼0.385 Mya; 0.176–1.428) than either H. m. melpomene or H. c. cordula (∼2.055 Mya (1.175–4.219) and ∼2.745 Mya (1.584–4.489) respectively), giving a confirmation of the relative ages of the alleles found in each species, independent of tree topology. Thus, the H. heurippa 3′ kinesin alleles diverged subsequent to the split between H. m. melpomene and H. c. cordula at this locus.
To examine the role of the putative kinesin gene in specification of wing pattern, we visualized spatial localization of kinesin transcripts in developing wings using in situ hybridization. In two red-banded forms H. melpomene cythera and H. melpomene rosina, a probe from exon 13 of the 3′ kinesin showed localization to the distal portion of the developing wing in early pupal stages (72–96 hrs after pupation; Figure 5A; Figure S3). No such spatial localization was seen either in individuals of H. cydno that do not express a red band phenotype (Figure 5B), nor in H. melpomene forewings treated with riboprobes for a different gene (Figure S3). This spatial localization suggests a model for pattern specification whereby the kinesin gene interacts with another as yet unidentified gene product to specify proximal and distal boundaries of the forewing band (Figure 5C), leading to upregulation of pigmentation genes such as cinnabar .
Of (A) H. m. cythera and (B) H. cydno. A similar expression pattern to that present in (A) is also observed in H. m. rosina forewings (Figure S3), consistent with the red band phenotype of these two races. The lack of any localized kinesin expression in H. cydno forewings is consistent with the absence of a red band in this species. (C) Model of how kinesin expression (K, solid line), might interact with an unknown gene (X, dotted line) to regulate forewing red band expression.
Homoploid hybrid speciation has been considered controversial in animals. We here provide the first molecular support for this hypothesis derived from sequence analysis of a gene region directly implicated in controlling a hybrid trait. H. heurippa was originally proposed as a hybrid species based on its unusual color pattern. The main evidence in support of this hypothesis are crossing experiments demonstrating experimental introgression of the H. m. melpomene red color forewing band into the H. c. cordula genomic background . Such experiments demonstrate a plausible route for the origin of H. heurippa, and make a clear prediction: the region controlling the red forewing band should show a pattern of introgression from H. melpomene into H. heurippa. Here, we provide support for this hypothesis at a molecular level, by demonstrating a 6.5Kb region in the HmB locus that is introgressed from H. melpomene into H. heurippa.
The majority of SNPs (634) sampled in 29 coding genes, located on 17 of the 21 H. melpomene linkage groups, showed shared polymorphism among the three species (Figure 1A). H. heurippa and H. c. cordula shared fixed polymorphism relative to H. m. melpomene at only 8 SNPs, and there were no fixed SNPs in H. heurippa and H. m. melpomene relative to H. c. cordula (Figure 1A). This agrees with previous genetic data showing extensive allele sharing in the nuclear genome between the three species, but H. heurippa somewhat closer to H. c. cordula . As we have argued previously , these data do not strongly support a hybrid speciation scenario, but are more consistent with either recent gene flow among the three species or shared ancestral polymorphism.
Here we have taken advantage of the recent cloning of HmB, the key locus underlying the speciation of H. heurippa. Our hypothesis derived from previous crossing experiments and sequence surveys is that the H. heurippa genome is most closely related to H. cydno, but with the introgression of the red forewing band, controlled by HmB, from H. melpomene. Here we directly test this hypothesis by sampling markers across the 721 Kb HmB locus . From 13 genes evaluated in this region (comprising 24 molecular markers), we found a 6,493 bp region, corresponding to the 3′ end of the kinesin locus, where H. heurippa is strongly related to H. melpomene (Figure 1B). The likelihood values for species relationships at this locus differs significantly from that seen among unlinked genes, implying that this relationship cannot be explained by chance (Figure 2). The high long-range LD at 3′ kinesin relative to the H. c. cordula genetic background is also expected under the introgression hypothesis (Figure 3). The pattern is comparable to that seen across the same region in a Heliconius melpomene hybrid zone, where long-range LD is observed between sites showing significant genotype-by-phenotype association . A rather surprising observation is that across the HmB locus there is also shared variation between the three species, at sites interspersed between those generating a strong phylogenetic signal. These could be due to gene flow and recombination subsequent to speciation, recurrent mutations or alternatively a hybrid founding event for H. heurippa that transferred significant polymorphism from the parents to the hybrid species. In the data, there was no marked difference in the transition/transversion ratio among fixed and shared polymorphisms, which might be indicative of recurrent mutation (data not shown).
An alternative hypothesis to be considered is that H. heurippa pattern might be ancestral and have given rise to H. melpomene and H. cydno lineages that inherited different aspects of the ancestral wing pattern. However, this is not supported by the rooted gene genealogy for the 3′ kinesin that shows H. heurippa monophyletic, forming a well supported and derived clade within H. melpomene (Figure 4B). Furthermore, none of the dating approaches showed H. heurippa older than the other two species. Other genomic regions have shown a genealogical pattern whereby H. heurippa was similarly nested within an H. cydno clade, also arguing that H. heurippa is not an ancestral taxon .
In addition, kinesin in situ hybridizations on developing wing tissue (72–96h post-pupation) showed localized gene expression in the distal region of the wing, supporting a likely functional role in specification of the proximal boundary of the forewing band. In combination with previous analyses showing parallel differences in expression levels of the kinesin gene between color pattern races of both H. melpomene and H. erato , , these data strongly suggest that a regulatory change in the kinesin gene is functionally required for pattern determination. The kinesin gene appears to be chimeric, with the 3′ region derived from H. m. melpomene and the 5′ end more strongly related to H. c. cordula. Since crossing experiments suggest that introgression of the HmB allele from H. melpomene is sufficient to generate the H. heurippa pattern, the implication is that the functionally important sites are located at the 3′ end of the gene. We have identified one amino acid change and 11 synonymous changes shared between red-banded H. melpomene and H. heurippa, representing candidate functional sites for red band specification.
We also observed five amino acid differences between H. melpomene versus H. heurippa (Figure S2), perhaps reflecting adaptive change subsequent to formation of H. heurippa, although there was no significant evidence for selection on the locus (Figure S1). Perhaps more likely, these changes may represent fixation of nearly-neutral variation due to a population bottleneck during the origin of H. heurippa.
The implication of this gene in phenotypic control at HmB is also consistent with previous population genetic analysis of phenotypic races of H. melpomene, which showed a region of high genetic differentiation corresponding to a genomic region including the kinesin . Members of the kinesin superfamily (KIFs) are key players in cellular functioning and morphology that interact with cargo molecules such as proteins, lipids or nucleic acids , . In both vertebrates and invertebrates, kinesin molecules are implicated in pigment transport, however in Heliconius melpomene upregulation of pigment pathway genes occurs later in development relative to the localized kinesin expression observed here. This would suggest a likely upstream role in scale cell fate specification, rather than pigmentation per se.
Although adaptive introgression has recently been demonstrated at a molecular level, for example between species in the genus Senecio , H. heurippa is unusual in the fact that the hybrid trait contributes directly to reproductive isolation and hence speciation. We have also recently demonstrated that first generation backcross hybrids resembling H. heurippa also exhibit mate preferences very similar to that of wild H. heurippa. This implies that mate preferences could also have been produced by introgression in addition to color pattern . A possible mechanism for this is suggested by the recent demonstration of a genetic association between the red band and male preference for red mates, in interspecific hybrids between H. m. rosina and H. c. chioneus (Merrill et al. pers. comm.). Thus, the derived color pattern and mate preferences of H. heurippa could potentially have arisen from introgression of the same gene or tightly linked genes.
Several cases of animal homoploid hybrid species have been recently proposed, such as Rhagoletis sp., Lycaeides sp., Cottus gobio group, cichlid fishes, Xiphophorus clemensiae and Pogonomyrmex sp. ,  where ecological divergence, sexual selection or both promote reproductive isolation of a hybrid taxon. However, to our knowledge, this is the first time that molecular evidence for introgression has been established for an adaptive trait that also contributes directly to reproductive isolation and hence speciation. We feel that our results therefore represent the most convincing molecular evidence to date for homoploid hybrid speciation in animals. Similar molecular evidence also supports the hybrid origin of sunflower species in the genus Helianthus, although the pattern is very different in this case. Hybrid sunflower genomes are a mosaic of genomic blocks inherited from one or other parent , in contrast to H. heurippa which shares polymorphism with both parental species across most of the genome. Although mathematical simulation has suggested that the origin of H. heurippa probably involved an initial period of allopatry, during which the hybrid pattern became established , the contemporary genetic pattern supports our model of ‘hybrid trait speciation’ whereby localized introgression of key traits can promote the origin of hybrid species .
Materials and Methods
Specimen collection and DNA isolation
Butterflies were collected from Colombia and Venezuela: H. m. melpomene in Morcote (5°37′0.52″N; 72°18′0″W, Casanare-Colombia) and Chirajara (4°12′48″ N; 73°47′70″W, Cundinamarca-Colombia), H. c. cordula in San Cristobal (7°47′56″N; 72°11′56″W, Merida-Venezuela) and H. heurippa in Buenavista (4°10′30″N; 73°40′41″W, Meta –Colombia). Wings of 10 individuals of each species were removed and stored in glassine envelopes and are lodged in the Natural History Museum of the Universidad de los Andes. The bodies were preserved in 20% DMSO-0.25M EDTA salt saturated solution. DNA was isolated with DNeasy Blood & Tissue Kit (QIAGEN) following manufacturer's instructions. Quality of genomic DNA was confirmed by visualisation in a 0.8% agarose gel.
Development, amplification, and sequencing of genetic markers
Single copy large exons.
In order to develop new coding markers in Heliconius, we designed primers to amplify single copy large exon markers that were widely distributed across the Heliconius genome. We used the Basic Local Alignment Search Tool via nucleotide (BLASTN) to compare Heliconius melpomene genomic raw reads against unique genes (unigenes) of Bombyx mori . Those unigenes showing homology with H. melpomene were subjected to BLASTN against B. mori whole genome shotgun (WGS) to reveal the location of introns . Additionally, the program Spidey, freely available at http://www.ncbi.nlm.nih.gov/spidey/, was employed to confirm this information , . With this knowledge in hand, introns and exons limits were determined in the H. melpomene contigs. To estimate the genomic location of selected markers we used the tool SilkMap , to locate genes on the silkworm chromosomes. Given highly conserved synteny between B. mori and H. melpomene , we could infer the putative chromosome location of markers in Heliconius. Identified H. melpomene contigs containing exons longer than 700 bp and located on different chromosomes were selected for PCR primer design using Primer 3 v.0.3.0 . Two previously reported single copy large exons were also included . We performed all PCR reactions in a 10 µL reaction volume containing 1× PCR buffer, 2.5 mM MgCl2, 200µM dNTPs, 1 µM each primer, 0.5 U Taq polymerase (QIAGEN) and 30–40 ng of DNA. The PCR cycling profile was 95°C for 5 min, 30 cycles of 95°C for 30 s, Tm for 45 s (see Table S2 for Tm values for each locus), 72°C for 1 min and final extension at 72°C during 15 min. Two microlitres of the PCR reaction were visualised in a 1% agarose gel to verify the success of PCR. The amplicons were cleaned up by ExoSAP-IT (USB Corp., Cleveland, OH). The BigDye Terminator Cycle Sequencing kit (PerkinElmer Applied Biosystems, USA) was used for direct sequencing using 24 cycles of denaturation at 96°C for 10 s, annealing at 50°C for 5 s, and extension at 60°C for 4 min. Sequencing reactions were cleaned up with Sephadex G50 (SIGMA) and analyzed with an ABI 3130xl DNA genetic analyzer (Applied Biosystems, USA).
Markers in the HmB locus region.
A candidate region where the HmB locus is located in H. m. melpomene was previously described and annotated , . It comprises seven BAC clones with a total length of around 721 kb , . Here, we developed 24 PCR amplicons located in BAC clones AEHM-28L23, AEHM-21P16 and AEHM-27I5 (Table S1). PCR and clean up conditions were as described above. Direct sequencing was possible for 8 markers comprising only coding regions (Table S1). However, due to the presence of introns with considerable indel variation, the other 16 markers were cloned before sequencing (Table S1). PCR products were ligated into the pGEM-T easy vector (Promega). Five or more positive clones per individual were selected, re-amplified and again purified with ExoSAP-IT (USB Corp., Cleveland, OH) to identify distinct alleles. Direct sequencing of clones was performed as described above.
Gene sequences were read, edited and aligned with Sequencher v4.6 (Gene Codes Corporation). For sequences resulting from direct sequencing, haplotypes reconstruction was conducted using DNAsp v4.90.1  implementing the algorithm provided in PHASE . Basically, PHASE assigns a probability of the correct inference of haplotype phase at every heterozygous position. PHASE simulations were repeated eight times for each locus, four without recombination and four with recombination. Each simulation was run with 5,000 iterations. In all cases, the most common output inferring haplotypes with >95% confidence was accepted. In the case of cloned products, aligned sequences of each individual were compared to discard PCR errors and false alleles caused by recombination in the cloning reaction. Purified alignments were translated to protein and checked for stop codons in MacClade v4.08 . Sequences were deposited in GenBank under accession numbers GQ985506–GQ988326.
SNP association statistic.
Allele frequencies were calculated for each single nucleotide polymorphism (SNP) with rare allele frequency greater than 10% in the total sample of 60 haplotypes. With these numbers, a likelihood statistic was calculated to indicate the degree to which the H. heurippa SNP frequency was similar to H. m. melpomene relative to H. c. cordula.Where hi and hj are the number of H. heurippa individuals with i or j nucleotide respectively at a particular SNP position. The frequency of the nucleotide i or j in H. m. melpomene is represented by pim and pjm, respectively, and that in H. c. cordula by pic and pjc. A positive value reflects H. heurippa being more similar to H. m. melpomene, while a negative value indicates association with H. c. cordula. Shared polymorphism at similar frequency in the three species, unique polymorphism in any one of the three species or a private allele in H. heurippa all give a value near to zero. For unlinked markers the average of these values was calculated to give a frequency distribution for the genomic background. Then, average values were similarly calculated in 1,000 bp windows across the HmB region and compared to the frequency distribution for unlinked markers. In order to determine if average likelihood values in the HmB region differ significantly from those in the genomic background, a confidence interval of 95% was established over the distribution of values for unlinked markers, by computing the 2.5 and 97.5 percentiles. Furthermore, using RSXL Excel , the statistical significance of outlier values was determined by resampling with replacement the likelihood averages for all 1,000 bp windows, with 40,000 repetitions.
Linkage disequilibrium analysis.
When new polymorphisms are introduced by recent introgression at some time after species divergence, this should increase levels of linkage disequilibrium (LD) above the background within the region involved. We analyzed LD across the HmB locus for combined samples of H. heurippa and H. c. cordula. This calculation was made among 199 SNPs from the 24 genetic markers sampled in the HmB linked region. These SNPs were those where the minor allele frequency was greater than 10% (from 60 alleles) and thus, were considered as informative for the LD analysis. The LD computation was executed using the software MIDAS , which considers the distance between pairwise markers and does not assume that the genotypic phases are known. The resulting LD among the SNPs was visualized with the R package LDheatmap  plotting the r2 estimates versus physical distance.
Net divergence, nucleotide diversity, and protein evolution.
For all the loci net divergence, fixed differences, shared polymorphisms and nucleotide diversity (π) were estimated with SITES . Additionally, we calculated the number of substitutions per site for synonymous sites (Ks) and non-synonymous sites (Ka) in pairwise comparisons among the three species. The ratio Ka/Ks was determined in order to detect protein evolution in DNAsp v4.90.1 . McDonald-Kreitman test  was also performed in DNAsp v4.90.1.
Gene genealogies and time of introgression
In order to confirm the species relationships, genealogical topologies were reconstructed for three fragments within the HmB region, rooted for the 3′ kinesin (6,493 pb) using H. numata as outgroup and unrooted for 5′ kinesin partial sequence (kin_2; 1,100 pb) and sorting nexin (sdp; 402 pb) (Figure 4). Maximum Parsimony analysis was carried out in PAUP*v4.0b10  using a heuristic search with TBR branch swapping; bootstrap values were calculated with 5,000 replicates using the same search conditions. Modeltest v3.7  was used to determine the most appropriate model for nucleotide substitution based on corrected Akaike information criterion (AICc). For the 3′ kinesin data set Modeltest identified the HKY+I+G model, for 5′ kinesin the K81uf+I+G and for nexin the K80+I. Likelihood reconstructions were also made in PAUP*v4.0b10  based on selected evolutionary models. Heuristic search and bootstrapping were carried out as for parsimony.
The 3′ kinesin genealogy was used to enforce a molecular clock hypothesis. When the likelihoods were compared, constant rate evolution was rejected (x2 = 96.92, df = 48; P<0.001). Then a Bayesian framework, implemented in BEAST v1.4.8 , was employed to obtain an approximate time for the 3′ kinesin introgression. We applied the HKY+I+G model of evolution with four rate categories and assumed a relaxed lognormal clock. Based on the calibration proposed by Wahlberg et al. for Nymphalidae, with Heliconius and Eueides diverged from their common ancestor 18.4 Mya . This date was used as a prior for a probabilistic calibration to determine the splitting time between H. cydno and H. melpomene alleles and between H. heurippa and H. melpomene alleles. The rest of the parameters were sampled keeping the default prior distributions. Two independent runs were implemented, with 50 million steps and burn-ins of 5,000,000. Tracer v1.4 was used to combine runs and observe parameter convergence . Divergence time standard deviations were calculated from 95% confidence/credibility intervals using a normal approximation. We also computed the time of 3′ kinesin introgression under the assumption of species expansion. To perform this calculation, we first tested the fit of the observed mismatch distribution to the theoretical expectation as implemented in Arlequin v. 3.0 . The calculations were made with a neutral mutation rate of ∼2.99×10−10 per base per generation for this region and 10 generations per year.
kinesin RNA in situ hybridizations were performed on H. m. cythera, H. m. rosina and H. cydno 72 to 96h pupal forewings. The specific races involved in the rest of the study were not available as live tissue for this analysis. A 303bp region of exon 13 in the H. melpomene kinesin gene was cloned into the vector pSPT19 (linearised with NheI). RNA probes were prepared with the DIG RNA labeling kit (SP6/T7) (Roche, Cat. 11 175 025 910) according to the manufacturer's instructions. Tissue fixation and in situ hybridization were carried out following a procedure modified from Ramos and Monteiro, 2007 .
Protein evolution analysis via Ka/Ks ratios. The distribution of ka/ks ratios for all genes for each species pair is shown. h: H. heurippa, m: H. m. melpomene and c: H. c. cordula. None of the comparisons had ka/ks >1, suggesting a lack of strong evidence for positive selection.
(0.54 MB PDF)
Protein sequence alignment of 3′ kinesin. The kinesin protein product from exon 9 to exon 14 is shown. Two representative sequences per species are shown. Residues with amino acid changes are highlighted. Polymorphic residues are indicated by a green asterisk, those residues where H. heurippa is different from H. c. cordula and H. m. melpomene are indicated by a blue asterisk and the red asterisk indicates one residue where H. c. cordula is different from H. heurippa and H. m. melpomene. We observed five amino acid differences between H. melpomene versus H. heurippa. This might reflect adaptive change subsequent to formation of H. heurippa, although there was no significant evidence for selection on the locus. Perhaps more likely however, these changes may represent fixation of nearly-neutral variation due to a population bottleneck during the origin of H. heurippa. Only one amino acid difference was found between H. melpomene and H. cydno in a residue also relating H. melpomene with H. heurippa. Although this amino acid replacement might be responsible for a structural protein change causing the red band, several intronic sites show a similar pattern and may have regulatory functions.
Controls for in situ hybridisations. (A) Kinesin expression in the forewing of the red-banded race H. m. rosina showing a distal expression of kinesin similar to that seen in H. m. cythera. The boundary of expression is more diffuse in this individual. The exact boundary position also varies between individuals (data not shown) most probably due to developmental stage. (B) Expression of gene HMB000025 in the forewing of the red-banded race H. m. cythera. Unlike kinesin, HMB000025 (a gene that is expressed in H. melpomene hingwings) does not show any localised expression pattern in the forewings. This indicates that the localization of expression for kinesin is probe-specific and not due to non-specific probe-trapping. (C) In situ control with no riboprobe.
(1.65 MB PDF)
Genes and primer information for HmB linked markers.
(0.08 MB DOC)
The authors would like to thank M. Gonzalez for her help with sequencing, O. Sanjur for logistical advice, and Andrea Manica for advice with analysis. For the genetic access permit number 008 we thank the Ministerio de Ambiente, Vivienda y Desarrollo Territorial (Colombia).
Conceived and designed the experiments: CS ML EB CDJ. Performed the experiments: CS CPD GW. Analyzed the data: CS CPD CDJ. Contributed reagents/materials/analysis tools: CS SWB AS CDJ. Wrote the paper: CS SWB CPD CDJ. Mapped and annotated the HmB color pattern region: SWB. Specimen collection: ML.
- 1. Kirkpatrick M, Ravigné M (2002) Speciation by natural and sexual selection: models and experiments. Am Nat 159: S22–S35.
- 2. Coyne JA, Orr HA (2004) Speciation. Sunderland, Mass.: Sinauer Associates.
- 3. Nosil P, Funk DJ, Ortiz-Barrientos D (2009) Divergent selection and heterogeneous genomic divergence. Mol Ecol 18: 375–402.
- 4. Ortiz-Barrientos D, Counterman BA, Noor MA (2004) The genetics of speciation by reinforcement. PLoS Biol 2: e416.
- 5. Turner TL, Hahn MW, Nuzhdin SV (2005) Genomic islands of speciation in Anopheles gambiae. PLoS Biol 3: e285.
- 6. Gavrilets S (2004) Fitness Landscapes and the Origin of Species: Monographs in Population Biology vol. 41;. In: Levin SA, Horn HS, editors. Princeton, NJ: Princeton University Press.
- 7. Bolnick D, Fitzpatrick BM (2007) Sympatric speciation models and empirical evidence. Annu Rev Ecol Evol Syst 38: 459–487.
- 8. Otto SP, Servedio MR, Nuismer SL (2008) Frequency-dependent selection and the evolution of assortative mating. Genetics 179: 2091–2112.
- 9. Chamberlain NL, Hill RI, Kapan DD, Gilbert LE, Kronforst MR (2009) Polymorphic butterfly reveals the missing link in ecological speciation. Science 326: 847–850.
- 10. Boughman JW (2001) Divergent sexual selection enhances reproductive isolation in sticklebacks. Nature 411: 944–948.
- 11. Savolainen V, Anstett M-C, Lexer C, Hutton I, CJ J, et al. (2006) Sympatric speciation in palms on an oceanic island. Nature 441: 210–213.
- 12. Berlocher SH, Feder JL (2002) Sympatric speciation in phytophagous insects: moving beyond controversy? Ann Rev Entomol 47: 773–815.
- 13. Podos J (2001) Correlated evolution of morphology and vocal signal structure in Darwin's finches. Nature 409: 185–188.
- 14. Miyatake T (2002) Pleiotropic effect, clock genes, and reproductive isolation. Popul Ecol 44: 201–207.
- 15. Puebla O, Bermingham E, Guichard F, Whiteman E (2007) Colour pattern as a single trait driving speciation in Hypoplectrus coral reef fishes. Proc R Soc Lond B 274: 1265–1271.
- 16. Seehausen O (2004) Hybridization and adaptive radiation. Trends Ecol Evol 19: 198–207.
- 17. Mallet J (2007) Hybrid speciation. Nature 446: 279–283.
- 18. Feder JL, Berlocher SH, Roethele JB, Dambroski H, Smith JJ, et al. (2003) Allopatric genetic origins for sympatric host-plant shifts and race formation in Rhagoletis. Proc Nat Acad Sci U S A 100: 10314–10319.
- 19. Kim M, Cui M-L, Cubas P, Gillies A, Lee K, et al. (2008) Regulatory genes control a key morphological and ecological trait transferred between species. Science 322: 1116–1119.
- 20. Anderson TM, vonHoldt BM, Candille SI, Musiani M, Greco C, et al. (2009) Molecular and evolutionary history of melanism in north american gray wolves. Science 323: 1339–1343.
- 21. Jiggins CD, Salazar C, Linares M, Mavárez J (2008) Hybrid trait speciation and Heliconius butterflies. Philos Trans R Soc Lond B 363: 3047–3054.
- 22. Arnold ML (2006) Evolution through genetic exchange. Oxford, UK: Oxford University Press.
- 23. Ungerer MC, Baird SJE, Pan J, Rieseberg LH (1998) Rapid hybrid speciation in wild sunflowers. Proc Nat Acad Sci U S A 95: 11757–11762.
- 24. Meng C, Kubatko LS (2009) Detecting hybrid speciation in the presence of incomplete lineage sorting using gene tree incongruence: a model. Theor Popul Biol 75: 35–45.
- 25. Mallet J, McMillan WO, Jiggins CD (1998) Mimicry and warning colour at the boundary between races and species. In: Howard DJ, Berlocher SH, editors. Endless Forms: Species and Speciation. New York: Oxford University Press. pp. 390–403.
- 26. Baxter SW, Johnston SE, Jiggins CD (2009) Butterfly speciation and the distribution of gene effect sizes fixed during adaptation. Heredity 102: 57–65.
- 27. Joron M, Papa R, Beltrán M, Chamberlain N, Mavárez J, et al. (2006) A conserved supergene locus controls colour pattern diversity in Heliconius butterflies. PLoS Biol 4: e303.
- 28. Mallet J, Beltrán M, Neukirchen W, Linares M (2007) Natural hybridization in Heliconiine butterflies: the species boundary is a continuum. BMC Evol Biol 7: 28.
- 29. Jiggins CD, Naisbit RE, Coe RL, Mallet J (2001) Reproductive isolation caused by colour pattern mimicry. Nature 411: 302–305.
- 30. Jiggins CD (2008) Ecological speciation in mimetic butterflies. Bioscience 58: 541–548.
- 31. Gilbert LE (2003) Adaptive novelty through introgression in Heliconius wing patterns: evidence for shared genetic “tool box” from synthetic hybrid zones and a theory of diversification. In: Boggs CL, Watt WB, Ehrlich PR, editors. Ecology and Evolution Taking Flight: Butterflies as Model Systems. Chicago: University of Chicago Press. pp. 281–318.
- 32. Linares M (1989) Adaptive microevolution through hybridization and biotic destruction in the neotropics. PhD thesis. University of Texas, Austin, TX.
- 33. Bull V, Beltrán M, Jiggins CD, McMillan WO, Bermingham E, et al. (2006) Polyphyly and gene flow between non-sibling Heliconius species. BMC Biol 4: 11.
- 34. Kronforst M, Young LG, Blume LM, Gilbert LE (2006) Multilocus analyses of admixture and introgression among hybridizing Heliconius butterflies. Evolution 60: 1254–1268.
- 35. Mavárez J, Salazar C, Bermingham E, Salcedo C, Jiggins CD, et al. (2006) Speciation by hybridization in Heliconius butterflies. Nature 441: 868–871.
- 36. Salazar C, Jiggins CD, Arias CF, Tobler A, Bermingham E, et al. (2005) Hybrid incompatibility is consistent with a hybrid origin of Heliconius heurippa Hewitson from its close relatives, Heliconius cydno Doubleday and Heliconius melpomene Linnaeus. J Evol Biol 18: 247–256.
- 37. Mallet J (2009) Rapid speciation, hybridization and adaptive radiation in the Heliconius melpomene group. In: Butlin R, Bridle J, Schluter D, editors. Speciation and Patterns of Diversity. Sheffield: Cambridge University Press. pp. 177–194.
- 38. Melo MC, Salazar C, Jiggins CD, Linares M (2009) Assortative mating preferences among hybrids offers a route to hybrid speciation. Evolution 63: 1660–1665.
- 39. Duenez-Guzman EA, Mavarez J, Vose MD, Gavrilets S (2009) Case studies and mathematical models of ecological speciation. 4. Hybrid speciation in butterflies in a jungle. Evolution 63: 2611–2626.
- 40. Salazar C, Jiggins CD, Taylor J, Kronforst M, Linares M (2008) Hybrid speciation and the genealogical history of Heliconius heurippa. BMC Evol Biol 8: 132.
- 41. Baxter SW, Nadeau N, Maroja L, Wilkinson P, Counterman B, et al. (2009) Genomic hotspots for adaptation: the population genetics of Müllerian mimicry in the Heliconius melpomene clade. PLoS Genet 6(2): e1000794.
- 42. Baxter SW, Papa R, Chamberlain N, Humphray SJ, Joron M, et al. (2008) Convergent evolution in the genetic basis of Müllerian mimicry in Heliconius butterflies. Genetics 180: 1567–1577.
- 43. Wahlberg N, Wheat CW (2008) Genomic outposts serve the phylogenomic pioneers: designing novel nuclear markers for genomic DNA extractions of Lepidoptera. Syst Biol 57: 231–242.
- 44. Excoffier L (2004) Patterns of DNA sequence diversity and genetic structure after a range expansion: lessons from the infinite-island model. Mol Ecol 13: 853–864.
- 45. Schneider S, Excoffier L (1999) Estimation of past demographic parameters from the distribution of pairwise differences when the mutation rates vary among sites: Application to human mitochondrial DNA. Genetics 152: 1079–1089.
- 46. Ferguson LC, Jiggins CD (2009) Shared and divergent expression domains on mimetic Heliconius wings. Evol Dev 11: 498–512.
- 47. Beltrán M, Jiggins CD, Brower AV, Bermingham E, Mallet J (2007) Do pollen feeding, pupal-mating and larval gregariousness have a single origin in Heliconius butterflies? Inferences from multilocus DNA sequence data. Biol J Linn Soc 92: 19.
- 48. Counterman BA, Araujo-Perez F, Hines HM, Baxter SW, Morrison CM, et al. (2009) Genomic hotspots for adaptation: the population genetics of Müllerian mimicry in Heliconius erato. PLoS Genet 6(2): e1000796.
- 49. Miki H, Okada Y, Hirkawa N (2005) Analysis of the kinesin superfamily: insights into structure and function. Trends Cell Biol 15: 467–476.
- 50. Tekotte H, Davis I (2002) Intracellular mRNA localization: motors move messages. Trends Genet 18: 636–642.
- 51. Mavárez J, Linares M (2008) Homoploid hybrid speciation in animals. Mol Ecol 17: 4181–4185.
- 52. Papanicolaou A, Gebauer-Jung S, Blaxter M, McMillan WO, Jiggins CD (2008) ButterflyBase: a platform for lepidopteran genomics. Nucleic Acids Res 36: d582–d587.
- 53. Pringle EG, Baxter SW, Papanicolaou WA, Lee SF, Jiggins CD (2007) Synteny and chromosome evolution in the Lepidoptera: evidence from mapping in Heliconius melpomene. Genetics 177: 417–426.
- 54. Mita K, Kasahara M, Sasaki S, Nagayasu Y, Yamada T, et al. (2004) The genome sequence of silkworm, Bombyx mori. DNA Res 11: 27–35.
- 55. Wang J, Xia Q, He X, Dai M, Ruan J, et al. (2005) SilkDB: a knowledgebase for silkworm biology and genomics. Nucleic Acids Res 33: 399–402.
- 56. Rozen S, Skaletsky H (2000) Primer3 on the WWW for general users and for biologist programmers. In: Krawetz S, Misener S, editors. Bioinformatics Methods and Protocols: Methods in Molecular Biology. Totowa, NJ: Humana Press. pp. 365–386.
- 57. Rozas J, Sánchez-DelBarrio JC, Messeguer X, Rozas R (2003) DnaSP, DNA polymorphism analyses by the coalescent and other methods. Bioinformatics 19: 2496–2497.
- 58. Stephens M, Smith N, Donnelly P (2001) A new statistical method for haplotype reconstruction from population data. Am J Hum Genet 68: 978–989.
- 59. Maddison WP, Maddison DR (2001) MacClade version 4.02: analysis of phylogeny and character evolution. Sunderland, Massachusetts: Sinauer Associates.
- 60. Blank S, Seiter C, Bruce PResampling stats in excel. 2 ed. Arlington, Virginia: Resampling stats, Inc..
- 61. Gaunt TR, Rodriguez S, Zapata C, Day IN (2006) MIDAS: software for analysis and visualisation of interallelic disequilibrium between multiallelic markers. BMC Bioinformatics 7: 227.
- 62. Shin J-H, Blay S, McNeney B, Graham J (2006) LDheatmap: An R function for graphical display of pairwise linkage disequilibria between single nucleotide polymorphims. J Statist Soft 16: 1–19.
- 63. Hey J, Wakeley J (1997) A coalescent estimator of the population recombination rate. Genetics 145: 833–846.
- 64. McDonald JH, Kreitman M (1991) Adaptive protein evolution at the Adh locus in Drosophila. Nature 351: 1111–1117.
- 65. Swofford DL (2003) PAUP*. Phylogenetic Analysis Using Parsimony (*and Other Methods). 4. ed. Sunderland, Massachusetts: Sinauer Associates.
- 66. Posada D, Crandall KA (1998) Modeltest: testing the model of DNA substitution. Bioinformatics 14: 817–818.
- 67. Drummond AJ, Ho SYW, Phillips MJ, Rambaut A (2006) Relaxed phylogenetics and dating with confidence. PLoS Biol 4: e88.
- 68. Wahlberg N, Leneveu J, Kodandaramaiah U, Pena C, Nylin S, et al. (2009) Nymphalid butterflies diversify following near demise at the Cretaceous/Tertiary boundary. Proc Biol Sci 276: 4295–4302.
- 69. Rambaut A, Drummond AJ (2007) Tracer v1.4, Available from http://beast.bio.ed.ac.uk/Tracer.
- 70. Excoffier L, Laval G, Schneider S (2005) Arlequin ver. 3.0: An integrated software package for population genetics date analysis. Evol Bioinform Online 1: 47–50.
- 71. Ramos D, Monteiro A (2007) In situ protocol for butterfly pupal wings using riboprobes. J Vis Exp 208.