The Whereabouts of an Ancient Wanderer: Global Phylogeography of the Solitary Ascidian Styela plicata

Genetic tools have greatly aided in tracing the sources and colonization history of introduced species. However, recurrent introductions and repeated shuffling of populations may have blurred some of the genetic signals left by ancient introductions. Styela plicata is a solitary ascidian distributed worldwide. Although its origin remains unclear, this species is believed to have spread worldwide by travelling on ship's hulls. The goals of this study were to infer the genetic structure and global phylogeography of S. plicata and to look for present-day and historical genetic patterns. Two genetic markers were used: a fragment of the mitochondrial gene Cytochrome Oxidase subunit I (COI) and a fragment of the nuclear gene Adenine Nucleotide Transporter/ADP-ATP Translocase (ANT). A total of 368 individuals for COI and 315 for ANT were sequenced from 17 locations worldwide. The levels of gene diversity were moderate for COI to high for ANT. The Mediterranean populations showed the least diversity and allelic richness for both markers, while the Indian, Atlantic and Pacific Oceans had the highest gene and nucleotide diversities. Network and phylogenetic analyses with COI and ANT revealed two groups of alleles separated by 15 and 4 mutational steps, respectively. The existence of different lineages suggested an ancient population split. However, the geographic distributions of these groups did not show any consistent pattern, indicating different phylogeographic histories for each gene. Genetic divergence was significant for many population-pairs irrespective of the geographic distance among them. Stochastic introduction events are reflected in the uneven distribution of COI and ANT allele frequencies and groups among many populations. Our results confirmed that S. plicata has been present in all studied oceans for a long time, and that recurrent colonization events and occasional shuffling among populations have determined the actual genetic structure of this species.


Introduction
Biological introductions have notably increased during the last century, posing a major threat to global biodiversity and altering the structure and function of many communities [1][2][3][4][5][6][7]. Despite some relatively recent attempts to buffer the ecological impact of these introductions [e.g. [8][9][10], oceans remain one of the most affected ecosystems [7,[11][12][13][14][15][16][17]. Among other transport vectors, non-native species arrive to new locations through ships' hulls and sea chests, in ballast water or with spats for mariculture. Thus, the increasing activity in maritime traffic and aquaculture has favoured the introduction of marine species all over the world [13,[18][19]. The establishment of new genetic variants and spread of exotic species has also been facilitated by a proliferation of harbours and other artificial structures along the coast [20][21][22][23][24][25].
Genetic diversity plays a crucial role on the successful establishment of an introduced species or variant in a new area [26][27][28][29][30]. The development of genetic tools and markers has widely contributed to enhance our knowledge on these species. A throughout assessment of the genetic structure of an introduced species, including its history of subdivision and gene flow, allows the identification of range expansions, colonization events, and an understanding of the invasive potential and the relative contributions of artificial and natural dispersal [e.g. [31][32][33][34].
The increasing pace of introductions has also fostered increased awareness. Monitoring and control programs have been established, and recent introductions are more easily detected and inventoried than in the past [e.g . 17]. However, historical invasions may still remain hidden. Some species could have arrived to a new location long before the distribution ranges of autochthonous species were assessed, and be now regarded as native [35,36]. Cosmopolitan or broadly distributed species, particularly those thriving in harbours and artificial substrata, are likely to be ''pseudoindigenous'' species [36]. Lack of historical records in many regions, taxonomic flaws and cryptic speciation further complicate the issue [e.g., 37,38]. In addition, and despite the new methods available [e.g., 33], our ability to extract information may be limited by our knowledge and access to native populations, recurrent introduction events, and shuffling of populations during a long period of time (i.e. centuries).
The paramount importance of ascidians for the study of marine introductions is well recognized, as they represent one of the most common invaders [39,40]. Ascidians have short-lived larvae, thus anthropogenic transport can greatly increase their dispersal abilities. The rate of introduction of non-indigenous ascidians has been increasing in the last decades [40], mostly linked to ship traffic or aquaculture activities [e.g., 39,[41][42][43][44][45]. However, some species may have been translocated centuries ago and have now become ancient introductions whose origins are poorly known [46]. These ancient colonizers are often species commonly found in harbours and man-made substrates, have broad distribution ranges and, while naturalized in many areas, continue to be introduced in new regions of the globe [e.g. [47][48][49][50].
Styela plicata (Lesueur, 1823) (Tunicata, Ascidiacea) is a solitary ascidian commonly found inhabiting marinas and harbours of warm and temperate oceans, usually at high-densities. In spite of its broad geographical distribution, the native range of this species is not yet elucidated [46]. Evidence to date suggests that S. plicata is native to the NW Pacific Ocean [36,[51][52][53][54]. In fact, the description of this species was based on an individual found on a ship's hull in Philadelphia (NE USA), and no other individual was observed in the surrounding natural substrata [55]. All records of S. plicata are based on observations of man-made structures, except in Japan, where this species has been observed to grow in natural habitats [Nishikawa pers. comm.,54]. A series of unique characteristics has allowed S. plicata to thrive in these diverse environments and outcompete other benthic invertebrates. S. plicata can physiologically adapt to widely fluctuating environments, particularly to changes in temperature and salinity [56,57]. This species can also tolerate highly polluted waters [58], grows rapidly until reaching sexual maturity [59][60][61], and is capable of self-fertilization (authors' current research).
To gain insight into the invasive potential of this species, we analyzed the genetic structure of seventeen populations covering most of S. plicata's distribution range. Using a mitochondrial (COI) and a nuclear (ANT) marker, we attempted to infer the global phylogeography of S. plicata, understand its dispersion patterns, and assess the diversity and connectivity of introduced populations.

Sampling
Samples of Styela plicata were collected in 2009 and 2010 from seventeen localities (Table 1): two from the Mediterranean Sea (Iberian Peninsula), three from the north-eastern Atlantic Ocean (Iberian Peninsula, Canary Islands), two from the north-western Atlantic Ocean (US east coast), one from the south-western Atlantic ocean (Brazil), five from the north-western Pacific Ocean (Japan and China), one from the south-western Pacific Ocean (Australia), one from the north-eastern Pacific Ocean (US west coast), and two from the south-western Indian Ocean (South Africa). These locations were chosen to cover as much of the distribution range of this widespread species as possible. All specimens were collected from artificial substrata (harbours, marinas or decks), except for one population collected from natural substratum in Sakushima Island (Japan). The shortest distance by sea between location pairs was calculated using the ''measure line'' tool of Google Earth (version 3.0, Google Inc., Amphitheatre Parkway, CA, USA). S. Plicata samples were obtained according to current Spanish regulations. Samples from outside Spain were collected by national researchers following their country regulations. This species is not protected by any law and all sampling was conducted outside protected areas.
All specimens were collected from depths that ranged between 0 and 2 m by pulling up harbour ropes, removing specimens from submersed docks and pilings, or pulling individuals from rocky assemblages (natural population). Samples were dissected in situ and a piece of muscular tissue from the mantle or the siphon was immediately preserved in absolute ethanol. Ethanol was changed after a few hours, and samples were then stored at 220uC until DNA extraction.

DNA extraction and sequencing
Total DNA was extracted using the REDExtract-N-Amp Tissue PCR Kit (Sigma-Aldrich). The universal primers LCO1490 and HCO2198 described in Folmer et al. [62] were used to amplify a fragment of the mitochondrial gene Cytochrome Oxidase subunit I (COI) from 368 individuals. The primer set designed by Jarman et al. [63] was used to amplify a fragment of the single-copy nuclear Adenine Nucleotide Transporter (ANT) gene. Based on the resulting sequences, we also designed the specific primers ANTf_Splic (59-TTG GCA GCT GAT ATT GGA AAA GG-39) and ANTr_Splic (59-CCA GAC TGC ATC ATC ATK CG-39), using the software Primer 3 v.0.4.0. [64]. Amplifications were carried out for 315 individuals using Jarman et al. [63] primers or the newly designed ones.
For both genes, amplifications were performed in a final volume of 20 mL using 10 mL of REDExtract-N-amp PCR reaction mix (Sigma-Aldrich), 1 mL of each primer (10 mM) for ANT or 0.8 mL for COI, and 2 mL of template DNA. The PCR program for ANT consisted of an initial denaturing step at 94uC for 2 min, 30 amplification cycles (denaturing at 94uC for 1 min, annealing at 58uC for 30 seconds and extension at 72uC for 30 seconds), and a final extension at 72uC for 6 min, on a PCR System 9700 (Applied Biosystems). The PCR program for COI was as described above, except for the amplification cycles, which were done at 94uC for 45 seconds, 50uC for 45 seconds and 72uC for 50 seconds. PCR products were purified using MultiScreenH filter plates (Millipore), labelled using BigDyeH Terminator v.3.1 (Applied Biosystems) and sequenced on an ABI 3730 Genetic Analyzer (Applied Biosystems) at the Scientific and Technical Services of the University of Barcelona (Spain). Other samples were directly sent for purification and sequencing to Macrogen Inc. (Seoul, Korea Korea). From the resulting sequences, we discarded low quality reads for ANT, hence the lower number of specimens sequenced for this marker.
Sequences were edited and aligned using BioEditH v.7.0.5.3 [65]. Some ANT sequences showed a deletion of 22 amino acids, thus heterozygotes had unequal lengths and had to be manually reconstructed by carefully analyzing both forward and reverse chromatograms. The allelic phase for ANT genotypic data was analyzed using fastPHASE 1.1 [66] implemented in the software DnaSP v.5 [67]. We also used the Recombination Detection Program (RDP3) [68] to test for recombination in our nuclear sequences. Sequences obtained in this study have been deposited in GenBank (accession numbers HQ916425 to HQ916446 for COI, and HQ916363 to HQ916423 for ANT).

Population genetics
Number of alleles (Nh), gene diversity (Hd), and nucleotide diversity (p) were computed with DnaSP v.5 [67]. Allelic richness was calculated using the program Contrib v.1.02, which implements a rarefaction method to obtain estimates independently of sample size [106]. Genetix v.4.05.2 [69] was used to calculate inbreeding coefficients for the ANT data obtained with fastPHASE. The nearly unbiased estimation of allelic differentiation between populations was based on the adjusted D est measure described by Jost [70], and calculated for each marker with SPADE [71]. The mean and SE values obtained with SPADE from 1,000 bootstrap replicates were used to calculate the confidence intervals and the degree of significance of the differentiation values (using a normal approximation). To correct for multiple comparisons, we set the p-value at 0.009, following the Benjamini and Yekutieli False Discovery Rate correction [72]. A value of D was deemed significant when the confidence interval around its mean did not contain 0. An analysis of molecular variance (AMOVA) was performed to examine population structure, and its significance was tested running 10,000 permutations in Arlequin v.3.1 [73]. The correlation of genetic and geographical distances was tested for all pairs of populations with a Mantel test [74] and 10,000 permutations using Arlequin.
Visual assessment of between-population differentiation was achieved by performing a discriminant analysis of principal components (DAPC) [75] on a dataset comprising information obtained from both genes. This recently developed technique extracts information from genetic datasets (multivariate in nature) by first performing a principal component analysis (PCA) on groups or populations, and then using the PCA factors as variables for a discriminant analysis (DA). The previous PCA step ensures that the variables input to DA meet the requirements of having less variables (alleles) than number of observations (individuals) and not having any correlation between variables [75]. DA seeks to maximize the inter-group component of variation. We performed DAPC analyses on both genes combined by using the adegenet package for R [76]. DAPC was performed (function dapc) using pre-defined groups corresponding to populations or groups of populations (see Results). Variables were centred but not scaled. In all analyses, 50 principal components of PCA were retained and input to DA. DA also provided estimates of the probability with which the analysis recovers the true membership of the individuals. Finally, in order to detect population growth and infer population demographic events, we computed Tajima's D [77], Fu's F s [78], R 2 [79], and the raggedness index (based on the mismatch distribution) [80], using DnaSP.

Phylogenetic and phylogeographical analyses
The complete dataset was used to construct a median-joining network for each marker using Network v.4.5.1.6 [81]. Resulting loops for the ANT network were solved using criteria derived from the coalescent theory [82,83]. For the COI network, only one loop was observed but it could not be resolved.
Phylogenetic analyses were conducted using Styela gibbsii as an outgroup (acc. number HQ916447 for COI and HQ916424 for ANT). The best-fit model of nucleotide substitution for each marker was selected using jModeltest v.0.1.1 [84,85], with the Akaike Information Criterion (AIC) for COI, and the corrected version for small samples (AICc) for ANT. The positions corresponding to the indel detected for ANT were not included in the analysis (see Results). For Bayesian inference (BI), MrBayes v.3.1.2 software [86] was used to infer tree topologies, implementing the corresponding likelihood model for each gene fragment. For each gene, the program was run with 1 million generations with a sample frequency of 100 (10,000 final trees). After verifying that stationarity had been reached (i.e. the average standard deviation of split frequencies between two independent chains reached less than 0.01), the first 1,000 trees were discarded in both cases as burnin. Majority-rule consensus trees were generated from the remaining 9,000 trees. Bayesian posterior probabilities were used as a measure of support for the branch nodes obtained. The obtained trees were drawn with FigTree v.1.2.2. DnaSP was used to perform the McDonald & Kreitman test [87], and check whether patterns of variation among groups of sequences were consistent with predictions for a neutral model.

Mitochondrial gene
For the mitochondrial COI gene, 368 sequences with a final alignment length of 624 bp were obtained. In total, we found 22 haplotypes with 38 polymorphic sites (6%), 6 of which corresponded to non-synonymous substitutions. The majority of haplotypes obtained (68%) corresponded to private haplotypes, most of which were found in the north-western Atlantic Ocean (Fig. 1). Remarkably, the six haplotypes found for the North Carolina population (NC) were private. The number of haplotypes per location ranged between one in Tenerife and six in Ferrol and North Carolina ( Table 2, Table S1). Regarding the oceanic basins, the Atlantic and Pacific Ocean had higher haplotype diversity (17 and 8 haplotypes, respectively) than the Mediterranean Sea and the Indian Ocean (4 and 5 haplotypes, respectively; Table 2). Mean and total haplotype diversity (Hd) were 0.497 (60.266 SD) and 0.810 (60.010 SD), respectively. Mean nucleotide diversity was 0.0055 (60.005 SD), while total nucleotide diversity (p) was 0.0135 (60.0006 SD). Variation in haplotype and nucleotide diversity between populations within basins was considerable. For instance, the populations of Knysna (KNY) and Port Elizabeth (PE) located in the Indian Ocean, had a haplotype diversity of 0.668 and 0.205 respectively. The California population (CAL) presented the highest haplotype and nucleotide diversity values (0.800 and 0.01684, respectively; Table 2). The higher allelic richness values (obtained after rarefaction to a common sample size of 11 and 40 genes per populations and basins) were found for the San Fernando (SP, 3.747) and Ferrol populations (FE, 3.793), while the lower values corresponded to the populations of Manly (AM, 0.458) and Arenys de Mar (AR, 0.555). When comparing between basins, the Atlantic Ocean showed the highest allelic richness, whereas the Mediterranean Sea had the lowest value ( Table 2). Jost's adjusted estimator (D est ) was used to assess the allelic differentiation between populations for each marker, showing high values of differentiation (mean D est = 0.660). The COI data revealed high differentiation between many population-pairs, as 88 comparisons out of 136 resulted in significant differences after correction for multiple comparisons (Table 3). For instance, the North Carolina population had no alleles in common with any other population (Fig. 2), and many other populations (e.g. Port Elizabeth, Manly, Misaki, Okinawajima) also differed considerably in their allele composition. No particular pattern was found for the only population collected from natural substratum (Sakushima Island, SKS), which was significantly different from half of the remaining populations.
The results of the hierarchical AMOVA showed higher within population variability (58.41%) than the one between populations (41.59%, P,0.001, Table 4). AMOVA analyses performed by grouping populations according to their oceanic basin revealed that most of the genetic diversity was due to variability within populations (56.97%, P,0.001), and among populations within basins (34.36%, P,0.001). However, no significant differences in genetic structure were detected between basins (8.67%, P = 0.055 for COI; Table 4). Accordingly, the Mantel test showed no correlation between genetic differentiation and geographical distance between populations (r = 0.00009, P = 0.434).
Overall, neutrality tests were not significant (Table 5), and hence did not support any lack of equilibrium due to selection or population size changes at any level (either partitioned by populations or oceanic basins). The only exceptions encountered were for the Australian population of Manly (AM), with significantly negative Tajima's D values, and for Sakushima and the Group 1 of haplotypes (see below), with a significant raggedness index ( Table 5).
The network obtained for the COI gene (Fig. 2a) revealed two divergent lineages (hereafter called Group 1 and Group 2)    (Fig. 2a). The BI tree reconstructed with COI haplotypes showed two moderately supported clades exhibiting 3.27% sequence divergence among them (Fig. 2b). These two clades matched exactly with Group 1 and 2 described for the COI network (Fig. 2a). Haplotype H_2 (inferred as ancestral) held a basal position within Group 1, while no evidence for a basal haplotype or group of haplotypes was found for Group 2.

Nuclear gene
For the ANT gene, we obtained 315 sequences of 220 bp. The ANT fragment targeted here includes an intron in many metazoans [63]. However, in our case, all sequences could be translated to amino acids and final sequence length was in accordance with what has been found for species without an intron in this position [63]. Our resulting dataset contained 80 homozygotes, which allowed a reliable reconstruction of the gametic phase of the heterozygotes (.95% confidence). No evidence was detected for recombination within our sequences. In total we obtained 61 alleles (Tables S2 and S3), 34 in the Atlantic (20 of which were exclusive to this basin) and 27 in the Pacific (Table 2). A deletion of 22 amino acids was found in 5 alleles (Table S2). Once more, the Mediterranean showed the lowest number of alleles (7, of which only one was private). Mean and total haplotype diversity (Hd) were 0.761 (60.011 SD) and 0.820 (60.012 SD), respectively. Mean nucleotide diversity was 0.0295 (60.008 SD), while total nucleotide diversity (p) was 0.0321 (60.0006 SD). Gene and nucleotide diversity did not differ between basins, except for the Mediterranean ( Table 2). The South African populations of Knysna (KNY) and Port Elizabeth (PE) showed the highest values for genetic diversity, followed by most Pacific populations and some Atlantic ones ( Table 2). Port Elizabeth (PE) was also the population showing the highest allelic richness (14.830) followed by Hong Kong (HK, 9.614), North Carolina (NC, 8.927) and Knysna (KNY, 8.145). As found for the mitochondrial gene, the lowest value of allelic richness corresponded to Manly (AM, 3.140). Low values were also retrieved for the Mediterranean populations of Javea (JA, 3.307) and Arenys de Mar (AR, 3.733). Comparisons between basins indicated that the Indian Ocean had the highest allelic richness, while the Mediterranean had the lowest ( Table 2). Eight populations had less heterozygotes than expected, five of which (Arenys de Mar, Javea, San Fernando, Ferrol and North Carolina) deviated significantly from Hardy-Weinberg equilibrium (significant F is values). Interestingly, 9 populations had an excess of heterozygotes (and negative F is ), and in 4 of them (Tenerife, Brasil, Misaki, Sakushima) these inbreeding coefficients were significant. Per basins, there was a heterozygote deficit in all populations except for the Pacific, and this deficit was most marked for the Mediterranean group of populations (0.282 H obs vs. 0.554 H exp ).
Jost's adjusted estimator showed lower values of differentiation for the nuclear intron ANT (mean D est = 0.324) than for the mitochondrial COI. D est values obtained for the ANT gene revealed fewer significant differences in pair-wise comparisons (45 out of 136). As before, the North Carolina population was significantly different from all the others (Table 3). Interestingly, the Sakushima population (on natural substratum) only differed from the North Carolina and Hong Kong populations. The hierarchical AMOVA analyses showed that most of the observed variability was found within populations (90.6%), and only a small but significant 9.4% (p,0.001) of variability was found among these populations (Table 4). When grouping populations according to their oceanic basins, AMOVA analyses' results were similar to those found for the mitochondrial marker. Most of the genetic diversity was due to variability within populations (90.19%, P,0.001), and among populations within basins (8.20%, P,0.001). No significant differences in genetic structure were detected between basins (1.61%, P = 0.127 ; Table 4). As found for COI, the Mantel test showed no correlation between genetic differentiation and geographical distance between populations (r = 0.000001, P = 0.243). Regarding the neutrality test, the same trend of COI was observed for ANT, with most tests being non-significant. However, Fu's F s were significant for the Atlantic Ocean and the Port Elizabeth population (Table 5).
Network analyses showed a considerable amount of loops that were unambiguously resolved following coalescent rules (Fig. 3a). None of these loops affected the main structures shown in the network. However, the relationship among alleles should be considered with caution and no clear ancestral allele could be reliably designated. Although less divergent than with the COI data, the ANT network also showed a distinction in two groups of sequences separated by 4 mutational steps (Fig. 3a). None of these four mutations corresponded to non-synonymous changes. Finally, the 22 amino acids deletion found in 5 alleles (H_4, H_14, H_39, H_43, H_50) was also retrieved (represented by a dot line in Fig. 3a). McDonald-Kreitman neutrality tests could not be performed between these groups, as there was no fixed difference between them. BI analysis showed that one of the groups (hereafter called Group A) occupied a basal position within the resulting tree, while a second group (Group B) formed a monophyletic, derived clade supported by a posterior probability of 1 (Fig. 3b). Within group B, the five alleles with a 22 amino acid deletion also formed a monophyletic clade (posterior probability = 1; Fig. 3b). When the sequence fragment corresponding to the deletion was removed from the analyses, these 5 alleles still grouped together, indicating that their phylogenetic relationship was independent from the indel presence. The alleles containing the deletion were found in all studied basins, not showing any apparent geographic pattern (Table S2, Figure 3a).
The private allele H_41 from North Carolina appeared genetically distinct from all the others in both the network and the BI analyses (Fig. 3a). This sample was re-extracted and sequenced de novo, but the same resulting sequence was obtained. The Mediterranean populations only presented alleles from Group A of ANT, while the remaining populations presented alleles from both groups (especially, those populations from the Pacific Ocean). This pattern explains the lower genetic diversity found in the Mediterranean basin compared with that of the other oceans. Group B seems to be a highly successful derived clade that has spread in most populations. Interestingly, in all localities in which there was an excess of heterozygotes (negative F is ), there was also a higher than expected proportion of individuals having one allele of each group (A or B; 0.75 observed vs. 0.49 expected frequency). This is especially noteworthy in the Pacific populations, where we found twice the number of ''mixed'' genotypes than expected. The only exception was for North Carolina, which had a significant deficit of heterozygotes and less than expected genotypes with an allele from each group.
Finally, DAPC analyses were performed combining results obtained for COI and ANT. In order to avoid cluttering of populations, a first DAPC was performed with 3 groups: the North Carolina population (significantly different from the rest in previous analyses), the Sakushima population (the only natural substratum population) and the remaining populations. The PCA Table 4. Analysis of the molecular variance (AMOVA) for the COI and ANT genetic markers. components retained explained 98.6% of the total variance observed. The scatterplot of the first two components of the DA (Fig. 4) showed that the first axis separates North Carolina from the rest, which form a tight cluster, while the second axis slightly sets apart the Sakushima population, although with a clear overlap of the inertia ellipses. We then repeated the analysis removing the North Carolina population and considering all populations as separate groups. 99.2% of the total variance was explained by the retained components of the PCA. The populations appeared mixed in the space of the first two axes of the discriminant analysis (Fig. 4), although the first axis separated slightly Misaki, Port Elizabeth and Manly on one extreme, and the two Mediterranean populations at the other end. The rest of the populations clustered tightly together, with the natural substratum population (Sakushima) appearing in a central position.

Discussion
Several remarkable features emerged from the recovered distribution of the genetic variability. First, there is a divergence in lineages for both markers, each featuring two groups of sequences. Second, the genetic pool is well mixed at the basin level, with little or no phylogeographic signal remaining. Third, many population pairs are genetically different, regardless of the geographic distance among them. Finally, there seems to be an effect of selection on the genetic makeup of this species, as illustrated by the highly divergent population of North Carolina and the intra-individual distribution of both groups of ANT sequences.
The most parsimonious explanation for the presence of two groups of sequences for COI (group 1 and 2) and ANT (group A and B) is that they have arisen concomitantly in a past fragmentation event within the native area of the species. We cannot, however, exclude an independent origin of these genetic splits. At present, the distribution of the groups obtained with the two markers is totally unrelated. Sequences of the Group A for ANT were found in ascidians having mitochondrial sequences of both lineages (Groups 1 and 2), and in direct proportion to their relative abundances. The same trend was observed for individuals having sequences of Group B for ANT (Table S3). If the differentiation of ANT and COI in different lineages occurred simultaneously in allopatric regions, the link between these markers was lost long ago. Mitochondrial genes are inherited maternally, while nuclear genes can be shuffled repeatedly through sexual reproduction. Thus, the lack of congruence found in the distribution of both markers could be due to frequent contact between individuals from different lineages coupled with genetic drift. A greater sensitivity of mitochondrial genes to genetic drift has been previously reported [88], and may explain the differences observed between mitochondrial and nuclear markers [e.g., [88][89][90]. In addition, no geographic pattern was observed in the distributions of the lineages observed for both markers. Even in the putative native area of S. plicata (NW Pacific), we found sequences of the two groups of COI and ANT in the same populations and, for ANT, even in the same individual. Barros et al. [54] found nine COI haplotypes for Styela plicata, 8 belonging to our Group 1 and one to our Group 2. Based on this divergent haplotype, these authors suggested that there could be a cryptic species within what is known as Styela plicata. Our results did not lend support to this hypothesis, as the nuclear marker showed a distribution unrelated to these two groups of mitochondrial sequences. Furthermore, when comparing our mitochondrial sequences with other species of the genus, the resulting genetic divergence was much higher than that found between our two COI groups (3.27% between our groups, 21.12% between S. plicata and S. gibbsii; 22.7% with S. clava, and 20% with S. montereyensis). The divergent sequences of S. plicata reported from Australia (Lake Conjola) by Pérez-Portela et al. [107] (GenBank accession numbers FJ528633-34 for COI and FH897323 for 18S rRNA) were likely the result of sample mislabelling (Pérez-Portela, pers. comm.). We sequenced 4 further specimens from the same locality and verified that they all had typical S. plicata COI sequences (i.e., Haplotype 5).
Although the native range of Styela plicata is not known with certainty, the prevailing hypothesis is that it comes from the NW Pacific area [36,54]. S. plicata would have then dispersed to other tropical and warm-water regions by ship fouling, likely since the early transoceanic navigation times [36]. Our results indicated that at present the genetic pool of S. plicata is well mixed among basins, with most genetic variability found within populations. Moreover, high genetic variability and the putatively most ancient alleles have not only been found in the NW Pacific populations (e.g. Sakushima, Hong Kong) but also in other oceanic basins (e.g. North East Pacific, Atlantic and Indian Ocean; see also David et al. [91]). Thus, we could not find any clear genetic signal in favour (or against) the hypothesis on the NW Pacific origin of this species. The only potential trend observed in our data was for the Mediterranean basin. The Mediterranean populations presented the lowest values for all diversity indexes, and only displayed group 1 for COI and group B for ANT. However, these findings should be interpreted with caution, as only two Mediterranean localities were included in this study. Lack of resolution for assessing native areas was also found in studies with other ascidian species that are believed to be ancient colonizers (e.g. Ciona intestinalis [38]). On the other hand, species that have spread more recently still have a genetic signature of their introduction history (e.g. Botryllus schlossei [41,42], Microcosmus squamiger [92], Styela clava [45]).
Long-distance dispersal of introduced marine species across oceans probably occurs via major shipping routes while further spread at a local scale may take place through local traffic and recreational boating [13,34,42,91,93]. Our results indicate that many populations of S. plicata are well differentiated from others in terms of allele frequencies. This observation is in agreement with results obtained for other ascidians inhabiting harbours and marinas [37,41,44, but see 38 for an exception]. As expected when anthropogenic transport is the vector of dispersal, genetic differentiation among S. plicata populations was unrelated to geographic distance. Some distant populations (e.g. Hong Kong and Ferrol) were genetically similar, while closer populations such as Knysna and Port Elizabeth (South Africa) were significantly divergent. The stochasticity of main transport events through international ship traffic could determine the observed patterns among basins. However, our sampling design was inappropriate to assess the degree of connectivity among closely located populations (i.e. post-border dispersion, [34]). Thus, it still remains necessary to evaluate the role of small-scale processes in colonization dynamics, and to assess the importance of recreational boating in spreading introduced species. Low genetic diversity caused by a founder effect or a bottleneck is not always the benchmark for introductory events [28,94,95]. In fact, recurrent introductions typically lead to highly diverse populations, especially if they receive migrants from native populations that are genetically structured [26,30,44,96,97]. Here, we found that genetic diversity indexes varied according to the studied population, with overall values ranging from moderate to high for both markers. Some exceptions were these populations where only one or two mitochondrial haplotypes were present (i.e. Arenys de Mar, Tenerife, Manly).
Besides recurrent introductions through ship transport, population differentiation could also be due to selection. Here, we found uneven abundances for each major group obtained for COI (Group 1 and 2) and ANT (Group A and B). For COI, haplotypes from Group 1 were considerably more frequent and diverse than haplotypes from Group 2. It is possible that these groups stand for differential adaptive capabilities of the individuals to stressful environments. This adaptive capability does not need to be directly linked to our studied gene (nonsignificant McDonald-Kreitman test), but to other mitochondrial genes. Differential adaptation to environmental factors (e.g., temperature, salinity) of mitochondrial sequences within one species is not a rare phenomenon, and has been described in many species [98][99][100][101][102][103][104].
For the ANT gene, selection may be favouring heterozygotes that have an allele of each group (A and B). In fact, the excess of heterozygotes found in most populations is due to the number of individuals with an allele each of A and B. Accordingly, the number of individuals with both alleles from the same group (A or B) was lower than expected. Homozygotes for the basal Group A occurred ca. 5 times less than expected based on allele frequencies. Thus, it is possible that populations that originally  Table 1. Labels are placed at the centre of dispersion for each group, further delineated by inertia ellipses. Dots represent individuals. doi:10.1371/journal.pone.0025495.g004 had only one group of ANT sequences were seeded with arriving individuals featuring the other group. The mingling of both groups may have favoured the heterozygotes with an allele from each group, and if this combination had an adaptive value, enhanced the fitness of those individuals. As for the COI lineages, this new adaptive capability to the environment is not necessarily linked to the ANT gene itself. Admixture between lineages can foster the emergence of novel genetic combinations with different physiological attributes and invasive characteristics [30]. In contrast to our results, solitary ascidians inhabiting artificial structures usually have a general deficit of heterozygotes [38,44,105].
Early invasions should not be considered ''naturalized,'' rather, their impacts, potential for further spread, and degree of integration in local processes and interactions should be assessed. A throughout knowledge of introduced species is required to understand and interpret the present-day structure, function, and conservation of marine communities [7,35,36]. Our genetic study of an ancient wanderer has uncovered signatures of deep divergences and recent mixing, with a phylogeographic signal mostly blurred. Current evolutionary processes may include adaptive changes and low and stochastic connectivity among established populations. More studies on S. plicata's biological cycle, interactions with other marine species, and local-scale genetic structure are necessary to understand the biology, ecology and post-border dispersal of this species and prevent ecosystem alterations.