Phylogeography of the Rock Shell Thais clavigera (Mollusca): Evidence for Long-Distance Dispersal in the Northwestern Pacific

The present-day genetic structure of a species reflects both historical demography and patterns of contemporary gene flow among populations. To precisely understand how these factors shape current population structure of the northwestern (NW) Pacific marine gastropod, Thais clavigera, we determined the partial nucleotide sequences of the mitochondrial COI gene for 602 individuals sampled from 29 localities spanning almost the whole distribution of T. clavigera in the NW Pacific Ocean (~3,700 km). Results from population genetic and demographic analyses (AMOVA, ΦST-statistics, haplotype networks, Tajima’s D, Fu’s FS, mismatch distribution, and Bayesian skyline plots) revealed a lack of genealogical branches or geographical clusters, and a high level of genetic (haplotype) diversity within each of studied population. Nevertheless, low but significant genetic structuring was detected among some geographical populations separated by the Changjiang River, suggesting the presence of geographical barriers to larval dispersal around this region. Several lines of evidence including significant negative Tajima’s D and Fu’s FS statistics values, the unimodally shaped mismatch distribution, and Bayesian skyline plots suggest a population expansion at marine isotope stage 11 (MIS 11; 400 ka), the longest and warmest interglacial interval during the Pleistocene epoch. The lack of genetic structure among the great majority of the NW Pacific T. clavigera populations may be attributable to high gene flow by current-driven long-distance dispersal of prolonged planktonic larval phase of this species.


Introduction
Most marine invertebrates remain in a pelagic larval stage for several days to months, and they may be able to disperse and metamorphose into sedentary adults [15]. It is generally thought that the propagule duration of marine species is significantly correlated with dispersal capacity [16]. Therefore, marine species with longer planktonic larval stages are expected to exhibit relatively low levels of population structure as a result of their increased opportunity for gene flow [11]. However, some surveys have shown that high levels of gene flow among populations are not always coupled with the duration of the pelagic larval stage. Small, but significant genetic structure has been observed in some marine invertebrates with high dispersal capability, including limpets [17], pen shells [18], and lobsters [19].
The rock shell Thais clavigera is the most common gastropod species in intertidal rocky shores of East Asia including China, Korea, Japan, and Taiwan. The life span of this species is a minimum of 7 years [20] and it is capable of adapting to eurythermic and euryhaline environments [21]. Although there is no precise estimate of the pelagic larval duration for T. clavigera, larvae of its congeneric species T. haemastoma [22] and T. chocolata [23] are known to last more than 60 days. Morphologically, T. clavigera exhibits a wide range of variation in shell sculpture, such as shape, the absence/presence of blotches on the nodules of shell surfaces, and shell apertures [24,25] In a previous DNA barcoding analysis of Korean Thais species, a high degree of genetic divergence was discovered in some T. clavigera populations [26]. In addition, some genetic surveys of T. clavigera populations performed in certain NW Pacific localities have resulted in inconsistent conclusions [27][28][29]: Huang [29] revealed significant population structure along the China coast based on the partial sequence of mt COI; however, based on sequence variation in mt16S and COI gene fragments, Wang [27] concluded that there is no genetic differentiation between the ESC and SCS. Moreover, taxon sampling was limited to a few local areas in these studies (Taiwan [28], China Sea [27,29]), insufficient to understand genetic relationships among NW Pacific populations.  [9]. Populations are labelled with numbers that correspond with those shown in Table 1. In the present study, to better understand contemporary genetic structure and historical demography of the NW Pacific T. clavigera populations, we sequenced a partial fragment of the mt COI gene from a total of 602 individuals sampled at 29 localities across the NW Pacific Ocean, including both its upstream (Taiwan and China mainland) and downstream (Korea and Japan) ranges. Specifically, we tested whether potential glacial refuges, ocean current systems, or freshwater outflows of the Changjiang River significantly influenced the current population genetic structure of T. clavigera in the NW Pacific.

Sample Collection and Sequencing
A total of 602 T. clavigera specimens were sampled from 29 localities across the NW Pacific Ocean spanning a distance of approximately 3,700 km in East Asia from September 2008 to June 2014 (Fig 1; Table 1). At least 20 individuals were collected and genetically analysed from each locality except two Japanese populations (Iwate and Kanagawa), from which only 10 and Phylogeography of Thais clavigera in the Northwestern Pacific 15 samples were obtained, respectively. T. clavigera is not an endangered or protected species, and therefore all collections were made from public access area without specific permits. Total genomic DNA was extracted using the E.Z.N.A. Mollusc DNA Kit (Omega Bio-Tek Inc., Norcross, GA, USA) following the instructions supplied by the manufacturer. A universal primer set (LCO1490: 5 0 -GGTCAACAAATCATAAAGATATTGG-3 0 , HCO2198: 5 0 -TAAA CTTCAGGGTGACCAAAAAATCA-3 0 ) [30] was used to amplify the partial fragment of the mitochondrial COI gene. Polymerase chain reaction (PCR) was performed in a 50 μL reaction volume containing ten units of Ex-Taq Polymerase (Takara, Shiga, Japan), 2.5 mM dNTP mixture, 2.5 mM MgCl 2 and 20 pmole of each primer with the following amplification conditions: an initial 30 s denaturation at 94°C, 40 cycles of 10 s at 98°C, 30 s at 47°C, 30 s at 72°C and a final 10 min extension at 72°C. The sequencing reaction was performed using the BigDye Terminator Cycle Sequencing Kit (Applied Biosystems, Carlsbad, CA, USA) and all COI fragments were sequenced in two directions on an ABI 3730 XL (Applied Biosystems, USA) automatic sequencer.

Molecular Diversity
Forward and reverse sequences of the COI target fragment were edited, assembled, and merged into consensus sequences using the GENEIOUS software program [31]. Consensus sequences were aligned with CLUSTALX 1.81 using default settings [32]. Standard molecular diversity indices including haplotype diversity (h), nucleotide diversity (π), and the mean number of pairwise differences (k) were calculated using the ARLEQUIN 3.5 software package [33]. The number of haplotypes was determined by a Bayesian coalescent-based program implemented in DNASP V5 [34]. All haplotype sequences were deposited in GenBank with the accession numbers KP116312-KP116913.

Phylogenetic Analyses and Genealogical Network Construction
To infer the phylogenetic relationships among haplotypes, Bayesian inference (BI) and neighbour-joining (NJ) analyses were performed with two congeneric species (T. bronni and T. luteostoma) as outgroups using MRBAYES 3.1.2 [35] and MEGA 6.0 [36], respectively. The Markov-chain Monte Carlo (MCMC) search was run with four chains for 10 million generations with a sampling frequency of 1/1,000 trees. The best-fit model of nucleotide substitutions for Bayesian analyses was estimated using jMODELTEST v.0.1.1 [37], and selected based on the Akaike Information Criterion ( [38]. HKY+G+I was identified as the most appropriate model and used for subsequent analyses. A network showing the genetic relationships among haplotypes was constructed using a median-joining algorithm in Network 4.612 [39]. The maximum parsimony (MP) option was run on the output file to delete all superfluous median vectors and links [40].

Population Genetic Structure
A hierarchical analysis of molecular variance (AMOVA) was conducted using ARLEQUIN to assess the population genetic structure of T. clavigera in the NW Pacific. Three independent AMOVA analyses were carried out based on our a priori expectation for each of the focal factors (potential glacial refuges, Changjiang River, and ocean current systems). Our groupings (see Table 2 for details) were as follows (Table 1  These AMOVA analyses partitioned the total molecular variance among groups (F CT ), among populations within groups (F SC ), and among populations whatever the groups (F ST ) and tested if those F-statistic values were statistically significant using 10,000 random permutations. The HKY model, which fit the data best according to jMO-DELTEST, cannot be implemented in ARLEQUIN, so the Tamura-Nei model was selected to correct for multiple substitutions.
Pairwise genetic differentiation between all 29 populations was further assessed with F-statistics in ARLEQUIN 3.5 [33]. The significance of each pairwise comparison was estimated using 10,000 permutations, and corrections for multiple tests were made following a sequential Bonferroni procedure [41]. To estimate isolation by distance (IBD), a nonparametric Mantel test was performed online to evaluate the association between matrices of pairwise comparisons among sampling locations of shortest geographical distances (estimated in km using Google Earth version 4.3) and genetic distances (log-transformed) with 10,000 randomizations of the data using Isolation By Distance Web Service 3.23 [42].

Demographic Analyses
Historical demographic analysis was conducted using two different methods. Tajima's D [43] and Fu's Fs [44] statistics were calculated to test for neutrality using DNASP. The significance levels of Tajima's D and Fu's Fs were evaluated under 10,000 permutations. The parameter τ obtained from the mismatch distribution was used to estimate the time elapsed since the sudden population expansion using the equation τ = 2μt, where μ is the mutation rate of the marker (per locus per generation) and t is the number of generations (Rogers & Harpending, 1992). In order to convert parameters into quantitative estimates of time, we used a mutation rate of 7.9 × 10 −9 substitutions/site/year, the genus Nucella-specific average substitution rate for COI [45], because Nucella is ever known the most closely relatives of T. clavigera (belongs to the same family Muricidae) in which fossil-based molecular clock calibration is thus far available. The generation time was assumed to be 1 year. For comparison with the mismatch distribution analysis, a Bayesian Skyline Plot (BSP) analysis was executed to examine changes in population size across time in Beast v1.7 software [46]. Three independent MCMC samplings were performed to assure the consistency of the results. Chains were run for 100 million generations and sampled every 1,000 generations, with the first 10% of generations discarded as "burn-in" under the HKY+G+I model determined by jMODELTEST v.0.1.1 with a constant skyline model and Bayesian skyline tree priors. All operators were optimized automatically. In all runs, the effective sample size yielded by MCMC chains for the parameters of interest was greater than 300. Finally, the results were visualized with the TRACER 1.6 software program [47].

Genetic Diversity and Phylogenetic Relationships
A 658 bp sequence of the mt COI gene fragment was determined for 602 individuals of 29 T. clavigera populations sampled across the NW Pacific Ocean (Table 1; Fig 1). A total of 165 polymorphic sites were identified and 298 haplotypes were encountered. No deletion or insertion mutations were detected. Of these 165 variable sites, 44 were single-nucleotide polymorphic, and the remains were parsimony informative. The COI data in NW Pacific T. clavigera populations showed a high level of genetic diversity; haplotype diversity (h) for all studied populations was 0.982, ranging from 0.944 to 1.000. Nucleotide diversity (π) for all examined samples was 0.0081 and ranged from 0.0064 to 0.0102 for individual populations. The average number of nucleotide differences (k) between haplotypes was 5.34 ( Table 1).
The phylogenetic trees for T. clavigera mtDNA COI sequence data inferred using NJ and BI methods were both unresolved, i.e., all nodes received low bootstrap support (all nodes 50%), suggesting that the 298 haplotypes sampled in all study populations lacked phylogeographic structure (S1 Fig). For three different groupings based on either potential glacial refuges, the Changjiang River, or the present-day ocean current systems, the centrality of the haplotype networks was occupied by the two most dominant haplotypes; however, these accounted for less than 20% of all sampled individuals (Fig 2). Furthermore, regardless of how we divided these populations into groups, there were no significant genealogical branches or geographic clusters detected in the three haplotype networks. For potential refuges, the two most dominant haplotypes accounted for 16.8% (101/602) of the total individuals (Fig 2A). The most dominant haplotype (9.5%, 57/602) occurred in all populations except Chungcheongnam-do (CN), Pohang-si (PH), Zhoushan (ZS), and Haikou (HK). We detected the

Population Genetic Structure
The three-level hierarchical AMOVA analyses revealed small but statistically significant genetic structure among populations of T. clavigera, irrespective of whether populations are grouped with respect to potential refuges, Changjiang River, or ocean currents (Table 2). We found that 1.47-5.14% of molecular variation occurred among groups (F CT ; P < 0.01), whereas most variation was observed within populations (F ST ; 93.84-96.92% of the total variation, P < 0.001). Among the three factors analysed, however, the Changjiang River had the largest influence on genetic structure, indicating it acts as a physical barrier to larval dispersal; the proportion of variation among groups was approximately five times greater than the proportion of variation among populations within groups (F CT = 0.051, F SC = 0.011; Table 2).
Pairwise comparisons across sampling locations showed a near absence of population structure across a wide geographical range (3,700 km 29]) were small, but statistically highly significant (P < 0.05) ( Table 3). These results substantiate the existence of a barrier to gene flow across the Changjiang River; however, there was no evidence of IBD estimated as a correlation between the shortest geographic distance by sea and genetic distance by the Mantle test (r = 0.46, P = 1.00) (Fig 3).

Demographic History
We detected significant population expansion of T. clavigera based on multiple lines of evidence from neutrality tests and the mismatch distribution (Table 1). Tajima's D rejected neutrality (P < 0.05) for 23 out of 29 populations. Similarly, Fu's Fs statistic was significantly different from zero (P < 0.05) for all populations except for the Taiwan samples. Both tests of neutrality were significant (i.e., indicated significant deviations from neutrality) when samples were grouped as a single data set. The mismatch distribution was unimodal for all examined samples, supporting a model of sudden expansion (Fig 4). The expansion time parameter τ was estimated from mismatch distribution analysis to be 4.005. Assuming a substitution rate of 7.9 × 10 −9 for COI data, the time since expansion was estimated to be approximately 385 kyr. The population expansion was further validated by the results of BSP analysis, which revealed that population sizes began to expand approximately 400 ka (Fig 5). These estimates indicate a population expansion of T. clavigera during marine isotope stage (MIS) 11, the longest and warmest interglacial interval of the past 500 kyr.

Discussion
The genetic structure of a species reflects the effect of historical demography as well as contemporary gene flow among populations. The factors that contribute to the present-day genetic structure can be inferred by genetic analysis of molecular data. In this study, we utilized sequence variation of the partial mtDNA COI gene for 602 individuals sampled from 29 localities spanning an extremely long coastline of approximately 3,700 km. This sampling represents almost the whole distribution of T. clavigera in the NW Pacific Ocean. We found that genetic   Table 1 for detailed information on site abbreviation. variation among the NW Pacific populations was generally low, perhaps owing to a combination of high contemporary gene flow and recent common ancestry of haplotypes. Nevertheless, both AMOVA and pairwise F ST analyses indicated weak, but significant genetic structure across the Changjiang River (Tables 2 & 3), suggesting the presence of geographical barriers to continuous larval dispersal at this locality; however, it should be noted that no signal of IBD was detected, nor did the haplotype networks show distinct genealogical branches or geographic clusters (Figs 2 & 3). These results support the hypothesis that the populations of T. clavigera that were examined have a high level of gene flow throughout the NW Pacific Ocean.
Although there are some exceptions [e.g., the marine clam genus Lasaea [48]], marine invertebrate species with a long-lived planktotrophic larval stage are generally capable of longdistance dispersal, and their offspring are spread several hundred to thousand kilometres away from their origin by the prevailing surface flow of the ocean current [16,49,50]; hence, the  Phylogeography of Thais clavigera in the Northwestern Pacific long-distance dispersal associated with a prolonged pelagic larval stage and the present-day oceanic current have been regarded as the most influential factors contributing to continuous gene flow over a wide geographic scale in many marine invertebrates, including molluscan species [11][12][13]18]. A reproductive ontogenetic study has reported that T. clavigera undergoes indirect development with a planktonic veliger larval stage [20], and its pelagic larval stage lasts up to approximately 2 months. Additionally, T. clavigera is perennial with an average lifespan of 7 years. The species can reach sexual maturity during their second year, and accordingly have a reproductive lifespan of approximately 6 years [20]. It should also be noted that in the NW Pacific, there are two influential current systems in surface water circulation (Fig 1), namely, the KSC (Kuroshio Current) and the CSCC (China Sea Coastal Current), with the KSC flowing northward year-round [51,52], and the CCC (China Coastal Current) entering the ECS (East China Sea) from the SCS (South China Sea) in the summer [7]. These prevailing currents transport a great number of warm-water marine species from their tropical centre to the north and expand their ranges [53]. The long planktonic larval stage in T. clavigera may facilitates gene flow by current-driven dispersal of pelagic larvae and consequently decreases genetic structure among distant populations spanning over 3,700 km in the NW Pacific coastline. Aside from long-distance dispersal ability and ocean currents, we hypothesize that its utilization of a wide range of habitats is a key factor for successful colonization in a new environments. T. clavigera is abundant in the intertidal zone over a wide range of environmental conditions including different temperatures and salinities [21]; thus it ability to inhabit a wide range of eurythermic and euryhaline environments may also indicate its potential to colonize new environments.
In this study, the AMOVA analysis and pairwise F ST values (Tables 2 & 3) suggest that the Changjiang River poses a weak but significant barrier to gene flow among some T. clavigera populations, indicating that the larval pool is not well mixed geographically across this area despite the long planktonic larval stage. Nonetheless, it is evident that there is generally a low level of genetic structure among populations, and we did not detect geographic clusters in the haplotype networks, consistent with previous results in marine invertebrates with long-lived pelagic larvae [18,19]. In addition to potential glacial refuges in the NW Pacific region, the Changjiang River and ocean circulation systems are two potentially important geographical barriers shaping current population structure [5,14]. We observed genetic divergence (F CT = 0.051, P = 0.003) between the northern and southern populations of the Changjiang River in our AMOVA analysis ( Table 2). As documented earlier, freshwater outflows from the Phylogeography of Thais clavigera in the Northwestern Pacific Changjiang River may act as physical barriers that limit northward dispersal of planktotrophic larvae from southern populations [5]. Genetic subdivision may also be attributed to the habitat of T. clavigera near the mouth of the Changjiang River. T. clavigera is most commonly found in shadowy crevices in intertidal rocky shores. There are relatively well developed mudflat areas formed by the deposition of sediments near the mouth of the Changjiang River. These conditions provide a relatively insufficient rocky shore substratum, and are consequentially unsuitable for the settlement of T. clavigera larvae. When specimens were sampled near the northern mouth of the Changjiang River (e.g., Nantong), only a few T. clavigera individuals were found; by contrast, near the southern mouth, T. clavigera were very abundant in the rocky seashore of the Zhoushan archipelago. This habitat discontinuity may have reduced effective gene flow between the northern and southern populations of the Changjiang River.
It has been reported that oceanographic patterns play an important role in maintaining genetic and phenotypic differentiation in the acorn barnacle Tetraclita japonica in the NW Pacific [14]. Moreover, in southern Australia, the major ocean currents influence the phylogeography and population structure of the intertidal barnacle Catomerus polymerus [54]; however, in the present study of T. clavigera, we found a lack of genetic structure across major ocean current systems, we found very low, but statistically significant genetic structure between the two major ocean circulation systems (CSCC and KSC) by AMOVA (F CT = 0.015, P = 0.002), but F SC (i.e., structure among populations within groups) was equivalent and statistically significant (F SC = 0.018, P < 0.001). The lack of genetic structure between the two current systems may be attributable to high gene flow owing to the long spawning time and prolonged planktonic larval phase of T. clavigera. Our findings that the most common two haplotypes occur at almost every site support this high gene flow hypothesis. Some additional observations suggest that although no water mass from a sub-branch of the KSC reaches the SCS in the summer, in some years this does occur [55]. Moreover, in other seasons, a south-westward current from Kuroshio flowing into the SCS has been observed [55]. In southern China and Taiwan, T. clavigera spawning could occur from spring to summer (February to August) [20,28]; furthermore, its pelagic larval duration lasts up to approximately 2 months. Therefore, T. clavigera larvae are likely to enter and mix into the CSCC system from the Taiwan coastline, which may increase gene flow between populations in the two circulation systems to some extent, therefore resulting in a lack of genetic differentiation.
From haplotype network analysis for COI data, we found a complicated network pattern that suggested that T. clavigera populations underwent a demographic expansion (Fig 2). Also, the observed pattern of mtDNA variation in T. clavigera further supports the hypothesis of non-equilibrium historical processes such as population range expansion. We observed very high COI haplotype diversity due to an excess of singleton variants (76% of the 298 detected haplotypes) coupled with relatively low nucleotide diversity. The retention of a surplus of rare COI variants may indicate a recent population expansion of T. clavigera; otherwise, these rare variants are predicted to be eliminated by genetic drift [56]. This suggests that mutation-drift equilibrium has not yet been attained in T. clavigera in the NW Pacific [57], an interpretation consistent with the significantly negative neutrality test statistics, a clear unimodal mismatch distribution, and BSP analysis. Furthermore, both mismatch distribution and BSP revealed an MIS 11 population expansion approximately 400 ka (Figs 4 & 5). This stage is the longest and the warmest during the Pleistocene epoch [12] and has been described as a super-interglacial period because of its long duration of 25-30 kyr [58]. The sequence of land mollusc species fossils in the Chinese loess-soil shows that the summer monsoon was particularly strengthened during MIS 11, which is typical of warmer climates [59]. Paleontological and palaeoecological estimates of MIS 11 deposits from Japan, Hawaii, Bermuda, and the Bahamas suggest a global sea level had risen during this stage [60][61][62]. Such climatic conditions are necessary to allow warm-water species to reach the northern Pacific and expand their range.
The Pleistocene glacial age, and particularly the last glacial maximum (LGM) approximately 20,000 years ago, had an important influence on the evolution and genetic structure of marine organisms. Many species in various marine realms appeared to arise at the beginning of LGM [2,63]; however, population expansion of T. clavigera is assumed to have occurred pre-LGM. These results differ from the traditional view of demographic expansion that it occurs during the period of LGM. Nevertheless, population expansion that occurred pre-LGM has been reported in species such as the cold-water barnacle Chthamalus challengeri [13] and the marine snail Concholepas concholepas [12]. In the NW Pacific, Ni et al. [7] also estimated a period of 120-140 ka that corresponded with dramatic population expansion of various species, including molluscs [13,64], fishes [65,66], and crustaceans [67,68]. These earlier reports indicate the importance of pre-LGM events in determining the demography of marine populations and should be considered in the future. Nonetheless, it is still unclear why the glacial events did not significantly impact the current population structure and demographic history of these species.

Conclusions
To better understand contemporary genetic structure and historical demography of the NW Pacific T. clavigera populations, we determined the partial sequence of the mt COI gene from 602 individuals sampled from 29 localities across the NW Pacific Ocean. We observed a high level of genetic diversity within each of sampled populations, and no significant genealogical branches or geographic clusters, suggesting high levels of gene flow among populations throughout the NW Pacific. Nevertheless, we detected low, but significant genetic differentiation that corresponds to habitat conditions and freshwater discharge from the Changjiang River. Since we used only a single mtDNA marker, further studies of T. clavigera using multiple nuclear markers are required to validate the observed genetic structure. Also, population genetic studies of other marine species with a planktonic larval phase in the region will provide additional insight into the phylogeographic patterns of NW Pacific organisms.