Genetic Population Structure of the Coral Reef Sea Star Linckia laevigata in the Western Indian Ocean and Indo-West Pacific

The coral reef sea star Linckia laevigata is common on shallow water coral reefs of the Indo-West Pacific. Its large geographic distribution and comprehensive data from previous studies makes it suitable to examine genetic differentiation and connectivity over large geographical scales. Based on partial sequences of the mitochondrial cytochrome oxidase I (COI) gene this study investigates the genetic population structure and connectivity of L. laevigata in the Western Indian Ocean (WIO) and compares it to previous studies in the Indo-Malay-Philippines Archipelago (IMPA). A total of 138 samples were collected from nine locations in the WIO. AMOVA revealed a low but significant ΦST-value of 0.024 for the WIO populations. In the hierarchical AMOVA, the following grouping rejected the hypothesis of panmixia: (1) Kenya (Watamu, Mombasa, Diani) and Tanzanian Island populations (Misali and Jambiani) and (2) the rest of the WIO sites (mainland Tanzania and Madagascar; ΦCT = 0.03). The genetic population structure was stronger and more significant (ΦST = 0.13) in the comparative analysis of WIO and IMPA populations. Three clades were identified in the haplotype network. The strong genetic differentiation (ΦCT = 0.199, P < 0.001) suggests that Indo-West Pacific populations of L. laevigata can be grouped into four biogeographic regions: (1) WIO (2) Eastern Indian Ocean (3) IMPA and (4) Western Pacific. The findings of this study support the existence of a genetic break in the Indo-West Pacific consistent with the effect of lowered sea level during the Pleistocene, which limited gene flow between the Pacific and Indian Ocean.


Introduction
The tropical Indo-West Pacific region hosts the world's greatest diversity of marine shallow water species [1], with species richness decreasing longitudinal and latitudinal from the centre [2][3][4]. Besides harbouring the world's richest marine shallow water biodiversity, this region experienced complex geologic events that strongly influenced distribution patterns of marine taxa. In particular, the frequent global fluctuation of sea level by up to 120 m created a vicariant barrier, which has been principally hypothesised to have caused genetic partitioning between the Pacific and Indian Ocean in numerous taxa [5]. However, findings from molecular genetic studies on species spanning the Indo-West Pacific show discordant population structures, implying that processes influencing the genetic structure of taxa, and hence evolution in the region, are not uniform across species. For instance, results on anemonefish Amphiprion spp [6], lionfishes Pterois spp [7], butterflyfishes Chaetodon spp [8], and sea stars Linckia spp [9] indicate a phylogenetic break between the Indian and Pacific Ocean, supporting the notion of allopatric speciation in separated ocean basins. On the contrary, organisms such as the sea urchins Eucidaris, Diadema, and Tripneustes [10][11][12], the marine snails Echinolittorina reticulate [13] and Thyca crystallina [14] lack this phylogeographic break across the Indo-Malay-Philippines Archipelago (IMPA). The lack of differentiation is attributed to their ability to have sustained dispersal throughout the IMPA during the glacial maxima, swiftly re-established gene flow after glacial maxima, or loss of their divergent lineages [15].
Most marine shallow water species are sedentary in nature, depending on their planktonic larval stage for long distance dispersal. Usually, the newly spawned larvae drift with the ocean currents, travelling 1000s of kilometres. This aspect makes larval dispersal in marine organisms a crucial factor, influencing population dynamics, population persistence, and range expansion. Genetic markers provide a tool that can determine the extent of larval dispersal. Generally, successfully dispersed larvae will leave a genetic trail of their movements, providing a proxy of estimating realised gene flow or connectivity [16]. Species with longer pelagic larval duration (PLD) are expected to exhibit higher gene flow and connectivity than those possessing a shorter PLD [17]. For example, it has been shown that broadcasting acroporid corals with a long PLD have higher gene flow than brooding corals of the genera Stylophora and Seriatopora, which have a shorter PLD [18]. Although numerous studies support the simple correlation of realised dispersal to traits affecting dispersal, such as PLD, findings in widespread organisms, such as the mantis shrimp Haptosquilla pulchella and the tropical abalone Haliotis asinina contradict this pattern. Haptosquilla pulchella, with a long-distance dispersal capacity, shows strong genetic structure among Indo-West Pacific populations, while Haliotis asinina displays high levels of gene flow, despite having a restricted PLD [19,17]. The existing discrepancy between gene flow and PLD proves that gene flow or connectivity among marine populations is not only affected by larval behaviour, but also by ocean currents [20], topographic features, local adaptation [17], geographical distance [21], and interactions with other species [19].
The blue sea star Linckia laevigata is a common benthic organism on Indo-West Pacific coral reefs, with a wide distribution from the Western Indian Ocean (WIO) to southeastern Polynesia [22][23]. It abundantly occurs on the shallow reef flat or coral patches in the lagoons of fringing reefs [23]. Linckia laevigata, such as other asteroids, is a sedentary organism depending on its planktonic larval stage for long distance dispersal. Due to its wide distribution and comprehensive data from previous studies, it is a suitable species to examine and understand levels of connectivity in coral reef ecosystems in the Indo-West Pacific [24]. It reproduces sexually, with an external fertilisation that involves the fusion of gametes freely in the water column. The peak breeding period occurs during the summer months, such as May to August in the case of the population in Guam [23] and may vary regionally. Metamorphosis takes at least 22 days in the larvae of L. laevigata [25], suggesting that its PLD may be longer than 3 weeks.
The WIO is a constituent of the tropical Indo-West Pacific, representing an important biogeographic region of tropical seas [26]. Despite being amongst the most biologically diverse tropical ecosystems, the genetic structure of WIO coral reef species remains one of the least studied globally [27]. For that reason, most coral reef species in the WIO are still managed without any consideration that some populations might be restricted in larval exchange. The WIO has a complex current system that can facilitate long-distance larval dispersal and can form barriers. The South Equatorial Current (SEC), which flows westward across the Indian Ocean, can facilitate long distance dispersal. However, the SEC bifurcates at the East African coast, forming the northward East African Coast Current (EACC) and the southward Mozambique Current, as well as the Mozambique Channel Eddies. This potentially forms an oceanographic barrier for dispersal between the northern and southern East African coast. The EACC joins the Somali Current in the North of the East African coast, forming the seasonal South Equatorial Counter Current (SECC) (Fig 1).
Although several studies have been performed on Linckia laevigata across the Indo-Pacific [9,21,24,33], none has studied gene flow within the WIO. Therefore, to complement previous work on connectivity in Linckia laevigata, the present study investigates the genetic population structure, genetic diversity and historical demography of L. laevigata at nine sites in the WIO. Due to the long PLD of Linckia laevigata broad-scale genetic homogeneity in the WIO would be expected. Additionally, this study aims to assess connectivity of populations in the WIO to the IMPA. The comparative genetic analysis of L. laevigata populations from the WIO and IMPA complements previous research in testing the vicariance hypothesis [33], but with extensive sampling that includes multiple locations in the WIO and IMPA (Fig 1). This extensive sampling facilitates detailed, fine-scale analysis of the genetic differentiation in Linckia laevigata across the Indo-West Pacific. The sequence data of populations from the IMPA were obtained from two previous studies [14,34].

Study sites and sampling
Tissue samples of L. laevigata were collected between February 2011 and February 2012 at nine sites located in the WIO (Fig 1C and Table 1). In total, 138 tissue samples were collected while SCUBA diving by cutting off a small piece of tissue (~1cm) from adult L. laevigata specimens without killing the animals. The tissue samples were preserved in absolute ethanol and stored at 4°C prior to DNA extraction. The analysis was restricted to blue colour morph individuals, because there is a congruent pattern between genetic and colour variation [33].

DNA extraction, amplification, and sequencing
Extraction of genomic DNA was done with the Nucleospin extraction kit, following the manufacturer's protocol. The amplification of the partial COI gene was through polymerase chain reaction (PCR), using the primers described by [35]: LC01490 (5'-GGT CAA CAA ATC ATA The following temperature profile was used to conduct PCR: 94°C for 5 min, followed by 35 cycles of 1 min at 94°C, 1.5 min at 45°C and 1 min at 72°C. Final extension was conducted at 72°C for 5 min [14]. Sequencing was done using the DyeDeoxy terminator chemistry (PE Biosystem) and an automated sequencer (ABI PRISM 310 and 3100, Applied Biosystems).
The 138 sequences analysed for this study are available in GenBank under accession numbers KF527577-KF527714. In total, 396 sequences from the IMPA were utilised for the comparative analysis ( Table 2).

Genetic diversity
The software ChromasPro (version 1.5; Technelysium) was used for editing the WIO sequences. The correct species identity of each sequence was verified using the Basic Local Alignment Search Tool (BLAST) at GenBank. To ensure that only functional mitochondrial DNA was used and not nuclear pseudogenes, the sequences were translated into amino acids with the software Squint Alignment Editor (version 1.02), in order to verify that the DNA sequences code for a functional protein. The online service of FaBox (version 1.4) was used to collapse the sequences into haplotypes. A multiple sequence alignment was obtained with CLUSTAL W [36] as implemented in the software BIOEDIT (version 7.0.4.1) [37]. Haplotype diversity (h) [38] and nucleotide diversity (π) [39] were calculated with the program Arlequin (http://cmpg.unibe.ch/software/arlequin35, version 3.5.1.3) [40].

Historical demography
Tajima's D-test [41] and Fu'Fs-test [42] were used to test the null hypothesis of neutral evolution of the marker in the WIO sequences. Negative Tajima  or population expansion after a recent bottleneck event [41]. To analyse the historical demography, the mismatch distribution [43] of the sums of squares deviation (SSD) [44] and Harpending's raggedness index (HRI) were used, which allowed for testing of the model of sudden population expansion [45]. The mismatch distribution is multimodal in populations under a demographic equilibrium and unimodal if a recent and fast demographic expansion of the population has taken place. Tests in the software Arlequin were carried out with 10,000 permutations.

Genetic population structure and connectivity
The significance of population structure in L. Laevigata was tested using analysis of molecular variance (AMOVA) [46] and pairwise F ST -values. The software Arlequin (version 3.5.1.3) [40] was used to carry out both statistical calculations, applying the standard AMOVA computations, while a matrix of pairwise distances was computed with 10,000 permutations. A network of haplotypes was calculated with the programme TCS (version 1.21) [47]. Clades were defined by at least three mutational steps between each other and a star-like pattern of abundant haplotypes with connections from several singletons. Single or groups of haplotypes that were also separated by at least three or more mutational steps, but lack a star-like pattern, were not defined as clades (Fig 1D). The correlation between geographical and genetic distances among L. laevigata population was tested using a Mantel test [48], as implemented in the Isolation-by-Distance Web service IBDWS [49]. Since estimates of genetic differentiation (F ST ) may be affected by unequal sample sizes, two datasets were analysed: (1) dataset excluding all sample sites with less than 10 individuals and (2) dataset with all sample sites. However, the genetic differentiation (F ST ) based on the two datasets was very similar and significant. Therefore, we present the results based on all samples sites, while the results based on the reduced dataset is provided as supplementary information (S3 Table).

Genetic diversity
Partial sequences of the mitochondrial COI gene were obtained from 138 individuals from nine sample sites in the WIO. A sequence alignment of 467 bp was obtained without indels and translated without any stop codons into an amino acid sequence. The 138 sequences yielded 61 haplotypes, of which 53 were unique to the WIO, showing 156 polymorphic sites (33%) and 184 substitutions. High values of haplotype (h) and nucleotide (π) diversity were recorded, with the average haplotype diversity and nucleotide diversity being 0.865 (0.56-1.0) and 1.9% (0.7-5.4%) respectively (Table 3). From the nine sites, two (Diani and Misali) had a haplotype diversity of 1, i.e. the sampled specimens did not share any haplotype. This could be attributed to the low sample size.

Historical demography
On the one hand, the null hypothesis of neutral evolution of the COI marker was only rejected for Mombasa and Jambiani based on Tajima's D test, while the results of Fu's Fs test could not reject the null hypothesis for all the sites. On the other hand, the mismatch distribution analysis and Rogers' test of sudden population expansion indicate population expansion for all sites, except Tuléar (Table 3). Genetic population structure and connectivity Western Indian Ocean. The genetic population structure of L. laevigata in the WIO region was determined with samples collected from three sites in Kenya, four sites in Tanzania and two sites in Madagascar. AMOVA results revealed a low fixation index, but significant genetic population structure (F ST -value = 0.024, P = 0.045). After Bonferroni correction for the 36 pairwise F ST -values, the significance level was adjusted from 0.05 to 0.0014. Correspondingly, the estimated pairwise F ST -values were low and insignificant for all sites, with the exception of one site in Kenya (Diani), which was significantly differentiated from Angel reef, Tanzania (S1 Table). Three hierarchical AMOVAs were carried out based on ocean currents and geographical location (S2 Table), but only one grouping rejected the hypothesis of panmixia: (1) Kenya (Watamu, Mombasa, Diani) and Tanzanian island populations (Misali and Jambiani) and (2) the rest of the WIO sites (mainland Tanzania and Madagascar; F CT = 0.03, P = 0.047). Nevertheless, a plot of pairwise F ST -values against geographic distance among sample sites did not exhibit a positive correlation, confirming the lack of isolation-by-distance.
Indo-West Pacific. Since the sequence obtained from WIO sites were shorter than the sequences reported by the two previous studies [14,34] in the IMPA, the combined 534 sequences alignment from 36 sites was reduced to 441 bp, yielding 194 haplotypes. However, the shortening of sequences might have reduced the number of haplotypes. Evolutionary relationships of 194 L. laevigata haplotypes found in the WIO and the IMPA are presented in the haplotype network (Fig 1D), showing three different clades separated by at least three mutational steps. The highest number of shared haplotypes was seen in clade 2, which had also the most common haplotypes. All clades were characterised by a star-like structure with many singletons. The distribution of different clades across the WIO and IMPA is presented in Fig 1B  and 1C

Discussion
In marine species, values less than 0.5 for both h and π (%) indicates low genetic diversity and little genetic divergence [50]. The estimated haplotype (h) and nucleotide diversities (π) in L. laevigata were higher in comparison with those shown in previous studies utilising COI in other invertebrates in the WIO, such as the giant tiger prawn Penaeus monodon [51], fiddler crab Uca annulipes [52], mangrove crab Perisesarma guttatum [53], but similar to the mangrove crab Neosarmatium meinerti [54]. The high haplotype and nucleotide diversity values might also confirm a recent population expansion that occurred with a pool of haplotypes that diversified prior to the population expansion [51]. Indeed, the results of Tajima's D, Fu's FS, mismatch distribution analysis, and Rogers' tests were all consistent in indicating a sudden population expansion subsequent to a bottleneck [44][45]. This change in population size could be attributed to the reduction of habitats during the sea level low stands and recolonisation of new habitats after rising of the sea level. The sea level dropped up to 120 m in the Pleistocene, which was inevitably accompanied by loss of shelf habitats and strong fragmentation, leading to the isolation of coral reef refugia. In particular, habitat reduction was intense in areas outside the IMPA, which maintained extensive coral refugia during periods of low sea level stands [55][56]. During this time, the population of L. laevigata may have undergone dramatic decline, owing to highly reduced coral reef area. The rising temperature after the last glacial maximum was associated with sea level rise that enabled re-colonisation of shelf areas, resulting in a demographic and spatial population expansion of L. laevigata in the WIO and elsewhere. Population expansion after a population bottleneck has also been observed in the IMPA populations of L. laevigata [14,34,57] and in other species in the WIO, such as the anemonefish Amphiprion akallopisos [58], mangrove crab Neosarmatium meinerti [53], fiddler crab Uca annulipes [51], mud crab Scylla serrata [59], and giant tiger prawn Penaeus monodon [50].
Previous studies have demonstrated that genetic divergence among marine populations can occur even in the absence of any apparent barriers to dispersal [60][61]. This is also true for the WIO, where larval dispesal is greatly influenced by complex hydrographic features. Despite the long PLD of L. laevigata that provides a mechanism for long distance dispersal, AMOVA results for the population from the WIO showed a weak, but significant population structure (F ST = 0.024, P = 0.045). This result is similar to the weak genetic structure in L. laevigata reported for populations in the IMPA, with a low fixation index (F ST = 0.03) but significant structure [14,34]. Shallow genetic structuring have also been observed in other marine species, such as the mud crab Scylla serata and the mangrove crab Neosarmatium meinerti along the East African coast, despite their long PLD of 3-4 weeks [54,59]. However, the isolation-by-distance analysis was not significant in the present study, indicating that the genetic structure is not attributed to distance-restricted dispersal. Thus, reduced gene flow in WIO L. laevigata populations might be caused by local ocean currents, life history traits, topographic features or success of immigrants in mating after settlement [62]. Given that the WIO experiences complex current patterns, it is more likely that restricted gene flow in L. laevigata is caused by local oceanographic features, such as local downwelling or eddies [32]. A more pronounced, but still weak significant structure in the WIO populations was observed in the hierarchical analysis, with the following grouping: (1) Kenya (Watamu, Mombasa, and Diani) and Tanzanian Island populations (Misali and Jambiani) and (2) the rest of the WIO sites (mainland Tanzania and Madagascar). An almost identical pattern was observed in the coral Acropora tenuis, separating Kenyan and northern Tanzanian sites from those in the South [63]. A plausible explanation for the observed pattern of discontinuity could be due to the strong local downwelling event that occurs along the southern Kenyan and Tanzanian coast throughout the year [64][65]. This phenomenon can influence dispersal continuity, especially when planktonic larvae are transported onshore, promoting retention. Nonetheless, the population structure revealed by hierarchical AMOVA analyses was weak, suggesting that substantial gene flow occurs due to their long PLD, even in the presence of local downwelling.
Numerous studies have shown a sharp genetic break between Indian and Pacific Ocean L. laevigata populations due to sea level fluctuations in the Pleistocene [9,14,21,33,34]. However, these previous studies could not provide a full phylogeographic context of L. laevigata, because their sampling efforts were concentrated in the IMPA. The present study complements these previous investigations by combining the sequence data from the WIO and IMPA. The AMOVA for the combined data sets revealed a strong genetic structure (F ST = 0.13, P < 0.001), comparable with other studies on L. laevigata in the Indo-West Pacific [21,33]. Moreover, the strong genetic differentiation in the Indo-West Pacific has also been shown in other species, such as the giant tiger prawn Penaeus monodon [50,66], the crown-of-thorn sea star Acanthaster planci [67], and the holothurians Holothuria atra and Holothuria nobilis [68][69]. The genetic differentiation between the Indian and Pacific Ocean populations in L. laevigata is consistent with limited gene flow between the two ocean basins for tens of thousands of years [33,67]. The Torres Strait between Australia and New Guinea was closed throughout the Pleistocene, while the Indonesian throughflow might have been restricted [70]. It is unlikely that dispersal in L. laevigata could have happened either in the past or present along the southeastern coast of Australia because the larvae of this tropical species would not survive the low temperate temperatures [33]. Furthermore, upwelling of cold water at the base of the Indonesian arc blocked the movement of tropical marine larvae through the few opened narrow channels in the eastern Indonesian islands, effectively closing any dispersal routes between the Pacific and Indian Ocean [33]. Conversely, a number of species demonstrate a lack of this apparent phylogeographic break. These include the marine snail Echinolittorina reticulata [13], the soldierfish Myripristis berndi [71], the swordfish Xiphius gladius, and the mud crab Scylla serrata [72]. This lack of an apparent phylogeographic break is attributed to the re-establishment of dispersal rapidly after glacial maxima. Hierarchical analysis displayed a stronger genetic break with the following groupings: (1) WIO, (2) EIO, (3) IMPA and (4) WP, as well as (1) Indian Ocean (WIO and EIO), (2) IMPA and (3) WP. The first hierarchical grouping shows that the WIO populations are genetically distinct from the EIO populations, suggesting a WIO-EIO divide. This break within the Indian Ocean was also observed in the giant clams Tridacna maxima and Tridacna squamosa [73], the anemonefish Amphiprion akallopisos [58], the crown-of-thorns sea star Acanthaster planci [74], and the black tiger prawn Penaeus monodon [50,66]. The limited genetic exchange between the WIO and EIO supports the notion that fewer islands exist in the Indian Ocean compared to the West Pacific in order to facilitate dispersal through "island hopping". Moreover, the enclosure of the Andaman Sea by land following sea level regression during the Pleistocene isolated the sample site in Thailand (AS) partially from the Indian Ocean and completely from the Pacific Ocean.
The genetic break between the Indian and Pacific Ocean is also corroborated by the geographical distribution of the different clades (Fig 1). On the one hand, clade 1 predominates in the WP and is completely missing in the EIO and WIO. On the other hand, clade 2 is found in almost all sites, with the exception of Sebakor/Sanggal/Papisol (SSP) and New Britain (NB), New Guinea. Clade 2 occurs in the WIO and was the only clade sampled in the two EIO sites (Andaman Sea (AS) and Kupang (Ku)). It also dominated the sites in the central IMPA, an indication that it is likely the ancestral Indian Ocean clade. Clade 1 most probably evolved in the Pacific Ocean and its westward dispersal into the central IMPA by the ITF is reduced by the Mindanao and Halmahera Eddies. Mixing of clades 1 and 2 occurs in the central IMPA, despite the genetic barrier that is observed in many species between the Indian and Pacific Ocean. Clade 3 has evolved and accumulated in the WIO and its restriction to this region also indicates limited gene flow between populations in the WIO and their counterparts in the EIO, IMPA, and WP. Clades 2 and 3 were found at all sample sites in the WIO, which could be attributed to the bifurcation of the South Equatorial Current, forming the north-flowing East African Coast Current and the south-flowing Mozambique Current. These currents might be responsible for the wide distribution of clades 2 and 3 throughout the WIO.
Supporting Information S1