Novel Polymorphic Microsatellite Markers Reveal Genetic Differentiation between Two Sympatric Types of Galaxea fascicularis

The reef-building, scleractinian coral, Galaxea fascicularis, is classified into soft and hard types, based on nematocyst morphology. This character is correlated with the length of the mitochondrial non-coding region (mt-Long: soft colony type, and nematocysts with wide capsules and long shafts; mt-Short: hard colony type, and nematocysts with thin capsules and short shafts). We isolated and characterized novel polymorphic microsatellite markers for G. fascicularis using next-generation sequencing. Based upon the mitochondrial non-coding region, 53 of the 97 colonies collected were mt-Long (mt-L) and 44 were mt-Short (mt-S). Among the 53 mt-L colonies, 27 loci were identified as amplifiable, polymorphic microsatellite loci, devoid of somatic mutations and free of scoring errors. Eleven of those 27 loci were also amplifiable and polymorphic in the 44 mt-S colonies; these 11 are cross-type microsatellite loci. The other 16 loci were considered useful only for mt-L colonies. These 27 loci identified 10 multilocus lineages (MLLs) among the 53 mt-L colonies (N MLL/N = 0.189), and the 11 cross-type loci identified 7 MLLs in 44 mt-S colonies (N MLL/N = 0.159). Significant genetic differentiation between the two types was detected based on the genetic differentiation index (F ST = 0.080, P = 0.001). Bayesian clustering also indicated that these two types are genetically isolated. While nuclear microsatellite genotypes also showed genetic differentiation between mitochondrial types, the mechanism of divergence is not yet clear. These markers will be useful to estimate genetic diversity, differentiation, and connectivity among populations, and to understand evolutionary processes, including divergence of types in G. fascicularis.


Introduction
Coral reef biodiversity estimates require unambiguous definitions of reef-building coral species because these constitute the structure of the reefs and they are increasingly endangered by both global and local disturbances. Polymorphic nuclear microsatellite loci provide invaluable information about complex sympatric species boundaries and about genetic differentiation and isolation. Next-generation sequencing is useful for surveying numerous genomic loci, including microsatellite markers. Shinzato et al. [1] sequenced the entire genome of the coral, Acropora digitifera, but genomic data for most other reef-building corals is presently lacking.
Reef-building, scleractinian corals of the genus Galaxea are distributed throughout the Indo-western Pacific. Galaxea comprises seven species, four of which (Galaxea astreata, G. fascicularis, G. longisepta, and G. paucisepta) occur around Okinawa Island [2]. Galaxea fascicularis (Linnaeus, 1767) cannot be mistaken for any other species. It is widely distributed [2], and is one of the dominant coral species in reefs around Okinawa, Japan. It is a gonochoric, broadcast spawning species; female colonies produce egg bundles, and male colonies form bundles consisting of sperm and infertile pseudo-eggs [3][4][5]. In Okinawa these colonies are classified into soft and hard types based on nematocyst (microbasic p-mastigophore) morphology [6]. These features are also correlated with the length of the non-coding region between the mitochondrial genes, cytochrome b (cyt b) and NADH dehydrogenase subunit 2 (nad 2) (mt-Long: soft colony type and nematocysts with wide capsules and long shafts; mt-Short: hard colony type and nematocysts with thin capsules and short shafts) [7,8]; therefore, these two types represent novel clades or species. Interestingly, gamete fertilization experiments revealed that these two types show partial reproductive isolation [9]. However, the reproductive barriers that separate these sympatric types and that cause their genetic differentiation are not obvious. Galaxea fascicularis has several distinctive color morphs (B: entire polyp is pale brown, BG: brown with greenish oral disks, Gs: brown, but with tentacles on the major septa that are light green, Gt: brown, but with lateral tentacles that are light green, Wt: brown with pale green (almost white) fluorescent lateral tentacles, and Ft: brown polyps with green fluorescent lateral tentacles) [8,10]. However, these color morphs are entirely unrelated to mitochondrial type [8,11].
For G. fascicularis, Suzuki et al. (unpublished data) developed five microsatellite markers (GenBank accession number: AB272099-AB272103), and Chen et al. [12] attempted to characterize the markers using colonies from two Chinese populations. However, two of these loci showed significant deviation from Hardy-Weinberg equilibrium (HWE) in both populations. Abe et al. suggested that in a third microsatellite locus (locus MS-1), the genotypic pattern is related to the length of the non-coding region [8], and that hybrid larvae show allelic features of both types [9]. However, locus MS-1 (AB272101) was identified as a zooxanthellan locus by Chen et al. [12]. Therefore, genetic conclusions drawn from it were misleading.
Acquisition of additional microsatellite and other genetic markers is essential for definitive population genetics studies in order to examine genetic differentiation, hybridization, and boundaries between types within species. We isolated and characterized novel polymorphic microsatellite markers for G. fascicularis using next-generation sequencing. We characterized one population based on the ratio of mitochondrial types and investigated clonality of each mitochondrial type and genetic differentiation between mitochondrial types within that population.

Ethical statement
All samples for this study were collected in strict accordance with good animal practice as defined by the relevant national and/or local animal welfare bodies, and for sample collection was approved by the Okinawa Prefectural Governor, Hirokazu Nakaima (No. .

Isolation of microsatellite and primer design
We constructed a genomic DNA library using sperm of a Galaxea fascicularis mt-L colony (mitochondrial type was determined following sequencing). Genomic DNA was isolated using proteinase K and phenol-chloroform extraction and purification with ethanol precipitation. Extracted DNA was sequenced using 250 bp paired end reads on a MiSeq sequencer (Illumina) according to manufacturer's instructions. Sequencing was performed with samples of other organisms using different index adaptors in a single MiSeq run. We performed subsequent bioinformatics analyses using default software parameters unless otherwise indicated. We did not use a quality filter for raw sequence data. Read pairs of at least 100 bp in both sequence directions were retained using SolexaQA [13]. Duplicated sequences were removed with filterPCRdupl ver. 1.01 and sequencing adapters were trimmed with fastq-mcf in ea-utils ver. 1.1.2-537 [14]. Sequences of each read pairs were merged. Then assembled sequences longer than 100 bp were selected. Detection of simple sequence repeats and PCR primer design in each assembled sequence were performed with PAL_FINDER ver. 0.02.04 [15]. In order to select microsatellite loci that might be highly variable, we selected primer pairs amplifying longer repeat stretches (thresholds: 3 mer repeats are 10 or more and 4 mer are eight or more, respectively).

Genotyping by PCR and fragment analysis
To characterize microsatellite loci, we screened 97 colonies of G. fascicularis collected randomly at Zampa, Okinawa Island (26°26 0 20@N/127°42 0 40@E) (Fig 1). Color morphs were ignored in this study, because they are not correlated with mitochondrial type [8,11]. Samples were preserved in ethanol and brought to the laboratory, where genomic DNA was extracted using a DNeasy Blood & Tissue kit (Qiagen). To analyze polymorphism and amplification of designed primer sets and to identify mitochondrial types, we used the tailed primer method to perform PCR. The reaction mixture (5 μL) contained template DNA (< 100 ng), AmpliTaq Gold 360 Master Mix (Qiagen), and three primers for each locus: a non-tailed forward primer (0.2 μM), a reverse primer with a U19 sequence tail (0.2 μM), and a U19 (5'-GGTTTTCCCAG TCACGACG-3') primer (0.5 μM) fluorescently labeled with FAM, VIC, NED, or PET, based on the method by Schuelke [16]. Furthermore, 188-1 (5'-GAATAGGGTATACTAGCAGG TC -3', see [7]), 188-R3-U19 (5'-GGTTTTCCCAGTCACGACGCATCATTATCCTCTTCA AGG -3'), and U19 primer fluorescently labeled with FAM were used for identification of mitochondrial types (mt-L or mt-S) based upon the non-coding region between cyt b and nad 2. Amplifications were carried out under the following conditions: 95°C for 9 min; followed by 35 cycles at 95°C for 30 s, 54°C for 30 s, and 72°C for 1 min; and a final extension at 72°C for 5 min. Amplified PCR products with an added internal size standard (GeneScan 600 LIZ; Applied Biosystems) were analyzed using an automated, capillary-based DNA sequencer, ABI 3130xl Genetic Analyzer (Applied Biosystems), and GeneMapper ver. 3.7 (Applied Biosystems). Loci showing non-amplification or multiple non-specific peaks were excluded from further analysis. Furthermore, loci with little polymorphism or few heterozygotes (e.g., almost all colonies were homozygous) were also excluded.

Statistical analysis
Multilocus lineages (MLLs) were employed in order to avoid genotyping samples incorrectly (see [17]). In this study, if only one locus differed in genotype, while all other successful loci coincided, that colony was considered to be clonal, derived asexually by fragmentation. Either a somatic mutation or a scoring error was assumed to be responsible for the variable locus. When we found a somatic mutation or a scoring error while genotyping mt-L colonies, the locus was deleted from the list of retained loci. If similar problems were encountered in an mt-S locus after characterization in mt-L, that allele was removed altogether from the analysis (input as 0 to GenAlEx allele data).
For successful microsatellite loci, the number of MLLs was counted in order to estimate clonality using GenAlEx ver. 6.5 [18]. Clonality was estimated with the following index: N MLL / N (N MLL : the number of MLLs, N: the number of colonies). After removal of clonal replicates from the data set, the number of alleles (N A ), values of expected and observed heterozygosity (H E and H O , respectively), and deviation index (F IS ) from HWE for each type were also evaluated with GenAlEx. Using all retained loci, linkage disequilibrium was estimated for mt-L and mt-S types using Genepop ver. 4.2 (available at http://genepop.curtin.edu.au/) [19,20] with default conditions. The significance level was adjusted using a false discovery rate (FDR) correction at the 95% confidence level [21].
We calculated the F ST value as a genetic differentiation index between types using GenAlEx. Population structure between the two types based on Bayesian clustering was inferred using STRUCTURE ver. 2.3.4 [22]. A burn-in period of 100,000 followed by 1,000,000 Markov chain Monte Carlo (MCMC) replications was used for population clustering without prior information under the admixture model and assuming independent allele frequencies between clusters [23]. In the admixture model, individuals were assumed to be drawn purely from genetic clusters (the number of assumed genetic clusters is shown as K) and were allowed to have mixed ancestry [22,23]. The simulations included 10 iterations, and the number of assumed genetic clusters was one to six. Mean alpha value and range were confirmed for each K to ensure convergence of MCMC replications, following instructions for STRUCTURE. After calculation of the mean value of log probability, Ln P(D), the model choice criterion to detect the most probable value for each K [22], detection of the number of K clusters that best fit the data was determined using the method of Evanno et al. [24], as implemented in STRUCTURE HAR-VESTER [25]. ΔK is an ad hoc quantity for predicting the number of possible clusters [24]. Run data were merged with CLUMPP ver. 1.1.2b [26]. Furthermore, discriminant analysis of principal components (DAPC) was conducted in R ver. 3.0.2 [27] using the package adegenet ver. 1.3-9.2 [28], to represent genetic patterns for each MLL. This clustering method does not assume a particular model, so it is free of assumptions about HWE and linkage equilibrium [28]. To avoid unstable output, we set one-third of the MLLs as the number of principal components to retain (PCs: 5; one-third of 17 MLLs, see Results and Discussion).

Determination of mitochondrial type
From the non-coding region between cyt b and nad 2, 53 colonies from Zampa were identified as mt-L while 44 were categorized as mt-S. No ambiguous or unexpected mitochondrial types were detected in this study. The finding that the number of mt-L colonies exceeded the number of mt-S colonies mirrors the results of Abe et al. [9] (mt-L: 22, mt-S: 13), but not those of Watanabe et al. [7] (mt-L: 1, mt-S: 20). The ratio of types differs in every population [7]. These two types are clearly sympatric at Zampa; however, further study is needed to clarify their distributions. Though this species also inhabits the temperate zone near mainland Japan [2,29], it is possible that other types are also present there.

Isolation of microsatellite markers
We obtained 1,730,565,720 bp (3,507,101 read pairs) of raw sequence data and each read pair was assembled. We used 3,368,983 assembled sequences longer than 100 bp (1,155,663,709 bp, average 343 bp) for SSR identification using PAL_FINDER. These SSRs were used for microsatellite detection. One hundred twenty primer pairs were selected from which to design microsatellite primer pairs (3 mer repeats: 60 loci, 4 mer: 60). With these 120 pairs, 27 loci were successfully amplified from 53 colonies of the mt-L type, without somatic mutations or scoring errors, using the 53 mt-L colonies. Eleven of the 27 loci were identified as cross-amplifiable, polymorphic microsatellite markers (Table 1); somatic mutations or scoring errors were found in mt-S colonies for five loci. Sixteen of the 27 loci were characterized as useful only for mt-L colonies ( Table 2). Four loci, Gfas3_011 and Gfas3_020 (effective for both mt-L and mt-S) and Gfas3_017 and Gfas_037 (effective for only mt-L), possess imperfect repeats within the repeat. These loci may not have followed a simple, stepwise mutation process, although they are likely to fit the infinite allele model [30]. This shows that there is little difference in the degree of clonality between types. In some coral species, clonality depends on location [31,32]. High storm frequencies tend to cause higher levels of clonality in populations of Pocillopora verrucosa [33] due to fragmentation. Thus, clonality appears to be less influenced by type than by environmental or geographic factors. Further collection of samples at additional locations will help to confirm this observation, and spatial correlations based on locality data will be needed to estimate kinship among colonies.

Clonality by multilocus lineage for each type
Clonal reproduction likely enhances the probability of population persistence when environmental conditions are unsuitable for sexual reproduction, or when low population densities supply fewer mating opportunities (i.e. Allee effects) [34,35]. If clonal colonies are dominant in a population, the success rate of fertilization appears to decrease considerably. In most cases, under experimental conditions, self-fertilization by sexual reproduction rarely occurs [36], and gametes derived from colonies with high kinship, tend to fail to fertilize in some hermaphroditic corals, such as some Acropora species [37]. However, the hermaphroditic coral genus, Goniastrea, especially Goniastrea favulus, shows a very high rate of self-fertilization compared to Acropora and Montipora [38,39]. While it is not clear whether higher kinship affects fertilization in gonochoric species, in Galaxea fascicularis, higher kinship may prevent sexual reproduction. Furthermore, populations with decreased clonal diversity are more vulnerable to pathogens, which may be increasing due to increases in sea-surface temperature [32,40]. Nevertheless, local adaptation may also decrease mortality due to pathogens, with persistence of the most fit genotypes [41]. Thus adaptation may decrease the variety of genotypes due to local selection; however, it may prevent local extinction as long as catastrophic environmental changes do not occur.

Characterization of microsatellite markers
The isolated markers revealed 6-14 alleles per locus for 10 MLLs in mt-L and 4-10 for 7 MLLs in mt-S. Fifteen of 182 alleles in 11 loci were shared between mt-L and mt-S types. Some common alleles may accord by chance due to size homoplasy of the microsatellite region [42]. H E Table 1. Characteristics of the 11 polymorphic, cross-type microsatellite loci for mt-L and mt-S in Galaxea fascicularis.

Locus
Repeat motif Primer sequence (5'-3') Size range (bp) The size range of amplification products includes the U19 sequence. Most of these loci did not show significant deviation from HWE, except for Gfas3_50 and Gfas4_076 in mt-L ( Table 2). There is no HWE deviation in 11 loci in mt-S. In addition, there is no significant linkage disequilibrium for all combinations of loci among either mt-L (27 loci) or mt-S (11 loci).

Genetic differentiation between mt-L and mt-S colonies
Significant genetic differentiation between types was detected based on the genetic differentiation index (F ST = 0.076, P = 0.001). Furthermore, the results of Bayesian clustering using STRUCTURE also indicated that these two types are genetically isolated (Fig 2). When the number of genetic clusters was assumed to be two (K = 2), ΔK as ad hoc quantity for predicting of K and mean Ln P(D) as posterior probability of the data given K, suggested the largest value (ΔK = 475.17, mean Ln P(D) = -945.52), which were indicators of most probable K value. DAPC also indicated obvious differentiation between the mt-L and mt-S clusters (Fig 3). This result showed that the most probable number of genetically clustered populations was two. It is also evident that there is genetic differentiation at the population level between these two sympatric types; therefore they appear to represent legitimate species. These markers provide evidence of significant differentiation between mt-L and mt-S; however, they may underestimate differentiation of the two G. fascicularis types, because the 11 cross-type loci are relatively well conserved between types. Species boundaries of other coral genera are often morphologically ambiguous. For example, in Acropora, hybridization has been confirmed both in the field [43,44] and in the laboratory [36]. Moreover, in that genus, morphological classification conflicts with genetic phylogeny [45] and there are cryptic species [46,47]. Furthermore, in the family Pocilloporidae, the definition of species has been debated due to the incongruence of phylogenetic data with morphotypes (Pocillopora: [48,49]; Seriatopora: [50]; Stylophora: [51]). The only species in the family Helioporidae, Heliopora coerulea, is genetically divided into two clades that are related in terms of shape (small-branch or flat type) [52]. In some species, including marine invertebrates, related species have arisen through sympatric speciation [53]. The two types of G. fascicularis may have occurred by sympatric divergence; however, the mechanism has not identified. Therefore, more detailed ecological distribution patterns of both types and genetic differentiation need to be examined. In general, classification of reef-building corals will need to be based upon both morphology and genetics in order to estimate species diversity and other biological parameters. obtained in this study revealed definite genetic differentiation between mt-L and mt-S, suggesting partial reproductive isolation. A recent study suggested that skeletal and color morphs of G. fascicularis appear related to such environmental factors as salinity and light intensity, rather than to genetic factors, such as mitochondrial type [11]. The microsatellite markers developed in this study may provide further information about the relationship between morphs and nuclear genotypes, and may elucidate evolutionary processes in the genus Galaxea. They will be available for population genetics studies of G. fascicularis, and they will be necessary in order to estimate genetic diversity, differentiation, connectivity among populations, and evolutionary processes, including divergence of types in G. fascicularis.