Hidden Diversity in Sardines: Genetic and Morphological Evidence for Cryptic Species in the Goldstripe Sardinella, Sardinella gibbosa (Bleeker, 1849)

Cryptic species continue to be uncovered in many fish taxa, posing challenges for fisheries conservation and management. In Sardinella gibbosa, previous investigations revealed subtle intra-species variations, resulting in numerous synonyms and a controversial taxonomy for this sardine. Here, we tested for cryptic diversity within S. gibbosa using genetic data from two mitochondrial and one nuclear gene regions of 248 individuals of S. gibbosa, collected from eight locations across the Philippine archipelago. Deep genetic divergence and subsequent clustering was consistent across both mitochondrial and nuclear markers. Clade distribution is geographically limited: Clade 1 is widely distributed in the central Philippines, while Clade 2 is limited to the northernmost sampling site. In addition, morphometric analyses revealed a unique head shape that characterized each genetic clade. Hence, both genetic and morphological evidence strongly suggests a hidden diversity within this common and commercially-important sardine.


Introduction
Accurately defining the limits of a species is essential to studying its biology, ecology, conservation, and management [1][2][3]. Species are typically categorized according to gross morphology [4]. However, species diversity can be masked by a lack of obvious morphological differences between cryptic species [5]. Cryptic species are morphologically similar but genetically distinct lineages, and are often overlooked by species identification using gross morphology alone [6]. Genetic differentiation can be used to distinguish morphologically similar lineages [7][8][9], in which the genotypic-clustering species definition is utilized to identify cryptic species [10][11][12]. Cryptic species are widely distributed across different taxa and geographies in the marine realm [13][14], occurring either in allopatry [15,16] or sympatry [8,17] as sister species [16], or a result of convergent evolution [18]. In the Indo-West Pacific, cryptic species of widely distributed reef fishes contribute significantly to overall marine biodiversity [19]. Thus, inaccurate species delimitation would overlook cryptic species and underestimate biodiversity, a fact that can lead to flawed conservation and management strategies [5,20].
Marine small pelagic fishes comprise the majority of world's landed fish catch [21], and members of the family Clupeidae contribute significantly to this volume [22]. The Clupeidae reach their highest diversity in the Indo-West Pacific [23], a region also proposed to be the geographic origin of this family [24]. Within the Indo-West Pacific, the goldstripe sardinella Sardinella gibbosa (Bleeker, 1849) is among the most abundant and widespread marine pelagic species. It has a distribution that extends from the East African coast to Taiwan, the Philippines, Indonesia and Northern Australia [23]. Historically, it is among the most abundant and commercially important species in the Indo-West Pacific sardine fishery [25,26]. In particular, the goldstripe sardinella is the second most abundant sardine occurring in the Philippine archipelago [27]. Migratory patterns of S. gibbosa have been correlated with the availability and seasonality of planktonic prey in the environment [26,28,29]. Biological data from the coast of India suggests a peak spawning period that lasts from early March towards the end of May [26,30]. However, two distinct length and age groups have been observed [31], and variations in the number of scale striations have been observed in S. gibbosa off the South African coast [32]. Morphological classification of this sardine has been complicated by subtle intra-species variations, leading to several recorded synonyms for S. gibbosa [23]. Type specimens, now considered synonyms of S. gibbosa, were previously known as Clupea immaculata (Southern Japan and China) [33], Fimbriclupea dactyolepis (Northwest Australia) [34], and Sardinella taiwanensis (Taiwan) [35]. Such subtle biological and morphological differences documented in S. gibbosa may hint of hidden diversity within the sardine.
The objective of this work has been to explore the possible cryptic species within the goldstripe sardinella in the Philippine archipelago by examining molecular and morphometric data. We investigated the occurrence of cryptic species using genotypic clustering for both mitochondrial and nuclear markers. We also used morphometric variations to test for subtle morphological differentiation between the genetically-partitioned groups. Both genetic and morphological evidence provide strong support for an unexpected multiple-species complex within the common and commercially-important sardine S. gibbosa in the Philippine archipelago.

Sampling
A total of 378 individuals of S. gibbosa were collected from sixteen fish markets across the Philippines, Taiwan, Malaysia, Vietnam and Thailand (Table 1). Body coloration, morphometric, and meristic characters were recorded for frozen then thawed samples. Tissue samples and voucher specimens preserved in absolute ethanol were stored at the National Fisheries Research and Development Institute, Quezon City, Philippines.

PCR amplification
Genomic DNA was extracted from muscle tissue samples using either modified ChelexH (Bio-Rad, Hercules, CA) DNA extraction protocol [36], or salting-out method [37]. Approximately 540 bp of the ribosomal 16S gene region are amplified by polymerase chain reaction (PCR) using the primers 16Sar (59-CGCCTGTTTATCAAAAACAT-39) and 16Sbr (59-CCGGTCTGAACTCAGATCACGT-39) [38]. Additional mitochondrial and nuclear DNA sequences were obtained only for Philippine collections due to mounting sequencing costs and limited time available for this study. The primers CRA (59-TTCCACCTCTAACTCCCAAAGCTAG-39) and CRE (59-CCTGAAGTAGGAACCAGATG-39) were used to amplify 560 bp of mitochondrial DNA control region [39].      DNA polymerase. Thermal cycling conditions consisted of an initial denaturation of 94uC for 10 min followed by 38 cycles of DNA denaturing at 94uC for 30 s, primer annealing at 45uC for 45 s, and sequence extension at 72uC for 45 s, ending with a final extension of 72uC for 10 min. For the ribosomal S7 intron, we utilized PCR conditions as previously described [40]. Successful PCR products were purified using ExoSAP-ITH (USB Corp, Cleveland, OH). The reaction mixture consisted of 2 ml of ExoSAP-IT and 22 ml of PCR product, and eventually incubated at 37uC for 15 min followed by another 15 min at 80uC to inactivate the enzyme. Purified PCR products were sent to either Macrogen, Inc. Korea or UC-Berkeley for DNA sequencing. Sequence data was deposited on the public domain database GenBank [Accession numbers pending].

Phylogenetic reconstruction
Sequences were assembled in Geneious v5.4 [41] and aligned using MUSCLE v3.8.31 [42]. A best-fit nucleotide substitution model was determined using jMODELTEST v2 [43,44]. Phylogenetic analysis using maximum likelihood (ML) criteria was inferred from MEGA v5.2.1 [45] using the best-fit nucleotide substitution models, namely, Kimura-2-Parameter (K2P) for 16S, three parameter model (TPM) for control region and Hasegawa-Kishino-Yano (HKY) for the S7 intron. Also included in the analysis for outgroup comparison were the closely related taxa, namely, Sardinella fimbriata, S. hualiensis, S. lemuru, Herkoltsichthys quadrimaculatus and Amblygaster sirm sequences. Further, such species have overlapping geographic distribution with S. gibbosa throughout the Indo-West Pacific [23]. However, we excluded Amblygaster sirm and Herklotsichthys quadrimaculatus as outgroups for control region dataset since they are highly divergent and inclusion of these taxa created large indels in sequence alignment. Allelic state of the nuclear S7 intron was estimated using PHASE v2.1 [46,47] as implemented in DnaSP v5.0 [48]. The phylogenetic network was inferred using the median-joining network implemented in NETWORK v4.6 [49] using the default settings.

Morphological analysis
To complement genetic data, variability within S. gibbosa from 10 individuals per site was quantified by morphometric measurements representing the head shape. Measurements (in mm) obtained using a Vernier caliper were standard length, snout length (tip of snout to eye), head length (tip of snout to edge of operculum), eye diameter (horizontal diameter), upper jaw length, and post-orbital length (right edge of eye to end of operculum). All measurements were converted into ratios to represent proportion with respect to standard length. A principal component analysis implemented in PC-ORD v4.10 [50] was performed on natural log-transformed ratios which separated morphological variations into linear combinations of variables that describe overall head shape. In addition, analysis of similarities (ANOSIM) and similarity of percentage analysis (SIMPER) were conducted on log-transformed morphometric ratios in Primer v5.2.4 [51] to determine the percentage contribution of morphometric ratios to the overall variations in head shape.

Results
Maximum-likelihood analysis of 16S rRNA sequences support the existence of two species within S. gibbosa (Figure 1). In concordance with 16S data, mitochondrial control region sequences revealed similar clustering (Figure 2). Clustering for both markers exhibited monophyletic clades with high bootstrap support. In addition, nuclear DNA sequences of the first intron of S7 gene revealed a deep divergence between Clade 1 and Clade 2 ( Figure 3). None of the phylogenetic analyses indicated that the two morphotypes initially identified as S. gibbosa are sister species. Consistent across examined gene regions, genetic distances calculated for both mitochondrial and nuclear gene regions exhibited divergence comparable to species-level differentiation ( Table 2). Clade 1 is broadly distributed across the collection sites except at the northernmost locations ( Figure 4). In contrast, Clade 2 is geographically restricted to this one northernmost site in Cagayan Province. Further, the single sample from Yilan County, Taiwan did not cluster with Clade 2, despite the site's close proximity with Cagayan Province. However, the current dataset for Taiwan, Vietnam, Thailand and Malaysia are only limited to the mitochondrial 16S gene region. Nevertheless, median joining network for all three markers, at least for Philippine sites, revealed numerous base-pair mutations between Clades 1 and 2 ( Figure 5).
All specimens exhibited the diagnostic characters for S. gibbosa, including the dark spot at dorsal fin origin. However, head shape and pigmentation of both lower and upper jaws differ between the two clades identified using genetic markers ( Figure 6). Principal component analysis (PCA) revealed strong morphometric differentiation in head dimensions (Figure 7). The first four principal components (PC) account for 95.42% of overall variance (PC1 -48.74%; PC2 -21.27%; PC3 -14.94%; PC4 -10.79%) ( Table 3). PC1 was highly correlated with variance in upper jaw length, eye diameter, and post-orbital length, respectively. On the other hand, PC's 2 through 4 were associated with differences in the ratios of head length, upper jaw, and eye diameter. In concordance with the genetic clades, a surprisingly similar clustering was observed in plots of the principal components (Figure 7). Scatter plots for PC1 and PC3 separated collections from Quezon Province into a distinct cluster. Such grouping might indicate a unique sub-population or subspecies in Clade 1. A shorter snout and upper jaw with respect to head length in individuals from the four clades accounted for such clustering in principal component analysis.
Multivariate analysis of morphometric ratios using ANOSIM showed significant variations between head shape of the three genotypic clades (R = 0.486; p,0.01). Differences in snout, postorbital, and head lengths distinguished the two clades of S. gibbosa (Table 4). Strong differentiation in head and upper jaw length accounted for 52.26% of variation between Clade 1 and Clade 2. On the other hand, variations in eye, snout, and post-orbital length contributed 47.74% of overall difference between the two clades. Clade 1 had shorter heads with respect to standard length. Lastly, Clade 2 individuals had the lowest post-orbital length, and subsequently a shorter operculum.
In addition to morphometric difference, Clade 2 has a distinct black pigmentation on the edge of mouth and frontal line between the nostrils (Figure 6). Similar blackish coloration has been observed in the caudal fins of Clade 2. Further, a leaner body characterized Clade 2 in contrast with the more rounded shape of Clade 1 (data not shown). Lastly, Clade 2 lacked the gold stripe across the lateral body wall which characterized Clade 1 ( Figure 6).

Discussion
Many discoveries of cryptic species have been based on prior observations of subtle behavioral, biological, or morphological intra-species variations [5]. However, phenotypic differentiation does not necessarily complement genotypic divergence [9], as evident in the lack of congruence between genetics and diagnostic morphological characters [6,10,52]. In extreme cases, morphological variations are randomly shared among genetically distinct lineages within a cryptic species complex [11,53]. To avoid the inconsistency between genetics and morphology, the straightforward approach is to identify cryptic species using multi-locus genetic data [54,55], a method that can help avoid the pitfalls of morphological species delimitation [12]. Cryptic species identified through genetic clustering can then be bolstered by support from additional morphological or biological traits.
In S. gibbosa, molecular evidence from both mitochondrial and nuclear DNA strongly supports two cryptic species. Clade 1 is widely distributed throughout the Philippine islands, while Clade 2 is geographically restricted to the Cagayan Province. Such allopatric distribution has been reported to occur in other cryptic species [15,16]. In addition, the two clades exhibited the same clustering for all markers (Figures 1-3), a finding consistent with the 'genotypic-clustering' species definition [12], and substantiated by agreement between multi-locus genotypic data [56,57]. Likewise, a lack of reciprocal monophyly in Clades 1 and 2 showed that the two lineages are different and not sister species. Clades 1 and 2 were paraphyletic with each other, a phylogenetic pattern that commonly occurs among cryptic species [58,59]. In addition, the 10-40% genetic distances between cryptic clades are comparable to species-level differences (Table 2) [60]. In some pairs, genetic distances for 16S rRNA and S7 intron of S. gibbosa exceed levels typically distinguishing closely-related species [60]. It is also interesting to note that Clades 1 and 2 do not have shared haplotypes in both the conserved 16S rRNA and the polymorphic mitochondrial control region ( Figure 5). Consistent patterns in both maternally and bi-parentally inherited genetic markers demonstrate a lack of gene flow between the two cryptic species. Such patterns fall within the framework of the general species concept for two unique species [1]. Hence, genetic information from both mitochondrial and nuclear DNA presents solid evidence for two biologically distinct species.
Distinct morphometric variations in head shape characterized both Clade 1 and 2 of the S. gibbosa species. Multivariate analysis of head measurements revealed clustering comparable to the genetic clades. Similar clustering due to head shape variations have characterized sub-species within a sardine [23,61], a result later confirmed by molecular evidence from mitochondrial data [62]. Morphological differences between closely related sardines are often characterized by slight differences in measurements or meristic counts, resulting in an ambiguous and often controversial taxonomic status [23]. For instance, the sister sardine species Sardinella tawilis and Sardinella hualiensis share diagnostic characters; and excluding habitat preference, differ only in head length and lower gillraker count [63]. However, intra-species morphological variations in sardines are often presumed to be an artifact of localized adaptations to environment, due to a lack of support from significant genetic differentiation between morphological forms [64,65]. In contrast, the morphological disparity between the cryptic clades of S. gibbosa complements genetic divergence ( Figure 6), and thus is not a mere localized ecological adaptation. Lastly, clustering in PCA due to head shape in Clades 1 and 2 of S. gibbosa falls within the phenetic-clustering species delimitation, as  there is a lack of intermediate forms in between the two clades [12,66]. Combined, genetic and morphological data reveal a hidden diversity in a common and commercially important sardine. Our findings expand the previous investigations on the biology, ecology, and morphology of S. gibbosa that alluded to a cryptic diversity [25,31]. Discovery of new fish species in the Northern Philippines [67,68] including a sardine beyond its previously known distribution [69], suggests that this region harbors undocumented and unique fauna. Such a pattern presents the possibility that Clade 2 might be a new species. Alternatively, Clades 1 and 2 might be previously documented synonyms of S. gibbosa [23]. Based on geographic proximity and morphological similarity, the most likely candidate synonym is S. taiwanensis [35]; however, further scrutiny of type specimens is necessary for validation. Nevertheless, the findings in this study demonstrate that a combination of both morphological and genetic data is essential to assess diversity in taxonomically ambiguous sardines. Here, strong evidence of two ecologically similar, but genetically and morphologically distinct species warrants appropriate management strategies for separate sardine fisheries. Morphometric ratio refers to each character value as a percentage of standard length (% Ls). Contribution and cumulative differences calculated using SIMPER describes, in percentage, each morphometric character's contribution to head shape variation between Clade 1 and Clade 2. doi:10.1371/journal.pone.0084719.t004