Origin of Chinese Goldfish and Sequential Loss of Genetic Diversity Accompanies New Breeds

Background Goldfish, Carassius auratus, have experienced strong anthropogenic selection during their evolutionary history, generating a tremendous extent of morphological variation relative to that in native Carassius. To locate the geographic origin of goldfish, we analyzed nucleotide sequences from part of the control region (CR) and the entire cytochrome b (Cytb) mitochondrial DNA genes for 234 goldfish and a large series of native specimens. Four important morphological characteristics used in goldfish taxonomy–body shape, dorsal fin, eye shape, and tailfin–were selected for hypothesis-testing to identify those that better correspond to evolutionary history. Principal Finding Haplotypes of goldfish rooted in two sublineages (C5 and C6), which contained the haplotypes of native C. a. auratus from southern China. Values of F ST and Nm revealed a close relationship between goldfish and native C. a. auratus from the lower Yangtze River. An extraordinary, stepwise loss of genetic diversity was detected from native fish to goldfish and from Grass-goldfish relative to other breeds. Significantly negative results for the tests of Tajima’s D and Fu and Li’s D* and F* were identified in goldfish, including the Grass breed. The results identified eye-shape as being the least informative character for grouping goldfish with respect to their evolutionary history. Fisher’s exact test identified matrilineal constraints on domestication. Conclusions Chinese goldfish have a matrilineal origin from native southern Chinese C. a. auratus, especially the lineages from the lower Yangtze River. Anthropogenic selection of the native Carassius eliminated aesthetically unappealing goldfish and this action appeared to be responsible for the stepwise decrease in genetic diversity of domesticated goldfish, a process similar to that reported for the domestication of pigs, rice, and maize. The three-breed taxonomy–Grass-goldfish, Egg-goldfish, and Wen-goldfish–better reflected the history of domestication.


Introduction
Goldfish, one of the first animals domesticated for ornamental purposes, has experienced extreme anthropogenic selection during its evolutionary history to create aesthetically appealing forms [1,2]. Widely distributed across Eurasia [3], native feral Carassius (crucian carp) can naturally change their body color from gray to red [1,2]. Feral red goldfish are thought to be the ancestral forms of Chinese goldfish [1,2].
The ability to change color has led to aquiculture of the fish for use in religion [4]. The earliest record of anthropogenic usage dates to the Tsin Dynasty (265-419 A.D.) of China as noted in the Compendium of Materia Medica [4]. Strong anthropogenic selection during cultivation is likely responsible for much of the phenotypic variation seen today [1,2,4]. Some goldfish possess features such as egg-shaped bodies, celestial or telescopic eyes, fancy tailfins, lionhead morphotypes, a raspberry-like hood encasing the head (oranda), no dorsal fin, and other variants [2,5]. Around 1502 A.D. goldfish were exported to Japan [1] and around 1700 A.D. to Europe [1,2,5].
The tremendous extent of morphological variation in goldfish owing to anthropogenic selection causes difficulties in evolutionary taxonomy. Linnaeus (1758) originally named the goldfish as Cyprinus auratus because morphologically it is similar to the common carp, Cyprinus carpio [6]. Subsequently, Cyprinus auratus was transferred to the genus Carassius as Carassius auratus [6]. Several taxonomic schemes (Table 1) exist in China for goldfish, each of which focuses on different morphological features [4,[7][8][9]. Most frequently, three terms are used to designate breeds: Grassgoldfish, Wen-goldfish, and Egg-goldfish. Assignment of an individual fish to one breed or another depends on body-shape (slender or egg-shaped) and condition of the dorsal fin (retained or loss) [7]. Systems involving four (Grass-goldfish, Wen-goldfish, Egg-goldfish, and Dragon-eye-goldfish) [8] and five (Grassgoldfish, Wen-goldfish, Egg-goldfish, Dragon-eye-goldfish, and Dragon-dorsal-goldfish) breeds are based on eye-shape (dragon or normal) as well as the two previous morphological characteristics [9]. Extended celestial or telescope eyes distinguish the Dragoneye-goldfish from those with normal eyes [8,9]. Thus, the Dragondorsal-goldfish has dragon-eyes and no dorsal fin [9]. Although the numbers of tailfins (single or double) is not a standard for grouping goldfish, it is an important morphological characteristic used to describe the breeds [4,[7][8][9]. The detailed descriptions of the previous four morphological characteristics for the three taxonomic schemes are listed in Table 1. Taxonomy is more informative when based on evolutionary history, but these various taxonomies (Table 1) focus only on human-selected morphological characteristics and likely obscure history. Unfortunately, no written history details the sequential development of the breeds. This necessitates a reassessment of goldfish-taxonomy to assure it better mirrors evolutionary history, and not merely the extent of morphological divergence.
Chen [10] reported reproductive viability in hybrids between breeds of goldfish and native Carassius. The muscle proteins of native Carassius are similar to those of goldfish [11]. Analyses of nucleotide sequence data from partial fragments of mitochondrial DNA (mtDNA) control region (CR; 471 bp), also known as the Dloop, obtain the same conclusion [12]. Komiyama et al. [13] analyzed a portion of the mitochondrial genome (740 bp) from 67 specimens of Carassius including 44 specimens of goldfish, and further 11180 bp from seven specimens of goldfish. Although they evaluated most of the mitochondrial genome, their sampling on mainland China was limited. Their matrilineal history hypothesizes that the ancestral breed is the Gibelio group of Chinese Carassius. Our prior research on the biogeography of the East Asian C. auratus complex used 1876 partial CR (426 bp) and 187 complete cytochrome b (Cytb; 1140 bp) gene sequences from 67 localities representing most of the species' range and identified three distinct, mostly geographically constrained matrilines [14]. These analyses provide an opportunity to investigate the origin and domesticated history of goldfish from the perspective of the large population of native Carassius.
Herein, we reconstruct the matrilineal relationships using de novo sequences of goldfish as well as mtDNA data of wild Carassius from prior studies [14]. Analyses are used to infer the geographic origin of goldfish and to investigate the genetic consequences of extreme anthropogenic selection. We use CR because of its high mutation rate, which facilitates the resolution of intraspecific matrilineal relationships [15][16][17][18]. We also employ nucleotide sequences of Cytb because this gene is less subject to substitutionsaturation, which makes it more reliable than CR for evaluating interspecific relationships [15,[17][18][19][20]. Furthermore, we select several important morphological characteristics of goldfish taxonomy in China for hypothesis-testing to identify those that better correspond to evolutionary history.

Ethics Statement
All samples of fish from China used in this study were obtained and handled following the guidelines of the by-laws on experimentation on animals, and was approved by the Ethics and Experimental Animal Committee of Kunming Institute of Zoology, Chinese Academy of Science, China (KIZ_YP201002).

Sampling and Molecular Methods
One hundred and ninety specimens of goldfish were collected from Hangzhou (37 specimens), Kunming (56 specimens), Changchun (20 specimens), Lanzhou (23 specimens) and Beijing (22 specimens), China, Seoul (11 specimens), South Korea, and Toronto (21 specimens), Canada. In addition, 5 sequences for Chinese goldfish [13], 39 sequences for Japanese goldfish [13] as well as 1876 sequences for native Carassius with detailed sampling localities and haplotypes [14] were obtained from GenBank ( Table 2). Samples used for morphological analyses were photographed and then stored as voucher specimens. Either tailfin clips or muscle tissue samples were collected from individuals and stored at 220uC until processing. Genomic DNA from freshly frozen or ethanol-fixed tissues was extracted by the standard phenol/chloroform method.

Analyses of Sequence Data
Sequences were assembled using DNASTAR v.5.0 (DNASTAR Inc., Madison, WI, USA) and manually verified. Sequence alignments and information on nucleotide variation were obtained using MEGA v.4.0 [21,22]. DAMBE v.4.1.19 [23] was used to identify shared haplotypes. The new haplotypes identified from goldfish, plus the 180 haplotypes of combined CR and Cytb from a prior study [14], were used for the new reconstruction of the matrilineal genealogy.
Phylogenetic analyses were conducted using maximum likelihood (ML) and maximum parsimony (MP) in PAUP* 4.0b10 [24], and Bayesian inference (BI) in MrBayes v.3.0b4 [25]. All analyses were based on the concatenated Cytb and CR data. Likelihood ratios tests [26][27][28] implemented in MODELTEST v.3.7 [29] were employed to select the best-fitting models for the ML and BI analyses. The GTR+I+G model was selected for the combined dataset by the Akaike Information Criterion [30]. In the ML and MP analyses, a heuristic search with 100 random additions replicates was involved. BI used four simultaneous Metropoliscoupled Monte Carlo Markov chains running for 5,000,000 generations. Convergence to stationarity was evaluated by TRACER v.1.5 [31] using log-likelihood values. The first 50% of the trees were discarded as burn-in and the remaining tree samples were used to generate a consensus tree. Nodal support for the ML and MP tree building methods was assessed using nonparametric bootstrapping (BS) [32] calculated in PAUP* for the MP analysis (MPBS) and RAxML [33] for ML (MLBS) using 1000 pseudoreplicates each. Bayesian posterior probability (BPP) values, the frequency of nodal resolution in the majority rule consensus tree, were calculated through the BI analysis, and the BS for each node in MP as well as ML reconstructions were plotted on the tree.
According to the historical distribution of East Asian Carassius [3,34-38], we classified the 67 sampling localities for native Carassius from a prior study [14] into four geographic regions: northern China (NC), southern China (SC), Japan (JA), and Europe (Russia and Czech Republic; EU). Sampling localities for

Hypothesis-tests of Morphology and Genealogy
We tested for the correspondence between the three methods of morphologically grouping Chinese goldfish and genealogical history based on the concatenated dataset. The morphological data consisted of the four characters traditionally used for Chinese goldfish taxonomy (Table 1): body-shape (slender or egg-shaped), presence or absence of the dorsal fin, the eye shape (normal or derived), and single versus double tailfins. We identified specific haplotypes that were constrained for one morphological condition, and C. carassius was used as the outgroup taxon to determine character state polarities. For example, we evaluated whether the specific haplotypes for slender body-shape was constrained to one matriline, or not. The morphology-based trees were compared to the best unconstrained molecular tree. A MP molecular tree that represented a particular morphological topology was estimated using constrained tree searches in PAUP*, and a heuristic search with 100 random additions replicates was involved for each analysis. Each of the constrained trees was compared to the unconstrained MP topology using a non-parametric Templeton test [43] in PAUP*. Constrained and unconstrained topologies were similarly calculated under the ML criterion in a heuristic search with 100 random additions replicates and compared using the Shimodaira and Hasegawa [44] test (SH) implemented in PAUP*.
A Fisher's exact test was used to examine the matrilineal distribution of goldfish based on the concatenated Cytb and CR datasets. DNASP [42] calculated the polymorphic sites (variable and potentially parsimony-informative sites) for native Carassius and goldfish, respectively, and the Fisher's exact test was performed using SPSS version 13.0.

Haplotype Nomenclature
To avoid confusion, we employed a nomenclature to distinguish haplotypes obtained from the two genes. Haplotypes starting with 'h' were used to denote CR data, those with the prefix ''Jap'' were CR sequences unique to Japanese goldfish, and those starting with ''B'' indicated Cytb data only. The designations were combined to indicate total mtDNA variation. Accordingly, a haplotype consisting of CR haplotype h13 and Cytb haplotype B10 was termed h13B10.

Sequence Variation
Sequence variation in CR and Cytb markers of goldfish was summarized in Table 2. The CR sequences (426 bp) of goldfish contained only 11 variable sites of which eight were potentially parsimony-informative. Analyses identified nine haplotypes from 234 specimens. Two haplotypes (Jap1, Jap2) were unique to goldfish, and seven were shared with native Carassius. Haplotype h20 was most common in goldfish, being shared by 181 specimens. Among the 1140 bp of Cytb data, 25 sites exhibited variation and among these only six sites were potentially parsimony-informative. For Cytb, eight haplotypes were identified from the 180 sequences of goldfish, of which three (B105-B107) were not shared with native Carassius. Shared by 160 specimens, haplotype B13 was the most common one. Combined, the CR and Cytb data identified 12 haplotypes. Seven of these haplotypes-h1_2B13, h20B13, h55B105, h55B106, h56B107, Jap1B13, and Jap2B13-were unique to goldfish. One hundred and forty seven specimens of goldfish shared the most common haplotype, h20B13; this haplotype was not shared with native Carassius. Only five haplotypes-h1_2B22, h13B10, h19B13, h32B25, and h56B9were shared with native Carassius. More specimens of native Carassius were sequenced for CR (1876) than Cytb (187) [14] and this may have resulted in the resolution of a greater number of haplotypes unique to goldfish for Cytb.

Genetic Divergence between Native Carassius and Goldfish
The F ST value between goldfish and native Carassius from southern China (SC) was substantially lower (0.2157) than that between goldfish and native Carassius from northern China (NC; 0.9958), Europe (EU; 0.8942), or Japan (JA; 0.7461). Values of N m also indicated more gene flow between the goldfish and native Carassius from SC (2.9847) than between goldfish and the native Carassius from NC (0.0254), EU (0.1337), and JA (0.2922). Values of F ST and N m between the native Carassius from eight Chinese drainages and goldfish were also calculated ( Table 3) We classified the goldfish by the three-breed taxonomy and subsequently compared the divergence between these breeds and native C. a. auratus. For the CR dataset, the F ST value between southern Chinese C. a. auratus (sublineages C5 and C6) and Grassgoldfish (0.0591) was lower than that between either Egg-goldfish (0.1355) or Wen-goldfish and southern Chinese C. a. auratus (0.1164). The N m value between southern Chinese C. a. auratus and Grass-goldfish (6.5712) was higher than between Egg-goldfish

Genetic Diversity
Haplotype (H d ) and nucleotide (p) diversity based on CR and Cytb sequences separately ( We also classified the goldfish by the three-breed taxonomy and subsequently compared their levels of genetic diversity. Haplotype h19 was shared by the Egg-goldfish and Wen-goldfish. Haplotypes Jap1 and Jap2 only occurred in Egg-goldfish, which have double tail fins and no dorsal fins. Six Cytb haplotypes were unique to Grass-goldfish and only haplotype B13 occurred in all three breeds. Haplotype B9 was detected in Wen-goldfish only. Grassgoldfish had a higher level of genetic diversity ( Table 2)   24.423) values were also significantly negative in Grass-goldfish for the Cytb dataset (Table 2).

Hypothesis-testing for Grouping Breeds
Four morphological constraint trees and the unconstrained best tree were represented in Figure S1. Results of the Templeton and SH tests were summarized in Table 4. The Templeton test rejected body-shape and eye-appearance (P,0.05) as being correlated with matrilineal history. P-values for the SH test showed that the best unconstrained ML topology differed significantly from the morphological constraint tree for eyeappearance (P,0.05). Differences in ln L values also revealed that the condition of the goldfish's eye (normal or derived) was the least informative character for grouping by history (23.66), followed by body-shape (egg-shaped or slender, 14.22), and dorsal fin (retained or loss, 14.22). The number of tailfins (single or double, 4.61) was most indicative of genealogical history. Therefore, the three-breed scheme (Grass-goldfish, Wen-goldfish, and Egg-goldfish) better reflected history than either the four-breed or the five-breed systems that emphasized eye condition.
Fisher's exact test based on the concatenated Cytb and CR datasets obtained a highly significant (P,0.001) relationship between genetic variation sites and lineage. This indicated that the domestication of goldfish was constrained to particular matrilines.

Origin of Goldfish
Our analyses suggest that Chinese goldfish have a matrilineal origin from native southern Chinese C. a. auratus, especially lineages from the lower Yangtze River. The genealogical analyses resolve the origin of goldfish from native Chinese Carassius, a finding consistent with that of Komiyama et al. [13]. The matrilineal genealogy ( Figure 1) and F ST values (Table 3) further indicate a much closer relationship between the goldfish and sublineages C5 and C6 of C. a. auratus from southern China rather than the gibel carp (sublineage C2) from northern China. This discovery differs from the suggestion of the origin of the goldfish being from the gibel carp [13]. Values of N m (Table 3) also suggest strong gene flow occurs between goldfish and native Carassius from the lower Yangtze River. All analyses are consistent with the historical record, which suggests that Hangzhou and Jiaxin, Zhejiang, China might be the area of domestication [1]. Our analyses do not detect significant genetic divergence among the different regions where goldfish live; strong gene flow appears to occur among these regions. These results are not surprising considering the long history of commercialization, artificial selection, and hybridization among different breeds and regions of goldfish [1,2,4].
Other evidence excludes the gibel carp from being the ancestor of goldfish. Gibel carp are usually hexaploids with more than 150 chromosomes [45]. In contrast, goldfish are always tetraploids and have around 100 chromosomes [4,46]. Further, the historical distribution of the gibel carp (C. a. gibelio) appears to be restricted to the northern Amur River system and Europe [47][48][49][50]. Historically, the distributions for goldfish and the gibel carp did not overlap. Therefore, our resolution of a southern origin for goldfish is valid not only because of the strength of the historical geography and ploidy levels of these fishes, but also because of the incontrovertible exclusion of the matrilines of the gibel carp.

Domestication History of Goldfish
Anthropogenic selection of native Carassius eliminated aesthetically unappealing goldfish and this action appears to be responsible for the stepwise decrease in genetic diversity of domesticated goldfish, i.e. the loss of genetic variation from native goldfish to Grass-goldfish in aquiculture followed by further loss within remaining breeds of goldfish. A strong reduction of genetic diversity should accompany the domestication and this is seen as a recent bottleneck event or founder effects, which occurs in domesticated pigs [51,52], maize [53], and rice [54]. Both the extraordinarily lower genetic diversity and the significantly negative results for the tests of Tajima's D and Fu and Li's D* and F* ( Table 2) indicate founder effects and bottlenecking during the domestication of goldfish. Based on recorded history, native red Carassius were initially herding in 'free life ponds' at many temples in Hangzhou and Jiaxin, Zhejiang, China, and without anthropogenic breeding selection [1,4]. The morphology of the Grass-goldfish is less derived and more similar to the native Carassius than other breeds [1,7,8]. Our analyses reveal that the Grass-goldfish has higher level of genetic diversity than either Egggoldfish or Wen-goldfish. The F ST values also indicate that Grassgoldfish and southern Chinese C. a. auratus differ less from each other than do either Egg-goldfish or Wen-goldfish from southern Chinese C. a. auratus. These findings indicate that the Grassgoldfish is likely the first domesticated breed of Chinese goldfish.
Strong anthropogenic selection for aesthetics is likely responsible for the further loss of genetic diversity among different breeds of goldfish. Our analyses for the three-breed system detect a further decrease in genetic diversity from Grass-goldfish to Egg-or Wengoldfish (Table 2). Tajima's D and Fu and Li's D* and F* in Grassgoldfish are also significantly negative (Table 2). Wen-goldfish and Egg-goldfish both have many derived morphological features relative to the Grass-goldfish, such as the egg-shaped body, the possession of double tailfins, the absence of dorsal fins, and/or dragon-eyes [4,[6][7][8]. These findings correspond to anthropogenic selection to eliminate aesthetically unappealing goldfish and the consequential dramatic reduction in genetic diversity, which occurs in Wen-and Egg-goldfish in the three-breed system.

Three-breed Taxonomy and the History of Domestication
Given the absence of a recorded history of the domestication of goldfish, we employed phylogenetic methods to reconstruct the past. Our analyses reveal that the three-breed taxonomy-Grassgoldfish, Egg-goldfish, and Wen-goldfish-better indicates the history of domestication than either the four-breed or the fivebreed systems that emphasize eye-condition. The results of hypothesis-tests indicate that the condition of the fins is informative for grouping goldfish with respect to their evolutionary history. In the three-breed taxonomy, the Grass-goldfish and native Carassius differ only in the color of their scales [4,[6][7][8], and the condition of dorsal fin (loss or retained) distinguishes the Eggbreed from the Wen-breed [1]. Biomechanically, the dorsal fin functions to maintain balance when swimming. Without the dorsal fin, most fishes cannot stay upright. Dorsal finlessness appears after the attainment of double tailfins, which compensate for the loss of dorsal fins [55]. Therefore, Egg-goldfish (no dorsal fin) likely have a more recent origin than goldfish with double tailfins. The further examination of other genes closely related to the morphological characteristics of goldfish can test this prediction. Figure S1 Four morphological constraint-trees and the best unconstrained matrilineal genealogy for goldfish. Unique haplotypes identified for each morphological characteristic were constrained to being monophyletic based on the concatenated Cytb and CR data, and using Carassius carassius as the outgroup taxon. The best unconstrained tree was shown at the bottom of the figure. Photographs of the goldfish for each morphological characteristic were mapped to the genealogy. (TIF)