Genetic analysis of body weight in wild populations of medaka fish from different latitudes

The genetic bases of growth and body weight are of economic and scientific interest, and teleost fish models have proven useful in such investigations. The Oryzias latipes species complex (medaka) is an abundant freshwater fish in Japan and suitable for genetic studies. We compared two wild medaka stocks originating from different latitudes. The Maizuru population from higher latitudes weighed more than the Ginoza population. We investigated the genetic basis of body weight, using quantitative trait locus (QTL) analysis of the F2 offspring of these populations. We detected one statistically significant QTL for body weight on medaka chromosome 4 and identified 12 candidate genes that might be associated with body weight or growth. Nine of these 12 genes had at least one single nucleotide polymorphism that caused amino acid substitutions in protein-coding regions, and we estimated the effects of these substitutions. The present findings might contribute to the marker-assisted selection of economically important aquaculture species.


Introduction
Growth and body weight are economically important traits in the livestock industry and in aquaculture. Such traits involve complex physiological processes that are controlled by various environmental and genetic factors. Quantitative trait locus (QTL) mapping and marker-assisted selection for economic traits, including growth and body weight in aquaculture, have recently been conducted in several studies using molecular markers such as microsatellites and single nucleotide polymorphisms (SNP) [1][2][3][4][5][6][7].
Body weight depends not only on growth traits but also on body composition and metabolism. Genome-wide association studies (GWAS) of body mass index (BMI) over the past decade have associated several hundred SNPs with body weight and obesity [8,9]. a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 Animal models play essential roles in most aspects of medicine. Diet-induced obesity in zebrafish and mammalian obesity are pathophysiologically similar; thus, the roles of genes associated with visceral adiposity have been examined in zebrafish models of human obesity [10,11]. Understanding the genetic basis of body weight in teleost fish models could help deepen the medical understanding of obesity.
Ecological profiles of body size are distributed within species according to the Bergmann's rule, which states that animals living at high latitudes are generally larger than those living in low latitudes [12]. Some exceptions exist, but the findings of several studies are in line with the Bergmann's rule and its applicability in many types of mammals [13], birds [14,15], and ectothermic vertebrate and invertebrate taxa [16]. However, the underlying genetic mechanisms that result in a body size cline remain obscure.
Medaka are small freshwater fish that are native to Japan, Korea, and China. Japanese wild populations of the Oryzias latipes species complex are widely distributed from high to low latitudes throughout the Japanese archipelago. Previous phylogeographic studies using allozymes, mitochondrial DNA (mtDNA) sequences, and genome-wide SNP analysis revealed that Japanese wild medaka comprise Northern and Southern Japanese populations [17][18][19][20]. The average rate of SNPs between two inbred strains derived from the two populations is 3.4% [21]. Inbred strains and wild stocks of medaka originating from Japanese wild populations are currently available through the National BioResource Project (NBRP) (https://shigen.nig.ac.jp/ medaka/top/top.jsp). Phenotypic variations between these two populations have been described for several traits, including brain [22] and craniofacial morphology [23], body color and sexual dimorphism [24], vertebral regionalization and number [25], aggressiveness [26], startle behavior [27], and male-specific ossified processes and sex steroid levels [28,29]. Several QTL have been detected by focusing on these phenotypic variations using genetic analyses based on draft [21] and updated medaka genome sequences (http://utgenome.org/medaka_ v2/#!Top.md). We investigated two medaka wild stocks originating from different latitudes. Body weight was higher in the Maizuru population than in the Ginoza population from high and low latitudes, respectively. We investigated the genetic basis of body weight via QTL analysis of the F 2 offspring of these medaka populations. We detected one statistically significant QTL for body weight on chromosome 4 and assessed candidate genes located within that QTL region.

Ethics statement
The Animal Experiment Committee at the National Institutes of Natural Sciences, Japan approved the study protocol (14A108, 15A047). The medaka used in these experiments were treated according to the animal experiment guidelines of the National Institutes of Natural Sciences, Japan.

Animals
The NBRP Medaka (https://shigen.nig.ac.jp/medaka/) supplied adult medaka (G 0 generations) from stocks originating from wild North and South Japanese populations at Maizuru City (Maizuru stock; strain ID, WS215) located at 35˚28 0 N 135˚23 0 E, Kyoto Prefecture) and Ginoza Village (Ginoza stock; strain ID, WS255) located at 26˚28 0 N 127˚58 0 E, Kunigamigun, Okinawa Prefecture), respectively. We then raised several generations of these fish in the laboratory. The Maizuru and Ginoza stocks were crossed to collect the G 1 generations, respectively. Then, Maizuru females were mass-mated with Ginoza males and vice versa to obtain the F 1 offspring, which was used to analyze body weight. Five Maizuru × Ginoza pairs were mated to generate the F 2 offspring for QTL mapping. Three of these pairs comprised Ginoza females and Maizuru males and two comprised Maizuru females and Ginoza males.
Two to four weeks after hatching, all the generations were transferred outdoors between June and July, 2014, and maintained for 1 year under natural temperature and photoperiod at the outdoor experimental field of the National Institute of Basic Biology (34˚57 0 N 137˚9 0 E) in Okazaki, Aichi, Japan. Between June and July 2015, the fish were transferred to experimental aquariums and maintained in water circulation systems at 26˚C ± 1˚C for 1 month. The fish were fed with artificial dry food twice daily. Body weight was analyzed in 10 Ginoza G 1 , 26 Maizuru G 1 , 15 F 1 hybrid, and 126 F 2 fish euthanized with 0.05% 3-aminobenzoic acid ethyl ester methanesulfonate salt (MS222). Water was removed from the fish by blotting them with paper towels; then, each fish was weighed. Thereafter, the fish were frozen in liquid nitrogen and stored in a deep freezer (-80˚C).

Restriction site-associated DNA sequencing (RAD-Seq) and SNP markers
Genomic DNA was extracted from muscle tissue using DNeasy Blood & Tissue Kits (Qiagen GmbH, Hilden, Germany) according to the manufacturer's instructions. The concentration of DNA was determined using a Qubit 3.0 fluorometer (Thermo Fisher Scientific Waltham, MA, USA). Genomic DNA (40 ng) from each sample was digested using two restriction enzymes, BglII and EcoRI, ligated with a Y-shaped adaptor, and amplified by polymerase chain reaction (PCR) using KAPA HiFi HS ReadyMix (Kapa Biosystems Inc., Wilmington, MA, USA). Fragments (~300-360 bp) were selected using E-Gel Size Select (Life Technologies, Carlsbad, CA, USA). Details of the library preparation method are described elsewhere [30]. The fragments were sequenced on a HiSeq 2500 platform (Illumina Inc., San Diego, CA, USA) in 50-bp single-end mode. We conducted RAD-Seq in 10 parent fish from the 5 Maizuru × Ginoza pairs, 3-4 from the F 1 generation of each Maizuru × Ginoza pair, and 126 of the F 2 generation. The reads were quality filtered using Trimmomatic [31] under the following parameters: trimmomatic-0.32.jar SE -threads 8 -phred33 ILLUMINACLIP TruSeq3-PE-2.fa:2:30:10 LEADING:19 TRAILING:19 SLIDINGWINDOW:30:20 AVGQUAL:20 MINLEN:51. The trimmed reads were mapped to the draft genome of the Hd-rR inbred medaka strain (v. 2.2.4, http:// utgenome.org/medaka_v2/#!Assembly.md), and SNPs were called using the Stacks pipeline [32]. We identified RAD tags with a homozygous genotype in all Maizuru and all Ginoza and those that had different alleles between all Maizuru and all Ginoza parents for SNP marker selection. Among these markers, we selected those that were heterozygous in all the F 1 individuals and genotyped in > 80% of the 126 F 2 fish. Finally, we selected 371 RAD markers for QTL analysis (S1 Table). Genetic distances (cM) involving each chromosome were calculated using the Kosambi map function [33].

Analysis of QTL
Quantitative trait loci associated with body weight were mapped in the 126 F 2 fish, and simple interval mapping [34] proceeded using R/qtl software [35,36]. Genome-wide significant (5%) and suggestive (10%) thresholds of a single QTL were determined by 1000 permutation tests. Bayesian credible intervals (95% CI) were computed using the R/qtl function. Physical lengths of credible intervals (Mb) were predicted by extending the physical position of the nearest flanking markers.

Analyses of SNPs in the candidate genes
We selected genes within the 95% CI that were positioned according to the NCBI ASM223467v1 (GCF_002234675.1) assembly and associated with growth, body weight, and obesity on the basis of a literature search. We sequenced the selected candidate genes for the Maizuru and Ginoza fish and cataloged the SNPs located on their coding regions (S2 Table). We then analyzed these polymorphisms using GENETYX software (version 13, GENETICS Inc., Tokyo, Japan) to detect variants that could cause amino acid substitutions. We only considered polymorphisms in which all eight sequenced Ginoza and Maizuru individuals (n = 4 each) were homozygous for the allele. We then analyzed the amino acid substitutions using Protein Variation Effect Analyzer (PROVEAN) v1.1 to estimate their functional effects on the encoded protein [37]. A non-synonymous amino acid substitution with a potential functional effect was found on one of the genes (sned1), and its protein sequence (accession number XP_011471983.1) was compared with those of two teleost fishes, Gasterosteus aculeatus (stickleback; ENSGACG00000003698) and Danio rerio (zebrafish; XP_017212114.1), Mus musculus (mice; XP_006529380.1), and Homo sapiens (humans; XP_011509233) using the sequence alignment tool, ClustalW (version 2.1, DNA Data Bank of Japan).

Identification of body weight QTL
A genetic map was constructed using 371 SNP markers obtained by RAD-seq (Fig 3). The total length of the genetic map of the 24 chromosomes was 1496.32 cM, and the average calculated interval between each marker was 4.51 cM (Table 1). Physical lengths were determined based on reference medaka genome data. The total physical length was 676.74 Mb, and the average interval between each marker was 1.75 Mb (Table 1).
A QTL analysis of the 126 F 2 fish identified a statistically significant QTL region on the distal arm of chromosome 4, with a maximum LOD of 4.14 (Figs 3 and 4). The closest SNP marker was 56900. We calculated the mean body weight of the F 2 individuals that were homozygous for the Maizuru and Ginoza alleles of the closest marker and the heterozygous fish. The mean body weight was higher among individuals that carried a homozygous or heterozygous Maizuru allele than among those that carried a homozygous Ginoza allele (Fig 4C). We confirmed a linkage between an amplified fragment length polymorphism marker located on the distal position of chromosome 4 (0.68 Mb, S3 Table) and the SNP marker 56900 (4.7 cM) by genotyping using PCR.

Differences in the amino acid sequences of the candidate genes between Maizuru and Ginoza
The 95% Bayesian CI of the QTL was 0-7 cM (Fig 4B), and the physical location of the CI estimated from the genetic and physical positions of the markers 56900 (0 cM, 2.38 Mb) and 55681 (17.7 cM, 11.01 Mb) that were the closest to the QTL (Fig 3B) was 0-5.74 Mb. Among the 141 genes encoding proteins within the CI, we identified 12 that are reportedly associated with body weight or growth. Nine of these genes had at least one SNP that caused substitutions of amino acids in the coding regions (Table 2). We estimated, using PROVEAN, that most of these amino acid substitutions exerted neutral protein functional effects. However, one substitution in the protein SNED1 encoded by the gene sned1 at G1013S appeared to have a deleterious (significantly different function) effect with a PROVEAN score of -2.648 (cutoff, -2.5). The amino acid glycine at 1013 in Ginoza SNED1 is conserved across other vertebrates, such as teleost, stickleback and zebrafish, as well mice and humans (Fig 5). In contrast, serine, which substituted for glycine in Maizuru SNED1, was not found in any other analyzed species.

Discussion
Medaka fish found at various latitudes provide an excellent model for investigating the genetic basis of body weight. Therefore, we compared two wild medaka stocks from Maizuru and Ginoza at different latitudes. Fish from the parental populations and the F 1 and F 2 generations were reared under the same environmental conditions to avoid the effects of plastic responses to temperature and other variables such as food availability during growth. Mean body weight differed significantly between the Maizuru and Ginoza individuals (Fig 1), reflecting the involvement of genetic components in the determination of body size. The Bergmann's rule is currently defined as a within-species tendency for body size to increase as latitude increases [12]. Our results conformed to Bergmann's rule, as body weight was greater among the Northern Maizuru population than the Southern Ginoza population.
The identification of genes that regulate complex multigenic traits such as growth and body weight has proven challenging. Over 6000 genes are considered to influence body weight in mice [38]. Multiple QTL regions are associated with body weight in Atlantic salmon (Salmo salar) [2]. Our QTL analysis identified a significant QTL on chromosome 4 (peak LOD, 4.14) (Fig 3), and the effects of the allele of the marker with the highest score supported the phenotype found in the parental strains (the body weight was higher for the Maizuru allele than the Ginoza allele in the F 2 medaka). This explains 14% of the variance.
We identified 12 candidate genes involved in body weight and growth regulation within the significant QTL region. The genes cilp2, expressing cartilage intermediate layer protein 2, and sned1, expressing sushi, nidogen and EGF-like domain-containing protein 1, are associated with BMI in humans [39]; mef2b (myocyte-specific enhancer factor 2B), rfxank (regulatory factor X-associated ankyrin-containing protein), and rab6b (Ras-related protein Rab-6B) are associated with body weight and growth traits in sheep [40][41][42]; sgcb (sarcoglycan beta) is   associated with body weight in broilers [43]; scly (selenocysteine lyase), cep19 (19-kDa centrosomal protein), dhcr24 (24-dehydrocholesterol reductase), and lmo4 (LIM domain transcription factor) are associated with body weight and obesity traits in mammals [44][45][46][47]; and mrpl55 (mitochondrial ribosomal protein L55) has a critical role in development and body size in Drosophila [48]. Furthermore, sgta (small glutamine-rich tetratricopeptide repeat-containing protein alpha) is a regulator of growth hormone receptors, which consequently influence body weight, because sgta knockout mice are smaller than wild-type mice [49]. Among the identified candidate genes, nine had amino acid substitutions that distinguished Maizuru from Ginoza ( Table 2). The SNED1 amino acid substitution G1013S was predicted to be deleterious (functionally different) according to the PROVEAN score. In addition, the Ginoza amino acid variant, glycine, was prevalent among other vertebrate groups, whereas the serine residue in Maizuru was not found in other investigated species (Fig 5). SNED1 in humans is known as insulin-responsive sequence DNA-binding protein 1 (IRE-BP1); it is a transcription factor involved in the determination of BMI [37] and the activation of insulin-responsive genes and obesity [50]. SNED1 is located at the terminal region of chromosome 2 in humans. Patients with a deletion in that region, with breakpoints at or within cytogenetic band 2q37, have a short stature among other features [51]. Sned1 might be involved in mouse skeletal development, as Sned1 knockout mice have craniofacial malformations and growth defects [52]. Therefore, we speculate that a functional difference in SNED1 protein activity caused by the amino substitution that distinguished Maizuru from Ginoza induced the difference in body weight. However, unidentified genes in the QTL region might also influence body weight through mechanisms other than differences in protein function, such as changes in their expression levels and/or profiles. These speculations await further investigation.
Future studies are necessary to identify the genetic variation(s) responsible for the differences in body weight between the medaka populations studied here. Nevertheless, the present results will contribute to the marker-assisted selection of economically important aquaculture species and provide a better understanding of the genetic mechanisms underlying ecological differences in body weight among populations at different latitudes.