Analysis of Anasplatyrhynchos genome resequencing data reveals genetic signatures of artificial selection

Three artificially selected duck populations (AS), higher lean meat ratios (LTPD), higher fat ratios (FTPD) and higher quality meat (CMD), have been developed in China, providing excellent populations for investigation of artificial selection effects. However, the genetic signatures of artificial selection are unclear. In this study, we sequenced the genome sequences of these three artificially selected populations and their ancestral population (mallard, M). We then compared the genome sequences between AS and M and between LTPD and FTPD using integrated strategies such as anchoring scaffolds to pseudo-chromosomes, mutation detection, selective screening, GO analysis, qRT-PCR, and protein multiple sequences alignment to uncover genetic signatures of selection. We anchored duck scaffolds to pseudo-chromosomes and obtained 28 pseudo-chromosomes, accounting for 84% of duck genome in length. Totally 78 and 99 genes were found to be under selection between AS and M and between LTPD and FTPD. Genes under selection between AS and M mainly involved in pigmentation and heart rates, while genes under selection between LTPD and FTPD involved in muscle development and fat deposition. A heart rate regulator (HCN1), the strongest selected gene between AS and M, harbored a GC deletion in AS and displayed higher mRNA expression level in M than in AS. IGF2R, a regulator of skeletal muscle mass, was found to be under selection between FTPD and LTPD. We also found two nonsynonymous substitutions in IGF2R, which might lead to higher IGF2R mRNA expression level in FTPD than LTPD, indicating the two nonsynonymous substitutions might play a key role for the regulation of duck skeletal muscle mass. Taken together, these results of this study provide valuable insight for the genetic basis of duck artificial selection.

Introduction Based on Fisher's theory of natural selection in 1930, traits are associated with evolutionary fitness, such as morphology and complex physiology [1]. Validation of Fisher's theorem has been demonstrated by the successful production of improved breeds for a wide range of species using artificial selection, including pigs [2], chickens [3], cattle [4], sheep [5], and inbred rat strains [6]. Currently, it has been shown that next generation sequencing technologies are very effective in identifying the genetic basis of improved or domesticated species. Next generation sequencing technologies are therefore widely used to explore genetic basis in a number of species including passenger pigeons [7], pigs [8], chickens [9], dogs [10], rabbits [11], polar bears [12] and cormorants [13].
Ducks (Anasplatyrhynchos) dissociated from chickens, zebra finches, and turkeys approximately 90-100 million years ago [14]. Ducks display great differences in morphology [15], physiology [16], and behavior [17] comparing to chickens, zebra finches, and turkeys due to a long period of differentiation selection. In addition, remarkable changes induced by natural and/or artificial selection have occurred in domesticated duck breeds compared to their wild ancestor (mallard) [18]. In 2013, the duck genome was sequenced [19] (http://www.ensembl. org/Anas_platyrhynchos/Info/Index), providing a platform for investigation of the mechanisms underlying artificial selection and domestication in ducks. It greatly facilitates us to investigate the genetic basis underlying differential traits through comparison of genome sequences of different duck populations.
China is the No. 1 country in the world in terms of the production and consumption of ducks, with the total number of ducks produced in China accounting for more than 90% of global duck production in 2014 (2,121,194,000 in China out of 2,324,224,000 worldwide) (FAO, http://www.fao.org/faostat/en/#data/QL). Pekin duck, a typical variety in China, was first bred in the 1960s by the Chinese Academy of Agricultural Sciences. Since then two new strains, lean-type Pekin ducks (LTPD) and fat-type Pekin ducks (FTPD), have been developed [20]. They differ significantly in lean meat ratio and fatness ratio (S1 Table). In addition, using the Sheldrake breed as a breeding material, a new duck strain named the China Micro-duck (CMD) characterized with high quality meat and white feather has also been developed in China.
To investigate the genetic signatures underlying artificial selection in these new duck breeds, genomic sequences of all three breeds (LTPD, FTPD, and CMD, referred to collectively as the artificial selection population (AS)) were compared with their ancestor (mallard, M). These populations display many differential characteristics and phenotypes (S1 Table). For example, LTPD and FTPD populations differ significantly in skeletal muscle development and fat deposition, whereas AS and M populations differ in feather color and heart rates. Therefore, we compared the genomic sequences of LTPD and FTPD to investigate the genetic signatures underlying differential skeletal muscle development and fat deposition, and the genomic sequences of AS and M to investigate the genetic signatures controlling feather color and heart rates.
In this study, we carried out whole genome resequencing for the four duck populations using pools of genomic DNA with an average coverage of~40× per pool. SNPs were detected for each population. Genomic regions under selection and genes overlapping these regions were identified and the functional effects of mutations harbored by selected genes were also investigated. We identified a two base (GC) deletion in HCN1 likely responsible for decreased heart rates observed in AS compared to M population. In addition, two SNPs harbored by IGF2R may contribute to increased skeletal muscle percentage. These results provide a better understanding of the mechanism underlying the altered skeletal muscle development, fat deposition, feather colors, and heart rates during artificial selection in ducks.

Anchoring duck scaffolds to pseudo-chromosomes
The current duck genome (http://www.ensembl.org/Anas_platyrhynchos/Info/Index) lacks the linkage information to align scaffolds at the chromosome level, which impedes the identification of continuous genomic signatures under selection and genes overlapping these signatures. Comparative genomics analysis has revealed that ducks have the closest genetic relationship with chickens and turkeys, whose genomes have been assembled with assigned chromosomes [21]. Based on chromosomal collinearity between duck and chicken, and between duck and turkey, we assigned duck scaffolds to pseudo-chromosomes. We obtained 28 pseudo-chromosomes representing 27 autosomes and one sex chromosome (chromosome Z) (Fig 1, S2 Table), which accounts for 70% (28 out of 40) of the duck chromosomes [22]. The 28 pseudo-chromosomes spanned~923 Mb, accounting for 84% of the assembled duck genome (1,104 Mb). The remaining scaffolds (length < 2 kb and not assigned to pseudo-chromosomes) were randomly linked as pseudo-chromosome UN and used for downstream analysis.

A load of variants were detected in duck genome
We performed Pool-seq to study the genomic variation of the four duck populations (LTPD, FTPD, CMD, and M; n = 30/group). The DNA pools for each population were paired-end sequenced to generate an average of~40x coverage per pool, resulting in a total coverage of 168.44x (185.96 Gb; S3 and S4 Tables). Mapping the reads to the duck genome resulted in an average alignment rate of 89.71%, of which 86.75% were uniquely mapped ( Table 1).
Analysis of genomic mutations, such as SNPs and short insertions and deletions (INDELs), is the basis of investigating the genetic mechanism underlying artificial selection at the genomic level. In this study, we detected SNPs and INDELs located in each of anchored pseudochromosomes and pseudo-chromosome UN, and summarized the results in a circular ideogram layout (Fig 1). In total, 11,393,231 high quality SNPs were identified across the four duck populations (Table 2). Of the SNPs harbored by genes (4,188,685),~95% (3,977,357) were located in introns (S5 Table). In total, 352 nonsense mutations resulting in incomplete and usually nonfunctional proteins were identified, with an additional 1,431 mutations predicted to effect protein functionality (S5 Table). In addition, we identified 620,677 INDELs across the four duck populations ( Table 2). Similar to SNPs, the majority of INDELs in gene regions (215,606) were located in intronic regions (S6 Table) with only 0.29% (2,194) of INDELs predicted to highly affect protein functionality (S6 Table). These results indicate only a small proportion of the identified mutations result in altered or loss of gene function and that loss of gene function is not the predominate cause of the differential phenotypes resulting from artificial selection in duck populations, which is consistent with previous reports in rabbits [11], pigs [23] and chickens [9].
A phylogenetic tree is a branching diagram or "tree" showing the inferred evolutionary relationships among various biological species or subspecies based upon similarities and differences in genetic characteristics. In this study, we constructed a phylogenetic tree to illustrate the genetic relationships of the four duck populations using SNPhylo software with the default parameters and the allele frequencies of SNPs detected in each population [24] (S1 Fig). The results indicated the LTPD and FTDP populations are the closest genetically related The innermost circle S � shows the anchored scaffolds ordered by pseudo-chromosomes. Circle P � represents the anchored duck pseudo-chromosomes. Circle C � and T � represent the distribution of corresponding scaffolds from the duck genome to chicken and to turkey genomes respectively. Circle SNP presents the distribution of SNP frequency within a bin size of 1k along each pseudochromosome. Circle INDEL presents the distribution of INDEL frequency within a bin size of 1k along each pseudo-chromosome.
https://doi.org/10.1371/journal.pone.0211908.g001 populations, with the CMD population more closely related to the LTPD and FTDP populations than the ancestral M population.

Many genomic regions associated with artificial selection were found
Selective sweeps occur when beneficial genetic variants increase in frequency due to positive selection [11]. In addition, positive selection always leads to reduced heterozygosity of the selected population and increased differentiation between populations around the selected site [10]. Therefore, genomic regions under selection can be identified based on reduced heterozygosity or increased differentiation in the selected populations. In this study, regions under selection between the LTPD and FTPD populations, as well as the M and AS populations were investigated by identifying regions with an increased fixation index (Fst) and reduced pooled heterozygosity (Hp) [25] using a 40Kb-sliding window (step = 20Kb). The distribution of observed Fst and Hp are shown in Fig 2A-2D for both comparisons. In total, 133 windows (76 continuous regions) overlapping 78 genes were found to be under selection between the M and AS populations (S7 Table), while 134 windows (76 continuous domains) overlapping 99 genes were found to be under selection between the FTPD and LTPD populations (S8 Table). Given the comprehensive sampling in our study and the correlation in allele frequencies amongst the populations studied, highly differentiated SNPs are likely to have either been directly targeted by selection or occurred in the vicinity of loci under selection. Therefore, we calculated the absolute allele frequency (ΔAF) for each SNP located in regions under selection and sorted them into 10% bins (i.e. ΔAF = 0.00 to 0.10, 0.10 to 0.20, etc.) (Fig 3, S1 and S2 Data). Of the 19,100 SNPs located in regions under selection between the AS and M populations, 9,761 displayed low ΔAF (� 0.40) and 9,339 displayed high ΔAF (� 0.41). Notably,~1% (1,580) of identified SNPs were fixed or nearly fixed in one group (ΔAF � 0.81). A much higher number of SNPs (26,784) were located in regions under selection in FTPD and LTPD comparison compared to that in M and AS comparison, however, only 35 of them were fixed or nearly fixed in one group (ΔAF � 0.81). These results suggest that although the artificial selection imposed on the FTPD and LTPD populations resulted in a high number of selected SNPs, the majority of detected SNPs were not fixed over the relatively short time span (~40 years).

Genes detection located in artificial selection
To detect genes targeted by artificial selection in ducks, we identified genes under selection between the M and AS populations and the LTPD and FTPD populations. In total, 78 genes were located in regions under selection between the M and AS populations (S7 Table), many of which are involved in morphology and physiology. Two of the selected genes (MITF and LYST) are known to be crucial for feather coloring [26][27][28][29]. Selected genes involved in biosynthetic processes include PTGS2, a key enzyme in prostaglandin biosynthesis that inhibits female reproductive processes when disrupted in mice [30], IGF1R, which results in growth retardation when mutated in humans [31,32], and MEF2A, which induces myogenic development and is involved in skeletal muscle regeneration [33,34]. GO analysis was carried out to identify the functional role of genes under selection. GO terms enriched for genes under  (Table 3 and S9 Table), indicating that feather color and substance synthesis have been strongly selected during the artificial selection process. LTPD and FTPD have been produced through artificial selection of ducks with higher breast muscle percentage or higher carcass fatness percentage since the 1990s, respectively [20]. In this study, 99 genes were found to be under selection between the FTPD and LTPD populations (S8 Table), 16 of which are involved in skeletal muscle development and fat deposition (S8 Table). GO analysis of the genes under selection between the FTPD and LTPD populations confirmed enrichment of genes involved in skeletal muscle development and fat deposition. Six out of top 10 terms were involved in fat deposition and muscle development (P-value = 1.3×10 −3 to 3.7 ×10 −3 ). The remained four terms were related with sensory perception (P-value = 1.5×10 −3 to 3.2 ×10 −3 ) (Table 3 and S10 Table), suggesting differences sense of smell between the FTPD and LTPD populations.

HCN1 is associated with reduced heart rates in AS populations
It had been reported that HCN1 is expressed in the sinoatrial node [35][36][37] and contributes to stable heart rates in mice [38]. In this study, HCN1 was identified as the gene under the strongest selective pressure (Fst = 0.78) between the M and AS populations with a lower Hp value (0.14) in the AS population (Figs 2A, 2B, and 4A and S11 Table). This result suggests significant differences in heart rates may exist between M and AS ducks. Therefore, we compared the heart rates of M (n = 59) and AS (n = 210) ducks, identifying a significantly higher average heart rate in M (201.70 beats/minute) compared to AS (174.60 beats/minute; pvalue = 0.00247) (Fig 4B). Although no nonsynonymous mutations were identified in the HCN1 gene, a two base (GC) deletion at base 1,357 of HCN1 resulted in expression of a splice variant in the AS population (S12 Table). In addition, M ducks displayed higher HCN1 mRNA expression level compared to AS ducks (n = 6/group) (Fig 4C). These results indicate that artificial selection resulted in reduced HCN1 expression is likely responsible for the reduced heart rates observed in AS compared to M ducks.

IGF2R is associated with increased lean meat ratios in LTPD populations
As a result of artificial selection, the percentage and thickness of breast muscle has significantly increased, while the skin fat percentage has significantly decreased in LTPD compared to FTPD (Table 4, S1 Table and S2 Fig). These differences indicate that selective pressure exerted on LTPD and FTPD populations during artificial selection has had significant effects on skeletal muscle development and fat deposition.
IGF2 plays a crucial role in muscle mass development in pigs [39] and in mice [40]. In addition, previous studies have indicated IGF2R, a negative regulator of IGF2 [41], plays a role in determining lean meat ratios [42]. Therefore, we investigated relationship between IGF2R and differential lean meat ratios and fat percentages observed between LTPD and FTPD populations. In this study, IGF2R was found to be under selection between the LTPD and FTPD populations (Fst = 0.39, HP(FTPD) = 0.45, Hp(LTPD) = 0.21) (Figs 2C, 2D, and 5A and S13 Table). Four SNPs resulting in three amino acid substitutions were identified within the IGF2R gene (S14 Table). The first nonsynonymous mutation was a 241A>G substitution resulting in an Ile81Val substitution in the duck IGF2R protein. Because this Ile81Val   substitution is not located in a conserved or functional region, we did not investigate this site further. The second nonsynonymous mutation identified was a 5119A>G mutation leading to a Val1707Ile substitution in the IGF2R protein. A Val1707Ile substation located in the highly conserved CIMR region of IGF2R was also identified based on comparison of the duck IGF2R   [43]. This Val1707Ile amino acid substitution was observed in the LTPD population and is highly conserved across 14 species (Fig 5C). Due to codon degeneracy, the third (5509T>C) nonsynonymous mutation combined with the fourth (5511G>A) nonsynonymous mutation resulted in a Trp1837Arg substitution in the LTPD population. Multi-species protein sequence alignments revealed the Arg substitution observed in LTPD is also highly conserved in the majority of bird species profiled to date (Fig 5B). Studies in humans have demonstrated that IGF2R is a transmembrane receptor molecule with a large extracellular domain comprised of 15 repeat regions and a small intracellular region. The 13th extracellular repeat region is responsible for regulating the binding affinity of IGF2R to IGF2 (Fig 5C) [44,45]. Based on comparison of the duck and human IGF2R protein sequences, the Trp1837Arg substitution was found to be located in the 13 th extracellular repeat region of the duck IGF2R protein (Fig 5C). Thus, this amino acid substitution is likely responsible for functional differences leading to increased lean meat ratios in LTPD compared to FTPD populations. In order to confirm the presence of two of the nonsynonymous mutations (5119A>G, and 5509T>C) in the IGF2R gene, which are likely responsible for the differential phenotypes observed between LTPD and FTPD populations, we cloned and sequenced the relevant regions of the IGF2R gene (S4-S7 Figs). In addition, we also compared the allele frequencies at the two SNP sites. At the 5119A>G mutation site, the G allele frequency was higher in the LTPD than in FTPD population (0.9130 and 0.2955, respectively) (S15 Table and S4 Fig), indicating near fixation of the G allele in the LTPD population. For the 5509T>C mutation, the T allele frequency was higher in the LTPD than in FTPD population (0.8913 and 0.2727, respectively) (S16 Table and S5 Fig).
To decipher whether the candidate mutations affect gene expression, we examined IGF2R mRNA expression level in the breast muscle of FTPD and LTPD populations. The result showed IGF2R mRNA level was significantly higher in breast muscle of FTPD compared to LTPD populations (Fig 5D). These results indicate that artificial selection has resulted in selection for IGF2R mutations and differential IGF2R protein levels in breast muscle between LTPD and FTPD populations. Previous studies have identified mutations in crucial genes leading to phenotypic differences between animal populations, such as the MGAM gene in dogs, which catalyzes the hydrolysis of maltose to glucose, and the MITF and KIT genes in dogs and pigs, which affect coat color [10,46]. Therefore, the mutations observed between populations in this study indicate IGF2R is a candidate gene for regulation of skeletal muscle mass in ducks.
Whole genome sequencing using pooled or individual samples provides an extremely powerful approach for detecting genetic differences associated with phenotypic traits in animals [47]. With this approach, diverse genetic signatures of domestication and evolution have been elucidated in a number of animal species [8][9][10][11][12][13]. In ducks, muscle growth and lipid deposition are two important features representing the main breeding objectives. Previously, 10 and 11 candidate genes related to duck muscle growth and lipid deposition have been reported [48]. However, the selected genes identified in this study do not overlapped with these previously reported candidate genes. As the previous study was performed using different duck breeds (a native Pekin duck with higher fat content and a Pekin duck bred crossbred with a native British duck breed with higher lean meat percentage and intramuscular fat content) [48,49], the differences observed between these studies suggest differential adaption mechanisms underlie the same phenotypic changes observed across distinct duck populations.
In conclusion, the results of this study broaden our knowledge of the effects of artificial selection on the duck genome, shedding light on the allelic variation underlying relevant production traits across industrially relevant duck populations. These results will enable the future improvement of duck breeding schemes and support further investigation of the mechanisms underlying artificial selection in ducks.

Anchoring duck scaffolds to pseudo-chromosomes
We utilized the chicken (Gallus gallus, Ensembl version: 4.76) and turkey (Meleagris gallopavo, Ensembl version: UMD2.76) chromosomal assemblies to anchor the duck scaffolds to pseudochromosomal sequences. We first performed whole genome alignment of duck scaffolds longer than 2 kb with harboring at least two protein-coding genes against chicken chromosome sequences using the promer program in the MUMmer 3.0 package [50]. We then filtered the alignments using delta-filter, with the-g parameter enabled to filter for 1-to-1 global alignments without rearrangements. Alignments with an identity < 30 and alignment uniqueness (percent of the alignment matching to unique reference and query sequences) < 50 were discarded. The resulting alignments indicating ambiguous order and orientation of duck scaffold sequences were visualized as dot plots for manually checking. The same methods were used to align scaffolds to the turkey chromosomal assembly. The anchored pseudo-chromosomes were finalized by discarding scaffolds with inconsistent ordering or orientation when aligned to the chicken and turkey chromosomes. In total, 1,141 scaffolds were anchored onto 28 pseudo-chromosomes.

Sample preparation
All protocols of birds handling and sampling were approved by the Animal Care and Use Committee of Chinese Academy of Agricultural Sciences (CAAS), and all efforts were made to minimize the suffering of animals according to recommendations proposed by the European Commission (1997). The study was carried out in accordance with the approved protocol. All methods were conducted in accordance with relevant guidelines. Birds were slaughtered using the electric shock method followed by jugular vein bloodletting method within 30 seconds to ameliorate their suffering. LTPD, FTPD, CMD, and M duck populations were used to collect venous blood for genomic DNA extraction in this study. LTPD, FTPD, and CMD populations developed by the Institute of Animal Science, Chinese Academy of Agricultural Sciences were sampled. Mallard (M), a widely accepted ancestor of domesticated ducks [18], raised by the Ji'ao Austrian Agricultural Science and Technology Co., Ltd in Fenghua City, Zhejiang Province, was also sampled. For each population, we randomly selected 30 healthy adult birds and collected blood from the wing vein to extract DNA as previously described [51]. DNA from each individual was mixed in equimolar ratios to generate a DNA pool for sequencing. Additional DNA from each individual was used to clone genes crucial for relevant production traits.

Library construction and sequencing
Paired-end sequencing libraries were generated for each pool using standard Illumina sequencing protocols. The constructed libraries were sequenced as 100 bp paired-end reads on an Illumina HiSeq 2000, resulting in~40x coverage per pool (S3 Table).

Read mapping
We used the Burrows-Wheeler Alignment tool (BWA) [52] with default settings to map the raw reads to the duck genome assembly (ensemble version: BGI_duck_1.076), resulting in an alignment rate of nearly 90% per pool (S4 Table). We sorted the alignments and removed PCR duplicates using the picard-tools MarkDuplicates.jar package (http://picard.sourceforge.net/). To improve the alignment accuracy, we performed 'multiple sequence realigning' around putative INDEL regions using the RealignerTargetCreator/IndelRealigne programs in the Genome Analysis Toolkit (GATK, http://www.broadinstitute.org/gatk/). The remained alignments were used for downstream sequence variation analysis.

SNP and INDEL detection and annotation
We used a combination of GATK [53] and freebayes [54] to detect SNPs and INDELs. First, GATK and freebayes were run using default settings independently to produce a pair of raw calling sets. Putative variants from freebayes with a QUAL score < 30 were discarded. We then applied stringent parameters to filter and integrate the two sets of results. Due to the distinct quality properties of SNPs and INDELs, we applied different filtering criteria for each. For SNPs, SelectVariants in GAKT was used to identify SNPs consistently called between GATK and freebayes. Consistently called SNPs were subjected to a hard filter step using the parameters recommended by the GATK mentor: QD < 2.0 || FS > 60.0 || MQ < 40.0 || Map-pingQualityRankSum < -12.5 || ReadPosRankSum < -8.0.3. Variants with ultra-high (> 500) or ultra-low (< 10) coverage were discarded. Finally, bi-allelic variants with allele frequency greater than 0.05 were included in the final set of variants. For INDELs, SelectVariants in GAKT was used to identify INDELs consistently called between GATK and freebayes. Consistently called INDELs were subjected to a hard filter step using the parameters recommended by the GATK mentor: QD < 2.0 || FS > 200.0 || ReadPosRankSum < -20.0.3. Variants with ultra-high (> 500) or ultra-low (< 10) coverage were further discarded. Finally, bi-allelic variants shorter than 5 bp and an allele frequency greater than 0.05 were included in the final set of variants.
To explore the location and effect of sequence variants on gene transcription and functionality, the identified SNPs and INDELs were annotated against the reference genome annotation using snpEff [55].

Analysis of selective sweeps, genes under selection, and variants harbored by selected genes
We used allele counts at SNP positions of sliding windows (bin size, 40 Kb; step size, 20 Kb) to identify signatures of selection. To reduce the number of false positives, windows contained > 10 SNPs were used in downstream analysis. For each comparison between two pools of sequencing data, we calculated Fst using popoolation2 [56] and Hp according to [9]. A window was defined as a potentially selected window used in downstream analysis if it simultaneously met the following two criteria: (i) Fst value of the window should fall into the top 200 highest Fst values among all the windows across the duck genome, and (ii) Hp value of the window should fall into the smallest 400 Hp values in at least one duck population in the two compared groups (M-AS or FTPD-LTPD). Genes overlapping with those windows were defined as genes under selection. We then calculated FST and a pooled heterozygosity score, Hp, at individual variant sites. In order to identify variants associated with selective sweeps, we manually analyzed the SNPs and INDELs located in coding regions of genes under selection. We focused on variants: 1) in genes under selection; 2) resulting in nonsynonymous amino acid changes or nonsense mutations; and 3) were evolutionary conserved. The conversed positions of each analyzed protein sequence were retrieved from multiple protein sequence alignments calculated by MAFFT [57]. We also analyzed all the aforementioned variations using PROVEAN (http://provean.jcvi.org/index.php), which predicts the impact of mutations on the biological function of proteins based on conservation analysis of homologs automatically searched for in the NCBI Nr database.

GO analysis of selected genes
We used the duck Gene Ontology (GO) annotation information from the Ensembl Genome Browser to perform enrichment analysis of genes under selection. For each query, we tested over represented GO terms against a background (the whole genome set of protein-coding genes in the annotation) using the GOstats Bioconductor R package (https://www.r-project. org/) [58].

The alignment of multiple protein sequences of IGF2R proteins
To investigate the conservation of amino acid substitutions occurring at amino acid positions 1,707 and 1,837 of the duck IGF2R protein, we obtained the duck IGF2R protein sequences of FTPD and LTPD. We then downloaded the IGF2R protein sequences deriving from 12 additional species. The online multiple protein sequence alignment tool, COBALT (www.ncbi. nlm.nih.gov/tools/cobalt/) [59], was used to identify conserved amino acids at positions 1,707 and 1,837 of the duck IGF2R protein.

Validation of IGF2R SNPs
PCR amplification and DNA sequencing approaches were used to validate the two SNPs harbored by the duck IGF2R gene (5119A>G, and 5509T>C). The two primer pairs used are listed in S17 Table. The total reaction volume for each PCR reaction was 20 μL, containing 10 μL of 2×Premix Taq PCR Solution, 0.7 μL of each primer (S17 Table), 1 μL of normalized template cDNA, and 7.6 μL ultrapure water. PCRs were performed as follows: 1 cycle of denaturation at 94˚C for 5 min, 36 cycles of denaturation at 94˚C for 30 s, annealing at 50˚C for 40 s, extension at 72˚C for 2 min, and a final extension at 72˚C for 10 min. The PCR products were separated using 1.2% agarose gel electrophoresis, and the target fragments were retrieved and purified by the EZgene Gel/PCR Extraction Kit (Biomiga, Shanghai, China) for DNA sequencing.

Detection of IGF2R and HCN1 mRNA expression
qRT-PCR was used to detect IGF2R and HCN1 mRNA expression levels and the corresponding primers are presented in S18 Table. qRT-PCR was performed using the SYBR PrimeScript RT-PCR Kit (TaKaRa, Dalian, China) with SYBR Green dye as described previously [60]. Briefly, qRT-PCR reactions were carried out with an iCycler IQ5 Multicolor Real-Time PCR Detection System (Bio-Rad, USA). The qRT-PCR reaction volume was 25 μL, contained 1 μL of cDNA template, 12.5 μL of SYBR Premix ExTaq, 9.5 μL of sterile water, and 1 μL of each gene-specific primer. Thermal cycling parameters were 1 cycle at 95˚C for 2 min, 40 cycles of 95˚C for 15 s, and 60˚C for 34 s. Dissociation curve analysis was done after each real-time reaction to ensure that there was only one product. The qRT-PCR analysis of each sample was done in triplicate.