Korean Hanwoo cattle have been subjected to intensive artificial selection over the past four decades to improve meat production traits. Another three cattle varieties very closely related to Hanwoo reside in Korea (Jeju Black and Brindle) and in China (Yanbian). These breeds have not been part of a breeding scheme to improve production traits. Here, we compare the selected Hanwoo against these similar but presumed to be unselected populations to identify genomic regions that have been under recent selection pressure due to the breeding program. Rsb statistics were used to contrast the genomes of Hanwoo versus a pooled sample of the three unselected population (UN). We identified 37 significant SNPs (FDR corrected) in the HW/UN comparison and 21 known protein coding genes were within 1 MB to the identified SNPs. These genes were previously reported to affect traits important for meat production (14 genes), reproduction including mammary gland development (3 genes), coat color (2 genes), and genes affecting behavioral traits in a broader sense (2 genes). We subsequently sequenced (Illumina HiSeq 2000 platform) 10 individuals of the brown Hanwoo and the Chinese Yanbian to identify SNPs within the candidate genomic regions. Based on allele frequency differences, haplotype structures, and literature research, we singled out one non-synonymous SNP in the APP gene (APP: c.569C>T, Ala199Val) and predicted the mutational effect on the protein structure. We found that protein-protein interactions might be impaired due to increased exposed hydrophobic surfaces of the mutated protein. The APP gene has also been reported to affect meat tenderness in pigs and obesity in humans. Meat tenderness has been linked to intramuscular fat content, which is one of the main breeding goals for brown Hanwoo, potentially supporting a causal influence of the herein described nsSNP in the APP gene.
Citation: Lim D, Strucken EM, Choi BH, Chai HH, Cho YM, Jang GW, et al. (2016) Genomic Footprints in Selected and Unselected Beef Cattle Breeds in Korea. PLoS ONE 11(3): e0151324. https://doi.org/10.1371/journal.pone.0151324
Editor: Gudrun A. Brockmann, Humboldt-University Berlin, GERMANY
Received: June 14, 2015; Accepted: February 26, 2016; Published: March 29, 2016
Copyright: © 2016 Lim et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All SNPs used in this study have been submitted to the NCBI dbSNP with the accession numbers: ss 1850337119-ss 1867919545.
Funding: This work was supported by Post-Genome project, Rural Development Administration (RDA), Korea, Grant Nummber: PJ010406.
Competing interests: The authors have declared that no competing interests exist.
Hanwoo is an indigenous cattle breed of Korea that has been maintained with minimal introduction of other germplasm for >2,000 years . The brown Hanwoo have been subjected to intensive artificial selection over the past four decades to improve meat production traits such as carcass weight, eye muscle area, marbling (intramuscular fat), and meat tenderness [2, 3]. This artificial selection has potentially increased the frequency of favorable alleles in brown Hanwoo (HW) at loci affecting the production traits. Nowadays, HW is one of the top breeds for intramuscular fat next to Japanese Wagyu cattle. Outside of the breeding program for HW, two closely related but unselected varieties survived in Korea: Brindle Hanwoo (BR) and Jeju Black Hanwoo (JB) [4, 5]. These two strains have small populations with around 4,000 BR individuals and 1,000 JB. Each population is primarily found on two islands off the coast of Korea and might be considered as island populations in which a founder effect and genetic drift contributed to a loss of variation and differentiation from the mainland HW. Chinese Yanbian (YB) cattle are found in the Yanbian Prefecture in China, north of the border of North Korea. The region has a strong Korean influence (60% of the population were Korean until the 1950s) and the Yanbian cattle were potentially fully connected to the brown Hanwoo until the split between North and South Korea. Yanbian are mainly used as draft animals and remained unselected for production traits .
There are several population genetic occurrences that will shift allele frequencies such as migration and admixture, bottle-necks, or random genetic drift. However, only constant inbreeding or selection will increase homozygosity in a population in a short amount of time. Whilst inbreeding will affect the entire genome and is tried to be minimized in modern selection programs, selection will accumulate favorable alleles targeting specific genetic regions. Genome scans for such signatures of selection in single breeds have identified known candidate genes such as DGAT1, the casein cluster, or GHR [7–9]. Applied statistics for a single breed include site frequency spectrum (SFS) or extended haplotype homozygosities (EHH) [10–12]. A more powerful approach is to compare genomes of populations that have been under different selection pressures. Such comparison studies were successfully carried out between dairy and beef breeds [13, 14], new world, European, and African cattle breeds , or within East-Asian cattle breeds . Comparison statistics include FST, XPEHH, or Rsb which are derived from the EHH statistic [17–19]. All of these statistics were designed to assess the similarity or difference of two populations based on genomic information.
We decided to compare the genomes of the selected HW with the unselected breeds of BR, JB, and YB to narrow down genetic regions with accumulated homozygosity in the HW. Such regions might have been shaped by the intensive artificial selection in the HW, and therefore, could be hotspots for genes affecting the traits emphasized in the Korean breeding program. We investigated the genes with known effects that are located in the identified selective sweep regions to provide explanations to why these regions might be under selection pressure in the HW. Lastly, we sequenced the entire genome of selected HW and YB individuals to obtain a more precise picture of sequence variations in regions of interest that might not have been captured by genome-wide markers. The DNA sequence was further used to predict protein structures and the impact of found sequence variations on protein structure.
Results and Discussion
Population structure and diversity
We performed a principal component (PCA) and Admixture analysis based on genotypic information which provided information about the genetic relationship of breeds and individuals within the breeds. The average genomic relationship between animals within the current population was 0.003, 0.049, 0.05 and 0.037 for HW, BR, JB, and YB, respectively (S1 Fig). In addition to the four East-Asian breeds, we included a European taurine breed (Angus) in the analysis to provide an anchor point for the breed diversification. The PCA results grouped individuals of the four East-Asian and the Angus breed in agreement to their origins (Figs 1 and 2). Some of the YB cattle spread from the East-Asian cluster to the out-group of Angus cattle. This could suggest that YB either had some unaccounted intercrossing with European cattle breeds, or genetic artefacts of the same domestication origin as European taurine remain in this unselected population . Further, HW and YB cattle are genetically very similar whilst BR and JB are more distinguishable (Table 1, Fig 2), probably due to the more intense drift and founder effects in these small populations. Nevertheless, compared to other breed differences, these two cattle varieties can be classified as Hanwoo.
The first component separates the European Angus from the East-Asian cattle breeds and explains 26.4% of the variation amongst the individuals. The second component separates the East-Asian breeds and explains 6% of the variation. HW: Hanwoo; BR: Brindle Hanwoo; JB: Jeju Black Hanwoo; YB: Chinese Yanbian; AG: Angus.
Each color depicts the breed proportions per individual and based on allele frequencies of the breeds available for this study. HW: Hanwoo; BR: Brindle Hanwoo; JB: Jeju Black Hanwoo; YB: Chinese Yanbian; AG: Angus.
Based on Admixture results, the BR and JB cattle have on average 30% of their allelic frequencies in common with HW, and YB share on average 67.2% of their genetic make-up with HW (Fig 2B). As a comparison, the East-Asian cattle breeds shared only between 0.2% and 1.5% of their genetic make-up with the European Angus cattle. Further, FST-statistics, which is designed to describe differences between two populations based on allele frequencies, also found that the Angus cattle were more distantly related to the Korean breeds than they are amongst each other (Table 1). Closest relationships were found between the YB and the other Korean breeds, especially with HW. Lastly, genomic relationships within the populations were <0.01 for the East-Asian breeds, indicating low relatedness and random sampling of animals. The similarity of the selected East-Asian cattle breeds makes them a good choice to detect recent selection footprints in the brown Hanwoo caused by the relatively recent breeding program.
Identification of footprints of selection
Footprints of selection were analyzed with the Rsb statistic which is based on extended haplotype homozygosities (EHH). Rsb scores are designed to detect recent selective sweep regions that occur in only one population compared to another. We compared the HW against a pooled sample of unselected populations (UN including the BR, JB, and YB) to increase the population and create a better contrast. All chromosomes displayed regions with significant PRsb scores (PRsb = -log[Φ(Rsb)]) indicating HW specific sweep regions (fi). To narrow down the regions of interest, we created 5,103 windows across the genome covering 1 Mb each (average 14.96 SNP per window). Candidate regions were considered those windows containing at least two SNP with PRsb > 2 (P-value < 0.01). At this stage, 31 suggestive regions containing 185 SNPs were identified (S1 Table). We further applied the Benjamini-Hochberg false discovery rate (FDR P-value < 0.05) to correct for multiple testing (Fig 3, ). Out of the 31 suggestive regions, 16 candidate regions containing 37 SNPs remained significant (Table 2).
Red dots indicate significant markers remaining after FDR correction.
All but one significant candidate region overlapped with QTLs for meat and production traits according to a search in the Animal QTLdb (Release 17; http://www.animalgenome.org/cgi-bin/QTLdb/BT/index). Among the 16 significant candidate regions, 21 known protein coding genes were located 300kb up-or down-stream of the SNP. Most of these genes (14 genes) are known to affect meat production, obesity, or lipid metabolism such as GHRH , APP [23–25], or TRPC1 .
The brown Hanwoo cattle have been selected for meat production since the 1970s . The breeding goal heavily emphasizes traits such as marbling score and meat tenderness, carcass weight, backfat thickness, and eye muscle area . Over the past 30 years, the annual genetic gain in HW increased from 0.02 to 0.82 kg/year, with body weight (at 18 months of age) increasing from 331 to 574 kg, and intramuscular fat (final slaughter age) from 15% to 23% (NIAS, 2009). Therefore, we can assume that changes to the genetic make-up of HW cattle in favor of meat enhancing alleles had taken place which was supported by the large number of QTLs and genes affecting meat traits within the identified candidate regions (Table 2).
Other genes affected reproduction including mammary gland development (3 genes: MTMR2 , CWC15 , ACCN1 ), coat color (2 genes: WNT1 , POMC ), and behavioral traits, including autism, neural development defects, and maternal behavior in humans, and aggressive behavior in pigs (2 genes: RELN , AVPR1A [34–36], Table 2).
Selection footprints in regions of fertility and mammary gland traits can be explained by the need of any breeding scheme to produce healthy and fertile individuals that are capable of nurturing their offspring to produce sufficient replacement animals (Table 2, ). Additionally, calm and social animals are easier to handle and keep under constricted housing conditions [38, 39], which is why the behavioral genes might have been indirectly selected for in the HW (Table 2). The coat color genes were most likely identified due to the variation in this trait between the chosen populations (Table 2).
Only one of our identified regions (chr. 29 44.5-47Mb) was in agreement with a previous study on selection signatures in Korean beef cattle . This region contained the calpain 1 (CAPN1) gene which was reported to be associated with carcass weight and marbling score in Hanwoo . This lack of confirmation with previous studies might be due to a sampling bias between the studies or the nature of the comparison in this study. This study was designed to identify recent signatures of selection , rather than older signatures most likely due to natural selection (e.g. adaptive signatures for different environments) as reported in Porto-Neto et al. .
Detection of sequence variation in the selection signals
To provide further information and proof for the regions exhibiting positive selection footprints in HW, we sequenced the entire genome of the HW and YB populations (10 individuals of each group). The sequence data consisted of a total of 7,268,700,964 sequence-reads across the bovine genome. The average depth of sequence coverage for each sample was approximately 11-fold. The GATK and ANNOVAR analyses identified 103 SNP (53 nsSNP, 50 synonymous SNP) in the 16 candidate regions described in the previous section. Allele frequencies of the same allele in the HW and YB population were calculated for each nsSNP. Frequencies differed by more than 20% between the two populations for 23 nsSNP, however, significant differences were only found for 6 nsSNPs (Table 3). Out of these 6 nsSNP, 3 nsSNP were classified by PolyPhen as benign, 2 nsSNP as possibly damaging, and 1 nsSNP as probably damaging (Table 3).
Out of the 3 possibly or probably damaging nsSNP, i.e. protein changing, only one gene matched putative candidate genes that were found within the selective sweep regions described in Table 2: The amyloid beta (A4) precursor protein (APP) on chromosome 1 (9–10.5 Mb). The APP gene was previously associated with meat and fat traits in pigs and mice [42–44]. Further, analyses of haplotype structures around the five possibly or probably damaging nsSNP showed a haplotype block in the HW population only in proximity to the APP gene (located between markers ARS-BFGL-NGS-43310 and Hapmap38887-BTA-22281, S2 Fig). In theory, the linkage disequilibrium between markers weakens over time. However, artificial selection accumulates the favorable allele including a small region surrounding the mutation which manifests itself as a haplotype block. Therefore, we decided to investigate the APP gene in more detail.
The nsSNP in exon 5 of the APP gene (APP: c.569C>T, Ala199Val) had a 30% difference in allele frequencies between the HW and YB populations (p-value = 0.002, Table 3). The frequency of the Valine variant (minor T allele) was increased in the HW population compared to the YB indicating an accumulation of a possibly favorable variant under selection pressure.
The APP is a highly conserved gene and its protein is found in a variety of tissues as a cell surface receptor and transmembrane precursor. Several different splice variants are known [45–49]; however, the exact function of the protein remains unclear. The human APP protein consists of five domains including a transmembrane domain linking the ecto- and cytoplasm. Dawkins and Small  discuss several possible functions of the APP protein from trophic actions including cellular growth and neuronal recovery, to stem-cell proliferation, to blood coagulation. The APP gene is possibly best known for its involvement in Alzheimer’s disease as its protein is a precursor molecule for beta-amyloid (Aβ) which is a major component of amyloid plaques . Further, APP has also been associated with a common progressive muscle disease (inclusion body myopathy, IBM) in elderly people [52, 53]. Lobjois et al.  inferred that this progressive muscle disease might explain a link to meat tenderness in livestock populations. Currently, 40 coding mutations in the human APP gene are known; however, for exon 5 only one mutation was reported without a pathogenic effect [54, 55].
We modelled the structure of the bovine APP protein based on the entire human N-terminal APP-E1 domain with a sequence identity to the bovine sequence of 92%. Only four out of the five known subunits of the human APP protein were also predicted in the bovine protein structure (Fig 4, ). A Kunitz-type protease inhibitor domain that was reported for longer isoforms of the human APP was missing. The position of the Ala199Val amino acid exchange in exon 5 was predicted to result in exposed hydrophobic surfaces of parts of the protein interfaces. Such an exposition is energetically unfavorable especially in aqueous environments. The folding free energy of each structure is defined as the free energy difference between the folded and unfolded states of the protein. Further, the folding structure of three out of the four protein subunits was affected by this nsSNP resulting in predicted unfavorable secondary and tertiary folding energies (Table 4). The pathogenic effect of the APP gene in humans was linked to an overexpression and thus an accumulation of Aβ peptides in brain and muscle cells . Lobjois et al.  also reported that meat of pigs with a lower shear force had a higher expression of APP in the longissimus dorsi muscle. Possibly, our identified nsSNP could cause an accumulation of Aβ peptide in the muscle through an impaired protein-protein interaction which could inhibit post-translational protein cleavage, and thus, affect meat tenderness.
The schematic structure of bovine APP gene in the wild-type (A) and mutated form (B). The bovine protein structure and mutation effects were modelled according to the known template structure of the human gene (PDB code 3KTM). Mutation residue (Ala199Val) is represented by ball and stick.
By comparing a selected population of brown Hanwoo with three closely related and unselected populations we identified 16 genomic regions suggesting recent selection in Hanwoo. These regions harbor 21 genes and 14 of these genes have been previously associated to production traits, especially meat traits and fat accumulation. Through the application of several analytical and comparative methods, we were able to single out the APP gene which was previously reported to be overexpressed in pigs with tender meat. A non-synonymous SNP (Ala 199 Val) that affected the protein structure was identified through sequencing and protein structure modelling. The most notable involvement of the APP gene is in its muscle degenerative function in humans; however, whether and how these human diseases and animal production traits are linked or caused through the same pathways and sequence variations remains unclear.
The brown Hanwoo DNA was extracted either from AI bull semen straws or from blood samples obtained from different veterinary practitioners in the Hanwoo Improvement Center of the National Agricultural Cooperative Federation with the permission of the owners. The protocol was approved by the Committee on the Ethics of Animal Experiments of the National Institute of Animal Science (Permit Number: 2013–028). No ethics statement was required for the collection of DNA samples from the Brindle, Jeju Black Hanwoo, or Chinese Yanbian cattle because they were not sampled specifically for this study.
Animals and genotype assays
The data on brown Hanwoo (N = 100) were collected from steers of candidate bulls for progeny testing in the Hanwoo Improvement Center of the National Agricultural Cooperative Federation in Seosan, Chungnam province, Korea. The DNA samples of the Brindle Hanwoo population (N = 20) were collected from steers from Hankyoung National University, Anseong and Ulleunggun Agriculture technology center, Gyeongsangbuk-do, Korea. The steers of Jeju Black (N = 20) Hanwoo were obtained from the subtropical animal experiment station of the National Institute of Animal Science (NIAS), Jeju island, Korea. Samples of Yanbian cattle (N = 39) were obtained from the Yanbian Agricultural College, Jilin in China. Angus data (N = 20) were obtained through the Animal Genetics and Breeding Unit, University of New England, Australia.
Genomic DNA for genotyping assays was extracted from semen or blood sample. The DNA was isolated from semen using the DNeasy 96 Blood and Tissue Kit (Qiagen, Valencia, CA, USA) with 100 μl of sperm added to 10 ml of Buffer 1. The mix was gently vortexed and centrifuged for 10 minutes at 4000 rmp to isolate pure cell pellets. Lysis of the cell walls was achieved by adding 300 μl Buffer 1 and 100 μl proteinase K to the pellet. The samples were incubated for 2 hours at 56°C. Purification of DNA was carried out with the DNeasy Blood & Tissue Kit; protocol 1 (Qiagen, Valencia, CA, USA) according to manufacturer’s protocol.
DNA quantification was performed using a NanoDrop 1000 (Thermo Fisher Scientific Inc., Wilmington, DE, USA). DNA samples were submitted for genotyping with total DNA of 900 ng, 260/280 ratio .1.8, and DNA concentration of 20 ng/ul. The single nucleotide polymorphism (SNP) genotyping was performed using the Illumina Bovine SNP 50K Bead chip (Illumina Inc., San Diego, CA). Approximately 200 ng of genomic DNA was used to genotype each sample on the chip. Samples were processed according to the Illumina Infinium-II assay. Normalized bead intensity data for each sample were processed with the Beadstudio 3.0 software (Illumina) which converted fluorescent intensities into SNP genotypes.
Analysis of SNP statistics
Stringent filtering criteria were applied to the genotype data per breed. Briefly, SNPs were excluded from the analysis if they failed to provide genotypes in more than 5% of the cases, had median GC scores below 0.6, a GC scores under 0.6 in more than 90% of the samples, and deviated in heterozygosity by more than three standard deviations from the heterozygosity of other SNPs. Individual genotypes with GC scores under 0.6 were treated as missing. Loci out of Hardy-Weinberg equilibrium in a chi2-test for a cut-off p-value of 1−15, unmapped SNPs, SNPs on sex chromosomes, and SNPs with a minor allele frequency < 0.01 were also excluded. Finally, genotype data of 38,266 SNP remained after quality control.
Analysis of genetic structure and relationship
Population structures were assessed with a principal component analysis (PCA) based on the genomic relationship matrix (GRM, ). Genotype records (0, 1, 2) were used to create a GRM. Through eigendecomposition, the principal components of the GRM were factorized. As such, a PCA summarizes the genetic similarity between subjects through eigenvalues and eigenvectors. The genetic variation within and between breeds were assessed with Weir and Cockerham’s F-statistics (FST, . F-statistics describe the reduction in heterozygosity in comparison to Hardy-Weinberg expectations. The FST-statistic in particular describes the difference in allele frequencies between two independent populations with a potential value of 0 to 1, with 1 being the most different/ distantly related. FST-distance matrices were used for hierarchical clustering using Ward’s minimum variance algorithm. European Angus samples were added as an out-group to the East-Asian populations.
To provide a finer quantification of the different ancestry proportions, we performed a model-based unsupervised hierarchical clustering of the individuals using Admixture 1.22 software . Admixture provides a likelihood estimate of breed proportions dependent on allele frequencies of the assumed ancestral populations. Ancestral populations in an unsupervised analysis are clustered based on allele frequency similarities. We analyzed breed proportions with K = 2 to 4 assumed ancestral populations. Cross validation showed that K = 3 provided the best fit to our data.
Detecting footprints of selection using Rsb score
Selection footprints per population were analyzed based on extended haplotype homozygosities (EHH) which is a measure for the breakdown of linkage disequilibrium with increasing distance from a SNP. Based on EHH, we computed Rsb scores according to Tang  using the rehh package in R . Rsb is the standardized log-ratio of the integrated EHHS (iES) between pairs of populations and designed to detect sweeps that have occurred in only one population compared to another population. As our study is designed to detect sweeps specific to HW, genomic regions with extreme Rsb are indicative for signals of positive selection in HW. Rsb scores were transformed into PRsb = -log[Φ(Rsb)], assuming Rsb are normally distributed (under neutrality). PRsb can be interpreted as log10(1 ⁄ P) where P is the one-tailed hypothesis associated to the neutral hypothesis (no selection). To account for multiple testing, we applied the Benjamini-Hochberg false discovery rate correction (P-value = 0.05) .
Haplotypes were predicted for a pooled sample (UN) from the three unselected breeds (BR, JB, and YB) to create a larger counter population to the selected HW and a more accurate haplotype prediction. Haplotypes of the HW and UN were predicted with fastPHASE , respectively. We used default parameters except for the number of random starts for the EM-algorithm (-T option), which was set to 10 to reduce calculation time.
Identification and annotation of candidate regions
Candidate regions with positive selection footprints were defined as containing at least two SNPs with a significant Rsb (FDR corrected P-value < 0.05) for a 1 Mb window (with a 0.5 Mb overlap) for the HW/UN. Candidate regions were further considered those containing at least two SNPs exceeding these thresholds for at least three population comparisons (S3 Fig). In case of several continuous windows with two SNPs or more, we combined the overlapping windows into one candidate region. Genes within the candidate regions were determined with the intersectBed command from the BedTools software . The BedTools software allows to identify genes that overlap by at least one base out of a 4-bp integration site within a given candidate region. We also selected genes within +/- 300kb around SNPs that were identified in all four comparisons using “closetBed” in the BedTools. The genomic position of the genes was obtained from 'refGene' information on the UCSC Genome Browser (http://genome.ucsc.edu/). To identify the functions/roles of genes in the candidate regions, we used the medScan database that provides sentences from MEDLINE abstracts . QTL regions were identified from information on Cattle QTLs in the Animal QTLdb (Release 17; http://www.animalgenome.org/cgi-bin/QTLdb/BT/index). QTL locations by bp (Btau4.0) were downloaded and meat and production trait associated QTLs were selected. The associated names of these QTL types described in the Animal QTLdb were as follows: intramuscular fat, marbling score, average daily gain, yield grade, marbling score (EBV) shear force, tenderness score, body weight, height, feed conversion ratio, and average daily gain.
Detection of sequence variation in the selection signals
We sequenced the entire genome of 10 HW and 10 YB cattle by randomly shearing 3 μg of genomic DNA to generate about 90bp inserts (Covaris INC, Woburn, USA). The fragmented DNA was amplified and end-repaired using T4 DNA polymerase, which has an inactive 5’-3’ exonuclease function. Illumina paired-end adaptor oligonucleotides were ligated to the sticky ends (Illumina Inc, San Diego, USA). We analyzed the ligation mixture by gel electrophoresis from which we purified fragments of 200–250bp length. These fragments were then sequenced on the HiSeq 2000 platform according to manufacturer’s specifications (Illumina Inc, San Diego, USA).
The sequences were aligned to the bovine reference genome (Btau4.0) using the Burrows-Wheeler Aligner (BWA; version 0.6.1)  with default parameters. SAMtools version 0.1.17  was used for converting, sorting, and indexing the alignments. Duplicated reads were excluded from downstream analysis using Picard tools (Picard 2009, http://picard.sourceforge.net/). Local re-alignments and re-calibrations were performed with the Genome Analysis Toolkit framework (GATK; version 1.5.9, . The initial novel SNP discovery was performed using the multi-sample SNP-calling procedure in the GATK package. All cut-off thresholds for filtering criteria, and cluster and window size were empirically derived and specific to this study (cluster Size = 3; clusterWindowSize = 10; MQ>-4; ((MQ0/1.0*DP))>0.1; QUAL<30; QD<5; Hrun>5; FS>200).
Additionally, we performed SNP annotation according to their functional type such as intergenic, 5’UTR, 3’UTR, coding SNP, or non-synonymous (nsSNP) in the target region using the ANNOVAR software . The sequencing processes and more detailed information can be found in Choi et al. .
Predicting protein structures
We used the PolyPhen program (Polymorphism Phenotype version 2.2; http://genetics.bwh.harvard.edu/pph2) to predict the effects of nsSNP (from the ANNOVAR analysis) on the protein structure. We based this analysis on the sequence data that we established in the previous section. This sequence data was translated into a protein sequence including the amino acid exchange of the nsSNP. PolyPhen provides a prediction about the effect of a SNP on the protein structure and categorizes them as 'benign' (i.e. most likely lacking any phenotypic effect) or ‘damaging’ (probably or possibly damaging: affecting protein function with higher or lower confidence, respectively) according a Bayesian approach. Regions with possibly or probably damaging nsSNP were further analyzed by investigating the haplotype structure surrounding these nsSNPs based on markers on the Illumina Bovine SNP 50K Bead chip. Haplotypes of the regions were generated with Haploview vs.4.2  with the default algorithm described by Gabriel et al. 2002 . Genes with damaging nsSNP were compared to the known genes located in our identified regions under selection.
Lastly, we predicted the bovine protein structure of the gene of interest (APP) through comparative homology modelling. We used the known human protein template structure (PDB code 3KTM) to explore mutation effects of the candidate genes due to a nsSNP. The homology modelling was performed using the MODELLER9v10 program within the Discovery Studio (DS) 3.5 molecular modelling packages (http://accelrys.com/products/discovery-studio/, Accelrys Inc, San Diego, USA). The detailed process is shown in S4 Fig.
S1 Fig. The plot of genomic co-variance matrix based on the genotype data.
The results showed that average relationship was 0.049, 0.05, 0.037 and 0.003 for Brindle, Jeju Black, Yanbian and Brown Hanwoo, respectively. Yellow means more genetically related between the individuals. (A) brown Hanwoo; (B) brindle Hanwoo; (C) Jeju black Hanwoo; (D) Chinese Yanbian.
S2 Fig. Haplotype structure of candidate region around the bovine APP gene (BTA1:9–10.5Mb) in 4 East-Asian cattle breeds.
Dark shading indicates strong linkage disequilibrium. (A) brown Hanwoo; (B) brindle Hanwoo; (C) Jeju black Hanwoo; (D) Chinese Yanbian.
S3 Fig. The Rsb plots of four cattle breed comparisons (HW/BR, HW/JB, HW/YB and HW/UN).
HW: brown Hanwoo; BR: brindle Hanwoo; JB: Jeju black Hanwoo; YB: Chinese Yanbian cattle; UN: pooled unselected breeds.
S4 Fig. The flowchart of prediction of protein structure.
The first step is the use of PolyPhen program for predicting the functional effect of SNPs on the protein structure among the nsSNPs from resequencing data. The nsSNP (Ala 199 Val) of APP gene was analyzed the mutation effect based on the homology modeling using MODELLER program.
Conceived and designed the experiments: DL CG SHL YMC THK. Performed the experiments: BHC GWJ. Analyzed the data: DL ES HHC. Contributed reagents/materials/analysis tools: DL ES CG HHC. Wrote the paper: DL ES.
- 1. Kim JB, Lee C. Historical look at the genetic improvement in Korean cattle—Review. Asian-Aust J Anim Sci. 2000;13:1467–81.
- 2. Jo C, Cho SH, Chang J, Nam KC. Keys to production and processing of Hanwoo beef: A perspective of tradition and science. Animal Frontiers. 2012;2(4):32–8.
- 3. Seideman SC, Koohmaraie M, Crouse JD. Factors associated with tenderness in young beef. Meat Sci. 1987;20(4):281–91. Epub 1987/01/01. pmid:22054614.
- 4. F.A.O. Domestic Animal Diversity Information Service (DAD-IS), accessed 2.9.2014 2012 [cited 2014]. Available: http://dad.fao.org/.
- 5. Choi TJ. Establishment of phylogenomic characteristics for Korean traditional cattle breeds (Hanwoo, Korean brindle and black): Jeon-buk National University; 2009.
- 6. Kim JH, Byun MJ, Kim MJ, Suh SW, Ko YG, Lee CW, et al. mtDNA Diversity and Phylogenetic State of Korean Cattle Breed, Chikso. Asian-Australasian journal of animal sciences. 2013;26(2):163–70. Epub 2013/02/01. pmid:25049772; PubMed Central PMCID: PMC4093160.
- 7. Schwarzenbacher H, Dolezal M, Flisikowski K, Seefried F, Wurmser C, Schlotterer C, et al. Combining evidence of selection with association analysis increases power to detect regions influencing complex traits in dairy cattle. BMC Genomics. 2012;13:48. Epub 2012/02/01. doi: 1471-2164-13-48 [pii] pmid:22289501; PubMed Central PMCID: PMC3305582.
- 8. Flori L, Fritz S, Jaffrezic F, Boussaha M, Gut I, Heath S, et al. The genome response to artificial selection: a case study in dairy cattle. PLoS One. 2009;4(8):e6595. Epub 2009/08/13. pmid:19672461; PubMed Central PMCID: PMC2722727.
- 9. Lim D, Gondro C, Park HS, Cho YM, Chai HH, Seong HH, et al. Identification of Recently Selected Mutations Driven by Artificial Selection in Hanwoo (Korean Cattle). Asian-Aust J Anim Sci. 2013;26(5):603–8.
- 10. Tajima F. Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics. 1989;123(3):585–95. Epub 1989/11/01. pmid:2513255; PubMed Central PMCID: PMC1203831.
- 11. Fu YX, Li WH. Statistical tests of neutrality of mutations. Genetics. 1993;133(3):693–709. Epub 1993/03/01. pmid:8454210; PubMed Central PMCID: PMC1205353.
- 12. Voight BF, Kudaravalli S, Wen X, Pritchard JK. A map of recent positive selection in the human genome. PLoS Biol. 2006;4(3):e72. Epub 2006/02/24. doi: 05-PLBI-RA-1239R2 [pii] pmid:16494531; PubMed Central PMCID: PMC1382018.
- 13. Hayes BJ, Chamberlain AJ, Maceachern S, Savin K, McPartlan H, MacLeod I, et al. A genome map of divergent artificial selection between Bos taurus dairy cattle and Bos taurus beef cattle. Anim Genet. 2009;40(2):176–84. Epub 2008/12/11. pmid:19067671.
- 14. MacEachern S, Hayes B, McEwan J, Goddard M. An examination of positive selection and changing effective population size in Angus and Holstein cattle populations (Bos taurus) using a high density SNP genotyping platform and the contribution of ancient polymorphism to genomic diversity in Domestic cattle. BMC Genomics. 2009;10:181. Epub 2009/04/28. doi: 1471-2164-10-181 [pii] pmid:19393053; PubMed Central PMCID: PMC2681480.
- 15. Gautier M, Naves M. Footprints of selection in the ancestral admixture of a New World Creole cattle breed. Mol Ecol. 2011;20(15):3128–43. Epub 2011/06/22. pmid:21689193.
- 16. Porto-Neto LR, Lee S-H, Sonstegard T, Van Tassell CP, Lee HK, Gondro C. Genome-wide Detection of Signatures of Selection in Korean Hanwoo Cattle. Anim Genet. 2012.
- 17. Sabeti PC, Reich DE, Higgins JM, Levine HZ, Richter DJ, Schaffner SF, et al. Detecting recent positive selection in the human genome from haplotype structure. Nature. 2002;419(6909):832–7. Epub 2002/10/25. nature01140 [pii]. pmid:12397357.
- 18. Chen H, Patterson N, Reich D. Population differentiation as a test for selective sweeps. Genome research. 2010;20(3):393–402. Epub 2010/01/21. pmid:20086244; PubMed Central PMCID: PMC2840981.
- 19. Weir BS, Cockerham CC. Estimating F-Statistics for the Analysis of Population Structure. Evolution; international journal of organic evolution. 1984;38(6):1358–70.
- 20. Loftus RT, MacHugh DE, Bradley DG, Sharp PM, Cunningham P. Evidence for two independent domestications of cattle. Proc Natl Acad Sci U S A. 1994;91(7):2757–61. Epub 1994/03/29. pmid:8146187; PubMed Central PMCID: PMC43449.
- 21. Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society Series B (Methodological). 1995:289–300.
- 22. Cheong H, Yoon D-H, Kim L, Park B, Choi Y, Chung E, et al. Growth hormone-releasing hormone (GHRH) polymorphisms associated with carcass traits of meat in Korean cattle. BMC genetics. 2006;7(1):35.
- 23. Lobjois V, Liaubet L, SanCristobal M, Glenisson J, Feve K, Rallieres J, et al. A muscle transcriptome analysis identifies positional candidate genes for a complex trait in pig. Animal Genetics. 2008;39(2):147–62. pmid:18366476
- 24. Lee YH, Tharp WG, Maple RL, Nair S, Permana PA, Pratley RE. Amyloid precursor protein expression is upregulated in adipocytes in obesity. Obesity. 2008;16(7):1493–500. pmid:18483477
- 25. Kunej T, Jevsinek Skok D, Zorc M, Ogrinc A, Michal JJ, Kovac M, et al. Obesity gene atlas in mammals. J Genomics. 2012;1:45–55.
- 26. Bosquez J, Pagan M, Casas E, Cianzio D, Casas A. Segregation of a molecular marker in the TRPC1 gene and its association with growth and carcass traits in beef cattle. Midwestern Section of the American Society of Animal Science. 2007;85(Suppl 2):65.
- 27. Lee SH, Park BH, Sharma A, Dang CG, Lee SS, Choi TJ, et al. Hanwoo cattle: Origin, Domestication, Breeding Strategies and Genomic Selection. Journal of Animal Science and Technology. 2014;56(2).
- 28. Pintus E, Sorbolini S, Albera A, Gaspa G, Dimauro C, Steri R, et al. Use of locally weighted scatterplot smoothing (LOWESS) regression to study selection signatures in Piedmontese and Italian Brown cattle breeds. Animal Genetics. 2013.
- 29. Sonstegard TS, Cole JB, VanRaden PM, Van Tassell CP, Null DJ, Schroeder SG, et al. Identification of a nonsense mutation in CWC15 associated with decreased reproductive efficiency in Jersey cattle. PLoS One. 2013;8(1):e54872. pmid:23349982
- 30. Onteru S, Fan B, Du ZQ, Garrick D, Stalder K, Rothschild M. A whole‐genome association study for pig reproductive traits. Animal Genetics. 2012;43(1):18–26. pmid:22221021
- 31. Quigley IK, Parichy DM. Pigment pattern formation in zebrafish: a model for developmental genetics and the evolution of form. Microscopy research and technique. 2002;58(6):442–55. pmid:12242701
- 32. Krude H, Biebermann H, Luck W, Horn R, Brabant G, Grüters A. Severe early-onset obesity, adrenal insufficiency and red hair pigmentation caused by POMC mutations in humans. Nat Genet. 1998;19(2):155–7. pmid:9620771
- 33. Skaar D, Shao Y, Haines J, Stenger J, Jaworski J, Martin E, et al. Analysis of the RELN gene as a genetic risk factor for autism. Molecular psychiatry. 2004;10(6):563–71.
- 34. Terenina E, Bazovkina D, Rousseau S, Salin F, D'Eath R, Turner S, et al., editors. Gene polymorphisms associated with aggression in pigs. 44e Journurn de la Recherche Porcine en France, Paris, France, 7–8 February 2012; 2012: Institut du Porc.
- 35. Avinun R, Ebstein RP, Knafo A. Human maternal behaviour is associated with arginine vasopressin receptor 1A gene. Biology letters. 2012:rsbl20120492.
- 36. Golimbet V, Alfimova M, Abramova L, Kaleda V, Gritsenko I. Arginine vasopressin 1a receptor RS3 promoter microsatellites in schizophrenia: A study of the effect of the “risk” allele on clinical symptoms and facial affect recognition. Psychiatry research. 2015;225(3):739–40. pmid:25529259
- 37. Safus P, Pribyl J, Vesela Z, Vostry L, Stipkova M, Stadnik L. Selection Indexes for Bulls of Beef Cattle. Czech Journal of Animal Science. 2006;51(7):285–98.
- 38. Lindholm-Perry AK, Kuehn LA, Freetly HC, Snelling WM. Genetic markers that influence feed efficiency phenotypes also affect cattle temperament as measured by flight speed. Anim Genet. 2015;46(1):60–4. Epub 2014/12/18. pmid:25515066.
- 39. Haskell MJ, Simm G, Turner SP. Genetic selection for temperament traits in dairy and beef cattle. Front Genet. 2014;5:368. Epub 2014/11/07. pmid:25374582; PubMed Central PMCID: PMC4204639.
- 40. Cheong HS, Yoon D-H, Park BL, Kim LH, Bae JS, Namgoong S, et al. A single nucleotide polymorphism in CAPN1 associated with marbling score in Korean cattle. BMC genetics. 2008;9(1):33.
- 41. Porto‐Neto L, Lee S-H, Sonstegard T, Van Tassell C, Lee H, Gibson J, et al. Genome‐wide detection of signatures of selection in Korean Hanwoo cattle. Animal genetics. 2014;45(2):180–90. pmid:24494817
- 42. Lobjois V, Liaubet L, SanCristobal M, Glenisson J, Feve K, Rallieres J, et al. A muscle transcriptome analysis identifies positional candidate genes for a complex trait in pig. Anim Genet. 2008;39(2):147–62. Epub 2008/03/28. pmid:18366476.
- 43. de Koning DJ, Harlizius B, Rattink AP, Groenen MA, Brascamp EW, van Arendonk JA. Detection and characterization of quantitative trait loci for meat quality traits in pigs. J Anim Sci. 2001;79(11):2812–9. Epub 2002/01/05. pmid:11768109.
- 44. Takeshita S, Suzuki T, Kitayama S, Moritani M, Inoue H, Itakura M. Bhlhe40, a potential diabetic modifier gene on Dbm1 locus, negatively controls myocyte fatty acid oxidation. Genes & genetic systems. 2012;87(4):253–64. Epub 2012/12/12. pmid:23229312.
- 45. Kang J, Lemaire HG, Unterbeck A, Salbaum JM, Masters CL, Grzeschik KH, et al. The precursor of Alzheimer's disease amyloid A4 protein resembles a cell-surface receptor. Nature. 1987;325(6106):733–6. Epub 1987/02/19. pmid:2881207.
- 46. Tanzi RE, McClatchey AI, Lamperti ED, Villa-Komaroff L, Gusella JF, Neve RL. Protease inhibitor domain encoded by an amyloid protein precursor mRNA associated with Alzheimer's disease. Nature. 1988;331(6156):528–30. Epub 1988/02/11. pmid:2893290.
- 47. Weidemann A, Konig G, Bunke D, Fischer P, Salbaum JM, Masters CL, et al. Identification, biogenesis, and localization of precursors of Alzheimer's disease A4 amyloid protein. Cell. 1989;57(1):115–26. Epub 1989/04/07. pmid:2649245.
- 48. Pangalos MN, Shioi J, Efthimiopoulos S, Wu A, Robakis NK. Characterization of appican, the chondroitin sulfate proteoglycan form of the Alzheimer amyloid precursor protein. Neurodegeneration: a journal for neurodegenerative disorders, neuroprotection, and neuroregeneration. 1996;5(4):445–51. Epub 1996/12/01. pmid:9117561.
- 49. Tang K, Wang C, Shen C, Sheng S, Ravid R, Jing N. Identification of a novel alternative splicing isoform of human amyloid precursor protein gene, APP639. The European journal of neuroscience. 2003;18(1):102–8. Epub 2003/07/16. pmid:12859342.
- 50. Dawkins E, Small DH. Insights into the physiological function of the beta-amyloid precursor protein: beyond Alzheimer's disease. Journal of neurochemistry. 2014;129(5):756–69. Epub 2014/02/13. pmid:24517464.
- 51. Masters CL, Simms G, Weinman NA, Multhaup G, McDonald BL, Beyreuther K. Amyloid plaque core protein in Alzheimer disease and Down syndrome. Proc Natl Acad Sci U S A. 1985;82(12):4245–9. Epub 1985/06/01. pmid:3159021; PubMed Central PMCID: PMC397973.
- 52. Sugarman MC, Yamasaki TR, Oddo S, Echegoyen JC, Murphy MP, Golde TE, et al. Inclusion body myositis-like phenotype induced by transgenic overexpression of beta APP in skeletal muscle. Proc Natl Acad Sci U S A. 2002;99(9):6334–9. Epub 2002/04/25. pmid:11972038; PubMed Central PMCID: PMC122949.
- 53. Fukuchi K, Pham D, Hart M, Li L, Lindsey JR. Amyloid-beta deposition in skeletal muscle of transgenic mice: possible model of inclusion body myopathy. The American journal of pathology. 1998;153(6):1687–93. Epub 1998/12/10. pmid:9846958; PubMed Central PMCID: PMC1866340.
- 54. Sassi C, Guerreiro R, Gibbs R, Ding J, Lupton MK, Troakes C, et al. Investigating the role of rare coding variability in Mendelian dementia genes (APP, PSEN1, PSEN2, GRN, MAPT, and PRNP) in late-onset Alzheimer's disease. Neurobiology of aging. 2014. Epub 2014/08/12. pmid:25104557.
- 55. Biomedical Research Forum L. ALZFORUM 2014 [cited 2014 10.9.2014].
- 56. VanRaden PM. Efficient methods to compute genomic predictions. J Dairy Sci. 2008;91(11):4414–23. Epub 2008/10/24. doi: S0022-0302(08)70990-1 [pii] pmid:18946147.
- 57. Alexander DH, Novembre J, Lange K. Fast model-based estimation of ancestry in unrelated individuals. Genome research. 2009;19(9):1655–64. Epub 2009/08/04. pmid:19648217; PubMed Central PMCID: PMC2752134.
- 58. Tang K, Thornton KR, Stoneking M. A new approach for using genome scans to detect recent positive selection in the human genome. PLoS biology. 2007;5(7):e171. Epub 2007/06/21. pmid:17579516; PubMed Central PMCID: PMC1892573.
- 59. Gautier M, Vitalis R. rehh: an R package to detect footprints of selection in genome-wide SNP data from haplotype structure. Bioinformatics. 2012;28(8):1176–7. Epub 2012/03/10. doi: bts115 [pii] pmid:22402612.
- 60. Scheet P, Stephens M. A fast and flexible statistical model for large-scale population genotype data: applications to inferring missing genotypes and haplotypic phase. Am J Hum Genet. 2006;78(4):629–44. Epub 2006/03/15. doi: S0002-9297(07)63701-X [pii] pmid:16532393; PubMed Central PMCID: PMC1424677.
- 61. Quinlan AR, Hall IM. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 2010;26(6):841–2. Epub 2010/01/30. doi: btq033 [pii] pmid:20110278; PubMed Central PMCID: PMC2832824.
- 62. Novichkova S, Egorov S, Daraselia N. MedScan, a natural language processing engine for MEDLINE abstracts. Bioinformatics. 2003;19(13):1699–706. pmid:12967967
- 63. Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009;25(14):1754–60. Epub 2009/05/20. pmid:19451168; PubMed Central PMCID: PMC2705234.
- 64. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009;25(16):2078–9. Epub 2009/06/10. pmid:19505943; PubMed Central PMCID: PMC2723002.
- 65. McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 2010;20(9):1297–303. Epub 2010/07/21. pmid:20644199; PubMed Central PMCID: PMC2928508.
- 66. Wang K, Li M, Hakonarson H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 2010;38(16):e164. Epub 2010/07/06. pmid:20601685; PubMed Central PMCID: PMC2938201.
- 67. Choi J-W, Choi B-H, Lee S-H, Lee S-S, Kim H-C, Yu D, et al. Whole-Genome Resequencing Analysis of Hanwoo and Yanbian Cattle to Identify Genome-Wide SNPs and Signatures of Selection. Molecules and Cells. 2015;38(5):466–73. pmid:26018558
- 68. Barrett JC, Fry B, Maller J, Daly MJ. Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics. 2005;21(2):263–5. Epub 2004/08/07. pmid:15297300.
- 69. Gabriel SB, Schaffner SF, Nguyen H, Moore JM, Roy J, Blumenstiel B, et al. The structure of haplotype blocks in the human genome. Science. 2002;296(5576):2225–9. Epub 2002/05/25. pmid:12029063.
- 70. Lee H-J, Jang M, Kim H, Kwak W, Park W, Hwang JY, et al. Comparative Transcriptome Analysis of Adipose Tissues Reveals that ECM-Receptor Interaction Is Involved in the Depot-Specific Adipogenesis in Cattle. PLoS One. 2013;8(6):e66267. pmid:23805208
- 71. Zhang Y-Y, Zan L-S, Wang H-B, Qing L, Wu K-X, Quan S-A, et al. Differentially expressed genes in skeletal muscle tissues from castrated Qinchuan cattle males compared with those from intact males. Livestock Science. 2011;135(1):76–83.
- 72. Hotta K, Nakamura M, Nakamura T, Matsuo T, Nakata Y, Kamohara S, et al. Association between obesity and polymorphisms in SEC16B, TMEM18, GNPDA2, BDNF, FAIM2 and MC4R in a Japanese population. Journal of human genetics. 2009;54(12):727–31. pmid:19851340
- 73. Stella R, Biancotto G, Krogh M, Angeletti R, Pozza G, Sorgato MC, et al. Protein expression changes in skeletal muscle in response to growth promoter abuse in beef cattle. Journal of proteome research. 2011;10(6):2744–57. pmid:21425879
- 74. Zhang H, Hu X, Wang Z, Zhang Y, Wang S, Wang N, et al. Selection signature analysis implicates the PC1/PCSK1 region for chicken abdominal fat content. PloS one. 2012;7(7):e40736. pmid:22792402
- 75. Taneera J, Lang S, Sharma A, Fadista J, Zhou Y, Ahlqvist E, et al. A systems genetics approach identifies genes and pathways for type 2 diabetes in human islets. Cell metabolism. 2012;16(1):122–34. pmid:22768844
- 76. Mattevi VS, Zembrzuski VM, Hutz MH. Impact of variation in ADRB2, ADRB3, and GNB3 genes on body mass index and waist circumference in a Brazilian population. American Journal of human biology. 2006;18(2):182–6. pmid:16493638
- 77. Lee SH, Gondro C, van der Werf J, Kim NK, Lim D, Park EW, et al. Use of a bovine genome array to identify new biological pathways for beef marbling in Hanwoo (Korean Cattle). BMC Genomics. 2010;11(1):623.
- 78. Cheong H, Yoon D-H, Park B, Kim L, Bae J, Namgoong S, et al. A single nucleotide polymorphism in CAPN1 associated with marbling score in Korean cattle. BMC genetics. 2008;9(1):33.
- 79. Chung H, Davis M. Effects of genetic variants for the calpastatin gene on calpastatin activity and meat tenderness in Hanwoo (Korean cattle). Meat Science. 2012;90(3):711–4. pmid:22119671