Genome Wide Association Study Identifies 20 Novel Promising Genes Associated with Milk Fatty Acid Traits in Chinese Holstein

Detecting genes associated with milk fat composition could provide valuable insights into the complex genetic networks of genes underling variation in fatty acids synthesis and point towards opportunities for changing milk fat composition via selective breeding. In this study, we conducted a genome-wide association study (GWAS) for 22 milk fatty acids in 784 Chinese Holstein cows with the PLINK software. Genotypes were obtained with the Illumina BovineSNP50 Bead chip and a total of 40,604 informative, high-quality single nucleotide polymorphisms (SNPs) were used. Totally, 83 genome-wide significant SNPs and 314 suggestive significant SNPs associated with 18 milk fatty acid traits were detected. Chromosome regions that affect milk fatty acid traits were mainly observed on BTA1, 2, 5, 6, 7, 9, 13, 14, 18, 19, 20, 21, 23, 26 and 27. Of these, 146 SNPs were associated with more than one milk fatty acid trait; most of studied fatty acid traits were significant associated with multiple SNPs, especially C18:0 (105 SNPs), C18 index (93 SNPs), and C14 index (84 SNPs); Several SNPs are close to or within the DGAT1, SCD1 and FASN genes which are well-known to affect milk composition traits of dairy cattle. Combined with the previously reported QTL regions and the biological functions of the genes, 20 novel promising candidates for C10:0, C12:0, C14:0, C14:1, C14 index, C18:0, C18:1n9c, C18 index, SFA, UFA and SFA/UFA were found, which composed of HTR1B, CPM, PRKG1, MINPP1, LIPJ, LIPK, EHHADH, MOGAT1, ECHS1, STAT1, SORBS1, NFKB2, AGPAT3, CHUK, OSBPL8, PRLR, IGF1R, ACSL3, GHR and OXCT1. Our findings provide a groundwork for unraveling the key genes and causal mutations affecting milk fatty acid traits in dairy cattle.


Introduction
Fat is the major energy substance in milk and more than 50% milk total energy comes from milk fat, which accounts 3-5% of milk contents. Fat nutrition value depends on fatty acids. Monounsaturated fatty acids (MUFA) have a favourable effect on human health because of its cholesterol-declining properties. Polyunsaturated fatty acids (PUFA) of the n-6 and n-3 series are essential nutrients that exert an important influence on plasma lipids and serve cardiac and endothelial functions for prevention and treatment of coronary heart diseases [1]. Conjugated linoleic acid (CLA) has effects on bone formation and the immune system as well as fatty acids and lipid metabolism and gene expression in numerous tissues [2]. Saturated fatty acids (SFA) lead to increase the concentration of low density lipoprotein (LDL) cholesterol and cause cardio cerebral vascular disease [3]. Therefore, changing the proportions of dietary fat by decreasing SFA and increasing MUFA and PUFA is vital to Human health. It is suggested that the ideal balance would seem to approximate 1:1.3:1 for SFA:MU-FA:PUFA [4].
From the genetics point of view, milk fatty acids are complex traits influenced by non-genetic factors, such as breed, herd, stage of lactation, etc [5,6] and genetic factors [7]. Bovine milk fatty acids have been found to be heritable, with heritability estimates ranging from 0.22 to 0.71 [8,9]. Short and medium chain C4 to C16 saturated and monounsaturated fatty acids, which are synthesized de novo in the mammary gland, have moderate to high heritability (0.4-0.6) [8,9]. Long chain fatty acids (above C16) are derived from circulating plasma lipids, whereas have low to moderate heritability (about 0.2) [8,9]. Identifying genes and loci responsible for the genetic variation is expected to contribute greatly to our understanding of milk fatty acids synthesis, and to develop a marker-assisted selection to improve fatty acids in dairy breeding program in future. In the past few years, candidate gene and quantitative trait locus (QTL) mapping approaches have been implemented to detect genes or QTLs for milk fatty acid traits. A few promising loci, e.g. DGAT1 p.Lys232Ala and SCD1 p.Ala293-Val [10,11,12] and a large number of significant or suggestive genomic regions [13,14] were identified. Although the above two methods have got a few prominent findings, identification of causal  Table 2. Genome-wise and suggestive significant SNPs for short-and medium-chain saturated fatty acid traits (SCFA and MCFA). mutations is still a challenge due to the commonly existing limitations [15]. At present, genome-wide association study (GWAS) has become a powerful strategy to identify genetic variants associated with complex traits. Since the first GWAS was published in 2005 [16], a great number of relative studies were conducted in human and domestic animals. Of them, several GWASs have been applied to detect genes or loci for milk production traits [17,18,19], conformation traits [19], reproduction traits [20,21], healthy traits [22,23], etc, in dairy and beef cattle. However, only studies have been carried out for fatty acids in Dutch dairy cattle [24,25]. We herein performed a GWAS for 22 milk fatty acid traits in a Chinese Holstein population to identify genes and chromosome segments with large effects on such traits.

Materials and Methods
The milk samples were collected along the regular quarantine inspection of the farms. The whole procedure for sample collection was carried out in strict accordance with the protocol approved by the Animal Welfare Committee of China Agricultural University (Permit Number: DK996).

Phenotypic data and traits
The Chinese Holstein population in this study comprised 784 cows, the daughters of 21 sire families. All cows in this study were from 18 farms of the Beijing Sanyuan Dairy Farm Center, where routine standard performance test, i.e. Dairy Herd Improvement system (DHI) have been carried out since 1999. A total of 50 ml milk sample was collected for each cow from the DHI laboratory of the Beijing Dairy Cattle Center during November to December, 2012. The procedure of milk sample collection was carried out corresponding to DHI sampling (dairy herd improvement). After DHI measure, the remaining milk samples were taken back to the laboratory within 4uC cooler and then stored at 220uC.
Phenotypic values of 16 kinds of main milk fatty acids were measured by gas chromatography at the Ministry of Agriculture Feed Industry Centre of China (http://www.mafic.ac.cn/intro/ default.asp), which included SFA of C8:0, C10:0, C12:0, C14:0, C16:0, C18:0, C20:0, C22:0; MUFA of C14:1, C16:1, C18:1n9c; PUFA of CLA (cis-9, trans-11 C18:2), C18:3n3, C18:3n6, C18:2n6c and C20:5n3. Before measuring, milk samples should be done with pretreatment. First, total milk fat were extracted from approximately 2 ml of each milk sample. The specific procedure was as follows: 2 ml milk was mixed with 4 ml solution of N-hexane/isopropyl alcohol (3:2) and 2 ml solution of Na 2 SO 4 , and centrifuged at 3,0006g for 20 min. The upper layer was collected into 20 ml hydrolysis tube and 200 ml of C19:0 methyl ester as the internal standard was mixed, and then the extracted fat was dried under nitrogen. Methyl esters of fat were performed in the next step. 2 ml of NaOCH3/Methanol was put into the above hydrolysis tube for 15 min water bath under 50uC, and was mixed  Table 3. Genome-wise and suggestive significant SNPs for long-chain saturated fatty acid traits (LCFA).  with 2 ml of hydrochloric acid/methanol solution (1:10) for 1.5 h water bath under 80uC. After the temperature fell to room temperature level, 3 ml of water and 6 ml of n-hexane were put into above hydrolysis tube, mixed, vortexed, and stratified. The upper layer was collected and dried under nitrogen, and finally dissolved in 1 ml of n-hexane. 1 ml methyl esters of fatty acids were prepared to be determined by gas chromatography using a gas chromatograph (6890N, Agilent) equipped with a flameionization detector and a high polar fused silica capillary column (SP TM -2560, 100 m60.25 mm ID, 0.20 mm film). About 1 ml sample was injected under the following gas chromatography conditions: Helium was used as the carrier gas at a flow of 45 ml/ min. The split ratio was 100:1. The oven temperature was programmed at 100uC and held for 10 min, then increased to 160uC at a rate of 6uC/min, held for 10 min, increased to 200uC with 5uC/min, held for 20 min, increased to 240uC at a rate of 4uC/min and held for 12 min. Both the injector temperature and the detector temperature were set on 260uC. Individual fatty acids were identified and quantified by comparing the methyl ester chromatograms of the milk fat samples with the chromatograms of pure fatty acids methyl ester standards (Supelco TM 37 Component FAME Mix), and were measured as the weight proportion of total fat weight (wt/wt%). Based on the phenotypes of 16 milk fatty acids, 6 additional traits were obtained including SFA, UFA, SFA/UFA (the ratio of SFA to UFA), C14 index, C16 index and C18 index. The 3 indices were calculated as cis{9 unsaturated cis{9 unsaturatedzsaturated |100 [26].
The descriptive statistics of these 22 fatty acid traits are presented in Table 1. Both SFA and UFA accounted for approximately 96% (wt/wt) of total fat.

Genotypes and quality control
The cows were genotyped using the Illumina BovineSNP50 BeadChip (Illumina Inc., San Diego, CA, US), of which, some   The quality control procedure was as follows, 20 daughters were excluded due to low call rate (,90%), leading to 764 daughters remaining for the association analysis. On the other hand, 11,736 SNPs were removed for falling to meet the following requirements: 652 SNPs with ,90% genotype call rate, 10,798 SNPs with a minor allele frequency (MAF) ,0.05, 286 SNPs with extreme value of Hardy-Weinberg equilibrium statistics (P,10 26 ). Eventually, 40,604 SNPs passed these quality control filters, which was 77.6% of the SNPs in the panel. The average distance between adjacent markers was quite constant among different chromosomes. The shortest average distance was 56 kb on BTA25, and the longest average distance was 75 kb on BTA5 (except for 198 kb on BTAX).

Statistical analysis
The statistical tests followed a two-step analysis. For the first step, phenotypic values were corrected for fixed non-genetic effects by using SAS 9.1 general linear model (GLM) procedure. The statistical model was: y ijkl~m zF i zP j zL k ze ijkl , where y ijkl was the unadjusted phenotype; m was the overall mean; F i was the fixed effect of farm; P j was the fixed effect of parity; L k was the fixed effect of stage of lactation; e ijkl was the random residual. In the second step, genome-wide association analyses were performed with quantitative trait procedure (additive model)of the PLINK software (v1.07) [27], and empirical p-values estimated based on the Wald-statistic. Individual pedigree of three generations was applied.

Significance level
Bonferroni correction was applied to adjust for multiple testing from the number of SNPs detected. A significant SNP at the genome-wise significance level was declared if a raw P value (unadjusted),0.05/N, N is the number of SNP markers tested in analyses [28]. In the present study, Bonferroni genome-wise significance was 1.23E-06 (0.05/40604). As the Bonferroni correction threshold levels were strict and may lead to high false negatives, we calculated suggestive significant association threshold P-value as previously described [29], which was 2.46E-05 (1/ 40604).

Indices of fatty acid traits (C14 index, C16 index, C18 index)
For indices of C14, C16 and C18, totally 84, 10 and 93 significant SNPs were detected, respectively. Forty-two SNPs associated with C14 index are located within a region of 16 Sum of fatty acid traits (SFA, UFA, SFA/UFA) A total of 108 significant associations mainly on BTA5, 10 and 20 with three sum of fatty acid traits were detected, which involved 52 distinct SNPs. Of them, 22 SNPs were simultaneously associated with three traits and 12 were common for two traits. The 0.96 Mbp region (72.69-73.65 Mbp) on BTA10 was associated with the three traits, in which the SNP (ARS-BFGL-NGS-4783) showed the strongest association for SFA (P = 6.07E-08), UFA (P = 4.01E-07) and SFA/UFA (P = 2.64E-07), respectively.

Discussions
To our knowledge, this is one of the first GWA study for milk fatty acids with high density SNP Chip. In this study, we detected a total of 83 genome-wise and 314 suggestive significant SNPs for 22 fatty acid traits. Among them, some SNPs are located within the QTL regions on BTA6, 14, 19 and 26 those have been reported by Stoop et al [13], Schenninket al [14] and Morris et al [30] for bovine milk fat composition. Sixteen SNPs on BTA14, 5 SNPs on BTA19, and 5 SNPs on BTA7 were consistent with the previous GWA study for fatty acid traits of dairy cattle [24]. However, associations of BTA19 with C16:1 and CLA were not found in this study. This is probably due to a different dairy population was tested. Several SNPs were found to be located within and/or close to genes that are known to have functions related to the milk composition. In addition, 20 novel prospective candidate genes affecting milk fatty acid traits were identified.

Chromosomes underlying novel promising candidate genes
On BTA1, 23 SNPs associated with 9 fatty acids (C10:0, C12:0, C14:0, CLA, C18:0, C18 index, C22:0, SFA and SFA/UFA) were detected. The SNP associated with SFA and SFA/UFA is 38,733 bp away from the 3-hydroxyacyl Coenzyme A dehydrogenase (EHHADH) gene. As a bi-functional enzyme, EHHADH is part of the classical peroxisomal fatty acid b-oxidation pathway, which is highly inducible via peroxisome proliferator-activated receptor a (PPARa) activation [31] and is essential for the production of medium-chain dicarboxylic acids [32]. Four SNPs for C18:0 and C18 index form an 0.40 Mbp region containing the 1-acylglycerol -3-phosphate O-acyltransferase 3 (AGPAT3) gene. AGPAT catalyzes the first step during de novo synthesis of triacylglycerol. AGPAT3 is a member of the acyltransferase family [33] and plays a key role in de novo phospholipid biosynthetic due to its function of converting lysophosphatidic acid into phosphatidic acid [34].
On BTA2, 21 SNPs showed associations with 7 traits (C10:0, C12:0, C14:0, C14:1, C18:0, C18 index and UFA). Two SNPs associated with C18:0 and C18 index are 0.50 Mbp away from the signal transducer and activator of transcription 1 (STAT1) gene, especially, one of them is the top 3 significant SNP for C18 index. STATs are transcription factors known to importance to cytokine signaling. STAT1 has a role in regulating the transcription of genes involved in milk protein synthesis and fat metabolism in Holstein [35]. The SNP associated with UFA is 0.14 Mbp and 19,295 bp away from the acyl-CoA synthetase long-chain family member 3 (ACSL3) gene and the monoacylglycerol O-acyltransferase 1 (MOGAT1) gene, respectively. ACSL3 is an isozyme of the long chain fatty acids coenzyme A ligase family that convert free long chain fatty acids into fatty acyl-CoA esters and has a substrate preference for PUFA [36]. Depletion of ACSL3 by RNAi causes a significant reduction in fatty acids uptake, thereby plays a key role in lipid biosynthesis and fatty acids degradation [37]. MOGAT1 catalyzes the synthesis of diacylglycerols, the precursor of triacylglycerol and phospholipids [38]. Two SNPs associated with C18:0 and C18 index are 5.70 Mbp and 5.74 Mbp away from the fatty acid binding protein 3 (FABP3) gene, respectively. FABP3 provides fatty acids for SCD, which is one of specific transporters for LCFA and one of the most abundant isoforms in bovine mammary tissue [39]. Eight contiguous SNPs associated with C18:0 and C18 index are located within a chromosome region of 63.58,98. 16 Mbp that overlaps a reported QTL region (67.56,68.25 Mbp) for C14 index, C16 index, C18 index, SFA, MUFA, PUFA and SFA/UFA [40].