Genome-Wide Association for Growth Traits in Canchim Beef Cattle

Studies are being conducted on the applicability of genomic data to improve the accuracy of the selection process in livestock, and genome-wide association studies (GWAS) provide valuable information to enhance the understanding on the genetics of complex traits. The aim of this study was to identify genomic regions and genes that play roles in birth weight (BW), weaning weight adjusted for 210 days of age (WW), and long-yearling weight adjusted for 420 days of age (LYW) in Canchim cattle. GWAS were performed by means of the Generalized Quasi-Likelihood Score (GQLS) method using genotypes from the BovineHD BeadChip and estimated breeding values for BW, WW, and LYW. Data consisted of 285 animals from the Canchim breed and 114 from the MA genetic group (derived from crossings between Charolais sires and ½ Canchim + ½ Zebu dams). After applying a false discovery rate correction at a 10% significance level, a total of 4, 12, and 10 SNPs were significantly associated with BW, WW, and LYW, respectively. These SNPs were surveyed to their corresponding genes or to surrounding genes within a distance of 250 kb. The genes DPP6 (dipeptidyl-peptidase 6) and CLEC3B (C-type lectin domain family 3 member B) were highlighted, considering its functions on the development of the brain and skeletal system, respectively. The GQLS method identified regions on chromosome associated with birth weight, weaning weight, and long-yearling weight in Canchim and MA animals. New candidate regions for body weight traits were detected and some of them have interesting biological functions, of which most have not been previously reported. The observation of QTL reports for body weight traits, covering areas surrounding the genes (SNPs) herein identified provides more evidence for these associations. Future studies targeting these areas could provide further knowledge to uncover the genetic architecture underlying growth traits in Canchim cattle.


Introduction
Growth traits are traditionally included in selection criteria in beef cattle breeding programs, due to their association with meat production, and therefore are of great economic importance for both breeders and the industry [1]. The most common type of growth trait used in the selection process is the body weight measurement, which can be taken from birth and throughout an animal's life. These traits are used not only for evaluation of growth and development, but also for decision making about reproduction, nutritional, and prophylactic management. Body weight usually presents heritability and genetic correlation coefficients from medium to high magnitude [2][3][4].
Technological advances allow the use of massive genotype information through single nucleotide polymorphism (SNP) panels for animal breeding. For cattle, there are several SNP panel densities commercially available from Illumina [5] and Affymetrix [6], varying from a few thousand to more than 3 million SNPs [7], which are applicable for genome-wide association and genomic selection studies. The concept of genomic selection (GS) was first presented by Meuwissen et al. [8], as an alternative to predict more accurate breeding values through genetic markers (i.e. genomic breeding values), and increase the rate of genetic gain by reducing generation intervals. Genome-wide association studies (GWAS) were designed to identify regions that could clarify the inheritance of complex traits [9]. If candidate regions are identified it could be useful for livestock improvement through GS by focusing on relevant genomic regions [10,11].
For growth traits in crossed beef cattle, important genomic associations for birth weight, weaning weight, and yearling weight were reported by Snelling et al. [12]. These authors also found the same significant SNPs for different growth traits, indicating that a pleiotropic effect exists and genes that are responsible for body weight performance at earlier ages are also acting later on. Furthermore, many of these SNPs observed by Snelling et al. [12] were located in quantitative trait loci (QTL) previously reported for birth weight [13,14], pre-and post-weaning body weight gain [15], yearling weight [13], stillborn calves [16], and calving difficulty [17]. The presence of QTL for birth weight and calving difficulty support that calf birth weight should be monitored in order to avoid problems with dystocia [18].
Many breeds specialized in meat production are reared in Brazil, such as Canchim, which is derived from crosses between Charolais and Zebu animals. Quantitative research has previously been conducted in this breed to evaluate genetic parameters for growth, reproductive performance, and meat quality traits. However, the specific regions or genes responsible for the genetic variation of these traits are still unknown. Recently, Mokry et al. [19] studied backfat thickness in Canchim cattle and found genomic associations that could aid in GS. Backfat thickness is also a trait of economic importance and has recently gained attention from producers, who wish to increase fat deposition in animals raised on pastures. Thus, the aim of this study was to identify, through GWAS, genomic regions and genes that play roles on birth weight, weaning weight, and long-yearling weight in Canchim cattle.

Ethics Statement
This study was performed with the approval of the Embrapa Southeast Livestock Ethical Committee of Animal Use (CEUA-CPPSE) under protocol number 02/2009.

Animals and Data
The BovineHD Beadchip SNP panel from Illumina was used for genotyping 194 males and 205 females, of which 285 were Canchim animals, and 114 were from the ''MA'' genetic group. The Canchim breed was developed in Brazil through mating schemes between Charolais and Zebu breeds, which generated animals with 5/8 (62.5%) Charolais and 3/8 (37.5%) Zebu fractions [20]. Different schemes were developed to ensure genetic variability and to expand the genetic base of the breed [21]. In particular, the ''MA'' genetic group produces individuals with an approximate proportion of 65.6% Charolais and 34.4% Zebu, and is very popular among producers. MA animals are the product of crossings between group ''A'' individuals (derived from crossings between Canchim and Zebu) with Charolais animals [19,20,22].
No specific permissions were required for these locations/states and the field studies did not involve endangered or protected species. All farms participate in the Canchim breeding program, in which the database is maintained by Embrapa Beef Cattle through the responsibility of the researchers Dr. Roberto Augusto de Almeida Torres Júnior and Dr. Luiz Otávio Campos da Silva. The genomic data used in this study can be available upon request from Dr. Luciana Correia de Almeida Regitano (Embrapa Southeast Livestock. Address: Rodovia Washington Luiz, km 234, São Carlos, São Paulo, 13560-970, Brazil, Telephone: 55+16 34115600).
Data on estimated breeding values (EBVs) for birth weight (BW), weaning weight adjusted for 210 days of age (WW), and long-yearling weight adjusted for 420 days of age (LYW) were provided by the National Association of Canchim Breeders (ABCCAN) and by the Embrapa-Geneplus Beef Cattle Breeding Program. The EBVs for each trait were obtained by means of multi-trait analysis, which was done using the REMLF90 software [23] under an animal model that included the additive genetic, maternal genetic (only for BW and WW), and residual random effects; contemporary group (CG) as a fixed effect for all the traits, and age of the dam at calving as a covariate. The effects considered in the formation of CGs were sex, birth year and birth season (January to March; April to September; and October to December), farm of birth, genetic groups, and feeding regimen. The relationship matrix of the genotyped animals consisted of a total of 4,095 individuals. The average inbreeding coefficient was 0.02.

Genotype Quality Control
The SNPs with a genotype calling score lower than 15% were treated as missing genotypes, as recommended by the Infinium platform [24]. Genotype quality control was applied to exclude SNPs with significant (P,10 25 ) Hardy-Weinberg Equilibrium deviation; heterozygous excess (.15%); minor allele frequency (, 5%); and call rate (,90%). Animals with call rate lower than 90% were also excluded and only autosomal SNPs with known genome position, according to the UMD_3.1 bovine assembly map [25], were used for the GWAS.

Genome-wide Association Study
GWAS were carried out by means of the Generalized Quasi-Likelihood Score method (GQLS), developed by Feng et al. [26] and implemented in the SLEUTH software by Dr. Mehdi Sargolzaei. In this method, a logistic regression was used to associate the EBVs (treated as a covariate) with genotypes (treated as a response variable). Analyses were done for one SNP at a time, in which X i~( X 1 , . . . ,X n ) 0 represents the EBVs (X i ) for the i th animal; and Y i~( Y 1 , . . . ,Y n ) 0 represents the genotypes (Y i ), considering Y i = K* (number of alleles for the i th animal's genotype). As the genotypes were coded as ''0'', ''1'', and ''2''; their respective proportions would be 0, K, and 1. The expected SNP allele frequency is represented by m, in which m~(m 1 , . . . ,m n ) 0~E (X jY ), thus 0vm i v1.
To associate m i with X i , the following logistic regression is defined as: The hypothesis to verify the association of the SNP with the EBV assumes: H 0 : b 1~0 , null hypothesis (the SNP is not associated with the EBV); , alternative hypothesis (the SNP is associated with the EBV).
Considering the null hypothesis, m i can be interpreted as To estimate the ''generalized quasi-likelihood score'', the W G statistics can be calculated by: According to Heide [27], under a null hypothesis, W G follows a Chi-squared distribution with one degree of freedom, resulting in p-values for each SNP. The advantages of the GQLS method are that it allows an unspecified distribution of EBVs, the method is suitable for studies with either a quantitative trait or a binary trait, and proposes the correction for the population stratification problem by considering the relationship matrix (A {1 ) for calculating W G . However, the effect of the SNP in this methodology is not able to be estimated. Thus, single regression analyses for each significant SNP were carried out, as described in the following section.

Allele Substitution Effects
Single regression analyses were carried out to verify the genetic additive effects of each significant SNP, previously identified in the GQLS analyses. The following model was applied: in which y i is the EBV for the trait of interest; m is the mean value of the EBV (constant variable);b is the linear regression coefficient (allele substitution effect); x i is the number of copies of a given allele (0, 1, or 2); and e i is the residual effect.

Correction for Multiple Testing
The false discovery rate (FDR) was used for multiple testing correction [28] in order to verify significant SNPs. A maximum threshold of 10% FDR for each chromosome (chromosome-wise) was considered. The p-values of each SNP were sorted in ascending order and the following formula applied: in which q is the desired level of significance, m is the total number of SNPs, and P is the p-value of the i th SNP. The SNPs were considered as significant when mP(i)=i resulted in a value lower than q.

Gene Mapping and In-silico Functional Analyses
The significant SNPs were surveyed to their corresponding genes or to surrounding genes within a distance of 250 kb, using the genome databanks National Center for Biotechnology Information (NCBI) [29] and Ensembl Genome Browser [30]. Functional analysis of the mapped genes was performed by means of the UniProt website [31] to verify functional information of the genes.
When no information was available for the Bos taurus genes, annotations from human, rat or mouse orthologs were used to proceed with the in-silico functional analyses. AnimalQTLdb [32] was accessed to verify previous QTL reported for growth traits in the surrounding regions of significant SNPs.

Results
The descriptive statistics of estimated breeding values for BW, WW, and LYW are presented in Table 1. As our individuals were a sample from a larger population (of which the breeding values were estimated), it was expected that the mean EBV for each trait was different than zero.
A total of 786,799 SNPs were originally present for each animal, and after the quality control, a total of 672,778 SNPs remained for the GWAS. The number of SNPs evaluated in each Bos taurus autosome (BTA) is presented in Table 2. On average, 9.44% of the SNPs were excluded after the genotype quality control. The BTA22 and BTA13 presented the lowest and highest quantity of excluded SNPs, respectively.
The results from the GQLS and single regression analyses, and SNP and gene identification are presented in Table 3. A total of 4, 12, and 10, SNPs were significantly associated (on a chromosomewise level) with BW, WW, and LYW, respectively. A total of 18 SNPs were located in gene intron regions, whereas seven were located in intergenic regions, and only one was located in an upstream region. Figure 1 shows the Manhattan plots for chromosome-wise significant regions for BW, WW, and LYW, respectively. Linkage disequilibrium between significant SNPs in each chromosome (BTA) were calculated using the r 2 coefficient [33], resulting in average values of 0.11 (BTA6, for WW), 0.80 (BTA7, for LYW), 0.91 (BTA9, for BW), 1.00 (BTA11, for WW), 1.00 (BTA22, for LYW), and 1.00 (BTA27, for LYW), respectively.

Birth Weight (BW)
The SNP rs43421095 was significantly associated with BW and is located in the DPP6 (dipeptidyl-peptidase 6) gene ( Table 3). The DPP6 gene is involved in proteolysis and nervous system development of mouse [34]. This SNP is located in the same region of a QTL associated with yearling weight and calving ease for Angus cattle [35]. Schrooten et al. [17] observed a QTL associated with gestation length in dairy cattle located between 102.10 and 124.80 cM.
On BTA9, the SNPs rs135754703, rs136146400, and rs109313268 are located close to MANEA (endo-alpha mannosidase) and the LOC783932 (small ubiquitin-related modifier 1-like) genes. The MANEA gene product is related to the glycoprotein endo-alpha-1,2-mannosidase activity, which acts in metabolism of protein pathways [36]. For the LOC783932 gene, no information regarding its function was observed in the consulted literature. In the same region, Alexander et al. [37] found a QTL associated with post-weaning average daily gain located between 48.73 and 80.26 cM. Snelling et al. [12] observed five significant SNPs (FDR#10%), located between 112,474,006 and 114,565,961 bp on BTA4; and three SNPs, located between 54,832,703 and 57,515,862 bp on BTA9, associated with birth weight. One of these SNPs found by these authors (rs108988191) is also located close to the MANEA and LOC783932 genes.
For Canchim cattle, Machado et al. [22] found a QTL on BTA5 associated with BW, which contained the microsatellite markers ILSTS066, TEXAN15, and BMS1248 and the microsatellite on the IGF1 gene. Andrade et al. [20] verified association of BW with microsatellite markers in the IGF1 gene, also located on BTA5. The reason why we didn't find associations for BW on Table 1. Descriptive statistics for birth weight (BW), weaning weight (WW), and long-yearling weight (LYW) in animals with genotype information. chromosome 5 could be due to the differences in experimental designs, and because the previously cited authors studied only animals from the Embrapa Southeast Livestock herd, while in our study animals from other farms and other Brazilian states were used. The genes here identified as associated with BW seem to act on an animal's body development, with special emphasis on the DPP6 gene, considering its function in the nervous system. Our discovery also provides more evidences for the genetic relationship of BW with gestation length and calving ease. In addition, our findings corroborate with quantitative genetic studies, which present genetic correlations of moderate to high magnitude between BW and reproductive traits [38,39].

Weaning Weight (WW)
On BTA 4, a pleiotropic effect for rs43421095 SNP was found between BW and WW, which corroborates with the results obtained by Snelling et al. [12]. These authors found pleiotropic effects for birth weight, weaning weight, and yearling weight; which means that the genes that are acting on body weight performance at earlier ages are also acting later on. On BTA6, the rs135156506 SNP (Table 3) is located in the upstream region of the FARSB gene (phenylalanyl-tRNA synthetase, beta subunit pseudogene); which plays a role in aminoacyl-tRNA biosynthesis pathway. For WW, McClure et al. [35] found a QTL on BTA6 between 8.05 and 43.93 cM that contains the rs135156506 SNP. The KCNIP4 gene (Kv channel-interacting protein 4) has a role in the calcium ion binding, and in potassium and voltage-gated ion channel activity [40]. A QTL associated with yearling weight was found by McClure et al. [35] between 46.86 and 53.72 cM.
The RALGDS gene product (ral guanine nucleotide dissociation stimulator), located on BTA11, is related to the guanyl-nucleotide exchange factor activity and participates in the regulation of small GTPase mediated signal transduction [41,42]. No biological function for the GTF3C5 gene (general transcription factor 3C polypeptide 5) was found in the literature. Snelling et al. [12] observed three significant SNPs for WW located between 113,506,092 and 114,302,482 bp, on BTA4; five SNPs located between 33,441,527 and 37,584,088 bp, on BTA6; and nine SNPs located between 41,178,449 and 42,609,559 bp, on BTA6. These SNPs were located in similar regions of the SNPs that were significant in the present study. Regarding the Canchim breed, Andrade et al. [20] found associations of WW with microsatellite markers in the IGF1 gene on BTA5.

Long-Yearling Weight (LYW)
For LYW, pleiotropic phenomenon was not observed with BW and WW, which means that different genes are acting in earlier and older ages. Significant SNPs close to the MARCH3 (membraneassociated ring finger (C3HC4) 3), LMNB1 (lamin B1), PHAX (phosphorylated adaptor for RNA export), ALDH7A1 (Aldehyde dehydrogenase family 7 member A1), C7H5orf48 (chromosome 7 open reading frame, human C5orf48), GRAMD3 (GRAM domain containing 3), and LOC100848523 (60S ribosomal protein L26-like) genes and micro RNA MIR2458 (microRNA mir-2458) were observed. According to Fukuda et al. [43], the MARCH3 gene participates in the processes of endocytosis and protein ubiquitination. The LMNB1 gene contributes in the maintenance of structural integrity of a complex or assembly within or outside a cell [44]. The PHAX gene acts in protein transport and in snRNA export from the nucleus to the cytoplasm [45]. The ALDH7A1 gene has a role in the glycine betaine biosynthetic pathway [46]. Information regarding the functions of C7H5orf48, LOC100848523, GRAMD3, and MIR2458 was not found in the consulted literature. Snelling et al. [12] observed a SNP (rs109544319) significantly associated with yearling weight in the C7H5orf48 gene. In the region where these significant SNPs were located, no QTL for growth traits has been previously found. However, Sherman et al. [47] found a QTL on BTA7 associated with residual feed intake located between 11.70 and 48.50 cM in Angus, Charolais, and hybrid bulls. Maltecca et al. [48] and Druet et al. [49] verified QTL associated with gestation length and spermatic motility located between 16.70 and 39.50 cM and between 25 and 71 cM, respectively.
The LARS2 gene is involved in aminoacyl-tRNA editing leucyl-tRNA aminoacylation, and in translational fidelity [50]. The ZDHHC3 gene product participates in transferase activity (transferring acyl groups) and in interaction with zinc ion [51]. The EXOSC7 gene participates in positive regulation of cell growth and rRNA processing [52]. The CLEC3B gene participates in calcium ion binding. This gene encodes a protein that is important in bone mineralization, cellular response to transforming growth factor beta stimulus, positive regulation of plasminogen activation, and skeletal system development [25,53]. QTL associated with yearling weight was found by McClure et al. [35], located between 64.08 and 82.93 cM.
The SNP rs109242147 is located in the intron region of the XYLT1 (xylosyltransferase I) gene, which participates in the cellular response to heat and in the glycosaminoglycan biosynthetic pathway [54]. McClure et al. [35] observed QTL for yearling weight located between 2.24 and 14.44 cM. On BTA27, five SNPs were significantly associated with LYW and are located in the LOC101904868 gene (CUB and sushi domain-containing protein 1-like). However, no biological functions or QTL were reported for this gene in the consulted literature.

Conclusions
The Generalized Quasi-Likelihood Score method identified regions on chromosome associated with birth weight, weaning weight, and long-yearling weight in Canchim and MA animals. New candidate regions for body weight traits were detected and some of them have interesting biological functions, of which most have not been previously reported. The observation of QTL reports for body weight traits, covering areas surrounding the genes (SNPs) herein identified provides more evidence for these associations. Future studies targeting these areas could provide further knowledge to uncover the genetic architecture underlying growth traits in Canchim cattle.