Impact of Genotype Imputation on the Performance of GBLUP and Bayesian Methods for Genomic Prediction

The aim of this study was to evaluate the impact of genotype imputation on the performance of the GBLUP and Bayesian methods for genomic prediction. A total of 10,309 Holstein bulls were genotyped on the BovineSNP50 BeadChip (50 k). Five low density single nucleotide polymorphism (SNP) panels, containing 6,177, 2,480, 1,536, 768 and 384 SNPs, were simulated from the 50 k panel. A fraction of 0%, 33% and 66% of the animals were randomly selected from the training sets to have low density genotypes which were then imputed into 50 k genotypes. A GBLUP and a Bayesian method were used to predict direct genomic values (DGV) for validation animals using imputed or their actual 50 k genotypes. Traits studied included milk yield, fat percentage, protein percentage and somatic cell score (SCS). Results showed that performance of both GBLUP and Bayesian methods was influenced by imputation errors. For traits affected by a few large QTL, the Bayesian method resulted in greater reductions of accuracy due to imputation errors than GBLUP. Including SNPs with largest effects in the low density panel substantially improved the accuracy of genomic prediction for the Bayesian method. Including genotypes imputed from the 6 k panel achieved almost the same accuracy of genomic prediction as that of using the 50 k panel even when 66% of the training population was genotyped on the 6 k panel. These results justified the application of the 6 k panel for genomic prediction. Imputations from lower density panels were more prone to errors and resulted in lower accuracy of genomic prediction. But for animals that have close relationship to the reference set, genotype imputation may still achieve a relatively high accuracy.


Introduction
Genomic selection has become a new tool for genetic improvement in livestock species and plants thanks to the discovery of many thousands of single nucleotide polymorphisms (SNP) and cost-effective high-throughput genotyping technology. Since the first publication of Meuwissen et al. [1], numerous statistical methods have been proposed for genomic prediction. Two main categories are genomic best linear unbiased prediction (GBLUP) methods [2][3][4][5][6][7], and Bayesian methods [1,[8][9][10].
Assumptions of SNP marker effects on the trait vary across different statistical methods. In general, GBLUP methods assume that all markers effects are from a normal distribution, while Bayesian methods assume non-normally distributed marker effects with distributions vary in different methods. Performance of different methods depends on the genetic architecture underlying the studied trait [11]. For traits that are affected by a few large quantitative trait loci (QTL), Bayesian methods usually outperform the GBLUP method, while for traits that are affected by many QTL with small effects, GBLUP would likely perform better than or similar as the Bayesian methods [11][12][13].
Accuracy in estimated breeding values has been increased significantly for young candidates in dairy cattle from application of genomic prediction using the BovineSNP50 BeadChip (50 k; Illumina Inc., San Diego, USA) [12,13]. While the current price of genotyping on the 50 k panel still hurdles a large number of male and female candidates to be screened, a cost-effective strategy has been proposed that uses cheap low density panels for genotyping followed by imputing to the 50 k panel [14][15][16][17]. In October 2011, Illumina, Inc. has released the BovineLD Genotyping BeadChip (6 k) to replace its predecessor, the Illumina Golden Gate Bovine3K chip (3 k). Compared to the 3 k panel, the 6 k panel has lower genotyping errors, denser genome coverage, and more accuracy in imputing to 50 k genotypes [14,18]. The GeneSeek Genomic Profiler (GGP) BeadChip has also become available since February 2012. The GGP chip has slightly higher imputation accuracy than the 6 k panel but performs similar as the 6 k panel in genomic evaluations [19]. The Illumina Golden Gate technology also allows an array to be customized containing 96, or from 384 to 3072 SNP loci. Genotype imputation from these lower density panels to 50 k genotypes was poorer than those from the 3 k, 6 k or the GGP chip. But these lower density panels may still have their applicability for genotyping selection candidates whose both parents have genotypes with at least one genotyped on the 50 k panel [16,20]. The cheap, cost-effective low density panels and high accuracy of genotyping imputation would facilitate genomic selection by allowing screening on a larger number of young bulls and allowing for the selection to be conducted on cows.
Studies have shown that imputation errors affect the accuracy of genomic prediction [21,22]. However, no comparisons of the impact of imputation errors on different genomic prediction methods have been reported. The aim of this study was to evaluate the performance of the GBLUP and Bayesian methods when imputed genotypes were used for genomic prediction.

Genotypes
A total of 10,309 Holstein dairy bulls born between 1950 and 2007 were genotyped on the Illumina BovineSNP50 BeadChip (50 k; Illumina Inc., San Diego, USA). SNPs with minor allele frequency (MAF) less than 0.05, missing rate more than 15% or Pvalue from Hardy-Weinberg disequilibrium test smaller than 0.0001 were filtered. After filtering, 35,790 SNPs with known locations on autosomal chromosomes were kept for analyses.
Five low density SNP panels including the Illumina Golden Gate Bovine3K BeadChip (3 k), the BovineLD BeadChip (6 k), and three simulated panels containing 384 (L384), 768 (L768), and 1536 (L1536) SNPs were considered in this study. In fact, 2,480 and 6,177 SNPs, from the 3 k and 6 k panel, respectively, coincided with the 50 k panel. Therefore, genotypes of the 2,480 and 6,177 SNPs extracted from the 50 k panel were used instead of re-genotyping animals on the 3 k and 6 k panels. The size of the simulated panels was chosen according to Illumina Golden Gate technology, which allows customized genotyping with 384 to 3,072 SNPs. SNPs in the three simulated panels were selected from the 50 k panel by compromising between uniform marker density and high MAF [23]. Number of SNPs on each chromosome is described in Table 1 for all panels.

Phenotypes
Milk yield, fat percentage, protein percentage, and somatic cell score (SCS) traits were considered for this study. Official bull proofs and reliabilities in April 2008 and December 2011 were obtained from Canadian Dairy Network (CDN). De-regressed proofs and corresponding reliabilities were derived by CDN from the 2008 bull proofs. Bulls born before 2004 and had reliabilities of de-regressed proofs greater than 0.80 were used as the training data set (n = 1,608). Bulls born in or after 2004 and had reliabilities of proofs greater than 0.80 in 2011 were used for validation (n = 3,232). Bulls born before 2004, but had less reliable deregressed proofs in 2008 (n = 5,469) were used for genotype imputation only. Numbers of bulls in different groups are listed by birth year in Table 2.

Scenarios
Four scenarios were designed to mimic situations that different proportion of bulls in the training and validation sets were genotyped on low density panels. In scenario 0 (S0), all bulls (n = 10,309) had genotypes on the 50 k SNP panel. In scenario 1 (S1), all bulls in the training set had genotypes from the 50 k panel, and all bulls in the validation set (n = 3,232) had genotypes on low density panels. In scenario 2 (S2) and scenario 3 (S3), 33% and 66% of bulls were randomly selected from the training set, respectively, to have low density genotypes, and all bulls in the validation set had low density genotypes. Size of the reference population was also reduced correspondingly to have the same proportion of animals that had 50 k genotypes. Animals in the training set that had 50 k genotypes were also combined with the reference population for genotype imputation and thus the actual numbers of bulls in the reference population used for genotype imputation were 7,077 for S1, 4,741 for S2, and 2,406 for S3.
Genotype imputation was implemented using software FImpute (version 2) developed by Sargolzaei et al. [24]. FImpute uses family imputation algorithm followed by population imputation steps based on a sliding window technique [25].

Genomic prediction
A GBLUP method and a Bayesian method with a spike and slab mixture prior distribution for marker effects were used to predict direct genomic values (DGV) for bulls in the validation set. The GBLUP method used a genomic relationship matrix proposed by VanRaden [5] and was implemented via the GEBV software [26]. For the Bayesian method, the statistical model can be written as: where y i is the phenotypic value (de-regressed EBVs) for animal i; m is the population mean; n is the total number of animals, and m is the total number of SNPs; x ij is the genotype coded 0, 1 or 2 as number of copies from a randomly chosen allele; b j is the regression coefficient (allele substitution effect) for SNP j; and e i is the random residual effect. The spike and slab mixture prior distribution for SNP effects is a mixture of two normal distributions with weights p and (1 -p), i.e., , where t is an arbitrarily small value. p was assigned a uniform prior distribution with a support on (0, 1). A scaled inverse Chi-square distribution with degree of freedom v b and a scale S 2 b was assigned to s 2 b . The strategy of using a group specific variance rather than a locus specific variance was to avoid the lack of Bayesian learning as in the BayesB method [1,9,27]. The arbitrarily small value t shrinks the SNP effects towards zero so that it can effectively, but not completely remove an irrelevant SNP from the model. Advantages of using such a two-group mixture distribution have been discussed previously [28,29]. m was given a flat prior distribution, and the residual was assigned a normal distribution, i.e., e i js 2 e À Á *N 0,s 2 e À Á , where s 2 e follows a scaled inverse Chi-square distribution with degree of freedom v e and a scale S 2 e . A Gibbs sampling algorithm was developed to draw inferences from the joint posterior distribution. Hyper-parameters were determined before running Gibbs sampling. v b and v e were arbitrarily set to 4 and 10, respectively. t was chosen to be 0.0001. The scale parameter S 2 b was derived from the expected value of a scaled inverse chi-square distributed random variable, i.e., where s 2 is the random environmental error variance. s 2 a and s 2 were estimated by ASReml 3.0 [30] from a preliminary analysis on de-regressed EBVs using an animal model. Residual effects were assumed to have a homogeneous variance due to a high cut-off value for the reliability of de-regressed EBVs in the training population.
A self-developed computer program was written with ANSI C language to implement the Gibbs sampler. The Gibbs sampling was run for 100,000 iterations with the first 20,000 cycles discarded as burn-in. Burn-in period were determined by visually inspecting the Gibbs sampling chain. All samples were kept after the burn-in and the sample means were used as estimates for SNP effects. Direct genomic breeding values for bulls in the validation set were estimated by summing the SNP effects over all loci.

Evaluation
Imputed genotypes were compared to the actual genotypes from the 50 k panel and the percentage of genotypes imputed correctly out of the total imputed genotypes was calculated as a measure of imputation accuracy. For genomic prediction, Pearson's correlation coefficient between DGV and bull proofs from 2011 for bulls in the validation set was estimated as a measure of accuracy.

Results and Discussion
Accuracy of genotype imputation Table 3 shows the accuracy of genotypes that were imputed from various low density panels to the 50 k SNP panel under different scenarios. The imputation accuracy was the highest (0.9841) when all bulls in the training set were genotyped with 50 k panel and bulls in the validation set were genotyped on the 6 k panel. Accuracy was dropped rapidly when the density of the SNP panels was decreased. Imputation accuracy from 6 k SNP panel were greater than that from the 3 k panel by about 2, 3 and 4 percentage points, when 0%, 33% and 66% of animals in the training set, respectively, were also genotyped on the low density panel. The accuracy decreased as more training bulls were genotyped on low density panels which, consequently, resulted in reduced reference size for imputation. The trends of imputation accuracy were consistent with reports from other studies [14,22,23].
Prediction accuracy using actual 50 k SNP genotypes Table 4 shows the accuracy of DGV in the validation set predicted via the GBLUP and Bayesian methods using the observed 50 k SNP genotypes in the training and validation sets. The Bayesian method outperformed GBLUP by 3, 11, 5 and 0 percentage points, for milk yield, fat percentage, protein percentage and SCS, respectively, with the greatest difference for fat percentage and the least for SCS.
Results also shown in the table were the posterior estimates of p, which were 0.96, 0.99, 0.99, and 0.93, for milk yield, fat percentage, protein percentage, and SCS, respectively. p is an indicator on the proportion of SNPs that were expected to have no effects on the trait. When p is small, the trait is likely affected by many QTL with small effects, and when p is large, a few large QTL are expected to influence the trait [9]. Daetwyler et al. [11] concluded that Bayesian methods would perform similarly or slightly worse than GBLUP when the trait was affected by many QTL each with a small effect, and better if the trait was influenced by a few large QTL. The largest p in our study was estimated for fat percentage, for which the Bayesian method showed the greatest advantage compared to GBLUP. A relatively smaller estimate of p for SCS corresponded to no difference of accuracy between the Bayesian and GBLUP methods. These results agreed with other studies comparing the performance between GBLUP and Bayesian methods for traits with different genetic architectures [12,13].
To show the difference of estimated SNP effects between the GBLUP and Bayesian methods, Figure 1 plots estimated SNP effects on chromosome 14 for fat percentage. Both GBLUP and the Bayesian methods highlighted a QTL region harbouring the known DGAT1 gene which has been confirmed to have a large effect on milk fat percentage in cattle [31]. The Bayesian method tended to select fewer relevant SNPs, while the GBLUP method picked many more SNPs surrounding the QTL.
Prediction accuracy using imputed 50 k SNP genotypes Table 5 presents the accuracy of DGV for bulls in the validation set under different scenarios. Accuracies achieved from the Bayesian method were greater than or similar to those from GBLUP, for all low density panels, under all scenarios. Accuracy of genomic prediction decreased when the density in the SNP panel was reduced. For scenarios where genotype imputation subjected to more errors, accuracy of genomic prediction also declined more rapidly. The scenario S1used the same training population as scenario S0, and thus the estimated p values from the Bayesian method were the same as in S0. For scenario S2 and S3 in Table S1, the estimated p values were shown.
The trend that the accuracy changed with the density of SNP panel and with the proportion of training bulls being genotyped on the low density panels agreed with results of Weigel et al. [22]. However, in their study, only up to a 3 k low density panel was evaluated. Our study revealed that the 6 k SNP panel performed better than the 3 k panel and resulted in the least reduction of genomic prediction accuracy among all the low density panels. In fact, no reduction was observed for 6 k panel using GBLUP method under all scenarios, and only slight reductions (up to 1 percentage point) were observed for the Bayesian method under S2 and S3.
The influence of imputation errors on the genomic prediction accuracy also depends on the traits. For example, when bulls in the validation set were imputed from L384 SNP panel, accuracy of DGV predicted via GBLUP was reduced by a rate of 20, 34, 20 and 15%, for milk yield, fat percentage, protein percentage, and SCS, respectively; and for the Bayesian method, the accuracy was dropped by 22, 44, 25, and 15%, respectively. This was likely due to different genetic architectures underlying different traits. For traits affected by a few large QTL, such as fat percentage, accuracy of genomic prediction seemed much more sensitive to imputation errors than traits controlled by many small QTL, such as SCS. A similar trend was observed in a simulation study [32], The reference population sizes used for imputation were n = 7,077, n = 4,741 and n = 2,406 for scenario S1, S2 and S3, respectively; 0%, 33%, and 66% of the training set in scenario S1, S2, and S3, respectively, and all bulls in the validation set were genotyped on the low density panel. where the accuracy of genomic prediction from low density panels declined much more rapidly for traits with a smaller number of QTL. Figure 1 showed that for chromosome regions with a large QTL, the Bayesian method tended to select fewer relevant SNPs, while GBLUP or the Ridge-regression picked many more SNPs surrounding the QTL. For other regions, the Bayesian method placed less weight on the SNPs than GBLUP. Therefore, the Bayesian method could suffer more if the few relevant SNPs were imputed with error, but the GBLUP method would suffer from imputation errors accumulated over many more SNPs. This could possibly explain why the Bayesian method had a greater reduction rate in accuracy for traits with large QTL and why the Bayesian method still resulted in higher or equal accuracies than GBLUP for all scenarios. Relative performance of the two methods might be more related to distributions of imputation errors. If more imputation errors were distributed around the QTL, one could speculate that the Bayesian method would suffer more from these errors than GBLUP and consequently resulted in more reductions in the accuracy of genomic prediction. Most of the economically important traits in dairy cattle may be controlled by large number of QTL with small effects, and the GBLUP and Bayesian method would perform similar for genomic prediction for these traits [11][12][13]. It is expected that for most of the complex traits the impact of imputation errors on the GBLUP and Bayesian method will be similar as observed for SCS in this study.
To examine the imputation errors on SNPs with large effects, two SNPs within the DGAT1 gene region were chosen. The two SNPs had the largest effects estimated from both the GBLUP and Bayesian methods. Imputation accuracies under scenario S1 are shown in Table 6. Both SNPs are present in the 6 k panel and thus the accuracies were 1. Imputation from the 3 k panel reached a higher accuracy than the average as shown in Table 3. But for L1536, L768, and the L384 panels, imputation accuracies for the two SNPs were below average. This could explain the rapid decline in accuracy of genomic prediction to 0.63 from 0.75 for the low density panel L1536 although the average imputation accuracy was not low (0.9430). To further investigate the potential of selecting SNPs with largest effects in the design of low density panels, actual genotypes from the two SNPs were included for genomic prediction under scenario S1 together with other imputed genotypes from the low density panels. Table 7 shows the results of genomic prediction using the GBLUP and Bayesian methods. Accuracy was increased by up to 2 percentage points for the GBLUP method. But for the Bayesian method, accuracies were increased by 1, 10, 13, and 22 percentage points, respectively, for the 3 k, L1536, L768, and L384 panels. These results suggest that a low density panel comprising SNPs with largest effects has the potential to preserve the accuracy of genomic prediction from higher density panels.
Using either the GBLUP or Bayesian method, the 6 k SNP panel performed better than the 3 k panel. In this study, genotypes on both the 3 k and 6 k panel were simulated from the 50 k genotypes. In reality, there are more genotyping errors in 3 k genotypes than in 6 k or 50 k due to the Golden Gate genotyping technology used for the 3 k panel [14]. So, results could be worse for the 3 k panel in practice. The other lower density panels performed worse due to more inaccurate imputation of SNP genotypes. Accuracy for either genotype imputation or genomic prediction was evaluated as an average for the population. Examining imputation accuracies at an individual animal level showed that even for very low density SNP panels, there were still a substantial number of bulls that achieved high imputation accuracies. These animals should have close relationships to the reference set which were used by FImpute. In results not shown above, when both parents were genotyped on the 50 k SNP panel, accuracy of imputation from L384 SNP panel to 50 k was 0.9436. The imputation accuracy was 0.7738 when only sires were genotyped on the 50 k panel, and 0.6213 when no parents had 50 k genotypes. Zhang and Druet [23] also found that when the genomes of target animals were fully inherited from reference animals, imputation errors from a 384 SNP panel to 50 k SNP panel can be as low as 3.2%. In this study, imputed genotypes were included in the training set regardless of their imputation accuracies. In practice, only animals that have their genotypes imputed with high accuracies should be included in the training set. This scenario should be further studied to investigate the impact of using imputed genotypes on the accuracy of genomic prediction. Table 4. Accuracy of genomic prediction and posterior estimates of p using observed 50 k SNP genotypes under scenario S0 1 .  In the future, more and more animals might be genotyped on low density panels. One might have to decide whether to include these animals in the training population to derive genomic prediction equations. From Table 5, the accuracy of genomic prediction was consistently reduced when more animals in the training set were imputed when the density of the SNP panel was lower than 6 k. For the 6 k panel, accuracy of genomic prediction was almost not altered even 66% of the training set was imputed. This of course, was achieved by the high genotype imputation accuracy for the 6 k panel. Currently, nearly all males used for breeding are genotyped or re-genotyped on panels with a density of 6 k or higher and results from this study justified the application of the 6 k panel. In this study, the sample size of the training set was kept constant regardless the use of imputation. The use of imputation could, however, also aid to enlarge the sample size of the training set, which might in fact increase the accuracy of the DGV. This possible scenario warrants further investigation. This study only uses bull genotypes for genomic evaluations. Large numbers of females in dairy cattle have also been genotyped on the 50 k or low density panels. Although cautions have been taken to include cow genotypes in the training population currently due to biases in genomic evaluations caused by preferential treatment of elite cows [33], the problem can be alleviated by including genotypes of randomly selected cows [33] or by appropriately adjusting for cow evaluations [34]. A large percent of the cows may be genotyped cost-effectively on low density SNP panels followed by imputing to the 50 k genotypes. Including imputed genotypes from cows in the training population and its impact on the genomic predictions require further evaluation.

Conclusions
Performance of both the Bayesian and GBLUP methods was influenced by imputation errors. For the traits considered in this study, the Bayesian method performed similar or better than the GBLUP method for genomic prediction under all scenarios and with different densities of the SNP panel or proportions of the training set being imputed. However, for traits affected by a few large QTL, the Bayesian method resulted in greater reductions of accuracy than GBLUP when a very low density of SNP panel was used. Including SNPs with largest effects in the low density panel substantially improved the accuracy of genomic prediction for the Bayesian method. When different low density panels were compared, with all animals in the training set genotyped on 50 k SNP panel, and animals in the validation set genotyped on the 6 k SNP panel, the accuracy of genomic prediction was the greatest for both GBLUP and the Bayesian methods. Imputation from SNP panels with a density lower than 6 k was more prone to errors and resulted in lower accuracy of genomic prediction. But for animals that have close relationship to the reference set, genotype imputation may still achieve a relatively high accuracy. Including genotypes imputed from the 6 k panel achieved almost the same accuracy of genomic prediction as that of the 50 k panel, even Table 5. Accuracy of genomic prediction using imputed 50 k SNP genotypes: Accuracies from using Bayesian model are parenthesized and accuracies for GBLUP are presented outside the parenthesis.  Table 6. Imputation accuracy under scenario S1 for two SNPs with largest effects on fat percentage. when 66% of the training animals were genotyped on the low density panels. These results justified the application of the 6 k panel for genomic prediction.