Improvement of Prediction Ability for Genomic Selection of Dairy Cattle by Including Dominance Effects

Dominance may be an important source of non-additive genetic variance for many traits of dairy cattle. However, nearly all prediction models for dairy cattle have included only additive effects because of the limited number of cows with both genotypes and phenotypes. The role of dominance in the Holstein and Jersey breeds was investigated for eight traits: milk, fat, and protein yields; productive life; daughter pregnancy rate; somatic cell score; fat percent and protein percent. Additive and dominance variance components were estimated and then used to estimate additive and dominance effects of single nucleotide polymorphisms (SNPs). The predictive abilities of three models with both additive and dominance effects and a model with additive effects only were assessed using ten-fold cross-validation. One procedure estimated dominance values, and another estimated dominance deviations; calculation of the dominance relationship matrix was different for the two methods. The third approach enlarged the dataset by including cows with genotype probabilities derived using genotyped ancestors. For yield traits, dominance variance accounted for 5 and 7% of total variance for Holsteins and Jerseys, respectively; using dominance deviations resulted in smaller dominance and larger additive variance estimates. For non-yield traits, dominance variances were very small for both breeds. For yield traits, including additive and dominance effects fit the data better than including only additive effects; average correlations between estimated genetic effects and phenotypes showed that prediction accuracy increased when both effects rather than just additive effects were included. No corresponding gains in prediction ability were found for non-yield traits. Including cows with derived genotype probabilities from genotyped ancestors did not improve prediction accuracy. The largest additive effects were located on chromosome 14 near DGAT1 for yield traits for both breeds; those SNPs also showed the largest dominance effects for fat yield (both breeds) as well as for Holstein milk yield.


Introduction
Simulations and validation studies using real data have indicated that genomic selection can provide remarkably high accuracy of predicted breeding values (BV) of individuals without their own records or without progeny records [1], [2], which offers the opportunity to select individuals as parents of the next generation accurately at an early stage of life. This technique has become a standard tool in dairy cattle breeding [3] and is rapidly expanding to other agriculturally important species (e.g., poultry [4], pig [5], and plant breeding [6]).
Few studies have attempted to generalize and apply genomic selection models that include non-additive genetic effects with large data sets [7]. Non-additive genetic variation results from interactions between alleles, and the interaction between alleles at the same locus is called dominance. Dominance is an important non-additive genetic effect, and the inclusion of dominance effects in models for the prediction of genomic BV could increase the accuracy of the predictions [8], [9]. However, genotypes and phenotypes for the same individuals must be known to detect allelic interaction. For some traits, the expression is naturally limited to females and estimated BV (EBV) or de-regressed EBV obtained from routine evaluations [10] are used as phenotypes in most applications of genomic selection. Such data allow only the estimation of allele substitution effects, and distinguishing between additive and dominance effects is not possible. The increasing availability of cows with phenotypes and genotypes in the United States now provides an opportunity to investigate models that include dominance effects. Sun et al. [11] estimated dominance variance using only cows that had genotypes and phenotypes for milk yield in the U.S. national database but did not test predictive ability for a model that included a dominance effect.
Although many cows with phenotypes do not have genotypes, their sires and dams or their sires and maternal grandsires (MGS) have genotypes.. The expected genotype probabilities for those cows based can be calculated using genotypes of the ancestors and the allele frequencies in the population. Boysen et al. [12] discovered significant dominance effects for yield traits in dairy cattle by regression of phenotypes on such derived genotype probabilities; however, they did not investigate if model prediction improved when cows with derived genotype probabilities were included in the analysis.
Many statistical models and algorithms have been proposed to predict BV using genome-wide dense markers, which differ in the assumption of distributions of SNP effects [13]. Two models to compute genomic best linear unbiased predictions (BLUP) [1] assume normally distributed SNP effects. They have become popular approaches in practical genomic evaluation because they are simple and have low computational demands, as well as similar performance with variable selection models [3], [14]. One estimates marker effects using random regression on marker genotypes, and genomic BV are calculated as the sum of estimated marker effects (hereafter called SNP-BLUP). The other estimates genomic BV directly using a marker-based relationship matrix (hereafter called GBLUP). These two BLUP models can be easily extended to include dominance effects [15]. However, different sets of dominance coefficients can be derived that can result in different predictions [16].
This study had four goals. First, additive and dominance variance components were estimated using Holstein and Jersey data for eight traits. Second, predictive ability of models that included additive and dominance effects was compared with that of a model that included only additive effects. Third, predictions obtained using different dominance coefficients were compared. Fourth, model prediction was tested by expanding the data set to include cows with genotype probabilities derived based on ancestor genotypes.

Data
Genotypes were available from the Council on Dairy Cattle Breeding (Reynoldsburg, OH, USA) for Holsteins and Jerseys. Genotypes were from six different SNP arrays: the Bovine3K, BovineLD, BovineSNP50, and BovineHD (Illumina Inc., San Diego, CA), and the GeneSeek Genomic Profiler and GeneSeek Genomic Profiler HD (Neogen Agrigenomics, Lincoln, NE, USA). All genotypes were imputed to a BovineSNP50 basis using findhap.f90 software [17] before estimating genomic BV and dominance effects.
Phenotypic data were yield deviations for milk, fat, and protein; productive life (PL); daughter pregnancy rate (DPR); somatic cell score (SCS), fat percent (fat%) and protein percent (protein%) for first parity. Yield deviations for fat% and protein% were obtained indirectly as (yield deviation of fat% = ((fat mean for base cows+fat yield deviation)/(milk mean for base cows+milk yield deviation)fat mean for base cows/milk mean for base cows) *100; and a corresponding formula for protein%). The values of trait mean for base cows were 11,839, 432 and 396Kg for Holstein milk, fat and protein, respectively, and corresponding values were 8379, 384 and 298Kg for Jersey breed. DPR is defined as percentage of nonpregnant cows that become pregnant during each 21-day period; a DPR of 1 implies that cows are 1% more likely to become pregnant during that estrus cycle than cows with an evaluation of 0. PL is defined as time in the milking herd before removal by voluntary culling, involuntary culling, or death; credits for each month in milk are obtained from standard lactation curves and then summed across all lactations; diminishing credits within lactation give cows more credit for beginning a new lactation than for continuing to milk in previous lactation; cows get 8 months credit for 305-day first-lactation records, 10 months credit for second lactations, 10.2 months credit for third and later lactations, partial credits for shorter records, and extra credits for longer records.
The data set was divided into three groups. The first set included cows with known genotypes and phenotypes (DATA C ). The second included cows with phenotypes, but genotype probabilities were calculated from genotyped sire and dam (DATA S-D ). The third included cows with phenotypes but genotype probabilities were calculated from genotyped sire and MGS (DATA S-MGS [12]).
Tables 1 and 2 listed phenotypic information for each of the data groups and six traits. Fixed effects (age and parity group, herd management group, inbreeding, and heterosis) were first estimated using a multi-trait and multi-breed linear mixed model from the full national data set of phenotype and pedigree information, and then records from first parity were adjusted for fixed effects (age and parity group and herd management group) for the subset of cows that had both phenotypic and genotypic information ( For non-genotyped cows, whose genotype probabilities were derived using genotyped sires and dams or genotyped sires and MGS (  Based on Tables 1  and 2, means and standard deviations were different for DATA C , DATA S-D and DATA S-MGS for yield traits. Given a specific marker locus with two alleles (A and B), the probabilities of possible genotypes (AA coded as 0, AB coded as 1, and BB coded as 2) for cows were computed as The same approach was used to calculate P(B sire or B dam or B MGS ).
The DATA C data set was used to estimate variance components and SNP effects (additive and dominance) and to perform ten-fold cross-validation for prediction. Variance estimation and validation were also conducted using the combined data sets (DATA C + DATA S-D +DATA S-MGS ). The same testing data sets were used when cross-validation was performed on DATA C only or on the combined data sets.

Variance Components
Variance components for each trait were estimated using the GBLUP method by including additive or additive and dominance genetic effects; the single-trait linear mixed models used were: where y is a vector of management group deviations for each trait; u, u S-D , and u S-MGS are the intercepts; a, a 1 , a 2 , and a 3 are vectors of additive effects for animals; d 1 , d 2 , and d 3 are vectors of dominance effects; e, e 1 , e 2 , and e 3 are the vectors of random residuals for animals; 1 is a vector with elements of 1, and 1 S-D and 1 S-MGS are vectors with elements of 1 for DATA S-D and DATA S-MGS , respectively, and 0 for other records. Each animal had a single record; therefore, W and W 0 were identity matrices. Then, a * N( 0, Gs 2 a ) , a 1 * N( 0, Gs 2 a 1 ) , a 2 * N( 0, , e * N( 0, Is 2 e ) , e 1 * N( 0, Is 2 e 1 ) , e 2 * N( 0, Is 2 e 2 ) , and e 3 * N( 0, Rs 2 e 3 ) , where G and D 1 (or D 2 ) are additive and dominance genomic relationship matrices, respectively; s 2 a , s 2 a 1 , s 2 a 2 and s 2 a 3 are additive variances; s 2 are dominance variances; s 2 e , s 2 e 1 , s 2 e 2 , and s 2 e 3 are residual variances, and R is the coefficient matrix for error variance: Table 2. Phenotypic statistics for Holstein and Jersey milk, fat, and protein yields based on cows with genotype probabilities derived using genotyped sire and dam (S-D) or genotyped sire and maternal grandsire (S-MGS).  Table 3. Holstein and Jersey estimated variance components and heritabilities for milk, fat, and protein yields, productive life (PL), daughter pregnancy rate (DPR), somatic cell score (SCS), fat percent (fat%) and protein percent (protein%) using four different models.  Table 3. Cont.   Table 5. Holstein and Jersey average correlations between estimated genetic effects and phenotypes for milk, fat, and protein yields, productive life (PL), daughter pregnancy rate (DPR), and somatic cell score (SCS) from training data for ten-fold cross-validation for four models.  Table 6. Holstein and Jersey average correlations between estimated genetic effects and phenotypes for milk, fat, and protein yields, productive life (PL), daughter pregnancy rate (DPR), and somatic cell score (SCS) from testing data for ten-fold cross-validation for four models as well as P-values from paired t-tests based on differences between model correlations.  where s 2 e C is the residual variance for genotyped cows, s 2 e S{D is residual variance for cows with genotype probabilities derived from genotyped sire and dam, and N S-MGS is the number of daughters for each sire-MGS pair. The G, D 1 , and D 2 were constructed based on information from genome-wide markers [1], [9], [15], [16]: where k is the total number of SNPs; Z is a centered genotype matrix with each z is a genotype code (0, 1, or 2) minus 2p i ; p i is the frequency of the second of two alleles at locus i; q i is the frequency of the first allele at locus i; the elements of H equal 022p i q i for homozygous alleles and 12 2p i q i for heterozygous alleles; and the elements of M equal { 2p 2 i , 2p i q i , and { 2q 2 i for genotype codes 0, 1, and 2, respectively. The differences between MAD and MAD2 were explained and investigates in detail in a previous study [16].
Variance components were estimated using average-information restricted maximum likelihood (AI-REML) [18] as imple- mented in MMAP (mixed models analysis for pedigrees and populations) software [19], [20]. The MMAP software incorporates the Intel Math Kernel Library [21] for optimized parallel matrix algebra and likelihood calculation.

SNP Effects
The additive and dominance effects for each SNP were estimated using the SNP-BLUP method with the variance components described previously. Using the MAD model as an example, the mixed model equation for estimating each SNP effect was.   (1) and inverting the left-hand side. The MA SNP , MAD SNP , and MAD2 SNP models used data only from DATA C , and equations were solved by the inversion method. However, the MAD3 SNP model used data from all three data sets (DATA C , DATA S-D , and DATA S-MGS ). Because some cow genotypes were probabilities and required .1 character for storage, calculations for Z 0 Z, Z 0 H, and H 0 H in (1) required much more time, memory, and disk space. An iteration-based program was developed to solve MAD3 SNP for big data. A blend of first-and second-order Jacobi iteration was implemented with two relaxation factors [1]. Manhattan plots of the additive and dominance effects were created using ggplot2 [22], version 0.9.2, and R-2.15.1 [23].

Model Validation
Goodness-of-fit for each model was evaluated using likelihoods based on the whole data set as well as correlations between predicted BV and phenotypes in the training data. The superiority of model MAD and MAD2 over MA was tested using a likelihood ratio test. Cross-validation was used to measure prediction accuracy, with the data set randomly divided into ten approximately equal portions. Nine of the portions were used in turn for training the models to estimate SNP effects, and the remaining portion was used for testing prediction accuracy. The predictive ability of the model was evaluated by comparing predictions and phenotypes of animals in the testing data set and was measured as the correlation between predicted genetic values and phenotypes. Predictions of additive genetic effect (BV) and total genetic value (defined as the sum of additive and dominance effects in the model) were both evaluated. Paired two-sample t-tests were used to test correlations for differences. Table 3 shows estimates of variance components and heritabilities using the MA, MAD, and MAD2 models for each of the eight traits; MAD3 was only applied to yield traits. For both Holsteins and Jersey yield traits, MAD had lower additive heritabilities and higher dominance heritabilities than MAD2, but the sum of additive and dominance variances were similar for both models. The MAD2 additive heritabilities were much closer than MAD additive heritabilities to MA heritabilities. Based on MAD and MAD2, dominance variance accounted for 5% and slightly less than 4%, respectively, of phenotypic variance for Holstein yield traits and 7% and 5.5% of Jersey yield traits. Additive heritability estimates from MAD3 were lower than from MAD and MAD2; MAD3 dominance variances were similar to those from MAD2 for Jerseys but smaller for Holsteins. Dominance variances from MAD and MAD2 were very small for DPR and SCS regardless of breed, especially for DPR. Dominance variance for PL was larger for Jerseys than for Holsteins. Fat% and protein% had high additive but low dominance heritabilities.

Model Goodness-of-Fit
Measures of goodness-of-fit based on likelihood ratio tests are in Table 4. For Holstein and Jersey yield traits, the likelihood ratio test showed that MAD and MAD2 fit the data significantly (P, 0.0001) better than did MA. For PL, DPR, and SCS, the 22 log likelihoods were similar for MA, MAD, and MAD2. The model including dominance also fit the data better than MA for protein% (both breed) and fat% of Holstein. The number of animals in MAD3 was different from that for MA, MAD, and MAD2; therefore, the likelihood for MAD3 was not comparable with that for other models.
Average correlations between estimated genetic effects and phenotypes in training data for ten-fold cross-validation (Table 5) also indicated model goodness-of-fit. Correlations between total genetic effects (additive for MA and additive plus dominance for MAD, MAD2, and MAD3) and phenotypes were higher for MAD and MAD2 than for MA for all Holstein and Jersey traits. For MAD3, correlations between total genetic effects and phenotypes were higher than for MA but lower than for MAD and MAD2; correlations between additive effects and phenotypes were lowest. The standard deviations of correlations were from 0.001 to 0.003 for Holstein, and from 0.001 to 0.005 for Jersey, across different traits; PL and milk had the largest and smallest standard deviation, respectively. This was true using MAD, MAD2 or MAD3. Because the yield deviations of fat% and protein% were derived from yield traits and their dominance variances were small, the ten-fold cross-validation was not carried out on fat% and protein%.

Prediction Accuracy
Predictive ability for Holstein and Jersey yield traits was better for MAD and MAD2 than for MA based on correlations from testing data used in the ten-fold cross-validation (Table 6). For MAD and MAD2, correlations were higher between phenotype and total genetic effects than between phenotype and additive-only effects for yield traits, and both MAD and MAD2 correlations were higher than those between phenotype and additive effect from MA. The differences between correlations from MAD or MAD2 and that from MA were statistically significant for Holstein yield traits and SCS (P,0.005) and Jersey yield traits (P,0.001).  However, for Jersey PL, DPR, and SCS as well as Holstein PL and DPR, correlations from MA, MAD, and MAD2 from testing data were almost the same and did not differ statistically (P.0.2). Jersey correlations from testing data were lower than Holstein correlations except for PL. By enlarging the data set, MAD3 did not provide better prediction for either Holsteins or Jerseys. The standard deviation of correlations from ten-fold cross-validation ranged from 0.017 to 0.024 on different traits for Holstein, and from 0.018 to 0.043 for Jersey; yield traits had lower standard deviation than other traits.

Largest SNP Effects
Based on additive and dominance SNP effects from MAD, Manhattan plots for eight traits were constructed, and the ten SNP with largest effect were characterized. Figures 1-3 show that the largest additive SNP effects are located on chromosome 14 near DGAT1 [24] for all three yield traits for both breeds. For Holstein milk and fat yields as well as Jersey fat yield, the SNP with largest additive effect also had the largest dominance effect. The SNP effects for PL, DPR, SCS, fat% and protein% are not shown because the dominance effects were extremely small and the plots were not informative.
For yield traits, Table 7 lists the top 10 SNPs selected by dominance effects which were estimated using MAD; SNP locations are based on the UMD 3.1 assembly of the Bos taurus genome [25]. For both Holsteins and Jerseys, several SNPs on chromosome 14 had both large additive and dominance effects for fat yield. For Holsteins, three SNPs on chromosome 14 had large dominance and additive effects for both milk and fat yields. One SNP on chromosome 26 also had a large dominance effect for milk and fat yields, and chromosomes 13 and 21 each had one SNP with a large dominance effect for both milk and protein yields. No SNP had both large additive and dominance effects for Jersey milk or protein yield. For Jerseys, two SNPs on chromosome 12 and one SNP on chromosome 22 had a large dominance effect for all three yield traits; another SNP on chromosome 12 had a large dominance effect for both milk and protein yields. Table 8 shows the top 10 SNPs selected by additive effect (from MAD) with SNPs on chromosome 14 excluded. No SNP had both large additive and dominance effects for either breed for any yield trait. Chromosome 5 had several SNP with a large additive effect for fat yield for both Jerseys and Holsteins. For milk yield, the SNP with the largest additive effects were on chromosomes 5, 6, 16, 18, and 25 for Holsteins and on chromosomes 2, 5, 7, and 19 for Jerseys. For protein yield, the SNP with the largest additive effects were on chromosomes 5,6,7,12,18,25,26, and X for Holsteins and on chromosomes 2, 5, 7, 18, and 25 for Jerseys. One SNP on chromosome 18 for Holsteins had a large additive effect for all three yield traits as did one SNP on chromosome 5 and another on chromosome 7 for Jerseys. Chromosome 16 for Holsteins had one SNP with a large additive effect for both milk and fat yields. Two SNPs on chromosome 6 and another on chromosome 25 for Holsteins had large additive effects for both milk and protein yields as did two SNPs on chromosome 2 and one SNP on chromosome 5 for Jerseys. The X chromosome for Holsteins had one SNP with a large additive effect for both fat and protein yields.

Discussion
The magnitude of dominance variance relative to phenotypic variance for different traits varied widely for genotyped Holstein and Jerseys cows in the United States. Dominance variances were larger for MAD than for MAD2. Dominance heritability from MAD for milk yield was 5% for Holsteins and 7% for Jerseys, which was slightly higher than the results reported by Sun et al. [11]. Result differences were caused by different models for estimating yield deviation and different methods for imputing missing genotypes, but the impact on Holstein results was smaller than for Jerseys because of the large Holstein data set. Few other studies have estimated dominance variance using Holstein genomic data. We verified that our software gives the same estimates of variance components and SNP effects as GVCBLUP [15] by comparing results when both were applied to the Jersey milk data and MAD2 model (see Text S1), but GVCBLUP cannot handle all the models we considered.
Additive and non-additive variances usually have been estimated using models with pedigree-based relationship matrices. Van Tassell et al. [26] estimated additive and dominance variance using Method R and reported results consistent with the findings of the current study for yield and SCS traits (5% and 1%  [27] reported a ratio of dominance to additive genetic variance of 17% for stature for U.S. Holsteins. However, Hoeschele et al. [28] reported ratios of 118% for days open and 161% for service period (days between first and last insemination) for U.S. Holsteins, and also showed that dominance variance changed significantly with slight differences in trait definition, e.g. at days open with an upper bound of 150 days, dominance heritability became very low. The change in estimates indicates some lack of precision, perhaps caused by solving for 3 genetic variances (A, D, and AA) in the same model, which also caused trouble in our study (results computed but not shown); furthermore different models (sire and maternal grandsire model vs animal model) and relationship matrices (pedigree vs genomic) as well as pre-selection (genotyped cows were offspring of genetically superior animals) all can lead to different results between our study with Hoeschele et al. [28]. In beef cattle, the ratio was . 50% for weaning weight for Herefords, Gelbvieh, and Charolais [29], [30], and for post-weaning gain in Limousin beef cattle [31].
These results indicate that the range of estimates for non-additive genetic variance in different studies is large and may reflect different features of various traits and populations or large sampling error due to insufficient data. Fixed regression on inbreeding and heterosis accounted for effects of dominance on phenotypic mean in this study, and variance estimates accounted for additional covariances among relatives. The pre-adjusted phenotypes used in this study included inbreeding and heterosis effects, and an additional analysis (results not shown) on variance components estimation for Jersey indicated that removing inbreeding and heterosis effects from pre-adjusted phenotypes decreased dominance heritabilities slightly for yield traits (for example 7.0% vs. 5.9% for milk), but had very small effects on other traits (for example 1.2% vs. 1.1% for SCS). The inbreeding and heterosis effects in the model may account for changes in the mean rather than changes in the covariance among relatives. The likelihood ratio test showed that a model with a dominance effect had better goodness of fit for yield traits than did a model with only an additive effect. Therefore, non-additive genetic variance is important for complex traits, and a model with nonadditive genetic effects is expected to increase prediction accuracy. In this study, MAD was approximately 2% better than MA for predicting phenotypes in testing data sets. Lee et al. [8] predicted unobserved phenotypes using whole-genome SNP data and reported that the accuracy of prediction increased considerably when dominance effects were included compared with a purely additive genetic model. Their increased accuracy was 17% for coat color and 2% for percentage of CD8 + cells in mice; however, added epistasis did not contribute to accuracy. Su et al. [9] estimated additive and non-additive genetic variances and predicted genetic merit using genome-wide dense SNP; they found that reliabilities of genomic BV for animals without performance records increased 0.7 percentage points for a model that included additive and dominance effects compared with an additive-only model; the corresponding increase for a model that included additive and epistatic effects was only 0.3 percentage points.
The difference between MAD and MAD2 was how the dominance relationship matrix was calculated. In this study, estimates for dominance variance were larger and additive variances smaller for MAD compared with MAD2. Vitezica et al. [16] reported this same result for simulated data and concluded that MAD underestimates additive genetic variance and overestimates dominance variance; however, they did not compare the predictive ability of MAD and MAD2. In this study, MAD and MAD2 had no apparent difference in predictive ability, and the correlations between total genetic effects (or additive effects only) and phenotypes in testing data (or training data) were almost the same for the two models.
The MAD3 model was expected to increase predictive ability even more than MAD and MAD2 because it included sire-dam and sire-MGS groups to increase the available data; however, it did not. Perhaps because of the more complex model needed to deal with combined data (DATA C , DATA S-D , and DATA S-MGS ), MAD3 underestimated additive heritability. A better model might treat the three groups as correlated phenotypes to account for differences in genotype accuracy and phenotype distributions between them. The cows with imputed genotype probabilities were offspring of genetically superior (elite) animals, and preselection may have affected the results and caused bias. Another issue that may need to be addressed is if including all of the genotyped females is optimal. Some elite cows were genomically tested after their phenotypes showed them to be superior and may represent only a small fraction of a herd (e.g., if a farmer tests only his five best animals). Such cows are highly selected, and predictions may become more accurate by limiting their data.
In addition to increased prediction accuracy, a model that includes additive and non-additive genetic effects could be beneficial for exploiting specific combining ability. Breeders should continue to select for additive merit but can also improve non-additive merit by considering interactions in mating programs [32]. Sun et al. [11] compared mating programs and found that expected progeny value for milk yield from linear programming using genomic relationship matrices increased 86 kg for Holsteins and 52 kg for Jerseys for the top 50 bulls for genomic BV for milk yield by including dominance effects. However, two practical limitations exist for implementing a model with both additive and non-additive genetic effects for genomic prediction [9]. First, the computational demand for models with both additive and nonadditive genetic effects is generally high because both additive and non-additive genomic relationship matrices are dense, thus requiring greater computing resources or more efficient algorithms. The iteration-based SNP-BLUP used in this study greatly decreased the amount of memory needed and converged well for each of the three data groups, but it converged poorly for the combined data. Second, a reference population often consists of bulls that have records of progeny performance, and pseudoobservations (conventional EBV, de-regressed EBV, or means of corrected progeny performance) are commonly used as response variables. However, a genomic prediction model that includes non-additive genetic effects requires that the response variable is an individual record. Therefore, pseudo-observations are appropriate for an additive genetic model but not for a model that includes non-additive genetic effects.
The DGAT1 gene is a major quantitative trait locus (QTL) on chromosome 14 that affects yield traits [24]. This study confirmed that the SNPs with the largest MAD additive effects were located on chromosome 14 for all three yield traits; those SNPs also had the largest dominance effects for fat yield for Holsteins and Jerseys as well as for Holstein milk yield. Boysen et al. [12] explored dominance effects using cow genotype probabilities based on bull genotypes and found significant (P # 0.01) dominance effects for fat yield on chromosome 14 within the DGAT1 region. The current study and Boysen et al. [12] both found no significant (P # 0.01) dominance effects for SCS. A QTL that affects yield traits have been identified on chromosome 6 using granddaughter designs in U.S. [33], Dutch [34], and German [35] Holstein populations. In the current study, SNP on chromosome 6 had large additive effects for Holstein milk and protein yields. Cole et al. [36] studied the distribution and location of additive genetic effects for Holsteins using 5,285 bulls and confirmed the presence of two major genes for yield traits on chromosomes 6 and 14. Similar results also were reported by Cole et al. [37] using a population of genotyped U.S. Holstein cows. Wang et al. [38] performed a genome-wide association study for fat percentage in the German Holstein-Friesian population and uncovered a QTL region on chromosome 5. The current study also indentified a region on chromosome 5 with both large additive and dominance effects for Holstein yield traits.

Conclusions
Dominance variance accounted for about 5 and 7% of total variance for yield traits for Holsteins and Jerseys, respectively, based on the MAD model. For PL, DPR, SCS, fat% and protein% dominance variances were very small, especially for Holsteins. The MAD model had smaller additive and larger dominance variance estimates compared with MAD2. The likelihood ratio test showed that a model with dominance effects included had better goodness of fit than an additive-only model for yield traits. Based on ten-fold cross-validation, the MAD and MAD2 models can increase prediction ability for Holstein and Jersey yield traits; improvements from the two models were similar. Prediction accuracy did not improve by including cows with derived genotypes. The largest additive effects were located on chromosome 14 for all three yield traits for both breeds, and those SNP also had the largest dominance effects for fat yield for Holsteins and Jerseys as well as Holstein milk yield. Dominance effects should be considered for inclusion in routine genomic evaluation models to improve prediction accuracy and exploit specific combining ability.

Supporting Information
Text S1.