Mixed Model Methods for Genomic Prediction and Variance Component Estimation of Additive and Dominance Effects Using SNP Markers

We established a genomic model of quantitative trait with genomic additive and dominance relationships that parallels the traditional quantitative genetics model, which partitions a genotypic value as breeding value plus dominance deviation and calculates additive and dominance relationships using pedigree information. Based on this genomic model, two sets of computationally complementary but mathematically identical mixed model methods were developed for genomic best linear unbiased prediction (GBLUP) and genomic restricted maximum likelihood estimation (GREML) of additive and dominance effects using SNP markers. These two sets are referred to as the CE and QM sets, where the CE set was designed for large numbers of markers and the QM set was designed for large numbers of individuals. GBLUP and associated accuracy formulations for individuals in training and validation data sets were derived for breeding values, dominance deviations and genotypic values. Simulation study showed that GREML and GBLUP generally were able to capture small additive and dominance effects that each accounted for 0.00005–0.0003 of the phenotypic variance and GREML was able to differentiate true additive and dominance heritability levels. GBLUP of the total genetic value as the summation of additive and dominance effects had higher prediction accuracy than either additive or dominance GBLUP, causal variants had the highest accuracy of GREML and GBLUP, and predicted accuracies were in agreement with observed accuracies. Genomic additive and dominance relationship matrices using SNP markers were consistent with theoretical expectations. The GREML and GBLUP methods can be an effective tool for assessing the type and magnitude of genetic effects affecting a phenotype and for predicting the total genetic value at the whole genome level.


Introduction
Genomic prediction using genome-wide single nucleotide polymorphism (SNP) markers has been shown to be a powerful tool to capture small genetic effects dispersed over the genome for predicting an individual's genetic potential of a phenotype [1][2][3][4][5].Current large scale genomic prediction focused on additive effects [2,4,5].Two SNP models for genomic prediction of additive effects were described: a traditional quantitative genetics model and a model with (21)-0-1 SNP coding [2].The traditional quantitative genetics model is attractive because it is equivalent to a conventional animal model with the relationship matrix calculated from the SNP genotypes [5] and it directly predicts genomic breeding values [2,4,5].Method and computing tool are available for estimating genomic heritability using genome-wide SNP markers [6].This method uses a standardization of the 0-1-2 additive coding and the subtraction step of this standardization leads to additive effects that are breeding values under the traditional quantitative genetics model assuming Hardy-Weinberg equilibrium [2,6,7].The mixed model implementation of this method is ideal for a large number of markers but is not ideal for a large number of individuals because the size of the matrix that needs to be inverted increases as the number of individuals increases.
From the point of view of missing heritability [8][9][10], the ability to estimate genome-wide dominance contribution will help determine the total genetic contribution to a phenotype.Similarly, methods of genomic prediction taking into account of dominance can predict an individual's total genetic potential for phenotypes affected by additive and dominance effects.Substantial dominance effect should justify the inclusion of dominance in genomic prediction and the design of mating systems to maximize dominance effect.In dairy cattle, dominance variances estimated from pedigree data were reported to be 11-16% of the phenotypic variance of stature [11], and the increased availability of cows with phenotypes and genotypes provides an opportunity to estimate dominance effects and include those in mating programs [12].However, only limited methodology studies on genomic prediction and variance component estimation of dominance were available [13][14][15][16].
Genomic best linear unbiased prediction (GBLUP) and various Bayesian methods are available for genomic prediction, and GBLUP generally had good performance in real data [17].Restricted maximum likelihood estimation (REML) [18] has been a widely accepted method for estimating variance components.
Objectives of this study were to develop mixed model methods for the joint genomic prediction of and variance component estimation of additive and dominance effects based on the traditional quantitative genetics model that partitions a genotypic value into breeding value and dominance deviation.The methodology will have two complimentary computing strategies for large numbers of individuals and markers, and the genomic prediction methods for have GBLUP and associated reliability for both training and validation data sets.Accuracies of the new methods will be evaluated using simulation data based a true dairy cattle SNP structure.

Genetic Model of SNP Markers and Mixed Model of Phenotypic Observations
The genetic model of SNP markers is an expansion of the additive model used in genomic evaluations [2,4,5] by adding a dominance component to the additive model.Using the traditional quantitative genetics model that partitions a genetic value into breeding value and dominance deviation under the assumption of Hardy-Weinberg equilibrium [7], the genetic value of each SNP marker can be expressed as: where g ij = genotypic value of SNP genotype A i A j (i,j~1,2), m = common mean, a = average effect of gene substitution, d = dominance effect, a ij = w aij a = breeding value, , and where p 1 = fre-= frequency of A 1 allele and p 2 ~1{p 1 = frequency of A 2 .Note that gene substitution effect (a) is a contrast of breeding values or a contrast of allelic effects, and dominance effect (d) is a contrast of dominance deviations or a contrast of genotypic values (Text S1: Part A).In matrix notations, the genetic model of Equation 1 can be expressed as: The quantitative genetics model of Equation 2 has the interpretation of 'breeding value' for additive effects.Assuming equal allele frequency and using a reparameterized m, Equation 2 can achieve the (21)-0-1 coding or 0-1-2 coding for additive effects and the 0-1-0 coding for dominance effects, but additive effects in those equal frequency models do not have the interpretation of 'breeding value' when the actual allele frequencies are unequal (Text S1: Part A).For each SNP marker, the variance of a and the variance of d are assumed to be Var(a)~s 2 a and Var(d)~s 2 d , and the covariance between a and d is assumed null.Let N = the number of phenotypic observations, q = the number of individuals, m = the number of SNP markers, and c = the number of fixed effects.Based on Equation 2, the mixed model with SNP breeding values and dominance deviations can be expressed as: where

Genomic Additive and Dominance Relationship Matrices
As the number of SNP markers increases, the values of the diagonal elements of W a W ' a and W d W ' d increase.Two methods to normalize the W a W ' a and W d W ' d matrices can be used.The first method divides W a W ' a and W d W ' d by the expected variance of the diagonal elements of each matrix (Definition I, [2]).The second method divides W a W ' a by the average of the diagonal elements of W a W ' a (Definition II, [4]), and we apply this method to W d W ' d for defining dominance relationship matrix.In addition, we use a transformation to transform W a W ' a and W d W ' d into correlation matrices so that off-diagonal elements are mathematically comparable, and we refer to this definition as Definition III and refer to the resulting correlation matrices as genomic additive and dominance correlation matrices.The additive correlations of Definition III are the genomic version of Wright's coefficient of relationship [19].Each of these three definitions of additive and dominance relationship or correlation matrices can be represented by two transformation matrices, The additive relationship or correlation matrix (A g ) and dominance relationship or correlation matrix (D g ) can be expressed as.
In Equations 6-7, subscript 'g' is used to distinguish A g and D g from the A and D matrices calculated from pedigree data [20].In addition to representing a number of definitions of genomic relationships, T a and T d are used to define equivalent models to achieve computing efficiency.

Two Equivalent Mixed Models, Two Sets of Complementary Formulations
With the T matrices of Equations 4-5, two equivalent mixed models with complementary computing advantages, Model 1 and Model 2, can be defined.Model 1 can be written as: where a~T a a = genomic breeding values,d~T d d = genomic dominance deviations, Var(a)~A g s 2 a , and Var(d)~D g s 2 d .Model 2 can be rewritten as: The CE set of Model 1 is the best for large number of markers (m.q) and the MME set of Model 2, to be referred as the QM set (QM meaning q.m), is the best for large number of individuals (q.m).The MME set of Model 1 (to be referred to as MQ, with MQ meaning m.q) has no computing advantage because the matrix size is twice as large as that of CE and requires the inverses of the relationship matrices.The CE set of Model 2 (CE2) also has no computational advantage because CE2 requires more memory than QM if m.q.These two sets (MQ and CE2) are not considered further.In the following, we focus on the CE and QM sets of solutions, where each set consists of GBLUP, reliability of GBLUP and GREML formulations.We first present these three types of formulations in each set, CE for m.q or QM for q.m, and then summarize the main features of the CE and QM sets.

GBLUP-CE, Reliability and GREML-CE for m.q
The CE form of GBLUP from Model 1 can be calculated as: V is defined by Equation 9, and We refer to the GBLUP of Equations 12-13 as GBLUP-CE.The GBLUP of genotypic values is calculated as ĝ~âz d.The reliability measures of â, d and ĝ for individuals with phenotypic observations (individuals in training data set) are: where R 2 ai = the reliability of GBLUP of breeding values ( â) for individual i, R 2 di = the reliability of GBLUP for dominance deviations ( d) for individual i, R 2 gi = the reliability of GBLUP for genotypic values ( ĝ),  [20][21][22] are: GBLUP-QM, Reliability and GREML-QM for q.m The mixed model equations for predicting SNP additive effects (a) and dominance effects (d) based on Model 2 are: where I m = m|m identity matrix, l a ~s2 e =s 2 a and l d ~s2 e =s 2 d .To reduce the size of Equation 18, equations for b can be absorbed, and Equation 18 after the absorption reduces to: where M~I N {X(X 0 X) { X 0 .The GBLUP of breeding values and dominance deviations for all individuals with phenotypic observations can be calculated as: where T a is defined by Equation 4, T d by Equation 5, and â and d are solutions to Equation 16.We refer to the approach of Equations 19-21 as GBLUP-QM.The comparison between Equations 20-21 and Equations 12-13 shows that GBLUP-CE and GBLUP-QM are mathematically identical.Reliabilities of GBLUP-QM from Equations 19-21 are: where T a and T d are defined by Equations 4-5, and C aa , C ad , C da and C dd are submatrices that satisfy: For individuals without phenotypic observations (individuals in validation data set), formulations of GBLUP-QM and associated reliability measures are given in Text S1: Part B. GREML-QM formulations via EM type algorithm are: where r is the rank of the coefficient matrix of Equation 18, ê~y{X b{Z 1 â{Z 2 d and C aa and C dd are defined by Equation 22.

Heritability Estimates
Three heritability estimates can be obtained from estimates of variance components: additive heritability or heritability in the narrow sense (h

Main Features of the CE and QM Formulations
The CE and QM sets of formulations for GBLUP, reliability and GREML are mathematically identical, offer identical results, and offer complimentary computing efficiency.The CE set is designed for m.q and is the best approach for using a large number of markers for GBLUP and GREML, while GBLUP-QM is designed for q.m and is the best approach for using a large number of individuals in GBLUP and GREML.A simple rule for choosing between CE and QM is: use CE if q,2m or vice versa.This is because the size of the V matrix to be inverted is q for CE (assuming one observation per individual) and the size of the MME coefficient matrix of Equation 19 is 2m for QM so that V become easier to invert than the MME coefficient matrix of Equation 19 for q,2m.Both sets do not require the inversions of the additive and dominance relationship matrices.The CE set uses relationship matrices explicitly whereas the QM set does so implicitly.Both sets are invariant to the invertibility of A g and D g , i.e., both sets are applicable to singular A g and D g , applicable to m.q where A g and D g are generally invertible, and applicable to q.m where A g and D g are non-invertible.The property of invariance to the invertibility of additive and dominance relationship matrices is a significant convenience because researchers do not have to require m.q and do not need to assess invertibility that is not guaranteed by m.q, e.g., the existence of identical twins results in non-invertible A g and D g .algorithm of Equations of 15-17 and 23-25 is known to be reliable but slow.The AI-REML algorithm [23][24][25] is fast but is not as reliable as EM type.The implementation of AI-REML for estimating additive, dominance and residual variance components is described in Text S1: Part C. All formulations for GBLUP, reliability, genomic relationships and GREML including AI-REML are implemented by the GVCBLUP package [26], which is freely available at http://animalgene.umn.edu.

Accuracy of GREML and GBLUP for Additive and Dominance Heritabilities
Simulation study with known true values of genetic effects and parameters is an effective approach to evaluate the accuracy of a new methodology because the observed GBLUP and GREML estimates can be compared with the true values.We generated a large number of simulated data sets based on a true dairy cattle SNP structure of 1654 Holstein cows assuming true additive and dominance heritability levels of 0, 0.05, 0.15 and 0.30, and we applied seven SNP sets to the simulated data, 1K causal variants, 1K SNP, 2K SNP and causal variants, 3K, 7K 40K SNP markers, and 41K SNP markers and causal variants.Detailed information about these marker sets and the procedure to generate the simulation data are described in Text S1: Part D. For the sample size of 1654 individuals in the simulation study with seven causal and SNP marker sets, GREML were able to capture small effects that each accounted for only 0.00005-0.0003 of the phenotypic variance with high accuracy and were able to distinguish between high and low heritability levels.However, dominance GREML was less accurate and required higher density of SNP markers than additive GREML (Table S1).These results were encouraging given the rapid data growth in genomic selection [27][28][29] that could substantially increase the GREML accuracy for both additive and dominance effects over the accuracies observed with our sample size.
GREML accuracy of causal variants.Causal SNP markers (1K_QTL, Table S1) had the best accuracy in almost all cases and had similar accuracies for both additive and dominance heritabilities except the case with h 2 a ~0:05 and h 2 d ~0:05, where the estimate of dominance heritability was ĥ h 2 d ~0:03+0:02 and the estimate of additive heritability was ĥ h 2 a ~0:06+0:01.Adding linked SNP makers to the causal SNP (2K and 41K in Table S1) decreased GREML accuracy in most cases.Causal SNP markers had nearly unbiased estimates of heritabilities (Figure 1) and had the smallest MSE of heritability estimates (Figure 2).The bias and MSE of variance components had similar patterns as those for heritabilities (data not shown).
GREML accuracy of linked SNP markers.Linked SNP markers were less accurate than causal SNP markers in nearly all cases but were still highly accurate for estimating additive variance.For additive effects, GREML using the 40K and 41K SNP sets had a tendency of slightly overestimating additive heritabilities and variance components.For dominance effects, the marker densities in this simulation study, 1K_SNP, 3K, 7K and 40K, were all insufficient to achieve accurate estimates of dominance heritabilities and variance components, although the 40K set was able to distinguish between high and low dominance heritabilites.Accuracy of dominance GREML increased as the density of linked SNP marker increased from 1K_SNP to 40K, indicating that further increase in marker density over 40K could improve the accuracy of dominance GREML (Table S1).
GREML estimates for '09 heritability.Estimating '09 heritability generally is considerably more difficulty than estimating non-null heritability.Therefore, the accuracy in estimating '09 heritability is a strong test for the accuracy of the GREML formulations.From the same simulation data set we generated above, we generated another set of simulation data requiring additive or dominance effects to be the only genetic effects such that h 2 a ~0:00 and/or h 2 d ~0:00 to test the performance of GREML when the true heritability and variance component for one or both effects were null.The causal variants (503_A and 503_D) again had the highest accuracy in estimating '09 heritabilities and variance components, with average heritability estimates in the range 0-0.01 for additive heritability and 0-0.02 for dominance heritability (Table S2).The 1K SNP set with half causal variants and half inter-QTL SNP (503_A +503_D) was virtually as accurate as the causal variants of 503_A or 503_D.The 41K set also included the causal variants but were not as accurate as the 1K set and overestimated dominance heritability by 0.05 when the true dominance heritability was '09.The 40K inter-QTL SNP markers had the same overestimates as the 41K.The results of the 1K, 40K and 41K SNP sets showed that a large number of linked SNP markers decreased the GREML accuracy when the true dominance heritability was null.Overall, the GREML formulations were surprisingly accurate in estimating null additive and dominance heritabilities except the 40K and 41K marker sets for null dominance heritability.

Accuracy of GBLUP for breeding values, dominance
deviations and genetic values.GBLUP of genotypic values ( ĝ) and GBLUP of breeding values ( â) were less sensitive to marker density than GREML.GBLUP of dominance deviations ( d) was sensitive to marker density as was dominance GREML.Observed and expected accuracies all increased as heritability levels increased.The benefit of using ĝ over â or d for predicting the total genotypic values increased as dominance heritability increased for a given additive heritability except the case h 2 d ~0:05 (Figure 3).

GBLUP accuracy of causal variants and linked SNP
markers.Causal variants had the best GBLUP accuracy for â, d and ĝ, but the accuracy for d was lower than that for â and ĝ, unlike additive and dominance GREML that had similar accuracies using causal variants.S3).For various densities of inter-QTL SNP markers ranging from 3K, 7K to 40K, R R g and R R a were relatively unchanged within each combination of additive and dominance heritability levels, indicating that increasing SNP density over 3K would achieve little improvement in R R g and R R a .For the 1K_SNP, R R g and R R a were lower than the 3K, 7K and 40K by about 0.05.In contrast, R R d was similar for the 7K and 40K, had substantial decrease for the 1K_SNP and 3K, and was considerably lower than R R a across all heritability combinations.These results indicated that dominance GBLUP required higher density of SNP markers and was more difficult than additive GBLUP (Table S3).
Adding linked SNP makers to causal variants (1K_QTL + 1K_SNP, 1K_QTL +40K) had lower observed accuracies than causal variants alone.The decrease in R R a was 0.03 for adding the 1K_SNP to the 1K_QTL and was 0.06 for adding the 40K to the 1K_QTL.The decreases in R R d were even larger, 0.06 and 0.13, respectively.These decreases were relatively constant across heritability levels (Table S3).However, any marker set with causal variants, the 2K or 41K, was more accurate than linked SNP only, the 1K_SNP, 3K, 7K or 40K.
Predicted and observed GBLUP accuracies.Predicted accuracy for breeding values (R a ) and for genotypic values (R g ) agreed well with the observed accuracies ( R R a and R R g ) across all heritability levels used in this study.For dominance deviations,

Comparison of Genomic Additive and Dominance Relationships with Expected Relationships
For genomic additive and dominance relationships, Definitions I-III had nearly identical results.The 1K, 3K, 7K and 41K marker sets had similar results of relationships (data not shown).For the 41K results with the removal of three full-sib outliers and nine half-sib outliers, additive and dominance relationships agreed well with theoretical expectations (Figure 4).For full-sibs, genomic additive and dominance relationships were nearly identical to theoretical expectations.Average genomic additive relationships was 0.471 for Definition I, 0.478 for Definition II, and 0.488 for Definition III, while the mean value of pedigree coancestry coefficients for full-sibs was 0.262, i.e., genomic additive relationships were about twice as large as pedigree coancestry coefficients.The mean dominance correlation for full-sibs was 0.245 for Definition I, 0.248 for Definition II and 0.254 for Definition III, compared to the expected full-sib dominance correlation of 0.25 assuming no inbreeding.The 1654 cows used in this comparison of genomic and pedigree relationships in fact were all related [30].Therefore, the true full-sib dominance relationships should have been above 0.25.For half-sibs, Definitions I-III had mean additive relationship of 0.213-0.221and the average of '26(pedigree coancestry coefficient)' was 0.282.Genomic dominance relation-  Genomic relationships have a distinct advantage over pedigree relationships: the calculation of genomic relationships does not need to know the pedigree.This advantage is important for assessing relatedness among individuals in species where pedigree information is unavailable or difficult to collect such as in wildlife species.Two important differences exist between relationships based on markers and relationships based on pedigree information.The first difference is that marker density affects the invertibility of genomic relationship matrices, which are noninvertible when q.m.In contrast, pedigree relationship matrices are positive definite in the absence of identical twins.Our cattle data showed that the invertibility of a genomic relationship matrix should not affect the use of genomic relationships as measures of genomic relatedness among individuals, because the genomic relationships calculated from genomic relationship matrices that were invertible or non-invertible had nearly identical values that were consistent with theoretical expectations (data not shown).The additive and dominance relationship matrices were noninvertible for the 1K SNP set, and were invertible for the 3K, 7K and 41K sets after removing a potentially identical twin or duplicated individual.The second difference is the range of relationship values.Genomic relationships by Definitions I-III could take negative values whereas pedigree relationships are nonnegative.However, no negative values were observed for full-sib genomic relationships.Negative genomic additive relationships with small absolute values near '09 were observed for unrelated individuals and some half-sibs, and negative dominance relationships with small absolute values near '09 were observed among half-sib (Figure 4).In all those situations, the expected relationships were '09.Therefore, negative genomic relationships close to '09 could be interpreted as no correlation.It remains to be seen whether genomic relationship measures could detect true 'negative genomic correlations' (if such correlations exist) that are impossible to detect using pedigree information.

Effect of Genomic Relationship Definitions on GBLUP and GREML Accuracies
Simulation results showed that the methods to normalize W a W ' a and W d W ' d (Definitions I and II of genomic relationships) had no effect on GBLUP accuracy, i.e., the original mixed model was just as accurate, as shown by the R R a and R R d values (Table S4).d increased and estimates of variance components decreased as the number of SNP markers increased regardless of the true heritability level, so that hertitability estimates based on such variance component estimates became meaningless, as shown by the comparison of GREML estimates and the corresponding heritability estimates in Table S4.For GREML estimation of additive and dominance variances, Definitions I-III had similar estimates that were consistent with the true values.

Random and Directional Dominance Effects
Random additive and dominance effects with zero means were assumed in the simulation study reported in the section of Results.Under these assumptions, dominance effects were more difficult to predict and estimate in two aspects: the current densities of inter-QTL SNP markers up to 40K were insufficient to achieve accuracies comparable to those for additive effects, and causal variants had lower accuracy of dominance GBLUP than the accuracy of additive GBLUP, although causal variants had similar accuracy for estimating additive and dominance variance components.The simulation results indicated that the number of SNP markers needed in the absence of causal variants would be considerably greater than 40K to achieve accuracies of dominance GBLUP and GREML comparable to the accuracies of additive GBLUP and GREML.High density of SNP markers could also compensate the lower accuracies of causal variants, whether or not causal variants were among the SNP markers.The simulation data set assuming positive dominance deviation for each heterozygous genotype (Text S1: Part D) showed that dominance GBLUP had similar accuracies to additive GBLUP (Table S5).
Taken all evidence together, genomic prediction and variance component estimation of dominance effects was more difficult than those of additive effects in populations where additive and dominance effects had similar distributions and heritabilities but could achieve similar accuracies as those for additive effects if heterosis exists.

An Application to Estimate Genomic Additive and Dominance Heritabilities in a Swine Population
We applied our methodology to a publically available swine genomics data set with anonymous genome-wide SNP markers and phenotypes with the SNP locations and true trait names masked [31] to compare genomic additive heritability with the reported heritability estimated using pedigree information and to explore whether the swine phenotypes had dominance effects.The data set included 3534 animals from a single PIC nucleus pig line with genotypes from the Illumina PorcineSNP60 chip [32].Genotyped animals had phenotypes for five purebred traits (phenotypes in a single nucleus line), with additive heritability estimated from pedigree data ranging from 0.07 to 0.62 (Table 1).Genotypes were filtered by requiring minor allele frequency (MAF) .0.001 and proportion of missing SNP genotypes ,0.100.Markers on the X or Y chromosome were excluded.The total number of available autosome markers used in our analysis was 52,842, with missing genotypes imputed using software AlphaImpute [33].The results showed that estimates of genomic additive heritability of 0.22-0.38 were substantially lower than the pedigree estimates of 0.38-0.62 for traits 3-4, the genomic additive heritability (0.27) was higher than the pedigree estimate (0.16) for trait 2, and was in agreement with the pedigree estimate for trait 1, 0.03 versus 0.07.Only traits 3 and 5 had small dominance heritabiities, 0.07 for trait 3 and 0.05 for trait 5.The genomic estimates reported here provide useful information to breeders about the underlying true genetic factors and about the potential true heritability levels of the five traits.

Conclusions
The genomic model based on the partition of a genotypic value into breeding value and dominance deviation with additive and dominance relationship matrices calculated using SNP markers parallels the traditional quantitative genetics model that calculates additive and dominance relationships using pedigree information.The GREML and GBLUP methods based on equivalent models with complementary computing advantages and identical mathematical results provide an efficient approach for the genomic estimation of variance components and heritabilities and for the genomic prediction of additive and dominance effects using SNP markers.These methods were able to capture small additive and dominance effects and were able to differentiate different levels of additive and dominance heritabilities.GBLUP of total genetic value that includes additive and dominance effects can be an effective tool to predict an individual's total genetic potential for a phenotype.
and P is given by Equation 14.Note that a ii ~dii ~1 for Definition III but the a ii and d ii values generally are not '19 for Definitions I and II.The average of a ii values and the average of d ii values are '19 under Definitions II and III, and are expected to be '19 under Definition I although the observed average a ii and d ii values under Definition I may deviate from '19.For individuals without phenotypic observations (individuals in validation data set), formulations of GBLUP-CE and associated reliability measures are given in Text S1: Part B. GREML-CE via the EM type algorithm Formulations for GBLUP-CE and GBLUP-QM for individuals without phenotypic observations (individuals in validation data set) and reliability measures are given in Text S1: Part B. The EM type a predicted accuracy (R d ) and observed accuracy ( R R d ) agreed well except h 2 a ~h2 d ~0:05, where R d was substantially lower than observed accuracies.In real data sets, observed accuracies measured by R R a , R R d and R R g are unavailable.The good agreements between predicted and observed accuracies indicated that predicted accuracy could reliably represent the observed accuracy in real data.

Figure 1 .
Figure 1.Bias and relative bias of GREML estimates of additive and dominance heritabilities.On the X-axis, heritiabilities of the top row are dominance heritabilities and those of the bottom row are additive heritabilities.(n = 10 repeats).doi:10.1371/journal.pone.0087666.g001

Figure 2 .
Figure 2. Mean square error (MSE) and relative MSE of GREML estimates of additive and dominance heritabilities.On the X-axis, heritiabilities of the top row are dominance heritabilities and those of the bottom row are additive heritabilities.(n = 10 repeats).doi:10.1371/journal.pone.0087666.g002

Figure 3 .
Figure 3. Correlation between the true genotypic values and GBLUP of breeding values, dominance deviations and genetic values.Corr(g,â a) is the correlation between true genotypic values and GBLUP of breeding values, Corr(g, d d) is the correlation between true genotypic values and GBLUP of dominance deviations, and Corr(g,ĝ g) is the correlation between true genotypic values and GBLUP of genotypic values.On the X-axis, heritiabilities of the top row are dominance heritabilities and those of the bottom row are additive heritabilities.(n = 10 repeats).doi:10.1371/journal.pone.0087666.g003

Figure 4 .
Figure 4. Genomic additive and dominance relationships among full-sibs, half-sibs and unrelated individuals.doi:10.1371/journal.pone.0087666.g004 Z = N|q model matrix allocating phenotypic observations to SNP marker genotypes of individuals, W a = q|m model matrix for gene substitution effects of SNP markers, a = column vector of gene substitution effects of SNP markers, W d = q|m model matrix for dominance effects of SNP markers, d = column vector of dominance effects of SNP markers, X = N|c model matrix for fixed non-genetic effects such as herd-year-season in dairy cattle, and b = vector of fixed effects.Assumptions for the first and second moments are: E(y)~Xb, Var(a)~I m s 2 a , Var(d)~I m s 2 d , and Var(e)~R~I N s 2 e , where s 2 e = residual variance, I m = m|m identity matrix, and I N = N|N identity matrix.With the model and assumptions of Equations 1-3, methods for GBLUP and genomic variance component estimation using restricted maximum likelihood estimation (GREML) can be developed.
Var(a)~A g s 2 a and Var(d)~D g s 2 d defined in Equation9.However, this type of conversion practically is unnecessary because the average a ii and d ii values are '19 under Definitions II and III and are expected to be '19 under Definition I of genomic additive and dominance relationships.

Table 1 .
Estimated genomic additive and dominance heritabilities from a swine nucleus line.Definition III of genomic relationships had the same accuracy as Definitions I-II for breeding values and had slightly lower accuracy for dominance deviations for one case only at h 2 ~0:75 for Definition III and R R d ~0:76 for Definitions I and II (TableS4).For GREML, normalization or transformation of the W a W ' a and W d W ' d matrices was necessary.Without such normalization or transformation, diagonal values in W a W ' a and W d W ' [31]ue in each () is the pedigree-based heritability estimate[31].doi:10.1371/journal.pone.0087666.t001