^{*}

The authors have declared that no competing interests exist.

Conceived and designed the experiments: YD. Performed the experiments: YD CW SW GH. Analyzed the data: CW SW GH. Contributed reagents/materials/analysis tools: CW SW. Wrote the paper: YD CW GH.

We established a genomic model of quantitative trait with genomic additive and dominance relationships that parallels the traditional quantitative genetics model, which partitions a genotypic value as breeding value plus dominance deviation and calculates additive and dominance relationships using pedigree information. Based on this genomic model, two sets of computationally complementary but mathematically identical mixed model methods were developed for genomic best linear unbiased prediction (GBLUP) and genomic restricted maximum likelihood estimation (GREML) of additive and dominance effects using SNP markers. These two sets are referred to as the CE and QM sets, where the CE set was designed for large numbers of markers and the QM set was designed for large numbers of individuals. GBLUP and associated accuracy formulations for individuals in training and validation data sets were derived for breeding values, dominance deviations and genotypic values. Simulation study showed that GREML and GBLUP generally were able to capture small additive and dominance effects that each accounted for 0.00005–0.0003 of the phenotypic variance and GREML was able to differentiate true additive and dominance heritability levels. GBLUP of the total genetic value as the summation of additive and dominance effects had higher prediction accuracy than either additive or dominance GBLUP, causal variants had the highest accuracy of GREML and GBLUP, and predicted accuracies were in agreement with observed accuracies. Genomic additive and dominance relationship matrices using SNP markers were consistent with theoretical expectations. The GREML and GBLUP methods can be an effective tool for assessing the type and magnitude of genetic effects affecting a phenotype and for predicting the total genetic value at the whole genome level.

Genomic prediction using genome-wide single nucleotide polymorphism (SNP) markers has been shown to be a powerful tool to capture small genetic effects dispersed over the genome for predicting an individual’s genetic potential of a phenotype

From the point of view of missing heritability

Genomic best linear unbiased prediction (GBLUP) and various Bayesian methods are available for genomic prediction, and GBLUP generally had good performance in real data

Objectives of this study were to develop mixed model methods for the joint genomic prediction of and variance component estimation of additive and dominance effects based on the traditional quantitative genetics model that partitions a genotypic value into breeding value and dominance deviation. The methodology will have two complimentary computing strategies for large numbers of individuals and markers, and the genomic prediction methods for have GBLUP and associated reliability for both training and validation data sets. Accuracies of the new methods will be evaluated using simulation data based a true dairy cattle SNP structure.

The genetic model of SNP markers is an expansion of the additive model used in genomic evaluations _{ij}_{ij}

As the number of SNP markers increases, the values of the diagonal elements of ^{th} diagonal element of ^{th} diagonal element of

The additive relationship or correlation matrix (

In

With the

The CE form of GBLUP from Model 1 can be calculated as:

The mixed model equations for predicting SNP additive effects (

_{α} and _{δ} are defined by

Three heritability estimates can be obtained from estimates of variance components: additive heritability or heritability in the narrow sense (

The CE and QM sets of formulations for GBLUP, reliability and GREML are mathematically identical, offer identical results, and offer complimentary computing efficiency. The CE set is designed for

Formulations for GBLUP-CE and GBLUP-QM for individuals without phenotypic observations (individuals in validation data set) and reliability measures are given in

Simulation study with known true values of genetic effects and parameters is an effective approach to evaluate the accuracy of a new methodology because the observed GBLUP and GREML estimates can be compared with the true values. We generated a large number of simulated data sets based on a true dairy cattle SNP structure of 1654 Holstein cows assuming true additive and dominance heritability levels of 0, 0.05, 0.15 and 0.30, and we applied seven SNP sets to the simulated data, 1K causal variants, 1K SNP, 2K SNP and causal variants, 3K, 7K 40K SNP markers, and 41K SNP markers and causal variants. Detailed information about these marker sets and the procedure to generate the simulation data are described in

Causal SNP markers (1K_QTL,

On the X-axis, heritiabilities of the top row are dominance heritabilities and those of the bottom row are additive heritabilities. (n = 10 repeats).

On the X-axis, heritiabilities of the top row are dominance heritabilities and those of the bottom row are additive heritabilities. (n = 10 repeats).

Linked SNP markers were less accurate than causal SNP markers in nearly all cases but were still highly accurate for estimating additive variance. For additive effects, GREML using the 40K and 41K SNP sets had a tendency of slightly overestimating additive heritabilities and variance components. For dominance effects, the marker densities in this simulation study, 1K_SNP, 3K, 7K and 40K, were all insufficient to achieve accurate estimates of dominance heritabilities and variance components, although the 40K set was able to distinguish between high and low dominance heritabilites. Accuracy of dominance GREML increased as the density of linked SNP marker increased from 1K_SNP to 40K, indicating that further increase in marker density over 40K could improve the accuracy of dominance GREML (

Estimating ‘0′ heritability generally is considerably more difficulty than estimating non-null heritability. Therefore, the accuracy in estimating ‘0′ heritability is a strong test for the accuracy of the GREML formulations. From the same simulation data set we generated above, we generated another set of simulation data requiring additive or dominance effects to be the only genetic effects such that

GBLUP of genotypic values (

Causal variants had the best GBLUP accuracy for

For various densities of inter-QTL SNP markers ranging from 3K, 7K to 40K,

Adding linked SNP makers to causal variants (1K_QTL +1K_SNP, 1K_QTL +40K) had lower observed accuracies than causal variants alone. The decrease in

Predicted accuracy for breeding values (

For genomic additive and dominance relationships, Definitions I-III had nearly identical results. The 1K, 3K, 7K and 41K marker sets had similar results of relationships (data not shown). For the 41K results with the removal of three full-sib outliers and nine half-sib outliers, additive and dominance relationships agreed well with theoretical expectations (

Genomic relationships have a distinct advantage over pedigree relationships: the calculation of genomic relationships does not need to know the pedigree. This advantage is important for assessing relatedness among individuals in species where pedigree information is unavailable or difficult to collect such as in wildlife species. Two important differences exist between relationships based on markers and relationships based on pedigree information. The first difference is that marker density affects the invertibility of genomic relationship matrices, which are non-invertible when

Simulation results showed that the methods to normalize

Random additive and dominance effects with zero means were assumed in the simulation study reported in the section of Results. Under these assumptions, dominance effects were more difficult to predict and estimate in two aspects: the current densities of inter-QTL SNP markers up to 40K were insufficient to achieve accuracies comparable to those for additive effects, and causal variants had lower accuracy of dominance GBLUP than the accuracy of additive GBLUP, although causal variants had similar accuracy for estimating additive and dominance variance components. The simulation results indicated that the number of SNP markers needed in the absence of causal variants would be considerably greater than 40K to achieve accuracies of dominance GBLUP and GREML comparable to the accuracies of additive GBLUP and GREML. High density of SNP markers could also compensate the lower accuracies of causal variants, whether or not causal variants were among the SNP markers. The simulation data set assuming positive dominance deviation for each heterozygous genotype (

Taken all evidence together, genomic prediction and variance component estimation of dominance effects was more difficult than those of additive effects in populations where additive and dominance effects had similar distributions and heritabilities but could achieve similar accuracies as those for additive effects if heterosis exists.

We applied our methodology to a publically available swine genomics data set with anonymous genome-wide SNP markers and phenotypes with the SNP locations and true trait names masked

Trait 1 | Trait 2 | Trait 3 | Trait 4 | Trait 5 | |

0.03 (0.07 |
0.27 (0.16) | 0.22 (0.38) | 0.35 (0.58) | 0.38 (0.62) | |

7.22×10^{−7} |
0.02 | 0.07 | 0.01 | 0.05 | |

0.03 | 0.29 | 0.29 | 0.36 | 0.44 |

Value in each () is the pedigree-based heritability estimate

The genomic model based on the partition of a genotypic value into breeding value and dominance deviation with additive and dominance relationship matrices calculated using SNP markers parallels the traditional quantitative genetics model that calculates additive and dominance relationships using pedigree information. The GREML and GBLUP methods based on equivalent models with complementary computing advantages and identical mathematical results provide an efficient approach for the genomic estimation of variance components and heritabilities and for the genomic prediction of additive and dominance effects using SNP markers. These methods were able to capture small additive and dominance effects and were able to differentiate different levels of additive and dominance heritabilities. GBLUP of total genetic value that includes additive and dominance effects can be an effective tool to predict an individual’s total genetic potential for a phenotype.

GREML estimates of variance components and heritabilities of additive and dominance effects (mean ± standard deviation, n = 10 repeats).

(PDF)

GREML estimates and GBLUP accuracy for simulation data with additive or dominance effects only (mean ± standard deviation, n = 10 repeats).

(PDF)

GBLUP Accuracies for breeding values, dominance deviations and genotypic values (mean ± standard deviation, n = 10 repeats).

(PDF)

GREML estimates of variance components and GBLUP accuracies with and without genomic relationships for phenotypes with additive and dominance effects of 1006 QTL (mean ± standard deviation, n = 10 repeats).

(PDF)

GBLUP accuracies in simulation data assuming random additive effects and directional dominance effects values (mean ± standard deviation, n = 10 repeats).

(PDF)

(PDF)

^{th}edition). Harlow, Essex, UK: Longmans Green.