The genetic connectedness calculated from genomic information and its effect on the accuracy of genomic prediction

Suo-Yu Zhang; Babatunde Shittu Olasege; Deng-Ying Liu; Qi-Shan Wang; Yu-Chun Pan; Pei-Pei Ma

doi:10.1371/journal.pone.0201400

Abstract

The magnitude of connectedness among management units (e.g., flocks and herds) gives a reliable estimate of genetic evaluation across these units. Traditionally, pedigree-based methods have been used to evaluate the genetic connectedness in China. However, these methods have not been able to yield a substantial outcome due to the lack of accuracy and integrity of pedigree data. Therefore, it is necessary to ascertain genetic connectedness using genomic information (i.e., genome-based genetic connectedness). Moreover, the effects of various levels of genome-based genetic connectedness on the accuracy of genomic prediction still remain poorly understood. A simulation study was performed to evaluate the genome-based genetic connectedness across herds by applying prediction error variance of difference (PEVD), coefficient of determination (CD) and prediction error correlation (r). Genomic estimated breeding values (GEBV) were predicted using a GBLUP model from a single and joint reference population. Overall, a continued increase in CD and r with a corresponding decrease in PEVD was observed as the number of common sires varies from 0 to 19 regardless of heritability levels, indicating increasing genetic connectedness between herds. Higher heritability tends to obtain stronger genetic connectedness. Compared to pedigree information, genomic relatedness inferred from genomic information increased the estimates of genetic connectedness across herds. Genomic prediction using the joint versus single reference population increased the accuracy of genomic prediction by 25% and lower heritability benefited more. Moreover, the largest benefits were observed as the number of common sires equals 0, and the gain of accuracy decreased as the number of common sires increased. We confirmed that genome-based genetic connectedness enhanced the estimates of genetic connectedness across management units. Additionally, using the combined reference population substantially increased accuracy of genomic prediction. However, care should be taken when combining reference data for closely related populations, which may give less reliable prediction results.

Citation: Zhang S-Y, Olasege BS, Liu D-Y, Wang Q-S, Pan Y-C, Ma P-P (2018) The genetic connectedness calculated from genomic information and its effect on the accuracy of genomic prediction. PLoS ONE 13(7): e0201400. https://doi.org/10.1371/journal.pone.0201400

Editor: Tzen-Yuh Chiang, National Cheng Kung University, TAIWAN

Received: March 15, 2018; Accepted: July 13, 2018; Published: July 31, 2018

Copyright: © 2018 Zhang et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All relevant data are freely available by QMSim software (https://www.ncbi.nlm.nih.gov/pubmed/19176551).

Funding: This work was supported by Agriculture Development through Science and Technology Key Project of Shanghai [ChanZi (2014-2016) 6 and grant no. TuiZi (2016) 1-1-4] and the National Natural Science Foundation of China (31701077).

Competing interests: The authors have declared that no competing interests exist.

Introduction

The reliability of genetic evaluations across management units (e.g., flocks and herds) depends on the magnitude of connectedness among these units. Comparisons of estimated breeding values (EBVs) tend to be biased when poor connectedness exists across units[1]. The lower the connectedness across units, the larger the bias and thus, decreasing the accuracy of comparison of EBVs across units. It was reported that few highly selected sires from dairy cattle populations generally have strong genetic links owing to the wide use of artificial insemination (AI)[2]. However, it is not the case in sheep, beef cattle or pig populations where AI is less used, leading to poor or no genetic connectedness across management units. Therefore, it is necessary to estimate connectedness among management in these species units before conducting genetic evaluation across these units.

Traditionally, genetic connectedness can be calculated through pedigree-based method [1–4]. However, the pedigree information used in China cannot guarantee its integrity and accuracy, which in turn may lead to lower or unreasonable estimates of genetic connectedness across pig nucleus farms in China[3, 5, 6]. The lack of extensive and reliable pedigree information is a general problem in developing countries[7], particularly in China, where the source of the pigs are extremely complex (e.g., introduced pigs from Denmark, the United States, Canada and France). Therefore, actual genetic connectedness among Chinese pig farms might not be totally reflected by pedigree information due to the inconsistence pedigree recording system between China and the foreign countries [2]. Moreover, Yu et al. [8] confirmed that genomic relatedness inferred from genomic information (i.e., single nucleotide polymorphisms, SNPs) increased the estimates of genetic connectedness across different management units, compared with pedigree information. Therefore, with regards to the above opinions, it is possible to ascertain genetic connectedness through genomic information, and this can be perceived as a plausible solution to get more accurate estimates of genetic connectedness across pig farms in China, as well as enhance the genetic improvement of Chinese pigs.

Recently, connectedness statistics have been used in genomic selection[9] for the sake of optimizing the design of reference population[10, 11]. However, it is important to investigate the effect of enhanced genetic connectedness estimated by genomic relatedness on the accuracy of genomic prediction, as noted by Yu et al.[8].

In this study, we simulated two populations which were applied to mimic existing China pig populations with the aim to measure genetic connectedness across management units (i.e., populations) by using genomic information and also investigate the effect of various levels of genetic connectedness across herds on the accuracy of genomic prediction.

Materials and methods

Simulation

A simulation scheme presented by E.C. Akanno[12] was used to mimic pig breeding programs in developing countries, which was adopted in our study to mimic the situation in China. The software QMSim[13] was used to simulate the genomic data and the whole simulation process was repeated nine times. QMSim software was designed to simulate a broad range of genetic architectures and population structures in livestock. Large-scale genotyping datasets and multiple livestock pedigrees can be reliably simulated. Simulation of populations was carried out in two steps: 1) to create historical population for establishing mutation-drift equilibrium, and 2) to simulate recent population, which can be very complex. A wide range of parameters (e.g., number of chromosomes, QTL and markers, crossover interference and location of QTL and markers) are available in order to simulate appropriate genome. This simulator is efficient in time and memory[13].

Population structure.

The populations were generated in three steps. In the first step, 1000 generations with a gradual decrease in population size from 5000 to 1050 were simulated, and then the population size was further decreased from 1050 to 200 in the following 1000 generations for the purpose of creating initial linkage disequilibrium (LD) and establishing mutation-drift equilibrium in historical population (HP).

In the second step, an expanded population (EP) was simulated by randomly choosing the 100 founder males and 100 founder females from the last generation of HP. Here, in order to expand the population, six generations was simulated assuming 10 offspring per dam under random mating.

In the third step, three recent populations (RP) (i.e., Herd1, Herd2 and Herd3) were simulated, and each of them with the population size of 20 founder males and 400 founder females from the last generation of EP. The size defined above represented the median group size for pig nucleus farms in China. The Herd1 population was composed of the top 20 males and top 400 females on the basis of their own phenotypic values from the EP. In order to make Herd1 have no connection with Herd2, Herd2 was simulated by selecting the last 20 males and the last 400 females from the EP. It is well recognized that genetic connectedness among China pig herds was generally established through using of common sires (i.e., sires with progeny in multiple herds or sires born in one herd with progeny in another herd) or through transferring of seedstock from one herd to another[3]. Therefore, to mimick the genetic connectedness created by common sires, 400 founder females of Herd3 were all from the first generation of Herd2, while the 20 founder males of Herd3 came from Herd1 and Herd2. It is assumed that the number of males defined as common sires from the founder males of Herd1 is n (0 ≤ n ≤ 19), then the remaining males from Herd2 is 20—n. Increasing n increased the genetic connectedness between Herd1 and Herd3. Moreover, the RP parameters used in this study mimicked more closely to a real Chinese pig production system with selection for high values of EBV and culling for low values of EBV with a replacement rate of 100% for sires and 40% for dams. Best linear unbiased prediction (BLUP) method was used to estimate the breeding value by using the Henderson’s mixed model theory[14] for an animal model. In this study, three traits corresponding number born alive, average daily gain and backfat were mimicked, whose heritability and phenotypic variance were obtained from a previous study carried out by Akanno E et al. [15]. Considering the computing time and memory requirements, only two generations of each RP were simulated. Herd1 and Herd3 both had 2020 individuals, which were made up of 420 founders and 800 progenies each from the first and second generation. Details of the parameters used to generate genomic data are given in Table 1, while the simulation steps are described in Fig 1.

Download:

Fig 1. A sketch map of simulation process.

Note: Ne: effective population size; LD: linkage disequilibrium.

https://doi.org/10.1371/journal.pone.0201400.g001

Download:

Table 1. Parameters of the simulation process.

https://doi.org/10.1371/journal.pone.0201400.t001

Genome.

The genome parameters were consistent with a previous study conducted by [16]. In this study, in order to create more realistic pig genome size, each chromosome was simulated to acquire an average length of 100 cM[17]. The marker density represented approximately 60 K SNP chip currently available[18]. The parameters shown in Table 1 were used to simulate the genome.

Genetic connectedness criteria

We used prediction error variance (PEV) of differences (PEVD), generalized coefficient of determination (CD) and prediction error correlation (r) defined below to investigate genetic connectedness between Herd1 and Herd3. Here, the PEV were obtained from the Henderson’s mixed model equation (MME) [14] and the PEV of ith individual is given by Where is the ith diagonal element of D₂₂ coefficient matrix which is defined as the inverse of the MME coefficient matrix (D) corresponding to genetic values. is the residual variance. A detailed description of the genetic connectedness criteria was provided by Yu et al [8].

PEVD, the average PEV of all pairwise EBV differences between the individuals across management units[2], which is calculated as Where and represent genetic value for individual i and individual j, respectively. PEC_ij indicates the prediction error covariance (PEC) defined by the off-diagonal element of the PEV matrix. The PEVD is used as a criterion to measure the genetic connectedness because poor connectedness among individuals will have higher prediction error than strong connectedness. In this study, a scaled PEVD was used for further analysis based on Kuehn’s suggestion[19]. Smaller PEVD indicated stronger connectedness.

CD, generalized coefficient of determination[20], is calculated as follows Where λ, , , are the same values defined above, and R is a relationship matrix which measures the relationship between individuals (defined below). This statistic ranging from 0 to 1 with larger values represented stronger connectedness.

And the r between genetic values of individuals from different management units is derived as[4]. Similar to CD, the statistic r also ranged from 0 to 1 and larger r indicated stronger connectedness across management groups.

Relationship matrix

Connectedness is determined in BLUP framework using the genetic relationship matrix. The information about the covariance structures among individuals is required to estimate the relatedness of the three genetic connectedness criteria stated above[8]. In this study, four relationship matrices (R) measuring the relationship among individuals are the same as previous study provided by Yu et al [8] and are defined below.

Firstly, R = A^PED, the usual numerator relationship matrix. When genetic evaluation is under an animal model, connectedness occurs due to A^PED[2]. The A^PED is directly calculated from the known pedigree and denotes the probability of inheritance of alleles from a common ancestor indicating that they are identical by descent (IBD). The off-diagonal elements are twice coefficients of kinship and are equivalent to the numerators of Wright’s correlation coefficients[21].

Secondly, R = G^BASE, basic genomic relationship matrix G^BASE was constructed according to the method (method 1) described by VanRaden[22], i.e., , where elements in column i of M are 0-2p_i, 1-2p_i and 2-2p_i for genotypes A₁A₁, A₁A₂ and A₂A₂, respectively, and p_i is the allele frequency of A₂ at locus i, calculated from the available marker data, as negative values generated in this scenario, R = G^0.5 (i.e., the third matrix), which supposes the MAF in the base population is unknown and 0.5 is used for all p_i. The G^0.5 constructed in this way does not create any negative values for simulated data.

Fourthly, when comparing marker-based with pedigree-based relationship matrices, scaling of genomic relationship matrices is needed for interpretation of genetic connectedness criteria. A reasonable rescaling may be achieved by using genomic elements that ranged between 0 and 2, which are the minimum and maximum values of A^PED, respectively. Therefore, to render G^BASE on the same scale as A^PED, a scaled G^BASE matrix (G^S) was created and the scaled genomic relationship between ith and jth individual was given by Where Gs_ij is a scaled element of the G^BASE and G_ij is a typical element of G^BASE. Gs_max = 2 and Gs_min = 0 are the maximum and minimum values elements that the scaled matrix is allowed to take, respectively, while G_max and G_min are the maximum and minimum element of the G^BASE. In this case, G^S does not create any negative values.

Finally, in order to simulate a more realistic scenario where not all the individuals were genotyped in the population, the H matrix (i.e., relationship matrix with pedigree and genomic information) was given by [23–25] where the A₁₁, A₂₂ and A₁₂ are submatrices of A matrix representing relationships among genetyped, among non-genotyped, and between genotyped and non-genotyped individuals respectively, and the superscript T represents the transpose of a matrix. The G_ω matrix indicates relationship of genotyped individuals and defined as where the ω represents the fraction of the genetic variance not captured by markers, and G = G^BASE, G^0.5 and G^S defined above. In this study, we assumed that individuals at generation 0–1 (N = 2440) as non-genotyped individuals while individuals from generation 2 (N = 1600) were genotyped. This simulated a real scenario, where individuals from more recent generation were likely to be genotyped with a relatively small sample size compared with individuals from earlier generations.

Population structures of the simulated populations

Principle component analysis (PCA) was used to investigate the population structure of Herd1 and Herd3. PCA was performed using PLINK software[26] and the PC plots were drawn by the ggplot2 package[27].

Prediction of genomic breeding values

In order to investigate the impact of various genetic connectedness inferred from genomic information on the accuracy of genomic prediction, the genomic breeding values were predicted using GBLUP, with different genomic matrices (G^BASE, G^0.5 and G^S) defined above. In addition, we also examined the predictive ability of other two relationship matrices (i.e., A^PED and H) to better understanding the possible effects of genomic connectedness on genomic prediction. The model was the same as the GBLUP model shown below but genomic relationship matrices were replaced by A^PED and H when predicting the (G) EBV.

The basic GBLUP model [22, 28] was defined as: Where y is simulation phenotypes, μ is the population mean, g is the vector of breeding values, ε is the vector of residuals, Z is an appropriate design matrix. Assuming that and , where G is the genomic relationship matrix. is the additive genetic variance, I is the identity matrix and is the residual variance.

Reference and validation data

The Herd1 data were divided into reference data and validation data by generation. The reference population was made up of a total of 1220 individuals comprising of 420 founders and 800 progenies from the first generation. The validation population comprised of 800 individuals from the second generation. To avoid inflation of the accuracy of genomic prediction, 1220 individuals from the founders and the first generation of Herd3 were included in a joint reference population. The accuracy of genomic prediction was estimated as the correlation between predicted genomic estimated breeding values (GEBV) and the true breeding values of the animals in the validation set.

Results

Genetic connectedness criteria

Genetic connectedness criteria between Herd1 and Herd3 for varied number of common sires with heritability of 0.08, 0.28 and 0.63 were presented in Fig 2, Fig 3 and Fig 4, respectively. Similar results among PEVD, CD and r_ij were observed.

Download:

Fig 2. The estimates of PEVD, CD and r_ij at heritability = 0.08.

Left column: A^PED. Right column: G^BASE. For r_ij, the G^BASE was replaced by G^S.

https://doi.org/10.1371/journal.pone.0201400.g002

Download:

Fig 3. The estimates of PEVD, CD and r_ij at heritability = 0.28.

Left column: A^PED. Right column: G^BASE. For r_ij, the G^BASE was replaced by G^S.

https://doi.org/10.1371/journal.pone.0201400.g003

Download:

Fig 4. The estimates of PEVD, CD and r_ij at heritability = 0.63.

Left column: A^PED. Right column: G^BASE. For r_ij, the G^BASE was replaced by G^S.

https://doi.org/10.1371/journal.pone.0201400.g004

Firstly, the increasing number of common sires (ranged from 0 to 19) increased the estimates of CD and r_ij but decreased PEVD in each heritability level when A^PED was used, indicating an increasing level of connectedness across herds. However, owing to the very high existing values of CD (CD for A^PED) with almost no change at the heritability of 0.63 (0.709–0.71) among common sires, hence, any further increase in CD might be difficult. A similar trend was also observed for G^BASE. As the number of common sires increased, the CD and r_ij increased with a decrease in PEVD indicating stronger genetic links between herds. Note that G^BASE r_ij criteria behave erratically with negative values, making them difficult to interpret. Thus G^S instead of G^BASE was used in comparison with A^PED. As shown in Fig 2, Fig 3, Fig 4 and Supporting Information (S1 Table), for G^S, three criteria occasionally fluctuated with increasing number of common sires, particularly for lower heritability levels. However, the general trend for the level of connectedness increased with the increasing number of common sires.

Secondly, as heritability increased, the levels of connectedness all increased regardless of genetic connectedness criteria, except for r_ij in A^PED in which the estimates for different heritability levels appeared similar (ranged from 0.001 to 0.005).

Finally, the estimates of G^BASE, G^0.5, and G^S for different heritability levels were all higher than that of A^PED (as seen in S1 Table). As expected, the r_ij estimates were all 0 in relation to A^PED when number of common sires equal to 0 regardless of heritability levels. This is because PEC among completely disconnected datasets all equals to 0.

We also simulated a more realistic scenario that only individuals in earlier generations were genotyped in the simulated dataset. In this case, the genomic matrices (i.e., G^BASE, G^0.5, and G^S) were combined with the A^PED creating H matrices. As shown in Supporting Information (S3 Table), estimates obtained from H matrix lie somewhere between the estimates observed when using the A^PED, G^BASE, G^0.5, and G^S. This is reasonable due to the fact that the H matrix was constructed based on a combination of A^PED and the genomic matrices (i.e., G^BASE, G^0.5, and G^S). Very little differences in the estimates were observed when A^PED was combined with G^BASE, G^0.5 and G^S and thus only results for G^BASE were shown (S3 Table).

PCA of the simulated populations

For the PCA, the first two principal components did not clearly separated all individuals from Herd1 and Herd3 into their respective groups when the number of common sires equal to 0 regardless of heritability levels (Fig 5A, Fig 5D and Fig 5G). As the number of common sires increased, all individuals tend to cluster together as expected, especially for number of common sires equal to 19 (Fig 5C, Fig 5F and Fig 5I).

Download:

Fig 5. Principal component analysis plots for the simulated populations.

PC1: Principal component 1. PC2: Principal component 2. Red: Herd1. Blue:Herd3.

https://doi.org/10.1371/journal.pone.0201400.g005

Genomic prediction

Accuracy of genomic prediction using Herd1 reference population or joint reference populations (Herd1 + Herd3) for specified scenarios was presented in Table 2. Compared to genomic prediction using Herd1 reference population alone, the accuracy of genomic prediction using joint reference population increased by 25% averaged over common sires, heritability levels and genomic relationship matrices (the detailed results are provided as Supporting Information (S2 Table)). Lower heritability benefited more. Moreover, it is worthy to note that the largest benefits were observed when the number of common sires equal to 0, and the gain of accuracy becomes smaller as the number of common sires increased. Additionally, the accuracy of genomic prediction using G^BASE was consistent with G^0.5 and G^S in each heritability level regardless of the scenarios. Furthermore, for A^PED and G^BASE, as the number of common sires increased, the accuracy of prediction generally decreased with increasing the CD and r_ij and decreasing PEVD regardless of heritability levels (Fig 6, the detailed results are provided as Supporting Information (S2 Table)). The highest accuracy was observed when the number of common sires equal to 0, as reflected by the lowest CD and r_ij values and highest PEVD estimates.

Download:

Fig 6. The relationship between genetic connectedness criteria and accuracy of prediction.

For r_ij, the G^BASE was replaced by G^S and the estimates of A^PED did not clearly distinguish the r_ij values at different heritability levels due to relatively small values (ranged from 0 to 0.005).

https://doi.org/10.1371/journal.pone.0201400.g006

Download:

Table 2. Accuracies of (G)EBV in the validation population when using the Herd1 or the joint reference population.

https://doi.org/10.1371/journal.pone.0201400.t002

In order to gain a better understanding of the possible effects of genomic connectedness on genomic prediction, the accuracies of the genomic predictions based on A^PED and H matrix were investigated as a comparison. Similar to the genomic matrices, A^PED and H matrix both gained (increased accuracy of genomic prediction) from using combined reference population (increased by on average 9% and 14%, respectively), with the largest gain for number of common sires equal to 0 and the gain of accuracy decreased as the number of common sires increased. In addition, relationship matrix with marker information (G^BASE, G^0.5, G^S and H matrix) provided higher accuracies of predictions than A^PED regardless of heritability levels and scenarios (i.e., varied number of common sires) (detailed information see S2 Table, S4 Table).

Discussion

The EBVs of individuals across management units (i.e., contemporary groups or herds) are comparable due to the use of BLUP method in genetic evaluation. However, the accuracy of these comparisons depends on the extent of connectedness among these units. The lack of the integrity and accuracy of the pedigree in China pig farms may lead to several practical problems. The use of pedigree-base methods (result unpublished) revealed no genetic links among pig nucleus farms such as BJLM, AHCF and FQYC in China. But in reality, there are possibilities of genetic connectedness existing among them due to common sire and also, since they all purchased seedstock from the same company. In such case, advancement of molecular biotechnology can provide novel insights to ascertain genetic connectedness at the genomic level. Our results confirmed that genomic relatedness increased the estimates of genetic connectedness across herds compared with its counterpart (i.e., pedigree relationship). Moreover, when reference datasets were combined, the accuracy of genomic predictions, averaged over each common sire scenarios, heritability levels and genomic relationship matrices, increased by 25% compared to genomic prediction using Herd1 reference alone. The largest benefits were observed as the number of common sires equal to 0 and the gain of accuracy of genomic prediction was smaller as the number of common sire increased.

The effect of genomic information on genetic connectedness

Pedigree-based genetic connectedness across management units has caused a great concern in the field of animal breeding. However, connectedness ascertained by genomic information was still remains poorly understood. The result from our study confirmed that genomic information enhance the estimates of genetic connectedness across the herds using PEVD, CD and r criteria regardless of heritability levels, and this is consistent with previous study of Yu et al. [8]. In 2017, Yu et al. proved that genomic relatedness strengthened genetic connectedness among management units by using the same genetic connectedness criteria. Given these data, the reason for the improved genetic connectedness might be due to the genomic relatedness captured Mendelian sampling which does not exist in pedigree relationship[29].

Genetic connectedness criteria

In order to provide a better understanding of the measurements of genetic connectedness, three known criteria (i.e., PEVD, CD and r) were used in this study. Overall, genetic connectedness calculated by PEVD and r criteria increased as the growth of common sires increases, which was in accordance with previous study[8]. However, the continued growth in CD relative to the increasing number of common sires differed from those reported by Yu et al [8]. Laloë D noted that CD is dependent on PEV and genetic variability[30]. The possible reason for the differences observed in the former and latter results might be due to the genetic variability in two generations simulated in the latter study which remained constant throughout the period of the study. In this case, a decrease in PEV corresponds to an increase in CD, which was confirmed in the present study. On the contrary, we speculated that the genetic variability tend to change because relatively intensive selection might have occurred in the previous studies.

In addition to PEVD, CD, and r, Mathur et al[31] proposed a connectedness statistics (the connectedness rating (CR) ranged from 0 to 1) to measure connectedness as the correlation between the estimates of the herd effects. We recalculate CR using three relationship matrices defined above, the CR statistics behave erratically in all scenarios (e.g., covariance of herd effects appeared negative values, leading negative values of CR) (result unpublished), making them difficult to interpret. The reason could be the negative values exist in the G matrix. How to apply this method in calculating genomic connectedness needs to be investigated in the future.

Genomic prediction

The construction of a large reference population for genomic prediction is difficult for numerically small breeds and traits that are difficult to measure. Particularly in China, the reference population size for pigs is normally smaller than other livestock species and this strongly inhibits the enhancement of genomic prediction accuracy for pigs. So far, the most straightforward method to increase the reliability is to combine reference data from different populations of the same breeds or different breeds, or by using robust methods (e.g., single step method).

In this study, Herd1 and Herd3 were both from the same historical population. In such cases, they were analogous to simulate two subpopulations (e.g., two lines in pig industry) from the whole population. Thus, we tended to combine reference data from the same populations (e.g., the same breed). By combining reference data, the accuracy of genomic prediction increased by 25% compared to genomic prediction using Herd1 reference data alone (S2 Table). This accuracy was determined by estimating the average of each common sire scenarios, such as average of different heritability levels and three genomic relationship matrices. The increase in accuracy of genomic prediction in our study was in accordance with earlier reports, for instance, Yorkshire populations in China[32], Holstein Friesian in North American[33, 34], in EuroGenomics [35] and in China Holstein Friesian population[36].

The accuracy of predictions based on A^PED matrix were lower than that of relationship matrices with marker information (i.e., G^BASE, G^0.5, G^S and H matrix), which was in agreement with previous studies [37]. Moreover, the prediction accuracy of H matrix was generally lower than that of genomic matrices (i.e., G^BASE, G^0.5, and G^S) across all scenarios and heritability levels, this is largely due to the fact that only a subset of individuals (N = 1600) were assumed to be genotyped, whereas, all individuals (N = 4040) from three generations assumed to be genotyped were used to estimate GEBV based on three genomic matrices. Based on the results of accuracy of H matrix, it has become increasingly apparent that single step method [23, 24] performed better than traditional BLUP method on A^PED even when the genotyped sample size was relatively small. This is especially important in the current China pig populations, where genomic selection is still in an early stage with limited genotyped individuals. Several earlier studies have shown that the improved genomic prediction due to combined reference population is mainly about the increased relatedness between the reference and validation populations [35]. Interestingly, as shown in S2 Table, combining two completely disconnected herds (i.e., number of common sires = 0) achieved the highest accuracy. The reasons may be attributed to the relationship between individuals from Herd1 and Herd3 which exist through genomic information if traced back far enough[38], this was confirmed by PCA plots where individuals from Herd1 and Herd3 were not clearly separated by the first two principal components when the number of common sires equal to 0 (Fig 5A, Fig 5D and Fig 5G) Therefore, the simulated data in our study is more similar to the scenario in two separate lines in one farm rather than two different herds. We found that increasing number of common sires decreased the gain of accuracies for joint reference population. It is speculated that the reason is largely due to the increasing genetic links in relation to number of common sires within reference population. Previous simulation study[39] showed that average reliabilities increased when average relationship within the reference population decreased. Moreover, Herd1 and Herd3 both from the same historical population (Ne = 200) and Ne is expected to remain constant due to their limited selection (generation = 2). In such cases, increased genetic connectedness within population may give less reliable prediction ability.

An extreme case of strong connectedness scenario was simulated to investigate its effect on the accuracy of genomic prediction. In this case, as the number of common sires across herds equal to 19 (the founder sires = 20), the individuals in generation 1 of Herd1 and Herd3 were all nearly half-sibs. Additionally, a value of 0.790 inferred from A^PED and 0.800 estimated by G^BASE both by using CD (in the range of 0 to 1) at heritability of 0.68 confirmed the strong genetic links across herds. It is pleasing to infer that, the accuracies for this extreme case in relation to strong genetic connectedness within reference data were still higher than that of using Herd1 reference data only. Consequently, the results indicated that the benefits of using the combined reference data may to some extent decrease by increasing the level of genetic connectedness within reference data. However, this is may not counteract the overall benefits of combining datasets.

Future direction

In this study, we focused on two simulated subpopulations (e.g., two lines in pig industry) with limited generations from the same historical population. Future research should include multiple populations, such as different selection lines or breeds. In addition, we have investigated the relationship between genetic connectedness criteria (i.e., PEVD, CD and r) and accuracy of prediction. However, the optimum statistical method (i.e., PEVD, CD and r) to measure genetic connectedness and enhance the predictive ability still remained poorly understood. Also, the level of genetic connectedness should be brought to a minimum level to ensure accurately across-herd genomic evaluation. Finally, the true genetic connectedness between populations is still unclear, which may preclude us from identifying which connectedness is the best.

Conclusions

This study confirmed that genomic relatedness could improve the estimates of genetic connectedness across herds compared with the use of pedigree relationships. We contend that our work contributes to better understand genetic connectedness that may have a positive impact on the genomic evaluation of pig in China. Moreover, the results demonstrated the importance of the size of reference populations for genomic prediction. However, care should be taken in the design of the reference population as combined closed related populations may give less reliable result of accuracy.

Supporting information

S1 Table. Average genetic connectedness statistics between Herd1 and Herd3 in the simulation data.

https://doi.org/10.1371/journal.pone.0201400.s001

(DOCX)

S2 Table. Accuracies of (G)EBV in the validation population when using the Herd1 or the joint reference population.

https://doi.org/10.1371/journal.pone.0201400.s002

(DOCX)

S3 Table. Average genetic connectedness statistics between Herd1 and Herd3 in the simulation data using H matrix.

https://doi.org/10.1371/journal.pone.0201400.s003

(DOCX)

S4 Table. Accuracies of (G)EBV in the validation population based on H matrix when using the Herd1 or the joint reference population.

https://doi.org/10.1371/journal.pone.0201400.s004

(DOCX)

Acknowledgments

We thank Jie-Li Fu for assistance in preparing the English manuscript.

References

1. Kuehn LA, Lewis RM, Notter DR. Managing the risk of comparing estimated breeding values across flocks or herds through connectedness: a review and application. Genet Sel Evol. 2007;39(3):225. pmid:17433239
- View Article
- PubMed/NCBI
- Google Scholar
2. Kennedy B, Trus D. Considerations on genetic connectedness between management units under an animal model. J Anim Sci. 1993;71(9):2341–52. pmid:8407646
- View Article
- PubMed/NCBI
- Google Scholar
3. Sun C, Wang C, Wang Y, Zhang Y, Zhang Q. Evaluation of connectedness between herds for three pig breeds in China. animal. 2009;3(4):482–5. pmid:22444370
- View Article
- PubMed/NCBI
- Google Scholar
4. Lewis R, Crump R, Simm G, Thompson R. Assessing connectedness in across-flock genetic evaluations. Proc Brit Soc Anim Sci. 1999;121.
- View Article
- Google Scholar
5. Zhang J, Zhang S, Qiu X, Gao H, Wang C, Wang Y. The Genetic Connectedness of Duroc, Landrace and Yorkshire Pigs in China. acta veterinaria et zootechnica sinica. 2017;48(9):1591–601.
- View Article
- Google Scholar
6. Yachun W, Yuan Z. The connectedness on large white and landrace in regional joint breeding system in Beijing. Journal of Animal and Veterinary Advances. 2010;9(18):2338–42.
- View Article
- Google Scholar
7. Akanno E. Genome-Wide Selection Program for Improvement of Indigenous Pigs in Tropical Developing Countries. Guelph University Press, Guelph; 2012.
8. Yu H, Spangler ML, Lewis RM, Morota G. Genomic Relatedness Strengthens Genetic Connectedness Across Management Units. G3: Genes, Genomes, Genetics. 2017;7(10):3543–56.
- View Article
- Google Scholar
9. Meuwissen THE, Hayes BJ, Goddard ME. Prediction of total genetic value using genome-wide dense marker maps. Genetics. 2001;157(4):1819–29. PubMed PMID: ISI:000168223400036. pmid:11290733
- View Article
- PubMed/NCBI
- Google Scholar
10. Rincent R, Laloë D, Nicolas S, Altmann T, Brunel D, Revilla P, et al. Maximizing the reliability of genomic selection by optimizing the calibration set of reference individuals: comparison of methods in two diverse groups of maize inbreds (Zea mays L.). Genetics. 2012;192(2):715–28. pmid:22865733
- View Article
- PubMed/NCBI
- Google Scholar
11. Isidro J, Jannink J-L, Akdemir D, Poland J, Heslot N, Sorrells ME. Training set optimization under population structure in genomic selection. Theoretical and applied genetics. 2015;128(1):145–58. pmid:25367380
- View Article
- PubMed/NCBI
- Google Scholar
12. Akanno E, Schenkel F, Sargolzaei M, Friendship R, Robinson J. Persistency of accuracy of genomic breeding values for different simulated pig breeding programs in developing countries. J Anim Breed Genet. 2014;131(5):367–78. pmid:24628765
- View Article
- PubMed/NCBI
- Google Scholar
13. Sargolzaei M, Schenkel FS. QMSim: a large-scale genome simulator for livestock. Bioinformatics. 2009;25(5):680–1. pmid:19176551
- View Article
- PubMed/NCBI
- Google Scholar
14. Henderson C. 1984-Guelph. 1984.
- View Article
- Google Scholar
15. Akanno E, Schenkel F, Quinton V, Friendship R, Robinson J. Meta-analysis of genetic parameter estimates for reproduction, growth and carcass traits of pigs in the tropics. Livestock Science. 2013;152(2):101–13.
- View Article
- Google Scholar
16. Putz A, Tiezzi F, Maltecca C, Gray K, Knauer M. A comparison of accuracy validation methods for genomic and pedigree‐based predictions of swine litter size traits using Large White and simulated data. J Anim Breed Genet. 2018;135(1):5–13. pmid:29178316
- View Article
- PubMed/NCBI
- Google Scholar
17. Vingborg RK, Gregersen VR, Zhan B, Panitz F, Høj A, Sørensen KK, et al. A robust linkage map of the porcine autosomes based on gene-associated SNPs. BMC genomics. 2009;10(1):134.
- View Article
- Google Scholar
18. Ramos AM, Crooijmans RP, Affara NA, Amaral AJ, Archibald AL, Beever JE, et al. Design of a high density SNP genotyping assay in the pig using SNPs identified and characterized by next generation sequencing technology. Plos One. 2009;4(8):e6524. pmid:19654876
- View Article
- PubMed/NCBI
- Google Scholar
19. Kuehn L, Notter D, Nieuwhof G, Lewis R. Changes in connectedness over time in alternative sheep sire referencing schemes. J Anim Sci. 2008;86(3):536–44. pmid:18073292
- View Article
- PubMed/NCBI
- Google Scholar
20. Laloë D. Precision and information in linear models of genetic evaluation. Genet Sel Evol. 1993;25(6):557.
- View Article
- Google Scholar
21. Wright S. Coefficients of inbreeding and relationship. The American Naturalist. 1922;56(645):330–8.
- View Article
- Google Scholar
22. VanRaden PM. Efficient methods to compute genomic predictions. J Dairy Sci. 2008;91(11):4414–23. pmid:18946147
- View Article
- PubMed/NCBI
- Google Scholar
23. Legarra A, Aguilar I, Misztal I. A relationship matrix including full pedigree and genomic information. J Dairy Sci. 2009;92(9):4656–63. pmid:19700729
- View Article
- PubMed/NCBI
- Google Scholar
24. Christensen OF, Lund MS. Genomic prediction when some animals are not genotyped. Genet Sel Evol. 2010;42(1):2.
- View Article
- Google Scholar
25. Christensen OF, Madsen P, Nielsen B, Ostersen T, Su G. Single-step methods for genomic evaluation in pigs. animal. 2012;6(10):1565–71. pmid:22717310
- View Article
- PubMed/NCBI
- Google Scholar
26. Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MA, Bender D, et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. The American Journal of Human Genetics. 2007;81(3):559–75. pmid:17701901
- View Article
- PubMed/NCBI
- Google Scholar
27. Wickham H. ggplot2: elegant graphics for data analysis. J Stat Softw. 2010;35(1):65–88.
- View Article
- Google Scholar
28. Hayes BJ, Goddard ME. Technical note: Prediction of breeding values using marker-derived relationship matrices. J Anim Sci. 2008;86(9):2089–92. PubMed PMID: ISI:000258851500005. pmid:18407982
- View Article
- PubMed/NCBI
- Google Scholar
29. Hill WG, Weir B. Variation in actual relationship as a consequence of Mendelian sampling and linkage. Genet Res. 2011;93(1):47–64.
- View Article
- Google Scholar
30. Laloë D, Phocas F, Menissier F. Considerations on measures of precision and connectedness in mixed linear models of genetic evaluation. Genet Sel Evol. 1996;28(4):359.
- View Article
- Google Scholar
31. Mathur P, Sullivan B, Chesnais J. Estimation of the degree of connectedness between herds or management groups in the Canadian swine population. Canadian Centre for Swine Improvement, Otawa. Canada.(Mimeo), 1999.
- View Article
- Google Scholar
32. Song H, Zhang J, Jiang Y, Gao H, Tang S, Mi S, et al. Genomic prediction for growth and reproduction traits in pig using an admixed reference population. J Anim Sci. 2017;95(8):3415–24. pmid:28805914
- View Article
- PubMed/NCBI
- Google Scholar
33. VanRaden PM, Olson K, Null D, Sargolzaei M, Winters M, Van Kaam JB. Reliability increases from combining 50,000-and 777,000-marker genotypes from four countries. Interbull Bulletin. 2012;(46).
- View Article
- Google Scholar
34. Schenkel F, Sargolzaei M, Kistemaker G, Jansen G, Sullivan P, Van Doormaal B, et al. Reliability of genomic evaluation of Holstein cattle in Canada. Interbull Bulletin. 2009;(39):51.
- View Article
- Google Scholar
35. Lund MS, de Roos APW, de Vries AG, Druet T, Ducrocq V, Fritz S, et al. A common reference population from four European Holstein populations increases reliability of genomic predictions. Genet Sel Evol. 2011;43. doi: Artn 43 PubMed PMID: ISI:000302058200001. pmid:22152008
- View Article
- PubMed/NCBI
- Google Scholar
36. Zhou L, Ding XD, Zhang Q, Wang YC, Lund MS, Su GS. Consistency of linkage disequilibrium between Chinese and Nordic Holsteins and genomic prediction for Chinese Holsteins using a joint reference population. Genet Sel Evol. 2013;45. doi: Artn 7 PubMed PMID: ISI:000317041200001. pmid:23516992
- View Article
- PubMed/NCBI
- Google Scholar
37. Guo X, Christensen OF, Ostersen T, Wang Y, Lund MS, Su G. Improving genetic evaluation of litter size and piglet mortality for both genotyped and nongenotyped individuals using a single-step method. J Anim Sci. 2015;93(2):503–12. PubMed PMID: ISI:000357086600005. pmid:25549983
- View Article
- PubMed/NCBI
- Google Scholar
38. Powell JE, Visscher PM, Goddard ME. Reconciling the analysis of IBD and IBS in complex trait studies. Nat Rev Genet. 2010;11(11):800. pmid:20877324
- View Article
- PubMed/NCBI
- Google Scholar
39. Pszczola M, Strabel T, Mulder H, Calus M. Reliability of direct genomic values for animals with different relationships within and to the reference population. J Dairy Sci. 2012;95(1):389–400. pmid:22192218
- View Article
- PubMed/NCBI
- Google Scholar

[ref1] 1. Kuehn LA, Lewis RM, Notter DR. Managing the risk of comparing estimated breeding values across flocks or herds through connectedness: a review and application. Genet Sel Evol. 2007;39(3):225. pmid:17433239
View Article
PubMed/NCBI
Google Scholar

[2] View Article

[3] PubMed/NCBI

[4] Google Scholar

[ref2] 2. Kennedy B, Trus D. Considerations on genetic connectedness between management units under an animal model. J Anim Sci. 1993;71(9):2341–52. pmid:8407646
View Article
PubMed/NCBI
Google Scholar

[6] View Article

[7] PubMed/NCBI

[8] Google Scholar

[ref3] 3. Sun C, Wang C, Wang Y, Zhang Y, Zhang Q. Evaluation of connectedness between herds for three pig breeds in China. animal. 2009;3(4):482–5. pmid:22444370
View Article
PubMed/NCBI
Google Scholar

[10] View Article

[11] PubMed/NCBI

[12] Google Scholar

[ref4] 4. Lewis R, Crump R, Simm G, Thompson R. Assessing connectedness in across-flock genetic evaluations. Proc Brit Soc Anim Sci. 1999;121.
View Article
Google Scholar

[14] View Article

[15] Google Scholar

[ref5] 5. Zhang J, Zhang S, Qiu X, Gao H, Wang C, Wang Y. The Genetic Connectedness of Duroc, Landrace and Yorkshire Pigs in China. acta veterinaria et zootechnica sinica. 2017;48(9):1591–601.
View Article
Google Scholar

[17] View Article

[18] Google Scholar

[ref6] 6. Yachun W, Yuan Z. The connectedness on large white and landrace in regional joint breeding system in Beijing. Journal of Animal and Veterinary Advances. 2010;9(18):2338–42.
View Article
Google Scholar

[20] View Article

[21] Google Scholar

[ref7] 7. Akanno E. Genome-Wide Selection Program for Improvement of Indigenous Pigs in Tropical Developing Countries. Guelph University Press, Guelph; 2012.

[ref8] 8. Yu H, Spangler ML, Lewis RM, Morota G. Genomic Relatedness Strengthens Genetic Connectedness Across Management Units. G3: Genes, Genomes, Genetics. 2017;7(10):3543–56.
View Article
Google Scholar

[24] View Article

[25] Google Scholar

[ref9] 9. Meuwissen THE, Hayes BJ, Goddard ME. Prediction of total genetic value using genome-wide dense marker maps. Genetics. 2001;157(4):1819–29. PubMed PMID: ISI:000168223400036. pmid:11290733
View Article
PubMed/NCBI
Google Scholar

[27] View Article

[28] PubMed/NCBI

[29] Google Scholar

[ref10] 10. Rincent R, Laloë D, Nicolas S, Altmann T, Brunel D, Revilla P, et al. Maximizing the reliability of genomic selection by optimizing the calibration set of reference individuals: comparison of methods in two diverse groups of maize inbreds (Zea mays L.). Genetics. 2012;192(2):715–28. pmid:22865733
View Article
PubMed/NCBI
Google Scholar

[31] View Article

[32] PubMed/NCBI

[33] Google Scholar

[ref11] 11. Isidro J, Jannink J-L, Akdemir D, Poland J, Heslot N, Sorrells ME. Training set optimization under population structure in genomic selection. Theoretical and applied genetics. 2015;128(1):145–58. pmid:25367380
View Article
PubMed/NCBI
Google Scholar

[35] View Article

[36] PubMed/NCBI

[37] Google Scholar

[ref12] 12. Akanno E, Schenkel F, Sargolzaei M, Friendship R, Robinson J. Persistency of accuracy of genomic breeding values for different simulated pig breeding programs in developing countries. J Anim Breed Genet. 2014;131(5):367–78. pmid:24628765
View Article
PubMed/NCBI
Google Scholar

[39] View Article

[40] PubMed/NCBI

[41] Google Scholar

[ref13] 13. Sargolzaei M, Schenkel FS. QMSim: a large-scale genome simulator for livestock. Bioinformatics. 2009;25(5):680–1. pmid:19176551
View Article
PubMed/NCBI
Google Scholar

[43] View Article

[44] PubMed/NCBI

[45] Google Scholar

[ref14] 14. Henderson C. 1984-Guelph. 1984.
View Article
Google Scholar

[47] View Article

[48] Google Scholar

[ref15] 15. Akanno E, Schenkel F, Quinton V, Friendship R, Robinson J. Meta-analysis of genetic parameter estimates for reproduction, growth and carcass traits of pigs in the tropics. Livestock Science. 2013;152(2):101–13.
View Article
Google Scholar

[50] View Article

[51] Google Scholar

[ref16] 16. Putz A, Tiezzi F, Maltecca C, Gray K, Knauer M. A comparison of accuracy validation methods for genomic and pedigree‐based predictions of swine litter size traits using Large White and simulated data. J Anim Breed Genet. 2018;135(1):5–13. pmid:29178316
View Article
PubMed/NCBI
Google Scholar

[53] View Article

[54] PubMed/NCBI

[55] Google Scholar

[ref17] 17. Vingborg RK, Gregersen VR, Zhan B, Panitz F, Høj A, Sørensen KK, et al. A robust linkage map of the porcine autosomes based on gene-associated SNPs. BMC genomics. 2009;10(1):134.
View Article
Google Scholar

[57] View Article

[58] Google Scholar

[ref18] 18. Ramos AM, Crooijmans RP, Affara NA, Amaral AJ, Archibald AL, Beever JE, et al. Design of a high density SNP genotyping assay in the pig using SNPs identified and characterized by next generation sequencing technology. Plos One. 2009;4(8):e6524. pmid:19654876
View Article
PubMed/NCBI
Google Scholar

[60] View Article

[61] PubMed/NCBI

[62] Google Scholar

[ref19] 19. Kuehn L, Notter D, Nieuwhof G, Lewis R. Changes in connectedness over time in alternative sheep sire referencing schemes. J Anim Sci. 2008;86(3):536–44. pmid:18073292
View Article
PubMed/NCBI
Google Scholar

[64] View Article

[65] PubMed/NCBI

[66] Google Scholar

[ref20] 20. Laloë D. Precision and information in linear models of genetic evaluation. Genet Sel Evol. 1993;25(6):557.
View Article
Google Scholar

[68] View Article

[69] Google Scholar

[ref21] 21. Wright S. Coefficients of inbreeding and relationship. The American Naturalist. 1922;56(645):330–8.
View Article
Google Scholar

[71] View Article

[72] Google Scholar

[ref22] 22. VanRaden PM. Efficient methods to compute genomic predictions. J Dairy Sci. 2008;91(11):4414–23. pmid:18946147
View Article
PubMed/NCBI
Google Scholar

[74] View Article

[75] PubMed/NCBI

[76] Google Scholar

[ref23] 23. Legarra A, Aguilar I, Misztal I. A relationship matrix including full pedigree and genomic information. J Dairy Sci. 2009;92(9):4656–63. pmid:19700729
View Article
PubMed/NCBI
Google Scholar

[78] View Article

[79] PubMed/NCBI

[80] Google Scholar

[ref24] 24. Christensen OF, Lund MS. Genomic prediction when some animals are not genotyped. Genet Sel Evol. 2010;42(1):2.
View Article
Google Scholar

[82] View Article

[83] Google Scholar

[ref25] 25. Christensen OF, Madsen P, Nielsen B, Ostersen T, Su G. Single-step methods for genomic evaluation in pigs. animal. 2012;6(10):1565–71. pmid:22717310
View Article
PubMed/NCBI
Google Scholar

[85] View Article

[86] PubMed/NCBI

[87] Google Scholar

[ref26] 26. Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MA, Bender D, et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. The American Journal of Human Genetics. 2007;81(3):559–75. pmid:17701901
View Article
PubMed/NCBI
Google Scholar

[89] View Article

[90] PubMed/NCBI

[91] Google Scholar

[ref27] 27. Wickham H. ggplot2: elegant graphics for data analysis. J Stat Softw. 2010;35(1):65–88.
View Article
Google Scholar

[93] View Article

[94] Google Scholar

[ref28] 28. Hayes BJ, Goddard ME. Technical note: Prediction of breeding values using marker-derived relationship matrices. J Anim Sci. 2008;86(9):2089–92. PubMed PMID: ISI:000258851500005. pmid:18407982
View Article
PubMed/NCBI
Google Scholar

[96] View Article

[97] PubMed/NCBI

[98] Google Scholar

[ref29] 29. Hill WG, Weir B. Variation in actual relationship as a consequence of Mendelian sampling and linkage. Genet Res. 2011;93(1):47–64.
View Article
Google Scholar

[100] View Article

[101] Google Scholar

[ref30] 30. Laloë D, Phocas F, Menissier F. Considerations on measures of precision and connectedness in mixed linear models of genetic evaluation. Genet Sel Evol. 1996;28(4):359.
View Article
Google Scholar

[103] View Article

[104] Google Scholar

[ref31] 31. Mathur P, Sullivan B, Chesnais J. Estimation of the degree of connectedness between herds or management groups in the Canadian swine population. Canadian Centre for Swine Improvement, Otawa. Canada.(Mimeo), 1999.
View Article
Google Scholar

[106] View Article

[107] Google Scholar

[ref32] 32. Song H, Zhang J, Jiang Y, Gao H, Tang S, Mi S, et al. Genomic prediction for growth and reproduction traits in pig using an admixed reference population. J Anim Sci. 2017;95(8):3415–24. pmid:28805914
View Article
PubMed/NCBI
Google Scholar

[109] View Article

[110] PubMed/NCBI

[111] Google Scholar

[ref33] 33. VanRaden PM, Olson K, Null D, Sargolzaei M, Winters M, Van Kaam JB. Reliability increases from combining 50,000-and 777,000-marker genotypes from four countries. Interbull Bulletin. 2012;(46).
View Article
Google Scholar

[113] View Article

[114] Google Scholar

[ref34] 34. Schenkel F, Sargolzaei M, Kistemaker G, Jansen G, Sullivan P, Van Doormaal B, et al. Reliability of genomic evaluation of Holstein cattle in Canada. Interbull Bulletin. 2009;(39):51.
View Article
Google Scholar

[116] View Article

[117] Google Scholar

[ref35] 35. Lund MS, de Roos APW, de Vries AG, Druet T, Ducrocq V, Fritz S, et al. A common reference population from four European Holstein populations increases reliability of genomic predictions. Genet Sel Evol. 2011;43. doi: Artn 43 PubMed PMID: ISI:000302058200001. pmid:22152008
View Article
PubMed/NCBI
Google Scholar

[119] View Article

[120] PubMed/NCBI

[121] Google Scholar

[ref36] 36. Zhou L, Ding XD, Zhang Q, Wang YC, Lund MS, Su GS. Consistency of linkage disequilibrium between Chinese and Nordic Holsteins and genomic prediction for Chinese Holsteins using a joint reference population. Genet Sel Evol. 2013;45. doi: Artn 7 PubMed PMID: ISI:000317041200001. pmid:23516992
View Article
PubMed/NCBI
Google Scholar

[123] View Article

[124] PubMed/NCBI

[125] Google Scholar

[ref37] 37. Guo X, Christensen OF, Ostersen T, Wang Y, Lund MS, Su G. Improving genetic evaluation of litter size and piglet mortality for both genotyped and nongenotyped individuals using a single-step method. J Anim Sci. 2015;93(2):503–12. PubMed PMID: ISI:000357086600005. pmid:25549983
View Article
PubMed/NCBI
Google Scholar

[127] View Article

[128] PubMed/NCBI

[129] Google Scholar

[ref38] 38. Powell JE, Visscher PM, Goddard ME. Reconciling the analysis of IBD and IBS in complex trait studies. Nat Rev Genet. 2010;11(11):800. pmid:20877324
View Article
PubMed/NCBI
Google Scholar

[131] View Article

[132] PubMed/NCBI

[133] Google Scholar

[ref39] 39. Pszczola M, Strabel T, Mulder H, Calus M. Reliability of direct genomic values for animals with different relationships within and to the reference population. J Dairy Sci. 2012;95(1):389–400. pmid:22192218
View Article
PubMed/NCBI
Google Scholar

[135] View Article

[136] PubMed/NCBI

[137] Google Scholar

Figures

Abstract

Introduction

Materials and methods

Simulation

Population structure.

Genome.

Genetic connectedness criteria

Relationship matrix

Population structures of the simulated populations

Prediction of genomic breeding values

Reference and validation data

Results

Genetic connectedness criteria

PCA of the simulated populations

Genomic prediction

Discussion

The effect of genomic information on genetic connectedness

Genetic connectedness criteria

Genomic prediction

Future direction

Conclusions

Supporting information

S1 Table. Average genetic connectedness statistics between Herd1 and Herd3 in the simulation data.

S2 Table. Accuracies of (G)EBV in the validation population when using the Herd1 or the joint reference population.

S3 Table. Average genetic connectedness statistics between Herd1 and Herd3 in the simulation data using H matrix.

S4 Table. Accuracies of (G)EBV in the validation population based on H matrix when using the Herd1 or the joint reference population.

Acknowledgments

References