Browse Subject Areas

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

The Relationship between Runs of Homozygosity and Inbreeding in Jersey Cattle under Selection

  • Eui-Soo Kim,

    Affiliations Animal Genomics & Improvement Laboratory, Beltsville Agricultural Research Center, Agricultural Research Service, United States Department of Agriculture, Beltsville, Maryland, United States of America, Department of Animal Science, Iowa State University, Ames, Iowa, United States of America

  • Tad S. Sonstegard,

    Affiliation Animal Genomics & Improvement Laboratory, Beltsville Agricultural Research Center, Agricultural Research Service, United States Department of Agriculture, Beltsville, Maryland, United States of America

  • Curtis P. Van Tassell,

    Affiliation Animal Genomics & Improvement Laboratory, Beltsville Agricultural Research Center, Agricultural Research Service, United States Department of Agriculture, Beltsville, Maryland, United States of America

  • George Wiggans,

    Affiliation Animal Genomics & Improvement Laboratory, Beltsville Agricultural Research Center, Agricultural Research Service, United States Department of Agriculture, Beltsville, Maryland, United States of America

  • Max F. Rothschild

    Affiliation Department of Animal Science, Iowa State University, Ames, Iowa, United States of America

The Relationship between Runs of Homozygosity and Inbreeding in Jersey Cattle under Selection

  • Eui-Soo Kim, 
  • Tad S. Sonstegard, 
  • Curtis P. Van Tassell, 
  • George Wiggans, 
  • Max F. Rothschild


Inbreeding is often an inevitable outcome of strong directional artificial selection but on average it reduces population fitness with increased frequency of recessive deleterious alleles. Runs of homozygosity (ROH) representing genomic autozygosity that occur from mating between selected and genomically related individuals may be able to reveal the regions affecting fitness. To examine the influence of genomic autozygosity on fitness, we used a genome-wide association test to evaluate potential negative correlations between ROH and daughter pregnancy rate (DPR) or somatic cell score (SCS) in US Jersey cattle. In addition, relationships between changes of local ROH and inbreeding coefficients (F) were assessed to locate genomic regions with increased inbreeding. Despite finding some decreases in fertility associated with incremental increases in F, most emerging local ROH were not significantly associated with DPR or SCS. Furthermore, the analyses of ROH could be approximated with the most frequent haplotype(s), including the associations of ROH and F or traits. The analysis of the most frequent haplotype revealed that associations of ROH and fertility could be accounted for by the additive genetic effect on the trait. Thus, we suggest that a change of autozygosity is more likely to demonstrate footprints of selected haplotypes for production rather than highlight the possible increased local autozygosity of a recessive detrimental allele resulting from the mating between closely related animals in Jersey cattle.


Inbreeding increases autozygosity of loci throughout the genome, some of which causes homozygosity of recessive alleles that may cause expression of an unfavorable phenotype. Although the exact number of deleterious mutations have not been estimated, it is important to note that most animals are assumed to be carriers of at least one recessively inherited disorder [1]. Previous studies have reported that individual humans would have 40–110 potentially deleterious variants and 0.4 recessive lethal mutations [2]. The frequency of a recessive lethal polymorphism in protein coding genes in some experimental species such as Drosophila melanogaster is expected to be substantial (25%) [3]. While autozygosity may reduce recessive deleterious allele frequency by chance, selection is the principal force that inhibits detrimental alleles from increasing in frequency [4].

Although inbreeding on average increases the expression of recessive deleterious alleles corresponding to genetic disease [5], creative use of mating systems between related individuals can increase selection response [6]. To estimate the extent of inbreeding depression, marker heterozygosity considering all fitness-influencing loci across the genome were estimated and the power to detect heterozygosity-fitness associations depended on the number of markers when only a few markers were genotyped [7]. Positive correlations between a trait of interest like reproductive fitness and level of marker heterozygosity are recognized as suggestive evidence of inbreeding depression [8]. Heterozygosity across a few microsatellite loci has been used to infer an inbreeding coefficient (F), but in an outbred natural population correlations between F and sparse molecular markers were relatively weak [9]. Recent development of high density SNP genotyping tools allows more precise examination of inbreeding even without records. Indeed, the genomic inbreeding coefficient estimated using continuous homozygous SNPs known as runs of homozygosity (ROH) was found to correlate well with the pedigree inbreeding coefficient estimated from a relatively small human cohort [10]. Varying lengths of ROH provide generational information about inbreeding levels in a reference population [11]. Moreover, the presence of unevenly distributed ROH across the genomes of individuals in European human populations suggested regional selection for adaptive variants [12]. Europeans generally have ROH shorter than 1.5 Mb, implying ancient linkage disequilibrium (LD) patterns or the inheritance of common haplotypes from both parents, whereas larger ROH reflect recent parental relatedness or a genomic region conferring a selective advantage [10]. In cattle, ROH analysis of two Holstein cow populations under 40 years of intense or no selection for milk production revealed most ROH ranged between 1.5 and 5.0 Mb across both populations, while ROH lengths of 10 Mb or higher could be found in the population under selection [13]. Because genomic autozygosity based on ROH allows for potential differentiation of inbreeding from selection for a quantitative trait [14], genomic regions encompassing extensive ROH potentially contain gene variants associated with genetic improvement in livestock [15].

The combination of intense selection for improved milk production based and artificial insemination from a limited number of related elite sires has reduced effective population size in most dairy breeds, and possibly also reduced average animal fitness. For example, a negative genetic correlation between reproduction and milk yield is currently one of the obstacles for accelerated genetic improvement of fertility [16]. While most studies emphasize the overall effect of inbreeding levels on reproduction [17], few studies report the correlation between local genomic autozygosity and phenotypes commonly associated with inbreeding depression. For the purpose of identifying the influence of recent inbreeding on two fitness traits, an association test of phenotypes and ROH-based autozygosity, representing a proxy for potential depression due to inbreeding, was performed using BovineSNP50 genotypes from 1,602 U.S. Jerseys. In addition, association between ROH and F was examined to identify if specific genomic regions contribute more frequently to overall autozygosity. These ROH were then compared to examine if there was a genetic effect on two fitness traits. Furthermore, the whole genome was scanned using haplotype windows allowing comparison between the most frequent haplotype alleles relative to ROH. These latter analyses provided insights of inbreeding depression relative to locus-specific (local hypothesis) and general genome-wide effects (general hypothesis), in an attempt to elucidate possible mechanisms underlying associations between genetic diversity and fitness [18].

Materials and Methods

Animals and genotypes

Pedigree information obtained from the USDA-Animal Improvement Programs Laboratory (Beltsville, MD) on 19,966 registered Jersey animals born in the United States between 1953 and 2008 was used to compute inbreeding coefficients (F) [19] using Pedigree Viewer [20]. Additionally, recorded phenotypes including production ability, fertility and disease related traits were collected. Inbreeding coefficients for animals with no parents or ancestors in the pedigree records were set to zero. Modern dairy cattle populations are comprised of both inbred (F ~ 0.1) and outbred structure because of intensive use of a small number of influential males selected for artificial insemination (AI) and mated to cows that probably originated from common ancestors born more than three generations ago, which creates complex pedigree structures consisting of multiple inbreeding loops. The genotyped animals consisted of 1,219 male and 383 female Jersey animals that were born between the 1960s and the 2000s. Most genotyped animals (N = 1,280) were born after 1990s, but frozen semen samples maintained for AI enabled us to genotype influential ancestors that were born before 1990s (N = 322), which may affect genetic polymorphisms of contemporary Jersey cattle. To reduce population stratification, animals were only sampled from small half-sib families (n = 3–30) without considering relationships in the extended pedigree. Then, the Illumina BovineSNP50 beadchip (Illumina, CA) was used to genotype 1,602 animals sampled from the U.S. Jersey cattle described above. The PLINK software [21] was used to screen SNPs based on minor allele frequency (MAF>0.01), Hardy-Weinberg Equilibrium (HWE) test (-log10p<3), genotyping rate (>0.8), and individuals with missing genotypes (<20%) and data were removed that did not conform. Finally, 36,869 SNPs were selected in autosomal chromosomes and used in all subsequent analyses. SNP genome coordinates were obtained from the bovine genome reference assembly UMD 3.1.

Definition of autozygous genomic region

Two approaches were used to define a homozygous genomic region. First, runs of homozygosity (ROH) were determined under the following criteria using a Perl script according to the modified methods that were suggested for the analysis of data in humans [10]. Considering that the cattle genome has in general regions with low density of SNPs, the criteria for defining genomic regions as ROH was 50 or more consecutive homozygous SNPs which is equivalent to approximately a 2 Mb region, and which allowed detection of homozygous regions encompassing 2–5 Mb region that are likely to originate from common ancestors for up to 10–25 generations ago [22]. For the purpose of defining locus autozygosity, we calculated the sum of ROH status (0 or 1) at each SNP across all genotyped animals. Furthermore, the summation of ROH across all marker genotypes for an individual was considered as overall genomic autozygosity (FROH), and this calculation of genomic inbreeding was then compared to the inbreeding coefficient (F) estimated based on pedigree (FPED). Haplotypes were obtained using fastphase [23] and the analyses of ROH were performed using Perl and R scripts.

Population structure

Principal components analysis (PCA) was performed using genotypes and ROH separately to examine and correct the potential population structure [24]. Association of F and ROH was assessed with or without adjusting for potential stratification. Therefore, the homozygous state of SNPs defined by ROH was used to conduct the analysis of principal components. Principal components (PCs) are obtained, which are included as covariates in a multiple regression model involving ROH and the inbreeding coefficient, which enables to adjust for underlying population structure [24]. To calculate principal components, the adegenet package [25] was used.

Associations between ROH and inbreeding coefficient

To evaluate associations between ROH at individual loci and inbreeding coefficients, a linear model was used without transforming FPED. The regression model was y = β0 + β1H + e where, y was the inbreeding coefficient (FPED) of an individual, β0 was the intercept of the equation, β1 was the coefficient of the predictor, H is the ROH as a particular locus (0 or 1), and e is the random error. To correct the effect of population stratification, principal components (PCn) from PCA were included as covariates in the linear model [24], , where PCi is ith principal component obtained using ROH and n is the number of PCs. Additionally, birth year of an animal was analyzed as a response (y) using the same model to assess the change of homozygosity. Statistical thresholds for a genome wide search of associations of genomic homozygosity and pedigree based inbreeding coefficient were determined empirically using permutation tests [26]. Inbreeding coefficient (FPED) was permuted repeatedly to obtain thresholds using the most significant result of each permutation was reserved [27]. The genome-wide critical values (1% = significant level and 5% = suggestive level) were obtained by 1,000 permutations of association tests between FPED and ROH in the whole genome. R and Perl scripts were used to detect associations and to calculate thresholds.

Association of ROH and haplotype homozygosity with phenotypes

Association of ROH and a trait was examined across the genome. In addition, the sliding window approach partitioned the haplotype homozygosity (HH) effect of each allele contributing to overall homozygosity. The predicted transfer ability (PTA) of daughter pregnancy rate and somatic cell count (SCS) in milk records of U.S. Jersey, which are genetic values after adjusting for environmental and polygenic effects, were obtained from USDA-AIPL ( Daughter pregnancy rate (DPR) was calculated from days open and directly relates to the proportion of females eligible to become pregnant in a 21 day period [28]. The association between haplotype or ROH and each trait was evaluated using linear regression [26], y = β0 + β1G + e, where y is PTA of DPR or SCS of an individual, β0 is the intercept, and β1 is a vector of recessive genetic effect. G is an indicator variable for the effect of ROH of an individual, and e is the random error. In this analysis, the effect of population structure in PTA was reduced by statistical adjustment using a linear mixed model including the effect of genetic relationships and environmental factors. The genome-wide statistical significance level was determined using 1,000 permutation tests [27] as described above.

Since ROH reflects the sum of haplotype homozygosity, the additive effect is inestimable. Therefore, associations of the most frequent haplotype and fertility (DPR) were also estimated using the additive or recessive model for the further understanding of genetic effect of the haplotype that is highly correlated with ROH. To estimate the additive genetic effect of the most common haplotype, the number of alleles of the most frequent haplotype are considered SNP-like genotypes (G = 0, 1 or 2) using the same model for ROH-trait association. Similarly, the homozygote of the most frequent haplotype was assumed to encompass the recessive allele and other haplotypes were assumed to have no recessive allele. The genetic effect was evaluated using the same model used for association test of ROH and trait by replacement of homozygote status indicator with an allele number in the sliding window. Finally, functional annotation for genes located in the regions identified from the association test between ROH and phenotypes was performed using the Enrichr software [29].


ROH and inbreeding

The length of an autozygous haplotype originating from a common ancestor 10 generations in the past was expected to be approximately 5 Mb, which is close to a typical size of ROH using thresholds ranging from 30–50 homozygous SNPs (S1 Table). At a 50 SNP threshold, the mean and median of homozygous fragments were 8.48 and 6.09 Mb, respectively. Correlations of FPED and FROH were 0.6–0.7 under the various definitions of ROH (S1 Table), while the level of ROH was independent of the definition of ROH. In all cases, FROH was higher than the mean FPED, implying that pedigree based inbreeding coefficients could be underestimated. Using pedigrees, we found that almost all individuals shared at least one common ancestor within 3 or 4 generations except founders and their offspring. In some individuals, particularly contemporary animals, the ROH extended over 10 Mb, suggesting derivation from recent common ancestors only 5 generations ago. We could not confirm due to limited resources of DNA and the presence of complex inbreeding loops, as to whether a ROH shared an identical allele from a common ancestor. In the genotyped animals with an assumption of no genetic relationships among founder animals, inbreeding coefficients ranged from 0.02 to 0.29 and the mean and median F were estimated to be 0.061 and 0.058, respectively. The inbreeding coefficient based on pedigree was nearly approximated by the estimated genomic autozygosity based on the sum of ROH per animal genome (S1 Fig).

Although obvious sub-populations were not separated, ROH based PCA showed that the levels of inbreeding coefficients could be roughly separated by PC1 (Fig 1A), which agreed with the trends of changes of inbreeding coefficients in the Jersey population. Using principal components based on ROH, the correlation of PC1 and FPED was high (r = 0.68), whereas correlations of FPED and PC2-PC5 did not exceed 0.1. Based on SNP genotypes, sub groups were not clearly separated by two principal components PC1 and PC2 (Fig 1B).

Fig 1. Principal component analysis of Jersey cattle using ROH (A) and SNP genotypes (B).

Principal component 1 (PC1, x axis) and principal component 2 (PC2, y axis) are plotted. Three groups that are classified based on inbreeding coefficient (FPED) are indicated with three different colors. Blue, red, and black circles represent individuals with FPED<0.03, FPED = 0.03–0.10, and FPED>0.10, respectively.

Association of local autozygosity and inbreeding coefficient

The existence of substantial amounts of variation in the levels of ROH leads us to examine that particular genomic regions were more likely to contribute to changes of the inbreeding coefficient. Associations of ROH and pedigree-based inbreeding coefficients (ROH-FPED associations) were assessed to identify genomic regions accounting for FPED increases. This analysis revealed ROH levels had increased at 60 or more regions (≥1 Mb, genome-wide threshold p≥0.01) with increasing FPED during the last five decades (Fig 2A; Fig 2B; S1 Table). Despite weak statistical evidence, the levels of ROH in most regions (>99%) have increased concomitant with inbreeding coefficient (FPED), whereas ROH of a few genomic regions decreased as inbreeding increased but with weak statistical significance (Fig 2B). Notably, autozygosity in the major histocompatibility complex region (MHC, 25–30 Mb) on BTA 23 has increased as levels of inbreeding elevation, which may affect immune response to infectious disease like somatic cell score (SCS) that is an indicative parameter of mastitis in cattle.

Fig 2. Genome-wide associations of ROH and F.

Associations of ROH-FPED (-log10p) are plotted against each SNP locus across the genome (A). The y axis (B) represent the effect (slope) of ROH-FPED association and a dotted line indicates the effect = 0 (B). Genome-wide suggestive level is–log10p = 4.3, and the significant threshold (–log10p = 5.4) is shown with dotted line (A). PCA based adjusted associations between ROH and FPED (C) and its effect (D) are plotted. A dotted line displays genome-wide significance level (C).

In contrast, ROH-FPED association test carried out with five principal components (PC1-PC5), which account >90% of the total variation, revealed only a few regions that were significantly associated with FPED (Fig 2C). Three regions with high levels of ROH were on BTA 1 (49–50 Mb), BTA 3 (39–44 Mb), and BTA 7 (24–26 Mb) and were significantly associated with FPED regardless of models, whereas most associations were influenced by potential stratification. These regions were concordant with the high levels of ROH (>0.4), but not all regions with high levels of ROH agreed with ROH-FPED association. Interestingly, half of the regions of increased ROH were due to decreased levels of FPED when stratification was adjusted (Fig 2D), demonstrating the existence of principal components that are involved in the associations of ROH and inbreeding. Using the same analysis, the effects of PCs based on genotypes were not considerable.

Mapping of ROH and DPR or SCS

Next, the association of the ROH and traits, including DPR (ROH-DPR associations) or SCS (ROH-SCS associations) were examined across the genome using linear regression, and the results identified genomic regions affecting DPR (Fig 3A). ROH was associated with both increased and decreased DPR across the genome (Fig 3B). The considerable negative associations between ROH and DPR were found on BTA 3, 7, 8, and 12 (Table 1). In these regions, DPR has decreased as the level of ROH has significantly increased with pedigree based inbreeding coefficient (FPED), suggesting a potential influence of local autozygosity on fertility. Similarly, association test of somatic cell score (SCS) resulted in the directional effect of ROH on the trait (S2 Fig). ROH that were associated with SCS also affected increased FPED on BTA 1, 3, 4, 5, 13, and 21 (Table 1), suggesting that elevated homozygosity could be involved in the susceptibility to mastitis. A region at 40 Mb overlapped with a relatively broad region (>3 Mb) encompassing ROH-DPR associations on BTA 3. In this region (39–44 Mb) of BTA 3, ROH elevated also with increasing inbreeding coefficient (Table 1). VCAM1 and SLC35A3 are closely located between 42 and 43 Mb on BTA 3. There was an overlap between ROH-DPR and ROH-SCS associations at a region on BTA 7 but ROH was not associated with FPED in this region.

Fig 3. ROH-DPR associations.

A) Significance of ROH-DPR association B) Effect of ROH-DPR association. A dotted line shows genome-wide significant threshold (adjusted 1% level, A). The positive or negative effect of daughter pregnancy rate (DPR) and ROH, which are defined by the slope of regression is plotted across the genome (B). Negative effect represents the region with decreased the levels of DPR by increased levels of ROH.

The genes identified in the regions that were significantly associated with DPR or SCS and the respective biological pathways are listed in S3 and S4 Tables. The total size of the regions was less than 20 Mb, which is approximately 2% of the total size of genome, encompassing approximately 100 annotated coding regions. One of the largest clusters from the analysis of DPR and SCS, genes affecting cell communication and sensory cognition (COL4A2, COL4A1, GJA2, GJB3, and GJB6) are located on BTA 3 (~40 Mb). Annotation of the regions that were associated with SCS revealed several interesting genes that may be involved in biological pathways like immune response, including CBLB (BTA 1) and NCK1 (BTA 1) that are involved in T cell receptor signaling pathway.

Comparisons of ROH associated with phenotypes, F, and birth year

When comparing the regions significantly associated with traits or F, ROH that affected the change of inbreeding coefficient did not greatly influence the traits known to be influenced by inbreeding depression in most genomic regions (Fig 4). However, the association of ROH and DPR appeared to be related to local autozygosity that has increased with overall inbreeding at six regions. To assess the correlation further, a genome-wide correlation coefficient was calculated between the associations. Overall, moderate correlations were found between the results from genome-wide associations of ROH and traits, birth year, and inbreeding coefficient using regression coefficients (S2 Table). The effects of ROH-FPED associations and ROH-DPR associations were correlated negatively (-0.24). The association between birth year and ROH, which reflects the consistent change of ROH during the past few decades, was correlated with the effect of associations of ROH-SCS (0.43) or ROH-DPR (-0.28), respectively.

Fig 4. Comparisons of ROH-DPR, ROH-SCS and ROH-FPED.

On each chromosome, red (upper), orange (middle) and blue (lower) bar display the significant associations of ROH-SCS, ROH-FPED, and ROH-DPR, respectively. The chromosome number is shown on the left side of each chromosome. Horizontal scale on top and bottom indicates genomic position (Mb) on a chromosome.

Finally, it is noted that ROH-FPED correlated with associations of ROH and the birth year of animals positively (0.36). This analysis examines the considerable change of ROH during the last five decades, which represents the regions under directional selection. Inbreeding coefficients in dairy cattle have increased mostly by selection of phenotypically superior ancestors that were frequently used as parents the next generation. When considering the annual change of ROH across the genome, several local autozygosity appeared to be involved in inbreeding and the increased level of ROH on BTA 2, 3, 5, 7, 8, 9, 16 and other regions (Table 2, S3 Fig), which implies that local autozygosity may be dependent on other genetic forces such as selection. On the other hand, FPED has been substantially increased with the elevating levels of ROH on several chromosomal regions, including BTA 4, 9, 10, 11, 13, 17, and 28, whereas the levels of ROH have not significantly associated with birth year in the same regions since the 1960s (S3 Fig; Fig 4).

Effect of haplotypes affecting variation in fertility

To examine the genetic effect mode of ROH, the correlation of the most frequent haplotype and ROH was assessed using haplotype homozygosity (HH) and then an association test between the most frequent haplotype and a trait was conducted. The HH calculated with a sliding window approach was then compared with ROH based homozygosity, allowing us further understanding of ROH. With a 50-SNP sliding window, mean length of haplotype was 3.4 Mb (standard deviation = 0.52) with an average of 123.5 (standard deviation = 77.0) alleles per haplotype (Table 3). The correlation among all ROH and HH was high (r = 0.9). In addition, no obvious large outlier was observed when comparing ROH and HH in Jersey cattle (S4 Fig). Although the average number of alleles was high, distribution of haplotype frequency was typically decided by a few crucial alleles. The mean frequency of the most frequent allele was 0.24 across the genome, having a range of 0.03 to 0.81. When all of the observed HH of each allele was calculated, the 5 most frequent alleles accounted for 90% of observed HH of all alleles (Table 3). For each haplotypic allele, observed HH did not deviate greatly from expected HH except the most frequent haplotype.

Table 3. Summary of the most frequent haplotypic allele using 50 SNP window.

When testing the association between ROH and fertility, we found that many regions with high or moderate levels of ROH were associated with fertility positively, suggesting that the genetic effect of most ROH appears to be unrelated to decreasing DPR. In Fig 5A, the significance for the regression showing the additive effect of the most frequent effect on DPR is shown, and in Fig 5B the significance of HH is seen. It was expected that results of the recessive model largely overlapped with regions harboring significant associations of ROH-DPR and correlation of associations from two models was considerably high (r = 0.8). Additionally, even the results from additive genetic model agreed for the regions that were associated with DPR using the recessive genetic model, in addition to the other region influencing fertility additively. In all, our results suggest that associations between ROH and DPR are concordant with the results from either the additive or recessive effects model using the most frequent haplotype, implying that ROH-DPR associations could be interpreted as quantitative trait loci as well as evidence of inbreeding depression due to increased homozygosity.

Fig 5. Association between DPR and the most frequent haplotype using additive (A) and recessive model (B).

Each bar demonstrates the association of DPR and haplotype that is defined by the 50-SNP window. Association of DPR and the most frequent haplotype (A) or homozygous status of the most frequent haplotype (B) represents an additive or recessive effect. Genome-wide significance level is shown on each plot.


A deleterious mutation may emerge randomly or one that exists may increase in frequency due to close linkage with a favorable allele under strong selection [30]. Because thousands of genes are influenced by the opposite effects of mutation and selection, detrimental or lethal alleles could be quite important when considering the entire genome [4]. In a highly redundant system, a deleterious mutation in a single gene may have a negligible effect and selection can only operate against the combination of mutations [3]. In contrast to random mating, artificial selection has increased overall similarity of the whole genome in livestock, particularly the genomic regions influencing economic traits [31].

Some inbreeding depression has been shown to be due to rare large-effect viability and fertility mutations in dairy cattle [32, 33]. However, in general inbreeding depression appears to be contributed to by rare, mildly detrimental mutations at many loci [34]. The sum of the mean number of deleterious causative polymorphism in zygotes, which contribute to the next generation has been estimated to be 12–32 in the human genome [3]. Inbreeding depression is a universal phenomenon, whereas the influence of inbreeding depression varies for different species and even for different populations of the same species [35]. VanRaden and colleagues [36] identified five haplotypes carrying lethal alleles in each dairy cattle breed including Holstein, Jersey, and Brown Swiss in the United States. However, the number of moderate and incomplete recessive deleterious mutation has not clarified in cattle.

Specifically, more than 60 regions showed a considerable change of local autozygosity accompanied by an elevation of inbreeding during the last few decades in Jersey cattle. On the contrary, only a small amount of ROH was associated with lower fertility in contemporary Jersey cattle. In addition, regions with high ROH (>0.3) created by recent inbreeding have not always been detrimental. In Australian Jersey cattle, only one genomic region affecting fertility was detected on the X chromosome using ROH based test [37]. Systematic mating may reduce relatedness between family members [38], but genomic factors corresponding to inbreeding depression have not been adequately characterized. The inverse correlation between milk production and fertility and the general belief that reproductive traits are associated with inbreeding depression led us to hypothesize that reproduction in cattle is associated with regions with excessive or increased levels of ROH in dairy cattle. It is worth noting that the high frequency of long intact haplotypes carrying loci under selection is one of the best determinants of selection [6].

Inbreeding coefficient (FPED) tends to be affected by influential and recent common ancestors, which may include stratification. According to the PCA based on genotypes, there was no obvious structure in Jersey data. However, the association test including or excluding stratification suggested to us some debatable results. After removing stratification using PCA of ROH, only three regions appeared to be bound to change of inbreeding coefficient, which deviated from our expectation. Interestingly, the correlation of the first principal component (PC1) and FPED was 0.7, which suggests that PC1 may account for the large amount of variation due to inbreeding coefficient (FPED). Therefore, we focused on the results from the analyses without PCs.

The correlation between the genomic and pedigree inbreeding coefficient is less than 0.7 in Jersey cattle, which is lower than the value observed in small human families (~0.85) [10]. The correlation between FPED and genotype based inbreeding coefficients were 0.74 using true allele frequencies and 0.68 using estimates of base frequencies in dairy cattle [39]. In Jersey cattle, correlation of FPED and FROH was 0.65 using 10–100 SNP thresholds [37], which agrees with our findings. A pedigree simulation study suggested that the variation in true autozygosity is considerably high among individuals with the same level of inbreeding without additional genetic effects like selection or migration [18]. Additionally, a pedigree based calculation of the inbreeding coefficient assumed founder animals were unrelated, which possibly results in lower F value compared to overall genomic autozygosity. In Jersey cattle, multiple regions showed substantial change of autozygosity during the last few decades, which suggested genetic forces like selection are more likely to play an important role in determining genome-wide patterns of autozygosity as well as inbreeding.

The local effect hypothesis assumes that heterozygosity-fitness correlations are the results of heterozygosity of particular genes and loci [40]. A mutation in SLC35A3 on BTA 3 causes complex vertebral malformation (CVM) [41], which is a recessively inherited disorder with the onset occurring during fetal development, and leading to frequent abortion of fetuses or perinatal death, and vertebral anomalies [1]. A causative mutation of CVM originated from only two influential sires born during the 1960s in the US. Similarly, alleles contributing to inbreeding depression in the US Jersey population appear to be associated with particular specific founders [41]. Another spontaneous mutation may occur in SLC35A3 or a neighboring gene such as vascular cell adhesion molecule-1 (VCAM1) that mediates leukocyte recruitment from blood into tissues and affects embryo survival rate in the mouse [42]. Decreases in lethal alleles from selection or inbreeding may lead to an apparent deficiency of heterozygotes compared to Hardy-Weinberg expectations [43]. Lethal genes have been reported in >5,000 Jersey cattle [36, 44], but minor alleles at a low frequency (< 0.1) are rarely detected using the HWE test in a small number of animals. In the case of the pedigree structure in domestic animals, multiple alleles appear to be common at major gene loci under selection for many generations [45]. This may restrict the increasing rate of a recessive mutation near favorable loci unless the mutation exists in all haplotypes carrying the loci under selection.

Our primary aim was to identify the regions corresponding to local autozygosity by detecting associations of ROH and DPR or SCS. Despite the fact that generally the inbreeding coefficient is the most influential factor associated with the decline in fertility, reproduction traits are generally quantitative traits and are largely affected by environment [46]. QTL mapping has been used to identify the genetic mode inducing inbreeding depression in Drosophila [47], demonstrating incomplete action of recessive effects on viability. The effect of ROH was expected to be partially or nearly recessive thus contributing to inbreeding depression. As inbreeding depression is also a consequence of nonlinear interactions between gene effects, it stands to reason that epistasis may complicate the analysis and interpretation [48]. Despite some ROH decreasing as FPED increased, most ROHs were positively correlated to FPED, supporting partial dominance rather than an overdominance hypothesis [14] Using haplotypes, the additive effects model explained variation in fertility better compared to the recessive effects model. Alternatively, the genetic effects estimated using a recessive model may be estimated approximately through the use of an additive model. For more accurate estimation, daughter yield deviation (DYD) of DPR that accounts for the additive effect and inbreeding depression can be used instead of PTA, but most of the recent genetic trend is decided by the additive effect [16]. It is important to recognize that most genetic effects will be additive when the mutations are incompletely recessive [48]. The power to detect recessive causal alleles is poor when the minor allele frequency was not close to 50% [49]. Estimated non-additive genetic variances for fertility were not smaller than the additive genetic variance in Holstein cattle [50]. Thus, an optimum approach will be necessary to evaluate precisely the effect of local autozygosity on traits.

In humans, susceptibility of disease increased with the increase of overall autozygosity, but a local effect of autozygosity on a disease has not been clarified [26]. In the primary analysis, we regarded the region at a high or increasing frequency of ROH as a product of an elevated inbreeding coefficient. However, the frequencies of alleles do not change significantly without other genetic events, whereas genotypic frequencies change due to inbreeding, which leads us to consider an alternative explanation of associations of ROH and FPED. In addition, the amount of change in local autozygosity does not necessarily agree with increasing levels of inbreeding. This suggests that inbreeding does not solely drive an increase of a local ROH faster than the other regions. Thus, we attribute locally emerging ROH to other genetic events including recent selection and inheritance from the recent common ancestors. The comparison between signatures of selection in Holstein cattle and ROH-FPED associations showed consistency on several regions which accounted for variation in milk yield [13]. The correlation of ROH-FPED associations and ROH-birth year associations (0.32–0.36) was considerable, which appears to be bound to an increase in the most frequent haplotype. This may suggest that multiple genetic events such as artificial selection based on a small number of effective founder animals, resulting in changing levels of ROH.

If records are not available, ROH appears to be the most powerful method for analysis of detecting inbreeding effects from among several alternative estimates of F with large sample size (n>12,000) [22]. However, contrary to expectations, association mapping of ROH with traits has demonstrated several additive QTL effects of the most frequent haplotypes, in particular when ROH was positively associated with DPR. The QTL accounting for fertility and calving traits have been reported in dairy cattle [46, 51]. In the US Holsteins, the top 10% of bulls had a DPR 4.9% higher than bulls in the lowest 10% [31], which could be exploited to improve reproduction and which may explain why not all regions of ROH were negatively associated with fertility problems.

We have suggested methods for assessing the relationships between inbreeding, genomic autozygosity, and trait performance using prior information from a Jersey cattle population. ROH could be considered the optimal one to measure inbreeding based on genomic information [11]. A survey of runs of homozygosity using high density SNP provided insight into the formation and role of autozygosity. However, the reasons for variation of the distribution and formation of ROH are unclear [14]. ROH appears to originate from selected ancestors that are likely to be related to each other, which suggests that the association between ROH and inbreeding coefficient appears to be indicative of the evidence of recent selection resulting in increased inbreeding. Altogether, we conclude that ROH reflects the homozygosity largely influenced by recent artificial selection [52, 53], suggesting the need for a careful interpretation of results from the analysis of associations between ROH and traits that is related to inbreeding depression in the US Jersey cattle.

Supporting Information

S1 Table. Correlation and regression of FROH on FPED.

1Definition of ROH based on the number of continuous homozygous SNP (30, 40, and 50 SNPs) or size (3 or 5 Mb). 2Regression coefficients of FPED on FROH are shown.



S2 Table. Correlation of associations between ROH and FPED, birth year, DPR, and SCS.

Correlation coefficient (r) are shown: r of the ROH-FPED, ROH-Year, ROH-DPR, and ROH-SCS represents associations between ROH and FPED, birth year, DPR, and SCS, respectively.



S3 Table. Genes in the regions encompassing associations between ROH and DPR.

The function of gene is annotated using Enrichr. Genes annotated by KEGG database ( are summarized in Excel file.



S4 Table. Genes in the regions encompassing associations between ROH and SCS.

The function of gene is annotated using Enrichr. Genes annotated by KEGG database ( are summarized in Excel file.



S1 Fig. Correlation of genomic autozygosity (ROH) and inbreeding coefficient.

The y axis represents the pedigree inbreeding coefficient (FPED), and the y axis indicates the levels of genomic inbreeding based on ROH (FPED).



S2 Fig. Associations of ROH and SCS.

A) Significant level of associations, B) Regression coefficient of associations. A dotted line shows genome-wide significant threshold (adjusted 1% level, A). The positive or negative effect of daughter pregnancy rate (SCS) and ROH, which are defined by the slope of regression is plotted across the genome (B). Negative effect represents the region with decreased the levels of SCS by increased levels of ROH.



S3 Fig. Genome-wide association of ROH and FPED or birth year.

ROH-F associations are plotted with gray bars and associations of ROH and birth year are shown in black dots. The dotted line shows the genome-wide significance level of ROH-FPED associations.



S4 Fig. The correlation between ROH and haplotype homozygosity.

ROH (y axis) corresponding to sliding window is plotted against haplotype homozygosity (HH, x axis). Mean ROH of in a sliding window is calculated to compare with 50-SNP haplotype homozygosity.



S1 File. Data availability.




We thank the former Animal Improvement Program Laboratory for providing the US Jersey pedigree and genotype information for animals previously processed at AGIL under funding from the American Jersey cattle association. Mention of trade names or commercial products in this article is solely for the purpose of providing specific information and does not imply recommendation or endorsement by the U.S. Department of Agriculture. The USDA is an equal opportunity provider and employer.

Author Contributions

Conceived and designed the experiments: ESK CPVT TSS. Performed the experiments: TSS. Analyzed the data: ESK GW. Contributed reagents/materials/analysis tools: TSS MFR CPVT. Wrote the paper: ESK MFR TSS CPVT GW.


  1. 1. Agerholm JS, Bendixen C, Andersen O, Arnbjerg J (2001) Complex vertebral malformation in Holstein calves. J Vet Diagn Invest 13: 283–89. pmid:11478598
  2. 2. Xue Y, Chen Y, Ayub Q, Huang N, Ball EV, Mort M, et al. (2012) 1000 Genomes Project Consortium. Deleterious- and disease-allele prevalence in healthy individuals: insights from current predictions, mutation databases, and population-scale resequencing. Am J Hum Genet 91:1022–32. doi: 10.1016/j.ajhg.2012.10.015. pmid:23217326
  3. 3. Morris JA (2001) How many deleterious mutations are there in the human genome? Med Hypotheses 56:646–52. pmid:11388784
  4. 4. Hedrick PW (2005) Genetics of populations. Sudbery: Jones and Bartlett. 737p.
  5. 5. Kristensen TN, Pedersen KS, Vermeulen CJ, Loeschcke V (2010) Research on inbreeding in the 'omic' era. Trends Ecol Evol 25:44–52. doi: 10.1016/j.tree.2009.06.014. pmid:19733933
  6. 6. Crow J, Kimura M (1970) An introduction to population genetics theory. New York, Evanston and London: Harper & Row Publishers.
  7. 7. Slate J, Pemberton JM (2002) Comparing molecular measures for detecting inbreeding depression. J of Evol Biol 15:20–31.
  8. 8. Hedrick P, Fredrickson R, Ellegren H (2001) Evaluation of d2, a microsatellite measure of inbreeding and outbreeding, in wolves with a known pedigree. Evol 55:1256–60.
  9. 9. Pemberton JM (2004) Measuring inbreeding depression in the wild: the old ways are the best. Trends in Ecol Evol 19:613–15.
  10. 10. McQuillan R, Leutenegger AL, Abdel-Rahman R, Franklin CS, Perici M, Barac-Lauc L, et al. (2008) Runs of homozygosity in European populations. Am J Hum Genet. 83:359–72. doi: 10.1016/j.ajhg.2008.08.007. pmid:18760389
  11. 11. Curik I, Ferenčaković M, Sölkner J (2014) Inbreeding and runs of homozygosity: A possible solution to an old problem. Livest Sci. 26:24.
  12. 12. Nothnagel M, Lu TT, Kayser M, Krawczak M (2010) Genomic and geographic distribution of SNP-defined runs of homozygosity in Europeans. Hum Mol Genet, 19(15):2927–35. doi: 10.1093/hmg/ddq198. pmid:20462934
  13. 13. Kim ES, Cole J, Huson HH, Wiggans GR, Van Tassell CP, Crooker BA, et al. (2013) Effect of Selection on runs of homozygosity in U.S. Holstein. Plos One 8:e80813. doi: 10.1371/journal.pone.0080813. pmid:24348915
  14. 14. Leroy G (2014) Inbreeding depression in livestock species: review and meta-analysis. Anim Genet.45: 618–28. doi: 10.1111/age.12178. pmid:24975026
  15. 15. Purfield DC, Berry DP, McParland S, Bradley DG (2012) Runs of homozygosity and population history in cattle. BMC Genet 13:70. doi: 10.1186/1471-2156-13-70. pmid:22888858
  16. 16. VanRaden PM, Sanders AH, Tooker ME, Miller RE, Norman HD, Kuhn MT, et al. (2004) Development of a National Genetic Evaluation for Cow Fertility. J Dairy Sci. 87:2285–92. pmid:15328243
  17. 17. McParland S, Kearney F. Berry DP (2009) Purging inbreeding depression within the Irish Holstein-Friesian population. Genet Sel Evol 41:16. doi: 10.1186/1297-9686-41-16. pmid:19284688
  18. 18. Markert JA, Grant PR, Grant BR, Keller LF, Coombs JL, Petren K (2004) Neutral locus heterozygosity, inbreeding, and survival in Darwin’s ground finches (Geospiza fortis and G. scandens). Hered 92:306–15.
  19. 19. Wright S (1922) Coefficients of inbreeding and relationship. Am Naturalist 56:330–38.
  20. 20. Kinghorn BP (1994) Pedigree Viewer-a graphical utility for browsing pedigreed data sets. 5th W Cong Genet App Livest Prod 22: 85–86.
  21. 21. Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MA, Bender D, et al. (2007) PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 81:559–75. pmid:17701901
  22. 22. Keller MC, Visscher PM, Goddard ME (2011) Quantification of inbreeding due to distant ancestors and its detection using dense SNP data. Genetics 189: 237–49. doi: 10.1534/genetics.111.130922. pmid:21705750
  23. 23. Scheet P, Stephens M (2006) A fast and flexible statistical model for large-scale population genotype data: applications to inferring missing genotypes and haplotypic phase. Am J Hum Genet 78: 629–44. pmid:16532393
  24. 24. Price AL, Patterson NJ, Plenge RM, Weinblatt ME, Shadick NA, Reich D (2006) Principal components analysis corrects for stratification in genome-wide association studies. Nat Genet 38:904–9. pmid:16862161
  25. 25. Jombart T, Ahmed I (2011) adegenet 1.3–1: new tools for the analysis of genome-wide SNP data.Bioinformatics 21:3070–1.
  26. 26. Keller MC, Simonson MA, Ripke S, Neale BM, Gejman PV, Howrigan DP, et al. (2012) Schizophrenia Psychiatric Genome-Wide Association Study Consortium 2012 Runs of homozygosity implicate autozygosity as a schizophrenia risk factor. PloS Genet 8:e1002656. doi: 10.1371/journal.pgen.1002656. pmid:22511889
  27. 27. Churchill GA, Doerge RW (1994) Empirical threshold values for quantitative trait mapping. Genetics 138:963–71. pmid:7851788
  28. 28. VanRaden PM (2003) Longevity and fertility trait definitions compared in theory and simulation. Interbull 30:43–46.
  29. 29. Chen EY, Tan CM, Kou Y, Duan Q, Wang Z, Meirelles GV, et al. (2013) Enrichr: interactive and collaborative HTML5 gene list enrichment analysis tool. BMC Bioinfo 14:129.
  30. 30. Chun S, Fay JC (2011) Evidence for hitchhiking of deleterious mutations within the human genome. Plos Genet 7:e1002240. doi: 10.1371/journal.pgen.1002240. pmid:21901107
  31. 31. Andersson L, Georges M (2004) Domestic-animal genomics: deciphering the genetics of complex traits. Nat Rev Genet 5:202–212. pmid:14970822
  32. 32. Thomsen B, Horn P, Panizt F, Bendixen E, Petersen AH, Holm LE, et al. (2006) A missense mutation in the bovine SLC35A3 gene, encoding a UDP-N-acetylglucosamine transporter, causes complex vertebral malformation. Genome Res 16:97–105. pmid:16344554
  33. 33. Khatib H, Maltecca C, Monson RL, Schutzkus V, Rutledge JJ (2009) Monoallelic maternal expression of STAT5A affects embryonic survival in cattle. BMC Genet 10:13. doi: 10.1186/1471-2156-10-13. pmid:19284551
  34. 34. Charlesworth D, Willis JH (2009) The genetics of inbreeding depression. Nat Rev Genet 10:783–96. doi: 10.1038/nrg2664. pmid:19834483
  35. 35. Karkkainnen K, Koski V, Savolainen O (1996) Geogrpahical variation in inbreeding depression of Scots pine. Evol 50:111–19.
  36. 36. VanRaden PM, Olson KM, Null DJ, Hutchison JL (2011) Harmful recessive effects on fertility detected by absence of homozygous haplotypes. J Dairy Sci 94:6153–61. doi: 10.3168/jds.2011-4624. pmid:22118103
  37. 37. Pryce JE, Haile-Mariam M, Goddard ME, Hayes BJ (2014) Identification of genomic regions associated with inbreeding depression in Holstein and Jersey dairy cattle. Genet Sel Evol 46:71. doi: 10.1186/s12711-014-0071-7. pmid:25407532
  38. 38. Weigel KA (2006) Prospects for improving reproductive performance through genetic selection. Anim Reprod Sci 96:323–30. pmid:16962265
  39. 39. VanRaden PM, Olson KM, Wiggans GR, Cole JB, Tooker ME (2011) Genomic inbreeding and relationships among Holsteins, Jerseys, and Brown Swiss. J Dairy Sci 94:5673–82. doi: 10.3168/jds.2011-4500. pmid:22032391
  40. 40. David P (1998) Heterozygosity-fitness correlations: new perspectives on old problems. Hered 80:531–37.
  41. 41. Nielsen US, Aamand GP, Andersen O, Bendixen C, Nielsen VH, Agerholm JS (2003) Effects of complex vertebral malformation on fertility traits in Holstein cattle. Livest Prod Sci 79:233–38.
  42. 42. Gurtner GC, Davis V, Li H, McCoy MJ, Sharpe A, Cybulsky MI (1995) Targeted disruption of the murine VCAM1 gene: essential role of VCAM-1 in chorioallantoic fusion and placentation. Genes Dev 9:1–14. pmid:7530222
  43. 43. Gillespie JH (1998) Population Genetics: A concise guide. Baltimore: John’s Hopkins press. 174p.
  44. 44. Sonstegard TS, Cole JB, VanRaden PM, Van Tassell CP, Null DJ, Schroeder SG, et al. (2013) Identification of a nonsense mutation in CWC15 associated with decreased reproductive efficiency in Jersey cattle. Plos One 8:e54872. doi: 10.1371/journal.pone.0054872. pmid:23349982
  45. 45. Kühn C, Thaller G, Winter A, Bininda-Emonds OR, Kaupe B, Erhardt G, et al. (2004) Evidence for multiple alleles at the DGAT1 locus better explains a quantitative trait locus with major effect on milk fat content in cattle. Genetics 167:1873–81. pmid:15342525
  46. 46. Holmberg M, Andersson-Eklund L (2006) Quantitative trait loci affecting fertility and calving traits in Swedish dairy cattle. J Dairy Sci 89:3664–71. pmid:16899702
  47. 47. Vermeulen C, Bijlsma R, Loeschke V (2008) A major QTL affects temperature sensitive adult lethality and inbreeding depression in life span in Drosophila melanogaster. BMC Evol Biol 8:297. doi: 10.1186/1471-2148-8-297. pmid:18957085
  48. 48. Houle D, Hoffmaster DK, Assimacopoulos S, Charlesworth B (1992) The genomic rate of mutation for fitness in Drosophila. Nature 359:58–60. pmid:1522887
  49. 49. Littre G, Lange C, Hirschhorn JN (2007) Genetic model testing and statistical power in population-based association studies of quantitative traits. Genet Epidemiol 31:358–62. pmid:17352422
  50. 50. Palucci V, Schaeffer LR, Miglior F, Osbrone V (2005) Non-additive genetic effects for fertility traits in Canadian Holstein cattle. Genet Sel Evol 39:181–93.
  51. 51. Druet T, Fritz S, Boussaha M, Ben-Jemaa S, Guillaume F, Derbala D, et al. (2008) Fine mapping of quantitative trait loci affecting female fertility in dairy cattle on BTA03 using a dense single-nucleotide polymorphism map. Genetics 17:2227–35.
  52. 52. Howard JT, Maltecca C, Haile-Mariam M, Hayes BJ, Pryce JE (2015) Characterizing homozygosity across United States, New Zealand and Australian Jersey cow and bull populations. BMC Genomics 16:187. doi: 10.1186/s12864-015-1352-4. pmid:25879195
  53. 53. Kim ES, Sonstegard TS, Rothschild MF (2015) Recent artificial selection in U.S. Jersey cattle impacts autozygosity levels of specific genomic regions. BMC Genomics 16:302. doi: 10.1186/s12864-015-1500-x. pmid:25887761