Skip to main content
Advertisement
  • Loading metrics

Detectability of runs of homozygosity is influenced by analysis parameters and population-specific demographic history

Abstract

Wild populations are increasingly threatened by human-mediated climate change and land use changes. As populations decline, the probability of inbreeding increases, along with the potential for negative effects on individual fitness. Detecting and characterizing runs of homozygosity (ROHs) is a popular strategy for assessing the extent of individual inbreeding present in a population and can also shed light on the genetic mechanisms contributing to inbreeding depression. Here, we analyze simulated and empirical datasets to demonstrate the downstream effects of program selection and long-term demographic history on ROH inference, leading to context-dependent biases in the results. Through a sensitivity analysis we evaluate how various parameter values impact ROH-calling results, highlighting its utility as a tool for parameter exploration. Our results indicate that ROH inferences are sensitive to factors such as sequencing depth and ROH length distribution, with bias direction and magnitude varying with demographic history and the programs used. Estimation biases are particularly pronounced at lower sequencing depths, potentially leading to either underestimation or overestimation of inbreeding. These results are particularly important for the management of endangered species, as underestimating inbreeding signals in the genome can substantially undermine conservation initiatives. We also found that small true ROHs can be incorrectly lumped together and called as longer ROHs, leading to erroneous inference of recent inbreeding. To address these challenges, we suggest using a combination of ROH detection tools and ROH length-specific inferences, along with sensitivity analysis, to generate robust and context-appropriate population inferences regarding inbreeding history. We outline these recommendations for ROH estimation at multiple levels of sequencing effort, which are typical of conservation genomics studies.

Author summary

In this study, we explored how various factors affect the accuracy of detecting runs of homozygosity (ROHs). ROHs are long sections of the genome where both inherited DNA strands are identical due to shared ancestry in the maternal and paternal lines. Because these sections reflect shared ancestry, ROHs are useful for measuring inbreeding in populations, which can negatively impact individual fitness and species survival. We analyzed simulated and empirical data to examine how factors such as demographic history and sequencing depth influence ROH detection. Our results show that these estimates of inbreeding are highly sensitive to the analysis software and data quality. Misidentifying ROHs can lead to significant overestimates or underestimates of inbreeding, potentially complicating conservation efforts. To improve accuracy, we recommend using multiple analysis methods and conducting sensitivity analyses to account for data variability. These findings are particularly relevant for managing endangered species, where precise genetic information is critical for effective conservation planning.

Introduction

Climate change and expanding human land use are increasingly partitioning wild populations into smaller areas of suitable habitat, often leading to declining population sizes [1,2]. Reduced population size can increase inbreeding, which can have negative fitness consequences for individuals in many wild populations [36]. Severe inbreeding depression may threaten populations with extirpation, making it crucial to assess inbreeding for understanding and mitigating risk in populations of conservation concern. Prior to widespread application of whole-genome sequencing strategies to non-model species, genetic estimates of inbreeding were obtained using allozyme or microsatellite data or inferred from known pedigrees [710]. These studies have been critically important to understanding the genetic dynamics of stable and shrinking populations and have led to increasing recognition of inbreeding depression’s prevalence and ability to affect wild population persistence [4,11]. However, whole-genome sequencing strategies offer new opportunities for insights previously not possible via pedigree or microsatellite-based studies [12].

Runs of homozygosity (ROHs) are contiguous genomic segments wherein both inherited haplotypes are identical by descent, typically resulting from mating between closely related individuals and their offspring inheriting large genomic segments from a common ancestor [13]. Recombination plays a key role in breaking down these segments over generations such that the presence of long ROHs is indicative of recent inbreeding, whereas shorter ROHs may point to more ancient inbreeding events. As a result, ROH length and frequency distributions can be used in population genetics for assessing inbreeding [13,14]. In conservation contexts, monitoring FROH (the proportion of an individual’s genome that is autozygous as measured by total length of ROHs [15]) can help establish reference points or empirical baselines for managing genetic diversity, especially in small or endangered populations where inbreeding may be common and maintaining healthy levels of genetic variation is essential for long-term viability.

The identification and quantification of inbreeding depression, largely a result of homozygosity of strongly deleterious mutations, is critical to conservation as it guides effective management strategies for myriad species. For example, the introduction of genetically diverse individuals into declining populations to reduce the proportion of deleterious alleles present in a population is expected to reduce genetic load and, as a result, reduce extinction risks. Further, coupling genomic data with fitness outcomes has shown that longer ROHs are associated with higher mutational loads [16,17] and ROH analyses in historical and extant populations have identified genetic trajectories leading to species declines and extinctions [18,19]. Thus, the integration of genetic data via ROH estimation and existing demographic data can provide important clues about population stressors that support management decision making, ultimately reducing extinction risks for target species.

Across the complex and varied histories of natural populations, particular scenarios represent critical turning points for managing populations in conservation contexts [20]. Large populations are often a high priority in conservation genetics due to the relatively weak force of genetic drift they experience, which allows for the maintenance of genetic variation in these populations [21]. Small populations, by contrast, experience stronger effects of drift and may experience inbreeding and inbreeding depression. As a result, small populations tend to have higher risk of extinction compared to larger populations [22,23]. In general, bottlenecked populations suffer from reduced genetic diversity and increased genetic drift, making them vulnerable to inbreeding depression and fitness loss [24], and declining populations may enter an extinction vortex, where inbreeding depression reduces fitness, leading to further population declines and exacerbating inbreeding in a cycle that can lead to extinction [25]. Therefore, reduced size population should result in longer ROHs due to higher inbreeding, and this signature should be detectable even after population expansion, as observed in several species [5,26,27]. However, detecting these ROHs requires identifying long genomic segments of reduced genetic variation that, in small and declining populations, comprise increasing proportions of the genome.

Estimating ROHs can provide crucial insights into populations’ evolutionary histories, but these histories can in turn affect which ROH-calling software and combination of parameter values are most appropriate. For example, the optimal settings for inferring ROHs in a small, inbred population may differ from those required for large, genetically diverse population due to underlying sources of bias being different (e.g., differences in ROH length distributions, numbers of variable sites, expected minor allele frequencies; [13]). While some studies include comparisons of results from multiple programs or parameter value combinations (e.g., [26,2830]), many more studies rely on default settings and do not explore the effects of varying these parameter values on their results. This has the potential to be misleading because it is always impossible to know how close the resulting estimates approximate reality, and yet ROHs are an alluring metric for decision makers based on the phenomenon that they illustrate.

Various methods are available for detecting ROHs in genome resequencing data. Scanning approaches, such as PLINK’s sliding window method, identify successive homozygous SNPs to define ROHs once a specified threshold of consecutive homozygous SNPs is met [31,32]. This method is widely used due to its simplicity and efficiency in handling large datasets, but it may miss shorter ROHs or produce false positives in regions with low SNP density. Model-based approaches, like BCFtools/RoH [33], use hidden Markov models to detect autozygosity in SNP data. These approaches can account for uncertainties in genotype data, making them particularly useful in cases where distinguishing true IBD regions from background noise is critical. However, they are computationally more intensive, which might be a consideration in very large datasets. For both methods, the extent to which these approaches may be biased by population history and the underlying ROH length and frequency distributions is not known.

Here, we address the challenge of interpreting ROH population patterns using simulated and empirical genomic sequencing data to compare ROH identification patterns among different demographic histories. We focus on whole-genome sequencing data, building on previous studies that have examined ROH inference for data sets with lower marker densities [3436]. Specifically, we test a wide array of setting combinations for two programs commonly used—PLINK and BCFtools/RoH—and, for PLINK, apply a sensitivity analysis to evaluate the effects of parameter values on ROH inference. These programs were chosen for their widespread use in population genomic studies and because they apply different approaches to ROH identification [34,3739]. Based on these results, we outline a set of recommendations for ROH estimation for population genomics studies.

Methods

Part I: Simulated data

Data generation and genotype calling.

We used SLiM v4.0.1 and modifications of scripts published by Stoffel et al. [6] to simulate four distinct demographic scenarios: (i) a long-term large population, (ii) a long-term small population, (iii) a bottlenecked population, and (iv) a long-term declining population. Each simulation was initiated as a population of 10,000 individuals, wherein each individual consisted of a homologous pair of 30-Mb chromosomes [40,41]. Each population was simulated for 10,000 generations, followed by scenario-specific demographic changes (Fig 1). The large population scenario involved an additional 1,000 generations at a population size of 1,000 individuals, whereas the small population scenario involved 1,000 additional generations at a population size of 250 individuals. The bottlenecked population persisted with 1,000 individuals for 900 additional generations, followed by a 50-generation interval with 50 individuals (i.e., the bottleneck) and a final 50-generation interval with 250 individuals. The declining population continued for 850 generations with a population size of 1,000, followed by three 50-generation intervals of decreasing population sizes, from 500 to 250 and finally to 50 individuals. The population sizes for the large and small populations were selected to balance realism, particularly in the context of IUCN listing criteria, with computational efficiency. Recombination rate (9.15 x 10−9 per site per generation) and base mutation rate (1.39 x 10−8 per site per generation) were chosen based on estimates calculated for Tasmanian devils (Sarcophilus harrissii)—the focal species for our empirical tests [42,43]. Each VCF file output from SLiM was converted to FASTA sequence files using a custom script in R v4.0.3 and a haploid ancestral sequence comprising a 30-Mbp segment randomly sampled from the Tasmanian devil genome [44].

thumbnail
Fig 1.

A) Diagrams describing population size changes for each of the simulated demographic scenarios. Each diagram shows the final 1,000 generations of each scenario, which was preceded in each by a burn-in of 10,000 generations with a population size of 10,000 diploid individuals. Population sizes for each interval are noted for each scenario. B,D,F,H) True overall FROH frequencies. C,E,G,I) Length bin-specific true FROH values; horizontal lines correspond to bin median values.

https://doi.org/10.1371/journal.pcbi.1012566.g001

Using the known genotypes for all individuals, we generated a coordinate file for each demographic scenario recording the start and end coordinates for all true ROHs ≥ 100 kb in length (i.e., ≥ 100 kb of sequential homozygous loci). We imposed this lower limit on ROH length because ROHs less than 100 kb in length likely originated in a single common ancestor approximately 500 generations ago (assuming a recombination rate of 1 cM/1 Mb; [45]), and would not be expected to influence contemporary individual fitness as strongly as more recently acquired autozygous segments [16]. This threshold is widely applied in population genetics studies of non-model species [5,4648], and we follow this convention for all downstream analyses.

We randomly selected 50 individuals from the final generation of each population simulation and generated FASTQ read files from each of the two FASTA files representing homologous chromosomes using ART (version MountRainier-2016-06-05) [49]. We simulated 150-bp paired-end reads using the HiSeq 2500 error model to a depth of 50X per individual (i.e., 25X per homologous chromosome). Each FASTQ file was quality-checked using FASTQC v0.11.9 [50]. We aligned reads to the ancestral sequence using the BWA-MEM algorithm implemented in BWA v2.0 and downsampled the resulting BAM files using SAMtools v1.17 to simulate four additional levels of coverage per individual: 5X, 10X, 15X, and 30X [51,52].

For each sorted BAM file, we called genotypes using the ‘HaplotypeCaller’ algorithm in Genomic Variant Call Format (GVCF) mode as implemented in GATK v4.1.9.0 [53]. For each level of coverage, individual GVCF files were combined using ‘CombineGVCFs’ and genotyped using ‘GenotypeGVCFs’. We applied ‘VariantFiltration’ to these VCF files in GATK to flag SNPs with low variant confidence (QualByDepth < 2), exhibiting strand bias (FisherStrand > 40), or with low mapping quality (RMSMappingQuality < 20). Finally, SNPs failing these filters and indels were removed using ‘SelectVariants.’ Although population history and genetic dataset characteristics can influence SNP filtering decisions, we chose to apply consistent filtering across all demographic scenarios to instead focus on how changing software parameters affect ROH inference. The effects of various SNP-filtering strategies on ROH inference have been addressed elsewhere [36,54]. Specifically, the impact of linkage disequilibrium (LD) SNP pruning on ROH calling has been assessed by Meyermans et al. [36] and Howrigan et al. [54], with differing results across populations including inhibition of ROH detection. To avoid introducing additional potential biases, we opted not to prune the data with respect to LD.

ROH calling: Hidden Markov model approach (BCFtools).

We applied the same ROH calling approaches to all multisample VCF files produced from the simulated data set using two of the programs most commonly applied to non-model species. First, we tested an extension of the BCFtools software package, BCFtools/RoH v1.17 [33]. This program uses a hidden Markov model to detect regions of autozygosity, requiring only a VCF file for all samples, population allele frequency information, and an optional recombination map. Because additional genetic information is not likely to be available for many wild populations, we relied on allele frequencies calculated from each of our sample sets. The main decision faced when running BCFtools/RoH is whether to estimate autozygous regions using called genotypes or genotype likelihood values. We tested the effects of this decision on ROH estimation by either including the -- GTs-only setting to limit inference based on genotypes (hereafter, BCFtools Genotypes) or omitting it and allowing genotype likelihood values to be considered (hereafter, BCFtools Likelihoods) (Table 1). There are also two HMM options that can be set:-- hw-to-az, which sets the transition probability of Hardy-Weinberg to autozygous state, and -- az-to-hw, which sets the transition probability of autozygous to Hardy-Weinberg state. The default values for these two probabilities are 6.7 x 10−8 and 5 x 10−9, respectively. However, BCFtools/RoH can be run in Viterbi training mode (-- viterbi-training option) to estimate these two probabilities for a specific data set. In total for BCFtools/RoH, we applied both methods, Genotypes and Likelihoods, using default transition probabilities and population-specific probabilities calculated using the-- cviterbi-training option and a custom R script (available in the GitHub repository for this project) to each demographic scenario and coverage level.

thumbnail
Table 1. Parameter values applied during ROH calling for both simulated and empirical data.

For PLINK, a total of 486 combinations were tested. PLINK default values are underlined. ROH = run of homozygosity.

https://doi.org/10.1371/journal.pcbi.1012566.t001

ROH calling: Sliding window approach (PLINK).

We tested a large number of parameter value combinations in PLINK v1.90b6.26 [31,32]. Unlike BCFtools/RoH, PLINK employs a sliding window approach to ROH identification: for each window placement, SNPs are examined for conformity to the PLINK parameter values (e.g., fewer than the number of heterozygous or missing calls allowed). It is then determined, for each SNP, whether a sufficient proportion of windows overlapping that SNP are homozygous and thus, whether the SNP is determined to be located within a ROH. PLINK has multiple parameters that can be set by the user, and we initially tested a total of 486 combinations of six of these parameters for each level of coverage (see Table 1 for list of parameters, initial values, and parameter descriptions). To explore the effects of these combinations on ROH inference, we applied an iterative approach designed by Mathur et al. [55] to explore through visualization how varying parameter settings influence overall individual autozygosity (FROH) estimates across samples within each population and coverage level (S1 Box).

For each iteration, we performed four steps:

  1. Run PLINK with all possible combinations of different parameters to be tested, ultimately generating a matrix of parameter values (predictor variables) and inferred FROH (response variable) for each sample.
  2. Create a linear model for each combination of parameter values (; where bi = weight of parameter xi), where the values of parameter xi are standardized to 1.
  3. Extract standardized rank regression coefficients (SRC) from the linear regression models using the sensitivity package in R and visualize sensitivity indices for each parameter (SRCi) to rank their weights [56].
  4. If SRCi ≈ 0 with little individual variation across samples, then set the parameter i to its default value. If SRCi is > 0 or < 0, then consider the effect described by SRCi (i.e., whether increasing the value of the parameter increases or decreases FROH and how SRCi varies with called FROH) and either define a new set of parameter values to test or select a value from the tested set. The rationale for this step is that if the SRCi for the parameter i is approximately zero (SRCi ≈ 0), the default value for the parameter is retained, as it is unlikely to have a significant influence on the FROH estimates. On the other hand, when SRCi is either greater than 0 (SRCi > 0) or less than 0 (SRCi < 0), this suggests that the parameter does influence FROH estimates. Therefore, when SRCi deviates from zero, we investigate the magnitude and direction of its effect on FROH to guide our decision making.

We began the first iteration by reading the results from the initial 486 combinations of parameter values for each demographic scenario into R v4.0.3 [44]. After examining the output figures, we selected new sets of parameter values to test for each demographic scenario (S2 Table). The final specific parameter values tested, as outlined in S2 Table, reflect the adjustments made after examining these figures.

Data summarization and statistical analyses.

Output files from BCFtools/RoH and the final PLINK runs were read into R for summarization and statistical analyses. We also read in true ROH data (i.e., start and end coordinates for known ROHs ≥ 100 kb in length) and calculated true FROH values for each individual. We filtered all called ROHs to retain ROHs ≥ 100 kb in length and calculated inferred FROH for each demographic scenario, individual, coverage level, and method. To describe relationships between true FROH and called FROH values, we constructed a linear model for each method and coverage level with true FROH as the predictor variable and called FROH as the response variable. For each model, we calculated the 95% confidence intervals (CIs) for the slope and y-intercept parameters using the confint function in R. To determine whether true and called FROH values differed for each model, we tested whether the model’s y-intercept differed from zero and whether the slope differed from one (i.e., whether the 95% CIs included zero or one, respectively). We also used the y-intercept and slope parameters to compare the results obtained from each BCFtools method when using default vs. Viterbi-trained HMM transition probabilities within each demographic scenario and, for all three approaches, to determine whether each method over- or underestimated true FROH at each coverage level and how the degree of over- or underestimation changed with increasing true FROH values and across demographic scenarios.

To further evaluate the accuracy of each ROH identification method, we also calculated false negative (i.e., failing to call a ROH present in an individual) and false positive (i.e., calling a ROH that was not present in an individual) rates for called ROHs. We began by identifying overlap between true and called ROHs on a per-position basis by summing the number of bases covered by both the true ROH and called ROH(s). From this information, we calculated (i) the false negative rate: the total chromosomal length covered by true ROHs but not by called ROHs divided by the total length of true ROHs; and (ii) the false positive rate: the total chromosomal length covered by called ROHs but not by true ROHs divided by the total chromosomal length not covered by true ROHs. For each demographic scenario, method, and level of coverage, we calculated median false positive and negative rates and compared these medians and the 50% quantiles between all method and coverage level combinations to provide insight into method-specific differences in ROH calling errors.

We calculated FROH for ROHs in four different length bins to explore how ROH identification methods may differ in their capabilities to accurately call ROHs of different sizes. We defined length bins as:

  1. Short ROHs: ≥ 100 kb and < 500 kb in length;
  2. Intermediate ROHs: ≥ 500 kb and < 1 Mb in length;
  3. Long ROHs: ≥ 1 Mb and < 2 Mb in length; and
  4. Very long ROHs: ≥ 2 Mb in length.

We examined how FROH for each bin changed with increasing coverage and also how patterns of over- and underestimation of FROH varied with increasing coverage by subtracting true FROH from called FROH for each individual. For each method, level of coverage, and length bin, we compared mean called FROH−true FROH and the 95% CI around these means (estimated using the quantiles function in R), with CIs < 0 indicating underestimation of true FROH and CIs > 0 indicating overestimation. We further explored relationships between true and called ROHs by examining how true and called ROHs overlap. We tabulated how many true ROHs each called ROH overlaps (or contains) and vice-versa for each unique combination of demographic scenario, ROH detection method, coverage level, and ROH length bin.

Part II: Empirical data

Data curation and genotype calling.

To test the effects of program and parameter value selection on identifying ROHs from empirical data, we analyzed publicly available whole genome sequencing data for a species of conservation concern, the Tasmanian devil (BioProject PRJNA549794 in NCBI’s Sequence Read Archive; [57]). From the full data set, we selected the 15 individuals from this data set with the highest number of reads. The accession numbers and relevant metadata for each set of sequences are provided in S2 Table. Adapters and low-quality bases were trimmed from raw sequences using Trim Galore v0.6.6 [58], and cleaned reads were mapped to the chromosome-level mSarHar1.11 S. harrissii reference genome (NCBI GenBank accession GCA_902635505.1) using BWA-MEM [51]. This species was chosen due to its known demographic history of severe population declines [43,5961], allowing us to draw comparisons between the empirical results and the outcomes observed in our analyses for the simulated declining population. We used Qualimap v2.2.1 to determine mean coverage per individual from each sorted BAM file [62]. These results were used to calculate the downsampling proportions required to approximate 5X, 10X, 15X, and 30X coverage for each individual. Following downsampling, BAM files were processed in the same manner as for the simulated data, with additional SNP filtering criteria applied in VCFtools v0.1.17 [63], including filtering SNPs within 5 bp of indels and requiring minor allele frequencies ≥ 0.05 and < 20% missing data across individuals.

ROH calling and sensitivity analyses.

We called ROHs from the final multisample VCF files using the same approaches as for the simulated data. We called ROHs in two ways, (i) using BCFtools/RoH (i.e., relying on genotypes or on genotype likelihood values and using default or Viterbi-trained HMM transition probabilities) and (ii) testing 486 parameter combinations in PLINK at each level of coverage and visualizing the effect of altering each parameter following the same sensitivity analysis process described above. The parameter values for all iterations tested are provided in S1 Table, with additional details on the workflow outlined in S1 Box.

Data summarization and statistical analysis.

Output files from BCFtools/RoH and the final PLINK runs for the empirical data were read into R for summarization and statistical analyses. Following the approach we used for the simulated data, we filtered all called ROHs to retain ROHs ≥ 100 kb in length and calculated inferred FROH for each individual in each scenario, coverage level, and method. We also calculated FROH for ROHs in four different length bins, where length bins were defined as described above for the simulated data. To compare results across methods, coverage levels, and ROH lengths, we calculated mean length-specific FROH values and compared the 95% CIs around these means among methods and coverage levels.

Results

Part I: Simulated data

Data collection and curation.

For each simulated population, all analyses were based on 50 randomly sampled individuals. For the declining population, extremely low heterozygosity in two individuals precluded estimation of HMM transition probabilities via Viterbi training, and all downstream analyses for this demographic scenario were based on the remaining 48 individuals. Mean heterozygosity (± SD) varied across scenarios: large population mean = 3.17 x10-5 ± 1.26 x 10−5; bottlenecked population mean = 3.05 x 10−5 ± 5.81 x 10−6; small population mean = 2.87 x 10−5 ± 1.22 x 10−5; declining population mean = 2.44 x 10−5 ± 8.81 x 10−6 (S17 Fig). Mean FROH generally increased with decreasing mean heterozygosity: large population mean = 0.21; bottlenecked population mean = 0.48; small population mean = 0.62; declining population mean = 0.65 (Fig 1B, 1D, 1F, and 1H). For all demographic scenarios, length-specific FROH (Methods) mostly decreased with increasing ROH length with some notable exceptions in the bottlenecked and declining population scenarios (Fig 1C, 1E, 1G, and 1I). For the bottlenecked and declining population scenarios, a few individuals had high proportions of their genomes located in very long ROHs (Fig 1G and 1I), consistent with very recent inbreeding events in each population. Following downsampling and SNP filtering, the final mean coverage across all demographic scenarios was 4.87, 9.77, 14.69, and 29.20, and 44.99 for the 5X, 10X, 15X, 30X, and 50X VCF files, respectively.

ROH calling accuracy across methods.

Before comparing results from the three main methods—BCFtools Genotypes, BCFtools Likelihoods, and PLINK—we examined the effect of using either default or Viterbi-trained HMM transition probabilities on the results produced by each BCFtools method. Within each method, we compared overall FROH values obtained when using default HMM transition probabilities (default values:--hw-to-az = 6.7 x 10−8,--az-to-hw = 5 x 10−9) against those calculated when using probabilities estimated via Viterbi training (mean values across all groups:--hw-to-az = 9.2 x 10−7,--az-to-hw = 3.8 x 10−6; S3 Table). (In practice, mean probabilities were calculated and applied for each demographic scenario and BCFtools method combination, but note the small magnitude of variation in Viterbi-trained probabilities across all groups relative to the difference between these probabilities and the program defaults.) The slope and intercept parameters never significantly differed between the two sets of results across all demographic scenarios and coverage levels, but for most comparisons, the slope and intercept parameters of the model built using Viterbi-trained results were closer to 1 and 0, respectively, than for the default values model (S4 Table and S1S4 Figs). Using Viterbi-trained transition probabilities slightly increased false positive rates but also decreased false negative rates (S5S8 Figs). Because of these slight, if not significant, improvements in overall FROH estimation accuracy and false negative rates, we chose to proceed with Viterbi-trained HMM transition probabilities for both BCFtools methods and present only these results below (we also describe one additional benefit of Viterbi-trained probabilities below).

We used our simulated data set and linear models to determine whether each approach tends to over- or underestimate true FROH. All three methods underestimated FROH, with all model intercepts across coverage levels negative and different from zero, except for PLINK when calling ROHs in the large population at 5X coverage (i.e., no other 95% CIs for intercepts included zero; Fig 2 and S5 Table). Within each demographic scenario, the three methods produced similar results with respect to overall FROH accuracy. For example, within all four scenarios, intercepts did not differ among methods or coverage levels, with two exceptions at 5X coverage when using PLINK: intercepts differed between this group and others within the large, declining, and the majority of coverage levels for small populations. Perhaps the most visually striking points of comparison across demographic scenarios are the slope estimates specific to each. Within each method, demographic scenarios were consistently ordered by slope estimates, with the declining population having the largest slope estimate (for BCFtools, all slopes > 1 with no 95% CIs including one), followed by the small, bottlenecked, and large populations (for the large population, all BCFtools slopes < 1, all PLINK slope 95% CIs include one; Fig 2 and S5 Table). Thus, accuracy of overall FROH estimates calculated for the declining population varied the most with respect to true FROH (i.e., of all demographic scenarios, slope parameter estimates for this population most greatly differed from one), followed by the large population. With respect to true FROH, accuracy went up with increasing true FROH for the declining population and went down with increasing true FROH for the large population.

thumbnail
Fig 2. The relationship between true and inferred FROH values depends on inference method and population demographic scenario.

Each regression line represents linear model results for a single level of coverage with the shaded areas representing 95% confidence intervals. Each point represents data for a single simulated individual. Panels display outcomes using BCFtools in Genotypes mode (A-D), BCFtools in Likelihoods mode (E-H) and, PLINK (I-L), as well as by population scenarios including large (A, E, I), small (B, F, J), bottlenecked (C, G, K), and declining (D, H, L) populations. Dashed line is 1:1 line and x- and y-axes are consistent within each demographic scenario. Note the differing slopes across demographic scenarios (e.g., among panels A-D) and differing overall accuracies across methods (e.g., differing distances between regression lines and 1:1 line among panels D, H, and L). Another version of this figure with consistent axis limits across panels and colorized by sequencing depth is available in S19 Fig.

https://doi.org/10.1371/journal.pcbi.1012566.g002

We calculated false negative (i.e., failing to call a ROH present in an individual) and false positive (i.e., calling a ROH that was not present in an individual) rates to further assess each method’s accuracy. With respect to false positive rates, PLINK performed poorly relative to the other two methods, with false positive rates consistently higher than BCFtools Genotypes and BCFtools Likelihoods (Fig 3A). Across all demographic scenarios and coverage levels, median false positive rates ranged from 6.14 x 10−7 to 0.031 for BCFtools Genotypes, from 4.54 x 10−8 to 0.007 for BCFtools Likelihoods, and from 0.008 to 0.112 for PLINK. For all three methods, increasing coverage to 10X corresponded to decreasing false positive rates, but rates did not continue to improve at coverages above 10X (i.e., 50% quantiles around median values overlap at 10-50X within each method and demographic scenario; results for all coverage levels provided in S9 and S10 Figs). Variation in false positive rates among samples at each coverage level was smallest for BCFtools Likelihoods, followed by BCFtools Genotypes, with PLINK showing the greatest variation across samples (summary statistics provided in S6 Table).

thumbnail
Fig 3. PLINK outperforms BCFtools with respect to false negative rates, but underperforms with respect to false positive rates.

A) False positive (i.e., incorrectly calling a base position as being located in a ROH) and B) false negative (i.e., failing to identify a base position as being located in a ROH) rates across demographic scenarios and methods. Horizontal lines indicate median values and shaded boxes are 50% quantiles. Note the difference in scale of y-axis between panels A and B. Both BCFtools approaches outperform PLINK with respect to false positive rates but the reverse is true for false negative rates. Increasing coverage corresponds to decreasing false positive rates and to increasing false negative rates. Values displayed for 5X and 50X coverages; data for all coverage levels presented in S9 and S10 Figs, as well as a scatter plot in S18 Fig.

https://doi.org/10.1371/journal.pcbi.1012566.g003

Patterns in false negative rates were generally in the opposite direction and magnitude to those we observed with false positives: both BCFtools methods performed poorly relative to PLINK, with BCFtools Genotypes producing slightly lower rates (median range across all demographic scenarios and coverage levels = 0.36–0.74) than BCFtools Likelihoods (median range = 0.49–0.78; Fig 3B). PLINK exhibited lower false negative rates than the BCFtools approaches (median range = 0.32–0.53) and less variation among samples at each coverage level. All three methods produced false negative rates that increased slightly at 10X coverage relative to 5X, with rates leveling off at 10X and higher coverage levels. Examples of false negative and false positive scenarios can be seen in S10 Fig, which illustrates a full chromosome of true and called ROHs for one exemplar individual from each demographic scenario.

We also examined how true and called values of FROH varied for ROHs of different lengths. For the simulated data, all three methods almost always underestimated the proportion of the genome located in short ROHs across demographic scenarios, with the 95% CI less than zero for all tests other than PLINK at 5X coverage (Fig 4; all levels of coverage presented in S12S15 Figs). For longer ROHs, results were variable and dependent on demographic scenario. For example, all three methods appeared to approach accuracy for longer ROHs in the large population, but this is likely due to the relative paucity of longer ROHs in this demographic scenario (Fig 1). For the bottlenecked and declining populations (wherein some individuals had very long ROHs), both PLINK and BCFtools Genotypes tended to overestimate FROH for very long ROHs. PLINK generally produced the highest overestimates of FROH and the most variation across samples of the three approaches, followed by BCFtools Genotypes. Finally, one coverage-related trend emerged across ROH length categories, methods, and demographic scenarios, with FROH estimates calculated at 5X coverage often exceeding estimates calculated at higher coverage levels. Across all length bins and demographic scenarios, individual estimates of length-specific FROH calculated at 5X were greater than those calculated at 10X for 37%, 53%, and 53% of BCFtools Likelihoods, BCFtools Genotypes, and PLINK estimates, respectively.

thumbnail
Fig 4. Increasing true ROH length corresponds to increasing detection.

Called FROH−True FROH displayed by length bin (short, intermediate, long, very long) and demographic scenario (A: large population; B: small population; C: bottlenecked population; D: declining population) at 15X (results for all coverage levels presented in S9S10 Figs). For BCFtools Genotypes and PLINK, FROH for short ROHs is consistently underestimated whereas FROH for very long ROHs is overestimated when these ROHs are present. BCFtools Likelihoods does not overestimate ROHs in any length bin.

https://doi.org/10.1371/journal.pcbi.1012566.g004

To further investigate how called ROHs correspond to true ROHs, we identified regions of overlap between true and called ROHs within each individual and at each coverage level using a unique identifier for each true and called ROH. We found no instances of true ROHs being split into multiple called ROHs, but multiple true ROHs were often lumped together into a single called ROH. This pattern held true for all three methods at most coverage levels (Figs 5 and S16). For BCFtools Genotypes and PLINK, increasing coverage did not appear to ameliorate this problem (i.e., the mean number of true ROHs lumped into a single called ROH changed very little with increasing coverage; Fig 5C and 5D). However, for BCFtools Likelihoods, the number of true ROHs contained in a single called ROH decreased with increasing coverage, reaching a 1:1 ratio at 30X. Across methods, the mean number of true ROHs combined into a single called ROH increased with increasing ROH length, except for BCFtools Likelihoods at coverage levels ≥ 30X (S16 Fig). This lumping can be seen in Fig 5B.

thumbnail
Fig 5. All three methods tested combine multiple true ROHs into single called ROHs, with increasing coverage only providing improvements for BCFtools Likelihoods.

A) Diagram illustrating this lumping issue. B) Examples of this issue at 5X and 50X in a single simulated individual drawn from the small population demographic scenario. C) Number of true ROHs combined into a single called ROH for ROHs of varying lengths when called by all three methods at 5X and (D) at 50X in the small population (results for all coverage levels and demographic scenarios provided in S16 Fig). Points correspond to mean values and vertical and horizontal error lines indicate 95% confidence intervals. Dashed horizontal line corresponds to y = 1 (a 1:1 relationship between numbers of true and called ROHs).

https://doi.org/10.1371/journal.pcbi.1012566.g005

Part II: Empirical data

Genotype and ROH detection in Tasmanian devils.

For the 15 sets of reads we downloaded from NCBI, the mean number of reads per sample was 9.75 x 108. Read mapping rates to the mSarHar1.11 S. harrissii reference genome were high, with an average of 95.4% of reads mapped and properly paired. For the final sets of filtered SNPs (n = 1,532,598), average depth across samples was 48.43 for the full coverage set (i.e., not downsampled) and 6.37, 11.84, 16.63, 30.75 for the 5X, 10X, 15X, and 30X downsampled sets, respectively (S2 Table). Given the modest improvements in FROH accuracy conferred by using Viterbi-trained HMM transition probabilities rather than the default values for the simulated data, we calculated Viterbi-trained values for both BCFtools methods applied to the empirical data.

Across methods, FROH estimated at 5X coverage was significantly higher than FROH estimates at all higher levels of coverage (95% CIs did not overlap, Fig 6A–6C). At 5X coverage, FROH estimates differed significantly among all three methods, with the lowest estimate produced by PLINK and the highest by BCFtools Genotypes. For all higher levels of coverage, FROH estimates produced by the two BCFtools methods did not differ from one another but estimates from both methods differed from those produced by PLINK, with PLINK again providing lower estimates.

thumbnail
Fig 6. When applied to the empirical data set, the three ROH calling methods differ greatly in their inferences, particularly at 5X coverage.

A-C) Overall FROH and (D-G) length bin FROH results for each method and level of coverage, with means and 95% confidence intervals indicated by points and vertical lines, respectively. Lighter background lines indicate results for individual samples.

https://doi.org/10.1371/journal.pcbi.1012566.g006

When comparing how the three methods estimated length-specific FROH values, patterns varied across ROH length categories. For short ROHs, PLINK produced much lower FROH estimates than the two BCFtools methods across all coverage levels, with differences among the three methods significant (i.e., non-overlapping 95% CIs) at 5X coverage (Fig 6D). For longer ROHs, results of the three methods were generally concordant, with all methods producing higher estimates of FROH at 5X than at higher coverages (Fig 6D–6G). However, PLINK produced much higher estimates of FROH for very long ROHs at 5X than the other two methods, with a mean value more than double those inferred by the BCFtools methods (Fig 6G).

Discussion

In this study, we systematically analyzed the discrepancies that arise in ROH detection when populations with varied demographic histories are analyzed and highlight effects on downstream interpretations associated with these differences. This is important because the demographic scenarios affected the accuracy, with declining populations showing the highest variance in FROH estimation relative to true FROH values. Additionally, ROH detection accuracy varied with sequencing coverage; improvements were observed up to a 10X coverage threshold beyond which the benefits plateaued. Our findings emphasize the impact of choosing appropriate computational parameters and underscore the challenges in standardizing ROH detection methods values (e.g., [26,2830]), highlighting the necessity for robust, systematic comparisons to enhance reproducibility and reliability in genomic studies.

We demonstrated the exploratory utility of the sensitivity analysis process we followed to select parameter values for our data (S1 Box). For the data considered here, this exploration process suggested the default parameters were reasonable selections for our analyses (S20 Fig). This process is important because disparate sequencing data characteristics are likely to require different parameter values, meaning that it may not be appropriate to use the values we used herein when analyzing other data. For example, studies that use fewer SNPs (e.g., populations that are less genetically diverse, studies with reduced sequencing efforts) should test the effects of altering the minimum SNP density required on ROH inference results. Sensitivity analysis provides a relatively quick and convenient way to visualize how different parameter values affect FROH estimates for an entire data set and the degree of variation in those effects across individuals.

Inference methodology biases FROH values

Between the two BCFtools methods when considering identified ROHs of all lengths, Genotypes produced more accurate overall FROH estimates than Likelihoods, with FROH estimates from Likelihoods also increasingly diverging from the true FROH value with increasing true FROH in the genome (Fig 2). For populations expected to have considerable variation in FROH among individuals (e.g., a population that has remained somewhat small for an extended period of time with evidence of recent immigration), applying the BCFtools Likelihoods approach could result in increasingly skewed values for the individuals with the highest levels of inbreeding. For example, using the linear model parameters estimated for 15X coverage, an individual with a true FROH of 0.10 would be assigned an inferred FROH of 0.01 (difference = -0.09), whereas an individual with a true FROH value of 0.40 would be assigned 0.23 (difference = -0.17). This could be particularly problematic when dealing with species or populations of conservation concern because the individuals with the highest true FROH also have the largest magnitude of error, meaning that concerning signals of inbreeding could go undetected.

In contrast to the underestimations produced by the BCFtools/RoH methods, the sliding window approach implemented in PLINK overestimated FROH. This was particularly evident at 5X coverage where FROH estimates differed more from their true values than any other method and coverage level combination in our study (Fig 2). However, at coverages above 5X, PLINK produced better estimates than either BCFtools approach (i.e., in our linear models, intercepts for PLINK at 10X-50X are closer to zero than for either BCFtools method and 95% CIs for these parameter estimates do not overlap with any BCFtools intercept 95% CIs). In the context of endangered species conservation, small overestimations of FROH may be more desirable than underestimations because these are likely to be more conservative (i.e., indicating more close inbreeding than is present in reality) in many situations. Importantly though, as with BCFtools Likelihoods, FROH estimates diverged from true FROH at increasing values of true FROH. However, these values diverged at a much lower rate in the PLINK estimates compared to BCFtools Likelihoods. Again, using our simulated data as a model, an individual with a true FROH value of 0.40 would be estimated to have an FROH of 0.46 (difference = 0.06) when estimated at 10X-50X with PLINK.

For the two BCFtools methods, patterns of underestimation were consistent with these approaches’ high false negative rates and low false positive rates (Fig 3), indicating that they are more conservative in identifying ROHs and may miss some shorter regions. Conversely, PLINK produced higher false positive rates and lower false negative rates than either BCFtools method, consistent with overestimation of FROH. In terms of absolute difference between true and called (S12S15 Figs) FROH values, PLINK outperformed BCFtools at 10X coverage and above, suggesting that PLINK will often provide the most robust estimate of FROH. However, at lower coverages (5X-10X), BCFtools Genotypes could be considered, given that this method produces FROH estimates closer to true FROH than either PLINK or BCFtools Likelihoods. On the other hand, the underestimates produced by this approach are likely related to the high false negative rates we observed (especially relative to PLINK), and the appearance of convergence on true FROH may be due to length-specific ROH calling rates by this program (see below) and therefore highly variable across populations. It is important to note that while the trends we describe may be consistent with some empirical results (e.g., [5]), individual variation in genomic characteristics exerts strong influence over FROH inference results as suggested by comparisons between our simulated and empirical results.

The differences in behavior between PLINK and the two BCFtools methods (Figs 3 and S5S8) emphasize the need for careful consideration when selecting a tool for ROH calling. The higher false positive rates of PLINK may be acceptable in conservation contexts, as this method tends to overestimate inbreeding, thus erring on the side of caution. However, BCFtools, with its more conservative approach that leads to lower false positive rates but higher false negative rates, may be more appropriate for studies requiring strict accuracy. Ultimately, the choice of tool should be informed by the specific goals of the study, population characteristics, and acceptable trade-offs between false positive and negative rates. Employing both tools and cross-validating results could provide a more balanced understanding of inbreeding patterns across populations.

Demographic history influences FROH value accuracy

The relationship between true and inferred FROH values varied with demographic scenario, demonstrating the potential for uncertainty when incorporating these inbreeding values into management action plans. For example, in the large population where true FROH values are small (< 0.3), inferred FROH accuracy decreased with increasing values of true FROH for the BCFtools methods, with both tending to underestimate FROH with increasing true FROH (Fig 2A and 2E). The opposite trend occurred for the declining population, where true FROH values exceed 0.9, with all three methods approaching more accurate FROH estimates with increasing values of true FROH (Fig 2D, 2H, and 2L). These contrasting patterns are likely due to different ROH length distributions in these two demographic scenarios. In the declining population, ROHs in the individuals with the highest overall FROH values tend to comprise very long ROHs (Fig 1I). Consistent overestimation of FROH for these very long ROHs by BCFtools Genotypes and PLINK directly leads to higher inferred FROH estimates in these individuals. However, individuals with lower overall FROH values have a greater proportion of their ROHs in shorter length categories, which are underestimated by all three methods (Fig 4D), leading to inferred overall FROH estimates that diverge from true FROH as true FROH decreases (Fig 2D, 2H, and 2L). In the large population, although some individuals have higher true FROH values, most ROHs are still short (Fig 1C), and likely to be underestimated by all three methods. Therefore, as true FROH increases, so do the number of true short ROHs and the cumulative effects of underestimation across methods, leading to overall FROH estimates that diverge from true FROH as true FROH increases.

Overall, we found that the accuracy of ROH detection significantly varies across populations with different demographic histories. Our analysis revealed that declining populations exhibited the highest variance in FROH estimation relative to true FROH values. In contrast, large populations tended to show more stable estimates but with a general tendency for underestimation of true FROH, particularly as true FROH values increased. These findings highlight the need for an analysis approach that considers demographic history of the population. For conservationists and researchers working with genomic data from populations known or suspected to be in decline, it is vital to interpret ROH analysis outputs with the breadth of demographic scenarios in mind. When demographic history is unknown and cannot be estimated, the distribution of FROH present in the dataset could be compared to the distributions simulated here (Fig 1B, 1D, 1F, and 1H), or to simulated distributions of FROH with more similar natural history parameters. Although tedious, these approaches will minimize the effects of misestimation of FROH, reducing the risk of exacerbating genetic diversity losses in populations and species that are most at risk of extinction.

Coverage ≤ 10X strongly influences called ROH lengths

Our results demonstrate that coverage levels at or below 10X significantly compromise the reliability of ROH estimates, with higher rates of false positives and false negatives affecting the accuracy of FROH values. This bias is evident not only in the overestimation of long ROHs but also in the overall discrepancy in ROH calling across all size bins (Figs 3 and 6). For the empirical data at 5X coverage, relative to higher coverage levels, all methods consistently produced lower FROH estimates for short ROHs and higher FROH estimates for longer ROHs (Fig 6). The overcalling of intermediate to very long ROHs at 5X could be related to the ROH-lumping issue noted in the simulated results, wherein multiple true ROHs are erroneously called as a single ROH (Fig 5). While we cannot confirm the accuracy of FROH inference for the empirical data, comparisons between results generated at 5X and higher levels of coverage are consistent with the simulated results of a declining population, suggesting that these patterns are accurate (Fig 7). For the Tasmanian devil samples we analyzed, the results from 5X coverage suggest much more frequent, recent inbreeding than the results from ≥ 10X coverage, painting a much more dire demographic scenario than is presented when more coverage is obtained. These findings serve as a caution for researchers using low coverage sequencing to assess FROH, as this data can lead to unreliable inferences. While lower coverage sequencing may be a tempting option for reducing costs [64], it introduces the risk of biased ROH estimates. Researchers relying on low coverage data should be aware of these limitations when interpreting ROHs drawn from low coverage data. Therefore, if one of the goals of a whole-genome sequencing project is to assess recent or historical patterns of inbreeding from ROH lengths, ~10X coverage appears to be a minimum requirement for generating robust ROH inferences.

thumbnail
Fig 7. Comparison between empirical and simulated declining population across coverages.

In all tools, FROH is consistently underestimated. Increasing coverage from 5X to 10X can have significant effects on FROH estimates. A-C) Inferred FROH values for declining population data and D-F) inferred FROH values for empirical data at varying coverage levels for all three methods. True mean FROH values for simulated data are indicated by horizontal dashed line. For the simulated data, error bars are bootstrapped 95% CIs and points represent mean values, lines for 15 randomly subsampled individuals are displayed for simplicity. For the empirical results, points represent mean values (n = 15) and error bars correspond to 95% CIs.

https://doi.org/10.1371/journal.pcbi.1012566.g007

Patterns of under- or overestimation may vary with ROH length distributions

In our simulated and empirical data, we observed patterns indicating that underlying ROH length distributions influence the patterns of FROH under- and overestimation. For example, even though PLINK produced higher FROH estimates than both BCFtools methods for the simulated data and PLINK and BCFtools Genotypes produced statistically indistinguishable estimates for the empirical data (Fig 2), length-specific FROH estimates suggest that differences in underlying true ROH length distributions between the simulated and empirical data may be responsible for the differences in relative FROH results we observed. For the simulated data, BCFtools Genotypes increasingly overestimated FROH as ROH length increased (Fig 4), with increasing numbers of true ROHs erroneously combined into single called ROHs (Fig 5). Although we cannot know the true ROH length distributions for the empirical data, long ROHs were called at higher frequencies in the empirical data relative to the simulated data. The tendency of BCFtools Genotypes to overestimate FROH for long ROHs combined with the presence of more called long ROHs in our empirical data set may have minimized differences in overall FROH estimates between BCFtools Genotypes and PLINK in the empirical results relative to the simulated results (Fig 2). Increased frequencies of long ROHs in the empirical data may have also led to greater differences in FROH between 5X and 10X across all three methods for the empirical results compared to the simulated results (Fig 2). All three methods called significantly more intermediate to very long ROHs from the empirical data at 5X than at 10X, and this may be related to the increased false positive rates we noted at 5X in the simulated data. These results again illustrate the effects of a population’s or individual’s actual ROH complement, which is determined by typically unknown demographic and breeding patterns, on the relative reliability and utility of ROH identification programs.

Particularly for endangered species with potentially complicated demographic histories, interpreting ROH patterns in a population may be most accurate when multiple tools are used to create an integrated picture. For example, comparing overall and length-specific FROH estimates between BCFTools/RoH and PLINK can be used to understand the underlying length distributions; an abundance of shorter ROHs would be indicated by higher overall FROH estimates in PLINK compared to BCFtools Genotypes but similar length-specific FROH patterns, whereas a ROH complement comprising many longer ROHs would be indicated by similar overall FROH estimates between PLINK and BCFtools Genotypes but higher intermediate to very long FROH estimates from BCFtools Genotypes relative to PLINK. These accurate assessments of past and ongoing inbreeding could then be used to inform management options, such as translocations to ameliorate close inbreeding. While our simulations focus on the parameters of ROH calling programs, other variables—such as number of generations since population decline, sample size, and simulated genomic size—could also be explored to further quantify and understand the source of biases to ultimately inform a more comprehensive understanding of how demographic factors influence ROH inference. In particular, considering demographic events that occurred over much longer time frames may illustrate the boundaries of ROH utility in investigating inbreeding, including how far back in time ROHs can reflect historical events while still providing relevant insights. Furthermore, the inclusion of additional sampled individuals might reveal more variation in identified ROHs and long simulated genomic sizes may identify even longer ROHs than those we identified. The results presented here provide a foundation for these considerations, which could help establish a clearer framework for making informed choices in ROH detection tools across different species and population structures.

Conclusions

Inferring the presence and characteristics of ROHs can shed important light on population demographic histories, detect inbreeding depression when combined with fitness information, and even disentangle the mechanisms underlying or loci contributing to inbreeding depression. However, given the variation in ROH-calling accuracy demonstrated here, we caution against direct comparisons of FROH values generated from different data types or sources or using different inference parameters. Data from disparate studies could be combined and re-analyzed in a standardized fashion, although special attention should be paid to variation in reference genome assembly quality for interspecific comparisons [14]. Regardless of the number of data sets to be analyzed, we strongly recommend that studies relying on ROH inference (i) employ at least two ROH-calling programs and interpret their results with each method’s biases in mind and (ii) compare multiple parameter value combinations via sensitivity analysis, taking care to vary parameters of particular relevance to a data set.

Supporting information

S1 Table. SRA accession numbers and read statistics for Tasmanian devil sequencing data used in empirical analyses.

Mean depths at filtered SNPs are provided for VCF files produced from downsampled reads at each target coverage level (5X, 10X, 15X, 30X) and the undownsampled, full-coverage files.

https://doi.org/10.1371/journal.pcbi.1012566.s001

(CSV)

S2 Table. All PLINK parameter setting combinations tested via sensitivity analysis for the simulated and empirical data.

Default settings are underlined.

https://doi.org/10.1371/journal.pcbi.1012566.s002

(CSV)

S3 Table. Viterbi-trained HMM transition probabilities for all demographic scenarios, BCFtools/RoH methods, and levels of coverage.

Default values for both settings are provided directly below column headers.

https://doi.org/10.1371/journal.pcbi.1012566.s003

(CSV)

S4 Table. Linear model parameter estimates and corresponding 95% CI lower and upper limits for true FROH vs. called FROH across population demographic scenarios, BCFtools methods, HMM parameter selection approach, and levels of coverage.

Likelihoods = BCFtools Likelihoods; Genotypes = BCFtools Genotypes.

https://doi.org/10.1371/journal.pcbi.1012566.s004

(CSV)

S5 Table. Linear model parameter estimates and corresponding 95% CI lower and upper limits for true FROH vs. called FROH across demographic scenarios, methods, and levels of coverage.

Likelihoods = BCFtools Likelihoods; Genotypes = BCFtools Genotypes.

https://doi.org/10.1371/journal.pcbi.1012566.s005

(CSV)

S6 Table. Summary statistics for false negative and false positive rates for each method and coverage level combination.

Genotypes = BCFtools Genotypes; Likelihoods = BCFtools Likelihoods; SD = standard deviation; CV = coefficient of variation.

https://doi.org/10.1371/journal.pcbi.1012566.s006

(CSV)

S1 Fig. True FROH vs. called FROH across coverage levels and BCFtools methods for the large population demographic scenario using default HMM transition probability values and Viterbi-trained values.

Solid lines indicate linear regression results and the dashed line is the 1:1 line. Within each method and coverage level combination, the slope and intercept parameters did not significantly differ between the approaches using Viterbi-trained or default probabilities, but the regression line calculated using Viterbi-trained results was consistently closer to the 1:1 line.

https://doi.org/10.1371/journal.pcbi.1012566.s007

(TIFF)

S2 Fig. True FROH vs. called FROH across coverage levels and BCFtools methods for the small population demographic scenario using default HMM transition probability values and Viterbi-trained values.

Solid lines indicate linear regression results and the dashed line is the 1:1 line. Within each method and coverage level combination, the slope and intercept parameters did not significantly differ between the approaches using Viterbi-trained or default probabilities, but the regression line calculated using Viterbi-trained results was consistently closer to the 1:1 line.

https://doi.org/10.1371/journal.pcbi.1012566.s008

(TIFF)

S3 Fig. True FROH vs. called FROH across coverage levels and BCFtools methods for the bottlenecked population demographic scenario using default HMM transition probability values and Viterbi-trained values.

Solid lines indicate linear regression results and the dashed line is the 1:1 line. Within each method and coverage level combination, the slope and intercept parameters did not significantly differ between the approaches using Viterbi-trained or default probabilities, but the regression line calculated using Viterbi-trained results was consistently closer to the 1:1 line.

https://doi.org/10.1371/journal.pcbi.1012566.s009

(TIFF)

S4 Fig. True FROH vs. called FROH across coverage levels and BCFtools methods for the declining population demographic scenario using default HMM transition probability values and Viterbi-trained values.

Solid lines indicate linear regression results and the dashed line is the 1:1 line. Within each method and coverage level combination, the slope and intercept parameters did not significantly differ between the approaches using Viterbi-trained or default probabilities, but the regression line calculated using Viterbi-trained results was consistently closer to the 1:1 line.

https://doi.org/10.1371/journal.pcbi.1012566.s010

(TIFF)

S5 Fig.

A) False positive (i.e., incorrectly calling a base position as being located in a ROH) and B) false negative (i.e., failing to identify a base position as being located in a ROH) rates across coverage levels for the large population demographic scenario. Horizontal lines indicate median values and shaded boxes are 50% quantiles. Note difference in scale of y-axis between panels A and B. Although not significantly different, rates calculated for ROHs called using Viterbi-trained HMM transition probabilities tend to be lower than rates calculated for ROHs called using default probabilities.

https://doi.org/10.1371/journal.pcbi.1012566.s011

(TIFF)

S6 Fig.

A) False positive (i.e., incorrectly calling a base position as being located in a ROH) and B) false negative (i.e., failing to identify a base position as being located in a ROH) rates across coverage levels for the small population demographic scenario. Horizontal lines indicate median values and shaded boxes are 50% quantiles. Note difference in scale of y-axis between panels A and B. Although not significantly different, rates calculated for ROHs called using Viterbi-trained HMM transition probabilities tend to be lower than rates calculated for ROHs called using default probabilities.

https://doi.org/10.1371/journal.pcbi.1012566.s012

(TIFF)

S7 Fig.

A) False positive (i.e., incorrectly calling a base position as being located in a ROH) and B) false negative (i.e., failing to identify a base position as being located in a ROH) rates across coverage levels for the bottlenecked population demographic scenario. Horizontal lines indicate median values and shaded boxes are 50% quantiles. Note difference in scale of y-axis between panels A and B. Although not significantly different, rates calculated for ROHs called using Viterbi-trained HMM transition probabilities tend to be lower than rates calculated for ROHs called using default probabilities.

https://doi.org/10.1371/journal.pcbi.1012566.s013

(TIFF)

S8 Fig.

A) False positive (i.e., incorrectly calling a base position as being located in a ROH) and B) false negative (i.e., failing to identify a base position as being located in a ROH) rates across coverage levels for the declining population demographic scenario. Horizontal lines indicate median values and shaded boxes are 50% quantiles. Note difference in scale of y-axis between panels A and B. Although not significantly different, rates calculated for ROHs called using Viterbi-trained HMM transition probabilities tend to be lower than rates calculated for ROHs called using default probabilities.

https://doi.org/10.1371/journal.pcbi.1012566.s014

(TIFF)

S9 Fig. False positive (i.e., incorrectly calling a base position as being located in a ROH) rates across coverage levels for all population demographic scenarios.

Horizontal lines indicate median values and shaded boxes are 50% quantiles.

https://doi.org/10.1371/journal.pcbi.1012566.s015

(TIFF)

S10 Fig. False negative (i.e., failing to identify a base position as being located in a ROH) rates across coverage levels for all demographic scenarios.

Horizontal lines indicate median values and shaded boxes are 50% quantiles.

https://doi.org/10.1371/journal.pcbi.1012566.s016

(TIFF)

S11 Fig. True and called ROHs ≥ 100 kb in length for one individual from each demographic scenario simulated.

True ROHs are indicated by black polygons and light gray shading.

https://doi.org/10.1371/journal.pcbi.1012566.s017

(TIFF)

S12 Fig. Called FROH–True FROH by length bin and coverage level for the large population demographic scenario.

https://doi.org/10.1371/journal.pcbi.1012566.s018

(TIFF)

S13 Fig. Called FROH–True FROH by length bin and coverage level for the small population demographic scenario.

https://doi.org/10.1371/journal.pcbi.1012566.s019

(TIFF)

S14 Fig. Called FROH–True FROH by length bin and coverage level for the bottlenecked population demographic scenario.

https://doi.org/10.1371/journal.pcbi.1012566.s020

(TIFF)

S15 Fig. Called FROH–True FROH by length bin and coverage level for the declining population demographic scenario.

https://doi.org/10.1371/journal.pcbi.1012566.s021

(TIFF)

S16 Fig. Mean number of true ROHs combined into a single called ROH at each coverage level by length bins.

Results are presented for all four demographic scenarios.

https://doi.org/10.1371/journal.pcbi.1012566.s022

(TIFF)

S17 Fig. Heterozygosity variation across scenarios.

Each dot represents the mean heterozygosity of an individual.

https://doi.org/10.1371/journal.pcbi.1012566.s023

(TIFF)

S18 Fig. Scatter plots of false-negative versus false-positive rates across different sequencing depths (5X, 10X, 15X, 30X, and 50X) and population scenarios.

The plots are organized by sequencing depth, with rows corresponding to 5X (A-D), 10X (E-H), 15X (I-L), 30X (M-P), and 50X (Q-T). Each point represents the performance of genotype calling in an individual sample.

https://doi.org/10.1371/journal.pcbi.1012566.s024

(TIFF)

S19 Fig. Comparison of true versus called FROH across different sequencing depths and population scenarios.

Each regression line represents linear model results for a single level of coverage with the shaded areas representing 95% confidence intervals. Each point represents data for a single simulated individual. Dashed line is 1:1 line.

https://doi.org/10.1371/journal.pcbi.1012566.s025

(TIFF)

S20 Fig. Sensitivity analysis for different parameters across simulated population scenarios and empirical data at 50x sequencing depth.

https://doi.org/10.1371/journal.pcbi.1012566.s026

(TIFF)

S1 Box. PLINK parameter exploration through sensitivity analysis.

https://doi.org/10.1371/journal.pcbi.1012566.s027

(DOCX)

Acknowledgments

We thank members of the Willoughby lab for constructive comments on earlier versions of this manuscript as well as M Kardos.

References

  1. 1. Diffenbaugh NS, Field CB. Changes in Ecologically Critical Terrestrial Climate Conditions. Science. 2013;341: 486–492. pmid:23908225
  2. 2. Haddad NM, Brudvig LA, Clobert J, Davies KF, Gonzalez A, Holt RD, et al. Habitat fragmentation and its lasting impact on Earth’s ecosystems. Sci Adv. 2015;1: e1500052. pmid:26601154
  3. 3. Crnokrak P, Roff DA. Inbreeding depression in the wild. Heredity. 1999;83: 260–270. pmid:10504423
  4. 4. Keller L. Inbreeding effects in wild populations. Trends in Ecology & Evolution. 2002;17: 230–241.
  5. 5. Robinson JA, Räikkönen J, Vucetich LM, Vucetich JA, Peterson RO, Lohmueller KE, et al. Genomic signatures of extensive inbreeding in Isle Royale wolves, a population on the threshold of extinction. Sci Adv. 2019;5: eaau0757. pmid:31149628
  6. 6. Stoffel MA, Johnston SE, Pilkington JG, Pemberton JM. Genetic architecture and lifetime dynamics of inbreeding depression in a wild mammal. Nat Commun. 2021;12: 2972. pmid:34016997
  7. 7. Gibbs HL, Grant PR. Inbreeding in Darwin’s Medium Ground Finches (Geospiza fortis). Evolution. 1989;43: 1273–1284. pmid:28564504
  8. 8. Liberg O, Andrén H, Pedersen H-C, Sand H, Sejberg D, Wabakken P, et al. Severe inbreeding depression in a wild wolf Canis lupus population. Biol Lett. 2005;1: 17–20. pmid:17148117
  9. 9. Saccheri I, Kuussaari M, Kankare M, Vikman P, Fortelius W, Hanski I. Inbreeding and extinction in a butterfly metapopulation. Nature. 1998;392: 491–494.
  10. 10. Slate J, Kruuk LEB, Marshall TC, Pemberton JM, Clutton-Brock TH. Inbreeding depression influences lifetime breeding success in a wild population of red deer (Cervus elaphus). Proc R Soc Lond B. 2000;267: 1657–1662. pmid:11467429
  11. 11. O’Grady JJ, Brook BW, Reed DH, Ballou JD, Tonkyn DW, Frankham R. Realistic levels of inbreeding depression strongly affect extinction risk in wild populations. Biological Conservation. 2006;133: 42–51.
  12. 12. Kardos M, Taylor HR, Ellegren H, Luikart G, Allendorf FW. Genomics advances the study of inbreeding depression in the wild. Evolutionary Applications. 2016;9: 1205–1218. pmid:27877200
  13. 13. Ceballos FC, Joshi PK, Clark DW, Ramsay M, Wilson JF. Runs of homozygosity: windows into population history and trait architecture. Nat Rev Genet. 2018;19: 220–234. pmid:29335644
  14. 14. Brüniche-Olsen A, Kellner KF, Anderson CJ, DeWoody JA. Runs of homozygosity have utility in mammalian conservation and evolutionary studies. Conserv Genet. 2018;19: 1295–1307.
  15. 15. McQuillan R, Leutenegger A-L, Abdel-Rahman R, Franklin CS, Pericic M, Barac-Lauc L, et al. Runs of Homozygosity in European Populations. The American Journal of Human Genetics. 2008;83: 359–372. pmid:18760389
  16. 16. Stoffel MA, Johnston SE, Pilkington JG, Pemberton JM. Mutation load decreases with haplotype age in wild Soay sheep. Evolution Letters. 2021;5: 187–195. pmid:34136268
  17. 17. Szpiech ZA, Xu J, Pemberton TJ, Peng W, Zöllner S, Rosenberg NA, et al. Long Runs of Homozygosity Are Enriched for Deleterious Variation. The American Journal of Human Genetics. 2013;93: 90–102. pmid:23746547
  18. 18. Liu S, Westbury MV, Dussex N, Mitchell KJ, Sinding M-HS, Heintzman PD, et al. Ancient and modern genomes unravel the evolutionary history of the rhinoceros family. Cell. 2021;184: 4874–4885.e16. pmid:34433011
  19. 19. Palkopoulou E, Mallick S, Skoglund P, Enk J, Rohland N, Li H, et al. Complete Genomes Reveal Signatures of Demographic and Genetic Declines in the Woolly Mammoth. Current Biology. 2015;25: 1395–1400. pmid:25913407
  20. 20. Willi Y, Kristensen TN, Sgrò CM, Weeks AR, Ørsted M, Hoffmann AA. Conservation genetics as a management tool: The five best-supported paradigms to assist the management of threatened species. Proc Natl Acad Sci USA. 2022;119: e2105076119. pmid:34930821
  21. 21. Kardos M. Conservation genetics. Current Biology. 2021;31: R1185–R1190. pmid:34637729
  22. 22. Genetics Lande R. and Demography in Biological Conservation. Science. 1988;241: 1455–1460.
  23. 23. Gilpin ME, Soulé ME. Minimum viable populations: processes of species extinction. In: Soulé ME, editor. Conservation biology: the science of scarcity and diversity. Sunderland, Mass: Sinauer Associates, Inc.; 1986. p. 584.
  24. 24. Kardos M, Armstrong EE, Fitzpatrick SW, Hauser S, Hedrick PW, Miller JM, et al. The crucial role of genome-wide genetic variation in conservation. Proc Natl Acad Sci USA. 2021;118: e2104642118. pmid:34772759
  25. 25. Soulé ME, Mills LS. No Need to Isolate Genetics. Science. 1998;282: 1658–1659.
  26. 26. Grossen C, Guillaume F, Keller LF, Croll D. Purging of highly deleterious mutations through severe bottlenecks in Alpine ibex. Nat Commun. 2020;11: 1001. pmid:32081890
  27. 27. Dussex N, Kurland S, Olsen R-A, Spong G, Ericsson G, Ekblom R, et al. Range-wide and temporal genomic analyses reveal the consequences of near-extinction in Swedish moose. Commun Biol. 2023;6: 1035. pmid:37848497
  28. 28. Saremi NF, Supple MA, Byrne A, Cahill JA, Coutinho LL, Dalén L, et al. Puma genomes from North and South America provide insights into the genomic consequences of inbreeding. Nat Commun. 2019;10: 4769. pmid:31628318
  29. 29. Von Seth J, Dussex N, Díez-del-Molino D, Van Der Valk T, Kutschera VE, Kierczak M, et al. Genomic insights into the conservation status of the world’s last remaining Sumatran rhinoceros populations. Nat Commun. 2021;12: 2393. pmid:33896938
  30. 30. Mueller SA, Prost S, Anders O, Breitenmoser-Würsten C, Kleven O, Klinga P, et al. Genome-wide diversity loss in reintroduced Eurasian lynx populations urges immediate conservation management. Biological Conservation. 2022;266: 109442.
  31. 31. Chang CC, Chow CC, Tellier LC, Vattikuti S, Purcell SM, Lee JJ. Second-generation PLINK: rising to the challenge of larger and richer datasets. GigaSci. 2015;4: 7. pmid:25722852
  32. 32. Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MAR, Bender D, et al. PLINK: A Tool Set for Whole-Genome Association and Population-Based Linkage Analyses. The American Journal of Human Genetics. 2007;81: 559–575. pmid:17701901
  33. 33. Narasimhan V, Danecek P, Scally A, Xue Y, Tyler-Smith C, Durbin R. BCFtools/RoH: a hidden Markov model approach for detecting autozygosity from next-generation sequencing data. Bioinformatics. 2016;32: 1749–1751. pmid:26826718
  34. 34. Ceballos FC, Hazelhurst S, Ramsay M. Assessing runs of Homozygosity: a comparison of SNP Array and whole genome sequence low coverage data. BMC Genomics. 2018;19: 106. pmid:29378520
  35. 35. Duntsch L, Whibley A, Brekke P, Ewen JG, Santure AW. Genomic data of different resolutions reveal consistent inbreeding estimates but contrasting homozygosity landscapes for the threatened Aotearoa New Zealand hihi. Molecular Ecology. 2021;30: 6006–6020. pmid:34242449
  36. 36. Meyermans R, Gorssen W, Buys N, Janssens S. How to study runs of homozygosity using PLINK? A guide for analyzing medium density SNP data in livestock and pet species. BMC Genomics. 2020;21: 94. pmid:31996125
  37. 37. Forutan M, Ansari Mahyari S, Baes C, Melzer N, Schenkel FS, Sargolzaei M. Inbreeding and runs of homozygosity before and after genomic selection in North American Holstein cattle. BMC Genomics. 2018;19: 98. pmid:29374456
  38. 38. Caballero A, Fernández A, Villanueva B, Toro MA. A comparison of marker-based estimators of inbreeding and inbreeding depression. Genet Sel Evol. 2022;54: 82. pmid:36575379
  39. 39. Escoda L, Castresana J. The genome of the Pyrenean desman and the effects of bottlenecks and inbreeding on the genomic landscape of an endangered species. Evolutionary Applications. 2021;14: 1898–1913. pmid:34295371
  40. 40. Haller BC, Messer PW. SLiM 3: Forward Genetic Simulations Beyond the Wright–Fisher Model. Hernandez R, editor. Molecular Biology and Evolution. 2019;36: 632–637. pmid:30517680
  41. 41. Haller BC, Messer PW. Evolutionary Modeling in SLiM 3 for Beginners. Hernandez R, editor. Molecular Biology and Evolution. 2019;36: 1101–1109. pmid:30590560
  42. 42. Feigin CY, Newton AH, Doronina L, Schmitz J, Hipsley CA, Mitchell KJ, et al. Genome of the Tasmanian tiger provides insights into the evolution and demography of an extinct marsupial carnivore. Nat Ecol Evol. 2017;2: 182–192. pmid:29230027
  43. 43. Patton AH, Margres MJ, Stahlke AR, Hendricks S, Lewallen K, Hamede RK, et al. Contemporary Demographic Reconstruction Methods Are Robust to Genome Assembly Quality: A Case Study in Tasmanian Devils. Molecular Biology and Evolution. 2019;36: 2906–2921. pmid:31424552
  44. 44. R Core Team. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing; 2022. Available: https://www.R-project.org/
  45. 45. Thompson EA. Identity by Descent: Variation in Meiosis, Across Genomes, and in Populations. Genetics. 2013;194: 301–326. pmid:23733848
  46. 46. Xie H-X, Liang X-X, Chen Z-Q, Li W-M, Mi C-R, Li M, et al. Ancient Demographics Determine the Effectiveness of Genetic Purging in Endangered Lizards. Harris K, editor. Molecular Biology and Evolution. 2022;39: msab359. pmid:34919713
  47. 47. Sánchez-Barreiro F, Gopalakrishnan S, Ramos-Madrigal J, Westbury MV, De Manuel M, Margaryan A, et al. Historical population declines prompted significant genomic erosion in the northern and southern white rhinoceros (Ceratotherium simum). Molecular Ecology. 2021;30: 6355–6369. pmid:34176179
  48. 48. Hasselgren M, Dussex N, Von Seth J, Angerbjörn A, Olsen R, Dalén L, et al. Genomic and fitness consequences of inbreeding in an endangered carnivore. Molecular Ecology. 2021;30: 2790–2799. pmid:33955096
  49. 49. Huang W, Li L, Myers JR, Marth GT. ART: a next-generation sequencing read simulator. Bioinformatics. 2012;28: 593–594. pmid:22199392
  50. 50. Andrews S. FastQC. 2023. Available: https://www.bioinformatics.babraham.ac.uk/projects/fastqc/
  51. 51. Li H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv; 2013.
  52. 52. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009;25: 2078–2079. pmid:19505943
  53. 53. McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, et al. The Genome Analysis Toolkit: A MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 2010;20: 1297–1303. pmid:20644199
  54. 54. Howrigan DP, Simonson MA, Keller MC. Detecting autozygosity through runs of homozygosity: A comparison of three autozygosity detection algorithms. BMC Genomics. 2011;12: 460. pmid:21943305
  55. 55. Mathur S, Tomeček JM, Tarango-Arámbula LA, Perez RM, DeWoody JA. An evolutionary perspective on genetic load in small, isolated populations as informed by whole genome resequencing and forward-time simulations. Evolution. 2023;77: 690–704. pmid:36626799
  56. 56. Iooss B, Veiga SD, Janon A, Pujol G, Broto B, Boumhaout K, et al. sensitivity: Global Sensitivity Analysis of Model Outputs and Importance Measures. 2021.
  57. 57. Wright BR, Farquharson KA, McLennan EA, Belov K, Hogg CJ, Grueber CE. A demonstration of conservation genomics for threatened species management. Molecular Ecology Resources. 2020;20: 1526–1541. pmid:32562371
  58. 58. Krueger F, James F, Ewels P, Afyounian E, Weinstein M, Schuster-Boeckler B, et al. FelixKrueger/TrimGalore: v0.6.10—add default decompression path. Zenodo; 2023.
  59. 59. Morris K, Austin JJ, Belov K. Low major histocompatibility complex diversity in the Tasmanian devil predates European settlement and may explain susceptibility to disease epidemics. Biol Lett. 2013;9: 20120900. pmid:23221872
  60. 60. Brüniche-Olsen A, Jones ME, Austin JJ, Burridge CP, Holland BR. Extensive population decline in the Tasmanian devil predates European settlement and devil facial tumour disease. Biol Lett. 2014;10: 20140619. pmid:25376800
  61. 61. Miller W, Hayes VM, Ratan A, Petersen DC, Wittekindt NE, Miller J, et al. Genetic diversity and population structure of the endangered marsupial Sarcophilus harrisii (Tasmanian devil). Proc Natl Acad Sci USA. 2011;108: 12348–12353. pmid:21709235
  62. 62. Okonechnikov K, Conesa A, García-Alcalde F. Qualimap 2: advanced multi-sample quality control for high-throughput sequencing data. Bioinformatics. 2016;32: 292–294. pmid:26428292
  63. 63. Danecek P, Auton A, Abecasis G, Albers CA, Banks E, DePristo MA, et al. The variant call format and VCFtools. Bioinformatics. 2011;27: 2156–2158. pmid:21653522
  64. 64. Lou RN, Jacobs A, Wilder AP, Therkildsen NO. A beginner’s guide to low-coverage whole genome sequencing for population genomics. Molecular Ecology. 2021;30: 5966–5993. pmid:34250668