Genealogical analysis of European bison population revealed a growing up population despite very low genetic diversity

In 1919, the European bison population became extinct in the wild. The rescue of the lowland subspecies and the whole species was achieved mainly thanks to individuals from the Białowieża Forest (Polish-Belarusian border). There are currently two breeding lines—the lowland (purebred B. b. Bonasus) founded by 7 individuals and the lowland-Caucasian (hybrids of B. b. Bonasus and B. b. caucasicus) founded by 12 individuals. This genealogical study was conducted on 15,071 individuals recorded in the pedigree book between 1881 and 2020. Its objective was to determine the level of genetic variability and inbreeding almost 100 years after the rescue measures were initiated. The completeness of the pedigree of the reference population was 77% in the fifth generation backwards. A maximum of 23 generations can be traced back in the pedigree. The average inbreeding coefficient and the mean average relatedness of the reference population were very high, about 17% and 16% respectively. No significant amount of new inbreeding was discovered. The reference population has lost 9.11% of the total genetic diversity compared to the population of founders. A male of the Caucasian subspecies Kaukasus was discovered among the ancestors of the lowland lineage reference population. The effective population size calculated based on the increase in inbreeding was 23.93 individuals, based on complete generations equivalent it was 16.1 individuals. Wright’s F-statistics showed very small differences in genotypic frequencies between individuals within the two lineages in the reference population (FIS = 0.10), between individuals and the total population (FIT = 0.04) and low differentiation between lineages (FST = 0.06). The population of the European bison from the Białowieża Forest is generally very uniform but still shows good fitness.


Introduction
The Wisent or European bison (Bison bonasus) is Europe's largest herbivore inhabiting almost the entire continent during the middle and late Holocene [1]. It belongs to the genus Bovini and is crossable with other genera Bos and Bison members, while fertility is preserved in female F1 hybrids [2]. Originally, there were two species of wild aurochs present in Europe since ancient times, in addition to the wisent, it was also Bos primigenius, which became extinct The low variability of the bison genome can be attributed to the low number of foundersonly 17 individuals. Of these, only one male Plebejer (No. 45) established the lowland lineage and five males the Caucasian-lowland lineage, of which only three (45 Plebejer,100 Kaukasus,15 Bergründer) also contributed with male offspring to the population [36]. The degree of inbreeding is 44% in the lowland lineage, while it is 26% in the lowland-Caucasian lineage [4]. Although the value of the average inbreeding coefficient is lower in the second lineage, there are apparent signs of inbred depression, namely shortening or lengthening of the neurocranium and narrowing of the splanchnocrania [18]. In contrast, no obvious signs of inbreeding depression have yet been recorded for the lowland lineage [36]. This may be due to the enhancement effect of genes that have resulted in a fitness increase of the species [37]. Or by the fact that the founders of the lowland lineage did not carry genes that would burden the population and thus reduce the fitness of the population [36]. However, we cannot underestimate possible inbreeding depression in the lowland lineage, even though it has not yet been recorded. The close relationship of individuals is a risk factor for example for genetically based diseases. One of them could be an inflammation of the penis and foreskin (posthitis), which was the cause of 8% male mortality in the western part of Białowieża Forest between 1952 and 2000 [4] and also appears in the lowland lineage [38]. The disease probably has a genetic predisposition based on the fixation of a harmful recessive allele [28]. However, no significant match was found between the MHC II genotype of DRB3 and the incidence of this disease [39]. Even the mitochondrial heteroplasmy found in animals in the Białowieża Forest does not appear to be related to the disease [40]. The first candidate genes were not identified until 2014 by Oleński et al. [41] on the 15th chromosome in the range of 2Mb. The same team of authors repeated the same genome-wide association study five years later on a larger number of animals, including clinical information on disease severity, evaluated by a more reliable statistical method and more strict quality controls for SNPs [38]. They discovered on chromosome 25 the candidate gene for posthitis coding protein periplakin (PPL) associated with skin development [38].
All this knowledge is difficult to apply to breeding without information from the pedigree book. However, extensive genealogical analysis of bison has not been performed for more than 20 years. This paper aims to describe the pedigree structure of a wisent based on pedigree books and to identify the main factors that affect the genetic diversity of the European bison and its possible losses.

Materials and methods
The data were obtained through the Białowieża National Park, which has a pedigree book on its website registered since 1947 [42]. All information, date of birth, the inclusion of the new bison in the book, date of death, and all other changes (mother or father changes, sex specification) have been supplemented over the years (even respectively) to the books to the section of corrections and transfers. The pedigree dates to 1881 and contained 15,071 animals born until 2020. The reference population for genetic analysis was determined according to the list of all living bison, i.e., animals that could potentially give rise to future generations, as of 31 December 2020. The following populations were also used as reference populations for lineage comparison: The completeness of the pedigree was determined by the maximum number of generations traced, defined as the number of generations between the individual and his farthest ancestor [43]. In addition, the complete generations equivalent (CGE) was calculated as the arithmetic mean of the sum of the expected genetic benefits of ancestors to the individual from the reference population [44], and the pedigree completeness index (PCI), which is a method based on the expected genetic contributions of maternal and paternal lineages to the reference population [45]. In this study, 5 generations back were considered for PCI computation.
The inbreeding coefficient of the individual was computed according to Meuwissen and Luo [46]. Average values for the birth year (only animals with a known date of birth were included) or generation were used to show the changes in inbreeding over time. The increase of inbreeding coefficient between the two generations in the whole population was calculated as a regression coefficient from the means of the individual inbreeding coefficients per maximum number of generations or CGE [47]. Furthermore, ancestral and new inbreeding were computed according to Kalinowski et al. [48] with the program Grain v2.2 [49,50]. As an additional indicator of inbreeding was chosen average relatedness, defined as twice the probability that two randomly selected alleles from a population are identical in origin [51]. It was obtained based on the kinship matrix of individuals from which are the AR (average relatedness) coefficients obtained as a line vector based on the equation described by Dunner et al. [52].
The probability of gene origin was computed on the reference population. The effective number of founders (f e ) and ancestors (f a ), the total number of founders and ancestors, as well as the founder genome equivalent (f ge ) were computed. The f e and f a were calculated based on the method established by Lacy [53] and adjusted according to Boichard et al. [44]: where p i expresses the share of genes that the founder i contributed to the creation of the next generation of descendants; p k is a marginal contribution of each ancestor k as its genetic contribution, which is determined by the proportion of genes it contributes, and this proportion (or part of it) has not yet been explained by the contribution of its ancestor selected before it.
Founder genome equivalent (f ge ) was estimated according to Caballero and Toro [54] as half the reciprocal of the average relatedness of the reference population.
In the next step, founders and ancestors with the biggest genetic contribution to the reference population were identified. Genetic variability losses were calculated as follows: Loss of genetic diversity (1-GD) due to genetic drift and the bottleneck effect [55]: loss due to the unequal contribution of the founders [54]: and loss due to genetic drift (as the difference between the two previous ones). Based on the birth dates of the animals, generation intervals were calculated, defined as the average age of the parents at the birth of their offspring used for reproduction [56]. The average age of the parents at the birth of the offspring was also computed. The parameter is calculated for four paths, namely: father-son, father-daughter, mother-son, and mother-daughter. Effective population size N e was found in two ways. Firstly, by regressing the birth data for the whole population or a given reference population: where b is the individual inbreeding coefficient over the CGE. Secondly, through an increase in the inbreeding of an individual in a given population [57]: Wright's F-statistic [58] was used to compare both wisent lineages, calculated according to Caballero and Toro [54,59] in program R using the package kinship2 [60] kinship matrix computation. The degree of individual inbreeding (F IS ) relative to the subpopulation in which it is found was used to determine the predominance of heterozygosity or homozygosity due to non-random mating within the subpopulation. The fixation index (F ST ) was used to determine the level of genetic variability between lineages. And the total inbreeding coefficient (F IT ) was used to evaluate the loss/increase of genetic diversity in the population from an individual's perspective.
All other genetic parameters and demographic data were processed in the program Endog 4.8 [47,61].

Basic demographic data and completeness of the pedigree
The total number of individuals registered in the pedigree book from 1881 to 31 December 2020 was 15,071. Of these, 1,646 lacked all the necessary data (father, mother and sex). Beside of these, there were 116 individuals with no gender determination, mostly due to death at birth or shortly thereafter. Pedigree analysis was performed on 6,797 females and 6,512 males. From 1960 to 1980, the number of animals born doubled every ten years. Between 1960 and 1970, 1,244 calves were born, and between 1970 and 1980, the total number increased by 2,232 to 4,827. The reference population numbered 1,971 individuals, of which 1,195 were females, 773 were males and 3 individuals had unknown sex.
A maximum of 23 ± 6.31 generations and 10 ± 2.55 complete generations could be traced in the reference population, which coincided with the results for the entire population. The mean CGE value for the reference population was 7.85 ± 3.65 (8.98 ± 3.61) before the inclusion records from 2020) and 6.72 ± 3.62 for the entire population. The PCI of the reference population for the last five generations was 90.08%, 88.74%, 86.20%, 83.64%, 77.26%. PCI was more than less constant from the 6 th to the 9 th generation back, but from the tenth generation it goes down sharply and before the 13 th generation, the pedigree contains only minimal information. Its values over the generations for the whole reference population and both lineages are shown in Fig 1. From 1881 to 1920, 25 males out of 48 were used in breeding. As the population grew, so did the number of offspring per male and the number of males used in breeding. Since 1970, about 189 males have been used every ten years. Of the total number of males, 1,289 (19.80%) participated in reproduction, i.e., 5,222 males never reproduced until the end of 2020. The average number of offspring per reproducing male is 9.73 ± 10.53.
A strong selection also occurred in females. Out of the total number of 6,795, 2,621 (38.57%) of them participated in reproduction, i.e. almost 4,174 did not reproduce until the end of 2020. The average number of offspring per female involved in reproduction is 4.74 ± 3.47. Most calves were born by a female 106 Friga (20 calves), who died in 26 years. Fig 2 shows the numbers of males and females involved in reproduction with different numbers of calves.

PLOS ONE
European bison population more than 100 years after extinction in the wild

Inbreeding coefficient and average relatedness
The highest value of the coefficient of inbreeding was 71.83% and was found in five animals, which were in the 15th generation. These were individuals with registration numbers: 12022, 12587, 12907, 13199, and 13872. Their AR to the whole pedigree was 27.90 ± 0.00%. Their parents are female Placka (11221) and male Plawiant (8940). The highest value of AR was 33.40% in the male Plisch (229). In the reference population, it was 28.14% in the male Sphinx (9054). The average value of F and AR for the whole population was 17.81 ± 15.36% and 16.07 ± 8.94% and for the reference population 17.02 ± 15.10% and 16.11 ± 7.96%. A comparison of AR and the average F for each generation is shown in Table 1. The distribution of the inbreeding coefficient in the reference population is shown in Fig 3. The most common range of the inbreeding coefficient was between 10-20% in 641 animals, and 511 animals had an inbreeding coefficient of up to 10%. No individual had this coefficient higher than 72%. A comparison of the development of the average inbreeding coefficient over time (1881-2020) with the number of registered animals is shown in Fig 4. There was a 3% difference between ancestral inbreeding (F a-Kal ) and total inbreeding (F). F had a value of 0.17 ± 0.15, while F a-Kal had only 0.14 ± 0.13. There was no disproportionately large recent inbreeding present in the whole pedigree.

Contribution of ancestors and founders to the reference population and loss of genetic diversity
The contribution of 17 founders to the reference population is shown in   For the reference population, the effective number of founders (f e ) was 11 and the effective number of ancestors (f a ) was also 11. The result of comparing these values (f a /f e ) was 1, so no bottleneck effect was observed. The genome equivalent of the founders (f ge ) of the entire population was 6.22 and 5.49 for the reference population. Only four ancestors would be needed to explain the 50% genetic contribution to the reference population. The total loss of genetic diversity of the reference population compared to the total population was found to be 9.11%. Of which, the loss caused by accidental genetic drift explains 4.56%, and 4.55% of the loss of genetic diversity is caused by the uneven contribution of the founders.

Population recovery rate
The effective population size of the reference population calculated via the individual increase in inbreeding was 23.93 ± 4.26, and via regression on equivalent generations, it was 16.1. Generation intervals and the mean age of parents at the birth of the first offspring were in the reference population almost the same, 9.09 ± 3.94 and 9.08 ± 3.93 years, respectively. There were 722 animals in the reference population, whose offspring were already involved in breeding. The shortest generation interval in the reference population was from mother to son (8.56 ± 4.10 years), as well as in the whole population (8.69 ± 3.91 years). The mean age at birth of offspring was slightly lower in the whole population (8.98 ± 3.97) than in the reference population (9.08 ± 3.93). Both parameters (see Table 3) show high deviations from the mean (up to 4.25 years) in the reference population.

Lineage comparison
The lowland lineage includes 3,735 individuals (1,737 males and 1,973 females), and the lowland-Caucasian 9,659 individuals (4,807 females and 4,750 males). Pedigree completeness showed slightly better results in the lowland-Caucasian lineage, 87% in the third generation The effective number of founders and ancestors of the lowland lineage was 7 and 7, respectively. The f ge was 3.76. The genetic loss of this lineage was caused by 7.14% due to an uneven contribution of the founders and by 6.16% due to genetic drift. In total, it lost 13.30% of the total variability. For the lowland-Caucasian lineage, f e and f a were 11 and 9, respectively, and the effective number of the founders' genomes was 4.32. The bottleneck effect is therefore noticeable in this lineage. The genetic loss for this lineage was calculated to be 11.58% and was caused also by both the uneven distribution of founders (4.55%) and genetic drift (7.03%). The largest contribution of the founders in both lineages was made by the male 45 Plebejer (lowland-33.53%, lowland-Caucasian-17.99%) and the female 42 Planta (lowland-18.62%, lowland-Caucasian-14.81%).
In the reference population of the lowland lineage, a very low contribution of male 100 Kaukasus was recorded, namely 0.02%. The genetic variability of the lowland lineage part of the reference population could be theoretically completely explained by 207 ancestors. Female 42 Planta (16.50%) and male 45 Plebejer (31.88%) were the most contributing founders, followed by the male Kabir contributed the most (7.14%). Other ancestors and founders accounted for less than 5%. The contribution of the original 17 founders to the lowland lineage is shown in Table 4B.

PLOS ONE
European bison population more than 100 years after extinction in the wild The total fixation index of F ST is 0.06. which means that the monitored lineages are not genetically different. A positive F IS value suggests that homozygous individuals may predominate in the population, but its value of 0.10 is also very low. The F IT value was also low, only 0.04.

Discussion
This study focuses on the whole registrated population of European bison (animals from enclosure-based centres, semi-wild breeding centres, or returned from freedom). Nearly 100 years have passed since the first rescue attempts. The size of the worldwide population has increased more than 23 times compared to the 84 purebred individuals living in early 1937 [9]. According to the IUCN Red List, in about a century, the European bison went from the category of extinct in the wild to the category of near threatened [16]. The maximum number of generations 23 is proportional to the age of the population since it is the longest line of ancestors that can be found in the family tree. The average age of the parents of the next generation would be somewhere between 4-5 years of age. However, the mean value of the equivalent of complete generations is significantly lower, almost 7.85 ± 3.65 CGE for the reference population and 6.72 ± 3.62 CGE for the whole pedigree. CGE simply tells how many ancestors are known. The smaller the number, the fewer ancestors are known. The CGE value we calculated is low even though it is certain, that all animals undoubtedly come from the same ancestors from Białowieża Forest [62], and thus it should be possible to trace all individuals back to these ancestors. The pedigree book has been struggling with errors in entries since the beginning of its existence. Although most of them are corrected with the release of the next part in the section "Additions and corrections to former lists", a certain degree of error can be assumed and naturally brings some limitations to our study. In addition, a lot of information is lost by removing some animals from the register due to a lack of information about the animals, or just by releasing them into the wild. Likewise, an animal from a freedom or semi-freedom breeding centre included in the register has no information about its ancestors and can therefore be considered a founder. Despite this, the quality of the European bison pedigree is comparable with pedigrees of production breeds of cattle. For example, for the reference populations of the Danish breeds Danish Holstein, Danish Jersey, and Danish Red, the following CGE values of 7.20, 7.36, 6.77 were observed and PCI values of 0.94, 0.9 and 0.93, respectively [63]. Not so high CGE was found in the breed Brown Swiss in Germany-6.24 [64], in Lidia cattle breed -5.5 [65] or Normand cattle in Colombia-5.21 [66].
The completeness of the pedigree also affects the coefficients used to analyze the origin of the genes, for example, the effective number of founders or the genome equivalent of the founders [44]. Their values are naturally very low in the European bison population due to its historical development. Its closed family tree has only 12 original genotypes since its inception. The variability of this population can therefore only decrease if we neglect the effect of a mutation. In this study, however, values related to founders are non-negligible influenced by individuals with unknown ancestors, who are understood as other founders by the program we used. In addition, errors in the pedigree, loss of alleles through genetic drift, and the assumption of absolute unrelatedness of ancestors affect the accuracy of the coefficients of gene origin. Therefore, the actual values will be even lower. Closed populations with such a small number of founding animals normally occur in the wild and some even go through in situ speciation [67]. However, the emergence of such populations is more often captured in new breeds of domesticated animals or laboratory strains, in the creation of which humans are directly involved [68][69][70][71]. The most common problem of such populations is severe inbred depression [72]. In some cases, however, the population is cleansed of strongly deleterious mutations due to a bottleneck, and the population is then still viable for hundreds of years [73]. This could hopefully be the case for the European bison, whose population shows no severe signs of inbred depression [36].
To identify losses of genetic variability caused by both the bottleneck effect and the uneven use of some individuals for breeding, the ratio between the effective number of ancestors and founders-f a / f e -was used. Its 1 value shows no reduction in reproducing animals, and it could be logical because the first bottleneck effect cannot be seen in our data-the records of individuals living before that event do not exist. The second bottleneck effect caused by the Second World War is also not apparent. The presence of false founders is the main effect to blame, because, before the last pedigree book was included in the file, the value of this ratio was 1.13. The others are a low number of animals living at that time and an overall small number of founders. The f ge / f e ratio (0.50) suggests that random drift had a greater effect on the diversity of the reference population. The contribution of the founders to the reference population is very unbalanced. Most involved were Planta and Plebejer (15.05% and 22.73%) whose share since 1954 (18.8% Planta, Plebejer 26.4%) slightly fluctuates [11]. Founders 1, 2, 7, 32, 33, 46, 122, and 123 still contributed only minimally. Very surprising was the finding of a small relative share of the founder Kaukasus in the reference population of the lowland lineage. This is a significant finding that should be reviewed in the pedigree book for error. Otherwise, it would mean that the lowland lineage is not as pure as it was originally thought. However, the actual genetic contribution of the Kaukasus may have long since been swept away by genetic drift.
Regarding the values of the average inbreeding coefficient and average relatedness, the numbers we find differ from previous studies. The Olech and Perzanoski study [74] dealt with two herds in the Bieszczady, in the first herd in Stuposiana the F was 13.70%, in the second herd in Komańcza it was even 37.63%, and the average relatedness was 24.44% and 32.98%, respectively. Previous research focused on the diversity of both lineages also reached significantly higher values. Olech mentions values of nearly 50% in the lowland and 30% in the lowland-Caucasian individuals born 1996-2002 [75]. These findings indicate that the 17% inbreeding, which we measured in both lineages in the reference population, must be strongly underestimated in this context. A close relative of the European bison, the American Bison, has also undergone several bottleneck effects, and its current population was founded on less than 100 individuals [76]. However, the European bison shows lower genetic variability than the American bison, according to a study by Skotarczak et al. [77]. This was done on 4,269 American bison, with an inbreeding coefficient of 3.26% (17.81% for the European bison), with the highest value recorded at 46.87% (71.83% for the European bison). The value of the average relatedness was also very low-0.31%, compared to the AR of the European bison-16.07%. The absence of recent inbreeding indicates good breeding management. However, this fact must be taken with caution, due to the inclusion of individuals with unknown parents.
The effective size of the population is very small, the number of individuals that would cause the same increase in inbreeding as the reference population is only 23.93 animals. This coincides with the effective population size computed based on microsatellite analysis of 71 individuals born in the Białowieża Forest between 1996 and 2005, which was 28 [78]. This value is less than half the minimum effective size of 50 individuals, which is the goal of management to minimize inbreeding for the survival of the population in the short term [79]. A lower value was found for the American bison, namely, N e = 11.64 [77]. Malhado et al. [80] also found a lower value for buffalo Jafarabadi, N e = 10.4 ± 3.69. Despite the rapid growth of the bison population in recent decades, the value of N e is very low and could hardly truly grow in the future. As with most other diversity parameters monitored here, its increase (or decrease in case of inbreeding) over the years is mainly due to the inclusion of animals from wildlife as founders, even though they share the same gene pool. The current bison population appears to be healthy. After nearly a century of inbreeding, this could be a purging effect of inbreeding or, in the case of herds kept in captivity for a long time, adaptation to captivity. This study included all types of breeding from free range to captive breeding, so determining the extent of these effects was not possible.