Mitochondrial DNA variants of Podolian cattle breeds testify for a dual maternal origin

Background Over the past 15 years, 300 out of 6000 breeds of all farm animal species identified by the Food and Agriculture Organization of the United Nations (FAO) have gone extinct. Among cattle, many Podolian breeds are seriously endangered in various European areas. Podolian cattle include a group of very ancient European breeds, phenotypically close to the aurochs ancestors (Bos primigenius). The aim of the present study was to assess the genetic diversity of Podolian breeds and to reconstruct their origin. Methodology The mitochondrial DNA (mtDNA) control-regions of 18 Podolian breeds have been phylogenetically assessed. Nine non-Podolian breeds have been also included for comparison. Conclusion The overall analysis clearly highlights some peculiarities in the mtDNA gene pool of some Podolian breeds. In particular, a principal component analysis point to a genetic proximity between five breeds (Chianina, Marchigiana, Maremmana, Podolica Italiana and Romagnola) reared in Central Italy and the Turkish Grey. We here propose the suggestive hypothesis of a dual ancestral contribution to the present gene pool of Podolian breeds, one deriving from Eastern European cattle; the other arising from the arrival of Middle Eastern cattle into Central Italy through a different route, perhaps by sea, ferried by Etruscan boats. The historical migration of Podolian cattle from North Eastern Europe towards Italy has not cancelled the mtDNA footprints of this previous ancient migration.


Introduction
Over the past 15 years, 300 out of 6000 livestock breeds identified by Food and Agriculture Organization of the United Nations (FAO) have gone extinct [1]. Risk factors for farm animal breeds are mainly: i) a reduction of genetic variability due to strict selection processes; ii) a strong economic pressure focused on specific traits, such as milk production, which leads to the replacement of local less productive breeds with highly productive industrial breeds; iii) an unrestricted and indiscriminate cross-breeding, especially in developing countries [2]. Bos taurus is one of the most economically important livestock species [3]. Both in historic and current societies it has fulfilled agricultural, economic, cultural, and even religious key roles, often paralleling human evolution [4]. Among cattle, many Podolian breeds are seriously endangered in various European countries [5]; [6]; [7]. Podolian cattle include a group of very ancient European breeds, with a grey coat colour and long horns, phenotypically close to the aurochs (Bos primigenius). According to many traditional notes the name Podolian refers to a common ancestral origin in Podolia (the modern western Ukraine). However place of origin and timing of spread out of the source area are both debated. Alternative hypotheses have been proposed: Podolian cattle might have spread from the eastern steppe southward into Anatolia and westward into the Balkans and Italy in historical times (3 rd -5 th century AD) along with East-European Barbarian people [8]; other authors suggest a more ancient migration (~3 kya BP) from the Near East to Central Italy through the Mediterranean Sea [9], together with a possible contribution from local wild aurochs through a secondary local domestication/introgression events [10]; [11].
Nowadays, some phenotypic distinctions stand out among Podolian cattle [12]. The noble aurochs-shaped ancient breeds with long horns (such as Hungarian Grey, Katerini, Podolsko, Slavonian Syrmian and Maremmana) are considered as the only true Podolian breeds by some scholars. However, some local breeds (i.e. Podolica Italiana, Ukrainian Grey, Turkish Grey and other Balkan breeds) do not necessarily show the long horns, but maintained some distinctive Podolian traits such as a red coat in calves and light grey in adults [13]. In general, a commercial trait shared by all Podolian cattle is that they are more suitable for beef production rather than for dairy. Because of that some improved beef breeds (Chianina, Marchigiana, Romagnola and Piemontese) are also considered within the Podolian group, although the inclusion of Chianina and Piemontese is still debated [14].
During the last decades, mitochondrial DNA (mtDNA) has been widely used as a molecular tool to investigate genetic origin, history and diversity of livestock species [15] [16] [17] [18] [19] [20]. Following this trend, the aim of the present study is to re-assess the mtDNA diversity of the major Podolian cattle breeds (ten of which classified as endangered or critical by FAO) to obtain additional information on their ancestral origin and ancient dispersal routes.
The molecular analysis of 221 base pairs of the control region (from np 16042 to np 16262) on the entire dataset of 1,957 samples revealed a total of 247 distinct haplotypes (from four to 70 haplotypes per breed) and 91 polymorphic sites (S), all represented by single nucleotide polymorphisms (SNPs). The average nucleotide diversity (π) was comparable between Podolian and non-Podolian breeds (~0.010-0.011; Table 1), while haplotype diversity was significantly lower (P-value < 0.01) in Podolian (Hd = 0.837 ± 0.010) than in non-Podolian breeds (Hd = 0.879 ± 0.013). Both indices varied largely across breeds as already seen in previous mtDNA studies [11]; [21]; [22]; [23]. Among all Podolian, the highest Hd values (!0.90) were identified in Chianina, Marchigiana and Turkish Grey, while the lowest values of Hd (<0.70), as well as of nucleotide diversity ( 0.008), were scored in Bianca di Val Padana, Calvana, and Slavonian Syrmian Podolian. As for 1,617 sequences, we were able to extend the analysis to a longer control-region fragment encompassing 731 bps ( Table 1). The results largely confirmed the same trend, with the only notable exception of Piemontese, Romagnola and Maremmana showing a higher Hd (>0.960) on this extended fragment. It is also interesting that the highest Hd values were identified in Chianina, and Maremmana, which showed values (>0.970) comparable to the Turkish Grey (0.971).
All control-region haplotypes have been classified in haplogroups and sub-haplogroups through an accurate analysis of mutational motifs ( Table 2 and S3 Table), according to previously published classification criteria [11]; [24]; [25]; [26]. Haplogroup T3 was the most common (83%) in all breeds, with the highest value in MP and PO (both 100%), followed by PI (96%) (acronyms are listed in Table 1). The second and third most common haplogroups (both 7%) were T1 and T2, which were missing in MP, PO and SS. The frequency of T2 is lower in non-Podolian (3.85%) than in Podolian breeds (8.35%) with extraordinary high peaks in three breeds, KA (42%), RG (24%) and BG (22%), followed by UK (16%) and MA and TK (both 15%). T1 haplogroup was predominantly found among breeds from Central and Southern Italy, both in Podolian (10%) and non-Podolian (19%) groups (Fig 1). Haplogroup T5 was found exclusively in non-Podolian breeds, and was restricted to IB, RP and VA except for one sequence found in SS and one in PI. Finally, haplogroups Q and R showed very low incidences  Table 1 (see also Table 2 and S3 Table for details). Note that RP and IB are also widespread in Italy. Maps (www.histgeo.ac-aix-marseille. fr/ancien_site/carto/) were used for illustrative purposes only and largely modified by the authors.
https://doi.org/10.1371/journal.pone.0192567.g001 restricted to Italian non-Podolian (Q = 1.15%, and R = 0.58%) and Podolian (Q = 0.77%, and R = 0.49%) breeds. Overall, the haplogroup distribution differed significantly between the Podolian and non-Podolian groups of breeds included in the current analysis (Table 2; chisquare P-value < 0.001) with the highest contribution given by the T2 haplogroup. This result was also verified by considering haplogroup frequencies based on different haplotypes (chisquare P-value < 0.001) in order to mitigate the effect of inbreeding and recent founder effects.
Thus, we performed a principal component analysis (PCA) to graphically display the different haplogroup distributions among the Podolian breeds. In order to consider as many populations as possible, the dataset based on the short fragment was included. After variables reduction to PCs, the coordinates of the observations for the 18 breeds were displayed in a two-dimensional plot representing the European Podolian genetic landscape (Fig 2). PC1 Because of the peculiar position of some Italian breeds, we used an analysis of molecular variance (AMOVA) to investigate fixation indices in three (artificially created) population groups, one including the Italian Podolian breeds, the other encompassing the European Podolian breeds and the Turkish Grey, and the third group covering the Italian non-Podolian breeds. Most of the variance (about 98%) observed in the Italian Podolian populations explained differences among samples within breeds, while less than 2% represented differences between breeds (Table 3), a value three times lower than in the other Podolian breeds.

Discussion
To date, only a limited number of studies have addressed the genetic composition of Podolian breeds. These investigations were generally limited to few breeds and focused on the nuclear genome [8]; [27]; [28]; [29]; [30]. Mitochondrial DNA data have been previously reported by  T1  T2  T3  T1'2'3  T5  Q  Q1  Q2  R  R1  R2  Ivankovic et al. [6] and Ilie et al. [7] who analysed the control-region sequences of some Croatian and Romanian cattle breeds, respectively. The present study extended the analysis of the mitochondrial genetic variation to 18 Podolian breeds by evaluating their haplogroup distributions, which were eventually compared among them and to nine non-Podolian breeds. Genetic diversity considered in terms of number of haplotypes and nucleotide diversity revealed some peculiarities of several Podolian breeds. The low mtDNA diversities of Bianca di Val Padana,  Calvana and Slavonian Syrmian Podolian could be due to a combination of factors, such as i) a sampling bias depending on the low consistence of current herds (S1 Table), ii) a bottleneck effect caused by the strong reduction in population size experienced by these breeds during the last decades, iii) genetic drift acting on small populations. On the contrary, the large diffusion of Piemontese, Marchigiana and Chianina cattle probably favoured the accumulation and maintenance of a high level of mtDNA variation, which is evident also in the Maremmana breed in spite of its lower consistency (S1 Table). In general, the most common European haplogroup T3 is predominant (83%) in our dataset. High frequencies of T1 in central and southern Italy might be due to the intensive migrations across the Mediterranean Sea, eased by the proximity to northern Africa, where T1 is prevalent [20]; [25]; [26]; [31] and to the Near East where T1 is also present [32,33]. It is worth noting that the central-Italian Podolian breeds show higher frequencies of T2, while the presence of Q and R within the Podolian group is limited to three Italian breeds: Romagnola, Chianina and Marchigiana, the latter derived from crossbreeding between the first two in the early 20th century. The significant differences between the haplogroup distributions of Podolian and non-Podolian and the low genetic differentiation among the Italian Podolian breeds (three times less than in other Podolian breeds or in the non-Podolian group) points to a common (and perhaps peculiar) origin. As a matter of fact, according to the first component of the PCA, five Italian beef cattle breeds formed a clearly separated group and were also closer to the Turkish Grey than to any other Podolian breeds. At first this peculiarity could be explained as the effect of a stronger beef-oriented selection carried out on these breeds compared to the other Podolian populations. However, another important Italian beef cattle, the Piemontese, is placed within the other Podolian cluster, which shares the common feature of a strong grey coat. Thus, an alternative explanation might assume a different ancestral origin for the two groups of Podolian breeds, as summarized in Fig 1: a first group, mostly consisting of breeds from East Europe and northern Italy that share a similar mitochondrial gene pool, may have originated from ancestors migrated through an inland route from Podolia across eastern Europe all the way into northern Italy, in accordance with the great wave of cattle migration occurred during the Barbarian invasions. Whereas, a second group, including the white Podolian cattle closely related to the Turkish Grey, may descend from ancestral bovines brought to Italy through a different and likely maritime route crossing the Mediterranean Sea. A previous study [34] suggested also a genetic link between the Turkish Grey and Bulgarian and Hungarian breeds, but our results do not support such hypothesis, highlight instead a stronger maternal relationship between the Turkish Grey and five central-southern Italian Podolian breeds. Those cattle are bred since the medieval time [29] in an area that largely overlaps with the ancient territory of Etruria. This finding further supports and extends another hypothesis, according to which at least part of the maternal ancestry of those breeds could be related to the Etruscan migration from Lydia, a region on the south-western coast of ancient Anatolia [9]; [35]; [36], [37]; [38]. It is worth noting that the five Podolian breeds are also the main Italian beef cattle together with the Piemontese, and that previous studies suggested a possible common genetic origin [39]. Our findings suggest that, in spite of a stronger beef-oriented selection, their mitochondrial gene pool still preserves genetic traces of a different maternal origin, confirming that the selection practices were mostly male-mediated and enforcing the importance of the mtDNA screening to reconstruct the ancestry and history of current breeds.

Ethics statement
All experimental procedures were reviewed and approved by the Animal Research Ethics Committee of the Universities of Perugia and Pavia in accordance with the European Union Directive 86/609.

Samples
The entire dataset analyzed in this study encompasses 1,957 mtDNA control-region sequences including 1,321 from our previous studies [11]; [24]; [40], 428 retrieved from GenBank, and 208 additional samples specifically collected for this study ( Table 1,

DNA extraction, amplification and sequencing
As for the 208 novel mtDNAs, blood samples were collected from the jugular vein of each animal in vacutainer tubes, containing EDTA as anticoagulant. These animals were chosen in different farms in order to avoid closely related individuals and gather a representative sample of the breeds. Whole blood was stored at -20˚C until DNA extraction. DNA was isolated using the GenElute Blood Genomic DNA kit (Sigma Aldrich, St. Louis, MO, USA) and stored at -20˚C until genotyping. PCR amplification of the control region was performed using forward and reverse primers (5'-CCTAAGACTCAAGGAAGAAACTGC-3' and 3'-AACCTAGAGGGC ATTCTCACTG-5' respectively) specifically designed on the Bovine Reference Sequence (BRS; GenBank V00654). The 1138 bp PCR fragment encompassed the mtDNA control region from np 15718 to 517. Amplicons were first purified using exonuclease I and alkaline phosphatase (ExoSAP-IT1 enzymatic system-USB Corporation, Cleveland, OH, USA), then sequenced with the primer 15757F (5'-CCCCAAAGCTGAAGTTCTAT-3'), as previously described [40]. A dataset of 1,321 sequences was already available in our laboratories. All data were recorded in GenBank with accession numbers MF474376-MF475904 (S3 Table) and compared to those retrieved from the database (S2 Table).

Data analyses
Sequences were aligned to the Bovine Reference Sequence (BRS; V00654) using the software Sequencher TM 5.0. For a total of 1,617 samples, we were able to analyse a 731-bps fragment trimmed from np 15823 to np 215, while only a short fragment (221 bps, from np 16042 to np 16262) was considered in order to include the widest possible number of samples (N = 1,957). Haplotypes were classified in haplogroups and sub-haplogroups according to previously identified mutational motifs [24]. Indices of molecular variation were calculated using the DNAsp 5.1 software [41], while an analysis of molecular variance was computed using AMOVA program implemented in the ARLEQUIN 3.01 package [42]. Finally, principal component analyses (PCA) were performed using Excel software implemented by XLSTAT, as described elsewhere [43,44]. The PCA is a widely used dimension-reduction method that summarizes the variance of multivariate data in a smaller number of variables (the principal components, PCs), which are linear functions of the original variables, here expressed as haplogroup and sub-haplogroup frequencies, The rarest haplogroups were phylogenetically grouped and frequencies were calculated by considering only different haplotypes within the same breed.
Supporting information S1