Ancient mitochondrial DNA pathogenic variants putatively associated with mitochondrial disease

Mitochondrial DNA variants associated with diseases are widely studied in contemporary populations, but their prevalence has not yet been investigated in ancient populations. The publicly available AmtDB database contains 1443 ancient mtDNA Eurasian genomes from different periods. The objective of this study was to use this data to establish the presence of pathogenic mtDNA variants putatively associated with mitochondrial diseases in ancient populations. The clinical significance, pathogenicity prediction and contemporary frequency of mtDNA variants were determined using online platforms. The analyzed ancient mtDNAs contain six variants designated as being “confirmed pathogenic” in modern patients. The oldest of these, m.7510T>C in the MT-TS1 gene, was found in a sample from the Neolithic period, dated 5800–5400 BCE. All six have well established clinical association, and their pathogenic effect is corroborated by very low population frequencies in contemporary populations. Analysis of the geographic location of the ancient samples, contemporary epidemiological trends and probable haplogroup association indicate diverse spatiotemporal dynamics of these variants. The dynamics in the prevalence and distribution is conceivably result of de novo mutations or human migrations and subsequent evolutionary processes. In addition, ten variants designated as possibly or likely pathogenic were found, but the clinical effect of these is not yet well established and further research is warranted. All detected mutations putatively associated with mitochondrial disease in ancient mtDNA samples are in tRNA coding genes. Most of these mutations are in a mt-tRNA type (Model 2) that is characterized by loss of D-loop/T-loop interaction. Exposing pathogenic variants in ancient human populations expands our understanding of their origin and prevalence dynamics.


Introduction
The scarcity of prehistoric human remains hampers obtaining complete picture of disorder incidence in ancient times. Altered or affected bones in skeletal remains might provide information about certain diseases, such as cancers [1,2] and rheumatic diseases [3]. A lesion on an  [4] and a fibrous dysplasia on a Neanderthal rib (older than 120000 years) from the site of Krapina, Croatia [5] are early confirmation of neoplastic disease. Neoplastic tumors have however been detected in early Homo samples as old as 1.7 million years ago, and these provide further insight into the outset of human cancers [6]. Mummified human remains of a 5300-year-old Neolithic man (Ö tzi, The Tyrolean Iceman) show hardening of the arteries, suggesting predisposition for coronary heart disease [7]. Based on information on subsistence, geography and sample age, Berens et al (2017) estimate the genetic disease risk for 3180 loci in 147 ancient genomes, and find it to be similar to that of modern day humans [8]. Focusing on individual genomes, however, they estimate that the overall genomic health of the Altai Neanderthal is worse than 97% of present day humans and that Ö tzi the Tyrolean Iceman had a genetic predisposition to gastrointestinal and cardiovascular diseases [8].
Data on the prehistoric origin of mitochondrial diseases is however notably lacking. The first disease linked directly to mtDNA mutation was discovered in 1988 [9]. Recently, wholegenome sequencing of mtDNA has led to significant advances in our understanding of mitochondrial diseases. Rare pathogenic mutations in mitochondrial DNA cause monogenic mitochondrial diseases involving multiple systems and are associated with variable clinical phenotypes. The severity of the clinical and biochemical phenotype caused by pathogenic mtDNA mutations has been found to be roughly proportionate to the percent mutant heteroplasmy [10,11]. The mitochondrial haplogroup harboring the mutation might also alter the penetrance of mitochondrial diseases [12]. Specific subclades of haplogroup J, for example, have been shown to affect the penetrance and pathogenicity of Leber's hereditary optic neuropathy [13]. Certain mtDNA mutations and haplogroups are also predictors of both lifespan and risk of various age-associated disease, including degenerative diseases, cancer, diabetes, heart failure, sarcopenia and Parkinson's disease [14].
We had previously performed whole-genome sequencing on 25 Thracian mtDNA samples dated 3000-2000 BCE, and 608 mtDNA variants were detected [15]. Only one of these however, m.15326A>G (rs2853508), is designated as likely pathogenic, associated with familial breast cancer [16]. This variant was found in all analyzed by us samples, and MitoMap (2019, update nr.3) database estimates 0.98 population frequency [17]. Such high frequency indicates that this variant is common and probably not disease related.
The objective of this study was to investigate the prevalence of pathogenic mutations in ancient mtDNA and to provide further insight into the emergence and dynamics of mitochondrial diseases.

Materials & methods
We used the comprehensive data of the ancient mtDNA genome sequences from the Ancient mtDNA database [18]  • GenBank database provides access to up-to-date and exhaustive DNA sequence information, and was used to get information on variant frequencies in contemporary populations [24].
• MitoTIP is an in silico tool for predicting pathogenicity of novel mitochondrial tRNA variants [25]. It integrates multiple sources of information, including the position of the variant within the tRNA, conservation across species and population frequencies, to provide a prediction for the likelihood that novel single nucleotide variants would cause disease.
• HmtVar uses algorithms to determine the importance of the variant position in tRNAs and was utilized to predict the pathogenicity and potential impact of mtDNA variants [26].
• Complementing the information obtained using the abovementioned tools, literature survey was conducted on variants designated as "confirmed pathogenic|" in an effort to acquire a comprehensive picture of the evidence for their disease causing effect.

Results
Out of 3191 unique mtDNA variants established in the 1443 analyzed ancient samples, six are designated as being "confirmed pathogenic" and 10 as "likely/possibly pathogenic" by Mito-Map (Table 1). For each of these variants, we review the available evidence in HmtVar and in the scientific literature for their pathogenic effects. Confirmed pathogenic mutations m.5703G>A (rs199476130). The variant m.5703G>A was found in five ancient mtDNA samples, one from the Neolithic period, three from the Iron Age and one from the Middle Ages. The two of the Iron Age samples are from the same site and time period in today's Russia, extracted the aged remains of a male and a female, and they probably related. This is also the pathogenic variant established in ancient mtDNA from archeological sites spanning the widest geographical range, i.e. from Poland to Mongolia. Our literature survey for the m.5703G>A mutation finds that it has been reported to cause mitochondrial myopathy (MM) and ophtalmoplegia [28,29]. Recently, its phenotypic spectrum was broadened by a report of a patient with typical myoclonic epilepsy with ragged red fiber (MERRF) syndrome carrying a heteroplasmic m.5703G>A mutation [30]. In another recent study, however, it has been argued that investigations carried out to confirm the pathogenicity of this variant are insufficient [71].
m.3243A>G (rs199474657). This variant was detected in a sample from a site in Germany from the Bronze Age (2029-1911 BCE) ( Table 1). Population-based studies suggest the m.3243A>G mutation is one of the most common disease-causing mtDNA mutations, with a carrier rate of 1 in 400 people [72,73]. This mutation has been shown to be associated with a wide range of symptoms, and there is evidence the outcome is also being determined by nuclear genetic factors [34]. It is associated with mitochondrial encephalopathy, lactic acidosis and stroke-like episodes (MELAS) [74], maternally inherited deafness and diabetes (MIDD) [75] and chronic progressive external ophthalmoplegia (CPEO) [76]. Other reported features  include renal failure [77], isolated myopathy, cardiomyopathy, seizures, migraine, ataxia, cognitive impairment, bowel dysmotility and short stature [78]. Low to moderate levels of mutant heteroplasmy in the m.3243G>A mutation are often associated with MIDD, whereas higher levels are variably associated with myopathy, high frequency sensorineural hearing loss, short stature, epilepsy, strokes and dementia [11,79,80]. Elevated heteroplasmy levels have in general been shown to lead to neurologic, movement, metabolic, and cardiopulmonary impairments [81]. m.5650G>A. This variants was found in 3 samples from the Iron Age, but they are from different Central Asian sites and time periods (ranging from 900 BCE to 134 BCE), so the individuals they were taken from are in all likelihood unrelated. McFarland and colleagues (2008) report a family where proximal myopathy has become increasingly severe with successive generations of the maternal lineage, and this pure myopathy is shown to be caused by the m.

Likely/possibly pathogenic mutations
This group includes ten mutations for which there is discrepancy in their clinical effect designation between the two used platforms. Whereas HmtVar classifies them as pathogenic, Mito-Map classifies them as likely or possibly pathogenic (Table 1).

Discussion
This study establishes for the first time the presence of pathogenic mtDNA variants in 1443 ancient mtDNA Eurasian genomes from different periods, publicly available in the AmtDB database. Among the 3191 unique variants detected in the analyzed samples, six are "confirmed pathogenic" with well-established clinical association in contemporary patients. Our results suggest that the prevalence of pathogenic mtDNA mutations might have been far greater in ancient populations, eleven cases of six unique confirmed pathogenic mtDNA mutations were established in 1443 ancient samples, compared to 1.5-2.9 in 100000 in contemporary populations [72]. This marked drop in prevalence might be related to the prevalence of particular mitochondrial haplogroups as there is evidence that certain pathogenic mutations are associated with particular haplogroup background [83].
The m.5703G>A mutation is established in European and Asian ancient samples, whereas in contemporary patients it is detected in a 16-year-old Caucasian girl in Europe and a Chinese girl, along with an African-American woman (cf. Table 1, Fig 1). Contemporary geographic distribution for this variant thus generally corresponds to that in the examined past periods. This variant was established in samples assigned to five different haplogroups (cf.  1, Fig 1). The sample found in today's Russia is W1c haplogroup which is determined to have arisen 8 kya [84], and is presently found in Europe, the Near East, Caucasus and India [85]. The samples found in today's Kazakhstan and Mongolia belong to East Asian haplogroups F1b1 and D4b2b2b, respectively [86]. The estimated distribution of m.5650G>A in ancient times thus generally corresponds to the contemporary distribution of the haplogroups harboring these variant. The incidence of the variant in contemporary European populations, where F1b1 and D4b2b2b lineages are rare, indicates lack of association of the variant with these haplogroups. The presence of m.5650G>A in contemporary European samples, and its absence from ancient European samples, might be the result of de novo mutation or migration and subsequent evolutionary events.
The remaining four ancient pathogenic variants are established in single samples belonging to different mitochondrial haplogroups. The m.14674T>C variant was determined in only one ancient sample from the territory of present day Mongolia, but contemporary carriers of this variant are found in the U.S., Germany, Brazil, Italy, Japan, the UK and Sweden. This ancient sample is classified to haplogroup U2e1h, a sub-branch of haplogroup U2e, presently found at low frequencies in Europe and Western Asia [87]. The correspondence of the contemporary distribution of m.14674T>C with the geographic range of the relatively rare U2e1h lineage indicates possible association between this variant and haplogroup.
Overlap between the geographical location of the ancient sample and the present-day distribution of the corresponding mitochondrial haplogroup is found in three more pathogenic mutations: m.8340G>A established in an ancient sample from present day Poland (assigned to the Western Eurasian haplogroup HV18); m.7510T>C detected in an ancient sample from present day Bulgaria (designated to haplogroup J2b1, found among modern populations in Atlantic Europe, the East European Plain and the Near East); and m.3243A>G found in ancient sample from Germany (belonging to a subclade of T2f, considered almost exclusively European, with rare instances in the Near East) [88].
The established "confirmed pathogenic" mutations are detected in samples estimated to be between 1600 and 7800 years old. The oldest of these, m.7510T>C, is detected in a Neolithic sample (5800-5400 BCE). The m.5703G>A mutation was detected in a Neolithic sample that could be as old as 4800 years, but it is found again in samples from later periods, Iron Age (900-600 BCE) and Middle Age (10-375 CE). Other mutations with established long histories, detected in >4000 years old Bronze Age samples are m.3243A>G, the most common mtDNA mutation with pathogenic effect, present in contemporary populations with 0.02% frequency, and m.14674T>C, found in a 2600 old Bronze Age sample, and also found in contemporary populations with 0.01% frequency (cf. Table 1). The m.5650G>A mutation is detected in an Iron Age sample that could be as old as 2900 years. The low population frequency of these variants in contemporary populations is further corroboration of their pathogenic significance.
Two mutations with likely/possibly pathogenic effect, m.7543A>G and m.7554G>A, detected in Neolithic period samples from the Neolithic period from the same archeological site and time span, are the oldest established mutations with putatively pathogenic effect (10200-10000 years old). Also, two Bronze Age mutations, m.8296A>G and m.4440G>A, could be as old as 4800 years. It is noteworthy that m.4440G>A is detected in as many as 12 ancient samples from different time periods and locations. Its pathogenic effect is corroborated by that, despite it being the most common clinically significant mutation detected in ancient mtDNA samples in this study, it has been described as a novel mutation causing MM in contemporary patients [57].
It is noticeable that all the established ancient mutations putatively associated with diseases are located in tRNA genes, and none is in genes encoding the 13 essential polypeptides of the OXPHOS system, even though tRNAs comprise only about 10% of the total coding capacity of the mitochondrial genome [89]. Epidemiological studies have highlighted that point mutations in the mt-tRNA genes are among the most common defects observed [73,90]. Mitochondrial tRNA mutations have been shown to be the most prevalent genetic defect by a survey of an adult population with mtDNA disease, accounting for more than 50% of all genetically diagnosed cases [91]. More than 150 different point mutations have been described in mt-tRNA genes including novel disease-causing mutations. Associated pathogenic mechanisms continue to be identified [92], yet mtRNA mutations'role in interfering with the translation mechanism remains unclear.
Ancient mutations putatively associated with mitochondrial diseases are in different tRNA genes and affect nucleotides in different functional parts of the encoded tRNA molecule ( Table 2).
The gene MT-TK coding tRNA Lysine is most commonly affected by mutations putatively associated with mitochondrial diseases in ancient mtDNA samples, i.e. one confirmed pathogenic (m.8340G>A) and three likely/possibly pathogenic (m.8296A>G, m.8328G>A and m.8342G>A). It is conceivable that these mutations have different clinical significance related to their impact on the stability of the tRNA protein.
A recent study on another mtDNA mutation in the same gene -m.8344 A>G, known to be associated with MERRF, has demonstrated that tRNA modifications have distinct effects on the stability and synthesis of mitochondrial proteins [82]. Such regulating mechanisms might play role in the etiology of human disease, and new RNA sequencing approaches to mitochondria should provide insights. Two ancient mutations were established in the MT-TD coding Aspartic acid. Aspartic acid is neurotransmitter and recent studies show that it may be involved in the pathogenesis of a stroke-like episode [43]. The remaining putatively pathogenic mutations established in ancient mtDNA are located in different tRNA genes. It is notable that the established ancient mutations are located in different cloverleaf models of tRNAs. (Fig 2). The tRNA model indicates one of the four possible groups human mt-tRNAs are classified based on their structural diversity and tertiary interactions [93]. Model 0 represents the quasi-canonical cloverleaf structure, with standard D-loop/T-loop interaction; Model 1-a single tRNA with an atypical anticodon stem; Model 2 -the most common among mt-RNAs, is characterized by loss of D-loop/T-loop interaction and Model 3-lack of D-stem. Thirteen out of the sixteen mutations putatively associated with mitochondrial disease presented in this study are in Model 2.
Seven out of the 16 mutations considered here are located in CS-Anticodon Stem, four in AS-Acceptor Stem, two in TS-TCC Stem, and single mutations are located in DL-Dihydrouridine Loop, CL-Anticodon Loop and DS-Dihydrouridine Stem (Fig 2).
Confirmed pathogenic mutation m.5703G>A is located on position 27 in Model 2 tRNA which is involved in post-transcriptional modifications and is adjacent to position 26, which participates in tertiary folding and is also subject to posttranscriptional modifications. Confirmed pathogenic mutation m. cutback of mitochondrial protein synthesis, including reduction of OXPHOS proteins. Despite the clinical significance, the molecular mechanisms leading to such disturbances remain poorly understood.
As often happens in ancient DNA analyses, because the amount of endogenous template DNA is typically very low, the surviving molecules are typically short and affected by postmortem cytosine deamination damage which appears as C>T and G>A variants in sequence data [94]. It is noteworthy to mention that 12 out of the 19 (63.2%) pathogenic or putatively pathogenic mutations that we detect in the analyzed ancient mtDNA samples are G>A or C>T substitutions. Review of the publications that present the analyzed ancient mtDNA genomes substantiates that the authors have employed adequate analyses to mitigate the effect of post mortem damage (PMD), strengthening our confidence that these are real variants and not the result of PMD [27,31,33].
Still, pathogenic mutations in mitochondrial DNA often show highly variable phenotypes for any given point mutation, and severity of the clinical and biochemical phenotype has been roughly proportionate to the percent mutant heteroplasmy [10,79]. Identifying heteroplasmic variants and establishing the level of heteroplasmy in ancient samples is not a trivial task. Heteroplasmic variants however constitute the bulk of disease-associated mtDNA variants in contemporary humans, and most of the detected confirmed or putatively pathogenic variants in our study have pathogenic effect in heteroplasmic state. Nevertheless due to insufficient phenotypic data about the human remains, there is no way of exactly knowing if disease-associated mutations, or those predicted to have a strong functional effect, were indeed pathogenic in ancient populations.

Conclusion
The established mtDNA pathogenic mutations in the analyzed ancient samples, all in tRNA coding genes, are putatively associated with a wide range of mitochondrial diseases found in contemporary populations. Studying putative pathogenic mutations from ancient mtDNA informs on the mitochondrial disease spectrum in ancient times, and comparing their frequencies among populations separated by significant time periods sheds light on the history of the disease. Our findings suggest that disease associated genes are often genes with long history, and that pathogenic variants in mtDNA exhibit diverse temporal dynamics with regard to their geographic distribution and mitochondrial haplogroup background association. Exposing pathogenic variants in ancient human populations contributes to our understanding of their origin and prevalence dynamics.