Evidence for Sub-Haplogroup H5 of Mitochondrial DNA as a Risk Factor for Late Onset Alzheimer's Disease

Background Alzheimer's Disease (AD) is the most common neurodegenerative disease and the leading cause of dementia among senile subjects. It has been proposed that AD can be caused by defects in mitochondrial oxidative phosphorylation. Given the fundamental contribution of the mitochondrial genome (mtDNA) for the respiratory chain, there have been a number of studies investigating the association between mtDNA inherited variants and multifactorial diseases, however no general consensus has been reached yet on the correlation between mtDNA haplogroups and AD. Methodology/Principal Findings We applied for the first time a high resolution analysis (sequencing of displacement loop and restriction analysis of specific markers in the coding region of mtDNA) to investigate the possible association between mtDNA-inherited sequence variation and AD in 936 AD patients and 776 cognitively assessed normal controls from central and northern Italy. Among over 40 mtDNA sub-haplogroups analysed, we found that sub-haplogroup H5 is a risk factor for AD (OR = 1.85, 95% CI:1.04–3.23) in particular for females (OR = 2.19, 95% CI:1.06–4.51) and independently from the APOE genotype. Multivariate logistic regression revealed an interaction between H5 and age. When the whole sample is considered, the H5a subgroup of molecules, harboring the 4336 transition in the tRNAGln gene, already associated to AD in early studies, was about threefold more represented in AD patients than in controls (2.0% vs 0.8%; p = 0.031), and it might account for the increased frequency of H5 in AD patients (4.2% vs 2.3%). The complete re-sequencing of the 56 mtDNAs belonging to H5 revealed that AD patients showed a trend towards a higher number (p = 0.052) of sporadic mutations in tRNA and rRNA genes when compared with controls. Conclusions Our results indicate that high resolution analysis of inherited mtDNA sequence variation can help in identifying both ancient polymorphisms defining sub-haplogroups and the accumulation of sporadic mutations associated with complex traits such as AD.


Introduction
Alzheimer's Disease (AD) is the most common neurodegenerative disease and the leading cause of dementia among senile subjects. Few AD cases are early onset and familial, with autosomal dominant inheritance, while the majority of cases are late-onset (over 60 years old) and sporadic [1]. Sporadic AD (SAD) has a complex aetiology due to environmental and genetic factors which taken alone are not sufficient to cause the disease. Presently the major genetic risk factor in SAD is recognized in the allele e4 of apolipoprotein E (ApoE4).
However, the causal factors in the majority of late-onset AD patients are still unknown and there is likely a complex interaction between genetic and environmental factors that eventually results in the pathology [2].
It has been proposed that SAD can be caused by defects in mitochondrial oxidative phosphorylation (OXPHOS) [3]. Structurally abnormal mitochondria have been observed in AD brains [4], and deficiencies in mitochondrial OXPHOS enzymes, such as cytocrome c oxidase, have been repeatedly reported in the brains and other tissues of AD patients [5,6]. Mitochondria are deeply involved in various cellular processes such as ATP synthesis, heat production, reactive oxygen species (ROS) generation through oxidative phosphorylation, but also calcium signalling and apoptosis [7,8]. Defects in OXPHOS inhibit ATP production, but also increase mitochondrial ROS production, which, in turn, can damage the mitochondrial genome (mtDNA). Thus, this hypothesis on AD pathogenesis claims that mitochondrial impairment may result from accumulated mtDNA damage that accompanies normal aging, amplified by disease-specific factors [9,3]. Although there were reports of somatic mutations specifically at sites of known mtDNA regulatory elements, the available data from hybrid studies and post-mortem brain examinations did not provide conclusive proof that somatic mtDNA mutations can play a major or dominant role in AD aetiology [10,11].
Human mtDNA codes for the 12S and 16S mitochondrial rRNAs, 22 tRNAs, and 13 polypeptides; these polypeptides are essential subunits of mitochondrial OXPHOS enzyme complexes, which generate the principal source of intracellular energy, ATP [12,8]. It is inherited almost exclusively through the maternal lineage, does not recombine and is highly polymorphic. Mitochondrial genomes can be affiliated within haplogroups and subhaplogroups on the basis of their specific sequence motifs, reflecting mutation events accumulated over time along diverging maternal lineages. These haplogroups tend to show a continental or regional localization, and a very large number of European, African, and Asian/Native American-specific haplogroups have been identified so far [13,14].
Given the fundamental contribution of the mitochondrial genome for the respiratory chain, there have been a number of studies investigating the association between mtDNA lineages, aging and multifactorial diseases. It has been hypothesised that some mtDNA haplogroups are not neutral and have been selected through adaptation to climate and nutritional conditions [15,8,16]. Moreover, there are many reports on the association between mtDNA haplogroups and longevity [17][18][19], Leber hereditary optic neuropathy [20], as well as complex diseases like diabetes [21], ischaemic disease [22] and neurodegenerative diseases like Parkinson's Disease [23] and AD.
As for AD, contrasting data have been published. Some studies suggested that the transition at np 4336 of the mitochondrial tRNA glutamine gene is a risk factor for AD [24][25][26]. This mutation characterizes a European sub-branch of haplogroup H [13], which was termed H5a in more recent years [27,28]. Later it has been reported that haplogroup T is under-represented whereas haplogroup J is over-represented in AD patients [29]. Haplogroup U has been reported to be under-represented in females and over-represented in male AD patients of European ancestry [30], while recently, it has been found that HV cluster is significantly associated with the risk of AD regardless of the gender and the APOE4 status [31]. On the other side, it is to note that several studies did not find any association between mtDNA haplogroups and AD by studying different European populations [32][33][34]11,35]. On the whole, no general consensus has been reached on the correlation between mtDNA haplogroups and AD [36].
Since AD is a common neurodegenerative disorder, affecting globally about 3% of people older than 60 years in the world [37], even a small risk associated with mtDNA haplotype could have major causative implications at a population level. It is therefore critically important to determine the role of mtDNA polymorphisms in AD by studying large cohorts of subjects at a level of molecular and phylogenetic resolution as fine as possible. Our study is then aimed to investigate the possible association of mtDNA at the sub-haplogroup level in a population of 936 AD patients and 776 controls from central-northern Italy, in order to clarify whether specific sub-haplogroup polymorphic sites are involved in AD, and to assess the possibility that associations detectable only at the sub-haplogroup resolution level might account for the discrepancies present in the literature regarding AD and European mtDNA inherited variability.

Results
In our total sample of 1712 Italian subjects (936 AD patients and 776 controls) there is a higher proportion of female (72.9%) and APOE4+ (43.5%) subjects among AD patients relative to controls (60.6% and 13.1%, respectively). A separate comparison of genders further confirmed that APOE4 carriers were significantly different between AD patients and controls ( Table 1).
The hierarchical survey of diagnostic markers in the coding region allowed the classification of mtDNAs from patients and controls into more than 40 haplogroups/sub-haplogroups ( Table  S1). Most of these are typical of modern European populations, but a few East Asian (M) [38] and sub-Saharan African (L1b, L2a, L2c, L3d, L3e) [16,39] mtDNAs were also detected. This latter finding is not unexpected, since low frequencies of African and East Asian haplogroups are not uncommon in populations of southern Europe. We then grouped, using a phylogenetic rationale whenever possible, all sub-haplogroups with frequencies lower than 1.5%, thus allowing a reduction of the overall number of categories from over 39 to 19. No difference was found between AD patients and controls in the distribution of mtDNA groupings ( Table 2). We further proceeded by comparing the 19 sub-haplogroup frequencies in AD patients and controls by taking into account the multiple comparison adjustments (p-value cut-off = 0.003); again no statistical difference was observed. Also when frequencies were compared separately for gender, we found no difference between AD patients and controls for the mtDNA, and overall no significant differences among the subhaplogroup frequencies (Tables S2 and S3).
We then performed a univariate logistic regression analysis with the mtDNA variable to identify the potential sub-haplogroup associated with AD. Using as reference the haplogroup H, the only sub-haplogroup that resulted statistically associated with AD was H5 (OR = 1.89, 95%CI = 1.03-3.42). Hence we created a dichotomous variable (H5 versus all other sub-haplogroups) and performed again the logistic regression analysis including sex, age and APOE4.Overall sub-haplogroup H5 has a slightly higher AD risk (OR = 2.19, 95%CI = 1.06-4.51) in the female group than in the total sample (OR = 1.83, 95%CI = 1.04-3.23). As expected, in our population APOE4 carriers have a higher AD risk than non-APOE4 carriers (OR = 5.08, 95%CI = 3.98-6.50) and, among them, women have a higher risk than men (OR = 6.16, 95%CI = 4.45-8.53, OR = 3.97, 95%CI = 2.69-5.88, respectively).
Subsequently, mitochondrial sub-haplogroup H5 and the major acknowledged risk factors for AD (gender, age, APOE4) were compared in multivariate regression analysis. As shown in Table 3, gender (female), APOE4 allele, age and sub-haplogroup H5 are confirmed to be risk factor for AD. Moreover subhaplogroup H5 interacts with age in modifying AD risk. Subjects younger than 75 years (corresponding to the median age in the total sample) and carrying sub-haplogroup H5 harbor a four-fold increased AD risk than subjects belonging to other subhaplogroups, while among subjects who do not carry subhaplogroup H5, those older than 75 years have a three-fold increased risk than younger subjects.
We then performed the multivariate logistic regression by gender. In the female group APOE4 allele, age and sub-haplogroup H5 are confirmed to be risk factor for AD, and the APOE4 allele interacts with age in modulating AD risk. In female subjects aged 76 years (corresponding to the median age in the female sample) or younger, those carrying the APOE4 allele have a ten-fold increased AD risk than non APOE4 carriers, while within subjects older than 76 years, those carrying the APOE4 allele have a nearly four-fold increased AD risk than non-APOE4 carriers. Among female non-APOE4 carriers, those older than 76 years have a four-fold increased AD risk than younger subjects ( Table 4). In the male group only APOE4 allele (OR = 3.87, 95%CI = 2.60-5.77) and age .74 years (OR = 2.15, 95%CI = 1.51-3.07) are confirmed to be risk factors for AD (data not shown).
To determine whether the increased risk associated with subhaplogroup H5 could be attributed to specific mutations or mutational motifs, the complete sequence of virtually all H5 mtDNAs (56 out of 57; 38 AD patients and 18 controls) in our collection was determined and the phylogenetic distribution among the samples was analyzed. The results of the phylogenetic distribution and of the sequence analysis are summarized in Table 5 and Figure 1. A total of 159 mutated positions relative to the reference sequence [40] were detected and analyzed in the network, including 104 nucleotide changes in the coding regions: the count of each kind of mutation is important in order to understand the mutational spectrum of the molecule and identify  mutational hotspots. Overall 17 of the mutations (underlined and in italics in Table 5) were not previously reported in either MITOMAP (www.mitomap.org) or mtDB (www.genpat.uu.se/ mtDB) and each mutation was observed only in a single H5 mtDNA.
The topology of our mtDNA network matches published European trees in most respects [41].The network data set included 38 AD cases and 18 healthy controls, belonging to H5. A reduced-median network was constructed based on the sequence variations in the entire mtDNA molecules and by placing the rCRS as reference ( Figure 1).
The H5 network turned out to be star-like and constituted by a number of sub-clusters. The major dichotomy is due to the presence/absence of the 4336 mutation. The presence of this mutation characterizes H5a (N = 15) the cluster over represented in AD patients (1.1% in patients vs. 0.6% in controls). Moreover, entire mitochondrial sequence data show that H5a can be further subdivided; in particular there is one sub-branch termed H5a1 and defined by the mutation 15833 which encompasses many of mtDNAs within H5a (9 out of 15).
Twenty-nine of the mutations (in bold) listed in Table 5 result in amino acid changes, but excluding the mutations 8860 and 15326 (rCRS private mutations), none was particularly common. The highest frequencies were indeed reached by the mutations 5319 and 8563, each observed in one AD patient (2.6%) and two controls (11.1%).
Among all mutations observed in the 56 H5 mtDNAs, the mutated position at 15833, defining the H5a1 subgroup, was observed in the 21.1% of the H5 AD patients and in the 5.5% of controls, when compared with the entire sample this difference was statistical significant (0.9% vs 0.1%; p = 0.039). The nucleotide change at np 4336 in the tRNA glutamine gene, already associated to AD in previous studies [24,25], was observed in 50.0% of the H5 AD patients and 33.3% of the controls. When the whole sample is considered, the H5a subgroup of molecules, harboring the 4336 transition, was about threefold more represented in AD patients than in controls (2.0% vs 0.8%; p = 0.031), and it might account for the H5 frequency increase in AD patients (4.2% vs 2.3%).
We did not find any statistical significant result when comparing the number of mutations along the mtDNA molecule by mtDNA regions (Table 6). Subsequently, we searched for groups of singleton mutations falling in specific mtDNA regions that may increase the susceptibility to AD. In particular we focused on singleton mutations falling in tRNA and rRNA genes, considering that previous studies have shown that mutations in these genes, including the transition at np 4336 (tRNA Gln ) and the G3196A mutation (16S rRNA), may be associated with AD [24,42]. We found that ten sporadic mutations (single occurrences) were present in the tRNA plus rRNA genes from AD mtDNA sequences, while only two sporadic mutations were found in the same genes from controls. We verified the significance of this finding by comparing the number of the tRNA+rRNA mutations with the number of mutations falling in the remaining of coding region (43 in AD patients and 37 in controls). The x 2 test, under the hypothesis of a homogeneous distribution of mutations showed a tenuous significance (p = 0.05).

Discussion
Although there is large amount of evidence about the role of mitochondria in the pathogenesis of AD, a definitive conclusion regarding the association of mtDNA haplogroup with AD has not been reached yet [36]. To date several studies reported lack of association between mtDNA haplogroups and AD [32][33][34]11,35], while others found positive associations with specific haplogroups as well as specific haplogroup clusters analyzing different Caucasian populations [24,[29][30][31]43].
To our knowledge, this is the first case-control study investigating, at the sub-haplogroup level, the association of mtDNA with Alzheimer's Disease. We analyzed a large cohort of AD patients (N = 936) and controls (N = 776) comparable for age and ethnicity from the central-northern regions of Italy, where the controls were directly assessed for their cognitive status. We found that sub-haplogroup H5 appears to be associated with a higher risk of AD in both the total sample and the female group. This result was in keeping with a recent study on the Polish population which revealed that the super-haplogroup HV and haplogroup H are associated with a higher risk of AD [31].
Studies of both survival after sepsis [44] and sperm motility [45] have shown significant associations with mtDNA haplogroups, leading to the proposal that mitochondria bearing haplogroup H mtDNAs, and particularly haplogroups H3, H4, H5 and H6 [46] are associated with a more tightly coupled oxidative phosphorylation, and consequently they should lead to an increased production of ROS than those with haplogroup T. Accordingly, it was hypothesized that subjects with haplogroup H could be more prone to oxidative stress than those with other haplogroups [8], and consequently more susceptible to neurodegenerative diseases, in which oxidative stress plays a major role. This hypothesis would account for the higher AD risk we observed for H5 subjects.
We also found that sub-haplogroup H5 interacts with age in modifying AD risk. H5 subjects younger than 75 years old have a higher AD risk than non-H5 subjects, while in those older than 75 years no increased risk is observed. Age is a strong risk factor for AD [37] and this interaction suggests that age 75, corresponding to the median age in our sample, could be considered as a threshold, under which risk factors such as sub-haplogroup H5 and APOE, could independently exert their major effect on the development of the disease, while over this age value other risk factors, such as aging itself, would largely prevail.     The presence of mtDNA sub-haplogroup H5 confers a greater risk of AD to women than to the total sample of AD patients considered. In our sample the proportion of AD females was higher than in the female control group, however we can exclude that the association is merely due to a gender imbalance because the frequencies of all haplogroups in AD females were comparable to the female controls (and also to male controls), these latter being similar to the northern Italian general population [47].
In males we did not find any significant association but only a trend toward an increase in H5 frequency in AD males relative to controls. It is to note that males are about one third of our total AD sample, because they are less represented both among the elderly [48] and in AD patients [49]. Thus, the negative association between H5 sub-haplogroup and AD in men could be explained by the smaller size of the AD male group. However, a specific or more pronounced risk in women cannot at present be excluded, given the growing evidence that components of the AD phenotype differ significantly based on gender [50].
As for APOE, any association emerged between sub-haplogroup mtDNA variation and APOE4 genotype, suggesting that they likely exert an independent effect on AD. Previous results on this topic are discordant, some Authors reporting a positive association [43,31], while others no association [30,35] between haplogroups and APOE4. Indeed, we found a strong association of APOE4 with age, this effect being particularly evident in females younger than 76 years, i.e. the same age group where mtDNA subhaplogroup H5 appears to exert its major influence on AD.
In order to assess if the H5 background of the patients harboured mutations responsible for the increased AD risk, we investigated the complete mtDNA sequences of our H5 subjects. We did not find any particular mutational pattern in the H5 patients (compared to controls) except for a threefold increase of the H5 subgroup (H5a) characterized by a transition at position 4336 (tRNA Gln gene) that has been previously shown to be associated with AD [24,25], and a sevenfold increase of the synonymous mutation at np 15833 which defines the H5a1 sub-haplogroup. The latter finding is difficult to explain unless that the high frequency of this C.T transition at np 15833 in AD patients is due to a mutational bias toward specific codon usage at synonymous sites [16].
In addition sporadic mutations (showing up in one sample only) are slightly more numerous in tRNA plus rRNA genes of AD samples than in controls. This finding is also in keeping with growing evidences showing that mutations in the protein synthesis machinery of mitochondria may be frequent cause of human degenerative diseases [42]. Thus our results confirm the relationship between mtDNA sequence variation and AD and suggest that both ancient polymorphisms defining sub-haplogroups and the accumulation of sporadic mutations are most likely involved.
Further study will need to better focus on the genetic variation falling in tRNA and rRNA genes, that is the mitochondrial translation machinery. In fact a better characterization of the molecular problems caused by these mutations in combination with may be of help in characterizing different subgroups of sporadic AD patients and help in their treatment.
Summarizing, the limitation of the present study is the lack of a replication study in another population. Accordingly, we cannot exclude population-specific effects. However, the large number of patients and controls in our study allows us to largely exclude possible false positive results. A replica of our study is quite demanding, owing to difficulties in enrolling another comparable large number of patients and of cognitively well assessed controls either in the same geographic area or in other European countries. The strengths are: (i) our study refers to a number of clinically well assessed AD patients (936) and controls (776) which is quite consistent; (ii) both AD patients and controls have been recruited in a specific relatively large geographic area thus avoiding possible bias related to a founder effect or population heterogeneity; (iii) the control subjects have been tested for their cognitive capability; (iv) we have identified for the first time a candidate risk subhaplogroup for AD which extends previous data regarding H-V cluster on a more limited number of AD patients and controls in a Polish population [31]; (v) we have provided 56 new mtDNA  complete sequences all referring to the H5 sub-haplogroup which not only suggest that sporadic mutations in mitochondrial tRNA and rRNA genes could be involved in AD risk but also and more in general represent a major contribution on mtDNA genetics regarding this specific haplogroup in a Caucasian population; (vi) we have pointed out the relationship between H5 mtDNA subhaplogroup and other well known risk factors for AD, such as APOE4 genetic polymorphism and age. These data allowed us to identify the subgroup of AD patients (younger than 75 years of age; female sex) where H5 sub-haplogroup has a stronger effect. Patients were diagnosed by skilled evaluation units as suffering from probable AD, according to NINCDS-ADRDA criteria [51] and underwent a comprehensive geriatric assessment, including an extended neuropsychological evaluation.

Methods
The control group was comparable for age, sex and ethnicity to the patient group. A major characteristic of this control group is that they were directly assessed by MMSE in order to exclude subjects affected by cognitive deficiency. Written informed consent was obtained from all control individuals and primary caregivers on behalf of AD Patients. Written informed consent was obtained from all control individuals and primary caregivers on behalf of AD Patients. Each Institution which provided the DNA samples received the approval from their own ethical committees. In particular the Ethic committees of the: Italian National Research Center for Ageing in Ancona, Department of Neurological and Psychiatric Sciences of University of Florence, Regional Center for Cerebral Ageing in Valdagno and the S. Giovanni di Dio-Fatebenefratelli Center in Brescia have given their approval.
DNA was recovered from fresh blood by phenol-chloroform standard procedures. MtDNA profiles in patients and controls were determined by sequencing the entire mtDNA control region for each subject from nucleotide position (np) 16024 to np 576. This was followed by a hierarchical survey of haplogroup and subhaplogroup diagnostic markers in the coding region [20,47]. APOE genotyping was performed as previously described [52]. Patients and controls were stratified into two subgroups, according to their APOE4 status: those carrying at least one APOE4 allele (APOE4+) and non-APOE4 carriers (APOE4-).
Complete mtDNA sequences were obtained after treatment of total DNA samples with REPLI-g Mitochondrial DNA kit (Qiagen, Valencia, CA) for specific whole genome amplification of mtDNA, followed by PCR amplification with MitoALL Resequencing kit (Applera, Foster City, CA). The PCR products (5-10 ng) were purified by EXOSAPit (U.S. Biochemical, Cleveland, Ohio) and used for direct sequencing with BigDye kit version 3.1 (Applera). Electropherograms were inspected with SeqScape version 2.5 software (Applera). All the samples were aligned and compared with the revised Cambridge Reference Sequence (rCRS) [40].
Reduced median network analyses, based on nucleotide variation on the complete sequences, were carried out by Network 4.5.1.0 (http://fluxus-engineering.com) [53] with an e default value of zero.
Because of mutation rate heterogeneity in mtDNA there is not a consensus on the weights to be assigned to the mutated nucleotide positions. As reported in other studies [27,54] and after experimental tests we choose to give different weights to nucleotide positions in the D-loop and coding regions of our samples. After an initial run, we summed the statistics (counts of each mutation) obtained for all samples and used them to weight the characters from 10 to 90 with an inverted linear relation against the number of occurrences in the statistics.

Statistical Analysis
Mitochondrial sub-haplogroups, genotype frequencies and gender were compared between AD patients and controls using the x 2 or Fisher-exact test, while differences in age were investigated through the non parametric Wilcoxon rank-sum test.
The mitochondrial sub-haplogroup variability envisages at least 40 subgroups, which have been further grouped into 19 subgroups considering their frequencies and phylogenetic relationship.
The comparisons between each of the 19 mtDNA subhaplogroups in AD patients and controls have been computed by applying both the Bonferroni's adjustments and the Holm's step procedure. All analyses were performed separately for women and men, as well as for the total group, and multivariate logistic regression analysis using a stepwise procedure, adjusting for significant interactions (considering a threshold level of 0.15), was used to assess the relationship between independent variables (mtDNA, APOE4 status, age and gender) and the outcome. Tests for statistical significance were two-sided with = 0.05; effect size for the association was measured as odds ratio (OR) with 95% confidence intervals (CI). SAS (9.1) statistical software from SAS Institute, Cary (NC), was used for all statistical analyses.