Expression of Novel Alzheimer’s Disease Risk Genes in Control and Alzheimer’s Disease Brains

Late onset Alzheimer’s disease (LOAD) etiology is influenced by complex interactions between genetic and environmental risk factors. Large-scale genome wide association studies (GWAS) for LOAD have identified 10 novel risk genes: ABCA7, BIN1, CD2AP, CD33, CLU, CR1, EPHA1, MS4A6A, MS4A6E, and PICALM. We sought to measure the influence of GWAS single nucleotide polymorphisms (SNPs) and gene expression levels on clinical and pathological measures of AD in brain tissue from the parietal lobe of AD cases and age-matched, cognitively normal controls. We found that ABCA7, CD33, and CR1 expression levels were associated with clinical dementia rating (CDR), with higher expression being associated with more advanced cognitive decline. BIN1 expression levels were associated with disease progression, where higher expression was associated with a delayed age at onset. CD33, CLU, and CR1 expression levels were associated with disease status, where elevated expression levels were associated with AD. Additionally, MS4A6A expression levels were associated with Braak tangle and Braak plaque scores, with elevated expression levels being associated with more advanced brain pathology. We failed to detect an association between GWAS SNPs and gene expression levels in our brain series. The minor allele of rs3764650 in ABCA7 is associated with age at onset and disease duration, and the minor allele of rs670139 in MS4A6E was associated with Braak tangle and Braak plaque score. These findings suggest that expression of some GWAS genes, namely ABCA7, BIN1, CD33, CLU, CR1 and the MS4A family, are altered in AD brains.


Introduction
Late onset Alzheimer's disease (LOAD) is the most common form of dementia. AD is pathologically defined by extensive neuronal loss and the accumulation of extracellular amyloid plaques and intracellular neurofibrillary tangles in the brain. While the familial form of AD is associated with heritable mutations in the APP, PSEN1, and PSEN2 genes, LOAD onset and progression appears to be influenced by complex interactions between genetic and environmental risk factors. Apolipoprotein e4 (APOE4) is the strongest genetic risk factor for LOAD [1][2][3][4] but only accounts for 10-20% of LOAD risk suggesting that susceptibility to LOAD involves additional genetic and environmental risk factors.
Despite the identification of numerous SNPs that occur in genes that function in pathways relevant to AD, we still know little of the specific functional impact of the LOAD GWAS SNPs and the specific role of these genes in AD. Thus, we sought to measure the influence of GWAS SNPs on gene expression in a cohort of AD cases and age-matched, cognitively normal control brains. We found that ABCA7, BIN1, CD33, CLU, CR1, and MS4A6A expression are associated with clinical and neuropathological measures of AD. The GWAS SNPs, however, were not associated with gene expression. Thus, we found that the expression patterns of some GWAS genes are altered in AD brains.

Subjects
Parietal lobes from European American, autopsy-confirmed AD (N = 73) and age-matched, cognitively normal control (N = 39) brains were obtained from the Charles F. and Joanne Knight Alzheimer's Disease Research Center (Table 1). AD pathology was measured using Braak and Braak staging [11,12]. Clinical dementia rating (CDR) is a clinical measure of dementia, which incorporates six domains of cognitive and functional abilities: memory, orientation, problem solving, community involvement, home, and personal care [13].
The Washington University IRB reviewed the Knight ADRC Neuropathology Core (from whom the brains were obtained) operating protocol as well as this specific study and determined it was exempt from approval. In the state of Missouri, individuals can give prospective consent for autopsy. Our participants provide this consent by signing the hospital's autopsy form. If the participant does not provide future consent before death the DPOA or next of kin provide it after death. All data were analyzed anonymously.

RNA Extraction
RNA was extracted from brain tissue with an RNeasy kit (Qiagen) according to the manufacture's protocol. Extracted RNA (10ug) was converted to cDNA by PCR using the High-Capacity cDNA Reverse Transcriptase kit (ABI). RNA integrity (RIN) was measured in an Agilent Bioanalyzer with an Agilent RNA Pico Kit (Table S1).
Real-time data were analyzed by the comparative C T method [14]. Average C T values for each sample were normalized to the average C T values for the housekeeping gene GAPDH ( Figure S1). The resulting value was then corrected for assay efficiency. Samples with a standard error of 20% or less were subsequently analyzed. GAPDH expression was highly correlated with PPIA expression, an additional endogenous housekeeping gene ( Figure  S2); thus, all subsequent analyses used GAPDH expression as a control.

Statistical Analysis
Relative gene expression values were log transformed to achieve a normal distribution ( Figure S3). To identify covariates that influence the expression of each gene, a stepwise discriminant analysis was performed using CDR, age, gender, disease status, PMI (post mortem interval), RIN (RNA integrity number), and APOE genotype (Table S2). After applying the appropriate covariates to the model, analysis of covariance (ANCOVA) was used to test for association between genotypes and gene expression. SNPs were tested using an additive model. All analyses were performed using statistical analysis software (SAS).

Replication Dataset
The replication dataset was obtained from Myers et al [15]. Brains were obtained from National Institute on Aging Alzheimer's Centers and the Miami Brain Bank. The 193 brains came from 18 sites and were composed of 20% frontal lobe, 70% temporal lobe, and 1% parietal lobe. The sample was 46% female with a mean age of 81 (range 65-100) and an average postmortem interval of 10 hours. Expression levels were measured on an Illumina Human Refseq-8 Expression Bead Chip System. To analyze expression levels, residual values were used that were log transformed and incorporate site, brain region, post-mortem interval, age, APOE genotype, and hybridization date as covariates.

Results
Recent large-scale LOAD GWAS have identified SNPs in ABCA7, BIN1, CD2AP, CD33, CLU, CR1, EPHA1, MS4A6A, MS4A4E, and PICALM [5][6][7][8][9][10]. To determine if gene expression is altered in AD, mRNA levels for each gene were measured by realtime PCR in the parietal lobe of AD case and age-matched, cognitively normal, control brains. All gene expression values were normalized to GAPDH, a housekeeping gene that accounts for total cell number. Because AD brains are characterized by neuronal loss, reactive gliosis, and microglial activation, we also corrected gene expression levels for specific subpopulations of cells (neurons [MAP2], microglia [AIF1], and astrocytes [GFAP]) to determine if there were cell specific effects on gene expression. ABCA7 expression was associated with CDR (p = 0.0304), where higher expression levels are correlated with elevated CDR (Table 2). CDR scores increase with cognitive and functional decline [13]. This association remained significant after correcting for subpopulations of cells (Table 2). After correcting expression for neuronal number, BIN1 expression was associated with age at onset (p = 0.0407) and disease duration (p = 0.0407), where higher expression levels are correlated with later age at onset and shorter disease duration ( Table 2). The expression of the neuronal isoform of BIN1 (BIN1n) was also associated with disease duration after correcting for total, neuronal, and microglial cell populations ( Table 2). Correcting expression levels for neuronal and microglial cell populations produced significant associations between disease status and CDR with CD33 and CR1 expression ( Table 2). Correcting CLU expression levels for neuronal number resulted in the association of CLU expression with disease status after correcting for neuronal cell populations (p = 0.0159) ( Table 2). CLU is alternatively spliced into two isoforms [16]. CLU isoforms containing exon 5 (CLU 1 ) produced similar association patterns after correcting for neuronal and microglia cell populations (Table 2). Additionally, MS4A6A expression levels were weakly associated with Braak tangle and Braak plaque scores (p = 0.0564 and p = 0.0559, respectively), where higher expression levels are correlated with higher Braak scores (Table 2). Higher Braak scores are indicative of more extensive tau and amyloid pathology in the brain [11,12]. The association between MS4A6A expression and Braak tangle and Braak plaque scores was slightly stronger after correcting for neuronal expression (p = 0.0437 and 0.0215, respectively; Table 2). Accounting for microglia number revealed an association between MS4A6A expression and CDR (p = 0.0311) and Braak tangle score (p = 0.0453). BIN1, CD2AP, EPHA1, and PICALM expression levels, however, were not associated with AD status or AD pathology (Table 2). Together, we demonstrate that in the absence of strong statistical associations between gene expression and clinical/neuropathological AD outcomes, accounting for subpopulations of cells reveals additional gene expression effects that are likely related to gene function and/or AD-specific cell loss. The top LOAD risk genes fall into three functional categories: immune response (CLU, CR1, ABCA7, MS4A, CD33, and EPHA1), cholesterol metabolism (CLU and ABCA7), and synaptic function (PICALM, BIN1, CD33, CD2AP, and EPHA1). We used the expression data for these genes to test whether expression levels of genes in a similar functional class are correlated. Expression of CD33 and MS4A6A, both of which function in immune response, were highly correlated ( Figure 1A). Furthermore, expression of CD33 and MS4A6A were highly correlated with AIF1 expression, a marker for microglia, the immune cell of the brain ( Figure 1B-C). Expression of genes related to synaptic function, BIN1, BIN1n, CD2AP, and PICALM, were highly correlated ( Figure 1D-G). BIN1 and PICALM expression were also highly correlated with GFAP expression, an astrocytic marker ( Figure 1H-I). ABCA7 expression, involved in immune response and cholesterol metabolism, was highly correlated with BIN1 and CD2AP expression, which are involved in synaptic function ( Figure 1J-K). Together, these results demonstrate that genes that fall into the same functional category are related at the RNA level. Thus, their dysfunction may be linked in AD.
To determine if the LOAD GWAS SNPs influence gene expression, we analyzed the association of SNP genotype with gene expression using an ANCOVA and testing for association with an additive model, the model utilized when originally reporting association between these SNPs and risk for AD [5][6][7][8][9][10]. We failed to detect an association between GWAS SNPs and cisacting expression quantitative trait loci (eQTL) after correcting for the total cell population (Table 3) or specific cell types (Table S3).
LOAD GWAS SNPs were identified based on their association with disease status. To determine if these SNPs contribute to AD pathology, independent of gene expression, we analyzed the association of each SNP with clinical (disease status, age at onset, disease duration, and CDR) and neuropathological (Braak tangle and Braak plaque score) measures of AD. The minor allele of rs3764650 in ABCA7 was associated with a later age at onset and shorter disease course (p = 0.0040, p = 0.0040, respectively; Table 4; Figure 2). The minor allele of rs670139 in MS4A6E was associated with Braak tangle and Braak plaque score (p = 0.0411, p = 0.0581, respectively; Table 4). We failed to detect an association between the remaining GWAS SNPs and the clinical/neuropathological measures of AD (Table 4).
To replicate our findings, we analyzed a publically available AD dataset [15], in which RNA was measured by the Illumina Human Refseq-8 Expression Bead Chip System. Of the nine genes analyzed in our cohort, only five survived quality control measures in the replication dataset: ABCA7, BIN1, CLU, MS4A6A, and PICALM. We analyzed residual expression levels for association with disease status. MS4A6A and CLU expression levels were significantly associated with disease status (p = 0.0346 and p = 0.0334, respectively), where MS4A6A and CLU expression was up regulated in the AD brains compared with controls (Table 5). BIN1 expression levels were marginally associated with disease status (p = 0.0540), where expression was also up regulated in AD brains compared with controls (Table 5).

Discussion
AD is the most common form of dementia. AD etiology is influenced by complex interactions between genetic and environmental risk factors. APOE4 is the strongest risk factor for LOAD; however, variation in APOE accounts for only 10-20% of LOAD risk, suggesting that additional risk genes exist for LOAD. Recent LOAD GWAS genes have been identified that are involved in cholesterol metabolism, synaptic function, and immune response. Yet, the functional impact of these genes in LOAD remains to be determined. In this study, we measured the influence of LOAD GWAS SNPs and gene expression levels on clinical and neuropathological measures of AD in parietal brain tissue from AD cases and cognitively normal individuals. ABCA7, BIN1, CD33, CLU, and CR1 expression levels were associated with clinical measures of AD (disease status, age at onset, disease duration, and/or CDR), and MS4A6A expression levels were associated with neuropathological measures of AD (Braak tangle and Braak plaque score). We failed to detect an association between GWAS SNPs and gene expression levels. We found that the minor allele of rs3764650 in ABCA7 was associated with clinical measures of AD (age at onset and disease duration), and the minor allele of rs670139 in MS4A6E is associated with neuropathological (Braak tangle and Braak plaque score) measures of AD. Together, these findings demonstrate that ABCA7, BIN1, CD33, CLU, CR1, and the MS4A gene family are affected at the mRNA level in AD brains.
ABCA7, BIN1, CD33, CLU Gene Family Expression are Marginally Associated with AD Phenotypes, CR1, and MS4A In this study, we found that ABCA7 expression levels are significantly associated with CDR, with higher expression levels of this gene being correlated with more extensive cognitive decline. We also demonstrated that the minor allele of rs3764650 in ABCA7 was associated with age at onset and disease duration, where the minor allele was associated with later age at onset and shorter disease duration. ABCA7 is an ATP-binding cassette transporter protein [17][18][19]. ABCA7 transports xenobiotics, metals, inorganic ions, carbohydrates, vitamins, amino acids, peptides, and lipids [20][21][22][23]. ABCA7 is highly expressed in the CA region of the hippocampus [24], where microglia express the protein at levels ten times greater than is observed in neurons [25]. ABCA7 has been predicted to stimulate the cellular cholesterol efflux to a lipid-free acceptor. ABCA7 may also play a role in phagocytosis [26].
BIN1 and the neuron specific BIN1 isoform (BIN1n) expression levels were associated with clinical measures of AD, where elevated expression was associated with later age at onset and shorter disease duration. Bin1 is implicated in receptor-mediated endocytosis and recycling of endosomes in the cell. Bin1 knockout mice do not exhibit deficiency in synaptic vesicle recycling [27,28] but have less age-associated inflammation [29]. CD33 and CR1 expression levels were associated with clinical measures of AD, where elevated expression levels were associated with AD after correcting for neuron and microglia number in the brain. CD33 and CR1 function in immune response pathways. CD33 is a transmembrane receptor expressed on cells from the myeloid lineage. CD33 functions in the innate and adaptive immune response [30], and it may play a role in receptormediated endocytosis independent of clathrin [31]. CR1 plays an essential role in the adaptive immune response. CR1 is highly expressed in red blood cells [32], where it mediates cell binding to particles and immune complexes. CR1 is a negative regulator of the complement cascade; mediates immune adherence and phagocytosis; and inhibits the classical and alternative complement pathways [33].
CLU expression levels are associated with clinical measures of AD, where elevated CLU levels occur in individuals with AD. Clusterin (ApoJ) exists as two isoforms and is highly expressed in astrocytes [16]. Clusterin is secreted from cells where it is reported to have several roles in the cell: chaperone function [34,35], lipid trafficking [36,37], and inhibition of the complement cascade [38]. Clusterin inhibits complement activation and the membrane attack complex [38], which is relevant to AD in that neuroin- flammation is a key feature of the disease. Clusterin has been implicated in AD in its ability to assist in refolding of misfolded proteins [35], bind to fibrillar proteins [39,40], clearance of Ab [41], and interact with ApoE [41]. Neuritic dystrophy and fibrillar amyloid deposits are markedly reduced when CLU is knocked out in PDAPP mice [42], suggesting that CLU may have deleterious effects when upregulated in AD brains. However, in the absence of APOE and CLU, PDAPP mice have accelerated disease onset, elevated CSF and ISF beta-amyloid levels, and more extensive amyloid deposition in the brain [41]. Thus, the role of clusterin in the brain is complex and influenced by other genes.
In our cohort, genes in the MS4A gene cluster showed association with clinical and neuropathological measures of AD. MS4A6A expression levels were found to be associated with elevated Braak tangle and Braak plaque scores. Additionally, the minor allele of rs670139 in MS4A6E was associated with CDR, Braak tangle score, and Braak plaque score. The MS4A family of genes is reported to play a role in the immune response via expression on high affinity IgE receptors [43]; however, little is known about the function of each family member. While several genes in the MS4A gene cluster have been identified in recent LOAD GWAS [9,10], we only measured expression levels of the MS4A6A gene. Due to extensive sequence conservation between the MS4A genes, we were unable to identify Taqman probes in other MS4A genes that would specifically detect a single gene; thus, we are limited in our interpretation of the role of each of the MS4A genes in AD brains. While our replication data set only contained the MS4A6A gene, we were able to replicate the association with disease status.

Factors Contributing to the Absence of Robust Findings
The associations we describe in this study are only marginal and would not survive multiple test correction. We interpret these findings to point to subtle effects in gene expression. However, type 1 errors are also a possible explanation. Our observations that the association of gene expression with clinical and neuropathological measures of AD can change after correction for neuronal, astrocytic, and microglial subpopulations indicates that cell specific gene expression plays an important role in disease.
We chose to examine measures of AD (disease status, CDR, Braak plaque score and Braak tangle score) because each trait represents a different, not completely overlapping, aspect of Alzheimer's disease. AD status, a dichotomous trait, is assigned at autopsy based on several criteria, including clinical dementia, neuronal death and Braak plaque and Braak tangle scores. CDR, Covariates included in analyses are reported in Table S2. doi:10.1371/journal.pone.0050976.t003 Figure 2. Rs3764650 in ABCA7 is associated with age at onset. Kaplan-Meier curve. AAO, age at onset in years. SNPs were analyzed using an additive model. G, minor allele. Blue line, TT (11). Red line, TG (12). Green line, GG (22 We failed to detect expression differences in the clinical and neuropathological measurements of AD in some of the genes tested in this study. These findings do not eliminate the possibility that changes are occurring in these genes during disease that we are unable to capture in our cohort. With a sample size of 112, this study may be underpowered to observe more subtle changes in gene expression that could contribute to LOAD. Furthermore, our study was limited to the parietal lobe, where AD pathology occurs late in the disease. It is possible that testing other brain regions that are susceptible to AD pathology at earlier time points in the disease course could produce additional associations. Environmental factors may also contribute to or obscure gene expression levels; however, at this time, we do not possess adequate phenotypic data to analyze this properly.

The Complexities of Defining the Functional Impact of LOAD GWAS SNPs
In this study, we analyzed genotype association with gene expression level to determine if the LOAD GWAS SNPs were functionally relevant. We failed to identify any SNPs that influence gene expression levels independent of disease status. Thus, it is possible that the functional polymorphisms that exist within these genes are rare, alter gene splicing, or impact inducible expression rather than constitutive expression. These findings fit with our previous study: we were unable to identify statistically significant associations of GWAS SNPs and SNPs in linkage disequilibrium with GWAS SNPs with CSF tau and Ab levels [44]. Thus, it is essential to exploit deep sequencing techniques to identify functional variants in these genes.

LOAD GWAS Genes are Functionally Linked
The genes identified in recent LOAD GWAS fall into three functional categories: immune response (CLU, CR1, ABCA7, MS4A family, CD33, and EPHA1), cholesterol metabolism (CLU and ABCA7), and synaptic function (PICALM, BIN1, CD33, CD2AP, and EPHA1). The genes with the most significant association with clinical and neuropathological measures of AD function in immune response and cholesterol metabolism. Despite an absence of association with the remaining GWAS genes, it is possible that these genes are affected at the protein level in AD brains.
Changes in genes that influence immune response may be difficult to identify in autopsied brain tissue, as the immune response can be transient. Additionally, alterations of the immune response in AD may primarily occur in organs other than the brain. CD2AP is localized in the cytoplasm where it has several functions: cytoskeletal remodeling [45]; cell survival [46,47]; endocytosis [48][49][50]. CD2AP functions in the immune response by interacting with CD2, a T-cell and natural killer cell membrane protein, and facilitates T-cell adhesion to antigen-presenting cells [45].
The influence of GWAS SNPs and their corresponding genes in AD that are associated with synaptic function may be more apparent at the protein level. Picalm functions in receptormediated endocytosis where it is essential in clathrin assembly,  axogenesis, and dendritic outgrowth in neurons [51]. EphA1 is highly expressed in the adult brain, where it participates in forward signaling in receptor-bearing cells and reverse signaling in ligand-bearing cells by binding to GPI-linked A ephrins, which together facilitates axon guidance and communication between neighboring cell populations [52][53][54][55][56][57]. CD2AP knockout also mice exhibit deficiencies in receptor trafficking to the lysosome.

Conclusions
This study provides evidence for the involvement of ABCA7, BIN1, CD33, CLU, CR1, and MS4A gene family in AD brain pathology. As AD is a complex disorder, it is likely that many genes are affected at the RNA and protein levels and that an understanding of the complex interactions that may occur between these genes is essential to understanding and treating AD.