Gene expression patterns associated with neurological disease in human HIV infection

The pathogenesis and nosology of HIV-associated neurological disease (HAND) remain incompletely understood. Here, to provide new insight into the molecular events leading to neurocognitive impairments (NCI) in HIV infection, we analyzed pathway dysregulations in gene expression profiles of HIV-infected patients with or without NCI and HIV encephalitis (HIVE) and control subjects. The Gene Set Enrichment Analysis (GSEA) algorithm was used for pathway analyses in conjunction with the Molecular Signatures Database collection of canonical pathways (MSigDb). We analyzed pathway dysregulations in gene expression profiles of patients from the National NeuroAIDS Tissue Consortium (NNTC), which consists of samples from 3 different brain regions, including white matter, basal ganglia and frontal cortex of HIV-infected and control patients. While HIVE is characterized by widespread, uncontrolled inflammation and tissue damage, substantial gene expression evidence of induction of interferon (IFN), cytokines and tissue injury is apparent in all brain regions studied, even in the absence of NCI. Various degrees of white matter changes were present in all HIV-infected subjects and were the primary manifestation in patients with NCI in the absence of HIVE. In particular, NCI in patients without HIVE in the NNTC sample is associated with white matter expression of chemokines, cytokines and β-defensins, without significant activation of IFN. Altogether, the results identified distinct pathways differentially regulated over the course of neurological disease in HIV infection and provide a new perspective on the dynamics of pathogenic processes in the course of HIV neurological disease in humans. These results also demonstrate the power of the systems biology analyses and indicate that the establishment of larger human gene expression profile datasets will have the potential to provide novel mechanistic insight into the pathogenesis of neurological disease in HIV infection and identify better therapeutic targets for NCI.

Introduction While the prevalence of severe HIV-associated dementia (HAD) has decreased since the introduction of combination antiretroviral therapy (cART), milder and chronic forms of neurocognitive impairment (NCI) including asymptomatic neurocognitive impairment (ANI) and HIV-associated neurocognitive disorders (HAND) as well as HIV-associated major depressive disorder remain high [1][2][3][4][5][6][7]. HIV encephalitis (HIVE) is considered to be the main neuropathological substrate of HAD [8][9][10]. NCI in the setting of cART is associated with synaptodendritic degeneration [7,11,12]. While the brain represents a sanctuary where HIV can persist due to suboptimal penetration of antiretroviral drugs [13], various studies highlighted the occurrence of NCI even in the setting of viral suppression [14,15]. Chronic neuroinflammation is believed to drive neurodegeneration in cART-era HAND [7,9,16,17]. However, the pathogenic mechanisms behind HAND remain unclear.
For pathway analysis we used the Gene Set Enrichment Analysis (GSEA), a computational method to assess whether a priori defined sets of genes show statistically significant differences between biological states [28]. GSEA was used in conjunction with gene sets from the Molecular Signatures Database (MSigDb), including canonical pathways in the C2 collection [29]. GSEA uses the Kolmogorov-Smirnov statistical test to assess whether a predefined gene set, here a pathway from the C2 collection, is statistically enriched in differentially expressed genes, by testing their distribution in the full list of genes ranked by their differential expression between two biological states [28].

Dataset and quality assurance (QA)
Clinical and demographic features of the subjects in the NNTC gene expression dataset used for the study are shown in S1 Table. The gene expression dataset consisted of Affymetrix Gene-Chip 1 Human Genome U133 Plus 2.0 arrays of the frontal cortex (Brodmann area 9), basal ganglia (head of the caudate nucleus), and white matter (deep frontal lobe). Raw data were downloaded from GEO (GSE35864). We filtered out 9 samples based on quality controls (actin3/actin5 ratio, gapdh3/gapdh5 ratio, NUSE (Normalized Unscaled Standard Errors) and RLE (Relative Log Expression) computed with the packages simpleaffy and affyPLM in R). Normalization was done using GCRMA, with the option fast set to FALSE, which demonstrated similar performance as the MAS5 normalization for reconstructing gene networks, hence for correctly retrieving gene-gene correlation, a key aspect for pathway dysregulation identification [30]. We further checked the expression of markers of neurons (RBFOX3) and oligodendrocytes (MBP) to validate the brain region profiled for white matter and frontal cortex samples, and excluded 2 samples that had conflicting expression according to their classification (D1 WM and D1 FC).
Pathway analysis. For pathway analysis, we selected one representative probe per gene based on the highest observed coefficient of variation of the probes across the samples. The dataset was interrogated for pathway enrichment using the GSEA algorithm and the canonical pathways from the MSidDb C2 collection (1,237 pathways with at least 10 genes). Here the gene list is ranked according to the statistics of a Welch t-test. For each pathway tested, a running enrichment score is calculated and the maximum enrichment score obtained is associated to the gene set. To test the significance of the enrichment score obtained, it is compared to a null distribution of enrichment scores generated by interrogating the same gene set on randomly generated ranked lists obtained by testing the significance of the genes after shuffling sample labels 1,000 times. Significance was assessed using the False Discovery Rate (FDR) computed as defined in the original GSEA publication for controlling the number of false positives in each GSEA analysis [28]. Differential expression was computed using a Welch t-test from the package Class Comparison in R 3.3.1. We defined the pathways commonly differentially regulated in each comparison as the pathways satisfying an FDR < 0.01 in at least 2 of the 3 brain regions for that comparison while pathways specific to one region were defined as pathways satisfying FDR < 0.01 in that region and FDR > 0.25 in the other 2 regions.
Pathways activity. The activity of a pathway in a sample was computed the following way: we first z-transformed the gene expression profiles to normalize the expression of each gene across samples. We then computed the enrichment score (ES) of a gene set using this z-transformed matrix of expression, as described in the original description of GSEA [28]. The ES corresponds to the relative activity of a gene set in a sample as compared to all others. Hence, the samples with the highest ES are the samples with the highest relative expression of the genes belonging to this set among the samples belonging to the gene expression matrix.
GSEA. Gene set enrichment analysis was implemented in R and follows the method described in [28]. Null distribution was obtained by 1,000 shuffling of the reference list. Gene signatures were obtained by ranking the genes according to the sign of the statistics (S) and the p-value (p) of the test with the following metric: -1×sign(S)×log(P,10).

Pathway analysis of the NNTC dataset
The NNTC dataset was interrogated for pathway enrichment using the canonical pathways from the MSigDb C2 collection and the GSEA algorithm [28] (See methods). We compared the following four groups: A: controls; B: HIV-infected no NCI no HIVE; C: HIV-infected with NCI no HIVE; D: HIV-infected with NCI and HIVE (Fig 1). All results are presented in supplementary tables in S2-S6 Tables.

Identification of pathways dysregulated in HIV-infected patients without NCI vs. uninfected controls (B-A comparison)
We identified 24 pathways concordantly differentially regulated in at least 2 brain regions in this transition (Table 1). Genes driving the enrichment (on the left of the leading edge corresponding to the peak of the running enrichment in GSEA as shown in Fig 2A) were retrieved for each region (Fig 3). These pathways and genes indicate a significant activation of IFN and cytokine signaling even in the absence of NCI. Both genes regulated by type I and type II IFN were activated in HIV infection without NCI (Figs 2 and 3). IFN-regulated genes, such as MCH class I genes, were induced in all brain regions of HIV-infected patients without NCI as compared to uninfected controls (Figs 2 and 3).
We then looked at pathways specifically differentially regulated in one brain region as compared to the two other regions. To this end, we selected pathways enriched at FDR < 0.01 in one region and FDR > 0. 25  NON LYMPHOID CELL). We also observed increased expression of calpain-related genes (BIOCARTA UCALPAIN PATHWAY) in the frontal cortex and calpain-related and caspasesrelated genes in the basal ganglia (KEGG APOPTOSIS) as well as evidence of activation of the apoptosis-mediating p75 receptor (PID P75 NTR PATHWAY) and TNF-α signaling (PID TNF PATHWAY) in both frontal cortex and basal ganglia, indicative of tissue damage. Downregulation of genes related to neurotransmission was also evident in the frontal cortex and basal ganglia of patients with HIV but no NCI (group B) compared to control subjects (e.g., REACTOME LIGAND GATED ION CHANNEL TRANSPORT, KEGG NEUROACTIVE LIGAND RECEPTOR INTERACTION, Fig 4), (S2 Table).

Identification of pathways differentially regulated in HIV-infected patients with NCI without HIVE vs. uninfected controls (C-A comparison)
HIV-infected patients with NCI and no HIVE (group C), showed significant changes specific to the white matter compared to uninfected controls. Upregulated pathways are indicative of immune activation involving chemokine, cytokines and β-defensins induction (e.g., Table 1. Pathways differentially regulated in multiple brain regions in patients infected with HIV without NCI as compared to uninfected controls. The table shows pathways dysregulated in at least 2 brain regions in patients infected with HIV without NCI compared to uninfected controls. The dataset was interrogated for pathway enrichment using the canonical pathways from the MSigDb C2 collection using GSEA. The GSEA pathway analysis results show gene expression changes involving significant immune activation and neuronal injury even in the absence of clinical NCI. NES: normalized enrichment score; FDR: false discovery rate; WM: White Matter; FC: Frontal Cortex; BG: Basal Ganglia. , oxidative stress and cytochrome P450 enzymes (KEGG DRUG METABOLISM CYTOCHROME P450, REAC-TOME BIOLOGICAL OXIDATIONS), matrix metalloproteases (MMPs) (NABA MATRI-SOME ASSOCIATED), and downregulation of genes related to RNA transcription and processing (e.g., REACTOME RNA POL II TRANSCRIPTION, KEGG SPLICEOSOME), (S3 Table),   Two pathways were concordantly differentially regulated between groups B and C in all brain regions. These pathways are indicative of type I IFN activation in HIV-infected patients without NCI (group B), as indicated above. We also identified 47 pathways specifically differentially regulated in the frontal cortex. Among the pathways upregulated in group B as compared to C in the frontal cortex, were pathways indicative of tissue damage (e.g., REACTOME REG-ULATION OF APOPTOSIS), RNA transcription and processing (e.g., REACTOME METAB-OLISM OF RNA, KEGG RIBOSOME), and pathways related to protein degradation (e.g., KEGG PROTEASOME, REACTOME AUTODEGRADATION OF THE E3 UBIQUITIN LIGASE COP1,REACTOME APC C CDC20 MEDIATED DEGRADATION OF MITOTIC PROTEINS), (S4 Table). Interestingly, the white matter in group D did not show any specific differentially regulated pathways in comparison to group B; conversely, the frontal cortex had 121 pathways and basal ganglia had 16 pathways significantly activated in group D in comparison to group B. Among the pathways upregulated in the frontal cortex in group D were pathways indicative of production of cytokine, chemokines and β-defensins (e.g., KEGG CYTOKINE CYTOKINE RECEP-TOR INTERACTION, REACTOME CHEMOKINE RECEPTORS BIND CHEMOKINES, REACTOME BETA DEFENSINS). Pathways indicative of neurodegeneration were differentially regulated between groups D and B in the frontal cortex including KEGG PARKINSONS DISEASE and KEGG HUNTINGTONS DISEASE (Fig 6). These pathways include genes indicative of trophic interaction, protein misfolding and mitochondrial function. We also identified downregulated pathways related to mitochondria and energy metabolism were decreased in group D in all brain regions at FDR < 0.2 (e.g., REACTOME TCA CYCLE AND RESPIRATORY ELECTRON TRANSPORT, REACTOME PYRUVATE METABOLISM AND CITRIC ACID TCA CYCLE, REACTOME GLYCOLYSIS), (S5 Table).

Identification of pathways differentially regulated between HIV-infected patients with HIVE vs. patients with NCI and no HIVE (D-C comparison)
We identified 27 pathways concordantly differentially regulated at the C to D comparison. Seventeen pathways were upregulated in group D (HIVE) as compared to group C (NCI without HIVE) and largely reflected activation in HIVE of IFN response (e.g., REACTOME INTER- Ten pathways were downregulated in group D and included pathways related to translation and transcription, as seen in the A to B transition, likely reflecting transcriptional/translational dysregulations brought about by IFN activation (e.g., REACTOME TRANSPORT OF Pathways related to neurodegenerative/neuronal pathways were differentially regulated in the frontal cortex and basal ganglia in patients with HIVE, including KEGG HUNTINGTONS DISEASE, KEGG PARKINSONS DISEASE, REACTOME NEURONAL SYSTEM (Fig 6). A significant component of these pathways are genes involved in mitochondria function and energy metabolism. No pathways were specifically different in white matter between groups C and D, while 37 pathways were specific to the frontal cortex and 11 to basal ganglia. Several frontal cortex-specific pathways were downregulated in group D and included cell cycle regulation while basal ganglia pathways were related to translation/transcription and immune regulation (BIOCARTA D4GDI PATHWAY, BIOCARTA 41BB PATHWAY, PID CD8 TCR DOWNSTREAM PATHWAY, PID IL12 STAT4 PATHWAY), (S6 Table).

Discussion
While dementia and HIV encephalitis are late consequences of HIV/AIDS, HIV enters the brain early after infection and remains in the brain throughout the course of infection. A current major challenge is the identification of the pathogenic mechanisms behind HAND, which remain prevalent despite cART [7,[11][12][13][14][15]. To better understand the pathogenetic processes behind HIV-associated neurological complications, here we employed the GSEA computational method for pathway analysis in conjunction with the MSigDb pathway collection [28,29]. This approach is particularly suited for this analysis as it allows for a non-biased interrogation of the data by inputting all genes from transcriptional profiling results without a priori selecting individual genes based on their differential expression. In this way, all genes in the experiment contribute to the statistical analysis allowing to interrogate the whole transcriptional landscape in a systems biology framework. Additionally, as GSEA interrogates differential expression within genesets, it is more sensitive than traditional threshold-based analysis where cutoffs can have a dramatic impact on gene selection. The brain regions profiled included white matter (representative of myelinated fibers and tracts); basal ganglia (subcortical region) and frontal cortex (cortical region) that are affected in neurodegenerative and inflammatory brain disease and HAND [23][24][25][26][27].
A considerable body of observations indicate that neuroinflammatory markers correlate with disease progression and the emergence of NCI in neuroAIDS [31][32][33]. Proinflammatory cytokines and chemokines including IFN-α, TNF-α and CCL2 that are secreted by astrocytes and microglia have long been implicated in the pathogenesis of neuroAIDS [34][35][36][37][38]. For instance, IFN-α in the cerebrospinal fluid has been observed to be higher in HAD compared with HIV-infected patients without HAD [34,37,39]. Here, we show that HIV infection is associated with substantial dysregulations of gene expression related to immune activation in the absence of NCI. This observation suggests that considerable immune activation and neuroinflammation can precede the onset of NCI and/or occur in individuals resistant to progression to clinical NCI.
In particular, a primary finding in the present study is that gene expression evidence of the induction of both IFN type I and type II responsive genes was seen in patients with HIV infection without NCI (group B) in all brain regions studied as well as in patients with HIVE (group D). Among the genes differentially regulated within these pathways were IFN-responsive genes such as HLA-A, -B, -C, -G, -F, adhesion molecules such as VCAM-1, and ISG15 and IFI6 [40][41][42].
Chronic IFN expression is considered a key contributor to inflammation in neuroAIDS as well as a potential cause of NCI and depression vulnerability. However, data on the contribution of IFN activation to NCI are conflicting. Mice with transgene expression of IFN-αin astrocytes develop a dose-dependent inflammatory encephalopathy [43]. Yet IFN-αtransgenic expression in the central nervous system induced only mild effects in an egocentric spatial working memory test [44]. However, the latter may also reflect compensatory changes as passively administered IFN-β impaired spatial memory in mice in another study [45]. A recent study suggested a role for IFN-γ in shaping fronto-cortical connections and social behavior [46], which is consistent with a potential role of excessive IFN activation in the pathogenesis of NCI. In a recent large multi-center trial, depression was not significantly increased by IFN-β treatment for multiple sclerosis (MS) [47]. The induction of IFN in patients of the NNTC dataset without NCI (group B) is reminiscent of previous studies in which IFN induction was not closely correlated with NCI, e.g., [48], and raises the possibilities that either protracted IFN dysregulation may be required to produce NCI or that it may be a co-factor in NCI pathogenesis.
Also evident in HIV-infected patients without NCI was the activation of mechanisms indicative of tissue injury, such as expression of matrix metalloproteases (MMP) and complementrelated genes in the white matter. MMP expression by HIV-1 infected monocytes and macrophages is recognized as a pathogenic mechanism in neuroAIDS [49]. Elevated MMP levels can contribute to microglial activation, infiltrate through cleavage of adhesion molecules, neuronal and synaptic injury, as well as blood-brain barrier disruption [50][51][52][53]. MMP increases were present in the white matter in HIV-infected patients with NCI and no HIVE, while in patients with HIVE, induction of MMPs was also evident both in the white matter and in the frontal cortex.
Another key finding in the study is that patients with NCI without HIVE (group C) in the NNTC cohort did not show significant activation of IFN, unlike patients in groups B and D. This discordant regulation of IFN signaling did not appear to be associated with antiretroviral therapy as patients with NCI and no HIVE include both patients treated with antiretrovirals and untreated patients. Conversely, patients with NCI without HIVE (group C) had increased expression of chemokines, cytokines and β-defensins in the white matter. Other proinflammatory markers were also concomitantly increased in the white matter of patients with NCI and no HIVE. Evidence of chemokine and cytokine expression were present in all HIV-infected groups in the study. β-defensins were also induced in patients with HIVE.
Chemokines have been implicated in impairing cognition, Alzheimer's disease and depression as well as other psychiatric conditions [54]. Increased immunoreactivity for MCP-2 was noted in MS lesions [55]. A chemokine gene cluster has been associated with age of onset of Alzheimer's [56]. A higher level of CCL2 in CSF, and a CCL2 -2578G allele, have been associated with worse neurocognitive functioning in HIV [57]. Animal studies, while scant, are consistent with a possible role for chemokines in NCI. For instance, chemokine signaling was increased by SIV infection and methamphetamine exposure in macaques [58,59]. Chemokines can induce changes leading to impaired hippocampal synaptic transmission, plasticity and memory [59,60]. Evidence also suggests a role for defensins in the chronic inflammation associated with degenerative brain diseases, and in particular Alzheimer's disease [61,62]. Defensins-related pathways were also induced in HIVE, but showed no consistent regulation in HIV-infected patients without NCI, suggesting a possible contribution to the pathogenesis of NCI.
In HIV without NCI, genes related to neurotransmission were also downregulated in the frontal cortex and basal ganglia while genes related to apoptosis, such as calpain-related mechanisms, which contribute to neurodegeneration in HIV [63], were induced in the basal ganglia and frontal cortex. Conversely, no pathways showed significant dysregulations in the frontal cortex and basal ganglia in HIV patients with NCI and no HIVE. In HIVE, multiple pathways indicative of impaired mitochondria and energy metabolism were differentially regulated. In NCI without HIVE, we observed increased expression of cytochrome P450 enzymes, which may indicate oxidative stress [64].
The anatomical distribution of the gene expression programs dysregulated in the NNTC dataset appears to reflect brain-region specific dynamics in neurological disease progression in HIV/AIDS. In particular, we observed some degree of white matter alteration of gene expression in all HIV-infected groups with and without NCI and HIVE. However, gene expression changes in patients with NCI without HIVE (group C) were localized to the white matter and had a specific gene expression profile. Lack of gene expression changes suggestive of neuronal injury in the frontal cortex and basal ganglia in patients with NCI without HIVE (group C) suggests that they may not be accompanied by significant neuronal atrophy, but that white matter pathology likely drives NCI in these patients. Prominent white matter gene expression changes were also present in HIVE, which was also characterized by considerable gene expression changes in the frontal cortex and basal ganglia, consistent with earlier clinical literature, e.g. [8]. White matter damage correlating with the severity of cognitive manifestations has been observed since the early days of the HIV pandemic [8,25]. Evidence of white matter injury in HIV-infected patients with and without NCI is also demonstrated in recent imaging studies [65,66]. Importantly, white matter changes are increasingly recognized as predictive of cognitive impairment and progression to dementia in aging and neurodegenerative diseases such as Alzheimer's disease and Parkinson's disease [19][20][21][22]. In addition to white matter changes, gene expression in HIVE was characterized by considerable changes in the frontal cortex and basal ganglia, which is in agreement with the association of NCI with progression of functional abnormalities involving the basal ganglia and the frontal cortex [23][24][25][26][27].
The present study has several limitations. Primarily, the NNTC dataset groups are of small sample size that was further reduced as part of the quality control analysis. Larger studies will be needed to better understand the pathogenesis and progression of neurological disease and to adequately represent all possible variants of central nervous system disease. For instance, gene expression results of the group of HIV-infected patients with NCI and without HIVE raise several questions, including if this is a distinct nosologic variant of neuroAIDS or if it is a stage in the progression of HIV brain disease.
In conclusion, in the present study we explored patterns of gene expression dysregulation in patients in the NNTC neuroAIDS gene expression dataset. Results point to gene expression changes indicative of immune activation characterized by IFN and cytokine expression as well as evidence of neuronal injury preceding NCI. Interestingly, the group of HIV-infected patients with NCI without HIVE showed a preeminently white matter dysfunction characterized by a distinct pattern of immune activation with low IFN. Larger studies are necessary to better understand the pathogenesis of neurological disease and its progression, to evaluate the impact of therapy on various HIV disease conditions, and to identify better therapeutic targets and strategies for NCI in HIV.
Supporting information S1