Caudwell Xtreme Everest: A prospective study of the effects of environmental hypoxia on cognitive functioning

Background The neuropsychological consequences of exposure to environmental hypobaric hypoxia (EHH) remain unclear. We thus investigated them in a large group of healthy volunteers who trekked to Mount Everest base camp (5,300 m). Methods A neuropsychological (NP) test battery assessing memory, language, attention, and executive function was administered to 198 participants (age 44.5±13.7 years; 60% male). These were studied at baseline (sea level), 3,500 m (Namche Bazaar), 5,300 m (Everest Base Camp) and on return to 1,300 m (Kathmandu) (attrition rate 23.7%). A comparable control group (n = 25; age 44.5±14.1 years; 60% male) for comparison with trekkers was tested at/or near sea level over an equivalent timeframe so as to account for learning effects associated with repeat testing. The Reliable Change Index (RCI) was used to calculate changes in cognition and neuropsychological function during and after exposure to EHH relative to controls. Results Overall, attention, verbal ability and executive function declined in those exposed to EHH when the performance of the control group was taken into account (RCI .05 to -.95) with decline persisting at descent. Memory and psychomotor function showed decline at highest ascent only (RCI -.08 to -.56). However, there was inter-individual variability in response: whilst NP performance declined in most, this improved in some trekkers. Cognitive decline was greater amongst older people (r = .42; p < .0001), but was otherwise not consistently associated with socio-demographic, mood, or physiological variables. Conclusions After correcting for learning effects, attention, verbal abilities and executive functioning declined with exposure to EHH. There was considerable individual variability in the response of brain function to sustained hypoxia with some participants not showing any effects of hypoxia. This might have implications for those facing sustained hypoxia as a result of any disease.


Introduction
Tissue hypoxia occurs in response to a diverse range of acute and chronic disease conditions [1] and may impact on neurocognitive outcome [2]. Mild to moderate perioperative hypoxaemia is also implicated in the pathogenesis of postoperative cognitive impairment [3], and cognitive impairment is recognised in those suffering conditions associated with chronic hypoxaemia, such as chronic obstructive pulmonary disease and obstructive sleep apnea [4]. However, apparently normal cognition (as evidenced by effective completion of tasks) is possible in the face of profound hypoxaemia [5] while the role of confounding factors and preexisting vulnerabilities (such as older age, comorbidity, and/or frailty) in the pathogenesis of cognitive decline cannot be completely elucidated in studies involving patient populations.
Research on healthy volunteers offers the opportunity to explore the direct effects of hypoxia on cognition and to identify pathways and avenues for neuroprotection [6,7,8]. Prior work has been instrumental in delineating the cognitive domains most affected [9] yet limited by small sample sizes, restricted ranges of neuropsychological tests, and lack of control groups and/or of serial assessments [8,10,11,12,13,14,15]. Importantly, improvements in neuropsychological test performance, which occur with repeat testing ("practice effects") can mask declines in cognitive functioning [16,17,18] and such effects are rarely adjusted for.
As part of the Caudwell Extreme Everest Medical Research Expedition (February 2007 -June 2007), we compared cognitive function in a large sample of healthy participants exposed to environmental hypobaric hypoxia (EHH) during a trek to the base camp of Mount Everest with a comparator group assessed at the same time points at or near sea level [19,20]. We also sought to identify trajectories of cognitive function across ascent and examined the associations between cognitive performance and concurrently assessed physiological parameters.

Participants
Participants were recruited from the general public. Inclusion criteria were: age over 18; fluent English speaker; lowland resident; non-professional trekker status; availability for baseline testing; and good physical health (determined by two health-screening tests). Excluded were those with recent (i.e., 3 months pior to departure) altitude exposure (e.g. trekking, skiing) [20]. To estimate the magnitude of practice effects, we used the same inclusion criteria to enroll a group of control participants who did not ascend to altitude. Control participants were recruited opportunistically (snowball sampling [21]) through network of acquaintances, coworkers of trekkers and researchers.
Written informed consent was obtained from all participants. The University College London Research Ethics Committee approved all study procedures (in accordance with the Declaration of Helsinki).

Experimental protocol/course of expedition
A detailed ascent profile is provided in "Design and Conduct of Caudwell Xtreme Everest: An Observational Cohort Study of Variation in Human Adaptation to Progressive Environmental Hypoxia" [18]. In brief, all participants were initially studied at sea level in London, UK (75 m), and then followed an identical 11-day ascent route to Everest Base Camp (EBC). Subjects were assessed at 3500 m (Namche), 5300 m (EBC) and at 1300 m (upon return to Kathmandu). Laboratory altitudes, barometric pressures and inspired partial pressures of oxygen are summarised in Table 1. The ascent rate was chosen to minimise the incidence of high altitude illness and therefore maximise the number of participants able to contribute data.
Each subject underwent neurocognitive assessment consistently on either day 1 or day 2 after arrival at each point. The control group underwent neuropsychological testing at sea level at the same time intervals as the trekkers.
Cognitive/neuropsychological tests and mood assessments. The neuropsychological assessment test battery was assembled to include all major cognitive domains-attention, memory, executive function, and language-using standardised and commonly used tests in clinical and research settings [22]: Trail Making Test parts A and B [23], Controlled Oral Word Association Test [24], Letter Cancellation Test [22,25], Stroop Test [26], Grooved Pegboard [27], Rey Auditory Verbal Learning Test [28], Symbol Digit Modalities Test [29] and Block Design Test [30] (see Table 2 for description). Alternate forms of the tests were used when available [22]. Tests were administered in a fixed order by trained research personnel. Intelligence (IQ) testing (using the Wechsler Intelligence Test [30]) was only performed at baseline.
The Brief Center for Epidemiologic Studies Depression Scale [31] and State Trait Anxiety Inventory [32] measured depression and anxiety symptoms respectively. Symptoms of Acute Mountain Sickness (AMS) were evaluated using the Lake Louise Symptoms Score [33].

Statistical analysis
In order to assess change at a group level, cognitive outcomes were compared between baseline and follow-up assessments for the trekker and control groups using a series of mixed factor analyses of variance. Three analyses were performed. The first compared group differences in changes of performance of each of the neuropsychological tests. The second and most critical analyses utilized the scores of the control group in constructing a practice adjusted Reliable Change Index (RCI) method [34,18] to adjust individual differences in test scores for measurement error and practice effects. RCIs were determined by subtracting the baseline score (X 1 ) from the follow up scores (X 2 ), giving (DELTA) X for each individual participant for a given task. The mean measured change for the controls, (DELTA) Xc , calculated in the same way, was then subtracted from this, removing any practice effect. This score was then divided by the within-subject standard deviation for control group, controlling for expected variability. This formula gives a precise estimate of relative change controlling for variability and practice-it is comparable with various regression based estimates and clearly superior to simple change scores in ability reliable and clinically meaningful change [18] These RCI scores were then used to create individual and combined test scores (Z combined ) using the sum of standardized RCI scores (z RCI) for each test divided by the standard deviation of this summation in the control group. This technique identifies cognitive change by comparing the changes in test scores of an individual trekker with changes in the test scores of the control group over the same interval. The sign was adjusted so that negative z scores indicated deterioration from the baseline test. In the third analysis, the RCI scores were examined for each individual to determine the percentage of individuals who demonstrated a decline in performance for each test (i.e., negative RCI).
To further examine patterns of cognitive performance, Latent Profile Analysis (LPA) was conducted on total/combined RCI scores in Mplus Version 6.12 [34]. An increasing number of classes were fit in a series of iterative steps until the resultant model was not well identified [35]. The optimal number of classes was determined using the Akaike Information Criterion (AIC) [36], Bayesian Information Criterion (BIC) [37], and the Bootstrapped Likelihood Ratio Test (BLRT) [38,39]. We sought a model with lower values for the information criteria, and p values .05 for the likelihood ratio test. In addition, classification quality was indicated by higher relative entropy (! .80), average posterior class probability (!.70), and odds of correct classification (! 5) [40,41,42]. The associations between cognitive change indices (individual RCI; and trajectories of total RCI scores) to demographic and clinical parameters were explored using correlations or ANOVA (as appropriate). For physiological parameters, both absolute levels as well the percentile change at each ascent point relative to baseline were employed.
To control for missing data, we repeated all analyses using all cases enrolled at baseline after missing data imputation (last value carried forward approach) as a sensitivity analysis. Significance levels were set at p < .05. All values reported are mean ± SD unless stated otherwise.

Results
One hundred and ninety-eight trekker participants ( The characteristics and baseline neuropsychological performance of the trekkers who provided data for analysis of cognitive outcomes across all four assessments were comparable to those of the total sample and the non-completers (p > .05 in all cases), indicating that the missing data did not reflect a selective loss to follow up. Details of physiological parameters and symptoms across ascent are depicted in Table 3.

Group comparisons on cognition
Mean neuropsychological scores for both trekkers and control groups were within the reference range for all NP tests (within one standard deviation (SD) of general population test norms) [43] (see Table 4 for mean scores).
Trekkers and controls had equivalent NP scores with the exception of RAVLT-Total, where control subjects outperformed trekkers on all assessments including baseline. Generally, mean cognitive performance across NP tests was significantly better from baseline to follow-up assessments in both trekkers and controls, albeit not uniformly for all tests, providing evidence of the expected practice effects with repeated NP administrations over short time intervals (p values < .05). Post-hoc tests showed that performance improvements over repeat assessments were greater in the control group relative to trekkers where slopes were less steep but still significant.
As expected, evidence of a significant decline in raw NP scores at the group level was not revealed on any of the neuropsychological tests for either group.

Course of cognition using reliable cognitive change index (RCI)
Data are summarized in Table 5. Controlling for practice effects using RCI methodology demonstrated an altitude-related decline in cognitive performance for the trekkers. Mean RCI scores revealed cognitive decline with ascent for all tasks except for the GP-D, GP-ND and RAVLT-L, which only deteriorated at the highest altitude (i.e., Everest Base Camp) ( Table 4). Inspection of RCI values indicated that there was a greater decline for most of tests at the first ascent point (Namche). Decline was noted in all tests at EBC and persisted on return to Kathmandu for all domains but for GP-D, GP-ND and RAVLT-L. Decline was more pronounced in tests of verbal ability/language (e.g. COWA; LCT) and executive function (e.g. BD; TMT-B) whereas test of psychomotor function and verbal memory were only affected at highest ascent. Sensitivity analyses using missing data imputation indicated a similar pattern of decline as that identified in the primary analysis above.

Incidence of cognitive decline
Individual variability in performance was assessed by classifying individual participant RCIs as either declined or not declined on each of the individual NP tests. With practice effects accounted for, between two thirds and three quarters of trekkers demonstrated decline in individual NP tests (see Table 4 above; see S1 Fig and S2 Fig). The mean number of test scores showing decline was comparable across ascent points but varied across trekkers. The mean number of test scores showing a decline was 5.38±2.28 (range = 10) in Namche; at EBC the mean was 6.23± 2.49 test scores (range = 9) whereas in Kathmandu, decline was noted in 5.15 ±3.42 (range = 11) NP test scores. The latent profile analysis of summary NP scores indicated similar decline rates. In this analysis, one-to three-class solutions were fit in a series of iterative steps to determine the optimal number of profiles (the four-class solution was not well-identified). Model fit improved as the number of classes increased ( Table 6). As a result, the three-class solution was selected. This solution demonstrated good relative entropy (.82), average posterior class probabilities (.91-.93) and odds of correct classification (7.23-48.49). The three distinct profiles of cognitive

Factors related to cognitive decline
Associations with demographic and clinical parameters were examined separately at each assessment point (for individual and summary RCI scores) and in relation to overall cognitive trajectories across expedition (mild decline; pronounced decline; improvement). The gender, education, anthropometric indices (height; weight) and mood (anxiety and depression symptoms) of the group of participants who exhibited cognitive decline did not  differ from those who exhibited no such decline. There were a few significant associations between RCI scores and age, education and mood albeit not consistently for all NP tests and across all ascent points. Any significant associations of mood and education with cognitive decline were mainly found at the highest point of ascent (EBC) but not for all cognitive tests. Anxiety and depression were in the main unrelated to cognitive decline with only a handful of significant (small-sized) associations between symptoms of anxiety with decline in memory (r = -.16; p = .035 for RAVLT-T and r = -.15; p = .043 for RAVLT-D) and overall total RCI summary scores at EBC (r = -.17 p = .032). Symptoms of depression correlated only with decline in one NP test, namely SD at EBC (r = -.19; p = .011).
Analyses on cognitive profiles indicated no significant differences between the three trajectories (mild decline; pronounced decline; improvement) in any of the socio-demographic, physiological or psychological variables examined.

Discussion
This is the largest prospective study to investigate the impacts of environmental hypoxia on cognitive ability. As expected, absolute cognitive performance improved with experience of repeated testing in both trekkers and controls. Adjustment for such 'practice effects', using RCI methodology and profiling of cognitive trajectories across the expedition, provided clear evidence of overall cognitive impairment across several domains in the group of trekkers exposed to altitude and environmental hypoxia. Cognitive declines were evident in the group on all NP tasks at the highest point of ascent, i.e. Everest Base Camp (5,300 m). Of note, whilst cognitive performance improved on descent from Everest Base Camp to Kathmandu (1,300m), it remained impaired when compared to pre-trek levels and even to those recorded at greater altitude (Namche: 3,500m) on ascent. Thus, return to lower altitude does not immediately restore the cognitive effects of exposure to hypoxia. Age was unrelated to RCI-based NP tests or summary scores at high altitudes (both Namche and Everest Base Camp) but was significant on descent to Kathmandu suggesting poorer cognitive recovery in older participants-although whether this persists is not known. This finding is in keeping with other studies where the vulnerability of the cognitive functioning of older people has been found to be associated with poorer recovery in a range of conditions including traumatic brain injury and surgery [44,45].
Cognitive decline was greatest in executive function, complex attention, and verbal skills [8,9,46,47,48,49]. There was a small improvement in psychomotor abilities at 1,300 m (Kathmandu) and 3,500 m (Namche) when compared at baseline, but this declined at the highest altitude (EBC). Verbal memory did not decline at first ascent testing (Namche; 3,500 m) yet declines in memory were found at all subsequent testing points. The notion of a threshold may explain the lack of decline in motor function and memory at altitudes lower than 4,000 m. Motor precision skills reflected in GP are thought to be affected at higher altitudes in comparison to motor speed [12]. It has been suggested that a distinct pattern of neuropsychological change develops above 4,000 m altitude and this is consistent with the cognitive decline in all cognitive domains at the highest point of ascent (Everest Base Camp at 5,300 m). These findings are in line with the proposed transitional zone at 4000 m for loss of autoregulation in cerebral blood flow [50]. Variation in the susceptibility to hypoxia of different brain regions (i.e. hippocampus, basal ganglia, cerebellum and occipital cortex), each associated with particular cognitive functions, may also account for the observed pattern of results [51]. An alternative explanation may relate to differences in test sensitivity and levels of difficulty. Grooved Pegboard, a test of motor precision, may be considered a less complex task relative to the tests of attention and executive function deployed in this study. These more complicated tasks showed deterioration at lower altitudes. The observed declines in cognitive performance at the lower altitudes cannot be attributed to AMS. AMS symptoms were related to decline in verbal abilities and executive function only at highest point of ascent (5,300) but not at lower altitudes. There was also no clear physiological pathway as none of measures taken showed reliable associations across tests or testing to allow some confidence of a plausible physiological mechanism.
While cognitive decline was demonstrated for most trekkers, some were unaffected or even performed better relative to baseline after practice effects were accounted for. However, decline with altitude was the norm for most. This pattern was shown both in analyses of individual tests as well as overall cognitive profiles across ascent. Such variability is consistent with that observed in other physiological responses to high altitude. The reasons for such variation remain unclear. With the exception of differential rates of recovery on descent being related to age, variability in performance was not consistently explained by socio-demographic characteristics (age, education, smoking), cognitive reserve (i.e., IQ), physiological parameters such as oxygen saturation levels or time intervals between assessments. It is of note that NP decline in some tests and on the overall measure of NP was associated with the symptoms of AMS only at the highest level of ascent. It is possible that the threshold for symptoms of AMS at that point is of sufficient severity to have a generalised effect including on cognition.
The heterogeneity in cognitive outcomes in individual performance represents a challenge for prediction in conditions of hypoxia. This is in common with a growing but incomplete understanding of the mechanisms underlying cognitive decline, resilience or recovery. Whilst hypoxia itself appears the most cogent explanation of the overall results, the variability can be attributed to factors such as inflammatory processes, oxidative stress, cortical damage, cumulative effects of repeated ascents, prolonged altitude exposure, immune alterations and fatigue, sleep disturbances, or psychological stressors [52]. All of these possibilities are highly speculative at this point, but generate potential hypotheses for future research on individual differences apparent in these results.
A particular strength of the current study is our sample size (the largest to date for such a study) that allowed the statistical modeling of trajectories of cognitive change; the use of controls matched for age and educational background, so as to control for measurement error and practice effects; and the use of a comprehensive cognitive battery. Most other studies have used small samples (that precluded statistical modeling) and limited neuropsychological tests. The current study adds to the literature in two important ways. First, we have used to model trajectories of cognitive test scores over incremental exposure to hypobaric hypoxia. The results of these analyses reveal that there is heterogeneity in the profiles of change over time in a large sample av. Second, these results also help to tease apart the relative contributions of maturation (i.e., normal aging) and practice effects that can affect follow-up test scores.
Our study does, however, have weaknesses. Limitations in carrying equipment to altitude, and unreliability of power supply, meant that assessments had to be based on oral or written (pen and paper) NP tests which have been shown to be associated with greater variability than computerised tests [53].
The control group was recruited opportunistically. The quality of RCI analyses is largely dependent upon how closely the control group matches the characteristics of the interest group. Barring randomisation, which would have been ideal yet unfeasible, we have undertaken careful matching to ensure that any significant differences in performance are not due to differences in age, education and oxygen saturation levels. It remains possible that there may be other unmeasured variables that may account for the effects such differences in levels of exercise between trekkers and control participants. Although none of NP tests were performed immediately after exercise or reaching designated ascent points so as to reduce possible proximal effects, differences between trekkers and controls were not directly controlled for. Although recruiting healthy participants was deemed the best technique to estimate practice effects, the small sample size of control group and potential differences in levels of exercise as result of the expedition may limit the accuracy of the calculation of practice effects.
Further work is required to elucidate the mechanisms of cognitive decline in conditions of acute hypoxia and to account for individual differences in cognitive performance and recovery rates.

Authors' translational perspective
The clinical neurocognitive effects of hypoxia manifest in diverse environments-from the agitated patient in the pre-hospital phase, to the deteriorating intensive care patient, to the elderly octogenarian developing a chest infection and concurrent confusion. By studying the neurocognitive effects of hypoxia on otherwise "healthy" individuals in a prospective manner, this study aimed to investigate this in a structured way. The results imply that, in some individuals at least, hypoxia results in a reduced ability to learn. This may link to disorientation in time and place that is commonly seen in hypoxic patients, particularly in critically ill patients. The notable predilection to loss of verbal ability may equate to inability to express oneself. Clinically it is often perceived that the elderly are more prone to confusion. In our study age was significantly associated with decline, and notably with delayed recovery (i.e., return back to the relative normoxia of Kathmandu). This suggests that cognitive recovery from hypoxia may persist longer in older people and this chimes with clinical experience of reduced cognitive resilience and delayed return to normal cognitive function in elderly patients exposed to hypoxia and other pathological stressors. Possibly of greatest importance is the marked heterogeneity of cognitive decline. Hypoxia did not have the same effect on everyone-some were more prone to neurocognitive decline than others. Further studies are needed to help identify biomarkers of predisposition and response for hypoxia-related cognitive impairment, particularly in older patients, in order to guide therapy and thereby minimise the magnitude and duration of impairment.