Gene expression is implicated in the ability of pikas to occupy Himalayan elevational gradient

Species are shifting their ranges due to climate change, many moving to cooler and higher locations. However, with elevation increase comes oxygen decline, potentially limiting a species’ ability to track its environment depending on what mechanisms it has available to compensate for hypoxic stress. Pikas (Family Ochotonidae), cold-specialist small mammal species, are already undergoing elevational range shifts. We collected RNA samples from one population of Ochotona roylei in the western Himalaya at three sites– 3,600, 4,000, and 5,000 meters–and found no evidence of significant population genetic structure nor positive selection among sites. However, out of over 10,000 expressed transcripts, 26 were significantly upregulated at the 5,000 m site and were significantly enriched for pathways consistent with physiological compensation for limited oxygen. These results suggest that differences in gene expression may play a key role in enabling hypoxia tolerance on this local scale, indicating elevational flexibility that may facilitate successful range shifts in response to climate change.


Introduction
Current climate trends indicate a continued increase in global mean temperature to as much as 4˚C above present temperatures by 2100 [1]. Many species are already shifting their ranges in order to follow their preferred climate [2], with flatland species generally shifting northward, and mountainous species moving to higher elevations [3]. However, moving up in elevation to escape warmer temperatures comes with new environmental stressors, including oxygen deprivation, or hypoxia [4]. Limitations that hypoxia may place on an animal's immediate ability to move upslope as it responds to warmer conditions have not been explored and likely depend on what hypoxia-compensation mechanisms a population has at its disposal.
Pikas (Order Lagomorpha, Family Ochotonidae) are an ideal model system with which to address this question. Pikas are small mammals related to rabbits and hares. At least 28  broadest elevational range possible on one mountain in the Western Himalayas (Fig 1) to quantify genetic structure, genetic adaptation, and gene expression. In comparing pikas along an elevational transect we expected four possible, non-mutually exclusive, outcomes. Pikas from different elevations could exhibit 1) population structure because they have undergone genetic drift due to a lack of gene flow; 2) differential adaptation at each location apparent through outlier loci under positive selection; 3) differences in gene expression; and 4) no detectable differences at the genetic or transcriptomic levels.
In this study, we address each of these possibilities to determine the relative role of each in generating tolerance to hypoxia in pikas living in this extreme environment. Each mechanism has different implications concerning the response of pikas to climate change. Genetic isolation between elevations (outcome 1) would indicate barriers to dispersal. Genetic adaptations (outcome 2) generally occur much slower than the time frame on which pikas are currently shifting their ranges [22]. If high-elevation individuals within this species have evolved unique genetic adaptations for hypoxia tolerance, then lower-elevation individuals may be genetically unfit to live in high-elevation environments (and vice-versa). Modulation of gene expression (outcome 3) on the other hand, a mechanism that has not yet been explored in pikas, can act on a much shorter time scale [23,24]. Phenotypic plasticity is likely to be one of the most powerful tools that an animal species can have at its disposal to successfully respond to rapid climate change [33]. If high-elevation pikas are dependent on alterations in gene expression to compensate for additional oxygen stress, then this is a mechanism that lower-elevation individuals could potentially be capable of quickly replicating to facilitate range shifts. No detectable differences (outcome 4) would indicate that pikas in this species across this gradient are confronting the challenges of elevation equally, or that the responses to elevation were not detectable at the genomic or transcriptomic level with the samples we collected.

Sample collection
Blood samples were collected from 24 pikas between September 24, 2013 andOctober 17, 2013 at three different locations along the Southwest side of Mount Kanamo in Spiti Valley, Himachal Pradesh, India (Fig 1). All samples were collected between 32.246˚-32.349˚North and 78.000˚-78.056˚East (Table 1). Permission to conduct research in this location was granted by the Chief Wildlife Warden of the Himachal Pradesh Forest Department Wildlife Wing under Wildlife Research Study 3853. All samples were obtained by live-trapping using baited Tomahawk traps baited with local red berries and greens as well as fresh fruits and vegetables, such as apples, carrots and corn. Pikas self-transferred from the trap into an anesthetizing chamber through a flexible funnel. Approximately 1 ml of isoflurane was applied to a cotton ball and held in a separate perforated container within a larger anesthetizing chamber. The larger chamber was also perforated in a design that allows adequate air exchange with the outdoor environment. The pika was observed through the clear walls of the chamber and removed for sampling when unresponsive. If the pika became physically active during sampling, it was returned to the chamber for further anesthesia. 500 μL of blood was collected by retro-orbital abrasion using a 100 μL non-heparinized capillary tube. Blood was collected directly into Qiagen RNAprotect animal blood tubes for RNA stabilization. After blood collection, pikas were transferred to an open mesh bag for weighing via a hanging scale and monitored in the bag until they had made a full recovery. Pikas were then released at their point of capture. We experienced no complications during live-trapping and all pikas were successfully sampled and released alive. This study and all procedures described were approved by the Stanford University Administrative Panel on Laboratory Animal Care (APLAC protocol 27547). Using the methods described above, blood collection was performed under isoflurane anesthesia and all efforts were made to minimize discomfort. Pikas were visually located at the lowest elevations they occurred and the highest elevations they occurred on the accessible parts of the mountain. Pikas were then sampled from three locations within 12.5 km of each other. Five samples were collected from around 3,600 m near the town of Kaza (~65% oxygen compared to sea-level). Twelve samples were collected from around 4,000 m near Kibber village (~63% oxygen compared to sea-level), approximately 12 km from the 3,600 m site. Seven samples were collected from around 5,000 m near Tinum (~55% oxygen compared to sea-level), approximately 6 km from the 4,000 m site and 7 km from the 3,600 m site (Fig 1). Studies of American pikas show that the microclimates that pikas experience in their underground talus habitat are often significantly different from ambient temperatures and depend on factors such as shade, vegetation, and rock ice features and can also be highly heterogeneous within one talus patch [34,35]. Additionally, pikas are known to use these microclimates to employ precise behavioral thermoregulation to avoid extreme temperatures [36]. Due to this disconnect between the ambient temperature and that experienced by pikas, the role of ambient temperature in gene expression was believed to be negligible compared to that of hypoxia.
Blood was targeted because it is a tissue that could be collected with minimal harm to the animals, and is biologically relevant for responses to limited oxygen. Blood is spread throughout the body and has a relatively quick turnover rate, making it an ideal tissue to gain real-time information of whole body status in response to physiological and environmental pressures [37,38]. Previous studies indicate that 61-80% of protein-coding genes are expressed in the blood transcriptome [37,39].
RNA-stabilized blood samples were stored in a glacier-melt river following collection and transferred to a -80˚C freezer within 14 days of collection. Total RNA excluding miRNA (less than 200 nucleotides in length) was extracted and purified from the RNAprotect-stabilized blood samples using the Qiagen RNeasy Protect Animal Blood Kit following the manufacture's protocol. RNA concentration was assessed using Qubit RNA HS reagents (Thermo Fisher Scientific). RNA integrity was assessed on a bioanalyzer.

Transcriptome data generation and processing
All samples that had an RNA integrity value greater than 6 and at least 0.5 μg of RNA were used for library preparation. A total of 20 out of the 24 samples passed this quality step and were used to create cDNA libraries to sequence-six samples from the 5,000 m site, nine samples from the 4,000 m site, and five samples from the 3,600 m site (  [41]. We then aligned the sequence data for each individual to this hemoglobin mRNA reference using Bowtie2 and reads that successfully mapped were removed from the sequence dataset. Reads were treated as single-end to remove adapter sequences and low-quality reads were then removed using Trimmomatic [42]. Reads were then sorted and paired using custom python scripts. Paired reads were then sorted again. Read quality was checked before and after each filtering step using FastQC [43]. The O. princeps annotated genome and transcriptome were downloaded from NCBI (GCF_000292845.1_Och-Pri3.0 with 26,240 transcripts). Ochotona princeps is the closest relative to O. roylei with an annotated genome and is approximately 15 million years diverged [44].

Species verification
The identity of our samples as O. roylei were confirmed by aligning MT-CYB gene sequence from our samples to voucher specimen MT-CYB sequence available on Genbank for species in the Con- Ochotona subgenus as an outgroup. Partial MT-CYB sequence was recovered from our transcriptomic dataset and sequences were aligned to published sequences using Geneious v7.1.4. We identified the best nucleotide substitution model for our alignment using jModelTest [45] and created a phylogeny using MrBayes [46,47] run for 1 million generations with a sampling frequency of 200 trees and a burn in of 10%. The phylogeny with posterior probabilities was visualized using Geneious v7.1.4

Variant identification
Variant identification was conducted by following the Broad Institute best practices for variant discovery in RNAseq using GATK [48]. Paired reads for each sample were mapped to the annotated O. princeps reference genome (GCF_000292845.1_OchPri3.0_genomic.fna) using multi-sample two-pass mapping in STAR aligner [49]. We then used picard tools (http:// picard.sourceforge.net) to add read group information, sort, mark duplicates, and index the data for each sample. GATK was used to split reads into exons, trim off any intron sequence, call haplotypes for each sample, and perform joint genotyping of all of the samples together. Base recalibration was not performed because known SNPs are not available for our data; however, variants were filtered using GATK VariantFiltration following GATK recommendations for hard-filtering. GATK VariantFiltration was used to remove variants with a Phred-scaled probably of strand bias (FS) greater than 30, variant quality score normalized by depth of coverage (QD) less than 17, a measure of strand bias (SOR) greater than 3, averaged root mean square mapping quality (MQ) less than 40, read position rank sum (ReadPosRankSum) less than -4, and a depth of coverage (DP) less than 5. We then used VCFtools [50] to remove indels, SNPs missing data for any of the 20 samples, SNPs that were different from the reference but identical across all samples, and singletons. GATK VariantFiltration was then used to remove SNPs in a cluster of three or more within a 30bp window.

Tests for population structure and selection
SNPs were used to investigate population structure using two Bayesian clustering programs, fastStructure [51] and Admixture v1.3.0 [52] using K = 1 to K = 10. In fastStructure, the optimal K was identified using the "chooseK" script that is part of the fastStructure program. In Admixture, the best K value was determined using cross validation scores. Pair-wise F st values and confidence intervals were calculated using the StAMPP package [53] in R [54] by bootstrapping across loci with 1,000 replicates. We calculated the number of shared and private SNPs in each sampling location using Arlequin 3.5.2.2 [55]. Additionally, we used MT-CYB sequences in order to calculate pair-wise F st between sites also using Arlequin 3.5.2.2 [55]. We assessed diversifying selection among the sampling sites using BayeScan v2.1 [56] with a prior odds of 100. PGDSpider v2.1.0.3 [57] was used to create our BayeScan input file.

Gene expression analysis
The O. princeps annotated mitochondrial genome was downloaded from Genbank (AJ537415.1) and the 14 annotated mitochondrial genes were extracted and added to the O. princeps reference transcriptome (GCF_000292845.1_OchPri3.0_rna.fna). This reference transcriptome was indexed and paired reads for each individual were pseudoaligned to this reference using Kallisto v0.42.5 [58]. Kallisto output transcript abundances in transcripts per million (TPM). Sequence based bias correction was implemented and 100 bootstraps were performed on each sample to measure uncertainty in the abundance estimates. Bootstrapping in Kallisto allows us to estimate the probability that a read is assigned to the correct transcript by accounting for technical variance. Transcript abundances and bootstrap values were analyzed in R v.3.2.3 [54] using the package Sleuth v.0.28.0 [59] to identify differentially expressed genes using the Wald test.
Transcripts identified as differentially expressed were run through DAVID v.6.8 [60] where enriched GO categories or KEGG pathways were identified. A heat map summarizing the relative expression and average TPM of each transcript identified as differentially expressed was made using the package gplot v. 3 [54].
Hemoglobin mRNA sequences were removed to allow for accurate sample quality assessment and data analyses. However, we also performed gene expression analyses on hemoglobin transcripts independently by performing the same procedure described above without the initial removal of hemoglobin transcripts.

Species verification
In this study, RNA-stabilized blood samples were successfully collected and sequenced from 20 pikas, representing three collection sites of varying elevations (3,600 m, 4,000 m, and 5,000 m) (Fig 1 and Table 1). We confirmed the species identification of our samples by aligning 664 bp of MT-CYB that we recovered sequence for in each of our samples to that of voucher specimens available on Genbank. The favored model for nucleotide substitution in our alignment was TIM2+G based on AIC and TPM2uf+G based on BIC. As in Lecocq et al. [61], models that could not be implemented in MrBayes were replaced by the most similar supported model. In this case, GTR+G was the most similar supported model for both TIM2+G and TPM2uf+G. The phylogeny output from MrBayes with posterior probabilities show that our samples group with O. roylei (S1 Fig). The MT-CYB sequences for all our samples were between 98.3-99.8% identical to the O. roylei voucher specimen sequence (JX682573.1) and were less than 89.5% identical to any other species in the subgenus to which O. roylei belongs, the Conothoa subgenus.
Between 67-89% of the reads for each sample mapped to the hemoglobin mRNA reference and were removed from the sequence dataset. After all filtering steps, between 2.9-14.7 (average 6.3) million paired reads per sample remained.

Variant identification
Between 65-74% of the final reads for each sample mapped to the reference genome using STAR aligner two-pass mapping [49]. The original variant file contained 2,004,114 sites; however, Genome Analysis Tookit (GATK) filtration steps, indel removal, and the removal of single nucleotide polymorphisms (SNPs) with missing data left 68,788 sites. The reference genome is from a different pika species (O. princeps), so most of these SNPs were differences between the reference and all of our samples but were not variable within our samples. Once removing these SNPs and singletons, 5,038 SNPs remained and were used in population structure and selection analyses.

Tests for population structure and selection
Both fastStructure [51] and Admixture v1.3.0 [52] found one population (K = 1) to be the best fit for our SNP dataset. We calculated Weir and Cockerham [62]

Gene expression analysis
In each sample, 45-58% (55% average) of all reads mapped to the reference transcriptome with reads mapping to 10,704 of the 26,254 transcripts. After accounting for the false discovery rate, when comparing the sample from the 5,000 m site to samples from the two other locations, 26 transcripts were significantly upregulated in the 5,000 m individuals and 40 transcripts were significantly down-regulated in the 5,000 m individuals (q-value < 0.05) (Fig 2). Our heat map (Fig 3), which also displays transcripts per million (TPM) for each of these transcripts, indicates that transcripts upregulated in the high-elevation group also generally have much higher TPM than the down-regulated transcripts. The set of upregulated and down-regulated genes were each queried in DAVID v.6.8 [60]. The set of upregulated genes ( Table 2), 23 of which were successfully annotated with the human genome through DAVID, displayed enrichment for numerous gene ontology (GO) categories and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways ( Table 3). The oxidative phosphorylation KEGG pathway showed the most significant enrichment (Benjamini-Hochberg multiple test correction (BH) p-value = 4.66E-09). DAVID identified six GO categories as being significantly over represented in our list of upregulated genes with BH p-value of < 0.05 (Table 3) all of which are related to the mitochondrial respiratory chain, also known as the mitochondrial electron transport chain. The set of down-regulated genes ( Table 4), 39 of which were successfully annotated with the human genome through DAVID, had no enrichment for any GO category or KEGG pathway (BH p-value > 0.27).
We also conducted a literature search on significantly upregulated genes that were not identified in DAVID as playing a role in the over representation of any GO category or KEGG pathway. Interestingly, GSTA1 plays a role in breaking down toxic products resulting from oxidative stress [63]. As such, the increased expression of this gene is considered to be an adaptation to oxidative stress [63]. The cold-inducible RNA binding protein (CIRBP), as evident by its name, displays upregulation in response to mild hypothermia, but is also known to be upregulated in response to hypoxia in human cells [64]. The most significantly upregulated gene, insulin-like growth factor binding protein 4 (IGFBP4), regulates growth and development of tissues by negatively regulating insulin-like growth factors (IGFs) and has been seen to be significantly upregulated in response to hypoxia in experiments using human brain cancer cell lines [65].
While there were no enriched GO categories or KEGG pathways in the down-regulated gene set, we also further investigated these genes through a literature search. We found that a number of these genes are known to be down-regulated in response to hypoxia in human cells, such as PLA2G4C [66] and MACF1 [67]. We also found that some of these genes are potentially related to immune response such as HNRNPR which positively regulates MHC class 1 expression [68], DOCK8 which is important in the functioning of natural killer cells and receptor-γt-positive innate lymphoid cells [69,70], and ELF1 which plays a role in the functioning and development of lymphocytes [71]. Additionally, the second most down-regulated gene, GGNBP2, is believed to play a role in spermatogenesis [72], and another significantly down-regulated gene, ARID4B, is a positive regulator of male fertility [73].
Analyses were also run comparing low-elevation samples to all other samples, as well as mid-elevation samples to all other samples. The low-elevation samples showed no significantly differentially expressed genes (all q-values > 0.91). The mid-elevation samples displayed one transcript that was marginally significantly different (transcript ID: XM_012929922.1, gene: ADGRE3, beta value = 1.18, q = 0.052) with all other transcripts displaying q > 0.33. Additionally, gene expression analysis of the data with hemoglobin transcripts included showed that none of the hemoglobin transcripts were significantly differentially expressed between the 5,000 m site sand the two lower-elevation sites (S1 Table).

Discussion
Our results indicate that differences in gene expression is likely an important mechanism facilitating hypoxia-tolerance in pikas along the elevational gradient we sampled. After testing for evidence of population structure, differential selection, and differences in gene expression between the sites, we only found evidence of significant differences in gene expression.
We performed multiple analyses to determine if there was any evidence in our data that these three sites could be considered discrete populations. Our SNP analyses indicate that individuals from these three sampling locations are not structured and thus can be considered one population, indicating no substantial barriers to gene flow across this steep elevational gradient. While our results suggest that elevation is not an effective barrier to gene flow, nothing is known about the dispersal distance or effective dispersal barriers for O. roylei, which has a home range of at least 42 m in diameter [74]. However, in the American pika, dispersal distances of 3 km have been observed for individual pikas [75] and dispersal distances have been estimated with genetic data to be up to 5 km [76] with home ranges of up to about 32 m in diameter [8]. In this study, our three sampling sites are between 6-12 km from each other (Fig 1).
Our SNP analyses also show no signs of differential selection between the sampling locations consistent with the absence of population structure, which implies that gene flow between the sites would be acting against the accumulation of any local adaptations. However, as these transcriptome data are limited to expressed genes, we cannot say for sure that there is not selection occurring in other non-coding regions or in genes that were not captured in our dataset. Further studies assessing selection across the entire genome are necessary to conclusively address differential selection between elevations. Additionally, our limited sample size makes it difficult to estimate the proportion of private vs. shared SNPs and future studies with more samples, and thus more power, could add to our understanding of the mechanisms at play.
The only significant difference between the sampling sites was found in our gene expression analyses. In a field study such as this, we are unable to control for all variables between sites that could impact gene expression; however, confounding environmental variables have been minimized by the proximity of our sampling sites. Additionally, our results show that the functions of the genes undergoing changes in gene expression are consistent with what we would expect in an organism compensating for limited oxygen. We found significant upregulation of genes involved in oxidative phosphorylation and mitochondrial electron transport in the highest elevation individuals. The process of oxidative phosphorylation (OXPHOS) creates 95% of the cell's energy [77] through the electron transport chain and is an essential cellular process for maintaining the health of the cell and survival of the organism. This process depends on the availability of oxygen, which is used as a terminal electron acceptor; thus limited oxygen, or hypoxia, directly affects cellular viability [78]. Due to the direct effect that hypoxia has on this vital pathway, there have been numerous examples of genes in this pathway undergoing selective pressure in hypoxia-adapted species [14,17,79,80]. Specifically in pikas, Lemay et al. [81] compared the transcriptome of American pikas along an elevational gradient and found different haplotypes of ND5, a mitochondrial gene important to the OXPHOS process, fixed at different elevations.
In fact, similar differences in expression of genes in the OXPHOS pathway have been identified in other species in response to hypoxic stress, however, these studies compared geographically distant high and low-elevation populations. When comparing the gene expression of rufous-collared sparrows living at 2,000 m to those living at 4,100 m in the Peruvian Andes, 187 annotated transcripts were upregulated in the high-elevation individuals and these transcripts were enriched for genes involved in oxidative phosphorylation, oxidative stress response, protein biosynthesis and signal transduction [24]. Cheviron et al. (2008) also found that when high-elevation individuals were brought to low elevation none of these transcripts remained differentially expressed [24], suggesting a within-individual plasticity in gene expression to compensate for elevational stress. Similarly, when comparing gene expression in deer mice from high and low elevations, 221 genes were found to be significantly differentially expressed, with many genes in the OXPHOS pathway upregulated in the high-elevation individuals and linked to elevated oxidative capacity and thermogenic capacity [27,28]. However, in the deer mouse, these changes in gene expression persisted even in a low-elevation common garden but were lost in the F1 generation [27]. The current study does not address whether changes in gene expression can occur within an individual, as in the rufous-collared sparrow [24], or if these expression profiles are hardwired within an individual and can only be reset in their progeny, as seen in the deer mouse [27], perhaps indicating genetic or epigenetic regulation of gene expression [82,83].
Our study adds to the evidence that genes in the OXPHOS pathway are upregulated in response to hypoxia and validates the value of utilizing blood in such a study, a tissue that does not require sacrificing specimens as in similar studies. We hope that the methods outlined here may broaden the options available for gene expression studies without requiring lethal sampling and yet still yield rich, biologically meaningful, data.
Additionally, there was no enrichment for any pathways or GO categories in the set of genes significantly down-regulated in the 5,000 m group; however, we did identify a few genes in which down-regulation may indicate a down-regulation of parts of the immune system and fertility. It is physiologically costly to compensate for increased hypoxia, and further exploration of genes identified here may provide insight as to what trade-offs may be taking place.
Previous studies have shown that gene expression profiles can be very different between geographically distant and genetically distinct populations [24,27,28,84]; however, divergence in gene expression increases with greater genetic distance [85]. Our study capitalized on the uniquely precipitous mountains of the Himalayan massif to draw both high and low-elevation samples from one area and has allowed us to begin to tease apart the role of genetic differences versus expression differences in response to hypoxia.
The results of our study indicate that plasticity in gene expression may be a key mechanism in allowing this pika species to live at 5,000 m versus 4,000 or 3,600 m. Changes in gene expression, unlike genetic adaptations, occur on a time scale that can keep pace with rapid climate change [23,24]. Other studies indicate that different pika species have evolved unique adaptations to hypoxia, perhaps specializing each species for the elevational range that it occupies and potentially limiting range movement outside of that elevational range [13,14]. However, this study suggests that, within a species, plasticity in gene expression may also facilitate range movement at a finer scale. We have not investigated the trade-offs that might be involved in this plasticity, however. Thus, while each species, or even populations, may be ideally suited to its general elevational range through genetic adaptations, within a population, plasticity in gene expression may be responsible for facilitating movement within the species' elevational envelope. This flexibility in elevation is likely to be an important source of resilience for lowerelevation pika populations impacted by climate change, helping to facilitate successful range shifts to higher, cooler, elevations within their species' elevational range.