Temporal transcriptome analysis suggest modulation of multiple pathways and gene network involved in cell-cell interaction during early phase of high altitude exposure

High altitude (HA) conditions induce several physiological and molecular changes, prevalent in individuals who are unexposed to this environment. Individuals exposed towards HA hypoxia yields physiological and molecular orchestration to maintain adequate tissue oxygen delivery and supply at altitude. This study aimed to understand the temporal changes at altitude of 4,111m. Physiological parameters and transcriptome study was conducted at high altitude day 3, 7, 14 and 21. We observed changes in differentially expressed gene (DEG) at high altitude time points along with altered BP, HR, SpO2, mPAP. Physiological changes and unsupervised learning of DEG’s discloses high altitude day 3 as distinct time point. Gene enrichment analysis of ontologies and pathways indicate cellular dynamics and immune response involvement in early day exposure and later stable response. Major clustering of genes involved in cellular dynamics deployed into broad categories: cell-cell interaction, blood signaling, coagulation system, and cellular process. Our data reveals genes and pathways perturbed for conditions like vascular remodeling, cellular homeostasis. In this study we found the nodal point of the gene interactive network and candidate gene controlling many cellular interactive pathways VIM, CORO1A, CD37, STMN1, RHOC, PDE7B, NELL1, NRP1 and TAGLN and the most significant among them i.e. VIM gene was identified as top hub gene. This study suggests a unique physiological and molecular perturbation likely to play a critical role in high altitude associated pathophysiological condition during early exposure compared to later time points.


Introduction
Reduced barometric pressure with increased altitude concurrently reduces the inspired oxygen partial pressure in humans [1]. In spite of reduced partial pressure of inhaled oxygen to the cells and tissues body adjusts to survive the extreme hypoxic environment referred as acclimatization [2]. These adjustments results in series of physiological responses to maintain adequate tissue oxygen delivery and supply at high altitude [3]. Increased pulmonary ventilation, an increase in cardiac output by increasing heart rate, changes in vascular tone, as well as increased hemoglobin concentration are well reported and observed as compensatory mechanism [4][5][6]. These responses to altitude are dependent upon individual to individual, their history of exposure to altitude, ventilation physiology, age, and altitude attained and stay [7,8]. Individuals un-acclimatized to altitude are bit prone to altitude induced immediate effects like acute mountain sickness (AMS). It is the consequence of severe AMS that can lead to life threatening conditions like high altitude pulmonary edema and high altitude cerebral edema [9]. AMS is not life threatening if occur at initial days of altitude stay and then gradually subsides. There are several physiological, metabolic, and transcriptional level changes observed in humans, which can cause initial AMS and prevalence of AMS signs during preliminary altitude acquaintance [10,11]. Highland area which had gained interest in past few years is Tien-Shen and Pamir mountain ranges in Kyrgyzstan. Very less is known about the Kyrgyzstan adaptation feature, besides few studies [12,13]. Kyrgyz acclimatization responses at an altitude of 3200m recently published by us was the first study to explore the signaling by genome wide expression analysis associated with hypoxia exposure [14] but the gene expression at high altitude >4,000 m remains unexplored in Kyrgyz. We analyzed the time based exposure induced perturbation in physiological and gene expression pattern at high altitude in Kyrgyz. Global transcriptome analysis using RNA sequencing, recent advanced technique, could effectively detect thousands of genes and their consecutive expression patterns. Systemic bioinformatics tool could further enhance the data presented by the RNA sequencing. Time series analysis benefits biology to dynamically capture the complete drift impinges in humans and specific selection of decisive time point for further therapeutic interventions. Present study was done with an objective to comprehend the responses during long 21 days stay at altitude of 4,111m in Kyrgyzstan by comparing the temporal response at each time point as compared to basal response. We studied molecular orchestration by enriched pathway analysis, gene clustering, and network analysis to find out critical time point indicative of acclimatization and top candidate genes controlling acclimatization process and cross validating it with physiological and biochemical markers. . . Time point analysis was performed because use of single time point can lead to static outcomes. This all can comprehensively help understand the acclimatization physiology at 4,111m.

Human volunteers and study design
The population group under study consists of young male residents of Kyrgyzstan (�800m) origin (n = 30) matched for age and BMI. Volunteers were fresh inductees to high altitude, did not show any kind of cardiovascular or neurological disease at basal as per their medical record data and were not into any kind of medication or supplementation. Volunteers were provided the same food/ration ad libitum with similar nutritional aspects during entire study duration. Protocol of the study was approved and carried out according to the guidelines by Ethics Committee of Defence Institute of Physiology and Allied Science (DIPAS), Delhi. After explaining the study protocol written informed consent was obtained from each individual. Basal studies started at Bishkek, Kyrgyzstan (800 m) and data was collected. Volunteers were inducted by road to~3000m (for 4 days) for acclimatization as per prescribed protocol of the study. After that, volunteers were moved to 4,111m by road where they stayed for 21 days and throughout the altitude stay temporal analysis was performed at Day3 (HAD3), Day7 (HAD7), Day14 (HAD14) and Day21 (HAD21). The basic characteristics of the individuals selected for the study has been provided in the Table 1.

Lake Louise Acute Mountain Sickness score recording
A very straightforward and common grading system for diagnosis of acute mountain sickness is the Lake Louise self-assessment questionnaire [15] with a headache and a score of �3 (ranges 0 to 12) represent AMS. The standard Lake Louise symptom questionnaire includes assessments like headache, gastrointestinal symptoms, fatigue and dizziness to produce a total score between 0 and 12. Each symptom was scored on a scale between 0 and 3 by volunteers (0 = none; 1 = mild; 2 = moderate; 3 = severe). Before initial recording volunteers were introduced with the terminologies, meaning of each symptom and how to identify signs of AMS and symptoms of AMS appeared in volunteers after ascent to high altitude were recorded. If volunteers had question regarding the meaning of the symptoms, question was rephrased and provided with an example to make them understand. AMS recording was done on first 7 days at 4,111m altitude stay.

Physiological measurements
Blood pressure, heart rate, arterial oxygen saturation, body weight and height. Systolic and diastolic blood pressure along with heart rate was measured using mercury free BP instrument (OMRON, USA) in the morning hours (0600-0700 hrs) in supine position when the participants were awake but still lying on bed. Saturation of peripheral oxygen was measured using finger pulse oximeter (Nonin Medical Inc, USA). It specifically measures percentage of oxygenated hemoglobin compared to total hemoglobin in blood giving an estimate of arterial oxygen saturation. Body mass was recorded after voiding between 0600-0800 hrs before breakfast using electronic platform balance (Deca 770, Seca incorporation, USA) and height was recorded using calibrated height rod (Seca Ltd, Medical Scale and measuring system, Brimingham, UK; Least Count:1 mm).
Pulmonary artery pressure by two-dimensional Doppler echocardiography. Twodimensional and Doppler echocardiography were performed from the standard parasternal, apical, and subcostal views in the resting state, in the supine or left lateral position using ultrasound systems (Phillips SX50, with transducer S5-1, Bothell, USA). The standard M-mode measurements of aorta, left atrium, left and right ventricular wall thickness, and left ventricular end-diastolic and systolic dimensions and right ventricular dimension at end-diastole and end-systole were made from the parasternal long-axis view as recommended by the American Society of Echocardiography [16]. All reported values represent the mean of at least 3 measurements. Peak E and A diastolic inflow velocities from mitral and tricuspid valves and tissue Doppler imaging (TDI) derived velocities of lateral and septal mitral annulus and lateral tricuspid annulus were measured to characterize left and right ventricular diastolic functions. Systolic pulmonary artery pressure (SPAP) was estimated using maximal tricuspid regurgitation (TR) jet velocity (V), assessed by continuous Doppler and transformed by the modified Bernoulli equation (4 � V 2 ) to which right atrial pressure (RAP) was added. RAP was considered as equal to 5 mmHg, taking into account that all volunteers displayed IVC < 2 1 mm and > 50% collapsibility. Mean pulmonary artery pressure (mPAP) can be approximated from the SPAP, using the following formula = 0.6 � SPAP + 2mmHg [17]. These calculations were based on multiple TR velocity measurements in nearly all cases (median number of TR velocity measurements = 6).

Blood collection; serum, plasma separation and storage
Blood samples were collected at basal, high altitude day 3, day 7, day 14 and day 21 between 0600-0800 hrs from anticubital vein after 12 hours fasting in ethelnediaminetetraacetic acid (EDTA) and gel tubes at basal (control) and at different altitude time points (test). The tubes were centrifuged; plasma and serum were separated by centrifugation at 3000 rpm for 20 min at 4˚C and stored at -80˚C until assayed. Venous blood samples from individuals at sea level and at different altitude time points was also collected in PAX gene Blood RNA Tubes (PAXgene, PreAnalytix, Hombrehtikon, Switzerland, distributed by Qiagen, catalogue no. 762165). PAX contains 6.5 ml of RNA stabilizing solution in which 2.5 ml of whole blood is added using a blood collection accessory at room temperature. After blood collection, PAX tube was inverted 8-10 times, were kept upright at room temperature (18-25˚C) for 2 hours than transferred to freezer (-20˚C) for 24 hours and finally stored at -80˚C.

RNA processing
RNA isolation, quantification and quality check. Isolation of total cellular RNA was done using PAX gene blood RNA Kit (PreAnalytix) as per manufacturer's protocol. DNase digestion was carried out for the removal of DNA using RNase free DNase set (Qiagen) on the RNA spin column. Purified RNA was immediately chilled on ice. RNA concentration and quality was evaluated by measuring absorbance at 260 and 280 nm using a spectrophotometer (NanoDrop, USA) before being stored at -80˚C. The RNA samples were quantified using Qubit HS RNA kit (which specifically quantitates RNA). The samples were checked for degradation using Agilent Bioanalyzer RNA 6000 nano kit for determination of RNA Integrity Number (RIN) with 1 being the most degraded and 10 being the least, calculated by computing the ratio of the areas under 18S and 28S rRNA peaks, the total area under the graph and by measuring the height of 28S peak. The RIN values more than 7 were considered for further analysis.
RNA sequencing and data availability. Two technical replicate samples for each control i.e. Basal and test i.e. high altitude D3, D7, D14 and D21 were transferred to Illumina HiseqX sequencing system and loaded sequencing by synthesis (SBS) reagents as per Illumina recommendation to perform 2x150 paired end sequencing. There were approximately 45 million (150-nt length) reads. The median quality score was >30 across all the samples. The sequence reads were mapped to the Human reference genome (HG19) with STAR aligner (V.2.0). The STAR mapped reads were processed to remove PCR and optical duplicates using Picard tools and raw expression count was generated using feature Counts. The RNA sequencing data is submitted to Gene Expression Omnibus (GEO): Accession Number: GSE133702.
RNA sequencing data normalization. The sequence data quality was checked using FastQC and MultiQC software. The data was checked for base call quality distribution, % bases above Q20, Q30, %GC, and sequencing adapter contamination. All the samples in technical replicates have passed QC threshold (Q30>80%). Raw sequence reads were processed to remove adapter sequences and low quality bases using Trimgalore. The QC passed reads were mapped onto indexed Human reference genome (HG19) using STAR v2 aligner. Overall 94.56% of the reads aligned onto the reference genome. The PCR and optical duplicates were marked and removed using Picard tools. Gene level expression values were obtained as read counts using featureCounts software. Expression similarity between biological replicates was tested by spearman correlation. For differential expression analysis the biological replicates were grouped as test (the altitude time points group) and control (the basal group). Differential expression analysis was carried out using edgeR package. The read counts were normalized and 40509 features (70.06%) have been removed from the analysis because they did not have at least 1 counts-per-million in at least 2 samples.

Differentially expressed gene analysis
Differentially expressed genes data representation. Gene expression pattern signal intensities of each gene probe were acquired by clustering according to the Euclidean distance. It reveals the correlation during temporal analysis in Kyrgyz for each gene probe to generate heat map in which matrix of values is mapped with matrix of colors. Heat map was constructed using web based matrix visualization and analysis platform Morpheus for group wise analysis [18] and CimMiner for sample wise analysis (http://discover.nci.nih.gov/cimminer). Venn diagram representation of differentially expressed genes in Kyrgyz was done using 'InteractiVenn' [19].
Gene enrichment analysis using DAVID, functional annotation clustering using Morpheus and Gene Ontology analysis using cytoscape BinGO plugin. The differentially expressed genes were studied for their overabundance in different GO terms as well as pathways. The overabundance of a particular term was decided based on the number of significant genes in the analysis. Functional annotation clustering was achieved using the database for annotation, visualization and integrated discovery (DAVID) (https://david.ncifcrf.gov/) can provide systematic and comprehensive biological function annotation information for high throughput gene expression. Therefore, we applied KEGG pathway analyses to the DEGs using DAVID online tool at the functional level. Differentially expressed genes which were segregated both numerically log2FC� ±0.58 and significantly pvalue <0.05 were assigned for official gene number and accession number. These genes were put into DAVID analysis based on official gene name and accession number to extract KEGG enrichment. Pathways obtained for the enriched genes were ranked based on the adjusted p value (Benjamini-Hochberg adjustments). The identified gene ontologies were screened using Morpheus for their expression at different altitude time points. The file was uploaded to the Morpheus heat map building tool available from the Broad Institute. The file was formatted to conform to the expected column headings as described in the input file documentation. Custom colors were selected to represent high expression (red), moderate expression (pink), and low expression (blue). 394 common genes during temporal analysis were matched for the similar gene transcripts obtained from functional annotation clustering (using Morpheus) and those common genes were analysed for gene ontology using BinGO. Overrepresentation and visualization of gene ontology (GO) terms and construction of enrichment GO network using Cytoscape 3.2.1 with Biological Networks Gene Ontology tool BinGO plug-in. Node size represented number of targets and color represents significance of the GO category.
cDNA conversion of RNA, quantitative and technical validation of genes using real time quantitative reverse transcription-polymerase chain reaction. Technical validation was performed using samples from basal and high altitude day3 time point group. Samples for the technical validation were kept same as taken for the RNA sequencing with RNA integrity number >7. 1μg of RNA was converted into double stranded cDNA by reverse transcription using cDNA Synthesis Kit (RT 2 First strand kit, Qiagen, USA) according to the manufacturers protocol with an oligo (dT) primer containing a T7 RNA polymerase site added 3' of the poly (T). The selection of Gene of interest (GOI) was to validate the RNA sequencing data. The genes selected were considered because they were significantly differentially expressed and are involved in pathways obtained from DAVID analysis. The GOI's are: TBXA2R, GP9, ITGB2, HBA1, VEGFB and TNF. The primers used for technical validation of GOI's are shown in Table 2. Two steps qRT-PCR (Applied Biosystems, Thermo Fisher Scientific Corporation, USA) was performed to confirm the differential expression results obtained from the RNA sequencing experiments. Primers were designed for 7 genes and polymerase chain reaction conditions were optimized individually using SYBR green (Qiagen, Hilden, Germany). The quantitative RT-PCR was run with SYBR green mix in a final volume of 25μl. The PCR thermo-cycler program was set at on Step one Plus software for 95˚C for 10 minutes followed by 40 cycles of 95˚C for 15sec, 55˚C for 60 seconds. Melt curve analysis followed the PCR step. Each RTPCR experiment was replicated before final data analysis for final interpretation. 2^-ΔΔCT formula was used for determination of the fold change in each GOI. Endogenous control used was 18s for the normalization of gene of interest expression values.
Network analysis of the Gene coexpression using cytoscape GeneMANIA plugin and identification of significant hub genes. GeneMANIA cytoscape plugin, which is an open resource platform for visualizing complex networks, was used to establish gene interaction relationship network, gene coexpression and hub genes were identified. Time point day 3 differentially expressed genes in comparison to basal were taken for analysis. Gene showing coexpression network analysis from cytoscape was done using five calculation methods: Average shortest path length, degree, clustering coefficient, closeness centrality and betweenness centrality. The intersecting genes derived using these five algorithms encodes core proteins and may represent key candidate genes with important biological regulatory function.

Biochemical analysis for biological validation
In accordance with the physiological, DAVID and network analysis expression changes we determined quantitatively proteins showing involvement in significantly perturbed biological

Statistical analysis
Physiological measurements were represented as mean±SEM in graph. For RNA sequencing analysis, the final set of differentially expressed genes shown in S1 File was filtered based on statistical significance of adjusted p value of <0.05 and log2 fold change �±0.58 in comparison to basal for each time point. For the functional enrichment analysis the genes with considerably enriched GO terms in DEG's were selected statistically by Bonferroni/ Benjamini-hochberg FDR correction. Biochemical analysis for biological validation of gene expression was represented as mean±SEM and statistically analyzed using student t-test. A p value of �0.05 was applied as the cut off value for statistical significance at all time.

Results
To comprehend the complex response towards high altitude hypoxia we studied physiological perturbation in Kyrgyz population along with whole genomic changes in temporal manner. We designed experiment strategies adopted for the current study as schematically summarized and represented in Fig 1.

Time point study at high altitude present considerable effect on physiological parameters
To determine the severity of high altitude hypoxic stress during different time points investigations from our sample size of n = 30 were done. Physiological parameters that are directly

PLOS ONE
related to high altitude condition like blood pressure, heart rate, arterial oxygen saturation, mean pulmonary artery pressure and acute mountain sickness score were investigated. Systolic blood pressure increased significantly on D3 (p<0.05) of HA as compared to basal. Thereafter, it reduced to almost basal level on D14 and D21 of high altitude (Fig 2A). Diastolic blood pressure increased upon ascent to high altitude on day3 (p<0.001) and remained almost similar on rest of the days at altitude (Fig 2B). Resting heart rate increased at HA D3, D7, and D14. At D3 increase was statistically significant as compared to basal (p<0.01) and heart rate increased at later time points significantly D7 & 14 (p<0.001). On D21 there was slight drop in heart rate but as compared to basal the increase was significant (p<0.01) ( Fig 2C). Oxygen saturation in blood decreased to a great extent on HAD3 as compared to basal, thereafter it regained but remained significantly less than basal at each time point (p<0.001) ( Fig 2D). Mean pulmonary artery pressure increased with altitude exposure. The significant up regulation (p<0.001) is observed at all-time points (Fig 2E). We investigated the prominence of responses individual recognized on reaching to high altitude as compared to basal. Acute mountain sickness prevalence was recorded at initial 7 days of altitude. To determine the severity of stress, investigations from our sample size of individuals was done, the individual and mean score remained less than 3, indicating no prevalence of AMS (Fig 2F). The sequential changes in physiology i.e. BP (systolic and diastolic), HR, SpO 2 , pulmonary artery pressure & AMS scoring allowed us to capture the transition between early high altitude exposed physiological changes and later stable response.

Data acquiring and analysis of differentially expressed genes at different high altitude time point in comparison to basal
The

DAVID Bioinformatics of differentially regulated genes in Kyrgyz at different time points of high altitude exposure
To integrate the complex physiological analysis and transcriptomic data clustering of differentially expressed genes, DAVID Bioinformatics was performed. Transcriptomic data sets of different time points were assessed after applying cut off of Log2FC�0.58. Up regulated genes and down regulated genes were considered for finding out the considerably enriched KEGG pathways (      PRF1, LAT, TNF, RAC2, ARAF, KIR2DL1, ITGB2, KIR2DS4   changes like reduced SpO 2 , increased BP, HR and mPAP response could directly relate the perturbation of whole genome for acclimatization at HAD3. The response majorly covers the fact that there is a clear transition state in human body when reached to high altitude from basal and at later stage the response get stabilized.

Clustering the temporal expression of genes, biological pathways using Morpheus and BinGo analysis
After data mining, we sought to understand how expression pattern of these genes occur during different time points post high altitude exposure. Grouping of pathways helped us unify genes in a common category and further comprehend our data. After data compilation broad categories obtained were cell-cell interaction, blood signalling, coagulation system, cellular processes and inflammatory responsive pathways. Separate heat maps were prepared consistent with the gene expression of KEGG pathway enrichment analysis and were further compiled. These KEGG pathways comprised of both down and upregulated expression of different high altitude exposure temporals. First heat map (Fig 4A) (Fig 4B) largely involved in inflammatory responsive pathways. Pathways shown in heat map were indicative of upregulation of majority of genes at HAD3. Such a time progression hypoxic exposure clustering analysis from genes to biological roles direct us to a preliminary conclusion that maximum genes were upregulated at initial time point 'HAD3', some genes were perturbed on subsequent time point, while majority of genes showed a stable response at later time points. Heat map indicated distinct expression pattern and importance of cellular responses at early time point. Interestingly we observed common biological process emerging in independent data mining strategies, suggesting specificity of analysis and also unique nature of gene ontology functions for Kyrgyz. 394 common genes were sorted for overrepresentation of GO terms using cytoscape 3.2.1 with BinGO plugin in FDR correction using Benjamini-Hochberg generated enriched clusters. Enriched terms included Biological process, immune response and immune system process ( Fig 5A). Common genes sorted from biological process data mining of 394 genes and from Morpheus gene annotation clustering were CD274, CDH2, ESAM, LAMC1, FOLR3, FOLR2, GUCY1A2, TUBB4A, RPS26, RPS15A, CD19, SERPING1, C4BPA, CCR9, TNFRSF17, CCL3, CCR9, CXCL5, IL9R, PFN1, ACTB. These gene transcripts mined with BinGO plugin revealed biological terms which were clustered further as biological and cellular processes, transport process, immune response, regulation of immune response and regulation of metabolic process ( Fig 5B).

Validation of RNA sequencing experiments using qRT-PCR
We also performed technical validation of DEG's using qRT-PCR by selecting random genes which show critical and overlapping involvements in different gene ontologies. Thus, 6 genes were selected. Substantial agreement was observed between the RNA sequencing and qRT-PCR results in all 6 genes. The fold change obtained in each gene from both the techniques showed similar drift (Fig 6). Candidate gene expression was normalized each time in different time points with 18s as the endogenous control.

Differentially expressed genes network analysis and hub genes identification
GeneMANIA coexpression network of high altitude day3 time point genes was constructed using GeneMANIA cytoscape plugin. After importing the data from cytoscape and running the network analysis, the top 10 genes were evaluated by the 5 topological feature average shortest path length, degree, clustering coefficient, closeness centrality and betweenness centrality. The network obtained from network analysis is provided in S1 Fig. As the network was very complex so we figured out important nodes/genes of the network on the basis of topological features. The most significant genes were different in different topological categories but out of them 3 categories showed overlapping genes (Table 4). Thus we short down the network Proteins were picked to deeply understand the responses acquired by sequential analysis performed. Cortisol expression was measured to confirm the physiological changes. Cortisol increase was~19% at high altitude day 3 as compared to basal. Further, angiotensin converting enzyme, coagulation factor 8 and vascular endothelial growth factor were picked for validating the pathway enrichment analysis expression levels,~12%, 32% and 33% increase was observed at D3 time point in  Table 4. Top 10 genes obtained from network analysis of HAD3 DEG's using three different topological features: Degree; closeness centrality and betweenness centrality. Three categories designate Vimentin as the mutual gene.

Gene Name
Degree Gene Name Closeness centrality Gene Name Betweenness centrality comparison to basal respectively. Also, the protein expression level of hub gene obtained from GeneMANIA cytoscape analysis Vimentin was observed to be~69% increase at day 3 time point in Kyrgyz individuals when compared to basal (Fig 7A-7E).

Discussion
High altitude imposes phenomenon which lead to preliminary pathophysiology and acclimatization benchmarks to survive the same. In our study we found that response to high altitude is complex and depends on time of exposure. Transcriptome analysis using RNA sequencing can provide novel biological insights into the pathways that are critical for high altitude survival.
Our study adopted an approach in Kyrgyzstan group to understand the biological basis of high altitude induced acclimatization response in a temporal manner. We used approach in which whole blood gene expression profile at high altitude and control i.e. basal were compared using different bioinformatics tool to find out the novel biological players pertinent to high altitude. Our study reflected increased systolic and diastolic blood pressure during initial time point and significant increased levels of cortisol in Kyrgyz significantly at D3 (p<0.05) point out towards stress condition after high altitude exposure. Mean pulmonary artery pressure increased considerably on day3 of high altitude made them prone to pulmonary vasoconstriction and increased pulmonary arterial vascular pressure. SpO 2 has been suggested as a useful tool for prediction of hypoxia. SpO 2 significantly (p<0.001) dropped at altitude during initial time point HAD3 and regained at later time points. This result is consistent with the previous studies in which lower arterial oxygen saturation has been observed in hypobaric hypoxia [20,21]. The reason may be increased dead space ventilation, triggered by lower tidal volume and higher breathing frequency [20,22]. Heart rate significantly (p<0.001) increased on Day7 during initial time point as the blood pressure and mPAP increased contrary with the Tibetans that mPAP did not change and they develop only minimal hypoxic pulmonary hypertension [23]. BP, HR and SpO 2 helped us monitor the progression of high altitude induced hypoxia and gave a lead for gene expression level changes. Overall, temporal analysis of Kyrgyz physiology helped understand the pattern of responses and cover all phases in this group to study the dynamic changes trend. Physiology parameters reveal time point D3 to be perturbed significantly, thus this phase could be point of analysis. We found a time window where physiological changes are observed and thereafter it subsides gradually. It has given us an opportunity to understand the molecular basis of this phenomenon. In our study when we subjected gene expression data to unsupervised learning we found a similar trend that day 3 has a unique molecular signature compared to other time point.
Further analysing day 3 time point touched upon pathways related to Cellular dynamics and cell adhesion molecules; Rap1 signalling, MAPK, vascular morphogenesis and angiogenesis, Platelet Activation and Hypercoagulation, Angiotensin converting enzyme, hypoxic pulmonary vasoconstriction and immune response which are further separately discussed.
The mechanical interaction between the cell and extracellular matrix influence the cell behaviour and function. Cell adhesion is generally the ability of a single cell to stick to another cell or an extracellular matrix forming an endothelium which is a dynamic organ to regulate vascular tone [24,25]. Vascular cell adhesion molecules are expressed on activated endothelium. Along with it cell-cell adhesion CAM's are involved in the transmission of signals across cell membrane [25]. It is reported that continuous hypobaric hypoxia cause altered expression of junctional protein complex of vascular endothelial cells [26,27]. As evident from our data gene transcripts like integrin ITGAV, ITGB2, ITGB7, ITGA4 and ITGA2; in cell adhesion molecule, focal adhesion and extracellular matrix receptor interaction, involvement of collagens COL6A3, COL6A2, claudins CLDN5 and other cell adhesion molecules like, ICAM3, CADM1, CDH2, CD58, ACTB, TLNN1, FLNA, ACTG1, LAMA5, PARVB, MYLK all are upregulated during initial time point in Kyrgyz suggesting modulation of cytoskeleton dynamics as a part of acclimatization to overcome hypobaric hypoxia. Similarly, in Tibetans and Sherpas gene network analysis suggest involvement of collagen and integrin in multiple pathways which are involved in the cellular functions related to angiogenesis [28].
Temporal cellular level changes at high altitude reveal occurrence of endothelial dysfunction. These cell adhesion dynamics might point towards enhanced Rap1 signalling which is evident in our present transcriptome data. Importance of cell dynamics is in several processes like coagulation, angiogenesis etc. Angiogenesis is a process of formation of new blood vessels from the existing vasculature [29]. This process needs cellular level proliferation which is evident by the up regulation of Mitogen activated protein kinase signalling during hypoxia. Evidence suggested role of hypoxia in phagosome formation, as evident in our study, for initiation of phagocytosis by activation of intracellular pathways i.e. MAPK pathway that could mediate the early events required for internalization/phagocytosis [30]. Gene transcripts like MAP2K3 log2Fold change = 0.72, MAPK7 log2Fold change = 0.65, RAC2 log2 Fold change = 0.70 (a small GTPase that regulate cellular responses) [31] and JUN log2 Fold change = 0.71 which is involved in proliferation and survival [32] were upregulated. Rap1 is a Ras family of GTPase that is activated by guanine nucleotide exchange factors (GEFs) and is subsequently converted back to its inactive GDP-bound state by GTPase-activating proteins (GAPs). Ras family of GTPase i.e. Rap1 play a key role in regulation of angiogenesis by modulating endothelial cell functions [33]. Rap1 promotes angiogenic signalling by several factors including growth factors like EGFR, FGF and VEGF. In our study the gene transcript level of VEGFB log2 Fold Change = 0.66, FLT4 log2 fold change = 0.631 and FGFR1 log2 fold change = 1.412 has been noticed to be upregulated on Day3 time point thereafter the expression got down regulated. VEGF increase vascular permeability and allow extravasation of plasma proteins thus forming a primitive scaffold for endothelial cell migration. VEGF protein in Kyrgyz show significant (p<0.05) upregulation at HAD3 supports and validate gene data. Angiopoietins, proteins that play important role in vascular development and angiogenesis that is important in hypoxic stress [34]. In our present study ANGPT4 is upregulated at day3 time point and thereafter it downregulated. ANGPT's mechanism by which they promote angiogenesis is involved in regulation of endothelial cell interaction with other vascular cells. FGFR2, NF1 and RasGef4, are the candidate genes which function in the Ras/ERK signalling pathway and which commonly promotes angiogenesis with the HIF pathway under hypoxia [35]. In our study gene transcripts like RAPGEF5, RAP1GAP played an important role in activating the RAP1 signalling at initial time point of altitude.
Arterial venous thrombosis is observed under some cases of acute hypobaric hypoxia wherein activation of platelets has been observed during hypoxia exposure in human and animal studies [36][37][38]. Identification of membrane glycoproteins like GP9, GP6, GP4 and integrin subunits like ITGA2B, ITGB3 has been reported in study in which volunteers exposed to continuous hypobaric hypoxia. Activation of membrane glycoproteins and integrin involvement has important role in platelet signalling and thrombosis [39,40]. In our study, activation was reflected during day7 of temporal analysis. Activation of membrane glycoproteins gene transcript GP1BB log2fold change = 1.78 is observed on Day7 of high altitude. GP9 log 2 fold change = 0.958 gene transcript is observed to be slightly upregulated during Day3 but major upregulation is observed at day7 time point. Integrin activation ITGA2B log2 fold change = 0.69 and 0.81 was observed at time point day3 and day7 respectively. This might indicate platelet activation not immediately after exposure to altitude rather at later time point. TXA2 is responsible for multiple biological processes through the cell surface TXA2 receptor, or Tprostanoid (TP) receptor [41,42]. Activation of gene transcript TBXA2R log2 fold change = 1.07 is also observed at HAD7 in Kyrgyz. This gene transcript is involved in the platelet aggregation. During hypoxic microenvironment TBXA2R act as paracrine mediator due to its abundant expression in platelets surface [42]. TXA2 is responsible for multiple biological processes through its cell surface receptor TP. High altitude has been associated with hyper coagulation state due to various environmental stresses and also RAP1 activation, as discussed in last section, in the Kyrgyz might be associated with platelet activation. This is confirmed by the studies which indicate that about 8% of the known proteins expressed in platelets are small GTPase [43,44]. RAP is the most abundant of protein in platelets [45]. Rap1 activation is evident in our transcriptome data regulates multiple responses in platelets, most notably integrin and glycoprotein activation on high altitude day7 in Kyrgyz. This is also supported by the other studies that rap1 activation leads to platelet activation [46,47]. Also, activation of platelet proteome/ reactivity correlates with a prothrombotic phenotype [38]. We measured coagulation factor 8 (F8) in serum and found to be upregulated during HAD3 which supports activated platelet gene transcripts. Platelet activation also induces Rap1 activation. It is a kind of cycle in which both promote each other. Platelet stimulation involves the interaction of ADP with platelet receptor P2Y12 leading to inhibition of RASA3 which in turn prevent the conversion of RAP1-GDP. Thus RAP1 remain in activated form and result in fast and sustained integrin αIIβ3 activation leading to coagulation effect [45].
Angiotensin converting enzyme is a protease which is capable of cleaving two amino acids from angiotensin 1 and form Angiotensin II, which is a powerful vasoconstrictor. ACE is found highest in pulmonary circulation and form angiotensin there causing hypoxic pulmonary vasoconstriction in humans [48,49]. This alteration in systemic blood pressure is due to vasoconstriction. ACE log2 fold change = 0.6 gene transcript in our study is upregulated in the Kyrgyz on day7 of high altitude. Previous studies on Kyrgyz individuals revealed the polymorphism associated with ACE gene transcript. It was studied that I/I genotype was more strongly related with pulmonary hypertension in Kyrgyz than in comparison to other phenotypes like I/D, D/D [12]. It has been studied that during initial 2-3 days results in very high level of renin due to strong sympathetic stimulation of kidney as per effects of hypoxia [50]. These very high renin level increases the angiotensin I levels and further angiotensin II. ACE levels in our study show significant (p<0.001) increase in Kyrgyz at HAD3 as compared to basal time point. This is suggestive of initial vasoconstriction and increased blood pressure and mPAP as evident from our physiology analysis. In present study physiological data mPAP and BP remained elevated even at later time points of altitude exposure.
In present study, it is noticeable that immune system was highly regulated during temporal analysis in Kyrgyz individuals. During HAD3, which is the initial time point for estimating the gene level changes in this group, it is evident that antigen processing and presentation, B cell receptor signaling, Natural Killer cell mediated cytotoxicity, Toll like receptor signaling pathways all are up regulated. The gene transcripts involved were different Major histocompatibility complexes, HLA-A, HLA-B, HLA-C, HLA-G log2 fold change = 0.65, 0.82, 0.91, 0.64 respectively are found on the surface of antigen presenting cells (called as APCs) for presenting peptides/antigens at cell surface; gene transcripts like CD22 CD79A, CD81 log2 fold change = 0.80, 0.92, 0.81 respectively which mediate B-cell B-cell interaction; gene transcript like LAT, ARAF, PRF1 log 2 fold change = 0.62, 0.65, 0.81 respectively all induce T cell activation. MHC activation during initial time point is a protective mechanism of Kyrgyz at high altitude for presenting antigens and reflects boosted adaptive immunity. Similarly, upstream regulated pathway is Complement and coagulation cascade. Activation of gene transcripts like C2, C3, C4BPA, CFB, C1QB during day7 also supports the activation of immune system for fighting against antigens. Along with adaptive immunity, innate immune system is also getting upregulated during initial temporal analysis in Kyrgyz individuals. Further, during Day7 of time point analysis cytokine-cytokine receptor interaction is getting down regulated. Cytokines, chemokines and their receptors like CCR9 log 2 fold change = -1.691, TNFSF10 log2 fold change = -0.583, CCL4L1 log2 fold change = -0.99 were down regulated which is indicative of reduced migration, activation of immune cells and future pathogenesis of human disease as the individuals getting acclimatized. Chemokines are essential players in immune and inflammatory reactions as well as infections [51,52] and this reveals a possible protective mechanism against inflammation or infection. Investigation has been done on probable association of inflammation in the pathogenesis of acute mountain sickness [53]. It has also been reported that epithelial gene transcript perturbation during hypoxia causes inflammation [54]. As, epithelial cells provide mucosal barrier thus cellular dysfunction cause less defense of mucosal barrier function. It is evident from our gene data that the epithelial function is disturbed during hypoxia exposure and this might also be a cause of strong inflammatory upregulation during initial time point. Endothelial activation and inflammation has strong correlation [55].
It is evident from the clustering of gene ontologies analysis that initial time point HAD3 is very important in terms of dynamic changes occurring in Kyrgyz. Thus, we further analysed this time point to find out the involvement of important genes and their associated functions. Genes obtained from the degree; betweenness centrality and closeness centrality were checked for their specific roles. It was found that among the genes 50% (i.e. 5 genes out of 10 in each in each category is involved in cellular level changes and cell to cell interaction. VIM, CORO1A, CD37, STMN1, RHOC, PDE7B, NELL1, NRP1 and TAGLN. Involvement of all these genes during hypoxia at cellular level suggests hypoxia affect structure and function of endothelial cells. Among it, VIM is lying in all 3 analysis and has the highest degree and could be pointed as a hub gene and rest all genes lie in at least 2 of the category. Hypoxia exerts effect on Vimentin, an important component of endothelial intermediate filament network, which help maintain structure and function of endothelial cells. As we have observed from the KEGG pathway analysis that activation of signalling pathway like MAP Kinase, cell adhesion molecules, focal adhesion and extracellular matrix receptor interaction pathways can alter the actin cytoskeleton and alter the endothelial cell migration, and barrier permeability during hypoxia. Thus, in the KEGG pathway analysis, the top pathways were associated with the cell-cell interaction which is in consonance with the hub gene functions. It has been shown that hypoxia causes redistribution of Vimentin to a more soluble and extensive filamentous network that could play a role in endothelial barrier stabilization. Results of our study support other studies which suggest up regulation of Vimentin in hypoxic cells [56,57]. It has also been observed that in endothelial cell surface Vimentin binding peptide induces angiogenesis under hypoxic conditions [58]. In our study VIM, on protein level, is significantly up regulated with p�0.001 at HAD3. In brain capillary endothelial cells upregulation of Vimentin has been reported [59]. In bone marrow endothelial cell line, Vimentin has been shown to regulate focal adhesion [60]. Other critical genes have been reported to be involved in cellular dynamics during hypoxic acclimatization. Being multifactorial in nature, high altitude is characterized with changes in physiology, vascular tone, inflammatory milieu, thrombotic factors and cellular dynamics. At molecular level high altitude is associated with an extensive alteration of systemic gene expression. Such extensive changes at high altitude induce pathophysiology which is not caused due to perturbation of single pathway but rather a cumulative alteration of many signalling modules. In summary, by using a series of analysis, genomics and bioinformatics we illustrated the decisive time frame i.e. high altitude day 3 for maximum perturbations in genome of Kyrgyz individuals cross validated with physiological and biochemical markers. Temporal analysis suggested cell-cell interaction and immune responsive genes to be majorly involved in the early phase of acclimatization response. Endothelial activation (cell-cell interaction) and immune response generation/ inflammatory response are both interrelated. It is evident from the clustering of gene ontologies analysis that initial time point HAD3 is very important in terms of dynamic changes occurring in Kyrgyz. Moreover, in this study we also tried to find the nodal point of the gene interactive network and candidate gene controlling many cellular interactive pathways VIM, CORO1A, CD37, STMN1, RHOC, PDE7B, NELL1, NRP1 and TAGLN and the most significant among them i.e. VIM gene was identified. These hub genes may be used in the future as a biomarker and therapeutic target for accurate diagnosis and treatment of high altitude induced hypoxia. This study will contribute to the knowledge of the molecular mechanisms underlying the acclimatization of Kyrgyz towards high-altitude environment.

Limitations and future prospect
We acknowledge that study was limited to 21 days at high altitude and was not investigated after descent from high altitude to baseline. Further similar studies can be conducted at similar altitude on larger number of volunteers for assessing the effect after 21 days and after descent to baseline.