Genomic Scan Reveals Loci under Altitude Adaptation in Tibetan and Dahe Pigs

High altitude environments are of particular interest in the studies of local adaptation as well as their implications in physiology and clinical medicine in human. Some Chinese pig breeds, such as Tibetan pig (TBP) that is well adapted to the high altitude and Dahe pig (DHP) that dwells at the moderate altitude, provide ideal materials to study local adaptation to altitudes. Yet, it is still short of in-depth analysis and understanding of the genetic adaptation to high altitude in the two pig populations. In this study we conducted a genomic scan for selective sweeps using FST to identify genes showing evidence of local adaptations in TBP and DHP, with Wuzhishan pig (WZSP) as the low-altitude reference. Totally, we identified 12 specific selective genes (CCBE1, F2RL1, AGGF1, ZFPM2, IL2, FGF5, PLA2G4A, ADAMTS9, NRBF2, JMJD1C, VEGFC and ADAM19) for TBP and six (OGG1, FOXM, FLT3, RTEL1, CRELD1 and RHOG) for DHP. In addition, six selective genes (VPS13A, GNA14, GDAP1, PARP8, FGF10 and ADAMTS16) were shared by the two pig breeds. Among these selective genes, three (VEGFC, FGF10 and ADAMTS9) were previously reported to be linked to the local adaptation to high altitudes in pigs, while many others were newly identified by this study. Further bioinformatics analysis demonstrated that majority of these selective signatures have some biological functions relevant to the altitude adaptation, for examples, response to hypoxia, development of blood vessels, DNA repair and several hematological involvements. These results suggest that the local adaptation to high altitude environments is sophisticated, involving numerous genes and multiple biological processes, and the shared selective signatures by the two pig breeds may provide an effective avenue to identify the common adaptive mechanisms to different altitudes.


Introduction
Qinghai-Tibetan plateau, the highest region on the world, offers an ideal nature laboratory for studying adaptation to extreme environments. It is well known that many mammalian species living on this plateau have acquired various anatomical and physiological traits that contribute to their ability to survive in high mountains or plateau, including loss of hypoxia pulmonary vasoconstriction, right ventricle hypertrophy, thinner-walled pulmonary arterioles, high blood flow, large lungs and hearts, and high energy metabolism [1]. All the above adaptive features are considered having a genetic basis. In the past few years, increasing attentions have been devoted to identify the underlying genetic factors for Tibetan people [2,3], yak [4], Tibetan chicken [5], Tibetan antelope [6], pika [7] as well as Tibetan pigs [8,9]. A list of selective genes possibly responsible for high-altitude adaptation was identified for each species. However, even for the most studied Tibetan people, we lack adequate evidence from independent studies demonstrating their truly roles of these genes in the local adaptation to high altitudes.
Pig has many similarities to human at the levels of anatomy, physiology and genetics. Therefore, understanding the genetic basis of pigs for adaptation to the high altitude environments would provide vital information for human traits. However, compared to Tibetan people, the pig residents on this plateau were less studied. Two recent population genetics studies [8,9] have proposed that a set of genes are responsible for adaptation to high altitudes in TBP. In addition, an previous experimental study revealed that several physiological measures, including pulmonary artery pressure, pulmonary artery wedge pressure, CO cardiac output and pulmonary vascular resistance in pigs, increased progressively with the simulated conditions corresponding to altitudes of 0 (low-altitude), 2300 (moderate-altitude) and 4500 meters (high-altitude) [10]. Yet, our understanding of the genetic basis underlying the adaptation process to hypoxia is still very limited, and several fundamental questions remain unresolved. For instances, it is still not very clear if moderate altitude could trigger the genetic adaptation in pigs, and whether pigs living in different level of altitude have the same or similar genetic adaptive mechanism(s) to the unfavorable harsh environment. Two local pig breeds living in South-Western China, Tibetan pig (TBP) and Dahe pig (DHP), provide golden opportunities to obtain the preliminary results pertaining to these issues. TBP, surviving in a wild state of a national natural reservation zone at the altitude of above 2500 meters, is deemed to have a local origin on Qinghai-Tibetan Plateau and has evolved exceptional mechanisms to favor their survival under high-altitude environment [11]. DHP is a native pig breed originating from a limited region in Yunnan province. Its habitat is at an altitude of about 2000 meters with oxygen pressure decreasing to ,70% of that at sea level [12].
To carefully examine the genetic adaptation to different levels of altitude in TBP and DHP, we conducted two genomic scans for selective sweeps using F ST based on genetic differentiation, compared with WZSP, a low-land breed living in the low altitude environment in Hainan province. Similar to TBP, WZSP is raised in a flexible diet system and adapts well to grazing conditions of the forests and pastures [13]. Hence, WZSP is expected to have less artificial selection. This feature may help us to easily identify the loci shaped by natural selection, in particular, the selection signatures due to various environmental factors at high altitudes.

Ethics statement
This study was approved by the ethics committees of all the participating institutes (Institute of Animal Science, Chinese Academy of Agricultural Sciences, and Guangdong Medical College). All the animal experimental procedures were performed according to the guidelines for the care and use of experimental animals established by the Ministry of Agriculture of People's Republic of China.

Samples
A total of 96 pigs were randomly sampled from three geographic regions. TBP sample (n = 35) came from Nyingchi Prefecture in the Tibetan Autonomous Region of China, with an average altitude of 3000 meters. DHP sample (n = 27) was collected from Dahe Town in Fuyuan county of Yunnan province, which is situated at a moderate altitude (from 1700 to 2341 meters above the sea level). WZSP sample (n = 34) was collected from Hainan province with an average altitude of 200 meters. In an effort to avoid relatedness between the collected pigs, we consulted with the local technicians or farmers regarding their distribution and breeding history. Genomic DNA of all samples was extracted from blood according to the standard protocols provided by the manufactures.
Genotyping and data quality control DNA samples were genotyped using the PorcineSNP60 chip array, which included 65,163 single nucleotide polymorphisms (SNPs), with an average inter-marker distance of 40 kb. Pig individuals with an average call rate below 90% were removed from this study. SNPs were removed if any of the following conditions was met: (1) with call rate ,90%; (2) with minor allele frequency (MAF) #0.05; (3) did not conform to Hardy-Weinberg equilibrium (HWE) (i.e., multiple tests adjusted p,10 25 ) in any breed; and (4) not included in the latest reference assembly of the porcine genome Sscrofa 10.2. After filtering, 44,433 autosomal SNPs were remained for further analysis.

Relationship test
To ensure independence among the collected pigs, cryptic relatedness among individuals within each breed were identified using pair wise Identity-By-Descent (IBD) metric (referred to as PI-HAT in PLINK [14]). One individual from a pair of pigs were

Principal component analysis
To avoid influence of linkage disequilibrium on principal component analysis, only 21,188 SNPs with pair-wise r 2 ,0.2 were actually used in this analysis. The principal component analysis was performed using R package SNPRelate [15] to test whether population stratification was present within the sampled animals, and the individuals outside of their expected breed clusters were excluded from further analysis.

Detection of genomic regions under selection
To identify the genomic selective signatures related to altitude adaptation for TBP and DHP, we performed two separated analyses for the two breeds, both compared with the low-land breed, WZSP. The unbiased estimate of F ST as described by Weir and Cockerham [16] was calculated using Genepop 4.2 software [17] to identify genome wide spots or patterns for positive selection. To assess statistical significance, we first carried out permutation test (100 permutations). For each replicate the individuals were randomly assigned to one of two breed groups. The null hypothesis is that all rearrangements of the alleles among the two populations are equally probable. The permutation pvalue (P P ), as computed previously [18], is the probability that the value of F ST in the null-distribution exceeds the observed value of F ST . And then, we obtained the empirical p-value (P E ) based on the distribution of nongenic SNPs on which selection is deemed to be weak, since selection preferentially targets gene regions. The empirical p-value for F ST was calculated as follows: number of nongenic lociwx)=(total number of nongenic loci): Here, the nongenic SNPs were defined as those located in genomic regions that are at least 50 kb away from the start or stop position of all known genes in pig genome. The 50 kb buffer was used to avoid the linkage with some genes or promoter regions. To further control false positive due to multiple tests for massive genomic SNPs, a Bonferroni corrected significance level (roughly equal to 0.01/number of SNPs analyzed) was used to assess statistical significance. SNPs with P P ,Bonferroni corrected significance level, and also P E ,0.01 were considered as the promising selection signatures. Furthermore, to account for stochasticity in single SNPs, we then clustered all significant SNPs within 500 kb of each other into single highly differentiated regions.

Bioinformatics analysis
Genes located in the genomic regions significantly differentiated between pig breeds were acquired by the use of the data mining tool Biomart, with the latest reference assembly of the porcine genome Sscrofa10.2 as the reference. Then, human homologous genes were retrieved by using the 'Homolog filters' option. Genes without functional annotations or human orthologs were removed. The final homolog list was analyzed by using the online DAVID database tool. In this functional analysis, statistical significance was assessed by using P value of a modified Fisher's exact test and Benjamini correction for multiple tests. Furthermore, a comprehensive literature review was conducted to verify whether these genes have some relevance with adaptive phenotypes in pigs or other mammals.

Data preprocessing
A total of 61,565 SNPs were genotyped by using Porcine 60 K Beadchip, of which 9,093 markers were not annotated to genomic porcine genome Sscrofa10.2. For this analysis, data for chromosome X were not included. A total of 44,433 autosomal markers passed the quality control for SNP call rate, consistency with HWE and MAF. The average physical distance between two adjacent SNPs were 54.50 kb, ranging from 46.12 kb on chromosome 12 to 64.59 kb on chromosome 8 (Table S1). One DHP individual with .10% missing data was excluded from further analysis. Two pairs of individuals in both TBP and WZSP were found related (with a PI-HAT value of over 0.5), and therefore only one individual from these pairs was randomly selected, yielding the final sample consisting of 91 animals.

Principle component analysis
The principle component analysis using a subset of 21,188 SNPs with pair-wise r 2 ,0.2 showed that the first two principal components (PC1 and PC2), which explained 10.4% and 7.9% of the variance, respectively, clearly separated the three pig breeds according to their geographic origins. As shown in Figure 1, PC1 provided a clear-cut division between two plateau pigs (TBP and DHP) and the low-land pigs (WZSP), while PC2 provided a good separation between TBP and DHP. As none was located outside of its expected breed clusters, all the 91 individuals were kept for further analyses.

Genome-wide detection of significant SNPs
We calculated F ST for 37,893 and 32,729 SNPs in TBP vs WZSP and DHP vs WZSP breed comparisons, respectively. The genome-wide distribution of F ST values for the two analyses was shown in Figure 2. In general, higher proportion of large F ST values (F ST bins$0.2) was observed in the breed pair DHP vs WZSP compared to TBP vs WZSP, reflecting that DHP is relatively more diverged from WZSP. The permutation test showed that the observed F ST distributions for both breed comparisons markedly deviated the permutated ones. The maximum F ST produced by the permutation was 0.362 for TBP vs WZSP and 0.450 for DHP vs WZSP, respectively, corresponding to the cutoffs for Bonferroni corrected significant levels (P = 2.64E-07 and 3.06E-07, respectively) at a = 0.01. Jointly considering the empirical p-value (P E #0.01), a total of 395 (observed F ST $0.557) and 365 (observed F ST $0.716) significant SNPs were obtained in the two breeds comparisons (see Table S2 and Table S3 for full list of these SNPs). And then, all significant SNPs within 500 kb of each other were merged into single highly differentiated regions for detection of selective sweeps or genomic regions containing biologically interesting genes, as described below.

Adaptive signals of selective sweeps for TBP
For TBP, a total of 53 highly differentiated regions (selective sweeps) containing two or more SNPs were derived ( Table 1). Totally, these regions contain 100 genes, of which 18 genes were found having some known biological functions associated with  Hereafter, we focus on reviewing the 18 promising genes, of which three genes, CCBE1, VPS13A and GNA14, are situated on chromosome 1, and all have potential relevance to the physiological functions for altitude adaptation. CCBE1 contributes to the lymphatic vascular development [19], while VPS13A was known to be associated with cell membrane deformation of circulating erythrocytes [20]. Another gene, GNA14, was previously shown to be induced by hypoxia and plays an important role in placental and fetal vascular endothelial functions under chronic hypoxia [21] (http://www.erp.wisc.edu/symposium/2012_abstracts.pdf), although the precise mechanisms of its response to hypoxia was not completely clear. The selective sweep on chromosome 2 (genomic location: 87.34-87.73 Mb) harbors two interesting functional genes, F2RL1 and AGGF1. Genetic variants in F2RL1 was found to be associated with blood pressure [22], and AGGF1 is an angiogenic factor and essential for embryonic and pathological angiogenesis [23]. Two promising selection signatures on chromosome 4 are ZFPM2, which plays a critical role in heart development and coronary vascular development [24], and GDAP1, which is associated with mitochondrial fission [25]. Two selective sweeps on chromosome 8 harbor two interesting genes, of which IL2 (genomic location: 108.70-109.10 Mb) is an important molecule for maintaining natural immunologic self-tolerance [26], and FGF5 (genomic location: 146.65-146.73 Mb) plays a role in increasing blood flow and contractile function in an ischemic region of the heart [27]. An interesting selective gene was also found on chromosomes 9 and 13. PLA2G4A, located on chromosome 9 (genomic location: 140.59-140.74 Mb), contributes to the pulmonary vasoconstrictor response to hypoxia [28], and ADAMTS9 on chromosome 13 (genomic location: 50.67-50.69 Mb), an essential element for normal cardiovascular development and adult homeostasis [29], was also identified to be highly selective in a previous study of TBP [9]. Four selective regions on chromosome 14 harbor two interesting genes, NRBF2 and JMJD1C. NRBF2 was deemed to be an important molecule for exercise-mediated mitochondrial expansion and altered fuel selection, which enhances aerobic ATP-generating capacity in skeletal muscle [30]. JMJD1C, encoding a probable histone demethylase, was found to be involved in hematopoiesis in humans and mouse [31]. Interestingly, VEGFC on chromosome 15 and FGF10 on chromosome 16 were also identified in a previous study of TBP [9]. VEGFC, a vascular endothelial growth factor [32], is particular interesting as it is involved in hypoxia-inducible factor (HIF) pathway that is initiated in response to hypoxic environmental conditions by regulating oxygen homeostasis in humans and other mammals [33]. And FGF10 plays a crucial role in lung branching morphogenesis during early embryonic lung development. Finally, three additional interesting genes were found on chromosome 16, of which PARP8 is related to DNA repair [34], ADAM19 plays essential roles in cardiovascular morphogenesis and heart development [35,36] and ADAMTS16 is associated with blood pressure [37].
To further elucidate the functional involvements of the 100 selective genes, we performed a GO-based enrichment analysis using the DAVID database tools. A total of 12 functional categories achieved the nominal significance at a = 0.05 (as shown in Table 2), based Fisher's exact test. However, they did not reach the highly conservative Benjamini multiple-tests corrected level. These categories were mainly involved in cardiovascular system, including ''blood vessel morphogenesis'' (P = 0.014, GO: 0048514), ''blood vessel development'' (P = 0.023, GO: 0001568), ''vasculature development'' (P = 0.025, GO: 0001944) and ''angiogenesis'' (P = 0.028, GO: 0001525), reflecting the adaptive response of developed blood vessels to increase the efficiency of oxygen utilization in TBP. In addition, categories involved in ''positive regulation of response to external stimulus'' (P = 0.033, GO: 0032103) and ''regulation of response to external stimulus'' (P = 0.033, GO: 0032101) were significantly overrepresented, which may reflect the adaptive response of TBP to the external stresses of low oxygen concentration and strong ultraviolet radiation at the high-altitude plateau. Significant GO terms associated with breeding traits such as ''gamete generation'' (P = 0.008, GO: 0007276) may reflect the effort of TBP to maintain an adequate fitness in the harsh high altitude environment by maximize its reproduction success.

Adaptive signals of selective sweeps for DHP
To determine whether the moderate altitude environments could also trigger the genetic adaptation to mild hypoxia, we further analyzed the genomic Fst profile of selection for DHP, a breed living at an altitude of about 2000 meters, compared with the low-land breed WZSP. The 365 significant SNPs mapped onto 42 genomic regions defined as the selective sweeps ( Table 3). These regions encompass 137 genes, of which 12 genes were found having known biological functions related to altitude adaptation and were illustrated in Figure 4. Interestingly, six genes were overlapped with the list of genes identified in TBP as mentioned above, including VPS13A, GNA14, GDAP1, PARP8, FGF10 and ADAMTS16.
Similar to the above analysis for TBP, further GO-based analysis of these highly divergent genes did not reveal any functional categories reaching the Beniamini corrected level for statistical significance. However, 14 GO categories related to nucleus and chromosome organization were of nominally significance with a modified Fisher's p-value,0.05 (Table 4). These functional terms were known to be associated with DNA repair and genome maintenance [45], and these findings may reflect the functional changes of DHP in order to adapt well to the harsh environments such as low ambient oxygen, increased risk of oxidative stress, and strong ultraviolet radiation.
The selective sweeps and genes shared by TBP and DHP As mentioned above, six genes (VPS13A, GNA14, GDAP1, PARP8, FGF10 and ADAMTS16) from completely or partly overlapped genomic regions for selective sweeps were shared by TBP and DHP ( Table 5). The first overlapped region is located on chromosome 1, covering a 0.07 Mb segment (from 257.07 to 257.14 Mb). There were four significant SNP signals in TBP, and two in DHP, of which two SNP loci (INRA0006383 and H3GA0003828) were duplicated in the two breeds. Based on the biological functions of VPS13A and GNA14 located in this genomic region, both breeds might follow a similar route to adapt under chronic hypoxia. The second overlapped region is situated on chromosome 4, and 0.15 Mb long (from 67.05 to 67.20 Mb). One identical locus (ALGA0025367) was found in both breeds, although DHP had larger Fst values. This region accommodates GDAP1 (location: 66.95-67.20), a gene encoding gangliosideinduced differentiation-associated protein 1. All the remaining three shared sweep regions are situated on chromosome 16. For the sweep ranging from 30.26 to 30.52 Mb, two of three selective SNP signals (DRGA0016021 and DRGA0016027) were repeatly identified in the two breeds. As the contained gene FGF10 reveals, this shared region may implicate of a similar adaptive physiological and development changes in the lungs of the two pig breeds. Only a single selective SNP (ALGA0090039) was identified in the two breeds on the fourth sweep and its associated gene was PARP8, which is related to DNA repair. However, for the sweep region ranging from 82.68 to 82.84 Mb, two different SNP signals (M1GA0021378 and ALGA0092291) were observed in the two breeds. This region may be associated with pigs' adaptive blood pressure changes to high altitudes, inferred from the gene (ADAMTS16) located in this region. Overall, these shared sweep regions provide an avenue for us to further unveil the common molecular mechanisms for the local adaptation to high altitude environments.

Discussion
In the present study, we identified genomic selective sweeps in TBP and DHP using a dense genome wide panel of SNPs, with a major focus on the detection of signals relevant for altitude phenotypes. Generally, an F ST above 0.25 indicates salient differentiation between two populations [46]. However, the F ST values in the significant SNP lists for both breed comparisons all were above 0.5, indicating very high divergence between these pig breeds. Similar degree of divergence were also observed between Northern and Southern Chinese indigenous pig breeds in a recent report by Yang et al. [47]. In our own view, these results are not Notes: P P , permutation p-value, and for comparison of DHP and WZSP, the Bonferroni corrected significant level (at a = 0.01) for P P = 0.01/32, 729 (# of SNPs analyzed) = 3.06E-07; P E , empirical p-value; Adaptive genes that have plausible biological functions contributing local adaptation are in bold. doi:10.1371/journal.pone.0110520.t003 very surprising, as many Chinese native pig breeds have lived in highly isolated niches for long time. Nevertheless, we could not deny other evolutionary forces than the natural selection and genetic drift occurred in these ecological niches. Consequently, differences in the adaptive traits associated with these selective sweeps may also be due to differences in production, reproduction, and survivability which may have been emphasized during breed formation and/or selected for over time.
Another notable finding is that DHP appears to be more diverged from the low-land breed WZSP than TBP, based on their corresponding F ST distributions ( Figure 2) and the longer list of significant SNPs for DHP. This pattern is not what we expected since TBP is deemed to suffer from higher level of environmental pressure. This unexpected result might be due to several reasons. First, DHP is distributed in a very restricted region in Yunnan province and its population size have decreased dramatically in history, which may generate a strong bottle-neck effect [48]. Currently, only a limited number of DHP are raised in several conservation farms. The smaller effective population size of DHP, compared with that of TBP, may lead to more genetic divergence. Second, some artificial breeding practice occurred in these farms for conserving this breed in recent years may also accelerate its genetic divergence from the low-land breed WZSP.
Despite TBP has evolved excellent adaptive capabilities to the high-altitude Qinghai-Tibetan plateau, its genetic basis remains not very clear. Although two recent studies have proposed a set of candidate genes, there is no consensus regarding their importance for the local adaptation. Ai et al. [8] have highlighted three candidate genes (ADAMTS12, SIM1 and NOS1) that are likely important for adaptation to high altitude, which, unfortunately,  were not detected in both our study and another previous study [9]. The sampling variations and the different methods to capture the genomic traces may be able to explain these incongruities. Nevertheless, three genes, VEGFC, FGF10 and ADAMTS9, previously identified by Li et al [9], were replicated by the present study, all having sounding biological functions. Furthermore, VEGFC was also included in a set of positively-selected genes for andeans [49,50], suggesting that this gene is a key element for multiple species.
DHP dwells at moderate altitudes of approximately 2000 m in Yunnan province. At this altitude, the standard barometric pressure reduces nearly 20% when compared to the sea level [12]. Previous studies have proved that residence at moderate altitude may also trigger some physiological changes due to the selective pressure of hypoxia. For example, birth weight declined significantly with the elevation of altitude from 1500 m to 2000 m [51], indicating that moderate-altitude has profound effects on fetal growth. Previous studies [52,53] also provided evidence of genetic adaptation to moderate-altitude in humans. However, to our knowledge, there is no published literature related to genetic adaptation to moderate-altitude in pigs. In this study, we identified 12 candidate genes (VPS13A, GNA14, GDAP1, FOXM1, RTEL1, RHOG, FLT3, OGG1, CRELD1, FGF10, PAPR8 and ADAMTS16), which may be responsible for the local adaptation in DHP. Of these genes, six (FOXM1, RTEL1, RHOG, FLT3, OGG1 and CRELD1) were unique to DHP and all have some relevant functions to the local altitude adaptation. Among them, five genes are related to DNA repair for oxidative damage or regulation of ROS generation (i.e., FOXM1, RTEL1, FLT3, OGG1 and RHOG). It is known that both the oxidative stress [54][55][56] and ROS [57] would increase at altitude due to hypoxia, cold, or ultra violet light, which may lead to DNA strand breaks and tissue injury [57]. Therefore, these unique genes may have shaped some specific molecular adaptation mechanisms of DHP to the local environment of moderate altitude.
Finally, to test whether pigs dwelling different levels of altitudes share some common adaptive mechanisms, we further compared genome-wide selection profiles for TBP, living at high altitude, and DHP, living at moderate altitude. Although a previous study of humans living in high-and moderate-altitude successfully identified EGLN1 as a common genetic factor for different levels of altitudes [52], no similar study of pigs is reported so far. Our genome-wide comparison of selective profiles for TBP and DHP yielded six common genes (VPS13A, GNA14, GDAP1, PARP8, FGF10 and ADAMTS16). Further literature reviews suggest that TBP and DHP may have some similar adaptive changes in physiology and development, such as developmental changes in lung and vascular endothelial cells etc. However, to further unveil these common molecular mechanisms requires careful designed experiments.

Supporting Information
Table S1 Distribution of SNPs in the autosomal chromosomes. (DOCX)