Genetic Structure and Demographic History Should Inform Conservation: Chinese Cobras Currently Treated as Homogenous Show Population Divergence

An understanding of population structure and genetic diversity is crucial for wildlife conservation and for determining the integrity of wildlife populations. The vulnerable Chinese cobra (Naja atra) has a distribution from the mouth of the Yangtze River down to northern Vietnam and Laos, within which several large mountain ranges and water bodies may influence population structure. We combined 12 microsatellite loci and 1117 bp of the mitochondrial cytochrome b gene to explore genetic structure and demographic history in this species, using 269 individuals from various localities in Mainland China and Vietnam. High levels of genetic variation were identified for both mtDNA and microsatellites. mtDNA data revealed two main (Vietnam + southern China + southwestern China; eastern + southeastern China) and one minor (comprising only two individuals from the westernmost site) clades. Microsatellite data divided the eastern + southeastern China clade further into two genetic clusters, which include individuals from the eastern and southeastern regions, respectively. The Luoxiao and Nanling Mountains may be important barriers affecting the diversification of lineages. In the haplotype network of cytchrome b, many haplotypes were represented within a “star” cluster and this and other tests suggest recent expansion. However, microsatellite analyses did not yield strong evidence for a recent bottleneck for any population or genetic cluster. The three main clusters identified here should be considered as independent management units for conservation purposes. The release of Chinese cobras into the wild should cease unless their origin can be determined, and this will avoid problems arising from unnatural homogenization.


Introduction
The elapid genus Naja comprises 28 species of non-spitting and spitting cobras and was recently split into four subgenera. Six African and Arabian non-spitting cobras inhabiting savanna and open habitats are included in the subgenus Uraeus, four African non-spitting cobras using forest habitats in the subgenus Boulengerina, seven African spitting cobras in the subgenus Afronaja, and 11 Asian cobras in the subgenus Naja [1]. Spitting adaptations appear to have evolved three times within the genus Naja [2]. All 11 Asian species except N. naja and N. oxiana have some degree of adaptation to spitting [1,3]. Phylogenetic analysis of 1333 bp of mitochondrial DNA (606 for ND4 and 727 for cytochrome b) supports monophyly of the Asian cobras [2]. This result is consistent with that based on morphological data [4], but invalidates the hypothesis that spitting and non-spitting cobras in Asia are the outcome of two separate ancestral stocks migrating from Africa [5,6].
There are two Naja cobras in China, the Chinese cobra (N. atra) and the monocled cobra (N. kaouthia). The Chinese cobra is found from the mouth of the Yangtze River south to northern Vietnam and Laos [7]. The cobra is of economic importance and has consequently been overexploited. It is listed as a highly vulnerable species according to the China Red Data Book of Endangered Animals [8] and is facing a realistic threat of local extinction in provinces such as Guangdong and Hainan [9]. Poachers are facing increasingly severe punishment, and great achievements have been made in artificial culture [10]. Unfortunately, however, wide populations of N. atra have received little conservation attention. Until now, all local populations were treated as homogenous, so that if populations were under severe threat of extinction in a part of the cobra's range, this would not necessarily ring alarm bells, as there are individuals left in other places or in farms. Within the cobra's range large mountain ranges and water bodies such as the Wuyi Mountains, Luoxiao Mountains and Nanling Mountains, Taiwan Strait and Qiongzhou Strait can be found (Fig. 1). However, it remains unknown whether these geographic barriers play a role in restricting migration and thus influencing population structure and the distribution of genetic diversity within this species.
Knowledge of population structure and genetic diversity is crucial for effective wildlife preservation because of its importance in determining the integrity and viability of wildlife populations. Moreover, these parameters can reveal more general processes acting on populations and species, such as how the population genetic structure of widespread species is formed and maintained and whether cryptic genetic groups exist within taxa [11]. The population structure and genetic diversity of a handful of snake species with large and continuous ranges has been determined for the five-paced pit-viper Deinagkistrodon acutus [12], the olive sea snake Aipysurus laevis [13], the eastern Massasauga rattlesnake Sistrurus c. catenatus [14], the short-tailed pit viper Gloydius brevicaudus [15], the Chilean long-tailed snake Philodryas chamissonis [16] and two hot-spring keel-back snakes, Thermophis baileyi and T. zhaoermii [17]. These studies generally show the existence of independent management units (MUs) for different lineages that should be taken into account for conservation plans.
Mitochondrial DNA can yield information about the historical processes behind intraspecific matrilineal relationships, and also allows insight into the importance of ongoing gene flow [18]. However, the maternally inherited haploid mitochondrial genome has a fourfold smaller effective population size than nuclear markers, which enhances the effects of genetic drift in subdivided populations and results in more rapid fixation or loss of alleles and stronger population subdivision at mitochondrial than nuclear loci [19]. While overall mutation rates tend to be higher for the mitochondrial genome [20], much of the mitochondrial genome is protein coding, potentially under selection [21] and may not always evolve sufficiently rapidly to infer levels of contemporary gene flow [22]. To reveal processes in contemporary populations influenced by both parental lineages, highly variable nuclear markers such as SNPs (single nucleotide polymorphisms) or microsatellites are required. Microsatellites are biparentally inherited and most loci appear to be selectively neutral and accumulate mutations rapidly [23].
Here, we used both mitochondrial and nuclear markers to investigate genetic processes in Chinese cobras from mainland China and Vietnam. This is the first attempt to describe genetic structure and phylogeography in this vulnerable species covering almost its entire range in mainland China and Vietnam, and is especially important because current conservation programs that release Chinese cobras reared in captivity and confiscated officially into the wild treat the population as homogenous. Several important questions regarding the population genetic characteristics of this species were addressed in this study: (1) Has genetic diversity been retained or lost following the recent and rapid decline of Chinese cobras across their range? (2) Does the cobra show genetic and population structuring across its range? (3) Are genes flowing between and within populations and are large mountain ranges and water bodies acting as geographic barriers? Answering these questions will allow us to assess the integrity, viability and population history of this taxon, thereby informing conservation efforts towards this vulnerable snake.

Genetic variation
Cytochrome b gene sequences (1117 bp) from 269 individuals yielded 39 distinct haplotypes (GenBank Accession Nos EF206656, JN160643-JN160680). Eighty-six variable nucleotide sites were found, comprising 77 transitions, six transversions and three transition/transversion. While haplotypes C1 and C15 occupied a wide geographic distribution, 33 of 39 haplotypes were restricted to one population ( Fig. 2). High haplotype diversity (0.863) and nucleotide diversity (0.827%) were detected for the whole population (Table 1). Individuals from ZW possessed the lowest haplotype diversity and nucleotide diversity, and individuals from M and QX the second lowest; individuals from GY, HN and VS had similar levels of mtDNA diversity (Table 1).
For microsatellites, 262 individuals were successfully genotyped. MICRO-CHECKER detected no evidence of null alleles or genotyping errors such as large allele dropout and stuttering. The mean number of alleles per locus (N A ) was 3. 33 Table 1 Table 1 for sample sites abbreviations. Sites where most individuals were assigned to the same cluster on the basis of microsatellite data have gradations of the same color (s). doi:10.1371/journal.pone.0036334.g002

Phylogenetics and phylogeography
The phylogenetic tree resulting from ML revealed three clades (Fig. 3). This finding was also supported by the median-joining network (Fig. 2). Clade A included haplotypes mainly from HN, VS and QX in the western region (in green in Fig. 2). Clade B included haplotypes mainly from M and GY in the southeastern region (in red in Fig. 2) and from the eastern region (in blue in Fig. 2). Clade C comprised only two individuals sampled from the westernmost site in the western region (Fig. 2). Clade C was divergent from the other two clades by at least 44 mutation steps (Fig. 2). Thirty-three haplotypes were unique to a single population.
Genetic distances analyses using mtDNA and microsatellite data did not produce similar results. The minimum and maximum genetic distances based on mtDNA data were from the comparisons of ZW-M (0.063) and of ZW-QX (13.836), respectively; whereas the minimum and maximum genetic distance based on microsatellite data were from the comparisons of M-GY (0.056) and of ZW-M (0.672), respectively ( Table 2). The genetic distance between the ZW and M populations was its maximum based on microsatellite data but its minimum based on mtDNA cytochrome b.
Cluster analyses with STRUCTURE based on microsatellite data identified two major clusters and a lower peak for K = 3, indicating probable substructure with three genetic groups (Fig. 4). Although the peak of DK for K = 2 was much higher than for K = 3, the average likelihood values for K = 3 from different runs were higher than for K = 2 (28901.5 and 29585.3, respectively). Therefore, we reanalyzed both clusters individually as suggested by the authors of STRUCTURE [24]. Reanalysis of cluster 1, mostly consisting of samples from M and GY, did not detect any subdivision. Repeated analysis with samples from the larger second cluster (n = 212, omitting snakes from M and GY) suggested the existence of two genetic subgroups (Fig. 4). Samples mainly from HN, VS and QX formed one cluster, while the other cluster consisted of samples from ZW. Each sampling site was represented in one of the three clusters with a high value of estimated membership coefficient (Table 3; Fig. 1). There was relatively few (3.8%) individuals having intermediate cluster membership (i.e. an estimated membership coefficient ,0.7 in all inferred clusters), but some populations have individuals with very different cluster membership. For example, two ''blue'' individuals were in the red ''M'' sample, which could reflect very recent migrants (Fig. 1, Fig. 4). However, the varying levels of ''blue'' ancestry in more distant ''green'' samples might be a somewhat artefactual result, based on ''blue'' having just a subset of the diversity present in the ''green'' cluster. Some individuals in ''green'' locations might be assigned higher ''blue'' ancestry if they randomly happen to have some of the same alleles that drifted to high frequency in ''blue''.

Demographic history
Although Fu's Fs test shows a non-significantly negative value, a significantly negative Tajima's D test value rejected the null hypothesis of neutral evolution and demographic equilibrium of the cytochrome b marker for the genetic cluster in the western region based on mtDNA data (Table 1). Moreover, many haplotypes were represented within a ''star'' cluster with C17 being the central haplotype in the haplotype network of cytchrome b (Fig. 2). This could indicate population expansion in the western region, which is supported by the mismatch distribution analysis and Rogers test [25] of sudden population expansion ( Table 1). The value of t for the western region was 2.36 (Table 1). Using the 95% confidence interval of 0.005-0.009 site 21 myr 21 , the

Genetic diversity
Our assessment of genetic variation based on mtDNA and microsatellites indicates high levels of genetic diversity in N. atra. For the whole population, mtDNA cytochrome haplotypic (h = 0.863) and nucleotide (p = 0.827%) diversities are higher than values reported for the Jamaican boa (Epicrates subflavus), a similarly vulnerable snake with fragmented distributions, using the same marker (h = 0.79 and p = 0.76%) [26]. In a study of A. laevis where values were not calculated for all populations sampled as a whole but for each population [13], haplotypic (h = 0.55-0.63) and nucleotide (p = 0.12-0.52%) diversities were lower for most populations of N. atra (Table 1). Nucleotide diversity detected in this study is lower than that (p = 1.41%) detected across 86 individual D. acutus sampled across their range in China using the entire mitochondrial ND2 gene [12].
High microsatellite diversity was also detected in N. atra, with a N A of 11.42 and H E of 0.768 (Table 1). At the species level, microsatellite diversity in N. atra is slightly higher than, or similar to, those reported for other snakes such as E. subflavus (N A = 7. 44 and H E = 0.64) [26], A. laevis (N A = 8.4 and H E = 0.263-0.881) [13] and S. c. catenatus (N A = 13.8 and H E = 0.49-0.77) [14]. These results suggest that despite a declining population, high genetic variation remains amongst wild populations of N. atra, perhaps because of a large effective population size, or because bottlenecks due to very recent ecological disturbance are unlikely to dramatically reduce genetic diversity within a handful of generations unless they lead to catastrophically low population sizes.

Genetic structure and barriers to gene flow
Our study utilized more and wider sampling locations compared to an earlier study of N. atra [27]. Here, phylogenetic tree and network analyses using mtDNA reveal three clades comprising individuals from the western (Clade A), eastern and southeastern (Clade B), and westernmost (Clade C) regions, and show a certain degree of geographic structure for the species. This pattern is consistent with that in snakes such as D. acutus distributed almost in the same region, whereby an east-west division of the whole D. acutus population was found [12]. With a different pattern of inheritance and rate of evolution from mtDNA, microsatellitebased analyses further divided Clade B into two genetic clusters: one included individuals from the eastern (ZW) region, and one included individuals from the southeastern (M and GY) region (Fig. 4).
Generally, the degree of genetic structure in a continuously distributed species is expected to fall somewhere between (1) a mosaic of subpopulations delimited by effective physical (e.g. mountains, rivers, roads) or ecological (e.g. prey-habitat specialization) [28] barriers to gene flow; and (2) a single genetic group with no definite boundaries or clear subdivision where gene flow between distant parts of the population may be restricted only by limited dispersal capabilities [29]. Large mountains are a dominant feature of the landscape throughout the cobra's range, and we hypothesized that these mountain ranges may affect gene flow. Based on the Bayesian clustering results the hypothesis was accepted with regard to the Luoxiao Mountains and partly accepted for the Nanling Montains, but rejected with regard to the Wuyi Mountains (Fig. 1). M and GY are located to the east of the Luoxiao Moutains and are separated from QX. As expected, significant differentiation occurred between M and QX and between GY and QX. M and GY are located to the north of the Nanling Mountains and are separated from VS. Significant differentiation occurred between M and VS, and between GY and VS. Although QX and VS are located on two different sides of  Table 1   the Nanling Mountains, they are only partly separated from each other by this mountain range. Consequently, significant differentiation was not found between QX and VS, possibily because of gene flow. With regard to the Wuyi Mountains, M and GY are located on two different sides of the mountain range; however, significant differentiation was not found in individuals from M and GY. More surprisingly, genetic divergences between ZW and M, and between ZW and GY are present, although no significant mountain ranges exist between these localities. Lineage sorting between island and mainland populations has not yet been completed in the most northeastern part of the cobra's range [27], primarily because of the young age of the Zhoushan Islands which were separated from the mainland 10,000 years ago [30]. Hainan is currently separated from the mainland by the Qiongzhou Strait, which first appeared 2.5 million years ago, with the last separation occurring some 10,000 years ago [31,32]. We found that: the haplotypes from Hainan are distributed in both Clade A and Clade B (Fig. 2, Fig. 3); haplotype C1 is shared among HN, M and ZW (Fig. 2); and haplotype C15 is shared among HN, VS and QX (Fig. 2). These findings suggest that, historically, the Qiongzhou Strait did not act as an important barrier to gene exchanges between the Hainan and mainland populations. A similar conclusion has also been drawn for the Reevese's butterfly lizard Leiolepis reevesii [33].
F ST estimates based on microsatellite data also support the Bayesian clustering results ( Table 2). Slatkin's genetic distances (F ST /1-F ST ) between ZW and M (0.672) and between ZW and GY (0.597) are the largest when compared with values between other populations, presumably suggesting that high latitude and cool climates play a significant role in the separation of northern and southern populations of the cobra. ZW is the northern limit of the species' range, and the cobra can be very abundant in the southern region (including islands) but absent in the northern part of the region [34]. Chinese cobras never thermoregulate when air temperatures are lower than 15uC, and a prolonged exposure of cobras to temperatures lower than 9uC can be lethal [35]. The cobra may therefore have a lower capacity to tolerate cold temperatures compared to other species. Glaciation and deglaciation and the accompanying falling and rising of temperature during the Pleistocene may have greatly affected the dispersion of this species. At times of maximum glaciation most cobras in ZW moved south to seek warmer habitats, resulting in increased genetic drift, inbreeding and loss of genetic diversity. As a consequence, genetic divergence has occurred between ZW and the adjacent populations M and GY.

Population history
There were many unobserved intermediate haplotypes missing along the long branches between the two central haplotypes (C17 and C14), whereas many haplotypes were represented within a ''star'' cluster in the haplotype network (Fig. 2). Is this phenomenon the outcome of a bottleneck event causing massive population extinction before subsequent recent expansion? Our answer to this question is no because BOTTLENECK tests based on microsatellites did not support a recent bottleneck for any population or genetic cluster (Table 4). Thus, the missing intermediate haplotypes could have resulted from insufficient sampling. Snakes are sensitive to changes in temperature, especially low temperatures in winter. The lack of a bottleneck and the evidence of expansion both indicate a less significant harmful effect from Quaternary glaciations on N. atra. In fact, no distinct Quaternary glacier landform or deposits have been found within the distribution range of N. atra thus far [36]. It seems likely that isolation, dispersal and expansion jointly explain the history of N. atra. This differs partly from conclusions drawn using the Asiatic toad Bufo gargarizans, an important prey item of N. atra. Phylogenetic analysis suggests that dispersal, instead of vicariance, dominated the history of the B. gargarizans species group [37]. Instead, the history of N. atra is somewhat similar to that of D. acutus where isolation, dispersal, bottlenecks and expansion jointly constitute its history [12].

Conclusions and conservation implications
Phylogeography and population genetic studies can aid the identification of evolutionarily significant units and management units [38] and result in evidence-based conservation decisionmaking regarding endangered species [39]. Phylogenetic and phylogeographic analyses based on our mtDNA data support the existence of two main and a minor (comprising only two individuals) clades. Analyses based on microsatellite data further divided Clade B into two distinctive genetic clusters. Thus, besides the minor clade from the westernmost site, the three main clusters (ZW, M-GY and HN-VS-QX) should be considered as independent management units (MUs) for conservation purposes (Fig. 1). Conservation strategies such as re-introductions and translocations are required to protect or re-establish natural populations of N. atra, but great care should be taken to ensure that individuals are re-introduced to or translocated between appropriate populations within the same MU. Furthermore, as how much adaptive differentiation is present between these MUs or within them is unknown, we caution against long-distance transfers within a group, especially when environmental differences are apparent. In China current practices regarding the translocated release and artificial breeding of Chinese cobras in commercial farms will likely cause serious problems due to unnatural homogenization [40]. We strongly recommend that cobras should not be released into the wild unless their origin can be determined with confidence.

Sample collection
We employed local people to collect adult cobras larger than 900 mm snout-vent length between May and September of 2005-2010 from various localities (18u489-30u179N, 103u579-122u189E Table 1). Great effort was made to avoid collecting more than one individual from the same site. The most distal 25 mm of the tail tip of each cobra was excised using a sterilized scalpel. Individual cobras were released at their site of capture after tissue sampling. Tissue samples were preserved in 95% ethanol before they were deposited at Nanjing Normal University under voucher numbers identified by locality-haplotype numbers. Our experimental procedures complied with the current laws on animal welfare and research in China, and were specifically approved by the Animal Research Ethics Committee of Nanjing Normal University (Permit No. AREC 2004-04-020). The Provincial Forestry Departments of Anhui, Fujian, Guangdong, Guangxi, Guizhou, Hainan, Hunan, Jiangxi, Yunnan and Zhejiang provided permits for capturing cobras in China. The collection of Vietnamese specimens was conducted under the ethics license from Vietnam National Museum of Nature, which was accepted by the Animal Research Ethics Committee of Nanjing Normal University.

MtDNA data analysis
Sequences were translated to amino acids with the program SQUINT [42] to verify if a functional mitochondrial DNA sequence was obtained and that nuclear pseudogenes were not being amplified. We compiled and aligned sequences using MEGA 5.05 [43]. We used ARLEQUIN 3.5 [44] to identify haplotypes and estimate genetic diversity within populations by haplotype (h) and nucleotide diversities (p) [45]. We tested for substitution saturation in cytochrome b (whole gene and each codon position separately). Within N. atra, signs of saturation were not present at any codon position; therefore, saturation was not considered to be a significant factor and all nucleotide positions were used in subsequent analyses.
We reconstructed a phylogenetic tree based on the maximum likelihood (ML), using N. kaouthia (GenBank Accession No. AF217835) as the outgroup. ML analysis was carried out by a heuristic search of 10 random addition analyses with treebisection-reconnection (TBR) branch swapping using PAUP 4.0 beta [46]. The GTR+I+G substitution model was selected by MODELTEST 3.7 [47] based on Akaike information criterion (AIC) [48]. The confidence level of the nodes in the ML tree was estimated using 1000 bootstrap pseudoreplicates. We also conducted a median-joining network (MJN) approach [49] to depict relationships among haplotypes. This approach has been shown to yield the best-resolved genealogies relative to other rooting and network procedures [50]. The MJN was estimated using NETWORK 4.5.0.0 [49].
We used mismatch distributions to test demographic signatures of population expansions within mtDNA lineages [51]. To compare observed distributions with those expected under the expansion model we calculated the sum of square deviation (SSD) and the Harpending's raggedness index [52]. Tajima's D test [53] and Fu's Fs test [54] were used to test equilibrium of the populations in ARLEQUIN 3.5. The statistics was expected to have large negative values under demographic expansion. The equation t = 2ut [25] was used to estimate the approximate expansion time in generations (t), where t is the date of the growth or decline measured in units of mutational time and u is the mutation rate per sequence and per generation. The approximate time of expansion in years was calculated by multiplying t by the generation time of N. atra. The generation time for large snakes was estimated as four years based on the approximate time at which animals mature [12,55]. The substitution rate of mtDNA sequences had been calibrated in studies of lizards [56] and other vertebrates [57] as approximately 0.65% per million years. Based on geological events (the final emergence of the Isthmus of Panama), Wüster et al. (2002) [58] suggest a substitution rate of 0.007 site 21 myr 21 (95% confidence interval: 0.005-0.009) for cytochrome b within the Viperidae. We used the upper and lower values (0.005-0.009) to estimate the overall range of potential dates. Although this dating must be taken with extreme caution due to the lack of calibration of the substitution rate in N. atra and to the sensible overestimation of timing recent events induced by the time-dependency of molecular rates [59], it provides an approximate time frame.

Microsatellite data analysis
All microsatellite loci were screened for null alleles and large allele dropouts using MICRO-CHECKER 2.2.3 [60]. CON-VERT [61] was used to detect private alleles, which were alleles present in one population and not shared with any other. The mean number of alleles (N A ) per locus and observed (H O ) and expected heterozygosities (H E ) were calculated using ARLEQUIN 3.5. FSTAT 2.9.3.2 [62] was used to test linkage disequilibrium and to calculate allelic richness (AR) on a minimum of 17 individuals. Deviations from Hardy-Weinberg equilibrium across all loci for each population were assessed using the exact probability test in GENEPOP 4.0 [63]. Significance values for multiple comparisons were adjusted using the Bonferroni correction.
Genetic distances were computed as Slatkin's (1995) genetic distance (F ST /1-F ST ) derived from pairwise F ST , which were estimated for 15 comparisons between six populations [64]. Geographical distances were measured as the shortest overwater distances between pairs of locations [65]. The significance of each test was assessed using 30000 data randomizations.
A Bayesian clustering method, STRUCTURE [66,67], was used to detect genetic clustering in the whole data set. Under STRUCTURE 2.3.3 the range of possible clusters (K) tested was set from 1 to 10, and 10 independent runs were carried out for each using no prior information, assumed admixture and correlated allele frequencies. The lengths of MCMC iteration and burn-in were set at 300 000 and 50 000, respectively. The true K is selected using the maximal value of the log likelihood [Ln Pr(X/K)] of the posterior probability of the data for a given DK [67]. Further, theDK statistic, the second-order rate of change in the log probability of the data between successive values of K, was also estimated [68].
Demographic history based on microsatellites was assessed using two different and complementary methods. First, the Wilcoxon's sign rank test was used to examine whether populations exhibit a greater level of heterozygosity than predicted in a population at mutation-drift equilibrium. This test is most sensitive to detecting bottlenecks occurring over approximately the last 2-4 Ne generations and, for most parameters, has more power to detect more recent bottlenecks (e.g. 0.2-1Ne generations ago). Second, a mode-shift test [69] was carried out to detect a distortion of the expected L-shaped distribution of allele frequency. This test is most appropriate for detecting population declines which have occurred more recently, specifically over the last few dozen generations [69,70]. Heterozygosity deficiency, heterozygosity excess and mode-shift tests were implemented in BOTTLENECK 1.2.02 [71]. We performed 10000 simulations in six populations and three genetic clusters under the stepwise mutation model (SMM) and the two-phase model (TPM), with 95% single step mutations and 5% multi-step mutations and a variance of 12 as recommended by Piry et al. (1999) [71]. P-values from the Wilcoxon's test were used as evidence for bottlenecks and were assessed for significance at the 0.05 level.