Molecular Phylogeny and Biogeography of Petaurista Inferred from the Cytochrome b Gene, with Implications for the Taxonomic Status of P. caniceps, P. marica and P. sybilla

The polymorphic genus Petaurista includes a group of diverse species and subspecies that are adapted for gliding and arboreal life. This morphological diversity has resulted in taxonomic discrepancies, and molecular phylogenetic studies have been limited by taxon sampling. To clarify this controversial taxonomy, we used the cytochrome b gene to reconstruct the phylogeny to obtain a more accurate picture of the evolutionary relationships, species differentiation and divergence pattern of Petaurista. The results revealed a significant inconsistency between taxonomic designations, phylogeny and genetic distances. When 6 recognized species were included, species delimitation revealed 15 putative species, a finding that warrants a comprehensive morphological diagnosis and a re-assessment of the species status. The validity of P. caniceps and P. marica was discussed. An estimation of the molecular divergence time demonstrated that the diversification and speciation of Petaurista began during the later Miocene and may have been affected by the uplifting of the Qinghai-Tibet plateau and subsequent climate change.


Introduction
The giant flying squirrels, Petaurista Link, 1795, belong to the subfamily Sciurinae and are distributed from Pakistan and Nepal to East Asia, North Indochina and Southeast Asia [1][2][3][4][5][6][7][8][9][10]. This polymorphic genus includes a group of diverse species/subspecies that are adapted for gliding and arboreal life. The head and body lengths of these animals range widely from 305 mm to 585 mm, and the dorsal pelage exhibits a great variety of colors including yellowish gray, buffy gray, bright brown, chestnut and black [1,10].
Within Petaurista, highly variable external morphology presents taxonomic difficulties, particularly for the trans-Himalayan taxa. At least 8 contradictory taxonomic hypotheses have been proposed based on dental and cranial characteristics and external morphology since 1940 (Table 1). In the latest taxonomic revision, 8 species were recognized [6]. Nonetheless, the number of recognized species has continuously changed from 5 to 31, and long-standing controversies remain regarding the taxonomic status of P. albiventer, P. caniceps, P. hainana, P. marica, P. marica sybilla and P. yunanensis (Table 1). For example, P. caniceps, P. marica and P. sybilla have been recognized as species, subspecies or synonyms in different revisions [2][3][4][5][6][7][8][9][10]. Notably, these taxonomic revisions were based on preliminary morphological comparison, and comprehensive morphological or morphometric analyses have not yet been performed.
Only recently have molecular phylogenies of Petaurista been proposed, all of which are based on cytochrome b (cyt b) [11][12][13][14][15][16]. However, due to the disparate sampling of taxa, no broad picture of the phylogeny is available (Figure 1). In addition, no genetic information has been published for P. caniceps, P. marica and P. sybilla.
In this study, to better understand the phylogeny and evolutionary history of Petaurista, we obtained the cyt b sequences of P. caniceps, P. philippensis, P. yunanensis, P. marica and P. sybilla. With additional sequences from GenBank, 6 of 8 species recognized by Thorington et al. (2005) were sampled, enabling us to develop a broad picture of Petaurista evolution and to investigate the validity of some debatable Petaurista species, such as P. sybilla and P. caniceps. Furthermore, we used Bayesian relaxed molecular clock approaches and fossil data to analyze the correlation between the evolutionary history of Petaurista and climate change.

Ethics Statement
All samples used in this study were obtained from specimens deposited in the Kunming Natural History Museum of Zoology (KNHMZ) at the Kunming Institute of Zoology (KIZ) of the Chinese Academy of Sciences (CAS). Our sampling did not violate any law, rule or regulation in China and thus required no ethical or institutional approval. Additionally, we obtained permission from the KNHMZ and the KIZ of the CAS to use the samples in our study (no permit number).

Sampling
In total, 10 specimens of Petaurista were collected and deposited in KNHMZ, KIZ. Additionally, 25 cyt b sequences of Belomys, Petaurista and Pteromys were obtained from GenBank. Thus, our sampling included 6 Petaurista species recognized by Thorington and Hoffmann [6] (Table 2).

DNA Preparation and Sequencing
The samples used in this study were muscle tissues preserved in 95% ethanol or pedal skin specimens. Before DNA extraction, pedal skins were treated in a series of 48-hour washes in 90, 70, 50, 30 and 10% ethanol, followed by successive 24-hour immersions in phosphate-buffered saline (PBS). Total DNA was extracted using a Tissue DNA Kit (BioTeke Corporation, Beijing, China), according to the manufacturer's protocols. Cyt b sequences were amplified using a set of primer pairs, including L14724, L14979, H15149, H15915 [17,18], L15306, H15347, and H15603 [12], as well as 2 other primers that were designed in this study: L15460 (59-CTC ATA ATC CTA GTC CTA T T-39) and L15550 (59-ACA TTA AAC CAG AAT GAT ACT TCC TAT-39). The 50-ml polymerase chain reaction (PCR) mixture contained 25 ml of 26Power Taq PCR MasterMix (BioTeke Corp.), 2 ml (10 ng) of genomic DNA and 2 ml of each primer (10 pmol). PCR amplification was performed using the following program: 5 min at 94uC, followed by 35 cycles of 1 min at 94uC, 1 min and 10 s at 48-54uC, then 1 min at 72uC and a post-extension step for 10 min at 72uC. The PCR products were purified and sequenced using the BigDye Terminator Cycle kit 3.1 on an

Sequence Alignment
Nucleotide sequences were proofread using SeqMan (DNAstar Inc., Madison, WI) and were aligned using Clustal W [19]. Quantitative pairwise comparisons between Petaurista putative species were performed, and the average genetic distances between phylogenetic clades were calculated using Kimura's (1980) 2-parameter (K2P) method in MEGA 5.0 [20]. To test the homogeneity of base frequencies across taxa, PAUP* 4.0b10 [21] was used to conduct a chi-squared test.

Phylogenetic Analysis
To elucidate the phylogenetic relationships among the Petaurista species, phylogenetic analyses were performed to assess maximum likelihood (ML) with GARLI v2.0 [22] and Bayesian inference (BI) using MrBayes v3.2.1 [23]. The cyt b data used in this study were partitioned according to the codon position for both ML and BI analyses. The best-fit evolutionary model of each codon position was calculated using jModeltest v2.1 [24] and determined using the Bayesian information criterion (BIC) because of its high accuracy and precision [25]. The models used for the 1 st , 2 nd and 3 rd codon positions were SYM+G, HYK+G and GTR+G, respectively.
ML tree calculation was performed using a random starting tree, 5 replicate searches and 5 million generations for each replicate, and were sampled every 1,000 generations to estimate the best tree. The bootstrap support (BS) was assessed based on 1,000 bootstrap replicates. PAUP* 4.0b10 [21] was used to generate the strict consensus tree.
Partitioned Bayesian analyses were executed using a random starting tree and the program's default distributions for model parameters. The analyses were repeated twice, and each analysis included 30 million generations. The results were sampled every 3,000 generations. Convergences were assessed by calculating the effective sample sizes (ESSs) using Tracer v1.5 [26]. Conservatively, the first 25% of the sampled trees were discarded as ''burn in'', and the remaining 75% of the sampled trees were used to calculate the Bayesian posterior probabilities (PP). Nineteen alternative phylogenetic hypotheses were also tested using CONSEL v0.2 [27] and PAUP4.0b10 by calculating the p-value in the approximately unbiased (AU) [28], Kishino-Hasegawa (KH) [29] and Shimodaira-Hasegawa (SH) [30] tests. Selection bias results from comparing many trees; in this case, the AU test is less biased regarding tree selection [28].

Divergence Time Estimation and Species Delimitation
The divergence times were estimated using the uncorrelated relaxed molecular clock approach [31] implemented in BEAST v1.7.4 [32]. Fourteen additional Sciurid taxa (the GenBank accession numbers are shown in the tree) were included as outgroups. Two calibration ages were treated as lognormal distributions with soft boundaries [33] and were defined based on fossil records in the Paleobiology Database (http://paleodb. org) and the NOW (New and Old Worlds) database of fossil mammals [34]. The oldest fossil squirrel (Douglassciurus jeffersoni) is known from the late Eocene (37.2-33.9 million years ago [Ma]). A previous study demonstrated that this calibration should be applied to a crown group of squirrels [35]. Thus, we used this fossil to calibrate the most recent common ancestor (MRCA) of living squirrels. The fossil sites were dated to 37.86-35.75 Ma, and changing the calibration date to between 37.8 and 35 Ma has an insignificant effect (see [35] and references therein). We used a lognormal distribution such that the earliest possible sample age was 33.9 Ma and the older 95% credible interval (CI) included 37.2 Ma (offset = 33.9, mean = 1.05, and standard deviation = 1.0). Note that we used a younger lower boundary because the cyt b sequences of Ratufa and Sciurillus, which represent the basal taxa of squirrels, were not included (or available) [35]. The  [37]. The substitution models used for each codon position were the same as those used in the MrBayes analyses. Each BEAST analysis included a randomly generated starting tree, an uncorrelated lognormal relaxed molecular clock, a birth-death model for the tree, and 10 million generations that were sampled every 1,000 generations. Tracer 1.5 [26] was used to confirm that each independent analysis had reached stationary states (i.e., ESSs .200). Based on the time-calibrated tree calculated using BEAST, the number of putative species was identified using the single threshold GMYC model [38]. This method used maximum-likelihood statistics and divergence times in a tree to identify the split point from the species to the population level. In some cases, this method performed very well (errors less than 25%) [39], and the temporal pattern of diversification was visualized using lineages-throughtime (LTT) plots. We calculated Pybus and Harvey's c to determine whether diversification occurred earlier (c,0) or later (c.0) [40]. These analyses were implemented using the APE v3.0, Laser v2.3 and SPLITS v2.1 packages for the R statistical environment [41,42].

Gene Sequences
We analyzed 35 cyt b sequences (1068-1140 bp), including 6 of 8 Petaurista species recognized by Thorington and Hoffmann [6] ( Table 2). The sequences of P. caniceps, P. marica and P. marica are novel data ( Table 2). The average nucleotide composition of the cyt b genes was 28.1% A, 29.2% T, 12.8% G and 29.9% C. The sequence alignment included 449 variable sites along with 386 parsimony informative sites (33.9% of the entire sequence). Analysis of the base composition (P = 1.0, df = 96, Chi-Squared = 48.77) indicated homogeneity among the taxa. The K2P distances between pairs of species are listed in Table 3. The pairwise distance values among P. caniceps, P. elegans, P. marica and P. sybilla were between 4.80 and 16.47% (Table 3).

Phylogenetic Relationships among Petaurista
Phylogenetic reconstructions using ML and BI generated the same tree topology, with overall strong supports (i.e., PP.95% and BS .70%; [43,44]) for most but not all relationships ( Figure 2). With Belomys pearsonii and Pteromys volans as outgroup taxa, Petaurista was consistently supported as a monophyletic clade (PP = 100% and BS = 97%), in which 4 major phylogroups were recovered (clades I, II, III and IV). Clade I, which is represented by a single species, P. leucogenys, occupied a basal position within the genus (PP = 100% and BS = 72%). Clade II (represented by P. xanthotis) and clades III+IV are sister groups, but the sister relationship between clades III and IV was not supported by bootstrap replicates or Bayesian probabilities (PP = 92% and BS = 40%), indicating that the relationship was not stable. Further AU, KH and SH tests found that 11 alternative phylogenetic scenarios could not be rejected by at least 1 test, and 4 could not be rejected by all 3 tests (P.0.05; Table 4). Thus, the relationships among the 4 clades remained ambiguous. Even so, P. caniceps represented a distinct lineage, which is the sister group of P. petaurista+P. grandis (PP = 100% and BS = 97%). P. marica and P. sybilla were supported as sister taxa (PP = 100% and BS = 99%), and the K2P distance between them was 4.80%.

Molecular Divergence Dating, Species Delimitation and Species Diversification
BEAST analyses recovered the same topology as GARLI and MrBayes analyses (Figure 2). The MRCA of the genus existed at approximately 12.51 Ma (95% CI = 16.16-9.04) (Figure 3). The divergences among clades II -IV occurred approximately between 10.94 and 10.30 Ma (95%CI = 14.03-7.75). Note that because the relationships among the 4 clades are not stable, the divergence times and the results of the LTT plots and the Pybus and Harvey's tests should be treated with caution. The early splits within clades III and IV occurred at 8.80 and 7.49 Ma, respectively (95%CI = 11. 45-5.17). The results also revealed that the divergence of P. caniceps and P. grandis+petaurista, P. elegans and P. sybilla+marica as well as P. albiventer+petaurista+yunanensis and P. alborufus+hainana+philippensis occurred almost simultaneously at  (Figure 4). Species delimitation analyses revealed 15 lineages as putative species; lineages that diverged older than 0.79 Ma were identified as potential species (Figure 3). Thus, despite the 6 species recognized by Thorington et al., 2005, P. albiventer, P. hainana, P. lena, P. marica, P. sybilla and P. yunnanensis were also recognized as potential species, and P. caniceps was recognized as 2 potential species. The Pybus and Harvey's c-value from our tree was 21.02 (P = 0.15). Thus, the pure-birth model was not significantly rejected, and does not support early-burst or late-burst/high early extinction.

Phylogenetic Relationships among Petaurista
Although more sequences and species were included in our phylogenetic analyses, the topology is largely congruent with all previous hypotheses; that is, no conflicting relationship with high BS or PP was observed. Unfortunately, alternative phylogenetic scenarios could not be rejected statistically (Table 4); therefore, the relationships among the 4 major lineages, including the basal position of the genus, remain obscure. Indeed, the values of the log-likelihood (-lnL) of several alternative phylogenies are very close to that of the ML/BI topology ( Table 4). The unresolved nature of the phylogeny might be attributed to insufficient phylogenetic information in the cyt b sequences. Other potential reasons include a rapid radiation. In the time-calibrated tree, the branches representing the 4 clades are short, and the divergences may have occurred within 2 million years ( Figure 3). Regardless of the reason, more robust data, such as multiple unlinked nuclear genes, are required to fully resolve the relationships.

Taxonomic Implications
Although we were not able to fully resolve the relationships, the 4 major clades and 15 putative species recognized in our analyses enable us to discuss the taxonomy of Petaurista at a preliminary level. We note that our species delimitation analysis was based on a single gene and a very simple hypothesis and relied on a genetic species concept [45,46]. Therefore, these putative species represent only a crude estimate rather than a fully described model. To better understand the taxonomic status of these putative species, further investigation using multiple unlinked genes, comprehensive morphological and/or morphometric analyses, karyotypic studies and ecological and reproductive studies are warranted. Even so, the putative species recognized appear to be congruent with the previous taxonomic hypotheses (Table 1) and have been implied in phylogenetic studies [11,12,13]. The species status of P. albiventer, P. grandis, P. hainana, P. lena and P. yunanensis have been discussed by Oshida et al. [11,12] and Yu et al. [13], and we will focus on the taxonomic status of P. marica, P. caniceps and P. sybilla herein.
P. caniceps was first recognized as Sciuropterus caniceps in 1842 [47]. The following taxonomic rearrangements appear to be authordependent [2][3][4][5][6]8]. In the present study, the distinct phylogenetic position and strikingly large genetic distances indicate that this species should be considered valid. In addition, P. caniceps is morphologically distinguishable from all other Petaurista by the absence of any unique white speckling over the back and a grey forehead. P. caniceps is sympatrically distributed with P. marica in southwestern China [5]. It is noteworthy that the species from western and middle Yunnan, China are also genetically distinguishable and were identified as 2 putative species in species delimitation analyses. Examination of the morphological differences among populations is warranted.
P. marica was first described by Thomas (1912) based on specimens from Yunnan (most likely near Mong-tze), China [48], and P. sybilla was named by Thomas and Wroughton in 1916 [49]. Since then, the taxonomic status of these species has been author dependent [2][3][4][5][6]8]. In this study, P. marica is represented by 2 specimens from locations in Lvchun and Jinpin ( Table 2) that are very close to its type locality ( Figure 5); P. sybilla is represented by 2 samples from western Yunnan. The results suggest that P. marica and P. sybilla may have diverged from P. elegans 4.47 Ma (95%CI = 6.46-2.78) and that the former 2 taxa split during the early to middle Pleistocene (3.12-0.97 Ma). The results justify a re-assessment of these 2 taxa and call for comprehensive morphological diagnoses.

Correlation between Petaurista Evolution and Climate Change
The higher-level phylogenetic relationships of Petaurista were not fully resolved and characterized by relatively short branches. Therefore, we assumed the diversification among the 4 major lineages to have occurred at approximately 12.51-7.49 Ma (95%CI = 16. 16-5.17) and may be associated with episodes of  (I,(II,III,IV) global cooling since the middle Miocene (from 15 Ma) as well as the accelerated uplift of the Qinghai-Tibet Plateau (at approximately 10-8 Ma) [50][51][52]. The uplift of the plateau also strengthened the East Asia monsoon and increased the aridity of the dry seasons [51]. The climate change and consequent habitat turnover could have led to fragmentation of the Petaurista distribution. This suggestion is based on the observed short branches but is not supported by the Pybus and Harvey's test results. Nonetheless, rapid diversification was also observed at approximately 12-10 Ma among tree squirrel genera on the Sunda Shelf islands and has been connected to climate change and the subsequent drop in sea levels [35]. Most of the diversification among species occurred from the early Pliocene to the early Pleistocene (5-2 Ma), a finding that may be related to global cooling and desiccation, particularly around the Miocene/ Pliocene boundary and during Pleistocene climate fluctuations [53][54][55][56]. However, these correlations require stronger evidence and should be tested in other East Asian taxa.