Phylogeography of Aphyocypris normalis Nichols and Pope, 1927 at Hainan Island and adjacent areas based on mitochondrial DNA data

We investigated the genetic structure of the freshwater fish Aphyocypris normalis, in 33 populations around Hainan Island and southern mainland China. Sequencing of the mitochondrial DNA (mtDNA) cytochrome b from 127 specimens yielded 47 haplotypes, from which we inferred a Bayesian tree. This revealed three major divergences: a principal clade of specimens with widespread geographic distribution, plus two clades with limited distribution. We estimated that these diverged between 1.05–0.16 Ma. Additionally, based on molecular data and comparing with the climate patterns of Hainan Island, eight phylogeographic ranges (populations) of A. normalis were constructed: the eastern plain (E), northeastern hills and plain (NE), northwestern hills and lowlands (NW), central mountains (C), southeastern hills and plain (SE), southern mountains and hills (S), southwestern mountains and lowlands (SW), and western lowlands (W). The patterns of geographical divergence in this species do not reflect the isolation caused by the Qiongzhou (Hainan) Strait, which would generally be experienced by terrestrial animals on isolated islands. The present results indicate that the major clades within A. normalis have diverged before the temporary land bridge existed across the strait during the Last Glacial Maximum.


Introduction
Biogeography is tied closely to both ecology and phylogenetics and is a key topic in classic texts on phylogenetic systematics [1][2][3]. Comparing the phylogenies of co-distributed taxa provides insight into how factors such as movement capability and life history influence the genetic signature left by geographic events. Beyond descriptive biogeography, understanding the factors that affect the evolution of taxa in a region allows for the development of testable predictions relating to patterns of genetic diversity [4].
Hainan, a c. 33,920 km 2 tropical island off the shore of southern China, is widely recognized as one of the world's biodiversity hotspots [5][6][7][8][9][10]. This large island is located in Beibu Bay (Gulf of Tonkin) and is isolated from Guangdong's Leizhou Peninsula to the north by the such as freshwater fishes, should share a similar evolutionary history to terrestrial species, and provide excellent opportunities for historical biogeographic research because of their limited dispersal capabilities. As a distinct biogeographical unit in China's zoogeography, Hainan Island hosts a unique assemblage of primary freshwater animals, mostly cyprinids [9,13,14]. At least 769 species of Cyprinidae have been reported in China [15,16], of which at least 66 have been reported on Hainan Island [10,17]. Their high biodiversity and wide distribution make Hainan cyprinids ideal subjects for phylogeographic studies.
Aphyocypris normalis Nichols and Pope, 1927 is a freshwater fish that ranges over southern mainland China and Hainan Island [18], as well as Vietnam on the Indochina Peninsula [19]. In mainland China and Hainan, we have observed that A. normalis is distributed in small brooks on hills and mountains up to elevations of ca. 640 m ASL. Due to its wide distribution in this area, analyses of the phylogenetic and demographic history of A. normalis across Hainan Island and adjacent areas could clarify the historical connections of the major river systems in these regions. The information of this fish on Indochina Peninsula is very limited, for example, the first formal record of A. normalis on this peninsula was in 2011 from northern and central Vietnam [19].
Herein, we present the first densely sampled phylogeographic study focused on the historical biogeography of a Hainan native species. Specifically, we examined the phylogeographic structuring of A. normalis in rivers across Hainan Island, plus in its adjacent range in mainland China, to assess genetic diversity through comprehensive geographic sampling. Moreover, we introduced a climate division hypothesis [36] to discuss the possible isolation mechanism of freshwater fauna in Hainan Island. The aims of our study were to i) construct a detailed phylogeny of A. normalis on Hainan Island based on data from the mitochondrial cytochrome b gene; ii) infer the distribution history of A. normalis; and iii) understand its evolutionary divergence in relation to Hainan Island's historic climatic patterns. On the basis of our results, we propose a hypothesis on the division of phylogeographic ranges for freshwater fishes of Hainan Island. In addition, short comments on the population diversity of a closely-related species, Aphyocypris moltrechti (Regan 1908) from Taiwan, and on the taxonomic status of this sibling cyprinid are provided.

Sample collection
We utilized ethanol-fixed tissue samples from 127 specimens of A. normalis, collected at 25 sites in Hainan (83 specimens) and at 8 sites in southern mainland China (44 specimens). This sampling encompasses almost the entire geographic range of the species. The Hainan specimens were collected by the authors, plus by members of the first author's laboratory at National Taiwan Ocean University. As an outgroup, we utilized samples from 22 specimens of A. moltrechti, a freshwater fish native to Taiwan, that were collected at 3 localities (Fig 1 and  Table 1). A total of 127 A. normalis specimens were successfully sequenced and the sequences were deposited in the GenBank database (accession codes: KC209831-KC209957). Eighty- three of these specimens were collected from 25 localities within 12 river systems on Hainan Island, and 44 specimens were collected from 8 localities in five river systems in southern mainland China. We also sequenced 22 A. moltrechti specimens (accession codes: KC209958-KC209979) as outgroups (Table 1 and Fig 1). All of the samples used in this study are specimen-vouchered, coming from collections established in 2009-2012 as part of projects supported by the National Science Council, Taiwan (now Ministry of Science and Technology,  Taiwan), or from the National Museum of Natural Science, Taiwan (Table 1).

DNA amplification and sequencing
Genomic DNA was isolated using the Tissue and Cell Genomic DNA Purification Kit (Hopegen Biotechnology Development Enterprises, Taichung, Taiwan). The extraction of geno-micDNA was performed according to the manufacturer's instructions with repeat membrane binding, salt washing, and centrifugation. Polymerase chain reaction (PCR) amplification was performed using the primer pair CypbF1 5 0 -TGACTTGAAGAACCACCGTTGTA-3 0 and

Phylogenetic analysis
Haplotype phylogenetic reconstructions, based on the resulting cytb sequences, were performed using neighbor-joining (NJ), maximum likelihood (ML), and Bayesian inference (BI) methods. The first two analyses were performed on unique haplotypes in PAUP � version 4 beta [40] with 1000 bootstrap replicates, and with HKY+I as the best-fitting model of sequence evolution, as determined by the AIC criterion in jModeltest0.  [43]. Standard error estimates were obtained from bootstrapping over 1000 replicates. Haplotype networks were inferred using TCS v1.21 [44]. A parsimony network was constructed using the statistical parsimony algorithm [45,46] and haplotypes were grouped hierarchically into a set of nested clades [47].

Divergence times among major clades
The divergence time of the most recent common ancestor (MRCA) of A. normalis specimens was estimated by a Bayesian analysis using BEAST ver. 1.7.4 [57]. We assumed the constant population size setting as a coalescent tree prior, which is applicable for trees reporting the relationships between individuals within the same species. Posterior distributions were obtained via Markov Chain Monte Carlo analysis with 2 x 10 7 steps (length of chain) sampling every 1000 steps under a fixed pairwise evolutionary rate with a strict clock model. We used previously reported molecular substitution rates estimated for cyprinid cytb, ranging from roughly 0.6% to 1.3% per million years (Myr; e.g., 0.76%, [58,59]; 0.60-1.24%, [60]; 0.58-1.1%, [61]; 0.76-0.92%, [62]; 1.05%, [26]). We adopted 1.0% Myr as the mean value of a normally distributed clock rate covered by above range of cytb clock rate following Ketmaier et al. [63] and Zhang et al. [64]. The parameters of the substitution model GTR were estimated independently for each haplotype. Results were viewed using Tracer v 1.5 [65]. The effective sample sizes (ESSs) for all parameters exceeded 200. A low ESS (<200) means that the parameter contained a lot of correlated samples and thus may not represent the posterior distribution well.

Population structure
The relative contributions of genetic variation within and between populations in A. normalis were tested via hierarchical analysis of molecular variance (AMOVA) [66] in Arlequin 3.5 [67], with significance assessed by using 1023 random permutations of the dataset. In this analysis, we focused on four a priori hypotheses we modeled that reflect different ways of partitioning populations to test the amount of variation explained between geographical units: isolated by the strait (i. e. Hainan Island vs. mainland; hypothesis 1a), drainage by river system, both on Hainan Island and on the mainland (hypothesis 1b), and two hypotheses of phylogeography within Hainan Island, as hypothesis 2a and hypothesis 2b explained below. For the hypothesis 1a, all sampled populations were combined into two groups representing the localities of their origin to evaluate movement across the strait. Alternatively, to evaluate movement between river systems (1b), we combined populations into 17 groups based on drainage basins or close geographic proximity (i. e., the following populations were combined: sites 1-2, 4-5, 7-8, 12-14, 15-17, 19-25, and A2-A5; see Fig 1). If genetic structuring is driven by isolation among drainages due to drainage basin boundaries, then we expect to see high levels of structuring among drainages. For the hypotheses 2a and 2b, we evaluated movement between rivers on Hainan Island and checked the robustness of dividing the region into eight biogeographical areas based on climate [36] (hypothesis 2b; also see Discussion), compared with 12 groups based on drainage basins on Hainan Island (hypothesis 2a; S1 Fig shows the four hypotheses mentioned above). Cases where high percentages of variation occurred between groups suggest that an important historic barrier to gene flow exists, which is consistent with the relevant hypothesis [68,69]. The F ST values were computed using the information of haplotype using Arlequin 3.5 as well.
In addition, we applied SAMOVA 2.0 to define groups of populations that are geographically homogeneous and maximally differentiated from each other, and to evaluate their hierarchical differentiation [70]. A total of 100 initial independent processes were tested, followed by 10,000 steps of the simulated annealing process, which maximizes the proportion of total genetic variance among groups. We set the population/sampling localities of a single river system as a grouping unit (17 counts), following hypothesis 1b in AMOVA analysis. SAMOVA analyses were run under hypotheses of two to 16 groups (K) without geographic restrictions to examine the grouping pattern and to explore if a larger number of groups existed. The F CT index (proportion of total genetic variance due to differences between groups of populations) was used to select the best grouping, that is, the most suitable K value. We selected the number of groupings that maximizes this component, meaning that under that hypothesis the groups of populations are maximally differentiated from each other [70].

Sequence diversity and haplotype distribution
All sequences obtained from 127 Aphyocypris normalis and 22 A. moltrechti specimens were 1,091 bp in length. No insertions, deletions, or stop codons were found in these sequences. Among the sequences of A. normalis, 88 polymorphic sites (8.07%) were identified and 57 sites (5.22%) were parsimony-informative. The estimated Transition/Transversion ratio is 11.50. In total, 47 haplotypes were detected and are reported in Table 1, as well as their frequencies as represented by the number of individuals. Forty-one of them were private haplotypes. Withinpopulation genetic variations of nucleotide diversity (π) of sampling locality populations, excluding those localities (sites 5, 6, 9, 11, 13, 25, and A1) where only one specimen was collected, ranged from 0−1.28%; the mean of all individuals was 0.69%. The haplotype diversity (Hd) within each locality ranged from 0.00−1.00 ( Table 1). Most of the southeastern Hainan Island localities/rivers (sites 1-11) showed low haplotype diversities (0-0.4) and low nucleotide diversities (0-0.073%). In contrast, localities in southern mainland China showed higher haplotype (0-1) and nucleotide diversities (0-0.295%). Nucleotide diversity was highest at site 15 (1.283%) on the Zhubi River, which was represented by only two specimens.
On Hainan Island, 23 individuals sampled from the Nandu River exhibited 12 haplotypes, ten of which were exclusive to the river. These haplotypes could be separated into two groups. One belonged to Clade I (h4, h6, h26, h28, and h29), in which they were single or two nucleotide substitutions to each other. Another belonged to Subclade Ib (h20, h21, h22, h23, h24, h25, and h27), in which they are single or several substitutions (up to 11) to each other. The five individuals from the Changhua River carried three haplotypes (two exclusive) whereas 11 individuals from the Wanquan River carried only one, which was shared with the Nandu River and the Xi River (the Pearl River system) specimens in mainland China. Moreover, the 24 individuals from the Pearl River system in mainland China carried 12 haplotypes, 11 being exclusive to that stream; copious haplotypes were detected in a single population. Only four haplotypes (h1, h4, h6, h8) were present in more than three populations, with haplotype 4 being the most frequent by far. Haplotype 4 is also the only haplotype shared between Hainan Island and mainland China (sites 4, 5, 23, A5; Fig 1).

Phylogenetic relationships
Bayesian, NJ, and ML haplotype trees (Fig 2, Bayesian tree with BI, NJ, and ML supporting values) showed three distinct clades among the A. normalis cytb haplotypes. Clades I and II, defined by 43 haplotypes from 116 specimens, represented almost all populations including those on Hainan Island and in southern mainland China, except those from the Ninyuan River (site 9), the Wanlou River (site 10), and the Changhua River (sites [12][13][14], which were occupied by Clade III with four haplotypes from 11 individuals. Clade II was represented by two haplotypes (h30, h31) from three individuals collected from the Lou River (site A1), which  Table 1. � indicates the haplotypes from the Zhoushui River, which were assigned to Aphyocypris amnis by Liao et al. [71]. (2) Subclade Ib, represented by eight haplotypes (h13, h20, h21, h22, h23, h24, h25, h27) from 13 individuals collected from the Zhubi River (site 15) and the Nandu River (sites 19-23) in Hainan; and (3) Subclade Ic, represented by five haplotypes (h32-h36) from 10 individuals collected from the mainland's Dong River (site A3). These clades/subclades are generally supported by high bootstrap/supporting values inferred from Bayesian, NJ, and ML analyses (Fig 2). Fig 3 shows the phylogenetic relationships between A. normalis and closely related cyprinid species. The sequences of A. normalis and A. moltrechti used in this study were tightly grouped, and the two together grouped with Aphyocypris arcus and other two species of Aphyocypris (A. kikuchii and A. chinensis). Like A. normalis and A. moltrechti, A. kikuchii and A chinensis also formed a close monophyletic group. All these species grouped together monophyletically at a high confidence level.

Haplotype network
In our haplotype network (Fig 4), two major clusters were identified (separated by the red broken line in Fig 4); only the smaller cluster of haplotypes h7, h8, h10, h11 was agreed with the Bayesian haplotype tree and ML as the Clade III (see Fig 2). Another two major clades (Clades I and II) identified by Bayesian, NJ, and ML methods were not monophyletic in the network result. Haplotypes of these two clades comprised the super-cluster of this network. In the super-cluster, the Clade I of phylogenetic tree contained at least 12 mutational differences from haplotypes of Clade II (h30 and h31). Haplotype 4 (h4) was the major haplotype (represented by 13 individuals, mainly from the Wanquan River), occupying the central position of all haplotypes, and occurred at three sampling locations on Hainan (sites 4, 5, 23) and a fourth location on the mainland (site A5). Furthermore, another haplotype (h1) represented by 15 individuals detected in two small rivers of eastern Hainan (sites 1−3) contained only one mutational difference from h4. The haplotype h1 was located between h4 and five haplotypes of Subclade Ic, which represented by 10 individuals from a single site on the mainland (site A3).
In the smaller cluster of the whole network (Clade III), the variation among the haplotypes was low with no more than six mutational differences. Three of the four Clade III haplotypes clustered closely with only one base variation between each other. The fourth haplotype (h7, represented by a single specimen from the Ninyuan River at site 9) contained five or six mutational differences relative to the other Clade III haplotypes.

Divergence times among major clades
Divergence times among clades/subclades of A. normalis were estimated at 0.031−1.049 million years ago (Ma;

Population structure
For all of the a priori phylogeographic hypotheses that we examined, AMOVA recovered highly significant population structure (F ST ) values which implied genetic variation between tested groups. These models are summarized below. First, the Hainan Island vs. mainland hypothesis (Hypothesis 1a in Table 4) revealed that only 1.1% of the total genetic variation was Table 3. Estimates of time since the most recent common ancestor (tMRCA; time scale in millions of years; standard errors are presented in parentheses) and 95% HPD estimates for clades/sub-clades for the main internal nodes of Aphyocypris normalis phylogeny (Fig 2)

PLOS ONE
Phylogeography of Aphyocypris normalis partitioned among the two regions, indicating a high level of genetic movement across the strait between Hainan Island and the mainland. We predicted that populations from both sides of Qiongzhou Strait and Beibu Bay would be genetically distinct, but this was not the case. Besides, the F CT values of this model was low (0.1104). Our subsequent hypothesis of this experiments (Hypothesis 1b) was designed to test the isolation power that independent river systems without freshwater connections would show lower levels of gene flow, and we found that this was more possible than hypothesis 1a, but also might not be true. The variations among groups and among populations within groups in Hypothesis 1b were similar as 32.77% and 47.01%, respectively. Second, we tested the isolation mechanism of A. normalis occurred on Hainan Island. In Hypothesis 2a, the variations among groups and among populations within groups for all river systems on Hainan Island are 51.14% and 36.34%, respectively. The results of Hypothesis 2a, as well as Hypothesis 1b, did not strongly support genetic isolation between river systems. The last hypothesis (2b), dividing the region into eight biogeographical areas based on climate revealed that most of the variation was between groups (70.15%; Table 4), which indicates that the hypothetical biogeographic region division is a strong barrier to gene flow.
In the SAMOVA analyses, the highest F CT value was observed at K = 3 (S1 Table). Under this K, 78.76% of the variation could be explained by variation among groups ( Table 5). The identified groups were the Lou River population in mainland China (site A1; see Table 1 and Fig 1), populations of Changhua, Ninyuan and Wanlou Rivers in Hainan (sites 9, 10, and [12][13][14], and the remained populations both in mainland and Hainan Island (S3 Fig). These groups coincide with the results of the AMOVA that the strait between Hainan Island and the mainland does not form a barrier impeding the genetic interaction between them. The second highest F CT value in the SAMOVA analyses was observed at K = 4, which showed very close to the highest one (0.78714 vs. 0.78762; see S1 Table). In this four-groups hypothesis, the specimen collected from Ninyuan River (site 9) was separated from the populations of Changhua and Wanlou Rivers in advance, in which indicated that the specimen from Ninyuan River is unique to other specimens in this study.
In general, SAMOVA analyses detected a group (populations of Changhua, Ninyuan and Wanlou Rivers in Hainan) within Hainan populations found in AMOVA hypothesis 2b (S1C Fig), which showed a unique phylogenetic group different from all other A. normalis populations tested in this study (S3 Fig). This group was also identified as Clade III in the phylogenetic analysis mentioned above.

Genetic diversity and structuring in Aphyocypris normalis
Our novel sequence data revealed three major genetically differentiated clades among the A. normalis specimens analyzed in this study, as supported by NJ, ML, and Bayesian analyses. It is generally accepted that geographical isolation is a very efficient isolation mechanism for freshwater animals; e.g., straits between islands and the mainland [74][75][76][77]. Nonetheless, concurring with the results of Huang et al. [25] and Li et al. [26], the geographic isolation of the Qiongzhou strait between Hainan Island and mainland China did not create efficient isolation barriers for A. normalis populations. Furthermore, Clade I was composed of specimens collected from 15 currently isolated rivers on Hainan Island and the mainland. Moreover, haplotypes h4, h6, and h8 all were found in separate rivers on Hainan Island and even across the strait (h4). Li et al. [26] reported that the Yinggeling and Wuzhishan Mountains (see Fig 5A, 5B) provides isolation mechanisms on A. normalis in Hainan and separated them into northeastern group and southwestern group. With more sampling sites in our work, we further proposed that Yinggeling Mountain formed a more efficiency isolation barrier than Wuzhishan Mountain, which might separate Subclade Ib and Subclade III. The pattern of lower genetic diversity in the populations of Hainan Islands is common with respect to the continent. Most of the Hainan populations showed low haplotype and nucleotide diversities, and populations in southern mainland China showed higher haplotype and nucleotide diversities in comparison, except the populations in Zhubi River (sites [15][16][17]. The highest nucleotide diversity was detected at site 15 (1.283%) on the Zhubi River, as well as the haplotype diversity, however, which were represented by only two specimens. Limited sampling size may mislead the accuracy of analysis results, nevertheless, all the analyzed data presented in this study were based on valid specimens and are the only information of these populations at present.
Our comprehensive phylogenetic analyses, AMOVA and SAMOVA testing conclude that the phylogeography of A. normalis on Hainan Island was not shaped solely by the geographical barriers. Instead, effective isolation mechanisms could have been formed by events other than geographic barriers, such as the elevation of different habitats after climatic events and rainfall, leading to the divergence of breeding strategies (e.g., [78]), or other water characteristics of lotic systems. It is therefore possible that the A. normalis population diversity of Hainan was shaped not only by geographic barriers or by habitat separation in different river systems, but also resulted from climatic events, such as floods, which provided opportunities for interaction between populations. He and Zhang [36] divided Hainan Island into eight climate regions primarily according to rainfall distribution: (1) NE seashore plain, more cool, windy but more wet; (2) NW seashore plain, more cool, windy but more dry; (3) NE hill district and basin, more cold and wet in winter; (4) NW mountain, hill district and basin, occasionally fierce cold winter; (5) SE seashore plain, more windy, more warm and wet; (6) SW seashore plain, more warm and fierce dry; (7) SE mountain and hill district, valley, and basin climate, more wet and warm; and (8) SW mountain and hill district basin, more warm and dry (Fig 5A). Their divisions fit the results of phylogeography of A. normalis on Hainan, as inferred from the cytb analyses of this study. Considering the climate divisions, topology of river systems, and the results of our cytb analyses, we propose eight phylogeographic ranges of A. normalis (Fig 5B): (1) eastern plain (E), including several small rivers, (2) northeastern hills and plains (NE), generally composed of hills and lowlands of the Nandu River system, (3) northwestern hills and lowlands (NW), including the northwestern lowland and catchment area of the Beimen and the Zhubi Rivers, (4) central mountain area (C), comprised of the catchment area upstream of the Nandu River of west-central Hainan Island, (5) southeastern hills and plains (SE), including the Wanquan and the Longto Rivers, (6) southern mountains and hills (S), mainly being the catchment area of the Lingshui River, (7) southwestern mountains and lowlands (SW), including the Ninyuan and the Wanlou Rivers and the watershed upstream of the Changhua River, and (8) western lowlands (W), including the Jianfengshui River and several lowland rivers on the west side of Hainan Island. The main difference between our phylogeographic ranges and He and Zhang's climate regions is that their division did not make allowances for the continuity of the river systems. However, the connections made by rivers are important interaction routes for freshwater animals. In this consideration, most river systems are catalogued separately except for areas upstream areas of the Nandu River (range C) and the Changhua River (range SW).
It is noteworthy that populations in upstream sections of the Nandu River formed a unique and highly supported phylogenetic subclade (Ib) in the central mountain area, the highest portion of the island. However, an individual identified with the haplotype characteristic of range C was found at a distant site within the upstream Nandu River system (range NE; also site 23 in Fig 1B). This implies that upstream Nandu River populations of A. normalis in ranges C and NE once interacted and later were separated by topographical events such as river capture.
Besides, in the upper reach of the Nandu River, a huge reservoir with a water surface area of approximately 144 km 2 and a backwater length of approximately 51 km was formed since 1970. The Songtao Reservoir is the largest body of water in Hainan, with a total reservoir capacity of 3.345 × 10 9 m 3 and covers 0.17 percent of this island [79]. In this study, A. normalis populations of sites 19-25 were separated by Songtao Reservoir into two parts: sites 19-22 in the upstream of Songtao Reservoir, and sites 23-25 in the downstream of it. Noteworthy, the genetic data were in accordance with this distribution pattern. The populations of sites 19-22 kept the haplotypes h21, h22, h23, h24 and h25, which belonged to Subclade Ib. On the other hand, the populations of sites 23-25 kept haplotypes h4, h6, h26, h27, h28 and h29, in which all belonged to Clade I but h27 (subclade Ib). A huge lentic system such as Songtao Reservoir can be an efficiency barrier for small-sized non-pelagic freshwater fish like A. normalis.
Subclade Ib is currently limited mainly to the west-central mountain area and shows clear differentiation between nearby populations in ranges NW and SW. Upstream Changhua River (range SW, haplotypes h8, h10 and h11) populations show a close relationship to individuals collected from small rivers (i.e., the Ninyuan and Wanlou Rivers with haplotypes h7 and h8, respectively) in the southwestern seashore lowlands that run south-westward into the Southern China Sea. The specimens collected from these three rivers are grouped as a distinct monophyletic Clade III group (see Fig 2). There is no connection between the Changhua River and those small rivers presently, so the existence of Clade III provides powerful evidence that upstream reaches of the Changhua River once ran southwestward into the catchment area of the Ninyuan and Wanlou Rivers. A river capture event must have subsequently occurred to reroute the upstream portions of the Changhua River to flow northward as the Changhua River of today. Furthermore, the populations (those of Changhua, Ninyuan and Wanlou Rivers) in range SW are also grouped together by SAMOVA analysis (S3 Fig). They are possibly different from other populations and show a unique phylogeographic range on Hainan Island. However, because our inferences were based on mtDNA only, nuclear genome data should be examined for further information on population relationships and history.
It is widely accepted that the ichthyological fauna of Hainan Island show close relationships with that of the Pearl River System [17,80,81]. The isolation effect of the Qiongzhou Strait is not apparent based on current cytb sequence data. This study's two eastern-most collection sites on the mainland (A1 and A2) form Clade II, a sister clade of Clade I, and site A3 of the Pearl River also shows the distinct Subclade Ic. These populations were probably isolated by distance or other alternatives such as isolation by environment (the A2 and A3; for they are located in the large Pearl River System, same as A4, A5 and A6) or by vicariance (A1) from their ancestor population. The other populations of southern mainland China were all identified as members of Clade I and showed no obvious differentiation from parts of Hainan Island populations.

Distribution processes of Aphyocypris normalis in Hainan and southern China with a comment on Aphyocypris arcus in Hainan
Hainan Island was formed by shifts in the location of regional land masses due to plate tectonics and subsequently separated by rising sea levels, and this island was connected to southern mainland China and northern Vietnam again by land bridge formation during the Last Glacial Maximum (LGM) around 19-26.5 kya (thousand years ago) while the sea level was around 80-100 m below the present day [82,83]. With the advent of the Holocene warm period, Hainan again became isolated once more [84], with the Qiongzhou Strait having a minimum width of 20-40 km. These geographic events provided opportunities for animals to migrate across the strait and become isolated.
According to our divergence time estimation of the major clades of A. normalis, the order of phylogenetic isolation events began with (1) Clade III separated from all other populations due to separation within ancient river systems running southwestward at about 1.05 Ma, (2) Clade II separated from Clade I due to isolation by vicariance or by distance at about 0.83 Ma, followed by the formation of three subclades of Clade I, i.e., (3) Subclade Ia separated from other Clade I when its range was limited to the mountainous area of central Hainan at about 0.33 Ma, (4) Subclade Ic in site A3 separated from remaining Clade I due to isolation by distance (A3 is located in the large Pearl River System, same as A2, A4, A5 and A6 in which no efficient geographic barrier is shown between them) at about 0.24 Ma, and (5) populations of Subclade Ia separated from the remaining Clade I populations by being limited to rivers that run northwestward in Hainan. The common ancestor of A. normalis was presumably distributed widely in the lowlands of southern mainland China and Hainan Island. The major clades within A. normalis have diverged before the temporary land bridge that existed across the strait during the LGM. Moreover, with the stream capture and flood events [10,25,30], the river systems within Hainan Island did not provide efficient isolation mechanisms similar to the strait, as there are shared haplotypes among geographically distant/ isolated river systems (e.g., haplotype h4 in the Wanquan River, sites 4 and 5; the Nandu River, site 23; and the Xi River, site A5. See Table 1). This research demonstrates that apparent vicariance events do not necessarily explain interregional population differentiation even in freshwater fish.
A unique phylogenetic group (Clade III) composed of populations of southwest Hainan was detected in this study (S2 Fig). It is possible that this group shares a close phylogenetic relationship with the populations in the Indochina Peninsula if A. normalis occurred there [19]. It would be helpful to analyze specimens collected from the Indochina Peninsula in future, both to clarify the migratory history of this species and to better understand the natural history of terrestrial fauna near the South China Sea.
Aphyocypris normalis and A. arcus are very similar, both in morphology [85,86] and phylogeny [48]. The taxonomic confusion between these two species has been the subject of ichthyologists' debate for decades. Du et al. [87] described the differences between A. normalis and A. arcus by comparing the morphology and anatomy of specimens collected from both southern mainland China and Hainan Island. They provided a morphological key for identifying these two species and claimed that they have sympatric distributions on Hainan Island. However, we did not detect any haplotype belonging to A. arcus from our Hainan Island specimens. We suggest a high possibility that A. arcus does not naturally occur on Hainan Island based on the DNA analysis result in this study. More comprehensive comparative studies of A. arcus both on DNA and morphology are therefore necessary in the future.

Comment on the genetic diversity of Aphyocypris moltrechti in Taiwan
We used Aphyocypris moltrechti as the comparison outgroup in this study. We used 22 museum specimens from three rivers in central Taiwan, including two from the Dajia River (site B1 in Fig 1, collected in 2003), ten from the Wu River (B2, 2008), and ten from the Zhoushui River (B3, 2011), which latter were assigned to a newly described species, Aphyocypris amnis Liao, Kullander and Lin [71]. Only four haplotypes were identified from 22 sequences that were 1,091 bp long and only one haplotype was found in the Dajia and Wu Rivers' population ( Table 1). The K2P distances among haplotypes of Aphyocypris moltrechti (only the specimens from the Dajia R. and Wu R.) and A. amnis (the specimens from the Zhoushui R.) were estimated to be 0.0018-0.0065 (2-7 bp variation) and 0.0035 on average within all samples of these two groups (n = 22). For the specimens collected from the Zhoushui River, two haplotypes were identified that had 5 bp of variation between them (K2P distances = 0.0046) and 0.00215 within the population (n = 10). These results disclosed small nucleotide diversity among A. moltrechti specimens used in this study and are similar to another report that focused on population structure using complete cytb sequences (1,140 bp) and microsatellite DNA data [88], which revealed extremely low levels of intra-population polymorphism for the four populations in three rivers (same as this study) of A. moltrechti on nucleotide diversity of mtDNA analyses, though they concluded that the Zhoushui River population (assigned as a new species A. amnis later in the same year 2011) possesses unique haplotypes and should be handled as a management unit different from other populations. Comparing with the cytb genetic diversities among A. moltrechti populations in this study, the variation between A. moltrechti and A. amnis is comparatively small. The validity of A. amnis should be reconsidered carefully due to its comparatively low genetic divergence in comparison to A. moltrechti populations, as well as their ambiguous morphologic difference. Since the mtDNA is of maternal origin, the conclusion that comes with using only mitochondrial data may differ from phylogenies at the species level or higher. We suggest that more appropriate nuclear markers, or reduced representation genome-wide markers should be used to evaluate the validity of A. amnis in the future.
Supporting information S1 Table. F