Phylogeography of the termite Macrotermes gilvus and insight into ancient dispersal corridors in Pleistocene Southeast Asia

Dispersal of soil-dwelling organisms via the repeatedly exposed Sunda shelf through much of the Pleistocene in Southeast Asia has not been studied extensively, especially for invertebrates. Here we investigated the phylogeography of an endemic termite species, Macrotermes gilvus (Hagen), to elucidate the spatiotemporal dynamics of dispersal routes of terrestrial fauna in Pleistocene Southeast Asia. We sampled 213 termite colonies from 66 localities throughout the region. Independently inherited microsatellites and mtDNA markers were used to infer the phylogeographic framework of M. gilvus. Discrete phylogeographic analysis and molecular dating based on fossil calibration were used to infer the dynamics of M. gilvus dispersal in time and space across Southeast Asia. We found that the termite dispersal events were consistently dated within the Pleistocene time frame. The dispersal pattern was multidirectional, radiating eastwards and southwards out of Indochina, which was identified as the origin for dispersal events. We found no direct dispersal events between Sumatra and Borneo despite the presence of a terrestrial connection between them during the Pleistocene. Instead, central Java served as an important link allowing termite colonies to be established in Borneo and Sumatra. Our findings support the hypothesis of a north-south dispersal corridor in Southeast Asia and suggest the presence of alternative dispersal routes across Sundaland during the Pleistocene. For the first time, we also propose that a west-east dispersal through over-water rafting likely occurred across the Pleistocene South China Sea. We found at least two independent entry routes for terrestrial species to infiltrate Sumatra and Borneo at different times.


Introduction
Southeast Asia (SEA) harbours 20-25% of the planet's plant and animal species and is thus a highly biodiverse region [1][2]. SEA today is relatively small and fragmented compared to the past few million years. During the Pliocene-Pleistocene glacial periods, the land area was and Java Seas through the Malacca Straits and a broader corridor expanding from the eastern slope of the Sumatran highlands well into the interior of western Borneo. On contrary, Slik et al. [16] found no evidence for a savanna corridor between Borneo and Sumatra, but show a coarse soil barrier in the then-exposed sea floor. Identifying such corridors and understanding their role as dispersal routes is therefore important for understanding the causes and development of modern biogeographical patterns in the region, including early humans' ability to exploit such corridors for dispersal.
Here we reconstruct the potential dispersal corridors for an extant subterranean fungusgrowing termite species, Macrotermes gilvus (Hagen). Termites are ideal subjects for biogeography and evolutionary studies because they are easily sampled, feed and live in a broad range of landscapes and are sensitive to environmental disturbances. Furthermore, colony maturation is slow and dispersal of reproductive alates is short ranged [10]. M. gilvus was particularly suitable for this study for several reasons: 1) it has a vast distribution in the lowlands of tropical SEA and can be found as far west as India [23]. Recent findings by Bourguignon et al. [24] show that Asian Macrotermes arose around 20 Ma. The location and timing of divergence of this genus in SEA is well suited to investigate the effect of past geological events in shaping the distribution of the local taxa; 2) dispersal flight range is short and establishment of new colonies depends primarily on soil structure and the availability of the obligate mutualistic fungal symbiont, Termitomyces sp.. Therefore, the genetic signature of this species should reflect historical patterns rather than contemporary gene flow between different regions of SEA; and 3) M. gilvus can inhabit both lowland forest habitat and the soil in open vegetation environments or urban settings, such as parks, underneath trees with an open-canopy along roadsides, in areas where vegetation is dominated by grasses, or in agricultural sites such as paddy field terraces and oil palm plantations formerly covered by rainforest [25][26][27][28][29]. Therefore, a wide range of vegetated Pleistocene land connection is suitable for M. gilvus dispersal, making it a suitable candidate for defining the extent of early dispersal corridors in pre-historic SEA. In this study, we used both nuclear and mitochondrial DNA markers to infer the evolutionary history of, and dispersal pathways, for the species. The genetic patterns observed allow inferences to be drawn regarding the palaeoenvironmental history and spatiotemporal dynamics of ancient dispersal routes of the region.

Sampling strategy
We used two independently inherited molecular markers (mtDNA and microsatellites) to accurately reflect population history. Two hundred and thirteen M. gilvus colonies were sampled in the field from 66 locations around SEA of various habitat types mainly from the urban environment and agricultural sites (see Fig 1, S1 and S2 Tables). Samples from 160 colonies were used for mtDNA analysis and all 213 colonies were used for microsatellite analysis (S1 and S2 Tables). M. gilvus colonies are founded by a pair of mating reproductives, and therefore we regarded each colony as a single sampling unit. Only one individual per colony was used for mtDNA and microsatellite analyses. The field studies did not involve endangered or protected species and no specific permissions were required for field studies done in public parks and roadsides. For agricultural sites, the owner of the land has given permission to collect samples for the study.
was polymerase chain reaction (PCR) amplified at two mitochondrial DNA regions: the partial cytochrome c oxidase subunit II (COII) and the large ribosomal subunit RNA (16S rRNA) genes. The PCR conditions and primer information can be found in S1 Appendix.
Phylogenetic relationships among mtDNA haplotypes were constructed using three different approaches: maximum parsimony (MP), maximum likelihood (ML), and Bayesian inference (BI). MP analysis was performed using the tree-bisection-reconnection branch-swapping algorithm and 1,000 random taxon addition replicates under a heuristic search in PAUP Ã [30], saving no more than 100 equally parsimonious trees per replicate. Nonparametric bootstrap values for branch support were assessed using 100 bootstrap pseudo-replicates with 10 random taxon additions in each replicate. ML analyses were conducted in GARLI 2.0 [31] with two runs of 100 replicates; each was executed in the best-fit substitution model as determined in MOD-ELTEST 3.7 [32] (S3 Table) and incorporated the best partitioning scheme selected based on Akaike's information criterion using different subset rates and 5,000 nonparametric bootstrap iterations (S4 Table). To mitigate any possible effect of over-parameterization in data partitions, an alternative ML analysis was performed in RAXMLGUI 1.3 [33], which incorporated a more generalized GTR + G substitution model with rate heterogeneity across all data partitions using the "ML + thorough bootstrap" option with 10,000 bootstrap replicates under the rapid hill-climbing algorithm. Similarly, BI was conducted in MRBAYES 3.1 [34] and incorporated the best partitioning scheme based on the bayes factor analysis (S5 Table).
Two independent runs starting from different random trees were performed using partition-specific rates, and four MCMC chains were run for 25 million generations with sampling frequency every 1000 generations (burn-in = 6250). Convergence was assessed by observing when the standard deviation of split frequencies fell below 0.01. Gaps were treated as missing data. Non-fungus-growing termites' mtDNA sequences from Genbank (Drepanotermes sp. (JX144938.1), Macrognathotermes errator (JX144939.1), and Nasutitermes triodiae (JX144940.1)) were used as outgroups in all phylogenetic trees.
Alternative phylogenetic hypotheses for evaluating the monophyly of clades in the phylogenetic tree using constrained tree topologies were tested by calculating the probabilities of tree topologies using the approximately unbiased (AU), Kisino-Hasegawa (KH), weighted KH (WKH), Shimoidara-Hasegawa (SH), and weighted SH (WSH) tests that were determined from the multi-scale bootstrap in CONSEL [35]. A P-value of < 0.05 was interpreted as strong evidence for rejecting the phylogenetic hypothesis.
A statistical parsimony network was reconstructed using the software TCS 1.21 [36], which links haplotypes through a series of evolutionary steps based on algorithms that estimate the 95% reconnection limit between haplotypes. This topology was compared with a minimum spanning network produced by the median-joining method in NETWORK 4.6.1.1 [37]. We made such comparisons so that any conflicting results could be identified and subsequently interpreted with caution. Additionally, pairwise F ST between pairs of populations was compared in ARLEQUIN 2.0 [38] using 10,000 permutations at α = 0.05.
Phylogeographic reconstruction of the spread of M. gilvus through space and time was conducted using the Bayesian MCMC approach in the BEAST 2.0 software package [39] that includes a discretized spatial diffusion model that can quantify the rate at which termite lineages moved between sampled locations [40]. Analyses were carried out using an uncorrelated lognormal relaxed molecular clock model that allows rates to vary among branches. Three sets of analyses were conducted using an insect mitochondrial substitution rate with an upper limit of 2.28% Myr -1 and a lower limit of 0.9% Myr -1 as well as a standard invertebrate mitochondrial divergence rate of 1.5% Myr -1 as the rough median; a similar approach according to Schutze et al [41]. Two runs of 30 million generations, each incorporating the GTR + G nucleotide substitution model, were run for each analysis. The resulting geotree files were combined and annotated with location states in TREEANNOTATOR 1.6.1 [39] before being converted into a keyhole markup language (KML) file using SPREAD [42] and subsequently visualized as animation in the free web service GOOGLE EARTH [43] The software TRACER 1.6 [44] was used to monitor support for stationary and efficient mixing, as diagnosed using effective sample size statistics. The discrete phylogeographic analysis required that each sequence be assigned a specific character state based on its geographic origin. In this study, we considered movement among major countries and regions (15 character states: Thailand, Vietnam, the Philippines, the Malay Peninsula, Singapore, North Sumatra, Riau, West Sumatra, West Java, Central Java, East Java, Madura, Southwest Borneo (B1), Northwest Borneo (B2), and East Kalimantan). Where more than two locations were grouped, the latitude and longitude of the centroid of the polygon defined by them was used.
Divergence dating using fossil calibrations from an inter-species termite phylogeny was conducted separately to attain an alternative resolution of divergence times. A set of 101 termite taxa containing the~681 bp COII gene haplotypes derived from this study as well as additional sequences obtained from previous studies [45][46] aimed at dating the related fungusgrowing termite group (subfamily Macrotermitinae) was used to construct a starting tree in the software MRBAYES 3.1 [34] using GTR + G nucleotide substitution models. Only the COII gene was used due to lack of complementary sequences from the 16S rRNA gene for other macrotermitines. Divergence time was estimated using the Bayesian relaxed-clock uncorrelated exponential approach as implemented in BEAST 1.7.4 [39] based on fossil calibration procedures from previous published works [45][46] (S1 Fig).

Microsatellite analysis
Multiplex PCR was conducted using 15 novel polymorphic microsatellite loci developed previously for M. gilvus following PCR conditions and fragment analysis as detailed in Veera Singham et al [47]. Phylogenetic inference of microsatellite genotypes was initially deduced without a priori assumptions of population definition based on the genetic distance measures (Dps' and Dkf') generated with the [1-ps/kf] option in MICROSAT [48] that were used to construct a neighbor-joining (NJ) phylogenetic tree with the program NEIGHBOR in the PHYLIP 3.69 package [49]. Additionally, another NJ tree was reconstructed to view relationships between pre-defined populations groups as inferred from MICROSAT [48] using D A distance in POPTREE2 [50] at10,000 bootstrap iterations.
The probability of individual assignments to population clusters (K) without prior information about the origin of individuals was calculated using STRUCTURE 2.3 [51] under the assumption of an admixture model and correlated allele frequency with the lambda parameter set to one, as the user's manual advises; the length of burn-in for MCMC chains was set to 10,000 each. We performed 20 runs for each dataset to determine the amount of variation of the likelihood for each K ranging from one to 15, and the most probable number of groups was determined as described in Evanno et al [52]. DISTRUCT [53] was used for producing the plot for visualization.
Multiple measures of genetic diversity (expected heterozygosity (H E ), observed heterozygosity (H O ), average number of alleles per locus and size range per locus, departures from Hardy-Weinberg equilibrium (HWE), and genotypic linkage disequilibrium) were analysed using ARLEQUIN [38]. Detection of null alleles and scoring bias due to stuttering or large allelic dropouts was performed using MICRO-CHECKER 2.2.3 [54]. The inbreeding coefficient (F IS ) and allelic richness (A R ) as a standardized measure of the number of alleles per locus independent of the sample size based on rarefaction method were assessed using FSTAT 2.9.3. [55]. Assessments of population pairwise comparisons using F ST and sum of squared size differences (R ST ) were derived from ARLEQUIN [38]. Isolation-by-distance analysis was conducted for both mtDNA and microsatellite analyses by regressing the Euclidean distance between the geographic localities and the population's pairwise genetic measure following Rousset's genetic distance [F ST (1 -F ST ) -1 ] for a linear relationship using IBDws 3.15 [56]. Significance was assessed using the Mantel test with 10,000 permutations, and each axis was log-transformed before analysis.

Phylogeographic analysis of mtDNA and microsatellites
The concatenated sequence data for the mtDNA genes (1,116 bp) revealed 50 haplotypes with 98 variable sites from 141 individuals (out of 160) that produced amplifiable gene fragments for both the COII and 16S rRNA genes (S1 Table). Seventy-five of the variables sites were observed in more than one individual and therefore were phylogenetically informative. Transition to transversion bias was observed at a ratio of 5.027, which could be indicative of sequence saturation or recent divergence, among other possibilities [57]. Phylogenetic analyses of the mtDNA haplotypes using MP, two alternative ML estimates, and BI approaches produced congruent tree topologies. Eight major phylogenetic clusters were observed (clades I-VIII) (Fig 2). The Bornean termites (BN) were partitioned geographically into two clades: clade I involving termites from Sematan to Sibu and East Kalimantan located in the lower parts of Borneo and clade VIII involving termites from Miri, Bintulu and Sabah located in the upper parts of Borneo. Sumatran termites (SU) were also segregated into two distinct geographical clusters: termites from North Sumatra that forms an association with the Malay Peninsula (MP) populations in clades III and IV and a Riau-West Sumatra complex that forms an association with the West Java and Borneo populations in clade I. Javanese (JV) termites were clustered in clades I and VII, with no obvious pattern of haplotype segregation among localities within Java. Haplotypes from Thailand (TH) and the Philippines (PP) formed monophyletic groups in clades II and VI, respectively, although haplotypes TH1 and PP2 were unresolved. Positions of the Vietnamese haplotypes (VT) were poorly resolved and formed a paraphyletic assemblage on the phylogenetic tree.
Subsequently, we performed alternative phylogenetic hypothesis testing to validate the deep geographical separation observed among the Bornean and Sumatran termites, respectively. The statistical parameters AU, KH, SH, WKH, and WSH significantly rejected the monophyly of Sumatra and Borneo (i.e., alternative scenario, S6 Table) and thus showed support for the inferred phylogenetic split shown in Fig 2. A similar approach was used to test on whether the Vietnamese haplotypes could alternatively form a monophyletic group. All tests parameters failed to reject the hypothesis of Vietnam being monophyletic (S6 Table). This result suggests that an evolutionary relationship between haplotypes from Vietnam and Java, as observed in clade VII, may be present, but the relationship is poorly resolved. The discrepancy for the monophyly of Vietnam may also arise due to small sample size and the ambiguous position of haplotype VT5, which forms a sister group to all other M. gilvus (we noticed a similar condition in the divergence dating analysis; see S1 Fig). To avoid any spurious results due to the ambiguous status of VT5, we excluded this sample from subsequent analyses.
The median-joining minimum spanning network of the termite mtDNA sequences displayed a distinct phylogeographic structure, with most haplotypes being regionally specific (Fig 3). Haplotype sharing was rare and restricted to geographically proximate localities, as demonstrated by haplotypes MP1/SG and MP3/SU that displayed close affinity between the Malay Peninsula with Singapore and Sumatra, respectively (see Fig 3). All Singapore haplotypes were derived from the Malay Peninsula lineage, as illustrated by the starburst pattern of haplotype MP1/SG. The absence of haplotypes with broad geographical distribution reflects a restricted gene flow across the geographical regions. A distinct population subdivision was observed within Borneo and Sumatra, which corroborated findings from the phylogenetic analysis. Haplotypes BN1-BN3 from Southwest Sarawak and haplotype BN7 from East Kalimantan shared a similar evolutionary pathway, with West Java's JV2 being the centrum connecting these haplotypes. On the other hand, haplotypes BN4 and BN5 from Northwest Sarawak and BN6 from Sabah formed another distinct evolutionary group 29 mutation steps away from the former group. To ease interpretation in all subsequent analyses, we designated Southwest Borneo as B1 for the former group and Northwest Borneo as B2 (see Fig 1). In Sumatra, distinct population subdivision was observed between haplotypes from North Sumatra derived from the Malay Peninsula and haplotypes of the Riau-West Sumatra complex originating from West Java (Fig 3). An alternative statistical parsimony network generated using TCS 1.21 [36] resulted in a similar topology, albeit with minor differences. The exclusion of  S1 Table for haplotype details on tip labels). The bootstrap values and posterior probabilities of the nodes were stated in the order of maximum parsimony, Bayesian, GARLI [31], and RAXML [33]. Only nodes with values > 50% are shown. Roman numerals represent geographical clades inferred based on significant nodal support [58][59]. haplotypes TH3, TH4, and TH5 from northern Thailand because the 95% confidence level exceeded the parsimony connection limit of 14 as determined by the TCS program [36] (dashed box, Fig 3) and the presence of homoplasmy (red triangle, Fig 3) suggest that these haplotypes could be old or ancestral and had been long separated from the rest of the region.
A composite genotype from 15 M. gilvus specific microsatellite loci was obtained from 207 termite individuals. East Kalimantan (n = 2), West Java (n = 3), and one sample (V7) from Vietnam were excluded due to small sample size or poor amplification quality. A NJ analysis of individual termite genotypes based on the proportion of shared allele (Dps') and kinship coefficient (Dkf') genetic distances produced concordant topologies that support the population geographic subdivision observed in the mtDNA analysis (Fig 4). Termites from the Malay Peninsula were clustered together with samples from Singapore and North Sumatra (100% bootstrap support). The remaining M. gilvus genotypes were partitioned into six clearly differentiated clusters (Fig 4). Borneo samples were split into B1 (Southwest Borneo) and B2 (Northwest Borneo) clusters that were identical to those of the mtDNA analysis, except for the exclusion of the East Kalimantan sample from B1 and with the Philippines being a sister group to B2 (78% bootstrap support). The Indochinese subcontinent was separated into a single group of termites from Thailand (except sample no. 16) and an independent assemblage of the Vietnam samples. The remaining samples from Riau and West Sumatra collectively formed a single cluster, and all of the Javanese samples were clustered into another group (78% bootstrap support).  Genetic variation and molecular diversity Quantitative estimates of mtDNA diversity recorded a wide range of genetic variation within each termite population, with an overall haplotype diversity (H d ) of 0.31 (Table 1). Several regions showed extremely low or high genetic diversity. The Indochinese populations (Thailand and Vietnam) had the highest H d index and thus were most likely ancestral, whereas the Sumatran termites were the least genetically diverse and therefore may have been recently introduced or experienced ancestral genetic bottleneck. Similarly, the Bornean termites displayed an unexpectedly low H d of 0.27 given the extensive sampling and effective sample size (Fig 1). Overall, genetic diversity assessment consistently showed that Thailand and Central Java harbored the highest genetic variation across various diversity measures (see Table 1).  [48]. The NJ tree based on the Dkf' distance produced concordant topologies. Branch labels are individual sampling numbers (see S1 and S2 Tables). The bootstrap value for nodal support is indicated at the branches (only > 50% is shown) as inferred based on the NJ tree reconstruction in POPTREE2 [50]. https://doi.org/10.1371/journal.pone.0186690.g004

Phylogeography of Macrotermes gilvus and dispersal corridors in Southeast Asia
Additionally, a marked increase in the sequence divergence among individuals within Sumatra and Borneo (π > 1.0%) followed by a substantial gain in θ S values (bold faced, Table 1) validates the observed population subdivision within these regions. The neutrality test displayed no significant departures from expectations in any population at α = 0.01 (Table 1).
Similar to the mtDNA results, overall microsatellite diversity was highest in Thailand with an average H E of 0.726 and allelic richness of 4.88 followed by populations from Java and Vietnam ( Table 2). Eleven of the 13 geographical groups showed population-specific alleles that tended to represent the extreme sizes of allele distributions (S8 Table). Of the 42 private alleles, 17 were either the smallest or the largest size class among all groups, and 24 were either the smallest or largest for a specific population, suggesting a recent derivation. The number of alleles per locus across all populations ranged from 8 to 21 alleles, indicating a modest level of polymorphism (S8 Table). Only 5 of the 13 populations displayed significant departures from HWE at several microsatellite loci (Table 2). This deviation was most substantial in the Malay Peninsula and B1 populations. This was associated with a significant heterozygosity deficit, which could be a result of considerable inbreeding within these populations (F IS = 0.318 in B1;  (Table 2). These results indicate a close genetic association among these regions, which supports the results of the mtDNA analyses. Tests of null alleles and linkage disequilibrium (LD) displayed an inconsistent pattern and often associated with significant departures from HWE within the respective population. This suggests that either the presence of null alleles or a significant LD could be the result of local sampling effect in the respective populations rather than non-random association of alleles between the loci. There were no scoring errors due to stutters or large allelic drop out detected from this analysis. All populations in this study displayed a relatively high Garza-Williamson index (P > 0.80), indicating that the populations were not experiencing recent bottleneck effect ( Table 2).

Genetic differentiation
Pairwise genetic differentiation between population pairs broadly supported the observed phylogeographic structure of M. gilvus across SEA. The global F ST index (weighted average of population specific F ST values) was remarkably high at 0.831 (P < 0.0001), indicating profound genetic differentiation among sites. Unexpectedly, geographically distant population pairs such as Thailand and the Philippines as well as Vietnam and Java displayed close genetic association with no significant genetic differentiation (F ST bold faced, S7 Table). In contrast, the geographically close B1 and B2 populations of Borneo displayed significant genetic differentiation (F ST = 0.950; P < 0.0001) despite the absence of any obvious geophysical barriers separating them. Similarly, North Sumatran termites were genetically isolated from the Riau and

Phylogeography of Macrotermes gilvus and dispersal corridors in Southeast Asia
West Sumatran termites, with a high F ST > 0.930; P < 0.0001. These conditions further suggest that the genetic association between sites was not restricted to contemporary processes but rather was influenced by past complex historical processes that took place in SEA. For microsatellites, all pairwise F ST comparisons also revealed significant genetic differentiation except for three geographically proximate pairs: West Sumatra-Riau, East Java-Central Java, and East Java-Madura (S9 Table). Genetic differentiation was most substantial between population pairs from Borneo and Sumatra suggesting restricted contemporary gene flow between these regions. In fact, the Bornean termites were markedly differentiated from the rest of the mainland SEA populations (R ST > 0.60 among all pairs). Similar to the mtDNA findings, unexpectedly high pairwise F ST /R ST estimates were found among population subdivisions within Borneo and Sumatra (S9 Table). The Mantel test revealed a significant isolationby-distance pattern between geographic distance and genetic distance for both mtDNA (r = 0.371, P < 0.05) and microsatellite analysis (r = 0.473, P < 0.001).

Genetic structure
An assignment test based on the Bayesian clustering of multilocus microsatellite genotypic data as implemented in the program STRUCTURE [51] supported the partitioning of the termite individuals into seven distinct clusters (ΔK = 7, Fig 5) as observed from the NJ tree with one exception. In this analysis, the termites from the Philippines were clustered with the Thailand group instead of forming an association with B2. To validate the clustering accuracy, the genotypic data from each termite were assigned back into the respective clusters as determined by STRUCTURE [51]. We found that 99.6% of the total individuals were correctly assigned to the respective clusters with MIGPRIOR option set at 0.01. Only one individual sample from Vietnam (colony V6) showed significant probability of being a migrant (P < 0.001) that may have had recent ancestry in the Thailand-Philippines cluster; the probabilities were 0.005 and 0.769 that the individual came directly from this group or had a single 'parent' from this population, respectively.

Spatiotemporal dynamics of M. gilvus dispersal in SEA
Ancestral reconstruction through discrete phylogeographic analysis in BEAST [39] revealed a pattern of historical migration of M. gilvus among study sites across SEA. Thailand was supported as the root location for all lineages across SEA with the highest root probability at 95% Ma) based on the 0.9% Myr -1 , 1.5% Myr -1 , and 2.28% Myr -1 of mtDNA substitution rates, respectively. In the absence of independent criteria to select between the different time estimates, the age estimate using the 1.5% Myr -1 divergence rate was selected as a rough median to illustrate the time dynamics of M. gilvus divergence, but the estimates obtained with the other calibration ages are also shown (Fig 6).
Under this scenario, the first migration event involved a bidirectional movement of termites eastwards to Vietnam and southwards to central Java, where their respective ancestral populations were founded around 0.67 Ma (0.33-1.04 Ma) (Fig 6B). Eastward migration extended further until it reached the modern Philippines around 0.53 Ma (0.34-0.72 Ma) ( Fig  6C). A single colonization event from Vietnam into Malaysia took place around 0.31 Ma (0.16-0.49 Ma) and was followed closely by a migration event from central Java into B1 in Borneo at 0.28 Ma (0.14-0.43 Ma) (Fig 6D). A West Java lineage originated from B1 around 0.14 Ma (0.05-0.23 Ma) (Fig 6E), and subsequently the B2 lineage was established independently much later around 0.06 Ma (0.01-0.12 Ma) from the Philippines. The North Sumatran termites derived from the Malay Peninsula lineage following two migration events between 0.04 Ma (0.005-0.09 Ma) and 0.02 Ma (0.0004-0.05 Ma) (Fig 6F). Concurrently, another independent lineage of Sumatran termites originating from West Java emerged at Riau and subsequently in the adjacent West Sumatra about 0.05 Ma (0.007--0.11 Ma) (Fig 6F). The termites from central Java eventually spread out to establish the East Java and Madura lineages around

Discussion
The patterns of spatial structure resolved from mtDNA and microsatellite data were mostly consistent in displaying a highly structured M. gilvus populations in SEA. Eight phylogeographic groups were identified in this study based on phylogenetic relationships, genetic structure, and molecular diversity indices at both markers: Thailand, Vietnam, the Malay Peninsula/ Singapore/North Sumatra, Riau/West Sumatra, Java, Southwest Borneo (B1), Northwest Borneo (B2), and the Philippines. The global F ST /R ST value at 0.34/0.64 was highly significant (P < 0.0001), signaling a substantive genetic differentiation among the populations as a whole, and most of the pairwise comparisons displayed a pronounced level of genetic differences (> 0.25). Recent gene flow and haplotype sharing between the phylogeographic groups were rare, which corroborates this finding. Nevertheless, the genetic similarity and haplotype sharing between North Sumatra and the Malay Peninsula (MP3/SU) suggests possibilities for retention of ancestral traits and/or recent gene sharing across these two populations.
Regression of genetic distance against geographical distance revealed a significant positive isolation-by-distance pattern for both mtDNA and microsatellite data. This indicates that the  [39]. The different panels represent temporal projections of the reconstructed migration events that have occurred up to present day in chronological order (a-f). Link heights indicate relative durations of the branches upon which the inferred migration occurred, and the red-black colour gradient illustrates relative age of transition (older-recent). Circles represent ancestral population size, and the deep-faint blue colour gradient represents age (older-recent). Age estimates of the migration events for each projection are given in the order of mtDNA age calibrations from top to bottom: 0.9% Myr -1 , 1.5% Myr -1 , and 2.28% Myr -1 . To comply with PLoS ONE submission requirements for figures under the CC BY license, a present day South China Sea may in fact act as an effective barrier to dispersal among the insular termites (i.e., Borneo, Sumatra, Java, and the Philippines) and the termites from mainland SEA. However, the correlation coefficient values from both molecular markers were weak (r 2 < 0.22), suggesting that the genetic differentiation between populations was not a simple function of genetic isolation-by-distance but rather a complex interaction of other historical processes followed by an isolation mechanism. This is plausible given the complexity of geological perturbations taking place surrounding SEA in the past, including repeated sea level fluctuations [4], considerable tectonic activity [60], and associated climatic and vegetation changes [11,20]. The absence of a recent bottleneck event and an overall high average H E from microsatellite data along with neutral expectation following mtDNA's Tajima's D and Fu's Fs tests indicate that each studied population is demographically stable, mixing randomly, and potentially in mutation-drift equilibrium. The average low inbreeding coefficient in all populations corroborates the occurrence of random mating and potentially reflects a substantial effective population size.
Dating phylogenies remains an uncertain process, as estimated divergence times vary with DNA regions, calibration procedures, and fossil calibration points. Nevertheless, the credibility intervals of the different time estimates used in the present study often overlapped, and the dispersal events were consistently placed within the Pleistocene time frame. This result is in line with the belief that climatic fluctuations of the Pleistocene were a vital driving force for genetic divergence and speciation [61]. Our finding based on the discrete phylogeographic analysis has provided novel insight on the dispersal framework of early faunal migration associated with open vegetation in Pleistocene SEA. The dispersal pattern was multidirectional, and Thailand was predicted as the ancestral root of M. gilvus radiation in SEA. This finding is congruent with previous studies on a northern Southeast Asian origin of tephritid fruit fly in this region [41,62,63].
We identified at least three major range expansions of M. gilvus from this study: 1) 1.09-0.42 Ma (early Pleistocene)-migration of termites due east and south from Indochina to establish the Java and the Philippines lineages, respectively, 2) 0.45-0.17 Ma (mid Pleistocene)-establishment of the Malay Peninsula and Southwest Borneo lineages from Vietnam and Central Java, respectively, and 3) 0.06-0.02 Ma (late Pleistocene)-independent colonization of Sumatra from the Malay Peninsula and West Java, respectively. The establishment of the Javanese lineage from Thailand (Fig 6B) provides important evidence for the presence of an open corridor connecting Indochina to Java during the Pleistocene. This finding agrees in part with the presence of the north-south savanna corridor proposed by Heaney [9] (see Fig 1). Following Heaney's proposal, the migration of savanna-dependent species would have commenced in Indochina and progressed through the Malay Peninsula before reaching insular SEA. Spatiotemporal patterns from our study, however, showed that an inland dispersal route developed first, long before it expanded into the eastern margin of the Malay Peninsula. The invasion of M. gilvus from Vietnam into the Malay Peninsula approximately 0.31 Ma (95% HPD: 0.16-0.49), after its dispersal to Java 0.67 Ma (95% HPD: 0.33-1.04) (Fig 6D), illustrates this scenario. This mode of dispersal corresponds to the presence of the oldest hominin fossil in the central Java region at ca. 1.6-1.02 Ma [19] before other insular habitats of SEA, which supports the existence of a dispersal corridor that allowed the migration of terrestrial fauna (including Homo erectus) from Asia through Sundaland into Java well into the Pleistocene [9,11]. For the first time, we hypothesize the presence of a west-east dispersal corridor stretching from mainland SEA across Sundaland towards the Philippines during the Pleistocene. Today, the northern boundary of Sundaland is hard to delineate because the region is now mostly submerged [11], which makes it difficult to determine the type and extent of vegetation (if any) necessary for the west-east migration of terrestrial species from mainland SEA. However, discrete phylogeographic analysis provided evidence for an effective termite migration event taking place from Indochina to establish new colonies in the east (Fig 6C). In addition, the microsatellite genotypic data from STRUCTURE [51] analysis revealed a significant close association between the Philippines and Thailand despite physical isolation by the present-day South China Sea (Fig 5). The M. gilvus population from the Philippines is genetically similar to the Thailand and Vietnam populations as compared to other regions (even Borneo, which is geographically closer to the Philippines; F-statistics, S7 Table). In fact, the three populations displayed no significant genetic differentiation from one another.
Schutze et al [41] reported that a similar eastward movement of tephritid fruit flies from Thailand to the modern Philippines took place approximately 360 ka, which further supports the existence of an effective dispersal across the South China Sea. Although natural dispersal mode of M. gilvus does not favour transoceanic dispersal, long-distance dispersal of terrestrial organisms via rafting on vegetation mats is a possibility [64]. Besides M. gilvus, there are many other soil-dependent termite taxa in Sundaland including the fungus feeding genus Odontotermes and soil feeders like Pericapritermes, Discupiditermes, Coxocapritermes as well as many nasute genera [65,10]. These rainforest taxa probably had even more restrictive over-water dispersal requirements than M. gilvus, however, various species from the genera Odontotermes, Pericapritermes, Bulbitermes, Grallotermes, Hospitalitermes, and Lacessitermes are common to the Philippines. Their presence necessitates over water rafting over very narrow Pleistocene seas or possibly larger gaps during interglacial high sea stands. Moreover, Nobre et al. [66] show that the Macrotermitinae (fungus-growing termite family) overcame a wide 300+ km Miocene oceanic gap between Africa and Madagascar. Since evidence for a direct land connection between Indochina and the Philippines is still lacking, under this scenario, over water dispersal mechanism would be most likely explanation for M. gilvus to have dispersed across these two regions.
We found no evidence of effective migration events between Sumatra and Borneo. The high genetic divergence between the two regions is congruent with results of other studies on speciation in neotenic net-winged beetles [67] population structure of the striped snakehead fish [68], and also as displayed by the dispersal pattern of the tephritid fruit fly [41], suggesting that the two regions have restricted dispersal event for selected taxa despite both landmasses were joined numerous times during the Pleistocene due to shallow waters (46m) of the Java sea. Slik et al. [16] show that there was no evidence for a savanna corridor between Borneo and Sumatra, but show evidence for coarse soil barrier of exposed sea floor. The soil condition might be deemed unsuitable for M. gilvus to disperse across these two regions during the Pleistocene. Instead, central Java displayed remarkable habitat stability (i.e., pronounced genetic diversity), serving as a central link to the south of the Equator and radiating out to establish termite lineages in Southwest Borneo, West Sumatra, and East Java. Our findings support the hypothesis of a minimal savanna corridor [11] that would allow dispersal of open vegetation adapted species between Java and Borneo. The data provide no support for the presence of a broader maximal savanna corridor [11] that would allow mixing of populations between Sumatra, Java, and Borneo. A land bridge may have connected the insular environments during the LGP, but it did not persist for long enough and/or the vegetation type did not allow effective migration to take place between Sumatra and Borneo.
Based on the termite dispersal events, we suggest two alternative routes of entry by which the termite could have colonized insular Sumatra as inferred from the discrete phylogeographic analysis. The first route involves a northern incursion from the southwest region of the Malay Peninsula and Singapore across the Malacca Straits anytime between 60 and 20 ka. The timing of the dispersal events suggests that colonization of M. gilvus occurred soon after the catastrophic super-eruption of Toba approximately 73.5 ka in northern Sumatra [69]. Such a post Toba-eruption re-colonization has been previously suggested for several mammals species in Sumatra [70][71][72][73]. The aftermath may have provided a suitable habitable environment to develop in the lowland areas surrounding the caldera. Geological records associated with ash layers attributed to the Toba eruption [74][75] suggest the possibility of an open vegetation phase of landform development related to the LGP in the southwest Malay Peninsula. Similarly, pollen records support the premise that the vegetation during the penultimate glacial maximum (~195-128 ka) near Kuala Lumpur was an open savanna dominated by pines and grasses [76]. The timing and the location of this open vegetation phase fits the inferred dispersal route presented in this study, suggesting that a stretch of savanna corridor across the Malacca Straits would have allowed effective migrations to take place. Our study suggests that the dispersal event existed after the Toba explosion, but it is possible that preexisting termite populations in North Sumatra resulting from an earlier migration event could have been erased by the Toba event.
The second route involves the incursion of termites in the south originating from Western Java, spreading to Riau Province, and subsequently residing in West Sumatra. Despite an ongoing debate about the timing of the emergence of the modern Sunda Straits (see Fig 1) that separate Java and Sumatra (5 th -6 th century AD ( [77][78] or 56-74 ka ( [11,79]), our results show that termite migration took place between Java and Sumatra sometime between 60 and 20 ka (based on all time estimates; Fig 6F). This suggests that a land bridge connecting these areas was present at that time. Alternatively, if the Sunda Straits did exist by this time, transoceanic dispersal would have been required to traverse the two islands. To our knowledge, this is the first time the occurrence of such a migratory route has been proposed based on phylogeographic evidence. On the other hand, the two disjunct M. gilvus populations in Sumatra could have directly resulted from the massive impact of super-eruption of Toba that could have splintered the populations sufficiently to enhance haplotype divergence as both populations share similar timing for colonization event to take place.
An unexpected genetic break was observed between B1 and B2 in Borneo despite the absence of contemporary geophysical barriers separating them. Genetic isolation-by-distance is possible, but we observed contrasting scenario based on the genetic affinity shared between the Malay Peninsula and North Sumatra termites, which are separated by a similar geographical distance as seen between B1 and B2 (~150 km). Alternatively, unfavorable soil condition between the two habitats may play a role as barriers to dispersal but this feature remains to be investigated. It is also important to note that B1 and B2 termites displayed different evolutionary origins, and the time dynamics of their colonization events were hundreds of thousands years apart. This finding suggests multiple points of entry for ancient terrestrial fauna to colonize Borneo at different time points during the Pleistocene glacial phase.
This study has provided insight about the biogeography of SEA and the relative impacts of palaeoenvironmental changes in shaping the distribution and dispersal of local taxa during the Pleistocene glacial periods. Different dispersal events and genetic isolation have shaped the present day biogeography of M. gilvus in this region. Our findings support several existing hypotheses of dispersal corridors in SEA and additionally suggest an alternative dispersal routes for terrestrial fauna to disperse across Sundaland during the Pleistocene. Despite several attempts to simulate the vegetation distribution at the last glacial maximum based on climatic/vegetation modelling (e.g., [80]), evidence of the extent to which a forest or a continuous/broken savanna corridor covered parts of the core of Sundaland remains sparse and conflicting [11]. In any case, our study suggests that effective dispersal events did exist on Sundaland at various points in time enabling dispersal of soil-dependent terrestrial fauna in Pleistocene SEA.
Supporting information S1