Independent allopatric polyploidizations shaped the geographical structure and initial stage of reproductive isolation in an allotetraploid fern, Lepisorus nigripes (Polypodiaceae)

Although polyploidy is pervasive and its evolutionary significance has been recognized, it remains unclear how newly formed polyploid species become established. In particular, the impact of multiple origins on genetic differentiation among populations of a polyploid species and whether lineages of independent origins have different evolutionary potentials remain open questions. We used population genetic and phylogenetic approaches to identify genetic differentiation between lineages with independent origins within an allotetraploid fern, Lepisorus nigripes. A total of 352 individuals from 51 populations were collected throughout the distribution range. To examine the genetic structure, multilocus genotyping, Bayesian population structure analysis, and neighbor-net analysis were carried out using single-copy nuclear genes. Phylogenetic trees were constructed to detect recurrent polyploid origins. Proportions of abortive spores were analysed as the measure of postzygotic reproductive isolation. Two genetically distinct lineages, the East-type and the West-type, were distributed mainly in the eastern and western parts, respectively, of the Japanese archipelago. Phylogenetic analyses indicated independent origins of these types and detected additional independent origins within each type. We also revealed limited genetic recombination between both types, even in their sympatric regions. F1 hybrids between the East- and West-types showed a reduction in fertility. It is likely that the East- and West-types formed independently in the eastern and western parts of Japan, respectively. The limited genetic recombination and reduced fertility of hybrids suggest that the two types are at an incipient stage of speciation. Two polyploid lineages with independent geographic origins could develop reproductive isolation barrier(s).


Introduction
Polyploidy or whole-genome duplication is prevalent in all eukaryotic species, including plants, animals, and fungi, and this process plays an important role in their diversification and evolution [1,2]. Polyploidy is a particularly significant process in the plant kingdom. During the last two decades, researchers have revealed that most lineages of vascular plants, including ferns, have experienced at least one whole-genome duplication event, namely an ancient polyploidization event [3][4][5], respectively. In addition, many contemporary species are of recent polyploid origin [6]. Although the evolutionary significance of polyploidy has been widely recognized, some aspects of its evolutionary consequences remain poorly resolved [7][8][9][10]. In particular, because recently formed polyploid lineages show the reduction in net diversification rate comparing with those that the closely related diploid lineages show, most of them are thought to be evolutionary dead ends [11,12]. Thus, further understanding of how nascent polyploid individuals increase their number and become established as species is needed to evaluate the evolutionary consequences of polyploidy. Because chromosome doubling directly serves as a postzygotic barrier against parental species, polyploid speciation is believed to be an example of non-ecological and sympatric speciation [13]. However, recent studies suggest that polyploid speciation might also require the development of prezygotic barriers to avoid reproductive interference by parental species, which include a transition to selfing and niche differentiation [10,12,[14][15][16].
Another problem that we focus on in this paper is the evolutionary consequences of multiple origins. In the traditional view, each polyploid species was considered to have originated only once (a single origin). This is because polyploidization was regarded as a very rare event [17]. Therefore, Stebbins, who developed central tenets of polyploid evolutionary thinking in the 20 th century, viewed polyploid species as genetically depauperate with limited evolutionary potential [17,18]. Through the advent of molecular approaches, however, it has revealed that many polyploid species originated recurrently from different parental genotypes [19,20]. Multiple origins contribute to an increase in genetic diversity within polyploid species; the introduced genetic diversity is considered to facilitate ecological adaptation under a new genetic/ genomic background and to lead to the evolutionary success of polyploid species [21]. For two decades, however, whether independent lineages with different origins interbreed with each other or have different evolutionary potentials, ultimately forming cryptic species has remained an open question [2,9,22]. Symonds et al. [23] hypothesized that the outcomes of multiple origins in polyploid species range from a single homogeneous species to multiple genetically distinct lineages that ultimately form cryptic species, owing to genetic variations contributed by parent species and the degree of gene flow among lineages with independent origins. In fact, a few studies have indicated no or limited gene flow among populations of independent origins [23,24]. For example, Perrie et al. [24] conducted a population genetic analysis of allooctoploid Asplenium ferns in New Zealand using the AFLP method. They found that allopolyploid lineages with independent origins remained genetically distinct even when sympatric and they called this phenomenon "parallel polyploid speciation". Currently, however, only a few researchers have examined genetic differentiation and gene flow between independently formed polyploid lineages within a species [23][24][25].
Homosporous ferns provide ideal research materials for understanding polyploid speciation because they show a high frequency of polyploid speciation among vascular plants [6]. This may be partly due to the inherent characteristics of the homosporous plant life cycle. In homosporous ferns, a polyploid sporophyte can be formed in a single step through selfing within a hermaphrodite gametophyte from an unreduced spore that is produced by sterile hybrids [26]. Additionally, allozyme analyses revealed that congeneric fern species are

Plant collection
To cover its known distribution range, a total of 352 samples of Lepisorus nigripes were collected from 51 populations (Table 1 and S1 Table). No permits for collections were required because our all collection sites did not belong to nature reserve such as Japanese National forest and the plant material did not involve endangered or protected species. Lepisorus nigripes is morphologically similar to its parental diploid species, L. thunbergianus [28]. Additionally, these two species frequently co-occur and readily hybridize with each other, producing triploid hybrids [31]. To avoid the inclusion of triploid hybrids in our samples, spore morphology was used. Sporangia of the triploid hybrids are highly abortive and hardly produce viable spores. Therefore, we removed individuals with abortive sporangia. For phylogenetic analyses, the two parental species, L. angustus and L. thunbergianus, and closely related species, L. hachijoensis Sa. Kurata Table).   Lepisorus kuratae is an allotetraploid of hybrid origin between L. thunbergianus and an unknown diploid race of L. tosaensis and was described as a new species jointly with L. nigripes [28].

DNA extraction, PCR amplification and nucleotide sequencing
Total genomic DNA was extracted from a silica gel-dried sample using the CTAB method [36]. A chloroplast DNA region, an intergenic spacer between rps4 and trnS (hereafter rps4-trnS), was amplified using the primers rps4_PTER_F and trnS_PTER_R1 [28]. For single-copy nuclear genes, we selected three genes, PgiC (cytosolic phosphoglucose isomerase), GapCp (NAD-dependent glyceraldehyde-3-phosphate dehydrogenase) and pTPI (plastidic triose phosphate isomerase). Intron 15 of PgiC was amplified using the primers reported by  5  3  1  0  1  1  1  0  1 ------44  5  0  2  1  2  0  1  1  2 ------45  5  0  5  0  0  0  3  0  0 ------46 10 Fujiwara et al. [28]. For GapCp, we used a forward primer, GAPCP_ POL_8F, that was newly designed based on the sequence of Polypodium amorphum Suksd. (KJ748228) presented in Sigel et al. [37], and a reverse primer (ESGAPCP11R1) reported in Schuettpelz et al. [38] to amplify a region from intron 8 to intron 10 of the gene. For pTPI, we developed a new primer set to amplify intron 5 of the gene based on the sequence of scaffold-YLJA-2013048-Polypodium_amorphum, which was obtained by a BLASTn search of the 1KP project database using the AT2G21170 gene of Arabidopsis thaliana (L.) Heynh. as a query. Primers used for PCR amplification and sequencing are listed in Table 2. PCR thermocycling conditions involved initial denaturation at 94˚C for 3 min followed by 35 cycles at 94˚C for 45 s, 56˚C for 45 s (rps4-trnS) or 54˚C for 45 s (nuclear genes), 72˚C for 90 s, and a final extension at 72˚C for 7 min. In order to infer genotype at the nuclear loci for all individuals and to determine the sequences of alleles separated on gels, we conducted single-strand conformation polymorphism (SSCP) gel electrophoresis of PCR products, generally following the methods of Jaruwattanaphan et al. [39] and Fujiwara et al. [40]. For determining DNA sequence of each of alleles, a part of the DNA band separated on the SSCP gel was excised, and DNA extracted from the band was used as a template for further PCR amplification [39]. Per each of DNA bands considered to represent different allele, multiple bands from individuals collected in different populations were sequenced.
For DNA sequencing, the PCR products were purified using Illustra ExoStar 1-Step (GE Healthcare, Chicago, Illinois, USA) and used as templates for direct sequencing. Cycle sequencing was conducted with a BigDye Terminator version 3.1 cycle sequencing kit (Applied Biosystems, Foster City, California, USA). The sequencing products were analysed by an ABI3500 genetic analyser (Applied Biosystems) and also partly by Eurofins Genomics (Tokyo, Japan). The resulting nucleotide sequences were deposited into the International Nucleotide Sequence Databases (INSD) (see S3 Table).

Multilocus genotype analyses
For assignment of homoeologous loci (a pair of loci derived from two parental genomes) into each sub-genome, phylogenetic approach is commonly performed. However, this approach is inapplicable in the case that a gene phylogeny exhibits incomplete lineage sorting between parental species. In our study, as shown in the results of phylogenetic analysis, incomplete lineage sorting between Lepisorus thunbergianus and Lepisorus angustus was highly expected. To avoid this problem, we calculated genetic distances between DNA sequences from allotetraploids and those of L. angustus, based on Kimura's two parameter model [41] implemented in MEGA7 [42]. For individuals with only two sequences (putative homozygote at both homoeologous loci), the one closer to L. angustus sequences was considered to be coded at the L. angustus-derived locus (designated as the "nigripes-A" locus), and the other at the L. thunbergianusderived locus (designated as the "nigripes-T" locus). Although some L. nigripes individuals had three or four different sequences, all of these sequences were found in the individuals having only two sequences. Therefore, each of these sequences could be assigned to the "nigripes-A" locus or the "nigripes-T" locus based on the results for the individual having only two sequences. With respect to L. nigripes, alleles at the "Ang" locus were labelled as nigripes-A1, nigripes-A2, and so on, and those at the "nigripes-T" locus as nigripes-T1, nigripes-T2, and so on. Alleles from progenitor diploid species were named allele 1, allele 2, and so on. Alleles from the other Lepisorus species were coded alphabetically if multiple sequences were obtained. The assignment was confirmed phylogenetically for each gene as described in the Results section.
Allelic richness (Ar) values were calculated to show the genetic diversity for each population using FSTAT v. 2.9.3 [43]. The level of inbreeding was estimated for each population by calculating Wright's fixation index (F), as implemented in FSTAT. Analysis of molecular variance (AMOVA) [44] was carried out to analyse the hierarchical population genetic structure using GenAlEx 6.5 [45]. Ar and F value estimations and AMOVA were conducted only for the populations that had more than five individuals.
Population genetic structure analysis was also conducted using the Bayesian clustering method based on multilocus genotypes (MLGs) at six nuclear loci (three pairs of homoeologous loci). Masuyama et al. [46] suggested a high gametophytic selfing ability of L. nigripes ("the tetraploids of L. thunbergianus" in the original paper) based on the observation that nearly all of the gametophytes can form sporophytes in isolated culture condition. In fact, as mentioned in the Results section of this paper, genotype data showed an excess of homozygosity, suggesting that L. nigripes is a predominantly selfing species. Although STRUCTURE software [47] is the most commonly used software for Bayesian genetic clustering, it assumes random mating within population; therefore, we employed InStruct [48], which does not assume random mating and can also infer selfing rates for each cluster. The model was run with the likely number of clusters (K) set to values from 1 to 10, using a burn-in of 300,000 MCMC iterations followed by 600,000 MCMC iterations, and 15 independent chains were run for each K. Subsequently, we chose 10 runs in 15 runs based on the likelihood value to remove runs suspected to be outliers. Because total number of individuals per population was variable (from N = 3 to 28) and such a uneven sampling leads to wrong inference of population structure and misestimating of cluster number [49], we performed the correction method proposed by Puechmaille [49] in addition to the commonly used method, ΔK method [50] to determine optimal K in L. nigripes populations for two different datasets. For this, we conducted Instruct analysis for 1) a full dataset (total number of populations: 51 and total number of individuals: 352) and 2) trimmed dataset for which six individuals were randomly selected from each population with N � 6 after removing the population with N < 6 (total number of populations: 31 and total number of individuals: 186). Subsequently, using the outputs from Instruct result for each dataset, we calculated ΔK and the corrected estimators for optimal K determination that is robust to uneven sampling, MedMeaK, MaxMeaK, MedMedK, and MaxMedK with threshold value 0.8 [49]. Cluster assignments were executed using the CLUMPP program [51], and the result was illustrated using the Distruct program [52]. We plotted the result for each population at K = 2 in map made by GeoMapApp (http://www.geomapapp.org/) [53] / CC BY / CC BY (Ryan et al. [54]). In order to discriminate between pure and admixed individuals at the optimal number of K, we used the results of cluster assignments (Q-matrix) generated by InStruct. We assigned individuals with q values of 95% and higher to non-admixed ones and those with q < 95% to admixed ones. Additionally, NewHybrids [55] was used to classify individuals into pure individuals, F 1 , F 2 or backcrosses based on the default genotype categories. The analysis was run for 50,000 sweeps after a burn-in period of 10,000 sweeps.
Agglomerate clustering of MLGs was performed using the neighbor-net method [56], which was implemented using the SplitsTree4 application [57]. Pairwise genetic distance matrices among MLGs were generated using Kosman & Leonard's similarity index [58], as implemented in the R package PopGenReport [59].

Phylogenetic analyses
To detect lineages with independent origin, we conducted phylogenetic analyses for cpDNA and nuclear genes. The number of lineages with independent origins was interpreted as the number of the lineages with different alleles independently derived from different individuals of parental species. Because the transmission of alleles from parental species into polyploid is independent among different gene, the use of species tree combining multiple genes can lead to under-or overestimation of the number of origins. Thus, we separately reconstructed each gene phylogeny. The sequences determined in the present study and those generated in a previous study [28] were aligned using MUSCLE [60] implemented in AliView [61], and the resultant multiple alignment was edited manually in AliView. INSD accession numbers of the sequences used are shown in S3 Table. For chloroplast DNA and nuclear genes, indels were coded as binary data using the "simple index coding" method [62] using the IndelCoder option of SeqState 1.4.1 software [63]. Binarized characters were added to the data matrix, and the corresponding nucleotide sites with gaps were excluded as missing data. Only one sequence for each haplotype was included in each dataset. For phylogenetic analysis, each of the plastid regions, rps4-trnS, and nuclear genes, PgiC, GapCp, and pTPI, were separately analysed by the maximum likelihood (ML) method using Garli 2.01 [64] and Bayesian inference (BI) using MrBayes v 3.1.2 [65]. The best fitting substitution model for each DNA region was determined using the Akaike information criterion (AIC) [66] in jModelTest [67]. For binarycoded data, the Mkv model [68] was applied. In the ML analysis, eight independent runs were performed with randomly started trees and 'genthreshfortopoterm' set to 100,000. The maximum likelihood bootstrap support (MLBS) values were assessed using 500 bootstrap pseudoreplicate data sets. The consensus tree of bootstrap replicates was obtained using SumTrees [69]. In the BI analysis, four MCMC chains were run for 3,000,000 generations with samples taken every 100 generations. For evaluating convergence, Tracer 1.6 [70] was used. The first 500,000 generations were discarded as burn-in.

Observation of spore fertilities
Spore fertility was evaluated by the proportion of normal spores per sporangium. Samples were selected based on the result of genetic clustering by InStruct: 11 F 1 individuals (from six populations) between different genetic clusters, ten non-admixed individuals (from six populations) with heterozygous genotypes, and 17 non-admixed individuals (from 17 populations) with a homozygous genotype. At least five sporangia per sample were examined. Each sporangium was transferred to a glycerol solution (glycerol: water = 3: 1) on a glass slide, and spores were forcibly discharged from each sporangium using a dissecting needle. Shrunken small spores and transparent ones were interpreted as abnormal spores based on the criteria presented by Masuyama et al. [71]. The differences in the spore fertilities among F 1 hybrids and heterozygous, and homozygous non-admixed individuals were tested using the Steel-Dwass test performed in R [72].

Multi-locus genotyping and population genetic structure
In nuclear DNA markers, nearly all of the samples showed heterozygous SSCP banding patterns at all three nuclear genes, PgiC, GapCp, and pTPI. Mean pairwise genetic distances between each sequence from the samples of L. nigripes and those from L. angustus were calculated for each gene ( Table 3). Each of the sequences found in L. nigripes was successfully assigned to the "nigripes-A" locus or the "nigripes-T" locus; there was a gap between the distances for the sequences of the "nigripes-A" locus and those of the "nigripes-T" locus (Table 3). Finally, we obtained sequences at six loci: two homoeologous loci for each of PgiC, GapCp, and pTPI. Polymorphisms were found in all loci examined. The number of alleles at each locus ranged from two (PgiC-nigripes-A, pTPI-nigripes-T, pTPI-nigripes-A) to 12 (PgiCnigripes-T) ( Table 3). Combining genotypes at six nuclear loci yielded 51 multi-locus genotypes (MLGs) in a total of 352 samples (S4 and S5 Tables). As for cpDNA, we found five haplotypes (A, B, C, D1, and D2) among the sequences determined from the 62 representative individuals: one or two from each of 51 locations (S4 Table). The InStruct results for full dataset and trimmed dataset indicated a clear population genetic structure among populations in L. nigripes (Figs 1 and 2 and S1 Fig). The ΔK method suggested that K = 2 was the optimal number of clusters for both datasets, although other methods proposed by Puechmaille [49], MedMeaK, MaxMeaK, MedMedK and MaxMedK indicated that K = 4 or 5 and K = 4 were the optimal number of clusters for the full dataset and trimmed dataset, respectively (S5 Table). At K = 2, one cluster (blue) included most individuals from north-eastern to central Honshu and some of the individuals from western Honshu and Shikoku, whereas the other (yellow) cluster comprised individuals from Shikoku and Kyushu and some of the individuals from central and western Honshu (Fig 1A). At K = 4 and 5, although each of two clusters at K = 2 were separated into additional clusters, the distinctness between the two clusters at K = 2 was consistently maintained (Fig 1A). Given that both of datasets, full dataset and trimmed dataset supported K = 2 in ΔK method, the result from ΔK method was robust against uneven sampling in this study. Thus, we used the result at K = 2 for subsequent analyses. The membership coefficients of the two clusters (Q-values) were plotted on the map (Fig 2). It was clear that L. nigripes showed a pattern of East-West genetic differentiation along the Japanese archipelago, although individuals of both clusters were commonly observed on the Kii Peninsula and on Shikoku Island (Fig 2B). Hereafter, we refer to the individuals belonging to the blue cluster as the East-type and those belonging to the yellow cluster as the West-type (Fig 2). The bar plot ordered by the Q-value at K = 2 indicated that admixed individuals (Q-values within the range of 5-95%) were relatively limited (34 individuals; 9.7%), despite the fact that the two clusters co-existed in the wide transitional zone (Figs 2B and 3). Among the admixed individuals, 16 individuals were interpreted as F 1 because their genotypes (six MLGs) exhibit heterozygosity with the alleles derived from each of pure Eastand West-types at all loci polymorphic among them ( Fig 3A, Table 4). Their Q-values ranged from 0.476 to 0.552. The other individuals could be later-generation hybrids or backcross hybrids. The admixed individuals always co-existed with the East-and/or West-type within populations, and there was no case where a population consisted of only admixed individuals (Figs 1 and 2). On the other hand, NewHybrids assigned 18 individuals as F 1 with high probability (> 0.9) (Fig 3B). The other individuals suggested to be admixed in the InStruct analysis were interpreted as the East-type or West-type, except for one individual (MLG43) that was assigned to F 2 or backcross with a relatively higher probability than that observed for other individuals.
To assess the genetic relationship among MLGs, neighbor-net was constructed (Fig 4). The MLGs of the East-type and those of the West-type were separated into two different groups, consistent with the InStruct result at K = 2. The six MLGs estimated to be F 1 (Fig 3) formed a cluster with MLG22 in the middle of the branches connecting the East and West types. The other admixed MLGs were located discretely between the two groups or were nested in the group of the East-type. Because both the Bayesian clustering (InStruct) and agglomerate clustering (neighbor-net) methods revealed two genetically distinct entities, the East-and West-types in L. nigripes, we treated samples of the East-and West-types as belonging to different populations in the subsequent population genetic analyses when they were collected from the same location. The AMOVA results revealed that most of the genetic variation in L. nigripes was distributed between the East-and West-types: Frt = 0.759 (Table 5). Within each type, genetic differentiation among geographic regions (Frt) was not significant. However, genetic differentiation among populations (Fst) was high for both types: 0.652 (East-type) and 0.399 (West-type). Allelic richness (Ar) and inbreeding coefficient (F) values for each population are presented in Table 1. The F values ranged from 0.000 to 1.000 with a mean of 0.730 in the East-type populations. The West-type populations also showed high levels of F values, ranging from 0.250 to 1.000 with a mean of 0.824. High levels of inbreeding were also suggested by Instruct (Fis = 0.900 and 0.940) and AMOVA (Fis = 0.767 and 0.821).

Phylogenetic analysis
The aligned sequences of nuclear genes were 650 bp for PgiC, 690 bp for GapCp and 281 bp for pTPI. GTR+G was selected as the best fitting model for PgiC, and HKY+G for GapCp and pTPI. We obtained a multiple sequence alignment of 1009 bp including 14 gaps for the chloroplast rps4 -trnS region. GTR+G was selected as the best-fitting model.
The ML tree of PgiC (Fig 5A) showed four highly supported clades (L. thunbergianus clade, L. angustus_1 clade, L. angustus_2 clade, and a clade of L. hachijoensis + L. onoei), as well as a weakly supported sister clade to the L. thunbergianus clade. Most of the sequences from L. thunbergianus and those at the "nigripes-T" locus in L. nigripes (12 sequences) were included in the L. thunbergianus clade. The sequences from L. angustus occurred mainly in two separate clades: L. angustus_1 and L. angustus_2. Two sequences at the 'nigripes-A' locus in L. nigripes (nigripes-A1 and nigripes-A2) were included in the L. angustus_1 clade. Two sequences from L. thunbergianus (L. thunbergianus alleles 4 and 11), a sequence from the "nigripes-T" locus in L. nigripes (nigripes-T9), and a sequence from L. angustus (L. angustus allele 3) were included in the weakly supported clade that was sister to the L. thunbergianus clade, together with the sequences from L. tosaensis and L. kuratae.
GapCp tree (Fig 5B) comprised four clades and a unique sequence (L. tosaensis / L. kuratae). All sequences from L. thunbergianus and four sequences at the "nigripes-T" locus in L. nigripes The ML tree of pTPI (Fig 5C) displayed three highly supported clades and an unsupported clade of L. thunbergianus alleles and L. kuratae B. nigripes-T1 and nigripes-T2 from L. nigripes were included in the L. thunbergianus_1 clade and clustered with L. thunbergianus alleles 4 and 6 and with L. thunbergianus allele 1, respectively. All of the sequences from L. angustus and those at the 'nigripes-A' locus in L. nigripes formed a highly supported clade (98/1.0). In this clade, nigripes-A1 from L. nigripes was identical to L. angustus allele 1, and nigripes-A2 was grouped with L. angustus allele 3.
With respect to the cpDNA tree, the ML and BI analyses yielded the same topology (S3 Fig). All of the haplotypes of L. nigripes and those of L. thunbergianus were included in the same clade (MLBS = 99, BIPP = 1.0). Hap_C was the most frequent haplotype and was shared by both East-and West-type individuals from 35 populations throughout the Japanese archipelago. Hap_D1 and Hap_D2 were unique to the East-type, and Hap_B was unique to the West-type.  Fig 6). With respect to non-admixed individuals, regardless of homozygosity or heterozygosity, mean proportions of normal spores ranged from 0.75 to 1.0, and average values exceeded 0.9. The proportion of normal spores of F 1 hybrids was significantly lower than that of non-admixed homozygous individuals (P < 0.001) and heterozygous individuals (P < 0.01).

Genetic differentiation between East and West Japan
In Bayesian clustering analysis, ΔK method selected K = 2 as optical number of clusters for full dataset and trimmed dataset (S5 Table), which suggested the result of ΔK method in this study  Table 4

PgiC
GapCp pTPI Num. Type   Other methods to determine optimal number of clusters, MedMeaK, MaxMeaK, MedMedK and MaxMedK selected K = 4 and 5 as the optimal number (S5 Table). However, the separations were the subdivision of each of cluster at K = 2, East-and West-types (Fig 1). Additionally, most of admixed individuals between clusters inside East-or West-type at K = 4 and 5 were heterozygous individuals within each type, whose fertilities were same as those in nonadmixed individuals (Fig 6 and S4 Table), as discussed later. Thus, it was concluded that the distinction at K = 2 is more biologically significant. The genetic differentiation between the East-and West-types at K = 2 was also supported by the neighbor-net analysis (Fig 4), the sharing of a few alleles in nuclear phylogenetic trees (Fig 5), and the high Frt values (0.759) in Fstatistics (Table 5). An East-West pattern of intraspecific genetic differentiation in the Japanese archipelago has often been reported in phylogeographic studies on woody angiosperm species [34,35,[73][74][75][76]. The present study is the first case of the East-West differentiation pattern in fern species in the Japanese archipelago. The scarcity of this differentiation pattern in ferns may reflect the high dispersal ability of ferns that have small spores as propagules [77], or merely that few phylogenetic studies have been conducted on Japanese ferns so far. The phylogeographic studies on Japanese plants concluded that such a geographic genetic structure would be caused by climatic changes during repeated glacial periods in the Quaternary period. During the last glacial maximum (LGM), the climate in the Japanese archipelago was much colder than at present. Evergreen and deciduous forests have been considered to have reduced their population sizes and have been restricted to small refugia. The East-West differentiation pattern is considered to have been formed by the past geographical isolation between eastern and western refugia during the LGM, followed by range expansion. We suggest that the geographical distinction between the East-and West-types of L. nigripes was also generated through geographical isolation between refugia formed during the glacial period(s) in the Quaternary. It is likely that epiphytic L. nigripes shifted its distribution together with its host tree species. We present two possible scenarios depending on the different timing of allopolyploidization events. (1) Ancestors of the East-and West-types originated before the last glacial period. With the arrival of the last glacial period, their geographical distribution range decreased, and their populations were restricted to eastern and western refugia. The geographic isolation during the last glacial period eventually resulted in the formation of genetically distinct groups, the East and West types. If (1) did not occur, (2) East-and West-types of L. nigripes originated independently in geographically isolated refugia in Eastern and Western Japan, respectively, during the last glacial period. The genetic difference between the two may reflect that they originated in different locations and from different parental source populations. AMOVA analysis revealed that the levels of genetic differentiation between East-and Westtypes is Frt = 0.759 and the differentiation is the primary source of whole variation observed in L. nigripes (Table 5). Some studies also detected East-West differentiation in other Japanese plant species and estimated the levels of genetic differentiation between eastern and western groups using several nuclear markers such as AFLP Thunb. with AFLP [81]. Importantly, AMOVA analysis conducted in these studies showed that genetic differentiation between the groups was not the primary source of genetic variation. Comparing with East-West differentiation in other plants that experienced population isolation between eastern and western Japan during LGM, the genetic differentiation between East-and West-types in L. nigripes showed much higher level of differentiation and had stronger effect on whole genetic variation in this species, which suggested that the genetic differentiation in L. nigripes cannot be explained only by bottleneck effect in geographical isolation of populations, that was cause for East-West differentiation in other plants. In addition, it should be noted that the East-and West-types shared few or no alleles at the nuclear loci examined: 1/12 and 1/2 for PgiC, 0/4 and 0/3 for GapCp, and 1/2 and 0/2 for pTPI (Fig 5). A high level of genetic differentiation would be unlikely if these types diverged only through genetic drift during the last glacier period, i.e., less than one hundred thousand years ago. Therefore, the scenario 2 is more plausible and it is likely that the genetic distinctness between the East-and West-types of L. nigripes reflects the differential introduction of parental allelic variants to polyploids in their respective place of origin, i.e., one type of founder effect. Independent origin of East-and West-types will also be supported by phylogenetic analyses (see the discussion below).

nigripes-T nigripes-A nigripes-T nigripes-A nigripes-T nigripes-A
Although L. nigripes is widely distributed in both cool and warm temperate regions [28], its parental species, L. thunbergianus and L. angustus, are restricted to warm temperate and cool temperate regions in Japan, respectively [33] Thus, it is likely that the geographical distribution areas of the parental species were strongly influenced by climatic oscillations during the Quaternary Era, together with the evergreen and/or deciduous forest trees on which they grow. Aoki et al. [74] revealed genetic differentiation in Curculio sikkemensis (Heller) between the north-eastern and southwestern parts of the mainland of Japan; this is a generalist acorn weevil of Fagaceae plants inhabiting deciduous forest and evergreen forest. Because no significant genetic differentiation of the weevil was observed between vegetation types of its utilized host plant species, Aoki et al. [74] concluded that deciduous and evergreen forests might have survived together or adjacent to each other during glacial periods. Qi et al. [82] estimated the potential geographic distribution of a Japanese deciduous cool temperate tree, Cercidiphyllum japonicum Siebold et Zucc. ex Hoffm. et Schult., during the LGM based on ecological niche modelling. They found that C. japonicum exhibited a southward range shift during the LGM and occupied areas where warm temperate forests develop at present. In this light, it is likely that L. thunbergianus and L. angustus also survived in the same or adjacent regions during glacial periods. Therefore, we can anticipate more opportunities for interspecific hybridization that is essential for producing L. nigripes during glacial periods compared with current and past warm interglacial periods.

Independent multiple origins
Nuclear DNA phylogenies revealed that L. nigripes recurrently originated from independent polyploidization events. In pTPI trees (Fig 5C), nigripes-A1 fixed in the East type of L. nigripes was identical to L. angustus allele 1, while nigripes-A2 fixed in the West-type clustered with L. angustus allele 3 with high support. This result indicates that nigripes-A1 and nigripes-A2 are the sequences derived from different individuals of L. angustus. Such a derivation pattern was observed also in the GapCp tree (Fig 5B). nigripes-A3 fixed in the West-type formed a clade together with L. angustus alleles 4, 5 and 6, whereas nigripes-A1 and nigripes-A2, specific to the East-type, were not included in the clade, suggesting that nigripes-A3 in the West-type and nigripes-A1 and nigripes-A2 in the East-type emerged from different origins. The result that each allele in GapCp-nigripes-A and pTPI-nigripes-A from the East-and West-types emerged independently from different parents supports scenario 2 in the above section, i.e., that the East-and West-types originated independently.
Alternatively, the possession of distinct alleles between East-and West-types also can be explained by two different mechanisms. One is that an ancestral allopolyploid with heterozygosity originated from single polyploidization and each allele was fixed in each of East-and Westtypes throughout gametophytic selfing. Another is introgression between L. nigripes and parental diploid species. Firstly, an ancestral allopolyploid with heterozygosity from single origin requires a fusion of two unreduced gametes with heterozygosity from two different diploid species. Given that gametophyte of Lepisorus is cordate-thalloid gametophyte that be known as short-live [83,84], the probability that unreduced spores were simultaneously produced from both parental species interbreeding with each other, is extremely low. It is rather much more likely that L. nigripes recurrently originated from single unreduced gametophytes produced in diploid hybrid species throughout gametophytic selfing, the ability unique to ferns. Additionally, two independent loci, pTPI in L. angustus subgenome and GapCp in L. angustus subgenome supported the pattern that can be interpreted as independent origins of East-and West-types. It is highly unlikely that the fixation of different alleles between the types in the two independent loci occurred throughout single origin. Secondly, interploidal introgression commonly occurs via triploid bridge in angiosperms. However, such a triploid bridge is unlikely in the case of L. nigripes, because triploid hybrid between L. nigripes and parental species has highly withered sporangia that cannot produce any spores and the evidence of introgression between two species with different ploidal level has never been reported so far in ferns. Therefore, independent origin is the most plausible explanation for the pattern observed in pTPI and GapCp phylogenies.
The PgiC tree provides evidences for additional independent origins. nigripes-T8 and nigripes-T6 formed a clade together with L. thunbergianus allele 6 (arrow 1 in Fig 5A), whereas nigripes-T1, 3, 5, 10, and 11 branched from the base of the clade (arrow 2), suggesting at least two independent acquisition events of diploid alleles. nigripes-T7 and nigripes-T12 formed a clade that diverged from a more basal position (arrow 3), which reflected the other origin. nigripes-T4 was included in the clade that consisted of L. thunbergianus alleles 2, 7, 10, and the sequence of L. kuratae homoeologous loci A (arrow 4). nigripes-T2 was a sister clade to all of the other sequences of the L. thunbergianus clade (arrow 5). Thus, nigripes-T4 and nigripes-T2 also have independent origins. Lastly, although nigripes-T9 was the sequence from the "nigripes-T" locus of L. nigripes, this sequence was sister to L. angustus allele 3. Because the genetic distance between nigripes-T9 and L. angustus allele 3 was relatively high and the sequences of L. thunbergianus were also nested and polyphyletic in this clade, we interpreted this sister relationship to be the result of incomplete lineage sorting, not as a result of nigripes-T9 having been derived from L. angustus as opposed to L. thunbergianus. Therefore, nigripes-T9 might have independently derived from L. thunbergianus. According to scenario 2, if we assume that the East-and West-types originated independently in different refugia, the Eastand West-types had at least four (arrows 1, 2, 3, and 5) and five (arrows 1, 2, 3, 4, and 6) origins, respectively, based on the result of the PgiC tree. Multiple origins have been reported among several polyploid taxa in ferns: Asplenium ceterach has at least six origins [85], A. cimmeriorum has at least two origins, A. gracillimum has at least four origins [24], A. majoricum has at least four origins [86], Astrolepis integerrima has at least ten origins [87], and Polypodium hesperium has at least two origins [37]. It was suggested that the East-and West-types of L. nigripes originated recurrently approximately the same number of times as allopolyploid ferns examined in previous studies.

Mating system of Lepisorus nigripes
Populations of Lepisorus nigripes exhibited high values of the inbreeding coefficient (F): a mean of 0.730 for the East-type populations and 0.824 for the West-type ones (Table 1). This level of inbreeding coefficient is comparable for the values in the species categorized as 'exclusively inbreeding' [88]. Masuyama et al. [46] performed gametophytic selfing tests for L. nigripes (treated as tetraploid L. thunbergianus in their study) and L. thunbergianus (treated as diploid L. thunbergianus in their study) individuals. In their test, 50 gametophytes per sporophyte were isolated and forced to undergo gametophytic selfing. In L. thunbergianus, only 0-16% of gametophytes produced sporophytes and some of the sporophytes were small and irregularly shaped. On the other hand, in L. nigripes, 98-100% produced sporophytes with a normal shape. Based on their results, the authors concluded that L. nigripes has high gametophytic selfing potential, which is consistent with our results. We found that populations of L. nigripes were characterized by low levels of genetic variation within the populations; ten out of 51 populations comprised only one MLG (Table 1). Gametophytic selfing enables ferns to colonize a new open habitat by a single spore [89,90], and it is expected that the population established through this process is initially genetically monomorphic [91]. The existence of such single-MLG populations observed in this study and the relatively high levels of Fst values among populations (Table 5) could be evidence that gametophytic selfing frequently occurs in L. nigripes under natural conditions. Although our understanding of the genetic, epigenetic, and genomic consequences of polyploid evolution in plants has greatly improved during the last two decades, how new polyploids establish and then persist in natural populations remains poorly understood [8]. Fowler and Levin [92] examined factors critical to the establishment of a new allopolyploid population based on a simulation model and suggested that a higher level of self-fertilization of polyploids increases the success rate because it mitigates the minority disadvantage [14] of the newly formed polyploids. Masuyama and Watano [93] suggested that gametophytic selfing might occur more frequently in polyploid than in diploid species. The contrasting mating systems of tetraploid L. nigripes and diploid L. thunbergianus [46] were adopted in the paper as a representative case. If evolution of selfing were triggered by the reduction of inbreeding depression in polyploids as suggested by Lande and Schemske [94], we could expect a substantial increase in selfing ability, even in a nascent polyploid. The idea that the mating system shift caused by polyploidization would assist the initial establishment of polyploids should be evaluated in future studies.

Initial stage of reproductive isolation between the East-and West-types
InStruct revealed that admixed individuals between the East-and West-types were relatively limited despite their coexistence across wide areas (Figs 2 and 3). We found 36 admixed individuals among the 352 samples examined. 16 of them were interpreted as F 1 hybrids, and 20 individuals were recombinants (Fig 3 and Table 4). In western Honshu and Shikoku, where the East-and West-types frequently co-exist, 15 recombinant individuals were found. The remaining five individuals with MLG15 and 40 were observed in northern Honshu, where the West-types were not found (Figs 1, 2 and 4). It is possible that MLG15 and 40 might not be recombinant genotypes, but rather MLGs unique to northern Honshu as suggested by neighbour-net analysis (Fig 4). Moreover, NewHybrids (Fig 3B) assigned 18 individuals as F 1 hybrids with high probabilities, and only one individual showed a higher probability for F 2 or backcrossing than that observed for the pure strain. The paucity of recombinants between the East-and West-types observed in this study suggested the presence of reproductive isolation between the two types. As L. nigripes is a predominantly selfing species, this mating system seems to have contributed towards the reduction of gene flow between the East-and Westtypes. However, it is unlikely that selfing alone could maintain the genetic distinctness across wide sympatric regions. Even in predominantly selfing species, occasional mating between selfing strains can occur, generating recombinant genotypes [95]. For example, although particular MLGs were maintained within each population in the predominantly selfing fern Sceptridium ternatum (Thunb.) Lyon, these MLGs were not maintained across populations [96]. Together, these findings suggest that intrinsic reproductive isolation is likely responsible for the genetic distinctness between the East-and West-types of L. nigripes. As shown in Fig 6, most of the F 1 hybrids between the East-and West-types showed significant reductions in fertility compared to the spore fertility of non-admixed individuals. The mean proportion of normal spores among F 1 hybrids was 0.59. Although two individual F 1 hybrids of POP39 showed a high proportion of normal spores, a reduction in the fertility of F 1 hybrids was widely observed across six populations from central Honshu, western Honshu, and Shikoku. This suggests that the reductions in spore fertility were not caused by external factors in each population, but rather by intrinsic factors of hybrid individuals.
Artificial crossing experiments between cryptic species that were detected by genetic markers in ferns were conducted in previous studies [71,97,98]. In the Ceratopteris thallicroides species complex, Masuyama et al. [71] showed that the germination rates in F 1 hybrids from intraspecific crossing between different geographic sources were as high as 50%, while those from interspecific crossing were nearly 0%. They also calculated the frequency of abortive spores in each crossing, revealing frequency ranges between 0.0% and 18.6% for intraspecific crossing and between 64.6% and 91.0% for interspecific crossing. In the present study, the frequencies of abortive spores in F 1 hybrids between the East-and West-types (38.7-61.9% except for the hybrids in POP39) were lower than those of interspecific hybrids in Ceratopteris. Additionally, because recombinant genotypes were found in the sympatric regions, we concluded that the intrinsic reproductive isolation between the East-and West-types would be incomplete. During the initial stage of divergence, the loci responsible for isolation can be polymorphic among individuals. As a result, the degree of incompatibility is expected to vary among crossing pairs [99]. Our observation that F 1 hybrids in POP39 showed high proportions of normal spores might indicate variability in reproductive isolation between the East-and Westtypes.

Conclusions
It remains unclear whether lineages with independent origins interbreed [8,9,22]. Symonds et al. [23] hypothesized the outcomes of multiple origins in polyploid species ranging from a single homogenous species to multiple genetically distinct lineages that ultimately form cryptic species, according to genetic variations contributed by parental species and the degree of gene flow among independent origins.
In this study, we detected two genetically distinct lineages (the East-and West-types) in L. nigripes that seem to have originated from independent polyploidization events. Based on the limited number of recombinants between the East-and West-types and the partial reduction of spore fertility in F 1 hybrids, it is suggested that the divergence between the East-and Westtypes represents an incipient stage of speciation with incomplete reproductive isolation. Although the genetic distinctness and isolation between them could suggest that they represent evolutionary distinct lineages respectively, it is concluded that they should be still treated as single species considering the existence of recombinants and morphological indistinctness.
Given that the East-and West-types of L. nigripes originated allopatrically during the last glacial period, we need to consider that reproductive isolation between the two types developed rapidly, i.e., on the order of tens of thousands of years. Genetic and epigenetic studies of polyploidy have demonstrated that gene silencing and chromosome rearrangement occur immediately following polyploidization [100][101][102][103], even within the first few generations in synthesized Brassica polyploids [104]. Such gene silencing after polyploidization has also been reported in allopolyploid ferns [105]. Werth and Windham [106] proposed a model in which reciprocal silencing at each of the duplicated loci of the same gene leads to reproductive isolation between allopatric polyploid populations. Future studies on the genetic basis of the intrinsic reproductive isolation between the East-and West-types of L. nigripes would greatly contribute to the understanding of allopatric speciation at the polyploid level. Intriguingly, heterozygous individuals within each type showed no reduction in the proportion of normal spores, although these individuals could be generated from hybridization between individuals of independent origins within each type. This result suggests that independent origins per se did not contribute to the development of reproductive isolation, but rather, allopatric formation and distinct genetic contribution from parental species as a consequence of independent origins could be responsible for speciation between allopolyploid lineages of independent origin. This is consistent with hypotheses previously presented by Soltis et al. [8] and Symonds et al. [23].