Genetic diversity and population structure of Miscanthus lutarioriparius, an endemic plant of China

Miscanthus lutarioriparius is a native perennial Miscanthus species of China, which is currently used as raw material of papermaking and bioenergy crop. It also has been considered as a promising eco-bioindustrial plant, which can offer raw material and gene for the biomass industry. However, lack of germplasm resources and genetic diversity information of M. lutarioriparius have become the bottleneck that prevents the stable and further development of the biomass industry. In the present study, genetic diversity of 153 M. lutarioriparius individuals nine populations was studied using 27 Start Codon Targeted (SCoT) markers. High polymorphic bands (97.67%), polymorphic information content (0.26) and allele number (1.88) showed SCoT as a reliable marker system for genetic analysis in M. lutarioriparius. At the species, the percentage of polymorphic loci [PPL] was 97.2%, Nei’s gene diversity [H] was 0.36, Shannon index [I] was 0.54 and Expected Heterozygosity [He] was 0.56. Genetic variation within populations (84.91%) was higher than among populations (15.09%) based on analysis of molecular variance (AMOVA). Moderate level of genetic differentiation was found in M. lutarioriparius populations (Fst = 0.15), which is further confirmed by STRUCTURE, principal coordinates analysis (PCoA) and an unweighted pair group method with arithmetic mean (UPGMA) analysis that could reveal a clear separation between groups of the north and south of Yangtze River. The gene flow of the populations within the respective south and north of Yangtze River area was higher, but lower between the areas. There was no obvious correlation between genetic distance and geographic distance. The breeding systems, geographical isolation and fragmented habitat of M. lutarioriparius may be due to the high level of genetic diversity, moderate genetic differentiation, and the population, structure. The study further suggests some measure for conservation of genetic resources and provides the genetic basis for improving the efficiency of breeding based on the results of diversity analysis.

Introduction codon (ATG) in plant genes. SCoT employs long primers (18-mers), and can generate polymorphisms that are reproducible. It is considered as a dominant marker system, requiring no prior sequence information, and the polymorphism is correlated to functional genes and their corresponding traits. Other excellent characteristics include their simplicity of use, high polymorphism, the use of universal primers, low cost and gene targeted markers. This technique has been successfully used to assess genetic diversity and structure [46,47], construct DNA fingerprints [48,49], identify QTLs [50], and analyze differential gene expression and screen stress tolerance genes [51].
The present study is the first attempt to use SCoT markers to assess the level of genetic diversity of M. lutarioriparius, which were collected from the wild populations. The main objectives of this study were to assess the genetic diversity and genetic relationship of M. lutarioriparius in China. These results could benefit M. lutarioriparius germplasm collection, conservation and future breeding.

Ethics statement
This research did not involve the rare and endangered plants. Samples of M. lutarioriparius were gathered from areas, which are not in any nature reserve or private land. Therefore, collection of the material did not require an additional approval of the ethics committee or other specific permission.

Plant materials
One hundred and fifty-three M. lutarioriparius accessions were used in this study. The accessions were collected from 153 sites which covered the major distribution areas in China (S1 Table). Over five rhizomes from a site were sampled. Sites of accessions separated by a distance of more than 1 km were collected during the autumn and winter of 2017. The rhizomes of each accession from a site were planted in the Miscanthus Germplasm Nursery Garden (281 1'14.42"N, 113˚4'7.69"E) in Hunan Agriculture University, Changsha, Hunan, China.

Genomic DNA extraction
In the spring of 2018, young leaves from 153 plants were collected from the Miscanthus Germplasm Nursery Garden and stored at -20˚C for genomic DNA isolation. Total genomic DNA was isolated from 1 g fresh young leaves of M. lutarioriparius following the method described by Doyle and Doyle [52]. The concentration and purity of DNA samples were determined by agarose gel electrophoresis analysis and spectrophotometer absorbance (Biochrom Libra S22, Biochrom Ltd., Cambridge, UK). Only DNA samples with an optical density (OD) at 260 nm/ OD at 280 nm (OD260/OD280) >1.8 were diluted in ultrapure sterile water to 50 ng μl -1 and then stored at -20˚C for further PCR amplification.

SCoT-PCR amplification
A total of 36 SCoT primers developed by Collard and Mackill [45], which were produced by Shanghai Sangon Biological Engineering Technology and Service Co. Ltd., and 27 primers with clear, enlarged, and rich polymorphism bands were chosen (

Data analysis
Genetic diversity. Excel 2013 was used to calculate the total number of bands (TNB), the number of polymorphic bands (NPB), and the percentage of polymorphic bands (PPB). The polymorphism information content (PIC) of SCoT primers was determined using POWERMARKER v3.25 [53]. The software POPGENE V1.32 was used to estimate the level of genetic diversity, with five parameters: observed number of alleles (Na), effective number of alleles (Ne), Nei's gene  [54]. Population structure. The population structure of the 153 M.lutarioriparius was analyzed in the software STRUCTURE v2.3.4 [55] using admixture model, correlated allele frequencies, and a burn-in period of 100,000 iterations, followed by 1,000,000 Markov Chain Monte Carlo (MCMC) repetitions [56]. The value of K ranged from 1 to 9, with 30 independent runs. Maximum likelihood (LnP(K)) and delta K (ΔK) were used to identify the optimum number of subpopulations following Evanno's methods [56]. The structure result was analyzed in Structure Harvester v0.6.94 [57] Cluster analysis. Principal coordinate analysis (PCoA) was performed using GenALEx v6.1 software to detect the genetic relationships among populations. Cluster analysis was performed using SAHN from NTSYS-pc version 2.10 with the unweighted pair group method with arithmetic mean (UPGMA) algorithm. SPSS 19.0 was used to calculate the correlation between the genetic distance and geographic distance matrices. The latitude and longitude of each population was replaced by the latitude and longitude of the midpoint of the population.
Genetic differentiation. Analysis of molecular variance (AMOVA) was performed to analyze the genetic differentiation (F st ) using "pegas" [58]in R. The gene flow (N m ) among populations (calculated as N m = (1-F st ) / 4 F st ), Nei's genetic distance, and the genetic similarity were calculated using the software POWERMARKER v3.25 [53].

SCoT polymorphisms
Thirty-six SCoT primers were tested with three M. lutarioriparius accessions as DNA templates; all primers produced amplification products, and only primers showing clear and reproducible band patterns were selected for further analysis. Twenty-seven primers were then chosen for species identification and phylogenetic analysis. As shown in Table 1, all 27 primers used for SCoT analysis A total of 429 fragments were obtained, and 419 of the fragments were polymorphic. The number of polymorphic fragments for each SCoT primer ranged from 7 (ST15) to 21 (ST27, 32), with an average of 15.52. The percentage of polymorphic fragments was from 88.89% to 100.00%, with an average of 97.67% polymorphism. Polymorphism information content (PIC) values were 0.22 to 0.29, with an average of 0.26. The number of different alleles was 1.97 at the species (Table 2). These results indicated that a high level of polymorphism could be detected among M. lutarioriparius accessions using SCoT markers.

Population genetic diversity
The percentage of polymorphic loci (PPL) and Nei's gene diversity (H) were important parameters for measuring the level of genetic diversity [59]. In Table 2

Population structure and cluster analysis
The population structure of the 153 accessions belonging to nine populations was analyzed by using STRUCTURE V2.3.4. There were two peaks for ΔK by Evanno's method [55], at K = 2 and 5(S1 Fig). This result indicated that the 153 individuals could be divided into two clusters at K = 2, namely, Cluster I and Cluster II ( Fig 1A). Cluster I contained 92 individuals, among which 19 accessions were from Pop1, two accessions were from Pop2, 18 accessions came from Pop3, 24 accessions were from Pop4, two accessions came from Pop5, 14 accessions were   from Pop7, seven accessions came from Pop8, and six accessions were from Pop9. Most of individuals were from the region of south Yangtze River. Cluster II contained 61 accessions, where 44 individuals came from the north area of the Yangtze River and 17 accessions came from the south area of the Yangtze River.
There were thirteen accessions with membership probability lower than 0.60 of belonging to one subpopulation at K = 5 ( Fig 1B, S2 Table). They were assigned to an 'admixed' group. Cluster 1 included 34 individuals, 85.29% of their members came from the south of Yangtze River. Cluster 2 (26 accessions) included accessions from Pop2, Pop6, Pop7 and Pop9. Cluster 3 contained 30 accessions, which all are members from the north of the Yangtze River. Cluster 4 (14 accessions) was only composed by the accessions from Luhu. Cluster 5 contained 33 accessions from the south of Yangtze River and 3 accessions from the north of Yangtze River.
A dendrogram was created using the unweighted pair group method with arithmetic mean (UPGMA) algorithm and cluster analysis, and the result is shown in Fig 2. Five clear clusters were recovered according to the genetic distance. 14 accessions of Pop4 formed a single cluster, and these accessions were all from Luhu in the Dongting Lake. The individuals of other populations were scattered among different clusters. All individuals were grouped into two main categories, as A and B. Most accessions from south of the Yangtze River merged together and formed cluster A. It contained 3 sub-clusters, including a total of 91 individuals. Cluster B contained two sub-clusters, and most of individuals were primarily collected from Hubei, He'nan, north of Anhui, and Jiangsu. There were 62 individuals in this cluster. This result was similar to the result of the STRUCTURE analysis at K = 5.
The principal coordinate analysis (PCoA) for 9 populations of M. lutarioriparius revealed that these populations divided into 2 groups (Fig 3). All the populations from the south of the Yangtze River gathered into group I, and the group II contained 4 populations of north of the Yangtze River. Then the first principal vector accounting for 78.44% of the genetic variance, the second principle vector accounted for 10.70% and the third principle vector explaining 3.21%. The results of PCoA were the same from the other cluster analyses as shown above.
Genetic distance among the nine populations were calculated by the software of POWER-MARKER v3.25, which could clearly reflect the genetic relationships (S3 Table). Pairwise comparisons of populations indicated relative genetic distances between populations ranging from a minimum of 0.0178 to a maximum of 0.1603 with a mean of 0.0756. The genetic distance between Pop2 from the northern of Anhui and Pop6 from Jiangsu was the closest (0.018). The distance between Pop8 from Hubei and Pop9 from Zhejiang was the farthest (0.1603). The populations of M. lutarioriparius were separated into two clusters in the UPGMA dendrogram (Fig 4). Cluster I consisted of 4 populations, as Pop 5, Pop8, Pop2 and Pop6. All populations were from the north of the Yangtze River. Cluster II contained Pop1, Pop3, Pop4, Pop7 and Pop9. There was no obvious correlation between genetic distance and geographic distance (r = -0.48, P = 0.782) (Fig 5).

Genetic differentiation and gene flow
The analysis of molecular variance (AMOVA) among populations showed that 15% of the genetic differentiation (F st ) was found among the populations, and the variance within populations was 85% (Table 3). The genetic variation between and within populations of M. lutarioriparius was both significant (P<0.001). The results showed that variation was more abundant within populations than among populations, and the genetic variation within populations was the main source of total variation. Results of S4 Table revealed  The results show that there was stronger connection within populations from the north area of Yangtze River, which is also true for the south group of Yangtze River. The low level of gene flow was found between populations from the north and the south of Yangtze River (ranging from 0.31 to 1.12), suggesting that the Yangtze River can server as a barrier to gene flow between populations on the both sides of the river.

The universality of SCoT primers
SCoT markers are novel molecular markers that target the translation initiation site and preferentially bind to genes that are actively transcribed. These primers have been shown to exhibit relatively high levels of polymorphism [45]. It was more informative than IRAP and ISSR for  [61]. The PIC values of M. lutarioriparius were low compared with Hemarthria [62], Elymus sibiricus [63], and other plants. This might be due to different research materials and different numbers of individuals in each population. Taken together, SCoT markers can be used as an effective tool for studying M. lutarioriparius, laying the foundation for identifying M. lutarioriparius germplasm resources, genetic map construction, gene mapping, and cloning.

Genetic diversity analysis of M. lutarioriparius germplasm
Miscanthus lutarioriparius is not only important as a fine raw material for papermaking, but also as a biofuel crop and ecological improvement plant. Therefore, it is important to understand the genetic diversity of M. lutarioriparius populations and the potential for genetic improvement. In this study, the mean of genetic diversity within each population was 0.32, which was similar to Chinese and U.S. M. sinensis although different markers were used in these study [23,24,64], and higher than previous measurements from cross-pollinated plants (He = 0.162) and monocotyledons (He = 0.181) [59,65]. At the species level, the genetic diversity of M. lutarioriparius was similar to Panicum virgatum [66], Maytenus emarginata [67] and Ziziphus mauritiana [68], which were analyzed by the same markers. But it was lower than microsatellite markers [34]. This may be related to the different types of makers used. Genetic diversity of populations is influenced by many factors, including breeding systems, population size, genetic drift, natural selection, mutation rate, and gene flow [69,70]. First, genetic diversity at the species level is greatly related to the breeding system of Miscanthus, which are cross-pollinated plants that have self-incompatibility. Miscanthus lutarioriparius can propagate through subterraneous stems under its natural state, as well as through sexual propagation through seeds. These reproductive characteristics have vital significance for maintaining the genetic diversity in populations [71], likely resulting in higher genetic diversity. Second, M. lutarioriparius has a relatively large distribution area from 111˚to 120˚E and 26˚to 33˚N according to our field research. Generally, the larger the distribution area, the greater the genetic diversity [72]. Miscanthus lutarioriparius is a perennial herb, and its perennial long-life habits could provide more opportunity to accumulate mutations or special microstructure in different populations due to biotic processes. Third, since the 1950s, M. lutarioriparius has attracted attention as a paper-making material, especially in Huanan and Hubei provinces, where large-scale introduction and variety screening have taken place. Therefore, these types of perennial plants preserve variants between generations, thus increasing the genetic diversity of populations [73].
In generally, the genetic diversity of populations in downstream of river was higher than that of populations in the upper and middle reaches of a river. The genetic diversity of Pop9 was the highest, it located in the downstream of the Yangtze River. However, the populations from the middle reaches of the Yangtze River (Pop3, Pop4, Pop8 and Pop7) showed higher genetic diversity than those in the downstream Yangtze River (Pop1, Pop2 and Pop6). Pop3 and Pop4 located in the Dongting Lake area, where contains the largest, continuous distribution of this species in China. However, in Jiangsu, He'nan, and Anhui, however, the population size of M. lutarioriparius were small, with only a few individuals found, due to predatory collection and habitat destruction. Population size correlates significantly with genetic diversity [74,75], and this was consistent with the low genetic diversity we observed in M. lutarioriparius from these  areas. In the natural environment, a low seed germination rate and vegetative reproduction through the rhizomes lead to low genetic diversity within populations in small population.

Genetic differentiation and gene flow of Miscanthus lutarioriparius
The genetic structures of populations are extremely important for the evolution and adaptation of a species [76]. Significant genetic differentiation was found between populations that came from the either side of Yangtze River. Miscanthus lutarioriparius exhibited moderate differentiation among populations. Our results were not consistent with previous studies of Yan [34]. This could be explained by our wider geographical accession collection and different molecular markers used in this study. Many factors might be responsible for moderate population differentiation, including, but not limited to pollen and seed transfer, breeding system, geographic isolation, and environment heterogeneity. In present study, the most plausible explanation for moderate genetic differentiation could be due to geographical isolation of M. lutarioriparius populations.
The results of AMOVA analysis showed that 15% of the variance was attributed to the differences among populations (P < 0.001), and 85% was attributed to the differences among individuals. The greater similarity of populations among rather than within individuals indicated that these populations were probably founded by many individuals and that there were The N m was high within M. lutarioriparius populations from the north group and south group of Yangtze River. When N m is higher than 1, gene flow could prevent genetic differentiation between populations caused by genetic drift [77]. Factors affecting gene flow include the dispersal modes of pollens and seeds, as well as the population size in the natural distribution. Miscanthus lutarioriparius is wind-pollinated, and its gene flow rate is consistent with findings that cross-pollinated plants have high gene flow [78]. Miscanthus lutarioriparius mainly grows along rivers and lakes, and seasonal floods can transfer seeds and subterraneous stems between different regions, further promoting genetic exchange between M. lutarioriparius populations. In addition, since the 1950s, M. lutarioriparius has been a vital raw material for paper making, and this plant was introduced widely to the area; therefore, humans have also contributed to the increased gene flow between different M. lutarioriparius populations.
Analysis indicated that gene flow was low between M. lutarioriparius populations from the north group and south group of Yangtze River. The River and mountains were usually the main reasons for limited gene communication. M. lutarioriparius was mainly distributed at elevations below 300 m (altitude) by field investigation. For example, the easternmost individuals of Pop 8 are very close to the Pop 7, but the two places were isolated by the Yangtze River, Dabie Mountain and Luoxiao Mountain. The gene flow (0.62) between the two populations was very low. Therefore, the geographic isolation could have prevented gene flow between the north group and south group of Yangtze River.

Genetic relationships among populations of Miscanthus lutarioriparius
In population genetics research, sampling strategies have a strong relationship with the reliability of research results because individual localities will affect genetic diversity of a population and the population's genetic structure [79]. Determination of a suitable population size, which can realistically reflect the population's genetic information, is difficult because a relatively small population might originate from asexual reproduction from one or a few origins. In addition, without biotic and abiotic damage, M. lutarioriparius could theoretically enter an infinite growth cycle. Under such a scenario, it would be difficult to determine the extent of each genotype by the range of plant growth on the ground. In this research, only scattered populations were found out of the large-scale concentrated range at in some distribution areas beside the coastal area of DongTing Lake and TaiHu Lake. In these areas, the number of individuals in each population is limited and populations are separated by mountains and rivers. There was no obvious correlation between genetic distance and geographical distance. This indicated that our increased sample scale and genetic diversity at the population level does not have a significant relationship. Additionally, it demonstrated that our sample scale is reasonable and that sampled individuals are included in populations.
The genetic structure of a population is not directly correlated with its geographical distribution. According to the analysis of structure, M. lutarioriparius was divided into two groups: those distributed to the north and to the south of the Yangtze River. To further distinguish the origins of the M. lutarioriparius populations, we used PCoA and UPGMA to study the distributions of M lutarioriparius, resulting again in classification into two types with the Yangtze River as a natural boundary between them. Judging by certain historical records, M. lutarioriparius was considered as a member of Phragmites in the past, being introduced across China for the last 50 years as the papermaking industry has expanded. Therefore, rather than relying on pollen and seed diffusion, human activities have had a more positive influence on the distribution of M. lutarioriparius.
Here, populations with smaller geographical distances were not clustered together, while the populations on either side of the Yangtze River were clustered together. This may be a result of population isolation along the Yangtze River, causing the populations (Pop1 and Pop2) with smaller geographical distances to generate certain differences. Similar findings have been observed in other species [80].

Protection strategies and utilization of wild M. lutarioriparius resources
Miscanthus lutarioriparius is endangered in some regions, this is likely not due to a loss of genetic diversity. According to our field investigation of M. lutarioriparius over many years, human activities have severely damaged the habitats of M. lutarioriparius. For example, there was large-scale M. lutarioriparius distribution in He'nan, Jiangsu, and Hubei from the 1950s to the 1970s, but now only small populations of this species are found in this area. In Hunan, after the recession of the papermaking industry, the habitat of M. lutarioriparius was used to plant poplar, leading to the decline of M. lutarioriparius populations. Therefore, M. lutarioriparius must be protected. We applied Andrew's method to estimate the sampling size needed to protect these populations [81]. The results indicated that sampling in only one population could preserve 95% of the genetic differentiation of the species. Based on this, we recommend a protection strategy as follows: (1) preservation of the M. lutarioriparius population in the Dongting Lake area; (2) increasing the sampling of other populations with high genetic diversities (including Hubei, Jiangxi and Jiangsu), and collecting subterraneous stems to encourage reproduction during sampling.
Miscanthus lutarioriparius has adapted to various conditions and plays an important role in the wetland ecological environment. Simultaneously, M. lutarioriparius is considered to have potential as a second-generation cellulosic energy plant and biomaterial. For adapting to the requirements of developing energy plants in China, breeding of M. lutarioriparius germplasm suitable for growing in marginal land is necessary; therefore, understanding the genetic diversity level of M. lutarioriparius germplasm resources is the foundation for hybrid breeding. Using the results from this study, we can select distantly related materials as parents that give rise to offspring with higher genetic variation, providing better resources for selective breeding. For example, the materials in Hubei and Zhejiang populations with the greatest genetic distance could be used as parents.

Conclusions
In conclusion, the SCoT marker system provides a highly efficient, reproducible and powerful tool for studying the genetic diversity and population structure of M. lutarioriparius. The results revealed that high genetic diversity and gene flow were detected at species level, which were attributed to its breeding system, large distribution area and anthropogenic movement of plant material. In addition, moderate genetic differentiation was found among populations, which was supported by habitat fragmentation and geographical isolation. Nine populations can be divided into two main groups by STRUCTURE analysis, PCoA and UPGMA, with the Yangtze River as a natural boundary between the two groups. All M. lutarioriparius accessions could be divided into two groups, with 92 accessions in Cluster A and 61 accessions in Cluster B. Lastly, we offered scientific measures for M. lutarioriparius protection. Thus, these results should help for selecting parents in hybridized breeding to exploiting new Miscanthus species and for further utilization in biomass energy and conservation.