Multivariate analysis reveals phenotypic diversity of Euscaphis japonica population

Fruit traits affect population genetic diversity by affecting seed protection and dispersal strategies, thereby comprising important components of phenotypic variation. Understanding of the phenotypic variation is an indispensable first step for developing breeding strategies. However, little information is known about the genetic variation in E. japonica—a monotypic species with abundant phenotypes that is mainly distributed in southern China. In this study, we evaluated the phenotypic diversity of 67 E. japonica using 23 phenotypic traits. Our results showed that the Shannon–Wiener (I) index of qualitative traits ranged from 0.55 to 1.26, and the color traits had a relatively high I. The average coefficient of variation of compound leaf traits (14.74%) was higher than that of fruit and seed traits (12.77% and 11.47%, respectively). Principal component analysis also showed that compound leaf and fruit traits were important components of total variation. Furthermore, correlation analysis revealed a significant difference in elevation and fruit color, irregular ribs, leaf margin and texture. The F value within populations was smaller than among populations, indicating the variation in phenotypic traits among populations was much greater than within populations. Dehua and Zunyi populations had the highest coefficients of variation, whereas Wenzhou population had the smallest—which may be attributed to habitat destruction. According to Q-type clustering, 67 samples clustered into four groups, with those having similar phenotypes clustering into the same group. In general, leaf and fruit traits had abundant phenotypic diversity, representing the main sources of phenotypic variation. Combined with clustering results and field surveys, this study suggests that the phenotypes of E. japonica are classified into two main categories: The deciduous E. japonica present at high altitudes; and the evergreen E. japonica present at low altitudes. Excavating E. japonica variations provides a theoretical reference for its classification and diversity, and is of great significance for planning genetic resources and establishing conservation strategies.

Introduction Tree evolution is largely driven by adapting previous seed protection and dispersal strategies that allow diversification into new niches [1]. Seed maturity and dispersion are inextricably linked to fruit type, e.g., dry fruit is mainly used to help seed dispersal by cracking. In most trees, fruit colors are also crucial for attracting seed-dispersing organisms (such as squirrel and birds), and for both commercial and ornamental value [2,3]. Therefore, variations in fruit traits are very important aspects of a population's survival and genetic diversity during evolution.
Genetic diversity, a fundamental source of biodiversity, provides the raw material for evolution by natural selection [4], which includes phenotype, protein, and DNA variations. Phenotypic variations are highly recommended as a first step prior to attempting more in-depth biochemical or molecular studies [5]. Hence, morphological description remains the first step in the process of plant genetic diversity preservation [6]. Using variation traits to study phenotypic diversity, revealing the genetic structure and variation size of the population form the basis of genetic breeding. Phenotypic variance is thought to be the result of natural selection, reflecting both adaption to local environmental characteristics and genetic diversity [7][8][9], such as meteorological change, trait mutation, and genetic drift [10]. Related studies have confirmed that phenotypic plasticity is a major means by which plants' cope with environmental factor variability [11]. Phenotypic plasticity can evolve when sufficient genetic variation is present [12,13], either due to genetic correlations with other traits that are under selection or genetic drift [14]. Thus, plant phenotypes result from interactions between genotype and environment, reflecting genotype adaptation to environmental changes. Phenotypes are formed as a result of long-term stress selection, and represent irreversible processes that can be stably inherited by offsprings. Phenotypes reflect plants' environmental adaptability, and thus phenotypic variation is of great significance in adaptation and classification.
Euscaphis (Staphyleaceae) consists of one monotypic species, Euscaphis japonica, only distributed in southern China, Japan, and Korea. It is found as populations of scattered individuals, associated with forests and broad-leaved species in streams or valleys [15,16]. According to Flora of China records [15], E. japonica is a deciduous tree or shrub, flowering from April to June, and fruiting from August to December. However, during long-term field observations, we found that E. japonica are a deciduous or evergreen tree with significant phenotypic differences at varying altitudes, especially with respect to leaves and fruit. In addition, it has been previously discovered that E. japonica has existed on Earth for 33.9 million years [17], which inspired our curiosity as to why Euscaphis has one species only despite its wide range of distribution and long survival time. Therefore, some causes of phenotypic diversity in E. japonica could be explained by exploring phenotypic variation.
Since ancient times fruit from E. japonica have been used as valuable medicinal material for people in southern China to treat headaches, dizziness, colds, urticaria, hernia, and rheumatism [18][19][20][21]. In recent years, several chemical compounds have been isolated from E. japonica, such as triterpene [22,23], phenolic acids [24,25], flavonoids [8,22], etc. [26,27]. In addition, E. japonica is an excellent ornamental tree species for its unique fruit-shaped and red pericarp. This tree has been cultivated on a large scale as an ornamental and medicinal tree species in Fuzhou, Jiangxi and Guizhou Provinces, in southern China. However, the economic potential of its cultivation has been poorly addressed. Thus, E. japonica is considered to be an underutilized species in this region, mainly due to the limited information on E. japonica's genetic structure and phenotype characterization, as well as the lack of growing guidelines for producers. Therefore, a strategy for evaluating and conserving genetic resources is necessary for preserving the existing genetic variability of E. japonica in China. In the current study, the main objectives were the use of phenotypic characters (i) to reveal the degree of phenotypic variation among and within populations, (ii) to explore the correlation between phenotypic traits and their geographical ecology, and (iii) to provide a theoretical reference for the classification, diversity, and conservation E. japonica-related studies.

Plant material
The field collection was conducted from 2016 to 2017 in southern China during the fruiting period of E. Japonica. The number E. japonica populations are present in small areas of natural communities. We sampled the individuals with fruit found and recorded the latitude, longitude, altitude, and specific habitat of populations (Table 1). Sixty-seven samples from eight populations were studied.
The sampling work of Zunyi and Jiangshi poopulations were permitted and approved by Guizhou Xishui Nature Reserve and Jiangshi Nature Reserve respectively; The sampling work of Dehua population was permitted and approved by Quangzhou Forestry Bureau; The sampling work of Taining population was permitted and approved by Sanming Forestry Bureau; The sampling work of Jianyang and Jianou were permitted and approved by Nanping Forestry Bureau; The sampling work of Qingliu population was permitted and approved by the Company of Yisheng Agriculture and Forestry, which land accessed is privately owned. Ou Bin (College of Forestry, Jiangxi Environmental Engineering Vocational College, Jiangxi Province) and Zou Zhuang (Zhejiang Academy of Agricultural Sciences) assist us to collect samples from Jiangxi and Wenzhou population. Furthermore, E.japonica does not belong to the national protected tree species. We only collect young leaves,so there is no damage to the trees.

Phenotypic evaluation
Through the observation of the flower morphology of different E. japonica populations, we found that the flower morphology was stable, including color, size, etc. Therefore, the flower characteristics were not discussed in this paper. Twenty-three phenotypic traits, both quantitative and qualitative, were measured (Table 2). Since the color traits (such as: compound petiole color; annual branch color; fruit color; fruit sequence color) hardly change after the fruit cracks, therefore, the original records of qualitative traits were obtained from field observation in November 2016 and 2017. The obtained original record was then converted into a form suitable for mathematical operations according to Table 2. There are seven important qualitative traits and corresponding codes (Fig 1). The relevant color codes were divided into four types, encoded by the ratio of green and red ( Table 2; Fig 1). Sixteen quantitative characters were present, and the measured value or count value was coded. Fruit width, fruit length, pericarp thickness, seed length, seed width, and petiole length were measured using Vernier calipers (precision 0.01mm). Compound leaf length, leaflet length, and leaf width were measured using a ruler (precision 0.1cm).
The formula of leaf, fruit, and seed indices were as follows: Leaf index = leaf length/leaf width; fruit index = fruit length/fruit width; seed index = seed length/seed width.
Phenotypic observations were carried out on twenty fruits, twenty leaves, twenty seeds, and twenty compound leaves per sample. For the qualitative traits of fruits and annual branches, all observations were made on the period of pericarp dehiscence. Fruits and annual branches were randomly collected from the canopy of each tree and brought back to the laboratory for measurement of quantitative traits of fruits, leaves and seeds. For seeds sampled from the sampled fruit. Leaves were sampled from the middle of the second or third pairs of compound leaves of the annual branches. To minimize environmental effects on all the phenotypes, data were collected across two years (2016-2017) for each sample [28,29].

Statistical methods
R package (R version 3.2.5) was used to calculate the Shannon-Wiener index (I), distribution and frequency of qualitative traits, and maximum, minimum, average (X), standard deviation (SD), coefficient of variation (CV) of each quantitative trait, which was calculated as follows: CV = SD/X. In addition, R package was used to analyze the correlation between phenotypic traits and geographical environmental factors in order to explore the effects of geographical environmental factors on phenotypic traits. Each trait was also subject to one-way analysis of variance (ANOVA) at a significant level of p < 0.05 [30]. Principal component analysis (PCA) is commonly used to study patterns of variation in a set of interrelated traits through the identification of subsets, called factors, which are substantially correlated, simultaneously affecting these traits to a large extent [31]. The purpose of principle component analysis was to reduce the number of observed variables into a relatively smaller number of components [6]. Examination of the eigenvalue is required to judge whether a principal component (PC) is meaningful. If this number is >1.0, then theoretically, the corresponding PC provides more information than any single variable and may be considered a major factor [32]. In this paper, the principal component analysis performed after standardizing all traits using R package. Furthermore, eigenvalues and relative proportion of the variance explained by each trait were calculated [33].
R package also was used to cluster all data with Q-type and R-type clustering. Using different units of measure resulted in entirely different types of scales with unequal weights, thus the data were standardized. The method of standard deviation standardization was selected. In order to explore the global similarity of phenotypic traits among different populations, Q-type clustering was performed using the class average chain method and Euclidean distance. In order to explore the degree of correlation between the different traits, R-type clustering was performed. The clustering method and distance coefficient were similar to that of Q-type clustering.

Analysis of phenotypic diversity
The Shannon-Wiener index (I) is an important index for evaluating the diversity of germplasm resources. Among the seven qualitative traits in this study, the I of fruit color (FC: 1.26),   (Table 3). For color traits, the color of fruit (FC) and fruit sequence (FSC) mainly belonged to 2-type, with distribution frequencies of 38.80% and 43.28%, respectively, followed by 1-type. The color of compound petiole (CPC) was mainly 0-type. Furthermore, compound petiole color (CPC) was greenish, while fruit color (FC) and fruit sequence color (FSC) were reddish. In addition, leaf texture was mainly membrane (73.13%), followed by paper (20.90%), and thick paper (5.97%) ( Table 3).
The coefficient of variation (CV) for quantitative traits indicates phenotypic traits' degree of dispersion. Thus, the larger the coefficient of variation, the greater the dispersion values of the measured traits. The average coefficient of variation of different quantitative traits of E. japonica were as follows: Petiole length (PL: 23.95%) > seed number (SN: 21.54%) > leaflet circumference (LC: 18.30%) > leaflet area (LA: 17.85%) > pericarp thickness (PT: 15.82%) > fruit length (FL: 14.05%) > compound leaf length (CLL: 13%. 91%) > fruit shape index (FS: 12.75%) > leaflet width (LW: 12.10%) > leaflet number (LN: 11.42%) > leaf shape index (LS: 10.88%) > seed shape index (SS: 9.96%) > leaflet length (LL: 9.47%) > seed width (SW: 8.57%) > fruit width (FW: 8.43%) > seed length (SL: 5.79%). It is therefore obvious that the coefficient of seed traits' variation were relatively backward, indicating that seed variation was small. Comparing the average coefficient of variation of eight compound leaf traits, four fruit   and seed traits, showed that the average coefficient of variation of compound leaf traits (14.74%) was higher than that of fruit traits (12.77%) and seed traits (11.47%), indicating that seed traits were relatively stable during the process of evolution, while rich variation in compound leaf and fruit traits (Table 4). In order to explore the contribution of individual traits to phenotypic variation, we have principal component analysis for all quantitative traits (Table 5). Using an eigenvalue greater than one as a measure of PC significance, five PCs accounted for 74.23% of the total variability in the data, which retained the majority of the information represented by the original factor. Due to the substantially high correlation between PC1 and leaflet area (LA: 0.912), leaflet length (LL: 0.861), leaflet width (LW: 0.806), and pericarp thickness (PT: 0.804), the first principal component represented these inter-correlated traits to a large extent and was therefore the most important genetic factor, accounting for 28.96% of the total variance. Thus, PC1 mainly reflects compound leaf and fruit characters. PC2 correlated positively with leaf index (LI: 0.629) and leaflet number (LN: 0.587), and negatively with seed width (SW: -0.716) and seed length (SL: -0.645). Therefore, PC2 represented leaf and seed traits, accounting for 16.04% of the total variance. PC3 correlated positively with fruit width (FW: 0.701) and fruit index (FI: 0.801), which represent fruit characters. PC4 correlated positively with seed index (SI: 0.614) and petiole length (PL: 0.518), and PC5 correlated positively with petiole length (PL: 0.597). It is seen that PC4 and PC5 mainly reflected leaf characters (Table 6). In summary, we found that the leaf and fruit traits contributed the most to the phenotypic variation of E. japonica.
Although principal component analysis (PCA) revealed which traits played an important role in phenotypic variation, it could not identify the relationship between phenotypic traits. Thus, the R-type cluster of 23 traits was carried out. As shown in Fig 2, the clustering result was clearly divided into five groups. Group A contained leaflet area (LA), leaflet width (LW), leaflet length (LW), fruit color (FC), leaf margin (LM), and irregular ribs (IR), which mainly reflected leaflet characters. Group B contained compound leaf length (CLL), fruit width (FW), pericarp thickness (PT), leaflet number (LN), seed shape index (SI), seed length (SL), seed width (SW), fruit length (FL), fruit shape index (FI), and seed number (SN), which mainly reflected fruit and seed characters. Group C contained leaflet circumference (LC), annual branch color (ABC), compound petiole color (CPC), and fruit sequence color, which mainly reflected color characters. Group D only contained the leaf shape index (LI), and Group E contained petiole length and leaf texture. Groups D and E also reflected leaf characters. The results of the R-type clustering reflected the correlation between some phenotypic traits, such as the color of annual branch, compound petiole, and fruit sequence. Conversely, some of the traits that appeared in proximity on the dendogram were not relevant, such as leaf margin and irregular ribs, whose genetic background requires further study. Our sampling range is wide, covering 4 provinces, from low altitude to high altitude. Therefore, we explore whether the phenotypic variation is related to altitude, latitude and longitude. Comparing the correlation coefficients between geographic factors and phenotypic traits showed that the altitude was the highest (8.343), followed by latitude (6.262), and longitude was the lowest (4.732) ( Table 6), indicating that the effect of altitude on the phenotypic variation of E. japonica was greater than that of latitude and longitude. Furthermore, we identified a significant negative correlation between altitude, fruit color (FC) and leaf margin (LM), as well as a significant positive correlation with leaf texture (LT) and irregular rib (IR).

Analysis of phenotypic diversity within and between populations
The population's phenotypic diversity positively correlates with the coefficient of variation. The higher the coefficient of variation, the richer the phenotypic diversity. Comparing the coefficients of variation for nine populations showed that Dehua was the highest at 18.89%, followed by Zunyi, and the smallest was Wenzhou ( Table 4).
The results from the analysis of 16 quantitative traits in different populations are shown in Tables 7 and 8. The seed index (SI), seed width (SW), leaflet area (LA), leaflet stem length (PL), and leaflet circumference (LC) of E. japonica reached a significant level within populations. Among 16 quantitative traits, except for fruit length (FL), the differences in other traits were significant among populations. This implied that the variation in quantitative traits among populations was much greater than within populations. The F value of leaflet circumference (LC), fruit length (FL), and seed width (SW) were lower among populations than within populations. Furthermore, the F value of the remaining quantitative traits were higher between populations than within populations, further suggesting the presence of abundant phenotypes among populations. In compound leaf traits, except leaflet circumference (LC), the F value among populations of the remaining traits were more than double compared to within populations' values. In fruit traits, the F value among populations of pericarp thickness (23.73) was 25.51 times that within populations; followed by fruit width (FW), which was 5.55 times that within populations. In seed traits, the F value of seed index (SI) and seed width (SW) were the highest within populations, 9.67 and 5.30, respectively. Therefore, it is obvious that compound leaf and fruit traits' variations were rich among populations, while seed traits' variations were relatively stable (Tables 7 and 8).
Multiple comparative analyses of compound leaf traits among populations showed that ( Table 7) the mean value of leaflet area (LA), leaflet length (LL), leaflet width (LW), and compound leaf length (CLL) of Zunyi were the highest, followed by Wenzhou, while Dehua had the lowest values. Interestingly, the leaf index (LI) of Wenzhou was the lowest (1.87 ± 0.18), while Dehua had the maximum leaf index value (2.46 ±0.29). Therefore, Zunyi and Wenzhou populations were defined as large-leaf E. japonica, and Dehua as small-leaf E. japonica. The mean values of fruit width (FW), fruit length (FL), and pericarp thickness (PT) of the Zunyi population were the highest, while Jianyang population exhibited the lowest mean value of fruit width (FW), pericarp thickness (PT), and fruit shape index (FS) ( Table 8). Finally, variable ranges of seed number (SN), seed length (SL), seed width (SW), and seed shape index (SS) were 1.00-3.00, 3.81-5.47, 2.20-5.55, and 0.88-1.21, respectively, among which the variation range of seed width was the largest (Table 8).

Cluster analysis between different populations
The dendogram obtained from the Q-type clustering performed on the 23 phenotypic traits is shown in Fig 3. Using the value 7.42, i.e., approximately 50% of the standardized maximum  distance for the separation of groups, as a dendrogram cutting point criteria [34,35], four groups were distinguished (Fig 3). DH3 and DHY3 comprised group Ⅰ, and group II was only sample ZYY5. Fifteen samples comprised group Ⅳ, which contained the remaining samples of Zunyi, all samples of Wenzhou, as well as DHY1 and DHY2 of Dehua. Group III included 49 samples from the Zhejiang and Fujian populations, and was further subdivided into five subgroups. The first subgroup included JX2, JX9 of JX, and QL5 -7of Qingliu. The second subgroup included some samples of Jiangshi, Taining, and Jianou. TN2 and TN3 were grouped together in the third subgroup. The fourth subgroup mainly included most samples from Jiangxi and Qingliu. The fifth subgroup contained 17samples from the Fujian province.

Discussion
It is known that endemic woody species have low genetic variation, while distributed and widespread species maintain high variation [35][36][37][38]. In this study, we focused on the natural E. japonica populations, and our sampling range covers four provinces, from low altitude to high altitude (100m~1000m). Studied on 23 traits of nine populations showed that there were abundant phenotypic variations in leaf and fruit of E. japonica, and principal component analysis also showed that the leaf and fruit variation were important components of the total variability, indicating that the variation of leaflet and fruit traits were regarded as an important evaluation index in the diversity evaluation of E. japonica from different provenances, and leaves and fruits are important genetic resources for breeding of E. japonica. In R-type cluster, the color traits clustered into group C, leaf margin and irregular ribs were closely together. And altitude had a great influence on phenotypic characters. Interestingly, there are consistent with our observations in the field. Field investigation showed that the higher the altitude, the redder the fruit color; and the reddish fruit color, the reddish color of fruit sequence, the color of annual branch and the color of compound petiole. we also found that E. japonica with red fruit, epidermis ribs prominent, leafy papery or thick papery, margin serrulate were present in high-altitude areas (300m~1000m), while E. japonica with fruit epidermis rib inconspicuous, leaf membranous, and margin obtuse serrate were present in lowaltitude areas (100m~500m). Diversity variation among populations is an important component of species diversity, reflecting population adaptation in different environments, where the magnitude of variation indicates to some extent the species ability to adapt to different environments [39,40]. In this study, the variation in phenotypic traits among populations was much greater than within populations. Furthermore, Zunyi and Wenzhou populations were defined as large-leaf E. japonica, and Dehua as small-leaf E. japonica. Analysis of the average coefficient of variation for each population showed that the coefficient of variation for Dehua and Zunyi populations were the highest, Wenzhou population was smallest. It may be that Dehua and Zunyi populations are located in forests and nature reserves and their habitats with little human interference and good habitat protection.
The evolution of life is driven by natural selection acting on phenotypic trait variation among individuals in a population [41]. The variations in these traits are genetically controlled but could be highly influenced by environment conditions [42]. In this study, E. japonica from different provenances clustered into four groups, according to the similarities in phenotypic traits. In combination with long-term field observation, these four groups clearly formed deciduous E. japonica and evergreen E. japonica, and the qualitative and quantitative traits of the two types were quite different. The deciduous E. japonica composed of the population of Wenzhou, Zunyi, and Dehua (DHY1, DHY2, DHY3) were located in the mountainous area at 300 m above sea level. The characteristics of these populations are: deciduous trees or shrubs with leafy papery or thick papery, margin serrulate, epidermis ribs prominent, red exocarp; florescence is from April to May, fruit is from July to December, and fruit discoloration is in June, leaf color turns yellow in autumn. Jiangxi population and the populations in Fujian Province constituted evergreen E. japonica, located in the mountainous area with low altitude (�500m). The characters of these populations are: evergreen trees or shrubs with leaf membranous, margin obtuse serrate, fruit epidermis rib inconspicuous, red and green exocarp; the florescence is from May to June, fruit is from September to March, and fruit discoloration is in August. We suggest that the E. japonica populations at similar elevations are in similar ecological conditions, and in the long-term evolution process form similar adaptation matrices for similar environments, resulting in phenotypic similarities.

Conclusion
In this study, we reveal considerable phenotypical diversity among nine populations of E. japonica from four provinces. Variations in leaf and fruit traits were abundant, indicating that improvement in E. japonica is broad, and the rich phenotypic variation provided a material basis for germplasm resources and diversity. Principal component analysis also verified that leaflet and fruit traits had a great contribution to the principal components. In addition, the correlation analysis revealed a significant difference in elevation and fruit color, irregular ribs, leaf margin, and leaf texture. The variation in phenotypic traits among populations was much greater than within populations. Furthermore, Zunyi and Wenzhou populations were defined as large-leaf E. japonica, and Dehua population as small-leaf E. japonica. Dehua and Zunyi populations had the highest coefficients of variation, and WZ had the lowest-possibly due to habitat destruction. Q-type clustering grouped the samples according to their phenotypic similarities. All samples clustered into four groups, basically distinguishing E. japonica of different provenances. Combining the clustering results and field surveys, showed that the clustering result clearly formed t deciduous E. japonica and evergreen E. japonica. One is the deciduous E. japonica at high altitudes, and the other is the evergreen E. japonica at low altitudes. The inventory of E. japonica based on phenotypic descriptions is the first study to give insight into the extent of phenotypic diversity and is of great importance for planning genetic resources preservation strategy and establishing collections.