Variation of phenotypic and physiological traits of Robinia pseudoacacia L. from 20 provenances

To select elite Robinia pseudoacacia L. germplasm resources for production, 13 phenotypes and three physiological indicators of 214 seedlings from 20 provenances were systematically evaluated and analyzed. The leaf phenotypic and physiological coefficients of variation among the genotypes ranged from 3.741% to 19.599% and from 8.260% to 42.363%, respectively. The Kentucky provenance had the largest coefficient of variation (18.541%). The average differentiation coefficients between and within provenances were 34.161% and 38.756%, respectively. These close percentages showed that R. pseudoacacia presented high genetic variation among and within provenances, which can be useful for assisted migration and breeding programs. Furthermore, based on the results of correlations, principal component analysis and cluster analysis, breeding improvements targeting R. pseudoacacia’s ornamental value, food value, and stress resistance of were performed. Forty and 30 excellent individuals, accounting for 18.692% and 14.019%, respectively, of the total resources. They were ultimately screened, after comprehensively taking into considering leaf phenotypic traits including compound leaf length, leaflet number and leaflet area and physiological characteristics including proline and soluble protein contents. These selected individuals could provide a base material for improved variety conservation and selection.


Introduction
A germplasm resources is a collection of all the genes of a species' individuals and play an important role in developing new varieties, discovering important agronomic traits, conserving endangered species, maintaining ecological balance and stabilizing the environment [1][2][3]. The germplasm can be considered as a carrier of biodiversity and is highly important [4]. In situ preservation and ex situ preservation are the two standard methods for the protection of germplasm resources. Although in situ conservation maintains the original ecosystem and natural habitat of plants better than ex situ preservation, it requires a large cultivation area and large investments of labor, materials, finances, and time for administration and management. Ex situ preservation acts as a backup for certain aspects of diversity that might otherwise be lost in human-dominant ecosystems and in nature [5][6][7]. Forest trees require a long period and large area for growth, and ex situ preservation is commonly used to protect plant resources. This method is convenient for breeders as it, allows research to be carried out in in a timely and efficient manner [8,9]. Since the early 1990s, China began to carry out systematic research work on protecting forest genetic resources. A preservation system for forest genetic resources was established for country's actual situation, by forming a framework for preserving and utilizing forest germplasm resources. It was coordinated by the National Forest Germplasm Resources Platform and National Forest Base. Breeding efforts have been carried out on many endangered species, such as Taxus chinensis var. Mairei [10,11], dove tree (Davidia involucrata Baill) [12], Emmenopterys henryi Oliv. [13,14], and Cathaya argyrophylla [15], so that populations of endangered species can expand. Forest germplasm resources with excellent characteristics and important economic value have been examined and approved for improved, new and local varieties. By this means, excellent seedlings and propagation materials should be produced by establishing seed standards, seed orchards, cutting orchards, demonstration forests, and so on. The goal of tree genetic resource preservation is to create ecological and social benefits to maintain the sustainable development of a biological environment. Nevertheless, the utilization of forest resources is still occurring at a slow pace, given the destruction of the environment and the demand for economic development. This situation can be reversed by paying attention to the investigation, protection and utilization of the forest germplasm; increasing funding support for forest resource research, improving the publicity and public education about forest genetic resources; and raising the whole population's awareness to protect genetic resources [16][17][18][19][20].
Black locust (Robinia pseudoacacia L.) is a multipurpose deciduous tree species. It is suitable for land reclamation, windbreaks, fence posts, raw material for energy plantations, timber, bee-keeping, wood fibre, and forage [21,22]. It was first introduced to China in 1877, and been extensively planted in 27 provinces [23,24]. Robinia pseudoacacia L. has become an important pioneer afforestation tree in northwest China because of it can withstand slight saline-alkaline and dry soils and has grown well in barren mountains [25]. In addition, the black locust is an economically valuable tree: bees feed on its flowers to produce honey, the forest rapidly growths, and the durability and strength its timbers make it suitable for building [26][27][28]. However, the haphazard introduction of black locust into many areas of China and a related lack of records about introduced samples have led to confusion concerning the black locust germplasm resources in China. This has greatly restricted breeding efforts and effective utilization of the black locust. The germplasm resources of black locust were systematically utilized and protected until the target-oriented breeding stage began in the 1990s [29][30][31][32].
A prerequisite for plant breeding is the research of genetic variation among individuals and groups of individuals. Plant phenotypes and physiological traits are affected by environmental conditions. Therefore, different plant phenotypes and physiological indicators can reflect the degree of plant adaptability to current site conditions [33,34]. Knowledge about trait variation can provide a more comprehensive understanding of germplasm resource diversity among breeding materials. It can be used to carry out further targeted breeding work based on the characteristics of each germplasm resource. The degree of variation and pattern of germplasm resources can be determined relatively easily. This, overall concept forms the basis of genetic breeding and is the most common and effective method for breeders [35][36][37].
Unfortunately, there have been few studies of the differences in the provenance of leaf phenotypic and physiological parameters of various black locust collected within their natural distribution [38]. However, an analysis based on morphological characteristics including germination ability, plant height and diameter at the ground level was performed to evaluated the seed characteristics and variation of 19 black locust from provenances in China. The results showed that interactions between the genotype and environment caused significant differences in tree growth from those provenances under different site conditions [39]. Zhang et al. [40] analyzed the seed survival rate, average height and diameter at breast height (DBH) of black locust trees from different Chinese provenances. They found that the further the provenance was from the test site, the lower its the survival and growth rates. At the origin provenance level, Zhou [41] investigated the differences in fruits, seeds, seedling height and diameter at the ground level for annual seedlings from different original black locust provenances. He also assessed the geographical variation in traits of different provenances, and made a preliminary selection of two provenances with high growth. Li et al. [42] surveyed 183 black locust families from 38 provenances and found that the regular geographic variation reflected in the origin of black locust's cold resistance presented a meridional gradient, which increased with increasing longitude.
To explore the diversity of leaf phenotypes and physiology based on natural black locust areas, 20 black locust provenances, 19 provenances covering almost the entire natural distribution and one provenance from China (CN), were used to comparatively analyze the variation among sixteen measured traits. Following that some elite trees were selected. The results provide valuable resources for efficient breeding and germplasm preservation of black locust trees in the future.

Plant materials
In this study, 214 samples of black locust were collected from 20 locations from September to October 2010. They comprised the 19 main black locust natural distribution areas in the United States and one main cultivation area in Henan, China (Table 1) [43]. In these areas, several fruits of black locusts with normal growth and a diameter at breast height (DBH) greater than 20 cm were collected at 500 m intervals and mailed to the Henan Academy of Forestry Sciences in China. From April to July 2011, these collected seeds were placed in a greenhouse for 24 h until germination. The successfully germinated seedlings were placed in nutrient bowls filled with nutritive soil and managed normally until the seedlings' height reached approximately 30 cm high, and were then transplanted to Mengjin Forest Farm (344 9 0 18@N, 112˚28 0 12@E), Luoyang, Henan, China. Mengjin County is a transition zone between subtropical and temperate; the annual average temperature was 15.4˚C, and the annual average precipitation was 593 mm from 2010 to 2017, respectively. The soil is mainly brown soil (accounting for 93%), followed by alluvial soil (accounting for 7%). There was at least one site for each provenance, and each site contained at least two accessions randomly distributed with a plant spacing of 4 m×4 m.
Two hundred and fourteen well-grown black locusts of different provenances were selected as experimental material from the Mengjin Forest Farm in August 2017. Each specimen was chosen by selecting those whose annual branches were free of pests and disease at the same height in four directions (north, south, east, and west).

Data collection
In this study, 16 quantitative traits of black locust trees were evaluated. They included 13 leaf traits: compound leaf length (CLL), compound leaf width (CLW), compound leaf length/ width (CLL/CLW), compound petiole length (CPL), leaflet length (LL), leaflet width (LW), leaflet length/width (LL/LW), leaflet area (LA), leaflet perimeter (LPM), leaflet circularity (LC), leaflet pairs (LP), leaflet number (LN), and petiole angle (PA). The 16 traits also included three physiological traits: chlorophyll content (Chl), total protein content (Spro), and proline (PRO). The 13 leaf traits and Chl were evaluated at maturity in August 2017. The two remaining physiological traits were measured in healthy leaves from April to May 2017 ( Table 2). The leaves were collected from trees and rapidly placed in sealed bags containing dry ice, and then transferred to the National Engineering Laboratory for Tree Breeding, Beijing Forestry

PLOS ONE
Evaluation of differences and selection analysis of 20 Robinia pseudoacacia L. provenances University, China (40˚0 0 22@N, 116˚21 0 1@E), and stored at -80˚C until tested. The compound leaf traits were measured by a ruler with a precision of 0.01 cm. The petiole angle was surveyed by an electronic protractor with a precision of 0.01˚. The leaves were scanned and saved in the same manner, and the remaining leaf characteristics were analyzed using LAMINA version 1.0.2 software. The chlorophyll content in mature leaves of black locust was determined by a SPAD-502 Plus chlorophyll meter (Konica Minolta, Japan); three leaflets were selected randomly from each direction (north, south, east, and west), and each leaflet was measured at 6 positions. Following that, the results were averaged into one measurement. Total protein content and proline were evaluated with a Total Protein Assay Kit (Art. No. A045-4) and a Proline Assay Kit (Art. No. A107), produced by the Nanjing Jiancheng Bioengineering Institute (http://www.njjcbio.com/).

Data and statistics
Microsoft Excel 2016 was used to examine the variation in leaf phenotypic and physiological traits, including the mean value, standard error, amplitude, and coefficient of variation (CV). SPSS version 24 was used to perform analyses of variance (ANOVAs) in conjunction with Duncan's multiple range tests for multiple comparisons. Principal component analysis was applied to the sixteen traits of the black locust provenances. A p-value for the ANOVA F tests �0.05 was considered significant. The formula, Vst (%) = δ 2 t/s/(δ 2 t/s+δ 2 s), was used to calculate the percentage of variance among and within the provenances, where Vst is the differentiation coefficient of the trait, δ 2 t/s is the variance component between provenances, and δ 2 s is the variance component within provenances [44]. A covariance correlation matrix was then used to analyze the correlations between clonal populations and geographical populations. The euclidean distance of each quantitative trait was calculated with the open-source statistical package, R; graphical visualization of the results was carried out using MEGA ver. 6.0 [45] after all the tested data had been processed in SPSS version 24. In addition, Mantel's correlation tests were conducted on the euclidean and geographical distances of all the traits of the black locust trees.

PLOS ONE
Evaluation of differences and selection analysis of 20 Robinia pseudoacacia L. provenances

Differentiation coefficient analysis of sixteen trait parameters
The analysis results (S1 Table in S1 File, Fig 1) showed that the coefficient of variation in the 13 leaf phenotypic traits among the different provenances was 3.741%-19.599%. All traits were lowest in LC and highest in LA, which indicates that LC has the smallest dispersion degree and highest stability among the leaf phenotypic traits; the case of LA is completely opposite. The coefficient of variation of the compound leaf traits was 11.236% and close to that of the leaflet traits (11.301%). At the provenance level, the total coefficient of variation was 11.281%, with the lowest and highest coefficients of variation at 7.286% (MS/AL) and 14.788% (TN), respectively, which shows that the provenance in TN had the most abundant leaf phenotypic diversity among the 20 black locust provenances. Similarly, the total average variation of the three physiological parameters was 30.993%, and the value of soluble proteins (Spro) (42.363%) had the highest level, followed by the proline (PRO) (42.356%) and chlorophyll content (Chl) (8.260%); Spro had the highest stability of these parameters, as revealed by the maximum-to-minimum ratios of each indicator-6.364%, 8.735% and 19.178%, respectively. Of the 20 provenances, KY provenance had the highest total average coefficient of variation (52.786%), which was approximately 3.8 times that for the MS provenance, which presented the smallest variation coefficient. In

Analysis of the differentiation coefficients of sixteen trait parameters
The differentiation coefficient and variance components among/within provenances ranged from 25.843% (CLW) to 71.655% (PA) at the leaf phenotypic trait level, with an average of 48.829%. The mean leaf phenotypic differentiation coefficient of the nine leaflet traits was 52.259%, lower than that of four compound leaf indicators (41.112%). Thus the differentiation level for the leaflets was slightly higher than for the compound leaves. In terms of single traits, the differentiation coefficients of CPL, LW, LP, LN and PA were larger than those within provenances, indicating that the variation among provenances was higher than within them (Table 4). Furthermore, the total mean leaf phenotypic differentiation coefficient among provenances was lower than within provenances, suggesting that the main variation of black locust occurred intra-provenance, not inter-provenance variation. Homoplastically, the variation of the three physiological parameters showed that Spro had the lowest and PRO, the highest differentiation coefficient at-23.956% and 49.137%, respectively. The total mean differentiation coefficient of the three indicators was 40.383%, but it was 59.617% within provenances. These results are consistent with the findings of leaf phenotypic traits, further demonstrating the importance of individual variation in black locust trees.  Table in S1 File, Fig 2). Among these traits, there were 50 significantly correlated leaf phenotypic traits (P<0.05). For compound leaf traits, CLL/CLW showed a highly significant negative correlation with CLW (P<0.01) but was not significantly correlated with CPL (P>0.05). In terms of leaflets, LC showed significant negative correlations with LW, LL/LW, and LPM, and LW was significantly positively correlated with LA, LPM, and LC (P<0.01). However, CLL/ CLW showed significant negative correlations with LL, LW, LA, LPM, and LP (P<0.01). There were 10 significantly correlations between phenotypic and physiological traits. CPL and LW exhibited extremely significant negative correlations between all pairs of physiological

PLOS ONE
Evaluation of differences and selection analysis of 20 Robinia pseudoacacia L. provenances traits; Chl presented significant positive correlations with LP and LN, and PRO was significantly negatively correlated with CLW and LA (P<0.05). There were no significant correlations between physiological traits, except for a correlation between Spro and PRO.

PLOS ONE
Evaluation of differences and selection analysis of 20 Robinia pseudoacacia L. provenances

Principal component analysis of sixteen trait parameters
Five principal components explained the investigated characteristics with eigenvalues greater than 1.0 (Table 5)

Cluster analysis based on sixteen trait parameters
Cluster analysis was performed for twenty different R. pseudoacacia provenances using the method of hierarchical cluster analysis between groups, taking sixteen trait parameters as variables (Fig 3). When the euclidean distance was set to 10, the twenty R. pseudoacacia provenances could be divided into four groups.  In addition, based on the results of the principal component analysis, systematic clustering analysis was performed on the eigenvectors of all 16 parameters in the four principal components using the same method described above (Fig 4). The results showed that the sixteen indicators could be classified into four groups when the euclidean distance was set to 23:

Mantel's test based on sixteen trait parameters
The correlations between the euclidean distance and geographical distance of the 13 phenotypic parameters and 3 physiological parameters of the 20 R. pseudoacacia provenances were analyzed. The results are shown in Fig 5 and indicate no significant correlations between the tested traits and the geographical distance of R. pseudoacacia at either leaf phenotypic or physiological levels.

Elite tree selection based on different breeding goals
According to the above analysis, 13 leaf phenotypic traits showed abundant variation in each provenance, which is helpful for breeding improved R. pseudoacacia plants for ornamental use and food production. Because of the significant positive correlations between CLL, LA, and LN, these three characteristics could also reflect the quantity and quality of R. pseudoacacia leaves. Therefore, when the above three traits were combined for respect to food-based breeding objectives, 40 elite trees were selected after the 214 accessions were analyzed ( Table 6).
Analysis of the three physiological traits, especially Spro and PRO, provided the basis for the selection of high-quality resistance resources for R. pseudoacacia. Firstly, based on the

PLOS ONE
Evaluation of differences and selection analysis of 20 Robinia pseudoacacia L. provenances evaluation of PRO, sixty-three excellent trees (29.439%) were selected. Secondly, based on the evaluation of Spro, eighty-four excellent trees (39.252%) were selected, whose Spro content was approximately 1.5 times that in the original provenances. Lastly, based on the evaluation of PRO and Spro, thirty excellent trees were obtained (14.019%), whose PRO and Spro contents were 1.8 times those of the original provenances, and the other two indicators also increased compared with those of the original provenances, indicating that the stress resistance of these elite trees increased at the original population level (Table 7). S3 and S4 Tables in S1 File list the specific names of these elite trees (S3 and S4 Tables in S1 File).

Discussion
Plant variation is closely related to the genetic characteristics of plants and their growth environment. In general, the larger the distribution range of a tree species, the greater the variation,  and the smaller the distribution range, the smaller the variation [46,47]. Morphological variation is an important part of genetic variation; the greater the area in which a tree species is distributed, the larger its genetic variation, leaf phenotypic and physiological differences [48,49].

Analysis of the variation characteristics of the leaf phenotypic and physiological characteristics of different Robinia pseudoacacia L. provenances
In this study, different R. pseudoacacia provenances were planted at the same site to reduce environmental variation. Our results showed that all black locust traits measured in the field varied among the provenances. Thirteen leaf phenotypic traits and 3 physiological indexes for black locust showed significant differences among 20 different provenances. Among all provenances, the TN provenance presented the largest coefficient of variation of phenotypic traits. And the KY provenance presented the largest coefficient of variation of physiological indexes of all and also the maximum of all 16 tested parameters. The LA, PA, and Spro in the KS provenance were largest, conversely, its CLL/CLW, LL/LW, LC, LP, LN, Chl, and PRO were the smallest. The degree of variation among different traits within the one provenance was diverse, indicating an imbalance in the degree of variation of leaf phenotypic traits and physiological indexes between different provenances of R. pseudoacacia. At the leaf trait level, different R. pseudoacacia provenances exhibited significant differences, consistent with the results of Granata et al. regarding Acer campestre leaf area morphological characteristics [50]. At the physiological level, compared to Spro and PRO, Chl had the smallest variation coefficient, with maximum-to-minimum ratios of 5.167% (Spro vs Chl) and 5.128% (PRO vs Chl). The large difference between these different physiological indicators may be due to the data obtained by the portable SPAD-502 Plus chlorophyll meter. Factors such as plant variety (genotype), environmental conditions, planting density, and nutrient conditions can affect SPAD values [51]. Using a spectrophotometric method compared to the portable chlorophyll meter SPAD-502 method, a study by Wang et al. showed a high coefficient of variation of the main greening tree species in China's northwestern Liaoning Province [52].
It is commonly held that a coefficient of variation of traits greater than 10% represents a large difference between individuals and equates to a rich variation in traits [53]. In our research, the coefficients of variation of 13 indexes were higher than 10%, with an average total coefficient of variation of 17.924%, showing abundant variation in leaf phenotypic and physiological indexes of black locust trees, which is the basis of species selection. This variability is consistent with the findings of previous studies of the national R. pseudoacacia fine variety bases in Ji, Shanxi Ji, which showed a rich diversity in leaf phenotypic traits among 96 genotypes [38]. In addition, the degree of variation of LC and PRO was relatively large, and the

PLOS ONE
Evaluation of differences and selection analysis of 20 Robinia pseudoacacia L. provenances degree of variation of LC was small, indicating that the same traits were affected differently in the same habitat or that different traits were affected in the same habitat. This may be related to the black locust's the inherent genetic factors and to the influence of environmental factors [46,54].
The differentiation coefficient of the tested indexes in our results showed that the main variation of black locust were intra-provenance variation. This is consistent with the results of previous studies on black locust [43] and other plant species [55,56], which reveals a large potential for the selection of individual variation and can provide potential opportunities for black locust genetic improvement and germplasm preservation.
Correlation coefficients can be used to reveal the relationships between measured traits and thus greatly influence selections as part of breeding strategies [57,58]. For all the tested parameters, compound leaf and leaflet traits generally revealed moderate and strong relationships, respectively. In particular, relationships between LA, LPM, and CLL as well as between CLW, LL, and LW have been are significantly positively correlated in most studies, such as those involving Salix psammophila [59] and Phoebe bournei [60]. Moreover, LL, LW, LL/LW, LA, LPM and LP were significantly negatively correlated with CLL/CLW, indicating leaflet traits were greatly affected by compound leaf shape. For black locust, comprehensive assessment of compound and leaflet traits were able to truly respond to the phenotypic traits of leaves. In addition, leaf size and shape can effectively reflect changes in the plant's natural environment and adjust morphologically to water evaporation and heat loss to adapt to the environment. The CLL/CLW, LL/LW, LA, LMP, and LC traits of black locust leaves could also reflect the adaptation to the growth environment to a certain extent. For the above reasons, most leaf traits of black locust exhibited complex relationships. Among the three physiological indexes, PRO had a significant positive correlation with Spro, which may be due to the Chl value obtained by the chlorophyll meter SPAD-502 instead of values to values obtained spectrophotometrically.
Principal component analysis is a multivariate technique widely used for dimension reduction, that is, analyzing multiple related variables of one or a few comprehensive indicators [61]. In our research, leaf traits and physiological parameters were analyzed by principal component analysis. The cumulative contribution rate of the first three principal components was 65.033%, which was lower than that for the R. pseudoacacia germplasm in Shanxi and Phoebe bournei. Possible explanations for these results could be the weak correlations between these traits and/or the differences in the number and types of measured parameters [3,38,60]. Furthermore, from eigenvalue and variance contribution rate, traits such as compound length, width, petiole size, leaflet length, leaflet size, leaflet circumference and leaflet area are the main factors in the phenotypic difference of black locust samples. And Srpo and PRO are the main factors in the physiological difference of black locust samples. The above traits can be focused on in the actual breeding.

Robinia pseudoacacia L. provenances
Cluster analysis is a mathematical method used to find similarities between measured indexes/ materials used in a group by revealing the real categories of the population and reducing the number of data points [62]. Multiple test parameters were therefore divided into dominant groups by cluster analysis, the most common and most effective classification method. The sixteen indicators could be classified into four groups, as shown in Fig 4. After comprehensive evaluations were performed, group II mainly reflected physiological indexes, and groups I and III mainly reflected leaf phenotypic characteristics of R. pseudoacacia leaves. Indicators I, II and III were the preferred test indicators to achieve the breeding objectives for practical production applications, including stress resistance, ornamental value and food production. Our results are consistent with those of many previous studies on morphological variation in Paeonia rockii. Four categories were divided based on 12 fruit traits, and group II was screened to maximize the economic yield per plant [63].
The results of Mantel's test were similar to the clustering results based on euclidean distance, revealing a nonsignificant geographical variation pattern of phenotypic and physiological parameters. This is consistent with the results of previous studies of not only black locust via molecular markers such SSRs [43] and ISSRs [64] and via allozymes [65,66] but also of other tree species, such as Prosopis alba [67]. Possible explanations for these results are that the collection range of provenances is broad while the numbers are small and/or black locust has migrated to its present-day range more recently.

Elite tree selection of different Robinia pseudoacacia L. provenances
Phenotypic changes following a change in natural selection are particularly important for undergoing continuous adaptation [68]; this reflects the ability of plants to grow normally in nature and indicates an ability to protect the environment. Excellent tree selection is a basic method for genetic improvement of forest trees; this method involves selecting individuals with relatively good comprehensive leaf phenotypic traits after comparisons with other trees under the same site conditions [69]. Some tree breeding programs aim to select a group of black locust trees that could be used to improve ornamental quality, tolerance to soil infertility and food production for livestock. For practical applications in the present study, forty and thirty elite trees were selected according to their aggregate indicators, that were significantly correlated (18.692% and 14.019% of the total sample for three-leaf phenotypic traits (CLL, LA, and LN) and two physiological indexes (PRO and Spro), respectively. The selection rate in our study was lower than that of a comprehensive scoring method for selecting Taxodium distichum by Wang et al. [70]. Different elite trees were selected based on different evaluation indexes, methods and breeding objectives, resulting in different selection rates.

Conclusions
The present study showed the following: (1) LC and LA exhibited opposite variations at the phenotypic level, of which LC had the highest stability; (2) when the differentiation coefficients of four compound leaves, nine leaflets and three physiological traits were compared the differentiation level of leaflet traits was higher than that of the other two types of indexes; (3) the variation of test traits is mainly attributed to differences within provenances, although the variation between provenances could not be ignored; and (4) there was a nonsignificant geographical variation pattern of phenotypic and physiological parameters. Suggestions and elite tree resources for germplasm preservation strategies and efficient breeding are provided to preserve the genetic resources of R. pseudoacacia.