Assessment of Soybean Flowering and Seed Maturation Time in Different Latitude Regions of Kazakhstan

Soybean is still a minor crop in Kazakhstan despite an increase in planting area from 4,500 to 11,400 km2 between 2006 and 2014. However, the Government’s recently accepted crop diversification policy projects the expansion of soybean cultivation area to more than 40,000 km2 by 2020. The policy is targeting significant expansion of soybean production in South-eastern, Eastern, and Northern regions of Kazakhstan. Successful realization of this policy requires a comprehensive characterization of plant growth parameters to identify optimal genotypes with appropriate adaptive phenotypic traits. In this study 120 soybean accessions from different parts of the World, including 18 accessions from Kazakhstan, were field tested in South-eastern, Eastern, and Northern regions of the country. These studies revealed positive correlation of yield with flowering time in Northern Kazakhstan, with seed maturity time in Eastern Kazakhstan, and with both these growth stages in South-eastern Kazakhstan. It was determined that in South-eastern, Eastern and Northern regions of Kazakhstan the majority of productive genotypes were in maturity groups MGI, MG0, and MG00, respectively. The accessions were genotyped for four major maturity genes (E1, E2, E3, and E4) in order to assess the relationship between E loci and agronomic traits. The allele composition of the majority of accessions was e1-as/e2/E3/E4 (specific frequencies 57.5%, 91.6%, 65.0%, and 63.3%, respectively). Accessions with dominant alleles in either E3 or E4 genes showed higher yield in all three regions, although the specific genotype associated with greatest productivity was different for each site. Genotype-environment interaction studies based on yield performances suggest that South-east and East regions formed one mega-environment, which was well separated from North Kazakhstan where significantly earlier time to maturation is required. The results provide important insights into the relationship between genetic and phenotypic patterns in new soybean growing territories in Kazakhstan.


Introduction
Soybean is an important crop globally, with high nutrition, protein and oil values [1], [2]. Despite this it is still a minor crop in Kazakhstan where farmers predominantly grow wheat. However, the recently adopted crop diversification policy in Kazakhstan has driven a steady increase in soybean harvesting area from 4,500 km 2 in 2006 to 11,400 km 2 in 2014. In 2012 the Ministry of Agriculture of the Republic of Kazakhstan proposed to increase the area under soybeans to 40,000 km 2 by 2020, and lands in South-eastern (SEK), Eastern (EK), and Northern (NK) regions of the country were designated for this purpose [3]. Currently, the total yield of soybean has increased from 17,000 tons in 2007 to 20,000 tons in 2014. The increase in production is not straightforward, as yield from 2012 to 2014 did not grow while the sowing area for that period increased nearly to 40,000 hectares [4]. This suggests that in addition to basic agronomic knowledge a detailed understanding of soybean adaptation to the new environments is required.
The productivity of soybean is largely dependent on flowering and maturity times in diverse ecological environments [2], [5], [6], [7]. Therefore, in Northern America 13 maturity groups (MG) were classified for breeding purposes [8]. The classification was found useful as it helped to compare patterns of seed maturation across many different environments [5], [6]. Soybean adaptability is primarily defined by expression of flowering genes [7], which may adjust duration of heading and maturity to maximize resource capture while minimizing the effect of abiotic stresses for specific environments. Flowering genes have been assigned to a series (E1-E8) [6], [9]. E1, E2, E3, and E4 and their roles in flowering time and maturation have been characterized [9], [10], while the identity and function of E5-E8 genes remains largely unknown [5]. It was shown that E1-E4 genes are involved in regulation of both pre-flowering and post-flowering growth of plants under different photoperiod length [11]. E1 is a flowering repressor and encodes a transcription factor that contains a putative nuclear localization signal and region related to the B3 DNA-binding domain [7], [12]. E2 is an orthologue of the Arabidopsis flowering gene GIGANTEA [2], [13]. E3 and E4 encode phytochrome A (PHYA) proteins GmPHYA3 and GMPHYA2, respectively [14], [15]. Dominant alleles of E1 and E2, as well as the e1-as allele at the E1 gene, delay time to maturation, while recessive alleles at E3 and E4 provide insensitivity to photoperiod length [11]. It was suggested that in the signaling pathway the two photoreceptor genes E3 and E4 suppress transcription of E1 and correspondingly elevate GmFT expression, which directly influences earliness of flowering time [9].
To date in Kazakhstan there are 12 officially registered commercial soybean cultivars basically developed for EK and SEK regions, which have a rather longer history of soybean cultivation than other regions in the country [16]. These cultivars were previously genotyped using DNA microsatellite markers and compared to a large number of foreign accessions [3], [17]. The authors characterized each cultivar with 50 SSR (single sequence repeat) markers and found that local accessions were clearly distinct from those cultivated in Northern America and East Asia [17]. However, no characterization of specific flowering genes has been carried out for these accessions, therefore, little information is available on the genetic bases of soybean adaptation to different environments within Kazakhstan.
The objective of this study was to evaluate whether differences in time of flowering and maturation in a genetically diverse soybean panel grown in different latitudes of Kazakhstan can be associated with particular allele combinations of major E series flowering genes. The data obtained provides useful information for selection of the most appropriate parental genotypes to be used in local breeding programs for improved soybean productivity.

Materials and Methods
The soybean material consisted of 120 genetically diverse cultivars and lines from the collection of the Kazakh Agricultural Research Institute (http://www.kiz.kz, Almaty region, Kazakhstan). The accessions can be obtained by direct request from Dr. Didorenko (svetl_did@mail. ru). The collection is currently a prime source for ongoing breeding projects in SEK, EK, and NK regions of Kazakhstan. The research material represents 12 countries from 5 geographic regions, including Western and Eastern Europe, Northern America, Eastern Asia, and Kazakhstan (S1 Table). It includes 18 accessions from Kazakhstan containing 5 officially registered cultivars in SEK. The collection was grown in three randomized replicates in breeding stations of SEK, EK, and NK regions of Kazakhstan. The locations and meteorological data of field conditions are shown in Fig 1. SEK was irrigated while EK and NK stations were non-irrigated sites. Plants were grown in 1 meter rows with 30 cm between rows and 5 cm between plants within rows. Times to flowering and maturation were studied at R stages described by Fehr et al [18]. The averaged data for triplicated trials were analyzed statistically for assessments of plant growth and yield components in three studied locations. The accessions were genotyped for E1, E2, E3, and E4 alleles according to protocols published elsewhere [5], [19], [20], [14], respectively. DNA extraction and agarose gel electrophoresis were performed according to Abugalieva et al [17]. Assignment of samples to maturity groups in each region was carried out using thermal time (TT) degrees ( 0 Cd) at R8 stage (TT-R8) according to Fehr et al [18]. Time from emergence to flowering was designated as VE-R2, from flowering to pods development as R2-R4, from pods development to seed maturation as R4-R8, and time from emergence to maturation as VE-R8. Statistical analyses of data, including Pearson's correlation and t-test, were calculated by using GraphPad Prism (version 5.00 for Windows, GraphPad Software, San Diego California USA, www.graphpad.com). Genotype-environment interaction patterns, including AMMI (Additive Main Multiplicative Interaction) and GGE Biplot methods, were studied using the GenStat package (16 th release, VSN International, Hertfordshire, UK). The symmetric scaling option of both methods and available field data for all three sites were used in estimations.

Ethics Statement
No specific permissions for field studies were required as all breeding organizations involved are governmental research institutions and participants of the ongoing project supported by the Ministry of Education and Science of the Republic of Kazakhstan.
Five major groups of ten or more accessions were found among the seventeen E genotype combinations found in this study ( Table 1). The number of genotypes characterised was reduced to sixteen in EK and fifteen in NK because accessions with genotypes e1-n/E2/E3/E4 in NK and E1/e2/E3/E4 in EK and NK did not form seeds.

Evaluation of adaptability of soybean collection in three regions of Kazakhstan
The diverse collection (S1 Table) was analyzed for key plant growth stages by three breeding organizations representing SEK, EK and NK. Accessions were studied for the length of six developmental stages at TT-R2 and TT-R8 stages (S2 Table).
As locations differed in latitude and longitude there were differences in daylength, meteorological conditions (Fig 1), and sowing dates (S2 Table). The correlations for all growth stages across the three locations was very high (S3 Table), except for stage R4-R8 where the correlation index between SEK and NK stations was less significant (P <0.02). The TT-R2 duration was similar between SEK and EK sites, which was 300 degrees shorter in NK ( Table 2). The duration of TT-R8 in SEK was longer by nearly 300 degrees compared to EK and NK sites, indicating that the greatest contrast at these two stages is between SEK and NK ( Table 2).
The durations of TT-R8 were used to classify accessions into maturity groups (MG) in the three sites (Table 3). Since the collection was characterized by 5 MGs in SEK, 3 MGs in EK, and only 2 MGs in NK, it was concluded that the number of MGs reduced in higher latitudes. The largest groups were MG00 in SEK and EK (86 and 61 accessions, respectively), and MG000 (44) in NK. Local breeding lines fell into the largest groups, except in SEK where Kazakh accessions were assigned to MGI and MGII (S1 Table). In each site, the highest average yield per plant (YPP) was recorded for the latest maturity group (Table 3). 3±0.0 n/a n/a n/a n/a n/a n/a Performance of soybean collection in South-eastern region. Four out of five local cultivars belonged to MGI, which contained 16 accessions. The fifth local accession was one of the two samples in MGII (S1 Table). As the TT-R8 duration of the 16 samples in MGI was significantly higher than those of MG0 and MG00 accessions (P<0.0001), their yield components were assessed in comparison to samples from two earlier flowering MGs. It was also determined that MGI was superior to MG0 and MG00 for the average YPP (Table 3). The highest yielding individual accessions were SD74 and SD117, which were characterized by the e1-as/ e2/E3/E4 genotype and assigned to MGI and MGII groups, respectively (S1 Table). It was noted that the local standard cultivar SD116 (E1/e2/E3/E4) also performed as a high yield accession with late time to flowering and maturation.
The role of each gene in plant performance was examined by comparing field data for each parameter. For the SEK location a two-tailed t-test indicated that alleles at E1 and E4 were highly significant for plant growth performance (S4 Table). Specifically, diversity at E1 was statistically significant for time to flowering (P<0.001), while that at E4 played an important role for time to maturity and yield (P<0.01). In pairwise analysis significant allele combinations for maturation and YPP were also recorded for E1/E4 (P<0.001). This trend was confirmed for three E gene comparisons (S4 Table).
Performance of soybean collection in Eastern region. Five local accessions bred in SEK did not form pods in the EK field conditions and, therefore, only 115 out of 120 accessions were analyzed. The TT-R8 values of the collection fell into 3 MGs (Table 3) suggesting that later ripening groups may not be suited to the region. The analysis of growth stages showed similar TT-R2 duration and 264 degree shorter TT-R8 duration in comparison with the SEK region. Most accessions (61) of the collection were grouped into MG00, including 4 out of 5 accessions bred in EK region. However, the highest YPP was recorded in the smallest group MG0 (10 accessions). Within MG0 three accessions had e1-as/e2/e3/E4, three accessions e1-as/ e2/E3/E4, two accessions e1/e2/E3/E4, and the remaining two accessions had e1/e2/E3/e4 and e1/E2/E3/E4 genotypes. SD91 (e1/e2/E3/e4) and SD82 (e1/e2/E3/E4) showed 31 days to flowering time and were the earliest flowering genotypes within MG0. The latest flowering accession was SD75 (e1-as/e2/e3/E4), followed by SD74 (e1-as/e2/E3/E4), which flowered at 49.7 and 49.0 days, respectively. When averages of flowering time for these contrasting pairs of accessions were compared by using yield data, it was determined that early flowering genotypes were maturing 13.5 days later and their yield per plant was higher by 11.4 g.
Nine accessions marked in S1 Table were also analyzed in a high latitude region of China [6]. Therefore, it was interesting to compare MG reference varieties with results in this study ( Table 4). The data from field trials in EK was selected for comparison, as latitude in this site was similar to reported Chinese conditions (49 0 57' vs 50 0 15'). It appeared that 5 out of 9 accessions showed similar MGs, while 4 others were found in earlier MGs in China.
The t-test was applied in order to determine the significance of each gene and their combinations in the analysis of growth stages and yield components for all studied 115 accessions in n/a n/a n/a n/a NK 85 44 21.3±1.5 41 27.4±2.1 n/a n/a n/a n/a n/a n/a SEK, South-eastern Kazakhstan; EK, Eastern Kazakhstan; NK, Northern Kazakhstan; n, number of accessions; MG, maturity groups; YPP, yield per plant (g); n/a, not available. Table). It was found that the E1 genotype is statistically significant for pre-flowering time ( . Hence the largest difference between these groups was 16.5 days, which was the biggest factor contributing to the YPP in the region (P<0.0001). It is interesting to note that none of five local promising breeding lines was found to have the e1/E3/E4 genotype in EK (S1 Table). Performance of soybean collection in Northern region. NK was the highest latitude testing site in this work and only 85 accessions of the collection formed seeds in all three replication blocks. Therefore, only those 85 accessions were selected for further comparative studies. The YPP in NK and EK were mostly similar to each other, but significantly higher than in SEK (P<0.0001, Table 2). The TT-R8 values in NK separated the accessions into MG00 and MG000, with the number of accessions nearly evenly split between the two groups (41 and 44, respectively; Table 3). All eight local accessions from NK were grouped into MG000, although the TT-R8 was longer (P<0.0001) and the YPP was higher (P<0.05) in MG00 in comparison to MG000. Within MG00 the largest group was e1-as/e2/E3/E4 genotype (14 accessions) and the highest YPP was recorded for the E1/e2/E3/e4 genotype (5 accessions), which included three accessions from the Ukraine (S1 Table).

EK (S4
Pairwise comparisons of genes showed that E1and E4 showed the highest statistical correlation with time to flowering and maturity and YPP (S4 Table). As in the other two environments (SEK and EK), the individual contributions of E2 alleles to flowering and maturation in NK were not significant in this study. In three gene analyses the most frequent allelic combination was e1-as/E3/E4 (30 accessions), which included five locally bred accessions. In an E1 background all three genotypes had the same duration of seed maturation but differed by 4.2 days in flowering time between E3/e4 and e3/E4 genotypes, suggesting that late flowering is beneficial for higher yield components in the region. The comparison of E1/E3/e4 and e1-as/E3/E4 using t-test showed statistically significant difference for duration of R2 and R8, and YPP (P<0.01).

Genotype-environment interaction patterns
Flowering and seed maturation times, as two key developmental phases of soybean, were analyzed for correlation to YPP in the three sites (S2 Table). Late flowering time was positively correlated with yield in SEK (P<0.0001) and NK (P<0.001) but not in EK. Late maturation time was very significant for the yield in SEK and EK (P<0.0001, for both cases), but was insignificant for NK site. The comparison of time to flowering and maturity in top productive genotypes in each region to performances of genotypes recommended for each MG reported in [2] is shown in Table 5. In SEK and EK regions the two genotypes with highest productivity in either site have been selected because their best genotype was represented by one accession only (Table 5). Results confirmed the advantages of the selected top genotypes (Table 1) and suggested the optimal range of time to flowering and maturity for each experimental site. The allele combinations in four E genes between selected top genotypes in this study and suggested genotypes for each MG proposed in [2] were different in all three sites (Table 5). There were differences in growth performances of top high-yielding genotypes among studied sites, indicating the reduction of TT-R8 duration from 2754 0 Cd in SEK to 2138 0 Cd in NK.
The YPP values from field trials were used for estimation of genotype-environment interaction (GEI) patterns based on AMMI and GGE Biplot methods (Fig 2). The AMMI effects of ANOVA (analysis of variance) detected large environmental contribution to the interaction (89.52%), while the sum of Genotype (G) and Genotype x Environment (GE) impacts was only 10.48%. The principal coordinate analysis of AMMI (Fig 2A) suggested that PC1 effectively discriminated SEK and EK sites from NK site, and PC2 allowed the differentiation of SEK from EK. Rather similar results were obtained using GGE Biplot methods, although in this case SEK and EK were combined to the same Mega-environment (Fig 2B).

Characterization of genotyping diversity in tested soybean accessions
Despite soybean being a typical short-day species it is successfully cultivated in long-day high latitude regions of North America, Europe and China [2], [6]. There are many reports suggesting that adaptation to certain environments is directly related to the relationship between flowering gene alleles and time to maturation of soybean [7], [21], [22]. Hence, it was important to find out which genotypes show highest productivity in different regions of Kazakhstan. Therefore, the breeding collection of soybean both in current harvesting areas and in recently developed high latitude territories in Kazakhstan was characterized for E flowering gene composition. Five major groups of genotype combinations (n >10) were identified from the seventeen variants found for the four E gene alleles of the breeding collection in this study ( Table 1). The genotypic structure of the breeding collection was comparable to those reported in China [2], [6] and for photoperiod-insensitive accessions in Japan [11]. It was found that most accessions tested in Chinese reports were E1/e2/E3/E4 followed by e1-as/e2/ e3/E4, while in the Japanese report they were e1-as/e2/e3/e4 followed by e1-as/e2/e3/E4 and e1/ e2/e3/E4. The largest group of accessions in the current study was e1-as/e2/E3/E4, which was moderately common in studies in China but absent in field trials in Japan. The discrepancy in genotypic content in this study may be due to more frequent exchange of germplasm with partners from those Eastern European countries that were part of former USSR.

Comparison of MG in regions with different latitudes
The comparison of MGs in three different latitude regions of Kazakhstan showed a reduction from five groups in SEK (MGII-MG000) to 2 groups in NK (MG00-MG000); in each trial the latest MG always provided better yield (Table 3). In 23 out of 120 soybean accessions the MG groups were similar across all three sites, and 93 accessions showed similarity for at least two sites, therefore, it is still a convenient instrument for breeding purposes [2,5]. In spite of a few exceptions the trend was the reduction of TT-R8 duration from warmer to colder regions with higher latitude (Table 3). Since the highest yield components were registered for EK and NK  (Table 2) it was important to determine an optimum duration of growth in those previously unstudied high latitude regions. It was found that the most productive accessions in EK and NK have the averages 2150 and 2138 for TT-R8, respectively (Table 5). Therefore, the accessions within this range of maturation were assigned to MG00. According to [2], MG0 and MG00 both were genotyped as e1-as/e2/E3/E4 and e1-as/e2/e3/E4. However, in the study the most optimal MG00 genotypes were e1-n/e2/E3/E4 and e1-n/E2/E3/E4 in EK, and E1/e2/E3/ e4-n in NK, suggesting that additional genetic factors associated with time to maturation should be involved in the determination of maturity groups.

Associations of flowering genes and maturity time in different latitude regions
The E series of genes were shown to be important factors for plant adaptation in soybean growing regions [12], [14]. It was recently hypothesized that three of those E genes (E1, E3, and E4) play larger roles in pre-and post-flowering photoperiod responses [11]. The authors underlined the role of E3 and E4 genes in photoperiod insensitive genotypes for early flowering and seed maturation. In particular, they proposed a gene regulatory network comprising three known maturity genes, a determinate habit gene (Dt1) Table).
In higher latitude regions of EK and NK the patterns of associations between flowering genes and yield were different to some extent. The highest yield in the collection in EK was recorded for the e1/e2/E3/E4 genotype (11 accessions). Dominant alleles of E3/E4 significantly delayed the R2-R8 stage resulting in the longest VE-R8 among 17 genotypes (S5 Table). The e1/e2/E3/E4 genotype showed 3.4 days earlier flowering and 10.3 days later post-flowering difference in comparison with e1-as/e2E3/E4 suggesting that E3/E4 is prolonging seed maturation when E1 and E2 are dysfunctional. The GEI analysis based on yield performance suggested that early flowering time is essential in SEK and EK, and, therefore, this result effectively separated SEK and EK from NK (Fig 2).

Conclusions
This study is the first attempt to assess soybean adaptation in different latitude regions of Kazakhstan based on E gene variation and field trial results. Although e1-as/e2/E3/E4 was the most common genotype in this collection, it was not among the top yield performing genotypes in three studied regions. Specific allele combinations of the four E genes and optimal ranges of time to flowering and maturity were proposed for each experimental site based on identification of top genotypes with higher productivity. The results confirmed the high importance of E1 for length of flowering, and of E3/E4 for time to maturation. The MG analyses based on yield performance and GGE Biplot method showed that SEK and EK regions have more similarity and comprised one mega-environment while NK region formed another.
Supporting Information S1