Allele combinations of maturity genes E1-E4 affect adaptation of soybean to diverse geographic regions and farming systems in China

Appropriate flowering and maturity time are important for soybean production. Four maturity genes E1, E2, E3 and E4 have been molecularly identified and found to play major roles in the control of flowering and maturity of soybean. Here, to further investigate the effect of different allele combinations of E1-E4, we performed Kompetitive Allele Specific PCR (KASP) assays based on single nucleotide polymorphisms (SNPs) at these four E loci, and genotyped E1-E4 genes across 308 Chinese cultivars with a wide range of maturity groups. In total, twenty-one allele combinations for E1-E4 genes were identified across these Chinese cultivars. Various combinations of mutations at four E loci gave rise to the diversity of flowering and maturity time, which were associated with the adaptation of soybean cultivars to diverse geographic regions and farming systems. In particular, the cultivars with mutations at all four E loci reached flowering and maturity very early, and adapted to high-latitude cold regions. The allele combinations e1-as/e2-ns/e3-tr/E4, E1/e2-ns/E3/E4 and E1/E2/E3/E4 played important roles in the Northeast China, Huang-Huai-Hai (HHH) Rivers Valley and South China regions, respectively. Notably, E1 and E2, especially E2, affected flowering and maturity time of soybean significantly. Our study will be beneficial for germplasm evaluation, cultivar improvement and regionalization of cultivation in soybean production.

The SNPs and InDels that contribute to the phenotypic variations of crops are indispensable for the development of functional markers (FM) which are significant for germplasm evaluation and MAS. It has long been a challenge to visualize SNP genotyping with traditional molecular techniques while InDels can be identified easily. For genotyping of E1-E4 genes, some allele specific markers, such as CAPS and dCAPS markers, which are based on PCR/gel and enzyme digestion, have been designed and used for genotyping of SNPs at E1-E4 [15][16][17][18]. But the procedure of genotyping with these markers is complicated. In recent years, the KASP (Kompetitive Allele Specific PCR) genotyping technology has been applied to identify SNPs in some crops, such as wheat and rice [27,28]. KASP assays can achieve high throughput at low cost for genotyping, which greatly improves the efficiency of selection in breeding programs. Here, we developed KASP assays for high-throughput genotyping of E1-E4 genes in soybean. We compared the effect of E1-E4 on soybean flowering at different locations, and analyzed the associations between the allelic variations of four E genes and adaptation of Chinese soybean cultivars to diverse geographic regions and farming systems.

Plant materials
A total of 308 soybean [Glycine max (L.) Merrill] cultivars released in 21 provinces of China were collected for genotyping in this study (S1 Table). The cultivars were classified into twelve MGs, including MG 0000, MG 000, MG 00, MG 0 and MGs I to VIII [29,30].

Field trials and phenotyping
One hundred and ninety representative cultivars covering a wide range of MGs (S1 Table) were planted for phenotyping. Field trials were conducted in Beijing (40˚13 0 N, 116˚33 0 E) and Xinxiang (Henan province, China) (35˚08 0 N, 113˚45 0 E) in 2017. The cultivars were planted with two replicates in a randomized complete blocks design. All the cultivars were arranged in a 1.5 m row with 0.4 m apart between rows and a space of 0.1 m between adjacent plants. The field management followed local routine practices.
Five uniform plants of each line were selected for measurement of flowering time and maturity time. The flowering time was calculated as the number of days from emergence (VE) to beginning bloom (R1), and the maturity time was calculated as the number of days from emergence (VE) to beginning maturity (R7) [31]. It should be noted that only the date of R1 was recorded in Beijing because of the serious lodging of late-maturing plants at later stages of growth.
KASP assays were developed with the SNPs identified by sequencing. Two allele-specific forward primers were designed carrying unique tails-FAM (5 0 GAAGGTGACCAAGTT-CATGCT 3 0 ) and HEX (5 0 GAAGGTCGGAGTCAACGGATT 3 0 ) respectively and with the targeted SNPs at the 3 0 end, and a common primer was designed to pair with both forward primers. The InDels at E3 and E4 loci were genotyped with traditional InDel markers [18]. All primers for sequencing and genotyping are provided in S3 Table.

Genotyping of E1-E4
The genomic DNA of the 308 cultivars was extracted using CTAB from fresh leaves. The KASP assays were tested with the above-mentioned thirty soybean cultivars (S2 Table). After initial testing, all 308 cultivars were genotyped with the KASP assays and InDel markers. In KASP assays, the primer mixture comprised ddH 2 O 46 μl, common primer 30 μl (100 μM), and each allele-specific forward primer 12 μl (100 μM). Assays were tested with 10 μl reactions (4.986 μl of 10-30 ng/μl genomic DNA, 5 μl of 2× KASP master mixture, and 0.014 μl of primer mixture). PCR cycling was performed using the following protocol: hot start at 95˚C for 15 min, followed by 10 touchdown cycles (95˚C for 20 s; touchdown from 65˚C to 55˚C with 1˚C decrease per cycle for 60 s), followed by 30 additional cycles (95˚C for 20 s; 57˚C for 60 s). The fluorescent endpoint readings were performed on a real-time PCR system (ABI 7500). Traditional PCR was performed using EasyTaq DNA polymerase with 30 cycles at 94˚C for 30 s, 58˚C for 30 s and 72˚C for 1 min, and the PCR products were separated on 1.0% agar gel [18].

Statistical analysis
Analysis of variance (ANOVA) was performed based on four factors including E genes. Data analysis was conducted with the statistical software R (version 3.3.1) [32]].
All 308 cultivars were genotyped with six KASP assays and InDel markers. For the E1 locus, two assays were performed for the genotyping of e1-as and e1-fs. The e1-nl allele was identified with PCR-electrophoresis, and amplification products were obtained for all cultivars except 'Dongnong 41'. Among these cultivars, 0.6% carried the e1-fs allele, 31.8% carried the e1-as allele, and 67.2% carried the photoperiod-sensitive E1 allele. For the E2 locus, the e2-ns allele was identified in 83.1% of cultivars and the wild-type E2 allele was mainly identified in latematuring cultivars from the south. Five allelic variations of E3 were identified, e3-fs and e3-ns were identified in 2.9% cultivars, e3-tr allele was identified in 31.2% cultivars, and 65.9% of cultivars carried the photoperiod-sensitive alleles E3-Ha or E3-Mi. Three alleles of E4 were
Furthermore, we analyzed the effects of E1-E4 on flowering and maturity of soybean by ANOVA. The variance of E4 dropped out of analysis because no allelic variations were identified at E4 locus across the cultivars for phenotyping. The 'sum square' of each E gene indicated the effects of the loci on flowering and maturity [18], the results showed that the effect of E1 on flowering time was larger than E2 in Beijing, but the effect of E2 on flowering and maturity time was the largest in Xinxiang ( Table 3).

Distribution of E genotypes in different maturity groups
The soybean cultivars carrying mutant alleles of E1-E4 genes reached flowering and maturity earlier. The percentages of mutant alleles for E1, E2, E3 and E4 genes were 32.8%, 83.1%, 34.1%, and 3.2% in all 308 cultivars belonging to different MGs, respectively. Some infrequent alleles of E1 (e1-nl, e1-fs), E3 (e3-fs, e3-ns) and E4 (e4-SORE, e4-kes) were identified in a few super early-maturing cultivars from MGs 0000-00 (Fig 2), indicating that the rare mutations may be associated with early maturity. The proportion of mutant alleles for these E genes became smaller from early-MGs to late-MGs (Fig 2). The mutant alleles e1-as and e3-tr were distributed in MGs 0000-Ⅳ and MGs 0000-Ⅱ respectively (Fig 2), while a majority of the Chinese cultivars carried e2-ns and E4 alleles.

Distribution of E genotypes in soybean cultivars from diverse geographic regions and farming systems in China
Soybean is widely distributed in China and three major production regions are the Northeast, the Huang-Huai-Hai (HHH) Rivers Valley Region and the South (Fig 3) which differ  considerably in terms of day-length and temperature. All the soybean cultivars in the Northeast are planted in the spring, thus the ecotype of the cultivars in this region is simplified as 'North spring (N-sp)'. The soybean cultivars of the HHH region planted in the spring and summer are classified as 'HHH spring (H-sp)' and 'HHH summer (H-su)' respectively. In South China, the cultivars are assigned to 'South spring (S-sp)', 'South summer (S-su)' and 'South autumn (S-au)' based on their growing seasons. Cultivars from twenty provinces were used to analyze the geographic distribution of E genotypes in China. The proportion of mutant alleles for these E genes decreased from north to south (Fig 3). Many mutant alleles or their combinations were identified in the Northeast. Most cultivars from the HHH region carried E1/e2-ns/E3/E4. The allele combination E1/E2/ E3/E4 was mainly distributed in the South.
Most mutations or mutant alleles were identified in N-sp cultivars which were planted in broad latitude ranging from 40˚N to 53˚N (Fig 3) and covered MGs 0000-IV (Table 4). e1-as/ e2-ns/e3-tr/E4 accounted for 36.6% of N-sp cultivars and played a dominant role in Northeast China. The cultivars which carried mutations at all four E loci were distributed in high-latitude (above 47˚N) cold regions [3,29]. The cultivars from MGs 0-II adapted to the lower latitude in the Northeast, and also abundant allele combinations were identified in these MGs. In the HHH region, E1/e2-ns/E3/E4 accounted for 76.0% of HHH cultivars while e1-as/E2/ E3/E4 were mainly identified in H-sp soybean cultivars. Some cultivars carrying E1/E2/E3/E4 were identified in this region. However, the cultivars carrying E1/E2/E3/E4 in the HHH region reached maturity earlier than those from the South region, which resulted in a large variation range of flowering and maturity time of E1/E2/E3/E4 (Fig 1).

Alleles of E1-E4 were identified to different proportions in Chinese soybean
Genetic and molecular analyses have revealed that various mutations of E1-E4 genes had effects on flowering and maturity of soybean [15][16][17][18][24][25][26]. Mutations at four E loci influence flowering and maturity at different levels [18,25,26]. The effect of some mutations in the 5 0 upstream regions or introns on flowering of soybean is still unclear [18]. Some alleles (e4-kam, e4-tsu, e4-oto and e3-Mo) identified previously in Japanese, Russian and a few Chinese cultivars [16,17] were not detected in this study, implying these alleles might play minor roles in maturity time improvement of soybean breeding in China. Some alleles were not detected maybe due to the deficiency of cultivars for sequencing, therefore more cultivars with different origins are in need for further identification of natural variations.
Multiple allelic variations of E1 and E3 were identified across the Chinese soybean cultivars used in the study. e1-as and e3-tr were identified in numerous cultivars, indicating that they contributed to regulate maturity of Chinese soybean greatly. In contrast, the photoperiodinsensitive alleles of E4 were identified only in several super early cultivars belonging to MGs 0000-000. It implied that the E4 locus might have a small contribution to flowering and maturity of Chinese soybean. For the E2 gene, a majority of the Chinese soybean cultivars possessed the e2-ns allele, which was consistent with previous studies [24,25]. This may be because E2 locus was not responsible for the photoperiod sensitivity but contributed to the expansion of

PLOS ONE
cultivated soybean [33,34]. Understanding the natural variations of the loci controlling flowering and maturing is very important for MAS and molecular design breeding of soybean in the future.

E1-E4 played different roles in controlling flowering and maturity of soybean
Each of E1-E4 genes performs a different function and influences photoperiod sensitivity and flowering time of soybean at a different extent. ANOVA results showed that E1 had larger effect on flowering time than E2 in Beijing (40˚13 0 N, 116˚33 0 E), but E1 had smaller effect than E2 in Xinxiang (35˚08 0 N, 113˚45 0 E), which may be due to the longer daylength of Beijing than Xinxiang. Because E1 was induced by long daylength to suppress flowering of soybean [8], then it maybe had larger effect under the longer daylength of Beijing. This may be also the reason that soybean reached flowering earlier in Xinxiang than in Beijing. In the previous studies, E1 was proved to have the largest effect on flowering in soybean [18,35,36], E2 had an obvious effect on maturity under short-day conditions [37]. In this study, E2 showed large effect on flowering and maturity of soybean. Previous reports also suggested that E2 exhibited pleiotropy across different traits including flowering and maturity in soybean and also played an important role in regulating the traits related to yield and seed quality [38][39][40]. Further studies are thus needed to reveal the molecular basis of the E2 gene in the control of flowering and maturity.
In the background of e2-ns/e3-tr/E4, the cultivars carrying the photoperiod-sensitive E1 allele reached flowering and maturity very early, which was similar to the cultivars carrying e1as (Fig 1). However, in the background of e2-ns/E3/E4, E1 delayed flowering and maturity in comparison with e1-as. Thus, E1 hardly repressed flowering in the background of E3 inactivation, suggesting that the inhibition of E1 was under the control of PHYA gene E3 [8].
The various allele combinations of E1-E4 gave rise to the diversity of flowering and maturity time of soybean cultivars. Cultivars carrying the same allele combinations may vary in flowering and maturity time, suggesting that other genes (e.g. E5-E10, J) or other unknown mutations of four E loci are involved in the control of flowering [12][13][14]18,[41][42][43][44]. Geneenvironment interaction may be another important factor that affects flowering and maturity time. Further studies should be conducted to identify other genes controlling flowering and maturity time, and to comprehensively analyze interactions among these genes and genotype × environmental interaction.

Allele combinations of E1-E4 affected adaptation of soybean to diverse geographic regions and farming systems
Mutations of E1-E4 genes reduced the photoperiod sensitivity and shorten the growth period of soybean cultivars [16][17][18][21][22][23], which is essential for soybean adaptation to long daylength or short growing season. Various combinations of mutations at four maturity E loci made cultivars adapt to different latitude environments and farming systems.
In the Northeast, the N-sp cultivars carrying recessive alleles of four E genes reached maturity early and belonged to early-MGs. Multiple mutations of four E genes made these soybean cultivars photoperiod-insensitive and able to adapt to the long-day conditions during the growing season of high latitudes in China. In particular, cultivars with mutations at all four E genes were distributed in MGs 0000-00 and matured very early, which is essential for adapting to high-latitude cold regions. Interestingly, these cultivars with mutations at all four E loci were not uniform in flowering and maturity time, indicating that other novel loci besides E1-E4 are involved in the control of flowering and maturity in these early cultivars. In the HHH region, most of the cultivars belonged to medium-MGs. A majority of H-su cultivars carried E1/e2-ns/E3/E4, while e1-as/E2/E3/E4 were mainly identified in 'HHH spring' cultivars which were photoperiod-insensitive compared to the summer cultivars, indicating that e1-as/E2/E3/ E4 may be more insensitive to long-day than E1/e2-ns/E3/E4. In the South, the photoperiodsensitive allele combination E1/E2/E3/E4 was mainly identified in S-su and S-au belonging to late-MGs and contributed to the adaptation of soybean to the short-day conditions. The photoperiod-insensitive S-sp cultivars reached flowering and maturity earlier than S-su and S-au cultivars. Some of the S-sp cultivars belonged to early-MGs Ⅰ-Ⅱ, among them some carried early-flowering allele combinations, such as e1-as/e2-ns/E3/E4, while others carried late-flowering allele combinations E1/E2/E3/E4 (S1 Table). The mechanism by which some S-sp cultivars carrying E1/E2/E3/E4 flower and mature earlier than other cultivars carrying same allele combination need to be further studied.

Conclusion
Various combinations of mutations of E1-E4 genes gave rise to the diversity of flowering and maturity time of soybean and enabled the cultivars adapt to different ecological regions and multiple cropping systems in China. The allele combinations e1-as/e2-ns/e3-tr/E4, E1/e2-ns/ E3/E4 and E1/E2/E3/E4 played important roles in the adaptation of the soybean cultivars to the Northeast, HHH Region and the South in China respectively. The KASP assays developed in this study will facilitate the germplasm characterization and MAS of maturity in soybean breeding programs.