The Genetic Basis of Natural Variation in Kernel Size and Related Traits Using a Four-Way Cross Population in Maize

Kernel size is an important component of grain yield in maize breeding programs. To extend the understanding on the genetic basis of kernel size traits (i.e., kernel length, kernel width and kernel thickness), we developed a set of four-way cross mapping population derived from four maize inbred lines with varied kernel sizes. In the present study, we investigated the genetic basis of natural variation in seed size and other components of maize yield (e.g., hundred kernel weight, number of rows per ear, number of kernels per row). In total, ten QTL affecting kernel size were identified, three of which (two for kernel length and one for kernel width) had stable expression in other components of maize yield. The possible genetic mechanism behind the trade-off of kernel size and yield components was discussed.


Introduction
Maize (Zea mays L.) is one of the most important cereal crops in the world, and increasing the maize production by selection for the components of grain yield is the main objective in maize breeding programs [1]. Maize kernel size, measured by kernel length, width and thickness, is an important component of grain yield. Moreover, the characteristic of kernel size is also an important factor of appearance quality, which may influence the corn market grades and consumer preference [2]. Therefore, investigating the genetic basis of kernel size, and discovering any possible genetic constraints to optimize it, will facilitate the improvement of grain yield in maize breeding programs.
The available QTL generally detected in bi-parental mapping populations have greatly contributed to the understanding of the genetic basis of kernel size. However, QTL mapping in such populations is subject to low allele numbers and limited recombination [29]. In recent years, the generation of multi-parent advanced generation integrated cross (MAGIC) populations has provided an additional option for QTL mapping. Compared with the bi-parental linkage populations, the development of MAGIC populations usually involved inter-crossing of multiple parental lines, which may introduce more than two independent alleles at a locus and subsequently increased probability of QTL being polymorphic across the multiple parents [30]. In addition, the precision and resolution of QTL detection can be increased by the amplified number of recombination events [31]. In view of the merits of MAGIC for QTL mapping, increasing number of MAGIC populations have been created in model animals and plants recently. For example, two MAGIC populations have been developed in mice and used for identifying candidate genes for serum cholesterol and coat color traits [32,33]. In plants, MAGIC populations were first developed in Arabidopsis and subsequently expanded to crops [34]. In recent years, encouraging results have been reported for flowering time, leaf morphology and seed traits of Arabidopsis thaliana [35][36][37], fruit weight of tomato [38], plant height and shoot traits of wheat [39,40], biotic stress and abiotic stress of rice [41] and flowering time of barley [42]. Very recently, MAGIC populations were also developed in maize and then used for QTL mapping in traits such as flowering time, plant height, ear height and grain yield [43].
Crop seed is a life-history trait, and the availability of resource pool in seed developmental processes drives seed production [2,44,45]. Due to the competing apportionment of resources between fitness components (i.e., seed size and seed number), a trade-off between seed number and size must occur [46]. A better understanding of natural variation in seed size requires simultaneous consideration of trade-off of kernel number related traits to seed development [37].
Given the potential benefits of multi-parental (four-way cross) mapping population, we developed a set of multi-parental (four-way cross) mapping population in maize [6]. In the present study, we investigated the natural variation in seed size and other seed related traits. The objectives of this study were to detect the genetic architecture underlying seed size in maize, and specifically we were interested in the genetic mechanism behind trade-off of seed traits to better understand the genetic basis of kernel size.

Materials and Methods
The experiment was conducted in Zhengzhou Experiment Station (34°51'N 113°35'E) and Jiyuan Experiment Station (35°4'N 112°36'E) of Henan Agricultural University (HAU). At the two experimental locations, HAU has set up experimental field bases for non-profit agricultural research with a wide array of partners in China. In the present study, the field experiments in the two stations were approved by HAU. Further, the stations where field studies were conducted are not protected locations for endangered or protected species.

Plant materials
The four-way cross mapping population including 305 individuals was developed from the four-way cross among D276/D72//A188/Jiao51. The four parental lines were selected based on the agronomic performances for a range of traits in maize breeding programs. All 305 individuals were self-crossed to develop progeny families. Twenty eight out of 305 individuals lacked enough self-pollinated seeds, and finally, 277 four-way cross F 1 individuals were genotyped for genetic map construction, and their selfed progeny, known as four-way cross families, were used for phenotyping [6].

Field trials and trait evaluation
In 2010, the 277 four-way cross families, together with their four parents were planted at the Jiyuan Experiment Station and Zhengzhou Experiment Station, respectively. Field experiments in each location were arranged in a randomized complete block design with three replicates. Each plot included one row with 4 m long and 0.67 m wide, and was overplanted and then thinned to 15 plants per row at a density of 52,500 plants per hectare.
To determine whether flowering time (FT) affects the trade-off between kernel size, FT was investigated and recorded as the number of days from planting when 50% of the plants in a row were shedding pollen. At physiological maturity, eight consecutive plants from the center of each plot were harvested by hand for trait measurements. The ear traits were evaluated, which included ear row number (ERN) and kernel number per row (KNR). After ears were dried down to a constant weight, the kernels at the middle of the ears in each plot were shelled and bulked. Four kernel traits were measured, including 100-kernel weight (HKW), kernel length (KL), kernel width (KW) and kernel thickness (KT). HKW was estimated from the average of three measurements of the weight of 100 randomly selected kernels; KL, KW and KT were estimated by the average of three replicated measurements of 50 kernels randomly selected from the bulked kernels using electronic digital calipers.

Phenotypic data analysis
Analysis of variance for phenotype data was performed using the General Line Model (Proc GLM) procedure in SAS software [47], and Fisher Least Significant Different (LSD) method was used for multiple comparisons. The components of variance were estimated using a random-effect model and broad-sense heritability (H 2 ) for each trait across the two environments was calculated as defined by Knapp et al. [48]. Phenotypic correlations among traits were calculated by the Pearson correlation method using the mean values of genotypes across environments.

Genetic map and QTL mapping
Genetic linkage map was constructed using the algorithm proposed by Zhang et al. which was implemented in software package GACD as functionality CDM [49]. Two hundred and twenty one markers were relatively evenly distributed on 10 maize chromosomes and the whole length of the genome was 1799.03 cM [6].
The algorithm of inclusive composite interval mapping (ICIM) for four-way crosses was implemented in GACD software (http://www.isbreeding.net) as functionality CDQ [50] and used for QTL mapping of six traits, i.e., KL, KW, KT, HKW, KNR and ERN. QTL analysis was performed on the mean values of each genotype across the two environments. Inclusive linear models that includes marker variables and marker interactions so as to completely control both additive and dominance effects were built respectively for each trait. Stepwise regression was used to select significant marker variables and then used for background control in Inclusive Composite Interval Mapping (ICIM) of QTL [50]. The two probabilities for entering and removing variables were set at 0.001 and 0.002. The scanning step was 1 cM. LOD threshold was set at 3.97 by the empirical formula derived from Zhang et al. [50]. The original genotypes, phenotypes and linkage maps of the four-way cross population was available in S1 Dataset.

Phenotypic variation and heritability
The phenotypic variations of kernel size and related traits among the four parental lines were investigated in Jiyuan and Zhengzhou locations in 2010, and significant variations were observed for all traits measured in this study, including three kernel size traits (i.e., KL, KW and KT), ear traits (i.e., ERN and KNE) and kernel weight (HKW) ( Table 1). Among the fourway families comprising of 277 entries, extensive phenotypic variation was observed in kernel size, HKW, ERN, KNR as well as FT (

Correlation of seed size and other traits
Of the traits surveyed in this study, a number of significant pairwise correlations were observed between kernel size and the other traits (i.e., FT, ERN, KNR and HKW) ( Table 3)

QTL mapping results of kernel size and related traits
A summary of the QTL detected across environments, including the positions, LOD scores, genetic effects (additive effects of a F and a M and dominance effect d), phenotypic variation explained (PVE) and the mean values of four different genotypes, were shown in Table 4. A : The genetic effects of a F and a M were the additive genetic effects of the two single crosses, D276×D72 and A188×Jiao51, respectively; the genetic effect of d was the dominance effect between the two single crosses. c : Phenotypic variation explained.
total of 10 QTL were identified for kernel size, including 5 QTL for KL, 3 QTL for KW and 2 QTL for KT. Single QTL of kernel size explained from 5.51% to 17.94% of the phenotypic variation. Five QTL were identified for HKW which located on chromosomes 1, 3, 5 and 7, and single QTL explained from 6.62% to 8.23% of the phenotypic variation. Six QTL were identified for KNR, which included 4 QTL on chromosome 5 and 1 each on chromosomes 1 and 3. Single QTL of KNR can explain from 5.13% to 6.85% of the phenotypic variation. Seven QTL were identified for ERN which located on chromosomes 1, 4, 6, 7, 9 and 10, and the largest QTL for ERN was located on chromosome 6 and explained 11.24% of the phenotypic variation.
Positions of all detected QTL were marked in the linkage maps, and overlaps between QTL of kernel size with other traits were observed (Table 4 and Fig 1). The first overlapped QTL located on chromosome 5 (bin 5.03/04). In this region, qKW5-1, which conferred the kernel width, shared the same flanking markers with qHKW5-1 and also with qKNR5-2. The second region located on chromosome 3 (bin 3.04/05). In this region, qHKW3-1 shared the same flanking markers with qKNR3-1, which had the largest effect for KNR. Moreover, one QTL for KL, qKL3-1, was also detected, which shared the same flanking marker umc1347 with qHKW3-1 and qKNR3-1. Other region with closely linked QTL was also identified on chromosome 7 (bin 7.02/03). In this region, both QTL for KL (qKL7-1) and ERN (qERN7-1) were identified, and they shared the same marker bnlg1792 within the QTL region. Despite the significant correlations between kernel size traits (i.e., KL, KW and KW), we did not detect any overlapping QTL region for the three kernel-size traits.

Discussion
The trade-off between kernel size traits in maize Grain seed is a life-history trait, and the trade-off of grain seed and related traits has widely reported in many plant species [2,37,44,45]. However, few studies have addressed the genetic mechanism behind trade-off of the factors involved in maize kernel development by taking into account life-history traits. In the present study, complex genetic mechanism behind the trade-off of seed traits in the four-way cross population was observed. On one hand, overlapped QTL for kernel size (i.e., qKL3-1, qKL7-1 and qKW5-1) and yield-components were observed, and most of them had the same direction of additive effects (a F and a M ) ( Table 4), which indicated the allele that increases kernel size is from the same natural accession, indicating past occurrence of directional selection for kernel size and yield components. On the other hand, kernel size (i.e., KW and KT) showed significant negative correlation with ERN and KNR (Table 3), which implied the potential trade-off behind them. However, there is little evidence for overlap in their genetic architecture since no common QTL between the traits were detected.

Comparison with published QTL/gene
In the present study, we mapped 10 QTL for kernel size, with three of them (qKW5-1, qKL3-1 and qKL7-1) had consistent co-localization or adjacent to QTL for one of the components of maize yield (Table 4). We compared the QTL with published kernel-size QTL, and overlapped QTL independent of the genetic background were identified.
qKW5-1 with flanking markers bnlg1700 and umc2298 in the present study, shared the same QTL region with CQTL5-1, an common QTL for kernel width in multiple connected RIL populations in maize [8]; In this region, the other QTL for kernel width (i.e., qKW5) was also identified in an independent QTL mapping of kernel-size [9] (S1 Table). More importantly, we identified qKW5-1 overlapped with the qHKW5-1 (the QTL with the largest effect for 100-kernel weight in the present study), similar results were also observed by Li et al. [8] and Liu et al. [9], who also identified the QTL for kernel-width overlapped with kernel weight in this region. Within qKW5-1 region, ZmGW2-Chr5 conferring kernel size in maize has been identified and is perhaps one of candidate genes for the QTL [51]. Therefore, it could be concluded that this genomic region is very important for grain yield since the QTL has the stable expression across different genetic background.
qKL7-1 located in bin 7.02/03 is another important region for the genetic control of grain yield and kernel traits. In this region, cluster QTL for kernel-size and yield-related trait were also identified in independent studies. For example, Li et al. [8] found one common QTL (CQTL7-1) conferring kernel weight, kernel width and thickness in multiple connected RIL populations in maize. Peng et al. [7] identified two QTL, Qqknpp7 and Qqgypp7, conferring kernel number and grain yield per plant, respectively. Other kernel-size QTL in the present study with known QTL/gene included qKL5-1 with gln1-3 [20] and Yqknpp5 [7], qKT1-1 with CQTL1-2 [8], and qKT5-1 with qKT5-1 [9]. The consistency of QTL/gene in independent study implied the common genetic basis for these traits (S1 Table).

Joint analysis for multiple related traits
In this study, QTL analysis was performed on the six traits, respectively. In total, 10 QTL for kernel size (KL, KW and KT) and 18 QTL for other three traits (ERN, KNR and HKW) were detected. In fact, these traits were highly related (Table 3). Multiple-trait analysis took into account the correlated structure of multiple traits, and could improve the statistical power of the QTL detection and the precision of parameter estimation [52]. However, there were seldom studies focused on the QTL mapping methods for jointly analyzing multiple traits in four-way crosses till now on. We will try to develop a statistical method for joint analysis on four-way cross populations in the future.

Implications for molecular-assisted selection (MAS) breeding
In the present study, a total of 10 significant QTL for kernel size were identified, which ranged from two for KT to five for KL. However, these QTL seemed to be independent genetic regulation of seed size since no consistent QTL were observed. These QTL could be valuable because it means that improvement in one trait can be accomplished without a corresponding decrease in the other. Here, we also found that at least three QTL (i.e., qKL3-1, qKL7-1 and qKW5-1) with stable expression across kernel size and at least one of the other kernel related traits, and they had the same direction of the additive effects. These QTL may imply the genetic regulation of seed size and the components of maize yield, and may have high values using MAS to improve yield in maize.