Time-Course Association Mapping of the Grain-Filling Rate in Rice (Oryza sativa L.)

Detecting quantity trait locus (QTLs) and elite alleles that are associated with grain-filling rate (GFR) in rice is essential for promoting the utilization of hybrid japonica rice and improving rice yield. Ninety-five varieties including 58 landraces and 37 elite varieties from the core germplasm collection were genotyped with 263 simple sequence repeat (SSR) markers. The GFR of the 95 varieties was evaluated at five stages, 7, 14, 21, 28 and 35 days after flowering (DAF) both in 2011 and 2012. We found abundant phenotypic and genetic diversity in the studied population. A population structure analysis identified seven subpopulations. A linkage disequilibrium (LD) analysis indicated that the levels of LD ranged from 60.3 cM to 84.8 cM and artificial selection had enhanced the LD. A time-course association analysis detected 31 marker-GFR associations involving 24 SSR markers located on chromosomes 1, 2, 3, 4, 5, 6, 8, 9, 11 and 12 of rice at five stages. The elite alleles for high GFR at each stage were detected. Fifteen excellent parental combinations were predicted, and the best parental combination ‘Nannongjing62401×Laolaihong’ could theoretically increase 4.086 mg grain-1 d-1 at the five stages. Our results demonstrate that the time-course association mapping for GFR in rice could detect elite alleles at different filling stages and that these elite alleles could be used to improve the GFR via pyramiding breeding.


Introduction
Rice (Oryza sativa L.) is a globally important cereal crop and is grown on 132 million hectares annually [1]. Although rice yield has increased in recent decades mainly due to genetic improvement [2], higher productivity is needed to meet the rapid population increase, especially with the reduction of arable land and water [3,4]. The rice yield trait consists of several key components, including grain weight, grain size, grain number, panicle number, and days to heading [5]. And a few QTLs and genes of rice yield related, such as GS3, GS5, GW2, GW5, GW8, GL3, Gdh7, and DTH7 were isolated recently [6][7][8][9][10][11][12][13][14][15][16][17]. Among all rice yield related traits, grain-filling is a complicated and dynamic process determining the final grain yield [18]. In China, hybrid rice has made great contributions to increasing yield since 1976. Compared to conventional rice, the yield of hybrid rice can increase up to 20% [19]. Although the acreage of hybrid indica rice accounts for approximately 80% of the total area of indica rice, hybrid japonica rice only accounts for approximately 5% of the total area of japonica rice in China. One major reason is that the grain-filling and grain plumpness in hybrid japonica rice are poor on large panicles, which is the main manifestation of heterosis of F 1 hybrid rice [20][21][22]. Poor grain-filling and inferior grain plumpness result in decreases of not only the grain yield but also the commodity value [23,24]. Therefore, improving grain-filling will provide new opportunity to increase grain productivity of F 1 hybrid rice.
Despite its importance, only several studies have addressed rice grain-filling in the last decade [25][26][27]. Among the grain-filling related QTLs and genes, the grain incomplete filling 1 (GIF1) is a key regulator of sucrose transport and unloading [25], which encodes a cell-wall invertase required for carbon partitioning during early grain-filling. Using the time-related mapping method, Toshiyuki et al. [26] detected two major QTLs on chromosomes 8 and 12 that were strongly associated with increased filling percentage per panicle. They also reported QTLs of days to heading, accumulated non-structural carbohydrate (NSC) and leaf nitrogen content. In another study, Jia et al. [27] mapped ten additive QTLs for the grain-filling rate using a recombination inbred line (RIL) population that was derived from a cross between Milyang 46 (small grain) and FJCD (large grain) (F 10 generation). Other studies about sucrose synthase (SUS), starch synthase (SS) and ADP-Glc pyrophosphorylase (AGP) [28][29][30][31] revealed that the genes (such as SUS1, SUS2, SUS3, GBSS, OsAPL2 etc.) were the key factors regulating the starch synthase during grain-filling process.
Association mapping using diverse germplasm resources in rice is a new and powerful tool for the elite allele dissection of complex quantitative traits [32][33][34][35]. Agrama et al. [32] detected 25 marker-trait associations using yield data and the components of 92 rice germplasm accessions and 123 SSR markers, suggesting that association mapping in rice is a viable alternative to QTL mapping based on crosses between different lines. To our knowledge, there is not report on the GFR of japonica rice using time-course association mapping. Here, we report marker loci that are significantly associated with GFR at five stages (7, 14, 21, 28 and 35 DAF) using time-course association mapping with 263 SSR markers and a core collection of 95 japonica rice accessions.

Plant materials
Of the 95 diverse rice accessions, 58 were landraces (1-58) from a core germplasm collection that was constructed by Jin et al. [36], and the remaining 37 (59-95) were newly released cultivars. The 95 accessions were collected from six provinces in China (S1 Table).

Field experiment and measurement
The experiment was conducted at Jiangpu Experimental Station, Nanjing Agricultural University, Nanjing, China, in 2011 and 2012. The seeds of 95 rice accessions were sown in the seedling nursery on 15 May, and the seedlings were transplanted with one seedling per hill on 15 June with three replications. Each plot consisted of five rows with eight hills per row, and the hill spacing was 17 cm×20 cm.
Twenty flowers bloomed on the same day, and five plants from each plot were marked with a black color magic pen (product code, 00633385, ML-T1, made in Japan, http://www.guitarmg.co.jp/Japan) on the glume surface. Seven days after marking, the marked fresh grains of one plant in each plot were picked and dried in an oven at 105°C to a constant weight. Then, the dried grains were hulled by hand, and five randomly selected grains of brown rice were weighed on a balance with precision up to 0.001 gram and averaged across three replications.
Similarly, the grains at 14, 21, 28 and 35 days after flowering were harvested, dried, hulled and weighed. The GFR at each stage was calculated as follows: where GFR i is the grain-filling rate (mg grain -1 d -1 ), and GWi is the grain weight (mg grain -1 ) at stage i.

Phenotypic data analysis
The phenotypic data were statistically analyzed using Microsoft Excel 2010. The genotypic and environmental variances of the traits were estimated using the general linear model (GLM) procedure based on multiple environments. The variances were then used to estimate the broad-sense heritability [41] using the following formula: H 2 B = σ 2 g / (σ 2 g +σ 2 e /n), where σ 2 g is the genetic variance, σ 2 e is the error variance, and n is the number of replicates.

Genotypic data analysis
The number of alleles per locus, gene diversity and polymorphism information content (PIC) were analyzed using PowerMarker V3.25 software [42]. The LD coefficient (D') of all markers in pairs was evaluated using the software TASSEL 2.1 [43].
where u and v represent the allelic variation of two loci; p i and q j represent the frequency of the i allelic variation of locus A and the frequency of the j allelic variation of locus B, respectively; x ij represents the frequency of gamete A i B j ; and p i and q j represent the frequency of allelic variation A i and B j , respectively. The theory D' value ranges from 0 to 1. A D' value less than 0.5 indicates LD decay. The decay of LD (with distance in cM) between the SSR loci within the same chromosome was evaluated. The model-based program STRUCTURE 2.2 [44] was used to determine the population structure. Twenty independent runs were performed for each k (from 2 to 10) using a burn-in length of 50,000, a run length of 100,000 and a model for the admixture and independent allele frequency. The true number of populations (K) was often identified using the maximal value of L (K) that was returned by STRUCTURE. A neighbor-joining tree was built by PowerMarker 3.25 based on Nei's genetic distance [45] as calculated by the allele frequencies. The tree was constructed using the MEGA 5.0 software. The coefficient of genetic differentiation (F ST ) was calculated to measure the fixation of different alleles in different populations using the method that was proposed by Weir and Hill [46], and the computing process was completed using the software Arlequin 3.11 [47].

Association analysis
The general linear model (GLM) in the TASSEL 2.1 software was used for association mapping. The population structure (Q) was included as a covariate in the model to test for marker-trait association [48]. The false discovery rate (FDR) was calculated according to method that was proposed by Benjamini et al. [49] to control the expected proportion of falsely rejected hypotheses, and it is the desirable control against errors originating from multiplicity. The allelic effects were estimated compared to the 'null allele' (non-amplified alleles) for each locus [50]. The formula that was used to calculate the average positive (negative) allelic effects (AAE) within a locus was where a c represents the phenotypic value of the c th allele with a positive (negative) effect, and n c represents the number of alleles with positive (negative) effects within the locus.

Temperatures during rice grain-filling seasons
In 2011, the variety Yangguang200 headed on August 12, which was the earliest heading date (HD), and the varieties Huangshanshi and Manyedao headed on September 13, which was the latest HD. The corresponding dates were August 13 and September 10, 2012. The daily maximum, minimum and average air temperatures at Jiangpu Experiment Station throughout the periods of grain-filling of the 95 accessions in 2011 and 2012 are shown in Fig. 1. The daily average temperatures during the rice grain-filling were between 18°C and 30°C, which were under the normal climate conditions for rice grain-filling [51]. The changes in the temperatures during rice grain-filling seasons in both years were similar. These data indicate that the grain-filling among 95 rice accessions proceeded under normal and similar climate conditions in both years.

Phenotypic variation
There were significant differences in the GFR among the varieties at five stages in both 2011 and 2012, with the CV ranging from 36.49% to 118.26% (Table 1). A high value of broad sense heritability was observed for each stage. The means of GFR over 95 accessions at 14 DAF were 1.01 and 1.11 mg grain -1 d -1 in 2011 and 2012, respectively, which were the highest among five stages. Among the 95 accessions, the GFR of sixty-two varieties peaked at 14 DAF and that of thirty-three varieties peaked at 21 DAF (S1 Table). Thus, the 95 accessions could be divided into two groups, group 1 corresponding to 62 varieties and group 2 to the 33 aforementioned varieties, according to the date that the GFR peaked. Using variety Yangguang200 (No. 91) to represent group 1 and variety C-bao (No. 94) to represent group 2, the changes in the GFR at 5 stages in both groups are shown in Fig. 2. The GFR of Yangguang200 was higher than that of C-bao at 7 DAF and 14 DAF, and the situation was reversed after 14 DAF. These results indicate that the varieties in group 1 had a faster GFR than that of the varieties in group 2 during the early stages of grain-filling, while the varieties in group 2 had a faster GFR than that of the varieties in group 1 during the late stages of grain-filling. Ninety-five rice varieties also had significant variation in the mature brown rice weight (BRW). Variety Huangsandannuo (No. 58) had the smallest BRW (16.4 mg), and variety H35 (No. 70) had the largest BRW (29.2 mg) (S1 Table).

Genetic diversity
A total of 263 SSR markers were used to measure the genetic diversity of the population. The average number of alleles per SSR locus was 5.93, ranging from 2 to 19. The average genetic diversity throughout all of the SSR loci was 0.5265, ranging from 0.0208 (RM433) to 0.8915 (RM7545). The average PIC value was 0.4776, ranging from 0.0206 (RM433) to 0.8841 (RM7545) (S2 Table).

Linkage disequilibrium
A linkage disequilibrium analysis was performed for landraces and elite varieties. The frequency of D' value was showed in Fig. 3A.The average D' value was lower in the landraces (0.640) than in the elite varieties (0.713), suggesting that hybridization and artificial selection increased LD during breeding. The regression analysis of the D' value and the genetic distance of syntenic marker pairs (marker pairs on the same chromosome) demonstrate that the minimum distance     of LD decay (D'<0.5) of the landraces and elite varieties is 84.8 cM and 60.3 cM, respectively and that the extent of LD decay is less in landraces than in elite varieties ( Fig. 3B and 3C).

Population structure
The Bayesian model-based simulation of the population structure showed that the L (K) value increased with the increase in k and peaked at K = 7 (Fig. 4A). This result indicates that the population that was used in this study is a mixed population consisting of seven subpopulations. The posterior probability value of each variety belonged to seven subpopulations (SP1, SP2, SP3, SP4, SP5, SP6 and SP7) is shown in Fig. 4B. To support the results of the group structure analysis, we constructed a neighbor-joining tree based on Nei's (1983) genetic distance among the 95 rice varieties (Fig. 4C). The neighbor-joining tree showed that the 95 varieties were clearly divided into seven subpopulations. The landraces were divided into four subpopulations, while the elite varieties were divided into three subpopulations. The analysis results based on the STRUCTURE model and the neighbor-joining tree were basically identical. The average F ST value among the seven subpopulations was 0.5434 (

Time-course association mapping of GFR
The GLM analysis of the marker-trait association revealed twelve SSR markers that were associated with the GFR (FDR<0.05) at 7 DAF located on chromosomes 2, 3, 4, 5, 6, 8, 9 and 11. The PVE ranged from 8.70% to 22.59%. RM1240_Chr11, residing on 1.46Mb, explained the maximum number of phenotypic variations, viz. 18.59% in 2011 and 22.59% in 2012 (Table 3). RM480_Chr5 had the largest positive AAE, and its effect was 3.6 times as large as the smallest that was observed at 7 DAF (Table 3). Eight markers were associated with the GFR at 14 DAF and were distributed on seven chromosomes; of these markers, RM528_Chr6 had the highest PVE of 25.87% in 2011 and 27.19% in 2012 and the highest positive AAE of 0.505 averaged over both years (Table 3). Two markers were associated with the GFR at 21 DAF and were distributed on chromosomes 5 and 11 ( Table 3). The PVE of RM224_Chr11 was slightly larger than that of RM5818_Chr5. Five markers that were associated with the GFR at 28 DAF were detected on chromosomes 2, 4, 5 and 8. RM525_Chr2 had the highest positive AAE of 0.412 at 21 DAF (Table 3). Four markers were associated with the GFR at 35 DAF and were distributed on chromosomes 2, 3, 9 and 12; of these markers, RM1013_Chr9 had the highest PVE and positive AAE (Table 3). Of the 24 SSR markers that were associated with the GFR in both years, seven SSR markers (RM480, RM5818, RM525, RM6361, RM6314, RM224 and RM72) were synchronously associated with the GFR at two stages, and the remaining markers were only associated with one stage (Table 3). RM480, RM525, RM6314 and RM72 were associated with the GFR at both 7 DAF and 28 DAF, RM5818 and RM224 were associated with the GFR at 14 DAF and 21 DAF, and RM6361 was associated with the GFR at 7 DAF and 35 DAF.

Best alleles for GFR at the 5 stages
Among the alleles whose phenotypic effects ranked within the top three, RM3170-160 from variety 'Nannongjing62401' had the largest phenotypic effect at 7 DAF (Table 4) Excellent parental combinations predicted for GFR improvement Fifteen excellent parental combinations for GFR improvement were predicted (Table 5) based on the data of Table 4 and S4 Table. As shown in Table 3, twenty-four SSR loci were significantly associated with the GFR at 5 stages. Among the 11 SSR loci that were detected in variety Nannongjing62401    (Table 5).

Effects of the variability in HD and BRW on GFR
The variation in crop ontogeny, one of the most essential aspects of the field experiment, can cause environmental variation in their subsequent developmental stages. In terms of the GFR in rice, the variation in HD and BRW may lead to environmental variation, which may significantly influence GFR. In this study, while variability in days to heading was observed among the 95 rice accessions, the climate conditions, especially the average daily temperature, were normal and favorable for rice grain-filling in both years (Fig. 1). We also found that the BRW and GFR of the 95 varieties were independent. For example, 'Wanhuangdao' (No. 3) and 'Guozinuo' (No. 4) with nearly the same BRW belonged to group 1 and group 2, respectively, while 'Hongmangshajing' (No. 2) and 'Wanhuangdao' (No. 3) belonged to group 1 with significant differences in the BRW (S1 Table). These results indicate that using 95 core accessions to study the GFR was feasible, even though there was variability in the HD and BRW.

Importance of the time-course method in studying the GFR
Grain-filling in rice is a critical and dynamic process, which highlights the need for a timecourse method. The significant variation in the rice GFR at five stages demonstrates that the dissection of the grain-filling rate by time course is necessary. Grain-filling has also been studied at different stages [25,26], e.g., Toshiyuki et al. [26] studied the grain-filling related QTLs at four stages. If the average GFR of the entire filling stage instead of the individual stage was considered, we could not detect elite alleles at different stages. Additionally, the study of grainfilling is needed at different stages because of the complexity of the grain-filling process involving various genetic mechanisms [52] and environmental factors [26].

Comparison of the genetic diversity of the rice core accessions
Conducting a successful association study requires an appropriate sample size and abundant diversity in phenotype and genotype [53]. Although only 95 rice varieties were included in this study, these varieties were core germplasm accessions that were selected from six provinces in China. These accessions have large variation in phenotype, a coefficient of variation ranging from 36.49% to 118.26%, and an average number of alleles per locus over 263 SSR markers of 5.93, which exceeds most of the reported values in other diversity studies of rice germplasm; the average number of alleles per locus in those studies ranged from 3.88 to 5.89 [34,[53][54][55][56]. The wide range of genetic diversity makes this collection one of the best for pyramiding elite alleles that associated with the GFR in rice. Bold markers represent that they were associated with two grain-filling stages. Digit in parentheses of the third column is the chromosome number.

Implication of the population structure and LD in association mapping
Determining the population structure is essential to avoid false-positive results between the phenotype and genotype in association mapping because of the linkage disequilibrium in natural populations [57]. The genetic structure of rice (O. sativa) has been previously reported [32,58]. Agrama et al. [32] detected eight subpopulations among 103 rice accessions with 123 SSR markers. Similarly, our study has detected seven subpopulations with 263 SSR markers by STRUCTURE and a neighbor-joining tree. Fifty-eight landraces were divided into four subpopulations, while the elites were divided into three subpopulations. The results that were obtained by STRUCTURE analysis and neighbor-joining clustering were consistent except for the clustering of a few varieties. Linkage disequilibrium is the basis of association analysis [59]. In this study, an LD analysis was conducted between landraces and elite populations, and the genetic distance of decay was 60 cM, which is longer than that of previous studies [32,34]. Agrama et al. [32] indicated that LD decays at 20-30 cM using SSR markers, and the LD of the germplasm accessions of Jin et al. [34] did not decay until 25-50 cM. These results indicate that the 95 rice accessions in our study experienced more hybridization and artificial selection than did the accessions of other studies. As a self-pollinated crop, the genetic distance of decay of rice was much longer than that of cross-pollinated crops, such as maize, whose LD diminished over a distance of 2000 bp [60].
Superiority of time-course association mapping in studying the GFR Linkage mapping in families and population-based association mapping are the two main approaches to mapping the relevant genes and identifying the variants that are associated with the traits. Linkage mapping mainly identifies only those loci with the strongest influence. However, for complex traits such as GFR, genetic association mapping has greater power than linkage studies to identify variants with weak effects [61]. Using time-course association mapping, we detected 24 SSR markers on chromosomes 1, 2, 3, 4, 5, 6, 8, 9, 11 and 12 that were associated with the GFR both in 2011 and 2012 (FDR<0.05). By comparing with other studies, we found eight of the 24 SSR markers detected in this study were novel, and the other 16 SSR markers were located near to the chromosome regions harboring grain-filling and yield related QTLs or genes which have been reported. Among the sixteen SSR markers, RM6361_Chr2, RM6266_Chr3, RM3170_Chr5 and RM480_Chr5 were near to the chromosome regions containing QTLs or genes for grain size, such as GW2, GL3, GS5, indicating that grain size might have impaction on grain-filling in the grain maturation process. GL3 was found to have favorable effect not only on grain length but also on grain filling [11]. RM263_Chr3 and RM5475_Chr3 were near to the chromosome regions covering EHD4 and Hd6 of heading-related genes shown in Fig. 6, respectively. Marker RM471_Chr4 associated with GFR at 14DAF and RM6314_Chr4 associated with GFR at both 7DAF and 28DAF were near to chromosome region harboring the gene GIF1 [25] (Fig. 6). RM72_Chr8, RM5389_Chr1, RM5475_Chr3, RM1013_Chr9 and RM528_Chr6 were near to chromosome regions containing the QTLs of filling percentage per panicle, leaf nitrogen content and NSC content detected by Toshiyuki et al [26]. RM314_Chr6, RM276_Chr6 and RM5818_Chr5 were near to chromosome region harboring the starch synthase related genes SUS 2,SSII-3 and Os APL 3 (Fig. 6). RM480_Chr5, RM3170_Chr5 and RM5818_Chr5 were near to chromosome regions containing the qGR-5-8 and qGR-5-9 for GFR [27]. Percentages of phenotypic variations explained by SSR markers detected in our study ranged from 7.48% to 27.19%. These comparisons elucidated that timecourse association mapping could not only detect more loci underlying GFR in rice than linkage mapping could, but also detect loci with weak effects.

Analysis of excellent parental combinations predicted for GFR improvement
Utilization of all the elite alleles that were detected at different filling stages in this study may improve the GFR of F 1 hybrid japonica rice. Among the fifteen excellent predicted combinations, the best parental combination 'Nannongjing62401×Laolaihong' can theoretically improve the GFR by 4.086 mg grain -1 d -1 at all of the stages ( Table 5). Eight of the 15 combinations, especially the top 3, have 'Nannongjing62401' as a parent. Therefore, 'Nan-nongjing62401' can be used as an outstanding parent to improve the GFR of F 1 hybrid japonica rice. Due to the complexity of grain-filling in rice, the performances of all of the excellent predicted parental combinations in rice yield improvement require further verification in rice production.

Conclusions
Eight novel GFR-associated markers were detected in this study by time-course association mapping with the core rice germplasm with a large variation in phenotype. Pyramiding the elite alleles at different grain-filling stages can significantly improve the GFR of rice. Fifteen excellent parental combinations were predicted, of which 'Nannongjing62401×Laolaihong' was the best. Variety 'Nannongjing62401' (No. 73) can serve as an elite parent in combination with other varieties, aiming at the genetic improvement of the GFR. The results of this study indicate that time-course association mapping in elite germplasm seems to be a better approach than Supporting Information S1