High-Density Genetic Linkage Map Construction and QTL Mapping of Grain Shape and Size in the Wheat Population Yanda1817 × Beinong6

High-density genetic linkage maps are necessary for precisely mapping quantitative trait loci (QTLs) controlling grain shape and size in wheat. By applying the Infinium iSelect 9K SNP assay, we have constructed a high-density genetic linkage map with 269 F 8 recombinant inbred lines (RILs) developed between a Chinese cornerstone wheat breeding parental line Yanda1817 and a high-yielding line Beinong6. The map contains 2431 SNPs and 128 SSR & EST-SSR markers in a total coverage of 3213.2 cM with an average interval of 1.26 cM per marker. Eighty-eight QTLs for thousand-grain weight (TGW), grain length (GL), grain width (GW) and grain thickness (GT) were detected in nine ecological environments (Beijing, Shijiazhuang and Kaifeng) during five years between 2010–2014 by inclusive composite interval mapping (ICIM) (LOD≥2.5). Among which, 17 QTLs for TGW were mapped on chromosomes 1A, 1B, 2A, 2B, 3A, 3B, 3D, 4A, 4D, 5A, 5B and 6B with phenotypic variations ranging from 2.62% to 12.08%. Four stable QTLs for TGW could be detected in five and seven environments, respectively. Thirty-two QTLs for GL were mapped on chromosomes 1B, 1D, 2A, 2B, 2D, 3B, 3D, 4A, 4B, 4D, 5A, 5B, 6B, 7A and 7B, with phenotypic variations ranging from 2.62% to 44.39%. QGl.cau-2A.2 can be detected in all the environments with the largest phenotypic variations, indicating that it is a major and stable QTL. For GW, 12 QTLs were identified with phenotypic variations range from 3.69% to 12.30%. We found 27 QTLs for GT with phenotypic variations ranged from 2.55% to 36.42%. In particular, QTL QGt.cau-5A.1 with phenotypic variations of 6.82–23.59% was detected in all the nine environments. Moreover, pleiotropic effects were detected for several QTL loci responsible for grain shape and size that could serve as target regions for fine mapping and marker assisted selection in wheat breeding programs.


Introduction
Wheat is the third highest producing cereal crop after maize and rice, and is the leading sources of vegetable protein in human food. The demand for wheat in the developing world is projected to increase by 60% by 2050 while production is expected to be affected negatively by climate change and natural resource depletion (FAO). Wheat is a staple food used to make flour for different kinds of products. Grain weight and size are the targets for breeding, not merely because they are the major components of grain yield, but also due to their impacts on milling and baking quality [1]. Moreover, grain size can partially explain the process of crop domestication [2].
Grain weight and size are complex quantitative traits controlled by a number of genes and significantly influenced by the environment. The grain weight and size can be divided into a number of components including thousand grain weight (TGW), grain length (GL), grain width (GW), and grain thickness (GT), etc. [3][4][5][6][7]. Previous studies have proved that TGW has high heritability values and is phenotypically the most stable yield component [8].
Genetic dissection of grain weight and size in bread wheat, however, is greatly hampered by an enourmous genome size (~17Gb), complex genomes (allohexaploid, 2n = 42, AABBDD), and prevalence of repetitive DNA. A well-saturated genetic linkage map is a powerful tool to dissect the genetic elements responsible for grain weight and size. Both restriction fragment length polymorphisms (RFLP) and simple sequence repeats (SSR) have been used in linkage map construction, and an increasing number of QTL studies have been conducted in attempts to analysis the genetic basis of grain weight and grain size in wheat [5,7,[20][21][22][23][24][25][26][27]. However, RFLP markers have shown very low levels of polymorphism between wheat cultivars, although they are co-dominant and highly reliable in nature. In contrast, SSR markers reveal a higher level polymorphism in wheat but it is very laborious to construct high-density genetic linkage maps. The need for studies of complex traits with very high density genetic linkage maps and progress in polymorphism detection and genotyping techniques has promoted the recent development of single nucletide polymorphism (SNP) markers in wheat. Meanwhile, next generation sequencing technology makes it possible to find more SNPs between wheat cultivars. The wheat Infinium iSelect 9k SNP genotyping assay was developed based on transcriptomes of 26 accessions of hexaploid wheat generated using Roche 454 and Illumina platforms [28].
In this paper, we report: 1) construction of an integrated SNP and SSR high-density genetic linkage map using Yanda1817/Beinong6 recombinant inbred lines (RILs) and an Illumina Infinium 9k SNP chip, and 2) QTL mapping of TGW, GL, GW and GT traits controlling grain shape and size in common wheat.

Ethics Statement
No specific permission was required for the study. The field studies did not involve endangered or protected species.

Plant Materials and Field Trials
Yanda1817, a pure line derivative of wheat landrace Pingyao Xiaobaimai from Shanxi Province, was one of the 'cornerstone parental' breeding lines for the Northern China Winter Breeding Program between 1950-1960. Yanda1817 is highly tolerance to drought, winter hardiness and poor soil fertility, and has very strong tillering ability and taller plant height. More than fifty registered wheat cultivars, mostly grown in the Northern Winter Wheat Zone of China, have been generated from Yanda1817 in different breeding programs [29]. Beinong6 is a semi-dwarf high-yielding 1B/1R derivative released in the 1990s by Beijing University of Agriculture. Beinong6 consistently has larger grain size and higher kernel weight than Yanda1817. Recombinant inbred lines (RILs) of Yanda1817/Beinong6 were selected for highdensity linkage map construction and QTL mapping because the RIL populations are known to be segregating widely for agronomic traits, such as plant height, yield and presence/absence of awns.
The mapping population for QTL analysis comprised 269 F 8 to F 12 recombinant inbred lines (RILs) derived from Yanda1817/Beinong6 by single seed descent. Compared to Yanda1817, Beinong6 shows a higher TGW and larger grain size.
Yanda1817, Beinong6 and the 269 RILs were grown in Beijing (BJ, E116.10, N40.08), Shijiazhuang (HB, E114.36, N37.38) and Kaifeng (HN, E114.23, N34.52) (S1 Fig.)  The trials were performed in a randomized complete block design, and each treatment contains three replicates except for E1 and E2 which had one replicate. Each plot had two rows that were 2 m long and 25 cm wide and 30 seeds were evenly planted in each row. Field management was the same as commonly practiced in wheat production.

Testing of Grain Traits
From the center of the rows, ten representational plants were selected to harvest as samples for measuring TGW in grams, and GL, GW and GT in millimeters. The seeds were fully cleaned and dried and broken grains were removed before trait evaluations. TGW was recorded using an electronic balance to determine the average weight of two (E1, E2, E3, E8, E9) or three (E4, E5, E6, E7) independent samples of 100 grains. GL, GW and GT were measured for 10 random grains from each RIL for each replication using vernier calipers. Trait values of each year-location combination (defined as one environment) were used for QTL analysis.

Statistical Analysis
The broad sense heritability (H = VG/VP) of each trait was estimated from the components of variance from ANOVA. The correlation coefficients (r) between pairs of all four traits were calculated using SPSS. 20.

DNA Extraction
Genomic DNA was extracted from two week old leaf tissue of Yanda1817, Beinong6 and each RIL using the cetyltrimethyl ammonium bromide (CTAB) method [30]. DNA was quantified using 1% agarose gel electrophoresis with λ DNA as the standard.

SSR and EST-SSR Genotyping
Genomic SSR and EST-SSR markers (Xcau) were screened for polymorphisms between Yada1817 and Beinong6. Primer sequences for the Beltsville Agricultural Research Center (BARC), Gatersleben wheat microsatellite (GWM), Wheat Microsatellite Consortium (WMC), INRA Clermont-Ferrand (CFA, CFD) and Gatersleben D-genome microsatellite (GDM) were obtained from the Grain Genes website (http://wheat.pw.usda.gov/GG2/index.shtml) and public available information [31][32][33]. EST-SSR markers were developed according to flanking sequences of microsatellite motifs in wheat ESTs deposited in public EST databases. The polymorphic markers were used to genotype the RIL population. The PCR reactions were performed with an ABI9700 in a total volume of 10 μL containing 10 mM Tris-Hcl, pH 7.5, 50 mM MgCl 2, 0.2 mM dNTP, 25 ng of each primer, 0.75 U of Taq polymerase, and 50 ng of genomic DNA as the template. After an initial denaturing step for 5 min at 94°C, 35 cycles were performed for 45 s at 94°C, 55-60°C (depending on the specific primers) for 45 s, and 72°C for 70 s, with a final extension at 72°C for 10 min. PCR products were separated in 8% non-denaturing polyacrylamide gels, visualized by silver staining and photographed.

Infinium iSelect SNP Genotyping
A total of 9,000 SNPs were selected based on their distribution across genome and frequency in the discovery population [28]. SNP genotyping was performed on the BeadStation and iScan instruments and conducted at the Genome Center of the University of California at Davis according to the manufacturer's protocols (Illumina). Single nucleotide polymorphism allele clustering and genotype calling was performed with GenomeStudio v2011.1 software as described in Cavanagh et al. [28]. A genotype calling algorithm was generated for bread wheat using an iterative process to account for observed shifts in SNP allele cluster positions caused by differences in the number of duplicated (homeologous and paralogous) gene copies detected between assays [28,34].

High Density Linkage Map Construction and QTL Analysis
The linkage map was constructed with MultiPoint software and MAPMAKER/EXP version 3.0 [35] with a minimum LOD of 3.0 and maximum recombination fraction of 0.372. The Kosambi mapping function was used to convert the recombination frequencies into centiMorgan (cM) map distance [36] and the genetic linkage maps were constructed using software Map Draw V2.1 [37]. Co-segregating markers were regarded as a polymorphic locus. QTL analysis was performed with inclusive composite interval mapping by IciMapping 3.2/4.0 based on stepwise regression of simultaneous consideration of all marker information (http:// www.isbreeding.net/). The 'Deletion' command was used to accommodate the missing phenotypes and the step size chosen was 1.0 cM. A QTL was claimed to be significant at an LOD value of 2.5.

Phenotypic Variation and Correlation Analysis
Field trials were conducted at Beijing, Shijiazhuang and Kaifeng under different agro-climatic conditions for five continuous years (2010-2014 in 9 total environments) to evaluate TGW, GL, GW and GT variation amongst the two parents (Yanda1817 and Beinong6) and the RIL populations. Beinong6 consistently showed higher values than Yanda1817 for all the grain traits tested in 9 environments (Table 1; S1 Table). The frequency distributions of the investigated traits reveled continuous variations and transgressive segregation in the RIL populations, suggesting that the phenotypic data of TGW, GL, GW and GT are normally distributed and that the traits are controlled by multiple loci. The heritability frequencies for TGW, GL, GW and GT are 85.58%, 95.72%, 88.22% and 91.78%, respectively, indicating that the grain shape and size are stable and are mainly under genetic control (Table 1; S1 Table).
Correlation coefficients (r) among the TGW, GL, GW and GT traits in different environments were calculated. All the four traits showed significant positive correlations with each other (significant at P = 0.01) ( Table 2). The highest positive correlation was observed between TGW and GW (r = 0.796), followed by TGW and GT (r = 0.761). The correlation between GL and GW was very weak (r = 0.204), as well as between GL and GT (r = 0.210).

Genetic Linkage Map Construction
Out of 500 genomic SSR and EST-SSR primer pairs screened, 150 polymorphic markers were selected for RIL genotyping. Out of 8632 designated and validated SNPs in the 9k Infinium chip, 2873 SNPs were polymorphic between the parental lines Yanda1817 and Beinong6, as well as the RIL populations. Based on the 90K SNP consensus map [34] and after removing ambiguous and unlinked markers, the final genetic linkage map consists of 128 SSR, EST-SSR and 2431 SNP markers (mapped in 1062 polymorphic loci) that covered all the 21 wheat chromosomes (Table 3; S2 Table). Chromosomes 4A, 7B and 7D were integrated by two linkage groups, respectively. The entire map spaned 3213.2 cM including nine gaps (>30cM) distributed on chromosomes 1D, 2D, 3A, 3D, 6A and 7D. However, the number of markers on each chromosome was uneven, ranging from 5 on 4D to 329 on 5B. The genetic coverage of each chromosome varied from 19.1 cM (4D) to 292.9 cM (5A). All together, the markers mapping on the B genome (1301) were greatly more than those on the A genome (1093), and considerably fewer markers (165) mapped on the D genome. Only the long arm was mapped for chromosome 1B which is consistent with the fact that Beinong6 is a 1BL/1RS translocation line (data not shown). However, only a 41.3 cM genetic coverage was found for the centromere region of chromosome 1AL, and this was far below the aveage coverage of the A genome chromosomes (Table 3; S2 Table). No polymorphic SNPs was identified between Yanda1817 and Beinong6 for chromosome 1AS and the distal region of 1AL after checking the SNP mapping data [28,34]. The possible reason for this may be that the chromosome regions of 1AS and the distal regions of 1AL have the same genetic backgrounds between Yanda1817 and Beinong6.

Marker Density
The marker density of the individual chromosomes ranged from 0.72 cM/marker for 1A to 17.83 cM/marker for 3D with an average marker density of 1.26 cM/marker in the genetic map of Yanda1817/Beinong6 (Table 3). More markers were mapped on the A and B subgenomes with a similar marker density of 1.09 and 1.06 cM/marker, while fewer markers mapped on the D subgenome which had a density of 3.91 cM/marker. Most of the gaps were found in the D genome. For example, 7 gaps (>30cM) were identified in chromsome 1D, 2D, 3D, 6D and 7D. In addition to the gaps, 40 marker clusters (10 makers at one locus) were spreaded over chromosomes 1A, 1B, 2A, 2B, 3A, 3B, 4A, 4B, 5A, 5B, 6A, 6B, 6D and 7A (S2 Table). The largest marker cluster was found on chromosome 5B, which contains 63 markers, while only 6 clusters mapped on chromosome 2B. To facilicate data analysis of the 2559 markers mapping in the 1062 loci, only one marker was selected from each locus for QTL mapping. The average locus  (Table 3).

Segregation Distortion Regions
Of the 1062 loci mapped in the Yanda1817 / Beinong6 genetic linkage map, 328 loci (31%) demonstrated genetic distortion (Chi-Square < 0.05) in the RIL population (S3 Table). Among the segregation distortion (SD) loci, one thrid of them (32.9%) were distorted in favor of Bei-nong6 and two thrids (67.1%) favored Yanda1817. Thrity-eight SD regions (SDR, 3 SD loci) were distributed in the whole genome except for chromosomes 1D, 2A, 3D, 5D and 7D (S3 Table). Among the SDRs, 24 were found in the B subgenome and 10 were identified in the A subgenome. Twelve chromosome regions were found to be associated with GW with phenotypic variations ranging from 3.69% to 12.30% (Table 5; S4 Table). QGw.cau-5A.1 was found in seven environments with the strongest association with GW and this QTL explained up to 12.30% of the phenotypic variation. QTL QGw.cau-6B.1 was detected in five environments, whereas QGw.cau-5A.2 and QGw.cau-7D were identified in three and two environments, respectively.

A High-Density Linkage Map for QTL Mapping
Improvement of grain weight and size has always been a challenging task for breeders because it is very difficult to select complex quantitative traits such as TGW, GL, GW and GT directly. Therefore, marker-assisted selection (MAS) has been proposed as an alternative approach for indirect selection to improve the grain weight and size. With a complex and large genome, high-density marker coverage of the genome is crucial for QTL mapping in common wheat. Previous genetic linkage maps used for QTL detection in wheat generally contained hundreds of markers, mostly AFLPs and SSRs, which were laborious and time-consuming to develop [38][39][40][41][42]. Here, we have constructed a high-density genetic linkage map consisting of 2559 markers (1062 polymorphic loci) that spanns 3213.2 cM covering all 21 wheat chromosomes using the recent developed Infinium iSelect 9K SNP assay intergrated with SSR and EST-SSR markers. The coverage of the Yanda1817/Beinong6 genetic linkage map is in agreement with  previously reported maps in common wheat with genetic coverage from 1070 cM [5] to 4223.1 cM [41]. The B genome has the longest length and the most markers, which is also consistent with earlier reports [5,7,22,27]. In our genetic linkage map, the marker density was 1.26 cM/ marker, far less than 3.7cM to 14.8cM per marker in previously reported wheat genetic linkage maps [5,7,27,40]. Another advantage of the Infinium iSelect 9K SNP assay was the high throughtput genotyping of multiply DNA samples at the same time. As the accuracy of a genetic linkage map was heavily influenced by population size, our mapping population containing 269 RILs is large enough to develop a high-density genetic linkage map with adequate genetic information for QTL analysis.
A significant phenomenon noticed in our study was the SNP marker clusters and gaps in the SNP only genetic linkage map. One possible reason may be that the SNPs were developed from the transcriptomes of 26 hexaploid wheat accessions and that most of the SNPs were derived from the gene-rich regions. Another version may be because the mapping population used for linkage map construction was developed from a cross between two Chinese wheat lines and some of the SNPs in the 9k Infinium chip were absent in the RILs. Therefore, we integrated 128 SSR and EST-SSR markers in the genetic linkage map to close some of the gaps.

Low Level of Polymorphism in the D Genome
The polymorphic ratio of SSR and EST-SSR markers is about 30% (150/500) at the whole genome level in our mapping population, whereas a higher polymorphic ratio (33.3%, 2873/ 8632) was observed for SNPs. We detected relatively high SNP and SSR polymorphism levels in our mapping population and a possible reason may be due to the high divergence of the two parental lines: Yanda1817 is a Chinese landrace while Beinong6 is an advanced semi-dwarf high-yielding breeding line.
In the 3 sub-genomes, B and A have the most polymorphic markers and the D genome has the lowest number of markers. Out of 2559 polymorphic markers, only 165 markers (6.4%) mapped on the D genome, which is consistent with previous studies [33,[43][44][45][46]. The low genetic coverage of the D genome may be responsible for the low number of QTLs in the D genome.
After two polyploidization events during the common wheat evolution, the gene flow between Ae. tauschii and T. aestivum was limited to only a small population/accessions Ae. stragulata from north Iran and the southwest Caspian sea introgressed into hexaploid wheat, whereas a continuous gene flow occurred due to frequent hybridization between T. aestivum and tetraploid wheat species, and these events increased the diversity of the A and B genomes [47][48][49]. Increasing the genetic diversity of the D genome is still an urgent task for wheat breeders. Considering that many important genes/QTLs controlling agronomic traits were located on the D genome, additional work to increase the number and density of markers in the D genome should be considered by applying new approach like next generation sequencing (NGS).

QTLs for Grain Shape and Size
TGW has been subjected to QTL analysis in many studies but very limited information is available for QTL mapping of GL, GW and GT in wheat. To date, QTLs for grain shape and size have been detected on almost all 21 wheat chromosomes [3,4,[9][10][11][12]20,[22][23][24][25]27,38,39,[50][51][52][53]. Using introgression lines (ILs), Röder et al. described fine mapping of QTgw.ipk-7D associated with the microsatellite marker Xgwm1002-7D [13]. Due to the low coverage of chromosome 7D in our genetic map, we did not detect this QTL. QTLs for TGW were mapped to the same genetic region of chromosome 6AS by Huang et al. [39] and Sun et al. [7] using F 1 -derived doubled haploid (DH) populations and RILs, respectively. Furthermore, TaGW2, the ortholog of OsGW2 in rice [54], was mapped earlier on 6AS and considered to be a candidate gene related to wheat grain development [6]. However, we did not detect any QTL on chromosome 6AS in our genetic map. In addition, due to the diversity of mapping populations, field trail conditions, and genetic coverage of the linkage maps used for QTL mapping, QTLs were often observed on different chromosome regions for grain weight and size when analyses were carried out with different phenotypic variation effects. Therefore, more refined analyses we focused on the QTLs detected at least in more than two environments.
We detected 17 QTLs for TGW and thirteen of these were found in at least two environments. In order to compare our QTL mapping data with published results, we used the integrated high-density SSR genetic linkage map [33] as a reference to anchor SSR markers and mapped QTLs (Table 4; S4, S5 Table). QTgw.cau-6B.1 and QTgw.cau-3D.1 were newly identified QTLs in three environments with phenotypic variation from 2.98% to 9.90%. QTgw.cau-1B was located in chromosome 1B near the Xgwm268-1B locus where an important QTL for TGW was previously identified using 262 accessions from a mini-core collection of Chinese wheat [26]. Similarly, QTgw.cau-2A for TGW was detected in the interval of Xgwm249-Xgwm473 on chromosome 2A, which corresponds to the QTLs previously found by Sun et al. [7], Huang et al. [24] and Wu et al. [55], respectively. QTgw.cau-4D was closely linked to marker Xcfd71 which may correspond to the TGW QTL reported by Huang et al. [39]. The QTL QTgw.cau-5A.1 detected in two environments mapped in a position that also has been described by many researchers [22,23,26,27,40,52,53], indicating its stability and major effects. The QTgw.cau-5A.2 was detected in two environments and located on the end of 5AL. In the same genetic region, QTLs for TGW were also reported by Mir et al. [56] with interval and association mapping and by Wu et al. [55] with a DH population genetic map. The interval of QTgw.cau-5B was described by Patil et al. [52] and is similar to the TGW QTL identified by Groos et al. [38] and Wang et al. [27]. In addition, Cui et al. [57] and Wu et al. [55] also detected QTLs for kernel weight per spike (KWPS) or TGW in same chromosome region. The QTgw.cau-6B.2 identified in two environments may be same as that reported by Sun et al. [7].
Out of the 14 QTLs for GL that we detected in more than one environment, five were described in previous studies and the remaining nine may be new loci. QTL QGl.cau-1B present in 5 environments in our study was linked to marker Xgwm259. Sun et al. [7] also detected a QTL for GL that is associated with SSR marker Xgwm140 which is closely linked with Xgwm259. A GL QTL reported by Gegas et al. [23] maps at the same location as QGl.cau-3B.1 in our mapping study. By using two hexaploid wheat mapping populations, Breseghello and Sorrells [21] detected two QTLs for GL on 4B and 5B, which are close to QGl.cau-4B and QGl. cau-5B.2 location in our study. On chromosome 7A, the detected QTL QGl.cau-7A.2 seems to correspond with the QTL previously detected by Williams et al. [11].
Four GW QTLs located on 5A, 6B and 7D were detected in more than one environment, and among theses, only QGw.cau-5A.1 was previously described [23].
QTL for grain thickness was rarely reported previously in wheat [10][11][12]. In our study, the GT QTLs, QGt.cau-3B.2, QGt.cau-5A.1, QGt.cau-5A.3 and QGt.cau-6B, were identified in more than four environments. Due to the diversity of molecular markers, it was difficult to align and compare QTLs detected by these studies. TGW, GL and GW QTLs were also detected at the same chromosome regions, indicating possible linkage or pleiotropic effects.

Trait Correlations and QTL Clustering
It was interesting that QTLs for grain size and shape clustered in same chromosome regions. In our study, co-localized QTLs were found on chromosomes 1B, 2A, 3B, 4A, 4D, 5A and 6B, and were especially prevalent on chromosomes 3B, 5A and 6B. Two QTL clusters were identified on chromosome 5A. One was located on the distal end of 5AS and is involved in controlling TGW, GW and GT. Another cluster on the distal end of 5AL is involved in regulating a GT QTL detected in four environments, a TGW QTL detected in two environments, a GL QTL detected in six environments, and a GW QTL was identified in only one environment. On chromosome 6B, QTL clusters were mainly related to TGW, GW and GT. As previous described [7,23,38,50,53,58,59] this is consistent with the positive relationships between the four grain shape and size traits, especially among TGW, GW and GT. These QTL clusters for TGW, GL, GW and GT provide important information for wheat breeders to improve the grain shape and size via marker-assisted selection.
Supporting Information