Genome wide genetic dissection of wheat quality and yield related traits and their relationship with grain shape and size traits in an elite × non-adapted bread wheat cross

The genetic gain in yield and quality are two major targets of wheat breeding programs around the world. In this study, a high density genetic map consisting of 10,172 SNP markers identified a total of 43 genomic regions associated with three quality traits, three yield traits and two agronomic traits in hard red spring wheat (HRSW). When compared with six grain shape and size traits, the quality traits showed mostly independent genetic control (~18% common loci), while the yield traits showed moderate association (~53% common loci). Association of genomic regions for grain area (GA) and thousand-grain weight (TGW), with yield suggests that targeting an increase in GA may help enhancing wheat yield through an increase in TGW. Flour extraction (FE), although has a weak positive phenotypic association with grain shape and size, they do not share any common genetic loci. A major contributor to plant height was the Rht8 locus and the reduced height allele was associated with significant increase in grains per spike (GPS) and FE, and decrease in number of spikes per square meter and test weight. Stable loci were identified for almost all the traits. However, we could not find any QTL in the region of major known genes like GPC-B1, Ha, Rht-1, and Ppd-1. Epistasis also played an important role in the genetics of majority of the traits. In addition to enhancing our knowledge about the association of wheat quality and yield with grain shape and size, this study provides novel loci, genetic information and pre-breeding material (combining positive alleles from both parents) to enhance the cultivated gene pool in wheat germplasm. These resources are valuable in facilitating molecular breeding for wheat quality and yield improvement.


Introduction
Wheat (Triticum aestivum L.) is one of the major food crops of the world and has an important role to play in achieving the food security. Therefore, enhancing grain yield (GY) and PLOS  improving resistance to biotic and abiotic stresses is always a target for genetic improvement in this crop. In recent times, due to demands of high quality wheat in domestic and international markets, end use quality has also become one of the major target of various wheat breeding programs around the world. Hard red spring wheat (HRSW), grown primarily in the northern plains of the United States, is a specialty wheat with high protein content and superior gluten quality. Some of the world's best baked products are made from HRSW. Hard red spring wheat is also used extensively around the world as a blending wheat to increase the gluten strength and protein quantity in a batch of flour. The resulting flour has dough with improved mixing characteristics and water absorption, which can be used to make different type of bread products and noodles. The wheat quality or end-use quality traits are mostly quantitatively inherited and controlled by a large number of QTL/genes and gene networks that are greatly influenced by environmental conditions [1][2][3][4]. More importantly, the direct estimation of several quality traits (e.g. bread making quality), through full-scale tests is expensive, time consuming and requires a large amount of grain, which is usually not available in early generation breeding lines. These limitations generally result in screening only limited number of advanced "candidate" lines. In some cases, simple indirect tests (e.g. SDS sedimentation volume for bread making quality) are used in earlier generations. However, several studies showed a week relationship between indirect test parameters and the final test scores [5][6][7]. If a line with desirable quality parameters could be identified in early generations, this would allow us to 1) discard undesirable lines in early generations, and 2) evaluate a larger number of populations or greater number of lines within each population at the earlier generations (head-row stage). This will help save tremendous amount of resources in the breeding program and also enhance our chances of recovering improved lines. Modern genomic tools offer a great opportunity to predict the end use performance and other quality characteristics of wheat lines in early generations without conducting the direct tests for quality [4,8,9]. Moreover, the availability of important genomic resources like whole genome sequence information for wheat [10] and improved gene editing technologies in recent time [11], also offer new opportunities to manipulate important individual genes to improve phenotypic performance. Therefore, to fully exploit the available genomic resources to achieve genetic gain in short time, it is important to have the knowledge about the genetics of the target traits. The identification of the QTL/genes influencing quality traits and estimates of the effects of the important alleles at these loci, can increase the efficiency of genomics assisted breeding in wheat.
Wheat quality, a complex association between many traits and factors, determines the market value of this crop. Although many factors define wheat quality, grain protein content (GPC), grain hardness (GH), flour extraction (FE) or milling yield are foremost determinants. Grain protein content defines the nutritional and end-use properties of dough mixing and rheological characteristics and thus effects the efficiency of the bread making process and product quality [12]. Flour extraction or milling yield is of great economic value. According to some estimates, a 2% reduction in milling yield results in a loss of approximately $7 per ton, which translates to millions of dollars loss every year for the wheat industry [13]. Endosperm texture or grain hardness defines wheat classification and determines the flour extraction and other end use quality parameters [14][15][16]. At the same time, yield is always a primary target in wheat breeding programs. Grain yield is influenced by many correlated traits including grains per spike (GPS), number of spikes per square meter (SPMS), thousand grain weight (TGW), heading date (HD), and plant height (PH). Because of high economic importance, a number of studies have investigated genetic control of quality [1,3,17,18] and yield related traits [19][20][21][22][23][24] in wheat. However, majority of those studies were based on low density genetic maps [25], thus limiting their use in marker assisted breeding programs. Most importantly, very limited or almost no information is available about the genetics of quality and yield traits in hard red spring wheat [25,26], which is a necessary first step in marker based improvement of those traits in this important wheat class.
Wheat quality and grain yield are affected by grain shape and size. It is believed that large, spherical grains are optimal for milling [27], whereas, small and shriveled grain reduce milling yield (flour extraction and quality). Grain yield is positively affected by grain size as it increases the grain weight [22, [28][29][30] and promotes seedling vigor [31,32]. Therefore, obtaining an optimal grain shape and size could be a way to improve grain yield and quality in wheat. However, the genetic association of grain shape and size with wheat quality and yield is very poorly studied. Therefore, in the present study, a recombinant inbred line (RIL) population developed from a cross of an elite HRSW and a non-adapted wheat genotype was used to 1) dissect the genetics of wheat quality and yield traits using a high density iSelect 90K SNP assay based linkage map, 2) identify novel QTL and markers closely associated with quality and yield for marker-assisted breeding, 3) identify the role epistasis in the genetic control of quality and yield and 4) understand the genetic association of wheat quality and yield traits with grain shape and size traits. To the best of our knowledge, this is the first such comprehensive study in hard red spring wheat.

Plant material and field evaluation
The present study used an RIL population of 160 lines developed from a cross between an elite line "ND 705" and an exotic line "PI 414566" (or WCB462). The details about the mapping population are provided in Kumar et al. [22]. Briefly, the line ND 705 was developed by the HRSW breeding program at North Dakota State University (NDSU), Fargo, ND, USA, while PI 414566 is originally from China and was obtained from the USDA National Small Grains Collection. PI141566 is a facultative type of tall wheat and has longer grains with high protein content. The RIL population was advanced through single seed descent (SSD) method to F 8 generation. The plant material (160 RILs, two parental genotypes and seven checks) was evaluated for two (2009,2010) years at two locations/year (Prosper and Carrington) of North Dakota, USA. Lines were planted in a 13 × 13 partially balanced square lattice design, with two replicates. Each genotype in a replication was planted in a plot consisting of seven rows of 2.44 m length and a distance of 12.7 cm between rows. Sowing rate was 113 kg ha -1 in all environments. All the plant material was harvested only after complete maturity and the seeds were uniformly dried (90˚F) for 2-3 days in a drier, before phenotypic analyses.
bran was discarded from the flour. Flour extraction was reported on clean dry wheat basis [34]. 4. Grain yield (kg ha -1 ), was measured for each plot (1.86 sq mt) and converted to kilogram/hectare. 5. Number of spikes (spikes m -2 ), determined by counting the number of stems in two one meter long rows per plot and subsequently converted to stems per square meter.
6. Number of grains (grains spike -1 ), measured as an average number of grains in five spikes per genotype.
7. Plant height (cm), measured at maturity from the soil surface to the tip of the spike, excluding the awns.
8. Days to heading (days), determined as the number of days from planting to heading. 9-14. Details about the phenotypic data for six grain shape and size (grain length, grain width, grain area, grain length width ratio, test weight or grain weight by volume and thousand grain weight) traits are reported earlier in Kumar et al. [22]. Briefly, digital image analysis was used to estimate grain length, width, area and length/width ratio. A digital camera was used to manually capture the images of seeds samples placed within a predefined field of view. The images were then processed using Aphelion software (Amerinex Applied Imaging, Amherst, MA). Using image analysis, grain length (GL) (mm), grain width (GW) (mm), and area (GA) (mm 2 ) were calculated from a 10 g samples for each genotype from each replication [35]. Grain length / width ratio (GLWR) was calculated by dividing the grain length mean by the grain width mean for each genotype. Thousand grain weight (TKW) (g) was calculated using the number of grains in a 10 g sample and test weight (TW) (kg m−3) was measured according to the American Association of Cereal Chemist International (AACCI) method 55-10.01 [33].

Phenotypic data analysis
The analysis of variance (ANOVA) was conducted using the MIXED procedure of the Statistical Analyses System [36]. Within the mixed model, the RIL, their parents and checks were considered fixed effects, while environments and blocks were considered random effects. As the error variances were homogeneous among the four environments, a combined analysis of variance was conducted for all the traits. Mean separation test was performed using an F-protected least significant difference (LSD) value at P�0.05 level of significance for each evaluated trait. Pearson correlations, between different traits and between environments of individual traits, were calculated and plotted in R 3.2.2 (https://www.r-project.org/) using cor.matrix and corrplot from the corrplot package. The frequency distribution of different traits was also plotted using R software. The t-test (assuming unequal variances) function in excel 2013, was used to compare the means of two group of lines differing for any particular QTL alleles associated with a phenotypic trait of interest.

Framework linkage map
In this study, we used a high density framework linkage map developed using Infinium iSelect 90K assay [22]. Briefly, the linkage maps consist of a total of 10,172 markers mapped onto 21 wheat chromosomes with an average of 484.4 markers per chromosome. The total genetic map length covered by 10,172 markers was 4,676.1 cM, with an average distance of 0.46 cM between any two markers. The 10,172 markers represented a total of 2,591 unique loci, with an average of 123.4 unique loci per chromosome. The average distance between two unique loci was 1.80 cM. The marker order was found to be consistent when compared with earlier published genetic maps [37]. For QTL mapping in this study, a total of 2,591 unique loci were used (S1 Table).

QTL mapping
In order to identify main-effect QTL, composite interval mapping (CIM) was conducted using QTL Cartographer V2.5_011 [38]. Quantitative trait loci mapping was conducted using the phenotypic data collected for each trait in individual environment as well for the mean data across environments (M). The CIM was conducted using default settings {model 6 (standard model), forward and backward regression, five control markers (cofactors), windows size of 10 cM, and walk speed of 1 cM} in QTL Cartographer. The threshold LOD scores calculated using 1000 permutations were used to declare a significant QTL, however, if any such significant QTL was identified with LOD below the threshold but >2 in other environments, the QTL were also included in the results as supporting information. Only the QTL identified in at least two data sets {out of five (four environments and mean)} or the QTL associated with at least two phenotypic traits were included in this study. The confidence intervals (CI) using ±1 (from the peak) method were estimated using QTL Cartographer. Any two QTL showing overlapping CIs or located within~10 cM region were considered as one QTL. Inclusive composite interval mapping (ICIM) using QTL IciMapping V 4.1 software [39] was also used to confirm the main effect QTL identified by Cartographer. For ICIM, only those main effect QTL which were detected above threshold LOD score based on 1,000 permutations, were considered. QTL IciMapping V 4.1 was also used to identify QTL×QTL epistatic interactions via a two-dimensional scan. In the IciMapping V 4.1, digenetic epistatic interactions were identified using ICI-M-EPI (QICE model) with a fixed LOD scores of 5.0 and walk speeds of 2 cM. In ICIM, a twostage stepwise regression strategy is applied to identify the most significant markers and marker-pair multiplications in a linear model. First, the best regression model that properly identifies markers and marker pairs explaining additive and epistatic variations, is selected. In the second stage, two-dimensional scanning or interval mapping approach is applied to the adjusted phenotypic values to identify significant QTL in marker intervals and estimate their effects. The phenotypic values are adjusted by using the regression model selected in the first step. The phenotypic values for interval mapping are adjusted to control genetic background effect, just like using co-factors in CIM by QTL Cartographer. It means that the adjusted values retain the information of QTL on the current mapping interval but exclude the influence of QTL on other intervals and chromosomes [40]. The ICIM approach identifies the epistatic QTL, with or without any significant main effect. To suggest, if the QTL identified in this study are located in the already reported genomic regions or if they are putative novel QTL, we compared the genomic locations of the identified QTL with earlier studies using associated markers. It may be noted that the genetic map used in this study was based on only SNP markers. However, many earlier studies are based on SSRs or DArT markers. Therefore, the genetic maps which contain both SSRs and SNPs from Infinium 9K or 90K assays [29,30,41] or SSRs and DArT markers were used as a bridge to compare the locations of QTL identified in this study with earlier reported studies. If needed, the information about SSR markers (closely mapped to the QTL associated SSRs) was also obtained from GrainGenes website (http://wheat.pw.usda.gov/cgi-bin/graingenes/browse.cgi? class=marker).

Phenotypic variation
The RIL population showed continuous variation for all the eight traits, suggesting a polygenic inheritance (Fig 1). Among the parents, the mean data showed that elite genotype ND 705 had higher values for FE, GPC, GH, GY, SPMS, GPS, PH, and lower values for DH (Table 1; Fig 1). The same trend was observed in individual environments for all the traits with few exceptions (S2 Table). Transgressive segregation in the desirable direction was observed in almost all the environments for all the traits in the RIL population (Fig 1) suggesting the presence of superior QTL alleles in both the parents. The ANOVA showed significant differences among the genotypes as well as significant genotype × environment interaction for all the traits. In general, all the traits showed higher values in Carrington compared to Prosper (S2 Table).

Phenotypic correlation
Phenotypic association between quality, yield and grain shape and size traits were also investigated (Fig 2). Among the three quality traits, FE and GH showed positive and significant (P = 0.05) correlation (0.24 to 0.49). However, FE, GPC showed a weak negative correlation (-0.13 to -0.33). Grain protein content and GH showed both positive and negative correlations in different environments. The correlations between grain quality traits (FE, GPC and GH) and GY were mostly non-significant. Grain yield had non-significant correlation with GPC, except for one location (Prosper 2010), where it showed a weak negative correlation (-0.29). Grain yield was positively correlated with yield components number of spikes per square meter (SPMS) and GPS (grains per spike) in all the environments, although the values were small. Days to heading and PH showed both positive and negative correlation with GY, although the association was weak.
Grain shape and size traits (GL, GW, GLWR, GA, TGW and TW) showed both positive and negative correlations with quality and agronomic traits. However, the correlations were mostly weak. Grain shape and size traits had positive but low correlation with flour extraction, low to moderate negative correlations with grain hardness, and both positive and negative correlation with grain protein (depending on location). Grain width, GA, TGW and TW showed positive low to moderate correlation with GY, depending upon the year and location. Test weight had a significant positive effect on GY in all the environments. The associations were weak in 2009, but moderate in 2010. Spikes per square meter, grain number per spike and thousand grain weight are considered three major components of yield. In this study, although these three yield components showed positive correlation with yield, only spikes per square meter, grain number per spike showed significant correlation with yield in all the environments (Fig 2). Thousand grain weight (TGW) showed significant correlation at both location in only in the year 2010, suggesting that among the three important components of yield in wheat, TGW is probably most influenced by the environment conditions. Among the top five high yielding RILs (30, 86, 88, 92 and 143), which showed higher yield than both the parents, all five had more grains per spike; three lines had more spikes per square meter and two lines had higher TGW, when compared to the parental genotypes. This suggests the importance of these component traits in enhancing yield in wheat.

QTL analysis
Composite interval mapping for eight traits dissected in this study identified a total of 60 main effect QTL located in 43 genomic regions belonging to 19 different chromosomes (all except 1D and 4A) ( Table 2; S3 Table). The majority (52 out of 60) of these QTL were also identified by IciMapping. The few QTL which were only detected by CIM (using QTL cartographer) were specific to individual environments ( Table 2).
A total of four, eight and six QTL for FE, GPC and GH respectively, were identified in at least three (out of five) datasets and could be considered stable QTL. The most important QTL identified for FE was located on 5BL (QFE.ndsu.5B) in a 2.7 cM interval (280.6-283.3 cM on 5B). This QTL had both stable and major effect (PVE upto 12.6%) on FE and ND 705 alleles increased the trait value. Interestingly, PI 414566 also contributed alleles for increased FE at two stable loci (located on 3A and 5A) ( Table 2). For GPC, the two most stable and major QTL were identified one each on 7A and 7B; both being independent of yield. The QTL located on the long arm of 7A (QGPC.ndsu.7A.2) was consistently detected in all the environments as well in the mean data, and contributed up to 14.6% of PV for GPC. The QTL located on short arm of 7B (QGPC. ndsu.7B) had the most significant effect on GPC and was also detected in four of the five data sets. Most importantly, the positive alleles for increased grain protein for six (out of 8) stable QTL, including the important loci on 7A and 7B, were contributed by PI 414566, the nonadapted genotype. For GH, one QTL located on 4B (QGH.ndsu.4B.1) was detected above threshold in all the five data sets and contributed up to 10.5% of PV for GH. The elite genotype ND 705 contributed alleles for increased hardiness at this locus. Other two most consistent QTL with major effect on GH were detected one each on 1A (QGH.ndsu.1A) and 1B (QGH.ndsu.1B), but the alleles for increased hardiness at those two loci were contributed by PI 414566.

Yield traits {grain yield (GY), grains per spike (GPS) and spikes per square meter (SPMS)}
Three yield traits, grain yield (GY), grains per spike (GPS) and spikes per square meter (SPMS) were included in this study. Composite interval mapping identified a total of five QTL  each for GY and GPS and seven QTL for SPMS in this population (Table 2; S3 Table). The 17 QTL were located in 14 genomic regions; one genomic region (genomic region 19) had QTL for two and another (genomic region 17) had QTL for three traits (S3 Table). Overall, the PVE by these QTL ranged from 7.5 to 19.7%. Four QTL for GY, three QTL for GPS and five QTL for SPMS explained >10% PV and could be considered major. For GY, one QTL located on 4B (QGY.ndsu.4B) was detected in three data set and could be considered stable; all other QTL were detected in one or two data sets. The elite genotype ND 705 contributed the alleles for increased GY at all the identified loci (

Agronomic traits {days to heading (DH), plant height (PH)}
This study identified four QTL for DH and seven QTL for PH (Table 2; S3 Table). Individually, these QTL contributed 10.8-38.8% and 5.3 to 31.2% of the phenotypic variation for DH and PH, respectively. For DH, the QTL identified on 5A (QDH.ndsu.5A) could be considered most important in this population, as it was detected in all the environments and explained up to 38.8% of PV. Another stable QTL identified in all the environment was located on 5B (QDH. ndsu.5B) and contributed up to 10.8% PV for DH. The QTL located on 7D has also major effect (PVE up to 15.6%) on DH and could also be considered stable as it was detected in three data sets. The elite line ND 705 contributed alleles for fewer days to heading at three loci (out of four), including the stable loci located on 5A and 5B (Table 2). For PH, the most significant QTL (QPH.ndsu.2D) in this population was identified on chromosome arm 2DS. The QTL was located in a 3.5 cM interval flanked by BS00067046_51 from  Novel QTL for wheat improvement and relationship of grain shape and size traits with wheat quality and yield one side and six Co-localized markers on the other side (Kukri_c51992_290, Kukri_rep_ c106786_230, wsnp_Ex_c14779_22892053, wsnp_JD_rep_c63957_40798083, wsnp_JD_rep_ c63957_40798121, Kukri_rep_c113120_104). These six co-localized markers were mapped at a distance of 0.5 cM from the peak. This QTL was detected above threshold LOD score with a consistent phenotypic variation (26.1-31.2%) in all the environments. Interestingly, the alleles for reduced height at this locus were contributed by the non-adapted parent PI 414566 and explained up to 31.2% phenotypic variation for PH. When the whole population was divided into two groups based on parental alleles at QPH.ndsu.2D locus, PI 414566 alleles reduced plant height by 8-9.4 cm in different environments with an average of 8.9 cm (Table 3). Four other QTL located one each on 1B, 3A, 3D and 7B were identified for PH in three or four data sets and could be considered stable. Overall, ND 705 contributed reduced height alleles at four loci, including two stable loci. However, at both of these loci, the alleles for reduction in height were associated with reduced yield. The non-adapted parent PI 414566 contributed reduced height alleles at three loci; all of them being stable across environments and independent of yield QTL.

Genetic association of grain quality, and yield traits with grain shape and size traits
Out of 28 genomic regions associated with three wheat quality traits (GPC, GH, FE) in this study, only five genomic regions (17.8%; S3 Table) were associated with grain shape and size traits. These genomic regions were 5 (1B), 6 (1B), 19 (4B), 27 (5A) and 39 (7A). Among the quality traits, four GPC QTL and two GH QTL (one being common for GPC and GH), were co-localized with QTL for grain shape and size. None of the QTL identified for FE showed any association with grain shape and size traits. On the other hand, co-localization of QTL for yield related traits with grain shape and size was more frequent (Table 2; S3 Table). Out of the 14 genomic regions associated with three yield traits (GY, GPS, SPMS), six (~43%) genomic regions were also associated with grain shape and size traits in the same population. These genomic regions were 9 (2A), 17 (3D), 19 (4B), 27 (5A), 34 (6A), and 35 (6B). The loci for grain quality and yield traits showed both positive and negative associations with grain shape and size traits. The ND 705 allele at the genomic region 27 (5A) was associated with 3.64 days reduction in heading as well as significant increase in GA and TGW. The same parental allele was also responsible for increase in SPMS and decrease in GPC, but SPMS and GPC QTL were significant in only one environment. The genomic region 39 (7A) seems to be a another good candidate for breeding as the PI 414566 allele at this locus was associated with increase in GPC as well as TGW, GL, GW, GA. The same alleles were, however associated with minor effect on grain softness. All the QTL in this region on 7A were stable across environments. In the genomic region 23 (6A), GA was positively associated with GY and TGW, with ND 705 providing the desirable alleles.

QTL×QTL epistatic interactions
Inclusive composite interval mapping identified a total of 359 digenic epistatic interaction for 14 different traits (including six grain shape and size traits) at a minimum LOD score of five (Table 4; S4 Table). The total number of epistatic interactions for an individual trait in a single environment ranged from zero to 29. For SPMS and KA, epistasis was detected only in one dataset (out of 5), while for the rest of the traits, epistasis was detected in at least four datasets. More epistatic interactions were identified for quality traits (FE, GPC and GH) compared to yield or grain shape and size related traits. The phenotypic variation (PV) explained by all the epistatic interaction belonging to a single trait in individual environments ranged from 7.3 to 74.3%. The average PV explained by all epistatic interactions for a trait in a single environment was 41.8%, compared to 48.2% for the main effect QTL, suggesting the importance of epistasis in the genetic control of different traits.
Although a large number of the epistatic interactions were associated with individual traits, a total of 51 epistatic interactions representing 20 unique QQ pairs were found associated with Table 4  two or more environments (called stable QQ interactions) and/or traits ( Table 5). The maximum number of six stable QQ interactions were detected for grain hardness, followed by four stable interactions for DH. For FE, KL, KW and TW, one QTL interaction was detected for each trait in two or more environments (Table 5). Both parental (two loci from the same parent increasing the trait values) and recombinant (one locus from each parent increasing the trait values) QQ interactions were observed. The QTL×QTL digenic interaction could involve two main-effect QTL (M-QTL), an M-QTL and an epistatic QTL (E-QTL; a QTL with no main effect), or two E-QTL. In the present study, almost all the QQ interactions were observed between E-QTL. Such QTL will escape detection in the analyses, which detect interactions between M-QTL only.

Discussion
Hard red spring wheat is considered a specialty wheat because of its high protein content and quality, traits which help produce excellent quality bread and determine the marker value of the crop. Therefore, grain protein content and flour extraction are the major targets of wheat breeding programs. At the same time, improvement in grain yield is always the primary target in any breeding program. This study investigates the high resolution genetic dissection of quality, yield and agronomic traits in hard red spring wheat to identify new useful alleles from adapted and non-adapted genotypes. The study also provides insight into the genetic association of grain quality and yield traits with grain shape and size traits. The population used is this study was derived from an elite line ND 705 and a non-adapted line PI 414566, and thus segregates for yield, quality and grain shape and size traits. The non-adapted germplasm is required to identify new sources of desirable alleles to enrich the cultivated gene pool. The cross, therefore, resulted in the identification of desirable alleles fixed in the breeding program as well as novel and desired alleles contributed by the un-adapted genotype. The use of high density Infinium 90k assay based genetic map allowed the identification of tightly linked markers with the traits of interest, which is critical for the success of marker assisted selection.

Novel QTL for flour extraction (FE)
The few studies conducted in the past to genetically dissect FE were based on low resolution genetic maps [17,25,42,43]. Although multiple QTL were associated with FE in our study, the inconsistency of most of them across environments suggests a significant environmental effect, as has been reported in the past [17]. The QTL QFE.ndsu.5B seems to be the most significant for FE as it was detected across multiple environments. The separation of RILs based on parental alleles at this locus showed that the ND 705 allele increased FE by 2% (61.8 vs 59.8%). The QTL for milling or flour yield were also reported on 5B in earlier studies [17,25,42]. However, when compared with those studies, the location of these QTL seems far from QFE.ndsu.5B, suggesting it to be a novel QTL. The QTL QFE.ndsu.5B did not show any association with grain shape and size traits or any other quality traits, suggesting an independent control for FE.

Lack of Gpc-B1, and novel QTL contributed by the non-adapted genotype, offer opportunities to enhance grain protein content (GPC) in adapted wheat germplasm
For GPC, the contribution of the positive alleles for the majority (seven out of nine) of stable QTL by the non-adapted genotype, provides an opportunity to enhance the cultivated gene , however, found a negatively associated QTL for yield or yield components in this region as well. In contrast, the QTL identified in this study did not show any genomic association with GY, GPS, SPMS, GL, GW, GA and TGW. This could be because of differences in germplasm and environmental conditions. We could not find any QTL on 6BS, suggesting that GPC-B1, the major GPC QTL cloned in wheat [54], does not play a role in our germplasm. This is expected as the majority of the modern wheat varieties carry the non-functional allele of Gpc-B1 gene [54]. The cloning of Gpc-B1 has now facilitated the successful introgression of a functional copy of this gene to adapted germplasm, through MAS ([55]; for review see, Kumar et al. [12]). This suggests an opportunity to further improve GPC in our spring wheat germplasm, through introgression of the high GPC allele of Gpc-B1 gene.  Novel QTL for wheat improvement and relationship of grain shape and size traits with wheat quality and yield

Novel QTL for grain hardness on chromosome 4B
Grain hardness or grain texture has a direct relationship with important end use quality traits like milling yield or flour extraction in wheat. This was further confirmed in our study, where we also observed a moderate positive association of GH with flour extraction in all the environments. Many studies have shown that grain hardness in wheat is controlled by a major hardness locus (Ha) located at a sub-telomeric position on chromosome 5DS [56, 57]. The Ha locus also harbors genes encoding 15-kD lipid binding endosperm specific protein, called friabilins. The friabilins are composed of mainly two proteins puroindoline a (Pina) and puroindoline b (Pinb) [58]. Characterization of different wheat types have shown that hard wheat varieties either lack or carry specific mutations for the puroindoline coding genes [57]. The soft wheat varieties carry the wild-type puroindoline alleles [59]. In addition to the role of Ha locus, studies in the past have also reported several other QTL associated with hardness as well [16,[60][61][62][63]. We did not find any QTL for GH on 5DS, suggesting that both the parental genotypes probably carry the hardness alleles at Ha locus. In this population, the differences in GH are attributed to several other QTL, which were mostly detected in multiple environments, showing a greater influence of genotype than environment on the GH, as has been observed in the past [64].

Complex genetics of yield and component traits
Grains per spike (GPS), number of spikes per square meter (SPMS) and thousand grain weight (TGW) are considered the three most important components of yield in wheat. The phenotypic structure in our study showed similar results, where GPS, SPMS and TGW had low to moderate positive correlations (mostly significant) with GY. However, yield is one of the most complex traits which is highly influenced by the environment. The genetic dissection further confirms this, as most of the identified QTL were environment specific and did not show stability across locations. The genomic region 17 (3D), where ND705 alleles were associated with increased GY (QGY.ndsu.3D), GPS (QGPS.ndsu.3D) and SPMS (QSPMS.ndsu.3D), and decreased PH (QPH.ndsu.3D), is a good candidate for yield increase without any effect on GPC. However, ND705 alleles at this locus were associated with decrease in TGW and GA [22]. This could probably mean that the yield increase through this locus is a result of increase in grains per spike and spikes per square meter. The genomic region 19 (4B) has been found associated with significant effect on grain shape and size, not only in wheat (discussed in detail by Kumar et al. [22]), but in other cereals as well [65][66][67]. In this population, the genomic region 19 was associated with major and stable effect on TGW, GL, GLWR,GA and GW [22]. We observed that the same region is also associated with GY, SPMS and GPC. For this locus, the ND 705 alleles, although increased GY, SPMS, GPC and TW, they were associated with decrease in other components, suggesting that the yield increase is due to an increase in SPMS. Due to its expression across environments, QGPS.ndsu.2B (on 2BL) associated with grains per spike could be an important candidate for yield improvement. Many other studies in the past have also identified QTL for grains per spike on chromosome 2B [68][69][70][71]. However, comparison of map locations showed that those QTL were located on the short arm, whereas QGPS.ndsu.2B was located in the centromeric bin of 2BL, suggesting that this is probably a novel QTL specific to our germplasm. Lines with ND alleles at both QGPS.ndsu.2B and QGPS. ndsu.3D had on average three more grains per spike (36.1 vs 33.1) compared to the lines with PI 414566 alleles at both loci, suggesting the potential of these two QTL for yield improvement.

Heading date controlled by major loci in the genomic regions harboring VRN genes on group 5 chromosomes
The photoperiod (Ppd) response, vernalization (Vrn) and earliness per se (Eps) genes control flowering time in wheat. The Eps genes influence flowering time independently, whereas, Ppd and Vrn genes have a modifying effect. To date, several QTL/genes for heading date are reported in wheat [72]. The most significant loci associated with photoperiod response, Ppd-D1, an orthologous to the Arabidopsis photoperiod pathways gene CONSTANS [73], was mapped on chromosome 2D in wheat [74]. The photoperiod insensitive allele of this gene Ppd-D1a was widely used in the "green revolution". We did not find any loci for HD on chromosome 2D, suggesting that the major developmental gene Ppd-D1 did not play any role in the variation for heading date in our population. This could also probably mean that both parents have the same alleles for this gene. The quantitative variation for heading date in our population was due to stable loci on 5AL and 5BL. Comparison with earlier studies suggest that QDH.ndsu.5A and QDH.ndsu.5B are in the region of VRN-A1 and VRN-B1, respectively [37,75]. This is probably expected as the parental genotype PI 414566 is a facultative type (or winter type) wheat. In contrast to Guedira et al. [75], we observed that QDH.ndsu.5A has a significantly higher contribution compared to QDH.ndsu.5B in our population. This could be attributed to differences in germplasm.

Major locus for plant height in the region of Rht8 on chromosome arm 2DS
Plant height affects lodging, disease resistance [76], quality [77] and yield [78] in wheat and therefore, has been an important breeding target since the mid-nineteenth century [79]. The reduced height phenotype which revolutionized the wheat yield worldwide could be mainly attributed to the Japanese varieties 'Norin 10' carrying the Rht-B1b (or Rht1 on chromosome 4B), Rht-D1b (or Rht2 on chromosome 4D) genes and 'Akakomugi' carrying the Rht8 (on chromosome 2D) gene [79,80]. In addition, over the years, many other reduced height or dwarfing genes/QTL have been identified in wheat [21, 69,81]. In our population, although both the cultivars were semi-dwarf and differed significantly for PH (ND705 = 95cm, PI 414566 = 70 cm), the major reduced height genes Rht-B1b and Rht-D1b did not play any role in the variation for PH. The QTL QPH.ndsu.2D, which was identified in the region of the Rht8 gene [82], played larger role in controlling PH. The comparison of map locations suggests that QPH.ndsu.2D is most likely Rht8. The two groups of lines with ND705 and PI 414566 alleles at the QPH.ndsu.2D locus showed that the PI 414566 alleles significantly reduced plant height. The semi-dwarfed wheat genotypes show improved spikelet fertility and increased grain number per spike compared to the tall genotypes because they distribute a major portion of assimilate to the developing spikes [83]. The Rht8 locus has been found associated with yield and several yield components including spike length, spikelet number, spikelet compactness, and thousand grain weight [2,68,81,82,84]. In this population, we did not find any QTL associated with yield, yield components, grain shape and size related traits in the Rht8 region. However, comparison of groups with ND705 or PI 414566 alleles at the Rht8 locus showed that the reduced height allele from the PI 414566 at the Rht8 locus was associated with a significant increase in grains per spike and flour extraction. But, the same allele also showed a decrease in the number of spikes per square meter and test weight. No significant association of the reduced height allele at the Rht8 was observed with grain yield, or other component traits, including thousand grain weight, grain length, grain width, grain length width ratio and grain area.
According to some estimates, over 70% of the registered wheat cultivars in the world are fixed for the Rht-B1b and the Rht-D1b dwarfing genes [85]. Since Norin 10 was the source of introgression of reduced height to US genotypes [79,80], Rht-B1b and Rht-D1b are much more prevalent in the US wheats, while the frequency of the Rht8 is much lower [86]. On the other hand, the reduced height in the European and Asian cultivars was contributed initially by the Rht8 gene using the Japanese cultivar Akakomugi [79]. In a Chinese wheat collection, Zhang et al. [87] observed that the frequency of Rht8 was highest (46.8%), followed by Rht-D1b (45.5%) and Rht-B1b (24.5%). Since PI 414566 was originally from China, it is more likely that it harbors the reduced height allele Rht8c, in addition to either the Rht-B1b or Rht-D1b. Overall, the historical evidence and the genotypic and phenotypic structure suggests that most likely, either the Rht-B1b or Rht-D1b locus is fixed in both the parental genotypes of our population, while the variation for PH was contributed mainly by the Rht8. The availability of molecular markers, now provides the opportunity to exploit the benefits of the Rht8 in US wheats. Introgression and selection of gibberellins insensitive genes Rht-B1b or Rht-D1b with gibberellins insensitive gene Rht8 using MAS may facilitate rapid development of high-yielding varieties [88].

Association of grain shape and size with wheat quality and yield
The knowledge about the genetic association of grain shape and size traits with grain quality, particularly flour extraction or milling yield in wheat is, although important, but mostly missing. The phenotypic and genotypic architecture of our population suggested that grain shape and size traits (GL, GW, GA, TGW and TW) were although correlated with quality traits, the association was not strong. The fact that only~18% of the loci for quality share the genomic locations with loci for grain shape and size, suggested that the genetic control of wheat quality traits like protein content and flour extraction is mostly independent of grain shape and size.
Although theoretical studies suggested that large, spherical grains are optimal for milling yield, presumably due to increased endosperm [89], experimental studies based on phenotypic evaluations did not find a strong correlation between flour extraction and GL, GW, GA, TGW, TW [90,91]. Similarly, in our study, flour extraction was not correlated with GL in any of the environments. However, a significant positive but weak correlation between GW and FE in Prosper location suggests that the association is probably due to environmental factors and not because of genetic loci. This was further confirmed by genetic analysis, which showed that the loci for FE do not share any genomic region with grain shape and size.
As expected, grain shape and size traits had more significant effect on yield compared to their effect on quality in wheat. The observation that about half of the genomic regions for yield traits were also associated with grain shape and size suggests a significant role of grain morphology in enhancing yield in wheat. The genomic regions for GA and TGW were most commonly associated with yield or yield traits, thus confirming that TGW is one of the most important yield components in wheat. Since GA is highly correlated with TKW ([22] ; Fig 2), targeting an increase in GA in the breeding programs may help enhancing wheat yield through an increase in TGW.

Role of epistasis in the genetic control of quantitative traits
Under the effect of epistasis, the value of an allele or genotype at a locus depends on the genotype at other epistatically interacting loci, thus complicating the picture of gene action. A seemingly "favorable" allele at one locus may be an "unfavorable" allele in a different genetic background. Knowledge about the role of epistasis in the genetic control of target traits is crucial, because epistasis complicates the genotype-phenotype relationship, and thus affects the rate of genetic gain in plant breeding. In this study, a comparable PV explained by main effect and epistasis (41.8% & 48.2%, respectively) clearly suggests that in addition to main effects, epistasis plays an important role in the genetic architecture of most of the quantitative traits. Also, the observation that almost all the QTL involved in epistasis did not have any effect of their own, but contributed to the genetic variation of the traits purely through epistatic interactions, suggests the need to use statistical methods/software which have the ability to detect such QTL.
The role of epistasis in the genetic control of quantitative traits has been documented in many crops [92], including wheat [1,70,71], barley [93], maize [94], and rice [95,96]. For grain protein content in two RIL populations, Kulwal et al. [1] reported that epistatic interactions contributed a significantly higher portion of phenotypic variation than main effect QTL. Similar results were also reported for yield related traits in wheat [97]. From these studies, we can conclude that epistatic interactions are an important component of the genetic basis of most of the quantitative traits. Neglecting epistasis could lead to reduced selection response and thus need serious consideration in breeding for genetic improvement. In the past, despite the identification of epistatic interactions in different crops and the success in marker assisted selection, epistatic QTL were hardly employed in genetic improvement programs. This could be due to many reasons, including the complexity of the epistatic gene networks and minor effects contributed by individual interactions. This means if we have to achieve greater selection response through the application of genomic tools, we need to develop improved tools to understand the role of epistatic interactions and their incorporation into the breeding program. The methods of genomic selection [98] might help us to incorporate epistasis into the breeding pipeline to achieve the desired genetic gains in crops.

Implications of the current study for wheat improvement
Wheat quality and yield improvement are two important target areas of any wheat breeding program worldwide. These traits have complex genetic control and low to moderate heritability. Here, the high resolution genetic dissection provided tightly linked molecular markers for quality and yield that could expedite the process of development of improved varieties in spring wheat. This study not only enhanced our knowledge about the genetics of wheat quality and yield, but also provided new detailed evidence about the phenotypic and genetic association of wheat quality and yield with grain shape and size related traits. The quality traits showed mostly independent genetic control whereas, yield traits showed moderate genetic association with grain shape and size traits. Based on these observations, it could be suggested that targeting grain shape and size, particularly an increase in GA, may help enhancing wheat yield through an increase in TGW. The novel and yield independent QTL for GPC identified in this study offer the opportunity to improve GPC in hard red spring wheat germplasm, without any yield loss (Table 3). Five lines (RIL No: 4, 10, 40, 90, and 130) with the highest GPC combine all the positive alleles from both the parents at the eight stable loci for this trait, providing important pre-breeding material to enhance protein content in the adapted germplasm. When all positive alleles were present together, these lines showed about two LSD higher GPC compared to the parental genotypes (or about 12% higher compared to ND 705 alleles) (S5 Table). Another line (RIL-138), which combines the allele from PI 414566 at QGPC.ndsu.7A.2 with other alleles from ND 705, showed almost 1.5 LSD higher GPC and no significant loss in grain yield when compared to the elite line ND 705 (S5 Table). This line could also serve as valuable pre-breeding germplasm to enhance protein content without compromising grain yield. Similarly, alleles from both the parents at the stable loci could be combined to increase flour extraction in wheat ( Table 3). The novel and consistent QTL for grains per spike identified on chromosome 2B and for flour extraction on chromosome 5B need to be incorporated into wheat breeding programs and could be very useful in enhancing grain and milling yield, respectively.
About 32% of the positive QTL alleles were contributed by PI 414566, explaining the transgressive segregation in the population, as a result of combination of superior alleles. The unadapted line PI 414566 also contributed novel, stable and major loci for FE, GPC, GH, and PH, offering an opportunity to enhance the adapted gene pool for these important traits. This further confirmed that non-adapted germplasm is essential to enhance genetic diversity in cultivated germplasm. The RILs having combinations of positive alleles from both the parents at important loci could serve as important pre-breeding material for improvement of target quality and yield traits. In conclusion, the genomic resources developed in this study will facilitate marker assisted breeding efforts for wheat improvement.
Supporting information S1