Common wheat is one of the most important crops in China, which is the largest producer in the world. A set of 230 cultivars was used to identify yield-related loci by association mapping. This set was tested for seven yield-related traits, viz. plant height (PH), spike length (SL), spikelet number per spike (SNPS), kernel number per spike (KNPS), thousand-kernel weight (TKW), kernel weight per spike (KWPS), and sterile spikelet number (SSN) per plant in four environments. A total of 106 simple sequence repeat (SSR) markers distributed on all 21 chromosomes were used to screen the set. Twenty-one and 19 of them were associated with KNPS and TKW, respectively. Association mapping detected 73 significant associations across 50 SSRs, and the phenotypic variation explained (R2) by the associations ranged from 1.54 to 23.93%. The associated loci were distributed on all chromosomes except 4A, 7A, and 7D. Significant and potentially new alleles were present on 8 chromosomes, namely1A, 1D, 2A, 2D, 3D, 4B, 5B, and 6B. Further analysis showed that genetic effects of associated loci were greatly influenced by association panels, and the R2 of crucial loci were lower in modern cultivars than in the mini core collection, probably caused by strong selection in wheat breeding. In order to confirm the results of association analysis, yield-related favorable alleles Xgwm135-1A138, Xgwm337-1D186, Xgwm102-2D144, and Xgwm132-6B128 were evaluated in a double haploid (DH) population derived from Hanxuan10 xLumai14.These favorable alleles that were validated in various populations might be valuable in breeding for high-yield.
Citation: Guo J, Hao C, Zhang Y, Zhang B, Cheng X, Qin L, et al. (2015) Association and Validation of Yield-Favored Alleles in Chinese Cultivars of Common Wheat (Triticumaestivum L.). PLoS ONE 10(6): e0130029. doi:10.1371/journal.pone.0130029
Academic Editor: Liuling Yan, Oklahoma State University, UNITED STATES
Received: January 22, 2015; Accepted: May 15, 2015; Published: June 11, 2015
Copyright: © 2015 Guo et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited
Data Availability: All relevant data are within the paper and its Supporting Information files.
Funding: This work was supported by grants from the Chinese Ministry of Science and Technology (2011CB100100; http://www.973.gov.cn/AreaAppl.aspx), the China Agricultural Research System (CARS-3-1-1; http://www.caaswheat.cn/), the National High-Tech R&D Program of China (2012AA101105; http://www.863.gov.cn/), the National Science and Technology Infrastructure Program (2011BAD35B03; http://www.nsfc.gov.cn/), and the National Natural Science Foundation of China (31301320; http://www.nsfc.gov.cn/).
Competing interests: The authors have declared that no competing interests exist.
Wheat is one the most important crops in the world with a total production of about 713 million tonnes in 2013 . With an increasing world population it is necessary to continuously raise production mainly through higher yields. Identification of new yield-related lociis becoming increasingly important in all food crops.
Wheat yield is determined by three key factors, viz. spikes per unit area, kernel number per spike and thousand-kernel weight. Most yield-related traits in wheat are controlled by genes with low heritability . Many yield-related QTLs were identified in studies of using bi-parental populations segregating for traits such as plant height [3–8], spike length [9, 10], spikelet number per spike [10–12], kernel number per spike [10, 13, 14, 15], thousand-kernel weight [10, 16, 17, 18], kernel weight per spike [10, 19] and sterile spikelet number per spike [7, 10, 20, 21]. For example, as a diagnostic marker, Xgwm261 closely linkedto Rht8 on 2D, plays an important role in wheat yield improvement in southern Europe [3, 4]. Although there has been progress in identification of yield-related QTL mapping based on bi-parental populations, only a relatively small part of the total phenotypic variation within a crop species is identified in a single cross .
Association analysis identifies trait-marker relationships based on linkage disequilibrium . This method has several advantages compared to bi-parental populations, such as (1) materials used in association analysis can be existing germplasm ranging from landraces to modern varieties and advanced lines; (2) novel and superior (favorable) alleles associated with the best phenotypes can be identified and ranked for use in breeding; (3) association mapping is more efficient and cheaper than other methods ; and (4) the results of association mapping apply to a wider range of genetic backgrounds. For example, Sajjad et al.  identified six SSR loci associated with yield-related traits on chromosome 3A, explaining 10.7 to 17.3% of the yield-related phenotypic variation in 94 wheat cultivars using 39 SSRs. Among them, Xgwm155 and Xwmc527, Xcfa2134 and Xgwm369, Xgwm155, and Xgwm369 were associated with grain yield per plant, fertile florets per spikelet, plant height, and spike length, respectively. Wang et al.  genotyped 531 SSR markers in the Chinese mini core wheat collection; 22 SSR loci were associated with TKW, each explaining phenotypic variation ranging from 1.56 to 21.99%. Six loci, Xcfa2234-3A, Xgwm156-3B, Xbarc56-5A, Xgwm234-5B, Xwmc17-7A and Xcfa2257-7A accounted for more than 10% of the variation. Using the same association panel Zhang et al.  identified 23 SSR loci significantly associated with KNPS, and reported that favorable alleles combined with additive effects. They also identified favorable alleles at the Xwmc304-1A, Xgwm311-2A, Xcfa2234-3A, Xgwm2-3A, Xgwm131-3B, Xgwm156-3B, Xgwm2-3D, Xcfe273-6A and Xcfa2257-7A loci with positive effects on both TKW and KNPS. However, relatively few studies in wheat have involved mapping/analysis of multiple yield-related traits based on combined bi-parental populations and association panels.
In the present study 230 diverse common wheat cultivars were genotyped at106 SSR loci prior to association analysis of data for seven yield-related traits obtained in multiple environments with the aim of identifying favorable loci or alleles. The purposes of the study were to provide insights into utilization of association study and linkage analysis to dissect the genetic basis of traits, as well as information that may be useful for future molecular breeding in the Yangtze River Valley.
Materials and Methods
The association panel of 230 wheat genotypes included 222 Chinese, 1 USA, 1 Chilean, 4 Italian, 1 Mexican and 1 Romanian cultivars. The Chinese accessions, included 39 cultivars from Jiangsu, 10 from Anhui, 6 from Hubei, 14 from Hunan, 2 from Jiangxi, 2 from Zhejiang, 5 from Fujian, 9 from Sichuan, 3 from Guizhou, 2 from Yunnan, 36 from Henan, 19 from Shandong, 6 from Gansu, 7 from Shanxi, 18 from Beijing, 8 from Hebei, 32 from Shaanxi, 3 from Heilongjiang and 1 from Qinghai (S1 Table). A biparental DH population of 150 lines from the cross Hanxuan10 xLumai14 was also used. Both parents were historically important cultivars; Hanxuan10 was released in 1966 and Lumai14 was a high-yielding cultivar during the 1990s; it has higher KNPS, TKW and yield than Hanxuan10 .
The cultivar panel was planted in four environments, viz. 2008 and 2009 at the Sichuan Academy of Agricultural Sciences in Chengdu (designated 08CD and 09CD, respectively), and in 2008 and 2009 at the Lixiahe Agricultural Institute of Jiangsu Province in Yangzhou (08YZ and 09YZ, respectively).
The field experiment consisted of three randomized complete blocks. Each cultivar was planted in three 133 cm rows with 40 seeds per row, and a row spacing of 25 cm. The yield-related traits PH (cm), SL (cm), SNPS, KNPS, TKW (g), KWPS (g) and SSN were measured on an average 20 plants in the middle of each plot and expressed as means.
The 150 DH lines and parents were planted in two environments, viz. 2010 and 2011 at Changping, Beijing (DH10 and DH11, respectively). The field design was three randomized complete blocks. Each cultivar was planted in two-row plots with a length of 2 m and 30 cm spacing rows. Yield-related traits included PH (cm), SL (cm), SNPS, KNPS, TKW (g) and SSN measured on 20 plants in the middle of each plot.
Mean values of yield-related traits, standard deviations, standard errors, variation coefficients (CV) and broad sense heritabilities for each environment were analyzed by IBM SPSS Statistics 21.0.0 software (http://www.brothersoft.com/ibm-spss-statistics-469577.html). The best linear unbiased predictor (BLUP) method was used to estimate mixed means of the phenotypic traits as in the association analysis [29–31].
Genomic DNA from 10 seedling leaves of each cultivar was extracted by the CTAB method . A total of 106 SSR markers distributed across all 21 chromosomes  were genotyped the association set and DH population. Among them, 21 and 19 markers were previously reported to be associated with KNPS  and TKW  (S3 Table), respectively. Primer sequences and annealing temperatures (S2 Table) were obtained from GrainGenes (http://archive.gramene.org/markers/) and Somers et al. . An ABI 3730 Genetic Analyzer (Applied Biosystems, Foster City, CA, USA) was used to separate amplified products after purification. Fragment sizes were determined using an internal size standard (GeneScanTM-500 LIZ, Applied Biosystems). GeneMapperV3.7 software (Applied Biosystems) was used to estimate fragment sizes (http://www.appliedbiosystems.com.cn/).
General parameters of genetic diversity of each SSR marker, including MAF (major allele frequency), allele number, genetic diversity and PIC (polymorphism information content) were evaluated using PowerMarker V3.25 software . To reduce spurious associations, population structure of the 230 cultivars was analyzed using Structure V2.3.2 . The number of presumed sub-populations (K) was set from 1 to 15 with an admixture model and correlated allelic frequencies. This process was repeated five times. For each run, burn-in and Markov Chain Monte Carlo iterations were set to 50,000 and 100,000, respectively. The number of sub-populations and the best output was determined following the ΔK method . Kinship analysis was also performed using genotypic data with SPAGeDi software  to determine genetic covariance between individuals. Evaluation of pairwise kinship coefficients was based on Loiselle et al.  with 10,000 permutation tests. All negative values between individuals were then set to 0, indicating that they were less related than random individuals .
The MLM (mixed linear model) module with Q + K was used for association analysis between phenotypic traits and SSRs through TASSEL 2.1 software (http://www.maizegenetics.net/) [40, 41]. The phenotypic variation explained (R2) for each associated locus was calculated for alleles with frequencies>5% [26, 27]. Based on phenotypic data and the kinship matrix using TASSEL 2.1 software, the heritability (h2) of each trait in different environments, defined as the proportion of genetic variance over total variance, was calculated according to the formula h2 = σa2/(σa2+σe2) with the MLM options of no compression or re-estimation for each marker. Here, σa2 means genetic variance, and σe2 indicates the residual variance. Genetic effects of favorable alleles of associated loci were evaluated by multiple comparisons and ANOVA using IBM SPSS Statistics 21.0.0 (http://www.brothersoft.com/ibm-spss-statistics-469577.html).
Yield-related traits for the association panel were determined over environments 08CD, 09CD, 08YZ and 09YZ. Average values of yield-related traits were calculated according to the BLUP method and a summary of parameters for the seven traits is listed in Table 1. The CV of phenotypic traits in each environment were higher than 10% with the exception of SNPS, indicating that trait values differed between cultivars. Moreover, the average h2 of PH and TKW were 72.3 and 51.1%, respectively, and higher than those of other traits in all environments.
A total of 907 alleles were detected in the association panel using 106 SSR markers. MAF ranged from 0.164 to 0.987 with a mean of 0.542. Numbers of alleles per locus varied from 2 to 25 with an average of 8.6. PIC values ranged from 0.026 to 0.903, with an average of 0.552 (S3 Table). These values indicated that the association panel had a relatively high level of molecular genetic diversity.
Genetic structure and relative kinship among cultivars
The population structure of the 230 cultivars was calculated based on 106 SSR markers with 907 alleles by Structure V2.3.2. The cultivars were basically divided into two sub-populations according to their geographic origins (Fig 1a). The number of presumed sub-populations (K) was set from 1 to 15 for calculating ΔK values, which reached the highest value at K = 2 (Fig 1b), confirming that the population should be divided into two sub-populations.
a: Genetic structure produced by Structure V2.3.2; b: Number of sub-populations estimated by ΔK at a range of K values.
Relative kinship coefficients between individuals were also calculated using data for 106 SSR markers (Fig 2). About 74.1% of the pairwise kinship coefficients ranged from 0 to 0.05, indicating that most cultivars had no, or only a weak, relationship with each other.
Association analysis between seven yield-related traits and SSR markers
Association analysis was performed between the 907 alleles at 106 SSR loci and seven yield-related traits over four environments using a mixed linear model. Seventy three significant associations were identified at 50 SSR loci (Fig 3, Table 2) located on all chromosomes except 4A, 7A and 7D. SSR loci associated with KNPS were located on chromosome 1A, 1D, 2D, 3B, 5B, 5D and 6B, KWPS-associated loci were located on 1A, 2D and 5B, PH-associated loci were located on 1B, 2A, 2B, 3A, 4B, 5A, 5B, 6B and 7B, SL-associated loci were located on 1A, 2B, 4B, 4D, 5D and 6A, SNPS-associated loci were located on 1D, 5A, 5D, 6B and 7B, SSN-associated loci were located on 2B, 2D, 3A, 3D, 5A, 5B, 6D and 7B, and TKW-associated loci were located on 2A, 2B, 4B and 5A. In addition, seven SSR loci were significantly associated with yield-related traits in two or more environments, such as, Xgwm135-1A with KNPS, Xgwm515-2A and Xgwm132-6B with PH, Xgwm219-6B with SNPS, and Xgwm102-2D, Xgwm297-7B and Xgwm383-3D with SSN. Furthermore, nine SSR loci, including Xgwm135-1A, Xwmc361-2B, Xgwm102-2D, Xgwm495-4B, Xbarc56-5A, Xgwm186-5A, Xgwm540-5B, Xgwm182-5D and Xgwm132-6B, were significantly associated with two or more traits across environments.
The red dotted line indicates the threshold value of significant association.
The phenotypic variation explained (R2) in overall associations varied for different traits and SSR loci, and the R2 of each association ranged from 1.54 to 23.93% with a mean of 10.00% (Table 2). Among 73 associations, 27 had R2 values higher than 10%and 33 were between 5% and 10%. For example, locus Xgwm132-6B had an R2 of more than 10% for KNPS in 08CD, PH in 08CD and 08YZ, and Xgwm102-2D on KNPS in 08CD, SSN in 08CD and 09CD, as well as Xgwm135-1A on KNPS in all four environments, KWPS in 08CD with more than 5% of R2.
Genetic effects of favorable alleles
The genetic effects of favorable alleles were calculated as differences between alleles and mean values. A total of 50 associated favorable alleles were identified by comparing mean phenotypic data and different alleles using multiple comparisons for alleles with allelic frequencies>5% (Table 3). Of these, the frequencies of Xgwm135-1A138 on KNPS, Xwmc361-2B216 on TKW, Xgwm102-2D144 on KNPS, KWPS and SSN, Xgwm540-5B115 on KNPS, KWPS and PH, were higher than 50%, indicating these loci might have undergone strong selection pressure during modern breeding.
Genetic effects of favorable alleles among various loci were also evaluated in four environments (Table 3). Seven alleles showing the largest effects on different yield-related traits, included Xgwm389-3B130 on KNPS (5.61), Xgwm540-5B115 on KWPS (0.16 g), Xgwm495-4B154 on PH (-7.92 cm), Xgwm194-4D133 on SL (0.38 cm), Xgwm219-6B186 on SNPS (1.00), Xgwm297-7B150 on SSN (-0.34), and Xgwm148-2B162 on TKW (2.09 g). In addition, we also detected some loci associated with multiple traits, such as Xgwm102-2D144 having positive effects on KNPS, KWPS and SSN, and Xgwm132-6B112 with positive effects on KNPS and PH.
Favorable alleles at crucial loci were not always the major ones in breeding panels
The 40 SSR loci associated with TKW and KNPS in Wang et al.  and Zhang et al.  were re-evaluated in the current population, and only three, viz. Xbarc56-5A with TKW, and Xwmc24-1A and Xgwm132-6B with KNPS, were significantly associated in this study. Failure to confirm most of the previously associated loci led us question the cause. We therefore used ANOVA to verify the allelic effects at these loci in the panel. Significant allelic differences were detected at 18 loci (Table 4). Favorable alleles assigned previously at 10 loci were the major alleles, including five loci associated with KNPS (Xgwm2-3A116, Xgwm108-3B127, Xcfd64-3D239, Xgwm2-3D220 and Xcfe273-6A306) and five loci associated with TKW (Xgwm312-2A190, Xgwm372-2A331, Xcfa2234-3A142, Xcfd266-5D167 and Xcfa2257-7A129). At the other eight loci, the favorable alleles assigned previously were not the major ones in the current panel. They were Xgwm259-1B102, Xgwm337-1D168, Xgwm609-4D111 and Xgwm132-6B112, which were associated with KNPS, and Xgwm234-5B227, Xgwm174-5D209, Xwmc168-7A305, and Xwmc17-7A180 associated with TKW. We also found that the R2 for TKW at nine loci were lower. For example, the R2 for Xcfa2234-3A and Xcfa2257-7A in the previous report were 18.20 and 21.99%, respectively [26, 27], whereas in current population they were 4.09 and 2.05%, indicating lower genetic effects in a breeding population.
Validation of favorable alleles in the DH population
In order to validate the genetic effects of alleles detected in the association panel, SSR loci significantly associated with yield-related traits were investigated for polymorphism between cultivars Hanxuan10 and Lumai14. A total of 19 SSR were polymorphic between the two cultivars, and were used to genotype the DH population. Statistical comparisons of phenotypic data identified significant differences between alleles at Xgwm135-1A, Xgwm337-1D, Xgwm102-2D and Xgwm132-6B in at least one of the two environments in which the DH population was grown (Table 5, Figs 4e, 4f, 5d and 5e). Trait data for favorable alleles were higher than those for pooled ‘other’ categories, although the differences were not statistically significant (Table 5). Therefore, the genetic effects of favorable alleles at these loci were confirmed in both a cultivar association panel and a DH population. Xgwm132 was linked with a PH QTL in a previous report (Fig 4a) . Xgwm132 was associated with PH in two environments, and the favorable 128 bp allele had the highest frequency among 10 alleles at this locus (Fig 4b and 4c). The PH effects of Xgwm132128 were -2.76 and -2.56 cm in 08CD and 08YZ, respectively (Fig 4d). This was further verified in the DH population, i.e. -3.20 and -5.50 cm in environments DH10 and DH11, respectively (Fig 4e and 4f). Xgwm135 was also associated with KNPS in four environments (Fig 5a.) The favorable 138 bp allele occurred at the highest frequency among seven alleles (Fig 5b). The phenotypic effects of Xgwm135138 were 1.14, 1.19, 1.84 and 0.89 in 08CD, 08YZ, 09CD and 09YZ, respectively (Fig 5c). Positive effects of 0.97 and 2.15 on KNPS were also confirmed in DH10 and DH11, respectively (Fig 5d and 5e).
a: QTL locus Xgwm132 for PH on chromosome 6B ; b: Associations of PH with 106 SSR markers illustrated as dot plots of compressed MLM at P<0.01. Red points represent association signals of Xgwm132 in different environments; c: Allelic frequenciesfor Xgwm132 among 230 wheat cultivars, green band represents the 128 bp allele, and blue band represents the 136 bp allele; d: Phenotypic effect of favorable allele Xgwm132-6B128 on PH in the association panel used in this study; e and f: Comparison of average PH values between two alleles in two environments in a DH population. *, significant at P = 0.05.
a: Associations of KNPS with 106 SSR markers illustrated as dot plots of compressed MLM at P<0.01. Red points represent association signals of Xgwm135 in different environments; b: Allelic frequenciesfor Xgwm135 among 230 wheat cultivars; green band represents the 138 bp allele, and blue band represents the 142 bp allele; c: Phenotypic effects on KNPS of favorable allele Xgwm135-1A138 in the association panel used in this study; d and e: Comparison of average KNPS between two alleles in two environments in the DH population. *,significant at P = 0.05.
Association analysis is more effective than biparental crosses in identifying yield-related genes
Association mapping and bi-parental population mapping utilize information about genetic recombination and the methods are complementary in identifying genes or QTLs . However, association mapping is more powerful in detecting superior alleles from a large sample of germplasm collections.
In previous studies several SSR loci reported to be associated with yield-related traits (Table 2). Xwmc24-1A and Xgwm132-6B were associated with QTLs for KNPS [7, 41]; Xgwm484-2D with KWPS ; Xgwm155-3A, Xgwm186-5A, Xgwm132-6B and Xgwm46-7B with PH [7, 10, 25, 43]; Xgwm182-5D for SL ; Xgwm297-7B and Xgwm186-5A with SSN [7, 45]; and Xbarc56-5A and Xwmc361-2B with TKW [45–49]. By association analysis of a panel of cultivar, we not only confirmed earlier marker/trait associations, but also found several new associations of markers and yield-related traits (Tables 2 and 5), such as an association of Xgwm135-1A with KNPS in four environments, Xgwm102-2D with KNPS, KWPS and SSN, and Xgwm337-1D with SNPS. Association mapping combined with bi-parental population analysis is even more powerful in identifying closely linked molecular markers involving yield-related genes [50, 51]. For example, in a QTL analysis of drought tolerance in three RIL populations using SNP markers and 305 diverse inbred lines in maize, Lu et al.  found that joint linkage-LD mapping identified 18 QTLs additional to those detected in separate linkage and LD analyses. Korir et al.  detected five markers associated with aluminum tolerance in both an association panel of 188 cultivars as well as184 RILs from a bi-parental soybean cross, confirming that these loci should be the best candidate regions to target. Twenty two seed weight and silique length-related QTLs were detected in three bi-parental populations in rapeseed. Among them, uq.A09-1 and uq.A09-3 were identified in all four environments and fine mapped in a set of 576 inbred lines using association analysis . Four associated SSR loci, Xgwm135-1A, Xgwm337-1D, Xgwm102-2D and Xgwm132-6B, were detected in our bi-parental population (Table 5, Figs 4e, 4f, 5d and 5e). They had effects on increasing spikelet and kernel numbers and decreasing plant height. These results demonstrate the power of combined association and bi-parental analyses in identifying closely linked molecular markers for economic traits.
Genetic effects of associated loci were panel-dependent
In previous studies Wang et al.  and Zhang et al.  used Chinese wheat mini core collection (MCC) to perform association analysis between TKW, KNPS and SSR markers. That collection represented 1% of the national germplasm collection, but more than 70% of the genetic diversity . We genotyped40 SSR loci associated with TKW and KNPS from Wang et al.  and Zhang et al. . Only three loci, Xbarc56-5A associated with TKW, and Xwmc24-1A and Xgwm132-6B associated with KNPS showed significant associations. A possible reason for the low number was that the present set comprised released cultivars, among which the allelic profiles were very different. ANOVA showed that 18 of 40 SSR loci had genetic effects on either TKW or KNPS (Table 4). This indicated that genetic effects of loci were also influenced by the population entries. In addition, the R2 values for TKW at nine loci were much lower than reported for the MCC panel. For example, the earlier R2 values for Xcfa2234-3A142 and Xcfa2257-7A129 were 18.20 and 21.99%, respectively [26, 27], but were only 4.09 and 2.05% in the present study (Table 4). This lower variation is likely due to the effects of long term selection in breeding programs because the frequencies of Xcfa2234-3A142 and Xcfa2257-7A129 in the earlier reports were 43.9 and 20.6%, respectively [26, 27], but were 93.5 and 55.7% in the current cultivar panel (Table 4). Because released cultivars usually carry superior alleles at crucial loci the genetic effects of those loci were greatly reduced and the R2 values were lower. For example, Qin et al.  detected four haplotypes, Hap-6B-1, Hap-6B-2, Hap-6B-3 and Hap-6B-4, at TaGW2-6B; but the frequencies of Hap-6B-1 and Hap-6B-2 showed increasing trends over time, and by the late 1980s Hap-6B-3 and Hap-6B-4 had disappeared. The allelic difference between Hap-6B-1 and Hap-6B-2 was much smaller than that involving either of them with Hap-6B-3 or Hap-6B-4. Therefore, the more important a locus is for an agronomic trait, the stronger it will be selected in breeding. Hence the R2 value should decline from a random germplasm collection to a released cultivar population.
Genetic effects were also affected by environment (G x E). Flowering time in maize is a complex trait affected by genes and the environments. ZmCCT is one of the most important genes affecting photoperiod response. Hung et al.  found that many maize inbred lines carried ZmCCT alleles with no sensitivity to day length, allowing breeders to produce more widely adapted maize varieties. In the current study, association signals were detected in multiple-environments, i.e. 08CD, 09CD, 08YZ and 09YZ. Average values of yield-related traits were also calculated according to the BLUP method and associated with SSRs. Based on association detection using BLUP mean values, seven SSR loci had significant association signals in two or more environments, such as Xgwm135-1A with KNPS, Xgwm515-2A and Xgwm132-6B with PH, Xgwm219-6B with SNPS, and Xgwm102-2D, Xgwm297-7B and Xgwm383-3D with SSN. Therefore, the influence of environments on genetic effects of the associated loci was indirectly reflected by comparison of their values in different environments. Hence, there was a higher influence if an associated locus had an effect in only one or few environments (Table 3).
Favorable alleles in past and future wheat breeding
Some loci have played important roles in wheat breeding. A good example is Xgwm261 that is 0.6 cM from Rht8 on chromosome 2D. The favorable Xgwm261 allele (192 bp) is associated with an approximate 10 cm reduction in plant height . Rht8 is also closely linked with Ppd-D1, which affects varietal adaptability leading to increased grain yield in certain environments . Zhou et al.  identified Rht8 in many Chinese wheat varieties widely grown in the last 30 years. About 40% of varieties contained Rht8 based on pedigree, but its frequency varied in different ecological zones. Italian cultivars Funo, Villa Glory, St1472/506 and St2422/464, widely used as founder genotypes in Chinese wheat breeding [58, 59], are all carriers of Rht8. Grain weight also underwent strong selection during wheat breeding. For example, TaGW2-A1 significantly associated with TGW  was mapped to chromosome 6A. The superior haplotype Hap-6A-A increases TGW by more than 3 g. Among loci that were validated in the DH population in this study the frequencies of favorable alleles Xgwm135-1A138 and Xgwm102-2D144 with positive effects on KNPS exceeded 50%, suggesting that these loci have contributed to Chinese wheat breeding (Tables 3 and 5). On the other hand, the frequency of Xgwm337-1D186 with a clear genetic effect verified in both populations was only 5.22%. Clearly that allele could be selected to increase SNPS in future marker-assisted selection (MAS) breeding. Thus, strong selection in the breeding of newly released cultivars has already focused on some favorable alleles [26, 27, 61]. Modern Chinese varieties produced over the last 60 years are based on 16 founder parents . Some of these founder parents were included in the 230 wheat cultivars used in this study. Abbondanza, Funo, and St2422/464, for example, carried more favorable alleles than some other founder parents in our study as was also reported by Ge et al. .
In summary, four favorable alleles, namely, Xgwm135-1A138, Xgwm337-1D186, Xgwm102-2D144, and Xgwm132-6B128, identified in this study will be useful in future breeding for high-yield.
S1 Table. The 230 wheat accessions used in association analysis.
S2 Table. The 106 SSR loci used in association analysis.
S3 Table. Allele number, MAF and PIC of 106 polymorphic SSR markers detected in the association panel.
We gratefully acknowledge help from Professor Robert A. McIntosh, University of Sydney, with English editing. We also thank Dr. Caixia Lan, CIMMYT, for advice.
Conceived and designed the experiments: XZ SC. Performed the experiments: JG YZ X. Chang RJ WY WS. Analyzed the data: JG CH. Contributed reagents/materials/analysis tools: YZ BZ X. Chang LQ TL X. Cheng RJ WY WH. Wrote the paper: JG CH XZ SC.
- 1. Food and Agriculture Organisation (FAO). FAOSTAT database; 2014. Available: http://faostat.fao.org/.
- 2. Shi JQ, Li RY, Qiu D, Jiang CC, Long Y, et al. (2009) Unraveling the complex trait of crop yield with quantitative trait loci mapping in Brassica napus. Genetics 182: 851–861. doi: 10.1534/genetics.109.101642. pmid:19414564
- 3. Korzun V, Röder MS, Ganal MW, Worland AJ, Law CN (1998) Genetic analysis of the dwarfing gene (Rht8) in wheat. Part I. Molecular mapping of Rht8 on the short arm of chromosome 2D of bread wheat (Triticumaestivum L.). Theor Appl Genet 96: 1104–1109.
- 4. Worland AJ, Korzun V, Röder MS, Ganal MW, Law CN (1998) Genetic analysis of the dwarfing gene Rht8 in wheat. Part II. The distribution and adaptive significance of allelic variants at the Rht8 locus of wheat as revealed by microsatellite screening. Theor Appl Genet 96: 1110–1120.
- 5. Kato K, Miura H, Sawada S (1999) QTL mapping of genes controlling ear emergence time and plant height on chromosome 5A of wheat. Theor Appl Genet 98:472–477.
- 6. Cui F, Li J, Ding AM, Zhao CH, Wang L, et al. (2011) Conditional QTL mapping for plant height with respect to the length of the spike and internode in two mapping populations of wheat. Theor Appl Genet122: 1517–1536. doi: 10.1007/s00122-011-1551-6. pmid:21359559
- 7. Wu XS, Chang XP, Jing RL (2012) Genetic insight into yield-associated traits of wheat grown in multiple rain-fed environments. PLOS ONE 7: e31249. doi: 10.1371/journal.pone.0031249. pmid:22363596
- 8. Tyagi S, Mir RR, Kaur H, Chhuneja P, Ramesh B, et al. (2014) Marker-assisted pyramiding of eight QTLs/genes for seven different traits in common wheat (Triticumaestivum L.). Mol Breed 34: 167–175.
- 9. Wu XY, Cheng RY, Xue SL, Kong ZX, Wan HS, et al. (2014) Precise mapping of a quantitative trait locus interval for spike length and grain weight in bread wheat (Triticumaestivum L.). Mol Breed 33: 129–138.
- 10. Xu YF, Wang RF, Tong YP, Zhao HT, Xie QG, et al. (2014) Mapping QTLs for yield and nitrogen-related traits in wheat: influence of nitrogen and phosphorus fertilization on QTL expression. Theor Appl Genet 127: 59–72. doi: 10.1007/s00122-013-2201-y. pmid:24072207
- 11. Kato K, Miura H, Sawada S (2000) Mapping QTLs controlling grain yield and its components on chromosome 5A of wheat. Theor Appl Genet 101: 1114–1121.
- 12. Quarrie SA, Quarrie SP, Radosevic R, Rancic D, Kaminska A, et al. (2006) Dissecting a wheat QTL for yield present in a range of environments: from the QTL to candidate genes. J Exp Bot 57: 2627–2637. pmid:16831847
- 13. Shah MM, Gill KS, Baenziger PS, Yen Y, Kaeppler SM, et al. (1999) Molecular mapping of loci for agronomic traits on chromosome 3A of bread wheat. Crop Sci 39: 1728–1732.
- 14. Campbell BT, Baenziger PS, Gill KS, Eskridge KM, Budak H, et al. (2003) Identification of QTLs and environmental interactions associated with agronomic traits on chromosome 3A of wheat. Crop Sci 43: 1493–1505.
- 15. Marza F, Bai GH, Carver BF, Zhou WC (2006) Quantitative trait loci for yield and related traits in the wheat population Ning7840 × Clark. Theor Appl Genet 112: 688–698. pmid:16369760
- 16. Groos C, Robert N, Bervas E, Charmet G (2003) Genetic analysis of grain protein-content, grain yield and thousand-kernel weight in bread wheat. Theor Appl Genet 106: 1032–1040. pmid:12671751
- 17. Laperche A, Brancourt-Hulmel M, Heumez E, Gardet O, Hanocq E, et al. (2007) Using genotype × nitrogen interaction variables to evaluate the QTL involved in wheat tolerance to nitrogen constraints. Theor Appl Genet 115:399–415. pmid:17569029
- 18. Ramya P, Chaubal A, Kulkarni K, Gupta L, Kadoo N, et al. (2010) QTL mapping of 1000-kernel weight, kernel length, and kernel width in bread wheat (Triticumaestivum L.). J Appl Genet51:421–429. pmid:21063060
- 19. Hai L, Guo HJ, Wagner C, Xiao SH, Friedt W (2008) Genomic regions for yield and yield parameters in Chinese winter wheat (Triticumaestivum L.) genotypes tested under varying environments correspond to QTL in widely different wheat materials. Plant Sci 175: 226–232.
- 20. Li SS, Jia JZ, Wei XY, Zhang XC, Li LZ, et al. (2007) A intervarietal genetic map and QTL analysis for yield traits in wheat. Mol Breed 20: 67–178.
- 21. Ma ZQ, Zhao DM, Zhang CQ, Zhang ZZ, Xue SL, et al. (2007) Molecular genetic analysis of five spike-related traits in wheat using RIL and immortalized F2 populations. Mol Genet Genomics 277:31–42. pmid:17033810
- 22. Myles S, Peiffer J, Brown PJ, Ersoz ES, Zhang Z, et al. (2009) Association mapping: critical considerations shift from genotyping to experimental design. Plant Cell 21:2194–2202. doi: 10.1105/tpc.109.068437. pmid:19654263
- 23. Flint-Garcia SA, Thornsberry JM, Buckler ES (2003) Structure of linkage disequilibrium in plants. Annu Rev Plant Biol 54: 357–374. pmid:14502995
- 24. Atwell S, Huang YS, Vilhjálmsson BJ, Willems G, Horton M, et al. (2010) Genome-wide association study of 107 phenotypes in Arabidopsis thaliana inbred lines. Nature 465: 627–631. doi: 10.1038/nature08800. pmid:20336072
- 25. Sajjad M, Khan SH, Ahmad MQ, Rasheed A, Mujeeb-Kazi A, et al. (2013) Association mapping identifies QTLs on wheat chromosome 3A for yield related traits. Cereal Res Commun 42: 177–188.
- 26. Wang LF, Ge HM, Hao CY, Dong YS, Zhang XY (2012) Identifying loci influencing 1,000-kernel weight in wheat by microsatellite screening for evidence of selection during breeding. PLOS ONE 7: e29432. doi: 10.1371/journal.pone.0029432. pmid:22328917
- 27. Zhang DL, Hao CY, Wang LF, Zhang XY (2012) Identifying loci influencing grain number by microsatellite screening in bread wheat (Triticumaestivum L.). Planta 236: 1507–1517. doi: 10.1007/s00425-012-1708-9. pmid:22820969
- 28. Jing RL, Chang XP, Jia JZ, Hu RH (1999) Establishing wheat doubled haploid population for genetic mapping by anther culture. Biotechnology 9: 4–8 (English abstract).
- 29. Bernardo R (1996) Test cross additive and dominance effects in best linear unbiased prediction of maize single-cross performance. Theor Appl Genet 93: 1098–1102. doi: 10.1007/BF00230131. pmid:24162487
- 30. Bernardo R (1996) Marker-based estimate of identity by descent and alikeness in state among maize inbreds. Theor Appl Genet 93: 262–267. doi: 10.1007/BF00225755. pmid:24162227
- 31. Bernardo R (1996) Best linear unbiased prediction of maize single-cross performance. Crop Sci 36: 50–56.
- 32. Sharp PJ, Chao S, Desai S, Gale MD (1989) The isolation, characterization and application in Triticeae of a set of wheat RFLP probes identifying each homoeologous chromosome arm. Theor Appl Genet 78: 342–348. doi: 10.1007/BF00265294. pmid:24227239
- 33. Somers DJ, Isaac P, Edwards K (2004) A high-density microsatellite consensus map for bread wheat (Triticumaestivum L.). Theor Appl Genet 109: 1105–1114. pmid:15490101
- 34. Liu K, Muse SV (2005) PowerMarker: an integrated analysis environment for genetic marker analysis. Bioinformatics 21: 2128–2129. pmid:15705655
- 35. Pritchard JK, Stephens M, Rosenberg NA, Donnelly P (2000) Association mapping in structured populations. Am J Hum Genet 67: 170–181. pmid:10827107
- 36. Evanno G, Regnaut S, Goudet J (2005) Detecting the number of clusters of individuals using the software STRUCTURE: a simulation study. Mol Ecol 14: 2611–2620. pmid:15969739
- 37. Hardy OJ, Vekemans X (2002) SPAGeDi: a versatile computer program to analyze spatial genetic structure at the individual or population levels. Mol Ecol Notes 2: 618–620.
- 38. Loiselle BA, Sork VL, Nason J, Graham C (1995) Spatial genetic structure of a tropical understory shrub, Psychotriaofficinalis (Rubiaceae). Am J Bot 82: 1420–1425.
- 39. Yu JM, Pressoir G, Briggs WH, Bi IV, Yamasaki M, et al. (2006) A unified mixed-model method for association mapping that accounts for multiple levels of relatedness. Nat Genet 38: 203–208. pmid:16380716
- 40. Bradbury PJ, Zhang ZW, Kroon DE, Casstevens TM, Ramdoss Y, et al. (2007) TASSEL: software for association mapping of complex traits in diverse samples. Bioinformatics 23: 2633–2635. pmid:17586829
- 41. Zhang ZW, Ersoz E, Lai CQ, Fodhunter RJ, Tiwari HK, et al. (2010) Mixed linear model approach adapted for genome-wide association studies. Nat Genet 42: 355–360. doi: 10.1038/ng.546. pmid:20208535
- 42. Li N, Shi JQ, Wang XF, Liu GH, Wang HZ (2014) A combined linkage and regional association mapping validation and fine mapping of two major pleiotropic QTLs for seed weight and silique length in rapeseed (Brassica napus L.). BMC Plant Biol 14:114. doi: 10.1186/1471-2229-14-114. pmid:24779415
- 43. Schnurbusch T, Paillard S, Fossati D, Messmer M, Schachermayr G, et al. (2003) Detection of QTLs for Stagonospora glume blotch resistance in Swiss winter wheat. Theor Appl Genet 107: 1226–1234. pmid:12928778
- 44. Kumar N, Kulwal PL, Balyan HS, Gupta PK (2007) QTL mapping for yield and yield contributing traits in two mapping populations of bread wheat. Mol Breed 19: 163–177.
- 45. Zhang LY, Liu DC, Guo XL, Yang WL, Sun JZ, et al. (2010) Genomic distribution of quantitative trait loci for yield and yield-related traits in common wheat. J Int Plant Biol 52: 996–1007. doi: 10.1111/j.1744-7909.2010.00967.x. pmid:20977657
- 46. Cuthbert JL, Somers DJ, Brûlé-Babel AL, Brown PD, Crow GH (2008) Molecular mapping of quantitative trait loci for yield and yield components in spring wheat (Triticumaestivum L.). Theor Appl Genet 117: 595–608. doi: 10.1007/s00122-008-0804-5. pmid:18516583
- 47. Gupta PK, Rustgi S, Kumar N (2006) Genetic and molecular basis of grain size and grain number and its relevance to grain productivity in higher plants. Genome 49: 565–571. pmid:16936836
- 48. Wang RX, Hai L, Zhang XY, You GX, Yan CS, et al. (2009) QTL mapping for grain filling rate and yield-related traits in RILs of the Chinese winter wheat population Heshangmai × Yu8679. Theor Appl Genet 118: 313–325. doi: 10.1007/s00122-008-0901-5. pmid:18853131
- 49. Sun XC, Marza F, Ma HX, Carver BF, Bai GH (2010) Mapping quantitative trait loci for quality factors in an inter-class cross of US and Chinese wheat. Theor Appl Genet 120: 1041–1051. doi: 10.1007/s00122-009-1232-x. pmid:20012855
- 50. Wu RL, Zeng ZB (2001) Joint linkage and linkage disequilibrium mapping in natural populations. Genetics 157: 899–909. pmid:11157006
- 51. Wu RL, Ma CX, Casella G (2002) Joint linkage and linkage disequilibrium mapping of quantitative trait loci in natural populations. Genetics 160: 779–792. pmid:11861578
- 52. Lu YL, Zhang SH, Shah T, Xie CX, Hao ZF, et al. (2010) Joint linkage-linkage disequilibrium mapping is a powerful approach to detecting quantitative trait loci underlying drought tolerance in maize. Proc Natl Acad Sci USA 107: 19585–19590. doi: 10.1073/pnas.1006105107. pmid:20974948
- 53. Korir PC, Zhang J, Wu KJ, Zhao TJ, Gai JY (2013) Association mapping combined with linkage analysis for aluminum tolerance among soybean cultivars released in the Yellow and Changjiang River Valleys in China. Theor Appl Genet 126: 1659–1675. doi: 10.1007/s00122-013-2082-0. pmid:23515677
- 54. Hao CY, Dong YC, Wang LF, You GX, Zhang HN, et al. (2008) Genetic diversity and construction of a corecollection in Chinese wheat genetic resources. Chin Sci Bull 53: 1518–1526.
- 55. Qin L, Hao CY, Hou J, Wang YQ, Li T, et al. (2014) Homoeologous haplotypes, expression, genetic effects and geographic distribution of the wheat yield gene TaGW2. BMC Plant Biol 14: 107. doi: 10.1186/1471-2229-14-107. pmid:24766773
- 56. Hung HY, Shannon LM, Tian F, Bradbury PJ, Chen C, et al. (2012) ZmCCT and the genetic basis of day-length adaptation underlying the postdomestication spread of maize. Proc Natl Acad Sci USA 109: E1913–E1921. doi: 10.1073/pnas.1203189109. pmid:22711828
- 57. Zhou Y, He ZH, Zhang GS, Xia LQ, Chen XM, et al. (2003) Rht8 dwarf gene distribution in Chinese wheats identified by microsatellite marker. Acta Agron Sinica 29: 810–814 (English abstract).
- 58. Zhang XY, Li CW, Wang LF, You GX, Dong YC (2002) An estimation of the minimum number of SSR alleles needed to reveal genetic relationships in wheat varieties. I. Information from large-scale planted varieties and corner-stone breeding parents in Chinese wheat improvement and production. Theor Appl Genet 106: 112–117. pmid:12582878
- 59. Zhuang QS (2003) Chinese wheat improvement and pedigree analysis. Beijing: Agricultural Press. (In Chinese).
- 60. Su ZQ, Hao CY, Wang LF, Dong YC, Zhang XY (2011) Identification and development of a functional marker of TaGW2 associated with grain weight in bread wheat (Triticumaestivum L.). Theor Appl Genet 122: 211–223. doi: 10.1007/s00122-010-1437-z. pmid:20838758
- 61. Ge HM, You GX, Wang LF, Hao CY, Dong YS, et al. (2012) Genome selection sweep and association analysis shed light on future breeding by design in wheat. Crop Sci 52: 1218–1228.