SNP markers for low molecular glutenin subunits (LMW-GSs) at the Glu-A3 and Glu-B3 loci in bread wheat.

The content and composition of seed storage proteins is largely responsible for wheat end-use quality. They mainly consist of polymeric glutenins and monomeric gliadins. According to their electrophoretic mobility, gliadins and glutenins are subdivided into several fractions. Glutenins are classified as high molecular weight or low molecular weight glutenin subunits (HMW-GSs and LMW-GSs, respectively). LMW-GSs are encoded by multigene families located at the orthologous Glu-3 loci. We designed a set of 16 single-nucleotide polymorphism (SNP) markers that are able to detect SDS-PAGE alleles at the Glu-A3 and Glu-B3 loci. The SNP markers captured the diversity of alleles in 88 international reference lines and 27 Mexican cultivars, when compared to SDS-PAGE and STS markers, however, showed a slightly larger percent of multiple alleles, mainly for Glu-B3. SNP markers were then used to determine the Glu-1 and Glu-3 allele composition in 54 CIMMYT historical lines and demonstrated to be useful tool for breeding programs to improve wheat end product properties.


Introduction
The world population is growing exponentially and therefore demands more and a greater diversity of food while facing less available land and the need to conserve soil, water, and genetic resources. More than any other crop, wheat (Triticum aestivum L.) provides calories and protein and is present in thousands of everyday foods worldwide. The global bread wheat consumption supplies nearly 16 g of protein per capita daily and is quickly increasing in developing countries, which are predicted to have the largest population increases [1].
The major problem is that even though wheat yields are increasing [2,3], the percentage increase is below the projected percentage demand with about 0.6% deficit projected annually until 2050 [4,5]. And while overall production must increase, high-quality standards for human nutrition, end-use functional properties and commodity value must also be maintained. a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 Wheat processing quality and end product properties are determined by a set of complex traits, the most important being the content and composition of seed storage proteins (gluten), the endosperm texture or grain hardness, the composition of starch and non-starch polysaccharides, and, for some specific products, the color of the flour/semolina. Gluten proteins, representing the major protein fraction of the starchy endosperm, are the main factors responsible for the unique viscoelastic properties (elasticity or strength and extensibility) of wheat dough, which underline the utilization of wheat to prepare bread and other wheat products [6,7]. Gluten is composed of a large number of proteins, mainly glutenins and gliadins. Glutenins contribute more to gluten strength while gliadins contribute to extensibility and viscosity. Among the glutenins there are high-molecular-weight glutenin subunits (HMW-GS), and low-molecular-weight glutenin subunits (LMW-GS). Common wheat possesses three to five HMW-GS codified by the Glu-A1, Glu-B1 and Glu-D1 loci and encoded on the long arm of chromosomes 1A, 1B and 1D, respectively. Although HMW-GS account for only 10% of the wheat storage proteins, they play a key role in determining bread wheat quality. The LMW-GS are codified by the Glu-A3, Glu-B3 and Glu-D3 loci (located on the short arms of chromosomes 1A, 1B and 1D, respectively) and large allelic variation has been reported with different alleles being designated by the name of the locus followed by letters, [8,9]. The LMW-GS represent about one-third of the total seed protein and around 60% of total glutenins [10]. Therefore, the characterization of allelic variation for LMW-GS in addition to HMW-GS among cultivars and investigation of their relationships with end-use quality has been a key area of research on quality improvement [11]. The LMW-GS have, however, proven to be more difficult to identify because of their complexity, heterogeneity, and similarity to each other and to some gliadin components.
To ensure good quality products, wheat quality analyses are an integral part of most breeding programs. Typically, parental lines for new crosses and advanced lines are evaluated. Quality analyses are rarely conducted during the early breeding generation selection process, mainly because the screening of entries in heterozygous state within segregating populations is not indicative in comparison to pure lines, early testing for quality must not compromise other breeding priorities, which must run with finite testing resources and high throughput, accurate methodologies are not available to analyze the extremely high number of entries present in early generations.
Single nucleotide polymorphisms (SNPs) and insertion/deletions (InDels) are common genetic polymorphisms within the HMW-GS and LMW-GS genes. PCR based sequencetagged site (STS) markers exist for ascertaining most HMW-GS [12][13][14][15]. Recently, LMW-GS genes were also isolated and functional markers identifying the subunit composition on chromosomes 1A and 1B in common wheat were developed to allow marker-assisted breeding [16,17]. However, despite being closely linked or functional markers, the application of these remains limited largely due to their dominant inheritance coupled with the cost and time required for screening large progenies via conventional PCR and gel electrophoresis.
The objective of our study was therefore to develop and validate a set of SNP markers functional for Glu-A3 and Glu-B3 alleles and useful for screening large number of samples in marker-assisted breeding. SNP marker results were validated by full comparison to sodiumdodecyl-sulfate polyacrylamide gel electrophoresis (SDS-PAGE) and STS marker methods.

Plant materials
Two germplasm sets were used to validate the designed SNP markers for Glu-A3 and Glu-B3 alleles. The first set (Set1) comprised 88 international reference lines with maximized diversity for Glu-A3 and Glu-B3 alleles ( Table 1). The set was developed by the Expert Working Group on Improving Wheat Quality for Processing and Health of the Wheat Initiative (https://www. wheatinitiative.org/). Seed was compiled from different sources and multiplied in Mexicali, México, and is available for distribution through the CIMMYT Germplasm Bank.
Lines were selected from a set of 103 standard cultivars previously used to establish a uniform classification system of LMW-GS alleles [18,19] and analyzed in various laboratories for the identification of the alleles. The second set (Set2) comprised a selection of 27 Mexican cultivars derived from an overall collection of 150 national cultivars (Table 2). An allele survey using the validated SNP markers was then performed on 54 historical CIMMYT lines (Set 3) representing key germplasm distributed by CIMMYT over the last 50 years (Table 3).

SDS-PAGE analyses
LMW-GS were initially separated by SDS-PAGE in the CIMMYT wheat quality lab based on the extraction method described by [9] with modifications. Concentrations of separation gels were 15.0% T with 1.3% C with a pH of 8.5. Current of running gels was 12.5 mA. The LMW-GS compositions were identified according to [9] and [20] and the gliadins were used as indicators of LMW-GS based on the linkage between LMW-GS and gliadin. The nomenclature system of LMW-GS followed the catalogue of gene symbols for wheat http://wheat.pw. usda.gov/ggpages/awn/53/Textfile/WGC.html.

DNA extraction and genotyping with STS markers
Genomic DNA was extracted from dried leaves using a modified CTAB (cetyltrimethylammonium bromide) method as described in CIMMYT laboratory protocols [21] and quantified using NanoDrop 8000 spectrophotometer V 2.1.0 (Thermo Fischer Scientific). To avoid inconsistent results derived from using different seed sources or residual heterogeneity within seed lots, the same seed (cut in half) was used for the quality and genetic analyses for Set 1. For Set 2 and 3, leaves from a random bulk of ten seedlings per entry were collected. Functional STS markers published by [16,17] were applied to identify Glu-A3 and Glu-B3 alleles in set 1 and set 2. PCR assays in single 10 μl reactions contained final concentrations of 1x Buffer with Green Dye (Promega Corp., US), 200 μM dNTPs, 1.2 mM MgCl 2 , 0.25 μM of each primer, 1U of DNA polymerase (GoTaq1Flexi, Promega Corp., Cat. # M8295) and 50ng of DNA template. The PCR profile was 94˚C for 2 min followed by 30 cycles of 94˚C for 1 min, 54 to 60˚C for 2 min (dependent on the STS primers), 72˚C for 2 min. The amplified products were separated on 2-3% agarose gels in TAE buffer.

SNP marker design and genotyping
KASP assays (LGC Genomics, LCC, Beverly, MA, USA) were developed to genotype the relevant SNPs. Coding sequences of the different Glu-A3 and Glu-B3 alleles at each locus were retrieved from the published literature [16,17]. The diagnostic SNPs/InDels for different alleles were identified and KASP primers were developed using the PolyMarker pipeline (http:// polymarker.tgac.ac.uk/) following standard KASP guidelines (described in CIMMYT laboratory protocols, [21]). The allele-specific primers were designed carrying the standard FAM ('5 TGAAGGTGACCAAGTTCATGCT 3 0 ) and HEX ('5 aGAAGGTCGGAGTCAACGGATT 3') tails and with the targeted SNP at the 3 0 end. A common primer was designed so that the total amplicon length was less than 120 bp. The primer mixture comprised 46 μl ddH2O, 30 μl common primer (100 μM), and 12 μl of each tailed primer (100 μM). Assays were tested in 384-well formats and set up as 4 μl reactions containing 2.5ml water, 2.5 ml 2×KASP Reaction mix, 0.07 ml Assay mix and 50ng of dried DNA with a PCR profile of 94˚C for 15 min

SNP marker validation
Segregation of LMW-GS using SDS-PAGE. The LMW-GS compositions identified by SDS-PAGE are listed in Tables 1 and 2. At the Glu-A3 locus, alleles Glu-A3b and Glu-A3e could be easily detected. Allele Glu-A3c, Glu-A3d and Glu-A3f were more difficult to separate [17] (Fig 1). At the Glu-B3 locus, the alleles Glu-B3d, Glu-B3h and Glu-B3i, each carried the slowest LMW-GS bands in SDS-PAGE. As previously reported, allele Glu-B3f could not always be reliably discriminated from Glu-B3g since these bands had very similar mobilities. This was the case for six lines in Set 1. In four lines in Set 2, the allele Glu-B3h was detected together with alleles Glu-B3b, c or i.
Verification of LMW-GS via allele specific STS markers. Seven primer pairs were used to identify Glu-A3 alleles and nine primer pairs for Glu-B3 alleles. The amplified fragments were of the same size than previously published in [16,17] (Fig 2). The Glu-A3 and Glu-B3 alleles could be differentiated well, however, the STS markers could not identify allele Glu-B3j, which resulted in 16 missing values for the 176 data points. Overall 78.3% of the detected   Validation of LMW-GS using the newly designed KASP assays. KASP assays designed are presented in Table 3 and the assays are aligned to their respective allele sequences in S1 File. All assays formed clear sample clusters with the allele labeled with the fluorescence 'VIC' representing the designated Glu-A3 and Glu-B3 allele (Fig 3). Similar to the STS markers, a KASP assay for the Glu-B3j allele was not designed, leading to 20 missing values for 176 data

Fig 1. High and low-molecular-weight subunit composition (HMW-and LMW-GS) by SDS-PAGE.
Glu-A3 alleles are indicated by a rhombus, Glu-B3 alleles by an arrow. Labels correspond to the entry numbers in Table 1. points. Among Set 1 and Set 2, 77.7% of the data points were identical between the KASP assays and SDS-PAGE. In 12.6% of the lines, the KASP assays showed the same allele as SDS-PAGE plus an additional allele. This was mainly due to Glu-B3c and Glu-B3g alleles simultaneously detected with the KASP. In 10.0% of the lines, alleles identified by KASP were different from alleles by SDS-PAGE. A higher percentage (83.6%) of alleles was identical and a lower percent of alleles (1.9%) differed between the KASP assays and the STS markers, while in 14.5% cases one or the other marker type showed more than one allele. In Set 1, parallel to the STS markers, different Glu-A3 alleles were identified for the lines ERNEST  Table 1.

Allele survey in CIMMYT historical lines
The newly designed and validated SNP markers were subsequently applied to survey the Glu-3 alleles in set 3 including 54 CIMMYT historical lines. Furthermore, SNP markers previously developed for Glu-A1, Glu-B1 and Glu-D1 loci were deployed (Table 4) to determine the HMW-GS in the same set of lines. At the Glu-A1 locus, 72% of the lines carried the Ax2 � subunit and 28% the Ax1 subunit ( Table 3). None of the line carried the null allele causing a lowquality score. Subunit Bx13 at the Glu-B1 locus was the only allele to be amplified with SNP markers. KASP assays for the subunit Bx7OE exist [22] but were not found to provide reliable results based on a lack of clustering in this study. Among the CIMMYT historical lines two lines (OPATA M85 and WHEAR/SOKOLL) carried this allele. For Glu-D1, the subunit combination of Dx5 and Dy10 was found in 75% of the lines, while the allelic pair Dx2 + Dy12 with negative effects on bread making quality was found in 25% of the lines including six lines released after the year 2000. The Glu-A3 locus showed lower allelic diversity than Glu-B3 locus. Five different alleles were found at the Glu-A3 locus. The most frequent Glu-A3 allele was allele c, present in 75% of the lines, followed by allele b present in 15% of the lines. One to three lines carried alleles d, e, and f (Table 3). Seven different alleles were found at the Glu-B3 locus. Thirty-four percent of the lines carried Glu-B3 allele h, while 23% and 17% of the lines carried allele g and b, respectively. Alleles d, f, i and c, were observed in less than 10%

Discussion
To determine wheat quality, several measurements of wheat grain, flour, dough, and final products must be assessed within breeding programs [23]. Several of the measurements are greatly limited by the amount of seed required and the cost and time of testing. Grain tests can be done on a small scale, quickly and cost effectively, making high-throughput implementation possible. However, dough rheology and end-use tests require large quantities of grain, which is destructed for milling into flour, limiting their implementation to advanced stages in a breeding pipeline [24]. Therefore, marker assisted selection (MAS) could be beneficial for these traits.
Gluten is the most important factor in determining bread making quality and is largely optimized by deployment of appropriate HMW-GS and LMW-GS alleles [25]. DNA markers linked to most of the individual gluten subunits or alleles exist but still require gel electrophoresis while higher-throughput and cost-effective genotyping platforms are available. The newly designed SNP markers for Glu-A3 and Glu-B3 alleles in this study were validated by comparing results to those derived by SDS-PAGE and STS marker methods, while the former was taken the standard. Various laboratories have tried to unify the nomenclature system of LMW-GS based upon the relative mobilities in SDS-PAGE and identified the protein composition of 103 reference lines, of which 75 were included in Set 1 of our study [18,19]. The correlation between the three methods in this study was relatively high. The SNP markers correlated slightly better with SDS-PAGE than the STS markers (10% vs. 11.5% of disagreement) but showed overall a somewhat higher number of multiple alleles (12.6% vs. 10.9%). We expect that the disagreement in allele detection has several reasons. Each method has a different discrimination power. Several of the alleles inconsistent across methods included SDS-PAGE bands that were difficult to separate (e.g., Glu-A3c and d), thus might not have been correctly identified. In addition, SDS-PAGE does not allow to read eventual residual heterozygote loci. The Glu-3 STS markers are dominant markers and can therefore produce false negatives, which resulted in more missing data. Similar to SDS-PAGE, heterozygote loci cannot be detected. Targeting only a single SNP and not a larger gene region, the KASP assays can be less allele specific. A larger number of multiple alleles were derived from the two KASP assays, mainly for the alleles Glu-B3c and Glu-B3g, indicating some lack of allele specificy. However, KASP assays, however, can detect heterozygote loci. The disagreement of alleles in Set 2 might have also be derived from the different seed lots used for the SDS-PAGE and molecular marker analyses. Residual heterozygosity within and between the seed lots is not as uncommon. Falling in the category of a high-throughput genotyping platform the KASP assays show some extra benefits in comparisons to SDS-PAGE and/or STS markers as 1) early breeding generation segregating materials can be evaluated, due to their co-dominant inheritance; 2) production costs are reduced; and 3) highly automated options are available that require less time to generate the results. The advantage of co-dominant markers in MAS strategies is that they allow a more direct assessment of the frequencies of the target alleles and thus progeny testing of selected individuals at later generations (e.g., F5 or F6) to recover homozygotes is not required. The cost for evaluating a line or sample with SDS-PAGE is around 30 USD at CIMMYT. Using a multi-gel buffer chamber allows running both gliadin and glutenin extracts at the same time providing reliable results, but not all alleles can be easily distinguished and results across different laboratories vary significantly. The sample cost using DNA markers consist of a DNA extraction cost and the genotyping cost. Genotyping using KASP is cheaper than using STS markers, especially when many samples are evaluated. In this study, 21 KASP assays were used to evaluate 54 historical CIMMYT lines for Glu-1 and Glu-3 alleles. Current genotyping cost per sample using 21 KASP assay with CIMMYT service providers would vary between 2.65 to 6.00 USD depending on the overall number of samples to be evaluated (1536 and 384 formats, respectively). However, while the cost of KASP assays are significantly less, not all HMW-GS and LMW-GS alleles can be identified yet (e.g., Glu-B1 and Glu-D3 alleles). Thus, using the currently available KASP assays alone will not provide the full composition of alleles, but KASP are ideal to be used in MAS when certain allele combinations are to be selected. Attempts to develop markers for other Glu-B1, Glu-D1 and Glu-D3 alleles are ongoing [26]. At CIMMYT with a very well-established wheat quality laboratory the designed KASP assays presented in this study are currently used to verify uncertain SDS-PAGE results during the evaluation of advanced breeding lines in preliminary yield trails. KASP assays associated with Glu-1 alleles are used in MAS in early breeding generations. For breeding programs with limited resources or limited access to a wheat quality laboratory, the SNP markers present a great alternative tool for screening Glutenin protein composition. [19] reported the composition of LMW-GS alleles of 103 wheat cultivars including 75 of the 88 cultivars in our Set 1. Almost all our SDS-PAGE results were similar to the study of [19] with only a few exceptions (four out of 75). E.g., line AC VISTA in our study showed allele Glu-A3g, while in [19] allele Glu-A3e, line BLUE SKY in our study showed allele Glu-B3c, while in [19] Glu-B3g. For the overall eight lines with different allele scores between SDS-PAGE and both DNA marker types (STS and KASP), five lines were also genotyped with STS markers in the same study of [19]. In four cases, the STS marker results matched the SDS-PAGE results in [19], while in one case DNA marker results were the same in both studies, thus the markers reporting a clearly different allele than SDS-PAGE. One reason for any discrepancy between the two studies could be the use of a different seed source or residual heterogeneity in the lines.
Finally, 54 historical CIMMYT lines were evaluated with available Glu-1 and Glu-3 KASP assays. At the Glu-1 loci, the frequency of the subunits Ax1 or Ax2 � and Dx5 + Dy10 were high. These subunits are positively associated with bread making quality [11,[27][28][29][30][31]. [32] carried out an extensive survey of LMW-GS in common wheat cultivars by SDS-PAGE and detected 20 banding patterns. Subsequently, six protein alleles were found for the Glu-A3 locus (a, b, c, d, e, f), nine for the Glu-B3 (a, b, c, d, e, f, g, h, i). With respect to effects on dough quality, various Glu-3 alleles were ranked for Rmax (maximum dough resistance, an indicator of dough strength), and the rankings of alleles were b > d > e > c at the Glu-A3 locus, i > b = a > e = f = g = h > c at Glu-B3 [29,33,34]. [35] found that the composition bbb for Glu-A3, Glu-B3 and Glu-D3, respectively, gave the best extensibility, and the composition bbc was almost as extensible. [36] considered allele d at Glu-A3 locus, allele b at Glu-B3 locus had greater quality parameters than their counterpart alleles. The most frequent Glu-A3 allele in the CIMMYT historical lines was allele Glu-A3c (76%), followed by allele Glu-A3b (15%). This result was consistent with the study by [37], who analyzed 273 CIMMYT lines with SDS-PAGE. The same authors found overall four different Glu-A3 alleles. In this study we found the same four Glu-A3 alleles, and additionally allele Glu-A3f in line HUW234+LR34/ PRINIA � 2//KIRITATI. Alleles Glu-A3e in hexaploid wheat being a null allele was found in only one line (HD2687). The most frequent Glu-B3 allele in same set of 54 CIMMYT historical lines was allele Glu-B3h (34.0%) followed by allele Glu-B3g (22.6%) and allele Glu-B3b (17.0%). These results were similarly consistent with the study of [37] in which however, 23% percent of the CIMMYT lines were also reported to carry the allele Glu-B3j. This allele is associated with the 1B.1R translocation and exhibits a strongly negative effect on all quality parameters. Despite some intends, a reliable SNP marker for the Glu-B3j has not developed yet. The Glu-B3j allele was therefore not detected in lines known to carry the 1B.1R translocation such as Seri M82. Overall, while the frequencies of HMW-GS positively associated with bread making quality are high in CIMMYT wheat varieties, there is the opportunity to improve the deployment of LMW-GS alleles.
Supporting information S1 File. Sequence alignment of designed KASP assays associated to Glu-A3 and Glu-B3 alleles. All 200bp sequences are derived from the S1 File in the publications of Wang et al. 2009 and 2010. The target SNP in highlighted in brackets, the KASP allele-specific primers and common primer are underlined. (DOC)