Identification of Low Molecular Weight Glutenin Alleles by Matrix-Assisted Laser Desorption/Ionization Time-Of-Flight Mass Spectrometry (MALDI-TOF-MS) in Common Wheat (Triticum aestivum L.)

Low molecular weight glutenin subunits (LMW-GS) play an important role in determining dough properties and breadmaking quality. However, resolution of the currently used methodologies for analyzing LMW-GS is rather low which prevents an efficient use of genetic variations associated with these alleles in wheat breeding. The aim of the current study is to evaluate and develop a rapid, simple, and accurate method to differentiate LMW-GS alleles using matrix-assisted laser desorption/ionization time-of-flight mass spectrometry. A set of standard single LMW-GS allele lines as well as a suite of well documented wheat cultivars were collected from France, CIMMYT, and Canada. Method development and optimization were focused on protein extraction procedures and MALDI-TOF instrument settings to generate reproducible diagnostic spectrum peak profiles for each of the known wheat LMW-GS allele. Results revealed a total of 48 unique allele combinations among the studied genotypes. Characteristic MALDI-TOF peak patterns were obtained for 17 common LMW-GS alleles, including 5 (b, a or c, d, e, f), 7 (a, b, c, d or i, f, g, h) and 5 (a, b, c, d, f) patterns or alleles for the Glu-A3, Glu-B3, and Glu-D3 loci, respectively. In addition, some reproducible MALDI-TOF peak patterns were also obtained that did not match with any known alleles. The results demonstrated a high resolution and throughput nature of MALDI-TOF technology in analyzing LMW-GS alleles, which is suitable for application in wheat breeding programs in processing a large number of wheat lines with high accuracy in limited time. It also suggested that the variation of LMW-GS alleles is more abundant than what has been defined by the current nomenclature system that is mainly based on SDS-PAGE system. The MALDI-TOF technology is useful to differentiate these variations. An international joint effort may be needed to assign allele symbols to these newly identified alleles and determine their effects on end-product quality attributes.


Introduction
Wheat seed storage proteins are composed of two major fractions, gliadins and glutenins. Based on their electrophoretic mobility, glutenin proteins are divided into high molecular weight glutenin subunits (HMW-GS) and low molecular weight glutenin subunits (LMW-GS.) [1]. About 20% of the whole glutenin fraction is HMW-GS and 80% is LMW-GS [2]. LMW-GS is highly polymorphic and mainly encoded by genes on complex loci Glu-A3, Glu-B3, and Glu-D3 on the short arms of group 1 chromosomes 1A, 1B, and 1D, respectively [3,4]. It possesses highly significant effects on dough physical properties especially dough extensibility, which is highly important for breadmaking [5][6][7][8][9]. Utilization of genetic variations associated with LMW-GS is currently an important task in modern wheat breeding.
In bread wheat cultivars, Gupta and Shepherd identified 20 different LMW-GS banding patterns by SDS-PAGE, six controlled by Glu-A3 (a, b, c, d, e, f), nine by Glu-B3 (a, b, c, d, e, f, g, h, i) and five by Glu-D3 (a, b, c, d, e) [3]. These banding patterns were then defined as LMW-GS alleles. The allele effect rankings for dough physical properties were also established, including Glu-A3: b>d>e>c; Glu-B3: i>b = a>e = f = g = h>c; Glu-D3: e>b>a>c>d. Ma et al. [6] demonstrated that selecting appropriate LMW-GS alleles is vital important in achieving balanced wheat dough physical properties.
Currently, two analytical systems are predominantly used for differentiating LMW-GS alleles, including sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE) [10], and reversed-phase high-performance liquid chromatography (RP-HPLC) [11,12]. However, Glu-3 loci for LMW-GS consist of a multigene family of about 30-40 variable genes. The LMW-GS composition is highly polymorphic and often one allele is composed of multiple proteins; it is therefore often difficult to accurately identify and analyze the LMW-GS by the currently established methods due to a large number of expressed subunits and their overlapping mobilities with other proteins such as the abundant gliadin proteins. Due to the common scoring errors in determining LMW-GS compositions by the current analyzing methods [13,14], the LMW-GS variation on wheat quality is less utilized than these of HMW-GS in wheat breeding. A current large international collaborating effort is focused on refining the LMW-GS nomenclature system [15].
In recent years, new technologies such as two-dimensional electrophoresis  and N-terminal amino acid sequences were developed to characterize and define LMW-GS [16], which have greatly improved the accuracy in identifying LMW-GS alleles and understanding their structures and functions. However, these technologies are of high cost and low throughput, not suitable for using in large scale wheat breeding programs that require accurately processing a large number of samples in a given short period. Matrix-assisted laser desorption/ionisation time-of-flight mass spectrometry (MALDI-TOF-MS) has been proven to be a powerful tool for wheat storage protein analysis [17,18]. It appears to be highly accurate and sensitive, only a small sample is required (normally less than 1 pmol), and is faster to perform (requiring about one minute per sample) comparing with other common separation methods [18,19]. The high throughput is particularly attractive for the possibility of rapid variety identification. It is most suitable for dealing with a large number of samples in a short time, ideal for wheat breeding programs, wheat grain trading, etc. Liu et al. [17] has successfully applied this technology in analyzing HMW-GS alleles and have identified a number of new alleles from old wheat varieties.
For the LMW-GS alleles, Muccilli et al. [20] analyzed the characteristics of the B-and C-type low molecular weight glutenin subunits by MALDI-TOF. However, allele specific MALDI-TOF profiles for LMW-GS alleles have not been established. MALDI-TOF-MS technology is still not efficiently used as an analytical procedure for wheat breeding. The aim of the current study is to use MALDI-TOF technology as a tool for rapid and accurately differentiating LMW-GS alleles in wheat breeding through establishing allele specific MALDI-TOF spectrum profiles.

Wheat Material
A total of 60 hexaploid wheat lines with known LMW-GS compositions were used to establish characteristic MALDI-TOF peak pattern for each LMW-GS allele. Aroona and its 16 substitution lines with different Glu-3 alleles detected by protein mobility were sourced from South Australian Research & Development Institute Grain Quality Research Laboratory and were initially used to gain allele specific spectrum peak patterns (Table 1). A collection of 18 international reference varieties [21] and 25 hexaploid gene deletant lines with different Glu-3 alleles defined by SDS-PAGE were then used to verify the patterns obtained from the Aroona lines. The final allele patterns were put into use to analyze another 202 hexaploid wheat lines, including commercial cultivars and advanced breeding lines.  [22] and Melas et al. [23]. About 15 mg of crushed seeds or flour were weighed followed by adding 1 ml 70% ethanol into the tube and vortexing for 30 min at room temperature. Centrifuge at 10,000 rpm for 5 min and then discard the supernatant. Add 1 ml of 55% iso-propanol into the pellet, mix well and put into 65°C water bath for 30 min. Centrifuge at 10,000 rpm for 5 min and discard the supernatant. The above steps were repeated twice. Supernatant must be discarded entirely at every step to thoroughly remove albumin, globulin and gliadin fractions. Add 150 μl of extraction buffer (50% iso-propanol, 80 mM Tris-HCl, pH8.0, and 1% DTT) at 65°C water bath for 30 min. After centrifugation, glutenin fractions were alkylated by adding an equal volume of extraction buffer consisting of 1.4% vinylpyridine (v/v) to the supernatant and incubating at 65°C for 20 min. Then centrifuge at 10,000 rpm for 10 min and transfer 60 μl of the supernatant into a new tube. Add 240 μl of pre-cold acetone (-20°C) into the supernatant to a final concentration of 80% (v/v). Keep the samples at -20°C freezer for 1-2 hours or overnight, and then centrifuge at 10,000 rpm for 10 min and dried at room temperature. The final pellet was put into -20°C freezer for further use.
2.2.2 Sample preparation. Add 60 μl of 50% acetonitrile (ACN) and 0.05% trifluoroacetic acid (TFA) to dissolve the precipitation for 1 hour at room temperature. Sinapinic acid (SA) was used as matrix, which was dissolved in 50% ACN and 0.05% TFA (10 mg/ml). Sample was spotted onto a MALDI-TOF Voyager DE Pro 100 sample size plate by 0.7 μl SA: 0.7 μl sample: 0.7 μl SA. The sample plate was air dried before analysis by MALDI-TOF-MS.

MALDI-TOF-MS
Biosystems Voyager DE Pro MALDI-TOF mass spectrometer was used in this study with delayed extraction technology operating. The mass spectrometer was operated in linear mode. The optimised instrument settings were as follows: 25kv acceleration voltage, 0.15% guide wire voltage and 94% plate voltage, 900 ns delay time in the mass weight range from 10 kDa to 50 kDa. Laser power was set from 1,800 minimum to 2,100 maximum. The final mass spectra recorded were the sum of 500 laser shots. All the samples were automatically accumulated in a random pattern over the sample area to provide the final spectra.

Optimization of Samples Extraction and MALDI-TOF-MS Settings
3.1.1 Sample optimization for MALDI-TOF-MS analysis. Since some gliadins have similar molecular weight as LMW-GS, it usually interferes with the detection of LMW-GS especially when a high sensitive instrument such as MADI-TOF is used. It is essential to eliminate gliadins together with albumins and globulins to ensure a reliable result. In the optimized sample extraction method (described in 2.2.1. Protein extraction), 150 μl of 55% iso-propanol was added into the sample tube to displace the 150 μl extraction buffer at 65°C for 30 min. After centrifugation, the supernatant was analyzed by MALDI-TOF-MS and no peaks were appeared. This indicates that the non-glutenin proteins were eliminated completely from the pellet based on the optimized protein extraction method which is suitable for MALDI--TOF-MS analysis.
Vinylpyridine effects in LMW-GS extraction was also investigated. By adding vinylpyridine, the peak resolution reacted differentially over different molecular weight regions. It caused a slight reduction of peak separating resolution in the 30-34 kDa region. However, in the region above 34 KDa, the 1.4% vinylpyridine (v/v) addition significantly enhanced the resolution and reproducibility of the LMW-GS profiling.
Sample concentration is also one of the main factors for MALDI-TOF-MS analysis. Too high or too low sample concentration will cause some peaks to disappear. A range of sample preparation factors affect the final sample concentration, including dissolving time length, types of matrix solutions, the compositions and ratios of solvents, the resolving times, and final sample volumes. The tested volume of TFA varied from 0.1% to 3.0%, and ACN from 0 to 50.0%. After the best TFA, ACN and H2O composition was chosen based on the MALDI--TOF-MS spectra results, the samples dissolving times of 30 min, 1 h, 2 h, 3 h, 4 h and overnight were compared. Five sample dissolving volumes, 30, 60, 100, 200, 300 μL, were also compared. The final sample concentration was set by using 60 μl of 50% acetonitrile (ACN) and 0.05% trifluoroacetic acid (TFA) solution to dissolve the precipitation for 1 h at room temperature.

Identification of LMW-GS Alleles
The MALDI-TOF profiles of the 16 single Glu-3 substitution lines of Aroona (Table 1, S1 File) were initially used to establish a suite of characteristic protein peak combinations for all alleles. These allele specific spectrum peak patterns were then tested and verified by 25 hexaploid gene deletant lines (Table 2, S2 File) and 18 reference varieties from three countries ( Table 3, S3  File). As a result, characteristic spectrum peak patterns were obtained for 17 LMW-GS alleles. These include 5 (b, a or c, d, e, f), 7 (a, b, c, d or i, f, g, h) and 5 (a, b, c, d, f) alleles at Glu-A3, Glu-B3 and Glu-D3 loci, respectively (Figs 1-5).
Glu-A3 allele is typically composed by 1 or two peaks each, and is the simplest among the Glu-3 loci. The characteristic spectrum peak patterns of Glu-A3 alleles are: 36,320 Da for Glu-A3b, 37,665 + 41,852 Da for Glu-A3c, 43,568 Da for Glu-A3d, 35,409 Da for Glu-A3e, 37,444 Da for Glu-A3f (Figs 1 and 2a (Figs 2b, 2c,  2d and 3). It is noted that Glu-B3f is similar to Glu-B3g with the latter having an extra 37,221 Da peak.
The Glu-D3 alleles were found to be the most complicated among the 3 LMW-GS loci. Their characteristic peak number for each allele ranges from 1 to 6. Some characteristic peaks were clustered together and most alleles contained more than one clustered peak groups.  (Fig 4). Glu-D3f allele was found to contain only one characteristic peak, 37,026 Da (Fig 5a).
Alleles Glu-A3a and Glu-A3c could not be differentiated by MALDI-TOF-MS due to their identical molecular masses; for the same reason, Glu-B3d and Glu-B3i were also difficult to differentiate. For this reason, Glu-A3a and Glu-A3c were assigned with the same spectrum peak pattern. Similarly, Glu-B3d and Glu-B3i also share the same characteristic pattern. Some alleles possessed similar spectrum peak patterns, such as three Glu-B3 alleles (Glu-B3c, Glu-B3d and Glu-B3h), and two Glu-D3 alleles (Glu-D3a and Glu-D3b). However, the unique MALDI-TOF spectrum peak appearances and molecular weight combinations made these alleles easily differentiable. For example, Glu-B3c (39,791 + 42,949 Da) and Glu-B3d (39,599 + 42,848 Da) were highly similar, but the two peaks of Glu-B3c were of 192 Da and 101 Da higher than the two corresponding peaks of Glu-B3d. Practically, this makes it rather simple to differentiate these two alleles. Furthermore, the reproducible nature of the MALDI-TOF spectra also made the allele differentiation a straightforward task.
The LMW-GS compositions of the eighteen reference wheat cultivars identified by MAL-DI-TOF-MS are listed in Table 3. For most cultivars, the MALDI-TOF results were consistent with the SDS-PAGE results [21]. The only exception was exist in cultivar Festin at the Glu-A3 locus. Among these cultivars, allele Glu-A3c could not be differentiated from Glu-A3a, while Glu-B3d could not be distinguished from Glu-B3i by MALDI-TOF-MS. It is worth noting that Zhang et al. [24] also reported the difficulties in differentiating Glu-B3d and Glu-B3i due to nearly identical SDS-Page banding patterns of the two alleles. The results confirmed the feasibility of using MALDI-TOF-MS to analyze the compositions of LMW-GS. The MALDI-TOF results of the LMW-GS compositions for some accessions are shown in Figs 5b, 5c, 5d and 6.

Analysis LMW-GS Alleles in Wheat Cultivars and Breeding Lines
Two hundred and two lines of hexaploid wheat (including cultivars and some advanced breeding lines) were analyzed by MALDI-TOF-MS and the allele compositions of LMWGS obtained from the above allele specific spectrum peak pattern are listed in Table 4. Results revealed a total of 48 allele combinations among the studied genotypes and a total of 24 allelic variants at the three Glu-3 loci, and are shown in Table 5.
At the Glu-A3 locus, eight different spectrum peak patterns were detected. Among these, five corresponded to the known alleles while three did not match any known alleles. Alleles Glu-A3a or c and Glu-A3b were present at frequencies of 50.5% and 36.1%, respectively. Alleles Glu-A3d (6.9%), Glu-A3e (2.0%) and Glu-A3f (2.5%) occurred at lower frequencies. The three new spectrum peak patterns at the Glu-A3 locus were detected from three Chinese wheat lines. Wheat line Baipimai possessed a spectrum peak with molecular weight 43,665 Da, which can be confidently treated as a new allele (Fig 5c). Chinese wheat landraces Hongmangmai and Hongmaoqiu contained another two abnormal peaks, 43,267 Da (Fig 5d) and 41,758 Da (Fig  6a), respectively, representing two new alleles.
At the Glu-B3 locus, nine LMW-GS spectrum peak patterns were detected including seven known allele specific patterns and two new patterns that did not match any known alleles. Two alleles, Glu-B3b and Glu-B3h were the most frequent, with frequencies of 42.6% and 41.1% among the accessions, respectively. Five allele specific patterns, Glu-B3a, c, d or i, f and g, were Table 3. Identification of LMW-GS alleles composition at loci Glu-A3, Glu-B3, Glu-D3 in common wheat.

Discussion
The current methodology development involved optimization of sample extraction and instrument settings to generate reproducible diagnostic spectrum profiles for wheat LMW-GS. Based on MALDI-TOF settings and models, over 100 wheat samples can be readily analyzed for LMW-GS alleles, indicating a high throughput nature. A total of 17 known LMW-GS alleles were found with matching spectrum peak patterns, including 5 (b, a or c, d, e, f), 7 (a, b, c, d or i, f, g, h) and 5 (a, b, c, d, f) alleles for the Glu-A3, Glu-B3 and Glu-D3 loci, respectively. According to LMW-GS allele characteristic peak patterns, 48 LMW-GS allele combination or genotypes in common wheat (Triticum aestivum L.) were identified. For the 18 reference cultivars, the spectrum scoring results of most cultivars are consistent with the SDS-PAGE results published previously excepting cultivar Festin, which was identified to contain the Glu-A3ef allele by Branlard et al. [25] but appeared as Glu-A3f allele in our study. As it was difficult to differentiate between Glu-A3e and Glu-A3f through SDS-PAGE, the two alleles were combined as Glu-A3ef previously [25]. However, our established MALDI--TOF procedure can clearly differentiate these two alleles.
Line Aril 15-4 (Glu-(A3a, B3b, D3c)) and Aroona (Glu-(A3c, B3b, D3c)) displayed the same MALDI-TOF spectra. The spectra of cv Chinese Spring, which is the Glu-A3a donor parent for Aril 15-4, were identical to these of Aril 15-4 and Aroona, all having the same characteristic peaks at the Glu-A3 locus (37,665 + 41,852 Da). For this reason, alleles Glu-A3a and Glu-A3c were not differentiable through MALDI-TOF. Aril 16-1 (Glu-(A3b, B3b, D3c)) and Aril 18-5 (Glu-(A3d, B3b, D3c)) both expressed one of the Glu-A3ac characteristic peak 41,852 Da but did not contain the 37,665 Da peak, thus the spectrum peak combination (37,665 + 41,852) Da were used to represent Glu-A3ac. Glu-A3d contained another two characteristic peaks, including 39,985 Da and 43,587 Da that is of the highest molecular weight among all LMW-GS spectra and only existed in Glu-A3d. This made it convenient to identify Glu-A3d by simply examining the existence of the maximal peak (43,587 Da); the Glu-A3f contained another two peaks 39,682 Da and 37,444 Da with the latter being exclusively existed in Glu-A3f. For this reason, peak 37,444 Da was adapted as a scoring mark for allele Glu-A3f. All of these were confirmed by the Aril lines, gene deletant lines, the reference cultivars, and most varieties being analyzed. By using MALDI-TOF technology, we were able to identify the LMW-GS allelic compositions of 197 wheat lines out of 202. Three Chinese landrace lines expressed no characteristic peak patterns for known Glu-B3 alleles, suggesting novel Glu-B3 alleles in these wheat lines. No peak pattern could be identified to match with known alleles from landrace Yumai, indicating novel alleles at the three Glu-3 loci. The characteristic peaks for Glu-D3c were (33,229 + 33,316 + 33,476) Da, which were used as the core spectrum peak pattern and criteria in determining Glu-D3c allele. However, four types of sub-allele variates were found for Glu-D3c that each contains different set of additional peaks apart from the three core peaks, including (  known Glu-D3c allele. This clearly demonstrates an enhanced power of MALDI-TOF procedure in analyzing LMW-GS allelic compositions. Such analytical power is desirable in modern wheat breeding since LMW-GS composition is highly polymorphic and high resolution identification of LMW-GS protein compositions is critical for efficiently utilizing the genetic variations [26][27][28].
It is worth reemphasizing that new Glu-3 alleles have been identified from our limited germplasm collection through the established MALDI-TOF procedure. The novel subunits associated with the abnormal spectrum peaks 43,665 Da, 43,267 Da, 41,758 Da may play a particular role in determining the viscoelastic properties of wheat dough. A more detailed study is required to characterize those novel alleles.  Table 5. Allele combinations and variants at the three Glu-3 loci in common wheat.

Glu-A3
Glu-B3 Glu-D3 Varieties Frequency (%) As a summary, a high efficient MALDI-TOF-MS procedure is established in the current study which is a rapid, simple, accurate and reliable method to identify wheat LMW-GS allele compositions. Through this approach, the complex LMW-GS can be readily differentiated. It can be used as an alternative approach for rapid identification of wheat LMW-GS in wheat breeding, which is most suitable for dealing with a large number of samples in a short period.
Disclaimer: This work is financially funded by Australian Grain Research & Development Corporation project: UMU00028. The funder had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Supporting Information S1 File. (Raw Data for Table 1) Mass spectrum files for each wheat lines listed in Table 1.  Table 2) Mass spectrum files for each wheat lines listed in Table 2. The data is compressed and in Biosystems MALDI-TOF software Data-Explorer format. (ZIP) S3 File. (Raw Data for Table 3) Mass spectrum files for each wheat lines listed in Table 3. The data is compressed and in Biosystems MALDI-TOF software Data-Explorer format. (ZIP)