Quantitative Trait Loci Associated with the Tocochromanol (Vitamin E) Pathway in Barley

The Genome-Wide Association Studies approach was used to detect Quantitative Trait Loci associated with tocochromanol concentrations using a panel of 1,466 barley accessions. All major tocochromanol types- α-, β-, δ-, γ-tocopherol and tocotrienol- were assayed. We found 13 single nucleotide polymorphisms associated with the concentration of one or more of these tocochromanol forms in barley, seven of which were within 2 cM of sequences homologous to cloned genes associated with tocochromanol production in barley and/or other plants. These associations confirmed a prior report based on bi-parental QTL mapping. This knowledge will aid future efforts to better understand the role of tocochromanols in barley, with specific reference to abiotic stress resistance. It will also be useful in developing barley varieties with higher tocochromanol concentrations, although at current recommended daily consumption amounts, barley would not be an effective sole source of vitamin E. However, it could be an important contributor in the context of whole grains in a balanced diet.

Despite the well-established nutritional requirement of tocochromanols for reproductive health and normal neurological development in mammals [3], the precise physiological function of these compounds remains elusive. The scientific literature is replete with laboratory studies on the nutritional benefits of tocochromanols, particularly with respect to cardiovascular disease [4]. Oddly, depending on the specific health risk, human epidemiological studies have been equivocal [5], with some reporting that the impact of αT is positive [4,6], negative [7], or relatively neutral [8]. In one exceptionally large trial, in which 39,876 apparently healthy women were administered either vitamin E or a placebo over an average of 10.1 years, very little evidence was found that vitamin E reduced the risk of either cardiovascular diseases or cancer [9]. However, most of the current literature is based on experiments where supplements, in the form of natural or synthetic αT, were used to test the effects of vitamin E on human health. High doses of αT are known to inhibit absorption of other tocochromanols in humans [10,11], and these effects may be long lasting [12]. More research is needed to fully understand the effects of consuming tocochromanols in a natural form (i.e. in whole grains).
In addition to their possible implications for human health, tocochromanols play an important role in plant stress tolerance. One key function of tocochromanols is to protect lipid membranes in the photosynthetic machinery from a range of oxidative stresses, primarily by deactivating 1 O 2 and OH_reactive oxygen species [13]. When used to scavenge lipid peroxyl radicals in plants, tocochromanols must be restored by another reducing agent, such as ascorbate (vitamin C) to re-gain functionality. In scavenging 1 O 2 , the anti-oxidant is irreversibly damaged [13]. The functions of other tocochromanols in plant physiology remain to be elucidated.
To date, there have been two major studies of the genetic controls of tocochromanol synthesis in barley. In one study, the cDNA sequence encoding homogentisate geranylgeranyl transferase (HGGT), an enzyme necessary for tocotrienol synthesis, was isolated in barley [14]. In the same study, the barley HGGT sequence was used for Agrobacterium-mediated transformation of maize, resulting in a six-fold increase of tocotrienols in the seed. However, the gene encoding this cDNA was not assigned a linkage or physical map position. In a more recent study [15], analysis of a bi-parental mapping population resulted in the identification of three Quantitative Trait Loci (QTL) associated with the concentrations of one or more tocochromanol forms in barley, one on chromosome 6H, and two on chromosome 7H. The QTL on chromosome 6H was attributed to VTE4, and one of the QTL on chromosome 7H was attributed to either HGGT or VTE2, based on orthology between rice and barley.
The availability of a comprehensive linkage map and a genome sequence in barley makes it possible to assess which regions in the barley genome are associated with variations in tocochromanol concentration, using a Genome-Wide Association Studies (GWAS) approach. GWAS is now widely used in a range of crop plants and is a powerful tool for rapidly detecting QTL and possibly even specific candidate genes [16,17]. In barley, GWAS has been used to identify QTL related to flowering time [18,19], disease resistance [18], and food quality [18,20].
Our objectives were to a) quantify the concentration of each tocochromanol form in cultivated barley using accessions from eight US spring barley breeding programs, b) identify QTL in the barley genome associated with the concentration of each tocochromanol form and fraction, and c) use identified QTL in conjunction with the barley genome sequence to identify candidate genes. Tocochromanols were analyzed and quantitated using a modified saponification method [22]. Approximately 1 g of grain was ground in a Retsch ZM-1 mill (Haan, Germany) and an aliquot (approximately 0.5 g) was weighed and the weight recorded. The freshly-ground sample was then extracted by addition of 0.5 ml 10M KOH, 0.5 ml 95% ethanol, 0.5 ml 0.15M NaCl and 1.25 ml of a 0.5M solution of pyrogallol (in 95% ethanol) and shaken in a water bath at 70°C for 30 min., vortexing every 10 min. The tubes were cooled on ice and an additional 3.75ml of 0.15M NaCl was added. This suspension was extracted twice with hexane/ethyl acetate (9:1 v/v) by vortexing and centrifuging at 1000g for 5 min and transferring the supernatant to a glass test tube. The combined organic phase was reduced to dryness in a Thermo-Savant SPD1010 speed-vac system (Asheville, NC) at 45°C. The dried extract was re-suspended in 1.0 ml hexane and centrifuged to remove particulates prior to analysis by High Performance Liquid Chromatography (HPLC). For HPLC analysis, each sample was analyzed with a Shimadzu LC-5a HPLC (Kyoto, Japan) using a 4.6 × 250 mm, 5 m Adsorbosil silica column (Grace Co., Deerfield IL.) with an isocratic mobile phase at a flow rate of 2.0 ml/min. Samples from the barley CAP I germplasm were separated using a mobile phase of 0.5% isopropanol in hexane. Unfortunately this solvent system did not effectively separate the γT and the βT3 content, thus a different mobile phase consisting of 2% ethylacetate and 2% dioxane in hexane, which did separate these two congeners, was used for the Barley CAP II germplasm. Fluorescence detection was employed using a Shimadzu RF-10A spectrofluorometer with excitation at 295 nm and detection at 330 nm. Peaks were integrated and compared to tocochromanol standards. Tocotrienols were quantitated using the standard curve developed for the corresponding tocopherol [23]. Tocochromanol data for germplasm arrays are available at The Triticeae Toolbox (T3) (http://triticeaetoolbox.org/; verified 13 October 2014) [24].

Methods
Barley accessions in the "Barley CAP I" and "Barley CAP II" germplasm arrays were genotyped for 3,072 single nucleotide polymorphism (SNP) markers with two GoldenGate Olionucleotide Pool Assays (OPAs), as described by Close et al. [25] and Szücs et al. [26]. The genotyping was conducted at the USDA-ARS Small Grains Genotyping Center in Fargo, North Dakota. After excluding markers with more than 10% missing data and markers that were cosegregating in this set of germplasm, 2,204 of the 3,072 SNP markers from the two OPAs were used in this analysis. Of the 1,534 accessions genotyped, 68 were excluded from the analysis because they had more than 10% missing genotypic data. Therefore, the GWAS is based on 1,466 barley accessions. SNP data was retrieved from The Triticeae Toolbox (T3) (http:// triticeaetoolbox.org/; verified 13 October 2014) (Blake et al. 2012).
Linkage map positions from the barley consensus map [27] were used to identify the position of SNP markers. One SNP marker that was significant in this analysis, 11_20311, had not been assigned a position in this consensus map. Therefore, its position in the barley genome sequence [28], relative to SNPs with known linkage map positions, was used to approximate its cM position. Linkage Disequilibrium (LD) between these markers was calculated using the "Measure.R2S" function in the R package "LDcorSV." The breeding program of each accession's origin was used to partially account for population structure for LD calculations in this panel.
An R script based on the "GWAS" function in the package rrBLUP version 4.1 [29], with minor modifications, was employed using R version 3.0.1, to conduct GWAS. Markers with a minor-allele frequency below 5% were removed. The Efficient Mixed-Model Association eXpedited (EMMAX) method, using a kinship matrix and five principal componentst, was used to account for genetic structure in this set of accessions [16]. P-values were adjusted to account for multiple comparisons using the False Discovery Rate (FDR), developed by Benjamini and Hochberg [30]. In instances where multiple closely linked markers were significant, and one of the markers was more significant than every other marker in that region for every significant trait, only the most significant marker was reported. Marker effects were based on Best Linear Unbiased Estimates (BLUEs).
Data from 2006 and 2007 were combined into a single analysis, using a fixed effect to account for differences across years, as described by Evangelou and Ioannidis [31]. This method of combining years was also used to combine barley food-quality data from an overlapping set of trials by Mohammadi et al. [20].
Positional information, Gene Ontology (GO) annotations [32], and InterPro assignments [33] were obtained for barley genes (ISBC_1.0.030312v22) through the Gramene version of the BioMart [34]. This database was scanned for genes that could be involved in the tocochromanol biosynthesis pathway, using a set of keywords to identify promising candidates. This list of genes was manually curated to remove genes that were identified by the automatic search, but after further review, were not determined to be associated with the tocochromanol biosynthesis pathway. A manual search was also conducted in which sequences from other species that are associated with the tocochromanol biosynthesis pathway were compiled from NCBI To determine the linkage group and cM positions of candidate genes, OPA SNP markers were aligned with the barley genome sequence using the BLAST-Like Alignment Tool (BLAT) [36]. The base-pair position of SNP markers in the barley genome was determined from the BLAT output by percent identity and level of significance. The positions of candidate genes relative to their flanking markers in the genome sequence were then used to calculate approximate cM positions.

Phenotypic data
There were detectable concentrations of all tocochromanols in all accessions in both years (Table 1; Fig 1; S1 and S2 Figs). Including both years, αT concentrations ranged from 6.8 mg/ kg to 23.9 mg/kg and total tocochromanol (TTC) concentrations ranged from 30.9 mg/kg to 94.1 mg/kg. Considering all forms, αT3 had the highest average concentration, and δT had the lowest average concentration. Means and standard errors for all tocochromanol forms are presented in Table 1. An analysis of variance showed that both year and breeding program had significant effects on αT and TTC concentrations (S1 Table). Row-type had a significant effect on both αT and TTC (S1 Table). αT, αT3 and δT concentrations were higher in 2006 (irrigated) than in 2007 (dryland), whereas the reverse was true for βT, δT3 and γT3 and TTC (p<0.0001 for all comparisons). As noted in the Materials and Methods, βT3 and γT were not distinguished in the analysis of 2006 samples. Therefore, it is not possible to assess the effect of year/management practice on these forms. Which breeding program had germplasm with the highest average tocochromanol concentration varied by year and tocochromanol form.   Figs 2 and 3). The two significant SNPs on chromosome 1H were at cM 110 (associated with total tocotrienol (TT3) and TTC), and the second at cM 128 (associated with βT). Based on the linkage distance between these SNPs and on an analysis of linkage disequilibrium (S1 File), these are two distinct regions with different QTLs/candidate genes and are described as 1H-A and 1H-B (Table 3). On chromosome 6H, two SNPs were significant: one at cM 58 (associated with δT), and one at cM 71 (associated with γT3). These two regions are described as 6H-A and 6H-B. The remaining eight SNPs were on 7H and formed three regions (7H-A, 7H-B, and 7H-C): one at cM 1 (associated with γT3); two at cM interval 95-96 (associated with βT and δT); and five at cM interval 136-145 (associated with αT3, βT3, δT, δT3, γT, γT3, TT3, and TTC). There were no significant associations of any mapped SNPs with αT or total tocopherol (TTP). In region 7H-B the two significant markers are in linkage disequilibrium, but the two markers were significant for different tocochromanol forms. In region 7H-C, in some cases adjacent significant markers were in linkage disequilibrium. However, the middle significant marker was not in linkage disequilibrium with either the first or last significant marker, providing some evidence for multiple QTL in this region. There were no significant associations of SNPs with αT or total tocopherol (TTP).

Candidate genes
None of the significant markers were within an Expressed Sequence Tag (EST) that was annotated within HarvEST as being potentially related to tocochromanol biosynthesis. Of the thirteen significant markers, seven were within 2 cM of at least one sequence homologous with genes known to be associated with the tocochromanol biosynthesis pathway in barley and/or other plants (Table 3). On 1H, candidate gene MLOC_16149, with sequence homology to VTE2 (as described by Collakova and DellaPenna [37]), homogentisate geranylgeranyltransferase (as described by Cahoon et al. [14]), and multiple enzymes upstream of geranylgeranyl diphosphate biosynthesis (a precursor to all tocochromanols; Cahoon et al. [14]), including farnesyl diphosphate synthase (as described by Matsushita et al. [38]) and homogentisate farnesyltransferase (as described by Sadre et al. [39]), was identified at cM 108-2 cM from marker 11_20021. No candidate genes were identified within 2 cM of marker 11_10586. On 6H, three candidate genes were identified within 2 cM of marker 12_30802: MLOC_ 72891, MLOC_44750, and MLOC_66290. Each of these candidate genes has sequence homology to multiple enzymes upstream of geranylgeranyl diphosphate biosynthesis. On 6H at cM 71, the candidate gene MLOC_13082, with sequence homology to enzyme VTE4 (as described by Shintani and DellaPenna [40]) was 0 cM from marker 12_30637. On 7H, no candidate genes were identified within 2 cM of markers 12_30296, 11_21201, or 11_20311. Two candidate genes were identified within 2 cM of the group of markers in cM interval 136-150: MLOC_12567 encoding HGGT, and MLOC_37476 with sequence homology to VTE2 and multiple enzymes upstream of geranylgeranyl diphosphate biosynthesis.

Allele effects and distributions
As shown in Table 4, the best linear unbiased estimators (BLUEs) for allele effects reveal substantial phenotypic variation associated with allele substitutions at the significant SNPs. Both alleles at each significant SNP were present in most breeding programs (S2 Table). The accessions from the USDA-ARS-ID program and UT had the highest levels of allelic diversity, never having less than 9% and 6% of the minor allele, respectively. The highest αT concentration observed was 1.72 times higher than the average αT concentration, and the highest TTC concentration observed was 1.49 times higher than the average TTC concentration. Differences were observed in tocochromanol concentrations over the two years of this study, although this was confounded by the different germplasm arrays grown each year ( Table 1; Fig 1; S1 and S2 Figs). While the 2006 growing season in Bozeman, Montana was relatively typical, the 2007 growing season was characterized by extremes, with 18.5 cm of snowfall recorded on May 29 th , followed by a July that was possibly the hottest on record, and had little precipitation (National Weather Service; http://nws.noaa.gov/climate/local_data.php?wfo = tfx; verified 3 November 2014). Irrigation in the 2006 growing season, but not in the 2007 growing season, resulted in differential moisture stress in the two years. Oliver et al. [15] also reported that moisture availability and temperature are important environmental factors associated with tocochromanol concentrations. Future experiments, in which barley varieties are replicated, and a range of environmental factors are controlled, would help to better understand the effect of specific environmental factors on tocochromanol concentrations, as well as Table 3. Significant SNPs associated with tocochromanols, and annotated sequences known or predicted to be associated with the tocochromanol biosynthesis pathway that occurred within 2 cM of a significant marker.  Given the observed values for tocochromanol forms, a key question is whether barley can be a viable source of these compounds for human nutrition. The answer to this question is complicated by the fact that a RDA has only been established for αT, which is 15 mg/day for adults (National Institute of Health Office of Dietary Supplements; http://ods.od.nih.gov/factsheets/ VitaminE-HealthProfessional/; verified 29 October 2014). Using the accession with the highest αT concentration, 06MT-55 with 23.9 mg/kg αT, a healthy adult would need to consume approximately 628 g of barley (dry weight) per day to meet their RDA. Therefore, it is not realistic to imagine barley as a sole or principal source of αT in human diets: other plant products are superior sources of αT. Sunflower seeds, for example contain approximately 351.7 mg/kg of αT (USDA-ARS National Nutrient Database; http://ndb.nal.usda.gov/ndb/foods/show/ 3658; verified 29 October 2014).

SNP
Tocochromanols other than αT are also reputed to provide nutritional benefits. γT, for example, is superior to αT in detoxifying reactive nitrogen species [42], an important consideration in chronic inflammation, and for smokers or individuals subject to air pollution. Furthermore, in cellular assays, γT was shown to provide neuroprotective effects at concentrations 4 to 10 fold lower than typically found in human plasma [43]. Diets rich in tocotrienols have been shown to reduce cholesterolgenesis in chicks [44] and in humans [45]. Tocopherols do not exhibit this property.
Whole grain barley, however, brings valuable components to human diets in addition to tocochromanols, including β-glucan [46,47]. Therefore, the focus for breeding food barley should more realistically be on the total nutritional composition, and not exclusively on tocochromanol content.
An important question to address in this context is the role of tocochromanols in barley growth, development, and reproductive fitness. Studies with Arabidopsis mutants deficient in tocopherol biosynthesis clearly illustrate a role for these metabolites in cold tolerance [48] and these mutants were employed to demonstrate a critical role for tocopherols in germination and seed storage [49]. In monocots, a correlation between γT concentration and enhanced germination and root growth has been shown in barley [50], and in emmer, seeds collected from a location with higher abiotic stresses had higher tocochromanol concentrations than those collected from locations with lower abiotic stresses [51]. Given the available phenotype and genotype data, and reserve seed, the GWAS panel used for this study could be used, in future analyses and experiments, to compare the genome locations of tocochromanol genes/QTLs in relationship to genes/QTLs related to productivity and stress resistance. Coincident genes/ QTLs would be justification to proceed with testing hypotheses regarding pleiotropic effects of tocochromanol concentration on plant health and productivity.

QTL and candidate genes
Thirteen significant SNP-tocochromanol trait associations were detected on three chromosomes (1H, 6H, and 7H) ( Table 1). The significant SNPs within each linkage group can be further subdivided into seven discrete genomic regions, based on the large linkage distances separating groups of significant markers (Table 3) and an analysis of linkage disequilibrium (S1 File). In five of these regions, there are candidate genes (1H-A, 1H-B, 6H-A, 7H-B and 7H-C) based on annotation: HGGT, VTE1, VTE2, VTE4, and a sequence with similarity to a gene encoding geranylgeranyl diphosphate synthase. In the remaining two regions (6H-B and 7H-A) there are significant QTL: marker associations without candidate genes. Possible explanations include (i) there are structural genes in these regions involved in tocochromanol synthesis but they are as yet undetected due to gaps in the genome sequence, and/or (ii) the presence of regulatory elements with functions in the tocochromanol pathway. Alignment of the barley consensus map and the map used by Oliver et al. [15] reveals some, but not complete, overlap with significant markers we detected (S2 File). That our findings are consistent with those of Oliver et al. [15] confirms that GWAS and bi-parental QTL mapping can be effective in the dissection of complex traits, given adequate population sizes, robust phenotyping and reasonable marker density. An advantage of GWAS is that a panel can be assembled immediately, whereas with a biparental population, for self-pollinated crops, it can take several years after an initial cross to achieve the amount of seed and the desired level of homogeneity before phenotypic evaluations can begin [52]. GWAS can provide fundamental insights into the genetic basis of economically important traits, as evidenced by recent reports in a range of crop plants, including barley [19][20][21]. By providing estimates of the number and genomic context of sequences affecting target traits in relevant germplasm, GWAS can also provide targets for Marker Assisted Selection (MAS) that will increase the efficiency of development for superior varieties. The panel used in this GWAS could be used to quickly generate near-isogenic lines for QTL by taking advantage of heterogenous inbred families [53]. The Barley CAP germplasm used in this study was previously used to identify QTL for disease resistance [54], and subsequent lines from the panel that were heterozygous at QTL were used to develop sets of near-isogenic lines to validate those QTL [55]. Near-isogenic line pairs developed from these lines could be used to study environmental effects, refine map positions, identify multiple alleles at QTL, or investigate QTL interaction with genetic background.
In terms of future breeding applications, two accessions from AB (2AB04-01084-6 and 2AB04-01084-15) have the favorable alleles at each of the 13 SNPs significantly associated with one or more tocochromanol forms and/or fractions. These two accessions may be a valuable resource for developing varieties with enhanced tocochromanol concentrations. The TTC concentrations of these lines (81.79 mg/kg and 80.73 mg/kg, respectively) is higher than the average accession in 2007, the year that these lines were grown, but lower than the highest accession grown in that year (6B05-0788, from BA), which had a TTC of 90.02 mg/kg. The αT concentrations for these accessions, 17.51 mg/kg and 14.29 mg/kg, were also higher than the average accession in 2007, but lower than the highest accession grown in that year (MT050165 from MT), which had an αT concentration of 23.88 mg/kg. In this set of germplasm, no accession had all negative alleles at the 13 significant SNPs.

Conclusions
This study demonstrates that GWAS can detect genetic determinants of complex traits in a panel of elite germplasm. This approach to QTL and candidate gene identification can complement the use of bi-parental mapping populations specifically tailored to each trait. A total of 13 marker-trait associations for tocochromanol concentrations in barley were identified. The significant SNPs were found in seven genomic regions on three chromosomes. Five of the seven associations were with markers near genes associated with the tocochromanol pathway. The availability of the draft of the barley genome sequence, published by the International Barley Genome Sequencing Consortium [28], enabled the alignment of QTLs with candidate genes. This information will be useful in future studies directed at understanding the role(s) of tocochromanols in barley growth, development, stress resistance and productivity. It will also be useful in breeding food barley varieties that can supply moderate amounts of tocochromanols in human diets within a framework of whole grain nutrition.

Disclaimer
Mention of trade names or commercial products in this publication is solely for the purpose of providing specific information and does not imply recommendation or endorsement by the U. S. Department of Agriculture. The USDA is an equal opportunity provider and employer.