Identification QTLs Controlling Genes for Se Uptake in Lentil Seeds

Lentil (Lens culinaris Medik.) is an excellent source of protein and carbohydrates and is also rich in essential trace elements for the human diet. Selenium (Se) is an essential micronutrient for human health and nutrition, providing protection against several diseases and regulating important biological systems. Dietary intake of 55 μg of Se per day is recommended for adults, with inadequate Se intake causing significant health problems. The objective of this study was to identify and map quantitative trait loci (QTL) of genes controlling Se accumulation in lentil seeds using a population of 96 recombinant inbred lines (RILs) developed from the cross “PI 320937” × “Eston” grown in three different environments for two years (2012 and 2013). Se concentration in seed varied between 119 and 883 μg/kg. A linkage map consisting of 1,784 markers (4 SSRs, and 1,780 SNPs) was developed. The map spanned a total length of 4,060.6 cM, consisting of 7 linkage groups (LGs) with an average distance of 2.3 cM between adjacent markers. Four QTL regions and 36 putative QTL markers, with LOD scores ranging from 3.00 to 4.97, distributed across two linkage groups (LG2 and LG5) were associated with seed Se concentration, explaining 6.3–16.9% of the phenotypic variation.


Introduction
Lentils are grown and consumed in many developing countries and are an important dietary staple because of their high protein content and nutrient density [1]. In particular, lentil is an excellent source of macro and micronutrients and trace elements [2], including Se [3]. Se is an essential micronutrient for organisms and has beneficial effects on animal and human health [4], although toxicity can occur with acute or chronic ingestion of excess Se [5]. Se is taken up from the soil by plants in two forms: inorganic (selenate and selenite) and organic (selenomethionine and selenocysteine) [6]. Both inorganic and organic forms can be good dietary sources of selenium for humans [6]. The recommended dietary allowance (RDA) of 55-200 μg of Se per day for adults is considered essential for healthy living [7]. The recommended daily intake of Se is 55μg in the US [7] and 60-75μg in the UK [8], and a single portion of cooked rice provides only 12μg of Se [9]. Low levels of dietary Se have been linked to a number of diseases. In the case of Se deficiency, serious muscle weakness [10], cardiovascular disruption [11], delayed child development [12], reduced eye health [13], early aging [14], nervous system disorders, and mental retardation [15] may be observed.
Se is integrated with selenoproteins (SPs) known as selenocystine (SeCys) residues in polypeptides [16]. Over 25 SPs have been identified in mammals [17]. SPs have antioxidant features that protect cells from free radicals [18], strengthen the immune system [19], contribute to proper thyroid gland function, and delay the aging process [20]. SPs help balance cardiovascular health by stimulating tissue flexibility and supporting heart cells [21,22], and also decrease the effects of toxic substances [23]. Se also plays an important role in the prevention of various cancers, including prostate [24], lung [25], colorectal [26], bladder [27], and various gastrointestinal cancers [28]. Therefore, developing micronutrient-enriched crop varieties by using genetic and genomic tools is considered a promising and cost-effective approach for controlling malnutrition worldwide [29] and increasing Se concentration in lentil seed could provide additional dietary sources of daily Se uptake. One alternative to increasing the level of Se in seed is through conventional breeding and a strategy known as biofortification [30]. The aim of biofortification is to enrich the micronutrient concentration in the edible parts of plants using biotechnological approaches in combination with plant breeding. Development of an effective biofortification strategy is a highly effective way of decreasing the cost of reducing micronutrient deficiencies in the rural population in developing countries. Lentil is a key crop in biofortification efforts as it provides fixed atmospheric nitrogen to plants and plays a role in the management of soil fertility [31].
Determining the genetic factors controlling micronutrient concentration is useful for mapbased cloning and marker assisted selection (MAS). The application of molecular markers for QTL analysis has provided an effective approach to determine these genetic factors. RILs or doubled haploid populations have been used to identify QTLs associated with micronutrient concentration [32][33][34]. Therefore, identification QTL alleles with micronutrient efficiency, and discovery of tightly linked molecular markers for MAS selection are very important for lentil breeding.
A biofortification approach similar to that proposed for other crops can be used to develop lentil varieties that are enriched with micronutrients [8,35]. To date, a number of QTL studies have been conducted on micronutrient accumulation in seeds of several cops. For example, Waters et al. [36] identified several QTLs for accumulation of Ca, Cu, K, Mg, Mn, P, S, and Zn in seeds of Arabidopsis thaliana; Blair et al. [37] detected 26 QTLs for Fe and Zn concentration in seeds of common bean; Tiwari et al. [34] reported two QTLs for Fe and one QTL for Zn in wheat; Sankaran et al. [38] identified 46 QTLs for mineral concentration (Ca, Cu, Fe, K, Mg, Mn, P, and Zn) in Medicago truncatula; Tezuka et al. [39] identified three QTLs for Cd accumulation in rice; Norton et al. [40] detected six QTLs for Se concentration in rice; and Pu et al. [29] identified five QTLs associated with Se concentration using composite interval mapping in two different RILs of wheat. However, no studies have been performed to identify QTL controlling Se uptake in lentil seed. Thus, the aims of this study were to (i) determine phenotypic variation among a RIL population and (ii) identify DNA markers linked to the gene(s) controlling Se uptake in lentil seed employing a QTL mapping approach.

Soil analysis
Soil analyses were carried out in the Department of Plant and Soil Science at Ege University in Turkey. Soils were sampled from three experimental fields in Turkey (located at Izmir, Adana, and Sanli Urfa) to determine structural and chemical properties of the soil. Analyses were conducted for soil pH [41], total soluble salt [42], texture [43], organic matter [41], and CaCO 3 [44]. Macro and micronutrient analyses of soil samples were carried out according to Bingham [45], Pratt [46], and Linsday and Novell [47]. Total Se concentration of experimental soils was determined as the mean of three replicates with standard error by inductively coupled plasma mass spectrometry (ICP-MS).

Plant material and DNA extraction
The experiment was carried out on a population of 96 RILs of LR-39, which was generated from the cross between "PI 320937" (P1) × "Eston" (P2) developed at the University of Saskatchewan, Canada. The population was generated by advancing F 1 plants from the simple cross to the F 2 generation and then through single-seed descent from the F 2 to the F 7 generation. The LR-39 population was planted with three replications at each of the three field sites during the 2012 and 2013 growing seasons.
For DNA extraction, young leaves from 4-6 week old seedlings of the 96 RILs and parents were collected in aluminum foil and placed in liquid nitrogen. Leaf tissue from each individual was ground in liquid nitrogen with a tissue lyser. Then, DNA was extracted from 100 mg of fresh leaf tissue using a DNA isolation kit (Plant Genomic DNA Purification Kit Cat no: K0792 Fermentas, Germany). The purified DNA was quantified using a Qubit 2.0 Fluorometer (Invitrogen, CA, USA). The isolated DNA was kept at -86°C in a freezer until use. The stock DNA was diluted for each polymerase chain reaction (PCR) protocol (AFLP and SSR) as described below.

Selenium analysis
Sample preparation and analyses were performed according to previously described procedures [48]. A total of 1 g of each ground sample was acid digested with HNO 3 -HClO 4 (4:1) mixed using a heating protocol, increasing from 100 to 220°C for 2 h on a digestion block. The final solutions were diluted with ultrapure water to a final volume of 50 mL. Se concentration was determined by ICP-MS with an Agilent 7700 instrument [49]. A commercially available Se standard solution (Cat no: 1197960100, traceable to SRM from NIST SeO 2 in HNO 3 , ρ = 1000 mg/L Se, Merck, Darmstadt, Certipur1, Germany) was used to prepare calibration curve graphs. Details of this procedure are described by Gupta [50]. Se analysis for each RIL was replicated three times.

Variance analysis
Analysis of variance (ANOVA) was carried out using TOTEMSTAT software [51] for Se concentrations of the RILs grown in different locations and years. Probability was accepted at the P 0.001 and P 0.05 levels.
DNA analysis with genetic markers SSR analysis. For SSR analysis, each 20 μL PCR reaction contained 40 ng of genomic DNA, 10 pmol of each forward and reverse primer, 2.5 mM dNTP, 1X PCR buffer (75 mM Tris-HCl, PH: 8.8, 20 mM (NH 4 ) 2 SO 4 , 25 mM MgCl 2 , 0,1% Tween), and 0.6 U Taq DNA polymerase. PCR was carried out in a thermocycler (MJ Research TM , Nevada, USA) with an initial denaturation of 94°C for 5 min, followed by 30 cycles of 94°C for 30 s, 55°C for 35 s, and 72°C for 60 s, followed by a final extension at 72°C for 5 min. The PCR products were resolved using 6% polyacrylamide gel electrophoresis on a LI-COR 4300 DNA analyzer (Li-COR Lincoln NE, USA). Forward primers were modified by adding an M13 tail (5ˈ-TGTAAAACGACGGCC AGT-3ˈ) and labelled with IRDye 700 or IRDye 800. The band sizes were analyzed using SagaGT software (Li-COR Lincoln, NE, USA). SNP analysis. SNP analysis was performed using a genotyping by sequencing (GBS) approach. GBS analysis followed the procedure described by Raman et al. [52]. Briefly, two different restriction enzymes were used to digest 100 ng of reference DNA. Pst I and Mse I enzymes were applied to the DNA to create overhangs for adapter ligation. A staggered, varying length barcode region, the Illumina flowcell attachment, and a sequencing primer sequence were included in the Pst I-compatible adapter. The Mse I-compatible overhang sequence was included in the reverse adapter along with the flowcell attachment sequence. The following PCR protocol was performed for 30 rounds to effectively amplify only the Pst I/Mse I fragments: 94°C for 1 min, followed by 29 cycles of 94°C for 20 s, ramp at 2.4°/s to 58°C, hold at 58°C for 30 sec, ramp at 2.4°C/sec to 72°C, hold at 72°C for 45 s, extend for 7 min at 72°C, and then finally cool to 10°C. To sequence on the Illumina Hiseq2000, the PCR products were multiplexed in equimolar amounts and a c-Bot (Illumina) bridge PCR was applied. A single read was run for 77 cycles. In a single lane, all amplicons were sequenced and analyzed by proprietary DArT analytical pipelines. To filter away poor quality sequences, the FASTQ files were first processed in the primary pipeline. A Phred pass score of 30 was chosen for the barcode region as a more stringent selection than for the rest of the sequence. Thus, in the barcode split step, very reliable results were obtained during assignment of the sequences to specific samples. In marker calling, approximately 2,000,000 sequences per barcode/sample were identified and used. In the secondary pipeline, these files were used for DArT P/L's proprietary SNP and Presence/Absence Markers (PAM) calling algorithms (DArTsoftseq). The analytical pipeline processed the sequence data. The parental lines ("PI 320937" × "Eston") of the LR-39 RIL population generated all polymorphic sequences for the DArT-Seq markers. SNP data was uploaded system as S1 Supporting Information.

Genetic linkage mapping and QTL analysis
Polymorphic bands were scored for each RIL individual and recorded as either type 'A' ("PI 320937") or 'B' ("Eston"), evaluated for the presence or absence of bands, and later combined to construct a genetic linkage map. All genotypic marker data were tested for segregation distortion using JoinMap4.0 linkage map software [53]. Distorted markers were eliminated and not used for linkage mapping. Linkage analysis was performed using maximum likelihood mapping algorithm with RIL population type, using Kosambi function, logarithm of the odds (LOD) of 2.0-10.0, and a recombination fraction of 0.35 as mapping parameters. The genotypic and phenotypic data sets were imported into MapQTL software for QTL analyses [54]. The QTL were determined following Simple Interval Mapping (SIM) [55]. The significant LOD scores for detection of the QTL were calculated based on 1000 permutations at P 0.05 and 0.01 [53]. The proportion of observed genetic variation explained due to a particular QTL was estimated by the coefficient of determination (R 2 ) using maximum likelihood for SIM.

Total Se concentration and phenotypic variation in seeds of lentil genotypes
Mean Se concentration for the RIL parents, "Eston" and "PI 320937", were 166 and 858 μg/kg, respectively, across the two study years and three growing locations (Table 1). Mean seed Se concentration within the RIL population ranged from 92 to 908 μg/kg (mean 421 μg/kg), and was very consistent across the three growing locations and over both study years (Table 1). Se concentrations in lentil seed from the LR-39 RIL population were normally distributed (Fig 1A  and 1B). Soil analyses for Adana, Izmir, and Sanli Urfa indicate high levels of Se at Adana (89 μg/kg) and Sanli Urfa (53 μg/kg) and adequate levels at Izmir (29 μg/kg) ( Table 1).
The ANOVA for Se concentration for all environments and years ( Table 2) shows that genotype, location, year, and the interactions between year × genotype and location × genotype were highly significant (P 0.001). The interaction between year × location was not significant. Significant variation was detected among the RILs for Se concentration at the three locations and in the two years at P 0.001.

SSR and SNP marker analysis
To determine whether the DNA markers used in this study show segregation distortion based on the expected Mendelian ratio, segregation distortion analysis was carried out using JoinMap 4.0 software (Table 3). A total of 3,030 markers (30 SSRs, and 3000 SNPs) were analyzed for the mapping of the genome. Of these, 1,784 markers were mapped (4 SSRs, and 1,780 SNPs markers) in the genome (Fig 2), whereas 1,076 (16 SSRs, and 1,060 SNPs) markers remained unlinked. A total of 170 markers (5.6% of the polymorphic markers) showed segregation distortion.

Linkage mapping results
The map spanned a total length of 4,060.6 cM and composed of seven LGs with an average of 2.3 cM between adjacent markers (Fig 2). The longest LG was LG5 (1,783.4 cM) and the  shortest was LG7 (111.7 cM). Average marker distance varied by LG; the smallest distance between markers was noted for LG3 (1.8 cM) and the greatest for LG4 (2.7 cM). SNPs were distributed throughout all LGs; however, SSRs were only mapped on LG5, which contained the most markers (782) with an average distance between markers of 2.3 cM.
LG7 contained the fewest markers (42) with an average distance between markers of 2.6 cM (Table 4).

QTL analysis results
A total of four QTL regions identified using SIM were detected for Se concentration in lentil seeds for the LR-39 population and deemed to be of significant importance (Fig 2).
LG2 contained only one QTL region whereas LG5 contained three QTLs for Se concentration. The QTL regions for Se located on LG2 and LG5 showed clusters of SNP markers. However, SSR markers did not locate on these regions. A total of 36 putative QTL markers were detected. The markers were only statistically significant in one year and/or in two locations. Because they did not have a stable LOD threshold (some were below LOD 3 in some years and in some locations), they can be considered putative QTL. Four QTL regions and 36 putative QTL markers, with LOD scores ranging from 3.00 to 4.97, were distributed across LG2 and LG5 and were associated with seed Se concentration, explaining 6.3-16.9% of the phenotypic variation (Table 5).

Discussion
Breeding for improved Se uptake by lentil may be an effective and sustainable strategy to address micronutrient malnutrition in the long term [56]. QTL studies have been carried out on a number of staple crops to determine micronutrient accumulation in seeds (e.g., [36,40]). The present work aimed to add to the literature in this regard for Se in lentil, and further bridge the gap between agriculture and human health.
In the current study, mean Se concentrations of LR-39 parents ("Eston" × "PI 320937") were 166 and 858 μg/kg, respectively, while the range for the RIL population was 119 to 883 μg/kg. Rahman et al. [56] reported similar results for lentil seeds grown in Bangladesh, with total Se concentrations ranging from 74 to 965 μg/kg (mean 312 μg/kg). Thavarajah et al. [3] report Se concentrations of 425-673 μg/kg for lentil grown in Canada, which also falls within the range of results obtained here. Se concentrations for lentil seed determined in the current study are higher than values determined for other crops, including wheat [4], rice [40], broccoli [57] and soybean [58].
The significant effects noted here for genotype, years, location, and interactions between year × genotype and location × genotype agree with results reported by Thavarajah et al. [16]. Rahman et al. [56] also report that location × genotype effects were significant for seed Se in lentil; however, year × location was not significant in the present study. The lack of significance in the year × location interaction for seed Se can be attributed to the differences between genotypes being consistent across the three locations for two years.
In the current study, a total of 170 markers (5.6% of the polymorphic markers) showed segregation distortion. Similar levels of segregation distortion have been detected in several lentil mapping studies [31,59,60,61]. Segregation distortion can be explained by many factors, such as small chromosomal rearrangements with little impact on fertility, chromosome pairing during meiosis [62], abortion of the male or female gametes or zygotes, or linkage to a lethal allele in gametes [63,64]. Chromosomal rearrangements may also lead to segregation distortion [65].
The current map covers 4,060.6 cM and is composed of seven linkage groups, which corresponds to the haploid chromosome number (2n = 2x = 14) of lentil [59,66,67]. Previous studies with lentil using a low number of markers obtained several LGs [31,59,60,61,67]; however, genome coverage in this study is far higher than many earlier efforts (e.g., 751 cM [59] [66]. This could be due to variation among physical length of lentil chromosomes. Supporting our results, several karyotype analysis of lentil showed that the physical length of lentil chromosomes showed variation between 8.59μm (chromosome (chr) 1) and 5. Among the previously published reports, Sharpe et al. [66]'s number of markers was the highest to date for lentil linkage mapping and was identified a large collection of SNPs and subsequently developed a genotyping platform to establish the first comprehensive genetic map of the L. culinaris genome. In an LR-18 population of lentil, they mapped six SSRs and 454 SNPs, covering the genome in 834.7 cM. The lentil genome is~4 GB in size [31], and therefore our genome coverage is closer to the size of the lentil genome than maps developed in other studies. The average distance between markers in the current map is 2.3 cM, which is shorter than reported for previous maps (6.87 cM [60] [66]. A number of QTL studies have been conducted to date on micronutrient accumulation in seeds of several crops [29,32,33,34,36,40,58]. In the current study, four QTL regions and 36 putative QTL markers controlling Se uptake in lentil seed were identified. LOD scores ranging from 3.00 to 4.97 and distributed across two LGs (LG2 and LG5) were associated with seed Se concentration, explaining 6.3-16.9% of the phenotypic variation. These phenotypic variation results are higher than for Norton et al.'s study of rice [40] and Ramamurty et al.'s study of soybean [58]. QTL results are similar to studies in other crops, in which Norton et al. [40] identified six QTLs for Se concentrations in rice and Pu et al. [29] identified four QTL regions associated with Se concentration in wheat. In Pu et al.'s study [29], one QTL region for Se concentration located between 129.2 and 161.7 cM explained 23.4% of the phenotypic variance and clustered with three markers covering 32.5 cM. Three QTL regions located between 25.8 and 35.9 cM, 121.7 and 141.2 cM, and 159.2 and 165.9 cM on different chromosomes explained 6.4, 10.1, and 28.5% of the phenotypic variance, respectively. Overall, these studies show that a high level of genetic variation and the use of more individuals can help resolve QTL regions and also allow for the characterization of regions with high LOD scores.

Conclusions
Large phenotypic variation was obtained among lentil RILs in terms of Se accumulation in seed. However, the use of molecular markers for developing maps and other breeding applications has been limited in lentil by the low genetic variation at the species level and the lack of available marker resources. This study addressed the shortcomings of using a large number of SNP markers and reports the first results using QTL analysis for Se accumulation in lentil seed. Data from the current study may help to advance molecular breeding techniques that would allow breeders to develop varieties with higher seed Se concentrations. The use of GBS technology in lentil showed that the marker system could be promising for mapping the lentil genome at a higher density. The markers can be used for QTL analysis. Construction of a linkage map of lentil using a RIL population with variation for Se concentration in seeds enabled detection of potential QTLs controlling Se accumulation. This study also showed that Se concentration in seed was quantitatively inherited. Because RIL LR-39-98 contained the highest Se concentration, it could be used as a donor parent in breeding programs to develop Se-biofortified varieties.