Association mapping unveils favorable alleles for grain iron and zinc concentrations in lentil (Lens culinaris subsp. culinaris)

Lentil is a major cool-season grain legume grown in South Asia, West Asia, and North Africa. Populations in developing countries of these regions have micronutrient deficiencies; therefore, breeding programs should focus more on improving the micronutrient content of food. In the present study, a set of 96 diverse germplasm lines were evaluated at three different locations in India to examine the variation in iron (Fe) and zinc (Zn) concentration and identify simple sequence repeat (SSR) markers that associate with the genetic variation. The genetic variation among genotypes of the association mapping (AM) panel was characterized using a genetic distance-based and a general model-based clustering method. The model-based analysis identified six subpopulations, which satisfactorily explained the genetic structure of the AM panel. AM analysis identified three SSRs (PBALC 13, PBALC 206, and GLLC 563) associated with grain Fe concentration explaining 9% to 11% of phenotypic variation and four SSRs (PBALC 353, SSR 317–1, PLC 62, and PBALC 217) were associated with grain Zn concentration explaining 14%, to 21% of phenotypic variation. These identified SSRs exhibited consistent performance across locations. These candidate SSRs can be used in marker-assisted genetic improvement for developing Fe and Zn fortified lentil varieties. Favorable alleles and promising genotypes identified in this study can be utilized for lentil biofortification.


Introduction
Lentil (Lens culinaris subsp. culinaris) is an annual, self-pollinating, herbaceous, cool-season grain legume originating from the Near East center of origin [1]. From the Mediterranean region, the crop spread to different parts of the world, which led to the evolution of six a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 geographical groups based on morphology, physiology, and functional variation [2]. Pilosaetype lentil with low biomass, small seeds, short rudimentary tendrils, pubescent foliage, and precocity in flowering and maturity is grown in South Asia. Several research groups have recommended the introgression of the Mediterranean lines [3][4][5][6] and wild species for increasing the genetic diversity and broadening the genetic base of the pilosae-type lentil. This crop is used mainly for food and fodder. Lentil grains are rich source of protein, fiber, minerals and carbohydrates playing a crucial role in reducing micronutrient deficiency in developing countries [7]. The crop is usually grown in rotation with cereals to break disease cycles and to fix atmospheric nitrogen [8].
The global lentil production during 2014 was 4.95 million tons [9]. Major lentil-growing regions included South Asia, West Asia, and North Africa. Micronutrient deficiency is prevalent in these regions because of high population densities and poor resources. India is a major producer and consumer of lentil. In India, lentil is grown in 1.89 million hectare with a production of 1.13 million ton [9]. This crop is mainly grown in the rain-fed areas of Central India and parts of Eastern India on residual moisture from the rainy season. The productivity of lentil is low because of short growing period and moisture stress during flowering and fruiting. Earlier study [5] has reported narrow genetic base of Indian lentil.
Micronutrients represent the essential vitamins and minerals required for normal cellular and molecular functions. Micronutrient deficiency is widespread in developing countries because of the poor quality of diet, which consists mainly of staple crops and very small amounts of meat, pulses, and fruits owing to low income. Micronutrient deficiency in food crops is primarily because of the low natural levels of available micronutrients in the soil owing to the decreased use of animal manure, crop residue and compost [10], which further results in low availability of micronutrients to plants [11]. Fe has been reported as heme and non heme forms. Non vegetarian food is source of heme iron while non heme iron is found in plants. In human beings Fe is essential constituent of many proteins and enzymes and is involved in cell growth and differentiation and oxygen transport. The heme Fe bioavailability is 12%-25% and non-heme Fe bioavailability is less than 5% [12]. Fe deficiency results in fatigue due to decreased Fe delivery as well as poor immunity and performance [13]. Fe deficiency affects more than 30% people worldwide [14]. Fe deficiency results in the disruption of the optimal function of both endocrine and immune systems [15]. It can cause anemia, which increases the risk of hemorrhage and bacterial infections during childbirth, thereby resulting in maternal deaths [16]. Babies may be born prematurely and may be liable to infections, learning disabilities, and delayed development [17]. Almost 40% pregnant women and more than 40% children under the age of 5 years in developing countries are anemic [18]. Almost 50% of these anemia cases are estimated to be due to Fe deficiency. Globally, half of the cultivated soil is zinc (Zn) deficient [19]. Zn is an essential component of proteinases, dehydrogenases, and peptidases in plants and is found in soil in the form of Zn 2+ and Zn (OH) + [20]. In humans, Zn is essential for the normal growth and development of fetuses and adolescent children [21]. Regular intake of Zn is required because the human body cannot store Zn [22]. Zn deficiency impairs immune functions and is associated with an increased risk of gastrointestinal infections [23,24]. It is also a contributing factor in child deaths due to diarrhea [25].
Breeding for quantitative or complex traits is a tedious and time-consuming process. Both genotypic and environmental factors play a role in the expression of a phenotype. The identification of molecular markers in the late 1980s has led to the development of tools for the directed manipulation of quantitative traits in field crops. Collard et al. [52] proposed a scheme for quantitative trait loci (QTL) mapping in crop plants for their further use in marker-assisted selection. Linkage analysis (using biparental populations) and association mapping (using diverse germplasm lines) are tools for dissecting quantitative traits. For linkage analysis from biparental crosses, several types of mapping populations (F2, RILs, NILs, double haploid, and backcross) can be developed. The progenies of the developed populations are screened for a trait of interest. Bulk segregant analysis, suggested by Michelmore et al. [53], is used to identify markers linked with the trait of interest. Although QTL mapping is a crucial tool for tagging and mapping of genes in crop plants, the method has limitations such as few meiotic events, difficulty in building a segregating population, reduced diversity derived from two parents, high cost involved, low resolution, and simultaneous evaluation of few alleles [54,55]. The association mapping (AM) approach, also known as linkage disequilibrium (LD) mapping [56,57], has been initially used by geneticists to map and clone many genes governing complex traits in humans [58][59][60]. However, in recent years, plant geneticists have also used it for identifying QTL in different crops [61][62][63][64][65][66][67]. AM harnesses genetic diversity of natural population, for searching genotypephenotype correlations among unrelated individuals. The main advantage of AM is likelihood for a higher resolution mapping because of the utilization of majority recombination events from a large number of meiosis throughout the germplasm development history [68]. The association mapping approach is based on the LD between the marker and the QTL, and it provides higher mapping resolution. The detection power of QTL depends on the LD between the QTL and the marker. A strong LD between the markers enhances the detection of QTL with a small effect [69]. The variance explained by QTL is underestimated if the LD between the marker and QTL is incomplete. In this study, we used association mapping for identifying QTL linked to grain Fe and Zn concentration in lentil. Inductively coupled plasma-mass spectrometry (ICP-MS) was used to estimate grain Fe and Zn concentration in a set of 96 Indian and Mediterranean lentil genotypes with 73 genomic and EST-SSRs. We then conducted an association mapping study by using a general linear model (GLM) to detect the significant loci responsible for natural variations of grain concentrations of Fe and Zn. We present results on significantly associated loci and favorable alleles. This approach can be used for identifying germplasm lines with desirable traits for accelerating lentil breeding through marker-assisted selection.

Plant material and field experiment
An AM panel of 96 diverse L. culinaris subspecies culinaris genotypes was used in this study ( Table 1). The panel consisted of advanced breeding lines developed at different lentil breeding centers of India, released lentil varieties, and exotic germplasm lines obtained from the International Center for Agricultural Research in the Dry Areas (ICARDA). The genotypes were  Table. In previous season mungbean was planted across all the three locations in these plots. No micronutrient spray or basal application was given. Only fertilizer applied to mungbean crop was 100 kg Di ammonium Phosphate / ha. To ensure proper homogenization, the soil was pulverized and thoroughly mixed and the field was leveled at each location. The experiment was planted in two replication. From each replication in each location ten samples were drawn to estimate soil Fe and Zn reveal significant variation ensuring the plot homogeneity. The panel was planted in a randomized complete block design with two replicates per entry (3 rows per replication) with a plant distance of 5 × 30 cm and a row length of 4 m. Standard agronomic practices were followed for crop cultivation. Fe and Zn concentration in the soil were estimated using a procedure proposed by Singh et al. [70].

Analysis of grain sample for Fe and Zn concentration
The grains of each genotype were harvested separately at maturity. Care was taken to prevent contamination from dust and metallic equipment. Plants were hand-threshed to avoid any type of contamination. Then the grains were placed in a clean plastic tray manually (using contaminant-free gloves). The grains of each sample were washed with Milli-Q water and dried at 35˚C for five days in a contamination-free uncorroded oven. From each sample, 10 g of grains was grounded manually into a fine powder using a mortar and pestle. The micronutrient analyses were performed at the Division of Soil Science and Agricultural Chemistry, IARI, New Delhi, India. The grain powder sample (0.5 g) was digested following the modified diacid protocol [70] by using a microwave digestion system (Multiwave ECO, Anton Paar, les Ulis, France). Fe and Zn concentration (in mg/kg seed) were estimated through ICP-MS (Perkin Elmer, model: NexION 300 ICP-MS, USA) by using an automatic sampling protocol. Two technical replications per biological replication were followed while estimating the concentration.

Genomic DNA extraction, purification and SSR amplification
Genomic DNA of the genotypes was extracted from 5 g of fresh leaf tissue by using the CTAB method proposed by Murray and Thompson [71]. The quality and quantity of DNA were determined using a spectrophotometer, and the samples were diluted to 10 ng/μL.

Diversity analysis and population structure
Polymorphism information content (PIC) was computed using the formula PIC = 1-SPi − SSPi Pj, where "i" is the total number of alleles detected for the SSR marker, "Pi" is the frequency of the i allele in the set of 96 genotypes investigated, and j = I + 1 [73]. A binary matrix was then transformed to a genetic similarity matrix by using Jaccard's coefficient. Unweighted neighbor joining (UNJ) method available in DARwin 5.0 (http://darwin.cirad.fr/) was used to visualize the dendrogram. STRUCTURE 2.3.4 was used [74] to determine the number of subgroups in the population assuming prior values of k = 1 to 10. The data was analyzed at a run length of 2,50,000 as the burning period length followed by 2,50,000 Markov Chain Monte Carlo iterations by keeping α constant. Each k value was repeated 10 times with an admixture model and correlated the allele frequency. The optimum k value was determined by plotting the Ln P (D) value against the given k value. Structure harvester v 6.92 [75] was used for obtaining the optimum k value determined by plotting the Ln P (D) value against k. The highest plateau was observed at delta k = 6; hence, the number of inferred populations was assumed to be six for further analysis.

Association mapping and favorable allele mining
To identify QTL for grain Fe and Zn concentration, association analysis were performed using 73 SSRs. The LD was estimated between each pair of polymorphic loci by calculating the square of the correlation coefficient (r 2 ) using TASSEL 3.01 [76] with 10,000 permutation. The General Linear Model (GLM with Q matrix to reduce false associations) was used to examine association between grain micronutrients and markers. An LD plot with p and r 2 values was generated by TASSEL to depict the overall LD among the entire SSR set. The markers considered to be significantly associated with the trait are represented in Manhattan plot [76]. Quantile-quantile (QQ) plots of the expected and observed p values was plotted to evaluate the adequacy of controlling Type I error.
Favorable allele for a marker loci associated with grain Fe and Zn concentration were identified using formula suggested by Li et al. [77]. The phenotypic allele effect (ai) was calculated as a i = ∑x ij /n i − ∑N k /n k , where a i is the phenotypic effect of the i th allele, x ij is the phenotypic value over the j th material with the i th allele, n i is the number of genotypes with the i th allele, N k is the phenotypic value over all genotypes, and n k is the number of genotypes [78]. It represents a comparison of the average phenotypic value of genotypes with a specific allele with that of all genotypes. When, ai > 0, then this allele is supposed to have positive effect on the trait. When ai < 0, the allele gives a negative effect [79].

Phenotypic variation in lentil germplasm
The variation in grain Fe and Zn concentration across locations and years are shown in the S1 Considering the data from the four environments, P 3220 (78.75 mg/kg) and P 13129 (70.45 mg/kg) were determined as the most promising genotype for the grain Zn concentration. The indigenous lines were found to be rich in grain Fe, whereas the exotic lines were found to be rich in grain Zn. A mean Fe grain concentration higher than 70 mg/kg seed was considered high, whereas that lower than 60 mg/kg seed was considered low. Similarly, a mean Zn grain concentration higher than 50 mg/kg seed was considered high and that lower than 40 mg/kg seed was considered low. Promising genotypes such as P 2130 (Fe: 85

Genetic diversity and population structure
A total of 220 alleles were generated through 73 polymorphic SSRs in the AM panel ( Table 2). The number of alleles produced per locus ranged from 2 to 5, with an average of 2.97 alleles per locus. The highest number of alleles (5) was detected for SSRs PBALC 353 and SSR 33, whereas 17 SSRs exhibited only two alleles. The PIC value ranged from 0.08 to 0.68, with an average of 0.36, indicating that the SSRs used in the study were informative. Genomic SSRs produced a higher average number of alleles (Na), and PIC over EST SSRs (Table 2). SSR 33 produced the maximum number of alleles (5 with the highest PIC value of 0.68).
Admixture model-based simulations revealed six subgroups (SGs) in the panel. The six SGs represented as SG I, SG II, SG III, SG IV, SG V, and SG VI, which included 9, 12, 21, 12, 23, and 19 genotypes, respectively (Fig 1). SG I comprised of nine indigenous genotypes including released varieties and advanced breeding lines. SG II comprised of twelve genotypes including two exotic genotypes. SG III consisted of 21 genotypes including two exotic genotypes. SG IV comprised 12 indigenous genotypes developed at IARI, New Delhi. SG V included 23 exotic genotypes from ICARDA. SG VI comprised 19 indigenous genotypes. The grouping of genotypes in SG I, IV, V and VI was based on the origin of genotype. The SG II and III included mainly indigenous genotypes and few exotic genotypes.
The UNJ separated the lentil genotypes into three clusters. Cluster I comprised 25 lentil genotypes including 23 exotic germplasm from ICARDA and two Indian lentil genotypes L   (Fig 2).

LD and marker trait association analysis
The LD patterns of all 2628 pairwise combinations of the 73 SSRs were assessed using TASSEL (Fig 3). The LD ranged from 0.0 to 0.70. The highest LD value was recorded between PLC 81 and GLLC 563, PLC 38, and PLC 60. In the present study, association mapping was used to identify linked markers for grain Fe and Zn concentration by using the GLM with the Q model. In total, eight SSRs (contributing to 8-22% of the phenotypic variation) were significantly associated with the grain Fe concentration, and five SSRs (contributing to 4-21% of the phenotypic variation) with the grain Zn concentration (Table 3). Environment-wise different SSRs associated with grain Fe and Zn concentration with -log10 p value >2 are presented in  14) in three datasets each. Few SSRs exhibited a consistent association with grain Fe and Zn concentration across environments. Three SSRs (PBALC 13, PBALC 206, and GLLC 563) exhibiting an association with the grain Fe concentration revealed phenotypic variation of 11%, 9%, and 11%, respectively. SSRs PBALC 353, SSR 317-1, PLC 62, and PBALC 217 were found to be associated with the grain Zn concentration with the phenotypic variation of 21%, 18%, 14%, and 16%, respectively. The markers that were significantly associated with the trait were represented in Manhattan plots (Figs 4 and 5). A QQ plot is a graphical method of depicting the observed and expected probability distributions by plotting their quantiles next to each other. The results derived from the GLM analysis were also explained in these plots (S2 Fig). The hypothetical QQ plots of the marker-trait association study for grain Fe and Zn concentration are a good approximation of normality.

Favorable allele mining
Phenotypic allele effects (ai) were calculated for SSRs associated with grain Fe and Zn concentration. The a i value greater than zero indicates that the allele has a positive effect, whereas the value of a i less than zero indicates that the allele has a negative effect. The favorable alleles identified for the grain Fe concentration include PBALC 13-1, PBALC 206-1, GLLC 563-1, GLLC 563-2, and PBALC 32-1, of which PBALC 13-1 exerted the maximum positive effect on a phenotype. Similarly, the favorable alleles identified for the grain Zn concentration include  Table 4.

Genetic variation in grain Fe and Zn concentration
Breeding micronutrient-rich crops is required to combat micronutrient deficiencies in humans [80]. The characterization of genotypes for micronutrient concentrations in different environments is essential for identifying genotypes rich in Fe and Zn. In the present study, high genetic variation for grain Fe and Zn concentration was recorded. Approximately 29.1% of genotypes had a high mean grain Fe concentration (>70 mg/kg) and 31.2% of genotypes had a low mean grain Fe concentration (<60 mg/kg). Similarly, 22.9% of genotypes had a high mean grain Zn concentration (>50 mg/kg) and 20.8% of the genotypes had a low mean grain Zn concentration (<40 mg/kg). The results indicated that the grain Fe concentration was higher in the indigenous genotypes than in the exotic genotypes. By contrast, the exotic genotypes were relatively richer in grain Zn concentrations than were the indigenous genotypes. The germplasm has been characterized for grain Fe and Zn concentration in wheat [81,82], rice [83] and chickpea [84,85]. A wide range of variability in grain Fe and Zn concentration in lentil [86,87] indicated the role of genotypes and environmental interactions in the expression  Table 1). https://doi.org/10.1371/journal.pone.0188296.g002 Association mapping for grain Fe and Zn in lentil of these traits. The variation in grain micronutrient concentration of the lentil genotypes at different locations and years in the present study was due to the sensitivity of genotypes toward variations in weather and soil conditions. Soil parameters affect the availability of Zn uptake in plants and control the amount of organic matter and the grain Zn concentration in soil solutions [88]. By hybridizing indigenous Fe-rich grain varieties with exotic Zn-rich grain varieties, the concentration of both micronutrients can be improved simultaneously and the genetic base of indigenous genotypes can be broadened. Biofortified lentil varieties can be sustainable as well as cost-effective for alleviating the micronutrient deficiency, thereby complementing the process of food fortification. Thavarajah et al. [89] reported that 100 g of dry lentil can provide the recommended daily allowance of micronutrients (Fe and Zn) in adults. Therefore, lentil can be consumed as a whole grain to meet the daily requirement of grain micronutrient concentration in humans. Correlation is a crucial aspect of the crop improvement program and is used to improve correlated traits simultaneously or reduce undesirable effects of some traits on another. We observed a positive correlation (r = 0.11) for grain Fe and Zn concentration. Similar results have been reported in other studies [90,91]. Fe and Zn are found in similar types of foods, and their absorption mechanisms are affected by food compounds. Therefore, the biochemical status of Fe and Zn may be interconnected [92]. A significant correlation between grain Fe and Association mapping for grain Fe and Zn in lentil Zn concentration was reported in fieldpea [93], barley [94] wheat [95] and maize [96,97]. The positive association between these traits suggests some common genetic mechanisms for the uptake, accumulation, and concentration of Fe and Zn. Correlation is very useful in the crop improvement program as a breeder can improve the micronutrient concentration of both the traits (Fe and Zn) simultaneously. Although the levels of Zn and Fe in grains are positively related, fertilization of one element did not affect the grain concentration of the other [98,99].  [101]. The PIC is a critical factor for selecting SSRs for the characterization of germplasm and tagging of genes [102]. PIC offers a more accurate assessment of diversity than does the raw number of alleles because PIC takes into account the relative frequencies of each allele [103]. It also indicates the ability to discriminate among genotypes. Kushwaha et al. [104] reported that markers with a PIC value ranging between 0.4 and 0.8 can be considered as informative exhibiting high polymorphism. The PIC observed in the current study were comparable with those reported in previous study by Andeden et al. [105], where PIC ranged from 0.13 to 0.79. In our study, the genetic diversity of genomic SSRs was higher than that of EST-SSRs and these findings are consistent with those of previous studies on lentil [100] and barley [106]. A lower level of polymorphism in EST-SSRs may be due to The population structure explained the presence of six subgroups in the AM panel. (Fig 2). The subgroup SG 5 consisted of 23 genotypes from the Mediterranean region. This group separated the Mediterranean genotypes from the remaining genotypes from India. The initial report [5] indicated that lentil genotypes from India have a narrow genetic base and are genetically more isolated than that of the the ICARDA genotypes. However in the last two decades Mediterranean material has been used for broadening the genetic base. Previous studies on lentil diversity by using molecular markers have revealed mostly two distinct groups, namely South Asian and all other origins [86,107,108] results from PCoA and cluster analyses also demonstrated a narrower genetic variability among the Indian material. Similarly Mekonnen et al. [109] and Koul et al. [110] reported five subgroups in the lentil germplasm. The

Association analysis and favorable allele identification
In the present study, we identified markers that are consistent across environments. Three SSRs were associated with the grain Fe concentration and four SSRs with the grain Zn concentration. After the validation of these trait-associated SSRs, the markers can be used for identifying genes/QTLs regulating grain Fe and Zn concentration. Furthermore, they can be useful in the marker-assisted genetic enhancement programs. The markers exhibiting an association in more than two environments were considered more reliable than the markers present in a particular environment. The SSRs PBALC 13 and GLLC 563 accounted for phenotypic variation of 11%, whereas PBALC 206 explained 9% of phenotypic variation for the grain Fe concentration. For the grain Zn concentration, PBALC 353 exhibited the highest phenotypic variation (21%), followed by SSR 317-1 (18%), PBALC 217 (16%), and PLC62 (14%). Similar level of trait variation for grain Fe and Zn concentration was recorded in fieldpea [93] and chickpea [84]. The size of our lentil germplasm panel was adequate. Even with small population size the high quality extensive phenotyping offers a reasonable basis for the association of mapping studies [111]. Association mapping studies in bean [112], peanut [113], barley [114], tomato [115] and sugarcane [111] also used approximately 90 genotypes. Atwell et al. [63] reported that for some of the traits, significant results can be achieved in a population size of less than 100. The putative functions of PBALC 217 reported by Kaur et al. [51] were assigned through comparison with the non redundant sequence database at the National Center for Biotechnology Information by using the basic local alignment search tool (BLAST) BLASTX program (http://blast.ncbi.nlm.nih.gov/Blast.cgi). This SSR exhibited homology with dormancy or auxin-associated protein in Medicago. truncatula (E value = 3e-42). Similarly, Upadhayay et al. [85] reported the auxin/IAA as a known/putative function of SNP (CakSNP1628/SNP55) associated with the seed Fe concentration.
By using genomic SSRs, association mapping has been successfully demonstrated in rice [116], wheat [117] and chickpea [118]. Our study in lentil using SSRs is the first attempt to identify QTLs or genes for grain Fe and Zn concentration. Gupta et al. [119] used 50 SSRs for The association mapping approach was useful in identification of marker loci linked with the grain Fe and Zn concentration. This approach also aided in mining alleles and further utilizing these favorable alleles with the maximum positive effect in marker-assisted selection, as suggested by Wan [122]. The identified favorable alleles can be used in the lentil breeding program for improving grain Fe and Zn concentration. For the grain Fe concentration, the maximum positive effect for the favorable allele PBALC 13-1 was recorded in seventeen genotypes (L 404, L 830, L 4596, L 4602, L 4603, L 4698, DPL 21, PL 02, P 2124, P 2125, P 2126, P 2127, P 2130, P 3113, P 3204, P 3208, and P 15104). Similarly, for the grain Zn concentration, the maximum positive effect for the favorable allele PLC 62-4 was recorded in seven genotypes (P 2124, P 2126, P 2130, P 3113, P 3204, P 3208, and P 3220) originating from ICARDA. These genotypes can be utilized in a hybridization program for simultaneous improvement of grain Fe-and Zn-rich varieties. Similar studies have also been performed in other crops such, wheat [106] and tomato [115].
Biofortification (breeding micronutrient-rich crops) of lentil can be achieved through plant breeding without affecting the yield or quality. The process like transfer of other traits is tedious and involves screening of germplasm, hybridization, study of mode of inheritance, molecular marker-assisted selection, and regular phenotyping of segregating populations. The approach is a sustainable and cost-effective solution for delivering micronutrients [123] and has emerged as an agriculture-based strategy to fulfill the nutritional requirement of malnourished people throughout the world. Considerable knowledge has been obtained on the molecular mechanisms affecting the accumulation of Fe [124] and Zn [125] in plants. In the future, these studies can be applied to develop crops with enhanced mineral concentration with the help of biotechnological tools in conventional breeding. In this study, markers were identified as being linked to grain Fe and Zn concentration. These identified SSRs can be further validated and deployed in marker-assisted selection for developing grain Fe and Zn rich lentil varieties. Superior positive alleles can be pyramided by hybridization for enhancement of grain Fe and Zn concentration.