A high-density genetic map and QTL analysis of agronomic traits in foxtail millet [Setaria italica (L.) P. Beauv.] using RAD-seq

Foxtail millet (Setaria italica), a very important grain crop in China, has become a new model plant for cereal crops and biofuel grasses. Although its reference genome sequence was released recently, quantitative trait loci (QTLs) controlling complex agronomic traits remains limited. The development of massively parallel genotyping methods and next-generation sequencing technologies provides an excellent opportunity for developing single-nucleotide polymorphisms (SNPs) for linkage map construction and QTL analysis of complex quantitative traits. In this study, a high-throughput and cost-effective RAD-seq approach was employed to generate a high-density genetic map for foxtail millet. A total of 2,668,587 SNP loci were detected according to the reference genome sequence; meanwhile, 9,968 SNP markers were used to genotype 124 F2 progenies derived from the cross between Hongmiaozhangu and Changnong35; a high-density genetic map spanning 1648.8 cM, with an average distance of 0.17 cM between adjacent markers was constructed; 11 major QTLs for eight agronomic traits were identified; five co-dominant DNA markers were developed. These findings will be of value for the identification of candidate genes and marker-assisted selection in foxtail millet.


Introduction
Foxtail millet (Setaria italica) is one of the oldest cereals in the world, and is thought to have been domesticated from the wild species green foxtail, more than 8,000 years ago in northern China [1]. Foxtail millet has many excellent characteristics, among which C 4 photosynthesis, which is the primary mode of carbon capture for some of the world's most important food, feed, and fuel crops, such as maize, sorghum, sugarcane and switchgrass [2]. In addition, foxtail millet is known for its nutritional value: its grains have high protein, folic acid, vitamin E, carotenoids, and selenium [3][4][5]. In recent years, foxtail millet has become a valuable model PLOS

Plant materials
Two foxtail millet cultivars, Hongmiaozhangu (P 1 ) and Changnong35 (P 2 ) were selected as female and male parents, respectively, for the mapping population. Hongmiaozhangu is characterized by low plant height, multiple tillers, and a small panicle. Changnong35 is characterized by high plant height, no tillering, and large panicle. The mapping parents were crossed at the Millet Research Institute (Changzhi, Shanxi) in 2013, and three real F 1 hybrids were obtained. F 2 seeds were obtained from a self-pollinated F 1 individual in 2014. In 2015, the parents and 124 F 2 plants obtained from one F 1 individual were planted, and plant height (PH, cm), main panicle length (MPL, cm), main panicle diameter (MPD, cm), first main internode diameter (FMID, cm), second main internode diameter (SMID, cm), and third main internode diameter (TMID, cm) were assessed in the mature stage. Main panicle weight per plant (MPWP, g) and main grain weight per plant (MGWP, g) were measured after harvest. Data were analyzed using SPSS 17. Genomic DNA was extracted from young leaf tissues of parental and 124 F 2 plants using a Plant DNA Kit (OMEGA, USA, D3485-02).

RAD-seq of the parental lines and F 2 population
We employed the RAD protocol described by Baird et al. [35]. The enzymes and restriction fragment sizes were evaluated based on the reference genome sequence (https://www.ncbi. nlm.nih.gov/genome/?term=foxtail+millet). TapI was selected for RAD library construction. The library for Illumina sequencing was constructed from 200 ng of each DNA sample. All library were sequenced using Illumina HiSeq X Ten at Shanghai Major Biological Medicine Technology Co., Ltd.

SNP identification and genotyping
For SNP calling, the Burrows-Wheeler Aligner [38] was applied for sequence alignment between the individual reads and the reference genome sequence, the Genome Analysis ToolKit [39] was used to detect SNP loci, and SAMtools [40] was used to filter out SNP loci. In this study, filtering of SNP loci was based on three criteria: (i) average sequence depth is < 5-fold in parents and < 3-fold in the progeny; (ii) no polymorphism between the parents; (iii) heterozygous in parents.

Development of new DNA markers
Primers were designed according to the flanking sequences (300 bp upstream and 300 bp downstream of the selected SNPs). The primer sequences used for the new DNA markers are listed in S4 Table. PCR was carried out in a 10 μL volume containing 40 ng genomic DNA, 1.0 μL 10× reaction buffer, 0.2 μL 10 mmol L -1 dNTPs, 1.0 μL primer, 1 U rTaq DNA polymerase (TaKaRa, Dalian), and 5.7 μL ddH 2 O. The PCR was performed by initially denaturing the template DNA at 94˚C for 5 min, followed by 35 cycles at 94˚C for 30 s, 58˚C for 30 s, and 72˚C for 30 s, and then terminated by a final extension for 10 min at 72˚C. PCR fragments were separated on 8% non-denatured polyacrylamide gel electrophoresis (PAGE) and visualized by silver staining [41].
construction. Construction of the linkage map was performed using MSTmap software [42]. The major parameters for loci partition were as follows: distance_function for Haldane; p_value for 0.0000001; no_map_dist for 10; missing_threshold for 0.3; objective_function for Maximum likelihood. A total of 10,016 SNP markers were considered for linkage map construction. LOD = 10 was used to partition the SNP markers into linkage groups. Linkage groups from the same chromosome were merged together, and SNP markers of the same chromosome were again reordered with MSTmap.

QTL analysis
QTL analysis was conducted using CIM (composite interval mapping) of the R/qtl package [43]. LOD thresholds for testing the significance of QTL peaks of each trait were calculated using 1,000 permutations, with a confidence interval of at least 80%. The position and effect of significant QTL was assessed for additive effects and percentage phenotypic variation explained (PVE%) by fitting a model containing all QTL identified for a given trait in R/qtl. The method of naming QTLs was as follows: q plus trait abbreviation and chromosome number; plus -1, -2, etc., when multiple QTLs for one trait were detected on the same chromosome.

Results and discussion
Phenotypic data of eight agronomic traits Phenotypic data of the eight agronomic traits were analyzed (  [44]. In the present study, all eight traits showed positive correlations with one another, except PH which had non-significant correlations with MPD, FMID, SMID, and TMID (Table 2), corroborating the results of previous studies.

RAD-seq analysis and SNP identification in parental lines and F 2 individuals
High-throughput genotyping by sequencing is an option for efficient marker-assisted breeding [45]. Currently, there are many new methods using NGS for identifying genes or QTLs, including GWAS [46], QTL-seq [47], MutMap [48], RAD-seq [35], and SLAF-seq [49], and some have been applied to foxtail millet [15, 30, 31]. However, the utility of RAD-seq has yet not been reported. RAD-seq sequences short DNA fragments with restriction sites digested by restriction endonucleases, regardless of length. It has been applied for SNP identification and linkage map construction in various organisms, including barley, snail, and ryegrass [50][51][52][53].
In the present study, a total of 830,740,674 reads were obtained for the parental lines and F 2 population. After removal of low-quality reads, 760,378,099 high-quality reads were obtained, representing 91.5% of all reads. The high-quality reads were subsequently mapped to the reference genomic sequence, with a mapping ratio of at least 82% (S1 Table).
A total of 2,668,587 SNP loci were detected, including 2,629,567 located on nine chromosomes and the remaining 39,020 found on scaffolds. Among the 2,629,567 SNP loci, 97.5% (2,564,566) were filtered because of low sequence depth and lack of polymorphism between parents. The remaining 65,001 SNP loci were further screened to remove markers unsuitable for genetic map construction.
Among the 65,001 SNP markers, the numbers of markers on the chromosomes ranged from 4,414 to 10,625; the markers covered at least 98.37% of the physical length of the genome ( Table 3). The marker density along each chromosome ranged from 104.82 to 261.63 markers per Mb, averaging 164.64 markers per Mb. The highest marker density (261.63/Mb) was found on chromosome 8, followed by chromosome 7 (213.19/Mb); the lowest marker density (104.82/Mb) was found on chromosome 1. In this study, we genotyped the parental lines using different letters and determined the segregation patterns of the mapping population. The genotypes of the SNP loci were encoded according to the paternal and maternal genotypes instead of the reference sequence. One locus with a homozygous SNP in the paternal and maternal genotypes would be encoded as aa × bb; if a certain SNP was heterozygous for one or two parents, the genotypes of the markers for the two parents were encoded as, for example, cc × ab or ef × eg. In total, we successfully coded 65,001 polymorphic SNP loci. Furthermore, these SNPs were classified into six segregation patterns (ab × cc, cc × ab, ef × eg, nn × np, lm × ll, and aa × bb). Finally, according to the two parent genotypes, 39,299 markers, which fell into the aa × bb segregation pattern, were used in linkage analysis (Fig 1).

High-density genetic linkage map with SNPs
Genetic linkage maps play a major role in clarifying the genetic control of important traits. In particular, high-density molecular markers can be used to quickly map agronomic traits and to identify candidate genes within a region of interest [54]. In this study, after removing incomplete (26,906), significant segregation distortion (24,706), and non-aa × bb (3.373) SNP markers, 10,016 SNP markers were retained for genetic map construction. Finally, a high-density genetic map was constructed containing a total of 9,968 SNP markers. The markers were grouped into nine linkage groups and ordered (Fig 2 and S2 Table). The total genetic distance of the generated map was 1648.8 cM, with an average distance of 0.17 cM between adjacent markers. The largest linkage group was Chr. 8 with 2541 SNP markers and a length of 199.9 cM; the smallest was Chr. 6 with 841 SNP markers and a length of 144.3 cM. Two large gaps (>20 cM) were identified on Chr. 1 (34.23 cM) and Chr. 4 (23.21 cM), respectively. The highest missing data was found on Chr. 6 with 8.42%; the lowest missing data (6.52%) was found on Chr. 7 (Table 4).
SNP markers are efficient for high-density genetic map construction since they allow highthroughput assessment, compared to RFLP and SSR markers. Currently, the most saturated intervarietal map was constructed by Fang et al. [16]. Compared with this map, the number of mapped loci (1035 SSR makers vs. 9968 SNP markers), marker density (26.25 marker/Mb vs. 164.64 marker/Mb), average distance between adjacent markers (1.27 cM vs. 0.17 cM) and total map length (1318.8 cM vs 1648.8 cM) were significantly increased in the newly constructed SNP maker genetic map. The marker number (9,968 SNPs) in this map was also Bennetzen et al. [13] (992 SNPs). The current map provides not only a large number of SNP markers for foxtail millet, but also useful data for QTL analysis, gene fine mapping, and molecular breeding.  The collinearity of each chromosome with the reference genome was also analyzed (Fig 3). The average ratios of genetic-to-physical distance in low-and high-recombination chromosomes were 3.3 cM/Mb (Chr. 9) and 4.98 cM/Mb (Chr. 8), respectively (S3 Table). The 9,968 SNP markers in the genetic map covered 393.53 Mb of the physical length, spanning approximately 76.4% of the foxtail millet genome (*515 Mb).
In this study, high levels of collinearity for each chromosome were revealed compared with the reference genome. A relatively low collinearity was observed between Chr. 8 and the reference genome. Non-collinearity could result from multiple factors, including intra-and interchromosomal locus duplication, genome rearrangement, transposon-mediated marker transposition, discrepancy in recombination rate among different genomic regions, small mapping population size, compromised marker ordering in the consensus map, missing data, and genotyping errors [55][56][57].

QTLs for agronomic traits
Agronomic traits play an important role in the breeding of crops. The more QTLs of agronomic traits that are identified, the more they promote breeding by MAS. Recently, Chinese scientists have reported the successful development of new elite varieties in rice by pyramiding major genes that significantly contribute to grain quality and yield from three parents over 5 years, and demonstrated that rational design is a powerful strategy for meeting the challenges of future crop breeding [58]. In the present study, with the exception of MGWP, a total of 11 QTLs were identified for eight agronomic traits using the F 2 population and SNP markers ( Table 5, Fig 2). The 11 QTLs were mapped to Chr. 1, Chr. 2, Chr. 5, Chr. 7, Chr. 8, and Chr. 9.
MPL had three QTLs, the most prominent of which was designated qMPL8.1, that explained 12.1% of the phenotypic variance. Sixty-five SNP markers covered this interval. The other two QTLs (qMPL1.1 and qMPL1.2) were detected on Chr. 1 with LOD scores of 3.48 and 3.57, and explained 23.3% of the phenotypic variance. Of these, qMPL1.1 and qMPL1.2 have also been detected in the haplotype and SSR maps [16,30]. Two QTLs were detected for FMID, with the largest effect displayed by qFMID9.1, which explained 15.5% of the phenotypic variance. Forty-seven SNP markers were identified within the chromosomal region of qFMID9.1. Two QTLs were detected for TMID, with the largest effect displayed by qTMID2.1, which explained 11.0% of the phenotypic variance. A total of twenty-two SNP markers were found within the chromosomal region of qTMID2.1. The other QTL, qTMID5.1 with 10.8% of the phenotypic variance was also detected in the haplotype map [30].
With the exception of qMPL1.1 and qTMID5.1, the remaining nine QTLs were identified for the first time, indicating that differences in QTL number and position might be attributed to different mapping populations (genotypes, population sizes, etc.), type and number of markers (i.e., RFLPs, SSRs, and SNPs), and environmental effects.

Newly developed DNA markers
In order to utilize the SNPs identified between Hongmiaozhangu and Changnong35, five SNPs were randomly selected in the intervals of identified QTLs. Based on the flanking sequences of selected SNPs, we designed primers and amplified the target sequences by PCR using genomic DNA of the two parents and their progenies. Finally, five co-dominant DNA markers were developed (Fig 4 and S4 Table). These markers will be useful for gene cloning and molecular breeding of foxtail millet.

Mapping population
The construction of F 2 populations is general straightforward, and F 2 populations provide abundant information suitable for gene or QTL mapping and genetic analysis for many qualitative and quantitative traits. To date, there have been many attempts to use F 2 populations directly for QTL analysis in crops. For example, QTLs were obtained that were associated  with: crown rust susceptibility in ryegrass; drought-induced flag leaf senescence in wheat; awn, incomplete panicle exertion, and total spikelet number in rice; and aluminum-toxicity tolerance in soybean [32, [59][60][61]. In this study, the two parental lines have contrasting values across a wide range of agronomic traits, i.e. PH (154.0 and 176.8), MPD (1.8 and 3.1) (Table 1), which are essential for QTL identification. Finally, 11 major QTLs (PVE% > 10) were identified for eight agronomic traits. However, QTLs are sensitive to different environmental factors, such as years, regions, and to sometimes even materials [30,62]. Therefore, different QTLs were more likely to be obtained in various studies. In particular, in the F 2 population, it is important to verify the identified QTLs because of the lack of repeats. In future experiments, we will construct an recombinant inbred line (RIL) population to repeatedly verify the QTLs identified in the present study.

Conclusions
Using a high-throughput and cost-effective RAD-seq approach, we developed a total of 9,968 SNPs to construct a high-density genetic linkage map for foxtail millet, spanning 1648.8 cM, with an average distance of 0.17 cM between adjacent markers. In total, 11 major QTLs were identified for eight agronomic traits, and nine QTLs (qPH1.

Accession number
Raw sequence data obtained in this study have been deposited in the NCBI Sequence Read Archive (SRA) with accession number SRP102319.