A High-Density Genetic Map of Tetraploid Salix matsudana Using Specific Length Amplified Fragment Sequencing (SLAF-seq)

As a salt-tolerant arbor tree species, Salix matsudana plays an important role in afforestation and greening in the coastal areas of China. To select superior Salix varieties that adapt to wide saline areas, it is of paramount importance to understand and identify the mechanisms of salt-tolerance at the level of the whole genome. Here, we describe a high-density genetic linkage map of S. matsudana that represents a good coverage of the Salix genome. An intraspecific F1 hybrid population was established by crossing the salt-sensitive “Yanjiang” variety as the female parent with the salt-tolerant “9901” variety as the male parent. This population, along with its parents, was genotyped by specific length amplified fragment sequencing (SLAF-seq), leading to 277,333 high-quality SLAF markers. By marker analysis, we found that both the parents and offspring were tetraploid. The mean sequencing depth was 53.20-fold for “Yanjiang”, 47.41-fold for “9901”, and 11.02-fold for the offspring. Of the SLAF markers detected, 42,321 are polymorphic with sufficient quality for map construction. The final genetic map was constructed using 6,737 SLAF markers, covering 38 linkage groups (LGs). The genetic map spanned 5,497.45 cM in length, with an average distance of 0.82 cM. As a first high-density genetic map of S. matsudana constructed from salt tolerance-varying varieties, this study will provide a foundation for mapping quantitative trait loci that modulate salt tolerance and resistance in Salix and provide important references for molecular breeding of this important forest tree.


Introduction
Willows, the general name of species in Salix (Salicaceae), are deciduous trees or shrubs distributed mainly in the temperate and frigid zones of the Northern Hemisphere. There are > 500 willow species worldwide, about half of which can be found in China. Willows are important tree species for energy production, afforestation, and greening [1][2][3]. Some willow varieties, including S. matsudana and Salix psammophila, have been receiving increasing attention because of their salt tolerance [4][5][6][7][8]. Some varieties of S. matsudana have become potential strategic resources in coastal forestry exploitations of China [9,10]. To date, the responses of S. matsudana to salt stress have been studied at the levels of physiology [10], gene expression [9], and miRNA expression [11]. However, the mechanisms of salt tolerance at the whole genome level of S. matsudana has not been extensively explored thus far.
Genetic maps that are constructed according to the linkage relationships among genetic markers at the whole genome level are the basis for quantitative trait locus (QTL) mapping, mapbased gene cloning, comparative genomics, and marker-assisted breeding. Willows have relatively high recombination rates and low levels of linkage disequilibrium (LD) [12], which makes them suitable for genetic mapping. The first genetic linkage map of Salix was constructed using a population of 87 hybrids derived from a cross between "Björn" (the male hybrid clone of Salix viminalis × Salix schwerinii) and "78183" (the female clone of Salix viminalis). The map consisted of 325 amplified fragment length polymorphisms (AFLP) and 38 restriction fragment length polymorphisms (RFLP) markers with an average density of markers of 14 cM [13]. Later, two linkage maps of Salix containing 495 single nucleotide polymorphisms (SNP) and 221 AFLP markers were consrtucted, with the average distances of 5.0 and 8.1 cM, respectively [14]. All these maps were constructed using diploid (2n = 38) willows and were of low density, with the average marker densities of 5.0~14 cM. Barcaccia et al. [15] have successfully constructed a genetic map for tetraploid (2n = 4x = 76) Salix, but its marker density needs to be increased for a better understanding of the genome structure and organization of tetraploid Salix.
Single nucleotide polymorphisms (SNP) represent DNA sequence variation among individuals caused by single base mutations. SNP markers have become a powerful tool in genetics due to their abundant and even distribution. Recent advances in next-generation sequencing (NGS) technologies have provided enormous impetus for the rapid development and extensive application of SNP markers [16]. Technologies that can develop SNP markers in a short time include complexity reduction of polymorphic sequences (CroPS) [17], restriction-site-associated DNA sequencing (RAD-seq) [18], and genotyping-by-sequencing [19]. Specific length amplified fragment sequencing (SLAF-seq) is a newly developed NGS technology that can be used to rapidly develop SNP markers by constructing a SLAF-seq library [20]. Development of SNP markers and construction of high-density genetic maps based on SLAF-seq have been applied to a number of species [20][21][22][23][24].
In this study, "Yanjiang" (a salt-sensitive variety of S. matsudana native to the Jiangsu riverine areas of China) and "9901" (a salt-tolerant variety of S. matsudana in Shandong coastal areas of China) were used as female and male parents, respectively, and the intraspecific F 1 hybrid population containing 3,520 individuals were obtained through controlled pollination. A total of 200 individuals, along with its parents, were selected randomly from the F 1 population and used as the mapping population. SLAF-seq technology [20] was used to develop SLAF markers (SLAFs) and construct a high-density intraspecific genetic map of tetraploid S. matsudana constructed from salt tolerance-varying varieties. Results from this study will provide a foundation for genetic map-based QTL fine mapping, gene cloning, comparative genomics, and marker-assisted breeding of salt-tolerant related traits in this important tree species.

Materials and Methods
Plant Material, DNA Extraction, and Identification of Ploidy Levels Main branches of "Yanjiang" (female parent) and "9901" (male parent) with 5 cm diameters were collected on December 15, 2013, and then cultured hydroponically in an intelligent greenhouse at 20-25°C (This work was conducted in Jiangsu Riverine Institute of Agricultural Sciences. We are members in Jiangsu Riverine Institute of Agricultural Sciences and this Institute granted us full permission of the work. This study did not involve endangered or protected species.). Mature pollens were collected from "9901" and then crossed with receptive female flowers on "Yanjiang". A total of 3,520 F 1 offspring were obtained and sown. Next, 200 individuals were selected randomly from the F 1 population on Jul. 3, 2014, and used as a mapping population. Genomic DNA of the mapping population was extracted using the cationic detergent cety-ltrimethylammonium bromide (CTAB) method [25]. The extracted DNA was detected by agarose gel electrophoresis (1%) and then analyzed on the ND-1000 spectrophotometer platform (NanoDrop, Wilmington, DE, USA) for concentration and purity. Ploidy levels of the parents and offspring were examined based on flow cytometry [26] according to the method of Serapiglia et al. [27].

Construction and Sequencing of the SLAF Library and Development of Polymorphic SLAF Markers
The genome of Populous trichocarpa (http://www.ncbi.nlm.nih.gov/assembly/GCF_000002775.3) was chosen as the reference genome for pre-restriction enzyme digestion according to the genome size and GC content information. HaeIII and Hpy166II restriction enzymes were finally chosen to digest genomic DNA of the mapping population. After digestion by HaeIII and Hpy166II, the obtained SLAFs (314-364 bp in length) of the mapping population had an A-tail added to the 3 0 ends, was ligated with Dual-index sequencing adaptors, amplified by PCR, screened, and then used to construct the SLAF library of S. matsudana (for detailed processes of SLAF library construction refer to the methods of Sun et al. [20]).
SLAFs in the quality-tested library were sequenced using the Illumina HiSeq 2500 platform (Illumina, San Diego, CA, USA). The genome of Oryza sativa, used as a control, underwent the same treatments of library construction and sequencing as the S. matsudana mapping population to check the reliability of testing processes.
Reads of the samples were obtained by identifying Dual-index sequences. The adaptor-filtered reads were evaluated for quality and data size, and then clustered to develop SLAFs in the parents and offspring. Polymorphic SLAFs were selected according to the single nucleotide mutations (SNP markers) and insertion and deletion mutations (InDel markers).

Construction of High-Density Genetic Maps
The high-quality polymorphic SLAFs were allocated into different linkage groups (LGs) according to their method limit of detection (MLOD) values. Genetic maps were constructed and corrected using HighMap software according to the methods of [22]. The genetic map was evaluated according to the haplotype maps, percentage of missing SLAFs, and heat maps of each LG.

Ploidy Levels of the Parents and Offspring
Salix suchowensis, the already known diploid (2n = 2x = 38), was used as a control to identify ploidy levels of the parents and offspring. Our results showed that the peak values of the parents were twice that of S. suchowensis, indicating that both the female parent and male parent were tetraploid (2n = 4x = 76) (Fig 1). Similarly, the randomly selected 15 offspring in the mapping population were all tetraploid (2n = 4x = 76) (Fig 2).

Quality of SLAF-seq Data
Evaluations of the Oryza sativa SLAF library showed that cleavage efficiency of the HaeIII and Hpy166II restriction enzymes was 92.06%, and the paired-end reads accounted for 96.67% of all reads obtained. These results indirectly reflected that the SLAF library of S. matsudana was of high quality. DNA sequencing generated a total of 472.53 M reads. Mean Q30 percentage (the percentage of bases with sequencing values 30 in the total bases) and mean GC percentage (the percentage of G and C bases in the total bases) of the S. matsudana parents and offspring were 90.15% and 38%, respectively. Basic statistics of the SLAF-seq data are listed in Table 1.

Development of SLAF Markers
After analyzing the SLAF-seq data, five offspring with the percentage of abnormal SLAFs in total SLAFs of 0.3% were removed from the mapping population and the remaining 195 offspring and the parents were retained. In total, 277,333 SLAFs were detected, of which the average depth of the male parent, female parent, and offspring were 53.20-fold, 47.41-fold, and 11.02-fold, respectively ( Table 2).
The 277,333 SLAFs were classified into the three categories of polymorphic, non-polymorphic, and repetitive according to differences in allele number and sequences. Of all the SLAFs obtained, 99,526 (accounting for 35.89% of the total SLAFs) were polymorphic (Table 3).
After filtering out the 99,526 polymorphic SLAFs lacking parent information, 58,763 were retained and further classified into eight segregation patterns (Fig 3). The number of SLAFs ranged from 806 to 16,442 in different patterns. The detailed distribution of the markers is shown in Fig 3. Since F 1 offspring were used as the mapping population, SLAFs with an aaxbb pattern were filtered. The remaining seven segregation patterns of SLAFs (42,321) were candidate markers to construct the genetic map.

Construction of the High-Density Genetic Map
To confirm the quality of the genetic map, the 42,321 polymorphic SLAFs with the following characteristics were further filtered: 1) number of SNPs 3; 2) sequencing depths of the parents 10-fold; 3) integrity 85%; 4) distorted segregation; 5) homozygous parents. Finally, 6,744 high quality SLAFs belonged to five segregation patterns (Fig 4) were suitable to construct the genetic map. The MLOD values between the 6,744 SLAFs were calculated. Finally, 6,737 SLAFs were allocated into 38 LGs (S1 Fig). The linear arrangements of all SLAFs and genetic distances of adjacent SLAFs within each LG were analyzed using HighMap software. An integrated genetic map, 5,497.45 cM in total length and 0.82 cM in average length, was finally constructed (S1 Fig, Table 4). A total of 9,488 SNP markers were included on the map. The number of SNP markers in different LGs ranged from 125 to 511 on the map (Fig 5). The mean sequencing depth of the SLAFs on the map were 190.38-fold for the parents and 28.68-fold for the offspring. The basic characteristics of the LGs on the male map and the female map are listed in Tables 5 and 6, respectively.

Evaluation of the Genetic Map
The genetic map of S. matsudana was evaluated using haplotype maps and heat maps. Haplotype maps were generated for each of the 195 F 1 individuals using 6,737 SLAFs (S1 File). The recombination events of each individual on LGs of the integrated genetic map were displayed intuitively on the haplotype maps. As can be seen in S2 File, the majority of recombination blocks were clearly defined. The missing percentage of markers in each LG of the integrated genetic map ranged from 0.14% to 0.52% (Fig 6), which did not significantly affect the quality of the genetic map. It can also be seen that all LGs distribute uniformity. Heat maps were also generated to evaluate the quality of genetic map using pair-wise recombination values for the 6,737 mapped SLAFs (S2 File). Heat maps could reflect the recombination relationships between markers in each of the LGs and were used to find the potential ordering errors. It can be seen in S2 File that most of the LGs performed well in visualization, indicating that the markers were well-ordered in each LG. Consequently, the genetic map of S. matsudana was of high quality.

Discussion
Willows show high-level variations in chromosome numbers, including diploid, tetraploid, hexaploid, and even dodecaploid [28,29]. Variations in chromosome numbers of willows occur not only among species, but among varieties of the same species, which has led to difficulties in improving willows. S. matsudana is an important arbor willow species, certain varieties of which have become potential strategic resources in coastal developments of China because of their salt tolerances [9,10]. The chromosome number of S. matsudana remains unknown, which has hindered studies on this species. To ensure accuracy in the subsequent experiments, ploidy levels of chromosome in the parents and the offspring were determined by comparing the peak values of their chromosomes with diploid S. suchowensis (2n = 2x = 38) [27,30] using flow cytometry [26]. Our results showed that both the parents and the 15 randomly selected offspring were tetraploid (2n = 4x = 76) (Figs 1 and 2), which has provided a reference for genetic map construction of S. matsudana. Genetic maps are the basis for QTL fine mapping of interesting traits, map-based gene cloning, and marker-assisted breeding. A suitable mapping population is the basis for successful construction of the genetic map. In this report, salt-sensitive "Yanjiang" and salt-tolerant "9901", the intraspecific varieties of S. matsudana, were chosen as female and male parents, and their F 1 hybrid population was used as the mapping population for genetic map construction. The segregation population may present with a large number of variations due to  Genetic markers are powerful tools for genetic map construction. The selection of molecular markers is key for the success of genetic map construction. The molecular markers used to construct the genetic maps of willows included amplified fragment length polymorphism (AFLP), simple sequence repeat (SSR), and SNPs [13][14][15]. Among all the markers developed, SNPs are ideal markers for constructing the genetic map due to the advantages of abundance and even distribution across the genome. SNP molecular markers were used in this report, and are suitable for constructing the genetic map of S. matsudana.
The number of molecular markers is one of the indices for evaluating the quality of the genetic map. A small number of markers will lead to a map with low-density and large distance, thus decreasing the effective utilization of the map. Although previous genetic maps of willows have been utilized to a certain degree [13][14][15], the number of markers on the maps was only several hundred, which has hindered effective utilization. SNP markers are used widely in genetic map construction not only because of the advantages of the marker itself, but also because advances in high throughput sequencing technologies allow us to detect a large number of SNPs in a short time [17][18][19][20]. RAD-seq and SLAF-seq are among the sequencing technologies that can detect large-scale SNPs. RAD-seq is a reduced-genome sequencing technology that randomly digests genomic DNA with restriction enzymes followed by sequencing. SLAF-seq is a reduced-genome sequencing technology that digests genomic DNA with double restriction enzymes followed by sequencing of the paired-end reads with a specific length. Compared with RAD-seq, the largest advantage of SLAF-seq is its repeatability. SLAFseq has been used to construct the genetic maps of a number of species in the last few years  [21][22][23][24]. This report has also adopted the more advanced SLAF-seq technology, which was suitable for constructing the genetic map.
Using intraspecific varieties of tetraploid S. matsudana-"Yanjiang" and "9901" as parents, the randomly selected 200 F 1 hybrid offspring as a mapping population, and the high-throughput SLAF-seq to construct the genetic map of S. matsudana, it can be seen from the map that a total of 277,333 SLAFs were obtained, of which 99,526 were polymorphic (Table 3). After filtering, the final number of SLAFs were 6,737 for the integrated genetic map (Table 4), 4,575 for the male genetic map (Table 5), and 2,857 for the female genetic map (Table 6), respectively. The integrated genetic map consisted of 38 linkage groups with total and average genetic distances of 5,497.45 cM and 0.82 cM, respectively (S1 Fig, Table 4). The number of SLAFs in each LG of the integrated genetic map ranged from 90 to 361. A total of 9,488 SNP markers were included on the integrated map. The map was of high quality based on evaluation of typlotype sources and linkage relationships (S1 and S2 Files). These results showed that SLAFseq is a reliable technology to detect large-scale SNPs and construct genetic maps. The intraspecific F 1 offspring used for map construction were derived from two tetraploid parents. Previous cytological and marker-segregation analyses show that tetraploid willows are allotetraploid, a group of tetraploids with two sets of chromosomes, each from a different ancestral species [15]. Because allotetraploids usually undergo disomic inheritance, approaches for their linkage analysis can be directly borrowed from those for diploids. However, these approaches do not characterize the difference of meiotic pairing occurring between homologous chromosomes from that between homeologous chromosomes, and therefore, the genome structure, organization and evolution of allotetraploids. By implementing the preferential pairing factor, Wu and colleagues have developed a series of statistical models that can identify the difference between homologous pairing and homeologous pairing [31,32]. These models can be used to analyze our tetraploid marker data and identify the degree of the preferential pair of homologous chromosomes over homeologous chromosomes. This information helps to gain new insight into the evolution of Salix genomes.
In nature, the meiotic recombination may differ between the two sexes, a phenomenon called heterochiasmy [33]. Wu et al. [34] modified a linkage analysis model to estimate and test heterochiasmy using marker data generated from a controlled cross. The application of this model provides additional insight into the genome-wide occurrence of heterochiasmy in Salix. The high-density genetic map constructed in this study provides a basis for QTL mapping, map-based gene cloning, and molecular breeding of willows, and gives important information for selection, breeding, protection, and utilization of salt-tolerant willow varieties; furthermore, the map will advance the genetic improvement of willows. Since no whole genomes of willows are available, the 38 linkage groups could not be further divided into two sets. This genetic map will be improved once the whole genome of willows is available.

Conclusion
Using the intraspecific F 1 hybrid population of salt-tolerant and salt-sensitive varieties of tetraploid S. matsudana as the mapping population, a total of 277,333 SLAFs were obtained, among which 99,526 were polymorphic, and 42,321 of the polymorphic markers could be used as potential markers for constructing the genetic map. After filtering the low quality markers and unsuitable markers for the F 1 population, 6,737 high quality SLAFs were used to construct a high quality genetic map. The genetic map consisted of 38 LGs, with total and mean genetic distances of 5,497.45 cM and 0.82 cM, respectively. The mean sequencing depth of SLAFs for the parents and offspring were 190.38-fold and 28.68-fold, respectively. The results from this study will provide a basis for fine mapping of salt-tolerant related QTLs and molecular breeding of S. matsudana.