Sequence-Based Genotyping for Marker Discovery and Co-Dominant Scoring in Germplasm and Populations

doi:10.1371/journal.pone.0037565

Figure 1.

Overview of SBG.

(A) The sequencing complexity of genomic DNA is reduced using a combination of rare and frequent cutting enzymes. (B) Sequencing adapters containing sample identification tags are ligated to the restriction fragments to construct SBG libraries. SBG libraries are amplified and sequenced using Illumina sequencing platforms. Only read 1 will be sequenced for single-end sequencing, while both read 1 and read 2 will be sequenced for paired-end sequencing. (C) SNPs are mined between the samples and simultaneously genotyped using the SBG bioinformatics analysis workflow.

More »

Expand

Figure 2.

Bioinformatics analysis workflow for SBG.

The Illumina data are first processed to remove low quality reads. The reference sequences are generated by clustering the unique reads present within the dataset. The reads are subsequently aligned to the reference sequences and variation called using the GATK Unified Genotyper. Lastly, the final set of SNPs and genotypes are generated by removing SNPs not meeting the threshold for percentage of missing data and expected genotypic frequencies.

More »

Expand

Table 1.

Summary statistics for generating the reference sequences.

More »

Expand

Table 2.

Variant calling for the arabidopsis and lettuce sequence datasets.

More »

Expand

Table 3.

Parent-based SNP genotyping in the arabidopsis and lettuce sequence datasets.

More »

Expand

Table 4.

Parent-based SNP genotyping in the arabidopsis and lettuce sequence datasets after removing SNPs displaying extreme genotypic frequencies and an excessive number of missing genotypes.

More »

Expand