Fig 1.
Workflow of the modified UNEAK pipeline.
(A) Short reads are collapsed to form tags for each inbred line. (B) The tags from all inbred lines in the association panel are collapsed again. (C) The tags only from a single inbred line are removed as sequencing errors. Only tags from more than one inbred line are kept for further analysis. (D) Pairwise alignment is performed between any two tags to form networks with at most two mismatches on each end of PE reads. (E) By employing the parameter error tolerance rate (ETR), the error tags (shaded circles) are removed from networks. (F) Scoring of allelic tag pairs using a network as an example. Each tag in the network is scored as an allele. “+” and “-” represent the presence and absence of the corresponding allele in an inbred line and then tabulate for the association panel. (G) The co-occurrence or combination of any two alleles from a locus within a network formed a possible genotype. And a co-occurrence matrix containing all possible homozygous and heterozygous genotypes is created. (H) Allelic tag pairs are discriminated from complicated networks mixed with homologous tags using relative heterozygosity (HR). The tag pairs with HR smaller than the empirical value (0.2) are considered as the allelic tags and then genotyped for each inbred line.
Fig 2.
Alignments between the LD-based linkage map and the B. napus reference genome.
The horizontal axis represents the genetic position (cM) of the SNPs on the genetic linkage map based on the BnaNZDH population [21] and LD-based mapping. The vertical axis represents the physical position (Mb) according to the B. napus reference genome ‘Darmor-bzh’ [20].
Table 1.
LD (r2) and LD decay distance along chromosomes.
Fig 3.
LD decay in the A and C subgenomes in the 189 B. napus diverse lines.
The black dots indicate the r2 values of all SNP pairs within chromosomes for each subgenome. The red curve represents a nonlinear function of r2 against the SNP physical distance (Mb). The horizontal blue dash line indicates the estimated background level of LD (r2 = 0.26).
Fig 4.
Genome-wide distribution of haplotype blocks for the entire panel and two groups in the association mapping panel.
(A) The heatmap above indicates LD distribution. The color density represents the average r2 values of all SNP pairs in a sliding-window of 500 kb. The map below indicates the distribution of haplotype blocks with length > 100 kb. The gray color represents the background and the blue color indicates the haplotype blocks. (B) The distributions of different types of haplotype blocks for the P1 and P2 groups across each chromosome. Gray rectangles represent genomic regions that don’t contain any haplotype block. Red rectangles represent the P1-specific haplotype blocks and blue rectangles represent the P2-specific haplotype blocks. Orchid rectangles represent the common haplotype blocks with frequency difference ≤0.4, and black rectangles represent the group-preferential haplotype blocks with frequency difference >0.4.
Table 2.
Length of haplotype blocks (HBs) along chromosomes in B. napus genome.
Fig 5.
The haplotype block associated with genes controlling glucosinolate content.
The gray windows represent the haplotype block for the entire panel, and the black dots indicate the PIC values of the SNPs in these region. The red boxes and the corresponding text indicate the positions and names of the genes possibly associated with the forming of the haplotype block.
Fig 6.
Manhattan and quantile-quantile plots of GWAS for seed oil content.
The BLUP values of oil content across three years were used as phenotypes in GWAS. (A) Manhattan plot for oil content. The dashed horizontal line indicates the Bonferroni-adjusted significance threshold (P = 9.7×10-5). Red dots above the threshold indicate the significant SNPs for oil content. (B) Quantile-quantile plot for oil content.