Figure 1.
Strategy to identify genes targeted by selection during the domestication process.
A hypothetical example of a chromosome with polymorphism data is shown. The domestication process of O. rufipogon to O. sativa should have resulted in changes in traits such as shattering, seed dormancy, awn length, and grain quality, among others, and the QTLs related to these traits have been roughly mapped on to the genome (shown by triangles). In addition, genomic regions that contain domestication genes should show reduced levels of polymorphisms due to selective sweep. We reasoned that these regions should overlap with the domestication related QTLs, and contain genes with functions related to such QTLs (those indicated by colors).
Table 1.
Accessions and sampling locations.
Figure 2.
Summary of population genetics analyses.
In each figure, red is used for O. rufipogon, blue for indica, and green for japonica. (A) Distribution of π of protein coding genes. The first 1000 synonymous sites from the translation start site were used for each gene to correct for the difference in length in each gene. 13,471 genes with 1,000 sites with reliable SNP data are used. (B) Neighbor-joining tree of all sequenced strains. (C) Population structure estimated by PCA. (D) Population structure estimated by the bayesian clustering program STRUCTURE (K = 3). The results of K = 2∼6 are shown in Figure S1. (E) Decay of LD against distance. The bin size is 2000 bp (measured until 1000 kb).
Table 2.
Summary of SNPs and nucleotide diversity.
Figure 3.
Genome-wide analysis of population structure and π for chromosomes 1 (A) and 3 (B).
The upper panel shows the results of STRUCTURE of K = 3. Thus, three clusters are assumed, which generally (but not strictly) correspond to O. rufipogon (red), indica (blue) and japonica (green). Unaligned regions (mostly due to gaps) are in gray. The middle panel shows the genome wide distributions of π for each taxa. O. rufipogon is in red, indica in blue, japonica, in green, and O. sativa (both indica and japonica included together) in black. The lower panel shows the statistical scores (logP) of the observed π (black) and θw (gray) of O. sativa O. rufipogon, calculated by coalescent simulation [59]. The top 10 low diversity regions are indicated by black boxes (S01, S02, and S04). The region that shows exceptionally high FST between tropical and temperate japonica is indicated by a green box (JJ01). Results of the other chromosomes are shown in Figure S3.
Figure 4.
Spatial distribution of the level of polymorphism around the top 10 low diversity regions.
The black line shows the scaled π (O. sativa/O. rufipogon), and the gray line shows the scaledθw (O. sativa/O. rufipogon). The dotted line indicates the genome-wide average of the scaled π and θw. The positions of two known domesticated genes sh4 [28], [29] and PROG1 [30], [31] are indicated by red arrows. The positions of some other interesting candidate domestication genes; seed storage proteins, and a number of transcription factors that contain fixed variants in O. sativa, are indicated by green and purple arrows, respectively. The numbers of QTLs related to awn length, shattering, seed dormancy, and quality (in blue, red, orange, and brown, respectively), and the number of annotated genes with fixed variants in O. sativa over the total number of annotated genes in each region are indicated. The black arrowed lines indicate the approximate regions of selective sweeps (the selective sweep regions in S01 and S09 are likely to extend into the regions shown by broken lines that could not be analyzed due to low sequencing coverage etc).
Figure 5.
Spatial distribution of the level of polymorphism around a typical low diversity region in O. sativa ssp. indica.
The statistical scores of the scaled π of indica/O. rufipogon. The dotted horizontal lines indicate the genome-wide average values of π. Indica is shown in blue, japonica in green, and O. sativa (indica and japonica pooled) in black. Results of the entire genome are shown in Figure S4.
Figure 6.
Spatial distribution of the level of polymorphism around two typical low diversity regions in O. sativa ssp. japonica.
The statistical scores of the scaled π of japonica/O. rufipogon. The dotted horizontal lines indicate the genome-wide average values of π. Indica is shown in blue, japonica in green, and O. sativa (indica and japonica pooled together) in black. Results of the entire genome are shown in Figure S5.
Figure 7.
(A) Between japonica and indica. (B) Between temperate japonica and tropical japonica.