Iterative Usage of Fixed and Random Effect Models for Powerful and Efficient Genome-Wide Association Studies
The first setting contains no QTNs (left panel). The second setting restricts QTNs to first three chromosomes and left the rest two chromosomes as control (middle panel). The last setting spread QTNs on all five chromosomes (right panel). All the SNPs under first setting and all the SNPs on chromosome four and five under second settings were used to derive null distributions (a and b). The tests of SNPs on chromosomes one to three, including the ones used as QTNs are displayed in the middle panel at bottom (d). Under the third setting, SNPs are classified into the QTN areas and the non-QTN areas. A QTN area includes a QTN and its adjacent SNPs within 100,000 base pairs on each side. The rest are the non-QTN areas. The null distribution of non-QTN SNPs is displayed on the top right (c) and tests on the SNPs in QTN areas is displayed on the bottom right (e). Three statistical methods were examined: FarmCPU, Naïve (t-test) and MLM. The MLM included top six PCs, derived from 10% of SNPs sampled randomly, and used as covariates to control population structure. FarmCPU did not include PCs. The data is a structured Arabidopsis thaliana population that includes 1,178 individuals with 214,545 SNP markers. P values were from the association tests on a simulated trait controlled by 100 QTNs with heritability of 50% and QQ plots over 100 replicates are displayed.