Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

< Back to Article

Figure 1.

A diagram for the three main sets of SNPs used in the text.

The first set of PCA SNPs is used to identify hidden population substructure. The set of genomic control SNPs is used to evaluate the over-dispersion factor in a given study, as well as in the proposed permutation procedure to select relevant PCs for the correction of PS. The second set of PCA SNPs is used to validate findings from the first set of PCA SNPs. In applications, only the first set of PCA SNPs is recommended.

More »

Figure 1 Expand

Table 1.

Tracy-Widom tests and associated P-values (in parenthesis) for the significance of principal components.

More »

Table 1 Expand

Table 2.

Spearman rank correlation coefficients between pairs of principal component directions from the original PLCO prostate cancer and NHS breast cancer studies.

More »

Table 2 Expand

Figure 2.

Samples represented by their first two principal components.

Principal components (PC, the 1st along the horizontal direction, the 2nd along the vertical direction)) were obtained by applying the PCA on the joint sample of PLCO prostate cancer and NHS breast cancer studies. A) First two PCs for subjects from the PLCO prostate cancer study. B) First two PCs for subjects from the NHS breast cancer study.

More »

Figure 2 Expand

Table 3.

Principal component comparisons (P-values) between cases and controls based on the Wilcoxon rank-sum test.

More »

Table 3 Expand

Figure 3.

Q-Q plot based on the test without PC adjustment.

For each of the four analyses, the Q-Q plot is based on P-values (in log10 scale) that correspond to the 1 d.f. Wald test on 475,116 testing autosomal SNPs by assuming an additive risk model (in logit scale) and without PC adjustment. A) Results for the original prostate cancer study (prostate cancer cases and controls from PLCO). B) Result for the reconstructed prostate cancer study using external controls (prostate cancer cases from PLCO, and external controls from NHS). C) Results for the original breast cancer study (breast cancer cases and controls from NHS). D) Results for the reconstructed breast cancer study using external controls (breast cancer cases from NHS, and external controls from PLCO).

More »

Figure 3 Expand

Table 4.

Over-dispersion factors and empirical type I errors for the association test without the correction of PS.

More »

Table 4 Expand

Table 5.

Over-dispersion factor (and the empirical type I error under the significant level of 0.05) for association tests with adjustment for various numbers of PCs.

More »

Table 5 Expand

Figure 4.

Q-Q plot based on the test with PC adjustment.

For each of the four analyses, the Q-Q plot is based on P-values (in log10 scale) that correspond to the 1 d.f. Wald test on 475,116 testing autosomal SNPs by assuming an additive risk model (in logit scale) and with PC adjustment. The PCs used in adjustment are selected by the proposed permutation procedure. A) Results for the original prostate cancer study (prostate cancer cases and controls from PLCO). B) Results for the reconstructed prostate cancer study using external controls (prostate cancer cases from PLCO, and external controls from NHS). C) Results for the original breast cancer study (breast cancer cases and controls from NHS). D) Results for the reconstructed breast cancer study using external controls (breast cancer cases from NHS, and external controls from PLCO).

More »

Figure 4 Expand

Figure 5.

SNP ranking correlation in prostate cancer studies.

In each plot, SNPs' rankings based on the 1 d.f. Wald test on 475,116 testing autosomal SNPs without PC adjustment are compared with their rankings based on the 1 d.f. Wald test with adjustment for PCs chosen by the permutation procedure. The SNPs in blue are ranked among the top 5% by tests both with and without PC adjustment. The SNPs in green and orange are ranked among the top 5% by only one of the tests. A) Results based on the original prostate cancer study (prostate cancer cases and controls from PLCO). The 1st PC was chosen for PS correction. B) Results based on the reconstructed prostate cancer study using external controls (prostate cancer cases from PLCO, and external controls from NHS). The 1st, 2nd and 4th PCs were chosen for PS correction.

More »

Figure 5 Expand

Figure 6.

The conditional ranking distribution for the original PLCO prostate cancer study.

Each plot shows the histogram of ranks according to the test without PC adjustment for SNPs ranked within a given range by the test with the adjustment for the 1st PC (chosen by the proposed permutation procedure). The ranking ranges (%) are shown on the horizontal axis. The frequencies (%) are shown on the vertical axis. A) The histogram of ranks for SNPs ranked in the top 0–1% by the test with PC adjustment. B) The histogram of ranks for SNPs ranked in the top 1–2% by the test with PC adjustment. C) The histogram of ranks for SNPs ranked in the top 2–3% by the test with PC adjustment. D) The histogram of ranks for SNPs ranked in the top 3–4% by the test with PC adjustment. E) The histogram of ranks for SNPs ranked in the top 4–5% by the test with PC adjustment.

More »

Figure 6 Expand

Figure 7.

The conditional ranking distribution for the reconstructed prostate cancer study using external controls.

Each plot shows the histogram of ranks according to the test without PC adjustment for SNPs ranked within a given range by the test with the adjustment for the 1st, 2nd, and 4th PCs (chosen by the proposed permutation procedure). The ranking ranges (%) are shown on the horizontal axis. The frequencies (%) are shown on the vertical axis. A) The histogram of ranks for SNPs ranked in the top 0–1% by the test with PC adjustment. B) The histogram of ranks for SNPs ranked in the top 1–2% by the test with PC adjustment. C) The histogram of ranks for SNPs ranked in the top 2–3% by the test with PC adjustment. D) The histogram of ranks for SNPs ranked in the top 3–4% by the test with PC adjustment. E) The histogram of ranks for SNPs ranked in the top 4–5% by the test with PC adjustment.

More »

Figure 7 Expand

Table 6.

Discrepancy in SNP selection for the follow-up study between the permutation procedure and an alternative PC adjustment strategy.

More »

Table 6 Expand

Table 7.

Detection and correction for population stratification using various numbers of SNPs for PCA in the reconstructed study comparing prostate cancer cases from PLCO with controls from NHS.

More »

Table 7 Expand