Skip to main content
Advertisement

< Back to Article

Figure 1.

Frequency Distribution Pattern of Estimated Average SNP HET Rates in the dbSNP Database

Blue bars are the distribution of SNP HET rates in dbSNP; red line is fitted line. Chi-square goodness of fit test (with 20 bins) for fitting a beta distribution was not rejected at α = 0.01 level.

More »

Figure 1 Expand

Figure 2.

Frequency Distribution of Estimated CVs of SNP HET Rates in the dbSNP Database

The x-axis is truncated at CV > 500% for illustrative reasons even though SNPs with higher CVs were included in the actual distribution analysis. About 30% of SNPs have an estimated CV of ≤50%, less than 13% of SNPs had an estimated CV of ≤20%, and less than 4% of the SNPs had a low CV (≤5%).

More »

Figure 2 Expand

Table 1.

Relationship between Sample Size and Estimation of SNP Heterozygous Rates

More »

Table 1 Expand

Table 2.

Sample Sizes* for Testing SNP Heterozygous Rate at Different Thresholds

More »

Table 2 Expand

Table 3.

Average Sample Number of Sequential Probability Ratio Test Method for SNP HET Rate Test

More »

Table 3 Expand

Figure 3.

Probability Density Function (Bars) and Cumulative Density (Lines) of the Number of SNPs Needed To Have at Least One Heterozygous SNP Based on Simulation Results Using the Distribution Shown in Figure 1

At α = 0.01 level, (A) and (C) are for left-hand side of Equation 2 ≥ 0.95; (B) and (D) for ≥ 0.99, respectively. (A,B) Are the results of excluding SNPs with HET rates > 0.5. (C,D) Are the results without the exclusion. The required number of SNPs ki can be estimated based on the cumulative density distribution function (cdf): [1 − P(kki) ] ≤ α. The simulation shows that if SNPs were randomly used for LOH detection, then ki = 15 for threshold = 0.95; and ki = 20 for threshold = 0.99 (both were calculated at α = 0.01 level for cdf).

More »

Figure 3 Expand

Figure 4.

Relationship between Size of Chromosome Loss (kb) and Probability of Detection of LOH Assuming Use of All Chromosome 1, 3, 9, and 17 SNPs in HapMap

Blue and black lines are the simulated results using HET rate distribution pattern in dbSNP (Figure 1) with the assumptions of successful detection if k = 15 (95%) or 20 (99%) SNPs per lost segment as shown in Equation 2. Red lines represent the probability of detection of LOH using HET SNPs based on real genotype data of 90 patients in the CEU group of the HapMap. Magenta lines represent the probability of LOH detection based on fitted model (HET SNP distribution was fitted with negative binomial distribution) prediction. The simulation results indicated a detection probability of about 75%–85% for 30 kb loss size (blue, black); the probability of detection reaches 95% or higher when loss size is approximately 50 ∼ 60 kb or larger. The LOH size approximately has to be 250 kb or larger in order to achieve a 99% or higher detection probability (except for Chromosome 9 with slightly lower probability values). The results based on real genotyping data (red line) indicate a detection probability of about 70% for a 30 kb loss size; the probability of detection reaches 95% or higher when loss size is approximately 200 kb or larger, and the loss size has to be 450 kb or larger in order to achieve a 99% detection probability. The results based on model fitting (magenta lines) appear to be a good approximation of the results based on genotyping data (red lines).

More »

Figure 4 Expand

Figure 5.

Relationship among Inter-SNP Distance, Size of LOH, and Probability of Detection of LOH with Heterozygous SNPs, Assuming an Even Distribution of SNPs

(Red lines: inter-SNP distance = 12kb; green lines: inter-SNP distance = 120kb; blue lines: inter-SNP distance = 200 kb). For each color, the three lines from bottom to top correspond to SNP HET rates of 0.2 (bottom), 0.3 (middle), and 0.4 (top).

(A) Shows the results when the chromosomal region being lost is smaller than the inter-SNP distance. For example, with a 100 kb region being lost and a 200 kb inter-SNP distance, the LOH detection probabilities are 8%, 15%, and 20% for 0.2, 0.3, and 0.4 SNP HET rates, respectively, (blue lines). The maximum detection probability is about 40% or less, depending on SNP HET rate.

(B) Shows the results when the region of loss size is larger than the inter-SNP distance. For a 300 kb region of loss size and a 120 kb inter-SNP distance, the detection probability is about 40%, 60%, and 70% for SNP HET rates of 0.2, 0.3, and 0.4, respectively, in the calculation (green lines). As the region of loss increases, approaching 900 kb, the LOH detection probability will approach 0.9 or higher when the SNPs have a HET rate of 0.3 or higher. Similarly, with a 200 kb inter-SNP distance and a region of loss of 300 kb, the probabilities of detection of LOH are about 28%, 40%, and 52% for SNP HET rates of 0.2, 0.3, and 0.4, respectively (blue lines). If the inter-SNP distance is 12 kb, the detection probability of LOH is fairly high (more than 85%) when loss size is about 100 kb or longer (red lines). The results were based on the assumption that the SNPs selected and arrayed on the chips are evenly distributed on the chromosome, which gives the most optimistic detection probability for genome-wide screening. If the selected SNPs on a chip are not evenly distributed, the detection probability will be reduced. If all the current available SNPs are used (arrayed on a chip), the detection probabilities become the pattern as shown in Figure 4.

More »

Figure 5 Expand

Figure 6.

Spatial Distribution Pattern of LOH Detection Probabilities with Heterozygous SNPs on Chromosomes 1, 3, 9, and 17 for Various Loss Sizes

The HapMap genotype data are from two randomly selected individuals from the CEU group.

More »

Figure 6 Expand