^{*}

Conceived and designed the experiments: DS. Performed the experiments: DS. Analyzed the data: DS. Contributed reagents/materials/analysis tools: AA CNR. Wrote the paper: DS AA CNR.

The authors have declared that no competing interests exist.

For samples of admixed individuals, it is possible to test for both ancestry effects via admixture mapping and genotype effects via association mapping. Here, we describe a joint test called BMIX that combines admixture and association statistics at single markers. We first perform high-density admixture mapping using local ancestry. We then perform association mapping using stratified regression, wherein for each marker genotypes are stratified by local ancestry. In both stages, we use generalized linear models, providing the advantage that the joint test can be used with any phenotype distribution with an appropriate link function. To define the alternative densities for admixture mapping and association mapping, we describe a method based on autocorrelation to empirically estimate the testing burdens of admixture mapping and association mapping. We then describe a joint test that uses the posterior probabilities from admixture mapping as prior probabilities for association mapping, capitalizing on the reduced testing burden of admixture mapping relative to association mapping. By simulation, we show that BMIX is potentially orders-of-magnitude more powerful than the MIX score, which is currently the most powerful frequentist joint test. We illustrate the gain in power through analysis of fasting plasma glucose among 922 unrelated, non-diabetic, admixed African Americans from the Howard University Family Study. We detected loci at 1q24 and 6q26 as genome-wide significant via admixture mapping; both loci have been independently reported from linkage analysis. Using the association data, we resolved the 1q24 signal into two regions. One region, upstream of the gene

Most genome-wide association studies performed to date have focused on individuals with European ancestry. Admixed African Americans tend to have disproportionately higher risk for many common, complex diseases. Disease or trait mapping in admixed individuals can benefit from joint analysis of ancestry and genotype effects. We developed a joint test that is more powerful than either admixture mapping of ancestry effects or association mapping of genotype effects performed separately. Our joint test fully capitalizes on the reduced testing burden of admixture mapping relative to association mapping. The test is based on generalized linear models and can be performed using standard statistical software. We illustrate the increased power of the joint test by detecting two loci for fasting plasma glucose in a sample of unrelated African American individuals, neither of which loci was detected as significant by traditional association analysis.

Genome-wide association studies are conventionally performed with an implicit assumption that the prior probability of association is uniform across loci

Three approaches to combine admixture mapping and association mapping have been described. Tang

We illustrate application of the joint test by analyzing fasting plasma glucose among 922 non-diabetic, admixed African Americans from the Howard University Family Study (HUFS) conducted in the Washington, D.C metropolitan area. The prevalence of type 2 diabetes (diagnosed mainly on the basis of elevated fasting plasma glucose levels) among adults in the USA is currently 11.3%, ranging from 10.2% among European Americans to 18.7% among African Americans

We first describe the characterization of local ancestry for the 922 admixed African Americans using 797,831 autosomal SNPs. The mean proportion of African ancestry was 0.797 (95% confidence interval 0.770 to 0.819, Supplementary

To empirically estimate the testing burdens of admixture mapping and association mapping, we fit autoregressive models and estimated the effective number of tests based on autocorrelation. For example, for the first individual in our sample, there were five ancestry switches along chromosome 22 (

For this individual, the chromosome is a mosaic of six segments, reflecting five ancestry switches.

Adjusting for global ancestry will not completely control confounding due to local ancestry in association mapping

If the posterior probability of a local ancestry effect is smaller than the prior probability of association in the absence of performing admixture mapping,

The change in association sample size as a function of

We also compared the average power of our joint test to the MIX score

Local Ancestry Odds Ratio | Genotype Odds Ratio | BMIX | MIX |

1.000 | 1.000 | 0.0004 | 0.0002 |

1.200 | 1.000 | 0.0263 | 0.0004 |

1.000 | 1.200 | 0.0508 | 0.0289 |

1.200 | 1.200 | 0.1804 | 0.0670 |

1.200 | 0.833 | 0.1610 | 0.0220 |

1.500 | 1.000 | 0.7006 | 0.0070 |

1.000 | 1.500 | 0.2954 | 0.3588 |

1.500 | 1.500 | 0.8572 | 0.3777 |

1.500 | 0.667 | 0.8829 | 0.1850 |

We performed admixture mapping for fasting plasma glucose by linearly regressing fasting plasma glucose on local ancestry, adjusted for age, global ancestry, and sex. We detected two genome-wide significant loci (

The y-axis shows the posterior probability that a locus affects the phenotype. The red line indicates the genome-wide significance level.

We performed association mapping for fasting plasma glucose by linearly regressing fasting plasma glucose on genotype stratified by local ancestry, assuming an additive genotype model, adjusted for age, global ancestry, and sex. The genomic control inflation factor was 1.009 (Supplementary

The y-axes indicate the posterior probability that a locus affects the phenotype. The red lines indicate the genome-wide significance level. (A) Association testing under the uniform prior probability. (B) Joint ancestry and association testing.

To functionally annotate these two SNPs, we first identified the intervals based on linkage disequilibrium surrounding these two SNPs containing all SNPs with pairwise

We present a joint test of ancestry and association applicable to mapping disease susceptibility loci or trait loci in admixed individuals. Although we proceed through the calculations sequentially by performing admixture mapping first followed by association mapping, equivalence to a joint test can be seen by recognizing that the joint probability of ancestry and association effects equals the product of the probability of an ancestry effect and the probability of association conditional on ancestry. Conditional independence of association given ancestry is necessary for validity of the joint test. For any given marker, admixture mapping is based on the “between” component of local ancestry strata and association mapping is based on the “within” component of local ancestry strata, so that even though both admixture mapping and association mapping are fundamentally based on observed genotypes the data are not used twice. Our joint test is based on generalized linear models and so can be performed with standard statistical software. The admixture mapping step can also accommodate a case-only test

Our joint test of ancestry and association are both genome-wide at equivalent high marker density. Every marker in a sample is tested by both admixture mapping and association mapping,

Compared to previous approaches, our joint test has several favorable characteristics. The approach of Deo

By sequentially updating the probability that a locus is a trait locus based on ancestry with the probability that the locus is a trait locus based on genotypic association conditional on ancestry, our procedure estimates the joint probability that a locus has ancestry and association effects. At chromosome 1q24, association mapping resolved the admixture signal into two regions,

In summary, we describe a joint test of ancestry and association for mapping disease susceptibility loci and trait loci in admixed individuals. Key properties of our test are that it maintains conditional independence of genotype and local ancestry and that it fully capitalizes on the reduced testing burden of admixture mapping relative to association mapping, making it more powerful than all existing joint tests. Upon application to fasting plasma glucose in African Americans, we identified two loci at genome-wide significance levels, whereas conventional association mapping yielded no new discoveries. Both loci have been identified previously by genome-wide linkage analysis, providing evidence of replication and indicating that linkage analysis, admixture mapping, and association mapping are all converging on the same loci. By taking advantage of fine-mapping afforded by association mapping and background linkage disequilibrium, we resolved one locus into two separate intervals. One of these intervals contains a promoter with multiple binding sites for transcription factors previously implicated in type 2 diabetes. The fact that both loci were discovered via admixture mapping directly implies that the genetic architecture of fasting plasma glucose is different in individuals of European ancestry

First, we briefly review Bayes' Theorem

Let the likelihood function

Two main quantities in Bayesian inference are Bayes factors and posterior probabilities. One advantage of Bayes factors over

The algorithm consists of six steps.

Using generalized linear regression, perform admixture mapping by regressing phenotype on local ancestry, adjusting for global ancestry (and other covariates as appropriate). For example, let ^{th} of individual, ^{th} individual at the ^{th} marker (^{th} individual (local ancestry averaged across all markers). We require the

Convert the

Using generalized linear regression, perform association mapping by regressing phenotype on genotype, stratified by local ancestry, adjusting for global ancestry (and other covariates as appropriate). For example, let ^{th} individual in the ^{th} stratum, ^{th} individual in the ^{th} stratum at the ^{th} marker (

Combine the regression coefficients for genotype for the strata of local ancestry using inverse variance-weighted fixed effects. The pooled estimate of the genotype effect is given by

Obtain association

Convert the association

All calculations were performed in R

The procedure to simulate admixed data under a vicariance model has been detailed previously _{ST}

To investigate whether adjusting for local ancestry is sufficient to control confounding due to global ancestry, we simulated two independent SNPs for a sample of 1,000 admixed individuals. The first SNP was the test SNP and the second SNP was untested. We estimated global ancestry by averaging local ancestries.

Ethical approval was obtained from the Howard University Institutional Review Board and written informed consent was obtained from each participant.

We used BMIX to analyze fasting plasma glucose among 922 non-diabetic, unrelated African Americans from the HUFS (Supplementary

Average proportion of African ancestry across the genome, estimated using LAMPANC version 2.3

(EPS)

Quantile-quantile plot for association

(EPS)

Adjusting for local ancestry does not control confounding due to global ancestry.

(DOC)

Association results for 1q24 stratified by local ancestry.

(DOC)

Clinical characteristics of the 922 participants.

(DOC)

R code implementing the BMIX joint test.

(TXT)

We thank the four anonymous reviewers for their comments. The contents of this publication are solely the responsibility of the authors and do not necessarily represent the official view of the National Institutes of Health.