• Loading metrics

High Trans-ethnic Replicability of GWAS Results Implies Common Causal Variants

High Trans-ethnic Replicability of GWAS Results Implies Common Causal Variants

  • Urko M. Marigorta, 
  • Arcadi Navarro


Genome-wide association studies (GWAS) have detected many disease associations. However, the reported variants tend to explain small fractions of risk, and there are doubts about issues such as the portability of findings over different ethnic groups or the relative roles of rare versus common variants in the genetic architecture of complex disease. Studying the degree of sharing of disease-associated variants across populations can help in solving these issues. We present a comprehensive survey of GWAS replicability across 28 diseases. Most loci and SNPs discovered in Europeans for these conditions have been extensively replicated using peoples of European and East Asian ancestry, while the replication with individuals of African ancestry is much less common. We found a strong and significant correlation of Odds Ratios across Europeans and East Asians, indicating that underlying causal variants are common and shared between the two ancestries. Moreover, SNPs that failed to replicate in East Asians map into genomic regions where Linkage Disequilibrium patterns differ significantly between populations. Finally, we observed that GWAS with larger sample sizes have detected variants with weaker effects rather than with lower frequencies. Our results indicate that most GWAS results are due to common variants. In addition, the sharing of disease alleles and the high correlation in their effect sizes suggest that most of the underlying causal variants are shared between Europeans and East Asians and that they tend to map close to the associated marker SNPs.

Author Summary

Describing and identifying the genetic variants that increase risk for complex diseases remains a central focus of human genetics and is fundamental for the emergent field of personalized medicine. Over the last six years, GWAS have revolutionized the field, discovering hundreds of disease loci. However, with only a handful of exceptions, the causal variants that generate the associations unveiled by GWAS have not been identified, and their frequency and degree of sharing across populations remains unknown. Here, we present a comprehensive comparison of GWAS results designed to try to understand the nature of causal variants. By examining the results of GWAS for 28 diseases that have been performed with peoples of European, East Asian, and African ancestries, we conclude that a large fraction of associations are caused by common causal variants that should map relatively close to the associated markers. Our results indicate that many of the disease risk variants discovered by GWAS are shared across Eurasians.


Genome-wide association studies (GWAS) have detected hundreds of risk alleles [1], generating novel biological knowledge and widening the range of diagnostic and treatment tools [2]. However, the reported effect sizes of these variants are small and their impact in individual risk prediction remains modest, raising doubts about the relevance of GWAS results [1], [3][6]. Some of the most hotly debated topics are how to account for the unexplained risk [4]; what may be the role of rare variants as a source of synthetic GWAS results [7][10]; and up to what extent GWAS results are portable between populations [11][15].

Answering to these questions is pressing for two reasons. First, the description of the genetic architecture of disease lies at the foundation of personalized medicine and, in particular, finding predictors of individual disease risk that could be applicable to different ancestries would be a major step forward [1] and would also allow the development of prioritizing strategies to identify disease-associated loci. Second, if sharing of causal variants across populations were common, it would suggest trans-ethnic mapping as a powered tool that would take profit of population heterogeneity in LD and allele frequencies to identify the causal variants underlying disease susceptibility [1], [15].

The available reports on the allele frequency distribution of GWAS risk variants point at an excess of common variants [16] that, at least for some particular diseases [17], present consistent effects across populations. If repeated, these observations constitute empirical evidence against rare alleles as a source of synthetic associations and would point at common variants that are in LD with the associated tagSNPs in all populations. However, such studies have not been generalized across different diseases and, currently, most evidence accumulating in the field comes from either re-sequencing efforts aimed to capture rare variants [18] or multi-ethnic replication efforts for a few risk variants [13], [15], [17], [19]. In addition, most meta-analysis of GWAS data, that could shed light on these issues, either have ignored population heterogeneity [2], [20] or have focused on a limited set of traits [21] and GWAS [22].

By compiling data from 299 GWAS for 28 different diseases, we build a large database of discovery-and-replication patterns of SNPs associated with complex disease. We evaluate the extent to which risk variants discovered in Europeans replicate in posterior studies performed on individuals of European, East Asian and African ancestries and compare the risk effect sizes found across populations. We also examine the extent up to which statistical power and differences in Linkage Disequilibrium among populations explain replication failures. Our results describe the patterns of replicability of GWAS across disease, evaluate how transportable these results are across populations and allow for inferences about the relative roles of rare and common variants in explaining current GWAS results.


We started by downloading all the associations in the GWAS Catalog [23] (last accessed in February 2012, see Materials and Methods), which represents a total of 7,145 associations with P<10−5 reported in 1,171 papers. To study trans-ethnic replicability across diseases, we first focused on GWAS performed upon the two most studied ancestral groups: Europeans and East Asians. We avoided quantitative traits, such as height or BMI, because they could be subject to different evolutionary pressures than disease traits and, thus, may present different replicability patterns. We selected for analysis diseases for which GWAS had been performed several times. In particular, we required (i) two or more GWAS with peoples of the same ancestry group; and (ii) at least one GWAS with subjects from the other ancestry group. A total of 28 diseases and 277 GWAS (206 European and 71 East Asian) fulfilled our criteria (average ∼10 papers per disease, range = 3–27 papers; see Materials and Methods and Tables S1 and S2). We further performed an exhaustive search of the literature to detect any GWAS published before February 2012 but not available in the Catalog (see Materials and Methods). This effort yielded six new GWAS that we included in our database (Table S3 and Table S2, in yellow). Finally, and to be as comprehensive as possible, we included 16 GWAS performed on peoples of African American ancestry that were available for eight of the previously ascertained 28 diseases. This rendered a final dataset of 299 GWAS papers reporting 419 associations to 28 diseases (Table S2). Out of these, we ascertained 190 SNPs initially reported as genome-wide significant (P<5×10−7, or P<5×10−8 if the study included imputed SNPs) in European GWAS and for which one or more replication attempts had been performed in subsequent European, East Asian and/or African GWAS (181, 225 and 61 attempts, respectively, Tables S4 and S5). We studied patterns of replication across studies, using the criterion that a replication was successful if the same risk allele achieved P<0.05. To obtain that information we examined every individual paper, since the GWAS Catalog records only P<10−5 (see Materials and Methods for a detailed description of the filtering process). Our database is available at

Replicability rates and sharing across Europeans and East Asians

Replicability rates are high within Europeans, with 155 successful out of 181 attempts (85.6%), when only 9 positive replications (∼5%) would be expected under the null hypothesis of no association (binomial test, P<10−16). This excess was robust to the significance threshold (e.g. 122 observed vs. 0.18 expected if only replication attempts achieving P<0.001 are considered successful and 56 observed vs. 1.8×10−5 expected for a threshold of P<10−7, Table S5). Moreover, replicability rates within Europeans approach 100% when accounting for statistical power. For the 168 attempts for which we could calculate the power to replicate the original finding (Table S5), we observed 147 positive replications, which is almost identical to the expectation of 149.1 positive replications given that average power is 89.1% (see Materials and Methods). This is expected, since most GWAS already contain an internal replication phase [1], [24]. Interestingly, all diseases presented similarly high replicability patterns, with no traces of heterogeneity in replicability (Table S6). These results were consistent with previous partial reports of replication for individual diseases [17], [19] and confirmed that the subset of 190 genome-wide significant SNPs map in loci truly associated with disease in peoples of European ancestry.

Next, we considered the replication attempts in East Asian populations. Out of 225 replication attempts, 103 were successful at a P<0.05 threshold (45.8%). This replicability departs significantly from the null expectation (103 vs. 11.3 expected, P<10−16) and is robust across replication thresholds (e.g. 49 observed vs. 0.23 expected for P<0.001 and 19 observed vs. 2.3×10−5 expected for P<10−7). Nevertheless, that figure is smaller than for Europeans, which can be expected since East Asian GWAS tend to have smaller sample sizes and, thus, less power [15]. We tested this hypothesis by calculating replicability rates after controlling for statistical power. First, we focused on the 81 attempts with ≥80% power to replicate the Odds Ratio (OR) found in Europeans (Table S5 and Materials and Methods). For that subset, replicability increases dramatically to 76.5% (62 out of 81 attempts are successful with a P<0.05 threshold). Second, we calculated that at most 132 positive replications would be expected out of statistical power (59% on average for the 225 attempts in East Asians, Table S5). The 103 observed replications thus correspond to an effective replicability rate of 77.9%, which suggests that a noticeable fraction of GWAS associations are shared across Eurasians. Again, we found no heterogeneity across diseases (Table S7).

Finally, we considered replication attempts performed upon individuals of African ancestry. Even if GWAS on individuals of non-Eurasian ancestry are scarce, we could find 16 GWAS performed on African Americans from which we gathered a total of 73 replication attempts (61 and 12 for SNPs discovered in Europeans and East Asians, respectively; see Materials and Methods and Table S1). Overall, we observed a low replicability rate (9.6%, 7 out of 73 attempts) that was not attributable to lack of statistical power (59.2% on average, see Table S5). This figure would cast doubts about the sharing of causal variants between Eurasians and Africans, but the inherent limitations of this part of the analysis warrant for caution. For instance, lower levels of LD in Africans than in Eurasians make it difficult to ensure that potentially shared causal variants are tagged by the same marker SNP [25]. Additionally, the 16 African-American GWAS form a rather small dataset corresponding to only five diseases (asthma, cardiovascular disease, hypertension, prostate cancer and type 2 diabetes). A complete study of African replicabilities will be possible when more studies are available. In the meantime, we focused on data gathered from European and East Asian GWAS.

Odds Ratios for Europeans and East Asians are strongly correlated

The observed rates of trans-ethnic replication between Europeans and East Asians indicate that a considerable fraction of risk loci associated with the 28 diseases is shared between the two Continental groups. As to the sharing and frequency of risk variants, it can be explored even if the causal variants themselves remain undiscovered. First, the possibility that the same causal variants underlie association in the two continents is reinforced by the strong correlation between the ORs for SNPs discovered in European GWAS and their replication OR in the largest East Asian study (Spearman's ρ = 0.82, P<10−16, Figure 1; we used the log(OR) to ensure symmetry around 1). Also, the slope of the linear regression of the two log(OR) is very close to 1 (1.03, SE = 0.064, P<10−16), which indicates that the log(OR) found in Europeans is the best predictor of the East Asian log(OR). These figures would be unexpected if GWAS hits were synthetically generated by population-specific rare causal variants, as their effect size and Linkage Disequilibrium (LD) with the replicated SNP would be different in each population. Moreover, when considering only replication attempts that did not achieve P<0.05 in East Asians, there is still a strong and significant correlation between the two ancestries' OR (Spearman's ρ = 0.53, P<2·10−9), which suggests that a fraction of these associations might also be shared even if not successfully replicated in East Asians. A final further piece of evidence indicates that causal alleles behind non-replicated SNPs might actually be common and shared. Since most rare variants occurred after the split of Europeans and East Asians [4], [12], [26][28], they would have accumulated randomly in the genealogy of each allele of the tagSNP used in GWAS. Therefore, if causal variants were rare, risk alleles would not be necessarily shared even if discovered through the same tagSNP. Strikingly, when considering the direction of effects of SNPs non-replicated in East Asians instead of only their significance, the same risk allele as in Europeans was observed for 73.6% of attempts. This proportion clearly departs from the 50% expectation in a scenario of independent sets of rare variants generating different synthetic associations if each continent (P<10−16, binomial test).

Figure 1. East Asian GWAS find the same risk allele and similar log(OR) than European discovery GWAS.

X axis: log(OR) for the replication stage of the discovery European GWAS. Y axis: log(OR) for the initial stage of East Asian GWAS (Materials and Methods). Dots in blue indicate significant (P<0.05) replication attempts in East Asia; dots in grey indicate non-significant replication attempts. (A) Using all replication attempts; (B) Using only the most-powered replication attempt per SNP; (C) Using the most-powered replication attempt per region. Attempts correspondent to SNPs with MAF<0.01 in East Asians are not shown.

Assessing the potential effect of publication bias

Publication bias could have inflated our replicability estimates [29], [30]. Due to the large number of SNPs that are tested, the usual practice when publishing a GWAS has been to report all newly discovered associations, plus the replication status of previously associated SNPs. However, this is not always the case and, in some cases, not all previous results are discussed in each publication. Therefore, our ability to gather replication attempts depends on how many of them are explicitly reported, which presents enormous variability among papers. This opens the possibility of a reporting bias, in which GWAS authors could prioritize mentioning successful replication attempts, while overlooking failed replications. If so, our chance of gathering a replication attempt might be heavily biased towards positive results, thus inflating our estimates of replicability [30].

In the most extreme version of this scenario, the 103 observed replications in East Asians at P<0.05 that find the same risk allele that had been previously discovered in European studies would be the result of type I error with a P = 0.05 threshold. In that case, the 103 positive replications would be just the 2.5% ( = 5% type I error×50% probability of the same risk allele) of a large pool of 4,120 replication attempts in East Asians (95% C.I. = 3,418–4,959, assuming a Poisson distribution). In other words, 4,017 ( = 4,120−103) associations failing to find the same risk allele at P<0.05 would have remained unreported. Given the huge amount of unpublished GWAS that this scenario would imply, we discard a big impact of publication bias in our analysis.

To obtain a more precise assessment of the potential size of reporting bias in East Asian GWAS, we estimated the maximum number of failed (P>0.05) but unreported replication attempts that is actually possible in our database [30]. Specifically, and for each East Asian GWAS, we computed the number of disease-associated SNPs discovered in Europeans that were not mentioned in East Asian studies (neither a p-value nor any other information was reported in the main text or in the supplemental information). In total, we found 416 such instances. Most of these cases may not constitute reporting bias at all, since the SNPs in question may not be included in the array used for the East Asian GWAS, may be monomorphic in the studied population, may have been filtered out during QC and so on. Still, making the extreme assumption that all these 416 cases are failed replications, they constitute the maximum number of biased reports that we could have not included in our database (Table S8). Under this extreme scenario, the 103 positive replications would not have been drawn from the 225 replication attempts gathered in our database, but from a larger set of 641 ( = 225+416) replication attempts in East Asians. This calculation allows us to estimate the lower-bound replicability rate at 16.1% (103 out of 641), which still departs from the null expectation of 5% (P = 10−16; Binomial test). In other words, the figure of 416 ungathered replication attempts is about an order of magnitude lower than the total number of unreported cases needed to explain all East Asian replications as type I errors (4,017, see previous paragraph) and, therefore, it is very unlikely that systematic reporting bias accounts for our results.

Replicability and differences in Linkage Disequilibrium and Heterozygosity

A clear prediction can be made if, as our results suggest, a substantial fraction of associations reported by European GWAS are caused by common variants with similar effect sizes across the two ancestral groups: whenever associations were successfully replicated, the frequencies of tagSNPs and causal variants and their LD relationships should be similar in the two groups. In other words, levels of heterozygosity and LD patterns should be more similar between populations in the genomic regions that contain successfully replicated SNPs than in the genomic regions with European-associated SNPs that have not reached significance in East Asians. To test this prediction, we compared the inter-continental similitude of heterozygosity and LD in genomic regions harboring two different groups of disease-associated SNPs: the 47 SNPs discovered in Europeans that have been successfully replicated in every attempt with East Asians and the 65 SNPs that have never been positively replicated.

We compared heterozygosity patterns by measuring the differences in average heterozygosity between Europeans and East Asians. We measured these differences between the two ancestry groups in a 600-SNP region around each SNP under study. We used sliding windows of 50 consecutive SNPs, with a step of 5 SNPs. As predicted, windows immediately centered on non-replicated SNPs presented significantly larger differences in average heterozygosity across populations than windows centered on replicated SNPs (0.048 vs. 0.019, P<0.009, Figure 2).

Figure 2. Difference in Average Heterozygosity between Europeans and East Asians.

The x axis represents the distance of each 50-SNP window from the associated SNPs. The y axis shows the average heterozygosity for Europeans minus that of East Asians (SEM indicated in shadow). The band in bisque indicates the windows with significant differences (P<0.01).

Analogous patterns were observed when comparing the differences in LD. To assess differences in LD between populations, we computed the varLD score [31] in the same 50-SNP sliding windows we used for heterozygosity. As predicted, differences in LD were significantly larger for the windows centered in non-replicated than in replicated SNPs (varLD = 17.64 vs. 12.66, P<0.002). Indeed, varLD differences are only significant in the immediate vicinity of the associated SNP and they quickly cancel out as the distance for the associated allele increases (Figure 3 and Table S9). We obtained the same result when using only attempts with ≥80% statistical power and contrasting 39 replicated versus 14 non-replicated SNPs (varLD = 20.42 vs. 12.49, P = 0.045).

Figure 3. Regions harboring non-replicated SNPs present larger differences in LD between Europeans and East Asians.

Measures of difference in LD (varLD scores) are given for sliding windows of 50 SNPs with a 5-SNP step. Measures for replicated and non-replicated SNPs are given as blue and black lines, respectively. Shadowed areas represent the standard error of the mean. The vertical red band indicates that all windows with significant differences (P<0.01) locate in the vicinity of the associated SNPs.

To study how differences in LD patterns compare with the genome-wide average, we focused on the region immediately adjacent to the marker SNPs. For a window of 50 SNPs around the marker, we compared differences between Europeans and East Asians in LD patterns around replicated and non-replicated SNPs to genome-wide average differences for random SNPs. We used the permutation method included in varLD to assign an empirical p-value to the observed differences in LD for each analyzed window (see Materials and Methods). We considered three different sets of 50-SNP windows centered on each of the (i) 47 replicated SNPs, (ii) 64 non-replicated SNPs, and (iii) 100 groups of 47 SNPs randomly selected from across the genome. Because the two populations differ in their LD patterns, we observed a trend towards significant differences in LD (empirical P<0.05) for the three datasets. However, the proportion of significant windows was larger for non-replicated (78%) than for replicated (62%) and random genomic SNPs (66%). Figure 4 shows the cumulative distributions of empirical p-values for the three groups of SNPs. The cumulative distribution of p-values correspondent to replicated SNPs does not depart from genome-wide expectations, while non-replicated SNPs clearly map into regions of the genome with extreme differences in LD between Europeans and East Asians. In other words: our observations on LD differences suggest that a proportion of associations would have failed to replicate in East Asians because of population heterogeneity in LD between causal variants and tagSNPs. Yet, the possibilities of heterogeneity across populations in the effect size of causal variants themselves (see [22], [32]) or the presence of European-specific causal rare variants in some associations cannot be discarded as a source of lack of replication.

Figure 4. Regions with non-replicated SNPs depart from the genome-wide expectation of regional differences in LD patterns between Europeans and East Asians.

The cumulative distributions of varLD targeted P-values after 1,000 permutations for non-replicated SNPs (n = 64, in gray) and replicated SNPs (n = 47, in blue) are shown. The median (black dashed line) and 95% empirical CI (green area) calculated for 100 groups of 47 SNPs randomly selected from across the genome are also depicted.

Comparison with previous results

Our results indicate that many causal variants underlying GWAS results are common and shared between Europeans and East Asians, extending the observation of reports that focused in single traits [17], [19]. This would seem to contradict results by us and others that highlighted heterogeneity in the genetic etiology of disease across human populations [14], [21], [22]. This observation contrasts with the large replicability and large correlation in OR that we observe, as well as with the suggested role of differences in LD in explaining associations non-replicated in East Asians. The apparent contradiction between the present and previous papers can be explained by two facts. First, our previous results focused on candidate-gene studies, which have been largely dominated by false positives [14]; and, second, studies that considered GWAS data addressed different questions, used different approaches and gathered different sets of traits [21] and/or associations [22].

An examination of previous datasets confirms a general trend to consistency of GWAS results across continents and emphasizes the benefits of incorporating as many associations as possible. Fu et al. [21] focused on associated SNPs discovered in East Asian GWAS. Although they used only four traits and 47 SNPs (43 loci), they demonstrated the challenges of multi-ethnic studies, and provided a framework to cope with these difficulties. As discussed by the authors, caution is warranted as to whether the disease loci and/or causal variants are population-specific. For instance, they suggested that two signals for type 2 diabetes located in 9p24.1 (PTPRD, rs17584499) and 17p13.3 (SRR, rs391300) could be East Asian-specific, as they fail to replicate in a well-powered study in Europeans. However, we gathered several replication attempts of these two signals in more recent East Asian GWAS (Table S4), and out of 8 replication attempts only one was successful (for PTPRD, rs17584499) at P<0.05, when a total of 7.49 replications would be expected by power alone (4 for PTPRD and 3.49 for SRR, see Table S5). Also, in only 4 out of 8 cases the risk allele was the same (two for each gene). Overall, the replication attempts gathered in our database suggest that both associations were false positive findings in East Asians. These results make it clear that Fu et al. [21] were right in asking for caution, since putative population-specific associations may well turn out to be false positives. Moreover, the inclusion of more recent studies in our dataset helps discarding the population-specific status of some true associations. For instance, the association of 10p13 (CAMK1D, rs12779790) to type 2 diabetes was considered as European-specific, but it has been eventually replicated in East Asians [33].

Ntzani et al. [22] examined differences in effect sizes from a set of 108 associations discovered by GWAS and for which data for various ancestries was available. Because of the sophistication of their approach, they had to focus on 12 diseases and 4 anthropometric traits, as well as on a relatively short (∼30) list of GWAS that either use samples with different ancestries in the replication stage or compare their own results with previous papers using different ancestries [22]. In contrast, we took the simpler approach of studying replicability in the studies with largest sample size, so we could gather attempts from multiple GWAS on the same diseases and were able to construct a larger database. Ntzani et al. [22] found overall consistency in effect direction across ancestries (∼82%, similar to ours of 85%), but with modest correlations in effect sizes, (rho≈0.33) that would seem contradictory with the large correlation in odds ratios we report here. Nevertheless, an almost identical correlation in OR would have been observed if limiting the study to the 22 SNPs that are shared between Ntzani et al. [22] and our dataset (rho = 0.58 and 0.53, respectively). Barring possible differences due to the different nature of the anthropometric traits analyzed by Ntzani et al. [22], the previous results stress the importance of continuously updating the list of replication attempts to increase the statistical power upon which inferences can be based.

Effective replicability rates of larger GWAS hints at weaker but common causal variants

Of course, the finding of shared variants underlying GWAS results holds only for associations that have been published so far. Ongoing efforts to join cohorts into large consortia [34] ensure steady progress in the field and guarantee the discovery of new genetic associations to complex disease [6], [35]. It is tempting to make inferences about what may be the results of future, much larger, association studies; particularly about the frequency and degree of trans-ethnic sharing of as yet undiscovered variants. We approximated this question by considering the allele frequencies and effect sizes of associated SNPs along with their patterns of replicability across time. Specifically, it is clear that if the GWAS with larger sample sizes that have been published recently for peoples of European ancestry had discovered variants with lower frequencies (variants that should be increasingly population-specific), their results should be less likely to replicate across populations. If this observation were made, it would predict decreased replicabilities in future, even larger GWAS with increased power to discover lower-frequency risk variants.

As observed in Figure 5, more recent GWAS have gathered larger sample sizes and unveiled associations with lower ORs. Although more recent GWAS present decreased replicability rates, an interesting inference can be made by observing effective replicability rates, the ratio between the proportion of positive replications and their statistical power. Effective replicability would be expected to decrease if the lower ORs detected by GWAS were due to lower-frequency (and thus increasingly population-specific) causal variants. In contrast, we observed a constant effective replicability rate of ∼80% that was independent of the OR reported in the European discovery GWAS (red line in Figure 5), indicating that the associations discovered by larger GWAS present similarly high replicability rates regardless of their weaker effect size.

Figure 5. Replicabilities against ORs in the discovery study.

For every SNP discovered in Europeans, all the replication attempts in East Asians were considered and classified by bins of OR found in Europeans. The OR of SNPs with risk alleles being major was transformed to ensure OR>1. By means of windows with step 0.3, the average statistical power (empty black circles), average replication success (solid black circles) and effective replicability (the ratio between observed and expected replicability, the two former quantities, red circles) are shown (using only windows with ≥20 attempts). Top values of the graph represent the average date of publication and sample size of discovery GWAS, for bins of 0.1 OR.

Changing focus to minor allele frequencies, it is possible that, regardless of the reported OR, genotyped marker SNPs with lower MAFs are more efficient in tagging low frequency causal variants. If that were the case, patterns of replicability may change as a function of the MAF of associated SNPs. Nevertheless, we observed similar rates of effective replicability across all the frequency spectrum of disease-associated SNPs, with no drastic decrease for markers of increasingly smaller MAFs (Figure 6).

Figure 6. Replicabilities against MAFs in the discovery study.

For every SNP discovered in Europeans, all the replication attempts in East Asians were considered and classified by bins of OR found in Europeans. By means of windows with step 0.3, the average statistical power (empty black circles), average replication success (solid black circles) and effective replicability (the ratio between observed and expected replicability, the two former quantities, red circles) are shown (using only windows with ≥20 attempts). Top values of the graph represent the average date of publication and sample size of discovery GWAS, for each bin of MAF.

All these inferences are confirmed after categorizing European discovery GWAS into two groups using a threshold of 10,000 individuals to distinguish between “small” and “large” studies. First, we did not observe differences in the MAF distribution of associated SNPs according to the discovery sample size (average MAF of 0.301 vs. 0.333 for “small” and “large” GWAS respectively; P = 0.12, Wilcoxon test). Second, even if larger GWAS do indeed detect associations with smaller ORs (average OR 1.15 vs. 1.28; P<3×10−7), the trans-continental correlation of ORs between Europeans and East Asians was the same for “small” and “large” GWAS (Figure 7). Both results show, yet again, that causal variants of different effect sizes are equally shared across populations, independently of the sample size of the discovery GWAS.

Figure 7. Similar correlation between European and East Asian log(OR), regardless of the discovery GWAS sample size.

This figure is a redraw of the points shown in Figure 1A but colored according to the discovery GWAS sample size. The same correlations arose when using all replication attempts (as it is shown in Figure) or the filtered (n = 123) set of largest replication attempt per SNP (not shown).

Building a database of replication attempts from non-GWAS studies

GWAS focus on describing new variants across the genome rather than validating the findings from previous GWAS. Instead, many replication attempts consist on genotyping limited sets of SNPs previously discovered by GWAS in independent samples. They tend to be published independently from GWAS and, hence, our replicability database may have failed to gather many replication attempts that occur outside the setting of genome-wide studies. Since endeavoring to analyze all the literature available for the 28 diseases in our database would have required a massive effort, we randomly selected six diseases to address this possibility (Table S10). For each disease, we searched all the publications citing each of the disease-associated variants present in our database, as well as for the original GWAS publications initially describing them.

In total, we looked at 1,706 and 6,068 citations available at PubMed and Google Scholar by December 2012, respectively (Table S10). In doing so, we gathered a total of 59 replication attempts from 38 candidate studies targeting GWAS variants discovered in Europeans (40 and 19 attempts used individuals of European and East Asian ancestry, respectively, see Table S11). Nonetheless, the observed effective replicability rates after accounting for statistical power of attempts gathered from GWAS and non-GWAS studies are very similar in both Europeans and East Asians (93.8% vs. 89.5%; P = 0.20 and 80.6% vs. 88.4%, P = 0.69; respectively). Thus, the inclusion of replication attempts that occur outside from the setting of GWAS should not have affected the patterns of replicability we report in the present study.


The relevance of our findings is that they allow for three inferences. First, they contribute to the debate on the possible synthetic origin of GWAS associations [7], [8], [10], since trans-continental replicability confirms that most –even if not all– of the associations detected by GWAS are not caused by population-specific, rare variants. Second, they clarify the contribution of common variants to extant GWAS results, since practically all GWAS have delivered precisely what they were designed to detect: associations with common variants [1]. Third, our results show that a substantial proportion of causal variants are shared across European and East Asian populations and that they probably lie in the regions close to marker SNPs, which may allow leveraging on the increasingly varied ancestries of GWAS to track them down [25], [36][38]. Finally, since larger GWAS did not detect variants with lower frequencies, our findings support a model of common variants of varying effect sizes, closer to the infinitesimal model than to a pure rare variant model of the genetic architecture of disease [4]. However, it is not simple to extrapolate our results to associations that so far remain undiscovered. Whether the heritability that is not yet explained by GWAS will be partly due to risk variants in insufficient LD with common SNP markers, as suggested by some authors [6], [39] or whether this heritability exists at all [40] will only be resolved by further empirical research.

Materials and Methods

Creating a database of SNPs associated with disease

We considered the 1,171 studies indexed in the catalog of Published Genome-Wide Association Studies as to February 2012 (, last accessed 14th February 2012) and classified them according to the trait under study. Each study was classified according to the genetic ancestry of the samples, considering only individuals used in the GWAS stage. Studies performed on a mixed panel were considered only if separate ancestry-specific analyses were provided and we recorded them as independent studies. We observed a strong bias towards GWAS performed with “European” (78.4%) and “East Asian” (14.9%) individuals, while much fewer studies are available for “African” (4.3%), “Hispanic” (1.2%), “Middle Eastern” (0.5%), “Native American” (0.4%) and “Oceanian” (0.3%) ancestries. Therefore, and to increase the reliability of our results, we focused on GWAS performed with peoples of European and East Asian ancestry to select frequently studied diseases. We ascertained only dichotomous disease traits, avoiding anthropometric traits such as height. To produce reliable replicability estimates across ancestries we included in our analysis the 28 diseases for which two or more GWAS were available in any of the two ancestral groups and at least one in the other group (e.g. 11 GWAS for lung cancer in Europeans and 5 in East Asians; 4 GWAS for Kawasaki Disease, 1 in Europeans and 3 in East Asians). Finally, we also added GWAS performed upon individuals of African ancestry for any of the 28 selected diseases.

We built a database with 28 dichotomous disease phenotypes (Table S1), with data coming from 206 European, 71 East Asian and 16 African GWAS. Several features of interest were recorded for each GWAS: first author, journal, year of publication, genetic ancestry, sample size in GWAS stage, total sample size in replication stage, array genotyped, genomic control factor in GWAS stage (if available), use of imputed SNPs (Y/N) and number of genomic regions achieving genome-wide significance in the initial and final stage (Table S2). The publications corresponding to each GWAS were downloaded from the respective journals.

To explore the full range of published GWAS, we performed a comprehensive independent search for studies not gathered in the Catalog. For each of the 28 diseases, we mined three resources: (i) the PubMed database of biomedical literature, (ii) the HuGE Navigator tool available at the Human Genome Epidemiology Network [41] and (iii) specific reviews available in the literature. Specifically, we searched the PubMed to identify potential new GWAS (i.e. “Asthma AND genome-wide”) and the “HuGE Literature Finder” available at the HuGE Navigator. Finally, we used the PubMed (“Review” tool in Article types) to identify 59 reviews covering the literature available for each disease (∼2.4 reviews per disease). After examination of all these sources, we complemented the list of 277 GWAS with six new genome-wide studies performed on Europeans that had remained unnoticed in the Catalog (Tables S2 and S3).

For each disease, the selected studies were sorted per date of publication regardless of the population of study. Starting for the first study, we built a cumulative database of disease-associated SNPs and their replicability in successive studies. After excluding GWAS with pooled DNAs or focusing on CNVs, each GWAS publication was visually screened for two kinds of association data: the report of a new disease-associated SNPs (discovered SNPs); and the replication status of disease-associated SNPs discovered in previous GWAS (replicated SNPs). In both cases, we recorded three features from each association: (i) Odds Ratio (OR) (ii) confidence interval of the OR and (iii) the p-value.

We used several conservative criteria to include newly discovered SNPs in our database. First, to avoid the winner's curse bias, we used the OR and p-value from the replication stages of the discovery GWAS. Second, when several replication stages from the same GWAS were available, the OR from the stage with largest sample size was recorded. Only when no replication stages were available did we use the OR from the GWAS stage. Third, SNPs associated uniquely in sex-specific analyses were excluded. Fourth, ORs coming from allelic tests and additive models were prioritized over genotypic tests and other genetic models. Fifth, the genome-wide significance level for a newly discovered SNP to be included in our analysis was set at P<5×10−7, unless imputed SNPs were used in the GWAS, in which we toughened up the threshold to P<5×10−8. Sixth, for genomic regions with several genome-wide significant SNPs (SNPs less than 200 Kb from each other), we included in the study the SNP with lowest p-value. Finally, disease-associated SNPs from the MHC region and HLA alleles were not included in the study. In several analyses, we used the log(OR) to ensure symmetry, which does not happen if using OR (i.e. an OR of 2 is equivalent to an OR of 0.5).

For replication attempts to be included in our database, several conservative conditions had to be met. We only recorded attempts in which exactly the same SNP than in the discovery GWAS had been genotyped. Moreover, and to avoid any bias towards associations that replicate across ancestries, we did not gather any replication attempt from the same “discovery” GWAS in which a new disease-associated SNP is described. Third, in all these cases, the p-value considered for the replication report was the one from the GWAS stage. Finally, the OR for each disease-associated SNP was referenced for the allele that had been the risk allele in the discovery study. Thus, OR<1 (and log(OR)<0) means that the minor allele was found as protective in the discovery study, while OR>1 (and log(OR)>0) means that the minor allele appeared as the risk allele. For SNPs with different minor alleles across populations, OR were referenced to the minor allele specific for each population. Instances of the latter are indicated in column “Shift” in Table S5 and the shifted OR is represented in all Figures except when otherwise indicated.

A total of 419 discovered SNPs from 337 genomic regions were found to be associated with disease, 320 of those SNPs being reported for the first time in Europeans, 97 in East Asians and 2 in Africans (Table S4). In total, we gathered 543 replication reports, dealing with 227 out of the 419 discovered SNPs (Table S5). Out of the 543 replication reports, 210, 260 and 73 corresponded, respectively, to attempts performed on Europeans, East Asians and on Africans. Since East Asian and African GWAS are more recent, most of the replication attempts (465 out of 536, 87%) reported the replication status of discovered SNPs that had been reported for the first time in Europeans. Therefore, we focused on the subset of 465 replication attempts gathered for 190 associated SNPs discovered in European GWAS. Out of these, a total of 181, 225 and 61 replication attempts had been reported for Europeans, East Asians and Africans, respectively.

The 225 replication attempts in East Asians aimed to replicate a total of 131 SNPs associated with disease with genome-wide significance in Europeans, which results in an average of 1.75 replication attempts per associated SNP (range = 1–7). Thus, our estimates of replicability could be biased if replicated SNPs gathered more replication attempts per SNP, or more associated SNPs in European populations. During the analysis, and as noted in the text, we applied an additive filtering to ensure no bias in the estimates of replicability and correlations between European and East Asian OR. Specifically, we repeated the analysis selecting only the largest replication attempt per SNP, resulting in a filtered set of 123 attempts. The SNPs ascertained for the filtering are indicated in Table S5.

Population genetics analysis (varLD and Heterozygosity)

Polymorphism data was downloaded from HapMap Project Phase 2 (release 24, November 2008). For each ascertained SNP, we downloaded two data sets: (i) genotypes for the associated SNP and (ii) genotypes for a 600-SNP window centered on the associated SNP. We downloaded all genotypes for all unrelated samples from the three populations of European and East Asian ancestry (CEU, JPT and CHB). JPT and CHB samples were clustered together due to their close genetic relationship.

Population differences in local patterns of Linkage Disequilibrium (LD) around disease associated SNPs were measured with the varLD software ( [3], using the targeted option for 50-SNP windows. For each population and genomic region, varLD builds a matrix of pairwise signed r2 values among all the SNP pairs and provides a raw score corresponding to the absolute difference in the eigen-decompositions between two matrices. This score is a summary measure of the overall LD levels in a given genomic region between two populations. We used it to measure the extent of differences in local LD between two kind of genomic regions: these containing replicated and non-replicated SNPs. To rule out the possibility that differences in LD between replicated and non-replicated SNPs are not related to the presence of the disease associated SNP, we scanned varLD differences in consecutive windows of the same size (50 SNP), starting 300 SNPs upstream of the disease associated SNP and finishing 300 SNPs downstream, with an step of 5 SNPs. In total, we checked 121 consecutive windows around the disease associated SNP. On average, we were examining a window of 503.61 Kb centered on each associated SNP.

We used a similar sliding window approach to summarize the differences in allele frequencies between Europeans and East Asians. Again, we did it for each SNP, calculating the average heterozygosity in each window for replicated and non-replicated SNPs. Differences in heterozygosity are simply the result of subtracting the average heterozygosity in East Asians from that in Europeans (Figure 2).

To compare LD differences of associated SNPs to the genome-wide background, we used the varLD targeted option that tests the null hypothesis that the correlations in LD between SNPs from a given window are equal in both populations. We implemented 1,000 permutations to calculate the empirical p-value for each 50-SNP window. Then, we built three cumulative distributions correspondent to each of the three sets of SNPs: replicated (n = 47), non-replicated (n = 64) and random genomic (n = 4,700) SNPs. The SNPs selected for the latter dataset were ascertained from HapMap Phase 2 in order to randomly match the minor allele frequencies of replicated and non-replicated SNPs in Europeans. Finally, we randomly created 100 groups of 47 genomic SNPs to calculate the median and 95% empirical CI of permutation p-values available at Figure 4.

Power and statistical analyses

As noted in the text, for some analysis we focused on the attempts that had >80% power to replicate the effect size found in Europeans. Statistical Power was calculated with the CaTS Power Calculator ( [42]. For each replication attempt we checked the power under a log-additive model to find the same effect size as in the discovery European GWAS, given the sample size of the replication GWAS and the allele frequency of the risk allele in East Asians. The number of expected replications was approached by multiplying the total number of replication attempts with the statistical power to replicate averaged for all attempts (see Table S5)

Statistical analyses were performed using standard R procedures. The significance of the replicability estimates was checked by means of a binomial test, with an expected replicability rate of 0.05 under the null hypothesis of no shared associated SNPs between Europeans and East Asians or Europeans and Africans. Similarly, the significance in the risk allele direction was checked by means of a binomial test, using a null expected ratio of 0.5. As indicated in the first section, differences in LD between replicated and non-replicated SNPs were checked by means of Mann-Whitney tests comparing the distributions of varLD scores for sliding 50-SNP windows centered on the disease-associated SNPs. The same procedure was used for the average difference in heterozygosity and distributions of OR found by “small” and “large” GWAS.

Supporting Information

Table S1.

List of ascertained diseases. A comprehensive list of all diseases considered is available, along with the reason for exclusion if not present in the list of 28 ascertained diseases. The number of GWAS available for each of the three ancestries is also provided.


Table S2.

Main features of the GWAS ascertained for the present study. The main features of the 299 analyzed GWAS is provided. The six GWAS not present in the Catalog and selected after a literature survey are marked in yellow (Table S3).


Table S3.

Survey of the literature and reviews looking for GWAS not gathered in the Catalog. For each disease, the HuGE Navigator and several available reviews were screened to look for GWAS not available in the Catalog. The title, PubMed ID, year of publication along with a comment about all considered non-Catalog GWAS is provided.


Table S4.

List of genome-wide significant associated SNPs and ongoing replication attempts gathered.


Table S5.

Main features of gathered replication attempts.


Table S6.

Replicability per disease in Europeans. A) Using all replication attempts and B) Using only replication attempts with ≥80%.


Table S7.

Replicability per disease in East Asians. A) Using all replication attempts and B) Using only replication attempts with ≥80%.


Table S8.

List of potentially unreported replication attempts for each East Asian GWAS.


Table S9.

Differences in LD among replicated and non-replicated SNPs in 50-SNP consecutive windows.


Table S10.

Literature survey looking for candidate-based replication attempts of variants discovered through GWAS. The total number of PubMed and Google Scholar citations considered for each of the disease-associated SNPs along with their discovery GWAS is indicated. In total, up to six conditions were considered in-depth: Alzheimer's disease, osteoarthritis, non-Hodgkin lymphoma, narcolepsy, psoriasis and rheumatoid arthritis. The main features about the candidate-based studies selected from the selected citations are also indicated.


Table S11.

Main features of gathered replication attempts, including candidate-based attempts, for the 6 analyzed diseases.



We acknowledge colleagues from Navarro's lab for technical support and helpful comments. We also thank Greg Gibson, Peter M. Visscher, Isabel Mendizabal, Rui Faria, Diego A. Hartasánchez, and Juan A. Rodríguez for valuable discussion and smart insight.

Author Contributions

Analyzed the data: UMM. Wrote the paper: UMM AN. Conceived and designed the study: UMM AN.


  1. 1. Visscher PM, Brown MA, McCarthy MI, Yang J (2012) Five years of GWAS discovery. American journal of human genetics 90: 7–24.
  2. 2. Hindorff LA, Sethupathy P, Junkins HA, Ramos EM, Mehta JP, et al. (2009) Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proceedings of the National Academy of Sciences of the United States of America 106: 9362–9367.
  3. 3. Eichler EE, Flint J, Gibson G, Kong A, Leal SM, et al. (2010) Missing heritability and strategies for finding the underlying causes of complex disease. Nature reviews 11: 446–450.
  4. 4. Gibson G (2012) Rare and common variants: twenty arguments. Nature Reviews Genetics 13: 135–145.
  5. 5. Manolio TA, Collins FS, Cox NJ, Goldstein DB, Hindorff LA, et al. (2009) Finding the missing heritability of complex diseases. Nature 461: 747–753.
  6. 6. Yang J, Benyamin B, McEvoy BP, Gordon S, Henders AK, et al. (2010) Common SNPs explain a large proportion of the heritability for human height. Nature genetics 42: 565–569.
  7. 7. Anderson CA, Soranzo N, Zeggini E, Barrett JC (2011) Synthetic associations are unlikely to account for many common disease genome-wide association signals. PLoS biology 9: e1000580.
  8. 8. Dickson SP, Wang K, Krantz I, Hakonarson H, Goldstein DB (2010) Rare variants create synthetic genome-wide associations. PLoS biology 8: e1000294.
  9. 9. Orozco G, Barrett JC, Zeggini E (2010) Synthetic associations in the context of genome-wide association scan signals. Human molecular genetics 19: R137–144.
  10. 10. Wray NR, Purcell SM, Visscher PM (2011) Synthetic associations created by rare variants do not explain most GWAS results. PLoS biology 9: e1000579.
  11. 11. Adeyemo A, Rotimi C (2009) Genetic variants associated with complex human diseases show wide variation across multiple populations. Public Health Genomics 13: 72–79.
  12. 12. Bustamante CD, Burchard EG, De la Vega FM (2011) Genomics for the world. Nature 475: 163–165.
  13. 13. Ioannidis JP (2009) Population-wide generalizability of genome-wide discovered associations. J Natl Cancer Inst 101: 1297–1299.
  14. 14. Marigorta UM, Lao O, Casals F, Calafell F, Morcillo-Suarez C, et al. (2011) Recent human evolution has shaped geographical differences in susceptibility to disease. BMC Genomics 12: 55.
  15. 15. Rosenberg NA, Huang L, Jewett EM, Szpiech ZA, Jankovic I, et al. (2010) Genome-wide association studies in diverse populations. Nature reviews 11: 356–366.
  16. 16. Park JH, Gail MH, Weinberg CR, Carroll RJ, Chung CC, et al. (2011) Distribution of allele frequencies and effect sizes and their interrelationships for common genetic susceptibility variants. Proceedings of the National Academy of Sciences of the United States of America 108: 18026–18031.
  17. 17. Waters KM, Stram DO, Hassanein MT, Le Marchand L, Wilkens LR, et al. (2010) Consistent association of type 2 diabetes risk variants found in europeans in diverse racial and ethnic groups. PLoS genetics 6: e1001078.
  18. 18. Durbin RM, Abecasis GR, Altshuler DL, Auton A, Brooks LD, et al. (2010) A map of human genome variation from population-scale sequencing. Nature 467: 1061–1073.
  19. 19. Waters KM, Le Marchand L, Kolonel LN, Monroe KR, Stram DO, et al. (2009) Generalizability of associations from prostate cancer genome-wide association studies in multiple populations. Cancer Epidemiol Biomarkers Prev 18: 1285–1289.
  20. 20. Mathieson I, McVean G (2012) Differential confounding of rare and common variants in spatially structured populations. Nature genetics 44: 243–246.
  21. 21. Fu J, Festen EA, Wijmenga C (2011) Multi-ethnic studies in complex traits. Human molecular genetics 20: R206–213.
  22. 22. Ntzani EE, Liberopoulos G, Manolio TA, Ioannidis JP (2012) Consistency of genome-wide associations across major ancestral groups. Human genetics 131: 1057–1071.
  23. 23. Hindorff LA, MacArthur J, (European Bioinformatics Institute), Wise A, Junkins HA, Hall PN, et al.. A atalog of Published Genome-Wide Association Studies. Available:
  24. 24. McCarthy MI, Abecasis GR, Cardon LR, Goldstein DB, Little J, et al. (2008) Genome-wide association studies for complex traits: consensus, uncertainty and challenges. Nature reviews 9: 356–369.
  25. 25. N'Diaye A, Chen GK, Palmer CD, Ge B, Tayo B, et al. (2011) Identification, replication, and fine-mapping of Loci associated with adult height in individuals of african ancestry. PLoS genetics 7: e1002298.
  26. 26. Gravel S, Henn BM, Gutenkunst RN, Indap AR, Marth GT, et al. (2011) Demographic history and rare allele sharing among human populations. Proceedings of the National Academy of Sciences of the United States of America 108: 11983–11988.
  27. 27. Keinan A, Clark AG (2012) Recent explosive human population growth has resulted in an excess of rare genetic variants. Science 336: 740–743.
  28. 28. Nelson MR, Wegmann D, Ehm MG, Kessner D, St Jean P, et al. (2012) An abundance of rare functional variants in 202 drug target genes sequenced in 14,002 people. Science 337: 100–104.
  29. 29. Barsh GS, Copenhaver GP, Gibson G, Williams SM (2012) Guidelines for genome-wide association studies. PLoS genetics 8: e1002812.
  30. 30. Lohmueller KE, Pearce CL, Pike M, Lander ES, Hirschhorn JN (2003) Meta-analysis of genetic association studies supports a contribution of common variants to susceptibility to common disease. Nature genetics 33: 177–182.
  31. 31. Ong RT, Teo YY (2010) varLD: a program for quantifying variation in linkage disequilibrium patterns between populations. Bioinformatics 26: 1269–1270.
  32. 32. Han B, Eskin E (2012) Interpreting meta-analyses of genome-wide association studies. PLoS genetics 8: e1002555.
  33. 33. Cho YS, Chen CH, Hu C, Long J, Ong RT, et al. (2012) Meta-analysis of genome-wide association studies identifies eight new loci for type 2 diabetes in east Asians. Nature genetics 44: 67–72.
  34. 34. Sullivan PF, Daly MJ, O'Donovan M (2012) Genetic architectures of psychiatric disorders: the emerging picture and its implications. Nature reviews 13: 537–551.
  35. 35. Park JH, Wacholder S, Gail MH, Peters U, Jacobs KB, et al. (2010) Estimation of effect size distribution from genome-wide association studies and implications for future discoveries. Nature genetics 42: 570–575.
  36. 36. Saxena R, Elbers CC, Guo Y, Peter I, Gaunt TR, et al. (2012) Large-scale gene-centric meta-analysis across 39 studies identifies type 2 diabetes loci. American journal of human genetics 90: 410–425.
  37. 37. Twee-Hee Ong R, Wang X, Liu X, Teo YY (2012) Efficiency of trans-ethnic genome-wide meta-analysis and fine-mapping. Eur J Hum Genet 20: 1300–7.
  38. 38. Zaitlen N, Pasaniuc B, Gur T, Ziv E, Halperin E (2010) Leveraging genetic variability across populations for the identification of causal variants. American journal of human genetics 86: 23–33.
  39. 39. Yang J, Manolio TA, Pasquale LR, Boerwinkle E, Caporaso N, et al. (2011) Genome partitioning of genetic variation for complex traits using common SNPs. Nature genetics 43: 519–525.
  40. 40. Zuk O, Hechter E, Sunyaev HR, Lander ES (2012) The mystery of missing heritability: Genetic interactions create phantom heritability. PNAS 109: 1193–1198.
  41. 41. Yu W, Gwinn M, Clyne M, Yesupriya A, Khoury MJ (2008) A navigator for human genome epidemiology. Nature genetics 40: 124–125.
  42. 42. Skol AD, Scott LJ, Abecasis GR, Boehnke M (2006) Joint analysis is more efficient than replication-based analysis for two-stage genome-wide association studies. Nature genetics 38: 209–213.