Gene-set analysis (GSA) methods are used as complementary approaches to genome-wide association studies (GWASs). The single marker association estimates of a predefined set of genes are either contrasted with those of all remaining genes or with a null non-associated background. To pool the p-values from several GSAs, it is important to take into account the concordance of the observed patterns resulting from single marker association point estimates across any given gene set. Here we propose an enhanced version of Fisher’s inverse χ2-method META-GSA, however weighting each study to account for imperfect correlation between association patterns.
Simulation and Power
We investigated the performance of META-GSA by simulating GWASs with 500 cases and 500 controls at 100 diallelic markers in 20 different scenarios, simulating different relative risks between 1 and 1.5 in gene sets of 10 genes. Wilcoxon’s rank sum test was applied as GSA for each study. We found that META-GSA has greater power to discover truly associated gene sets than simple pooling of the p-values, by e.g. 59% versus 37%, when the true relative risk for 5 of 10 genes was assume to be 1.5. Under the null hypothesis of no difference in the true association pattern between the gene set of interest and the set of remaining genes, the results of both approaches are almost uncorrelated. We recommend not relying on p-values alone when combining the results of independent GSAs.
We applied META-GSA to pool the results of four case-control GWASs of lung cancer risk (Central European Study and Toronto/Lunenfeld-Tanenbaum Research Institute Study; German Lung Cancer Study and MD Anderson Cancer Center Study), which had already been analyzed separately with four different GSA methods (EASE; SLAT, mSUMSTAT and GenGen). This application revealed the pathway GO0015291 “transmembrane transporter activity” as significantly enriched with associated genes (GSA-method: EASE, p = 0.0315 corrected for multiple testing). Similar results were found for GO0015464 “acetylcholine receptor activity” but only when not corrected for multiple testing (all GSA-methods applied; p≈0.02).
Citation: Rosenberger A, Friedrichs S, Amos CI, Brennan P, Fehringer G, Heinrich J, et al. (2015) META-GSA: Combining Findings from Gene-Set Analyses across Several Genome-Wide Association Studies. PLoS ONE 10(10): e0140179. https://doi.org/10.1371/journal.pone.0140179
Editor: Manu Sharma, Institute for Clinical Epidemiology and Applied Biometry, GERMANY
Received: May 28, 2015; Accepted: September 21, 2015; Published: October 26, 2015
Copyright: © 2015 Rosenberger et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited
Data Availability: The summary data of individual data which are used in our meta-analysis approach for all four included genome-wide association studies will be found in dbGaP, accession number phs000877.
Funding: TRICL (Transdisciplinary Research in Cancer of the Lung): The investigations in this article were supported by the grant from the National Institute of Health (NIH) (U19CA148127). The SLRI study was supported by Canadian Cancer Society Research Institute (no. 020214), Ontario Institute of Cancer and Cancer Care Ontario Chair Award to Dr. Hung. The German lung cancer study consists of three investigations: The sample collection of the Heidelberg lung cancer study was in part supported by grant 70-2919 from the Deutsche Krebshilfe. The work was additionally supported by a Helmholtz-DAAD fellowship (A/07/97379) and by the National Institutes of Health (USA) (U19CA148127). The KORA Surveys were financed by the Helmholtz-Zentrum München (HMGU), which is funded by the German Federal Ministry of Education, Science, Research and Technology and the State of Bavaria. The LUCY study was funded in part by the National Genome Research Network (NGFN), the DFG (BI 576/2-1; BI 576/2–2), the Helmholtzgemeinschaft (HGF) and the Federal office for Radiation Protection (BfS:STSch4454). Genotyping was performed in the Genome Analysis Center (GAC) of the Helmholtz Zentrum Muenchen. Funding for the MD Anderson Cancer Study was provided by NIH grants – P50 CA70907, R01CA121197, RO1 CA127219, U19CA148127, and CPRIT grant RP100443. Genotyping services were provided by the Center for Inherited Disease Research (CIDR). CIDR is funded through a federal contract from the National Institutes of Health to The Johns Hopkins University, contract number HHSN268200782096C. Support for the Central Europe study was provided by Institut National du Cancer, France. Support for the Central Europe study, Czech Republic, was also provided by the European Regional Development Fund and the State Budget of the Czech Republic (RE-CAMO, CZ.1.05/2.1.00/03.0101). The Central Europe study was a multicenter study conducted in seven central European countries. The following investigators are responsible for the collection of data at each of the sites: Neonila Szeszenia-Dabrowska, Jolanta Lissowska, David Zaridze, Peter Rudnai, Eleonora Fabianova, Lenka Foretova, Vladimir Janout, Vladimir Bencko, and Miriam Schejbalova.
Competing interests: The authors have declared that no competing interests exist.
Genome-wide association studies (GWASs) enable us to identify single markers or narrow genomic regions associated with a disease after genotyping thousands of single nucleotide polymorphisms (SNPs) throughout the whole genome. However, the interplay of genes in the etiology of the phenotype in question has still not been considered. “It is well known that genes do not work in isolation; instead, complex molecular networks and cellular pathways are often involved in disease susceptibility and disease progression” . Hence, to respect pathways, which are typically sets of genes connected through known or proposed mechanisms on a molecular, cellular, or organic level, can help discover genetic susceptibility to complex traits .
Gene-set analysis (GSA) was therefore proposed as a complementary approach in the investigation of the genetic basis of disease using GWAS results [3–9]. In principle, GSA either contrasts the entirety of observed marker-specific associations of a common set of genes of interest (GS) with those of all remaining genes (GS´) (competitive test) or compares these to a null hypothesis (H0) of a non-associated background (self-contained test). GSA approaches provide in general no estimate of the strength of association but only p-values (pGS), indicating some kind of accumulation in significance of observed associations with a phenotype for genes or markers within the GS (denoted as accumulated marker significance). These pGS-values usually result from one-sided statistical tests. Hence, low pGS–values result from low single marker p-values pm of genes in GS. GSA in general does not provide any estimate of the strength of a joint association for the set of markers or genes in question, respectively. The appeal of GSA as an analysis tool complementary to GWAS is that gene sets which are enriched with moderate association signals may be discovered, even if no individual markers within genes of the GS demonstrate genome-wide significance.
If one aims to perform a meta-analysis across several existing GSA studies (s = 1 …ns) based on GWAS results, it seems straightforward simply to apply e.g. Fisher’s inverse χ2-method  as a quantitative method to pool p-values (pGS,s) from independent, one-sided tests (further denoted as simple p-pooling (SPP)) of concordant null hypotheses (direction of the test) . Nevertheless, low pGS–values can theoretically arise through accumulated marker significance, in which the minor alleles of all markers are observed for example as protective factors in one study, while being seen at the same time as risk factors in another study. Thus, significance for GS can appear simultaneously in several studies without concordance of the patterns of all observed associations of markers, respectively genes, contained in GS (briefly denoted as association pattern). Consequently, concordance of the direction of the test (of pGS) is not given a priori.
Hence, it is of vital importance to take the concordance of association patterns between studies into account when combining pGS,s-values of several studies (reflecting significance of GSs). Furthermore, replicated accumulated marker significance (low pGS–values) alone is insufficient to state the consistency in GSA findings in terms of the Bradford Hill criteria .
We propose an approach for a quantitative meta-analysis of GSA results, which we have named META-GSA. This follows the idea of an enhanced version of Fisher’s inverse χ2-method , weighting each study to account for imperfect correlation between its own association pattern with an overall association pattern across studies. Because this overall association pattern is a hidden construct, we use principal component analysis (PCA) to determine appropriate study weights. Doing so, the weights are neither pre-specified nor strictly positive, as presumed by the above-mentioned method. Thus the distribution of the final test statistic under the H0 of “no accumulated marker significance taking concordance of association patterns into account” needs to be derived by data permutation.
In this manuscript we outline the principle of META-GSA. We then present simulation results considering 20 different scenarios for the “pattern of true marker-phenotype association”, simulating different relative risks between 1 and 1.5 in gene sets of 10 genes. Finally, we apply META-GSA to four lung cancer GWASs from the Transdisciplinary Research in Cancer of the Lung / International Lung Cancer Consortium (TRICL / ILCCO).
Assume that for several independent studies s = 1…ns gene-set analyses (GSA) based on the results of GWASs have already been performed (see Fig 1). Within each GWAS, a particular association measure θm,s (e.g. odds ratio OR) for each of m = 1,…, nm markers has been estimated. Additionally, corresponding p-values pm,s and/or test statistics Tm,s from two-sided tests of no association (in the case of odds ratios the null hypothesis is: H0: θm,s = 1, the alternative hypothesis is HA: θm,s≠1) are available. We introduce the parameter dm,s as indicator for the direction of the point estimator of θm,s, taking values of -1 or +1 (e.g. of dm,s = -1 if ORm,s<1 and dm,s = +1 if ORm,s≥1).
Genome-wide association study (GWAS); gene-set analysis (GSA); meta-analysis of GSAs (META-GSA); gene set of interest (GS, containing genes 3,5,…); complementary gene set (GS’, containing genes 1, 2, 4,…); gene-level statistic of gene g and study s (Γg,s); “enrichment score” (ES) as test statistic for GSA in study s.
Within each GSA, a common set of genes GS, identical for all studies s, was compared to another set GS`of genes, possibly all remaining genes or to a non-associated background. A gene-level statistic Γg,s for each gene g in each study s was constructed from θm|g,s, Tm|g,s or pm|g,s, respectively (e.g. the lowest observed marker-specific pm|g,s-value for all markers of gene m|g). A GSA test statistic for each study ESs, e.g. an enrichment score, was calculated from Γg,s. Usually, p-values pGS,s(ESs) are gained from one-sided tests of H0,s: ESs = 0 versus e.g. HA,s: ESs>0, i.e. the alternative is an enrichment of low pm|g,s-values of genes from GS.
A meta-analytical approach with several levels such as markers, genes, and gene sets is necessarily rich in notation. Thus the relationship between GWAS, GSA, and META-GSA is summarized in Fig 1. All the important notation is listed in Table 1.
The term nominal significance is used to indicate significance without correcting for multiple testing.
Aim and principle of META-GSA
The aim of META-GSA is to increase statistical evidence by pooling p-values of GSAs (pGS,s reflecting GS-significance, this is equivalent to accumulated marker significance), taking also into account the concordance of association patterns (reflected by the magnitude of pm|g,s and the direction of θm|g,s). We only consider situations in which all studies use the same association measure (e.g. odds ratio).
A well-known p-pooling technique to combine evidence from independent one-sided tests of significance is Fisher’s inverse χ2-method or enhancements thereof [10, 12, 13]. Assuming a p-value p to follow a uniform distribution, if H0 is true, the sum of ns independent tests then follows a χ2-distribution, with df = 2ns degrees of freedom. is meaningful if and only if all related test statistics Ts point towards the same direction in terms of the target measure. Makambi enhanced this method by introducing pre-specified weights ws for each study ( and ws > 0 for all s), to account for imperfect concordance of the direction of the test . More details are given in S1 and S2 Figs and S1 Text.
The principle of META-GSA is to make use of the test statistic by allocating both necessary aspects (GS-significance and concordance of association patterns) to its parameters:
- ○. pGS,s representing GS-significance:
- Testing in GSA is usually performed one-sided, thus study-specific pGS,s are used as the p-values to be combined.
- ○. ws representing concordance of association patterns
- (patterns of all or a selected number of observed marker-specific associations of markers, respectively genes, contained in GS)
PDR: A single quantity combing significance and direction of association
A core element of META-GSA is the combination of significance (p-values: pg,s) and direction (e.g. sign of association measure) of an observed gene-specific association in a manner similar to a correlation coefficient, such that zero represents no evidence and the limits +1 and -1 represent strong evidence of positive or negative association. Thus we define a directed reversed p-value (PDR)
This definition proved superior to two alternative definitions, which are presented in S2 Text.
The workflow of META-GSA
To perform META-GSA, the following four steps are necessary to select and concentrate marker-specific information up to study-specific weights ws, to determine the null distribution of the test statistics , as well as to assess the overall significance:
- step I) From the marker- to the gene-level:
- Combination of significance (p-value) and direction (sign of effect estimate) to a directed reverse p-value (PDR) for each gene;
- step II) From the gene-level to between-study concordance:
- Determination of study weights from PDRs,
- step III) Performance of significance testing for a GS and
- step IV) Correction for multiple testing.
Although summarizing marker-specific association results is necessary as an intermediate step when performing META-GSA, it is not our aim to perform gene-level tests of association. Gene-level statistics are considered only to assess the concordance of association patterns between studies.
Step I: Combination of significance and direction to a PDR for each gene
Marker information is aggregate to a gene-level statistic for each gene and each study by calculating PDRs , since often several markers are allocated to a gene. In general, there are three possible strategies of marker selection / aggregation with the aim of calculating PDRs:
- ○. Choosing the best (most significant) marker for a gene
- ○. “Averaging” over all markers allocated to a gene
- ○. “Averaging” over only “promising” markers allocated to a gene
For META-GSA it is advisable, but not essential, to apply the same strategy of marker selection / aggregation that was used for individual GSAs.
This step, as stated previously, is solely to assess the concordance of association patterns. This is the purpose of “PDR-profiles”, one for each study, comprising all genes in GS.
Choosing the best (most significant) marker for a gene.
Gene-level direction dg,s: The determination of the direction dg,s is a bit more complicate. In the simplest case, all representative markers are the same across the studies. The direction dg,s is then the sign of the association point estimate (if H0: θm,s = 0 as for log(odd ratio)): (3b)
If the representative markers are not the same, LD between these markers needs to be taken into account to determine dg,s. As measure of choice for LD, we consider r the correlation between the alleles of marker and . A motivation for this choice is given in S3 Text.
Please note, if the representative markers of one study is in linkage equilibrium to any selected marker for another studies , the direction dg,s = 0. Regardless of the observed significance of this marker the PDR will get .
“Averaging” over all markers allocated to a gene.
Another way to generate a single gene-level statistic is to combine evidence for association of all markers assigned to a gene .
Gene-level significance pg,s: Performing the weighted Fisher inverse χ2-method for single marker p-values pm|g,s is one possibility. Thus the sum is calculated and a p-value pg,s is derived from a χ2-distribution. Weights vm|g,s are assigned to markers in order to account for pairwise LD and the LD-block structure, respectively. For mSUMSTAT  or SLAT  permutation approaches to determine these marker weights were previously proposed, but these are time consuming. Instead we choose vm|g,s proportional to the inverse of the sum of r2 across any pair of markers in the same LD-block: (m´ indicating any marker within the same LD-block). Thus the sum of marker weights ∑g,svm|g,s is equal to the “number of independent markers” which also defines the degrees of freedom for the considered χ2-distribution.
Hence, the p-values for a gene (within each study) is, in contrast to Eq 3a, a function of all markers: (4a)
Gene-level direction dg,s: The direction dg,s is simply determined as the weighted mean of marker specific direction (as calculated according Eq 3c), using the same marker weights vm|g,s as before: (4b)
“Averaging” over only “promising” markers allocated to a gene.
The same procedures as just explained can be applied restricted to “promising” markers (e.g. filtered for pm|g,s<0.05).
Re-orientate the gene-level directions dg,s across studies.
To be able to calculated comparable PDRs, one needs to re-orientate these gene-level directions dg,s along a “profile of reference directions”, represented by the mean of directions across studies for each gene: . In the case the is negative for a gene g*, all dg*,s (direction within a study) are multiplied with -1, to ensure a positive value indicating, that a marker is pointing towards the reference direction (mean of all studies).
As soon as the direction dg,s and the significance pg,s of all genes and for each study are determined, PDR-values can be calculated according to Eq 1 and comprised (for genes in GS only) in a vector of observed gene-phenotype associations . It is important to note that a true “PDR-profile” of the GS of interest is unknown. The average profile is used as reference instead.
Step II: Determination study weights ws from rank correlations of PDRs
Determination of the rank correlation of PDRs between studies.
Next, we have to quantify the pairwise concordance of observed PDR-profiles between studies. For this, we simply calculate all pairwise correlations τs,s’.
Regarding as a function of the true association θm,s, the sample size nm,s, perhaps population stratification, and the fitted statistical model, PDRs may be “scaled” differently between studies. Hence, we propose applying Kendall’s τ. Alternative correlation measures were also investigated (see S2 Text).
Deriving study weights ws by principal component analysis.
Once the correlation matrix is found, we use principal component analysis (PCA) to determine the load of each study on a common but unknown general PDR-profile of the GS. We assume that these loads can be represented by the first principal component (PC1s). The corresponding eigenvalue (EV1) is ns-times the fraction of the variance accounted for by PC1s. Only in the case of perfect pairwise correlation between (total concordance of) PDR-profiles between all studies s, this proportion will be 1 and hence EV1 = n2 and PC1s = 1. We interpret the fraction as “effective number of studies”.
This satisfies the condition that the sum of weights is equal to the effective number of studies: . However this does not satisfy and 1≥ws≥0 for all s as would be necessary for Makambi’s enhancement of Fisher’s inverse χ2-method.
Step III: Performing significance testing
The assumption that the test statistic asymptotically follows a χ2-distribution is based on a few conditions: The p-values are independent, identically distributed (iid), and follow a uniform distribution under H0. The weights ws are pre-specified, range between 0 and 1, and sum up to one ( and ws≥0 for all s). Thus the test statistic is strictly non-negative: .
Since the assumptions for ws and are not fulfilled, the null distribution of the test statistic M0 needs to be determined by a permutation procedure, permuting the allocation of genes to GS (to determine ws) and drawing pGS,s-values by random from a uniform distribution. The statistical significance is expressed by the permutation p-value ppermute,GS, which is assessed by the fraction of permutations achieving a test statistic Mj (j = 1 to x) at least as extreme as the non-permuted (original) test statistic M0. Details and techniques to reduce the computational burden are given in S4 Text and S5 Text.
Step IV: Correction of multiple testing
Correction for multiple testing is required if several GSs have been investigated. Because a single genes can appear across multiple gene sets and genes can be in LD to each other, ppermut,GS-values are maybe not independent. Application of a Bonferroni or Šidák adjustment for multiple testing would lead to overcorrection. We propose applying the bootstrap method of Storey and Tibshirani [17, 18] to estimate the proportion of true null hypotheses π0 and use this in a Bonferroni-like manner to correct p-values for each GS: (6)
Details of how π0 is derived are given in S6 Text.
We performed power simulations in order to compare the performance of META-GSA and the simple p-pooling of GSA results. To this end, we set up 20 different scenarios for “patterns of true marker-phenotype associations” investigated by GSAs based on GWAS data in two to ten studies (see Table 2). For simplicity, we assumed only one genotyped marker for each gene. These patterns comprised the following situations (H0 and HA are expressed according to a competitive test strategy):
- H0: no enrichment in the GS of interest: F(RR|GS) = F(RR|GS´) scenarios 1–3
- HA: markers with RR>1 only in GS: F(RR|GS)≠F(RR|GS´) scenarios 4–16
- HA/H0: mixed structured gene sets (markers with RR≠1 in GS and GS’) scenarios 17–20
where F(RR|GS) is the cumulative distribution function of “true” relative risks (RR) of markers/genes in GS. For example, in scenario 1 the true relative risk (RR) of all markers was set to RR = 1, in scenario 2 to RR = 1.2. Both scenarios represent a H0 situation for a competitive test strategy. In scenario 5 we assumed RR = 1.2 for half of the markers of GS and a RR = 1 for the other half and all of the markers of GS´. Not necessarily the same markers of GS needed to be truly associated in each study. However, a true association only appears in genes belonging to the gene set of interest, which is an idealized situation for the competitive test strategy.
To complete each simulation run, we needed to perform the following steps:
For the MA step we considered samples of 500 cases and 500 controls. The minor allele frequencies of all markers were set to 30%. The distribution of the genotypes was assumed to be in Hardy-Weinberg Equilibrium (HWE). The prevalence of a binary phenotype in the population was set to 5%. The relative risk was pre-specified and ranged between 1.0 and 1.5. The direction of an association was chosen randomly, assuming both directions as being equally likely.
In the GWAS step we considered 100 genes with one genotyped marker each. For each of these 100 markers, we performed a two-sided Armitage trend test, yielding pm,s-values and fitted log-additive models to determine the direction of observed association (dm,s = +1 if OR≥1 and dm,s = -1 if OR<1). Markers strongly deviating from HWE in controls (pm,s<1×10−7) were excluded from further analysis.
In the GSA step, we applied a one-sided Wilcoxon’s rank-sum test  to investigate enrichment of low pm,s-value in GS (yielding pGS,s). The gene set GS under investigation always consisted of 10 genes, the remaining 90 genes were considered as GS’.
In the META-GSA step, marker-specific pm,s-values and directions dm,s were subsequently combined to PDR-values. Finally, we determined gene-set-specific pGS,s-values. There was no need to correct for multiple testing because we only regarded a single GS.
Subsequently and for the purpose of comparison, we simply summarized pGS,s-values applying Fisher’s inverse χ2-method to pSPP-values.
As an alternative approach, one may consider a so-called mega-analysis (individual participant data meta-analysis) directly based on all participants’ geno- and phenotype data, if or when available, as gold standard. Advantages of such an approach would be the use of only one statistical model throughout the analysis, the inclusion of available unpublished data, a larger sample size that allows subset analysis, the avoidance of ecological bias, and potentially higher power [20, 21]. However, it has been demonstrated mathematically that summarizing summary results, i.e. performing a meta-analysis for main effects of each marker, is as efficient as pooling genotype data, i.e. performing a mega-analysis for each marker, if no covariates are considered . This is similarly true for investigations of gene x environment interactions . In addition, limitations associated with mega-analysis need to be dealt with, such as ethics or confidentiality constraints in sharing study data, comparable data quality and completeness, or sufficient system capacity for storage and transfer. Analyzing studies separately and summarizing their results in a meta-analysis also enables adjustment for different sets of covariates for different studies, as well as allowing for heterogeneous genetic effects between studies .
Since the single marker associations of all studies are available (which is a prerequisite for META-GSA), we also accomplished a random effects meta-analysis for each marker and then performed a single GSA (denoted as pooledGWAS-GSA) for the purpose of comparison with META-GSA and SPP. Thus the combining of studies is switched to the level of markers, respectively genes. This approach is located halfway between an individual participant data meta-analysis and an aggregate study-level data meta-analysis.
We considered 500 simulations for each scenario. The percentages of pMETA-GSA-values, ppooledGWAS-GSA-values or pSSP-values lower than the level of significance of 5% were used as power for the respective test. Given a true type I error of 5%, the observed type I error may range from 3% to 7% (95% random dispersion interval for 500 simulations).
An overview of the scenarios, the achieved type I error, and power is presented in Table 2.
Type I error
As expected, in about 5% of the simulations, a GS is falsely discovered when the pattern of association does not differ between GS and GS’ (scenarios 1–3). Any observed type I error is within random variation according to the number of simulations. However, in the case of mixed structures with risk markers in GS and GS’ (scenario 20), SPP, pooledGWAS-GSA and META-GSA seem to be conservative (type I error ranges between 2.6% and 4.8%).
Interestingly, there seems to be almost no systematic overlap in discovered GSs between META-GSA and SPP. For example, in scenario 1 the GS was falsely discovered in 20 simulation runs by each approach. However, GS was concordantly discovered by META-GSA and SPP in only four runs (see Table 3). Furthermore, almost no correlation between pMETA-GSA and pSPP was observed (see Fig 2). False-positive gene sets found by META-GSA and SPP only partially overlap. We assume that this may result at least in part from respecting the concordance of association patterns.
The numbers of simulations out of a total of 500 are depicted. Gene sets are classified as significant (p ≤ 0.05) or not significant using SPP and META-GSA.
Next, we consider those scenarios in which patterns of association differ between GS and GS’ (scenarios 4–19). The power of a META-GSA exceeded that of SPP in all considered scenarios, that of a pooledGWAS-GSA was always in between. Only in the presence of stronger marker-specific associations (e.g. true RR = 1.5) and a larger number of studies (scenarios 8, 15, and 16) were pooledGWAS-GSA and META-GSA comparable in terms of power.
For example, the simulated power of META-GSA was almost 60% if five markers with a true RR = 1.5 belong to GS (scenario 8). For SPP, we achieved a power of only 37% and a power of 56% for a pooledGWAS-GSA in the same scenario. The GS was identified by META-GSA in 294 out of 500 runs, while SPP identified the GS in only 184 of those 294 runs (approximately two out of three runs). In every simulation run in which META-GSA failed to identify the GS correctly, SPP also failed (see Table 3). Using SPP, the largest p-value found in these 294 runs was pSPP = 0.62, indicating that gene sets are missed by SPP, which were identified by META-GSA (see Fig 3).
The numbers of simulations out of a total of 500 are depicted. Gene sets are classified as significant (p ≤ 0.05) or not significant using SPP and META-GSA.
For the pattern combination of scenario 8, we also simulated the power dependent on the number of studies to be combined (scenarios 9–16) and found a steady increase in power for all three approaches. META-GSA always proved superior to both SPP and pooledGWAS-GSA.
Regarding scenarios 4 to 8, we found that the more GS differs from GS’, the larger the benefit becomes in terms of power of META-GSA compared to SPP.
Finally, we simulated three patterns with varying associations in both gene sets GS and GS’ (scenarios 17–19). Note that a third to a half of all the genes in GS’ were set to be associated, which is an extremely and unrealistically high proportion. We found a remarkable loss of power if the GS of interest is dominated by other gene sets in GS’, meaning that truly associated genes are more frequent in GS’ than in GS (scenario 19). The power of a META-GSA even falls below the level of significance, as was the case for a pooledGWAS-GSA. This was to be expected, since we simulated according to the “competitive hypothesis”. SPP behaves better in such a scenario. However, a demonstrated power of 8% is far from promising though.
Application: GO-Pathways and Lung Cancer
Previous comparison of GSA methods
Data from four case-control GWASs of lung cancer risk were previously implemented  to compare the performance of four different GSA approaches: EASE , GenGen (developed from GSEA) , SLAT , and mSUMSTAT  a modification of the SUMSTAT approach . The lung cancer cases and controls were taken from the Central Europe Study , a study of the Lunenfeld-Taneberg Research Institute / Toronto , the German Lung Cancer Study [27, 28], and a study of the MD Anderson Cancer Center / Texas (MDACC Study, non-small cell lung cancer cases and controls) . This research was granted a renewed protocol approval from the Dartmouth Committee for Protection of Human Subjects on 7/30/2014 with id STUDY00023602. The specific protocol for this research covers meta-analysis of data from several studies for which participants previously signed consent documents and for which genome-wide association studies have already been completed. The current research covers the combination of data across studies. These studies are shared amongst members of the Transdisciplinary Research in Cancer of the Lung / International Lung Cancer Consortium (TRICL / ILCCO). Data from these four studies were combined into two data sets: 1) Central Europe and Toronto (CETO); and 2) Germany and MDACC (GRMD), in order to create adequate sample sizes and statistical power to detect associations in the pathway analyses. More details on the studies are provided by Fehringer et al. .
In summary, the genotypic information from 8,542 individuals (3,897 cases and 4,646 controls) was used. Genotyping of more than 300,000 SNPs assigned to about 18,000 genes was performed using either the Illumina HumanHap300 or HumanHap550 chips. In total, 7,163 genes were assigned to at least one of 421 GO level 4 pathways (obtained from the GenGen website), comprising 15 to 200 genes to avoid testing overly large or small gene sets. Only a single gene set (GO0015464: acetylcholine receptor activity) was found nominally significant in both data sets by two of the four GSA methods. Some biological interpretation of how acetylcholine receptor activity can influence the risk for lung cancer was given .
The resulting files, containing p-values according to gene set and method as well as SNP-to-gene assignment, gene-to-pathway assignment, and SNP-wise association, were the same as used previously. The only information needed which had not been used previously was the LD measure between markers within genes to homogenize PDRs. These quantities were therefore estimated based on a sample of 484 Caucasians used as controls in the German Lung Cancer Study. All controls were genotyped for 561,466 SNPs using the Illumina Sentrix HumanHap550 Beadchip. How we corrected for multiple testing is outlined in S6 Text.
At the nominal level of significance of α = 5%, 30 out of all 421 gene sets were found to be significant according to ppermut by at least one of the GSA methods (EASE, GenGen, SLAT, mSUMSTAT), when only the most significant marker was selected to determine concordance of the direction of association. Remarkably, 26 gene sets thereof were identified irrespective of which GSA approach was chosen. In contrast, 134 pathways would have been identified as nominally significant by at least one of the GSA methods when using SPP and only 8 by all of the methods. This impressively demonstrates the selective character of META-GSA. Moreover, a strong variation in the number of nominally significant gene sets between GSA methods was observed when analyzed by SPP (28 to 72 gene sets). This maybe can be attributed to the principles of the applied GSA methods. EASE or GeneGen are based on the count of genes classified as significant or not significant. Analyzed by SPP the number of nominal significant gene sets are 32 or 28, respectively. SLAT an mSUMSTAT make use of the magnitude of single marker significance (p-values or χ2-values). Analyzed by SPP the number of nominal significant gene sets are 89 or 72, respectively.
Applying META-GSA, results were consistent irrespective of which GSA method was used. Using all SNPs at a gene, 46 to 47 gene sets were nominally significant; using only significant SNPs at each gene, 22 to 24 gene sets appeared nominally significant. Using the most significant SNPs for each gene, the number of significant gene sets lay in between (27 to 30 gene sets, see Table 4). This demonstrates that META-GSA substantially reduces the influence of the GSA method applied on the final results. On the contrary, the dependency of SPP on the GSA method applied can clearly be demonstrated for GO0015291, for which pSPP, adjusted for multiple testing ranges, from 0.0003 to 1.0. Further details demonstrating the inhomogeneity between GSA-methods and the homogenizing effect of META-GSA are given in S7 Text.
The homogenizing effect of META-GSA can also be demonstrated by the strong pairwise rank correlations between the pMETA-GSA-values of the GSA-methods, when using the most significant SNP to determine concordance (ρ > 0.85) in contrast with only moderate correlations regarding pSPP-values (ρ ≈ 0.5; see Table 5).
A comparison here with pooledGWAS-GSA cannot be made, since gene set analyses were previously applied to the presented data only for each study, but not for a pooled GWAS.
Using all markers allocated to a gene to determine concordance and applying the GSA approach EASE revealed GO-pathway GO0015291 (“transmembrane transporter activity”) to be significantly enriched with associated genes (ppermut = 0.0001; pMETA_GSA = 0.315). When GSA was performed using another method, significance was reached nominally (p permut ≤ 0.001) but not when adjusted for multiple testing (pMETA_GSA ≈ 0.2). In comparison with META-GSA, even nominal significance was missed applying SPP. For the previously identified gene set GO0015464 , we achieved almost the same nominal ppermut-values as for GO0015291 and across GSA methods.
It is increasingly recognized that GSA can extend GWAS approaches by incorporating existing knowledge of biological processes, with the aim of identifying disease-related pathways. GSA has gained great popularity and several approaches have been proposed. Although the pros and cons have been discussed [9, 19] and points to improve have been formulated , it has only been mentioned that there is a need to replicate pathway association findings to avoid false positive results . According to our knowledge, there is no formal method to combine the results of several GSAs. The basic criticism on applying simple p-pooling (SPP) is a lack in interpretability of results when single markers differ in their attributed role as risk or predictive factor. Here we propose the quantitative approach META-GSA to combine such results, respectively GS-significance, by incorporating concordance of single-marker association patterns between studies, relevant for the GS of interest.
The main steps of META-GSA are first to determine the concordance of association patterns; second, to use these to derive a weight for each study; and third, to apply a weighted version of Fisher’s inverse χ2-method to summarize significance of GSAs in a single meta-analytical p-value. Thus significance of GSA-results and concordance of single-marker association are combined. META-GSA can be further considered as a conditional approach, testing GS-significance conditional to, or in the presence of concordance of association patterns.
Since the mathematical conditions of the weighted version of Fisher’s inverse χ2-method are not fulfilled, the application of a CPU-intensive permutation procedure is required. Thus we investigated the effort and benefits of META-GSA in comparison with SPP, which is fast but does not address concordance of association patterns in any way. Both methods keep type 1 error at the specified level. However, under H0 the results of META-GSA and SPP were found to be almost uncorrelated. We believe this to be down to addressing or ignorance of the concordance of association patterns, respectively. Moreover, META-GSA was found to be more powerful than SPP. The greater the number of studies combined, the larger the advantages in power became. The gain in power can be clearly attributed to the fact that META-GSA is more efficient at separating the “wheat from the chaff” for gene sets in terms of false replicate findings when concordance of association patterns is not given. Given a unique truly associated gene set, META-GSA successfully yielded a significant result twice out of 3 times, whereas SPP failed to find even one of these two significant results.
We also compared META-GSA to a pooledGWAS-GSA approach. For the latter, the combining of studies is switched to the level of markers, followed by a single GSA performed on the pooled marker-specific associations. In general, we found META-GSA to outperform pooledGWAS-GSA. This is similar to the fact that a mega-analysis does not outperform a meta-analysis when testing for single marker association . One further advantage of META-GSA is the fact that heterogeneity in the strength of association for single markers or genes, respectively, between studies does not necessarily cause lower power, as long as other genes belonging to GS compensate such deficiency. “Between-study heterogeneity … can offer valuables insight for further clarification of gene-disease associations” .
Furthermore, META-GSA is applicable to any GSA method selected, even those using individual participants’ genotype data, which may prove to be more suitable and more powerful than methods based on GWAS summary results (pooledGWAS-GSA) [2, 32]. Resting the GSA on common effect estimates can become critical in the case of strong study heterogeneity for few or many markers, since the existence of a common marker-specific association in such a situation is doubtful. For META-GSA, study heterogeneity results in low concordance of the patterns of study-specific association estimates and subsequently in low study weights, which simply reduces the power, however without violating such a critical assumption.
Nevertheless, the simulations were configured to compare a gene set of interest to all other genes. If the investigated gene set is truly dominated by other, containing more strongly associated genes, the simulation revealed META-GSA as performing poorly, although META-GSA is not based on the so-called “competitive hypothesis”. This disadvantage is related to the genetic architecture of complex diseases that always needs to be taken into account when performing pathway analysis [1, 9]. GSA is likely to be informative if the interplay of hundreds of genes in pathways contributes to the susceptibility to a disease or trait. In the presence of a single, strongly associated gene, GSA methods may be of less interest to research that addresses the “competitive hypothesis”. However, such a drawback can be obviated by e.g. excluding such genes and other genes of already identified pathways from the analysis. GSA methods addressing the “self-contained hypothesis” are said to be more appropriate, as is the case in studies limited to candidate genes. Note that META-GSA is applicable to both types of hypotheses underlying a GSA. META-GSA, however, has only limited use for pleiotropic investigation, since concordance of gene effects between all studies is regarded as important. This matters when considering only one outcome entity. Summarizing genetic associations with distinct but putatively related traits, one may allow susceptibility loci to be associated in subsets or in different directions for different traits .
Applying META-GSA to (only) two independent GWAS/GSA investigations into lung cancer revealed that META-GSA is more robust than SPP compared to the previously used GSA method. Furthermore, we discovered a positive relationship between the sufficiency in condensing single-SNP information to a gene-level statistic and the likelihood of identifying a related gene set. Nevertheless, this observation is related to the general discussion of how to condense SNP information within a gene, addressing LD structures, gene size and length, overlapping genes, or the statistical method applied [1, 9, 30].
All the same, META-GSA has some critical points that need to be mentioned. First, it is necessary to estimate LD between neighboring markers to be able to calculate PDRs of differing markers across studies. However, patterns of LD in the human genome are said to be noisy, they vary from region to region, and are difficult to determine for broadly defined ethnicities such as Caucasians, Africans, and Asians. For example, LD in non-African populations extends over longer genomic distances than in Africans . Since it is known that LD-patterns in human subpopulations are different , this can be problematic if one aims to combine results across different ethnicities or if the source population of a study is known to be admixed. To circumvent the problem of multiple ethnicities, we considered only studies of Caucasians in our analysis.
Secondly, a number of alternative ways to combine marker-level estimates to a gene-level statistic have already been published. This includes the inverse gamma method (GM), the rank truncated product method (RTP), the truncated product method (TPM); the adaptive rank truncated product method (ARTP), or any approach for that matter, in which the number of true null hypotheses  is estimated. Any of these methods could have been implemented instead. The pros and cons of several methods have been discussed elsewhere [36–39].
Thirdly, the use of a permutation procedure is time and CPU-intensive. We therefore developed a “quick but dirty” test to avoid the computational burden caused by unnecessary permutations. This test consists of a marginal combination of significance and concordance of association patterns (details are given in S8 Text). However, we must recognize that the ability to sort out only “bad candidate gene sets” for the permutation was fairly limited. That said, we could speed up the program by implementing early-stopping rules. That again also means that the resulting p-values are insufficiently approximated and imprecise, in particular for gene sets clearly not significant.
Fourthly, p-values are the only results. As with most GSA approaches META-GSA does not deliver any effect estimation.
Fifthly, gene-set analysis results are often prone to sources of bias including a) unequal gene set size, a) LD patterns and c) overlapping genes [6, 30, 40]. To be robust against a) and b) multiple comparison procedures, such as Sidak's correction ; decorrelation tests ; or omnibus tests, such as Fisher’s inverse Chi2-method (e.g. SLAT ) are used. To be robust against a), b) and c) permutation procedures (e.g. ALIGATOR ) or single model approaches comprising all SNPs allocated to a gene-set and correcting for the number of effective markers (e.g. kernel methods) can be used. An additional source of bias results from d) low resolution knowledge bases. To perform GWASs and GSAs one needs to annotate SNPs to genes which are annotated to pathways in a static way, referring to public databases. In contrast, genes act dynamically and may have for instance several transcripts which can be active or passive in a certain pathway. Thus, a static annotation maybe doesn’t matches well to a dynamic biological process. Furthermore, e) both SNP-to-gene as gene-to-pathway annotation databases reflect at the best the state of knowledge and should therefore considered as incomplete (to an unknown degree). Because GSA methods, as they have been proposed, use genomewide genotype data therefore are f) unable to model dynamic response of a gene in the course of disease development. In the same they are g) limited to model response to external stimuli . These additional sources of bias (d-g) affect all GSA approaches. Hence, META-GSA is concerned (biased) to the same extend as the GSA approaches aimed to be combined.
All in all, we demonstrate that META-GSA may be a powerful add-on tool in the research of the genetic architecture of complex traits or diseases. One can attribute its benefit to the incorporation of the concordance of single-marker association into the test statistics.
All programs were implemented in SAS 9.3 (SAS Institute, NC, USA) and are provided as S11 Text.
S1 Fig. Graphical representation of Fisher’s p-pooling method for independent and identical target measures.
The test statistic M is the sum of -2ln(ps) for s = 1 to ns studies, which follows a χ2-distribution, assuming all tests point towards the same (common) direction (identical target measures). Note: the lower the p-value, the higher the statistical evidence and the longer the arrow.
S2 Fig. Graphical representation of Makambi’s weighted p-pooling method for different, hence imperfect correlated target measures.
The test statistic M is the sum of -2wsln(ps) for s = 1 to ns studies, which follows a χ2-distribution, allowing for some deviations in the direction of the target measures. No study points towards a common direction.
We would like to thank all the individuals who participated in the four studies, as well as the patients’ physicians, who made this meta-analysis possible. We would like to thank all the researchers involved in the four studies and who made their data available to us.
Conceived and designed the experiments: A. Rosenberger. Performed the experiments: A. Rosenberger. Analyzed the data: A. Rosenberger SF GF. Contributed reagents/materials/analysis tools: CA PB JH RH TM MMN A. Risch HB. Wrote the paper: A. Rosenberger HB.
- 1. Wang K, Li M, Hakonarson H. Analysing biological pathways in genome-wide association studies. Nature reviews Genetics. 2010;11(12):843–54. pmid:21085203.
- 2. Ramanan VK, Shen L, Moore JH, Saykin AJ. Pathway analysis of genomic data: concepts, methods, and prospects for future development. Trends in genetics: TIG. 2012;28(7):323–32. pmid:22480918; PubMed Central PMCID: PMC3378813.
- 3. Hosack DA, Dennis G Jr, Sherman BT, Lane HC, Lempicki RA. Identifying biological themes within lists of genes with EASE. Genome Biol. 2003;4(10):R70. pmid:14519205
- 4. Lewinger JP, Conti DV, Baurley JW, Triche TJ, Thomas DC. Hierarchical Bayes prioritization of marker associations from a genome-wide association scan for further investigation. Genet Epidemiol. 2007;31(8):871–82. pmid:17654612
- 5. Luo L, Peng G, Zhu Y, Dong H, Amos CI, Xiong M. Genome-wide gene and pathway analysis. European journal of human genetics: EJHG. 2010;18(9):1045–53. pmid:20442747
- 6. Peng G, Luo L, Siu H, Zhu Y, Hu P, Hong S, et al. Gene and pathway-based second-wave analysis of genome-wide association studies. European journal of human genetics: EJHG. 2010;18(1):111–7. pmid:19584899
- 7. Sohns M, Rosenberger A, Bickeboller H. Integration of a priori gene set information into genome-wide association studies. BMC Proc. 2009;3 Suppl 7:S95. pmid:20018092
- 8. Wang K, Li M, Bucan M. Pathway-based approaches for analysis of Genomewide Association Studies. Am J Hum Genet. 2007;81(6).
- 9. Fridley BL, Biernacka JM. Gene set analysis of SNP data: benefits, challenges, and future directions. European journal of human genetics: EJHG. 2011;19(8):837–43. pmid:21487444
- 10. Petitti DB. Meta-analysis, decision analysis, and cost-effectiveness analysis methods for quantitative synthesis in medicine. New York: Oxford University Press; 2000.
- 11. Rothman KJ, Greenland S. Modern epidemiology. 1998;2. ed.
- 12. Makambi K. Weighted inverse chi-square method for correlated significance tests. Journal of Applied Statistics. 2003;30(2):225–34.
- 13. Hedges LV, Olkin I. Statistical methods for meta-analysis. 1985.
- 14. Lehne B, Lewis CM, Schlitt T. From SNPs to genes: disease association at the gene level. PLoS One. 2011;6(6):e20133. pmid:21738570; PubMed Central PMCID: PMC3128073.
- 15. Fehringer G, Liu G, Briollais L, Brennan P, Amos CI, Spitz MR, et al. Comparison of pathway analysis approaches using lung cancer GWAS data sets. PLoS One. 2012;7(2):e31816. Epub 2012/03/01. pmid:22363742; PubMed Central PMCID: PMCPMC3283683.
- 16. De la Cruz O, Wen X, Ke B, Song M, Nicolae DL. Gene, region and pathway level analyses in whole-genome studies. Genetic epidemiology. 2010;34(3):222–31. pmid:20013942; PubMed Central PMCID: PMC4061611.
- 17. Storey JD, Tibshirani R. Statistical significance for genomewide studies. Proc Natl Acad Sci USA. 2003;100(16):9440–5. pmid:12883005
- 18. Publishing SAS. SAS/Stat 9.22 User's Guide the MULTTEST Procedure: SAS Publishing; 2010.
- 19. Ackermann M, Strimmer K. A general modular framework for gene set enrichment analysis. BMC Bioinformatics. 2009;10.
- 20. Riley RD, Lambert PC, Abo-Zaid G. Meta-analysis of individual participant data: rationale, conduct, and reporting. BMJ. 2010;340:c221. pmid:20139215.
- 21. Blettner M, Sauerbrei W, Schlehofer B, Scheuchenpflug T, Friedenreich C. Traditional reviews, meta-analyses and pooled analyses in epidemiology. Int J Epidemiol. 1999;28(1):1–9. pmid:10195657.
- 22. Lin DY, Zeng D. Meta-analysis of genome-wide association studies: no efficiency gain in using individual participant data. Genetic epidemiology. 2010;34(1):60–6. pmid:19847795; PubMed Central PMCID: PMC3878085.
- 23. Sung YJ, Schwander K, Arnett DK, Kardia SL, Rankinen T, Bouchard C, et al. An empirical comparison of meta-analysis and mega-analysis of individual participant data for identifying gene-environment interactions. Genetic epidemiology. 2014;38(4):369–78. pmid:24719363.
- 24. Lee S, Teslovich TM, Boehnke M, Lin X. General framework for meta-analysis of rare variants in sequencing association studies. Am J Hum Genet. 2013;93(1):42–53. pmid:23768515; PubMed Central PMCID: PMC3710762.
- 25. Tintle NL, Borchers B, Brown M, Bekmetjev A. Comparing gene set analysis methods on single-nucleotide polymorphism data from Genetic Analysis Workshop 16. BMC Proc. 2009;3 Suppl 7:S96. pmid:20018093
- 26. Hung RJ, McKay JD, Gaborieau V, Boffetta P, Hashibe M, Zaridze D, et al. A susceptibility locus for lung cancer maps to nicotinic acetylcholine receptor subunit genes on 15q25. Nature. 2008;452(7187):633–7. pmid:18385738
- 27. Koshiol J, Rotunno M, Consonni D, Pesatori AC, De Matteis S, Goldstein AM, et al. Chronic obstructive pulmonary disease and altered risk of lung cancer in a population-based case-control study. PLoS One. 2009;4(10):e7380. pmid:19812684
- 28. Sauter W, Rosenberger A, Beckmann L, Kropp S, Mittelstrass K, Timofeeva M, et al. Matrix metalloproteinase 1 (MMP1) is associated with early-onset lung cancer. Cancer epidemiology, biomarkers & prevention: a publication of the American Association for Cancer Research, cosponsored by the American Society of Preventive Oncology. 2008;17(5):1127–35. pmid:18483334
- 29. Amos CI, Wu X, Broderick P, Gorlov IP, Gu J, Eisen T, et al. Genome-wide association scan of tag SNPs identifies a susceptibility locus for lung cancer at 15q25.1. Nat Genet. 2008;40(5):616–22. pmid:18385676
- 30. Wang L, Jia P, Wolfinger RD, Chen X, Zhao Z. Gene set analysis of genome-wide association studies: methodological issues and perspectives. Genomics. 2011;98(1):1–8. pmid:21565265
- 31. Ioannidis JP, Patsopoulos NA, Evangelou E. Heterogeneity in meta-analyses of genome-wide association investigations. PLoS One. 2007;2(9):e841. pmid:17786212; PubMed Central PMCID: PMC1950790.
- 32. Freytag S, Manitz J, Schlather M, Kneib T, Amos CI, Risch A, et al. A network-based kernel machine test for the identification of risk pathways in genome-wide association studies. Human heredity. 2013;76(2):64–75. pmid:24434848; PubMed Central PMCID: PMC4026009.
- 33. Bhattacharjee S, Rajaraman P, Jacobs KB, Wheeler WA, Melin BS, Hartge P, et al. A subset-based approach improves power and interpretation for the combined analysis of genetic association studies of heterogeneous traits. Am J Hum Genet. 2012;90(5):821–35. pmid:22560090; PubMed Central PMCID: PMC3376551.
- 34. Wall JD, Pritchard JK. Haplotype blocks and linkage disequilibrium in the human genome. Nature reviews Genetics. 2003;4(8):587–97. pmid:12897771.
- 35. Gabriel SB, Schaffner SF, Nguyen H, Moore JM, Roy J, Blumenstiel B, et al. The structure of haplotype blocks in the human genome. Science. 2002;296(5576):2225–9. pmid:12029063.
- 36. Zaykin DV, Zhivotovsky LA, Czika W, Shao S, Wolfinger RD. Combining p-values in large-scale genomics experiments. Pharmaceutical statistics. 2007;6(3):217–26. pmid:17879330; PubMed Central PMCID: PMC2569904.
- 37. Dudbridge F, Koeleman BP. Rank truncated product of P-values, with application to genomewide association scans. Genet Epidemiol. 2003;25(4):360–6. pmid:14639705
- 38. Yu K, Li Q, Bergen AW, Pfeiffer RM, Rosenberg PS, Caporaso N, et al. Pathway analysis by adaptive combination of P-values. Genetic epidemiology. 2009;33(8):700–9. pmid:19333968; PubMed Central PMCID: PMC2790032.
- 39. Hsueh HM, Chen JJ, Kodell RL. Comparison of methods for estimating the number of true null hypotheses in multiplicity testing. Journal of biopharmaceutical statistics. 2003;13(4):675–89. pmid:14584715.
- 40. Khatri P, Sirota M, Butte AJ. Ten years of pathway analysis: current approaches and outstanding challenges. PLoS computational biology. 2012;8(2):e1002375. pmid:22383865; PubMed Central PMCID: PMC3285573.
- 41. Holmans P, Green EK, Pahwa JS, Ferreira MA, Purcell SM, Sklar P, et al. Gene ontology analysis of GWA study data sets provides insights into the biology of bipolar disorder. Am J Hum Genet. 2009;85(1):13–24. pmid:19539887