Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Evaluating the Cauchy combination test for count data

  • Huda Alsulami ,

    Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Project administration, Software, Visualization, Writing – original draft, Writing – review & editing

    haalsulami@kau.edu.sa

    Affiliations School of Mathematical Sciences/Centre for Probability, Statistics and Data Science, Queen Mary University of London, London, England, United Kingdom, Department of Statistics, King Abdulaziz University, Jeddah, Makkah, Saudi Arabia

  • Silvia Liverani

    Roles Conceptualization, Methodology, Project administration, Supervision, Writing – review & editing

    Affiliation School of Mathematical Sciences/Centre for Probability, Statistics and Data Science, Queen Mary University of London, London, England, United Kingdom

Abstract

The Cauchy combination test (CCT) is a p-value combination method used in multiple-hypothesis testing and is robust under dependence structures. This study aims to evaluate the CCT for independent and correlated count data where the individual p-values are derived from tests based on normal approximation to the negative binomial distribution. The correlated count data are modelled via copula methods. The CCT performance is evaluated in a simulation study to assess the type 1 error rate and the statistical power, and compare it with existing methods. Our results indicate that the number of combined tests, the negative binomial success parameter, and sample size significantly affect the type 1 error rate of the CCT under independence or moderate correlation. The CCT has more control over managing the type 1 error rate as the strength increases in the Gumbel-Hougaard copula. In general, the choice of copula and the strength of its correlation have a significant influence on type 1 error rates for both the CCT and MinP tests. Our simulation findings support the broader applications of the CCT under multivariate copulas that model upper-tail dependence with higher correlations. This knowledge may have significant implications for practical applications.

Introduction

Combining p-values from various statistical tests is a fundamental procedure in multiple testing for applied statistics. It is a tool to detect an overall effect, such as in meta-analysis and bioinformatics. These combination tests combine and unify large numbers of p-values to a single p-value, potentially providing a more powerful test than testing each p-value separately. Suppose there are m hypotheses to be tested simultaneously. Let H0i and Hai represent the null and alternative hypotheses for the ith variable, respectively, where . Let be the vector of test statistics corresponding to the m hypotheses, along with their associated p-values . The purpose of the p-value combination test is testing:

(1)

where H0, the global null hypothesis, is satisfied when all the individual null hypotheses H0i are true and Ha, the global alternative hypothesis, if at least one of the individual alternatives Hai is true. Combining multiple tests provides a comprehensive conclusion about a specific research question. Moreover, it improves statistical power and controls the inflation of type 1 errors. These p-value combination methods inherently account for the number of combined tests, thereby avoiding the need for multiple testing corrections.

In the literature, many p-value combination tests, which differ by their underlying assumptions, have been proposed to combine independent individual p-values [1,2]. Extensions of these combination tests have been developed to include dependence and weights [37]. The significance of the global alternative hypothesis may be significantly affected when the test statistic of the combined p-values does not appropriately account for the correlations among the individual p-values. It is crucial to investigate the impact of the correlation on the significance of the combined test [8]. For an overview of these methods see [811]. Recently, there has been interest in the Cauchy combination test (CCT) [12] due to its advantageous features over other methods in addressing challenges arising from correlations, computations, and sparse signals in high-dimensional settings.

This study set out to evaluate the CCT’s performance for count data. The objective is to investigate the type 1 error rate and the statistical power of the CCT when the individual p-values are derived from test statistics based on the normal approximation to the negative binomial distribution. The study offers some important insights by studying the influence of the negative binomial parameter, the success parameter, and the number of individual p-values on the combination test. Moreover, it discusses the implications on the power in the case of independent and correlated data.

This paper is organized as follows. The next section, Materials and methods, reviews three p-value combination methods: Fisher, MinP, and the Cauchy combination tests, and introduces the copula methods to construct correlations between count variables. The Simulation study section describes the simulation design. The Results and discussion section presents findings and applications to meta-analysis data. Finally, a conclusion is given in the Conclusion section.

Materials and methods

In this section, we briefly review the CCT and two other p-value combination tests, the MinP and Fisher’s tests. In addition, we introduce the copula methods for producing correlated data.

P-value combination methods

MinP test.

The minimum p-value test (MinP) [13], orders the individual p-values in ascending order . Under the assumption that they are independent and identically uniform on the unit interval [0,1], the MinP test is p(1) which follows a beta distribution with parameters 1 and m, and its p-value is calculated as .

Fisher’s combination test.

The Fisher’s combination test [1] combines the non-linear transformations of the m p-values where each transformation , under the null hypothesis, has a chi-square distribution with 2 degrees of freedom. Therefore, the Fisher’s test statistic has the following distribution:

(2)

Cauchy combination test.

The Cauchy combination test (CCT) by Liu and Xie [12] possesses advantageous features over other tests. The CCT is robust against correlation structure and powerful against sparse signals. In addition, it is computationally efficient which makes it suitable for high dimensional data analysis. The CCT is the sum of weighted non-linear p-values transformations through the tangent function. Under the null hypothesis, the CCT is defined as:

(3)

where and . If no prior information is provided, the CCT becomes the weighted average of the transformations where . However, if the weights are random variables and independent of the individual test statistics, then the tail approximation still holds. Under various correlation structures, the correlation has a minimal impact on the tails of the CCT distribution. The tails of the CCT distribution are approximately standard Cauchy, and its p-value is:

(4)

It has been theoretically and empirically demonstrated in [12] that the CCT can effectively control type 1 error rates across different significance levels. The ratio of the size of the CCT, which represents the type 1 error rate, to the significance level approaches 1 as the significance level converges to 0. This indicates its validity in large-scale multiple testing.

Under the global null hypothesis, the tail probability of the CCT is approximated by a standard Cauchy distribution, which is valid under the assumptions of bivariate normality of the individual tests and some regularity conditions on the correlation matrix. However, Long et al. [14] broaden the applicability of the CCT when the assumption of bivariate normality may not hold. They demonstrated that, theoretically and by simulation, the approximation of the standard Cauchy distribution for the tail probability of the CCT is still valid across a broader range of bivariate copula distributions. This includes the six popular copula distributions, the product Copula, Farlie–Gumbel–Morgenstern (FGM) Copula, Cuadras-Augé Copula, Normal Copula, Ali-Mikhail-Haq (AMH) Copula, and Survival Copula. While their study focused on bivariate copula dependence structures, our research is based on simulating more complex joint distributions under multivariate copulas, which is realistic for many real-world applications where higher-order correlations exist.

Copula methods

We utilise copulas to simulate correlated count data [15,16]. Copula methods are powerful tools to capture complex dependencies between variables rather than simple linear relationships [17]. They model various structures of dependencies, including tails dependencies. A copula is a multivariate distribution function that models the dependence structure between multiple variables, each following a standard uniform marginal distribution, U(0,1) [18]. The basic theorem of copula theory is known as Sklar’s Theorem [19]. Two types of Archimedean parametric copulas for asymmetric dependencies are considered: the Clayton and the Gumbel-Hougaard copulas. As both copulas are widely used in applications to model asymmetric tail dependence, they are suitable for evaluating the CCT and studying the effect of low and large correlated p-values on the properties of the combination tests. In addition, they are exchangeable copulas as their bivariate margins share a common correlation structure through Kendall’s tau value. They are easy to interpret and serve our aim to evaluate the combination method under a simple controlled correlation structure; for example, if the original data exhibit jointly low counts using the Clayton copula.

The Clayton copula models positive dependence in the lower tail, while the Gumbel-Hougaard copula models positive dependence in the upper tail. They are defined by:

(5)

and

(6)

Both copulas have the parameter θ, representing the tail dependence coefficient. As θ increases, the strength of dependence increases. The associated Kendall’s τ for each copula depends on the parameter θ. It takes the form for the Clayton copula and for the Gumbel-Hougaard copula[18]. For illustration, S1 Fig and S2 Fig present wireframe and contour plots of bivariate Clayton and Gumbel–Hougaard copulas, and scatter plots of a sample of size n = 1000 simulated from the bivariate copulas with ().

Evaluating the Cauchy combination test for count data

Biological data, such as RNA-sequencing data, are best fitted with the negative binomial distribution. Unlike the Poisson distribution, the negative binomial distribution accounts for overdispersion when the variance exceeds the mean [20,21]. In this context, the CCT has been adopted as a gene-set test to combine p-values from individual genes and identify differentially expressed genes [22]. Another application of the CCT on count data, in a comparative study evaluating different methods for analysing microbiome data [23], the CCT outperforms other methods of combining p-values and provides an accurate p-value while controlling the type one error rate. In addition, the ranked combined p-values produced from the CCT have high-rank similarity with the true ranks. The CCT successfully replicated and identified microbiome taxa associated with colorectal cancer in a real dataset where the most highly ranked microbiome taxa using the CCT have been reported to be associated with this condition. Consequently, evaluating the CCT is pivotal to providing a robust statistical tool for analyzing non-Gaussian count data.

We aim to evaluate the type 1 error rate and the power of the CCT to combine individual p-values obtained from the normal approximation to the negative binomial distribution and modelling the correlation between the discrete data via copula methods. There are several formulations for the negative binomial distribution in the literature. In this paper, we used the following definition. In a sequence of independent Bernoulli trials, the negative binomial distribution is the distribution of the number of trials (or failures) X needed until a fixed number (r) of successes occurs. Then, where the first parameter r is the number of successes, and p is the probability of success in each trial, and the probability mass function of X is:

(7)

with mean and variance:

(8)

The normal approximation to the negative binomial is applied here for large r and moderate p. This approximation is accurate under these conditions because both parameters, r and p, affect the shape and symmetry of the negative binomial distribution. Hence, it closely resembles that of a normal distribution. This approximation enables us to meet the assumptions required by the Cauchy combination test for the individual tests. The normal approximation to the negative binomial distribution becomes as follows:

(9)

Then, we approximate the sampling distribution of the sample mean using the central limit theorem where the sampling distribution converges to normal for large sample size.

Simulation study

This simulation was designed to assess the type 1 error rates and the power of three different p-value combination methods: the Cauchy combination test (CCT), Fisher’s test, and the MinP test using independent and correlated p-values obtained from the normal approximation to the negative binomial distribution. In this context, the number of variables refers to the number of individual tests, or similarly p-values, denoted as m. We denote the sample size as n and the number of simulations as M.

Data generation

Datasets for independent and correlated variables were simulated from the negative binomial (NB) distribution using the R software version 4.5.0 [24]. The correlations were modelled using the Clayton and the Gumbel-Hougaard copulas to introduce the correlation among the variables using the copula package [17]. The dependence parameter, θ, denotes the dependency strength between the variables. We considered θ values of 1, 3, and 5, which represent weak (independence in the Gumbel-Hougaard copula), moderate, and strong correlations in the lower or upper tail. The corresponding Kendall’s τ values are 0.33, 0.60, and 0.71 for the Clayton copula, and 0, 0.67, and 0.80 for the Gumbel-Hougaard copula. First, we generated data using copulas with dimension m and parameter θ. Following this, the simulated unit variates from these copulas were transformed by the negative binomial quantile function to obtain count data from the negative binomial distribution.

Type 1 error rate

To evaluate the type 1 error rate, the data were simulated under the null hypothesis, H0i: , , using M = 105 replications at 0.05 and 0.01 levels of significance. We generated datasets from negative binomial distribution with parameters r and p = 0.5, NB(r,0.5), each with a sample size n = 30 under the null hypothesis H0i: , . Fixing a large sample size of 30 and a moderate probability of success of 0.5, helped us to isolate their effects and satisfy part of the assumptions of the normal approximation and the CLT, and then, study the influence of the success parameter r. We varied the following parameters: the number of variables m, the number of success parameter of the negative binomial distribution r, and copula parameter θ. We calculated the Z test for each variable as , with a two-sided p-value given by , for . We combined the individual p-values using the three combination methods. Finally, we calculated the mean of the combined p-values that were below the prespecified significance level.

Power comparisons

The statistical powers of the three combination methods were compared in the presence of sparse signals. The evaluation was performed against the sample size and the correlation strength using different correlation structures. Data were generated from a negative binomial distribution consisting of nine variables with parameters r = 10 and p = 0.5, along with one variable with r = 11 and p = 0.5. This scenario reflects the sparsity of the signals commonly encountered in multiple hypothesis testing, where a small number of hypotheses compared to the nulls are true. Specifically, in our simulation setting, a significant test out of 10 is an example of this scenario. In practice, for instance, detecting rare variants in the Genome-Wide Association Studies (GWAS) or the RNA-seq analysis, few genes are expected to be associated with a phenotype or a disease.

We varied the correlation structures and evaluated the power against values of sample size using replicated samples. The correlation structures included independence among all variables, and correlated variables modelled via the Clayton and the Gumbel-Hougaard copulas with θ equals 3.

Results and discussion

To evaluate the performance of the CCT, we conducted a simulation study to assess the type 1 error and power when the individual p-values were obtained from tests based on the normal approximation to the negative binomial distribution. In addition, we studied the effect of the success parameter, the sample size, and the correlation structures. In addition, we provide applications of the combination methods to real meta-analysis datasets for count data.

Table 1 presents the type 1 error rates for different numbers of independent p-values, r, and significance levels. To demonstrate the effect of the success parameter r, Fig 1 presents the results when the numbers of combined tests are 10 and 50. Across different values of the parameter r and varying the number of tests m, the Fisher’s test consistently controls the type 1 error well. When the number of tests is small, all methods manage to control false positives within different significance thresholds. However, the impact of the parameter r on the type 1 error rates of the CCT and MinP tests is evident as m increases.

thumbnail
Fig 1. Type 1 error rates of the Cauchy combination test (smooth line), Fisher’s test (dotted line), and MinP test (dashed line) at the significance level = 0.05 against the success parameter r, using M = 10,000 replications.

Datasets were simulated from m independent negative binomial variables with sample size n = 30 and parameters (r,0.5). A: m = 10. B: m = 50.

https://doi.org/10.1371/journal.pone.0334663.g001

thumbnail
Table 1. Type 1 error rates of the CCT, Fisher, and MinP tests at different values of the level of significance using M = 10,000 replications.

Datasets were simulated from m independent negative binomial variables NB(r, 0.5) with sample size n = 30, where r is the success parameter and the probability of success is 0.5.

https://doi.org/10.1371/journal.pone.0334663.t001

The MinP method exhibited relatively stable type 1 error rates as r increased while it remained conservative for the CCT. The CCT had conservative type 1 error rates at the 0.05 significance level, whereas at 0.01, it was around 0.01, except when m = 100 and r equals 5 or 30. As r increased, the rate increased for the CCT especially for large m. For instance, when m = 50 at , type 1 error rate increased from 0.0222 to 0.0401.

The results in Table 1 show that a small value of r (r = 5) leads to conservative type 1 error rates for the CCT and liberal rates for the MinP method. We expect that as the parameter r value increases, the rate will be around the significance level of 0.05. A possible interpretation is that the skewness of the negative binomial distribution affects the sampling distribution of the sample mean. When r is small, the negative binomial distribution is positively skewed, and its skewness is defined as . The skewness value decreases and becomes closer to zero as r increases, which affects the symmetry of the sampling distribution and, therefore, the accuracy of the individual p-values.

When r is small, the normal approximation of the sample mean through the Central Limit Theorem might be inaccurate and produce p-values, under the null hypothesis, that are stochastically larger than a uniform distribution U(0,1) (conservative individual p-values). As a result, the MinP method may still have higher rejection rates because of its sensitivity to the few small p-values. The CCT, which is based on the average of the transformed p-values, is less likely to reject the null hypothesis due to the excess number of large p-values. In practical contexts, the trade-off between type 1 error rate and power is crucial. Applying the CCT or MinP method has implications on the results that rely on the normal approximations to skewed count data, such as negative binomial data with small r. The CCT has a conservative type 1 error rate and reduces false positives, i.e. rejecting the null hypotheses less frequently than the expected nominal level. Consequently, it has less power and may fail to detect true positives (true signals). On the other hand, the MinP method increases the risk of false positives, and therefore, has higher misleading power. For example, in applications such as genomics, the aim is to detect differentially expressed genes; The MinP method may incorrectly detect genes that are not truly differentially expressed, and the CCT may not be able to declare the truly significant genes.

In contrast, the Fisher method shows stable type 1 error rates closer to the nominal level regardless of the value of the success parameter r. This indicates its robustness to the value of r and means that even when we have conservative individual p-values, Fisher tends to control the type 1 error rate. This finding is also supported by [25], who found that the Fisher method works well with conservative null p-values.

As a practical guideline, the success parameter r should be sufficiently large. A lower bound on r should be maintained such that the standardised skewness (the skewness of the sample mean), , is below a small threshold that is close to zero. A sensitivity analysis of the CCT type 1 error rate was conducted by varying the success parameter r, the sample sizes , and the number of combined tests . The results are presented in Table 2. For example, when m = 30, and across different values of sample sizes 30,100,200, we observed that the CCT type 1 error rates were around 0.05 (after rounding to two digits) when the standardised skewness values ranged between 0.05 and 0.07. A smaller sample size requires a larger r value, and conversely, a larger sample size requires a smaller r. Specifically, when the sample sizes n were 30,100 or 200, the corresponding r values that maintain the type 1 error rate were 50,20 and 5, respectively. Additionally, a larger number of individual tests required a more stringent threshold. For instance, when m = 50, (the results are not shown), the thresholds were around 0.04–0.05. This approach may improve the normal approximation of the individual test statistics, and therefore, the individual p-values follow the uniform distribution under the null. It helps to effectively control the type 1 error rate of the combination methods.

thumbnail
Table 2. Sensitivity analysis of the type 1 error rates of the CCT using M = 10,000 replications.

Datasets were simulated from m = 30 independent negative binomial variables NB(r, 0.5) with varying sample sizes, where r is the success parameter and the probability of success is 0.5.

https://doi.org/10.1371/journal.pone.0334663.t002

In addition, we conducted a rigorous diagnostic assessment using a heatmap in Fig 2 of the CCT type 1 error rates across a grid of different sample sizes n and success parameters r. In 1000 replications, we assessed the average of type 1 error rates using 10,000 simulations by combining m = 30 individual p-values. The heatmap shows that as n and r increase, type 1 error rates approach the nominal level of 0.05.

thumbnail
Fig 2. A heatmap presenting the average type 1 error rates of the Cauchy combination test at the significance level = 0.05.

https://doi.org/10.1371/journal.pone.0334663.g002

It is important to note that the choice of sample size of 30 is based solely on the normal approximation using the CLT. Although it is considered large enough, in practical applications, there are more rigorous methods to estimate the sample size and evaluate the precision of the normal approximation of skewed count data; see, for example, [26]. This may help explain the observed conservative type 1 error rates for large numbers of individual tests (m) and (r), meaning that the CCT requires larger sample sizes. As both the negative binomial success parameter (r) and sample size (n) increase, the distribution of the sample mean becomes more symmetric and continuous, and the corresponding p-values better approximate the uniform distribution under the null. Consequently, the CCT rejection rate stabilizes around the nominal level, confirming the theoretical validity of the method under approximately continuous individual p-values. With lower values of (r) and (n), the individual tests are influenced by the discreteness and skewness of the underlying count data and produce discrete p-values that may affect the validity of the CCT. Alternatively, the Fisher method is more appropriate than other combination methods under the assumption of independence.

Two types of correlation structures using the Clayton and the Gumbel-Hougaard copulas were introduced. The results are presented in Tables 3 and 4. Fisher’s test has the highest type 1 error rates, except in the Gumbel-Hougaard copula when θ is 1, which represents the independence case. As expected, this is due to the violation of the independence assumption. In contrast, the MinP test tends to be more conservative in controlling type 1 errors across different copula structures and levels of dependence. Our findings show that the CCT outperforms the MinP test in controlling false positives, particularly when the dependence strength θ increases from 1 to 5 in the Gumbel-Hougaard copula.

thumbnail
Table 3. Type 1 error rates for the CCT, Fisher, and MinP tests at 0.05 significance level using M = 10,000 replications.

Datasets were modeled using the Clayton copula with and simulated from m negative binomial variables NB(r, 0.5) with sample size n = 30, where r is the success parameter and the probability of success is 0.5.

https://doi.org/10.1371/journal.pone.0334663.t003

thumbnail
Table 4. Type 1 error rates for the CCT, Fisher, and MinP tests at 0.05 significance level using M = 10,000 replications.

Datasets were modeled using the Gumbel-Hougaard copula with and simulated from m negative binomial variables NB(r, 0.5) with sample size n = 30, where r is the success parameter and the probability of success is 0.5.

https://doi.org/10.1371/journal.pone.0334663.t004

For the CCT, the choice of copula model significantly affected controlling the type 1 error rates. In Table 3, as the parameters r and θ increased, the CCT had slightly higher type 1 error rates for different numbers of dependent tests through the Clayton copula. On the other hand, modelling tests based on the Gumbel-Hougaard copula showed that as the correlation strength increased, the type 1 error rate decreased and became more controlled for highly correlated tests. For example, at r = 30 and , across different numbers of tests, the type 1 error rates for the CCT ranged from 0.0588 to 0.0625, while in the Gumbel-Hougaard copula they ranged from 0.051 to 0.0514. Similarly, when r = 30 and , CCT, the type I error rates ranged from 0.0691-0.0603, but in the Gumbel-Hougaard copula, they were between 0.0539-0.0568.

Both copulas differ in modelling extreme values. The dependence in the lower tail of the Clayton copula tends to produce small counts that occur together across variables, resulting in low sample means and large absolute Z test statistics values. Furthermore, the discreteness of the data can lead to tied sample means and p-values. When these effects are combined, small p-values can occur more frequently than expected under the global null hypothesis. Since the CCT is sensitive to small p-values, this can significantly inflate the type 1 error rate. On the other hand, the Gumbel-Hougaard copula better controls type 1 error rates because it shows weak dependence in the lower tail. Thus, the sample means increase compared to Clayton and therefore reduce the likelihood of many false positives under the null hypothesis.

Under multivariate copulas, lower-tail dependence structure can inflate the type 1 error rate, as shown in the Clayton copula results, and copulas that model the upper tail are valid for higher correlations, i.e. higher copula parameter . We further explored the validity of the CCT under the multivariate survival Clayton copula [27]. Similar to the Gumbel-Hougaard copula, this copula captures upper-tail dependence. We aimed to assess whether the tail dependence structure affects the type 1 error rate of the CCT. It was proved that under the bivariate survival copula, the CCT is valid and its tail is well approximated by the Cauchy distribution [14]. When considering the multivariate copula the results, see S1 Table, indicated that the CCT maintained type 1 error rates under the survival Clayton copula for higher copula parameter . These findings support the broader applications of the CCT to multivariate copulas to model the upper tail dependence.

Fig 3 compares the power of the three combination methods against the sample size. As expected, power is generally increasing as the sample size increases. The Fisher and MinP tests have higher power than the CCT when the combined individual tests are independent. The CCT and MinP exhibit comparable power in sparse signals and small effect sizes, regardless of the correlation structure.

thumbnail
Fig 3. The power comparison of the Cauchy combination test (smooth line), Fisher’s test (dotted line), and MinP test (dashed line) from negative binomial data in three different correlation structures.

A: Independent variables. B: Clayton copula. C: Gumbel-Hougaard copula.

https://doi.org/10.1371/journal.pone.0334663.g003

The MinP method showed relatively higher power compared to the Fisher method and the CCT when combining independent tests. At a sample size of 30, both the MinP and Fisher methods approached the maximum power of 1, while the power curve for the CCT increased slowly. It remained below the other two methods even for larger sample sizes, which suggested that it may be conservative in detecting the true signal when r is small, specifically when r = 11, where the normal approximation may not be accurate.

For correlated count data, the Fisher method exhibited higher power, especially for small to moderate sample sizes. The independence assumption for the Fisher method was violated in this case, leading to inflated type 1 error rate, which reflected a misleading increase in power. Although the power curve for the CCT achieved higher power at smaller sample sizes, this must be interpreted with caution due to its liberal type 1 error rates at n = 30. However, as the sample size exceeded 30, the normal approximation of the negative binomial variables became more accurate, and the power observed at larger sample sizes became more reliable. On the other hand, the MinP method, which represents a conservative method under dependence, offered more reliable power results.

Real data applications

In addition to simulation results, we provide a real data analysis to illustrate the application of combination methods to combine p-values from count data. Meta-analyses of Genome-Wide Association Studies (GWAS) show that two SNPs (rs4570625-T and rs17110747-A) on the TPH2 gene are associated with major depressive disorder (MDD) using fixed effects models [28]. We apply the Cauchy combination test (CCT) and the Fisher method to combine p-values in two meta-analysis studies. The first study includes six independent case-control studies to test the association between the rs4570625-T SNP and MDD. The second meta-analysis involves five studies to test the association of the rs17110747-A SNP and MDD. The individual p-values were calculated from the forest plots 1 and 2 in [28].

For the first meta-analysis, the individual p-values were 0.78, 0.002, 0.74, 0.016, 0.89, and 0.10. By combining these p-values and at a significance level of 0.05, both the CCT and Fisher method indicated significant results with combined p-values of 0.011 and 0.009, respectively.

In the second meta-analysis, the individual p-values were 0.94, 0.0015, 0.97, 0.79, and 0.81. The CCT produced a combined p-value of 0.0082, while the Fisher method yielded a non-significant result of 0.17. Notably, [29] applied a new proposed p-value combination method to combine the five individual p-values which resulted in a combined p-value of 0.0081. Along with the CCT, their result was the only significant finding among other existing p-value combination methods.

The Cauchy combination test (CCT) yielded significant findings when applied to both real meta-analysis studies, while the Fisher method exhibited significance in only one study. Under the independence assumption, the difference in detecting true signals between the CCT and Fisher method is due to their sensitivities to small p-values in a sparse setting. These findings highlight the potential of the CCT when applied to meta-analysis of count data and only a subset of studies exhibit strong effects.

It is important to note that multivariate copulas with discrete marginal distributions do not have a unique copula representation [19]. The imitation of non-uniqueness in copulas arises in the estimation of the copula parameter and joint distribution of observed discrete data. In practice, a specific copula may not capture this, but different possible copulas may lead to the same marginals and joint distribution. However, this limitation does not affect the validity of our simulation study. We explicitly specified a known correlation structure in advance, simulated latent uniform variables using a copula, and then applied the inverse cumulative function of the negative binomial distribution. The goal was to assess the influence of the correlation structures on the p-value combination methods [15,16].

Further study is needed to evaluate combination methods, particularly the CCT, under highly dispersed count data using distributions such as the Poisson Inverse Gaussian (PIG) and Sichel. Models based on these distributions offer flexibility in fitting and modelling highly dispersed count data and outperform negative binomial models in the analysis of crash and infectious disease count data [3032]. Furthermore, the performance of the CCT could be compared with other methods that account for dependence structures, such as the Z or Empirical Brown’s methods, in discrete settings. It is also of interest to examine the robustness of the CCT under other types of copulas such as, for instance, a symmetric and heavy-tailed dependence structure like the Student-t copula or even under more complex mixed correlation structures such as the Vine copula. Future work could evaluate the combination methods and extend them to count data that exhibit overdispersion and temporal or spatial dependence, such as traffic data, under other distributional models and more complex dependence structures [30,32]. Such comparisons would improve our understanding of the performance of p-value combination methods and guide practitioners in selecting appropriate approaches for real-world count data analysis.

Conclusion

In this paper, we compare three p-value combination tests, where individual p-values are obtained from count data based on normal approximation to the negative binomial distribution. The Cauchy combination test (CCT) is a powerful and robust method against sparse alternatives under arbitrary dependence structures. The observed variations in type 1 error rates of the CCT when combining multiple independent or correlated tests based on the normal approximation emphasize the need for caution to ensure the validity of statistical inferences. We find that the number of combined tests influences the accuracy of normal approximation, which is affected by both the sample size and success parameter. In addition, the choice of the copula and its parameter are also other factors to consider. Our simulation findings support the broader application of the CCT to multivariate copulas that model upper-tail dependence with higher correlations. These factors contribute to the robustness and validity of the CCT.

Supporting information

S1 Fig. Graphical representations of the bivariate Clayton copula.

Wireframe plot of the bivariate Clayton copula density (top left), contour plot of the copula distribution function (top right), contour plot of the copula density (bottom left), and scatter plot of a sample of size n = 1000 simulated from the bivariate Clayton copula with (, illustrating lower-tail dependence.

https://doi.org/10.1371/journal.pone.0334663.s001

(TIFF)

S2 Fig. Graphical representations of the bivariate Gumbel-Hougaard copula.

Wireframe plot of the bivariate Gumbel-Hougaard copula density (top left), contour plot of the copula distribution function (top right), contour plot of the copula density (bottom left), and scatter plot of a sample of size n = 1000 simulated from the bivariate Gumbel-Hougaard copula with (, illustrating upper-tail dependence.

https://doi.org/10.1371/journal.pone.0334663.s002

(TIFF)

S1 Table. Survival Clayton copula results.

Type 1 error rates for the CCT, Fisher, and MinP tests at 0.05 significance level using replications. Datasets were modeled using the survival Clayton copula with and simulated from m negative binomial variables NB(r, 0.5) with sample size n = 30, where r is the success parameter and the probability of success is 0.5. Abbreviations: CCT: Cauchy combination test; MinP: Minimum P-value test.

https://doi.org/10.1371/journal.pone.0334663.s003

(PDF)

Acknowledgments

The authors acknowledge with thanks WAQF and the Deanship of Scientific Research (DSR) for technical and financial support.

References

  1. 1. Fisher RA. Statistical Methods for Research Workers. 4th ed. London: Oliver and Boyd; 1932.
  2. 2. Stouffer SA, Suchman EA, DeVinney LC, Star SA, Williams RM. The American Soldier. Princeton: Princeton University Press; 1949.
  3. 3. Liptak T. On the combination of independent tests. Magyar Tudomanyos Akademia Matematikai Kutato Intezetenek Kozlemenyei. 1958;3:171–97.
  4. 4. Lancaster HO. The combination of probabilities: an application of orthonormal functions. Australian Journal of Statistics. 1961;3(1):20–33.
  5. 5. Brown MB. 400: a method for combining non-independent, one-sided tests of significance. Biometrics. 1975;31(4):987.
  6. 6. Kost JT, McDermott MP. Combining dependent P-values. Statistics & Probability Letters. 2002;60(2):183–90.
  7. 7. Poole W, Gibbs DL, Shmulevich I, Bernard B, Knijnenburg TA. Combining dependent P-values with an empirical adaptation of Brown’s method. Bioinformatics. 2016;32(17):i430–6. pmid:27587659
  8. 8. Alves G, Yu Y-K. Accuracy evaluation of the unified P-value from combining correlated P-values. PLoS One. 2014;9(3):e91225. pmid:24663491
  9. 9. Loughin TM. A systematic comparison of methods for combining p-values from independent tests. Computational Statistics & Data Analysis. 2004;47(3):467–85.
  10. 10. Heard NA, Rubin-Delanchy P. Choosing between methods of combining $p$-values. Biometrika. 2018;105(1):239–46.
  11. 11. Zhang H, Wu Z. The generalized Fisher’s combination and accurate p-value calculation under dependence. Biometrics. 2023;79(2):1159–72. pmid:35178716
  12. 12. Liu Y, Xie J. Cauchy combination test: a powerful test with analytic p-value calculation under arbitrary dependency structures. J Am Stat Assoc. 2020;115(529):393–402. pmid:33012899
  13. 13. Tippett L. The Methods of Statistics. London: Williams Norgate; 1931.
  14. 14. Long M, Li Z, Zhang W, Li Q. The Cauchy combination test under arbitrary dependence structures. The American Statistician. 2022;77(2):134–42.
  15. 15. Safari-Katesari H, Samadi SY, Zaroudi S. Modelling count data via copulas. Statistics. 2020;54(6):1329–55.
  16. 16. Geenens G. Copula modeling for discrete random vectors. Dependence Modeling. 2020;8(1):417–40.
  17. 17. Hofert M, Kojadinovic I, Maechler M, Yan J. Copula: Multivariate Dependence with Copulas. 2024.
  18. 18. Hofert M, Kojadinovic I, Maechler M, Yan J. Elements of copula modeling with R. Cham, Switzerland: Springer; 2018.
  19. 19. Sklar A. Fonctions de répartition à n dimensions et leurs marges. Publications de l’Institut de Statistique de l’Université de Paris. 1959;8:229–31.
  20. 20. Anders S, Huber W. Differential expression analysis for sequence count data. Genome Biol. 2010;11(10):R106. pmid:20979621
  21. 21. Mi G, Di Y, Schafer DW. Goodness-of-fit tests and model diagnostics for negative binomial regression of RNA sequencing data. PLoS One. 2015;10(3):e0119254. pmid:25787144
  22. 22. Liu Y, Chen S, Li Z, Morrison AC, Boerwinkle E, Lin X. ACAT: a fast and powerful p value combination method for rare-variant analysis in sequencing studies. Am J Hum Genet. 2019;104(3):410–21. pmid:30849328
  23. 23. Ham H, Park T. Combining p-values from various statistical methods for microbiome data. Front Microbiol. 2022;13:990870. pmid:36439799
  24. 24. R Core Team. R: A Language and Environment for Statistical Computing; 2021. https://www.R-project.org/
  25. 25. Hoang A-T, Dickhaus T. Combining independent p-values in replicability analysis: a comparative study. Journal of Statistical Computation and Simulation. 2022;92(10):2184–204.
  26. 26. Cundill B, Alexander NDE. Sample size calculations for skewed distributions. BMC Med Res Methodol. 2015;15:28. pmid:25886883
  27. 27. Mucha V, Páleš M, Teplanová P. Modelling risk dependencies in insurance using survival clayton copula. Statistika. 2024;104(3):320–35.
  28. 28. Gao J, Pan Z, Jiao Z, Li F, Zhao G, Wei Q, et al. TPH2 gene polymorphisms and major depression–a meta-analysis. PLoS One. 2012;7(5):e36721. pmid:22693556
  29. 29. Chen Z, Yang W, Liu Q, Yang JY, Li J, Yang M. A new statistical approach to combining p-values using gamma distribution and its application to genome-wide association study. BMC Bioinformatics. 2014;15 Suppl 17(Suppl 17):S3. pmid:25559433
  30. 30. Zha L, Lord D, Zou Y. The Poisson inverse Gaussian (PIG) generalized linear regression model for analyzing motor vehicle crash data. Journal of Transportation Safety & Security. 2014;8(1):18–35.
  31. 31. Moshi Ouma V. Poisson Inverse Gaussian (PIG) model for infectious disease count data. AJTAS. 2016;5(5):326.
  32. 32. Zou Y. Modeling highly dispersed crash data with sichel GAMLSS: an alternative approach to traditional methods. Multidiscip Sci J. 2025;7(8):2025392.