Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Statistical methods for comparing two independent exponential-gamma means with application to single cell protein data

  • Jia Wang,

    Roles Data curation, Formal analysis, Methodology, Software, Writing – original draft, Writing – review & editing

    Affiliation Department of Biostatistics, University at Buffalo, Buffalo, NY, United States of America

  • Lili Tian,

    Roles Conceptualization, Methodology, Supervision, Writing – review & editing

    Affiliation Department of Biostatistics, University at Buffalo, Buffalo, NY, United States of America

  • Li Yan

    Roles Conceptualization, Data curation, Formal analysis, Methodology, Software, Supervision, Writing – original draft, Writing – review & editing

    Li.Yan@RoswellPark.org

    Affiliation Department of Biostatistics and Bioinformatics, Roswell Park Comprehensive Cancer Center, Buffalo, NY, United States of America

Abstract

In genomic study, log transformation is a common prepossessing step to adjust for skewness in data. This standard approach often assumes that log-transformed data is normally distributed, and two sample t-test (or its modifications) is used for detecting differences between two experimental conditions. However, recently it was shown that two sample t-test can lead to exaggerated false positives, and the Wilcoxon-Mann-Whitney (WMW) test was proposed as an alternative for studies with larger sample sizes. In addition, studies have demonstrated that the specific distribution used in modeling genomic data has profound impact on the interpretation and validity of results. The aim of this paper is three-fold: 1) to present the Exp-gamma distribution (exponential-gamma distribution stands for log-transformed gamma distribution) as a proper biological and statistical model for the analysis of log-transformed protein abundance data from single-cell experiments; 2) to demonstrate the inappropriateness of two sample t-test and the WMW test in analyzing log-transformed protein abundance data; 3) to propose and evaluate statistical inference methods for hypothesis testing and confidence interval estimation when comparing two independent samples under the Exp-gamma distributions. The proposed methods are applied to analyze protein abundance data from a single-cell dataset.

1 Introduction

Recent investigations of physical models [14] of individual cells have demonstrated that the protein copy number (or abundance) distribution can be approximated by a gamma distribution. These studies claimed that the shape parameter of the gamma distribution can be interpreted as the number of mRNA produced per cell cycle, and the scale parameter as the protein molecules produced per mRNA within individual cells. Although studies at single-cell level were costly and scarce a decade ago, recent technology advances make large scale of protein abundance data at single-cell level proliferate [5, 6].

In practice, up-regulated and down-regulated genes between samples are assessed using fold change which represents a proportional rather than additive changes from a reference (e.g. healthy) to an alternative (e.g. tumor) state. Hence log-transformed abundance level is more biologically relevant, and expression (or concentration) of genes is usually pre-processed using log-transformation before statistical modeling. Additionally, log-transformation is used to adjust for skewness and for variance stabilization [711]. Such transformation is widely used as a preprocessing step for many types of molecular markers. Therefore, the exponential-gamma distribution (Exp-gamma), derived from the gamma distribution by applying a logarithmic transformation, is an ideal candidate for modeling log-transformed protein abundance data.

However, researchers often resort to two sample t-test or the Wilcoxon-Mann-Whitney (WMW) test in differential analysis of log-transformed protein abundance and other molecular data [1214] to detect the difference between two experimental conditions, often with some pre- and post-model adjustment to reduce the false positive rate [15, 16]. Fay and Proschan [17] argued that two sample t-test decision rules are asymptotically valid under quite general conditions even if the normality assumption is rejected. Recently, Li et al. [14] pointed out that two sample t-test often results in exaggerated false positive rate, and recommended using the WMW test for comparing two sets of expression levels measured under two conditions for a gene in population-level RNA-seq studies with large sample sizes. Hao et al. [5] analyzed the differential abundance of cell types across experimental conditions using the WMW test after log-normalization of the protein data. However, some researchers pointed out [17, 18] that although the WMW test does not require parametric assumptions, it assumes that the two distributions are equal under the null hypothesis; hence it could result in inflated type I errors when testing the equality of means. Recently, Torrente et al. [19] studied the shape of gene expression and discovered that the gamma distribution was the predominant non-normal category of genes in both microarray and RNA-seq datasets.

Although there exist some research on the appropriateness of two sample t-test and the WMW test in the differential analysis of log-transformed protein abundance data [5, 1214], there does not exist such an investigation under the Exp-gamma distribution. Furthermore, accurate statistical inference methods for comparing two Exp-gamma means are of particular interest since identifying differences in log transformed protein abundance data under two different experiment conditions is a fundamental research question in genomics study. Despite the existence of rich statistical research on gamma means [2029], to our knowledge, there does not exist literature on inference of the Exp-gamma means. Therefore, the aim of this paper is three-fold: 1) to present the Exp-gamma distribution as a proper biological and statistical model for the analysis of log-transformed protein abundance data from single cell experiments; 2) to demonstrate the inappropriateness of using two sample t-test and the WMW test in analyzing log-transformed protein abundance data; 3) to propose and evaluate statistical inference methods for hypothesis testing and confidence interval estimation for comparing two independent samples under Exp-gamma distributions.

This paper is organized as follows. In Section 2, we provide some preliminary results on features of the Exp-gamma distribution, along with its characteristics. In Section 3, the motivation for this research is addressed by a more detailed description of the molecular process of protein production and its critical role in human traits and disease, as given in Section 3.1, followed by an investigation into the inappropriateness of two sample t-test and the WMW test for testing the equality of two Exp-gamma means in Section 3.2. In Section 4, methods for hypothesis testing for the equality of two independent Exp-gamma means and confidence interval estimation for mean difference are proposed. In Section 5, we present the simulation studies on the type I error control and power of the proposed tests, as well as the coverage probability of proposed confidence intervals. In Section 6, a subset of Seurat data used in scRNA-seq studies is analyzed using the proposed methods. Finally, concluding remarks are provided in Section 7.

2 The setting

Let Y1 and Y2 denote two independent random variables from log-transformed gene expression/protein abundance, where YiExp-gamma(αi, βi), i.e. where y ∈ (−∞, ∞), and αi, βi > 0. Note that following a gamma distribution, i.e. Xigamma(αi, βi) where αi and βi stand for the shape parameter and rate parameter, respectively. Fig 1 contains two graphs of the probability density functions of Yi and Xi at (αi, βi) = (1, 1) and (αi, βi) = (3, 1), respectively. The Exp-gamma distribution is skewed to the left (negatively skewed), with its both tails extending indefinitely.

thumbnail
Fig 1. Probability density of YExp-gamma(α, β) and X = eYgamma(α, β), for (α, β) = (1, 1) and (3, 1), respectively.

https://doi.org/10.1371/journal.pone.0314705.g001

Let δi and denote the population mean and variance for Yi, respectively. It can be proved that (1) (2) for i = 1, 2, where ψ() is the digamma function and ψ(1)() is the trigamma function. The details of the proof are presented in S1 Appendix.

Skewness and excess kurtosis are the other two measures which describe the distributional properties of a probability distribution. Skewness measures the asymmetry of the probability distribution, and excess kurtosis measures how much the distribution deviates from a normal distribution in terms of tails. Both the skewness (skew) and the excess kurtosis (ex-kurt) of Exp-gamma distribution only depend on its shape parameter αi, (3) for i = 1, 2, where ψ(k−1)() is kth derivative of the log gamma function. The detailed proof is shown in S1 Appendix.

Fig 2 shows the skewness and excess kurtosis of Exp-gamma distribution as the shape parameter (α) ranges from 0.1 to 50. The negative skewness confirms the appearance of Exp-gamma distribution is left skewed. The excess kurtosis of Exp-gamma distribution can be positive and negative, whereas the positive value means that the Exp-gamma distribution is thin-tailed and has fewer outliers, and the negative value means that the Exp-gamma distribution is fat-tailed and has many outliers. When α = 0.7689, the Exp-gamma distribution has the same kurtosis as the normal distribution. As α tends to infinity, the value of skew converges to 0, and the value of ex-kurt converges to −3.

thumbnail
Fig 2. Plot of skewness and excess kurtosis for Exp-gamma distribution as α ranges from 0.1 to 50. When α = 0.7689, the excess kurtosis is 0.

https://doi.org/10.1371/journal.pone.0314705.g002

We are interested in testing the hypothesis H0 : δ1 = δ2, vs. H1 : δ1δ2, as well as constructing confidence interval for mean difference δ1δ2. The mean difference of two independent Exp-gamma distributions is given by

Let and stand for the maximum likelihood estimates for αi and βi, respectively. The maximum likelihood estimator (MLE) of δi is Then The variance of is (4)

3 Motivation

In this section, we provide detailed arguments about the compelling importance of the exponential-gamma (Exp-gamma) distribution in analyzing log-transformed protein abundance data from single-cell experiments, as well as the paramount significance of developing statistical inference procedures under the Exp-gamma distribution.

3.1 Justification of using the Exp-gamma distribution for cellular protein abundance measurements

The central dogma of molecular biology is a fundamental theory developed by Francis Crick in 1958 that explains how genetic information flows within a biological system. The core idea can be simply stated as: “DNA makes (messenger) RNA, and RNA makes protein”. The abundance of cellular protein is intimately linked to all biological functions in living cells. Since then, this theory has withstood the test of time and intensive investigations, with only minor exceptions and enrichment. The expression levels of messenger RNAs (mRNAs) and proteins are essential measurements of an organism’s genetic makeup (genotypes), and are often directly related to many observable characteristics or traits (phenotypes), including morphology, development, biochemical, and physiological properties. Common phenotypes in humans include height and blood type, as well as disease related characteristics, e.g. cancer subtypes. Understanding the differences in genotypes (e.g. protein abundance) and their relationships with phenotypes (e.g. cancer progressions) is the focus of molecular biology.

Since its introduction in 2008 [30], cost effective and rapid mRNA quantification of whole genome (transcriptome) has become a standard tool in the life sciences research community. Initially developed for bulk samples, this method evolved to quantify mRNA levels in single cells, and revolutionized the field of cancer research. Numerous analysis methods and pipelines have been developed for mRNA quantification, based on the organism under study, platform characteristics, and researcher’s goals [31]. Due to its wide-spread usage, the mRNA quantification is often used as a synonym for gene expression in many studies. However, in fact, the protein abundance data is a more accurate measurement for gene activity. It is well established that mRNA transcript level only partially correlates with protein abundances [32], and transcriptomics alone is often incapable of distinguishing between categories of cells that are molecularly similar, but functionally distinct. Due to the high cost and experimental complexities, studies that access protein abundance remain scarce, especially at single-cell level because of the low abundance of proteins in cells. Only in the past a few years, genome-wide analysis of protein abundance at single-cell level became practical [5]. Unfortunately, this belated development also means lack of investigation of protein specific statistical analysis method. Most of the methods that were adopted from RNA-seq analysis [7] overlooked sample distribution, except for some preprocessing and normalization steps to compensate for the obvious skewness of the protein data. It becomes clear that single-cell protein abundance specific statistical method for accurate assessment of such data is in great need. Consistent with the two-stage model of gene expression described in the central dogma of molecular biology, the intriguing physical models [14] unveiled intrinsic association between gamma distribution parameters and biological process of protein synthesis. Based on these observations, we propose to use the Exp-gamma distribution for modeling single-cell protein levels in molecular biology and cancer research, since log-transformed protein abundances are often biologically more relevant to their cellular functions.

It is worth mentioning that many molecular biology data can be modeled by gamma distribution. For example, microRNA sequencing data often align closely with a gamma distribution due to the stochastic nature of exponential PCR amplification [19, 33, 34]. Additionally, the absolute abundance levels of metabolic [35] and microbiome [36] data exhibit characteristics that align with gamma distributions. Furthermore, log-transformation is a standard preprocessing step in the statistical analysis of these data. Hence, the Exp-gamma distribution is a good candidate for modeling log-transformed cellular protein abundance measurements.

3.2 Two sample t-test and the Wilcoxon-Mann-Whitney (WMW) test could be misleading

The two sample t-test and the WMW test are widely used in differential analysis for log-transformed protein abundance data in proteomics [5, 7, 14, 3739]. However, the appropriateness of using these two tests in differential analysis under the Exp-gamma has not been investigated. Hence, in this section, we aim to use a simulation study to demonstrate their limitations in differential analysis.

Assume two samples of protein abundance obtained under different experimental conditions are from gamma distributions, i.e. X1gamma(α1, β1), and X2gamma(α2, β2). The differential analysis is based on log-transformed data from Y1 and Y2, where Y1 = log(X1) ∼ Exp-gamma(α1, β1) and Y2 = log(X2) ∼ Exp-gamma(α2, β2). We are interested in testing the equality of two Exp-gamma means.

We carried out simulations to evaluate the type I error control of two sample t-test and the WMW test under H0 : δ1δ2 = 0. Four parameter settings for (α1, β1) vs. (α2, β2) are considered: A) (0.2, 0.005) vs. (5, 4.509); B) (0.5, 0.14) vs. (10, 9.504); C) (1, 0.561) vs. (5, 4.509); and D)(5, 0.048) vs. (5, 0.048). In settings A, B, and C, the two Exp-gamma distributions differ, while in setting D, they are identical. Fig 3 presents the density plots under these four settings. It can be seen that these settings vary considerably despite the fact that they are all under H0 : δ1δ2 = 0. Under the null hypothesis of equal population means, the probability that Y1 is greater than Y2 (i.e. P(Y1 > Y2)), a measure for the difference between two populations, is 0.621, 0.593, 0.556, and 0.5, for settings A, B, C, and D, respectively, indicating setting A has the largest difference between two populations and setting D has the smallest difference. Note that generally speaking, P(Y1 > Y2) = 0.5 does not necessarily imply two populations are identical. In this simulation study, we deliberately design setting D to have two identical populations for the purpose of checking the applicability of two sample t-test and the WMW test under two identical Exp-gamma distributions. For each setting, we considered sample sizes from small (10) to large (75). For a given set of sample sizes and parameter configuration, 2000 observed datasets are generated. The simulated type I errors by two sample t-test and the WMW test are reported in Figs 4 and 5, respectively.

thumbnail
Fig 3. Density plots of samples from four pair of comparisons Y1 vs Y2 where Y1Exp-gamma(α1, β1) vs. Y2Exp-gamma(α2, β2).

(A : (0.2, 0.005) vs. (5, 4.509); B: (0.5, 0.14) vs. (10, 9.504); C: (1, 0.561) vs. (5, 4.509); D: (5, 0.048) vs. (5, 0.048)).

https://doi.org/10.1371/journal.pone.0314705.g003

thumbnail
Fig 4. Estimated type I errors of two sample t-test for testing the equality of mean of two Exp-gamma distributions as a function of sample sizes.

The middle dashed line represents the nominal significance level at α = 0.05; and upper and lower dashed lines are upper and lower limits for satisfactory type I error rates, which are 0.06 and 0.04 with 2000 simulations runs, respectively.

https://doi.org/10.1371/journal.pone.0314705.g004

thumbnail
Fig 5. Estimated type I errors of the Wilcoxon-Mann-Whitney (WMW) test for testing the equality of mean of two Exp-gamma distributions as a function of sample sizes.

The middle dashed line represents the nominal significance level, which is set to α = 0.05; and upper and lower dashed lines are upper and lower limits for the type I error rates, which are 0.06 and 0.04, respectively.

https://doi.org/10.1371/journal.pone.0314705.g005

As shown in Fig 4, the type I errors of two sample t-test (or Welch’s test for unequal variances) converge to nominal level as sample sizes increase, as guaranteed by the central limit theorem. In addition, when two Exp-gamma distributions are identical (setting D), the two sample t-test maintains controlled type I errors even when sample sizes are small. Note that the type I errors for setting D lie completely between two dashed lines in Fig 4, which indicate boundaries for satisfactory coverage given 2000 simulation runs. However, if two Exp-gamma distributions are different (i.e. settings A, B and C), the type I errors for testing the equality of means can be as high as 0.1, particularly when sample sizes are small (e.g. less than (50, 50) for settings A and B, and less than (30, 30) for setting C). Thus, two sample t-test is appropriate for testing the equality of two means of log-transformed protein abundance data when sample sizes are larger than (50, 50). When dealing with small to medium sample sizes, we should exercise caution with two sample t-test, especially when two underlying distributions are very different.

When the assumption of normality is in doubt, it is a common practice that the WMW test is used as an alternative as it is a non-parametric test. However, while non-parametric tests such as the WMW test do not require normality, they test the null hypothesis that two populations are identical. Hence, when two populations have the same mean but not identical, the WMW test does not guarantee that the significance level will be preserved. More details can be found in the paper by Pratt [18] which thoroughly investigated the effect of differences between two populations on the level of the WMW test for normal, double exponential, and rectangular distributions. In this simulation study, we investigate the effect of the difference between two Exp-gamma distributions on the significance level of the WMW test under null hypothesis of equality of two Exp-gamma means. In Fig 5, we observe inflated type I errors for settings A, B, and C in the WMW test, and the magnitude of inflation increases as sample sizes grow. Furthermore, given sample sizes, as the disparity measured by P(Y1 > Y2) increases, the inflation of type I error becomes worse; and setting A has the worst type I error control among all settings. It is also notable that the type I errors are well controlled for setting D in which two distributions are identical. Hence, for testing equality of two Exp-gamma means, the WMW test can control type I error only when two distributions are exactly the same, and the type I error can be severely out of control when two distributions are not the same.

In summary, both two sample t-test and the WMW test have limitations in testing of equality of two independent Exp-gamma means. While two sample t-test is not ideal when sample sizes are below medium, the limitations for the WMW test are more severe as it requires the two distributions to be exactly the same under the null hypothesis. In practice, small to medium sizes are common in genomics studies, and scenarios with identical populations under the null hypothesis could be rare. Therefore, accurate procedures for statistical inference for the mean difference of two independent Exp-gamma distributions are desirable.

4 Inferences on the mean difference of two independent Exp-gamma distributions

Let Y1 and Y2 be two independent Exp-gamma random variables, i.e. Y1Exp-gamma(α1, β1) and Y2Exp-gamma(α2, β2). Note that Yi = lnXi where Xigamma(αi, βi), i = 1, 2, and X1 and X2 are independent. Then the population means for Y1 and Y2 are given as follows: Thus, the research interest is to perform hypothesis testing with satisfactory type I error control under H0 : δ1 = δ2 vs. H1 : δ1δ2, and estimate the confidence interval for the mean difference η = δ1δ2 with satisfactory coverage probability. jpeg.

4.1 The method based on generalized inference

The concepts of generalized variables and generalized pivots were introduced by Tsui and Weerahandi [40] and Weerahandi [41]. More details can be found in the book of Werrahandi [42]. In S2 Appendix, a brief summary of the core concepts is presented. The concepts of generalized pivotal quantity and generalized confidence interval have been successfully applied to a variety of practical problems when standard exact solutions do not exist, and it has been shown that generalized inference methods generally have good performance, even when sample sizes are small; see e.g. [4345].

Although there does not exist exact generalized pivots for gamma parameters, approximates generalized pivots have been proposed [2226]. These approximate pivots have been utilized to make inference for gamma distributions, including single gamma means and difference between two gamma means under different scenarios [2729]. Utilizing the existing approximate generalized pivots for gamma parameters, we will develop the generalized inference methods for hypothesis testing and confidence interval estimation for mean difference of two independent Exp-gamma distributions.

4.1.1 Generalized pivots for population parameters: A review.

Assume Xgamma(α, β). In the following, we will first briefly review the existing approximate generalized pivots for gamma parameters α and β.

Krishnamoorthy and Wang’s method: [25, 26] By applying the Wilson-Hilferty normal approximation, i.e. W = X1/3N(μ, σ2). Generalized pivotal quantities for normal mean and variance, Rμ and Rσ2 can be obtained for transformed data. Let and be the observed sample mean and sample variance based on the transformed data W. The generalized pivotal quantities for α and β can be further expressed as: (5) where , and , with ZN(0, 1), , , and Z, U1, and U2 are independent.

Chen and Ye’s method: [22, 23] It is known that approximately, where v = 2E2(V1)/Var(V1) and c = E(V1/v). The detailed formulas for E(V1) and Var(V1) can be found in Chen and Ye [22]. Using this result, an approximate generalized pivotal quantity for α can be written as where , and are observed values of and . Furthermore, utilizing a well-known result regarding gamma distribution, i.e. , the generalized pivot quantity for β can be written as (6) where .

Wang and Wu’s method: [24] Let . Note that U = F(.) ∼ U(0, 1), where F(.) is the c.d.f of T. On the basis of Cornish-Fisher expansion, the Uth percentile of T can be approximated by κ1(α) + [κ2(α)]1/2Q(α, U), where κj(α) is the jth cumulant of T and Q(α, U) is a function of κj(α)’s. The detailed formulas can be found in Wang and Wu [24]. Let t denote the observed value of T. An approximate generalized pivotal quantity for α, i.e. Rα, can be obtained by solving t = κ1(α) + [κ2(α)]1/2Q(α, U). Similar to Chen and Ye’s method, the approximate generalized pivotal quantity for rate parameter, Rβ, can be obtained by Eq (6). This method improves Chen and Ye’s method and can work well even when the shape parameter α is small.

4.1.2 The generalized inference methods for hypothesis testing and confidence interval estimation for two independent Exp-gamma means.

For the ith (i = 1, 2) sample, the generalized pivotal quantities and can be obtained by one of the three approximate generalized inference methods for gamma parameters, i.e. Krishnamoorthy and Wang’s method [25, 26], Chen and Ye’s method [22, 23], and Wang and Wu’s method [24], as reviewed in Section 4.1.1. Replacing αi with and βi with in Eq (2), the generalized pivotal quantity for δi can be expressed as (7)

The generalized pivotal quantity we propose for the mean difference (η) of two independent Exp-gamma distributions can be expressed as (8)

It is easy to verify that Rη is a bona fide generalized pivotal quantity for η approximately. For a given data set and , the following holds: 1) the distribution of Rη is independent of any unknown parameters; 2) the value of Rη is η approximately when the statistics used in the definitions of and (i = 1, 2) are equal their observed value (e.g. in and in Chen and Ye’s method).

For testing the hypothesis of equality of two Exp-gamma means, (9) where η = 0. The generalized test variable is defined as (10) where Rη is the generalized pivotal quantity defined in Eq (8). Note that Tη satisfies the three conditions to be a bona fide generalized test variables: 1) the distribution of Tη is free of nuisance parameters; (2) tη, the observed value of Tη, is 0, and hence is free of any unknown parameters; and (3) Tη is stochastically decreasing in η.

The generalized p-value for testing the hypothesis of equality of two Exp-gamma means is given by (11)

4.1.3 Computing algorithm.

Consider a given data set Yij’s (i = 1, 2, j = 1, 2, …, ni) where the ith sample YiExp-gamma(αi, βi). The generalized p-value for testing equality of two Exp-gamma means, and estimated confidence interval of the mean difference of two Exp-gamma distributions, can be computed by the following steps:

  1. Use one of the three methods presented above, generate and for i = 1, 2, then compute generalized pivot for δi following (7) for i = 1, 2.
  2. Compute generalized pivot for η following (8).
  3. Repeat steps 1-2 a total B (B = 2000) times and obtain array of ’s for b = 1, 2, …, B.

Let Rη : p denote the 100p percentile of the B Rη’s generated in the preceding steps. Then (Rη : p/2, Rη : 1 − p/2) is a 100(1 − p)% confidence interval for the mean difference of two independent Exp-gamma distributions.

Under the H0 : δ1 = δ2, the generalized p-value can be obtained by (11), i.e. (12) The H0 can be rejected if the p-value is less than a given significant level a.

We refer the three methods based on the generalized pivotal quantity of Exp-gamma mean difference as GK, GC, and GW, corresponding to the methods used for gamma parameters, i.e. Krishnamoorthy and Wang’s method, [25, 26], Chen and Ye’s method [22, 23], and Wang and Wu’s method [24], respectively.

4.2 The parametric bootstrap method

Parametric bootstrap (PB) method has been widely used in estimating confidence intervals when the parametric model is justified, e.g. [21, 46]. In this section, we propose a PB method for hypothesis testing and confidence interval estimation for mean different of two independent Exp-gamma distributions.

Let denotes the mean based on a sample of size ni from a Exp-gamma(αi, βi) distribution, i = 1, 2. Let and denote the MLEs of αi and βi, respectively. Similarly, let denotes the mean based on a bootstrap sample of size ni from the . Let denote the MLEs based on a boostrap sample, i = 1, 2. The PB pivot to estimate the difference between two means δ1 = ψ(α1) − ln(β1) and δ2 = ψ(α2) − ln(β2) is given by (13)

The following steps can be used to obtain the p-values for hypothesis testing in Eq (9), decision rules, and confidence interval for η based on PB method:

  1. For a given sample of size ni, calculate the MLEs and , i = 1, 2.
  2. Generate bootstrap samples of size ni from gamma(, ). Then calculate the , and MLEs based on the bootstrap samples for i = 1, 2.
  3. Calculate Qη as in Eq (13).
  4. Repeat steps 2–3 a total B (B = 2000) times and obtain array of ’s for b = 1, 2, …, B.
  5. The p-value can be obtained by
  6. The H0 can be rejected if the p-values is less than a given significant level a.
  7. The 100(1 − p)% PB confidence interval can be obtained as where Qη;p denotes the 100p percentile of Qη.

5 Simulation studies

In previous section, we presented several methods for hypothesis testing and confidence interval estimation for the mean difference between two independent Exp-gamma distributions: three methods based on the generalized pivots (i.e.GC, GW, and GK), and a parametric bootstrap method (i.e. PB). Simulation studies are carried out to evaluate the performance of the proposed methods for hypothesis testing and confidence interval estimation.

5.1 Hypothesis testing

Sample sizes are set from small (10) to large (75), including both balanced and unbalanced settings. The parameter settings for type I error control include scenarios of equal/unequal shape parameters, with the common mean of two samples ranging from −1.369 to 4.634. The parameter settings for power study include scenarios with equal/unequal shape parameters with the mean difference ranging from 0.5 to 1.386. For each parameter setting, 2000 random samples are generated with given sample sizes. For the type I error and power based on generalized inference methods (GC, GW, and GK), 2000 values of generalized pivots are obtained for each random sample. For the type I error and power obtained by PB method, 2000 bootstrap samples are generated for each random sample.

Table 1 presents the type I error rate estimates of hypothesis testing based on the proposed methods (GC, GW, GK, and PB), in comparison with t-test and the WMW test, for testing the equality of means of two Exp-gamma distributions. Note that for the first three scenarios, the two Exp-gamma distributions are identical. The remaining scenarios are ranked using P(Y1 > Y2) in ascending order, indicating a larger disparity between two Exp-gamma distributions under the null hypothesis.

thumbnail
Table 1. Estimated type I errors for testing the equality of means of two independent Exp-gamma distributions (2000 simulations).

https://doi.org/10.1371/journal.pone.0314705.t001

Out of the three proposed methods based on the generalized pivots, GC and GW have excellent type I error control regardless of shapes, rates, sample sizes, and the value of P(Y1 > Y2). In contract, GK can have inflated type I errors when the shape parameter(s) are small (e.g. scenarios 10, 15–18). The reason is that GK obtains approximate generalized pivotal quantities based on the normal approximation of the distribution with a cube root transformation, and such approximation can be very inaccurate when shape parameter is small [22]. For all scenarios, the PB method has inflated type I errors when sample sizes are less than (50, 50). As sample sizes increase, the PB method shows improved type I error control. The inflation of type I errors in the PB method with small sample sizes is primarily due to the instability in estimating parameters from limited data. Furthermore, this improvement in type I error control does not occur when the shape parameter(s) are small (e.g. scenarios 10, 15–18). Small shape parameters in Exp-gamma distributions lead to highly left-skewed data, which exacerbates the difficulty of accurately estimating the parameters. The type I error of two sample t-test converges to nominal level as sample sizes increase, as guaranteed by the center limit theorem. However, it can have inflated type I errors when sample sizes are less than (50, 50), especially when the disparity between two distributions is obvious (e.g. when P(Y1 > Y2) is larger than 0.555). For the first three scenarios for which two Exp-gamma distributions are identical, the WMW test has excellent type I error control. However, as the value of P(Y1 > Y2) deviates from 0.5, the WMW test tends to have more severely inflated type I errors as sample sizes increase. Moreover, the magnitude of type I error inflation increases as the value of P(Y1 > Y2) becomes larger.

Note that scenarios 2, 12, 16, and 18 in Table 1 are the scenario D, C, B and A, respectively, discussed in Section 3.2 and the type I errors of two sample t-test and the WMW test are presented in Figs 4 and 5. To help to visualize the performance of the proposed methods, Fig 6 presents the type I errors obtained GC, GW, GK, and PB, for these four scenarios.

thumbnail
Fig 6. Estimated type I errors of the hypothesis testing based on generalized pivots (GC, GW, and GK) and parametric bootstrap method (PB) for testing the equality of mean of two Exp-gamma distributions as a function of sample sizes.

The middle dashed line represents the nominal significance level, which is set to α = 0.05; and upper and lower dashed lines are upper and lower limits for the type I error rates, which are 0.06 and 0.04, respectively.

https://doi.org/10.1371/journal.pone.0314705.g006

Table 2 presents estimated power of hypothesis testing based on proposed methods (GC, GW, GK, and PB), in comparison with t-test and the WMW test.

thumbnail
Table 2. Estimated powers for testing the equality of means of two independent Exp-gamma distributions under H1 : δ1δ2 (2000 simulations).

https://doi.org/10.1371/journal.pone.0314705.t002

Reflecting on the type I error control presented in Table 1, caution should be exercised while interpreting estimated power and making comparisons between methods. Note the following: 1) the power of the WMW method when the value of P(Y1 > Y2) deviates from 0.5 is not interpretable due to its inflated type I error for these cases, 2) the power of two sample t-test might be inflated when sample sizes less than 50 due to its inflated type I error for these cases; 3) the power of the PB method could be inflated due to its poor type I error control, especially at small sample sizes; 4) GK can have inflated power due to inflated type I error when the shape parameter (α) is small. For example, for scenarios 26 and 27 where the value of P(Y1 > Y2) has a larger deviation from 0.5, the WMW test and two sample t-test have higher power than that of GC, GW and GK when sample sizes are small. Such observations are due to the inflated type I error for the WMW test and the two sample t-test; hence they should not be interpreted as an evidence that t-test and the WMW test are more powerful.

Overall speaking, the two generalized inference methods with good type I error control, i.e. GC and GW, have comparable power. When sample sizes exceed (50, 50), the powers by two sample t-test and the PB test are comparable to those of GC and GW.

In summary, we recommend both GC and GW methods for hypothesis testing of two independent Exp-gamma distributions due to their ability to provide decent power with excellent type I error control, even when sample sizes are small. The GK method is not recommended because it has inflated type I errors for certain scenarios, such as when the shape parameter is less than 0.5. The PB method has inflated type I errors when sample sizes are small or when shape parameter(s) are small, leading to incorrect rejection of the null hypothesis. Two sample t-test may exhibit inflated type I errors at small sample sizes. The WMW test only maintains controlled type I errors when the two distributions are identical, hence it is not a reliable choice.

5.2 Confidence intervals

The proposed three methods based on the generalized pivots (i.e. GC, GW, and GK), and parametric bootstrap (PB) method can provide estimated confidence interval for the mean difference between two Exp-gamma distributions. Additionally, the estimated confidence intervals by two sample t-test are also provided for comparison purpose. Note that theoretically, the WMW method can not yield estimated confidence interval for the mean difference.

Simulation studies are carried out to evaluate the performances of proposed methods regarding coverage probabilities and the average lengths of proposed confidence intervals for mean difference of two independent Exp-gamma distributions. The sample sizes are set as (10, 10), (20, 20), (30, 30), (20, 50), (50, 50), (50, 75), and (75, 75). We considered settings with equal means (i.e. η = 0), as well as different means (i.e. η ≠ 0), and with equal/unequal shape parameters. For each parameter setting, 2000 samples are simulated. For generalized confidence intervals, 2000 Rη’s are obtained. For PB method, B = 2000 bootstrap samples are used.

Table 3 presents the coverage probabilities and average lengths of proposed confidence intervals. Overall speaking, the GC and GW methods that based on the generalized pivots maintain satisfactory coverage probabilities for all settings except that they might be slightly conservative at small sample sizes such as (10, 10), while the GK method is not recommended when the shape parameter is less than 0.5, due to the fact that this normal-based method does not work well when shape parameter is small [22]. The confidence intervals obtained by the PB method are liberal when sample sizes are small, although its coverage probabilities converge to nominal level when sample sizes reach (50, 50). The coverage probabilities of the two sample t-test converges to nominal level as sample sizes increase. However, for some scenarios, it can be liberal when sample sizes are small, such as scenario 19 as sample sizes being less than (30, 30), and scenario 20 at (10, 10). In terms of the length of confidence intervals, the PB method appears to provide shortest confidence intervals among the proposed methods when sample sizes are small. However, this observation is due to the fact that the PB method is liberal at small sizes, hence it should not be interpreted. As sample sizes reach (50, 50), all four methods are generally comparable in terms of length.

thumbnail
Table 3. Coverage probabilities and average lengths of proposed 95% confidence intervals for mean difference of two independent Exp-gamma distributions (2000 simulations).

https://doi.org/10.1371/journal.pone.0314705.t003

In summary, generally we recommend the proposed GC and GW methods over GK, PB method, and two sample t-test, due to the fact that GC and GW methods maintain satisfactory coverage probabilities even at small sample sizes and when the shape parameters are small.

6 Data examples

In this section, we illustrate the proposed method using publicly accessible data from a recent study that measured mRNA expression and protein abundance at single cell level simultaneously by Hao et.al. [5]. In this study, peripheral blood mononuclear cell (PBMC) samples from eight volunteers were collected at pre (day 0) and post HIV vaccination (day 3 and 7), yielding a total of 210,911 cells. The CITE-seq method was used to simultaneously quantify RNA and surface protein abundance in single cells via the sequencing of antibody-derived tags (ADTs). Analyses identified 57 clusters of different types of cells, encapsulated all major and minor immune cell types and revealed striking cellular diversity.

For demonstration purpose without delving deeply into the biological details of immune cells functions, we focus on protein abundance data in the cluster of plasmacytoid dendritic Cell (pDC) cells. The pDC releases type 1 interferon in response to viral infection [47], thus could serve as an indicator of immune response to vaccination. Although pDC cell counts are usually low in PBMC samples, as shown in Hao’s study, they may play a critical role in regulating gene expression and innate immune responses [48]. The data used in this manuscript were attached as S1 Table. Based on the physical model [14], it is reasonable to assume the measured protein abundances of individual cells follow Gamma distribution. This assumption is further verified by goodness-of-fit test [49] for Gamma distribution using method implemented in R goft package. The biological effects were estimated by comparing the log-transformed protein abundances.

In this section, different analyses are performed to investigate the protein abundance variation between donors and across time points within pDC cells. According to the simulation results in Section 5, the PB method is not suitable, as it yields inflated type I errors when sample sizes are small, which is very common for protein abundance data. Furthermore, GK, one of the methods based on generalized pivots, generates inaccurate results when the shape parameter is small, potentially leading to unreliable testing outcomes. Thus, we use two recommended testing approaches based on the generalized pivots (i.e.GC and GW) on the log-transformed data. For comparison purpose, we also analyze data using two sample t-test and the WMW test, which are commonly used in the differential analysis of protein abundance data and other genomic studies. More details described in Example 1 and Example 2 below. To enable direct comparisons between different donors regardless of differences in sample size, we use relative counts (RC) of protein abundance. In the settings of this single-cell studies, the sample sizes refer to the counts of pDC cells, making our proposed methods ideal choices for modeling them.

Example 1. Comparison of log-transformed protein abundance data for two different donors at same time point.

Analyzing protein abundance data across different donors at given time points allows us to perceive variations in the immune response to vaccination among different individuals. Since the consistency of immune response is crucial for vaccine success, accurately assessing protein levels at fixed time related to vaccination is essential for evaluating its quality and effectiveness.

Table 4 lists summary statistics for four genes: Rat-IgG1–2 (donor P1 vs. P3 at day 7), CD3–2 (donor P3 vs. P8 at day 0), CD226 (donor P1 vs. P6 at day 0), and CD44–2 (donor P1 vs. P3 at day 3). The estimated p-values as well as confidence intervals by the proposed methods (GC and GW) and two sample t-test and the WMW test are also presented. For these four proteins, the GC and GW methods yield different conclusions in terms of significance, in contrast to two sample t-test and the WMW test.

thumbnail
Table 4. Testing the equality of protein abundance data from different donors at the same time point (p-value and estimated confidence interval for mean difference).

https://doi.org/10.1371/journal.pone.0314705.t004

As observed in simulation studies, the two sample t-test is unreliable as the disparity between two distributions is large, especially when sample sizes are less than (50, 50). Moreover, the WMW test is very sensitive to difference between the shapes of distributions. For this data set, the sample sizes (counts of pDC cells) are generally small, and the data exhibit different distributions for different donors at same time point. Therefore, two sample t-test and the WMW test should not be trusted for their testing results.

For instance, when comparing the protein abundance of Rat-IgG1–2 between donor P1 and donor P3 at day 7, our proposed approaches (GC and GW) reveal significant mean difference. However, two sample t-test and the WMW test fail to identify this difference. On the other hand, when examining the protein abundance of CD226 between donor P1 and donor P6 at day 0, our GC and GW methods indicate there are no significance between two donors, whereas two sample t-test and the WMW test state otherwise. These erroneous conclusions based on t-test and the WMW test may lead us to mis-characterize the nature of vaccine response related to these genes.

Furthermore, the estimated 95% confidence intervals for the mean difference (η) are also presented in Table 4, and GC and GK methods generally have the comparable lengths.

Example 2. Comparison of log-transformed protein abundance data at two different times for same donor.

One important aspect of the study by Hao et al. [5] is to characterize the response to vaccination for each of previously identified cell types, with particular interests in identifying cell populations that contribute most strongly to the innate immune response. This response is expected to be highly activated at the first vaccinated time point (day 3), and subsequently dampen at the second time point (day 7), as observed with another non-replicating viral vectored HIV vaccine [50]. Therefore, donor with strong innate immune response to vaccination and the activated genes can be identified by comparing the gene expression at time 0 to 3 and 7, as described in Hao et al. [5] The ability to identification individual with strong innate immune response is critical in our understanding of antibody production related to vaccine and other factors. Such insight may also help us developing more personalized treatment. Finally, we would like to point out that, due to the nature of single cell experiments, the gene protein levels at different time points are extracted from independent sets of cells. Therefore, t-test and WMW test of two independent samples were used when we investigate the changes of means of log-transformed protein abundance between two times for a given donor in this example.

Table 5 lists summary statistics for three genes (CD48, CD45–1, and CD337) of donor P8 for day 0 vs. day 3, and day 0 vs. day 7. The estimated p-values as well as confidence intervals by the proposed methods (GC and GW), and the commonly used two sample t-test and the WMW test, are presented. For these three genes, the proposed methods may or may not yield different conclusions in terms of significance, comparing to the two sample t-test and the WMW test.

thumbnail
Table 5. Testing the equality of protein abundance data from same donor at different time points (p-value and estimated confidence interval for mean difference).

https://doi.org/10.1371/journal.pone.0314705.t005

For example, when comparing the protein abundance of CD48 between day 0 and 3 for donor P8, our proposed approaches (GC and GW) identify significant mean difference in log-transformed samples. Furthermore, the immune response is dampens at day 7, and the GC and GW methods yield insignificant difference from day 0 to 7. This pattern aligns with the characteristics of innate immune response stated above. However, both two sample t-test and the WMW test fail to generate significant differences between day 0 and day 3, indicating that the changes of CD48 abundance do not fit the pattern. Similar patterns are observed for CD337 and CD45–1 by the GC and GW methods, while such discoveries would have been missed by either the WMW test or t-test.

Interestingly, genes CD48 [51], CD45–1 [52] and CD337 [48] are all playing important roles in human’s immune system. The observed multiple protein abundance modifications in donor P8 may indicate a different innate immune response compared to other donors, which do not show the patterns of changes described above. This discovery may point to potential existence of minority subtypes with different responses to the vaccine. A closer examination of changes in immune-related gene profiles in response to the vaccine might have clinical value.

Furthermore, the estimated 95% confidence intervals for the mean difference (η) by proposed methods are presented in Table 5.

7 Summary and discussion

In genomics, the two sample t-test and the Wilcoxon-Mann-Whitney (WMW) test are commonly used to identify proteins that can differentiate between different experiment conditions, and the comparison is usually applied on log-transformed protein abundance data [39]. However, the protein abundance data could be modeled by gamma distribution [3, 53, 54], and the shape of protein abundance distribution needs to be taken into consideration in differential analysis [19].

In this paper, we demonstrated the inappropriateness of using two sample t-test and the WMW test for testing the equality of means of two log-transformed protein abundance samples. Several methods for two-sample hypothesis testing and confidence interval estimation for mean difference of two independent Exp-gamma distributions are proposed.

Through comprehensive simulation studies, we demonstrated that two proposed methods (i.e. GC and GW) based on the concepts of generalized inference can have excellent type I error control for testing the equality of two Exp-gamma means. Additionally, these methods can provide satisfactory confidence intervals for the Exp-gamma mean difference, with consistent performance across different parameter settings and sample sizes. Furthermore, the GC and GW methods work well even when the data are highly skewed. On the other hands, the GK method is not recommended when the shape parameter(s) are less than 0.5, as the cube root transformation is not accurate with small shape parameter(s). The PB method is not advised when sample sizes are small or when the shape parameter(s) are less than 0.5, because highly skewed data increases the risk of parameter estimation instability. The possible inflation of type-I error when using parametric bootstrap analysis have been observed by many researchers in different settings; e.g. Golzarri-Arroyo et.al. [55]. Hence, we advise practitioners use caution when applying parametric bootstrappjng method in practice.

We expect the proposed methods have broad applicability to differential analysis in genomics studies and other applied fields. The proposed approaches for hypothesis testing and confidence interval estimation are easy to implement and their running time of these methods is quite feasible on standard computer platforms.

The R program is available at request from Dr. Yan at li.yan@roswellpark.org.

Supporting information

S1 Table. Real data example.

The data used in Tables 4 and 5.

https://doi.org/10.1371/journal.pone.0314705.s001

(XLSX)

S1 Appendix. The characteristics of exponential-gamma (Exp-gamma) distribution.

https://doi.org/10.1371/journal.pone.0314705.s002

(PDF)

S2 Appendix. Generalized pivots and generalized test variables.

https://doi.org/10.1371/journal.pone.0314705.s003

(PDF)

References

  1. 1. Friedman N, Cai L, Xie XS. Linking stochastic dynamics to population distribution: an analytical framework of gene expression. Physical Review Letter. 2006;97:168302. pmid:17155441
  2. 2. Shahrezaei V, Swain PS. Analytical distributions for stochastic gene expression. Proceedings of the National Academy of Sciences. 2008;105(45):17256–17261. pmid:18988743
  3. 3. Taniguchi Y, Choi PJ, Li GW, Chen H, Babu M, Hearn J, et al. Quantifying E. coli proteome and transcriptome with single-molecule sensitivity in single cells. Science. 2010;329(5991):533–538. pmid:20671182
  4. 4. Li GW, Xie XS. Central dogma at the single-molecule level in living cells. Nature (London). 2011;475(7356):308–315. pmid:21776076
  5. 5. Hao Y, Hao S, Andersen-Nissen E, Mauck WM, Zheng S, Butler A, et al. Integrated analysis of multimodal single-cell data. Cell. 2021;184(13):3573–3587.e29. pmid:34062119
  6. 6. Xie H, Ding X. The intriguing landscape of single-cell protein analysis. Advanced Science. 2022;9(12):2105932. pmid:35199955
  7. 7. Kammers K, Cole RN, Tiengwe C, Ruczinski I. Detecting significant changes in protein abundance. EuPA Open Proteomics. 2015;7(C):11–19. pmid:25821719
  8. 8. Scherf U, Ross DT, Waltham M, Smith LH, Lee JK, Tanabe L, et al. A gene expression database for the molecular pharmacology of cancer. Nature Genetics. 2000;24(3):236–244. pmid:10700175
  9. 9. Tauman R, Ivanenko A, O’Brien LM, Gozal D. Plasma C-reactive protein levels among children with sleep-disordered breathing. Pediatrics. 2004;113(6):e564–e569. pmid:15173538
  10. 10. Zybailov B, Mosley AL, Sardiu ME, Coleman MK, Florens L, Washburn MP. Statistical analysis of membrane proteome expression changes in saccharomyces cerevisiae. Journal of Proteome Research. 2006;5(9):2339–2347. pmid:16944946
  11. 11. Taylor SC, Nadeau K, Abbasi M, Lachance C, Nguyen M, Fenrich J. The ultimate qPCR experiment: producing publication quality, reproducible data the first time. Trends in Biotechnology (Regular ed). 2019;37(7):761–774. pmid:30654913
  12. 12. Thomas JG, Olson JM, Tapscott SJ, Zhao LP. An efficient and robust statistical modeling approach to discover differentially expressed genes using genomic expression profiles. Genome Research. 2001;11(7):1227–1236. pmid:11435405
  13. 13. Gontarz P, Fu S, Xing X, Liu S, Miao B, Bazylianska V, et al. Comparison of differential accessibility analysis strategies for ATAC-seq data. Scientific Reports. 2020;10(1):10150. pmid:32576878
  14. 14. Li Y, Ge X, Peng F, Li W, Li JJ. Exaggerated false positives by popular differential expression methods when analyzing human population samples. Genome Biology. 2022;23(1):79. pmid:35292087
  15. 15. Law CW, Chen Y, Shi W, Smyth GK. voom: precision weights unlock linear model analysis tools for RNA-seq read counts. Genome Biology. 2014;15(2):R29. pmid:24485249
  16. 16. Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biology. 2014;15(12):550. pmid:25516281
  17. 17. Fay MP, Proschan MA. Wilcoxon-Mann-Whitney or t-test? On assumptions for hypothesis tests and multiple interpretations of decision rules. Statistics Surveys. 2010;4:1. pmid:20414472
  18. 18. Pratt JW. Robustness of some procedures for the two-sample location problem. Journal of the American Statistical Association. 1964;59(307):665–680.
  19. 19. de Torrenté L, Zimmerman S, Suzuki M, Christopeit M, Greally JM, Mar JC. The shape of gene expression distributions matter: how incorporating distribution shape improves the interpretation of cancer transcriptomic data. BMC Bioinformatics. 2020;21(21):1–18. pmid:33371881
  20. 20. Fraser DAS, Reid N, Wong A. Simple and accurate inference for the mean of the gamma model. Canadian Journal of Statistics. 1997;25(1):91–99.
  21. 21. Krishnamoorthy K, León-Novelo L. Small sample inference for gamma parameters: one-sample and two-sample problems. Environmetrics (London, Ont). 2014;25(2):107–126.
  22. 22. Chen P, Ye ZS. Approximate statistical limits for a gamma distribution. Journal of Quality Technology. 2017;49(1):64–77.
  23. 23. Chen P, Ye ZS. Estimation of field reliability based on aggregate lifetime data. Technometrics. 2017;59(1):115–125.
  24. 24. Wang BX, Wu F. Inference on the Gamma distribution. Technometrics. 2018;60(2):235–244.
  25. 25. Krishnamoorthy K, Wang XG. Fiducial confidence limits and prediction limits for a Gamma distribution: censored and uncensored cases. Environmetrics. 2016;27:479–493.
  26. 26. Krishnamoorthy K, Mathew T, Mukherjee S. Normal–Based methods for a Gamma distribution: prediction and tolerance intervals and stress-strength reliability. Technometrics. 2008;50(1):69–78.
  27. 27. Wang X, Zou C, Yi L, Wang J, Li X. Fiducial inference for gamma distributions: two-sample problems. Communications in Statistics—Simulation and Computation. 2021;50(3):811–821.
  28. 28. Wang X, Li X, Zhang L, Liu Z, Li M. Fiducial inference on gamma distributions: two-sample problems with multiple detection limits. Environmental and Ecological Statistics. 2022;29(3):453–475.
  29. 29. Gao Y, Tian L. Confidence interval estimation for the difference and ratio of the means of two gamma distributions. Communications in Statistics—Simulation and Computation. 2022;0(0):1–14.
  30. 30. Lister R, O’Malley RC, Tonti-Filippini J, Gregory BD, Berry CC, Millar AH, et al. Highly Integrated Single-Base Resolution Maps of the Epigenome in Arabidopsis. Cell. 2008;133(3):523–536. pmid:18423832
  31. 31. Conesa A, Madrigal P, Tarazona S, Gomez-Cabrero D, Cervera A, McPherson A, et al. A survey of best practices for RNA-seq data analysis. Genome Biology. 2016;17(1):13. pmid:26813401
  32. 32. Schwanhäusser B, Busse D, Li N, Dittmar G, Schuchhardt J, Wolf J, et al. Global quantification of mammalian gene expression control. Nature. 2011;473(7347):337–342. pmid:21593866
  33. 33. Wang J, Huang M, Torre E, Dueck H, Shaffer S, Murray J, et al. Gene expression distribution deconvolution in single-cell RNA sequencing. Proceedings of the National Academy of Sciences. 2018;115(28):E6437–E6446. pmid:29946020
  34. 34. Qin LX, Tuschl T, Singer S. Empirical insights into the stochasticity of small RNA sequencing. Scientific Reports. 2016;6(1):24061–24061. pmid:27052356
  35. 35. Stopka SA, Khattar R, Agtuca BJ, Anderton CR, Pasa-Tolic L, Stacey G, et al.; WA (United States) Pacific Northwest National Lab. (PNNL). Metabolic noise and distinct subpopulations observed by single cell LAESI mass spectrometry of plant cells in situ. Frontiers in Plant Science. 2018;9:1646–1646. pmid:30498504
  36. 36. Cappellato M, Baruzzo G, Di Camillo B. Investigating differential abundance methods in microbiome data: A benchmark study. PLOS Computational Biology. 2022;18:e1010467. pmid:36074761
  37. 37. Devore JL, Peck R. Statistics: The exploration and analysis of data. Taylor & Francis; 1997.
  38. 38. Pan W. A comparative review of statistical methods for discovering differentially expressed genes in replicated microarray experiments. Bioinformatics. 2002;18(4):546–554. pmid:12016052
  39. 39. Old WM, Meyer-Arendt K, Aveline-Wolf L, Pierce KG, Mendoza A, Sevinsky JR, et al. Comparison of label-free methods for quantifying human proteins by shotgun proteomics* S. Molecular & Cellular Proteomics. 2005;4(10):1487–1502.
  40. 40. Tsui KW, Weerahandi S. Generalized p-values in significance testing of hypotheses in the presence of nuisance parameters. Journal of the American Statistical Association. 1989;84(406):602–607.
  41. 41. Weerahandi S. Generalized confidence intervals. Journal of the American Statistical Association. 1993;88(423):899–905.
  42. 42. Weerahandi S. Exact statistical methods for data analysis. Springer-Verlag; 1995.
  43. 43. Tian L, Cappelleri JC. A new approach for interval estimation and hypothesis testing of a certain intraclass correlation coefficient: the generalized variable method. Statistics in Medicine. 2004;23(13):2125–2135. pmid:15211607
  44. 44. Lin SH, Lee JC, Wang RS. Generalized inferences on the common mean vector of several multivariate normal populations. Journal of Statistical Planning and Inference. 2007;137(7):2240–2249.
  45. 45. Lai CY, Tian L, Schisterman EF. Exact confidence interval estimation for the Youden index and its corresponding optimal cut-point. Computational Statistics and Data Analysis. 2012;56(5):1103–1114. pmid:27099407
  46. 46. Yan L. Confidence interval estimation of the common mean of several gamma populations. PloS One. 2022;17(6):1–13. pmid:35714130
  47. 47. Collin M, McGovern N, Haniffa M. Human dendritic cell subsets. Immunology. 2013;140(1):22–30. pmid:23621371
  48. 48. Xi Y, Troy NM, Anderson D, Pena OM, Lynch JP, Phipps S, et al. Critical role of plasmacytoid dendritic cells in regulating gene expression and innate immune responses to human rhinovirus-16. Frontiers in Immunology. 2017;8:1351. pmid:29118754
  49. 49. Villaseñor JA, González-Estrada E. A variance ratio test of fit for Gamma distributions. Statistics & Probability Letters. 2015;96:281–286.
  50. 50. Zak DE, Andersen-Nissen E, Peterson ER, Sato A, Hamilton MK, Borgerding J, et al. Merck Ad5/HIV induces broad innate immune activation that predicts CD8+ T-cell responses but is attenuated by preexisting Ad5 immunity. Proceedings of the National Academy of Sciences of the United States of America. 2012;109(50):E3503–E3512. pmid:23151505
  51. 51. McArdel S, Terhorst C, Sharpe A. Roles of CD48 in regulating immunity and tolerance. Clinical Immunology. 2016;164. pmid:26794910
  52. 52. Donovan JA, Koretzky GA. CD45 and the immune response. Journal of the American Society of Nephrology. 1993;4(4):976–985. pmid:8286719
  53. 53. Fisher RA, Corbet AS, Williams CB. The relation between the number of species and the number of individuals in a random sample of an animal population. The Journal of Animal Ecology. 1943; p. 42–58.
  54. 54. Koziol J, Griffin N, Long F, Li Y, Latterich M, Schnitzer J. On protein abundance distributions in complex mixtures. Proteome Science. 2013;11:1–9. pmid:23360617
  55. 55. Golzarri-Arroyo L, Dickinson SL, Jamshidi-Naeini Y, Zoh RS, Brown AW, Owora AH, et al. Evaluation of the type I error rate when using parametric bootstrap analysis of a cluster randomized controlled trial with binary outcomes and a small number of clusters. Computer Methods and Programs in Biomedicine. 2022;215:106654. pmid:35093646