Figures
Abstract
The first table in many articles reporting results of a randomized clinical trial compares baseline factors across arms. Results that appear inconsistent with chance trigger suspicion, and in one case, accusation and confirmation of data falsification. We confirm theoretically results of simulation analyses showing that inconsistency with chance is extremely difficult to prove in the absence of any information about correlations between baseline covariates. We offer a reasonable diagnostic to trigger further investigation.
Citation: Proschan MA, Shaw PA (2020) Diagnosing fraudulent baseline data in clinical trials. PLoS ONE 15(9): e0239121. https://doi.org/10.1371/journal.pone.0239121
Editor: Vance Berger, National Cancer Institute, UNITED STATES
Received: August 16, 2019; Accepted: August 31, 2020; Published: September 30, 2020
This is an open access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 public domain dedication.
Data Availability: All relevant data were provided in the paper or were generated by the code available at https://github.com/PamelaShaw/FraudRCT.
Funding: The authors received no specific funding for this work.
Competing interests: No authors have competing interests.
1 Introduction
In clinical trials, baseline variables are used to: 1) document that the trial recruited its target population, 2) summarize the natural history of the disease in the control arm, and 3) adjust the treatment effect for baseline differences in prognostic factors. Because baseline variables are measured before randomization, any differences between arms are attributable to chance. That is, the null hypothesis of no treatment effect should hold marginally for each baseline variable. For each continuous baseline variable compared using a continuous test statistic, the marginal distribution of its p-value should be uniform if the assumptions underlying the test (e.g., the data are normally distributed) are satisfied.
Clinical trialists sometimes go one step further and assume that p-values should behave like independent uniform deviates. Seeing appreciably more or fewer than the expected one in 20 statistically significant differences at α = 0.05 arouses suspicion. In one case uncovered by Bolland et al. [1], that suspicion led to the accusation and later verification of data falsification. A randomized study in dogs uncovered by Calisle et al. [2] also appeared to show implausibly little variability of baseline covariates across arms. It and other publications by the same authors were retracted.
Betensky and Chiou [3] and Bland [4] use simulation to show that, in practice, p-values for baseline variables in clinical trials frequently do not behave like independent uniforms for several reasons: 1) the assumptions underlying a test may not be satisfied (e.g., skewed data do not fit the normality assumption), 2) the covariate may be binary, in which case even marginal uniformity of p-values does not hold exactly, and 3) many baseline covariates are correlated, so their p-values are also correlated. The authors conclude that interpretation of standard tests of uniformity applied to p-values is problematic. A natural question is whether a legitimate case for data falsification can be made based solely on p-values reported in a baseline table (i.e., with no information presented on correlations between baseline covariates).
We propose a statistical test based on the sum of squared z-scores of baseline covariates that can be used to determine whether further investigation of fraud is warranted. This article complements the simulation results in [3–4] with theoretical results showing the difficulty of actually proving fraud. The problem is that the distribution of the test statistic depends critically on the correlation between z-scores comparing arms on baseline covariates. Naively treating these z-scores as independent rejects the null hypothesis of no fraud too often if there is any true correlation. On the other hand, using the worst case correlation matrix leads to an extremely conservative test that virtually never rejects the null hypothesis. We characterize correlation matrices that ought to be conservative, but not so conservative as to be useless.
2 Test statistic 
2.1 Weighted combination of iid χ2(1)s
Let Pi be the one-tailed p-value for testing whether treatment observations tend to be larger than control observations for the ith continuous baseline covariate, i = 1, …, k. Assume that the corresponding test statistic has a continuous distribution and its underlying distributional assumptions are satisfied. In the absence of data falsification, the Pi are dependent uniform (0, 1) random variables. Betensky and Chiou [3] evaluate the impact of correlation on chi-squared and Kolmogorov-Smirnov statistics of uniformity of the Pi. We consider instead a test specifically targeting too little variability of actual results from expected results, an indication of possible data falsification.
We begin by transforming the dependent uniforms Pi to dependent standard normals Zi by Zi = Φ−1(1 − Pi), where Φ−1 is the inverse of the standard normal distribution function. Although the Zi need not have a multivariate normal distribution, they will be approximately multivariate normal if the test statistics are asymptotically sums of iid random variables and the sample size dwarfs the number of baseline covariates. Given that our intent is to show the difficulty of proving data falsification even under the best circumstances, we assume that the Zi are exactly multivariate normal with mean vector (0, 0, …, 0) and nonnegative-definite covariance matrix Σ whose diagonal elements are 1. That is, the Zi are correlated (unless Σ is the identity matrix) but marginally standard normal.
We will suspect data falsification if the Zi are too close to their expected value of 0; i.e., there is too much balance between arms. The sign of the Zi is not important, so we will be suspicious if is very small for multiple baseline variables. A natural way to combine the
is through
, the squared length of the vector
of z-statistics for the k baseline covariates. If the Zi were iid, we could determine whether L2 is “too small” by comparing its value to the αth percentile of a chi-squared distribution with k degrees of freedom. But the Zi are dependent, so we must derive the distribution of L2 under a marginal standard normal assumption, but with arbitrary correlation matrix Σ. We derive this distribution using standard results in linear algebra (see, for example, [5]) as follows.
2.2 Distribution of L2
We can find an orthonormal basis of eigenvectors of Σ and form the orthogonal matrix Γ whose columns are
. Then ΓT ΣΓ = D, a diagonal matrix whose diagonal entries λ1, …, λk are the eigenvalues of Σ, which are all nonnegative real numbers because Σ is a nonnegative-definite, symmetric matrix. Therefore,
Let Σ1/2 denote the matrix ΓD1/2 ΓT, where D1/2 is the diagonal matrix whose diagonal elements are the square roots of those of D. Then
has the same distribution as
, where
is a vector of k iid standard normals, because cov
. It follows that
(1)
where
. Also, cov
, so the distribution of
is that of k iid standard normals. Eq (2) implies that
(2)
is a weighted sum of squares of iid standard normals. The weights λi are the eigenvalues of Σ, which sum to k.
We interpret these eigenvalues in terms of variances of linear combinations of the Zi as follows. The vector maximizing the variance of the linear combination
, subject to
, is
, the eigenvector associated with the largest eigenvalue, λ(k), of Σ. We can view
as the projection of
onto the axis defined by
(Fig 1). The variance of
is λ(k). The second largest eigenvalue
of Σ is the maximum variance of linear combinations
such that 1)
, and 2)
is orthogonal to
. The variance of
is λ(k−1). Continuing in this fashion, the smallest eigenvalue
of Σ is the maximum variance of linear combinations
such that 1)
, and 2)
is orthogonal to each of
. The variance of
is λ(1). If there are a few very large eigenvalues and the rest are close to 0, the Zi are highly correlated. On the other hand, if the eigenvalues are all of similar size, the Zi are close to being uncorrelated.
Summary:
- The distribution of L2 under correlation matrix Σ for
is a weighted sum of iid chi-squared random variables with 1 degree of freedom.
- The weights are the eigenvalues of Σ, which sum to k.
- If all eigenvalues are 1, the Zi are iid and
.
- If k − 1 eigenvalues are 0, the Zi are maximally correlated and
.
2.3 Peakedness as a function of 
We have characterized the distribution of the test statistic L2 in the absence of fraud and under an assumed correlation matrix Σ. Next we investigate the peakedness of this distribution. If the distribution is peaked, then a small value of L2 suggests possible fraud, as small values would be quite unlikely otherwise. On the other hand, if the distribution of L2 is very dispersed, then small values of L2 might be common even under the null hypothesis of no fraud. It is critical, therefore, to determine limits on the peakedness of the null distribution to properly interpret evidence engendered by small values of L2. There are different ways to quantify peakedness, but perhaps the simplest is based on the variance.
2.3.1 Minimum and maximum variance.
We consider next how the variance of L2 depends on the eigenvalues of the correlation matrix of . This will be important to evaluate how misleading it can be to treat p-values for baseline covariates as if they were independent. The mean and variance of L2 are
(3)
(4)
Thus, the mean of L2 does not depend on the correlation matrix Σ for the baseline z-scores, but the variance of L2 does. For this reason, it is important to find the minimum and maximum values Vmin and Vmax of var(L2) and determine which correlation matrices yield those extreme values. Because the λi sum to k,
. Write V as
(5)
where
is the sample variance of the λi. It is clear that V is minimized when
for each i. In other words, the independence case, Σ = I, produces the smallest variance, Vmin = 2k, of L2. In a sense, this produces the greatest peakedness for the null distribution of L2. To see the serious implications of this fact, suppose that the observed value of L2 is small. If we wrongly assume that the z-scores for baseline covariates are independent, then we will be using the minimum possible variance of L2 to determine whether L2 is implausibly small. Consequently, the observed L2 value might be many assumed standard deviations away from its mean value of k. The resulting p-value will be tiny, and the level of evidence supporting data falsification will be greatly overstated if the true correlation matrix Σ is far from the identity matrix corresponding to independent z-scores.
To avoid inflating the probability of erroneously suspecting data falsification, we could assume the correlation matrix Σ yielding the largest variance of L2. It can be shown that the maximizing the variance of L2 assigns value k to one of the λi and 0 to the remaining λis. In that case, var
. In summary, the smallest and largest values of var(L2) are:
(6)
(7)
We will see that using Vmax is almost always too conservative to be useful. Therefore, we want to select conservative values of V that are not so conservative that they are useless. To see how to do this, notice that the vectors (1, …, 1) and (0, 0, …, k) in (6) and (7) are at opposite ends of a certain spectrum. Imagine two different communities, each with k luxury cars divided among k people. In one community, everyone has 1 luxury car, and in the other community, one person has all k luxury cars. Vectors (1, …, 1) and (0, 0, …, k) correspond to these least and most polarized distributions. This concept can be formalized as follows. A vector is said to majorize another vector
, written
, if the ordered values x(1) ≤ … ≤ x(k) and y(1) ≤ … ≤ y(k) satisfy
for j = 1, …, k, with equality when j = 1. In other words,
is more polarized (the rich are richer and the poor are poorer) than
. The smallest and largest vectors, in terms of majorization, with sum k are (y(1), …, y(k)) = (1, …, 1) and (0, …, 0, k), respectively. The generalization of the ordering of variances in (6) and (7) is as follows.
Theorem 2.1. V = var(L2) increases as the vector of eigenvalues of Σ becomes more polarized (i.e., increases in the majorization ordering).
The proof follows from D.2 on page 101 of [6].
Therefore, computing the null distribution of L2 assuming that is one of the larger (although not the largest possible) vectors in the majorization ordering should be conservative, but not prohibitively so.
Our treatment in this section implicitly assumed that the distribution of L2 is approximately normally distributed for large k, which is reasonable for many because L2 is a linear combination of iid chi-square random variables with 1 degree of freedom. However, for certain extreme vectors
such as (0, …, k), L2 is not normal. We would like to show, without invoking asymptotic normality, that the distribution of L2 has fatter tails as
becomes more polarized (increases in the majorization ordering). We defer discussion of this technical and difficult topic to the appendix.
2.4 Simple Σs allowing exact calculation
For any given critical value C, we can compute P(L2 ≤ C) analytically without using a normal approximation for certain classes of correlation matrices Σ. Equivalently, we can think in terms of the eigenvalues of Σ, which, as we have seen, can be interpreted in terms of variances of projections of the Zi onto directions defined by its eigenvectors. Suppose the total variance is spread equally among a small number of directions. Then all but a few eigenvalues are 0, and the remainder all have the same value. For instance, with only 1 direction, all but one eigenvalue is 0, and the nonzero eigenvalue is k. This is the most extreme correlation matrix in which all are identical. More generally, if all variability is focused equally in j directions, then each of the j nonzero eigenvalues has value k/j. In that case, expression (2) becomes k(Xj/j), where Xj has a chi-squared distribution with j degrees of freedom. The probability of a type 1 error is
(8)
where Gj is the distribution function of 1/j times a chi-square random variable with j degrees of freedom. The appendix shows that Gj has fatter tails as j decreases. Therefore, the distribution of L2 has fatter tails if the total variability of
is spread equally over a smaller number of directions.
Another relatively simple class of correlation matrices corresponds to the same correlation ρ for all pairs of z-scores. It can be shown that all but one eigenvalue is 1−ρ, and the remaining eigenvalue is 1 + (k − 1)ρ. In that case, expression (2) can be written as
(9)
where Xk−1 and X1 are independent chi-squared random variables with k − 1 and 1 degree of freedom, respectively. Let Hj and hj denote a chi-squared distribution and density function with j degrees of freedom, j = 1, …, k. From (9), the type 1 error rate is
(10)
Table 1 uses Eqs (8) and (10) to compute the inflation of the type 1 error rate when one erroneously assumes that the z-scores comparing baseline covariates across arms are independent, when the true Σ is either equicorrelated or has all variability focused in a few directions. For the rows labeled by directions, the true Σ corresponds to total variability of z-scores divided equally among 1, 2, or 3 directions. For rows labeled by ρ, the z-scores for baseline comparisons all have the same pairwise correlation ρ. For example, the “1 direction” row shows that if critical value C is computed assuming the Zi are independent, but the truth is that all variability of the Zi is focused in only 1 direction, the actual type 1 error rate is 47.0 percent or 62.3 percent if k = 10 or k = 100, respectively. On the other hand, if the true Σ has the same pairwise correlation ρ = 0.50 for all pairs, the true type 1 error rate is 15.1 percent or 54.1 percent for k = 10 or k = 100, respectively. The probability of falsely becoming suspicious increases as the Zi become more correlated.
The true Σ either has all variability focused equally in 1, 2, or 3 directions, or each off-diagonal element has value ρ.
On the other hand, if one assumes perfect correlation and sets the critical value using Vmax, the test becomes extraordinarily conservative. Table 2 shows that when k = 25, the actual type 1 error rate of the Vmax test if the z-scores have common correlation ρ = 0.75 is 7.9 × 10−20 instead of 0.05. In other words, if we want to protect against the most drastic correlation matrix Σij = 1 for all i and j, the test becomes incredibly conservative even if the true correlation matrix still has unrealistically high correlation. Likewise, even if the true correlation matrix has all variance focused in only 3 directions, the Vmax test has ultraconservative type 1 error rate 0.0003. Remember that the L2 test is being used as a diagnostic to see if further investigation is warranted. Further investigation would provide an estimate of Σ that could be used to compute the true distribution of L2, resulting in a much more accurate test. Thus, a reasonable option for the diagnostic test is to make a conservative assumption, such as that all correlations are 0.75 or all variability is focused equally in only 3 directions. This is almost guaranteed to overstate the degree of correlation in a real clinical trial.
The true Σ either has all variability focused equally in 1, 2, or 3 directions, or each off-diagonal element has value ρ.
2.4.1 Example.
Fujii et al. [7] was a study randomizing 24 dogs to one of three doses of midazolam to evaluate the effect of midazolam on contractility of the diaphragm. Table 3 shows baseline means and standard deviations in each of the three arms for each of 8 continuous variables. There appears to be little variability across arms. We apply our test to dose groups 1 and 2. For each variable, compute a one-tailed p-value Pi using an unpaired t-statistic with alternative hypothesis that group 2 has a higher mean than group 1. Then convert each p-value to a z-score by Zi = Φ−1(1−Pi), and compute the test statistic . We find that L2 = 0.2556. If we erroneously assume independence of baseline covariates, and therefore of Z statistics, the p-value is
. This overstates the strength of evidence for fraud. On the other hand, assuming perfect correlation between z-scores for baseline covariates almost certainly understates the evidence for fraud. That p-value is
. We feel confident that a real randomized experiment would not result in all correlations being 0.9. Therefore, making the assumption of a common ρ of 0.9 should still be highly conservative. The p-value using (10) with ρ = 0.9 is 0.0055. In other words, even under what we feel is an unrealistically large degree of correlation, namely 0.9 for all pairs, the evidence for fraud certainly seems sufficient to warrant further investigation.
Shown are the eight continuous baseline covariates.
3 Number of significant z-scores
We have focused on L2 as a test statistic for detecting cheating, but other goodness of fit statistics such as those considered by Betensky and Chiou [3] and Bland [4] have similar behavior. A particularly simple statistic is the number J of statistically significant z-scores. We might be suspicious if the number of continuous baseline covariates is large and none result in a statistically significant difference between arms. Suppose these z-scores are equicorrelated with nonnegative correlation ρ. Then Z1, …, Zk have the same distribution as X + ϵ1, …, X + ϵk, where X and the ϵi are mutually independent normal random variables with zero means and variances ,
. Let zα satisfy 1−Φ(zα) = α. Given X = x, the indicators I(Zi > zα) are iid Bernoulli (p) random variables, where
(11)
The specific distribution function F(p) for the random variable P = p(X) is unimportant. The important fact is that P has mean α, as the following calculation shows:
Accordingly, the distribution of J is a mixed binomial:
(12)
where f(p) is the density function corresponding to distribution function F(p). Note that ∫pf(p)dp = α.
Under independence, the number of significant Zi has an ordinary binomial distribution with parameters k and α. let J and J′ denote random variables from the mixed binomial (12) and the unmixed binomial bin(k, α). To see that extreme results are more likely for J than for J′, note that by Jensen’s inequality, for k > 1,
(13)
More generally, Shaked [8] has shown that a mixed binomial has fatter tails than an ordinary binomial with the same mean. This explains the common phenomenon of observing what appear to be too few or too many statistically significant baseline differences in clinical trials. Therefore, if one falsely assumes that the Zi are independent, the chance of falsely suspecting fraud will be inflated.
4 Discussion
We have proposed a diagnostic for detecting suspiciously low between-arm variability in baseline covariates in clinical trials. The test statistic L2, the squared length of the vector of z-scores of balance in baseline covariates, has a distribution that depends on the correlation matrix Σ of
only through its eigenvalues. We confirm analytically for L2 what is demonstrated through simulation in [3–4] for similar goodness of fit tests applied to baseline covariates in clinical trials when no information about correlations is available. Assuming independence between covariates (and, therefore, between the Zi) results in an unacceptably high probability of falsely suspecting fraud. In fact, the distribution of L2 has thinnest tails when one falsely assumes that z-scores for baseline covariates are independent, and fattest tails when one assumes the most extreme possible correlation. We draw two conclusions: 1) one should never conclude fraud solely because L2 is unusually small under the independence assumption and 2) to feel confident that the Zi are too small to have occurred by chance, L2 must be unusually small even assuming unrealistically high correlation. Assuming perfect correlation produces a test that virtually never triggers further investigation. Therefore, we suggest using a practical upper bound on the correlation matrix such as all correlations equal to 0.75 or all variability focused equally in only 3 directions.
The final verdict will almost always be based on the totality of evidence. The case made by Bolland et al. [1] was based on numerous trials by the same authors that contained warning signs such as suspiciously fast enrollment and few deaths and dropouts despite recruiting older patients with substantial comorbidity. Our test statistic is a useful diagnostic that can be used in conjunction with other evidence to bolster the case for data falsification.
A Appendix: Fat-tailed distributions
The argument in Section 2.3 that L2 should have fatter tails as becomes more polarized was based on approximate normality of L2 for large k. But if
, L2 has the distribution of k times a chi-squared variate with 1 degree of freedom, which is not normal. This section addresses whether L2 has fatter tails as
becomes more polarized even without the approximate normality assumption.
The first problem lies in defining “fatter tails.– This is easy for normal distributions or other distributions symmetric about their mean: the same mean and larger variance implies fatter tails. For asymmetric distributions such as those of linear combinations of chi-squared random variables, we must use a different definition. One possibility is the following.
Definition A.1. Distribution function F2 has fatter tails than distribution function F1, denoted by or
, if there exists a number x* such that F2(x) ≥ F1(x) for x ≤ x*, and F2(x) ≤ F1(x) for x > x*.
In other words, if the left tail F2(x) is at least as large as the left tail F1(x) for all x ≤ x*, and the right tail 1−F2(x) is at least as large as the right tail 1−F1(x) for all x > x*. Another way of expressing this fact is that if F1−F2 has any sign changes, then there is exactly one, and the sequence of signs is −, + as x increases.
Bock, et al. [9] conjectured, but did not prove the following.
Conjecture A.1. (Bock et al. [9]) if are iid standard normals and
, then
.
Theorem 1 of Roosta-Khorasani and Szekely [10] is closely related, but it shows that large values are more likely for than for
. We are interested in the opposite tail, namely that very small values are more likely for
than for
.
Although we have been unable to prove Conjecture A.1 in complete generality, empirical evidence and heuristic arguments support its veracity. For example, we investigated the k = 2 case using an extensive grid of possible values of and
and computing the distributions of
and
through numerical integration. For k = 3, we repeatedly generated random vectors
and
from a simplex in a way that
, and used simulation to compute the distributions of
and
. Further details are available from the authors.
One special case of Conjecture A.1 is when all of the variability of is concentrated equally in each of j directions. In that case, the distribution of L2 is
, where
contains j ones and k−j zeroes. Since
has a chi-squared distribution with j degrees of freedom, L2 is
, where
denotes a chi-squared random variable with j degrees of freedom. Thus, Conjecture A.1 says that
has fatter tails as j diminishes. Although we are unable to prove Conjecture A.1, we prove this special case at the end of the appendix.
Theorem A.1 for integers i, j, i < j.
Another important special case of Conjecture A.1 is when all pairs (Zi, Zj) have the same correlation ρ ≥ 0. In that case, the eigenvalue vector is (1−ρ, …, 1−ρ, 1 + (k−1)ρ), which increases in the majorization ordering as ρ increases. The conjecture implies that L2 has fatter tails as ρ increases. Eq (9) shows that , where τ = {1 + (k−1)ρ}/k. Therefore, Conjecture A.1 applied to the special case of equicorrelated Zis is equivalent to
having fatter tails as τ increases from 1/k to 1.
It should be noted that one could define fatness of tails of a distribution function in ways other than Definition A.1. For example, suppose that E{ψ(X)} ≤ E{ψ(Y)} for every convex function ψ such that the expectations exist. Then not only is the variance of X no greater than that of Y, but the same is true for the fourth central moment, the sixth central moment, etc. This is one way of formulating the idea that the distribution function of Y has fatter tails than that of X.
It is very easy to prove, using the alternative definition above, that assuming the Zi are independent produces the thinnest tailed distribution. This is demonstrated in Theorem A.2.
Theorem A.2 Let be iid standard normals and
, where
. Then for any convex function ψ such that
is finite for all
,
is largest when
.
This follows from the definition of convex function: for any u1, …, uk and nonnegative w1, …, wk with ∑wi = 1, ψ(∑wi ui) ≤ ∑wi ψ(ui). Set wi = λi/k, i = 1, …, k. Then
(14)
completing the proof.
Proof of Theorem A.1
Let fj(x) be the density of , where
has k−j zeroes followed by j ones. The distribution of
is chi-squared with j degrees of freedom. It follows that
(15)
The derivative of g(x) = exp(x/2k)/x1/2 is negative for 0 ≤ x < k and positive for x > k. Thus, g(x) decreases for x < k and increases for x > k. Moreover, g(x) has a limit of + ∞ as either x ↓ 0 or x → ∞. These facts also hold for h(x), which is just a positive constant times g(x). It follows that the number m1 of x such that h(x) = 1 is either 0, 1, or 2. But m1 cannot be 0 or 1 because that would imply that fj−1(x) > fj(x) for all x or for all but one x, contradicting the fact that fj−1(x) and fj(x) both integrate to 1. Therefore, h(x) = 1 for exactly two values, x = x1 and x = x2, x1 < x2.
Let Fj−1(x) and Fj(x) be the distribution functions corresponding to the densities fj−1(x) and fj(x). Because fj−1(x) > fj(x) for x < x1 and x > x2, Fj−1(x) > Fj(x) for x < x1 and Fj−1(x) < Fj(x) for x > x2. Because Fj−1 and Fj are continuous, there must be a point x* for which Fj−1(x*) = Fj(x*). Necessarily, x1 < x* < x2. But fj−1(x) < fj(x) for x ∈ (x1, x2), so if Fj−1(x*) = Fj(x*), then F(j−1)(x) < Fj(x) for all x ∈ (x*, x2). But we have already established that Fj−1(x) < Fj(x) for x > x2, so Fj−1(x) < Fj(x) for x > x*. Putting these facts together, we have established that
This completes the proof.
References
- 1. Bolland MJ, Avenell A, Gamble GD, Grey A. Systematic review and statistical analysis of the integrity of 33 randomized controlled trials. Neurology. 2016;87:2391–2402. pmid:27920281
- 2. Carlisle JB, Dexter F, Pandit JJ, Shafer SL, Yentis SM. Calculating the probability of random sampling for continuous variables in submitted or published randomised controlled trials. Anaesthesia. 2015;70:848–858. pmid:26032950
- 3. Betensky RA, Chiou SH. Correlation among baseline variables yields non-uniformity of p-values. PLoS ONE. 2017;12(9):e0184531. pmid:28886190
- 4. Bland M. Do baseline p-values follow a uniform distribution in randomised trials? PLoS ONE. 2013;8(10): e76010. pmid:24098419
- 5.
Noble B, Daniel JW. Applied Linear Algebra, 3rd ed. Pearson New York; 1987.
- 6.
Marshall AW, Olkin I, Arnold BC. Inequalities: Theory of Majorization And Its Applications, 2nd ed. Springer New York; 2011.
- 7. Fujii Y, Hoshi T, Uemura A, Toyooka H. Dose-response characteristics of midazolam for reducing diaphragmatic contractility. Anesthesia and Analgesia. 2001;92:1590–1593 (later retracted). pmid:11375852
- 8. Shaked M. On mixtures from exponential families. Journal of the Royal Statistical Society B. 1980;42:192–198.
- 9. Bock ME, Diaconis P, Huffer FW, Perlman MD. Inequalities for linear combinations of gamma random variables. The Canadian Journal of Statistics. 1987;15:387–395.
- 10. Roosta-Khorasani F, Székely G. Schur properties of convolutions of gamma random variables. Metrika. 2015;78:997–1014.