Diagnosing fraudulent baseline data in clinical trials

Michael A. Proschan; Pamela A. Shaw

doi:10.1371/journal.pone.0239121

Abstract

The first table in many articles reporting results of a randomized clinical trial compares baseline factors across arms. Results that appear inconsistent with chance trigger suspicion, and in one case, accusation and confirmation of data falsification. We confirm theoretically results of simulation analyses showing that inconsistency with chance is extremely difficult to prove in the absence of any information about correlations between baseline covariates. We offer a reasonable diagnostic to trigger further investigation.

Citation: Proschan MA, Shaw PA (2020) Diagnosing fraudulent baseline data in clinical trials. PLoS ONE 15(9): e0239121. https://doi.org/10.1371/journal.pone.0239121

Editor: Vance Berger, National Cancer Institute, UNITED STATES

Received: August 16, 2019; Accepted: August 31, 2020; Published: September 30, 2020

This is an open access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 public domain dedication.

Data Availability: All relevant data were provided in the paper or were generated by the code available at https://github.com/PamelaShaw/FraudRCT.

Funding: The authors received no specific funding for this work.

Competing interests: No authors have competing interests.

1 Introduction

In clinical trials, baseline variables are used to: 1) document that the trial recruited its target population, 2) summarize the natural history of the disease in the control arm, and 3) adjust the treatment effect for baseline differences in prognostic factors. Because baseline variables are measured before randomization, any differences between arms are attributable to chance. That is, the null hypothesis of no treatment effect should hold marginally for each baseline variable. For each continuous baseline variable compared using a continuous test statistic, the marginal distribution of its p-value should be uniform if the assumptions underlying the test (e.g., the data are normally distributed) are satisfied.

Clinical trialists sometimes go one step further and assume that p-values should behave like independent uniform deviates. Seeing appreciably more or fewer than the expected one in 20 statistically significant differences at α = 0.05 arouses suspicion. In one case uncovered by Bolland et al. [1], that suspicion led to the accusation and later verification of data falsification. A randomized study in dogs uncovered by Calisle et al. [2] also appeared to show implausibly little variability of baseline covariates across arms. It and other publications by the same authors were retracted.

Betensky and Chiou [3] and Bland [4] use simulation to show that, in practice, p-values for baseline variables in clinical trials frequently do not behave like independent uniforms for several reasons: 1) the assumptions underlying a test may not be satisfied (e.g., skewed data do not fit the normality assumption), 2) the covariate may be binary, in which case even marginal uniformity of p-values does not hold exactly, and 3) many baseline covariates are correlated, so their p-values are also correlated. The authors conclude that interpretation of standard tests of uniformity applied to p-values is problematic. A natural question is whether a legitimate case for data falsification can be made based solely on p-values reported in a baseline table (i.e., with no information presented on correlations between baseline covariates).

We propose a statistical test based on the sum of squared z-scores of baseline covariates that can be used to determine whether further investigation of fraud is warranted. This article complements the simulation results in [3–4] with theoretical results showing the difficulty of actually proving fraud. The problem is that the distribution of the test statistic depends critically on the correlation between z-scores comparing arms on baseline covariates. Naively treating these z-scores as independent rejects the null hypothesis of no fraud too often if there is any true correlation. On the other hand, using the worst case correlation matrix leads to an extremely conservative test that virtually never rejects the null hypothesis. We characterize correlation matrices that ought to be conservative, but not so conservative as to be useless.

2 Test statistic

2.1 Weighted combination of iid χ²(1)s

Let P_i be the one-tailed p-value for testing whether treatment observations tend to be larger than control observations for the ith continuous baseline covariate, i = 1, …, k. Assume that the corresponding test statistic has a continuous distribution and its underlying distributional assumptions are satisfied. In the absence of data falsification, the P_i are dependent uniform (0, 1) random variables. Betensky and Chiou [3] evaluate the impact of correlation on chi-squared and Kolmogorov-Smirnov statistics of uniformity of the P_i. We consider instead a test specifically targeting too little variability of actual results from expected results, an indication of possible data falsification.

We begin by transforming the dependent uniforms P_i to dependent standard normals Z_i by Z_i = Φ⁻¹(1 − P_i), where Φ⁻¹ is the inverse of the standard normal distribution function. Although the Z_i need not have a multivariate normal distribution, they will be approximately multivariate normal if the test statistics are asymptotically sums of iid random variables and the sample size dwarfs the number of baseline covariates. Given that our intent is to show the difficulty of proving data falsification even under the best circumstances, we assume that the Z_i are exactly multivariate normal with mean vector (0, 0, …, 0) and nonnegative-definite covariance matrix Σ whose diagonal elements are 1. That is, the Z_i are correlated (unless Σ is the identity matrix) but marginally standard normal.

We will suspect data falsification if the Z_i are too close to their expected value of 0; i.e., there is too much balance between arms. The sign of the Z_i is not important, so we will be suspicious if is very small for multiple baseline variables. A natural way to combine the is through , the squared length of the vector of z-statistics for the k baseline covariates. If the Z_i were iid, we could determine whether L² is “too small” by comparing its value to the αth percentile of a chi-squared distribution with k degrees of freedom. But the Z_i are dependent, so we must derive the distribution of L² under a marginal standard normal assumption, but with arbitrary correlation matrix Σ. We derive this distribution using standard results in linear algebra (see, for example, [5]) as follows.

2.2 Distribution of L²

We can find an orthonormal basis of eigenvectors of Σ and form the orthogonal matrix Γ whose columns are . Then Γ^T ΣΓ = D, a diagonal matrix whose diagonal entries λ₁, …, λ_k are the eigenvalues of Σ, which are all nonnegative real numbers because Σ is a nonnegative-definite, symmetric matrix. Therefore, Let Σ^1/2 denote the matrix ΓD^1/2 Γ^T, where D^1/2 is the diagonal matrix whose diagonal elements are the square roots of those of D. Then has the same distribution as , where is a vector of k iid standard normals, because cov . It follows that (1) where . Also, cov , so the distribution of is that of k iid standard normals. Eq (2) implies that (2) is a weighted sum of squares of iid standard normals. The weights λ_i are the eigenvalues of Σ, which sum to k.

We interpret these eigenvalues in terms of variances of linear combinations of the Z_i as follows. The vector maximizing the variance of the linear combination , subject to , is , the eigenvector associated with the largest eigenvalue, λ_(k), of Σ. We can view as the projection of onto the axis defined by (Fig 1). The variance of is λ_(k). The second largest eigenvalue of Σ is the maximum variance of linear combinations such that 1) , and 2) is orthogonal to . The variance of is λ_(k−1). Continuing in this fashion, the smallest eigenvalue of Σ is the maximum variance of linear combinations such that 1) , and 2) is orthogonal to each of . The variance of is λ₍₁₎. If there are a few very large eigenvalues and the rest are close to 0, the Z_i are highly correlated. On the other hand, if the eigenvalues are all of similar size, the Z_i are close to being uncorrelated.

Download:

Fig 1. Eigenvalues and eigenvectors when k = 2 illustrated with a large number of observations from a bivariate standard normal distribution with correlation ρ = 0.80. Observations vary most when projected onto the direction vector

(solid line). The corresponding variance is λ₍₂₎, the larger of the two eigenvalues. The variance of values projected onto the orthogonal direction vector

(dashed line) is the smaller eigenvalue, λ₍₁₎. The sum of the two eigenvalues is λ₍₁₎ + λ₍₂₎ = 2.

https://doi.org/10.1371/journal.pone.0239121.g001

Summary:

The distribution of L² under correlation matrix Σ for is a weighted sum of iid chi-squared random variables with 1 degree of freedom.
The weights are the eigenvalues of Σ, which sum to k.
If all eigenvalues are 1, the Z_i are iid and .
If k − 1 eigenvalues are 0, the Z_i are maximally correlated and .

2.3 Peakedness as a function of

We have characterized the distribution of the test statistic L² in the absence of fraud and under an assumed correlation matrix Σ. Next we investigate the peakedness of this distribution. If the distribution is peaked, then a small value of L² suggests possible fraud, as small values would be quite unlikely otherwise. On the other hand, if the distribution of L² is very dispersed, then small values of L² might be common even under the null hypothesis of no fraud. It is critical, therefore, to determine limits on the peakedness of the null distribution to properly interpret evidence engendered by small values of L². There are different ways to quantify peakedness, but perhaps the simplest is based on the variance.

2.3.1 Minimum and maximum variance.

We consider next how the variance of L² depends on the eigenvalues of the correlation matrix of . This will be important to evaluate how misleading it can be to treat p-values for baseline covariates as if they were independent. The mean and variance of L² are (3) (4) Thus, the mean of L² does not depend on the correlation matrix Σ for the baseline z-scores, but the variance of L² does. For this reason, it is important to find the minimum and maximum values V_min and V_max of var(L²) and determine which correlation matrices yield those extreme values. Because the λ_i sum to k, . Write V as (5) where is the sample variance of the λ_i. It is clear that V is minimized when for each i. In other words, the independence case, Σ = I, produces the smallest variance, V_min = 2k, of L². In a sense, this produces the greatest peakedness for the null distribution of L². To see the serious implications of this fact, suppose that the observed value of L² is small. If we wrongly assume that the z-scores for baseline covariates are independent, then we will be using the minimum possible variance of L² to determine whether L² is implausibly small. Consequently, the observed L² value might be many assumed standard deviations away from its mean value of k. The resulting p-value will be tiny, and the level of evidence supporting data falsification will be greatly overstated if the true correlation matrix Σ is far from the identity matrix corresponding to independent z-scores.

To avoid inflating the probability of erroneously suspecting data falsification, we could assume the correlation matrix Σ yielding the largest variance of L². It can be shown that the maximizing the variance of L² assigns value k to one of the λ_i and 0 to the remaining λ_is. In that case, var . In summary, the smallest and largest values of var(L²) are: (6) (7)

We will see that using V_max is almost always too conservative to be useful. Therefore, we want to select conservative values of V that are not so conservative that they are useless. To see how to do this, notice that the vectors (1, …, 1) and (0, 0, …, k) in (6) and (7) are at opposite ends of a certain spectrum. Imagine two different communities, each with k luxury cars divided among k people. In one community, everyone has 1 luxury car, and in the other community, one person has all k luxury cars. Vectors (1, …, 1) and (0, 0, …, k) correspond to these least and most polarized distributions. This concept can be formalized as follows. A vector is said to majorize another vector , written , if the ordered values x₍₁₎ ≤ … ≤ x_(k) and y₍₁₎ ≤ … ≤ y_(k) satisfy for j = 1, …, k, with equality when j = 1. In other words, is more polarized (the rich are richer and the poor are poorer) than . The smallest and largest vectors, in terms of majorization, with sum k are (y₍₁₎, …, y_(k)) = (1, …, 1) and (0, …, 0, k), respectively. The generalization of the ordering of variances in (6) and (7) is as follows.

Theorem 2.1. V = var(L²) increases as the vector of eigenvalues of Σ becomes more polarized (i.e., increases in the majorization ordering).

The proof follows from D.2 on page 101 of [6].

Therefore, computing the null distribution of L² assuming that is one of the larger (although not the largest possible) vectors in the majorization ordering should be conservative, but not prohibitively so.

Our treatment in this section implicitly assumed that the distribution of L² is approximately normally distributed for large k, which is reasonable for many because L² is a linear combination of iid chi-square random variables with 1 degree of freedom. However, for certain extreme vectors such as (0, …, k), L² is not normal. We would like to show, without invoking asymptotic normality, that the distribution of L² has fatter tails as becomes more polarized (increases in the majorization ordering). We defer discussion of this technical and difficult topic to the appendix.

2.4 Simple Σs allowing exact calculation

For any given critical value C, we can compute P(L² ≤ C) analytically without using a normal approximation for certain classes of correlation matrices Σ. Equivalently, we can think in terms of the eigenvalues of Σ, which, as we have seen, can be interpreted in terms of variances of projections of the Z_i onto directions defined by its eigenvectors. Suppose the total variance is spread equally among a small number of directions. Then all but a few eigenvalues are 0, and the remainder all have the same value. For instance, with only 1 direction, all but one eigenvalue is 0, and the nonzero eigenvalue is k. This is the most extreme correlation matrix in which all are identical. More generally, if all variability is focused equally in j directions, then each of the j nonzero eigenvalues has value k/j. In that case, expression (2) becomes k(X_j/j), where X_j has a chi-squared distribution with j degrees of freedom. The probability of a type 1 error is (8) where G_j is the distribution function of 1/j times a chi-square random variable with j degrees of freedom. The appendix shows that G_j has fatter tails as j decreases. Therefore, the distribution of L² has fatter tails if the total variability of is spread equally over a smaller number of directions.

Another relatively simple class of correlation matrices corresponds to the same correlation ρ for all pairs of z-scores. It can be shown that all but one eigenvalue is 1−ρ, and the remaining eigenvalue is 1 + (k − 1)ρ. In that case, expression (2) can be written as (9) where X_k−1 and X₁ are independent chi-squared random variables with k − 1 and 1 degree of freedom, respectively. Let H_j and h_j denote a chi-squared distribution and density function with j degrees of freedom, j = 1, …, k. From (9), the type 1 error rate is (10)

Table 1 uses Eqs (8) and (10) to compute the inflation of the type 1 error rate when one erroneously assumes that the z-scores comparing baseline covariates across arms are independent, when the true Σ is either equicorrelated or has all variability focused in a few directions. For the rows labeled by directions, the true Σ corresponds to total variability of z-scores divided equally among 1, 2, or 3 directions. For rows labeled by ρ, the z-scores for baseline comparisons all have the same pairwise correlation ρ. For example, the “1 direction” row shows that if critical value C is computed assuming the Z_i are independent, but the truth is that all variability of the Z_i is focused in only 1 direction, the actual type 1 error rate is 47.0 percent or 62.3 percent if k = 10 or k = 100, respectively. On the other hand, if the true Σ has the same pairwise correlation ρ = 0.50 for all pairs, the true type 1 error rate is 15.1 percent or 54.1 percent for k = 10 or k = 100, respectively. The probability of falsely becoming suspicious increases as the Z_i become more correlated.

Download:

Table 1. Probability of suspecting fraud (that is, L² ≤ C) when C is determined assuming the Z_i are independent.

The true Σ either has all variability focused equally in 1, 2, or 3 directions, or each off-diagonal element has value ρ.

https://doi.org/10.1371/journal.pone.0239121.t001

On the other hand, if one assumes perfect correlation and sets the critical value using V_max, the test becomes extraordinarily conservative. Table 2 shows that when k = 25, the actual type 1 error rate of the V_max test if the z-scores have common correlation ρ = 0.75 is 7.9 × 10⁻²⁰ instead of 0.05. In other words, if we want to protect against the most drastic correlation matrix Σ_ij = 1 for all i and j, the test becomes incredibly conservative even if the true correlation matrix still has unrealistically high correlation. Likewise, even if the true correlation matrix has all variance focused in only 3 directions, the V_max test has ultraconservative type 1 error rate 0.0003. Remember that the L² test is being used as a diagnostic to see if further investigation is warranted. Further investigation would provide an estimate of Σ that could be used to compute the true distribution of L², resulting in a much more accurate test. Thus, a reasonable option for the diagnostic test is to make a conservative assumption, such as that all correlations are 0.75 or all variability is focused equally in only 3 directions. This is almost guaranteed to overstate the degree of correlation in a real clinical trial.

Download:

Table 2. Probability of suspecting fraud (that is, L² ≤ C) when C is determined assuming Σ_ij = 1 for all i, j (i.e., perfect correlation).

The true Σ either has all variability focused equally in 1, 2, or 3 directions, or each off-diagonal element has value ρ.

https://doi.org/10.1371/journal.pone.0239121.t002

2.4.1 Example.

Fujii et al. [7] was a study randomizing 24 dogs to one of three doses of midazolam to evaluate the effect of midazolam on contractility of the diaphragm. Table 3 shows baseline means and standard deviations in each of the three arms for each of 8 continuous variables. There appears to be little variability across arms. We apply our test to dose groups 1 and 2. For each variable, compute a one-tailed p-value P_i using an unpaired t-statistic with alternative hypothesis that group 2 has a higher mean than group 1. Then convert each p-value to a z-score by Z_i = Φ⁻¹(1−P_i), and compute the test statistic . We find that L² = 0.2556. If we erroneously assume independence of baseline covariates, and therefore of Z statistics, the p-value is . This overstates the strength of evidence for fraud. On the other hand, assuming perfect correlation between z-scores for baseline covariates almost certainly understates the evidence for fraud. That p-value is . We feel confident that a real randomized experiment would not result in all correlations being 0.9. Therefore, making the assumption of a common ρ of 0.9 should still be highly conservative. The p-value using (10) with ρ = 0.9 is 0.0055. In other words, even under what we feel is an unrealistically large degree of correlation, namely 0.9 for all pairs, the evidence for fraud certainly seems sufficient to warrant further investigation.

Download:

Table 3. Fujii et al. [7] study in dogs uncovered by Carlisle et al. [2].

Shown are the eight continuous baseline covariates.

https://doi.org/10.1371/journal.pone.0239121.t003

3 Number of significant z-scores

We have focused on L² as a test statistic for detecting cheating, but other goodness of fit statistics such as those considered by Betensky and Chiou [3] and Bland [4] have similar behavior. A particularly simple statistic is the number J of statistically significant z-scores. We might be suspicious if the number of continuous baseline covariates is large and none result in a statistically significant difference between arms. Suppose these z-scores are equicorrelated with nonnegative correlation ρ. Then Z₁, …, Z_k have the same distribution as X + ϵ₁, …, X + ϵ_k, where X and the ϵ_i are mutually independent normal random variables with zero means and variances , . Let z_α satisfy 1−Φ(z_α) = α. Given X = x, the indicators I(Z_i > z_α) are iid Bernoulli (p) random variables, where (11) The specific distribution function F(p) for the random variable P = p(X) is unimportant. The important fact is that P has mean α, as the following calculation shows: Accordingly, the distribution of J is a mixed binomial: (12) where f(p) is the density function corresponding to distribution function F(p). Note that ∫pf(p)dp = α.

Under independence, the number of significant Z_i has an ordinary binomial distribution with parameters k and α. let J and J′ denote random variables from the mixed binomial (12) and the unmixed binomial bin(k, α). To see that extreme results are more likely for J than for J′, note that by Jensen’s inequality, for k > 1, (13) More generally, Shaked [8] has shown that a mixed binomial has fatter tails than an ordinary binomial with the same mean. This explains the common phenomenon of observing what appear to be too few or too many statistically significant baseline differences in clinical trials. Therefore, if one falsely assumes that the Z_i are independent, the chance of falsely suspecting fraud will be inflated.

4 Discussion

We have proposed a diagnostic for detecting suspiciously low between-arm variability in baseline covariates in clinical trials. The test statistic L², the squared length of the vector of z-scores of balance in baseline covariates, has a distribution that depends on the correlation matrix Σ of only through its eigenvalues. We confirm analytically for L² what is demonstrated through simulation in [3–4] for similar goodness of fit tests applied to baseline covariates in clinical trials when no information about correlations is available. Assuming independence between covariates (and, therefore, between the Z_i) results in an unacceptably high probability of falsely suspecting fraud. In fact, the distribution of L² has thinnest tails when one falsely assumes that z-scores for baseline covariates are independent, and fattest tails when one assumes the most extreme possible correlation. We draw two conclusions: 1) one should never conclude fraud solely because L² is unusually small under the independence assumption and 2) to feel confident that the Z_i are too small to have occurred by chance, L² must be unusually small even assuming unrealistically high correlation. Assuming perfect correlation produces a test that virtually never triggers further investigation. Therefore, we suggest using a practical upper bound on the correlation matrix such as all correlations equal to 0.75 or all variability focused equally in only 3 directions.

The final verdict will almost always be based on the totality of evidence. The case made by Bolland et al. [1] was based on numerous trials by the same authors that contained warning signs such as suspiciously fast enrollment and few deaths and dropouts despite recruiting older patients with substantial comorbidity. Our test statistic is a useful diagnostic that can be used in conjunction with other evidence to bolster the case for data falsification.

A Appendix: Fat-tailed distributions

The argument in Section 2.3 that L² should have fatter tails as becomes more polarized was based on approximate normality of L² for large k. But if , L² has the distribution of k times a chi-squared variate with 1 degree of freedom, which is not normal. This section addresses whether L² has fatter tails as becomes more polarized even without the approximate normality assumption.

The first problem lies in defining “fatter tails.– This is easy for normal distributions or other distributions symmetric about their mean: the same mean and larger variance implies fatter tails. For asymmetric distributions such as those of linear combinations of chi-squared random variables, we must use a different definition. One possibility is the following.

Definition A.1. Distribution function F₂ has fatter tails than distribution function F₁, denoted by or , if there exists a number x* such that F₂(x) ≥ F₁(x) for x ≤ x*, and F₂(x) ≤ F₁(x) for x > x*.

In other words, if the left tail F₂(x) is at least as large as the left tail F₁(x) for all x ≤ x*, and the right tail 1−F₂(x) is at least as large as the right tail 1−F₁(x) for all x > x*. Another way of expressing this fact is that if F₁−F₂ has any sign changes, then there is exactly one, and the sequence of signs is −, + as x increases.

Bock, et al. [9] conjectured, but did not prove the following.

Conjecture A.1. (Bock et al. [9]) if are iid standard normals and , then .

Theorem 1 of Roosta-Khorasani and Szekely [10] is closely related, but it shows that large values are more likely for than for . We are interested in the opposite tail, namely that very small values are more likely for than for .

Although we have been unable to prove Conjecture A.1 in complete generality, empirical evidence and heuristic arguments support its veracity. For example, we investigated the k = 2 case using an extensive grid of possible values of and and computing the distributions of and through numerical integration. For k = 3, we repeatedly generated random vectors and from a simplex in a way that , and used simulation to compute the distributions of and . Further details are available from the authors.

One special case of Conjecture A.1 is when all of the variability of is concentrated equally in each of j directions. In that case, the distribution of L² is , where contains j ones and k−j zeroes. Since has a chi-squared distribution with j degrees of freedom, L² is , where denotes a chi-squared random variable with j degrees of freedom. Thus, Conjecture A.1 says that has fatter tails as j diminishes. Although we are unable to prove Conjecture A.1, we prove this special case at the end of the appendix.

Theorem A.1 for integers i, j, i < j.

Another important special case of Conjecture A.1 is when all pairs (Z_i, Z_j) have the same correlation ρ ≥ 0. In that case, the eigenvalue vector is (1−ρ, …, 1−ρ, 1 + (k−1)ρ), which increases in the majorization ordering as ρ increases. The conjecture implies that L² has fatter tails as ρ increases. Eq (9) shows that , where τ = {1 + (k−1)ρ}/k. Therefore, Conjecture A.1 applied to the special case of equicorrelated Z_is is equivalent to having fatter tails as τ increases from 1/k to 1.

It should be noted that one could define fatness of tails of a distribution function in ways other than Definition A.1. For example, suppose that E{ψ(X)} ≤ E{ψ(Y)} for every convex function ψ such that the expectations exist. Then not only is the variance of X no greater than that of Y, but the same is true for the fourth central moment, the sixth central moment, etc. This is one way of formulating the idea that the distribution function of Y has fatter tails than that of X.

It is very easy to prove, using the alternative definition above, that assuming the Z_i are independent produces the thinnest tailed distribution. This is demonstrated in Theorem A.2.

Theorem A.2 Let be iid standard normals and , where . Then for any convex function ψ such that is finite for all , is largest when .

This follows from the definition of convex function: for any u₁, …, u_k and nonnegative w₁, …, w_k with ∑w_i = 1, ψ(∑w_i u_i) ≤ ∑w_i ψ(u_i). Set w_i = λ_i/k, i = 1, …, k. Then (14) completing the proof.

Proof of Theorem A.1

Let f_j(x) be the density of , where has k−j zeroes followed by j ones. The distribution of is chi-squared with j degrees of freedom. It follows that (15) The derivative of g(x) = exp(x/2k)/x^1/2 is negative for 0 ≤ x < k and positive for x > k. Thus, g(x) decreases for x < k and increases for x > k. Moreover, g(x) has a limit of + ∞ as either x ↓ 0 or x → ∞. These facts also hold for h(x), which is just a positive constant times g(x). It follows that the number m₁ of x such that h(x) = 1 is either 0, 1, or 2. But m₁ cannot be 0 or 1 because that would imply that f_j−1(x) > f_j(x) for all x or for all but one x, contradicting the fact that f_j−1(x) and f_j(x) both integrate to 1. Therefore, h(x) = 1 for exactly two values, x = x₁ and x = x₂, x₁ < x₂.

Let F_j−1(x) and F_j(x) be the distribution functions corresponding to the densities f_j−1(x) and f_j(x). Because f_j−1(x) > f_j(x) for x < x₁ and x > x₂, F_j−1(x) > F_j(x) for x < x₁ and F_j−1(x) < F_j(x) for x > x₂. Because F_j−1 and F_j are continuous, there must be a point x* for which F_j−1(x*) = F_j(x*). Necessarily, x₁ < x* < x₂. But f_j−1(x) < f_j(x) for x ∈ (x₁, x₂), so if F_j−1(x*) = F_j(x*), then F_(j−1)(x) < F_j(x) for all x ∈ (x*, x₂). But we have already established that F_j−1(x) < F_j(x) for x > x₂, so F_j−1(x) < F_j(x) for x > x*. Putting these facts together, we have established that This completes the proof.

References

1. Bolland MJ, Avenell A, Gamble GD, Grey A. Systematic review and statistical analysis of the integrity of 33 randomized controlled trials. Neurology. 2016;87:2391–2402. pmid:27920281
2. Carlisle JB, Dexter F, Pandit JJ, Shafer SL, Yentis SM. Calculating the probability of random sampling for continuous variables in submitted or published randomised controlled trials. Anaesthesia. 2015;70:848–858. pmid:26032950
3. Betensky RA, Chiou SH. Correlation among baseline variables yields non-uniformity of p-values. PLoS ONE. 2017;12(9):e0184531. pmid:28886190
4. Bland M. Do baseline p-values follow a uniform distribution in randomised trials? PLoS ONE. 2013;8(10): e76010. pmid:24098419
5. Noble B, Daniel JW. Applied Linear Algebra, 3rd ed. Pearson New York; 1987.
6. Marshall AW, Olkin I, Arnold BC. Inequalities: Theory of Majorization And Its Applications, 2nd ed. Springer New York; 2011.
7. Fujii Y, Hoshi T, Uemura A, Toyooka H. Dose-response characteristics of midazolam for reducing diaphragmatic contractility. Anesthesia and Analgesia. 2001;92:1590–1593 (later retracted). pmid:11375852
8. Shaked M. On mixtures from exponential families. Journal of the Royal Statistical Society B. 1980;42:192–198.
- View Article
- Google Scholar
9. Bock ME, Diaconis P, Huffer FW, Perlman MD. Inequalities for linear combinations of gamma random variables. The Canadian Journal of Statistics. 1987;15:387–395.
- View Article
- Google Scholar
10. Roosta-Khorasani F, Székely G. Schur properties of convolutions of gamma random variables. Metrika. 2015;78:997–1014.
- View Article
- Google Scholar

[ref1] 1. Bolland MJ, Avenell A, Gamble GD, Grey A. Systematic review and statistical analysis of the integrity of 33 randomized controlled trials. Neurology. 2016;87:2391–2402. pmid:27920281
View Article
PubMed/NCBI
Google Scholar

[2] View Article

[3] PubMed/NCBI

[4] Google Scholar

[ref2] 2. Carlisle JB, Dexter F, Pandit JJ, Shafer SL, Yentis SM. Calculating the probability of random sampling for continuous variables in submitted or published randomised controlled trials. Anaesthesia. 2015;70:848–858. pmid:26032950
View Article
PubMed/NCBI
Google Scholar

[6] View Article

[7] PubMed/NCBI

[8] Google Scholar

[ref3] 3. Betensky RA, Chiou SH. Correlation among baseline variables yields non-uniformity of p-values. PLoS ONE. 2017;12(9):e0184531. pmid:28886190
View Article
PubMed/NCBI
Google Scholar

[10] View Article

[11] PubMed/NCBI

[12] Google Scholar

[ref4] 4. Bland M. Do baseline p-values follow a uniform distribution in randomised trials? PLoS ONE. 2013;8(10): e76010. pmid:24098419
View Article
PubMed/NCBI
Google Scholar

[14] View Article

[15] PubMed/NCBI

[16] Google Scholar

[ref5] 5. Noble B, Daniel JW. Applied Linear Algebra, 3rd ed. Pearson New York; 1987.

[ref6] 6. Marshall AW, Olkin I, Arnold BC. Inequalities: Theory of Majorization And Its Applications, 2nd ed. Springer New York; 2011.

[ref7] 7. Fujii Y, Hoshi T, Uemura A, Toyooka H. Dose-response characteristics of midazolam for reducing diaphragmatic contractility. Anesthesia and Analgesia. 2001;92:1590–1593 (later retracted). pmid:11375852
View Article
PubMed/NCBI
Google Scholar

[20] View Article

[21] PubMed/NCBI

[22] Google Scholar

[ref8] 8. Shaked M. On mixtures from exponential families. Journal of the Royal Statistical Society B. 1980;42:192–198.
View Article
Google Scholar

[24] View Article

[25] Google Scholar

[ref9] 9. Bock ME, Diaconis P, Huffer FW, Perlman MD. Inequalities for linear combinations of gamma random variables. The Canadian Journal of Statistics. 1987;15:387–395.
View Article
Google Scholar

[27] View Article

[28] Google Scholar

[ref10] 10. Roosta-Khorasani F, Székely G. Schur properties of convolutions of gamma random variables. Metrika. 2015;78:997–1014.
View Article
Google Scholar

[30] View Article

[31] Google Scholar

Figures

Abstract

1 Introduction

2 Test statistic

2.1 Weighted combination of iid χ2(1)s

2.2 Distribution of L2

2.3 Peakedness as a function of

2.3.1 Minimum and maximum variance.

2.4 Simple Σs allowing exact calculation

2.4.1 Example.

3 Number of significant z-scores

4 Discussion

A Appendix: Fat-tailed distributions

References

2.1 Weighted combination of iid χ²(1)s

2.2 Distribution of L²