A New Powerful Nonparametric Rank Test for Ordered Alternative Problem

We propose a new nonparametric test for ordered alternative problem based on the rank difference between two observations from different groups. These groups are assumed to be independent from each other. The exact mean and variance of the test statistic under the null distribution are derived, and its asymptotic distribution is proven to be normal. Furthermore, an extensive power comparison between the new test and other commonly used tests shows that the new test is generally more powerful than others under various conditions, including the same type of distribution, and mixed distributions. A real example from an anti-hypertensive drug trial is provided to illustrate the application of the tests. The new test is therefore recommended for use in practice due to easy calculation and substantial power gain.


Introduction
The problem of statistically testing the equality of three or more populations has been studied for decades, and many efficient nonparametric tests have been proposed. Kruskal and Wallis [1] introduced a nonparametric test for a general alternative where at least two independent populations differ in median under the alternative. This test does not identify the pairwise group differences or the number of these differences. Specific ordered alternatives, such as the trend among groups, may be more interesting to practitioners and researchers. Many tests have been proposed for different types of ordering alternatives, for example, the test proposed by Mack and Wolfe [2] for an umbrella alternative, the one proposed by Fligner and Wolfe [3] for a tree alternative, the Cochran-Armitage test [4,5] for a monotonic alternative with binary endpoints, and the Jonckheere-Terpstra ( JT) test [6,7] for a monotonic alternative with continuous endpoints.
The monotonic ordering problem with continuous endpoints occurs frequently in a wide range of statistical and medical applications [8,9]. For example, in typical toxicity studies, the risk of adverse events that are caused, or possibly caused, by the treatment's action is often expected to rise with increasing doses. This problem has received considerable attention in the literature. After Jonckheere [6] and Terpstra [7] developed the nonparametric test for the nondecreasing ordered alternative based on the Mann Whitney (MW) testing procedure, many nonparametric tests have been developed for this problem based on the MW test or other tests. Recently, Neuhauser et al. [10] introduced a modified JT (MJT) test weighted by the distance between groups, and this test was shown to be more powerful than the JT test in small sample sizes due to the less discrete null sampling distribution. But the power gain would vanish as the sample size increases. This MJT test is a special case of the generalized JT test proposed by Tryon and Hettmansperger [8]. The Wilcoxon rank sum test was extended to the k-sample ordered problem by Cuzick [11] (referred to as the CU test) based on the the Wilcoxon rank sum test. The CU test is a special case of the linear rank test, and is a locally most powerful test for location shifts under the logistic distribution [12]. Later, Le [13] proposed a test for monotonic ordering alternatives analogous to the Kruskal Wallis test, which was shown to be equivalent to the CU test when the sample sizes were equal across groups. The numerical comparison among the JT test, the CU test, and the Le test was performed by Mahrer and Magel [14], and they found that all three tests were comparable in terms of power. Most aforementioned tests are constructed on pairwise comparisons. More recently, Terpstra and Magel [15] proposed a nonparametric test based on simultaneous comparisons with one observation from each group. In addition, interested readers are referred to Kossler [16], and Alonzo et al. [17].
In this article, we propose a new nonparametric test for the monotonic ordering problem based on the rank difference between two observations from different independent groups. The commonly used JT test statistic is calculated as the total number of pairs whose observation in the second group is greater than that in the first group. In addition to the sign of difference between two observations, the actual difference is also important to detect the ordered alternative. The actual difference can be measured by the rank difference in the nonparametric setting. The new nonparametric test captures not only the sign of the difference between observations, but also the value of the difference. We are the first to propose this new idea for detecting a monotonic ordering, and it can be readily extended to other important statistical problems.
The remainder of this article is organized as follows. In Section 2, we introduce the proposed new nonparametric rank test, derive the exact mean and variance of the test statistic under the null hypothesis, and prove the asymptotic null distribution. In Section 3, we compare the performance of the proposed test and other commonly used nonparametric tests with regard to power under a wide range of conditions. A real example from an antihypertensive drug trial is given to illustrate the application of the nonparametric tests in Section 4. Section 5 is given to discussion and future work.

Nonparametric tests
The underlying distribution functions of k independent populations are assumed to be absolutely continuous and of the form F i~F (x{m i ), where m i is the location parameter for the i{th group, i~1,2, Á Á Á ,k. The total number of subjects in the study is N, with n i subjects in the i{th group, and N~P k i~1 n i . There is no difference among the k populations under the null hypothesis, and the distributions under the monotone ordering alternative differ by their location parameters m i ,i~1,2, . . . ,k. Specifically, the hypotheses are and H a : m 1 ƒ Á Á Á ƒm k and m 1 vm k : Let X il be the l{th observation in the i{th group, and R il denote the rank in the combined data for the l{th observation in the i{th group, where i~1,2, . . . ,k and l~1,2, . . . ,n i . The commonly used JT test is based on the k(k{1)=2 possible pairwise comparisons between two groups, and within each two group comparison the MW test statistic [18] is used. The JT test statistic is expressed as Pn j m~1 I(X il vX jm ) is the MW test statistic for comparing the i-th and j-th population, I(y)~1 if y is true, and 0 otherwise.
2.1 Existing nonparametric tests. In addition to the JT test, we considered three more frequently used nonparametric tests for monotonic ordering alternative problems to compare the performance with the new proposed test. They are the modified JT (MJT) test introduced by Neuhauser et al. [10], the test proposed by Terpstra and Magel [15] (referred to as the TM test), and the CU test proposed by Cuzick [11] based on the Wilcoxon rank sum test. The MJT test is a special case of generalized versions of the JT test with the weight as the distance between the group, and the test statistic is given as Neuhauser et al. [10] showed that the MJT test has an actual type I error closer to the nominal level and is substantially more powerful than the common JT test in small sample sizes.
Terpstra and Magel [15] introduced a nonparametric test based on the k-tuplet simultaneous comparison, not the pairwise comparison as in the JT test. A k-tuplet is constructed with one observation from each group, and the total number of k-tuplet is n 1 n 2 Á Á Á n k . The TM test statistic is It is noted that the MW test is a special case of the TM test when k~2.
The Wilcoxon rank sum test is one of the most popular nonparametric tests for comparing two independent populations. An extension of the Wilcoxon test was proposed by Cuzick [11]. The sum of ranks for each group is first calculated, and then the CU test statistic is computed as a weighted sum of these ranks with the weight as the group number The CU test is generally more powerful than other tests under monotonic alternatives [17]. Although other tests may be considered, these four existing nonparametric tests are typically used in applications and are considered as representatives of the available tests for the monotonic ordering problem.
2.2 Proposed rank test. The MW test statistic used in the JT test counts the number of pairs such that the observation from one group is greater than that from another group; however, it does not differentiate pairs using pair differences. In other words, the actual differences between observations are not well captured. We consider the actual differences to be important information that should be utilized in the testing procedure to improve the test's efficiency. Following Shan [19] for comparing two groups, the new rank based nonparametric test by incorporating the actual differences is given as where D ij~P n i l~1 Pn j m~1 Z ijlm , Z ijlm~( R jm {R il )I(X jm wX il ) and R il (R jm ) denotes the rank of the observation X il (X jm ) in the combined data. This new test can be considered as an extension of the sign test and the Wilcoxon rank sum test, since I(X jm wX il ) and R jm {R il are used in the sign test and the Wilcoxon test, respectively. The exact mean and variance of the null sampling distribution are given in the following theorem.
Theorem 2.1 Under the null hypothesis, the new test statistic S has the mean and variance as and Proof. The calculation for the mean of S is straightforward.
Under the null hypothesis, the expectation of S is given as The calculation for variance is not easy and requires some effort. The variance of S can be written as a summation of covariances, We use these notations interchangeably in this article. We consider two observations as a pair when they have the same value. Because ivj and i'vj', one observation from a pair is from (X il ,X jm ) and the other is from (X i'l' ,X j'm' ).
The covariance is non-zero only when at least one pair exists in the observations (X il ,X jm , Thus, the Var(Z ijlm ) under the null hypothesis is expressed as When only one pair exists in (X il ,X jm ,X i'l' ,X j'm' ), there are four possible outcomes: In cases (a) and (b), the observations X il ,X jm ,X i'l' ,X j'm' are either from two groups where the unpaired two observations are from one group and the pair is from the other, or from three groups where the pair is from either the first group or the third group after the groups have been sorted.
The first type of covariance in the case with only one pair is In cases (c) and (d), the observations X il ,X jm ,X i'l' ,X j'm' are from three different groups and the pair is from the second group (the middle group) after sorting the groups.
Then, the second type of covariance in the case with only one pair is given as In the case with no pair in the observations (X il ,X jm , Therefore, the variance of S is given as n i n j n l )CovB: The standardized test statistic of S is The following theorem shows the asymptotic normality of the test statistic S t under the null hypothesis.
test S t has an asymptotic standard normal distribution as N?? and n i ??.
It should be noted that Q(x 1m 1 ,x 2m 2 , Á Á Á ,x km k ): By applying the results of the Problem 42 in the Appendix of Lehmann [20], S{E(S) asymptotically follows a normal distribution without scaling by the standard deviation, which can be proven by projecting the test statistic S onto a sum of independent random variables [21] and then applying the central limit theorem.
The new proposed test can be performed by comparing S t with appropriate quantile of standard normal distribution. For example, at the significance level of a, the null hypothesis will be rejected in the favor of an increasing ordered alternative if S t §q 1{a , where q 1{a is the upper 100(1{a) percentile of the standard normal distribution.
The asymptotic cumulative distribution function (CDF) and the Monte Carlo simulation based exact distribution of P(S t ƒs) for k~3, n~ (5,5,5) are displayed in Figure 1. The simulated exact distribution was based on 20,000 iterations from the standard normal distribution for each group. As seen in the figure, the exact permutation distribution approximates the asymptotic distribution well.

Numerical study
We conduct extensive exact Monte Carlo simulation studies to compare the five tests: 1): the JT test; 2) the MJT test; 3) the TM test; 4) the CU test; and 5) the new proposed test. The nominal level is set to be a~0:05. In order to make a fair comparison between tests and avoid unsatisfied type I error rate control for tests using asymptotic distributions, exact permutation approach is used with data simulated from standard normal distributions with the same location and scale, e.g., N(0,1). Total 20,000 iterations are utilized to obtain the 95% cutpoint, and these 20,000 simulated data is used for all the methods. For given the number of group and sample size within each group, the 95% cutpoint for each test is computed from the same simulated null distribution. In other words, the simulated null distribution under each configuration, is used multiple times to cacluate the cutpoint for each test. The same rule is applied to the simulated alternative distribution for power comparison. This procedure would reduce the bias of cutpoint and power estimates between tests, and makes a fair comparison between them.  N(1,1). The parameters for alternative distributions (a), (b), (c), and (d) are also used for the t distribution with df = 3 of the form t 3 zm. In addition to symmetric distributions, we also consider a skewed distribution, exponential distribution, and a mixed distribution of normal distribution and exponential distribution. We consider similar distributions for the case of k~4, but with the sample sizes (n 1 ,n 2 ,n 3 ,n 4 ): (8,8,8,8), (10,6,6,10), (20,20,10,10), and (10,20,10,20), and three alternatives: (A) : m~(0,0:2,0:5,1), (B) : m~(0,0:5,0:5,0:5), and (C) : m~(0,0,0,1). The power comparison between the five tests is examined for each configuration of sample size and alternative hypothesis.
The simulated power under normal distributions for k~3 is shown in Table 1. The actual sizes were obtained by simulating samples from standard normal distributions using the simulated 95% cutpoint. Simulated sizes are generally closer to the nominal level across the tests and sample sizes considered. We observe that the MJT test and the test due to Cuzick have the same power, which is also observed under other distributions. Although we do not theoretically prove that both tests have the same power using exact permutation test, it may be the case that they are equivalent to each other. For this reason, we only present one of them in the following power comparison results. The TM test has some power gain compared to other tests under the convex shape alternative (c) with decreasing sample sizes across groups. We have seen this trend from the other three distributions. The TM test has some power advantage as compared to others under the normal distribution with unequal variances. In all other configurations, the power of the TM test is lower than that of other tests. Out of the total 20 configurations from the alternative (a)-(e) and four difference sample sizes, the new test has more power than the JT test in 19 cases, and is at least as powerful as the CU test in 15 cases.
The power study under other distributions for k~3 are shown in Table 2 for the t alternative and in Table 3 for the exponential distribution. The exponential distribution is examined as an example of skewed distributions, with mean values: We also compare the tests with mixed distributions for k~3 in Table 4. The mixed distribution considered here is: normal distribution for the first group, and exponential distributions for   Table 3. Simulated power study based on exponential distribution for k~3.   has more power than other tests in 12 out of the total 16 configures.
The power comparison results for k~4 are shown in Tables 5,  6, 7, and 8 for the normal distribution, the t distribution, the exponential distribution, and the mixed distribution. The mixed distribution is the one with normal distributions N(m,1) for the first two groups, and exponential distributions exp(m) with mean m for the last two groups. As can be seen from these tables, the new test generally has more power than all other existing tests, and is almost uniformly more powerful than the commonly used JT test.

Example
A clinical trial for an antihypertensive drug [22] is provided to illustrate the use of the discussed tests. The primary objective of the study was to examine the effect of the selected doses on diastolic blood pressure by measuring the mean reduction in Table 6. Simulated size and power study based on t distribution with with df = 3 of the form t 3 zm for k~4.

Conclusion
In this article we propose a new powerful nonparametric test, based on the rank difference between observations, for the monotonic ordering alternative problem in k-sample problem. The rank difference between observations for two groups is analogous to the two sample t test when the parametric assumptions are satisfied. The positive rank differences used in the test statistic are motivated by the idea of the sign test. We derive the asymptotic distribution of the new test statistic and studied the convergence rate of the simulation based exact distribution to the asymptotic distribution. The power comparison between the new test and other existing tests shows that the new test is generally more powerful than other tests for various distributions. We would recommend using the new test in practice due to substantial power gain.
The asymptotic distribution of the new test statistic was derived with continuous endpoints. No ties occur in continuous data. For ordinal and binary data, one has to consider the frequency of ties in the data, and the variance of the new test needs to be investigated. However, for given data, permutation based or simulation based approaches are readily employed for the p-value calculation. The application of the new test for ordinal or binary data is considered for future work. Other alternative hypotheses may be studied, such as the general alternative [1], the umbrella alternative [2], and the tree alternative [3]. An extension of the new test in exact testing framework [23,24,25,26] and for repeated data from randomized block designs are also interesting.