Statistical tests under Dallal’s model: Asymptotic and exact methods

This paper proposes asymptotic and exact methods for testing the equality of correlations for multiple bilateral data under Dallal’s model. Three asymptotic test statistics are derived for large samples. Since they are not applicable to small data, several conditional and unconditional exact methods are proposed based on these three statistics. Numerical studies are conducted to compare all these methods with regard to type I error rates (TIEs) and powers. The results show that the asymptotic score test is the most robust, and two exact tests have satisfactory TIEs and powers. Some real examples are provided to illustrate the effectiveness of these tests.


Introduction
In clinical medicine, we often encounter bilateral data taken from paired organs of patients such as eyes and ears. For the same patient, the intraclass correlation between responses of paired parts should be considered to avoid misleading results. There have been in the past various models to analyze such data. For example, Rosner [1] introduced a positive constant R as a measure of the dependency by assuming that the probability of a response at one side of the paired body given a response at the other side is R times to the response rate. Donner [2] provided an alternative approach and considered the common correlation coefficient in each of two groups. Under these two models, asymptotic and exact methods have been studied for many years and achieved significant progress.
Under Rosner's model, Tang et al. [3] developed exact and approximate procedures when sample size is small or the data structure is sparse. Qiu et al. [4] derived sample formulas for testing difference between two proportions. Shan and Ma [5], and Ma et al. [6] presented several asymptotic and exact methods to investigate the equality of proportions. Peng et al. [7] constructed asymptotic confidence intervals (CIs) of proportion ratio for correlated paired data. Under Donner's model, Pei et al. [8,9] used asymptotic methods to analyze test statistics and CIs in two treated groups. Liu et al. [10,11] provided exact methods to test the homogeneity of prevalence from multiple groups. Generally, asymptotic methods can produce empirical type I error rates (TIEs) close to the pre-specified nominal level for large samples. However, a1111111111 a1111111111 a1111111111 a1111111111 a1111111111

Dallal's model and estimators
Suppose that N patients is randomly allocated into g groups. There are m i patients in the ith (i = 1, . . ., g) group. Let m li be the number of patients who have l(l = 0, 1, 2) organ(s) with improvement response(s) in the ith (i = 1, . . ., g) group, and S l be the total number of patients with l(l = 0, 1, 2) response(s). Obviously, m i ¼ P 2 l¼0 m li and S l ¼ P g i¼1 m li . The data structure is shown in Table 1. Let p li be the probability that a patient has l(l = 0, 1, 2) response(s) in the ith (i = 1, . . ., g) group. The vector m i ≜ (m 0i , m 1i , m 2i ) T follows a multinomial distribution M (m i ;p 0i , p 1i , p 2i ). The probability density satisfies 0i p m 1i 1i p m 2i 2i ; i ¼ 1; . . . ; g: Let Z ijk = 1 if the kth organ of the jth patient has improvement response in the ith group for k = 1, 2, i = 1, 2, . . ., g, and j = 1, 2, . . ., m i , and 0 otherwise. Under Dallal's model, we assume where 0 � π i , γ i � 1. Especially, γ i = π i means that two organ responses of each patient are completely independent, and γ i = 1 represents that they are completely dependent in ith group. By using the Eq (1), the probabilities of no, one or both response(s) are where 0 � p li � 1, p 01 + p 1i + p 2i = 1, and max 0; 1 À 1 In the work, we are interested to test whether the correlations of g groups are identical. Thus, the hypotheses are given by . . . ; gg: Denote m = (m 1 , . . ., m g ), π = (π 1 , . . ., π g ) and γ = (γ 1 , . . ., γ g ). Given the observation m, the log-likelihood function Letp i andĝ i be the unconstrained MLEs of π i and γ i under H 1 .
Differentiate (2) with respect to π i and γ i , and set them to 0. The MLEsp i andĝ i are the solutions of the following equations Letp i andg be the constrained MLEs of π i and γ under H 0 . Similarly, they are the solution of the equations For the first equation, we have m 0i 1À 2p i þp i g ¼ m 1i þm 2i p i ð2À gÞ . The second equation can be simplified as γ (S 1 + 2S 2 ) − 2S 2 = 0. Then, the constrained MLEs are obtained Test methods

Asymptotic test statistics
In this section, we propose three asymptotic tests for large samples based on the unconstrained and constrained MLEs.
Under H 0 , test statistic T l (= T L , T SC or T W ) has asymptotic chi-square distribution with g − 1 degrees of freedom. Let w 2 gÀ 1;1À a be the (1 − α)th quantile of the chi-square distribution with g − 1 degree of freedom. Given the nominal level α, the null hypothesis H 0 will be rejected if the value of T l is larger than w 2 gÀ 1;1À a .

Exact methods
Given the observed data m = (m 1 , . . ., m g ), let T l (m) be the value of the aforementioned statistic T l (l = L, SC, W). The asymptotic (A) p-values of these statistics are defined by where m � is an observed data of m. For convenience, we call p A L , p A SC and p A W as "A approach" based on statistics T L , T SC and T W . Asymptotic tests work well when the sample size is large. However, they have some limitations if the sample size is relatively small. Several exact conditional and unconditional methods are proposed for small samples based on these statistics.
An exact conditional method is introduced under the assumption that all of m i (i = 1, . . ., g) and S l (l = 0, 1, 2) are fixed in Table 1 where Sðm � Þ ¼ fm : S l ¼ S � l ; l ¼ 0; 1; 2g. According to the tail area C l (m � ), the exact conditional (C) p-values can be calculated by Here, p C L ; p C SC and p C W are described as "C approach" based on statistics T L , T SC and T W . Another exact p-value is from Basu's maximization approach [14]. It can be obtained by maximizing the tail probability over all nuisance parameters instead of the constrained MLEs under H 0 . In this case, we define the tail area of statistic T l (l = L, SC, W) as where π = (π 1 , . . ., π g ) and γ = (γ 1 , . . ., γ g ). Hence, under maximization (M) method, three exact unconditional p-value of are given by where L(π, γ|m) = exp(l(π, γ|m)) and l(π, γ|m) is defined in (2). Corresponding to "A approach" and "C approach", p M L , p M SC and p M W are called "M approach" based on T L , T SC and T W .

Numerical studies
In this section, we investigate the performance of the proposed asymptotic and exact tests in terms of TIEs and powers under different parameter settings.
We first compare asymptotic methods T L , T SC and T W with empirical TIEs. Let g = 2, 3, 4, π = 0.3: 0.02: 0.5, γ = 0.3: 0.02: 0.8 and m = m 1 = � � � = m g = 15, 50, 100. Here, a: b: c means increasing from a to c by b. For each parameter setting, 10,000 samples are randomly generated from the null hypothesis H 0 . Given the nominal level α = 0.05, empirical TIE is calculated by the proportion of rejecting H 0 , i.e., the number of rejections/10,000. Figs 1, 2 and 3 show the distribution surfaces of empirical TIEs for all the tests under π i = π and γ i = γ(i = 1, 2, . . ., g; g = 2, 3, 4). According to Tang et al. [3], a test is liberal if its empirical TIE is greater than 0.06, conservative if it is less than 0.04, otherwise it is robust. We observe that score test is more robust than other tests since its TIEs are closer to the pre-determined level α = 0.05. All the tests work well for larger sample size. However, likelihood ratio and Wald-type tests have inflated TIEs and are especially liberal when sample size is small. Some of their TIEs may be less than 0.04 or greater than 0.06.
Next we calculate the empirical powers of these tests according to the parameter settings for m = 15, 50, 100: (i) g = 2, π = (0.

PLOS ONE
show TIE surfaces of all the exact methods for π i = π and γ i = γ (i = 1, . . ., g; g = 2, 3). We observe that A approach p A SC is closer to the pre-specified nominal level α = 0.05 for m = 10 and g = 2, 3. However, p A L and p A W have the inflated TIEs. For C approach, p C W is better than p C L and p C SC since they have the inflated TIEs. The M approaches p M L and p M SC can produce satisfactory TIEs.

Real examples
In this section, two real examples with unbalanced designs are provided to illustrate our proposed methods at the nominal level α = 0.05. We first show an example with large samples based on asymptotic test statistics.

PLOS ONE
Let m li be the number of patients with l(l = 0, 1, 2) affected eyes in the ith (i = 1, 2, 3, 4) group. Under Dallal's model, we are interested to test if the correlations of these four groups are equal, i.e., H 0 : Table 3  For small sample case, we provide another example to compare the effectiveness of asymptotic and exact methods.
Through calculating (4) and (5) Table 5 provides the comparison of asymptotic and exact methods. The result shows that there is no significant difference among the correlations of two groups regardless of any approaches.   Numerical studies are conducted to investigate the performance of asymptotic and exact methods in terms of TIEs and powers. When the samples is larger, empirical TIEs and powers of T L , T SC and T W are close to each other. In general, score test T SC is more robust than other two tests. However, these tests may produce unacceptable TIEs such as Wald-type test when the samples is smaller. The results are similar to those of Rosner's and Donner's models, see Ma et al. [6] and Liu et al. [10]. For small sample, we obtain TIE surfaces and power curves of exact C and M approaches with two and three groups, comparing with A approach. As for TIEs, the A approaches p A L and p A W are liberal, and p A SC is close to the nominal level 0.05 under different parameter configurations. The C approaches p C L and p C SC tend to be more inflated than p C W . The M approach P M SC is better than p M L and p M W . On the other hand, the powers of exact methods are very close based on likelihood ratio T L and score T SC . For C approach, p C W has higher power, while p C l ðl ¼ L; SCÞ has lower power. Moreover, p M SC has higher power, but p M W has lower power in M approach. The ideas of asymptotic and exact methods can be extended other data structures with larger or small samples such as crash data. For example, Zeng et al. [17][18][19] proposed some models for the analysis of crash rates by injury severity. Dong et al. [20] introduced mixed logit model to investigate the difference between single-and multi-vehicle accident probability. Chen et al. [21][22][23] analyzed unbalance panel models by using real-time environmental and traffic big data. For these problems, we will leave these for future research.