Figures
Abstract
This paper proposes asymptotic and exact methods for testing the equality of correlations for multiple bilateral data under Dallal’s model. Three asymptotic test statistics are derived for large samples. Since they are not applicable to small data, several conditional and unconditional exact methods are proposed based on these three statistics. Numerical studies are conducted to compare all these methods with regard to type I error rates (TIEs) and powers. The results show that the asymptotic score test is the most robust, and two exact tests have satisfactory TIEs and powers. Some real examples are provided to illustrate the effectiveness of these tests.
Citation: Li Z, Ma C, Ai M (2020) Statistical tests under Dallal’s model: Asymptotic and exact methods. PLoS ONE 15(11): e0242722. https://doi.org/10.1371/journal.pone.0242722
Editor: Feng Chen, Tongii University, CHINA
Received: August 21, 2020; Accepted: November 9, 2020; Published: November 30, 2020
Copyright: © 2020 Li et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All relevant data are within the manuscript.
Funding: This research is supported by the National Natural Science Foundation of China (Grant Nos: 12061070, 12071014, 11661076), and the Science and Technology Department of Xinjiang Uygur Autonomous Region (Grant No: 2018Q011).
Competing interests: I have read the journal’s policy and the authors of this manuscript have the following competing interests: The authors have declared that no competing interests exist.
Introduction
In clinical medicine, we often encounter bilateral data taken from paired organs of patients such as eyes and ears. For the same patient, the intraclass correlation between responses of paired parts should be considered to avoid misleading results. There have been in the past various models to analyze such data. For example, Rosner [1] introduced a positive constant R as a measure of the dependency by assuming that the probability of a response at one side of the paired body given a response at the other side is R times to the response rate. Donner [2] provided an alternative approach and considered the common correlation coefficient in each of two groups. Under these two models, asymptotic and exact methods have been studied for many years and achieved significant progress.
Under Rosner’s model, Tang et al. [3] developed exact and approximate procedures when sample size is small or the data structure is sparse. Qiu et al. [4] derived sample formulas for testing difference between two proportions. Shan and Ma [5], and Ma et al. [6] presented several asymptotic and exact methods to investigate the equality of proportions. Peng et al. [7] constructed asymptotic confidence intervals (CIs) of proportion ratio for correlated paired data. Under Donner’s model, Pei et al. [8, 9] used asymptotic methods to analyze test statistics and CIs in two treated groups. Liu et al. [10, 11] provided exact methods to test the homogeneity of prevalence from multiple groups. Generally, asymptotic methods can produce empirical type I error rates (TIEs) close to the pre-specified nominal level for large samples. However, they may yield inflation TIEs for small samples. Thus, exact tests become alternative to deal with the problem.
Dallal [12] indicated that Rosner’s model may give a poor fit if the characteristic was almost certain to occur bilaterally with widely varying group-specific prevalence. Suppose the probability of response at one organ given response at the other organ was independent of its probability. He introduced likelihood ratio test for large samples. However, the approach performs poorly with unsatisfactory TIE control in small samples. Up to now, statistical inferences on Dallal’s model have been less considered, including asymptotic and exact methods. This paper aims to propose asymptotic and exact methods for testing homogeneity of correlations among multiple bilateral data under Dallal’s model.
The remainder of the work is organized as follows. In Section 2, we review bilateral data structure and introduce Dallal’s model. The maximum likelihood estimations (MLEs) are derived for different hypotheses. Three asymptotic statistics and six exact procedures are proposed in Section 3. In Section 4, some numerical studies are conducted to compare these methods in terms of TIEs and power. Two examples are provided to illustrate these proposed approaches in Section 5. Conclusions are given in Section 6.
Dallal’s model and estimators
Suppose that N patients is randomly allocated into g groups. There are mi patients in the ith (i = 1, …, g) group. Let mli be the number of patients who have l(l = 0, 1, 2) organ(s) with improvement response(s) in the ith (i = 1, …, g) group, and Sl be the total number of patients with l(l = 0, 1, 2) response(s). Obviously, and
. The data structure is shown in Table 1. Let pli be the probability that a patient has l(l = 0, 1, 2) response(s) in the ith (i = 1, …, g) group. The vector mi ≜ (m0i, m1i, m2i)T follows a multinomial distribution M(mi;p0i, p1i, p2i). The probability density satisfies
Let Zijk = 1 if the kth organ of the jth patient has improvement response in the ith group for k = 1, 2, i = 1, 2, …, g, and j = 1, 2, …, mi, and 0 otherwise. Under Dallal’s model, we assume
(1)
where 0 ≤ πi, γi ≤ 1. Especially, γi = πi means that two organ responses of each patient are completely independent, and γi = 1 represents that they are completely dependent in ith group. By using the Eq (1), the probabilities of no, one or both response(s) are
where 0 ≤ pli ≤ 1, p01 + p1i + p2i = 1, and
. In the work, we are interested to test whether the correlations of g groups are identical. Thus, the hypotheses are given by
Denote m = (m1, …, mg), π = (π1, …, πg) and γ = (γ1, …, γg). Given the observation m, the log-likelihood function
(2)
where
. Let
and
be the unconstrained MLEs of πi and γi under H1. Differentiate (2) with respect to πi and γi, and set them to 0. The MLEs
and
are the solutions of the following equations
(3)
Let and
be the constrained MLEs of πi and γ under H0. Similarly, they are the solution of the equations
For the first equation, we have . The second equation can be simplified as γ(S1 + 2S2) − 2S2 = 0. Then, the constrained MLEs are obtained
(5)
Test methods
An information matrix
Denote β = (γ1, …, γg, π1, …, πg). According to the Eq (3), the second-order derivatives of l with respect to πi and γi are
for i = 1, …, g, and
for i ≠ j. Thus, the information matrix Iβ with respect to β is
where
Otherwise, Iij = 0. By calculation, its inverse matrix is
(6)
where
for i = 1, …, g.
Asymptotic test statistics
In this section, we propose three asymptotic tests for large samples based on the unconstrained and constrained MLEs.
- (i). Likelihood ratio test. Let
,
be the unconstrained MLEs, and
,
be the constrained MLEs under H0. Denote
,
and
. Given observation m, likelihood ratio statistic is given by
where l(π, γ|m) is defined in (2) and
From (4) and (5), likelihood ratio test can be represented as - (ii). Score test. Denote
Under H0, score test statistic can be defined as
A direct calculation shows that the simplified form of TSC is - (iii). Wald-type test. Let
. The null hypothesis H0: γ1 = … = γg is equivalent to C βT = 0, where 0 is a zero vector, and
Hence, Wald-type test statistic can be written as
where
is defined in (6). Let
(7)
For convenience, denote . Obviously, A is a symmetric tridiagonal matrix of order g − 1. Let
,
for j = 2, …, g − 1, and
,
. Following [13], A−1 is also a symmetric matrix denoted by
where
Since , we obtain the simplified form
Next we provide the expressions of TW for g = 2, 3, 4. If g = 2, it follows that
Denote . If g = 4, then
where ai is defined in (7).
Under H0, test statistic Tl(= TL, TSC or TW) has asymptotic chi-square distribution with g − 1 degrees of freedom. Let be the (1 − α)th quantile of the chi-square distribution with g − 1 degree of freedom. Given the nominal level α, the null hypothesis H0 will be rejected if the value of Tl is larger than
.
Exact methods
Given the observed data m = (m1, …, mg), let Tl(m) be the value of the aforementioned statistic Tl(l = L, SC, W). The asymptotic (A) p-values of these statistics are defined by
(8)
where m* is an observed data of m. For convenience, we call
,
and
as “A approach” based on statistics TL, TSC and TW. Asymptotic tests work well when the sample size is large. However, they have some limitations if the sample size is relatively small. Several exact conditional and unconditional methods are proposed for small samples based on these statistics.
An exact conditional method is introduced under the assumption that all of mi(i = 1, …, g) and Sl(l = 0, 1, 2) are fixed in Table 1. Thus, the cell values of the table follow a hypergeometric distribution. Define the tail area of statistics TL, TSC and TW as
where
. According to the tail area Ψl(m*), the exact conditional (C) p-values can be calculated by
(9)
Here, and
are described as “C approach” based on statistics TL, TSC and TW.
Another exact p-value is from Basu’s maximization approach [14]. It can be obtained by maximizing the tail probability over all nuisance parameters instead of the constrained MLEs under H0. In this case, we define the tail area of statistic Tl(l = L, SC, W) as
for a given table m*. Denote Θ = {π: πi ∈ [0, 1], i = 1, …, g} and
where π = (π1, …, πg) and γ = (γ1, …, γg). Hence, under maximization (M) method, three exact unconditional p-value of are given by
(10)
where L(π, γ|m) = exp(l(π, γ|m)) and l(π, γ|m) is defined in (2). Corresponding to “A approach” and “C approach”,
,
and
are called “M approach” based on TL, TSC and TW.
Numerical studies
In this section, we investigate the performance of the proposed asymptotic and exact tests in terms of TIEs and powers under different parameter settings.
We first compare asymptotic methods TL, TSC and TW with empirical TIEs. Let g = 2, 3, 4, π = 0.3: 0.02: 0.5, γ = 0.3: 0.02: 0.8 and m = m1 = ⋯ = mg = 15, 50, 100. Here, a: b: c means increasing from a to c by b. For each parameter setting, 10,000 samples are randomly generated from the null hypothesis H0. Given the nominal level α = 0.05, empirical TIE is calculated by the proportion of rejecting H0, i.e., the number of rejections/10,000. Figs 1, 2 and 3 show the distribution surfaces of empirical TIEs for all the tests under πi = π and γi = γ(i = 1, 2, …, g;g = 2, 3, 4). According to Tang et al. [3], a test is liberal if its empirical TIE is greater than 0.06, conservative if it is less than 0.04, otherwise it is robust. We observe that score test is more robust than other tests since its TIEs are closer to the pre-determined level α = 0.05. All the tests work well for larger sample size. However, likelihood ratio and Wald-type tests have inflated TIEs and are especially liberal when sample size is small. Some of their TIEs may be less than 0.04 or greater than 0.06.
Next we calculate the empirical powers of these tests according to the parameter settings for m = 15, 50, 100: (i) g = 2, π = (0.2, 0.3), γ1 = 0.2: 0.05: 0.95, γ2 = 0.1, (ii) g = 3, π = (0.2, 0.3, 0.3), γ1 = 0.2: 0.05: 0.95, γ2 = γ3 = 0.1, and (iii) g = 4, π = (0.2, 0.3, 0.3, 0.3), γ1 = 0.2: 0.05: 0.95, γ2 = γ3 = γ4 = 0.1. For each parameter setting, we randomly choose 10,000 samples from the alternative hypothesis H1. The empirical power is computed by the proportion of rejecting H0 for all samples. Fig 4 reflects the empirical powers of three proposed tests for g = 2, 3, 4. The powers will increase when sample size is larger or the group number increases. Especially, the powers of all the tests are very close when m = 50, 100. However, there exists some differences between these tests for smaller samples. Wald-type test has higher power and likelihood ratio test has lower power.
Considering the limitations of asymptotic methods, we analyse A, C and M approaches for small samples. Unlike 10,000 random samples of asymptotic tests, we need to generate all possible tables with random cell values. For m = 10 and g = 2, 3, there are totally 4,356 and 287,492 tables. The TIEs and powers are obtained for m = m1 = ⋯ = mg = 10 and g = 2, 3 according to the cases: π = 0: 0.04: 1, γ = 0: 0.04: 1, satisfying 0 ≤ pli ≤ 1(l = 0, 1, 2, i = 2, 3). At the given nominal level α = 0.05, the probabilities are calculated by the log-likelihood (2) of all possible tables. We will reject the null hypothesis H0 if the probability is less than 0.05. Figs 5 and 6 show TIE surfaces of all the exact methods for πi = π and γi = γ (i = 1, …, g; g = 2, 3). We observe that A approach is closer to the pre-specified nominal level α = 0.05 for m = 10 and g = 2, 3. However,
and
have the inflated TIEs. For C approach,
is better than
and
since they have the inflated TIEs. The M approaches
and
can produce satisfactory TIEs.
Fig 7 provides the powers of exact methods according to parameter settings for m = 10: (i) g = 2, π = (0.2, 0.3), γ1 = 0.2: 0.05: 0.9 and γ2 = 0.1, and (ii) g = 3, π = (0.2, 0.3, 0.3), γ1 = 0.2: 0.05: 0.9 and γ2 = γ3 = 0.1. We observe that the powers will increase when m or γ1 increases under other fixed parameters. The powers of A, C and M approaches are relatively close based on statistics Tl(l = L, SC).
Note that all parameter settings of asymptotic and exact methods are studied under balanced designs, that is, m = m1 = ⋯ = mg. For unbalanced case, we can handle it through some examples.
Real examples
In this section, two real examples with unbalanced designs are provided to illustrate our proposed methods at the nominal level α = 0.05. We first show an example with large samples based on asymptotic test statistics.
Example 1 [15] There were 216 patients aged 20-39 with retinitis pigmentosa (RP) at the Massachusetts Eye and Ear infirmary. They were divided into four genetic groups (Table 2): autosomal dominant RP (DOM), autosomal recessive RP (AR), sex-linked RP (SL) and isolate RP (ISO).
Let mli be the number of patients with l(l = 0, 1, 2) affected eyes in the ith (i = 1, 2, 3, 4) group. Under Dallal’s model, we are interested to test if the correlations of these four groups are equal, i.e., . Table 3 provides the results of statistics, p-values and constrained MLEs. Moreover, the unconstrained MLEs
and
. Given the nominal level α = 0.05, TL, TW, TSC
= 7.81 and p-values are greater than 0.05. Thus, there is no evidence to reject H0. That is to say, the correlations of four groups are equal: γ1 = γ2 = γ3 = γ4 = 0.8246.
For small sample case, we provide another example to compare the effectiveness of asymptotic and exact methods.
Example 2 [16] A double-blind clinical trial was conducted to study amoxicillin treatment of acute otitis media with effusion (OME) in twenty-four children at 14 days. Each child underwent no, unilateral or bilateral OME and was assigned into three groups according to ages: <2, 2-5 and ≤6 years (Table 4). Denote m* = (2, 2, 11, 5, 1, 3, 6, 0, 7). Next we apply asymptotic and exact methods to test H0: γ1 = γ2 = γ3 = γ.
Through calculating (4) and (5), the unconstrained MLEs ,
, and the constrained MLEs
,
under H0. Then, TL(m*) = 0.2445, TSC(m*) = 0.2525 and TW(m*) = 0.2285. Table 5 provides the comparison of asymptotic and exact methods. The result shows that there is no significant difference among the correlations of two groups regardless of any approaches.
Conclusions
In this paper, we propose asymptotic statistics and exact procedures to test if the correlations of multiple bilateral data are equal under Dallal’s model. Three asymptotic test statistics are likelihood ratio TL, score TSC and Wald-type TW for large sample. The explicit expressions of these tests are obtained, and their asymptotic p-values are denoted by A approach. For small sample, six exact methods are derived based on statistics TL, TSC and TW, including three conditional exact C procedures
and three unconditional exact M approaches
.
Numerical studies are conducted to investigate the performance of asymptotic and exact methods in terms of TIEs and powers. When the samples is larger, empirical TIEs and powers of TL, TSC and TW are close to each other. In general, score test TSC is more robust than other two tests. However, these tests may produce unacceptable TIEs such as Wald-type test when the samples is smaller. The results are similar to those of Rosner’s and Donner’s models, see Ma et al. [6] and Liu et al. [10]. For small sample, we obtain TIE surfaces and power curves of exact C and M approaches with two and three groups, comparing with A approach. As for TIEs, the A approaches and
are liberal, and
is close to the nominal level 0.05 under different parameter configurations. The C approaches
and
tend to be more inflated than
. The M approach
is better than
and
. On the other hand, the powers of exact methods are very close based on likelihood ratio TL and score TSC. For C approach,
has higher power, while
has lower power. Moreover,
has higher power, but
has lower power in M approach.
The ideas of asymptotic and exact methods can be extended other data structures with larger or small samples such as crash data. For example, Zeng et al. [17–19] proposed some models for the analysis of crash rates by injury severity. Dong et al. [20] introduced mixed logit model to investigate the difference between single- and multi-vehicle accident probability. Chen et al. [21–23] analyzed unbalance panel models by using real-time environmental and traffic big data. For these problems, we will leave these for future research.
Acknowledgments
The authors thanks the editor and referees for constructive comments that help improve the manuscript.
References
- 1. Rosner B. Statistical methods in ophthalmology: an adjustment for the intraclass correlation between eyes. Biometrics 1982 38: 105–114. pmid:7082754
- 2. Donner A. Statistical methods in opthalmology: an adjusted chi-square approach. Biometrics 1989 45:605–611. pmid:2765640
- 3. Tang ML, Tang NS, Rosner B. Statistical inference for correlated data in ophthalmologic studies. Statistics in Medicine 2006 25:2771–2783. pmid:16381067
- 4. Qiu SF, Tang NS, Tang ML, Pei YB. Sample size for testing difference between two proportions for the bilateral-sample design. J Biopharm Stat. 2009 19:857–871. pmid:20183448
- 5. Shan G, Ma CX. Exact methods for testing the equality of proportions for binary clustered data from otolaryngologic studies. Stat Biopharm Res. 2014 6(1):115–122.
- 6. Ma CX, Shan G, Liu S. Homogeneity test for binary correlated data. PLoS One 2015 10: e0124337. pmid:25897962
- 7. Peng X, Liu C, Liu S, Ma CX. Asymptotic confidence interval construction for proportion ratio based on correlated paired data. J Biopharm Stat. 2019; 29(6):1137–1152. pmid:30831053
- 8. Pei YB, Tang ML, Wang WK, Guo JH. Confidence intervals for correlated proportion differences from paired data in a two-arm randomised clinical trial. Stat Meth Med Res. 2012 21(2):167–187.
- 9. Pei YB, Tang ML, Wong WK, Tang NS. Testing equality of correlations of two paired binary responses from two treated groups in a randomized trial. Stat Biopharm Res. 2011 21:511–525. pmid:21442523
- 10. Liu XB, Liu S, Ma CX. Testing equality of correlation coefficients for paired binary data from multiple groups. Journal of Statistical Computation and Simulation 2016 86(9):1686–1696.
- 11. Liu XB, Yang ZY, Liu S, Ma CX. Exact methods of testing the homogeneity of prevalences for correlated binary data. Journal of Statistical Computation and Simulation 2017 87(15):3021–3039.
- 12. Dallal GE. Paired Bernoulli trials. Biometrics 1988 44(1):253–257. pmid:3358992
- 13. Meurant G. A review on the inverse of symmetric tridiagonal and block tridiagonal matrices. SIAM J Matrix Anal Appl. 1992 13(3):707–728.
- 14. Base D. On the elimination of nuisance parameters. J. American Statistical Association 1977 72(358):355–366.
- 15. Berson EL, Rosner B, and Simonoff E. An outpatient population of retinitis pigmentosa and their normal relatives: Risk factors for genetic typing and detection derived from their ocular examinations. American Journal of Ophthalmology 1980 89:763–775.
- 16. Mandel EM, Bluestone CD, Rockette HE, Blatter MM, Reisinger KS, Wucher FP, et al. Duration of effusion after antibiotic treatment for acute otitis media: comparison of cefaclor and amoxicillin. Pediatric Infectious Disease 1982 1:310–316. pmid:6760146
- 17. Zeng Q, Wen HY, Huang HL, Pei X, Wong SC. A multivariate random parameters Tobit model for analyzing highway crash rate by injury severity. Accident Analysis and Prevention 2017 99:184–191. pmid:27914307
- 18. Zeng Q, Guo Q, Wong SC, Wen HY, Huang HL, Pei X. Jointly modeling area-level crash rates by severity: A Bayesian multivariate random-parameters spatio-temporal Tobit regression. Transportmetrica A: Transport Science 2019 15(2):1867–1884.
- 19. Zeng Q, Wen HY, Wong SC, Huang HL, Guo Q, Pei X. Spatial joint analysis for zonal daytime and nighttime crash frequencies using a Bayesian bivariate conditional autoregressive model. Journal of Transportation Safety and Security 2020 12(4):566–585.
- 20. Dong BW, Ma XX, Chen F, Chen S. Investigating the differences of single- and multi-vehicle accident probability using mixed logit model. Journal of Advanced Transportation, 2018:1–9.
- 21. Chen F, Chen S, Ma XX. Crash frequency modeling using real-time environmental and traffic data and unbalanced panel data models. International Journal of Environmental Research and Public Health 2016 13:1–16. pmid:27322306
- 22. Chen F, Chen S, Ma XX. Analysis of hourly crash likelihood using unbalanced panel data mixed logit model and real-time driving environmental big data. Journal of Safety Research 2018 65: 153–159. pmid:29776524
- 23. Chen F, Song MT, Ma XX. Investigation on the injury severity of drivers in rear-end collisions between cars using a random parameters bivariate ordered probit model. International Journal of Environmental Research and Public Health 2019 16(14):1–12. pmid:31340600