Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Statistical tests under Dallal’s model: Asymptotic and exact methods

  • Zhiming Li,

    Roles Funding acquisition, Methodology, Software, Writing – original draft

    Affiliation College of Mathematics and System Science, Xinjiang University, Urumqi, China

  • Changxing Ma ,

    Roles Writing – original draft, Writing – review & editing

    cxma@buffalo.edu

    Affiliation Department of Biostatistics, University at Buffalo, Buffalo, NY, United States of America

  • Mingyao Ai

    Roles Writing – review & editing

    Affiliation LMAM, School of Mathematical Sciences and Center for Statistical Science, Peking University, Beijing, China

Abstract

This paper proposes asymptotic and exact methods for testing the equality of correlations for multiple bilateral data under Dallal’s model. Three asymptotic test statistics are derived for large samples. Since they are not applicable to small data, several conditional and unconditional exact methods are proposed based on these three statistics. Numerical studies are conducted to compare all these methods with regard to type I error rates (TIEs) and powers. The results show that the asymptotic score test is the most robust, and two exact tests have satisfactory TIEs and powers. Some real examples are provided to illustrate the effectiveness of these tests.

Introduction

In clinical medicine, we often encounter bilateral data taken from paired organs of patients such as eyes and ears. For the same patient, the intraclass correlation between responses of paired parts should be considered to avoid misleading results. There have been in the past various models to analyze such data. For example, Rosner [1] introduced a positive constant R as a measure of the dependency by assuming that the probability of a response at one side of the paired body given a response at the other side is R times to the response rate. Donner [2] provided an alternative approach and considered the common correlation coefficient in each of two groups. Under these two models, asymptotic and exact methods have been studied for many years and achieved significant progress.

Under Rosner’s model, Tang et al. [3] developed exact and approximate procedures when sample size is small or the data structure is sparse. Qiu et al. [4] derived sample formulas for testing difference between two proportions. Shan and Ma [5], and Ma et al. [6] presented several asymptotic and exact methods to investigate the equality of proportions. Peng et al. [7] constructed asymptotic confidence intervals (CIs) of proportion ratio for correlated paired data. Under Donner’s model, Pei et al. [8, 9] used asymptotic methods to analyze test statistics and CIs in two treated groups. Liu et al. [10, 11] provided exact methods to test the homogeneity of prevalence from multiple groups. Generally, asymptotic methods can produce empirical type I error rates (TIEs) close to the pre-specified nominal level for large samples. However, they may yield inflation TIEs for small samples. Thus, exact tests become alternative to deal with the problem.

Dallal [12] indicated that Rosner’s model may give a poor fit if the characteristic was almost certain to occur bilaterally with widely varying group-specific prevalence. Suppose the probability of response at one organ given response at the other organ was independent of its probability. He introduced likelihood ratio test for large samples. However, the approach performs poorly with unsatisfactory TIE control in small samples. Up to now, statistical inferences on Dallal’s model have been less considered, including asymptotic and exact methods. This paper aims to propose asymptotic and exact methods for testing homogeneity of correlations among multiple bilateral data under Dallal’s model.

The remainder of the work is organized as follows. In Section 2, we review bilateral data structure and introduce Dallal’s model. The maximum likelihood estimations (MLEs) are derived for different hypotheses. Three asymptotic statistics and six exact procedures are proposed in Section 3. In Section 4, some numerical studies are conducted to compare these methods in terms of TIEs and power. Two examples are provided to illustrate these proposed approaches in Section 5. Conclusions are given in Section 6.

Dallal’s model and estimators

Suppose that N patients is randomly allocated into g groups. There are mi patients in the ith (i = 1, …, g) group. Let mli be the number of patients who have l(l = 0, 1, 2) organ(s) with improvement response(s) in the ith (i = 1, …, g) group, and Sl be the total number of patients with l(l = 0, 1, 2) response(s). Obviously, and . The data structure is shown in Table 1. Let pli be the probability that a patient has l(l = 0, 1, 2) response(s) in the ith (i = 1, …, g) group. The vector mi ≜ (m0i, m1i, m2i)T follows a multinomial distribution M(mi;p0i, p1i, p2i). The probability density satisfies

Let Zijk = 1 if the kth organ of the jth patient has improvement response in the ith group for k = 1, 2, i = 1, 2, …, g, and j = 1, 2, …, mi, and 0 otherwise. Under Dallal’s model, we assume (1) where 0 ≤ πi, γi ≤ 1. Especially, γi = πi means that two organ responses of each patient are completely independent, and γi = 1 represents that they are completely dependent in ith group. By using the Eq (1), the probabilities of no, one or both response(s) are where 0 ≤ pli ≤ 1, p01 + p1i + p2i = 1, and . In the work, we are interested to test whether the correlations of g groups are identical. Thus, the hypotheses are given by

Denote m = (m1, …, mg), π = (π1, …, πg) and γ = (γ1, …, γg). Given the observation m, the log-likelihood function (2) where . Let and be the unconstrained MLEs of πi and γi under H1. Differentiate (2) with respect to πi and γi, and set them to 0. The MLEs and are the solutions of the following equations (3)

Then, (4)

Let and be the constrained MLEs of πi and γ under H0. Similarly, they are the solution of the equations

For the first equation, we have . The second equation can be simplified as γ(S1 + 2S2) − 2S2 = 0. Then, the constrained MLEs are obtained (5)

Test methods

An information matrix

Denote β = (γ1, …, γg, π1, …, πg). According to the Eq (3), the second-order derivatives of l with respect to πi and γi are for i = 1, …, g, and for ij. Thus, the information matrix Iβ with respect to β is where

Otherwise, Iij = 0. By calculation, its inverse matrix is (6) where for i = 1, …, g.

Asymptotic test statistics

In this section, we propose three asymptotic tests for large samples based on the unconstrained and constrained MLEs.

  1. (i). Likelihood ratio test. Let , be the unconstrained MLEs, and , be the constrained MLEs under H0. Denote , and . Given observation m, likelihood ratio statistic is given by where l(π, γ|m) is defined in (2) and
    From (4) and (5), likelihood ratio test can be represented as
  2. (ii). Score test. Denote Under H0, score test statistic can be defined as
    A direct calculation shows that the simplified form of TSC is
  3. (iii). Wald-type test. Let . The null hypothesis H0: γ1 = … = γg is equivalent to C βT = 0, where 0 is a zero vector, and

Hence, Wald-type test statistic can be written as where is defined in (6). Let (7)

Then,

For convenience, denote . Obviously, A is a symmetric tridiagonal matrix of order g − 1. Let , for j = 2, …, g − 1, and , . Following [13], A−1 is also a symmetric matrix denoted by where

Since , we obtain the simplified form

Next we provide the expressions of TW for g = 2, 3, 4. If g = 2, it follows that

If g = 3, we have

Denote . If g = 4, then where ai is defined in (7).

Under H0, test statistic Tl(= TL, TSC or TW) has asymptotic chi-square distribution with g − 1 degrees of freedom. Let be the (1 − α)th quantile of the chi-square distribution with g − 1 degree of freedom. Given the nominal level α, the null hypothesis H0 will be rejected if the value of Tl is larger than .

Exact methods

Given the observed data m = (m1, …, mg), let Tl(m) be the value of the aforementioned statistic Tl(l = L, SC, W). The asymptotic (A) p-values of these statistics are defined by (8) where m* is an observed data of m. For convenience, we call , and as “A approach” based on statistics TL, TSC and TW. Asymptotic tests work well when the sample size is large. However, they have some limitations if the sample size is relatively small. Several exact conditional and unconditional methods are proposed for small samples based on these statistics.

An exact conditional method is introduced under the assumption that all of mi(i = 1, …, g) and Sl(l = 0, 1, 2) are fixed in Table 1. Thus, the cell values of the table follow a hypergeometric distribution. Define the tail area of statistics TL, TSC and TW as where . According to the tail area Ψl(m*), the exact conditional (C) p-values can be calculated by (9)

Here, and are described as “C approach” based on statistics TL, TSC and TW.

Another exact p-value is from Basu’s maximization approach [14]. It can be obtained by maximizing the tail probability over all nuisance parameters instead of the constrained MLEs under H0. In this case, we define the tail area of statistic Tl(l = L, SC, W) as for a given table m*. Denote Θ = {π: πi ∈ [0, 1], i = 1, …, g} and where π = (π1, …, πg) and γ = (γ1, …, γg). Hence, under maximization (M) method, three exact unconditional p-value of are given by (10) where L(π, γ|m) = exp(l(π, γ|m)) and l(π, γ|m) is defined in (2). Corresponding to “A approach” and “C approach”, , and are called “M approach” based on TL, TSC and TW.

Numerical studies

In this section, we investigate the performance of the proposed asymptotic and exact tests in terms of TIEs and powers under different parameter settings.

We first compare asymptotic methods TL, TSC and TW with empirical TIEs. Let g = 2, 3, 4, π = 0.3: 0.02: 0.5, γ = 0.3: 0.02: 0.8 and m = m1 = ⋯ = mg = 15, 50, 100. Here, a: b: c means increasing from a to c by b. For each parameter setting, 10,000 samples are randomly generated from the null hypothesis H0. Given the nominal level α = 0.05, empirical TIE is calculated by the proportion of rejecting H0, i.e., the number of rejections/10,000. Figs 1, 2 and 3 show the distribution surfaces of empirical TIEs for all the tests under πi = π and γi = γ(i = 1, 2, …, g;g = 2, 3, 4). According to Tang et al. [3], a test is liberal if its empirical TIE is greater than 0.06, conservative if it is less than 0.04, otherwise it is robust. We observe that score test is more robust than other tests since its TIEs are closer to the pre-determined level α = 0.05. All the tests work well for larger sample size. However, likelihood ratio and Wald-type tests have inflated TIEs and are especially liberal when sample size is small. Some of their TIEs may be less than 0.04 or greater than 0.06.

thumbnail
Fig 1. Empirical TIE surfaces of asymptotic tests for g = 2, πi = π and γi = γ.

https://doi.org/10.1371/journal.pone.0242722.g001

thumbnail
Fig 2. Empirical TIE surfaces of asymptotic tests for g = 3, πi = π and γi = γ.

https://doi.org/10.1371/journal.pone.0242722.g002

thumbnail
Fig 3. Empirical TIE surfaces of asymptotic tests for g = 4, πi = π and γi = γ.

https://doi.org/10.1371/journal.pone.0242722.g003

Next we calculate the empirical powers of these tests according to the parameter settings for m = 15, 50, 100: (i) g = 2, π = (0.2, 0.3), γ1 = 0.2: 0.05: 0.95, γ2 = 0.1, (ii) g = 3, π = (0.2, 0.3, 0.3), γ1 = 0.2: 0.05: 0.95, γ2 = γ3 = 0.1, and (iii) g = 4, π = (0.2, 0.3, 0.3, 0.3), γ1 = 0.2: 0.05: 0.95, γ2 = γ3 = γ4 = 0.1. For each parameter setting, we randomly choose 10,000 samples from the alternative hypothesis H1. The empirical power is computed by the proportion of rejecting H0 for all samples. Fig 4 reflects the empirical powers of three proposed tests for g = 2, 3, 4. The powers will increase when sample size is larger or the group number increases. Especially, the powers of all the tests are very close when m = 50, 100. However, there exists some differences between these tests for smaller samples. Wald-type test has higher power and likelihood ratio test has lower power.

thumbnail
Fig 4. Empirical power curves of asymptotic tests for g = 2, 3, 4.

https://doi.org/10.1371/journal.pone.0242722.g004

Considering the limitations of asymptotic methods, we analyse A, C and M approaches for small samples. Unlike 10,000 random samples of asymptotic tests, we need to generate all possible tables with random cell values. For m = 10 and g = 2, 3, there are totally 4,356 and 287,492 tables. The TIEs and powers are obtained for m = m1 = ⋯ = mg = 10 and g = 2, 3 according to the cases: π = 0: 0.04: 1, γ = 0: 0.04: 1, satisfying 0 ≤ pli ≤ 1(l = 0, 1, 2, i = 2, 3). At the given nominal level α = 0.05, the probabilities are calculated by the log-likelihood (2) of all possible tables. We will reject the null hypothesis H0 if the probability is less than 0.05. Figs 5 and 6 show TIE surfaces of all the exact methods for πi = π and γi = γ (i = 1, …, g; g = 2, 3). We observe that A approach is closer to the pre-specified nominal level α = 0.05 for m = 10 and g = 2, 3. However, and have the inflated TIEs. For C approach, is better than and since they have the inflated TIEs. The M approaches and can produce satisfactory TIEs.

thumbnail
Fig 5. TIE surfaces of exact approaches for m = 10, g = 2, πi = π and γi = γ.

https://doi.org/10.1371/journal.pone.0242722.g005

thumbnail
Fig 6. TIE surfaces of exact approaches for m = 10, g = 3, πi = π and γi = γ.

https://doi.org/10.1371/journal.pone.0242722.g006

Fig 7 provides the powers of exact methods according to parameter settings for m = 10: (i) g = 2, π = (0.2, 0.3), γ1 = 0.2: 0.05: 0.9 and γ2 = 0.1, and (ii) g = 3, π = (0.2, 0.3, 0.3), γ1 = 0.2: 0.05: 0.9 and γ2 = γ3 = 0.1. We observe that the powers will increase when m or γ1 increases under other fixed parameters. The powers of A, C and M approaches are relatively close based on statistics Tl(l = L, SC).

thumbnail
Fig 7. Power curves of exact approaches for m = 10 and g = 2, 3.

https://doi.org/10.1371/journal.pone.0242722.g007

Note that all parameter settings of asymptotic and exact methods are studied under balanced designs, that is, m = m1 = ⋯ = mg. For unbalanced case, we can handle it through some examples.

Real examples

In this section, two real examples with unbalanced designs are provided to illustrate our proposed methods at the nominal level α = 0.05. We first show an example with large samples based on asymptotic test statistics.

Example 1 [15] There were 216 patients aged 20-39 with retinitis pigmentosa (RP) at the Massachusetts Eye and Ear infirmary. They were divided into four genetic groups (Table 2): autosomal dominant RP (DOM), autosomal recessive RP (AR), sex-linked RP (SL) and isolate RP (ISO).

Let mli be the number of patients with l(l = 0, 1, 2) affected eyes in the ith (i = 1, 2, 3, 4) group. Under Dallal’s model, we are interested to test if the correlations of these four groups are equal, i.e., . Table 3 provides the results of statistics, p-values and constrained MLEs. Moreover, the unconstrained MLEs and . Given the nominal level α = 0.05, TL, TW, TSC = 7.81 and p-values are greater than 0.05. Thus, there is no evidence to reject H0. That is to say, the correlations of four groups are equal: γ1 = γ2 = γ3 = γ4 = 0.8246.

thumbnail
Table 3. Test statistics, p-values and constrained MLEs under H0.

https://doi.org/10.1371/journal.pone.0242722.t003

For small sample case, we provide another example to compare the effectiveness of asymptotic and exact methods.

Example 2 [16] A double-blind clinical trial was conducted to study amoxicillin treatment of acute otitis media with effusion (OME) in twenty-four children at 14 days. Each child underwent no, unilateral or bilateral OME and was assigned into three groups according to ages: <2, 2-5 and ≤6 years (Table 4). Denote m* = (2, 2, 11, 5, 1, 3, 6, 0, 7). Next we apply asymptotic and exact methods to test H0: γ1 = γ2 = γ3 = γ.

Through calculating (4) and (5), the unconstrained MLEs , , and the constrained MLEs , under H0. Then, TL(m*) = 0.2445, TSC(m*) = 0.2525 and TW(m*) = 0.2285. Table 5 provides the comparison of asymptotic and exact methods. The result shows that there is no significant difference among the correlations of two groups regardless of any approaches.

Conclusions

In this paper, we propose asymptotic statistics and exact procedures to test if the correlations of multiple bilateral data are equal under Dallal’s model. Three asymptotic test statistics are likelihood ratio TL, score TSC and Wald-type TW for large sample. The explicit expressions of these tests are obtained, and their asymptotic p-values are denoted by A approach. For small sample, six exact methods are derived based on statistics TL, TSC and TW, including three conditional exact C procedures and three unconditional exact M approaches .

Numerical studies are conducted to investigate the performance of asymptotic and exact methods in terms of TIEs and powers. When the samples is larger, empirical TIEs and powers of TL, TSC and TW are close to each other. In general, score test TSC is more robust than other two tests. However, these tests may produce unacceptable TIEs such as Wald-type test when the samples is smaller. The results are similar to those of Rosner’s and Donner’s models, see Ma et al. [6] and Liu et al. [10]. For small sample, we obtain TIE surfaces and power curves of exact C and M approaches with two and three groups, comparing with A approach. As for TIEs, the A approaches and are liberal, and is close to the nominal level 0.05 under different parameter configurations. The C approaches and tend to be more inflated than . The M approach is better than and . On the other hand, the powers of exact methods are very close based on likelihood ratio TL and score TSC. For C approach, has higher power, while has lower power. Moreover, has higher power, but has lower power in M approach.

The ideas of asymptotic and exact methods can be extended other data structures with larger or small samples such as crash data. For example, Zeng et al. [1719] proposed some models for the analysis of crash rates by injury severity. Dong et al. [20] introduced mixed logit model to investigate the difference between single- and multi-vehicle accident probability. Chen et al. [2123] analyzed unbalance panel models by using real-time environmental and traffic big data. For these problems, we will leave these for future research.

Acknowledgments

The authors thanks the editor and referees for constructive comments that help improve the manuscript.

References

  1. 1. Rosner B. Statistical methods in ophthalmology: an adjustment for the intraclass correlation between eyes. Biometrics 1982 38: 105–114. pmid:7082754
  2. 2. Donner A. Statistical methods in opthalmology: an adjusted chi-square approach. Biometrics 1989 45:605–611. pmid:2765640
  3. 3. Tang ML, Tang NS, Rosner B. Statistical inference for correlated data in ophthalmologic studies. Statistics in Medicine 2006 25:2771–2783. pmid:16381067
  4. 4. Qiu SF, Tang NS, Tang ML, Pei YB. Sample size for testing difference between two proportions for the bilateral-sample design. J Biopharm Stat. 2009 19:857–871. pmid:20183448
  5. 5. Shan G, Ma CX. Exact methods for testing the equality of proportions for binary clustered data from otolaryngologic studies. Stat Biopharm Res. 2014 6(1):115–122.
  6. 6. Ma CX, Shan G, Liu S. Homogeneity test for binary correlated data. PLoS One 2015 10: e0124337. pmid:25897962
  7. 7. Peng X, Liu C, Liu S, Ma CX. Asymptotic confidence interval construction for proportion ratio based on correlated paired data. J Biopharm Stat. 2019; 29(6):1137–1152. pmid:30831053
  8. 8. Pei YB, Tang ML, Wang WK, Guo JH. Confidence intervals for correlated proportion differences from paired data in a two-arm randomised clinical trial. Stat Meth Med Res. 2012 21(2):167–187.
  9. 9. Pei YB, Tang ML, Wong WK, Tang NS. Testing equality of correlations of two paired binary responses from two treated groups in a randomized trial. Stat Biopharm Res. 2011 21:511–525. pmid:21442523
  10. 10. Liu XB, Liu S, Ma CX. Testing equality of correlation coefficients for paired binary data from multiple groups. Journal of Statistical Computation and Simulation 2016 86(9):1686–1696.
  11. 11. Liu XB, Yang ZY, Liu S, Ma CX. Exact methods of testing the homogeneity of prevalences for correlated binary data. Journal of Statistical Computation and Simulation 2017 87(15):3021–3039.
  12. 12. Dallal GE. Paired Bernoulli trials. Biometrics 1988 44(1):253–257. pmid:3358992
  13. 13. Meurant G. A review on the inverse of symmetric tridiagonal and block tridiagonal matrices. SIAM J Matrix Anal Appl. 1992 13(3):707–728.
  14. 14. Base D. On the elimination of nuisance parameters. J. American Statistical Association 1977 72(358):355–366.
  15. 15. Berson EL, Rosner B, and Simonoff E. An outpatient population of retinitis pigmentosa and their normal relatives: Risk factors for genetic typing and detection derived from their ocular examinations. American Journal of Ophthalmology 1980 89:763–775.
  16. 16. Mandel EM, Bluestone CD, Rockette HE, Blatter MM, Reisinger KS, Wucher FP, et al. Duration of effusion after antibiotic treatment for acute otitis media: comparison of cefaclor and amoxicillin. Pediatric Infectious Disease 1982 1:310–316. pmid:6760146
  17. 17. Zeng Q, Wen HY, Huang HL, Pei X, Wong SC. A multivariate random parameters Tobit model for analyzing highway crash rate by injury severity. Accident Analysis and Prevention 2017 99:184–191. pmid:27914307
  18. 18. Zeng Q, Guo Q, Wong SC, Wen HY, Huang HL, Pei X. Jointly modeling area-level crash rates by severity: A Bayesian multivariate random-parameters spatio-temporal Tobit regression. Transportmetrica A: Transport Science 2019 15(2):1867–1884.
  19. 19. Zeng Q, Wen HY, Wong SC, Huang HL, Guo Q, Pei X. Spatial joint analysis for zonal daytime and nighttime crash frequencies using a Bayesian bivariate conditional autoregressive model. Journal of Transportation Safety and Security 2020 12(4):566–585.
  20. 20. Dong BW, Ma XX, Chen F, Chen S. Investigating the differences of single- and multi-vehicle accident probability using mixed logit model. Journal of Advanced Transportation, 2018:1–9.
  21. 21. Chen F, Chen S, Ma XX. Crash frequency modeling using real-time environmental and traffic data and unbalanced panel data models. International Journal of Environmental Research and Public Health 2016 13:1–16. pmid:27322306
  22. 22. Chen F, Chen S, Ma XX. Analysis of hourly crash likelihood using unbalanced panel data mixed logit model and real-time driving environmental big data. Journal of Safety Research 2018 65: 153–159. pmid:29776524
  23. 23. Chen F, Song MT, Ma XX. Investigation on the injury severity of drivers in rear-end collisions between cars using a random parameters bivariate ordered probit model. International Journal of Environmental Research and Public Health 2019 16(14):1–12. pmid:31340600