A comparative study of estimators in multilevel linear models

Sabz Ali; Said Ali Shah; Seema Zubair; Sundas Hussain

doi:10.1371/journal.pone.0259960

Abstract

Multilevel Models are widely used in organizational research, educational research, epidemiology, psychology, biology and medical fields. In this paper, we recommend the situations where Bootstrap procedures through Minimum Norm Quadratic Unbiased Estimator (MINQUE) can be extremely handy than that of Restricted Maximum Likelihood (REML) in multilevel level linear regression models. In our simulation study the bootstrap by means of MINQUE is superior to REML in conditions where normality does not hold. Moreover, the real data application also supports our findings in terms of accuracy of estimates and their standard errors.

Citation: Ali S, Shah SA, Zubair S, Hussain S (2021) A comparative study of estimators in multilevel linear models. PLoS ONE 16(11): e0259960. https://doi.org/10.1371/journal.pone.0259960

Editor: Feng Chen, Tongii University, CHINA

Received: February 1, 2021; Accepted: November 1, 2021; Published: November 18, 2021

Copyright: © 2021 Ali et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All relevant data are within the manuscript and its Supporting Information files.

Funding: The authors received no specific funding for this work.

Competing interests: The authors have declared that no competing interests exist.

Introduction

Multilevel data or clustered data are commonly observed in schools, health institutions, and epidemiology. Multilevel models are also called hierarchical, mixed effects, or random effects models Snijders and Bosker [1], Raudenbush and Bryk [2].

Maximum likelihood (ML) method estimates and estimates standard errors were used by Maas and Hox [3]. Wen et al. [4] concluded that Bayesian spatial-temporal model is superior to the random effects model and spatial model for investigating the effects of weather and roadway characteristics on crash incidence.

Brown and Draper [5] utilized ML method of estimation and accomplished that in small sample sizes the estimates are biased. MINQUE recommended by Rao [6], as an alternate to ML estimator. The method, however, does not rely on the assumption of normality in multilevel linear models. According to Bagakas [7], one major problem with the MINQUE estimators is that standard errors of the minimum norm quadratic unbiased estimators cannot be computed because of the non-existence of formulae. In situations, where a researcher attempts to construct confidence interval and perform testing of hypothesis about the parameter then the MINQUE is not appropriate. The researcher then needs to use an alternate scheme such as bootstrapping, where not only the parameter estimates but also their standard errors can be estimated by applying different estimation methods such as MINQUE or ML method of estimation.

In practice, both parametric and nonparametric bootstrap can be used. However, when the assumption of normality does not exist the nonparametric bootstrap is handy. As the MINQUE method of estimation is free from the normality assumption, so the bootstrap by means of MINQUE will be used. Swallow and Monahan [8] compared REML, ML and MINQUE estimators.

Bagakas [7] used bootstrap by means of MINQUE. Similarly, Meijer et al. [9] concluded that multilevel bootstrapping performance was excellent in small sample sizes in multilevel models. Carpenter et al. [10] carried out a simulation study where they compared the relative performance of parametric bootstrap and nonparametric residuals bootstrap methods by using multilevel linear models. Hutchison et al. [11] successfully carried out simulation study on a two-level model. They applied the procedure of nonparametric cases bootstrap and promising standard errors of the estimates were obtained. Wang et al. [12] used multilevel linear model to apply nonparametric residual bootstrap through a SAS macro. Nonparametric residual bootstrap estimates standard errors were promising. Delpish [13] also compared REML and Bootstrap by means of MINQUE in her study. Ali et al., [14] concluded that ML gave better results than Penalized Quasilikelihood (PQL)for small sample conditions in multilevel model. To get accurate estimates of both fixed and random effects ML requires relatively small sample compared to PQL in multilevel logistic models (Ali et al. [15]). In a study by Zeng et al. [16] revealed that univariate spatial model gave lower deviance information criteria (DIC) and accurate estimates of parameters as compared to bivariate spatial model while investigating the factors responsible for vehicle crash on freeway. The proposed multivariate random-parameters spatio-temporal Tobit model gave lower Deviance Information Criteria (DIC), Mean Absolute Deviance (MAD) and Mean Squared Prediction Errors (MSPE) then the competing model such as multivariate random-parameters Tobit model and a multivariate random-parameters spatial Tobit model (Zeng et al. [17]. It was confirmed from the results that spatio-temporal correlation and interaction have significance in the area wide crash data.

In this paper, the researchers investigate the performance of REML and Bootstrap by means of MINQUE under varying conditions of the number of groups, Intra-class correlation and different skewed distributions.

Materials and methods

For this study a random intercept and random slope multilevel linear model was used. The model has single explanatory variable at each level. The model is given below: (1)

Level 1 model (2)

Level 2 models (3)

The combined model was obtained by substituting level 2 model in level 1 model: (4)

(Fixed part)+(Random part)

Where X_ij is the Level 1 explanatory variable, W_j corresponds to Level 2 explanatory variable, γ₀₀, γ₁₀, γ₀₁ and γ₁₁ are the fixed effects, e_ij is assumed to follows a normal distribution i.e

e_ij∼ N (0, ). In case of normality, u_oj and u_1j assumed to follow a multivariate normal distribution as (5)

Corresponds to the random intercept variance, is the random slope variance and σ_u1 is the covariance term.

Design factors

Three levels of number of groups were used in this study: 30,100 and 120.
Three levels of intra-class correlations were used: 0.01, 0.10 and 0.20. Where the intra-class correlation coefficient (ICC) is given as

(6)

3. Three distributions were used: Normal distribution, Lognormal distribution and Exponential distribution

Analysis

Two estimation procedures Restricted Maximum Likelihood and Bootstrap by means of MINQUE were used in all the three distribution conditions. All the simulations and bootstrapping were performed in SAS 9.2 to obtain estimates and their standard errors.

Algorithm. The procedure for cases bootstrap is as given below:

Draw with replacement J group level units along with corresponding scores on group level variable .
Then draw with replacement n_j individual level units within group level unit j, j = 1, 2………, J. This results the bootstrap data (Y*, X*) and this data set is then combined with the group level variable in order to get (Y*, X*, ) the desired bootstrap sample.
Obtain the minimum norm quadratic unbiased estimates of the model parameters from the bootstrap replicated sample.
Replicate steps 1–3 B times, b = 1, 2, 3…… B, and then obtain the minimum norm quadratic unbiased estimates of the model parameters.
Obtain the mean value of estimates by using

(7)

And the bootstrap parameter estimate standard error is obtain as (8)

The real data was selected from High School & Beyond Survey data set, which is a national survey of United States conducted by National Center for Educations Statistics (NCES) about Public and Catholic schools. For the purpose of illustration, a dataset of 30 schools was randomly selected from the data of 160 schools.

Results

Tables 1 and 2 show that the bootstrap procedure showed perfect results in terms of accuracy of the fixed and random effects estimates, however, REML method estimates were comparable to that of the bootstrap procedure at 100 and 1200 groups respectively. Similarly, from Table 3 it is evident that the bootstrap CI outclassed the REML CI at the first two levels of the number of group (30 and 100) factor when the distribution was normal.

Download:

Table 1. Average relative parameter bias of fixed effects estimates obtained for normal distribution data (First = REML estimation procedure, Second = Bootstrap estimates are enclosed in parenthesis).

https://doi.org/10.1371/journal.pone.0259960.t001

Download:

Table 2. Average relative parameter bias of the random effects estimates obtained for normal distribution data (First = REML estimation procedure, Second = Bootstrap estimates are enclosed in Parenthesis).

https://doi.org/10.1371/journal.pone.0259960.t002

Download:

Table 3. Impact of groups and ICC on estimates 95% coverage probability for normal distribution data (First = REML estimation procedure, Second = Percentile bootstrap estimates are enclosed in Parenthesis).

https://doi.org/10.1371/journal.pone.0259960.t003

The bootstrap procedure was superior to REML in terms of accuracy of the fixed effects and random effect estimates as can be seen in Tables 4 and 5 for lognormal distribution. Moreover, Table 6 reveals that the bootstrap CI outperformed the REML CI at all levels of the number of groups when data was generated from lognormal distribution. Furthermore, when the distribution of the data was exponential again the bootstrap method outshined the REML method estimates as shown in Tables 7–9 respectively.

Download:

Table 4. Average relative parameter bias of fixed effect estimates obtained for lognormal distribution data (First = REML estimation procedure, Second = Bootstrap estimates are enclosed in parenthesis).

https://doi.org/10.1371/journal.pone.0259960.t004

Download:

Table 5. Average relative parameter bias of the random effect estimates obtained for lognormal distribution data (First = REML estimation procedure, Second = Bootstrap estimates are enclosed in parenthesis).

https://doi.org/10.1371/journal.pone.0259960.t005

Download:

Table 6. Impact of groups ICC on estimates 95% coverage probability for lognormal distribution data (First = REML estimation procedure, Second = Percentile bootstrap estimates are enclosed in parenthesis).

https://doi.org/10.1371/journal.pone.0259960.t006

Download:

Table 7. Average relative parameter bias of fixed effect estimates obtained for exponential distribution data (First = REML estimation procedure, Second = Bootstrap estimates are enclosed in parenthesis).

https://doi.org/10.1371/journal.pone.0259960.t007

Download:

Table 8. Average relative parameter bias of the random effect estimates obtained for exponential distribution data (First = REML estimation procedure, Second = Bootstrap estimates are enclosed in parenthesis).

https://doi.org/10.1371/journal.pone.0259960.t008

Download:

Table 9. Impact of groups and ICC on estimates 95% coverage probability for exponential distribution data (First = REML estimation procedure, Second = Percentile bootstrap estimates are enclosed in parenthesis).

https://doi.org/10.1371/journal.pone.0259960.t009

Real data application

The application of bootstrap by means of MINQUE method to the real data is demonstrated in this section. A two-level model was fitted to a subsample data drawn from High School & Beyond (HSB) data. The data consist of two levels i.e school level and student level. HSB data consists of 7185 students nested within 160 schools. The data contains four level 1 or individual level variables and six level 2 or group level variables in total. For the purpose of illustration of bootstrap by means of MINQUE method only 30 schools were drawn randomly from 160 schools. The total numbers of level 1 units are 1447 and level 2 units are 30. Students MATH ACHIEVEMENT SCORE was taken as a response variable, SES was selected as a level 1 variable and MEANSES was selected as a level 2 variable. A two-level model used in this real data application is given below (9)

Level-1 model (10)

Level-2 models

The combined model can be written as (11)

REML and bootstrap by means of MINQUE estimation procedures were used to estimate both fixed effects and random effects using HSB: 30 schools data set for the model in equation (1.8). The SAS package procedure PROC MIXED was used to obtain REML estimates and estimates standard errors. The REML confidence intervals were then constructed for each parameter using normal theory. For all the eight parameters in the model (1.8), B = 1000 bootstrap replicates were obtained using cases bootstrap. The mean of 1000 estimates were then taken to obtain the bootstrap estimate. This means that the bootstrap estimate of any parameter is the average of one thousand estimates. On the other hand, single estimate for each parameter was obtained under REML method of estimation. Bootstrap confidence intervals were constructed for each parameter in the model by using the percentile method. The data set of 30 schools randomly selected from 160 school’s data is presented in Table 10.

Download:

Table 10. High school & beyond data (30 schools data set).

https://doi.org/10.1371/journal.pone.0259960.t010

Table 11 illustrates estimates and estimated standard errors under REML and bootstrap by means of MINQUE methods of estimation. Moreover, 95% CI’s are also given in Table 11. There is not much difference to choose between the two procedures as for as the accuracy of the estimates is concerned. However, both fixed and random effects estimate standard errors were lower under bootstrap by means of MINQUE. The widths of the REML CI’s were clearly higher than that of the percentile bootstrap CI’s. Overall, for real data, bootstrap by means of MINQUE performs better than that of the REML method of estimation especially in terms of precision. Simulation results also exposed that bootstrap by means of MINQUE procedure outperformed the REML method of estimation particularly in terms of estimates promising standard errors.

Download:

Table 11. Fixed and random effects parameter estimates and CI Limits under both REML and bootstrap by means of MINQUE methods of estimation for real data.

https://doi.org/10.1371/journal.pone.0259960.t011

Conclusion

REML produced unbiased fixed effects estimates at the second level and third level of the number of groups (100 and 120) factor. On the other hand, the bootstrap fixed effects estimates were unbiased across all conditions. Additionally, the bootstrap procedure outperformed the REML method in terms of accuracy of the random effects estimates when the number of groups was 30. Based on the above normal data results, it is recommended that at least 30 groups are essential to obtain unbiased fixed effects estimates and their standard errors under REML method of estimation. Furthermore, 100 groups are essential to achieve accurate random effects estimates and their standard errors under REML method of estimation. It is also recommended that bootstrap by means of MINQUE can be superior to REML when the number of groups are 30 and normality holds.

In general, the estimates and estimated standard errors were biased for the two skewed distribution data when the number of groups was 30 under REML method of estimation. On the other hand, the bootstrap estimates and estimated standard errors were unbiased across all conditions. To put it differently, the bootstrap fixed effects and random effects estimates coverage rates were not only acceptable but also superior to that of REML estimates coverage rates across all conditions. Furthermore, REML level 2 random effects estimates coverage rates were unacceptable across all conditions under both skewed distributions data. Moreover, real data results and conclusion are clearly matching with the simulation results.

It is recommended on the basis of these study results, whenever the data are based on skewed distributions and normality assumption does not hold, REML should not be used particularly for inference. In such situations, the bootstrap standard errors by means of MINQUE can be used for inference to achieve precise results.

Supporting information

S1 File.

https://doi.org/10.1371/journal.pone.0259960.s001

(RAR)

References

1. Snijders TA, Bosker RJ. Multilevel analysis: An introduction to basic and advanced multilevel modeling.
- View Article
- Google Scholar
2. Raudenbush SW, Bryk AS. Hierarchical linear models: Applications and data analysis methods. sage; 2002. https://doi.org/10.2466/pms.2002.94.2.671 pmid:12027363
3. Maas CJ, Hox JJ. Robustness issues in multilevel regression analysis. Statistica Neerlandica. 2004 May;58(2):127–37.
- View Article
- Google Scholar
4. Wen H, Zhang X, Zeng Q, Sze NN. Bayesian spatial-temporal model for the main and interaction effects of roadway and weather characteristics on freeway crash incidence. Accident Analysis & Prevention. 2019 Nov 1; 132:105249. pmid:31415995
5. Browne WJ, Draper D. A comparison of Bayesian and likelihood-based methods for fitting multilevel models. Bayesian analysis. 2006;1(3):473–514.
- View Article
- Google Scholar
6. Rao CR. Estimation of heteroscedastic variances in linear models. Journal of the American Statistical Association. 1970 Mar 1;65(329):161–72.
- View Article
- Google Scholar
7. Bagaka’s JG. Two-level nested hierarchical linear model with random intercepts via the bootstrap.
- View Article
- Google Scholar
8. Swallow WH, Monahan JF. Monte Carlo comparison of ANOVA, MIVQUE, REML, and ML estimators of variance components. Techno metrics. 1984 Feb 1;26(1):47–57.
- View Article
- Google Scholar
9. Meijer E, Van Der Leeden R, Busing FM. Implementing the bootstrap for multilevel models. Multilevel Modelling Newsletter. 1995;7(2):7–11.
- View Article
- Google Scholar
10. Carpenter J, Goldstein H, Rasbash J. A non-parametric bootstrap for multilevel models. Multilevel modelling newsletter. 1999;11(1):2–5.
- View Article
- Google Scholar
11. Hutchison D, Morrison J, Felgate R. Bootstrapping the effects of measurement errors. Multilevel Modelling Newsletter. 2003; 15:2–10.
- View Article
- Google Scholar
12. Wang J, Carpenter JR, Kepler MA. Using SAS to conduct nonparametric residual bootstrap multilevel modeling with a small number of groups. Computer methods and programs in biomedicine. 2006 May 1;82(2):130–43. pmid:16569459
13. Delpish AN. Comparison of estimators in hierarchical linear modeling: Restricted maximum likelihood versus bootstrap via minimum norm quadratic unbiased estimators.
- View Article
- Google Scholar
14. Ali S, Ali A, Khan SA, Hussain S. Sufficient sample size and power in multilevel ordinal logistic regression models. Computational and mathematical methods in medicine. 2016 Sep 22;2016. pmid:27746826
15. Ali A, Ali S, Khan SA, Khan DM, Abbas K, Khalil A, Manzoor S, Khalil U. Sample size issues in multilevel logistic regression models. PloS one. 2019 Nov 22;14(11): e0225427. pmid:31756205
16. Zeng Q, Wang X, Wen H, Yuan Q. An empirical investigation of the factors contributing to local-vehicle and non-local-vehicle crashes on freeway. Journal of Transportation Safety & Security. 2020 Jul 3:1–5. pmid:27648455
17. Zeng Q, Guo Q, Wong SC, Wen H, Huang H, Pei X. Jointly modeling area-level crash rates by severity: a Bayesian multivariate random-parameters spatio-temporal Tobit regression. Transportmetrica A: Transport Science. 2019 Nov 29;15(2):1867–84.
- View Article
- Google Scholar

[ref1] 1. Snijders TA, Bosker RJ. Multilevel analysis: An introduction to basic and advanced multilevel modeling.
View Article
Google Scholar

[2] View Article

[3] Google Scholar

[ref2] 2. Raudenbush SW, Bryk AS. Hierarchical linear models: Applications and data analysis methods. sage; 2002. https://doi.org/10.2466/pms.2002.94.2.671 pmid:12027363

[ref3] 3. Maas CJ, Hox JJ. Robustness issues in multilevel regression analysis. Statistica Neerlandica. 2004 May;58(2):127–37.
View Article
Google Scholar

[6] View Article

[7] Google Scholar

[ref4] 4. Wen H, Zhang X, Zeng Q, Sze NN. Bayesian spatial-temporal model for the main and interaction effects of roadway and weather characteristics on freeway crash incidence. Accident Analysis & Prevention. 2019 Nov 1; 132:105249. pmid:31415995
View Article
PubMed/NCBI
Google Scholar

[9] View Article

[10] PubMed/NCBI

[11] Google Scholar

[ref5] 5. Browne WJ, Draper D. A comparison of Bayesian and likelihood-based methods for fitting multilevel models. Bayesian analysis. 2006;1(3):473–514.
View Article
Google Scholar

[13] View Article

[14] Google Scholar

[ref6] 6. Rao CR. Estimation of heteroscedastic variances in linear models. Journal of the American Statistical Association. 1970 Mar 1;65(329):161–72.
View Article
Google Scholar

[16] View Article

[17] Google Scholar

[ref7] 7. Bagaka’s JG. Two-level nested hierarchical linear model with random intercepts via the bootstrap.
View Article
Google Scholar

[19] View Article

[20] Google Scholar

[ref8] 8. Swallow WH, Monahan JF. Monte Carlo comparison of ANOVA, MIVQUE, REML, and ML estimators of variance components. Techno metrics. 1984 Feb 1;26(1):47–57.
View Article
Google Scholar

[22] View Article

[23] Google Scholar

[ref9] 9. Meijer E, Van Der Leeden R, Busing FM. Implementing the bootstrap for multilevel models. Multilevel Modelling Newsletter. 1995;7(2):7–11.
View Article
Google Scholar

[25] View Article

[26] Google Scholar

[ref10] 10. Carpenter J, Goldstein H, Rasbash J. A non-parametric bootstrap for multilevel models. Multilevel modelling newsletter. 1999;11(1):2–5.
View Article
Google Scholar

[28] View Article

[29] Google Scholar

[ref11] 11. Hutchison D, Morrison J, Felgate R. Bootstrapping the effects of measurement errors. Multilevel Modelling Newsletter. 2003; 15:2–10.
View Article
Google Scholar

[31] View Article

[32] Google Scholar

[ref12] 12. Wang J, Carpenter JR, Kepler MA. Using SAS to conduct nonparametric residual bootstrap multilevel modeling with a small number of groups. Computer methods and programs in biomedicine. 2006 May 1;82(2):130–43. pmid:16569459
View Article
PubMed/NCBI
Google Scholar

[34] View Article

[35] PubMed/NCBI

[36] Google Scholar

[ref13] 13. Delpish AN. Comparison of estimators in hierarchical linear modeling: Restricted maximum likelihood versus bootstrap via minimum norm quadratic unbiased estimators.
View Article
Google Scholar

[38] View Article

[39] Google Scholar

[ref14] 14. Ali S, Ali A, Khan SA, Hussain S. Sufficient sample size and power in multilevel ordinal logistic regression models. Computational and mathematical methods in medicine. 2016 Sep 22;2016. pmid:27746826
View Article
PubMed/NCBI
Google Scholar

[41] View Article

[42] PubMed/NCBI

[43] Google Scholar

[ref15] 15. Ali A, Ali S, Khan SA, Khan DM, Abbas K, Khalil A, Manzoor S, Khalil U. Sample size issues in multilevel logistic regression models. PloS one. 2019 Nov 22;14(11): e0225427. pmid:31756205
View Article
PubMed/NCBI
Google Scholar

[45] View Article

[46] PubMed/NCBI

[47] Google Scholar

[ref16] 16. Zeng Q, Wang X, Wen H, Yuan Q. An empirical investigation of the factors contributing to local-vehicle and non-local-vehicle crashes on freeway. Journal of Transportation Safety & Security. 2020 Jul 3:1–5. pmid:27648455
View Article
PubMed/NCBI
Google Scholar

[49] View Article

[50] PubMed/NCBI

[51] Google Scholar

[ref17] 17. Zeng Q, Guo Q, Wong SC, Wen H, Huang H, Pei X. Jointly modeling area-level crash rates by severity: a Bayesian multivariate random-parameters spatio-temporal Tobit regression. Transportmetrica A: Transport Science. 2019 Nov 29;15(2):1867–84.
View Article
Google Scholar

[53] View Article

[54] Google Scholar

Figures

Abstract

Introduction

Materials and methods

Design factors

Analysis

Results

Real data application

Conclusion

Supporting information

S1 File.

References