Inferential procedures for random effects in generalized linear mixed models

Xu Ning; Francis K.C. Hui; Alan Welsh

doi:10.1371/journal.pone.0320797

Abstract

We study three commonly applied measures of uncertainty for random effects prediction in generalized linear mixed models (GLMMs), namely the unconditional and conditional mean squared errors of prediction (UMSEP and CMSEP, respectively), and the unconditional variance of the prediction gap used by the popular R package for glmmTMB. We demonstrate that, although the three theoretical measures differ in how they quantify uncertainty, the resulting estimators all turn out to be very similar in form. We derive asymptotic results regarding the consistency of the three measures of uncertainty, and in doing so resolve a contradiction between theoretical and empirical results for the glmmTMB variance estimator by re-interpreting it conditionally on a finite subset of the random effects. Our results have important implications for predictive inference in GLMMs, particularly around the legitimacy and implications of coupling these measures with a normality assumption to construct prediction intervals for the random effects.

Citation: Ning X, Hui FK, Welsh A (2025) Inferential procedures for random effects in generalized linear mixed models. PLoS ONE 20(4): e0320797. https://doi.org/10.1371/journal.pone.0320797

Editor: Abhik Ghosh, Indian Statistical Institute, INDIA

Received: December 28, 2024; Accepted: February 25, 2025; Published: April 16, 2025

Copyright: © 2025 Ning et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All relevant data are within the manuscript and its Supporting Information files.

Funding: XN was supported by the Australian Government Research Training Program Scholarship. FKCH and AHW were supported by an Australian Research Council Discovery Project DP230101908. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.

1 Introduction

Generalized linear mixed models (GLMMs) are widely used to analyze clustered data e.g., in longitudinal and multilevel studies [1–3]. In GLMMs, unobserved random effects included in the linear predictor induces correlations between observations within a cluster; these random effects play an important role in many applications e.g., predictions in small-area estimation [4,5], and their realized values or functions thereof are often of direct interest to domain experts [5–7]. The empirical distribution of the predicted random effects is also often examined in model diagnostics, e.g., [8–10], to assess whether the normality assumption for the random effects is reasonable.

For the empirical Bayes and Laplace predictors of the random effects [11,12], several measures of uncertainty have been proposed for performing inference about the random effects in a GLMM. Arguably the two most recognized ones are the unconditional (i.e., neither responses nor random effects are conditioned on) mean squared error of prediction (UMSEP) [13], and the conditional (on the responses from a finite subset of clusters) mean squared error of prediction (CMSEP) [14]. The UMSEP was first explored for the special case of linear mixed models (LMMs) by [13] and [15], where Taylor expansions are used to provide approximations to the UMSEP assuming the variance components are unknown. Similar approaches have since been suggested to approximate UMSEP in GLMMs - see for instance [16,17]. For GLMMs, [14] introduced the CMSEP and derived an estimator of it for a linear combination of fixed and random effects within a given cluster, conditional on the responses in that cluster. While arguments have been provided for favoring either UMSEP or CMSEP, e.g, [6], to date there is no consensus as to which is the more appropriate measure of uncertainty for random effects predictions in GLMMs. Also, when using CMSEP and UMSEP to construct prediction intervals, normality of the prediction gap i.e., the difference between the predicted and realized random effect, is commonly assumed in practice, e.g., [18–20].

Most software used to fit GLMMs does not employ UMSEP or CMSEP for quantifying uncertainty of the prediction gap. For example, both the R packages lme4 [21] and glmmTMB [3,12] instead attempt to compute the unconditional variance of the prediction gap, which differs from the UMSEP by a squared bias term, using two different approximations: the lme4 package considers the second derivative (with respect to the random effects) of the joint likelihood of the data and the random effects, while glmmTMB includes an extra correction term to account for estimation of the fixed effects. To construct prediction intervals for the random effects, both packages (again) couple their respective estimators with an assumption of normality of the prediction gap. We refer to such intervals as normal intervals constructed using the glmmTMB/lme4 estimator. We refer the reader to [22,23] among others for instances where such predictions intervals have been used in real GLMM analysis. Furthermore, we acknowledge the recent work of [20] and [24] on asymptotic distributions for random effects in LMMs specifically; these however are not directly applicable to the GLMM i.e., non-normal response, context.

Even though the UMSEP, CMSEP, and unconditional variance of the prediction gap are all distinct methods for quantifying uncertainty in random effects predictions, remarkably their estimators end up being very similar to each other. Also, the coverage probability for prediction intervals constructed using glmmTMB (but not lme4) tend to achieve close to the nominal significance level in practice, when both the number of clusters and cluster sizes in the GLMM are large. This is despite there being no formal consistency results for the estimators of prediction error used by CMSEP, UMSEP, glmmTMB or lme4, as well as no justification for the assumption of normality for the prediction gap. The empirical results are made even more surprising by the recent work of [25], who showed the prediction gap is not in general asymptotically normal, but instead a convolution of a normal distribution and a normal scale-mixture distribution. In this article, we will refer to this contradiction between the strong empirical performance of glmmTMB, and the theoretical results of [25], as the “normality paradox".

1.1 A motivating example

To offer a concrete example which illustrates some of the above conceptual/theoretical issues and their implications for GLMM analysis, we consider data for 65 patients from a clinical trial comparing two groups (bolus/lock-out combinations). The data can be found in the cold package in R [26], and contain the number of bolus requests per interval in 12 successive four-hourly intervals following abdominal surgery. Thus, there are 12 observations (time-points) of the number of requests, a discrete count response, for each individual. Preliminary analysis suggests the bolus/lock-out combination the patient belonged to does not contribute to the number of requests. Hence for illustrative purposes, we do not include a group effect, and include time as the only covariate in our analysis. Let denote the number of requests of the ith individual at the jth period, for i = 1 , … , 65 and j = 1 , … , 12.

We model this dataset using a GLMM. Assuming the individuals (clusters) are independent, we use a Poisson GLMM with canonical log link function, such that + where denotes the conditional mean response and and denote a fixed intercept and slope for the time period, j, respectively. Independently and identically distributed (i.i.d.) normal random intercepts and random slopes for time are also included, , for some unstructured 2 × 2 random effects covariance matrix.

We are interested in constructing prediction intervals for the random effects of the individuals. Under the GLMM structure above, all individuals have the same design matrix for both their fixed and random effect covariates, namely the 12 × 2 matrix where the first column is a column of ones, and the second column is . Together with the assumption that the individuals (clusters) are independent, it follows that individuals are exchangeable, since there is no information distinguishing between them. If we work in an unconditional framework then (i.e., neither responses nor random effects are conditioned on), we should expect the prediction gap, i.e., the difference between the predictor of the random effects and the true realised random effects, to have the same marginal distribution for individuals. Put another way, lme4 and glmmTMB (both of which aim to estimate the unconditional variance of the prediction gap), and UMSEP should all produce prediction intervals for which have the same length across all individuals. However, CMSEP should produce prediction intervals that differ between individuals (unless the realized responses are identical).

Fig 1 presents the results for the random intercepts, , and random slopes, of each individual, based on fitting the proposed Poisson GLMM using lme4 and glmmTMB. Note the latter package produces results that are identical to those using the estimators for CMSEP and UMSEP; see Sect 2.1 for a detailed derivation on this. For both packages, the interval lengths for each individual differ substantially; this contradicts the unconditional interpretation that prediction intervals constructed from these packages (as well as from UMSEP) aim to produce. Indeed, the black horizontal line shown in Fig 1 represents results for an unconditional interval derived in [25] based on a normal scale-mixture distribution. These unconditional intervals do not rely on first deriving UMSEP or unconditional variance and then applying a distributional assumption; rather, the bounds are determined directly from the quantiles of the appropriate normal scale-mixture distribution.

Download:

Fig 1. Estimated 95% prediction interval lengths for each individual.

Horizontal black line represents the length of the unconditional interval as derived in [25].

https://doi.org/10.1371/journal.pone.0320797.g001

It is also worth noting glmmTMB produces consistently wider prediction intervals than lme4, even though both packages use the Laplace approximation to fit the GLMM and obtain almost identical point estimates of the model parameters and predictions of the random effects (see Sect B of the supplementary material). Thus, there a possible implication either glmmTMB is overcovering or lme4 is undercovering. In summary, this simple motivating example raise some interesting questions regarding random effects inference in GLMMs, notably what are these prediction intervals from glmmTMB and lme4 estimating (i.e., what are they consistent for), and how should they be interpreted?

1.2 Main contributions

This article aims to resolve the lack of consensus on random effects inference in GLMMs literature, provide a deeper understanding of how uncertainty is quantified and estimated for random effects predictions, and the appropriate way to interpret and use them in practice. The main contributions as are follows:

We study proposed estimators of UMSEP and CMSEP for the empirical Bayes and Laplace predictors of the random effects respectively, establishing new connections between these methods for constructing and interpreting prediction inference, and the practical approaches used by glmmTMB and lme4;
Leveraging recent theoretical work in [25], we reconcile the normality paradox and offer a formal justification and correct interpretation for the glmmTMB (but not the lme4) approach to constructing prediction intervals of random effects in GLMMs. This is done by deriving novel asymptotic consistency and distributional results for the prediction gap in a sampling framework conditional on a finite subset of the random effects, which is an extension to the results obtained under an unconditional sampling framework in [25];
Through a simulation study along with several examples, we offer theoretical justification and discuss practical implications of applying the glmmTMB approach/CMSEP/USMEP coupled with a normality assumption for prediction inference in GLMMs;
Focusing particularly on prediction interval lengths, we offer new insights into differences that arise when performing inference in an unconditional framework versus a conditional framework, where the conditioning variables are either the true random effects or the responses of a finite subset of clusters.

The remainder of this article is structured as follows. In Sect 2.1, we define independent cluster GLMMs and discuss two common methods of estimation. Sect 2.2 develops the various measures of prediction uncertainty under consideration. Sect 3.1 introduces consistency and distributional results for the prediction variance estimators used in glmmTMB, and resolve the normality paradox. Sect 3.2 presents results from a simulation study that empirically verify our theoretical developments, and in the Discussion we discuss the implications of our results.

2 Materials and methods

2.1 Generalized linear mixed models

We study the independent cluster generalized linear mixed model, often applied in longitudinal studies and official statistics for small-area estimation, among other fields. For , and i = 1 , … , m, let denote the jth measurement of cluster i, denote a vector of fixed effect covariates, and denote a vector of random effect covariates. Define to be the total sample size, to be the mean cluster size, , and , and assume the m clusters are independent. Conditional on a -vector of random effects , the responses are assumed to be independent observations from the exponential family with mean and dispersion parameter ϕ. That is, , for known functions a ( ⋅ ) , c ( ⋅ ) , such that for a given link function g ( ⋅ ) we have , where denotes the linear predictor and β denotes a -vector of unknown fixed effect coefficients. Commonly used distributions within the exponential family include the normal, Poisson, Binomial, and Gamma distributions. The random effects are assumed to be independently drawn from a multivariate normal distribution with zero mean vector and unstructured random effects covariance matrix G, that is, .

Let the matrix , and the matrix , so that for each cluster, the mean model for the GLMM can be written as for , where . We can further concatenate the across clusters and write g ( μ ) = Xβ + Zb where , , , and , where bdiag ( ) denotes the block-diagonal matrix operator. Here, X is of dimension , and Z is an sparse block diagonal matrix, with at most non-zero components per row.

Let , and . The marginal log-likelihood function for the independent cluster GLMM is given by

(1)

where vech ( ⋅ ) denotes the half-vectorization operator which stacks elements from the lower triangular portion of a (symmetric) matrix, column-wise, into a vector. The integral in (1) is not available analytically except in the special case of a normal response with an identity link function i.e., LMMs. Many methods have been devised in order to compute approximate maximum likelihood estimators of the unknown parameters [27–29]. Of these, the most widely used is the Laplace-approximated marginal log-likelihood, which is employed in the R packages lme4 and glmmTMB for fitting GLMMs. This approach is obtained by approximating the integrand of (1) with a quadratic function around its mode. Denote as the joint log-likelihood function of the responses and random effects, and let ∂ denote a partial derivative and ∇ ⁡ denote a total derivative. The Laplace approximation to the marginal log-likelihood can be derived as

(2)

where b ( ψ ) satisfies , i.e., it is the maximizer of l ( b , ψ ) over b for a given ψ. The Laplace estimates of the model parameters are then obtained as , and Laplace predictors of the random effects as .

A method closely related to the Laplace approximation is penalized quasi-likelihood (PQL) estimation for GLMMs [30]. By treating the random effects covariance matrix G as known (or fixed at some value), such that , the PQL objective function for an independent cluster GLMM and unknown parameters is defined as , and the PQL estimator of the fixed effects, random effects and dispersion parameters is defined as .

For known G, the PQL objective function Q ( 𝜃 ) can also be seen as an approximation to the marginal log-likelihood of the GLMM, since where K is a constant with respect to ψ. Then . The PQL objective function thus ignores the log-determinant term in the Laplace-approximated marginal log-likelihood (2), which [30] argue is reasonable if the cluster sizes are large, since in this case the log-determinant term varies slowly as a function of ( β , ϕ ) . Put another way, the PQL objective function is similar to the Laplace-approximated objective function provided all the ’s are sufficiently large and growing. For this reason, although the theoretical results in this article for prediction inference are for the PQL estimator, we can reasonably apply them to the estimator based on the Laplace approximation; see also the work of [31,32] as well as the simulation results of Sect 3.2 which corroborate this point.

The Laplace and PQL procedures both provide estimates of the model parameters (i.e., fixed effects and dispersion parameters), as well as predictors of the random effects. In the latter, it is straightforward to show both produce the modal predictor for the true realized random effects i.e., the predictor is equal to the modal predictor of , the conditional distribution of the random effects given the observed data evaluated at the estimated model parameter values.

2.2 Measures of prediction uncertainty for random effects

In this section, we outline the derivation of three prominent measures of prediction uncertainty in GLMMs: the unconditional and conditional mean squared errors of prediction (UMSEP and CMSEP, respectively), and the unconditional variance of the prediction gap for a given cluster employed by glmmTMB. By synthesizing the existing literature, in Sects 2.2.1-2.2.3 we highlight common steps in the derivation of all three approaches as well as steps where assumptions are made without formal justification. In doing so, we show that although the aims are different for these three measures, the final estimators of each measure are similar and sometimes even identical in form. In Sect 2.2.4, we also discuss the measure of prediction uncertainty used by lme4, and how it relates to the aforementioned three measures.

Let denote the true parameter values of the GLMM, , denote the estimates of the model parameters obtained from either Laplace or PQL estimation, and denote the true realized random effect for the ith cluster. The CMSEP, UMSEP and glmmTMB uncertainty are defined as , , and respectively. The derivations of the estimators of all three measures follow a similar general approach: all are decomposed into two terms with one representing the uncertainty in prediction if the model parameters are known, and another representing the uncertainty from having to estimate the model parameters. A Taylor expansion of around is applied then to the second term. Finally, estimators are proposed for the resulting approximations.

Two further remarks are worth making regarding the derivation of all three measures. First, a common assumption made is that the distribution of the random effects conditional on the observed data is multivariate normal. This implies the mean and median of the random effects given the observed data are equal. Asymptotic equality of the mean and median of the true random effects given the observed data can also be rationalized without this multivariate normal assumption, if cluster sizes are growing. A second assumption common to all three derivations is that the conditional variance of the random effects given the observed data is given by the inverse of the second derivative of the joint likelihood of the responses and random effects. This assumption is often rationalized by appealing to the Laplace approximation to the (marginal) likelihood, which also relies on growing cluster sizes. To our knowledge, there are no formal results in the literature justifying either of these assumptions.

We introduce some notation used in the derivations below. Let , which is a function of both parameters ψ and observed data , denote a predictor of . By construction, the from either the PQL or Laplace procedure in Sect 2.1 can be obtained as for any given ψ. Thus, they may also be written as where is the mode of and can be either or . Lastly, recall denotes the joint log-likelihood of the data and random effects, and l ( ψ ) denotes the marginal log-likelihood given by (1).

2.2.1 CMSEP.

We begin by deriving an estimator of CMSEP, based on the work of [14]. For GLMMs, [14] advocated inference should be performed conditional on the observed responses belonging to the same clusters as the random effects of interest. That is, they suggested and studied the quantity , which is the mean squared prediction error of the linear predictor for a particular cluster and some and , conditional on the observations in that cluster. To better highlight the connection between CMSEP and the UMSEP/glmmTMB derivations later, we will apply the same steps as their derivation to a multivariate version of the CMSEP, specifically, the quantity .

We begin by writing

Assume , i.e., the predictor for is the mean of the random effect given the data. Note for the modal predictor considered by [12] for glmmTMB and [16] for UMSEP, this assumption holds if is multivariate normal. On the other hand, [14] explicitly advocated using instead of the modal predictor in GLMMs. Since and are conditionally independent given , the CMSEP can be further rewritten as

The two terms in the decomposition may be thought of as encapsulating the naive prediction variance arising from predicting for when is known, plus an adjustment term for needing to estimate . Again using , we have . Also, from the Laplace approximation [27] we know approximates this conditional variance with a relative error of . Thus, we can estimate by , provided is a consistent estimator (conditional on ) of .

Turning to , we have from a Taylor expansion of around that

(3)

where the order of the error term on the right hand side is shown in [33]. Since by definition for any ψ, by the multivariate chain rule we obtain , and thus

(4)

By substituting Eq (4) into Eq (3) and noting is block-diagonal, we obtain

where the second equation follows because is not a function of or , the -vector formed by deleting from y. Since the conditional (on y_i) and unconditional variance of are in agreement to order [14], if we assume we may subsequently estimate by estimating . Furthermore, from standard results on maximum likelihood estimation, we can approximate by or , after which we obtain the estimate by replacing with . This estimate is reasonable provided both the number of clusters (in order to appeal to standard maximum likelihood asymptotics e.g., [34]) and the cluster sizes (to guarantee the PQL/Laplace approximation is sufficiently accurate for true marginal log-likelihood) grow.

Combining the estimators for and , we obtain the estimator for CMSEP,

Remark 1. [14] propose a bootstrap procedure to adjust for the error introduced when replacing with , which is non-negligible to second order when the cluster sizes are fixed. However, this error is negligible when both the number of clusters and cluster sizes are allowed to grow. We will see later that without the bootstrap adjustment, the CMSEP and glmmTMB and UMSEP estimators are identical.

We point out two key aspects regarding Eq (5):

[14] do not actually show (5) is a consistent estimator of the true CMSEP. That is, there is no proof .
To construct prediction intervals for using the estimated CMSEP, we require a distributional result. For example, it turns out the package glmmTMB uses () as if it were a variance, and assumes normality of . Using (5) as if it were a variance and assuming normality would require exact or asymptotic normality of conditional on ; to our knowledge this has not been proven in the literature (see Section 3.1 for further discussion).

2.2.2 UMSEP.

While there are several results on the UMSEP for linear predictors in LMMs [13,15,35], few such results exists for GLMMs. Two notable exceptions are the works of [16,17], who consider logistic link binomial GLMMs and estimate the UMSEP by considering a Taylor linearization of the mean responses in the linear predictor, before applying the LMM results of [15]. By contrast, the derivation of UMSEP we develop below more closely follows the work of [13], which does not adjust for the error introduced when replacing with ; this error is negligible if both the number of clusters and cluster sizes are allowed to grow.

For UMSEP, the goal is to estimate the quantity . From the law of iterated expectation, we have

where in the second line we have assumed . As in the derivation for CMSEP, this equality holds if is (assumed to be) multivariate normally distributed. As an aside, [13] require the estimator of the random effects covariance parameters to be translation-invariant in order for the cross-product terms to be zero. We avoid this requirement as we consider the UMSEP of the prediction gap as opposed to the linear predictor.

Next, making the assumption again we have , which we subsequently estimate using . Note an equivalent term also appears in the derivation of UMSEP in [16,17], and is estimated in essentially the same way.

Turning to , we first note the inner expectation disappears because conditional on y is not a function of . Next, we make use of the Taylor expansion of around i.e., using Eqs (3) and (4), we obtain

Next, we draw a parallel with the work of [13] and make the following approximations, which are discussed in Sect E of the supplementary material. Assume the smaller order term has finite expectation, treat b ( ψ ) as if it was not a function of y, and assume . By approximating with , we may then approximate directly by

where, similar to the CMSEP derivation, is approximated by or . Finally, by replacing with , and combining terms and and focusing on the the ith cluster, we obtain the UMSEP estimator

where we note is block-diagonal in structure. It is not hard to see the resulting estimator of UMSEP is identical to the CMSEP estimator.

Remark 2. Since

then one may view the CMSEP as a ‘central approximation’ [12] for the UMSEP, and in this sense it is not surprising the estimators are the same. Indeed, approximating by its central approximation (i.e., approximating an expectation by a single realization of a random variable with that expectation), and treating b ( ψ ) as if it is not a function of y in the approximation to , are the steps that make the UMSEP estimator the same as the CMSEP estimator. Thus one way to view the estimator is as a central approximation of the true UMSEP.

In seeking to apply the estimator for the UMSEP as the basis for constructing prediction intervals for random effects in GLMMs, we again run into the same points faced by the estimator for CMSEP. That is, to our knowledge there is no consistency result for the UMSEP estimator, and neither is there a distributional result for the prediction gap.

2.2.3 The glmmTMB estimator.

We follow the work of [12] and [33] to derive the estimator used in the glmmTMB package for the quantity i.e., the unconditional variance of the prediction gap , where can be either or . From henceforth, we will also refer to this estimator simply as the glmmTMB estimator.

To begin, from the law of total variance we have

We replace by its central approximation - see [12] as well as Remark 2 above. Similarly to UMSEP, this step has no formal justification, and we will see in Sect 3.1 that this step prevents the final estimator from being a consistent estimator of in general. Next, identical to in the derivation of the UMSEP, we approximate by . From this, a reasonable estimator of is thus given by , provided is a consistent estimator for . This is then the same estimator as for U₁ in the UMSEP, which is expected since when . It is also analogous to in the derivation of the CMSEP, in the sense the ith-diagonal block of is the estimator for .

Turning to , write . Assuming is used, then . By applying Eqs (3)–(4) once more, we obtain , which, by treating as if it is not a function of y, can be further approximated as

[33] show that when is the Laplace predictor, , which justifies this approximation. Note the estimator for is, again, similar to the estimator for in the derivation of the CMSEP: the latter is a sub-matrix of the former due to the block-diagonality of .

On combining the estimators for and above, we obtain , . Finally, to obtain the estimated covariance matrix of , we can extract the corresponding sub-matrix, which since is block-diagonal, this gives

Note this is identical to the both the estimators of the CMSEP, in Eq (5), and the UMSEP, .

Remark 3. The assumption is the key step that makes the glmmTMB estimator (6) identical to the UMSEP estimator. It is not surprising these two estimators are identical because

and [33] show . Thus UMSEP and the unconditional variance of the prediction gap only differ by a smaller order term, so they can be estimated in the same way when the number of clusters is large.

The glmmTMB estimator suffers the same two critical problems as the estimators for the CMSEP and the UMSEP estimators: to our knowledge there is no consistency result for , and neither is there a distributional result for the prediction gap.

2.2.4 lme4.

Finally, we consider the approach taken by the lme4 package, which estimates the unconditional variance of the prediction gap using . That is, it uses only the first part of the glmmTMB estimator in (5). [12] claimed the estimator for term , the second part of the glmmTMB estimator, is negligible in large samples due to the assumed consistency of . However, this statement is only true if the number of clusters m grows while all cluster sizes are fixed. On the other hand, growing cluster sizes are required for the consistency of the PQL or Laplace estimators, the latter of which is used for estimation by glmmTMB.

In Sects 3.1 and 3.2, we show the lme4 estimator is, in general, not an adequate measure of prediction uncertainty when the cluster sizes are larger than the number of clusters, and can result in substantial undercoverage when used as the basis for constructing prediction intervals. Indeed, the omission of the second term of the glmmTMB estimator causes the noticeable differences in the prediction interval lengths between the packages lme4 and glmmTMB (as exemplified in Fig 1 shown earlier).

3 Results

3.1 Normality paradox

To construct prediction intervals for the random effects in GLMMs, both the glmmTMB and lme4 packages assume the prediction gap is multivariate normally distributed with mean vector zero and covariance matrix given by their respective estimators e.g., Eq (5) in the case of glmmTMB. The assumption of normality is also used when using CMSEP and UMSEP to build prediction intervals, e.g., [18,19]. However, as discussed throughout Sect 2.2 there is no formal justification for this assumption under GLMMs. Indeed, recent research by [25] shows the prediction gap is, in general, not asymptotically normal: under an unconditional framework when both m and grow, and if the random and fixed effects have the same covariates i.e., are partnered, they prove that the prediction gap derived from the PQL estimator converges in distribution to a convolution of a normal distribution and a normal scale-mixture distribution; see Sect 3.1.2 for a formal definition of this. Recall also this result was applied to obtain the horizontal solid black line in Fig 1, from which it is apparent the unconditional distribution of the prediction gap can yield markedly different prediction intervals compared to glmmTMB and lme4.

The results from [25] call into question the justification for assuming normality as the basis for constructing prediction intervals. Despite this however, in practice (e.g., see the simulation study in Sect 3.2) the prediction intervals constructed using the glmmTMB estimator and assuming normality actually achieve close to nominal coverage when both the number of clusters and cluster sizes in the GLMM are large. As such, there seems to be a contradiction between empirical performance (which shows assuming normality for the prediction gap results in reasonable performance), and the theoretical results in [25] (which show this is asymptotically incorrect).

In this next subsection, we reconcile the above “normality paradox" by providing a formal justification for using the glmmTMB estimator and assuming normality of the prediction gap to construct intervals for the random effects. In doing so, we also resolve an issue identified in our motivating example in Sect 1.1, namely that the prediction interval lengths from glmmTMB differ across clusters even when the clusters are exchangeable. This contradicts the stated goal of the glmmTMB estimator as discussed in see Sect 2.2.3, which is to estimate the unconditional variance of the prediction gap and produce unconditional intervals. We develop our asymptotic results for the PQL estimator, although in Sect 3.2 we demonstrate empirically that they also work for the Laplace estimator when m and grow.

3.1.1 Technical results.

We consider independent cluster GLMMs, and make the following additional simplifying assumptions.

(S1) for i = 1 , … , m. That is, all covariates included as fixed effects covariates are also included as random effects covariates and vice versa. Note in this partnered case, .
(S2) The canonical link function is used, so that for and i = 1 , … , m.
(S3) uniformly for i = 1 , … , m, and the number of clusters and minimum cluster size satisfy , and .
(S4) The random effects covariance matrix and dispersion parameter are known, such that ψ = β and .

Assumptions (S1)-(S4) are also made in [25]. Assumption (S1), while restrictive, is relevant in practice where, due to a lack of knowledge on which random effects should be included in the model, a fully partnered model is often first fitted [36]. Assumption (S2) is often assumed in GLMM asymptotics, e.g., [32,37]. Assumption (S3) requires cluster sizes to grow at the same rate when deriving asymptotics of the PQL estimator (see also [31]), while the number of clusters cannot grow too fast relative to the cluster size. Finally, assumption (S4) does not detract from the arguments we make in this article given our main focus is on the interpretation of prediction intervals for the random effects. Moreover, it is possible to relax this assumption to employ a working G and ϕ analogous to [25], but we leave this extension as an avenue of future research.

We note that although very small cluster sizes are important in much real data, there are increasingly many settings in application where cluster sizes are not that small and for which the asymptotic approximations and large sample results established in this article are relevant, e.g., educational studies with large numbers of students (units) grouped within schools (clusters) [38], medical studies with large groups (clusters) of patients (units) treated at different hospitals [39], and settings where the data for each cluster are recorded at relatively high temporal frequency [40].

We begin by stating our main consistency result.

Theorem 1. Assume (S1)–(S4) and Conditions (C1)–(C4) in the supplementary material are satisfied. Then for the independent cluster GLMM and conditional on the random effects , it holds that

for all i = 1 , … , m, where is defined in Eq (6).

The above results states that for any cluster, the glmmTMB estimator, and thus the estimators of the CMSEP and UMSEP derived in Sect 2.2, are consistent for the conditional variance of the prediction gap given the random effects of that cluster. Importantly, this is not the unconditional variance the glmmTMB estimator sets out to estimate, nor is it the UMSEP or the CMSEP. This consistency result is also distinct from the results in [25], which are derived under an unconditional sampling framework. Theorem 1 explains why we observe differing prediction interval lengths for each cluster in Fig 1: even when clusters are exchangeable in the motivating Poisson GLMM example, the true conditional variances differ for each cluster as they depend upon the corresponding random effect . Hence the glmmTMB estimator, which is consistent for this quantity, will also produce different prediction intervals across the clusters.

Remark 4. Theorem 1 is based on conditioning on a finite subset of the random effects, which differs from conditioning on (a finite subset of) the observed y as in [14] and [12]. Put another way, Theorem 1 does not state whether the estimator for CMSEP in Sect 2.2.1, which so happens to take the same form as the glmmTMB estimator, is a consistent estimator for the true CMSEP.

With regard to the true unconditional variance and true UMSEP, we provide the following remark.

Remark 5. The glmmTMBestimator cannot, in general, be a consistent estimator of the unconditional variance of the prediction gap, . Similarly, the glmmTMBestimator cannot be a consistent estimator of the UMSEP, .

The above lack of consistency occurs due to employing the central approximation for (see Remark 2), as converges in probability to , which is unconditionally a random variable and a function of the random effects . By contrast, the correct unconditional variance is actually , which is not a function of . We illustrate the differences between these via a concrete example in Sect 3.1.2.

Next, we offer a distributional result for the prediction gap of the PQL estimator.

Theorem 2. Assume (S1)–(S4) and Conditions (C1)–(C4) are satisfied. Then for the independent cluster GLMM and conditional on the random effects , we have the following for each i = 1 , … , m:

(a) If , then .
(b) If , then .
(c) If , then .

Theorem 2 can be summarized by stating that a correct finite sample approximation for the prediction gap when conditioning on , is given by distribution. Note Theorem 2c is the only part of the result where the asymptotic distribution does not involve the random effects . An immediate consequence of this is that the case of offers one setting where the glmmTMB estimator can support an unconditional interpretation, and is in fact consistent for the unconditional variance of the prediction gap. Again, this asymptotic distributional result is distinct from the result for the prediction gap obtained in [25], which was derived in an unconditional framework and involves a normal scale-mixture distribution.

By combining Theorems 1 and 2, we obtain the following result which resolves the normality paradox.

Corollary 1. Assume (S1)–(S4) and Conditions (C1)–(C4) are satisfied. Then normal prediction intervals constructed using the glmmTMB estimator asymptotically achieve nominal coverage.

Theorems 1 and 2 are conditional on ; however, Corollary 1 uses the fact that correct conditional coverage implies correct unconditional coverage to arrive at an unconditional result. Moreover, taken together the results above imply that one appropriate way to interpret prediction intervals constructed using the glmmTMB estimator plus a normal approximation is to do so conditionally on (a finite subset of) the random effects. By the equivalence of the glmmTMB estimator and the CMSEP and UMSEP estimators, the same interpretation can also be applied to predictions intervals constructed using the estimators of the CMSEP and UMSEP derived in Sect 2.2. Note this differs from the unconditional interpretation that is appropriate for prediction intervals based on the results in [25]; this is expected since the two intervals different in behavior substantially as exemplified from Fig 1.

3.1.2 Poisson intercept-only GLMM example.

To offer greater insight for the technical results above, we present a simple but insightful example involving a Poisson intercept-only GLMM. Consider the model , where the mean is modeled as , and . By condition (S4), we assume the variance component is known and for all i = 1 , … , m i.e., the design is balanced. Using PQL estimation to fit the GLMM, and following the developments of [25], for we can show the prediction gap for the ith cluster can be written as

(7)

Suppose . Then conditional on , the asymptotic variance of is straightforwardly seen to be , which agrees with Theorem 2b. When , the second term disappears and the asymptotic variance becomes as in Theorem 2a. Finally, when case the first term disappears and the asymptotic variance of is as in Theorem 2c. Overall, from (7) an appropriate finite sample approximation of the conditional variance of can be given by .

Meanwhile, using the derivations from Sect 2.2.3, we can show the glmmTMB estimator for the uncertainty of is given by . Since and converge in probability to and , respectively, then we can further write the glmmTMB estimator as . Ignoring the smaller order term, we see this leads to the same expression as the finite sample approximation of the conditional variance of based on (7). Thus when , it follows the glmmTMB estimator is consistent for the conditional variance of , irrespective of the precise rate of . This agrees with Theorem 1.

In the case , and are asymptotically independent because has the same distribution regardless of whether a finite subset of ḃ is constant or not. The glmmTMB, CMSEP and UMSEP estimators also reduce to the correct unconditional variance , which is identical to the unconditional result of [25]. Therefore in this particular setting there is no difference between -conditional and unconditional inference. This is in agreement with Theorem 2c.

Consider now deriving an (unnormalized) asymptotic unconditional variance of . This can be explicitly calculated from (7) as

This variance is evidently not , i.e., it is not the glmmTMB estimator and by extension not the estimators of UMSEP and CMSEP. Indeed, the glmmTMB estimator is unconditionally a random variable because of the use of a central approximation is used (see Remark 2). Moreover, the dependence on the estimated value of is the reason why prediction intervals constructed using glmmTMB will differ in length between clusters.

[25] showed the sum of uncorrelated, (unconditionally) dependent random variables converges in distribution not to a normal but in fact a normal scale-mixture distribution . The latter is characterized by P having a distribution conditional on , and . Let H ( ⋅ ) denote the cumulative distribution function of this normal scale-mixture distribution.

On the other hand, if we conditional on , then by the model definition is a sum of i.i.d. random variables, and is independent of . Thus the central limit theorem applies, and we have an asymptotic normality result for the (appropriately normalized) prediction gap . This agrees with our result in Theorem 2, and together with the consistency of established in 2 previously explains the correct unconditional coverage as illustrated in Corollary 1.

To better elucidate the differences between inference conditional versus unconditional on for the Poisson intercept-only GLMM, we now focus on the case, i.e., when is the dominating term in (7). First, conditional on , the quantity P converges to a normal distribution with mean zero and variance . As such, at the 1 − α% nominal level, the average length (normalized by n) of the conditional intervals is given by . On the other hand, unconditional on , the length of the asymptotic central prediction interval is given by , by the symmetry of the normal scale-mixture distribution. One can show then that, for small/large values of α and due to the heavy-tailedness of the normal scale-mixture distribution relative to the normal distribution, the unconditional interval will be larger/smaller than the average length of the conditional intervals.

The discrepancy between the two interval lengths also depends on the value of , and Fig 2 explores this in more detail. For small values of , the expected lengths of the conditional intervals are roughly equal to the lengths of their unconditional counterparts across all significance levels. This is not surprising since, when is small, the normal scale-mixture distribution is close to a normal distribution. Indeed, the two intervals are identical when . Conversely, when is large, the expected length of the conditional intervals is greater/smaller than the length of the unconditional intervals for higher/lower significance levels.

Download:

Fig 2. Expected conditional interval lengths versus unconditional interval lengths, for

.

https://doi.org/10.1371/journal.pone.0320797.g002

Finally, for a fixed 5% nominal level, Fig 3 examines the empirical distribution of 50,000 simulated conditional interval lengths, compared with the corresponding unconditional interval length. Results show larger values of result in the expected length of the conditional intervals being smaller than their unconditional counterpart. The variance of the conditional intervals’ length increases with increasing , as the empirical distribution of lengths becomes increasingly right-skewed.

Download:

Fig 3. Histograms of conditional interval lengths, for a fixed significance level of 5% and

.

Blue line is the expected conditional interval length, and red line represents the corresponding unconditional interval length.

https://doi.org/10.1371/journal.pone.0320797.g003

Ultimately, this Poisson intercept-only GLMM example can serve as a litmus test for practitioners to understand the consequences of performing analysis conditional on (a subset of) the random effects or not. Specifically, conditional inference implies prediction intervals will differ in length across each cluster, while unconditional inference implies prediction intervals will be of the same length. This may influence a practitioner’s decision on which inferential framework to work under.

3.2 Simulation study

We performed a simulation study to empirically verify the differences between the glmmTMB estimator and the results presented in [25]. Note [25] derived results for the PQL estimator, but for the purposes of this study we applied them directly to the Laplace estimator. The PQL and Laplace estimators are very similar when cluster sizes are sufficiently large (see Sect C of the supplementary material).

We simulated data from an independent cluster GLMM with fixed and random effect covariates as follows. First, we generated the fixed effect covariates by setting the first element equal to one for a fixed intercept, and simulating the remaining elements from a standard normal distribution. As in assumption (S1), we then set the random effect covariates . Next, we set the 2-vector of true fixed effect coefficients as , the 2 × 2 random effects covariance matrix , and simulated the random effects as . Finally, conditional on the responses were generated from a Poisson distribution with log link i.e., . We varied the number of clusters as m = { 25 , 50 , 100 , 200 } and the cluster sizes . For each combination of ( m , n ) , we simulated 1000 datasets.

For each simulated dataset, we fitted the corresponding GLMM using glmmTMB with set as known. We then examined the unconditional empirical coverage probability of 95% prediction intervals constructed from the glmmTMB estimator (as derived in Sect 2.2.3), and compared these to prediction intervals constructed using the results in [25] as well as from lme4. Note to construct the unconditional intervals of [25], we compute the relevant asymptotic variance and the quantiles of the relevant normal scale-mixture distribution via direct simulation with 10000 samples. For simplicity, we focus on the prediction intervals for the random effects of the first cluster , and for quantities and .

For the random effect predictions, the empirical coverage probabilities of the glmmTMB and unconditional intervals of [25] were fairly close to the nominal level across all combinations of ( m , n ) , with the exception of a tendency to overcover when n = 25 (Table 1). This is concordant with our theorems in Sect 3.1 and Corollary 1. By contrast, the prediction intervals from lme4 undercover severely when n grows at a rate faster than or equal to m, in agreement with the discussion in Sect 2.2.4. Turning to prediction interval widths, the conditional intervals from glmmTMB had a smaller average length than the unconditional intervals of [25] (Table 2). However, the unconditional intervals also have higher coverage in this case. On the other hand, Table 3 presents the same results but scaled by n, from which we see the variance of the interval lengths using the glmmTMB and lme4 estimators remain roughly constant as ( m , n ) grow. This implies the standard errors are converging to a random variable rather than a constant.

Download:

Table 1. Empirical coverage probabilities of prediction intervals for

, constructed using glmmTMB, lme4, and the unconditional approach of [25].

https://doi.org/10.1371/journal.pone.0320797.t001

Download:

Table 2. Empirical average interval lengths of prediction intervals for

, constructed using glmmTMB, lme4, and the unconditional approach of [25].

https://doi.org/10.1371/journal.pone.0320797.t002

Download:

Table 3. Scaled empirical interval length variance of prediction intervals for

, separately constructed using glmmTMB and lme4.

https://doi.org/10.1371/journal.pone.0320797.t003

4 Discussion

In this article, we examined several well-known measures of prediction uncertainty for random effects’ predictions in GLMMs, namely estimators of CMSEP and UMSEP, as well as the estimators used in the software packages glmmTMB and lme4. We demonstrated the first three measures arrive at the same estimator despite having different aims, while the lme4 estimator differs from the other three. When the fixed and random effects are partnered, the lme4 estimator results in severe undercoverage of the true random effects when the cluster size is growing faster than the number of clusters. We also leveraged the theoretical results of [25] to explain why using the glmmTMB estimator and a normal assumption yields asymptotically correct inference for the true random effects. Our derivations showed that this glmmTMB procedure can be asymptotically justified by conditioning on the random effects for a finite number of clusters; this is contrary to the underlying original goal of glmmTMB and UMSEP, which is to make unconditional inference. Through a motivating example and simulation studies, we empirically demonstrate the differences between the conditional inference provided by typical software packages, and correct unconditional inference. Moreover, the litmus test as given by the Poisson intercept-only GLMM (based on the exchangeablility of clusters) leads to prediction intervals of the same length for the random intercepts of any cluster when working in an unconditional framework. On the other hand, inference conditional on allows prediction interval lengths to differ for each cluster.

It is important to note the theoretical developments in this article suggest that an order of magnitude difference between m and n is unlikely to greatly deteriorate the asymptotic approximations which lme4 and glmmTMB essentially use, so long as the actual values of the cluster sizes themselves (regardless of the ratio to the number of clusters) are sufficiently large for the asymptotic approximations to be accurate. That is, the findings of this work can still be relevant in practical applications where the number of clusters greatly exceed the cluster size, provided the actual values of the latter are not extremely small. This is based on our theory requiring i.e., the number of clusters are not growing faster than the cluster size squared, and indeed our numerical results support this, e.g., coverage of the intervals are close to nominal when m = 200 and n = 25. This is also consistent with recent simulation results in, e.g., [20,41], who consider even smaller sample cluster sizes of n = 10. Our theory and simulation results further suggest the coverage of lme4 random effects prediction intervals tend toward nominal (from undercoverage) even in the partnered case, as the ratio of m ∕ n increases.

Although our conclusions hold under the assumptions made in [25], they offer a first step towards offering justification for procedures in prediction inference in GLMMs previously lacking in the literature. We conjecture some of the assumptions can be relaxed without affecting the results greatly, and establishing this formally is an interesting future direction to pursue. Furthermore, the viability - both in practice and theory - of inference conditional on y or is not explored deeply in this article, and is another avenue of further research to pursue. Finally, beyond Remark 1 for the CMSEP, it is important to acknowledge resampling approaches such as the parametric bootstrap and variations thereof are another class of approaches used to construct prediction intervals in GLMMs; see for instance [42,43] as well as [44,45] on the use of the jackknife for small area estimation. We conjecture the way in which resampling is performed e.g., conditional on the random effects or not, is likely to affect the properties of the resulting prediction intervals, and may be closely related to the theory and conclusions reached in this article.

Finally, GLMMs are commonly used in situations where there exist missing data [46,47], and so the extension of our theory and finite sample results to incorporate various missing data patterns is an important avenue of future research. As a starting point, although our theoretical framework is currently not set up to incorporate any missingness in the response or fixed/random effect covariates, the fact that we allow uneven cluster sizes suggests we could rely on standard literature in the case of Missing Completely at Random (MCAR) and perform complete case analysis without any theoretical properties changing [47]. On the other hand, we conjecture there are likely to be biases in both point and interval estimates if our major theoretical results are directly applied to the case of (Missing Not at Random) MNAR responses and/or covariates, and this would also be consistent with standard MNAR literature (e.g., [46,48]). Finally, in the case of (Missing at Random) MAR data, a common approach is to employ some form of multiple imputation [49]. We conjecture our asymptotic developments which condition on e.g., Theorems 1 and 2, will still be valid for fixed effects (which the are in this case) inference if classic methods of imputation such as using Rubin’s rules and increasing the variance of the asymptotic approximating normal distribution are employed [50]. However, further investigation is required when it comes to random effects inference/prediction intervals in the unconditional case, since the extra variability from imputation would need to be incorporated into our theoretical results, either within the normal mixture distribution component or the normal distribution component of the asymptotic approximating distribution.

Supporting information

S1 Fig 1. Comparison of glmmTMB and lme4 point predictions of the random effects.

https://doi.org/10.1371/journal.pone.0320797.s001

(TIF)

S2 Fig 2. Naive interval lengths versus unconditional interval lengths, for .

https://doi.org/10.1371/journal.pone.0320797.s002

(TIF)

S1 Appendix. Proofs and extra results.

https://doi.org/10.1371/journal.pone.0320797.s003

(PDF)

References

1. Molenberghs G, Verbeke G. Models for discrete longitudinal data. Springer Science & Business Media; 2006.
2. Fitzmaurice GM, Davidian M, Verbeke G, Molenberghs G. Longitudinal data analysis. CRC Press; 2008.
3. Brooks ME, Kristensen K, van Benthem KJ, Magnusson A, Berg CW, Nielsen A, et al. glmmTMB balances speed and flexibility among packages for zero-inflated generalized linear mixed modeling. R J. 2017;9:378–400.
- View Article
- Google Scholar
4. Jiang J. Empirical best prediction for small-area inference based on generalized linear mixed models. J Stat Plan Inference. 2003;111(1–2):117–27.
- View Article
- Google Scholar
5. Rao JN, Molina I. Small area estimation. John Wiley & Sons; 2015.
6. Skrondal A, Rabe-Hesketh S. Prediction in multilevel generalized linear models. J R Stat Soc Ser A: Stat Soc 2009;172(3):659–87.
- View Article
- Google Scholar
7. McCulloch CE, Neuhaus JM. Prediction of random effects in linear and generalized linear models under model misspecification. Biometrics 2011;67(1):270–9. pmid:20528860
- View Article
- PubMed/NCBI
- Google Scholar
8. Hui FKC, Müller S, Welsh AH. Random effects misspecification can have severe consequences for random effects inference in linear mixed models. Int Statistical Rev 2020;89(1):186–206.
- View Article
- Google Scholar
9. Gu Z. Model diagnostics for generalized linear mixed models. ProQuest Dissertations and Theses; 2008. p. 123.
10. Rabe-Hesketh S, Skrondal A, et al. Diagnostics for generalised linear mixed models. In: United Kingdom Stata Users’ Group Meetings 2003. Stata Users Group; 2003.
11. Carlin BP, Louis TA. Empirical bayes: Past, present and future. J Am Stat Assoc 2000;95(452):1286–9.
- View Article
- Google Scholar
12. Kristensen K, Nielsen A, Berg CW, Skaug H, Bell BM. TMB: Automatic differentiation and Laplace approximation. J Stat Soft. 2016;70(5).
- View Article
- Google Scholar
13. Kackar RN, Harville DA. Approximations for standard errors of estimators of fixed and random effects in mixed linear models. J Am Stat Assoc 1984;79(388):853–62.
- View Article
- Google Scholar
14. Booth JG, Hobert JP. Standard errors of prediction in generalized linear mixed models. J Am Stat Assoc 1998;93(441):262–72.
- View Article
- Google Scholar
15. Prasad NGN, Rao JNK. The estimation of the mean squared error of small-area estimators. J Am Stat Assoc 1990;85(409):163–71.
- View Article
- Google Scholar
16. Saei A, Chambers R. Small area estimation under linear and generalized linear mixed models with time and area effects. Southampton Statistical Sciences Research Institute; 2003.
17. González-Manteiga W, Lombardía MJ, Molina I, Morales D, Santamaría L. Estimation of the mean squared error of predictors of small area linear parameters under a logistic mixed model. Comput Stat Data Anal 2007;51(5):2720–33.
- View Article
- Google Scholar
18. Korhonen P, Hui F, Niku J, Taskinen S. Fast and universal estimation of latent variable models using extended variational approximations. Stat Comput. 2023;33:26.
- View Article
- Google Scholar
19. Tawiah R, Bondell H. Multilevel joint frailty model for hierarchically clustered binary and survival data. Statistics in Medicine; 2023.
20. Lyu Z, Welsh A. Asymptotics for EBLUPs: Nested error regression models. J Am Stat Assoc. 2021:1–15.
21. Bates D, Mächler M, Bolker B, Walker S. Fitting linear mixed-effects models using lme4. J Stat Softw. 2015;67(1):1–48.
- View Article
- Google Scholar
22. Douma JC, Weedon JT. Analysing continuous proportions in ecology and evolution: A practical introduction to beta and Dirichlet regression. Methods Ecol Evol 2019;10(9):1412–30.
- View Article
- Google Scholar
23. Karmakar M, Lantz PM, Tipirneni R. Association of social and demographic factors with COVID-19 incidence and death rates in the US. JAMA Netw Open 2021;4(1):e2036462. pmid:33512520
- View Article
- PubMed/NCBI
- Google Scholar
24. Lyu Z, Welsh A. Small area estimation using EBLUPs under the nested error regression model. Stat Sinica. 2023.
- View Article
- Google Scholar
25. Ning X, Hui F, Welsh A. Asymptotic results for penalized quasi-likelihood estimation in generalized linear mixed models. Stat Sinica. 2026.
- View Article
- Google Scholar
26. Gonçalves MH, Cabral MS. Cold: An R package for the analysis of count longitudinal data. J Stat Softw. 2021;99(3).
- View Article
- Google Scholar
27. Tierney L, Kadane JB. Accurate approximations for posterior moments and marginal densities. J Am Stat Assoc 1986;81(393):82–6.
- View Article
- Google Scholar
28. Verbeke G, Fieuws S, Molenberghs G, Davidian M. The analysis of multivariate longitudinal data: a review. Stat Methods Med Res 2014;23(1):42–59. pmid:22523185
- View Article
- PubMed/NCBI
- Google Scholar
29. Ormerod JT, Wand MP. Gaussian variational approximate inference for generalized linear mixed models. J Comput Graph Stat 2012;21(1):2–17.
- View Article
- Google Scholar
30. Breslow NE, Clayton DG. Approximate inference in generalized linear mixed models. J Am Stat Assoc 1993;88(421):9–25.
- View Article
- Google Scholar
31. Vonesh EF, Wang H, Nie L, Majumdar D. Conditional second-order generalized estimating equations for generalized linear and nonlinear mixed-effects models. J Am Stat Assoc 2002;97(457):271–83.
- View Article
- Google Scholar
32. Hui FKC. On the use of a penalized quasilikelihood information criterion for generalized linear mixed models. Biometrika 2020;108(2):353–65.
- View Article
- Google Scholar
33. Zheng N, Cadigan N. Frequentist delta-variance approximations with mixed-effects models and TMB. Comput Stat Data Anal. 2021;160:107227.
- View Article
- Google Scholar
34. Nie L. Convergence rate of MLE in generalized linear and nonlinear mixed-effects models: Theory and applications. J Stat Plan Inference 2007;137(6):1787–804.
- View Article
- Google Scholar
35. Fuller WA. The multivariate components of variance model for small area estimation. Small Area Stat. 1987.
- View Article
- Google Scholar
36. Cheng J, Edwards LJ, Maldonado-Molina MM, Komro KA, Muller KE. Real longitudinal data analysis for real people: Building a good enough mixed model. Stat Med 2010;29(4):504–20. pmid:20013937
- View Article
- PubMed/NCBI
- Google Scholar
37. Jiang J, Wand MP, Bhaskaran A. Usable and precise asymptotics for generalized linear mixed model analysis and design. J R Stat Soc Ser B: Stat Methodol 2021;84(1):55–82.
- View Article
- Google Scholar
38. Hui FKC, Müller S, Welsh AH. Testing random effects in linear mixed models: another look at the F‐test (with discussion). Aus NZ J Stat 2019;61(1):61–84.
- View Article
- Google Scholar
39. Pratesi M. Analysis of poverty data by small area estimation. John Wiley & Sons; 2016.
40. Ployhart RE, Bliese PD, Strizver SD. Intensive longitudinal models. Annu Rev Org Psychol Org Behav 2025;12(1):343–67.
- View Article
- Google Scholar
41. Lyu Z, Welsh AH. Increasing cluster size asymptotics for nested error regression models. J Stat Plan Inference. 2022;217:52–68.
- View Article
- Google Scholar
42. Hall P, Maiti T. On parametric bootstrap methods for small area prediction. J R Stat Soc Ser B: Stat Methodol 2006;68(2):221–38.
- View Article
- Google Scholar
43. Chatterjee S, Lahiri P, Li H. Parametric bootstrap approximation to the distribution of EBLUP and related prediction intervals in linear mixed models. Ann Stat. 2008;36(3).
- View Article
- Google Scholar
44. Jiang J, Lahiri P, Wan S-M. A unified jackknife theory for empirical best prediction with M-estimation. Ann Stat. 2002;30(6).
- View Article
- Google Scholar
45. Jiang J, Lahiri P. Mixed model prediction and small area estimation. TEST. 2006;15(1).
- View Article
- Google Scholar
46. Ibrahim JG, Chen MH, Lipsitz SR. Missing responses in generalised linear mixed models when the missing data mechanism is nonignorable. Biometrika 2001;88(2):551–64.
- View Article
- Google Scholar
47. Ibrahim JG, Molenberghs G. Missing data methods in longitudinal studies: A review. Test (Madr) 2009;18(1):1–43. pmid:21218187
- View Article
- PubMed/NCBI
- Google Scholar
48. Wu K, Wu L. Generalized linear mixed models with informative dropouts and missing covariates. Metrika 2006;66(1):1–18.
- View Article
- Google Scholar
49. Little RJ, Rubin DB. Statistical analysis with missing data, vol. 793. John Wiley & Sons; 2019.
50. Huque MH, Carlin JB, Simpson JA, Lee KJ. A comparison of multiple imputation methods for missing data in longitudinal studies. BMC Med Res Methodol 2018;18(1):168. pmid:30541455
- View Article
- PubMed/NCBI
- Google Scholar

[ref1] 1. Molenberghs G, Verbeke G. Models for discrete longitudinal data. Springer Science & Business Media; 2006.

[ref2] 2. Fitzmaurice GM, Davidian M, Verbeke G, Molenberghs G. Longitudinal data analysis. CRC Press; 2008.

[ref3] 3. Brooks ME, Kristensen K, van Benthem KJ, Magnusson A, Berg CW, Nielsen A, et al. glmmTMB balances speed and flexibility among packages for zero-inflated generalized linear mixed modeling. R J. 2017;9:378–400.
View Article
Google Scholar

[4] View Article

[5] Google Scholar

[ref4] 4. Jiang J. Empirical best prediction for small-area inference based on generalized linear mixed models. J Stat Plan Inference. 2003;111(1–2):117–27.
View Article
Google Scholar

[7] View Article

[8] Google Scholar

[ref5] 5. Rao JN, Molina I. Small area estimation. John Wiley & Sons; 2015.

[ref6] 6. Skrondal A, Rabe-Hesketh S. Prediction in multilevel generalized linear models. J R Stat Soc Ser A: Stat Soc 2009;172(3):659–87.
View Article
Google Scholar

[11] View Article

[12] Google Scholar

[ref7] 7. McCulloch CE, Neuhaus JM. Prediction of random effects in linear and generalized linear models under model misspecification. Biometrics 2011;67(1):270–9. pmid:20528860
View Article
PubMed/NCBI
Google Scholar

[14] View Article

[15] PubMed/NCBI

[16] Google Scholar

[ref8] 8. Hui FKC, Müller S, Welsh AH. Random effects misspecification can have severe consequences for random effects inference in linear mixed models. Int Statistical Rev 2020;89(1):186–206.
View Article
Google Scholar

[18] View Article

[19] Google Scholar

[ref9] 9. Gu Z. Model diagnostics for generalized linear mixed models. ProQuest Dissertations and Theses; 2008. p. 123.

[ref10] 10. Rabe-Hesketh S, Skrondal A, et al. Diagnostics for generalised linear mixed models. In: United Kingdom Stata Users’ Group Meetings 2003. Stata Users Group; 2003.

[ref11] 11. Carlin BP, Louis TA. Empirical bayes: Past, present and future. J Am Stat Assoc 2000;95(452):1286–9.
View Article
Google Scholar

[23] View Article

[24] Google Scholar

[ref12] 12. Kristensen K, Nielsen A, Berg CW, Skaug H, Bell BM. TMB: Automatic differentiation and Laplace approximation. J Stat Soft. 2016;70(5).
View Article
Google Scholar

[26] View Article

[27] Google Scholar

[ref13] 13. Kackar RN, Harville DA. Approximations for standard errors of estimators of fixed and random effects in mixed linear models. J Am Stat Assoc 1984;79(388):853–62.
View Article
Google Scholar

[29] View Article

[30] Google Scholar

[ref14] 14. Booth JG, Hobert JP. Standard errors of prediction in generalized linear mixed models. J Am Stat Assoc 1998;93(441):262–72.
View Article
Google Scholar

[32] View Article

[33] Google Scholar

[ref15] 15. Prasad NGN, Rao JNK. The estimation of the mean squared error of small-area estimators. J Am Stat Assoc 1990;85(409):163–71.
View Article
Google Scholar

[35] View Article

[36] Google Scholar

[ref16] 16. Saei A, Chambers R. Small area estimation under linear and generalized linear mixed models with time and area effects. Southampton Statistical Sciences Research Institute; 2003.

[ref17] 17. González-Manteiga W, Lombardía MJ, Molina I, Morales D, Santamaría L. Estimation of the mean squared error of predictors of small area linear parameters under a logistic mixed model. Comput Stat Data Anal 2007;51(5):2720–33.
View Article
Google Scholar

[39] View Article

[40] Google Scholar

[ref18] 18. Korhonen P, Hui F, Niku J, Taskinen S. Fast and universal estimation of latent variable models using extended variational approximations. Stat Comput. 2023;33:26.
View Article
Google Scholar

[42] View Article

[43] Google Scholar

[ref19] 19. Tawiah R, Bondell H. Multilevel joint frailty model for hierarchically clustered binary and survival data. Statistics in Medicine; 2023.

[ref20] 20. Lyu Z, Welsh A. Asymptotics for EBLUPs: Nested error regression models. J Am Stat Assoc. 2021:1–15.

[ref21] 21. Bates D, Mächler M, Bolker B, Walker S. Fitting linear mixed-effects models using lme4. J Stat Softw. 2015;67(1):1–48.
View Article
Google Scholar

[47] View Article

[48] Google Scholar

[ref22] 22. Douma JC, Weedon JT. Analysing continuous proportions in ecology and evolution: A practical introduction to beta and Dirichlet regression. Methods Ecol Evol 2019;10(9):1412–30.
View Article
Google Scholar

[50] View Article

[51] Google Scholar

[ref23] 23. Karmakar M, Lantz PM, Tipirneni R. Association of social and demographic factors with COVID-19 incidence and death rates in the US. JAMA Netw Open 2021;4(1):e2036462. pmid:33512520
View Article
PubMed/NCBI
Google Scholar

[53] View Article

[54] PubMed/NCBI

[55] Google Scholar

[ref24] 24. Lyu Z, Welsh A. Small area estimation using EBLUPs under the nested error regression model. Stat Sinica. 2023.
View Article
Google Scholar

[57] View Article

[58] Google Scholar

[ref25] 25. Ning X, Hui F, Welsh A. Asymptotic results for penalized quasi-likelihood estimation in generalized linear mixed models. Stat Sinica. 2026.
View Article
Google Scholar

[60] View Article

[61] Google Scholar

[ref26] 26. Gonçalves MH, Cabral MS. Cold: An R package for the analysis of count longitudinal data. J Stat Softw. 2021;99(3).
View Article
Google Scholar

[63] View Article

[64] Google Scholar

[ref27] 27. Tierney L, Kadane JB. Accurate approximations for posterior moments and marginal densities. J Am Stat Assoc 1986;81(393):82–6.
View Article
Google Scholar

[66] View Article

[67] Google Scholar

[ref28] 28. Verbeke G, Fieuws S, Molenberghs G, Davidian M. The analysis of multivariate longitudinal data: a review. Stat Methods Med Res 2014;23(1):42–59. pmid:22523185
View Article
PubMed/NCBI
Google Scholar

[69] View Article

[70] PubMed/NCBI

[71] Google Scholar

[ref29] 29. Ormerod JT, Wand MP. Gaussian variational approximate inference for generalized linear mixed models. J Comput Graph Stat 2012;21(1):2–17.
View Article
Google Scholar

[73] View Article

[74] Google Scholar

[ref30] 30. Breslow NE, Clayton DG. Approximate inference in generalized linear mixed models. J Am Stat Assoc 1993;88(421):9–25.
View Article
Google Scholar

[76] View Article

[77] Google Scholar

[ref31] 31. Vonesh EF, Wang H, Nie L, Majumdar D. Conditional second-order generalized estimating equations for generalized linear and nonlinear mixed-effects models. J Am Stat Assoc 2002;97(457):271–83.
View Article
Google Scholar

[79] View Article

[80] Google Scholar

[ref32] 32. Hui FKC. On the use of a penalized quasilikelihood information criterion for generalized linear mixed models. Biometrika 2020;108(2):353–65.
View Article
Google Scholar

[82] View Article

[83] Google Scholar

[ref33] 33. Zheng N, Cadigan N. Frequentist delta-variance approximations with mixed-effects models and TMB. Comput Stat Data Anal. 2021;160:107227.
View Article
Google Scholar

[85] View Article

[86] Google Scholar

[ref34] 34. Nie L. Convergence rate of MLE in generalized linear and nonlinear mixed-effects models: Theory and applications. J Stat Plan Inference 2007;137(6):1787–804.
View Article
Google Scholar

[88] View Article

[89] Google Scholar

[ref35] 35. Fuller WA. The multivariate components of variance model for small area estimation. Small Area Stat. 1987.
View Article
Google Scholar

[91] View Article

[92] Google Scholar

[ref36] 36. Cheng J, Edwards LJ, Maldonado-Molina MM, Komro KA, Muller KE. Real longitudinal data analysis for real people: Building a good enough mixed model. Stat Med 2010;29(4):504–20. pmid:20013937
View Article
PubMed/NCBI
Google Scholar

[94] View Article

[95] PubMed/NCBI

[96] Google Scholar

[ref37] 37. Jiang J, Wand MP, Bhaskaran A. Usable and precise asymptotics for generalized linear mixed model analysis and design. J R Stat Soc Ser B: Stat Methodol 2021;84(1):55–82.
View Article
Google Scholar

[98] View Article

[99] Google Scholar

[ref38] 38. Hui FKC, Müller S, Welsh AH. Testing random effects in linear mixed models: another look at the F‐test (with discussion). Aus NZ J Stat 2019;61(1):61–84.
View Article
Google Scholar

[101] View Article

[102] Google Scholar

[ref39] 39. Pratesi M. Analysis of poverty data by small area estimation. John Wiley & Sons; 2016.

[ref40] 40. Ployhart RE, Bliese PD, Strizver SD. Intensive longitudinal models. Annu Rev Org Psychol Org Behav 2025;12(1):343–67.
View Article
Google Scholar

[105] View Article

[106] Google Scholar

[ref41] 41. Lyu Z, Welsh AH. Increasing cluster size asymptotics for nested error regression models. J Stat Plan Inference. 2022;217:52–68.
View Article
Google Scholar

[108] View Article

[109] Google Scholar

[ref42] 42. Hall P, Maiti T. On parametric bootstrap methods for small area prediction. J R Stat Soc Ser B: Stat Methodol 2006;68(2):221–38.
View Article
Google Scholar

[111] View Article

[112] Google Scholar

[ref43] 43. Chatterjee S, Lahiri P, Li H. Parametric bootstrap approximation to the distribution of EBLUP and related prediction intervals in linear mixed models. Ann Stat. 2008;36(3).
View Article
Google Scholar

[114] View Article

[115] Google Scholar

[ref44] 44. Jiang J, Lahiri P, Wan S-M. A unified jackknife theory for empirical best prediction with M-estimation. Ann Stat. 2002;30(6).
View Article
Google Scholar

[117] View Article

[118] Google Scholar

[ref45] 45. Jiang J, Lahiri P. Mixed model prediction and small area estimation. TEST. 2006;15(1).
View Article
Google Scholar

[120] View Article

[121] Google Scholar

[ref46] 46. Ibrahim JG, Chen MH, Lipsitz SR. Missing responses in generalised linear mixed models when the missing data mechanism is nonignorable. Biometrika 2001;88(2):551–64.
View Article
Google Scholar

[123] View Article

[124] Google Scholar

[ref47] 47. Ibrahim JG, Molenberghs G. Missing data methods in longitudinal studies: A review. Test (Madr) 2009;18(1):1–43. pmid:21218187
View Article
PubMed/NCBI
Google Scholar

[126] View Article

[127] PubMed/NCBI

[128] Google Scholar

[ref48] 48. Wu K, Wu L. Generalized linear mixed models with informative dropouts and missing covariates. Metrika 2006;66(1):1–18.
View Article
Google Scholar

[130] View Article

[131] Google Scholar

[ref49] 49. Little RJ, Rubin DB. Statistical analysis with missing data, vol. 793. John Wiley & Sons; 2019.

[ref50] 50. Huque MH, Carlin JB, Simpson JA, Lee KJ. A comparison of multiple imputation methods for missing data in longitudinal studies. BMC Med Res Methodol 2018;18(1):168. pmid:30541455
View Article
PubMed/NCBI
Google Scholar

[134] View Article

[135] PubMed/NCBI

[136] Google Scholar

Figures

Abstract

1 Introduction

1.1 A motivating example

1.2 Main contributions

2 Materials and methods

2.1 Generalized linear mixed models

2.2 Measures of prediction uncertainty for random effects

2.2.1 CMSEP.

2.2.2 UMSEP.

2.2.3 The glmmTMB estimator.

2.2.4 lme4.

3 Results

3.1 Normality paradox

3.1.1 Technical results.

3.1.2 Poisson intercept-only GLMM example.

3.2 Simulation study

4 Discussion

Supporting information

S1 Fig 1. Comparison of glmmTMB and lme4 point predictions of the random effects.

S2 Fig 2. Naive interval lengths versus unconditional interval lengths, for .

S1 Appendix. Proofs and extra results.

References