Figures
Abstract
Modeling repeated measures of arterial occlusive diseases, such as peripheral artery disease (PAD), using data with mixed-type outcomes poses unique challenges due to complex dependency structures and diverse distributional assumptions. This study proposes a comprehensive Bayesian hierarchical modeling framework for the simultaneous analysis of binary and continuous outcomes observed repeatedly within individuals. We focus on the methodological comparison of three major Markov Chain Monte Carlo (MCMC) Bayesian computational methods—Metropolis-Hastings, Gibbs sampling, and Hamiltonian Monte Carlo (HMC) that are suitable for hierarchical models without random effects, as well as those with random intercepts and slopes, by utilizing arterial occlusive disease (AOD) data that includes repeated leg measurements on 16 patients with a total of 256 observations. We evaluate model performance across multiple criteria, including the widely applicable information criterion (WAIC), Leave-one-out information criteria (LOO-IC), K-fold cross-validation (K = 10), the Bayesian information criterion (DIC), and the BIC information complexity (ICOMP). Our results reveal that the full random effects model estimated via HMC performed better and achieved higher predictive accuracy across the considered information criteria for this small-sample, historical dataset used for modeling applications. This work emphasizes the importance of model selection strategies in hierarchical Bayesian analysis and highlights the advantages of employing modern MCMC techniques in medical applications. However, we realize that these findings may depend on the precise priors and parameterizations used and may not apply to all small-sample hierarchical datasets. Thus, expanding this model to larger, contemporary datasets will improve its generalizability and clinical relevance.
Citation: Ebrahim EA, Cengiz MA (2026) Bayesian hierarchical models for multivariate mixed responses with repeated measures: A case study in arterial occlusive disease. PLoS One 21(4): e0346331. https://doi.org/10.1371/journal.pone.0346331
Editor: Denekew Bitew Belay, Bahir Dar University, ETHIOPIA
Received: August 13, 2025; Accepted: March 18, 2026; Published: April 15, 2026
Copyright: © 2026 Ebrahim, Cengiz. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: The data were obtained from https://doi.org/10.2307/2532336. The extracted data used in this study are available at https://doi.org/10.5281/zenodo.17621089.
Funding: This work was supported and funded by the Deanship of Scientific Research at Imam Mohammad Ibn Saud Islamic University (IMSIU) (grant number IMSIU-DDRSP2601).
Competing interests: The authors have declared that no competing interests exist.
Abbreviations: AOD, Arterial Occlusion Diseases; BHM, Bayesian Hierarchical Modeling; BRMS, Bayesian regression models using ‘Stan’ (R package); CI, Credible Interval; DIC, Bayesian deviation criterion; EAS, Expected Value Posterior Estimation; ESS, Effective sample sizes; HMC, Hamiltonian Monte Carlo; HPD, 95% High Posterior Density(Credible)Interval; ICOMP, Bozdogan’s Information Complexity/Criteria Index; LKJ, Lewandowski-Kurowicka-Joe; LOO-IC, Leave One-Out Information criterion; MCMC, Markov Chain Monte Carlo; MCMCglmm, Generalized Linear Mixed Models with MZMC (R package); R2MLwiN, A package for running MLwiN from within R using the Metropolis-Hastings algorithm; RCP, Reduced in cuff pressure measurement; WAIC, Watanabe-Akaike/Bayesian Information Criterion
1. Introduction
Repeated measures with multivariate mixed outcomes—comprising both categorical and continuous response types—are prevalent in many fields, including medicine, social sciences, and engineering. Such data structures require sophisticated modeling strategies that account for intra-subject correlations and the complexities of joint distributions. Traditional univariate approaches often fail to exploit correlations across outcomes, leading to inefficient parameter estimation [1]. Hierarchical Bayesian models provide a flexible framework for addressing these challenges by incorporating prior knowledge and nested random structures. In this study, we apply these modeling strategies to a real-world dataset on AOD, in which repeated binary and continuous outcomes were measured for each subject. Our objective is not limited to the application; we emphasize evaluating different MCMC algorithms in estimating model parameters and selecting the most appropriate model structure for such data.
Repeated measures data are a type of longitudinal data in which repeated measurements (observations) of the same dependent or response variable are taken at two or more points in time, space, or across multiple occasions. Repeated measures are obtained by taking multiple measurements on the same experimental unit, such as an individual, animal, or machine [2]. Research on repeated-measurement data is increasing rapidly due to the high demand for advanced statistical techniques to analyze it across disciplines such as life sciences, public health, biostatistics and epidemiology, agriculture, Econometrics, Social Sciences, environmental studies, etc. [3].
Statistical modeling of repeated-measures data is complicated by two types of dependencies [4]: first, the dependency between the dependent and explanatory variables, and second, the dependency between the observed measurements of the dependent variables. In many fields of study, multivariate data are often generated with multiple outcomes. It is possible to combine continuous and discrete response variables. Due to their complexity, the data are often handled by applying regression analysis to each outcome independently, even when they are related. In studies with mixed data comprising both continuous and binary outcomes, multivariate techniques were more efficient than univariate approaches. In general, more accurate parameter estimates can be obtained by jointly estimating parameters using MCMC estimation methods in Bayesian inference. The dependent variables can be of two types. First is categorical (qualitative): ordinal or nominal, and second is quantitative: discrete or continuous. When analyzing repeated-measures multivariate data, the type of dependency between measurements arising from repeated events within the same participants is of great concern. In other words, the nature of the correlation within and between individuals must be taken into account [4–6].
Hierarchical statistical models, also called random effects, mixed effects, or multilevel models, are widely used for data that are structured into groups of units. By examining repeated-measures data as a two-level hierarchy, researchers can build more complex hierarchical model structures [7]. Bayesian statistical techniques provide flexible modeling assumptions that enable models to accurately capture the complexity of real-world data. The dependent variable’s distribution, probability functions or prior distributions, regression structure, numerous layers of observational units, etc., are some examples of these adaptable modeling tools. A collection of estimating techniques known as MCMC approaches is used to fit realistic, intricate functions of parameters with subtle variations to Bayesian models, which are often compared with more conventional models [8].
According to research by Lages and Scheel (2016), Bayesian logistic mixed models with latent imputation and group-specific parameters better accounted for the data [9]. At the same time, the classical logistic hierarchical model with mixed effects also performed well. Different MCMC approaches vary in their performance in terms of speed and convergence, depending on the model structure [10]. MCMC approaches are presented both theoretically and practically using the R, BUGS, and STAN packages. Lemoine (2019) used R simulations to show how informative priors affect posterior parameter estimates and model performance, which explains the hierarchical modeling of the random effects structure and its impact on statistical inference [11]. However, computational problems may increase as the number of outcomes increases, leading to a strong dependency structure [12].
A Bayesian probabilistic technique is proposed to analyze regression models with mixed response or dependent variables as continuous and categorical variables, by incorporating prior distributions. Among the important aspects of multivariate models for mixed-type data is the investigation of the correlation structure of dependent-variable vectors to answer multivariate queries [13–15]. It provides control over the rise in type I error that occurs when univariate analyses are performed across different dependent variables. Multiple comparisons of model parameters are performed without accounting for random terms. Moreover, because different sets of individuals contribute to each analysis, it provides more precise parameter estimates by accounting for relationships among dependent variables and facilitating interpretation in the absence of dependent variables [16].
Tidemann-Miller et al. (2016) proposed a Bayesian approach for jointly analyzing multiple functional dependencies of different types (e.g., binary/categorical and continuous data) [17]. In this case, it models the dependence between functional responses via the multivariate latent normal process and the dependence of the latent process. There is a need to jointly analyze multivariate mixed outcomes while accounting for explanatory variable effects and considering all variance-covariance structures [18]. Tate and Pituch (2007) demonstrated the usefulness of multivariate multiple outcomes in a hierarchical linear model from randomized field experiments and simulation data for a hypothetical scenario [19]. Analyzing multi-outcome longitudinal data in a linear hierarchical model provides great flexibility for modeling between- and within-group correlation in multi-outcome repeated measures. A robust Bayesian approach to the multivariate Student-t linear mixed model can be equipped with a computationally efficient inverse-Bayesian-formula (IBF) sampler, along with a Gibbs sampler [20].
According to Alfo and Giordani (2022), a flexible regression model for multivariate dependent variables with mixed distributions can be fitted by considering several (conditionally independent) univariate regression models with random effects specific to the outcome [21]. This multivariate model accounts for subtle heterogeneity specific to the outcomes and for dependence among them through their joint distributions. It also opens up more general dependence structures among random effects in regression Equations with different dependent variables.
The Bayesian Hierarchical Modeling (BHM) with mixed categorical (Binomial) and continuous (normal) dependent variables can be modeled on incomplete cross-sectional data. In this case, the model-dependent variable indicator for the categorical dependent variable can depend on the continuous dependent variable. Such a modeling phenomenon is demonstrated by data from an observational study, in which the effect of parents’ psychological disorders on both verbal comprehension scores and the presence of negative symptoms in their children was modeled despite missing dependent variables [22].
Hierarchical models with multiple continuous dependent (outcome) variables, each with a Normal distribution, are common in applied research. However, it is less commonly applied, and it is possible to combine the analysis of different types of dependent variables in a hierarchical multivariate mixed dependent variables model; for example, we can include a continuous dependent variable, such as blood pressure, alongside a categorical dependent variable, such as smoking status, in multivariate models [23]. In this study, a mixed-response modeling approach was applied to repeated-measures data to examine correlational effects.
This manuscript’s main objective is to apply Metropolis-Hasting, Gibbs sampling, and HMC Bayesian MCMC methods for parameter estimation in hierarchical models of repeated-measures data with mixed dependent variable types, and to examine model selection methods.
2. Materials and methods
Most statistical models assume that observations are identically and independently distributed (iid), drawn from a single distribution, and that one or more unknown parameters are present. However, in many cases, it does not make sense to treat measurements as identically and independently distributed (iid) with the same parameter(s) from the same distribution [24]. For continuous variable (e.g., income, score, CD4 count, grade point average), and categorical (e.g., students’ attendance status, patient health status, academic status) dependent variables, single-level classical linear and binary logistic or multinomial logistic regression models can be generalized to a Bayesian hierarchical/multilevel/ model by adding prior distributions and allowing the regression coefficients to vary randomly over the clusters.
2.1. Bayesian hierarchical mixed (binary and continuous) responses modeling
Modeling mixtures of Normal, Bernoulli (Binomial), and Multinomial distributed dependent variables in a repeated-measure data structure is not a simple task. This is because of differences in the distributional assumptions of the various link functions and exponential families for our different types of dependent variables [25]. Currently, using MCMC estimation approaches in R2MLwiN, we can only fit mixtures of Normal and Binomial dependent variables, and then only one probit link function for the Binomial. This is because only this combination of dependent variables results in a definable distribution for the lowest level of residuals. Other combinations of dependents can be handled by assuming independence at the lowest measurement level (often unrealistic) [26].
In the continuous response case, a linear hierarchical (mixed) model is typically assumed, in which the conditional response distribution is normal, , and the link function is the identity link
. The variance
and the random variation parameter is
.
Binary responses are assumed to be independently Bernoulli distributed as for given
. Here, the conditional expectation
is also the conditional probability
for given values of the covariates/ predictors/ and random effects. The logit link is the most widely used link function, expressed as
The goal of the Bayesian paradigm is to calculate (or estimate) the joint posterior distribution, treating all unknown model parameters as random variables. In this study, we used a probit link function across all software implementations (R2MLwiN, MCMCglmm, and BRMS/Stan) to maintain comparability [27]. The general hierarchical bivariate mixed dependent variables model can be written as a more complex Bayesian general hierarchical linear model that can be expressed in matrix algebra as follows:
where is the linear estimator of fixed effects:
is the linear estimator of random effects, and
is the residual term. In this idea, Bayesian analysis treats all fixed and random effects as random variables with their distributions. The multivariate (mixed) dependent variables can now be represented in matrix form,
is the vector of the dependent variable(s), indicated as
as a continuous and
as a categorical response. The Bayesian hierarchical model considered here includes both categorical and continuous responses and includes both binary (Binomial) and continuous (normal) outcomes. Generalizing the model to multiple categorical (ordinal and nominal) response variables is straightforward in principle, since each categorical variable can be represented by a set of binary indicator variables.
Suppose
be a continuous response having
measurement units. For these individuals,
is the vector of predictor/independent variables. Suppose
numbers of observations on a categorical response,
. Thus,
is a response/ dependent variable for
individual subject with
(
observed measurements of
(
response/outcome type. Therefore,
,
are
distinct mixed outcome variables with a general case of normalized latent response
or
.
For a 2-level hierarchical model with measurement observations nested within individual persons, the multivariate mixed response/dependent/ variable model can be written as:
With a simultaneous response indicator, a dummy variable, .
In Bayesian statistics, three popular MCMC algorithms are used for estimating Bayesian marginal and joint posterior densities: Metropolis-Hasting, Gibbs sampling, and the new hybrid Monte Carlo algorithm. The HMC algorithm was implemented to update the parameters by suppressing random walk behavior and utilizing the gradient of the log-posterior density [28].
2.1.1. Specification of Models.
Likelihoods: The likelihood function is constructed by defining the conditional distributions of the observed dependent variable, given the model parameters, including the subject-specific random effects. A common approach for mixed responses is to use a latent-variable formulation for the binary outcome, which simplifies modeling the correlation between the two response types.
Let be the continuous response variable and
be the binary response variable for individual subject
at observed measurement
,
. For a vector of common predictor variables,
, for both
and
response variables. Then,
The continuous response variable, , can be modeled with a linear hierarchical structure:
where, represents the fixed effects on
and
is the individual subject-specific random effects, which accounts for the repeated measures in
within each subject. The error term,
, and
is the residual variance of the continuous response,
. The parameter
is the population-level (fixed effect) regression coefficient for
is the vector of subject-specific random effects.
To specify the likelihood (sub-model) for a binary response variable, , it is necessary to introduce a continuous latent variable,
, that is linked to the binary response variable. A probit link is often preferred in a Bayesian context as it can simplify posterior sampling and is considered to have a fixed residual variance of 1 for identifiability; i.e.,
.
Then, the latent variable can be modeled with a linear hierarchical structure:
where, represents the fixed effects on
and
is the individual subject-specific random effects, which accounts for the repeated measures in
within each subject.
is the population-level (fixed effects) regression coefficients for the binary response variable,
The full likelihood for individual subject is the product of the likelihoods for each response type that requires integrating out the random effects:
where includes all fixed effects and variance components, and
is the variance-covariance parameters of the random effects. The full random Bayesian model that combines the joint likelihood with prior distributions for all random and fixed effects is:
Prior distributions: Bayesian inference requires specifying prior distributions for all model parameters. These priors can be informative (based on previous research) or non-informative (diffuse), and they can vary depending on the model formulation and the R package used [29,30]. For the fixed effect parameters, , the weakly informative (such as Normal (0,1) or flat priors are common choices among most R packages.
where, and
are diagonal matrices with larger variances.
The prior for the continuous response’s residual variance, a common choice is an inverse-gamma distribution in MCMCglmm and R2MLwiN, and half-t or half-Cauchy distributions in BRMS. Thus, a weakly informative prior can be specified as:
Or
.
The prior for the vector of random effects, can be specified as a multivariate normal distribution with mean zero and a covariance matrix,
, which captures the correlation between the random intercepts for the two response variables. Thus,
where, is the variance of the random intercept for the continuous response variable,
.
is the variance of the random intercept for the binary response variable,
;
is the correlation between the random intercepts for
and
. The common prior choices for the random effects covariance matrix,
, is an inverse-Wishart distribution in MCMCglmm and R2MLwiN, and Lewandowski-Kurowicka-Joe (LKJ) distribution in BRMS:
where: for
number of random effects in the model, and the scale matrix
is an identity matrix. The full posterior distribution,
can be formulated as:
where, is the vector of all model parameters;
is the joint likelihood function, and
is the prior distribution for
.
2.2. Markov Chain Monte Carlo computational methods
Bayesian computation methods are employed when the posterior distribution cannot be obtained in closed form, requiring the application of alternative techniques to estimate or sample from it. The joint posterior distribution can be sampled using a class of computational techniques known as MCMC methods [31]. These techniques rely on constructing a Markov chain from the posterior distribution, which serves as the stationary distribution. Thus, samples from the joint posterior distribution are generated after multiple iterations by repeatedly sampling from this Markov chain.
Model formulation directly affects the effective sample size (ESS) through factors such as spatial and autocorrelation, distinct likelihoods, prior specification, and design matrix complexity. A model with substantial autocorrelation (e.g., Markov Chain Monte Carlo) may have an ESS lower than its actual sample size with default priors, since each sample depends on the one before it [32]. A lower ESS could likewise be found in a more complicated model with multiple associated parameters. Conversely, models with low autocorrelation and independent predictors will have an ESS closer to the actual sample size. Aim for an ESS of 1,000 or more for stable estimates and 200 or more for a bare minimum, while there are other guidelines. Emphasize that greater values (400+) are essential for precise interval estimation, even though 200 is the absolute minimum for a very basic estimate [29,30].
In MCMC convergence diagnostics, the ESS measures the number of independent samples a MCMC sampler would need to produce the same precision as the current auto-correlated sample. The Metropolis-Hastings ESS is sensitive to the variance of the proposal distribution; high autocorrelation leads to a low ESS. The HMC and its variants generally have higher ESS than basic Metropolis-Hastings because their gradient-based updates enable more efficient exploration of the parameter space. However, it is still affected by adaptation, tuning, and the model’s specific formulation. Gibbs samplers can achieve high ESS if the conditional distributions are easy to sample from and the variables are not highly correlated. Different MCMC samplers yield different ESS values because of their distinct mechanisms for exploring the posterior distribution. HMC generally yields higher ESS per iteration in many complex problems, whereas Metropolis-Hastings and Gibbs samplers can be less efficient, especially in high-dimensional or highly correlated posteriors [33].
Although R-hat = 1.0 is the ideal value and indicates perfect convergence across all MCMC chains, R-hat statistic, or potential scale reduction factor, values < 1.1 indicate chain convergence, while values > 1.1 are indicative of less mixing and stationarity. For HMC and other MCMC methods, an R-hat value of 1.01 indicates excellent convergence. However, a value slightly higher than 1.01 might still be acceptable, with 1.1 being a standard threshold for early stages. R-hat values significantly greater than 1.00 suggest that the Markov chains have not mixed well or converged to the target distribution, with the highest acceptable value often considered to be around 1.01 for robust convergence. The recommended threshold for an R-hat value of 1.01 or less is a modern standard used by probabilistic programming software such as Stan [34,35]. Others propose improvements to the traditional R-hat and recommend the ESS > 400 threshold for reliable diagnostics, which is the standard used in STAN [36].
MCMC techniques have been widely used to generate samples from high-dimensional, complex distributions. HMC is the most efficient one that uses a variety of Bayesian computational techniques [37]. The HMC method converges more quickly than conventional Metropolis-Hastings and Gibbs techniques and is most effective when approximating complex data-structure models [31]. Well-known MCMC approaches are used in complex parameter structures [38], but they exhibit poor performance and slow convergence. Similar to Gibbs sampling, HMC uses a random proposal distribution centered on the current parameter value. However, unlike the Gibbs algorithm, HMC does not rely on parameter sampling from a conditional posterior distribution. The HMC has two advantages over other MCMC methods. First, the samples have little or no autocorrelation. The other advantage is fast mix-in: the chain converges quickly to the distribution. Accordingly, it is the most effective method for continuous distributions, with low sample rejection and low (auto)correlation [39].
The HMC typically performs better for hierarchical models [40–42], but performance depends on the parameterization, marginalization, and data size. With a small dataset, discrepancies in the information criterion may be due to implementation differences.
The Metropolis algorithm, first proposed by Metropolis et al. (1953), is used when an analytical expression for the posterior distribution is not available [43]. As explained in Table 1, the algorithm requires a proposal distribution with a density function. It precedes one step at a time, based on simulations from this distribution, and accepts or rejects each step based on a Metropolis-Hastings ratio. Let enote the target distribution (the posterior distribution) and
be the proposal from which the candidate point is sampled at iteration time.
The Metropolis approach, which chooses a symmetric proposal/offer distribution, , was historically found before Metropolis-Hastings. Sometimes we may need an asymmetric proposal distribution,
, which has a different acceptance probability than the random walk Metropolis. Therefore, Tables 1 and 2 differ in how they calculate the acceptance probability. Here, a sample drawn from the conditional distribution for the new value is proposed as
the new value, which is accepted with a certain probability
. Otherwise, it remains negative. The Metropolis-Hastings method uses a proposal distribution to generate new approximations to the parameter ensemble.
Gibbs sampling [39] can be viewed as a specific instance of the Metropolis-Hastings method, in which the proposal distributions are the full conditional distributions of the model parameters (Table 3). This ensures that all ideas are automatically approved, since the Metropolis-Hastings algorithm’s acceptance probability is always 1.
The HMC, also known as hybrid Monte Carlo, can suppress such random walk behavior when model parameters are continuous rather than discrete by employing a clever auxiliary-variable strategy that transforms the sampling problem from one target distribution to another [44]. By following a series of steps guided by first-order gradient information, the HMC method circumvents the random-walk behavior and the sensitivity to correlated parameters that plague conventional MCMC techniques. HMC requires appropriate parameter settings based on attributes and population structure to fit the BHM. Moreover, HMC outperformed Gibbs sampling on simulated data [39]. The HMC algorithm (Table 4) is based on the Hamiltonian (total energy), which calculates the trajectory for a time T and then obtains the final position, .
2.3. Application data set
Arterial occlusive disease data has two widely used intrusive techniques for classifying each leg as either healthy (0) or diseased (1): ultrasound imaging and reduced cuff pressure (RCP) measures. The variables can be expressed as is the
patient’s health status from the measurement on the
side of the leg.
is the
patient’s disease severity score measurement on the
side of the leg.
is the
patient’s ultrasound image score measurement on the
side of the leg.
is the
patient’s RCP measurement on the
side of the leg. Here,
and
. Peripheral vessels are the vein branches that join the main vein to transport the contaminated blood from the arteries to the heart and the artery branches that emerge from the aorta and deliver clean blood to the arms, legs, brain, and organs [45].
Atherosclerosis, caused by decreased blood flow, is typically the cause of arterial occlusive disorders, which include occlusion or constriction of the arteries in the legs (often the arms). The data were obtained on disease severity, patient health status, and other factors at Broadgreen Hospital in Liverpool in 1988/89 [46]. This small dataset might not be a representative sample for clinical practice and inferences. However, we used it because the study emphasized methodological application across distinct MCMC approaches with no identical priors or likelihoods. The flat lining of healthy peripheral arteries encourages continuous blood flow and inhibits clotting. Although it can affect any artery, peripheral artery disease most frequently affects the legs [47]. Thus, in the data, the total number of measurements collected was 256 (16 patients, with four measurements on each of the two legs [left/right] and on each leg [upper/lower]). The dataset includes a categorical (binary) variable representing health status, as well as three distinct continuous variables: disease severity score, RCP, and ultrasound image measurement scores.
The patient’s health status and disease severity scores were considered outcome variables, and the patient’s RCP and ultrasound image measurement scores were independent variables/predictors. Of all patient features considered in the dataset, patient health status, ultrasound measurements, RCP measurements, and disease severity scores are included. The categorical (binary) response was the patient’s health status, and the continuous outcome variable was the patient’s disease severity scores. In contrast, the ultrasound imaging score and reduction cuff pressure measurements were explanatory variables expected to jointly predict the patient’s health status and disease severity scores. The structural summary of the AOD dataset in long format is shown in Table 5.
The summary Table 5 of the AOD dataset clarifies that the study includes N = 16 individuals or patients; 2 legs (Left/Right) per patient; 4 measurements or repeats per leg side (Upper/Lower/occasions), totaling eight (8) measurements per patient and 256 observations. The dataset is fully observed (balanced) for the primary variables used in the analysis and has no missing values.
2.4. Ethical statements
As secondary data analysis from publicly available sources was used in this study, it is noted that the original study from which the data were obtained received ethical approval. The current analysis uses anonymized, de-identified data and does not allow for re-identification of participants. We have also made efforts to mitigate bias by establishing clear criteria for data inclusion and ensuring transparency in our methods, thereby promoting the reproducibility of our results.
3. Results and discussion
3.1. Descriptive findings on correlation and random effects
From Fig 1, the bivariate relationships were all positive. Patients’ health status shows a strong correlation with ultrasound image and disease severity scores. In addition, patients’ disease severity score has a moderate positive correlation with RCP and a strong correlation with ultrasound image score. The ultrasound image score shows a stronger correlation with both response variables than RCP.
The results in Fig 2 showed moderate variation in health status among patients, as reflected in their leg-side measurements. A large dispersion in disease severity scores was observed among patients based on leg-side measurements.
3.2. Fitted model results and Bayesian inferences
In Bayesian inference for joint modeling of patients’ disease severity scores and health status, the null (empty) model, random-intercept model, and full random-slope models were fitted separately using three popular MCMC approaches. Thus, nine (9) distinct multivariate mixed response models were fitted, and their prediction performance was evaluated. These models were fitted using default likelihood and prior parameters, with the default settings of the R2MLwiN, MCMCglmm, and HMC packages in R version 4.0.
In Table 6, the null (empty) model, the random-intercept model, and the full random-slope models were fitted separately using the Metropolis-Hastings approach for joint multivariate mixed outcomes. In the null model, both the intercepts for patients’ disease severity score and health status were significant. The random part of the subject (level 2) was statistically significant. The random-intercept-with-fixed-slope model showed that both the intercepts for patients’ disease severity score and health status (leg side only for patients’ disease severity score) and the level-2 random variation were credible at the 5% level. The full random-intercepts-and-slopes model showed that the intercepts for patients’ disease severity score and health status, the leg-side and ultrasound-image scores for joint outcomes, and the level-2 and subject variations by ultrasound were statistically significant at the 5% level.
In Table 7, the null Model, random-intercept model, and full random-slope model were fitted separately for joint multivariate prediction of disease severity score and patients’ health status as mixed responses using the Gibbs sampling approach. Also, in Table 8, the null, random-intercept, and random-slope models were fitted separately for joint multivariate prediction of disease severity score and patients’ health status using the HMC approach.
According to Table 8, posterior parameter (mean and SD) estimates from this best-fitting model via the Hamiltonian Monte Carlo method indicate that the predictors ultrasound and RCP were significantly associated with the patient’s health status. A unit increase in ultrasound score has a significant impact (posterior mean of 12.5) on the health status of arterial occlusive disease. Ultrasound is a primary non-invasive tool for diagnosing, grading, and monitoring arterial occlusive disease, such as PAD. The results typically correlate highly with the extent and severity of the disease and are linked to cardiovascular risk. A unit increase in RCP score has a significant impact (posterior mean of 0.02) on health status outcome for patients with AOD. Moreover, there was significant variation in disease severity scores between patients (level 2) and significant variation in health status within patients (between measurements of the leg side). A random-effect SD indicates that disease progression trajectories vary between subjects, with a standard deviation of 0.12, representing the typical deviation of an individual’s trajectory from the posterior mean in disease severity scores.
3.3. Model comparisons: Bayesian hierarchical multivariate mixed response
The three Bayesian hierarchical multivariate mixed models were assessed using the three MCMC methods across various information criteria, as shown in Table 9. According to Table 9, the full random-slopes hierarchical model has the lowest DIC, ICOMP, WAIC, LOO-IC, and 10-fold estimates across the Metropolis-Hastings, Gibbs sampler, and Hamiltonian Monte Carlo approximation techniques. Thus, among the above-established models, the last full random-intercepts-and-slopes model in the Hamiltonian Monte Carlo algorithm appears to be the best. Therefore, a full random intercepts and slopes model as a joint multivariate function of two responses and two explanatory variables (ultrasound and cuff pressure measurement), and the random coefficient for leg, ultrasound, and cuff pressure measurement has the best predictive performance for disease severity scores (Y1) and patients’ health status (Y2). The consistency of an ensemble of Markov chains was demonstrated by models with an ESS greater than 1000 and an R-hat value near 1.00, but not exceeding 1.10 [48]. Furthermore, the overall ESS of the models, and both the Bulk-ESS and Tail-ESS, must be at least 100 (approximately) per Markov chain to be reliable, and the corresponding posterior quantiles are reliable. These findings support the model convergence and consistency of MCMC chains.
The R-hat, Bulk-ESS, and Tail-ESS findings for the null, random-intercept, and random-coefficient models satisfied the convergence diagnostic metrics across all models using HMC. As a result, for stable estimates in each established model, the ESS and potential scale reduction (R-hat) convergence diagnostic measures are adequate [49]. The R2MLwiN package employs a hybrid approach that can generate both frequentist and Bayesian-style output, serving as an interface to the standalone software MLwiN, which provides both estimation methods: Iterative Generalized Least Squares (IGLS) for maximum likelihood and MCMC for Bayesian inference. R2MLwiN offers the option to display the posterior mean and its corresponding posterior probability, but this relies on the assumption that the posterior distribution is approximately normal [50].
MCMCglmm is a fully Bayesian package designed to fit Generalized Linear Mixed Models using MCMC methods. In a strict Bayesian framework, researchers typically use credible intervals (which represent the probability that the true parameter value lies within a given range) or posterior probabilities (e.g., the proportion of posterior samples that are above or below zero) to make inferences. Model fitting in Metropolis-Hastings is based on the random-walk behavior of algorithms 1 and 2, which can make it inefficient for exploring complex or high-dimensional distributions. In contrast, the HMC algorithm moves along trajectories using the posterior’s gradient information, allowing it to take longer, more intentional steps into distant regions of the parameter space while retaining high acceptance rates. In this study, all continuous predictors were standardized (z-scored) before analysis to ensure coefficient comparability across algorithms. For the binary outcome (Y2), we utilized a probit link across all software implementations (R2MLwiN, MCMCglmm, and BRMS/Stan) to maintain comparability. However, the observed significant differences in parameter estimates (coefficients) or intercepts in Table 6 and Table 8 were expected due to parameterization differences between (Metropolis-Hastings and HMC) packages, such as prior distributions and residual variance scaling. Metropolis-Hastings packages, such as R2MlwiN, frequently use the Inverse-Wishart distribution as a conjugate prior for covariance matrices. When variances are close to zero, this can be both informative and biased, resulting in overly optimistic coefficient estimates. HMC packages, such as BRMS-Stan, advocate using the LKJ prior for correlation matrices and separate priors for standard deviations. This decoupling enables far more flexible and precise correlation estimation, yielding distinct (often more reliable) coefficients than those from Inverse-Wishart models.
The PSIS k-diagnostics report shown in Fig 3 is a method for improving the stability of importance sampling estimates by smoothing the importance weights. The stability and reliability of importance sampling estimates, as examined by the Pareto shape parameter, are assessed using PSIS k-diagnostics for LOO, and are within an acceptable range for reliable estimates. The threshold is 0.7 in BRMS, and since the Pareto shape parameter k is not greater than or equal to 7, there are no problematic observations in the full random-slopes model.
A prior predictive check in Fig 4 is a crucial step in the Bayesian workflow used to evaluate if the chosen prior distributions for a model’s parameters generate data that are consistent with existing domain knowledge or expectations, before considering the actual observed data. The prior predictive check diagnoses priors that are too strong, too weak, or poorly located, which could lead to problematic posteriors [51]. A prior predictive check is a statistical method in Bayesian analysis that evaluates whether a prior assumption is reasonable by simulating data from it before incorporating any actual data into the model [52]. These generate posterior parameter estimates and predicted values based only on the prior distributions of the model parameters. The weakly informative prior in the model, as shown in Fig 4(b), and its posterior lines are better overlaid compared to the default prior setups in Fig 4(a).
In this application, hierarchical models, particularly the full random model in HMC, frequently mix better; however, performance depends on data size, marginalization, and parameterization. Because the dataset is so small, variations in the information requirements could arise across different implementations [53].
3.4. Evaluation of MCMC convergence diagnostics and conditional/marginal effects for the best-fitted model
The MCMC convergence diagnostics for ESS values in the fitted models under Metropolis-Hastings ensure that the posterior estimates are adequately characterized for the Null and Full random intercept and slope models. However, a slight discrepancy in posterior estimates was observed in the random-intercept-with-fixed-slopes model.
On the other hand, for all models fitted using Gibbs Sampling and HMC, the ESS were adequate (above 400) to ensure reliable posterior estimates and better convergence [36]. The ESS metric confirmed that the MCMC simulations ran for a sufficiently long time and mixed sufficiently well to produce reliable results via HMC. The ESS metric confirmed that the MCMC simulations ran for a sufficiently long time and mixed sufficiently well to produce reliable results via HMC.
The full random intercepts and slopes model estimated via HMC, with default for fixed terms and Half-Cauchy (0, 1) prior for scale parameters, is the best-fitted model in this application dataset. This choice yielded good model convergence and sufficient ESS values (i.e., greater than or equal to 1000). The Bayesian hierarchical random-slope model was fitted, and convergence diagnostics, effect directions, and marginal effects of predictors were checked. Additionally, a visually appealing posterior predictive check (PPC) plot of predicted versus observed data is presented.
Besides the necessary estimates of R-hat, the Bulk Effective Samples Size (Bulk-ESS) that indicate posterior means and medians were reliable, and the Tail Effective Samples Size (Tail-ESS) that indicate posterior variances and tail quantiles were reliable, the best convergences trace plots of the HMC approach are shown in Figs 5 and 6. Figs 7 and 8 show the marginal effects of each fixed and random term. Accordingly, ultrasound and leg type showed increasingly positive effects on patients’ health status, but did not affect the disease severity score. Specifically, the marginal effects of ultrasound and RCP on patients’ health status and disease severity score were positive and linear, as shown in Fig 8.
According to the posterior predictive check (PPC) in Fig 9, the random slope and intercept model fits well and produces nearly identical posterior observed density and posterior predicted scatter plots. The prediction plots show the dependent variables (patients’ health status and disease severity score) of the Atherosclerotic diseases data set in the joint multivariate Bayesian hierarchical model. The PPC looks like the model is pretty good at retrofitting the data.
4. Conclusion
This study demonstrates the effectiveness of hierarchical Bayesian modeling for jointly analyzing multivariate repeated-measures data with mixed outcome types. Using clinical data on AOD, we compared the performance of three widely used MCMC methods—Metropolis-Hastings, Gibbs Sampling, and HMC—under various model specifications. The results revealed that the full random intercept and slope model estimated via HMC provides the best fit for this small clinical dataset, as measured by DIC, WAIC, LOO-IC, and ICOMP criteria, compared with other models. Despite inconsistent coefficients compared to other methods, this is a common phenomenon in small clinical datasets and cannot lead to generalization of the model to more hierarchical structures. However, this model-fitting and performance differences among MCMC samplers are often attributed to their distinct approaches to sampling from a probability distribution. These include differences in how the model is formulated (e.g., direct versus conditional sampling), the prior distributions used, and the specific decisions made regarding scaling and proposal parameters. When it comes to managing model complexity and the target distribution, each approach and package has its own unique advantages and disadvantages [54–56].
Clinically, the findings highlight the importance of jointly modeling patient health status (binary) and disease severity scores (continuous) to capture the correlation structure better and improve prediction. The significant association between ultrasound measurements and both outcomes supports its utility as a key non-invasive diagnostic indicator in peripheral arterial disease assessments. In addition, the best-fitted model showed that RCP and ultrasound image score had a significant effect on patients’ health status.
In this specific bite-sized AOD dataset, with distinct prior and parameterizations, HMC produced significantly higher ESS and more stable R-hat diagnostics than Metropolis-Hastings and Gibbs sampling. However, we recognize these results may depend on the specific priors and parameterizations used and may not generalize to all small-sample hierarchical structures.
From a methodological perspective, our work underscores the value of gradient-based sampling methods, especially HMC, in estimating complex hierarchical models with high-dimensional parameter spaces. These models not only yield better fit and interpretability but also provide robust uncertainty quantification, critical in medical decision-making. The main contribution of this study is the methodological application of HMC to intricate hierarchical systems, despite using a historical dataset from 1988/89. The analysis demonstrates that when properly parameterized, HMC effectively navigates the high-dimensional geometry created by hierarchical correlation structures, which often cause convergence issues in standard Gibbs sampling and Metropolis-Hasting. Specifically, by using identical link functions, standardized predictors, and distinct parameterizations (default prior distribution settings), we observed that the random slopes were estimated with high efficiency. However, these methodological strengths must be balanced against two key caveats: prior sensitivity and sample size. The posterior distributions remain sensitive to the choice of hyper-priors, particularly when historical data are high-variance. While HMC is robust, the ESS for the late-period parameters suggests that results should be viewed as a proof-of-concept for the algorithm rather than a definitive clinical profile.
Future research should focus on applying this HMC framework to contemporary longitudinal datasets to validate these methodological lessons in a modern clinical context, and explore integrating more flexible priors (e.g., shrinkage or nonparametric priors), including time-varying covariates, and comparing with frequentist multilevel approaches. Moreover, extending this modeling framework to larger sizes and more recent clinical datasets can enhance its generalizability and translational potential in cardiovascular research.
5. Appendix
5.1. Posterior estimates (mean and median) and sensitivity analysis results
The choice of a prior depends on the data structure and software packages [57]. In R2MLwiN, we used the non-informative flat prior for the average fixed-effect parameter, and the scaled inverse Wishart (IW) distribution was recommended for the random-effect variance–covariance matrices [58]. Priors beyond the informative and default were preferred for scale parameters [11].
The MCMCglmm library in R can perform BHM for continuous and mixed-dependent variables. The default priors for model coefficients and intercepts are uninformative, being standard normal distributions [59]. The prior distribution for the higher-order random effect is specified by two separate terms for the G-structure and the residual term R-structure, with the defaults V and nu (mean mu). A wide range of distributions and link functions is supported in BRMS [60]. By default, BRMS [61] uses a half Student-t prior distribution with 3 degrees of freedom. This prior is generally less informative, but leads to better model convergence than the half-Cauchy prior. Due to differences in model parameterizations and scaling, the predictive performance of models fitted with R2MlwiN, MCMCglmm, and RStan/BRMS was not identical. Different thinning, burn-in, and iteration values were applied to fit the models in the various packages. All models were run with four chains, each with four cores, a warm-up of 1000 iterations, a thinning interval of 1, and 10000 iterations using the Hamiltonian Monte Carlo approach.
Bayesian Models were fitted using the MCMCglmm package, employing Gibbs sampling for 5000 iterations after 100,000 burn-in iterations, with a thinning interval of 2000. Furthermore, the models with R2MlwiN for the Metropolis-Hastings approach were fitted for 10,000 iterations after 1,000 burns with a thinning interval of 10 as the default setup [62].
Different scholars have suggested various priors for components in hierarchical models of variability parameters, depending on the fitted Bayesian model structure and MCMC method [63–65]. Some researchers have proposed non-informative prior distributions, including uniform and inverse-gamma families, for use in Gibbs sampling [66]. Other researchers suggested a half-t family for the hierarchical model and demonstrated a relatively weakly informative prior distribution. Half-student-t prior, a default prior in BRMS for SD parameters, leads to better convergence. Still, the local shrinkage parameters lead to more divergent transitions in the BRMS of Stan [67]. This study considered robust choices for group-level standard deviations in Bayesian hierarchical models, including half-normal (0, 5), half-Student t (3 degrees of freedom), and half-Cauchy prior distributions [68,69].
Kallioinen et al. (2024) and Wesner & Pomeranz (2021) proposed a half-normal distribution for SD priors in BRMS. Choosing a truncated normal distribution is a good idea because the standard deviation cannot be negative [70] and [71]. However, a prior on the random-effect parameter with a long right tail is “conservative” because it yields large SD estimates.
Other scholars (Gelman & Hennig, 2016; Zwet & Gelman, 2022) proposed a half-Cauchy prior with a mode at zero and a large scale parameter [72,73]. This explains the half-Cauchy prior’s restrictive nature in providing sufficient information for the small number of groupings inherent in the hierarchical structure of the data. To reduce the likelihood of unrealistically large SD estimates, the BRMS-Stan documentation suggested using a half-Cauchy prior, which automatically bounds the SD at 0. R-Stan renormalizes the distribution used so that the sum of the area between the bounds is 1. The half-Cauchy (0, 1) prior is a special case of the half-Student t distribution, where the degrees of freedom are parameterized in terms of the standard deviation (SD). It occupies a reasonable middle ground between different prior classes, performing well near the origin. It does not lead to drastic compromises in the estimates of the population-level (location) and group-level effects of the parameter space [74].
Sensitivity analysis of priors (S1 Table) demonstrates the robustness of the Bayesian analysis across different prior distributions. According to Aguilar and Bürkner (2022), for each SD component of the random parameters in hierarchical models, any prior distribution is practically well-defined on the non-negative real numbers only [75]. In this study, we used the default in BRMS, the truncated Student’s t distribution with 3 degrees of freedom, as a reference prior. Since negative values do not apply to standard deviation, we also used a very informative truncated normal prior with a mean of 5 and a standard deviation of 0.01. A half-Cauchy prior was also employed in the sensitivity analysis of its impact on the Bayesian hierarchical models for the applied dataset [76].
Models with an ESS greater than 1000 and an R-hat value closest to 1.00 but not exceeding 1.10 demonstrated the consistency of an ensemble of Markov chains [48]. Moreover, both Bulk-ESS and Tail-ESS should be at least 100 (approximately) per Markov chain to be reliable, indicating that the estimates of the respective posterior quantiles are reliable. In this study, the R-hat, Bulk-ESS, and Tail-ESS results for the null (empty), random-intercept, and full random-slopes models met the convergence diagnostics. Therefore, the ESS and potential scale reduction (R-hat) convergence diagnostic metrics are sufficient for stable estimates in each fitted model [32].
We conducted a sensitivity analysis of priors to scrutinize the results of the fully hierarchical specified model (random intercept and random slope using Hamiltonian Monte Carlo) based on the default (or reference) prior, using different prior distributions. The posterior mean estimates and 95% posterior density (HPD) intervals by the median for the fixed and random effects, including the SDs of the joint modeling disease severity scores (Y1) and health status (Y2), did not change much across the specified priors. This indicates a practically identical interpretation of the estimates, regardless of the priors. Since there was no significant difference in the percentage of models across the alternative prior II specifications, we reported the model results using the half-Cauchy prior in alternative prior I.
As explained by Depaoli et al. (2020), sensitivity analysis results can be presented visually, akin to Shiny app plots, or in a table format, indicating the degree of discrepancy in estimates or HPD intervals across parameters, as we present in the Appendix table below. We conducted a sensitivity analysis of the priors to examine the results of the final, fully random, hierarchically specified model in BRMS, based on the default prior (or reference), using two different prior distributions. The posterior distributions for the median and the 95% posterior density (HPD) as credible intervals (95% CI) for the fixed and random effects, including the SD in joint modeling of disease severity score (Y1) and health status (Y2), did not change much depending on the specified priors. This indicates a practically identical interpretation of the estimates, regardless of the priors. Therefore, we report the model results using the half-Cauchy prior. This approach provided good model convergence and adequate ESS values (i.e., greater than or equal to 100) due to the lack of significant percentage deviation between models, regardless of the alternative prior specification.
Most primary inferences (S1 Table) regarding the covariate (Ultrasound) are robust to the choice of prior, as the signs and credible interval behaviors do not change meaningfully. However, the variance components and RCP effects are prior-sensitive; for these parameters, the posterior is more heavily influenced by the prior distribution than the data. Ultrasound (Y2) is highly robust, with a negligible −0.27% change under the Half-Cauchy prior and an 8.2% change under the Normal prior. RCP (Y1) shows high numerical sensitivity (−96.4% and −90.8% deviation), likely due to the small absolute magnitude of the estimate (0.0210 vs 0.5801). However, the practical inference remains similar as the effect size is near zero. The level 2 variance (Y1) parameter is highly sensitive to the choice of prior, with a −93.7% deviation under the Half-Cauchy prior and a −98.9% deviation under the Normal prior. This suggests that the data provide limited information about this specific variance component, making the prior choice influential. The correlation, Cor():
), between Y1 and Y2 is sensitive to the scale prior, shifting from 0.0401 (default) to 0.0801 (+99.8% change) under the Half-Cauchy specification. However, it remained relatively stable under the Normal prior (−0.6%) and did not alter the primary conclusion that HMC provided better model fit.
Supporting information
S1 Table. Posterior estimates (mean and median) and sensitivity analysis results.
https://doi.org/10.1371/journal.pone.0346331.s001
(DOCX)
References
- 1. Ebrahim EA, Cengiz MA, Terzi E. The best fit Bayesian hierarchical generalized linear model selection using information complexity criteria in the MCMC approach. J Math. 2024;2024:1459524.
- 2. Liu X. Longitudinal transition models for categorical response data. Methods and Applications of Longitudinal Data Analysis. Elsevier. 2016. 379–410.
- 3.
Twisk JW. Applied multilevel analysis: A practical guide for medical researchers. Cambridge University Press. 2006.
- 4. Kang L, Kang X, Deng X, Jin R. A Bayesian hierarchical model for quantitative and qualitative responses. J Qual Technol. 2018;50:290–308.
- 5. Rolfe M. Bayesian models for longitudinal data. Queensl Univ Technol Discip Math Sci. 2010;:293.
- 6. Tango T. Repeated measures design with generalized linear mixed models for randomized controlled trials. Minato-ku Tokyo, Japan: Center for Medical Statistics. 2017.
- 7. Vanbrabant K, Boddez Y, Verduyn P, Mestdagh M, Hermans D, Raes F. A new approach for modeling generalization gradients: A case for hierarchical models. Front Psychol. 2015;6:652. pmid:26074834
- 8. Hadfield JD. MCMC methods for multi-response generalized linear mixed models: The MCMCglmm R package. J Stat Softw. 2010;33:1–22.
- 9. Lages M, Scheel A. Logistic mixed models to investigate implicit and explicit belief tracking. Front Psychol. 2016;7:1681. pmid:27853440
- 10. Korner-Nievergelt F, Roth T, von Felten S, Guélat J, Almasi B, Korner-Nievergelt P. Markov chain Monte Carlo simulation. Bayesian data analysis in ecology using linear models with R, BUGS, STAN. 2015. 197–212.
- 11. Lemoine NP. Moving beyond noninformative priors: Why and how to choose weakly informative priors in Bayesian analyses. Oikos. 2019;128(7):912–28.
- 12. Santos-Fernandez E, Wu P, Mengersen KL. Bayesian statistics meets sports: A comprehensive review. J Quant Anal Sport. 2019;15:289–312.
- 13. Wu D, Goldfeld KS, Petkova E, Park HG. A Bayesian multivariate hierarchical model for developing a treatment benefit index using mixed types of outcomes. BMC Med Res Methodol. 2024;24(1):218. pmid:39333874
- 14. Seedorff N, Brown G, Scorza B, Petersen CA. Joint Bayesian longitudinal models for mixed outcome types and associated model selection techniques. Comput Stat. 2023;38(4):1735–69. pmid:38292019
- 15. Goldstein H, Carpenter J, Kenward MG, Levin KA. Multilevel models with multivariate mixed response types. Statistical Modelling. 2009;9(3):173–97.
- 16. Bai H, Zhong Y, Gao X, Xu W. Multivariate mixed response model with pairwise composite-likelihood method. Stats. 2020;3(3):203–20.
- 17.
Tidemann-Miller B, Reich B, Staicu AM. Modeling multivariate mixed-response functional data. 2016.
- 18. Kapur K, Li X, Blood EA, Hedeker D. Bayesian mixed-effects location and scale models for multivariate longitudinal outcomes: An application to ecological momentary assessment data. Stat Med. 2015;34(4):630–51. pmid:25409923
- 19. Tate RL, Pituch KA. Multivariate hierarchical linear modeling in randomized field experiments. The Journal of Experimental Education. 2007;75(4):317–37.
- 20. Wang WL, Fan TH. Bayesian analysis of multivariate t linear mixed models using a combination of IBF and Gibbs samplers. J Multivar Anal. 2012;105:300–10.
- 21. Alfò M, Giordani P. Random effect models for multivariate mixed data: A Parafac-based finite mixture approach. Statistical Modelling. 2021;22(1–2):46–66.
- 22. Ganjali M. A model for mixed continuous and discrete responses with possibility of missing data. J Sci Islam Repub Iran. 2003;14:53–60.
- 23. Leyland AH, Groenewegen PP. Multilevel Modelling for Public Health and Health Services Research. Springer International Publishing. 2020.
- 24. Hilbe JM. Data analysis using regression and multilevel/hierarchical models. J Stat Soft. 2009;30(Book Review 3).
- 25. Liu X. Linear mixed-effects models. Methods and Applications of Longitudinal Data Analysis. Elsevier. 2016. 61–94.
- 26.
Rasbash J, Steele F, Browne WJ, Goldstein H. A User’s Guide to MLwiN, v3.05. University of Bristol: Centre for Multilevel Modelling. 2020.
- 27. Halliwell B, Holland BR, Yates LA. Multi-response phylogenetic mixed models: Concepts and application. Biol Rev Camb Philos Soc. 2025;100(3):1294–316. pmid:40192008
- 28. Depaoli S, van de Schoot R. Improving transparency and replication in Bayesian statistics: The WAMBS-Checklist. Psychol Methods. 2017;22(2):240–61. pmid:26690773
- 29. Betancourt M, Girolami M. Hamiltonian Monte Carlo for hierarchical models. Current Trends in Bayesian Methodology with Applications. 2013. 79–102.
- 30. Brooks S, Gelman A, Jones GL, Meng XL. Handbook of Markov Chain Monte Carlo. 2011.
- 31. Kruschke JK, Liddell TM. The Bayesian new statistics: Hypothesis testing, estimation, meta-analysis, and power analysis from a Bayesian perspective. Psychon Bull Rev. 2018;25:178–206.
- 32. Turek D, de Valpine P, Paciorek CJ. Efficient markov chain monte carlo sampling for hierarchical hidden markov models. Environ Ecol Stat. 2016;23:549–64.
- 33. Gamerman D, Lopes HF. Markov Chain Monte Carlo: Stochastic Simulation for Bayesian Inference, Second Edition. 2006.
- 34. Gelman A, Rubin DB. Inference from iterative simulation using multiple sequences. Stat Sci. 1992;7:457–72.
- 35. Gelman A, Shirley K. Inference from simulations and monitoring convergence. Handbook of Markov Chain Monte Carlo. Chapman and Hall/CRC. 2011. 162–74.
- 36. Vehtari A, Gelman A, Simpson D, Carpenter B, Bürkner P-C. Rank-normalization, folding, and localization: An Improved Rˆ for Assessing Convergence of MCMC (with Discussion). Bayesian Anal. 2021;16(2).
- 37. Almond RG. A comparison of two MCMC algorithms for hierarchical mixture models. CEUR Workshop Proc. 2014;1218:1–19.
- 38. Coly S, Garrido M, Abrial D, Yao A-F. Bayesian hierarchical models for disease mapping applied to contagious pathologies. PLoS One. 2021;16(1):e0222898. pmid:33439868
- 39. Nishio M, Arakawa A. Performance of Hamiltonian Monte Carlo and No-U-Turn Sampler for estimating genetic parameters and breeding values. Genet Sel Evol. 2019;51(1):73. pmid:31823719
- 40. Yamada T, Ohno K, Ohta Y. Comparison between the Hamiltonian Monte Carlo method and the Metropolis–Hastings method for coseismic fault model estimation. Earth Planets Space. 2022;74(1).
- 41. Bradley JR. Joint bayesian analysis of multiple response-types using the hierarchical generalized transformation model. Bayesian Anal. 2022;17(1).
- 42.
Zhang C, Li Z, Shen Z, Xie J. A Hybrid Stochastic Gradient Hamiltonian Monte Carlo Method. 2021. 1–2.
- 43. Metropolis N, Rosenbluth AW, Rosenbluth MN, Teller AH, Teller E. Equation of state calculations by fast computing machines. J Chem Phys. 1953;21:1087–92.
- 44. Neal RM, Rosenthal JS. Efficiency of reversible MCMC methods: elementary derivations and applications to composite methods. 2023. https://arxiv.org/abs/2305.18268v2
- 45. Loenneke JP, Fahs CA, Rossow LM, Sherk VD, Thiebaud RS, Abe T, et al. Effects of cuff width on arterial occlusion: implications for blood flow restricted exercise. Eur J Appl Physiol. 2012;112(8):2903–12. pmid:22143843
- 46. Percy DF. Blocked arteries and multivariate regression. Biometrics. 1992;48:683.
- 47. Kasapis C, Gurm HS. Current approach to the diagnosis and treatment of femoral-popliteal arterial disease. A systematic review. Curr Cardiol Rev. 2009;5(4):296–311. pmid:21037847
- 48. Dominique A, Denton A, Brownhill S. Doing Bayesian Data Analysis. 2015.
- 49. Vehtari A, Simpson D, Gelman A, Yao Y, Gabry J. Pareto Smoothed Importance Sampling. 2015.
- 50. Zhang Z, Parker RMA, Charlton CMJ, Leckie G, Browne WJ. R2MLwiN: A package to run MLwiN from within R. J Stat Softw. 2016;72:1–43.
- 51. van Zundert C, Somer E, Miočević M. Prior predictive checks for the method of covariances in Bayesian mediation analysis. Struct Equ Model A Multidiscip J. 2022;29:428–37.
- 52. Nott DJ, Wang X, Evans M, Englert BG. Checking for prior-data conflict using prior-to-posterior divergences. Stat Sci. 2020;35:234–53.
- 53. Vehtari A, Gelman A, Gabry J. Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC. Stat Comput. 2016;27:1413–32.
- 54. Cook SR, Gelman A, Rubin DB. Validation of software for Bayesian models using posterior quantiles. J Comput Graph Stat. 2006;15:675–92.
- 55. Karras C, Theodorakopoulos L, Karras A, Krimpas GA, Bakalis C-P, Theodoropoulou A. MCMC Methods: From theory to distributed hamiltonian monte carlo over pySpark. Algorithms. 2025;18(10):661.
- 56. Shukla A, Vats D, Chi EC. Proximal Hamiltonian Monte Carlo. 2025. https://arxiv.org/pdf/2510.22252v1
- 57. Alvarez I, Niemi J, Simpson M. Bayesian inference for a covariance matrix. Conf Appl Stat Agric. 2014.
- 58.
Leckie G, Steele F. Multilevel modelling of repeated measures data-mlwin practical. 2016.
- 59. Gromping U. Multilevel modeling using R. J Stat Softw. 2015;62.
- 60. McElreath R. Statistical rethinking: A Bayesian course with examples in R and Stan. 2nd ed. Chapman and Hall/CRC. 2018.
- 61. Bürkner PC. brms: An R package for Bayesian multilevel models using Stan. J Stat Softw. 2017;80:1–28.
- 62. Mai Y, Zhang Z. Review of Software Packages for Bayesian Multilevel Modeling. 2018.
- 63. Röver C, Bender R, Dias S, Schmid CH, Schmidli H, Sturtz S, et al. On weakly informative prior distributions for the heterogeneity parameter in Bayesian random-effects meta-analysis. Res Synth Methods. 2021;12(4):448–74. pmid:33486828
- 64. Depaoli S, Winter SD, Visser M. The importance of prior sensitivity analysis in bayesian statistics: Demonstrations using an interactive shiny App. Front Psychol. 2020;11:608045. pmid:33324306
- 65. Sugasawa S. Prior Sensitivity Analysis without Model Re-fit. 2024. http://arxiv.org/abs/2409.19729
- 66. Ghosh J, Li Y, Mitra R. On the Use of Cauchy Prior Distributions for Bayesian Logistic Regression. Bayesian Anal. 2018;13(2).
- 67. Piironen J, Vehtari A. Projection predictive model selection for Gaussian processes. 2016 IEEE 26th International Workshop on Machine Learning for Signal Processing (MLSP), 2016. 1–6.
- 68.
Congdon PD. Bayesian Hierarchical Models: With Applications Using R Second Edition. an Informa business: Taylor & Francis Group: Chapman & Hall/CRC. 2020.
- 69. Chien Y-F, Zhou H, Hanson T, Lystig T. Informative g-Priors for Mixed Models. Stats. 2023;6(1):169–91.
- 70. Kallioinen N, Paananen T, Bürkner PC, Vehtari A. Detecting and diagnosing prior and likelihood sensitivity with power-scaling. Stat Comput. 2024;34.
- 71. Wesner JS, Pomeranz JPF. Choosing priors in Bayesian ecological models by simulating from the prior predictive distribution. Ecosphere. 2021;12(9).
- 72. Zwet E, Gelman A. A proposal for informative default priors scaled by the standard error of estimates. Am Stat. 2022;76:1–9.
- 73.
Gelman A, Hennig C. Beyond subjective and objective in statistics. 2016.
- 74. Polson NG, Scott JG. Local Shrinkage Rules, Lévy Processes and Regularized Regression. Journal of the Royal Statistical Society Series B: Statistical Methodology. 2012;74(2):287–311.
- 75. Aguilar JE, Bürkner PC. Intuitive joint priors for Bayesian linear multilevel models: The R2-D2-M2 prior. 2022.
- 76. Wiesenfarth M, Calderazzo S. Quantification of prior impact in terms of effective current sample size. Biometrics. 2020;76(1):326–36. pmid:31364156